This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
oo. Lemma 3 .13 For all i and u, 00 f C_ hj f e(Y)G+(i, j; dy) : 1(u) < C+ > hj u u} coincide. In particular, ,b(u,T) = P(VT > u),
e(1tL)G+(i,j;dy).
0 G+(i,7; dy) = aj f Bj(dy - x ) R(i,j; dx).
00
Thus
C+ > hj f"o e7(Y-u)G +(i,
j; dy)
jEE u 0
00
C+ ijhj f R(i, j; dx) f e7( v-u)Bj (dy - x) jEE
00
u
0 //^^ C+E,3jhj
// f 00
jEE
R(i, j; dx)
100 ery(&-u+x)Bj (dy) C . Bj(u Bj (u - x)
0
R(i, j; dx ) Bj (u - x ) = Gi(u),
E Qj f jEE
°O
x)
3. CHANGE OF MEASURE VIA EXPONENTIAL FAMILIES
167 ❑
proving the upper inequality, and the proof of the lower one is similar.
Proof of Theorem 3.13 Let first cp=°)(u) = C_ hie-"u in Lemma 3.13. We claim by induction that then cpin) (u) > C_ hie-7u for all n, from which the lower inequality follows by letting n -* oo. Indeed, this is obvious if n = 0, and assuming it shown for n, we get
Wo n +1) (u) = ? 7 i ( U ) + E J u gyp;n) ( u - y)G+(i, j; dy)
(3.10)
jEE o
C_ 1 f hje7(y- u)G+(i, j; dy) jEE u
U +C_ hje7( y-u)G
+(i, j; dy)
jEE""
C_e-7u 57 O+[i, j; y]hj = C_ e-7uhi,
(3.11)
jEE
estimating the first term in (3.10 ) by Lemma 3 . 13 and the second by the induction hypothesis , and using Lemma 3 . 9 for the last equality in (3.11). ❑ The proof of the upper inequality is similar , taking cps°) (u) = 0. Here is an estimate of the rate of convergence of the finite horizon ruin probabilities 'i (u, T) = Pi (7- (u) < T ) to 0i (u) which is different from Theorem 3.10: Theorem 3 . 14 Let yo > 0 be the solution of 'c'(yo ) = 0, let C+(yo) be as in (3.8) with -y replaced by yo and hi by h=7o ), and let 8 = e'(70). Then
0<
Vi (u )
-
0i(u,
T) < C+(')' o)hi7u)e-7ou8T .
(3.12)
Proof We first note that just as in the proof of Theorem 3.11, it follows that Vi(u) < C_(yo)
h=70)e-7ou.
(3.13)
Hence, letting MT = maxo
=
C h(7o)e-7ou8T . +i
168
CHAPTER VI. MARKOVIAN ENVIRONMENT
Notes and references The results and proofs are from Asmussen and Rolski [44]. Further related discussion is given in Grigelionis [176], [177].
4 Comparisons with the compound Poisson model 4a Ordering of the ruin functions For two risk functions 0', ", we define the stochastic ordering by 0' < s.o. V)" if z/i'(u) <'c"" (u), u > 0. (4.1) Obviously, this correponds to the usual stochastic ordering of the maxima M', M" of the corresponding two claim surplus proceses (note that 0'(u) _ P(M' > u), 0"(u) = P(M" > u)) Now consider the risk process in a Markovian environment and define i' (u) _ >iEE irioi(u). It was long conjectured that -0* Vi,,, where o*(u) is the ruin probability for the averaged compound Poisson model defined in Section 1 and ,0,, is the one for the Markov-modulated one in the stationary case (the distribution of J0 is 7r). The motivation that such a result should be true came in part from numerical studies, in part from the folklore principle that any added stochastic variation increases the risk, and finally in part from queueing theory, where it has been observed repeatedly that Markov-modulation increases waiting times and in fact some partial results had been obtained. The results to be presented show that quite often this is so, but that in general the picture is more diverse. The conditions which play a role in the following are: ,31:5)32 ... < ,3p.
(4.2)
Bl <_s.o. B2 <_s.o.... <s.o. Bp.
(4.3)
The Markov process {Jt} is stochastically monotone (4.4) To avoid trivialities, we also assume that there exist i # j such that either /3i <,33 or Bi 0 Bj. Occasionally we strengthen (4.3) to B = Bi does not depend on i.
(4.5)
Note that whereas (4.2) alone just amounts to an ordering of the states, this is not the case for (4.3). For the notion of monotone Markov processes, we refer to
4. COMPARISONS WITH THE COMPOUND POISSON MODEL
169
Stoyan [352], Section 4.2; note that (4.4) is automatic in some simple examples like birth-death processes or p = 2 . Conditions (4.2)-(4.4) say basically that if i < j , then j is the more risky state , and it is in fact easy to show that Vii(u ) < t/j(u) (this is used in the derivation of (4.9 ) below). Theorem 4 .1 Assume that conditions (4.2)-(4.4) hold. Then V,* For the proof, we need two lemmas. The first is a standard result going back to Chebycheff and appearing in a more general form in Esary, Proschan & Walkup [140], the second follows from an extension of Theorem I1.6.5 (cf. also Proposition 2.1) which with basically the same proof can be found in Asmussen & Schmidt [49]. Lemma 4 . 2 If al < ... < a,,, b1 < ... < bp and 7ri > 0 (i = 1, ... , p), ^i 7ri = 1, then P
P
P 7rjbj.
1:7riaibi > E 7riai i=1 i=1 j=1
The equality holds if and only if a1 = ... = aP or b1 = ... = b,,. Lemma 4 . 3 (a) P,r (JT(o) = i, 7-(0) < oo) = pirf+), where 7r2+) = QiµBilri/p, (b) P,r (Sr(o) E dx Jr(o) = i, T(0) < oo) = Bi(x) dx/tcai . Proof of Theorem 4.1. Conditioning upon the first ladder epoch, we obtain (cf. Proposition 2.1 for the first term in (4.7) and Lemma 4.3 for the second) *(u)
_
/3 *B* (u) +,13*
J0 u 0*(u - x)B*(x) dx, (4.6)
fl*B*(u) + p> s=1
7r=
+)
b
f0 u
(u -
x)Bt (x) /pB,
dx (4.7)
7ri,3iBi(x)YPi(u - x)dx _ /3*B*(u) + f u / ^ t=1 ^j
B* ( ) > 3 *+ f
Tri/iBd(x) . E 7r i Wi(u - x) dx u o i =1 i=1
Q*B*(u)+,3* f uB(x) z/^,r(u -x)dx. 0
(4.8)
(4.9) (4.10)
Here (4.9) follows by considering the increasing functions 3iBi (x) and Oi (u - x) of i and using Lemma 4.2. Comparing (4.10) and (4.6), it follows by a standard
CHAPTER VI. MARKOVIAN ENVIRONMENT
170
argument from renewal theory that tk.., dominates the solution 0* to the renewal equation (4.6). ❑ Here is a counterexample showing that the inequality tp* (u) < V),, (u) is not in general true: Proposition 4.4 Assume that ,3µi < 1 for all i, that P
P
/^2 /^ ^i/ji pBi < 1il3i i=1
i=1
/^ 1r1NiµBi
(4.11)
i=1`
and that A has the form eAo for some fixed intensity matrix A0. Then i/i*(u) < ,0,r (u ) fails for all sufficiently small e > 0. Proof Since 0..(0) = V,* (0), it is sufficient to show that 0',(0) < b *'(0) for e small enough. Using (4.6), (4.8) we get P
'*' (0) =
P
-3* + /3*1* (0) _ > lri'3qqi • E 7i/ipBi - /3*, i=1
i=1
7'r(0) _ EFioiwi(0) - 0*•
i=1
But it is intuitively clear (see Theorem 3.2.1 of [145] for a formal proof) that z/ii(u) converges to the ruin probability for the compound Poisson model with parameters ,3i, Bi as e J. 0. For u = 0, this ruin probability is /3iPBi, and from this the claim follows. ❑ To see that Proposition 4.4 is not vacuous, let = ( 1/2 1/2 ) , 01 = 10-3, Q2 = 1, µB, = 102, µB2 = 10-4. Then the l.h.s. of (4.11) is of order 10-4 and the r.h.s. of order 10-1. Notes and references The results are from Asmussen, Frey, Rolski & Schmidt [32]. As is seen, they are at present not quite complete. What is missing in relation to Theorem 4.1 and Proposition 4.4 is the understanding of whether the stochastic monotonicity condition (4.4) is essential (the present author conjectures it is).
4b Ordering of adjustment coefficients Despite the fact that V)* (u) < *,, (u) may fail for some u, it will hold for all sufficiently large u, except possibly for a very special situation . Recall that
4. COMPARISONS WITH THE COMPOUND POISSON MODEL
171
the adjustment coefficient for the Markov-modulated model is defined as the solution -y > 0 of ic(-y) = 0 where c(a) is the eigenvalue with maximal real part of the matrix A + (rci(a))diag where rci(a) = ai(Bi[a] - 1) - a. The adjustment coefficient -y* for the averaged compound Poisson model is the solution > 0 of rc*(ry*) = 0 where rc*(a) _ 13*(B*[a] - 1) - a = E irirci(a). (4.12) iEE
Theorem 4.5 y < ry*, with strict inequality unless rci (y*) does not depend on iEE. Lemma 4.6 Let (di)iEE be a given set of constants satisfying EiEE iribi = 0 and define A(a) as the eigenvalue with maximal real part of the matrix A + a(bi)diag• Then )t(a) > 0, with strict inequality unless a = 0 or bi = 0 for all i E E. Proof Define
X= f &ids. Then {(Jt, Xt)} is a Markov additive process (a so-called Markovian fluid model, cf. e.g. Asmussen [20]) as discussed in 11.5, and by Proposition II.5.2 we have (Ei[e"X'; Jt = i])' EE = vA+n(6.)a..g.
Further (see Corollary 11.5.7) )i is convex with
A'(0) = lim
EXt
t-ioo t
=
70i
= 0, (4.13)
iEE
A„(O) iioo varXt t t
(4.14)
By convexity, (4.13) implies A(a) > 0 for all a.
Now we can view {Xt} as a cumulative process (see A.ld) with generic cycle w = inf{t>0: Jt_54 k,Jt=kI A (the return time of k) where k E E is some arbitrary but fixed state. It is clear that the distribution of X,, is non-degenerate unless bi does not depend on i E E, which in view of EiEE 1ibi = 0 is only possible if Si = 0 for all i E E. Hence if 5i 54 0 for some i E E, it follows by Proposition A1.4(b) that the limit in (4.14) is non-zero so that A"(0) > 0. This implies that A is strictly convex, in particular .(a) > 0 for all a 0 0. 0
172
CHAPTER VI. MARKOVIAN ENVIRONMENT
Proof of Theorem 4.5. Let bi = rci(y*), a = 1 in Lemma 4.6. Then > risi = 0 because of (4.12) and rc*(y*) = 0. Further a(1) = rc(y*) by definition of A(.) and rc (•). Hence rc (y*) > 0. Since ic is convex with rc'(0) < 0 , this implies that the solution y > 0 of K(y) = 0 must satisfy y < y*. If rci(y* ) is not a constant function of i E E, we get rc (y*) > 0 which in a similar manner implies that ❑ y < y*. Notes and references Theorem 4.5 is from Asmussen & O'Cinneide [40], improving upon more incomplete results from Asmussen, Frey, Rolski & Schmidt [32].
4c Sensitivity estimates for the adjustment coefficient Now assume that the intensity matrix for the environment is Ae = Ao/ e, whereas the ,Qi and Bi are fixed . The corresponding adjustment coefficient is denoted by ry(e). Thus -y(e) -* y* as e 10, and our aim is to compute the sensitivity
ay
ae
E=O
A dual result deals with the limit a -4 oo. Here we put a = 1/e, note that y(a) -+ mins=1,...,p yi and compute
8y 8a
a=0
In both cases, the basic equation is (A + (rci(y))diag)h = 0, where A, y, h depend on the parameter (e or a). In the case of e, multiply the basic equation by a to obtain 0 = (A0 + e(r£i(y))diag)h,
0 = ((ri(-Y))diag + ery (4{('Y))diag)h + (A0 + e(?i'Y))diag)h'. (4.15)
Normalizing h by 7rh = 0, we have 7rh' = 0, h(0) = e. Hence letting e = 0 in (4.15) yields
0 = (Ii(y*)) diage + Aoh'(0) = (rci('Y*)) diage + (Ao - eir)h'(0), h'(0) = -(Ao - e7r)-1 (Ici(Y*))diage. (4.16) Differentiating (4.15) once more and letting e = 0 we get
5. THE MARKOVIAN ARRIVAL PROCESS
173
0 = 27'(0)(r-i(`Y *)) diage + 2(ci('Y* )) diag h' (0) + Aoh" (0)
, (4.17)
0 = 27'(0)p+27r(rs;i(7' *))diagh'(0),
(4.18)
multiplying (4.17) by 7r to the left to get (4.18). Inserting (4.16) yields Proposition 4.7 8ry
aE
= 1 7r(ci ('Y*))diag ( Ao -e7r)-1(Xi(-Y*))diage *=0 P
Now turn to the case of a. We assume that 0 < -y < 7i,
i = 2, ... , p. (4.19)
Then 'y -^ ryl as a ^ 0 and we may take h(0) = el (the first unit vector). We get
0 = (aAo + ( lc&Y))diag)h, 0 = (Ao + ry'(ii(-Y)) diag )h + (aAo + (Ki(7'))diag)h'. (4.20) Letting a = 0 in (4.20) and multiplying by el to the left we get 0 = All + 7'(0)rci (0) + 0 (here we used icl (ry(0)) = 0 to infer that the first component of K[7(0)]h'( 0) is 0), and we have proved:
All
Proposition 4.8 If (4.19) holds, then 8a a=o
rci (0)
Notes and references The results are from Asmussen, Frey, Rolski & Schmidt [32]. The analogue of Proposition 4.8 when ryi < 0 for some i is open.
5 The Markovian arrival process We shall here briefly survey an extension of the model, which has recently received much attention in the queueing literature, and may have some relevance in risk theory as well (though this still remains to be implemented). The additional feature of the model is the following: • Certain transitions of {Jt} from state i to state j are accompanied by a claim with distribution Bid; the intensity for such a transition (referred to as marked in the following) is denoted by Aii l and the remaining intensity
CHAPTER VI. MARKOVIAN ENVIRONMENT
174
f o r a transition i -+ j by A
. For i = j, we use the convention that
a1i = f3i where 3i is the Poisson rate in state i, that Bii = Bi , and that are determined by A = A(l ) +A(2) where A is the intensity matrix the governing {Jt}. Thus , the Markov-modulated compound Poisson model considered sofar corresponds to A(l) = (,6i ) diag, A(1) = A - (13i )diag, Bii = Bi ; the definition of Bij is redundant for i i4 j. Note that the case that 0 < qij < 1, where qij is the probability that a transition i -* j is accompanied by a claim with distribution, is neither 0 or 1 is covered by letting Bij have an atom of size qij at 0. Again , the claim surplus is a Markov additive process (cf. II.4). The extension of the model can also be motivated via Markov additive processes: if {Nt} is the counting process of a point process, then {Nt} is a Markov additive process if and only if it corresponds to an arrival mechanism of the type just considered. Here are some main examples: Example 5 .1 (PHASE-TYPE RENEWAL ARRIVALS) Consider a risk process where the claim sizes are i.i.d. with common distribution B, but the point process of arrivals is not Poisson but renewal with interclaim times having common distribution A of phase-type with representation (v, T). In the above setting, we may let {Jt} represent the phase processes of the individual interarrival times glued together (see further VIII.2 for details), and the marked transitions are then the ones corresponding to arrivals. This is the only way in which arrivals can occur, and thus
1i = 0, A(l) = T, A(l) = tv, Bij = B; the definition of Bi is redundant because of f3i = 0.
❑
Example 5 .2 (SUPERPOSITIONS) A nice feature of the set-up is that it is closed under superposition of independent arrival streams . Indeed, let { Jt 1) }, j(2)
} be two independent environmental processes and let E(k), A(1'k) A(2 k1),
B;^) etc. refer to
{ Jt k)
}. We then let (see the Appendix for the Kronecker
notation)
E = E(1) x E(2), Jt = (Jtl), Jt2)) (2;2) A(1) = A(' 1) ® A(1;2), A ( 2) = A (2`1 ) ® A,
5. THE MARKOVIAN ARRIVAL PROCESS
175
-
Bij,kj = Bik) B13 4k = Bak)
(the definition of the remaining Bij,kl is redundant). In this way we can model, e.g., superpositions of renewal processes. ❑ Example 5 .3 (AN INDIVIDUAL MODEL) In contrast to the collective assumptions (which underly most of the topics treated sofar in this book and lead to Poisson arrivals), assume that there is a finite number N of policies. Assume further that the ith policy leads to a claim having distribution Ci after a time which is exponential, with rate ai, say, and that the policy then expires. This means that the environmental states are of the form i1i2 • • • iN with il, i2i ... E 10, 11, where ik = 0 means that the kth policy has not yet expired and ik = 1 that it has expired. Thus, claims occur only at state transitions for the environment so that AN2... iN,1i2 ... iN = all BOi2... iN,1i2...iN C17 AilO...iN,iil...iN = a2, Bilo...iN,iil...iN = C27
All other off-diagonal elements of A are zero so that all other Bii are redundant. Similarly, all Al i2...iN are zero and all Bi are redundant. Easy modifications apply to allow for • the time until expiration of the kth policy is general phase-type rather than exponential;
• upon a claim, the kth policy enters a recovering state, possibly having a general phase-type sojourn time, after which it starts afresh.
Example 5 .4 (A SINGLE LIFE INSURANCE POLICY ) Consider the life insurance of a single policy holder which can be in one of several states, E = { WORKING, RETIRED, MARRIED, DIVORCED, WIDOWED, INVALIDIZED, DEAD etc.}. The individual pays at rate pi when in state i and receives an amount having distribution Bij when his/her state changes from i to j.
❑
Notes and references The point process of arrivals was studied in detail by Neuts [267] and is often referred to in the queueing literature as Neuts ' versatile point process , or, more recently, as the Markovian arrival process ( MAP). However , the idea of arrivals at transition epochs can be found in Hermann [193] and Rudemo [313]. The versatility of the set-up is even greater than for the Markov-modulated model. In fact , Hermann [193 ] and Asmussen & Koole [37] showed that in some appropriate
CHAPTER VI. MARKOVIAN ENVIRONMENT
176
sense any arrival stream to a risk process can be approximated by a model of the type studied in this section : any marked point process is the weak limit of a sequence of such models . For the Markov-modulated model, one limitation for approximation purposes is the inequality Var Nt > ENt which needs not hold for all arrival streams. Some main queueing references using the MAP are Ramaswami [298], Sengupta [336], Lucantoni [248], Lucantoni et at. [248], Neuts [271] and Asmussen & Perry [42].
6 Risk theory in a periodic environment 6a The model We assume as in the previous part of the chapter that the arrival mechanism has a certain time-inhomogeneity, but now exhibiting (deterministic) periodic fluctuations rather than (random ) Markovian ones. Without loss of generality, let the period be 1; for s E E = [0, 1), we talk of s as the 'time of the year'. The basic assumptions are as follows: • The arrival intensity at time t of the year is 3(t) for a certain function /3(t), 0 < t < 1; • Claims arriving at time t of the year have distribution B(t); • The premium rate at time t of the year is p(t). By periodic extension, we may assume that the functions /3(t), p(t) and B(t) are defined also for t t [0, 1). Obviously, one needs to assume also (as a minimum) that they are measurable in t; from an application point of view, continuity would hold in presumably all reasonable examples. We denote throughout the initial season by s and by P(8) the corresponding governing probability measure for the risk process. Thus at time t the premium rate is p(s + t), a claim arrives with rate /3(s + t) and is distributed according to B(8+0 . Let 1
1
/3* _ f /3(t) dt, B* =
J
t
1
B(t) ((*) dt, p * = 0 p(t) dt. )3
J
(6.1)
Then the average arrival rate is /3* and the safety loading rt is 77 = (p* - p)/p, where
f
i f00 p = f /3(v) dv xB(°) (dx) _ ,3*µs • 0 0
(6.2)
Note that p is the average net claim amount per unit time and µ* = p//3* the average mean claim size.
6. RISK THEORY IN A PERIODIC ENVIRONMENT 177 In a similar manner as in Proposition 1.8, one may think of the standard compound Poisson model with parameters 3*, B*, p* as an averaged version of the periodic model, or, equivalently, of the periodic model as arising from the compound Poisson model by adding some extra variability. Many of the results given below indicate that the averaged and the periodic model share a number of main features. In particular, it turns out that they have the same adjustment coefficient. In contrast, for Markov-modulated model typically the adjustment coefficient is larger than for the averaged model (cf. Section 4b), in agreement with the general principle of added variation increasing the risk (cf. the discussion in 111.9). The behaviour of the periodic model needs not to be seen as a violation of this principle, since the added variation is deterministic, not random. Example 6 .1 As an example to be used for numerical illustration throughout this section, let ,3(t) = 3A(1 + sin 27rt), p(t) = A and let B(t) be a mixture of two exponential distributions with intensities 3 and 7 and weights w(t) _ (1 +cos27rt)/2 and 1 - w(t), respectively. It is easily seen that ,3* = 3A, p* = A whereas B* is a mixture of exponential distributions with intensities 3 and 7 and weights 1/2 for each (1/2 = ff w(t)dt = f o (1- w(t)) dt). Thus, the average compound Poisson model is the same as in III.(3.1) and Example 1.10, and we recall from there that the ruin probability is
*(u) _ 3245e-u1 + 35e-6u. (6.3) Note that A enters just as a scaling factor of the time axis, and thus the averaged standard compound Poisson models have the same risk for all A. In contrast, we shall see that for the periodic model increasing A increases the effect of the periodic fluctuations. ❑ Remark 6 .2 Define T
6(T) = p(t ) dt, St = Se-I(t). 0 Then (by standard operational time arguments )
{St}
is a periodic risk process
with unit premium rate and the same infinite horizon ruin probabilities. We ❑ assume in the rest of this section that p(t) - 1. The arrival process {Nt}t>0 is a time-inhomogeneous Poisson process with intensity function {/3(s + t)}t>0 . The claim surplus process {St } two is defined in the obvious way as St = ^N° Ui - t.
Thus , the conditional distribution
CHAPTER VL MARKOVIAN ENVIRONMENT
178
of U; given that the ith claim occurs at time t is B(8+t). As usual, r(u) _ inf It > 0 : St > u} is the time to ruin , and the ruin probabilities are
0(8) (U) = P(s )(r(u) < 00), 0 (5)(u,T) = P(8)(r(u)
Jt = (s + t) mod 1
P(8) - a.s .. (6.4)
At a first sight this point of view may appear quite artificial, but it turns out to have obvious benefits in terms of guidelining the analysis of the model as a parallel of the analysis for the Markovian environment risk process. Notes and references The model has been studied in risk theory by, e.g., Daykin et.al. [101] , Dassios & Embrechts [98] and Asmussen & Rolski [43], [44] (the literature in the mathematical equivalent setting of queueing theory is somewhat more extensive, see the Notes to Section 7). The exposition of the present chapter is basically an extract from [44], with some variants in the proofs.
6b Lundberg conjugation Motivated by the discussion in Chapter II.5 (see in particular Remark 11.5.8), we start by deriving formulas giving the m.g.f. of the claim surplus process. To this end, let
f 8+1 tc *(a) _ (B* [a] - 1) -a =
J8
,3(v)(B(vl [a] - 1) dv - a
be the c.g.f. of the averaged compound Poisson model (the last expression is independent of s by periodicity), and define h(s;a)
= exp { - ^8 [,Q(v) (B(„) [a] - 1) - a - tc* (a)] dv
then h (.; a) is periodic on R. J Theorem 6 . 3 E(8)eaSt = h(s; a) etw*(a) h(s+t;a) Proof Conditioning upon whether a claim occurs in [t, t + dt] or not, we obtain E.(8) [eaSt+dt
I7t]
=
(1 - (3(s + t)dt)e«St -adt + /3(s + t)dt - east B(8+t) [a]
=
east - (1 - adt +,3(s + t)dt[B(8+t)[a] - 1]) ,
179
6. RISK THEORY IN A PERIODIC ENVIRONMENT E(8)east+ dt
=
E(8)east (1 - adt +,0(s + t)dt[B(8+t)[a] - 1]) ,
Et.(8)east
=
E (8)east (-a +,3(s +
dt log E(8)east
=
-a + f3(s + t) [B(8+t) [a] - 1],
l og E(8) et
=
-at +
d
t)[D(8
+t)[a] - 1]) ,
+ v)(B([a] - 1)dv f
=
log h(s + t; a) - log h(s; a),
where
atetk•(a) h(t; a) = exp I f t3(v)(kv)[a) - 1)dv -
o
h(t; a)
Thus E(8)east = h(s + t; a) = h(s; a) et,.* (a) h(s; a) h(s + t; a)
Corollary 6.4 For each 0 such that the integrals in the definition of h(t ; 0) exist and are finite, h(s + t; B) eoSt -t,c* (e) {Le,t}t>o = h(s; 9)
Lo
is a P ( 8)-martingale with mean one. Proof In the Markov additive sense of (6.4), we can write
Jt;9) east-t,t.(e) Let = h( h(Jo; 0) P(8)-a.s. so that obviously {Lo,t} is a multiplicative functional for the Markov process { (Jt, St)} . According to Remark 11.2.6 , it then suffices to note that E(8)Le,t = 1 by Theorem 6.3. ❑ Remark 6.5 The formula for h(s) = h(s; a) as well as the fact that rc = k` (a) is the correct exponential growth rate of Eeast can be derived via Remark 11.5.9 as follows. With g the infinitesimal generator of {Xt} = {(Jt, St)} and
CHAPTER VI. MARKOVIAN ENVIRONMENT
180
ha(s,y) = eayh(s), the requirement is cha(i,0) = Kh(s). However, as above E (s) ha(Jdt, Sdt) = h(s + dt) e-adt (1 -,(3(s)dt) +,3(s)dt • B(s)[a]h(s) = gha(s, 0) =
h(s) + dt {-ah(s) -,3(s)h(s) + h'(s) +,3(s)ks)[a]h(s)} -ah(s) -13(s)h(s) + h'(s) +,3(s)B(s) [a]h(s).
Equating this to rch (s) and dividing by h(s) yields h(s ) = h(s)
=
a + ,6 ( s ) exp { -
0( s )&s) [a] + tc ,
J s [,3(v)( Bi"i [a] - 1) - a - tc] dv}
(normalizing by h(0) = 1). That rc = is*(a) then follows by noting that h(1) _ ❑ h(0) by periodicity. For each 0 satisfying the conditions of Corollary 6.4, it follows by Theorem II.2.5 that we can define a new Markov process {(Jt, St)} with governing probability measures Fes), say, such that for any s and T < oo, the restrictions of Plsi and Pest to Ft are equivalent with likelihood ratio Le,T. Proposition 6.6 The P(s), 0 < s < 1, correspond to a new periodic risk model with parameters
ex ,60(t) = a(t)B(t)[0], Bet)(dx) = ^ B(t ) (dx). Proof (i) Check that m.g.f. of St is as for the asserted periodic risk model, cf. Proposition 6.3; (ii) use Markov-modulated approximations (Section 6c); ( iii) use approximations with piecewiese constant /3(s), B(s); (iv) finally, see [44] for 11 a formal proof. Now define 'y as the positive solution of the Lundberg equation for the averaged model. That is, -y solves n* (-y) = 0. When a = y, we put for short h(s) = h(s;'y). A further important constant is the value -yo (located in (0, ry)) at which n* (a) attains its minimum. That is, -yo is determined by 0 = k* (70) = QB*, [70] - 1. Lemma 6 .7 When a > -yo, P(s) (T(u) < oo) = 1 for all u > 0.
6. RISK THEORY IN A PERIODIC ENVIRONMENT
181
Proof According to (6.2), the mean number of claims per unit time is p«
dv J ' xe«xB (°) (dx) Jo 1,6(v) ✓✓ o r^ xe«xB'(dx) = Q'B' [ a] = ^' J 0
=
= ^c"'(a) + 1, ❑
which is > 1 by convexity.
The relevant likelihood ratio representation of the ruin probabilities now follows immediately from Corollary 11.2.4. Here and in the following, ^(u) = ST(u) - u is the overshoot and 9(u) = (T(u) + s) mod 1 the season at the time of ruin.
Corollary 6.8 The ruin probabilities can be computed as (u)+T(u)k'(a)
^/i(8) (u, T)
= h(s; a) e-«uE(8 ) e «^ ; T(u) < (6.7) h(B(u); a) TI
0(')(u) = h(s; a)e-«uE (a
h(9(u); a) a > ry0 (6.8)
iP(s) (u) = h( s)e-7uE(` ) h(O(u))
(6.9)
To obtain the Cramer-Lundberg approximation from Corollary 3.1, we need the following auxiliary result . The proof involves machinery from the ergodic theory of Markov chains on a general state space, which is not used elsewhere in the book, and we refer to [44]. Lemma 6 .9 Assume that there exist open intervals I C [0, 1), J C R+ such that the B(8), s E I, have components with densities b(8)(x) satisfying
inf sEI, xEJ
0 (s)b(8)(x) > 0.
(6.10)
Then for each a, the Markov process {(^(u),9(u))} u>0, considered with governing probability measures { E(8) }E[ , has a unique stationary distribution, say s0,1)
the distribution of (l: (oo), B(oo)), and no matter what is the initial season s, Wu), 0(u)) -* (b(oo), e(cc)) Letting u --> oo in (6.9) and noting that weak convergence entails convergence of E f (^(u), 9(u)) for any bounded continuous function (e.g. f (x, q) = e-ryx/h(q)), we get:
CHAPTER VI. MARKOVIAN ENVIRONMENT
182
Theorem 6.10 Under the condition (6.10) of Lemma 3.1, Vi(8) (u) - Ch(s)e-ry", where
u -+ oo, (6.11)
e- -W- ) C = E1 h(B(oo))
Note that ( 6.11) gives an interpretation of h(s ) as a measure of how the risks of different initial seasons s vary. For our basic Example 6 . 1, elementary calculus yields h(s) = exp
{ A C 2^
cos 2irs -
4^ sin 21rs + 11 cos 41rs - 16,ir) }
Plots of h for different values of A are given in Fig. 6.1, illustrating that the effect of seasonality increases with A.
A=1/4 A=1
A=4
0
Figure 6.1 In contrast to h, it does not seem within the range of our methods to compute C explicitly, which may provide one among many motivations for the Markovmodulated approximation procedure to be considered in Section 6c. Among other things, this provides an algorithm for computing C as a limit. At this stage , Theorem 6 . 10 shows that certainly ry is the correct Lundberg exponent. Noting that ^(u) > 0 in ( 6.9), we obtain immediately the following version of Lundberg ' s inequality which is a direct parallel of the result given in Corollary 3.6 for the Markov-modulated model: Theorem 6 . 11 7/'O (u) < C+°)h(s) e-ry", where C(o) = 1
+ info < t
6. RISK THEORY IN A PERIODIC ENVIRONMENT
183
Thus, e.g., in our basic example with A = 1, we obtain Co) = 1.42 so that tp(8) (u) < 1.42 • exp {J_ 1 cos 27rs - 47r sin 27rs + 167r cos 47rs - 167r I Cu. (6.12) As for the Markovian environment model, Lundberg's inequality can be con-
siderably sharpened and extended. We state the results below; the proofs are basically the same as in Section 3 and we refer to [44] for details. Consider first the time-dependent version of Lundberg's inequality. Just as in IV.4, we substitute T = yu in 0(u, T) and replace the Lundberg exponent ry by ryy = ay - yr. (ay), where ay is the unique solution of
(6.13)
W(ay) =y•
Elementary convexity arguments show that we always have ryy > -Y and ay > ry, r.(ay) > 0 when y < 1/ic' (7), whereas ay < -y, #c( ay) < 0 when y > 1/tc'('y). Theorem 6 .12 Let 00)(y)
1 Then info < t
,(8) (u, yu) 000 (u) - 0(8) (u+ yu)
< C+)(y)h(s)
e-7yu,
(6.15)
The next result improves upon the constant C+) in front of e-ryu in Theorem 6.11 as well as it supplements with a lower bound. Theorem 6.13 Let = 1 B(t) C o
(6.16
Then for all $ E [0, 1 ) and all u > 0, C_h(s)e-7u < V,(s)(u) < C+h(s)e-7".
(6.17)
In order to apply Theorem 6.13 to our basic example, we first note that the function u{w • 3e-3x + ( 1 - w) .7e - 7x j dx
fu° ex-u {w • 3e - 3x + (1 - w ) • 7e
_7x }
_ 6w + 6(1 - w)e-4u dx 9w + 7(1 - w)e-4u
184
CHAPTER VI. MARKOVIAN ENVIRONMENT
attains its minimum 2 /3 for u = oo and its maximum 6 /(7 + 2w) for u = 0. Thus C_ = 2 inf ex cos 2irs - 1 sin 27rs + 1 cos 47rs - 9 3 0<8<1 p 27r 47r 167r 161r
2 0.013,\ _ _e3 C+ = sup
6 exp { -A (- cos 27rs - -L sin 27rs + 1 I cos 47rs - 19 }
0 <8<1
8 + cos 21rs
Thus e.g. for A = 1 (where 3 e-0.013,\ = 0 .66, C+ = 1.20), 1/i18 1 s (u) > 0.66. exp 2^ cos 21rs - 4^ sin 2irs + 16^ cos 41rs - 16- I e-u, ,181 s(u) <
1.20 •exp { 2n cos 27rs - 1 sin 2irs + 16_ cos 47rs - 19
I
e-u.
Finally, we have the following result: Theorem 6 . 14 Let C+('yo) be as in (6.16) with 'y replaced by -yo and h(t) by h(t; -yo), and let 8 = er' (Y0). Then -7oudT . 0 <'p(8)(u ) -,(8)(u,T) < C+('Yo)h( s;'Yo)e
(6.18)
Notes and references The material is from Asmussen & Rolski [44]. Some of the present proofs are more elementary by avoiding the general point process machinery of [44], but thereby also slightly longer.
6c Markov-modulated approximations A periodic risk model may be seen as a varying environment model, where the environment at time t is (s + t) mod 1 E [0, 1), with s the initial season. Of course, such a deterministic periodic environment may be seen as a special case of a Markovian one (allowing a continuous state space E = [0, 1) for the environment), and in fact, much of the analysis of the preceding section is modelled after the techniques developed in the preceding sections for the case of a finite E. This observation motivates to look for a more formal connection between the periodic model and the one evolving in a finite Markovian environment. The idea is basically to approximate the (deterministic) continuous clock by a discrete (random) Markovian one with n 'months'. Thus, the nth Markovian environmental process {Jt} moves cyclically on {1, ... , n}, completing a cycle
7. DUAL QUEUEING MODELS
185
within one unit of time on the average , so that the intensity matrix is A(n) given by -n n 0 ••• 0 0 -n n ••• 0
(6.19)
A(n) _
n
0 0
••• -n
Arrivals occur at rate /3ni and their claim sizes are distributed according to Bni if the governing Markov process is in state i. We want to choose the /3ni and Bni in order to achieve good convergence to the periodic model. To this end, one simple choice is Oni = 0(
i - 1 ((i 1)/n) ) and Bni = B ,
(6.20)
but others are also possible. We let {Stn)}
be the claim surplus process of t>o the nth approximating Markov-modulated model, M(n) = Supt>o Stn), and the ruin probability corresponding to the initial state i of the environment is then Y'yn)(t) = F (M(n) > t),
(6.21)
which serves as an approximation to 0(1)(u) whenever n is large and i/n s. Notes and references
See Rolski [306].
7 Dual queueing models The essence of the results of the present section is that the ruin probabilities i/ (u), z/'i (u, T) can be expressed in a simple way in terms of the waiting time probabilities of a queueing system with the input being the time-reversed input of the risk process. This queue is commonly denoted as the Markov-modulated M/G/1 queue and has received considerable attention in the last decade. Thus, since the settings are equivalent from a mathematical point of view, it is desirable to have formulas permitting freely to translate from one setting into the other. Let 0j, Bi, A be the parameters defining the risk process in a random environment and consider a queueing system governed by a Markov process {Jt } ('Markov-modulated') as follows: • The intensity matrix for {Jt } is the time-reversed intensity matrix At _ A Aii'r?/7ri())i,jEE of the risk process, AE= • The arrival intensity is /3i when Jt = i;
CHAPTER VI. MARKOVIAN ENVIRONMENT
186
• Customers arriving when Jt = i have service time distribution Bi; • The queueing discipline is FIFO. The actual waiting time process 1W-1.=1 , 2 .... and the virtual waiting time (workload) process {Vt}too are defined exactly as for the renewal model in Chapter V. Proposition 7.1 Assume V0 = 0. Then (VT > u, JJ = i). (7.1) Pi(T(u) < T, JT = j) = LjPj 7ri In particular,
,0i (u , T) =
1 P.n(VT > u, JT =
7ri
i)
= 'P. (VT > u I JT = 2), (7.2)
Oi(u) = -1- P(V > u, J* = i) = P.T(V > u I J* = i), (7.3) 7ri
where (V, J*) is the steady-state limit of (Vt, Jt ). Proof Consider stationary versions of {Jt}o
and (7.1) follows. For (7.2), just sum (7.1) over j, and for (7.3), let T - oo in ❑ (7.2) and use that limF (VT > u, JT = i) = P(V > u, J* = i) for all j. Now let In denote the environment when customer n arrives and I* the steady-state limit. Proposition 7.2 The relation between the steady-state distributions of the actual and the virtual waiting time distribution is given by F(W > u, I* )3i P(V > u, J* = i), (7.4) where 0* = >jEE 7rj/3j. In particular, P(W > u, I* = i).
ii (u) = it /3
7. DUAL QUEUEING MODELS
187
Proof Identifying the distribution of (W, I*) with the time-average , we have N
1: I(W, >u,I,,=i) a4. P(W >u,I *=i), N -* oo. n=1
However, if T is large, on average 0*T customers arrive in [0, T], and of these, on average /32TP(V > u, J* = i) see W > u, I* = i. Taking the ratio yields (7.4), and (7.5) follows from (7.4) and (7.3).
❑
Notes and references One of the earliest papers drawing attention to the Markovmodulated M/G/1 queue is Burman & Smith [84]. The first comprehensive solution of the waiting time problem is Regterschot & de Smit [301], a paper relying heavily on classical complex plane methods. A more probabilistic treatment was given by Asmussen [17], and further references (to which we add Prabhu & Zhu [296]) can be found there. Proposition 7.1 is from Asmussen [16], with (7.3) improving somewhat upon (2.7) of that paper. The relation (7.4) can be found in Regterschot & de Smit [301]; a general formalism allowing this type of conclusion is 'conditional PASTA', see Regterschot & van Doorn [123]. In the setting of the periodic model of Section 6, the dual queueing model is a periodic M/G/1 queue with arrival rate 0(-t) and service time distribution B(-') at time t of the year (assuming w.l.o.g. that /3(t), B(t) have been periodically extended to negative t). With {Vt} denoting the workload process of the periodic queue, p < 1 then ensures that V(*) = limN-loo VN+9 exists in distribution, and one has PI'>(rr(u) < T) = P(-'_T)(VT > u),
(7.6)
P(- T)(T(u)
(7.7)
P(1-')(r(u) < oo) = P(')(00) > u).
(7.8)
For treatments of periodic M/G/1 queue, see in particular Harrison & Lemoine [186], Lemoine [242], [243], and Rolski [306].
This page is intentionally left blank
Chapter VII
Premiums depending on the current reserve 1 Introduction We assume as in Chapter III that the claim arrival process {Nt} is Poisson with rate ,6, and that the claim sizes U1, U2, ... are i.i.d. with common distribution B and independent of {Nt}. Thus, the aggregate claims in [0, t] are Nt
At = Ui
(1.1)
(other terms are accumulated claims or total claims). However , the premium charged is assumed to depend upon the current reserve Rt so that the premium rate is p(r) when Rt = r. Thus in between jumps, {Rt} moves according to the differential equation R = p(R), and the evolution of the reserve may be described by the equation
Rt = u - At + p(R8) ds. Zt
(1.2)
As earlier, z/i(u) = F IinffRt< 0IRo=u 1
tk(u,T) = FloinfTRt< OIRo=u1
denote the ruin probabilities with/initial reserve u and infinite, resp . finite horizon, and T(u) = inf {t > 0 : Rt < u} is the time to ruin starting from Ro = u so that '(u) = F(T(u) < oo), i&(u,T) = F(T(u) < T). 189
190
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
The following examples provide some main motivation for studying the model: Example 1 .1 Assume that the company reduces the premium rate from pi to p2 when the reserve comes above some critical value v. That is, pi > p2 and p(r) =
One reason could be competition, where one would try to attract new customers as soon as the business has become reasonably safe. Another could be the payout of dividends: here the premium paid by the policy holders is the same for all r, but when the reserve comes above v, dividends are paid out at rate pi - p2. Example 1.2 (INTEREST) If the company charges a constant premium rate p ❑ but invests its money at interest rate e, we get p(r) = p + er. Example 1.3 (ABSOLUTE RUIN) Consider the same situation as in Example 1.2, but assume now that the company borrows the deficit in the bank when the reserve goes negative, say at interest rate b. Thus at deficit x > 0 (meaning Rt = -x), the payout rate of interest is Sx and absolute ruin occurs when this exceeds the premium inflow p, i.e. when x > p/S, rather than when the reserve itself becomes negative. In this situation, we can put Rt = Rt + p/S, P(r) _ p + e(r - p/S) r > p/S p-5(p/5-r) 0
❑
Now return to the general model. Proposition 1.4 Either i,i(u) = 1 for all u, or o(u) < 1 for all u. Proof Obviously '(u) < ilb(v) when u > v. Assume 0(u) < 1 for some u. If Ro = v < u, there is positive probability, say e, that {Rt} will reach level u before the first claim arrives. Hence in terms of survival probabilities, 1 - Vi(v) ❑ > e(1 - '(u)) > 0 so that V'(v) < 1. A basic question is thus which premium rules p(r) ensure that 'O(u) < 1. No tractable necessary and sufficient condition is known in complete generality of the model. However, it seems reasonable to assume monotonicity (p(r) is
191
1. INTRODUCTION
decreasing in Example 1.1 and increasing in Example 1.2) for r sufficiently large so that p(oo) = limr.+ p(r) exists. This is basically covered by the following result (but note that the case p(r) .I3IB requires a more detailed analysis and that µB < oo is not always necessary for O(u) < 1 when p(r) -4 oo, cf. [APQ] pp. 296-297): Theorem 1.5 (a) If p(r) < /.3µB for all sufficiently large r, then ?(u) = 1 for all u; (b) If p(r) > /3µB + e for all sufficiently large r and some e > 0, then l/i(u) < 1 for all u, and P(Rt -+ oo) > 0. Proof This follows by a simple comparison with the compound Poisson model. Let Op(u) refer to the compound Poisson model with the same 0, B and (constant) premium rate p. In case (a), let uo be chosen such that p(r) < p = /3µB for r > uo. Starting from Ro = uo, the probability that Rt < uo for some t is at least tp(0) = 1 (cf. Proposition I1I.1.2(d)), hence Rt < uo also for a whole sequence of is converging to oo. However, obviously infu
(1.5)
and the process {Vt} has a proper limit in distribution , say V, if and only if V)(u) < 1 for all u. Then
0(u) = P(V > u).
(1.6)
192
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
In order to make Theorem 1.6 applicable, we thus need to look more into the stationary distribution G, say, for the storage process {Vt}. It is intuitively obvious and not too hard to prove that G is a mixture of two components, one having an atom at 0 of size 'yo, say, and the other being given by a density g(x) on (0, oo). It follows in particular that 0(u) = fg(Y)dy. (1.7) Proposition 1.7 p(x)g(x) = -tofB (x) + a f
(x - y)g(y) dy.
(1.8)
Proof In stationarity, the flow of mass from [0, x] to (x, oo) must be the same as the flow the other way. In view of the path structure of {V t }, this means that the rate of upcrossings of level x must be the same as the rate of downcrossings. Now obviously, the l.h.s. of (1.8) is the rate of downcrossings (the event of an arrival in [t, t + dt] can be neglected so that a path of {Vt} corresponds to a downcrossing in [t, t + dt] if and only if Vt E [x, x + p(x)dt]). An attempt of an upcrossing occurs as result of an arrival, say when {Vt} is in state y, and is succesful if the jump size is larger than x - y. Considering the cases y = 0 and 0 < y < x separately, we arrive at the desired interpretation of the r.h.s. of (1.8) as the rate of upcrossings. ❑ Define
^x 1 w(x) Jo p(t) dt.
Then w(x) is the time it takes for the reserve to reach level x provided it starts with Ro = 0 and no claims arrive. Note that it may happen that w (x) = oo for all x > 0, say if p(r) goes to 0 at rate 1 /r or faster as r j 0. Corollary 1.8 Assume that B is exponential with rate b, B(x) = e- 6x and that w(x) < oo for all x > 0. Then the ruin probability is tp (u) = f' g(y)dy, where g(x) = p( ^ ^ exp {,Qw(x) - Sx}, yo
1 + oo Q exp {,6w(x) - Sx} dx. Jo AX) (1.9)
Proof We may rewrite (1.8) as g(x) = p 1
{yo13e_6x +, Oe-ax f x e'Yg (y) dy } = p) e-axa(x)
193
1. INTRODUCTION where c(x) = 1o + fo elyg(y) dy so that (x) = eaxg(x) _
1 p(x)
nkx).
Thus log rc(x) = log rc(0) + Jo X L dt = log rc(0) + /3w(x), p(t) c(x) = rc (0)em"lxl = Yoes"lxl, g(x) = e-axK' (x) = e-6x ,Yo)3w'(x)e'6"lxl which is the same as the expression in (1.9). That 'Yo has the asserted value is ❑ a consequence of 1 = I I G I I = yo + f g• Remark 1.9 The exponential case in Corollary 1.8 is the only one in which explicit formulas are known (or almost so; see further the notes to Section 2), and thus it becomes important to develop algorithms for computing the ruin probabilities. We next outline one possible approach based upon the integral equation (1.8) (another one is based upon numerical solution of a system of differential equations which can be derived under phase-type assumptions, see further VIII.7). A Volterra integral equation has the general form x g(x) = h(x) + f K(x, y)9(y) dy, 0
(1.10)
where g(x) is an unknown function (x > 0), h(x) is known and K(x,y) is a suitable kernel. Dividing (1.8) by p(x) and letting K(x, y) _ ,QB(x - y) _ 'YoIB(x) p(x) , h(x) p(x) we see that for fixed -to, the function g(x) in (1.8) satisfies (1.10). For the purpose of explicit computation of g(x) (and thereby -%(u)), the general theory of Volterra equations does not seem to lead beyond the exponential case already treated in Corollary 1.8. However, one might try instead a numerical solution. We consider the simplest possible approach based upon the most basic numerical integration procedure, the trapezoidal rule dx = 2hfxN() [f ( xo) + 2f (xi) + 2f ( x2) + ... + 2f (XN-1) + f (xN)1 p
194
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
where xk = x0 + kh. Fixing h > 0, letting x0 = 0 (i.e. xk = kh) and writing 9k = 9(xk ), Kk,e = K(xk, xe), this leads to h 9N = hN + 2 {KN,09o+KN,N9N}+h{KN,191+'''+KN,N-19N-1},
i.e.
hN+ ZKN ,ogo +h{KN,lgl+•••+KN,N-19N-1} 1 - ZKNN
9 N=
(
1.11
)
In the case of (1.8), the unknown yo is involved. However, (1.11) is easily seen to be linear in yo. One therefore first makes a trial solution g*(x) corresponding to yo = 1, i.e. h(x) = h*(x) = (3B(x)/p(x), and computes f o' g*(x)dx numerically (by truncation and using the gk). Then g(x) = yog*(x), and IIGII = 1 then yields f 00 g*(x)dx (1.12) 1= 1+ 'Yo from which yo and hence g(x) and z/'(u) can be computed. ❑
la Two-step premium functions We now assume the premium function to be constant in two levels as in Example 1.1, p(r) _ J 1'1 r < v P2 r > v.
(1.13)
We may think of the risk reserve process Rt as pieced together of two risk reserve processes R' and Rt with constant premiums p1, P2, such that Rt coincide with Rt under level v and with above level v. For an example of a sample path, Rt see Fig. 1.1.
Rt
V
Figure 1.1
195
1. INTRODUCTION
Proposition 1.10 Let V)' (u) denote the ruin probability of {Rt}, define a = inf It > 0 : Rt < v}, let pi ( u) be the probability of ruin between a and the next upcrossing of v (including ruin possibly at a), and let q(u) = 1 - V" (u)
0 < u < v. (1.14)
Then 0
1 - q(u) + q ( u)z,b(v) p1(v)
u
=
v
1 + pi (v ) - '02 (0) pi (u) + (0, (u - v) - pi (u)) z/i(v )
v < u < oo.
Proof Let w = inf{ t > 0 1 Rt= v or Rt < 0} and let Q1 (u) = Pu(RC,, = v) be the probability of upcrossing level v before ruin given the process starts at u < v. If we for a moment consider the process under level v, Rt , only, we get Vil (u ) = 1 - q, (u ) + g1(u),O1( v). Solving for ql (u), it follows that q1 (u) = q(u). With this interpretation of q(u) is follows that if u < v then the probability of ruin will be the sum of the probability of being ruined before upcrossing v, 1 - q(u), and the probability of ruin given we hit v first , q(u)z'(v). Similarly, if u > v then the probability of ruin is the sum of being ruined between a and the next upcrossing of v which is pl (u), and the probability of ruin given the process hits v before (- oo, 0) again after a, (Pu(a < oo ) - p1(u))''(v) = (Vi2(u - v) - p1 (u))''(v)• This yields the expression for u > v, and the one for u = v then immediately follows. ❑ Example 1 .11 Assume that B is exponential, B(x) = e-62. Then 01 (u)
_
0 e -.yiu ,,2 (u) = )3 e -72u p1S P2S
where ry; = S - ,Q/p;, so that
q
-
1 - ~ e-ry1u p1S 1 - Q e-ryly P1S
Furthermore , for u > v P(a < oo ) = 02(u - v) and the conditional distribution of v - Ro given a < oo is exponential with rate S . If v - Ro < 0, ruin occurs at time a . If v - R, = x E [0, v], the probability of ruin before the next upcrossing of v is 1 - q(v - x). Hence
196
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
( pi(u) _ 02 ( u - v){ a-av + J (1 - q(v - x))be-dxdx 0 I 1- a e- 7i(v -x)
P2,ee- 7z(u-v)
1
_
P16 0 1 - a e-7iv P16
1 - e -6V Qbe-72(u-v) P2
Se-6xdx
a
e -71v (e(71 -6)v - 1)
1 - p1(71 - b)
1
p2be- 7z(u-v) 1 _
-
Ie-71v P16
1 - e-71v a
1 - -e -7iv P '6
0 Also for general phase-type distributions, all quantities in Proposition 1.10 can be found explicitly, see VIII.7. Notes and references Some early references drawing attention to the model are Dawidson [100] and Segerdahl [332]. For the absolute ruin problem, see Gerber [155] and Dassios & Embrechts [98]. Equation (1.6) was derived by Harrison & Resnick [186] by a different approach, whereas (1.5) is from Asmussen & Schock Petersen [50]; see further the notes to II.3. One would think that it should be possible to derive the representations (1.7), (1.8) of the ruin probabilities without reference to storage processes. No such direct derivation is, however, known to the author. For some explicit solutions beyond Corollary 1.8, see the notes to Section 2 Remark 1.9 is based upon Schock Petersen [288]; for complexity- and accuracy aspects, see the Notes to VIII.7. Extensive discussion of the numerical solution of Volterra equations can be found in Baker [57]; see also Jagerman [209], [210].
2 The model with interest In this section, we assume that p(x) = p + Ex. This example is of particular application relevance because of the interpretation of f as interest rate. However, it also turns out to have nice mathematical features.
197
2. THE MODEL WITH INTEREST
A basic tool is a representation of the ruin probability in terms of a discounted stochastic integral (2.1)
Z = - f e-EtdSt 0
w.r.t. the claim surplus process St = At - pt = EN` U; - pt of the associated compound Poisson model without interest . Write Rt") when Ro = u. We first note that: Proposition 2.1 Rt") = eetu + Rt°) Proof The result is obvious if one thinks in economic terms and represents the reserve at time t as the initial reserve u with added interest plus the gains/deficit from the claims and incoming premiums. For a more formal mathematical proof, note that
dR(u) = p + eR(u) - dAt, d [R(") - eetu] = p + e [R(u) - eEtu] - dAt . Since R( ;u) - eE'0u = 0 for all u, Rt") - eEtu must therefore be independent of u which yields the result. 0 Let
Zt = e-etR(0) = e-et (ft (p + eR(°)) ds - At I Then dZt
= e -Et
(_edt
f t (p + eR°) ds + (p + eR°)) dt + e dt A- dA
= e_et (pdt - dAt) = -e-EtdSt. / Thus v Z,, = - e-etdSt,
0 where the last integral exists pathwise because {St} is of locally bounded variation. Proposition 2.2 The r.v. Z in (2.1) is well-defined and finite, with distribution H(z) = P(Z < z) given by the m.g.f.
H[a] = Ee" = exp where k(a) _
(-ae-Et) dt} = exp {f °° k
13(B[a] - 1) - pa. Further Zt a ' Z
k
{fa
as t --+ oo.
(-y) dy}
198
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
Proof Let Mt =At -tAUB. Then St = Mt+t(/3pB-p) and {M„} is a martingale. e-EtdMt} From this it follows immediately that {fo is again a martingale. The mean is 0 and (since Var(dMt) = /3PB2)dt)
Var (
Z
'
e-'tdMt )
/' v
(2)
J e- eft/3p(B)dt = a2B (1 - e-2ev). o
Hence the limit as v -3 oo exists by the convergence theorem for L2-bounded martingales, and we have
v
v -
Zv =
e-EtdSt = -f e-t(dMt + (,3pB - p)dt)
J
0 - f0"
e-Et
a' 0 - f 0
oo
o o
(dMt + (3p$ -
p)dt)
e-EtdSt = Z.
Now if X1i X2, ... are i.i.d. with c.g.f. 0 and p < 1, we obtain the c .g.f. of E0° p'Xn at c as 00
00
log E fl ea°n X„ n=1
00
= log 11 e0(av ") _ n=1
E 0(apn). n=1
Letting p = e-Eh, Xn = Snh - S( n+1)h, we have q5(a) = hic(- a), and obtain the c.g.f. of Z = - f0,30 e-'tdSt as 00 00 00 lim E 0(apn ) = li h E rc(-ae -Fnh) = f tc (-ae-t) dt; n=1
1
n=1
0
the last expression for H[a] follows by the substitution y = ae-Et Theorem 2.3 z/'(u) =
H(-u) E [H(-RT(u)) I r(u) < oo] .
Proof Write r = r(u) for brevity. On {r < oo }, we have
u + Z =
(u + Zr ) + ( Z - Zr) = e
-
ET {e
(u + Zr) - f '* e-E(t-T )dSt] T
e-
ET [
R( u)
+ Z`],
❑
199
2. THE MODEL WITH INTEREST
where Z* = - K* e-E(t-T)dSt is independent of F, and distributed as Z. The last equality followed from Rt") = eEt(Zt + u), cf. Proposition 2.1, which also yields r < oo on {Z < -u}. Hence H(-u)
= P(u + Z < 0) = P(RT + Z* < 0; r < oo) zb(u)E [P(RT + Z* < 0 I)7T, r < oo)] _ O(u)E [H(-RT(")) I r(u) < oo] .
Corollary 2.4 Assume that B is exponential, B(x) = e-6', and that p(x) _ p + Ex with p > 0. Then o€Q/E -Ir, (8(p + cu);.
V) (u)
1\ E E
aA/Epal Ee -6n1 E +^3E1 / E
1r
Cbp;
E El al
where 1'(x; i) = f 2°° tn-le-tdt is the incomplete Gamma function. Proof 1 We use Corollary 1.8 and get
w(x) fo P + Etdt = g(x) = p +0x
e log(p + Ex) - e loge,
exp { - log(p + Ex) - - log p - 6x }
pal(p ryo)3 + ex)plE-1e-6^ J 70 = 1 + J p) exp {Ow(x) - Sx} dx x r^ = 1+ + ' /E (p + Ex)01'-le-ax dx 0
fJ
= 1+
a Epo/ E
f yI/ E- 1e- 6(Y -P)/E dy P
1+ OEA/E- 1e6 P /Er 60/e po/ e
(
,;,3 ) E E
lp(u) = -to foo a exp {w(x) - bx} AX) acO/E" 1 ePE l
Yo
50 1epolE
(
+ cu); 0)
5(p
E E
200
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS ❑
from which (2.2) follows by elementary algebra. Proof 2 We use Theorem 2.3. From ic(a) = ,3a/ (5 - a) - pa, it follows that = f 1 c(-y)dy = 1 f '(p-a/(a +y))dy f 0 0 Ey
logH[a]
R/E 1 [pa + )3log 8 - /3 log(b + a)] = log ePa/f (a + a )
e which shows that Z is distributed as p/E - V, where V is Gamma(b, 13/E), i.e. with density x(3/e-1aQ/e fV (x) _
e
-6X
' x > 0.
r (j3/E)
In particular, H(-u)
=
P(Z
r < -u) = P(V
>
u + p/E) =
(8(p + Eu)/E; ,13 /E) r (,3/E)
By the memoryless property of the exponential distribution, -RT(u) has an exponential distribution with rate (S) and hence E [H(-RT(u))I r(u) < oo]
L Pe-6'r (P/C - V < x)]0 + f P(V > p/E ) +
/' P/ ' (p/
-
e-by fv (p/E - x) dx
x)p/e -150/f
e- b P/E dx
I' (/3/E) (6P1'E;01'E) + (p/E)al aO l fe-bP/E } IF (0 /0 jF
From this (2.2) follows by elementary algebra.
/^
❑
Example 2 .5 The analysis leading to Theorem 2.3 is also valid if {Rt} is obtained by adding interest to a more general process {Wt} with stationary independent increments. As an example, assume that {Wt} is Brownian motion with drift µ and variance v2; then {Rt} is the diffusion with drift function p+Ex and constant variance a2. The process {St} corresponds to {-Wt} so that c(a) or2a2/2 - pa, and the c.g.f. of Z is IogH[a]
= f ytc(-y)dy = e fa (0,2y +µ ) dy
3. THE LOCAL ADJUSTMENT COEFFICIENT
201
_ Q2a2 pa 4e E
I.e., Z is normal (p/E, Q2/2E), and since RT = 0 by the continuity of Brownian motion, it follows that the ruin probability is
Cu)
H(-u) H(0)
11 Notes and references Theorem 2.3 is from Harrison [185]; for a martingale proof, se e.g. Gerber [157] p. 134 (the time scale there is discrete but the argument is easily adapted to the continuous case). Corollary 2.4 is classical. The formula (2.3) was derived by Emanuel et at. [129] and Harrison [185]; it is also used as basis for a diffusion approximation by these authors. Paulsen & Gjessing [286] found some remarkable explicit formulas for 0(u) beyond the exponential case in Corollary 1.8. The solution is in terms of Bessel functions for an Erlang(2) B and in terms of confluent hypergeometric functions for a H2 B (a mixture of two exponentials). It must be noted, however, that the analysis does not seem to carry over to general phase-type distributions, not even Erlang(3) or H3, or to non-linear premium rules p(•). A r.v. of the form Ei° p"X" with the X„ i.i.d. as in the proof of Proposition 2.2 is a special case of a perpetuity; see e.g. Goldie & Griibel [167]. Further studies of the model with interest can be found in Boogaert & Crijns [71], Gerber [155], Delbaen & Haezendonck [104], Emanuel et at. [129], Paulsen [281], [282], [283], Paulsen & Gjessing [286] and Sundt & Teugels [356], [357]. Some of these references also go into a stochastic interest rate.
3 The local adjustment coefficient. Logarithmic asymptotics For the classical risk model with constant premium rule p(x) - p*, write y* for the solution of the Lundberg equation
f3(B[ry *] - 1) - -Y*p*
= 0,
write Vi* (u) for the ruin probability etc., and recall Lundberg 's inequality W*(u) < e-ry*u
202
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
and the Cramer-Lundberg approximation (3.3)
V,*(u) - C*e--f*".
When trying to extend these results to the model of this chapter where p(x) depends on x, a first step is the following: Theorem 3 .1 Assume that for some 0 < 5o < oo, it holds that f3[s] T oo, log ?i(u) < < 00 -JO . If 60 s f 6o, and that p(x) -* oo, x -* oo. Then lim sup u->oo u
and e -E''p(r) -+ 0, e(1o+e)2 (x ) u -> 00.
oo for all E > 0, then
log u (u)
In the proof as well as in the remaining part of the section , we will use the local adjustment coefficient 'y(x), i.e. the function -y(x) of the reserve x obtained by for a fixed x to define -y(x) as the adjustment coefficient of the classical risk model with p* = p(x), i.e. as solution of the equation n(x,'y ( x)) = 0
where
r. (x, a) = f3(B[a] - 1) - ap(x);
(3.4)
we assume existence of -y(x) for all x, as will hold under the steepness assumption of Theorem 3.1, and (for simplicity) that
inf p(x) > (3µs ,
(3.5)
x>0
which implies inf.,>o 7(x) > 0. The intuitive idea behind introducing local adjustment coefficients is that the classical risk model with premium rate p* = p(x) serves as a 'local approximation ' at level x for the general model when the reserve is close to x. Proof of Theorem 3.1. The steepness assumption and p(x) -+ oo ensure 'y(x) -* So. Let y* < So, let p* be a in (3. 1) and for a given E > 0, choose uo such that p( x) > p* when x > u0E. When u > uo, obviously O(u) can be bounded with the probability that the Cramer -Lundberg compound Poisson model with premium rate p* downcrosses level uE starting from u , which in turn by Lundberg's inequality can be bounded by e-ry*(1-E)" Hence limsup„,.log '(u)/u < -ry*(1 - E). Letting first E -* 0 and next ry * T 5o yields the first statement of the theorem. For the last asssertion , choose c(,1 ), c(,2) such that p(x) < c(.i)eex, B(x) > C(2)e-(ao+f)x for all x. Then we have the following lower bound for the time for the reserve to go from level u to level u + v without a claim: w(u + v) - w (u) J
dt > c(3)e-eu v 1 p(u+ t)
3. THE LOCAL ADJUSTMENT COEFFICIENT
203
where c,(3) = (1 - e-a°/(ecf1)). Therefore the probability that a claim arrives before the reserve has reached level u + v is at least c(,4)e-E" Given such an arrival, ruin will occur if the claim is at least u + v, and hence '(u) > c(4)e-euc( 2)e-(do+e)u
The truth of this for all e > 0 implies lim inf log V,(u) > -so.
❑
Obviously, Theorem 3.1 only presents a first step, and in particular, the result is not very informative if bo = oo. The rest of this section deals with tail estimates involving the local adjustment coefficient. The first main result in this direction is the following version of Lundberg's inequality: Theorem 3 . 2 Assume that p(x) is a non-decreasing function of x and let I(u) = fo ry(x)dx. Then ,' (u) < e-I("). (3.6) The second main result to be derived states that the bound in Theorem 3.2 is also an approximation under appropriate conditions. The form of the result is superficially similar to the Cramer-Lundberg approximation, noting that in many cases the constant C is close to 1. However, the limit is not u -+ oo but the slow Markov walk limit in large deviations theory (see e.g. Bucklew [81]).
For e > 0, let 0e (u) be evaluated for the process only with 3 replaced by /0/e and U; by cU2.
{Rte)}
defined as in (1.2),
Theorem 3 .3 Assume that either (a) p(r) is a non -decreasing function of r, or (b) Condition 3.13 below holds. Then lim-elog l/ie (u) = I(u). (3.7) CIO
Remarks: 1. Condition 3.13 is a technical condition on the claim size distribution B, which essentially says that an overshoot r.v. UJU > x cannot have a much heavier tail than the claim U itself. 2. If p(x) = pis constant , then Rte) = CRtie for all t so that V), (u) = O(u/e), I.e., the asymptotics u -* oo and c -- 0 are the same. 3. The slow Markov walk limit is appropriate if p(x) does not vary too much compared to the given mean interarrival time 1/0 and the size U of the claims; one can then assume that e = 1 is small enough for Theorem 3.3 to be reasonably precise and use e` (u) as approximation to 0 (u).
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
204
4. One would expect the behaviour in 2) to be important for the quantitative performance of the Lundberg inequality (3.6). However, it is formally needed only for Theorem 3.3. 5. As typical in large deviations theory, the logaritmic form of (3.7) is only captures 'the main term in the exponent' but is not precise to describe the asymptotic form of O(u) in terms of ratio limit theorems (the precise asymptotics could be logI(u)e-1(U) or I(u)"e_I(u), say, rather than e-I(u)).
3a Examples Before giving the proofs of Theorems 3.2, 3.3, we consider some simple examples. First, we show how to rewrite the explicit solution for ti(u) in Corollary 1.8 in terms of I(u) when the claims are exponential: Example 3 .4 Consider again the exponential case B(x) = e-ax as in Corollary 1.8. Then y(x) = b - (3/p(x), and r
j
v(x)dx = bu -
a J0
p(x)-ldx =
Integrating by parts, we get 1 'Yo
/' oo
exp {(3w (x) - bx} dx J" AX) = 1 + J0 dodx(x) exp {,(iw(x) - bx} dx fo
= 1+
00
1 + [exp {/(3w(x) - bx}]o + b
exp low (x) - bx} dx
1+0- 1 + b f e-,(x) dx, J0
1
^oo
70 Ju
exp low(x) - bx} dx
g(x ) dx f AX)
lexp IOW (X )
bx + b
oo exp low(x) bx dx u
r oo = b
J
u
exp low (x) - bx} dx - exp {/33w(u) - bu},
3. THE LOCAL ADJUSTMENT COEFFICIENT
205
and hence
f°° e-I(v )dy - e- I ( u) fool,
ry(x
/b -I u
e- I ( v )dy
o e -f0 °° e -
+u) dxdy - 1/8 . (3.8)
fo
7(x)dxdy
1-1 We next give a direct derivations of Theorems 3.2, 3.3 in the particularly simple case of diffusions: Example 3.5 Assume that {Rt} is a diffusion on [0, oo) with drift µ(x) and variance a2 (x) > 0 at x. The appropriate definition of the local adjustment coefficient 7(x) is then as the one 2p(x)la2(x) for the locally approximating Brownian motion. It is well known that (see Theorem XI.1.10 or Karlin & Taylor [222] pp. 191-195) that
1P (U) = fu0 e-I(v)dy = e-I(u) follo e- fory(x+u)dxdy ( 3.9 ) 11000 e-I(v)dy f000 e- f y(x)dxd y
If 7(x) is increasing , applying the inequality 7(x + u) > 7(x) yields immediately the conclusion of Theorem 3.2. For Theorem 3.3, note first that the appropriate slow Markov walk assumption amounts to u, (X) = µ(x), 0,2(X) = ev2(x) so that 7e(x) = 7(x)/e, IE(u) = I(u)/e, and (3.9) yields
-e log ,0, (u) = I(u) + AE - BE, (3.10) where AE = e log
000 e- fo 7(x)dx/Edy f ,
Be = e log U000 e- fa
7(x+u)dx/Edy
o The analogue of (3.5) is infx>o 7(x) > 0 which implies that f °O ... in the definition of AE converges to 0. In particular, the integral is bounded by 1 eventually and hence lim sup AE < lim sup a log 1 = 0. Choosing yo, 70 > 0 such that 7(x) < 7o for y < yo, we get
r
00 e- fo 7(x) dx /E dy >
Yo
a-v 'yo /Edy = E (1 - e-v 0 O /E) J0 70 70
This implies lim inf A, > lime log e = 0 and AE -* 0. Similarly, BE -* 0, and (3.7) follows. ❑
206
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
The analogue of Example 3.5 for risk processes with exponential claims is as follows: Example 3 .6 Assume that B is exponential with rate S. Then the solution of the Lundberg equation is -y* = b - ,6/p* so that u
1 dx. I (U) = bu - /3 1 AX ) Note that this expression shows up also in the explicit formula for lk(u) in the form given in Example 3.4. Ignoring 1/5 in the formula there, this leads to (3.6) exactly as in Example 3.5. Further, the slow Markov walk assumption means 5E = b/c, 0, _ ,0/e. Thus 7e(x) _7(x)/e and (3.10) holds if we redefine AE as AE = flog (j °° efo 7(x)dx/edy _ E/5 I
and similarly for B. As in Example 3.5, lim sup Af < lim sup c log(1 - 0) = 0. E-+o
e-*O
By (3.5) and 7* = 5 -,Q/p*, we have 5 > 7o and get
C
lim inf AE > lime log e - 7o
15 - 1 I I > 0. -
Now (3.7) follows just as in Example
3.5.
0
We next investigate what the upper bound / approximation a-I (°) looks like in the case p(x) = a + bx (interest) subject to various forms of the tail B(x) of B. Of course, 7(x) is typically not explicit, so our approach is to determine standard functions Gl (u), . . . , G. (u) representing the first few terms in the asymptotic expansion of I(u) as u -+ oo. I.e.,
G,(u) oo,
G;+1 (u) = o( 1), I(u ) = G1(u) + ... + Gq(u) + o(G9(u))• Gi (u)
It should be noted , however , that the interchange of the slow Markov walk oo is not justified and in fact, the slow Markov limit a -* 0 and the limit u walk approximation deteriorates as x becomes large. Nevertheless , the results are suggestive in their form and much more explicit than anything else in the literature.
3. THE LOCAL ADJUSTMENT COEFFICIENT
207
Example 3 .7 Assume that B(x) - clxa-le-5x
(3.11)
with a > 0. This covers mixtures or convolutions of exponentials or, more generally, phase-type distributions (Example 1. 2.4) or gamma distributions; in the phase-type case , the typical case is a = 1 which holds , e.g., if the phase generator is irreducible ( Proposition VIII. 1.8). It follows from (3.11) that b[s] -* co as s f S and hence 7* T S as p* -+ oo. More precisely, B[s] = 1 + s exB(x)dx = 1 +c1SF(a) ('+o(')) (S - s)C' f "o
o
as s T S, and hence (3.1) leads to (S-7T N Ocp
a,
,Y ,:;C2p*
fu I(u) Su - c2
J
Su
a + bx 1/
0
(
C2 = (3clr( a))11',
)
Su
a<1 Su - c3 logu a= 1
a dx -
c4ul -1/° a > 1
❑
where c3 = c2 /b, c4 = c2b -1/'/(1 - 1/a).
Example 3 .8 Assume next that B has bounded support, say 1 is the upper limit and B(x) - cs(1 - x)n-1, x T 1,
(3.12)
with y > 1. For example, 77 = 1 if B is degenerate at 1, y = 2 if B is uniform on (0, 1) and 17 = k + 1 if B is the convolution of k uniforms on (0,1/k). Here B[s] is defined for all s and B[s] - 1 =$
e"B(x)dx = e8 f
cse8 Sn
-1
f
f
Jo s e-IB ( 1 - y/s)dy
' e-vy'7-ldy = cse8r(T7)
sn -1
as s T oc. Hence (3.1) leads to ,3cse7*I7(77) - ry*°p*, ry* loge*+ g7loglogp*, I(u) Pt; u(logu + r7loglogu).
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
208
Example 3 .9 As a case intermediate between (3.11) and (3.12), assume that B(x) CO -x2/2c7, x f oo . (3.13)
We get b[s] - 1
Cgs o"O 0
e-c78)2/2c7
esxe-x2/2c7 dx = cgsec782/2
dx
f - css 2%rc7eC782/2,
C7
- log p*, 7 * - c8
where c8 =
,
log
I (u) c8u
log u
2/c7.
0
3b Proof of Theorem 3.2 We first remark that the definition (3.4) of the local adjustment coefficient is not the only possible one: whereas the motivation for (3.4) is the formula
h
logEues ( Rh-u) ,•, ,3 (B[s] - 1) - sp(u), h 10, (3.14)
for the m .g.f. of the increment in a small time interval [0, h], one could also have considered the increment ru (T1) - u - Ul up to the first claim (here ru (•) denotes the solution of i = p (r) starting from ru(0) = u). This leads to an alternative local adjustment coefficient 7o(u) defined as solution of
1 = Ee''o(u)(vi+u - ru(TI)) - B[7o (u)] .
1
3e- Ote7o( u)(u- r^.(t))dt.
(3.15)
0
Proposition 3.10 Assume that p(x) is a non-decreasing function of x. Then: (a) -y(x) and 7o(x) are also non-decreasing functions of x; (b) 'y(x) <'Yo(x)• Proof That 7(x) is non-decreasing follows easily by inspection of (3.4). The assumption implies that ru(t) - u is a non-decreasing function of u. Hence for u
By convexity of the m . g.f. of U1 + v - r„(Ti), this is only possible if 7o(v) 2 7o(u)•
3. THE LOCAL ADJUSTMENT COEFFICIENT
209
For (b), note that the assumption implies that ru(t) - u > tp(u). Hence 1
= Ee-Yo(u)(U1+u-ru(T1)) < E,e70(u)(U1-P(u)T1)
0 + 7o(u)p(u)'
0
<_ 00['Yo( u)] - 1) - 7o (u)p(u)•
Since (3.4) considered as function of 7 is convex and 0 for -y = 0, this is only possible if -yo(u) > 7(u). ❑ We prove Theorem 3.2 in terms of 7o; the case of 7 then follows immediately by Proposition 3.10(b): Theorem 3.11 Assume that p(x) is a non-decreasing function of x. Then (u) < efo Yo(x)dx.
(3.16)
Proof Define 411(n)(u) = P('r(u) < on) as the ruin probability after at most n claims (on = TI + • • • + Tn). We shall show(' by induction that Y'(n) (u) <
e-
fo
'Yo(x)dx
(3.17)
from which the theorem follows by letting n -+ oo. The case n = 0 is clear since here To = 0 so that ik(°)(u) = 0. Assume (3.17) shown for n and let Fu(x) = P(U1 + u - ru(T1) < x). Separating after whether ruin occurs at the first claim or not, we obtain „I,(n+l) (u)
1 - Fu(u ) +
J ✓
^(n)(u - x)Fu(dx) 00
U efo J
o (y) dYF
(dx)
)+f
I
=
11
/' / 00 e f oFu
J
fu dx) + of u :7o(Y)dYFu(dx)
u
00
l`
1
Considering the cases x > 0 and x < 0 separately, it is easily seen that fu x7o(y)dy < x-yo (u). Also, fa 7o(y)dy < u7o(u) < x-yo (u) for x > u. Hence „/,(n+1) (u)
e-fo Yo(x)dxI^"Q exyo( I
u
fo -yo( x)dx j,u[70(u)] fo e-
e-
-yo(x)dx
u)Fu(dx )+ J - es'Yo(u)Fu(dx)} o0
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
210
where the last identity immediately follows from (3.15); we used also Proposition 3.10(a) for some of the inequalities. 0 It follows from Proposition 3.10(b ) that the bound provided by Theorem 3.11 is sharper than the one given by Theorem 3.2. However, yo(u) appears more difficult to evaluate than y(u). Also, for either of Theorems 3.2, 3.11 be reasonably tight something like the slow Markov walk conditions in Theorem 3.3 is required, and here it is easily seen that yo(u) ,: y(u). For these reasons, we have chosen to work with -y(u) as the fundamental local adjustment coefficient.
3c Proof of Theorem 3.3 The idea of the proof is to bound { R( f) } above and below in a small interval [x - x/n, x + x/n] by two classical risk processes with a constant p and appeal to the classical results (3.2), (3.3). To this end, define
uk,n =
ku,
P k,n
=
p(x),
sup
inf
pk n =
n uk-1,n
uk_l,n
u k}1,n AX),
and, in accordance with the notation i/iE (u), op*. (u), let Op*.;E (u) denote the ruin probability for the classical model with 0 replaced by ,3/e and U; by €U=. Lemma 3.12 lim sup4^o -f log O, (u) < I(u). Proof For ruin to occur, {RtE)} (starting from u = un,n) must first downcross un-l,n. The probability of this is at least n n;E (u/n), the probability that ruin occurs in the Cramer-Lundberg model with p* = pn,n (starting from u/n) without that 2u/n is upcrossed before ruin. Further, given downcrossing occurs, the value of {R(E)} at the time of downcrossing is < un-l,n so that n,n;E (u/ n) Y'E (un - I,n) pn niE (u /n)
n_1 n;E ( u/n) ^•e. (un-2,n)
n
>
II v ^k n;E (u/n)
k =1
Now as e . 0,
-Y*u /E, 0;.;e (u) = v';. W O ,., C*e-
where the first equality follows by an easy scaling argument and the approximation by (3.3). Let Ck,n, ryk,nbe C*, resp. y* evaluated for p* = Pk,n; in
3. THE LOCAL ADJUSTMENT COEFFICIENT
211
particular, since ry' is an increasing function of p', also sup ?'(x).
ryk,n =
uk_1,n <X
Clearly, *p•;E (u/n) -Op•;E (urn)
<
+^p•;F (2u/n),
\
, /' (u/n) -'T nk,n cE (2u/n)
*I,n;E (u/n) OP-
Ck ne-7k,nu /fn( 1 Ck
-
ne-7k,nu/en(1 +
e- 7k,nu /En) o(1)),
where o(1) refers to the limit e .. 0 with n and u fixed. It follows that n
log Ypk,,i;! (u/n)
-log V'C (u) k =1
n
n
m 7k,n - log Ck,n + 0(1),
limsup-elogv), (u) CIO
<
k=1 k=1 n u _ nE7 k,nk=1
Letting n -4 oo and using a Riemann sum approximation completes the proof. 11 Theorem 3. 3 now follows easily in case (a). Indeed , in obvious notation one has -tC (x) = y(x)/e, so that Theorem 3.2 gives 7PE (u) < e-Ii"i/f =
lim inf -Clog 0E (u) > I (u). 40
Combining with the upper bound of Lemma 3.12 completes the proof. In case (b), we need the following condition: Condition 3.13 There exists a r. v. V < oo such that (i) for any u < oo there exist Cu < oo and a (u) > supy <„ 7(x) such that P(V > x) < Cue- a( u)z; (3.18) (ii) the family of claim overshoot distributions is stochastically dominated by V, i.e. for all x , y > 0 it holds that
F(U>x
+yIU>x)
B(x + y) < F (V > y). B(x)
(3.19)
212 CHAPTER VII. RESERVE-DEPENDENT PREMIUMS To complete the proof, let v < u and define ^(E) (u, v ) = v - R<) (u v).
T(E) (u, v ) = inf { t > 0 : R(c ) < v R) = u } , Then Y'E (u)
,,,^''
(R,( . ) (u
l
<
oo]
E
=
E [OE (u/n - ^(E) (u, u/n)) ; T() (u, u/n) < oo]
=
E [OE (u/n - E(E) (u, u/n)) I T(E) (u, u/n) < oo] . P (T(E) (u, u/n) < oo)
<
EV), (u/n - eV) • P (T(E) (u, u/n) < oo) .
[
,,
u
/n)) ; T(E) (u, u /n)
=
Write EO, (u/n - EV) = El + E2, where El is the contribution from the event that the process does not reach level 2u/n before ruin and E2 is the rest. Then the standard Lundberg inequality yields (u/En - V)
El < E?;1 n;E (u/n - EV) = EiI 1 ,n <
e-ry1,nu /EnE [e71,n V; V < u/En] + P(V > u/En)
= e-71 nu/Eno(l)
(using (3.18) for the last equality). For E2, we first note that the number of downcrossings of 2u/n starting from RoE) = 2u/n is bounded by a geometric r.v. N with
EN < 1 = infx>2u/nA(x) = 0(1), infx>2u /n P(x) - ,QEU 1 - of:>2 in n(x);E (0) cf. (3.5) and the standard formula for b(0). The probability of ruin in between two downcrossings is bounded by Epp ;E (2u/n - EV) = e- 2-y 1 ' . u /en 0(i) _n
so that E2 < e-2ryl nu/En0(1),
Ei + E2 < e-71,nu/En0(1)
3. THE LOCAL ADJUSTMENT COEFFICIENT
213
Hence lim inf -e log Ali, (u) 40
lim inf -e log(Ei { +E2) + logP (r(`) (u, u/n) < oo 40 )I U nryl n+liminf-elogP (T(')(u,u/n) < oo) CI -
> u
n
n ryi n' i=1
❑
Another Riemann sum approximation completes the proof.
Notes and references With the exception of Theorem 3.1, the results are from Asmussen & Nielsen [39]; they also discuss simulation based upon 'local exponential change of measure' for which the likelihood ratio is (
/'t
/'t
Ns
Lt = exp S - J y(Rs-)dR,) = exp - J -r(Rs)p(R,)ds + -Y(R2;-)Ui } . l o JJJ o ;=1 J An approximation similar to (3.7) for ruin probabilities in the presence of an upper barrier b appears in Cottrell et al. [89], where the key mathematical tool is the deep Wentzell-Freidlin theory of slow Markov walks (see e .g. Bucklew [81]). Djehiche [122] gives an approximation for tp(u,T) = P „(info
(3.20)
(with ic(x, s) as in (3.4) and the prime meaning differentiation w.r.t. s), whereas the most probable path leading to ruin is the solution of r(x) _ -k (x,7(x))
(3.21)
(the initial condition is r(0) = u in both cases). Whereas the result of [122] is given in terms of an action integral which does not look very explicit, one can in fact arrive at the optimal path by showing that the approximation for 0(u, T) is maximized over T by taking T as the time for (3.21) to pass from u to 0; the approximation (3.7) then comes out (at least heuristically) by analytical manipulations with the action integral. Similarly, it might be possible to show that the limits e . 0 and b T 00 are interchangeable in the setting of [89]. Typically, the rigorous implementation of these ideas via large deviations techniques would require slightly stronger smoothness conditions on p(x) than ours and conditions somewhat different from Condition 3.13,
214
CHAPTER VII. RESERVE-DEPENDENT PREMIUMS
the simplest being to require b[s] to be defined for all s > 0 (thus excluding , e.g., the exponential distribution ). We should like, however, to point out as a maybe much more important fact that the present approach is far more elementary and self-contained than that using large deviations theory. For different types of applications of large deviations to ruin probabilities , see XI.3.
Chapter VIII
Matrix-analytic methods 1 Definition and basic properties of phase-type distributions Phase-type distributions are the computational vehicle of much of modern applied probability. Typically, if a problem can be solved explicitly when the relevant distributions are exponentials, then the problem may admit an algorithmic solution involving a reasonable degree of computational effort if one allows for the more general assumption of phase-type structure, and not in other cases. A proper knowledge of phase-type distributions seems therefore a must for anyone working in an applied probability area like risk theory. A distribution B on (0, oo) is said to be of phase-type if B is the distribution of the lifetime of a terminating Markov process {Jt}t>o with finitely many states and time homogeneous transition rates. More precisely, a terminating Markov process {Jt} with state space E and intensity matrix T is defined as the restriction to E of a Markov process {Jt}o
0
.
We often write p for the number of elements of E. Note that since (1.1) is 'Here as usual , P; refers to the case Jo = i; if v = (vi)iEE is a probability distribution, we write Pv for the case where Jo has distribution v so that Pv = KER viPi•
215
216
CHAPTER VIII. MATRIX-ANALYTIC METHODS
the intensity matrix of a non-terminating Markov process, the rows sum to one which in matrix notation can be rewritten as t + Te = 0 where e is the column E-vector with all components equal to one. In particular, T is a subintensity matrix2, and we have t = -Te.
(1.2)
The interpretation of the column vector t is as the exit rate vector, i.e. the ith component ti gives the intensity in state i for leaving E and going to the absorbing state A. We now say that B is of phase-type with representation (E, a, T) (or sometimes just (a,T)) if B is the Pa-distribution of the absorption time C = inf{t > 0 : it = A}, i.e. B(t) = Fa(^ < t ). Equivalently, C is the lifetime sup It > 0 : Jt E E} of {Jt}. A convenient graphical representation is the phase diagram in terms of the entrance probabilities ai, the exit rates ti and the transition rates (intensities) tij:
tj
3 aj ai i
tjk
ti tk
FkJ ak
Figure 1.1 The phase diagram of a phase-type distribution with 3 phases, E = {i, j, k}. The initial vector a is written as a row vector. Here are some important special cases: Example 1 .1 Suppose that p = 1 and write ,0 = -t11. Then a = a1 = 1, t1 = /3, and the phase-type distribution is the lifetime of a particle with constant failure rate /3, that is, an exponential distribution with rate parameter ,3. Thus the phase-type distributions with p = 1 is exactly the class of exponential distributions. 0 2this means that tii < 0, tij > 0 for i 54 j and EjEE tij < 0
1. PHASE-TYPE DISTRIBUTIONS 217 Example 1.2 The Erlang distribution EP with p phases is defined Gamma distribution with integer parameter p and density bp XP-1 -6x (p- 1)!e
Since this corresponds to a convolution of p exponential densities with the same rate S, the EP distribution may be represented by the phase diagram (p = 3)
Figure 1.2 corresponding to E = {1, . . . , p}, a = (1 0 0 ... 00))
-S s o ... 0 0 0 0 -S 6 ... 0 0 0 T=
t= 0 ••• -S S 0 0 0 0 0 0 ... 0 -S 6
Example 1.3 The hyperexponential distribution HP with p parallel channels is defined as a mixture of p exponential distributions with rates 51, ... , 6, so that the density is P E ai6ie-6,x i=1
Thus E _ -Si 0
0 -S2
0 0
... •..
0 0
0 0
0
0
0
•••
-Sp-1
0
0
0
0 •..
T
t=
and the phase diagram is (p = 2)
0
-SP
CHAPTER VIII. MATRIX-ANALYTIC METHODS
218
Figure 1.3 0 Example 1 .4 (COXIAN DISTRIBUTIONS) This class of distributions is popular in much of the applied literature, and is defined as the class of phase-type distributions with a phase diagram of the following form: 2
1
b2- t2
617 ti
yt bP- 1 tP-1
t2
1
Figure 1.4 For example, the Erlang distribution is a special case of a Coxian distribution. The basic analytical properties of phase -type distributions are given by the following result . Recall that the matrix-exponential eK is defined by the standard series expansion Eo K"/n! 3. Theorem 1 . 5 Let B be phase-type with representation (E, a, T). Then: (a) the c.d.f is B (x) = 1 - aeTxe; (b) the density is b(x ) = B'(x) = aeTxt; (c) the m.g.f. B[s] = f0°O esxB (dx) is a(-sI -T)-lt (d) the nth moment f0°O xnB(dx) is (- 1)"n! aT-"e. Proof Let P8 = (p ^) be the s-step EA x EA transition matrix for {Jt } and P8 the s-step E x E-transition matrix for {Jt} , i.e. the restriction of P8 to E. Then for i , j E E, the backwards equation for {Jt} (e.g. [APQ ] p. 36) yields s . p: dp; d-. . ds^
=
ds'
= ttlaj +
tikpkj. E t ikp kj = kEE kEE
3For a number of additional important properties of matrix-exponentials and discussion of computational aspects , see A.3
1. PHASE-TYPE DISTRIBUTIONS
219
That is, d8 P8 = TP8, and since obviously P° = I, the solution is P8 = eT8. Since
1 - B(x) = 1'a (( > x) = P., (Jx E E) =
1: aipF. = aPxe, i,jEE
this proves (a), and (b) then follows from B'(x) _ -cx Pxe = -aeTxTe = aeTxt (since T and eTx commute). For (c), the rule (A.12) for integrating matrixexponentials yields
B[s] =
esxaeTxt dx = a ( f°°e(81+T)dx ) t
J
a(-sI - T) -1t. Alternatively, define hi = Eie8S. Then h -tit ti + ti3 h j - tii -tii - s j# -tii i
(1.5)
Indeed , - tii is the rate of the exponential holding time of state i and hence (-tii)/(-tii - s) is the m .g.f. of the initial sojourn in state i. After that, we i w.p. tij / - tii and have an additional time to absorption either go to state j which has m .g.f. hj , or w.p. ti/ - tii we go to A, in which case the time to absorption is 0 with m .g.f. 1. Rewriting ( 1.5) as hi(tii + s)
=
-ti
-
t ij hj, j#i
tijhj + his = -ti,
jEE
this means in vector notation that (T + sI)h = -t, i.e. h = -(T + sI)-1t, and since b[s] = ah, we arrive once more at the stated expression for B[s]. Part (d) follows by differentiating the m.g.f.,
d" dsn a
(- s I - T) -'t
=
B(n)[0] = _
(- 1 ) n +l n ! a (s I + T ) - n -lt , (-1) n+1n!aT - n-1t = (-1)nn!aT-n-1Te (-1)nn! aT-ne.
Alternatively, for n = 1 we may put ki = Ei( and get as in (1.5)
-L j-
ki = 1 + tii
j:Ai
-tii
(1.6)
220 CHAPTER VIII. MATRIX-ANALYTIC METHODS which is solved as above to get k = -aT-le.
0
Example 1.6 Though typically the evaluation of matrix-exponentials is most conveniently carried out on a computer, there are some examples where it is appealing to write T on diagonal form, making the problem trivial. One obvious instance is the hyperexponential distribution, another the case p = 2 where explicit diagonalization formulas are always available, see the Appendix. Consider for example
3 9 a= (2 2), T= 2 111 so that 2 2
Then (cf. Example A3.7) the diagonal form of T is
T
9
9
10
70
6
1
9
10
70
7
1
7
10
10
0
9 110
where the two matrices on the r.h.s. are idempotent. This implies that we can compute the nth moment as
(-1)"n! aT -"e
9 9 10 70
1"n! 1 1 22
7 1 10 10 1 9
+6- "n!
(
( l 2 2 ) 17 9 0 \ 1 / 10 10
32 n! 35 +n!353 6" Similarly, we get the density as
aeTyt = e x
9 9 6 (1 1) 10 7 1 0 10
2
221
1. PHASE-TYPE DISTRIBUTIONS
+e -6x (1
2
1 10
11 2
7 10
9 6 70 7 9 10
2
35e-x + 18e-6x 35
The following result becomes basic in Sections 4, 5 and serves at this stage to introduce Kronecker notation and calculus (see A.4b for definitions and basic rules): Proposition 1.7 If B is phase-type with representation (v,T), then the matrix m.g.f. B[Q] of B is
f3[Q] =
J
e'1zB(dx) _ (v (9 I)(-T ® Q)-1(t ® I). (1.7)
Proof According to (A.29) and Proposition A4.4, 00
B[Q] =
Jf0
veTxteQx dx
(v (& I)
(
= (v ® I) (
(T ®Q)xdx fo o" e
f° eT
x edx I (t I)
t ® I) _ (v ® I)(-T ® Q)-1(t ® I). )( 0
Sometimes it is relevant also to consider phase-type distributions, where the initial vector a is substochastic, hail = E=EE a; < 1. There are two ways to interpret this: • The phase-type distribution B is defective, i.e 11BIJ = 1laDD < 1; a random variable U having a defective phase-type distribution with representation (a, T) is then defined to be oo on a set of probability 1- 11aDD, or one just lets U be undefined on this additional set. • The phase-type distribution B is zero-modified, i.e a mixture of a phasetype distribution with representation (a/llall,T) with weight hall and an atom at zero with weight 1 - hall. This is the traditional choice in the literature, and in fact one also most often there allows a to have a component ao at A.
222
CHAPTER VIII. MATRIX-ANALYTIC METHODS
la Asymptotic exponentiality Writing T on the Jordan canonical form, it is easily seen that the asymptotic form of the tail of a general phase-type distribution has the form
B(x) _ Cxke-nx, where C, 77 > 0 and k = 0, 1, 2.. .. The Erlang distribution gives an example where k > 0 (in fact, here k = p-1), but in many practical cases, one has k = 0. Here is a sufficient condition: Proposition 1.8 Let B be phase-type with representation (a, T), assume that T is irreducible , let -,q be the eigenvalue of largest real part of T, let v, h be the corresponding left and right eigenvectors normalized by vh = 1 and define C = ah • ve . Then the tail B(x) is asymptotically exponential, B(x) - Ce-7'.
(1.8)
Proof By Perron-Frobenius theory (A.4c), i is real and positive, v, h can be chosen with strictly positive component, and we have eTx - hve-7x, x -* oo. Using B(x) = aeTxe , the result follows (with C = (ah)(ve)). 0 Of course, the conditions of Proposition 1.8 are far from necessary ( a mixture of phase-type distributions with the respective T(') irreducible has obviously an asymptotically exponential tail, but the relevant T is not irreducible, cf. Example A5.8). In Proposition A5.1 of the Appendix, we give a criterion for asymptotical exponentiality of a phase-type distribution B, not only in the tail but in the whole distribution. Notes and references The idea behind using phase-type distributions goes back to Erlang, but todays interest in the topic was largely initiated by M.F. Neuts, see his book [269] (a historical important intermediate step is Jensen [214]). Other expositions of the basic theory of phase-type distributions can be found in [APQ], Lipsky [247], Rolski, Schmidli, Schmidt & Teugels [307] and Wolff [384]. All material of the present section is standard; the text is essentially identical to Section 2 of Asmussen [26]. In older literature, distributions with a rational m.g.f. (or Laplace transform) are often used where one would now work instead with phase-type distributions. See in particular the notes to Section 6. O'Cinneide [276] gave a necessary and sufficient for a distribution B with a rational m.g.f. B[s] = p(s)/q(s) to be phase-type: the density b(x) should be strictly positive for x > 0 and the root of q(s) with the smallest real part should be unique (not necessarily simple, cf. the Erlang case). No satisfying
223
2. RENEWAL THEORY
algorithm for finding a phase representation of a distribution B (which is known to be phase-type and for which the m.g.f. or the density is available ) is, however, known. A related important unsolved problem deals with minimal representations: given a phase-type distribution , what is the smallest possible dimension of the phase space E?
2 Renewal theory A summary of the renewal theory in general is given in A.1 of the Appendix, but is in part repeated below. Let U1, U2, ... be i.i.d. with common distribution B and define4
U(A)
= E# {n = 0,1, ...: U1 + ... +UnEA} 00 = EEI(U1 +...+UnEA). n=O
We may think of the U; as the lifetimes of items (say electrical bulbs) which are replaced upon failure, and U(A) is then the expected number of replacements (renewals) in A. For this reason, we refer to U as the renewal measure; if U is absolutely continuous on (0, oo) w.r.t. Lebesgue measure, we denote the density by u(x) and refer to u as the renewal density. If B is exponential with rate 0, the renewals form a Poisson process and we have u(x) = 0. The explicit calculation of the renewal density (or the renewal measure) is often thought of as infeasible for other distributions, but nevertheless, the problem has an algorithmically tractable solution if B is phase-type: Theorem 2.1 Consider a renewal process with interarrivals which are phasetype with representation (cr,T). Then the renewal density exists and is given
by u(x) = ae(T+ta)xt.
(2.1)
Proof Let {Jtk)} be the governing phase process for Uk and define {Jt} by piecing the { J(k) } together, JtJt1)
0
Jt={Jt?ul}, U1
is Markov and has two types of jumps , the jumps of the j(k) and the it } k) to the next J( k+l) A jump jumps corresponding to a transition from one Jt Then {
4Here the empty sum U1 +... + U0 is 0
224 CHAPTER VIII. MATRIX-ANALYTIC METHODS of the last type from i to j occurs at rate tiaj , and the jumps of the first type are governed by T. Hence the intensity matrix is T + ta, and the distribution of Jx is ae ( T+t«)x. The renewal density at x is now just the rate of jumps of the second type, which is ti in state i. Hence ( 2.1) follows by the law of total probability. ❑ The argument goes through without change if the renewal process is terminating, i.e. B is defective , and hence ( 2.1) remains valid for that case. However, the phase-type assumptions also yield the distribution of a further quantity of fundamental importance in later parts of this chapter , the lifetime of the renewal process. This is defined as U1 + ... + Uit_1 where s; is the first k with Uk = 00, that is, as the time of the last renewal; since Uk = oo with probability 1 - IIBII which is > 0 in the defective case, this is well-defined. Corollary 2.2 Consider a terminating renewal process with interarrivals which are defective phase-type with representation (a,T), i.e. IIafl < 1. Then the lifetime is zero-modified phase -type with representation (a,T + ta). Proof Just note that { it } is a governing phase process for the lifetime.
❑
Returning to non-terminating renewal processes , define the excess life e(t) at time t as the time until the next renewal following t, see Fig. 2.1. fi(t)
U2 U1 - U1
U3
U2
U3
U4
Figure 2.1 Corollary 2.3 Consider a renewal process with interarrivals which are phasetype with representation (a, T), and let µB = -aT-le be the mean of B. Then: (a) the excess life t(t) at time t is phase-type with representation ( vt,T) where vt = ae (T+ta)t .,
(b) £(t) has a limiting distribution as t -* oo, which is phase -type with representation (v,T) where v = -aT-1 /µB. Equivalently, the density is veTxt = B(x)/µB.
225
2. RENEWAL THEORY
Proof Consider again the process { Jt } in the proof of Theorem 2.1. The time of the next renewal after t is the time of the next jump of the second type, hence e(t) is phase-type with representation (vt,T) where vt is the distribution of it which is obviously given by the expression in (a). Hence in (b) it is immediate that v exists and is the stationary limiting distribution of it, i.e. the unique positive solution of ve = 1, (2.2) v(T + ta) = 0. Here are two different arguments that this yields the asserted expression: (i) Just check that -aT-1/µB satisfies (2.2): -aT-1 e = AB = 1 µB µB -a + aT-'Tea -aT-1(T + ta) µB
PB -a + aea -a + a µB
=0.
µB
(ii) First check the asserted identity for the density: since T, T-1 and eTx commute, we get B(x) aeTxe aT-1eTxTe µB µB PB
= veTxt.
Next appeal to the standard fact from renewal theory that the limiting distribution of e(x) has density B(x)/µB, cf. Al.e. ❑ Example 2 .4 Consider a non-terminating renewal process with two phases. The formulas involve the matrix-exponential of the intensity matrix Q = T + to =
( tll + tlal t12 + t2al
tlz + tlaz _ -q1 ql t22 + t2a2
(say).
q2 -q2
According to Example A3.6, we first compute the stationary distribution of Q, = qz ql (x1 xz) = ql + qz ql + q ' and the non-zero eigenvalue A = -ql - q2. The renewal density is then aeQtt = (al a2) ( 7i
7"2.) ( t2 )
CHAPTER VIII. MATRIX-ANALYTIC METHODS
226
+
e;`t (al a2)
J
(7r1 7r2) ( t2 7rltl +
7r2t2
+
C
11 172 -ir12 / \ t 2 ) r1
+ eAt (al a2) ( 71(t2 - tl)
eat (a17r2
- a27rl) (tl - t2)
1 + eat (a17r2 - a27r1) (t1 - t2) . t1B 0
Example 2 .5 Let B be Erlang(2). Then
Q=
0
55
)+(1o)=( j
ad ).
Hence 7r = (1/2 1/2), A = -25, and Example 2.4 yields the renewal density as u(t) = 2 (1 - e-2bt) 13
Example 2 .6 Let B be hyperexponential. Then _ Q
51 0
0 -52
+
51 52
_
)
-5152
51a2
52a1
-62a1
(al a2)
Hence
Slat + 52a1
51a2 51a2+52a1
A = -51a2 - 52a1, and Example 2.4 yields the renewal density as
u(t) = 5152
e- (biaz + aza, )t (51 - 52) 25152
51x2+5251 51a2+5251
Notes and references Renewal theory for phase-type distributions is treated in Neuts [268] and Kao [221]. The present treatment is somewhat more probabilistic.
3. THE COMPOUND POISSON MODEL 227
3 The compound Poisson model 3a Phase-type claims Consider the compound Poisson (Cramer-Lundberg) model in the notation of Section 1, with 0 denoting the Poisson intensity, B the claim size distribution, r(u) the time of ruin with initial reserve u, {St} the claim surplus process, G+(.) = F(ST(o) E •, T(0) < oo) the ladder height distribution and M = supt>o St. We asssume that B is phase-type with representation (a, T). Corollary 3.1 Assume that the claim size distribution B is phase-type with representation (a, T). Then: (a) G+ is defective phase-type with representation (a+, T) where a+ is given by a+ = - f3aT-1, and M is zero-modified phase-type with representation (a+, T + to+). (b) V,(u) = a+e(T+tQ+)u
Note in particular that p = IIG+II = a+e. Proof The result follows immediately by combining the Pollaczeck-Khinchine formula by general results on phase-type distributions: for (a), use the phasetype representation of Bo, cf. Corollary 2.3. For (b), represent the maximum M as the lifetime of a terminating renewal process and use Corollary 2.2. Since the results is so basic, we shall, however, add a more self-contained explanation of why of the phase-type structure is preserved. The essence is contained in Fig. 3.1 on the next page. Here we have taken the terminating Markov process underlying B with two states, marked by thin and thick lines on the figure. Then each claim (jump) corresponds to one (finite) sample path of the Markov process. The stars represent the ladder points ST+(k). Considering the first, we see that the ladder height Sr+ is just the residual lifetime of the Markov process corresponding to the claim causing upcrossing of level 0, i.e. itself phasetype with the same phase generator T and the initial vector a+ being the distribution of the upcrossing Markov process at time -ST+_. Next, the Markov processes representing ladder steps can be pieced together to one {my}. Within ladder steps, the transitions are governed by T whereas termination of ladder steps may lead to some additional ones: a transition from i to j occurs if the ladder step terminates in state i, which occurs at rate ti, and if there is a subsequent ladder step starting in j whic occurs w.p. a+j. Thus the total rate is tip + tia+.i, and rewriting in matrix form yields the phase generator of {my} as T + ta+. Now just observe that the initial vector of {mx} is a+ and that the lifelength is M.
228 CHAPTER VIII. MATRIX-ANALYTIC METHODS t -- M---------------------------------------
--------
S
ST+-(2-) -
{mx}
.,.t t
d kkt
--S.-------
Figure 3.1 This derivation is a complete proof except for the identification of a+ with -,QaT-1. This is in fact a simple consequence of the form of the excess distribution B0, see Corollary 2.3. 0 Example 3.2 Assume that ,Q = 3 and
b(x) = - 1 , 3e-3x + - . 7e-7x 2 2 Thus b is hyperexponential (a mixture of exponential distributions) with a (2 2 ), T = (-3 - 7)diag so that
a+
=
T+ta+ =
-QaT 1 = -3
(
3 2 2)
0
3 0 07/+( 7I \ 2 14
3 9 2 14 7
11
2
2
229
4. THE RENEWAL MODEL This is the same matrix as is Example 1.6, so that as there 1
9 9 e(T+ta+)u
9
10 70
e_u 10 70
7
1
7
9
10
) + e6'410 ( 10
10
Thus ,^(u) = a+e( T+ta+)ue = 24e-u + 1 e-6u 35 35 0
Notes and references Corollary 3.1 can be found in Neuts [269] (in the setting of M/G/1 queues, cf. the duality result given in Corollary 11.4.6), but that such a simple and general solution exists does not appear to have been well known to the risk theoretic community. The result carries over to B being matrix-exponential, see Section 6. In the next sections, we encounter similar expressions for the ruin probabilities in the renewal- and Markov-modulated models, but there the vector a+ is not explicit but needs to be calculated (typically by an iteration).
The parameters of Example 3.2 are taken from Gerber [157]; his derivation of +'(u) is different. For further more or less explicit computations of ruin probabilities, see Shin [340]. It is notable that the phase-type assumption does not seem to simplify the computation of finite horizon ruin probabilities substantially. For an attempt, see Stanford & Stroinski [351] .
4 The renewal model We consider the renewal model in the notation of Chapter V, with A denoting the interarrival distribution and B the service time distribution. We assume p = PB/µA < 1 and that B is phase-type with representation (a, T). We shall derive phase-type representations of the ruin probabilities V) (u), 0(8) (u) (recall that z/i(u) refers to the zero-delayed case and iY(8) (u) to the stationary case). For the compound Poisson model, this was obtained in Section 3, and the argument for the renewal case starts in just the same way (cf. the discussion around Fig. 3.1 which does not use that A is exponential) by noting that the distribution G+ of the ascending ladder height ST+ is necessarily (defective) phase-type with representation (a+, T) for some vector a+ = (a+;j). That is, if we define {mz} just as for the Poisson case (cf. Fig. 3.1): Proposition 4.1 In the zero-delayed case, (a) G+ is of phase-type with representation (a+,T), where a+ is the (defective)
CHAPTER VIII. MATRIX-ANALYTIC METHODS
230 distribution of mo;
(b) The maximum claim surplus M is the lifetime of {mx}; (c) {mx } is a (terminating) Markov process on E, with intensity matrix Q given by Q = T + to+.
The key difference from the Poisson case is that it is more difficult to evaluate a+. In fact, the form in which we derive a+ for the renewal model is as the unique solution of a fixpoint problem a+ = cp(a+), which for numerical purposes can be solved by iteration. Nevertheless, the calculation of the first ladder height is simple in the stationary case:
Proposition 4.2 The distribution G(s) of the first ladder height of the claim surplus process {Ste) } for the stationary case is phase -type with representation (a(8),T), where a(8) = -aT-1/PA. Proof Obviously, the Palm distribution of the claim size is just B. Hence by Theorem 11.6.5, G(') = pBo, where B0 is the stationary excess life distribution corresponding to B. But by Corollary 2.3, B0 is phase-type with representation ❑
(-aT-1/µa,T)• Proposition 4.3 a+ satisfies a+ = V(a+), where
w(a +) = aA[T + to+) = a
J0
e(T+t-+)1A(dy).
(4.1)
Proof We condition upon T1 = y and define {m.*} from {St+y - Sy-} in the same way as {mx} is defined from {St}, cf. Fig. 4.1. Then {m,*'} is Markov with the same transition intensities as {mx}, but with initial distribution a rather than a+. Also, obviously mo = m. Since the conditional distribution of my given T1 = y is ae4y, it follows by integrating y out that the distribution a+ ❑ of mo is given by the final expression in (4.1). We have now almost collected all pieces of the main result of this section:
Theorem 4 .4 Consider the renewal model with interarrival distribution A and the claim size distribution B being of phase-type with representation (a,T). Then
4. THE RENEWAL MODEL
231
I
i
- M----------------------------- •..
{mx} ------------------- ----------
y
^--
T1= y -`•r---------------
Figure 4.1 ,^(u) = a+e ( T+ta+)xe,
,^(8)(u) = a ( 8)e(T+ta +) xe,
(4.2)
where a+ satisfies (4.1) and a(8) _ -aT- 1/pA. Furthermore , a+ can be computed by iteration of (4.1), i.e. by a+ = lim a +n)
where a+°) - 0, a+l ) = cp (a+°)) , a+2) = ^p (a+l)) , .... (4.3)
Proof The first expression in (4.2 ) follows from Proposition 4.1 by noting that the distribution of mo is a+. The second follows in a similar way by noting that only the first ladder step has a different distribution in the stationary case, and that this is given by Proposition 4.2; thus , the maximum claim surplus for the stationary case has a similar representation as in Proposition 4.1(b), only with initial distribution a(*) for mo. It remains to prove convergence of the iteration scheme (4.3). The term tf3 in cp(i3) represents feedback with rate vector t and feedback probability vector (3. Hence ^p(,3) (defined on the domain of subprobability vectors ,0) is an increasing function of /3. In particular , a+) > 0 = a+o) implies a+) _ (a+)
> W (a+)) = a+)
CHAPTER VIII. MATRIX-ANALYTIC METHODS
232
and (by induction ) that { a+ n) } is an increasing sequence such that limn,. a+ ) exists . Similarly, 0 = a+) < a+ yields a+) _ (a+0))
(a+) = a+
(n and by induction that a(n) < a+ for all n . Thus , limn-4oo a ) < a+. To prove the converse inequality, we use an argument similar to the proof of Proposition VI.2.4. Let Fn = {T1 + • • • + Tn+1 > r+}be the event that {my} has at most n arrivals in [T1, 7-+ ], and let &+".) = P(mTl = i; Fn ). Obviously, &+n) T a+, so to complete the proof it suffices to show that &+n) < a+) for all n. For n = 0, both quantities are just 0 . Assume the assertion shown for n - 1. Then each subexcursion of {St+Tl - ST,-} can contain at most n - 1 arrivals (n arrivals are excluded because of the initial arrival at time T1 ). It follows that on Fn the feedback to {mz} after each ladder step cannot exceed &+n-1) so that a+ n) < a f ^ e(T+ t&+ -1))YA(dy) o
<
a is e(T+t«+-1')YA(dy) _ w (a+-1 )) = a+n). 0
0 We next give an alternative algorithm, which links together the phase-type setting and the classical complex plane approach to the renewal model (see further the notes). To this end, let F be the distribution of U1 - T1. Then F[s] = a(-sI - T)-'t • A[-s] (4.4) whenever EeR(S)U < oo. However, (4.4) makes sense and provides an analytic continuation of F[•] as long as -s ¢ sp(T). Theorem 4.5 Let s be some complex number with k(s) > 0, -s ¢ sp(T). Then -s is an eigenvalue of Q = T + ta+ if and only if 1 =,P[s] = A[-s]B[s], with B[s], F[s] being interpreted in the sense of the analytical continuation of the m.g.f. In that case, the corresponding right eigenvector may be taken as (-sI - T)-It.
Proof Suppose first Qh = -sh. Then e4'h = e-82h and hence -sh = Qh = (T + taA[Q])h = Th + A[-s]tah. (4.5) Since -s $ sp(T), this implies that ahA[-s] # 0, and hence we may assume that h has been normalized such that ahA[-s] = 1. Then (4.5) yields h = (-sI - T)-1t. Thus by (4.4), the normalization is equivalent to F(s) = 1.
4. THE RENEWAL MODEL
233
Suppose next F(s) = 1. Since R(s) > 0 and G _ is concentrated on (-oo, 0), we have IG_ [s] I < 1 , and hence by the Wiener-Hopf factorization identity (A.9) we have G+[s] = 1 which according to Theorem 1.5(c) means that a+(-sI T)-1t = 1. Hence with h = (-sI -T)- lt we get
Qh = (T + to+)h = T(-sI - T)-lt + t = -s(-sI - T)-lt = -sh.
Let d denote the number of phases. Corollary 4.6 Suppose u < 0,' that the equation F(s) = 1 has d distinct roots p1, ... , Pd in the domain ER(s) > 0 , and define hi = (-piI - T)-It, Q = CD-1 where C is the matrix with columns hl,..., hd, D that with columns -p1 hl, ... , -pdhd. Then G+ is phase- type with representation (a+, T) with a+ = a(Q-T)/at. Further, letting vi be the left eigenvector of Q corresponding to -pi and normalised by vihi = 1 , Q has diagonal form d
d
Q = -dpivi®hi = -dpihivi. (4.6) i=1
i=1
Proof Appealing to Theorem 4.5, the matrix Q in Theorem 2.1 has the d distinct eigenvalues - p1i ... , -Pd with corresponding eigenvectors hl,..., hd. This immediately implies that Q has the form CD-1 and the last assertion on the diagonal form . Given T has been computed, we get at a(Q - T) = 1 ata+ = a+.
Notes and references Results like those of the present section have a long history, and the topic is classic both in risk theory and queueing theory (recall that we can identify 0(u) with the tail P(W > u) of the GI/PH /1 waiting time W; in turn, W v M(d) in the notation of Chapter V). In older literature , explicit expressions for the ruin/ queueing probabilities are most often derived under the slightly more general assumption that b is rational (say with degree d of the polynomial in the denominator) as discussed in Section 6. As in Corollary 4.6, the classical algorithm starts by looking for roots in the complex plane of the equation f3[y]A[-ry] = 1, t(ry) > 0. The roots are counted and located by Rouche' s theorem (a classical result from complex analysis giving a criterion for two complex functions to have the same number of zeros within the unit circle ). This gives d roots 'y,,. .. , -yd satisfying R(ryi) > 0, and the solution is
CHAPTER VIII. MATRIX-ANALYTIC METHODS
234 then in transform terms
d
F 1 + a J e°" ip(u) du = Ee°w =
11(--t,) d
(see, e.g., Asmussen & O'Cinneide [ 41] for a short self- contained derivation). In risk theory, a pioneering paper in this direction is Tacklind [373], whereas the approach was introduced in queueing theory by Smith [350]; similar discussion appears in Kemperman [227] and much of the queueing literature like Cohen [88]. This complex plane approach has been met with substantial criticism for a number of reasons like being lacking probabilistic interpretation and not giving the waiting time distribution / ruin probability itself but only the transform. In queueing theory, an alternative approach (the matrix-geometric method ) has been developed largely by M.F. Neuts and his students, starting around in 1975. For surveys , see Neuts [269], [270] and Latouche & Ramaswami [241]. Here phase- type assumptions are basic, but the models solved are basically Markov chains and -processes with countably many states ( for example queue length processes ). The solutions are based upon iterations schemes like in Theorem 4.4; the fixpoint problems look like
R=Ao+RAI+R2A2+ , where R is an unknown matrix, and appears already in some early work by Wallace [377]. The distribution of W comes out from the approach but in a rather complicated form . The matrix- exponential form of the distribution was found by Sengupta [335] and the phase-type form by the author [18]. The exposition here is based upon [18], which contains somewhat stronger results concerning the fixpoint problem and the iteration scheme. Numerical examples appear in Asmussen & Rolski [43]. For further explicit computations of ruin probabilities in the phase-type renewal case , see Dickson & Hipp [118], [119].
5 Markov-modulated input We consider a risk process {St } in a Markovian environment in the notation of Chapter VI. That is , the background Markov process with p states is {Jt}, the intensity matrix is A and the stationary row vector is ir . The arrival rate in background state i is a; and the distribution of an arrival claim is B;. We assume that each B; is phase-type, with representation say (a(' ), T('), E(t)). The number of elements of El=> is denoted by q;. It turns out that subject to the phase- type assumption , the ruin probability can be found in matrix-exponential form just as for the renewal model, involving
5. MARKOV-MODULATED INPUT
235
some parameters like the ones T or a+ for the renewal model which need to be determined by similar algorithms. We start in Section 5a with an algorithm involving roots in a similar manner as Corollary 4.6. However, the analysis involves new features like an equivalence with first passage problems for Markovian fluids and the use of martingales (these ideas also apply to phase-type renewal models though we have not given the details). Section 5b then gives a representation along the lines of Theorem 4.4. The key unknown is the matrix K, for which the relevant fixpoint problem and iteration scheme has already been studied in VI.2.
5a Calculations via fluid models. Diagonalization Consider a process {(It, Vt)}t>o such that {It} is a Markov process with a finite state space F and {Vt} has piecewiese linear paths, say with slope r(i) on intervals where It = i. The version of the process obtained by imposing reflection on the V component is denoted a Markovian fluid and is of considerable interest in telecommunications engineering as model for an ATM (Asynchronuous Transfer Mode) switch. The stationary distribution is obtained by finding the maximum of the V-component of the version of {(It,Vt)} obtained by time reversing the I component. This calculation in a special case gives also the ruin probabilities for the Markov-modulated risk process with phase-type claims. The connection between the two models is a fluid representation of the Markov-modulated risk process given in Fig. 5.1. (a) 0
o
0 ♦ o ° tl ♦ • 0 0 o } o
(b)
0
}
o
♦ •
f
0
o
Figure 5.1 In Fig. 5.1, p = ql = Q2 = 2. The two environmental states are denoted o, •, the phase space E(°) for B. has states o, O, and the one E(•) for B. states
CHAPTER VIII. MATRIX-ANALYTIC METHODS
236
4, 4. A claim in state i can then be represented by an E()-valued Markov process as on Fig. 5.1(a). The fluid model on Fig . 5.1(b) {(It ,Vt)} is then obtained by changing the vertical jumps to segments with slope 1. Thus F = {o, o, V, •, 4, 4}. In the general formulation , F is the disjoint union of E and the Eli), r(i, a) = 1.
F = E U { (i, a) : i E E, a E E(i) } , r(i) _ -1, i E E, The intensity matrix for { It} is (taking p = 3 for simplicity)
AI =
0
0
A - (Ni)diag
'31a(1) 0
f32a(2)
0
0 t(2) 0
0 T1 0 0
0 0 T(2) 0
'33a(3) 0 0 T(3)
I
t(1) 0 0
0 0 t(3)
The reasons for using the fluid representation are twofold. First, the probability in the Markov-modulated model of upcrossing level u in state i of {Jt} and phase a E Eli) is the same as the probability that the fluid model upcrosses level u in state (i, a) of {It}. Second, in the fluid model Eel', < oo for all s, t, whereas Ee8s' = oo for all t and all s > so where so < oo. This implies that in the fluid context, we have more martingales at our disposal. Recall that in the phase-type case, Bi[s] = -a(i)(T(i) + sI)-it('). Let E denote the matrix
-,31a(l) 0
(/3i)diag - A
Or 1A/ _
t(i)
0
0 0
t(2)
0
0 0 t(3)
0 T1 0 0
0 - 92a(2) 0 0 T(2) 0
0 0
-f33a(3) 0 0 T(3)
with the four blocks denoted by Ei„ i, j = 1, 2, corresponding to the partitioning + Epp). of E into components indexed by E, resp. Eli) + Proposition 5.1 A complex number s satisfies
'A+
(f3i(Bi[-s] - 1))diag + sII = 0 (5.1)
if and only if s is an eigenvalue of E. If s is such a number, consider the vector a satisfying (A + (13i(Bi[ -s] - 1))diag ) a = -sa and the eigenvector b =
5. MARKOV-MODULATED INPUT
(a>
237
of 0* 1 AI, where c, d correspond to the partitioning of b into components
indexed by E, resp . E(1) + + E(P). Then (up to a constant) c = a, d = (sI -
E22)-1E21a
= E ai(sI - T('))-1t(i) . iEE
Proof Using the well-known determinant identity Ell E12 E21 E22
E22 I ' I Ell - E12E22 E21 I ,
with Eii replaced by Eii - sI, it follows that if
(/3i)diag
t(1)
0 0
0 t(2) 0
0 0
-Qla(1) 0 0 -,32a(2)
- A - sI 0 0 t(3)
- Nla(1) 0 0 T 1- sI 0 0 0 T(2) - sI 0 0 0 T(3) - sI
= 0,
then also ()3i)diag - A - sI+ ((3ia(i)(T(i) - sI)-1t)) iag
I = 0
which is the same as (5.1). For the assertions on the eigenvectors, assume that a is chosen as asserted which means (Ell - sI + E12 (sI - E22)-1 E21) a = 0, and let d = (sI - E22)-1 E21a, c = a. Then E21c+E22d =
E21a - (sI - E22 - sI) (sI - E22)-1 E21a E21a - E21a + sd = sd.
Noting that E11c + E12d = se by definition, it follows that Ell E12
( E 21 E22) (d) = s 1 d I . 0
CHAPTER VIII. MATRIX-ANALYTIC METHODS
238
Theorem 5.2 Assume that E = Or 'Al has q = ql + + qp distinct eigenvalues si, ... , sq with $2s,, < 0 and let b(v) = I d(„)) be the right eigenvector corresponding to s,,, v = 1, . . . , q. Then ,,/'
u = e'
(esiuc ( 1)
... e89uc(e)) (d(1) ... d("))-1 e.
Proof Writing Or-'Alb( v) = svb( v) as (AI - O,.sv)b(v) = 0, it follows by Proposition II.5.4 that {e--"1b(v) is a martingale . For u, v > 0, define w(u,v)=inf{t >0:Vtu orVt=- v}, w(u)=inf{t >O:Vt-u}, pi(u, v;
p i( u , v; j) pi( u ; j, a)
(j, a)), I' i( Vw(u,v) = -v) I,( u,v) = j), P2 (w (u) < oo, Iw(u,v) = (j, a)).
= Pi (Vw(u,v) =
j, a)
=
=
u)
Iw(u,v) =
Optional stopping at time w (u, v) yields C{V) = e8 ,upi(u, v;
Letting v -^ oo and using
Rsv
j, a )d(a
+ e8 °vpi (u ,v;j)c
v
.
< 0 yields
e8'uc = Epi(u;j,a)d^ ). j,a
Solving for the pi(u; j, a) and noting that i1 (u) = >I j,,,,pi(u; j, a), the result ❑ follows. Example 5 .3 Consider the Poisson model with exponential claims with rate 5. Here E has one state only. To determine 0 (u), we first look for the negative eigenvalue s of E = I -0 I which is s = -ry with yy = b -,Q. We can take a = c = 1 and get d = (s + b)-16 = 5/(3 = 1/p. Thus 0(u) = esu/d = pe-7 ° as ❑ should be. Example 5 .4 Assume that E has two states and that B1, B2 are both exponential with rates 51 i b2. Then we get V)i (u) as sum of two exponential terms where the rates s1, s2 are the negative eigenvalues of Al +01 -A1
E _
-A 2 b1
A2 +32 0
0 52
239
5. MARKOV-MODULATED INPUT
5b Computations via K Recall the definition of the matrix K from VI.2. In terms of K, we get the following phase-type representation for the ladder heights (see the Appendix for the definition of the Kronecker product 0 and the Kronecker sum ®): Proposition 5.5 G+(i, j; •) is phase-type with representation (E(i), 8^')IT(j)) where e 3^') =,33(e = 0 a(j))(-K ®T ( j))(ej (9 I). Proof We must show that G+ (i, j; (y, oo))
j)ye. (') a T(
However , according to VI.( 2.2) the l.h.s. is
0 /3 f R(i , j; dx)Bj(y - x) 00 f ° (') (j) eT (y-y)edx ,3j eye- xxej • a 00 oo f
el ,Qj eie
x T(j)y eej (j)a T(') e dx e e
0
00
eKx
®
e T(')'
dx (ej (& I)e T(')ye
00 eKa®T(')x dx (ej (9 I)eT(') Ye
e(i)eT(')ye. 0
Theorem 5 .6 For i E E, the Pi-distribution of M is phase-type with representation (E(1) + + E(P), 9('), U) where t(j) + t(j)O(j j = k
uja,k.y = to
B k7
j # k
In particular,
i,b (u) = Pi(M > u) = 9(i)euue. (5.3)
240
CHAPTER VIII. MATRIX-ANALYTIC METHODS
Proof We decompose M in the familiar way as sum of ladder steps . Associated with each ladder step is a phase process, with phase space EU> whenever the corresponding arrival occurs in environmental state j (the ladder step is of type j). Piecing together these phase processes yields a terminating Markov process with state space EiEE E('), intensity matrix U, say, and lifelength M, and it just remains to check that U has the asserted form. Starting from Jo = i, the initial value of (i, a) is obviously chosen according to e(`). For a transition from (j, a) to (k, ,y) to occur when j # k, the current ladder step of type j must terminate, which occurs at rate t(i), and a new ladder step of type k must start in phase y, which occurs w.p. Bk7 . This yields the asserted form of uja,k y. For j = k, we have the additional possibility of a phase change from a to ry within the ladder step, which occurs at rate t^^7.
❑
Notes and references Section 5a is based upon Asmussen [21] and Section 5b upon Asmussen [17]. Numerical illustrations are given in Asmussen & Rolski [43].
6 Matrix-exponential distributions When deriving explicit or algorithmically tractable expressions for the ruin probability, we have sofar concentrated on a claim size distribution B of phase-type. However, in many cases where such expressions are available there are classical results from the pre-phase-type-era which give alternative solutions under the slightly more general assumption that B has a Laplace transform (or, equivalently, a m.g.f.) which is rational, i.e. the ratio between two polynomials (for the form of the density, see Example 1.2.5). An alternative characterization is that such a distribution is matrix-exponential, i.e. that the density b(x) can be written as aeTxt for some row vector a, some square matrix T and some column vector t (the triple (a, T, t) is the representation of the matrix-exponential distribution/density): Proposition 6.1 Let b(x) be an integrable function on [0, oo) and b* [0] = f °O e-Bxb(x) dx the Laplace transform. Then b*[0] is rational if and only b(x) is matrix-exponential. Furthermore, if
b* [0] =
b1 +b20+b302 +... +bn0i-1 0n +a10n-1 +... +aii-10+anI
then a matrix-exponential representation is given by b(x) = aeTxt where
a = (b1 b2 ... bn-1 bn), t = (0 0 ... 0 1)', (6.2)
6. MATRIX-EXPONENTIAL DISTRIBUTIONS
T =
241
0 1 0 0 0 ... 0 0 0 0 1 0 0 ... 0 0 .. .(6.3) 0 0 0 0 0 ... 0 1 -an -an-1 -an _2 - an_3 -an _ 4 ... -a2 -a1
Proof If b(x) = aeTxt, then b*[0] = a(0I -T)-1t which is rational since each element of (01 - T)-1 is so. Thus, matrix-exponentiality implies a rational transform. The converse follows from the last statement of the theorem. For a proof, see Asmussen & Bladt [29] (the representation (6.2), (6.3) was suggested by Colm O'Cinneide, personal communication). ❑ Remark 6.2 A remarkable feature of Proposition 6.1 is that it gives an explicit Laplace tranform inversion which may appear more appealing than the first attempt to invert b* [0] one would do, namely to asssume the roots 6l, . . . , bn of the denominator to be distinct and expand the r.h.s. of (6.1) as E 1 c;/(0 + bi), ❑ giving b(x) = E 1 cie-biz/bY. Example 6 .3 A set of necessary and sufficient conditions for a distribution to be phase-type are given in O'Cinneide [276]. One of his elementary criteria, b(x) > 0 for x > 0, shows that the distribution B with density b(x) = c(1 cos(21r x))e-x, where c = 1 + 1/47r 2, cannot be phase-type. Writing b(x) = c(-e( 2ni-1 ) y/2 - e(-tai-1)x/2 + e-'T) it follows that a matrix-exponential representation ()3, S, s) is given by
27r i - 1 0 0 )3 = (111), S =
f -c/2
0 -21ri - 1 0 , s = -c/ 2 . (6.4) 0 0 -1 c
This representation is complex, but as follows from Proposition 6.1, we can always obtain a real one (a, T, t). Namely, since 1 + 4ir2 03 + 302 + (3 + 47x2)0 + 1 + 47r2 it follows by (6.2), (6.3) that we can take
0 1 0 0 a= (1 + 47r2 0 0), T= 0 0 1 , t= 0 . -1 - 47r2 -3 - 47x2 -3 1
0
CHAPTER VIII. MATRIX-ANALYTIC METHODS
242
Example 6 .4 This example shows why it is sometimes useful to work with matrix-exponential distributions instead of phase-type distributions: for dimension reasons . Consider the distribution with density
b(x)
=
15 ((2e-2x - 1)2 + 6). 7 + 155e-x
Then it is known from O'Cinneide [276] that b is phase-type when 6 > 0, and that the minimal number of phases in a phase-type representation increases to 0o as 5 , 0, leading to matrix calculus in high dimensions when b is small. But since
15(1 +6)02 + 1205 0 + 2255 + 105 b* [9] _ (7 + 155)03 + (1355 + 63)92 + (161 + 3455)9 + 2256 + 105 Proposition 6.1 shows that a matrix-exponential representation can always be ❑ obtained in dimension only 3 independently of J. As for the role of matrix-exponential distributions in ruin probability calculations, we shall only consider the compound Poisson model with arrival rate 0 and a matrix-exponential claim size distribution B, and present two algorithms for calculating '(u) in that setting. For the first, we take as starting point a representation of b* [0] as p( O)/q(9) where p, q are polynomials without common roots. Then (cf. Corollary 111.3.4) the Laplace transform of the ruin probability is
/g(e)-PO 0*[e] _ /' e-eu^G(u)dU = 0 9(/3--a0p(-9)ap (9)/q(9)) .
(6.5)
Thus, we have represented ti* [0] as ratio between polynomials (note that 0 must necessarily be a root of the numerator and cancels), and can use this to invert by the method of Proposition 6.1 to get i (u) = f3esus. For the second algorithm, we use a representation (a, T, t) of b(x). We recall (see Section 3; recall that t = -Te) that if B is phase-type and (a, T, t) a phase-type representation with a the initial vector, T the phase generator and t = -Te, then 5(u) = -a+e(T+t-+)uT-le
where a+ = -/3aT-1. (6.6)
The remarkable fact is, that despite that the proof of (6.6) in Section 3 seems to use the probabilistic interpretation of phase-type distribution in an essential way, then: Proposition 6.5 (6.6) holds true also in the matrix-exponential case.
6. MATRIX-EXPONENTIAL DISTRIBUTIONS 243 Proof Write b* = a(9I - T)-1t, b+ = a+(9I - T)- 't, b+ = a +(BI - T)-1T-1t. Then in Laplace transform formulation , the assertion is equivalent to -a+(BI - T - to+)-1T - 1t du =
9(
, - 6b* - b* (6.7)
cf. (6.5 ), (6.6). Presumably, this can be verified by analytic continuation from the phase-type domain to the matrix-exponential domain , but we shall give an algebraic proof. From the general matrix identity ([331] p. 519)
(A + UBV )- 1 = A-1 - A - 1UB(B + BVA-1UB)- 1BVA-1, with A = 91-T, U =- t,B=land V=a+, we get (91- T - to+)-1 = (BI - T)-1 + (6I - T)-1t ( l - a+(9I - T)-1t)-1a +(9I - T)-1 (91- T)-1 + 1 ib* (91- T)-1ta+(OI - T)-1 so that b* b** b** -a+(9I - T - to+)-1T - 1t = -b* - 1 + b+ = b++ 1 . Now, since (91-T)-1T - 1 = ^(T-1 + ( 91-T)-1),
(91- T)-1T -2 =
IT-2 + 82T - 1 + 82 (9I - T)-1
and 1 =
J0 00 b(x) dx =
-aT-1t,
xb(x) dx = aT2t,
AB
f
we get b+ = -0aT-1(9I -T)- 1t = -f3a (0I -T)-1T-1t
CHAPTER VIII. MATRIX-ANALYTIC METHODS
244
- 8 a(T-1 + (01- T)-1)t = 8 (1 - b*), -/3aT-1(0I - T)-1T- 1t = -/3a (9I - T)-1T-2t -,3a
(1 0
T -2
+
1
1 T -1 + (9I 02 02
-T)-1) t
-P + 7- 82b*.
From this it is straightforward to check that b**/(b+ - 1) is the same as the r.h.s. of (6.7). 0 Notes and references As noted in the references to section 4, some key early references using distributions with a rational transform for applied probability calculations are Tacklind [373] (ruin probabilities) and Smith [350] (queueing theory). A key tool is identifying poles and zeroes of transforms via Wiener-Hopf factorization. Much of the flavor of this classical approach and many examples are in Cohen [88]. For expositions on the general theory of matrix-exponential distributions, see Asmussen & Bladt [29], Lipsky [247] and Asmussen & O'Cinneide [41]; a key early paper is Cox [90] (from where the distribution in Example 6.3 is taken). The proof of Proposition 6.5 is similar to arguments used in [29] for formulas in renewal theory.
7 Reserve-dependent premiums We consider the model of Chapter VII with Poisson arrivals at rate 0, premium rate p(r) at level r of the reserve {Rt} and claim size distribution B which we assume to be of phase-type with representation (E, a, T). In Corollary VII.1.8, the ruin probability(u) was found in explicit form for the case of B being exponential. (for some remarkable explicit formulas due to Paulsen & Gjessing [286], see the Notes to VII.1, but the argument of [286] does not apply in any reasonable generality). We present here first a computational approach for the general phase-type case (Section 7a) and next (Section 7b) a set of formulas covering the case of a two-step premium rule, cf. VII.la.
7a Computing O(u) via differential equations The representation we use is essentially the same as the ones used in Sections 3 and 4, to piece together the phases at downcrossing times of {Rt} (upcrossing times of {St}) to a Markov process {mx} with state space E. See Fig. 7.1, which is self-explanatory given Fig. 3.1.
7. RESERVE-DEPENDENT PREMIUMS
245
Rt l0 -u
--------------------- 1z I.
Figure 7.1 The difference from the case p(r) = p is that {m2}, though still Markov, is no longer time-homogeneous. Let P(tl,t2) be the matrix with ijth element P (mt2 =j I mtl = i), O<- tl < t2 < u. Define further vi(u) as the probability that the risk process starting from RD = u downcrosses level u for the first time in phase i. Note that in general >iEE Vi (U) < 1. In fact, >iEE Vi (U) is the ruin probability for a risk process with initial reserve 0 and premium function p(u + •). Also, in contrast to Section 3, the definition of {m8} depends on the initial reserve u = Ro. Since v(u) = (vi(u))iEE is the (defective) initial probability vector for {m8}, we obtain V)(u) = P(m„ E E) = v(u)P(0,u)e = A(u)e (7.1) where A(t) = v(u)P(0, t) is the vector of state probabilities for mt, i.e. Ai(t) = P(mt = i). Given the v(t) have been computed, the A(t) and hence Vi(u) is available by solving differential equations: Proposition 7.1 A(0) = v(u) and A'(t) = A(t)(T + tv(u - t)), 0 < t < u. Proof The first statement is clear by definition. By general results on timeinhomogeneous Markov processes, P(tl, t2) = exp
{
tq
f Q(v) dvl
t1 1
where Q(t) = ds [P(t, t + s) - I] I 8-0
CHAPTER VIII. MATRIX-ANALYTIC METHODS
246
However, the interpretation of Q(t) as the intensity matrix of {my} at time t shows that Q(t) is made up of two terms: obviously, {mx} has jumps of two types, those corresponding to state changes in the underlying phase process and those corresponding to the present jump of {Rt} being terminated at level u - t and being followed by a downcrossing. The intensity of a jump from i to j is tij for jumps of the first type and tivj(u - t) for the second. Hence Q(t) _ T + tv(u - t), A'(t) = A(t)Q(t) = A(t)(T + tv(u - t)).
0 Thus, from a computational point of view the remaining problem is to evaluate the v(t), 0 < t < u. Proposition 7.2 For i E E,
-vi,(u)
p ( u)
=
,(tai
+ vi(u)
E
vj(u)tjp (u) -
jEE
Q
+ vj (u)tjip ( u). (7.4)
jEE
Proof Consider the event A that there are no arrivals in the interval [0, dt], the probability of which is 1 -,3dt. Given A', the probability that level u is downcrossed for the first time in phase i is ai. Given A, the probability that level u + p(u)dt is downcrossed for the first time in phase j is vj (u + p(u)dt). Given this occurs, two things can happen: either the current jump continues from u + p(u)dt to u, or it stops between level u + p(u)dt and u. In the first case, the probability of downcrossing level u in phase i is
8ji(1 + p(u)dt • tii) + (1 - Sj i)p(u)dt • tji = Sji
+ p(u)tji dt,
whereas in the second case the probability is p(u)dt • tjvi(u). Thus, given A, the probability of downcrossing level u in phase i for the first time is E vj (u + p(u)dt) (Sji + p( u)dt • tji + p(u)dt • tjvi(u)) jEE
vi(u) + vi' (u)p(u)dt + p(u) dt E {tji + tjvi(u)} jEE
Collecting terms, we get vi(u) = aidt + (1 -,Qdt) vi(u) + vi'(u)p(u)dt + p(u) dt E{tji+tjvi(u)}. jEE
7. RESERVE-DEPENDENT PREMIUMS 247 Subtracting v; (u) on both side and dividing by dt yields the asserted differential ❑ equation. When solving the differential equation in Proposition 7.2, we face the difficulty that no boundary conditions is immediately available. To deal with this, consider a modification of the original process {Rt} by linearizing the process with some rate p, say, after a certain level v, say. Let p" (t), Rt , F" etc. refer to the modified process. Then pv(r)
p(r) r < v p r>v '
and (no matter how p is chosen) we have: Lemma 7.3 For any fixed u > 0, vi (U) = lim v= (u). V - 00
Proof Let A be the event that the process downcrosses level u in phase i given that it starts at u and let B" be the event By={o,
where o, denotes the time of downcrossing level u . Then P(B,) is the tail of a (defective) random variable so that P(Bv) -+ 0 as v -4 oo, and similarly P"(Bv) - ^ 0. Since the processes Rt and Rt coincide under level B,,, then P(A n Bv) _ P"(A n BV'). Now since both P(A n Bv) -3 0 and P"(A n Bv) -- 0 as v -+ 00 we have
P(A) -P"(A) = P(AnBv)+P(AnBv) -P"(AnB,,) -P"(AnBv) = P(AnB,)-P"(AnB,)
-+ 0 ❑
as v -+ oo. From Section 3, we have p(r) = p =
vi (u) -0aTe;, P
which implies that v, (v) is given by the r.h.s. of (7.5). Thus, we can first for a given v solve (7.4) backwards for {va (t)}v>t>o, starting from v"(v) = -,i7rT-1/p. This yields v, (u) for any values of u and v such that u < v.
248
CHAPTER VIII. MATRIX-ANALYTIC METHODS
Next consider a sequence of solutions obtained from a sequence of initial values {v; (u)},, where, say, v = u, 2u, 3u etc. Thus we obtain a convergent sequence of solutions that converges to {vi(t)}u>t>o• Notes and references The exposition is based upon Asmussen & Bladt [30] which also contains numerical illustrations. The algorithm based upon numerical solution of a Volterra integral equation (Remark VII.1.9, numerically implemented in Schock Petersen [288]) and the present one based upon differential equations require both discretization along a discrete grid 0, 1/n, 2/n,.... However, typically the complexity in n is 0(n2) for integral equations but 0(n) for integral equations. The precision depends on the particular quadrature rule being employed. The trapezoidal rule used in [288] gives a precision of 0(n 3), while the fourth-order Runge-Kutta method implemented in [30] gives 0(n-5).
7b Two-step premium rules We now assume the premium function to be constant in two levels as in VII.1a, p(r)
P, r
We may think of process Rt as pieced together of two standard risk processes RI and Rte with constant premiums p1, p2, such that Rt coincide with RI under level v and with Rt above level v. Let ii'( u) = a+'ie(T+ta +^)"e denote the ruin probability for R't where a+ = a+i) = -laT-1/pi, cf. Corollary 3.1. We recall from Propositon VII.1.10 that in addition to the O'(•), the evaluation of Vi(u) requires q(u) = 1 - zp1(u)/(1 - z51(v)), 0 < u < v, which is available since the z/i'(.) are so, as well p1(u), the probability of ruin between a and the next upcrossing of v, where v = inf It > 0 : Rt < v}. To evaluate p1(u), let v(u) = a+2ieiT +ta+>)(u-v), assuming u > v for the moment. Then v(u) is the initial distribution of the undershoot when downcrossing level v given that the process starts at u, i.e. for u > v the distribution of v - RQ (defined for or < oo only) is defective phase-type with representation (v(u), T). Recall that q(w) is the probability of upcrossing level v before ruin given the process starts at w < v. Therefore u vvueTa t 1vueTva (7.7) pl( ) = ( ) ( q(v q( dx ))+( ) f o (the integral is the contribution from {R, > 0} and the last term the contribu-
tion from {R, < 0}). The f iin
in (7.7) equals
-01 (v - x) dx f v(u)eT xt dx - v v(u)eTat 1 1 - V" M 0
7. RESERVE-DEPENDENT PREMIUMS 1
1 - v(u)eTVe - 1
249
1 - v(u ) eTV e
-
- ^1(v)
J
v(u)eTxtz/)l (v - x) dx} V
from which we see that
1
pl (u) = 1 + 1
-1(v) f V
v(u) eTxt,01 (v - x) dx -
1 - v(u)eTve). 1 -^(v) ( (7.8)
The integral in (7.8) equals v v(u)eTxta+2) e(T+ta +))( v-x)edx which using Kronecker calculus (see A.4) can be written as (Y(u)
®a+)e(T+t°+>)°1
(T ® (-T - to+))1-1
{e{T®(-T-toy+ ))}„ - jl (t ®e)
Thus, all quantities involved in the computation of b(u) have been found in matrix form.
Example 7.4 Let {Rt } be as in Example 3.2. I.e., B is hyperexponential corresponding to
-3 0 3 a-(2 2)' T= ( 0 7 t- (7 The arrival rate is (i = 3. Since µB = 5/21, p2 < 3.21 = ? yields 0(u) = 1, so we consider the non-trivial case example p2 = 4 and p1 = 1. From Example 3.2, 01(u)
_ 24 -u + 35 1 e-6u 35 e
4(u)
_ 35 - 24e- u - e-6u 35 - 24e-v - e-6v
Let Al = -3 + 2V'2- and A2 = -3 - 2V"2- be the eigenvalues of T + to( 2 ). Then one gets
f
X20 20 21
1ea1(u -v) + 1
3
3 ^ A 2(u e
- v)
1eai(u -v) +
7
31 ^') eA2 (u- v) + (2^ + 3v2 ea'(u "
1
7
e\2(u
-v)
CHAPTER VIII. MATRIX-ANALYTIC METHODS
250
From (7.7) we see that we can write pi (u) = v(u)V2 where V2 depends only on v, and one gets 12e5" - 2 35e6v - 24e5v - 1 V2 = 4e5"+6 35e6v - 24es" - 1 Thus, pi (u) = p12(u)/p1 l(u) where p1i(u)
35e6v - 24es" - 1,
p12(u)
) e sv + ( 2v/2- + it (3 4'I 1 ea2(u-v
e1\2(u-")
7
+
(
32 +4,/-2-) ea 1(u - v)esv + 7
4_
2,,/2-
ea1(u-")
.
21 3
In particular, 192esv + 8 P1 - 21(35e6v - 24e5v - 1)' ?,b(v) =
192esv +8 35e6v + 168esv + 7*
Thus all terms involved in the formulae for the ruin probability have been ex❑ plicitly derived. Notes and references [30].
The analysis and the example are from Asmussen & Bladt
Chapter IX
Ruin probabilities in the presence of heavy tails 1 Subexponential distributions We are concerned with distributions B with a heavy right tail B(x) = 1- B(x). A rough distinction between light and heavy tails is that the m.g.f. B[s] = f e8x B(dx) is finite for some s > 0 in the light-tailed case and infinite for all s > 0 in the heavy-tailed case. For example, the exponential change of measure techniques discussed in II.4, III.4-6 and at numerous later occasions require a light tail. Some main cases where this light-tail criterion are violated are (a) distributions with a regularly varying tail, B(x) = L(x)/x" where a > 0 and L(x) is slowly varying, L(tx)/L(x) -4 1, x -4 oo, for all t > 0; (b) the lognormal distribution (the distribution of eu where U - N(µ, a2)) with density 1 e-(logy-Fh) 2/2az .
x 2iror2 (c) the Weibull distribution with decreasing failure rate , B(x) = e-x0 with 0<0<1. For further examples, see I.2b. The definition b[s] = oo for all s > 0 of heavy tails is too general to allow for a general non-trivial results on ruin probabilities, and instead we shall work within the class S of subexponential distributions . For the definition , we require that B is concentrated on (0, oo ) and say then that B is subexponential (B E S) if 251
252 CHAPTER IX. HEAVY TAILS
B*2\ 2, B(x)
Here B*2 is the convolution square, that is, the distribution of independent r.v.'s X1, X2 with distribution B. In terms of r.v.'s, (1.1) then means P(X1 +X2 > x) 2P(Xi > x). To capture the intuition behind this definition, note first the following fact: Proposition 1.1 Let B be any distribution on (0, oo). Then: (a) P(max(Xi, X2) > x) ^' 2B(x), x -3 00.
(b) liminf
BB(()
) > 2.
Proof By the inclusion-exclusion formula, P(max(Xi, X2) > x) is P(X1 > x) + P(X2 > x) - F(X1 > x, X2 > x) = 2B(x) - B(x)2 - 2B(x), proving (a). Since B is concentrated on (0, oo), we have {max(Xi, X2) > x} C {X1 + X2 > x}, and thus the lim inf in (b) is at least lim inf P(max(Xi, X2) > ❑ x)/B(x) = 2. The proof shows that the condition for B E S is that the probability of the set {X1 + X2 > x} is asymptotically the same as the probability of its subset {max(Xi, X2) > x}. That is, in the subexponential case the only way X1 + X2 can get large is by one of the Xi becoming large. We later show: Proposition 1.2 If B E S, then P(X1>xI X1+X2>x)--* 2, P(Xi
x x where U is uniform on (0, 1). Thus , if X1 + X2 is large , then (with high proba❑ bility) so are both of X1, X2 but none of them exceeds x.
253
1. SUBEXPONENTIAL DISTRIBUTIONS Here is the simplest example of subexponentiality: Proposition 1.4 Any B with a regularly varying tail is subexponential.
Proof Assume B(x) = L(x)/xa with L slowly varying and a > 0. Let 0 < 5 < 1/2. If X1 + X2 > x, then either one of the Xi exceeds (1 - S)x, or they both exceed Sx. Hence lim sup a--+oo
B*2(x) 2B((1 - S)x + B(Sx)2 < lim sup x-aoo
B(x)
lim sup 2L((1 x-^oo
B(x)
- 6)x)/((1 - 5)x)' + 0 _ 2 L(x)l xa (1-6)-
Letting S 10, we get limsupB*2(x)/B(x) < 2, and combining with Proposition ❑ 1.1(b) we get B*2(x)/B(x) -* 2. We now turn to the mathematical theory of subexponential distributions. Proposition 1.5 If B E S, then B(B(x)y) -* 1 uniformly in y E [0, yo] as X -+ 00. [In terms of r.v.'s: if X - B E S, then the overshoot X - xIX > x converges in distribution tooo. This follows since the probability of the overshoot to exceed y is B (x + y)/B(x ) which has limit 1.] Proof Consider first a fixed y. Using the identity + 1)(x) 1+ 2 1 - B*n(x - z B(x) - B*(n ) B(dz) (1.2) B(x) B(x ) B(x) Jo
B*(n+1)(x) = 1+
with n = 1 and splitting the integral into two corresponding to the intervals [0, y] and (y, x], we get
BZ(x)) > 1 + B(y) + B(B(-)y) (B(x) - B(y)) . If lim sup B(x - y)/B(x) > 1, we therefore get lim sup B*2(x)/B(x) > 1+B(y)+ 1 - B(y) = 2, a contradiction. Finally lim inf B(x - y)/B(x) > 1 since y > 0. The uniformity now follows from what has been shown for y = yo and the obvious inequality
1 < B(x ) Y) B( < B( x) 0), B(
y E [0,yo].
0
254 CHAPTER IX. HEAVY TAILS Corollary 1.6 If B E 8, then e"R(x) -* oo,
b[c] = oo for all e > 0.
Proof For 0 < 5 < e, we have by Proposition 1.5 that B(n) > e-6B(n - 1) for all large n so that B(n) > cle-6n for all n. This implies B(x) > c2e-5x for all x, and this immediately yields the desired conclusions. 0 Proof of Proposition 1.2. P(X1 > xIX1 + X2 > x)
_ P(Xi > x) _ B(x) 1 P(X1 + X2 > x) B2(x) 2 1
y
P(X1
2
using Proposition 1.5 and dominated convergence. O The following result is extremely important and is often taken as definition of the class S; its intuitive content is the same as discussed in the case n = 2 above. Proposition 1.7 If B E S, then for any n B*n(x)/B(x) -* n, x
oo.
Proof We use induction. The case n = 2 is just the definition, so assume the proposition has been shown for n. Given e > 0, choose y such that IB*n(x)/B(x) - nI < e for x > y. Then by (1.2), B*(n+1) (x
I x-y + Jxx y) W-(x - z ) ) = 1 + (^ B(x - z) B(dz). B(x) \Jo _ B(x - z) B(x)
Here the second integral can be bounded by B*n(y) B(x) - B(x - y) sup v>o B(v) B(x) which converges to 0 by Proposition 1.5 and the induction hypothesis. The first integral is y B(x - z) B(dz) (n + O(e)) ^x JO B(x) B (x) - B*2 (x) -
(n + 0(0) I
B(x)
(x - z) B(dz) _yBB(x) 111 Lx
255
1. SUBEXPONENTIAL DISTRIBUTIONS
Here the first term in {•} converges to 1 (by the definition of B E S) and the second to 0 since it is bounded by (B(x) - B(x - y))/B(x). Combining these estimates and letting a 4.0 completes the proof. 0 Lemma 1.8 If B E S, e > 0, then there exists a constant K = KE such that B*n(x) < K(1 + e)nB(x) for all n and x. Proof Define 5 > 0 by (1+5)2 = 1+e, choose T such that (B(x)-B*2(x))/B(x) < 1 + b for x > T and let A = 1/B(T), an = supx>o B*n(x)/B(x). Then by (1.2), an+1
fX B*n( *n(x - z) B(x - z) B(dz) x - z) B(dz ) + sup < 1 + sup - z) B(x) f x
=
(1
+
A)/e.
0
Proposition 1.9 Let A1, A2 be distributions on (0, oo) such that Ai (x) _ aiB(x) for some B E S and some constants al, a2 with a1 + a2 > 0. Then Al * A2 (x) - (al + a2)B(x).
Proof Let X1, X2 be independent r.v.'s such that Xi has distribution Ai. Then Al * A2(x) = P(X1 + X2 > x). For any fixed v, Proposition 1.5 easily yields P(X1 + X2 > x, Xi <=v) f (x - y)Ai(dy) v Ai o - ajB(x)Ai(v) = ajB( x)(1+o„(1))
(j = 3 - i). Since P(X1+X2 > x,X1 > x-v,X2 > x-v) < A1(x-v)A2(x -v)
- ala2B(x)2
which can be neglected, it follows that it is necessary and sufficient for the assertion to be true that JX_VA
(x - y)Ai(dy) = (x)o(1)
(1.3)
Using the necessity part in the case Al = A2 = B yields
f x-v B(x - y)B(dy) = B(x)ov (1)• v
(1.4)
256
CHAPTER LX. HEAVY TAILS
Now (1.3) follows if 'V-V B(x - y)Ai(dy) = B(x)o„(1).
(1.5)
f" By a change of variables, the l.h.s. of (1.5) becomes x B(x - v)Ai(v) - Ai(x - v)B(v) + -_'U Aq(x - y)B(dy). V
Here approximately the last term is B(x)o„(1) by ( 1.4), whereas the two first yield B(x)(Ai(v) - aiB(v)) = B(x)o„(1). ❑ Corollary 1.10 The class S is closed under tail-equivalence. That is, if q(x) aB(x) for some B E S and some constant a > 0, then A E S. Proof Taking Al = A2 = A, a1 = a2 = a yields A*2(x) - 2aB(x) - 2A(x).
❑
Corollary 1.11 Let B E S and let A be any distribution with a ligther tail, A(x) = o(B(x)). Then A * B E S and A * B(x) - B(x) Proof Take Al = A, A2 = B so that a1 = 0, a2 = 1.
❑
It is tempting to conjecture that S is closed under convolution. That is, it should hold that B1 * B2 E S and B1 * B2 (x) - Bl (x) + B2 (x) when B1, B2 E S. However, B1 * B2 E S does not hold in full generality (but once B1 * B2 E S has been shown, B1 * B2 (x) - Bl (x) + B2 (x) follows precisely as in the proof of Proposition 1.9). In the regularly varying case, it is easy to see that if L1, L2 are slowly varying, then so is L = L1 + L2. Hence Corollary 1.12 Assume that Bi(x) = Li(x)lxa, i = 1,2, with a > 0 and L1, L2 slowly varying. Then L = L1 + L2 is slowly varying and B1 * B2(x) sim L(x)/x«. We next give a classical sufficient (and close to necessary) condition for subexponentiality due to Pitman [290]. Recall that the failure rate A(x) of a distribution B with density b is A(x) = b(x)/B(x) Proposition 1.13 Let B have density b and failure rate A(x) such that .(x) is decreasing for x > x0 with limit 0 at oo. Then B E S provided
fo
"O
exA(x) b(x)
dx < oo.
257
1. SUBEXPONENTIAL DISTRIBUTIONS
Proof We may assume that A(x) is everywhere decreasing (otherwise, replace B by a tail equivalent distribution with a failure rate which is everywhere decreasing). Define A(x) = fo .(y) dy. Then B(x) = e-A(x). By (1.2), B*2(x) - 1 B(x) eA( x)-A(x-v )-A(y)A(y) dy f B(x - y ) b(y)dy = B (x) o ox _
J
= ox/2 eA( x)-A(x-y )- A(y)\(y)
Jo
dy + fox/ 2 eA(x
)- A(x-y)-A ( y).(x - y) dy.
0
For y < x/2, A(x) - A(x - y) < yA(x - y) y\(y)• The rightmost bound shows that the integrand in the first integral is bounded by ey"(v)- A(y)a(y ) = ev'(y) b(y), an integrable function by assumption. The middle bound shows that it converges to b(y) for any fixed y since \ (x - y) -* 0. Thus by dominated convergence , the first integral has limit 1 . Since ) (x - y) < A (y) for y < x/2, we can use the same domination for the second integral but now the integrand has limit 0 . Thus B*2(x )/ B(x) - 1 has limit 1 + 0, proving B E S. Example 1.14 Consider the DFR Weibull case B(x) = e-x0 with 0 <,3 < 1. Then b(x) = Ox0-le-xp, a(x) = ax0-1. Thus A(x) is everywhere decreasing, and exa(x)b(x) = (3x0-1e-(1-0)x9 is integrable. Thus, the DFR Weibull distri❑ bution is subexponential. Example 1.15 In the lognormal distribution, x - e-009x-v)2/2a2/(x 2irv2) logx ( ) 't (-(logx - U) /or)
v 2x
This yields easily that ex,`(x)b(x) is integrable. Further, elementary but tedious calculations (which we omit) show that A(x) is ultimately decreasing. Thus, the ❑ lognormal distribution is subexponential. In the regularly varying case, subexponentiality has alrady been proved in Corollary 1.12. To illustrate how Proposition 1.13 works in this setting, we first quote Karamata's theorem (Bingham, Goldie & Teugels [66]): Proposition 1.16 For L(x) slowly varying and a > 1, f ' L(y) dy ,,, L(x) y° (a - 1)xcl-1
258
CHAPTER IX. HEAVY TAILS
From this we get Proposition 1.17 If B has a density of the form b(x) = aL(x)/x°+1 with L(x) slowly varying and a > 1, then B(x) - L(x)/x" and )t(x) - a/x. Thus exa(x)b(x) - ea b(x) is integrable. However, the monotonicity condition in Proposition 1.13 may present a problem in some cases so that the direct proof in Proposition 1.4 is necessary in full generality. We conclude with a property of subexponential distributions which is often extremely important: under some mild smoothness assumptions, the overshoot properly normalized has a limit which is Pareto if B is regularly varying and exponential for distributions like the lognormal or Weibull. More precisely, let X W = X - xjX > x, 'y(x) = EXix>. Then: Proposition 1.18 (a) If B has a density of the form b(x) = aL(x)/xa with L(x) slowly varying and a > 1, then 7(x) x/(a - 1) and
P(X (,)/-Y(x) > y) (1 + y/(a - 1))^ ' (b) Assume that for any yo )t(x + y/A(x)) 1 A(x) uniformly for y E (0, yo] . Then 7(x) - 1/A(x) and P(X ixil'Y (x) > y) -* e-'; (c) Under the assumptions of either ( a) or (b), f O B(y) dy - y(x)B(x). Proof ( a): Using Karamata's theorem, we get
EX(x) - E(X - x)+ _ 1 °° P(X > x) P(X>x )J L
PX >y)dy
1 x L(y)/y-dy L(x)/((a1)x'-1) x ( )l ° J °° ()l a x
a-1 Further P ((a - 1)X(x)/x > y) = P(X > x[1 + y/(a - 1)] I X > x) L(x[1 + y/(a - 1)]) xa L(x) (x[1 + y/(a - 1)])a 1 1 . (1 + y/(a - 1))a .
(1.6)
259
2. THE COMPOUND POISSON MODEL
We omit the proof of (c) and that EX (x) - 1/.(x). The remaining statement (1.8) in (b) then follows from
P (A(x)X (x) > y)
= F(X > x + y/.A(x) I X > x) = exp {A(x) - A(x + y/A(x))}
fY
ex p = = + x) dx ex P - f yl a(x) a(x 0 0
A( x + u /A( x))
}
a(x) du
= exp {-y (1 + 0(1))} 0
The property (1.7) is referred to as 1/A(x) being self-neglecting. It is trivially verified to hold for the Weibull- and lognormal distributions , cf. Examples 1.14, 1.15. Notes and references A good general reference for subexponential distribution is Embrechts, Kliippelberg & Mikosch [134].
2 The compound Poisson model Consider the compound Poisson model with arrival intensity /3 and claim size distribution B. Let St = Ei ` Ui - t be the claim surplus at time t and M = sups>0 St, r(u) = inf it > 0; St > u}. We assume p = /3µB < 1 and are interested in the ruin probability V)(u) = P(M > u) = P(r(u) < oo). Recall that B0 denotes the stationary excess distribution, Bo(x) = f0 B(y) dy / µB. Theorem 2 .1 If Bo E S, then Vi(u) P Bo(u). P The proof is based upon the following lemma (stated slightly more generally than needed at present). Lemma 2.2 Let Y1, Y2, ... be i. i. d. with common distribution G E S and let K be an independent integer-valued r.v. with EzK < oo for some z > 1. Then P(Y1 + • • • + YK > u) - EK G(u). nG(u), u -a oo, and that for each Proof Recall from Section 1 that G*n (u) z > 1 there is a D < oo such that G*n(u) < G(u)Dzn for all u. We get p(yl+...+YK> = n)G* n(u ) -- n-0 = ^•P(K 1 •P(K= n)•n = EK, 0 G(u) L G(u) -u)nn-.
CHAPTER IX. HEAVY TAILS
260
❑
using dominated convergence with >2 P(K = n) Dz" as majorant.
Proof of Theorem 2.1. The Pollaczeck-Khinchine formula states that (in the set-up of Lemma 2.2) M = Yl + • • • +YK where the Yt have distribution Bo and K is geometric with parameter p, P(K = k) = (1- p)p'. Since EK = p/(1- p) and EzK < oo whenever pz < 1, the result follows immediately from Lemma 2.2. ❑ The condition Bo E S is for all practical purposes equivalent to B E S. However, mathematically one must note that there exist (quite intricate) examples where B E S, Bo ¢ S, as well as examples where B ¢ S, Bo E S. The tail of Bo is easily expressed in terms of the tail of B and the function y(x) in Proposition 1.18, _
B(x^sx Bo(x) µ8 I aoB(y
)dy =
(x).
(2.1)
(^) - ?(xµ 8
In particular , in our three main examples (regular variation , lognormal , Weibull) one has
(
( ) B(x) - x^
lox - µ J
B(x) _ f or
B(x) = e-x'
Bo(x
) - µB(01 - 1)xa-1' vxe-(109x-11)2/202 2 +° /2 µB = eµ Bo(x) eµ+O2/2(log x)2 27r' = µB = F(1/0 )
Bo(x
1
xl-Qe-xp
) ,., r(1/Q)
From this , Bo E S is immediate in the regularly varying case, and for the lognormal and Weibull cases it can be verified using Pitman 's criterion (Proposition 1.13). Note that in these examples , Bo is more heavy-tailed than B . In general: Proposition 2.3 If B E S, then Bo(x)/B(x) -+ 00, x -4 00. Proof Since B(x + y)/B(x) -* 1 uniformly in y E [0, a], we have x+a
fx B(y)dy = a B0 (x) > lim inf lim inf x-+oo B(x) - x-400 PBB(x) PB Leta-+oo.
❑
Notes and references Theorem 2.1 is essentially due to von Bahr [56], Borovkov [73] and Pakes [280]. See also Embrechts & Veraverbeeke [136].
The approximation in Theorem 2.1 is notoriously not very accurate. The problem is a very slow rate of convergence as u -^ oo. For some numerical studies, see Abate,
261
3. THE RENEWAL MODEL
Choudhury & Whitt [1]. Kalashnikov [219] and Asmussen & Binswanger [27]. E.g., in [219] p. 195 there are numerical examples where tp(u) is of order 10-5 but Theorem 2.1 gives 10-10. This shows that even the approximation is asymptotically correct in the tail, one may have to go out to values of 1/'(u) which are unrealistically small before the fit is reasonable. In [1], also a second order term is introduced but unfortunately it does not present a great improvement. Somewhat related work is in Omey & Willekens [278], [279]. Based upon ideas of Hogan [200], Asmussen & Binswanger [27] suggested an approximation which is substantially better than Theorem 2.1 when u is small or moderately large.
3 The renewal model We consider the renewal model with claim size distribution B and interarrival distribution A as in Chapter V. Let U= be the ith claim , T1 the ith interarrival time and Xi = U; - Ti,
Snd) = Xl +... + Xn, M =
sup s$ , t9(u) = inf {n : Snd> > u} . {n= 0,1,...}
Then ik(u) = F ( M > u) = P(i9 (u) < oo). We assume positive safety loading, i.e. p = iB /µA < 1. The main result is: Theorem 3 . 1 Assume that (a) the stationary excess distribution Bo of B is subexponential and that (b) B itself satisfies B(x - y)/B (x) -> 1 uniformly on compact y -internals. Then l/i(u) 1 P
Bo(u)
u -+ 00. (3.1)
P [Note that (b) in particular holds if B E S.] The proof is based upon the observation that also in the renewal setting, there is a representation of M similar to the Pollaczeck-Khinchine formula. To
this end , let t9+ = i9(0) be the first ascending ladder epoch of
{Snd>
},
G+ (A) = P(Sq+ E A,,9+ < oo) = P(S,+ E A, T+ < oo) where r+ = T1 + • • • + T,y + as usual denotes the first ascending ladder epoch of the continuous time claim surplus process {St}. Thus G+ is the ascending ladder height distribution (which is defective because of PB < PA). Define further 0 = IIG+II = P(r9+ < oo). Then K
M=EY, i=1
CHAPTER IX. HEAVY TAILS
262
where K is geometric with parameter 9, P(K = k) = (1 - 9)9'' and Y1,Y2,... are independent of K and i.i.d. with distribution G+/9 (the distribution of S,y+ given r+ < oo). As for the compound Poisson model, this representation will be our basic vehicle to derive tail asymptotics of M but we face the added difficulties that neither the constant 9 nor the distribution of the Yi are explicit.
Let F denote the distribution of the Xi and F1 the integrated tail, FI (x) _ fz ° F(y) dy, x > 0. Lemma 3 .2 F(x) - B(x), x -* oo, and hence FI(x) - PBBo(x). Proof By dominated convergence and (b),
B(x) _ J
O°
B(B(x)y) A(dy) f
1 . A(dy) = 1.
0 The lemma implies that (3.1) is equivalent to P(M > u) " -- FI(u),
u -a 00, (3.3)
and we will prove it in this form (in the next Section, we will use the fact that the proof of (3.1) holds for a general random walk satisfying the analogues of (a), (b) and does not rely on the structure Xi = Ui - Ti). Write G+( x) = G+ ( x, oo) = F(S,g+ > x, d+ < oo). Let further 19_ _ inf {n > 0: S^d^ <
0}
be the first descending ladder epoch, G_(A) = P(S,y_ E
A) the descending ladder height distribution (IIG -II = 1 because of PB < P A) and let PG_ be the mean of G_. Lemma 3 .3 G+ (x) - FI(x) /IPG_I, x -+
oo.
Proof Let R+(A) = E E'+ -' I(S,(,d)) E A) denote the pre-19+ occupation measure and let and U_ = Eo G'_" be the renewal measure corresponding to G_. Then 0 0 F( x - y) R+(dy ) _ j (x_y)U_(dY) G+ (x) =
J
00
00
(the first identity is obvious and the second follows since an easy time reversion argument shows that R+ = U_, cf. A.2). The heuristics is now that because of (b), the contribution from the interval (- N, 0] to the integral is O(F(x)) = o(FI(x)), whereas for large y , U_ (dy) is close to Lebesgue measure on (- oo, 0] normalized by IPG_ I so that we should have to G+(x) - 1
IPG_ I
/
F(x - y) dy = 1 Pi (X) oo
IPG_ I
263
3. THE RENEWAL MODEL
We now make this precise. If G_ is non-lattice, then by Blackwell 's renewal theorem U_ (-n - 1, -n] -+ 1/I µG_ I. In the lattice case, we can assume that the span is 1 and then the same conclusion holds since then U-(-n - 1, -n] is just the probability of a renewal at n. Given e, choose N such that F(n - 1)/F(n) < 1 + e for n > N (this is possible by (b) and Lemma 3.2), and that U_(-n - 1, -n] < (1 + e)/1µc_ I for n > N. We then get lim sup G+(x) x-ro0 Fj(x) o F(x - y) U- (dy)
< lim sup
fN FI ( x)
X---)00
+ lim sup Z-Y00
N F(x - y) U_ (dy) 00
FI (x)
< lim sup F(x) U-(-N, 0] x-+00 FI(x)
00 + lim up 1 x) E F(x + n) U_ (-n - 1, -n] F1 ( n=N _1 1+e E F(x+n) 0 + limsup x-r00 FI(x) FAG- I n=N
E)2 r00 F(x + y) dy + e) lim sup - 1 I ,UG_ I x-,oo Fj(x) N
(1
J
(1 +6)2 I {IC_
I
lim sup X-400
FI(x + N) _ (1 + e)z (x) I Pi µ G_ I
Here in the third step we used that (b) implies B(x)/Bo(x) -+ 0 and hence F(x)/FI(x) -4 0, and in the last that FI is asymptotically proportional to Bo E S. Similarly, > (1 - e) z lim inf G+(x) -
FI (x)
Ip G_ I
❑
Letting a 10, the proof is complete.
Proof of Theorem 3.1. By Lemma 3.3, F(Y= > x) FI(x)/(OIp _ 1). Hence using dominated convergence precisely as for the compound Poisson model, (3.2) yields 00 F F I (u) P(M > u) _ E(1 - 0)0k k I(u) A;=1
BIp G_ I (1- 9)IpG_ I
Differentiating the Wiener-Hopf factorization identity (A.9) 1 - F[s] = (1 - O-[s])(1 - G+[s])
264
CHAPTER IX. HEAVY TAILS
and letting s = 0 yields
-µF = -(1 - 1)6+[0] - (1 - IIG+II)µc_ = -(1 - 0)ua_ . Therefore by Lemma 3.2, FJ(u) UBBO(U) PBo(u) N
(1-0)Ipc_I
=
JUA - AB i-P
We conclude by a lemma needed in the next section: Lemma 3 .4 For any a < oo, P(M > u, S+9(u) - Se(u)_1 < a) = o(Fj(u)). Proof Let w(u) = inf {n : Sid) E (u - a, u), Mn < u}. Then P(M E (u - a, u)) > P(w(u) < oo)(i -lp (0))• On the other hand, on the set {M > u, Sty(u) - Sty(u)_I < a} we have w(u) < oo, and {Su,(u)+n - SS(u)}n=o,l,... must attain a maximum > 0 so that P(M > u, S+q(u) - So( u)_1 < a) < P (w(u) < oo)j/i(0) < 0(0) P(M E (u - a, u)). 1-0(0) But since P(M > u - a) N P(M > u), we have P(M E (u - a,u)) = o(P (M > u)) = o(FI(u)).
Notes and references Theorem 3.1 is due to Embrechts & Veraverbeke [136], with roots in von Bahr [56] and Pakes [280].
Note that substantially sharper statements than Lemma 3.4 on the joint distribution of (S,yiui_1,So(u)) are available, see Asmussen & Kliippelberg [36].
4 Models with dependent input We now generalize one step further and consider risk processes with dependent interclaim times, allowing also for possible dependence between the arrival process and the claim sizes. In view of the `one large claim' heuristics it seems reasonable to expect that similar results as for the compound Poisson and renewal models should hold in great generality even when allowing for such dependence.
4. MODELS WITH DEPENDENT INPUT
265
Various criteria for this to be true were recently given by Asmussen, Schmidli & Schmidt [47]. We give here one of them, Theorem 4.1 based upon a regenerative assumption, and apply it to the Markov-modulated model of Chapter VI. For further approaches, examples and counterexamples, see [47]. Assume that the claim surplus process {St}t>o has a regenerative structure in the sense that there exists a renewal process Xo = 0 < Xl <- X2 < ... such that
{SXo+t
-
SXo}0
- Sxi}0
(viewed as random elements of the space of D-functions with finite lifelengths) are i.i.d. and the distribution of {Sxk+t - Sxk}o
Figure 4.1 Note that no specific sample path structure of {St} (like in Fig. 4.1) is assumed. We return to this point in Example 4.4 below. Define
S.
= Sx, +1,
M,1
=
max
S,
M* =
max
S;,,
M = sup St.
k=0,...,n n=0,1,... t>0
The idea is now to observe that in the zero-delayed case, {Sn}n=o,1,... (corresponding to the filled circles on Fig. 4.1 except for the first one) is a random walk. Thus the assumption ,F*(X) = P0(Si > x) ,., G(x)
(4.1)
CHAPTER IX. HEAVY TAILS
266
for some G such that both G E S and Go E S makes (3.3) applicable so that F(M* > u) 141 F*(u), u -p 00. (4.2) Imposing suitable conditions on the behaviour of {St} within a cycle will then ensure that M and M* are sufficiently close to be tail equivalent. The one we focus on is Fo (Mix) > x) ,,, Fo(Si > X), (4.3) where
Mnx) = sup
Sxn +t - Sxn =
o
sup Sxn+t - S.* -i o
Since clearly M(x) > Sl , the assumption means that Mix) and Sl are not too far away. See Fig. 4.2.
--------------N N Xi=0
N Figure 4.2 Theorem 4.1 Assume that (4.1) and (4.3) hold. Then '00 (u) = Fo(M > u) ,,, jF11 F* (U). Proof Since M > M*, it suffices by (4.2) to show F(M* > u) > 1. (4.4) liminf u->oo F(M > u)
4. MODELS WITH DEPENDENT INPUT
267
Define 79* (u) = inf {n = 1 , 2, ...: Sn > u} , /3(u)
= inf{n=1,2,...: S;,+Mn+1>u}
(note that {M> u} = {3(u) < oo}). Let a > 0 be fixed. We shall use the estimate
Po(M > u) Miu^+ 1 < a) = o(Po (M > u)) (4.5) which follows since Po (M > u, MW O(u)+1 < a) IN
(
U
A1; E (u - a, u)}
n=1
< P(M* E (u - a, u))/P(M* = 0) = o(Po(M* > u)). Given e > 0, choose a such that Po(Si > x ) > (1 - e)Po (MMX> > x), x > a. Then by Lemma 3.4, Po(M* > u)
- Po (M* > u, S;(u) - S;( u)-1 > a) 00 1: Po(MnaV(u-Sn*)) n=1
00 > (1-E)EPo(MnaV(u
-S;,
))
n=1 00
> (1 - E) Po ( n
max St u, Mn+l > a V (u - Sn 0
( 1 - e)Po (M > u, M^xu)+l > a) - (1 - e)Po (M > U). Letting first u -+ oo and next e . 0 yields (4.4).
❑
Under suitable conditions , Theorem 4.1 can be rewritten as (4.6)
00 (U)
1 p pBo(u)
where B is the Palm distribution of claims and p - 1 = limti00 St/t. To this end, assume the path structure Nt
St = EUi-t+Zt i=1
CHAPTER IX. HEAVY TAILS
268 with {Zt} continuous, independent of {>
N` U;} and satisfying Zt/t
a4'
0. Then
the Palm distribution of claims is N,
B(x) = E N Eo
I( U1 < x) .
x
0
(4.8)
i=1
Write ,Q = EoNx/EoX. Corollary 4.2 Assume that {St} is regenerative and satisfies (4.7). Assume further that (i) both B and Bo are subexponential; (ii) EozNX < oo for some z > 1; (iii) For some o -field Y, X and N. are F-measurable and NX
Po
J:U=>x i=1
(iv) Po
o(B(x))
sup Zt > x / (0:5t<x
Then (4.6) holds with p = ,3PB. Proof It is easily seen that the r.v.'s
FNX U; and ENX Ui - X both have tails of
order EoNx • B(x), cf. the proof of Lemma 4.6 below. The same is true for Sl, since the tail of Zx is lighter than B(x) by (iv), and also for Mix) since Nx
Mix) < > UE + i=1
sup Zt.
o
Thus Theorem 4.1 is in force, and the rest is just rewriting of constants: since p = 1+tlim St = 1+ . oX (see Proposition A1.4), we get 00 (u)
1
J
Po(Sl > x) dx
IPF- I u
1 EoNxB(x) dx EoX(1 - p) Ju P Bo(u) 1-p
0
4. MODELS WITH DEPENDENT INPUT
269
Example 4 .3 As a first quick application, consider the periodic model of VI.6 with arrival rate /3(t) at time t (periodic with period 1) and claims with distribution B (independent of the time at which they arrive). Assume that B E S, Bo E S, i.e. (i) holds. The regenerative assumption is satisfied if we take Xo = Xi = 0, X2 = 1, X3 = 2,.. ., Zt - 0 (thus (iv) is trivial). The number N. of claims arriving in [0, 1) is Poisson with rate /3 = fo /3(s) ds so that (ii) holds, and taking F = o,(NX), (iii) is obvious. Thus we conclude that (4.6) ❑ holds. Example 4 .4 Assume that St = Zt - t + EN'I Ui where {>N`1 Ui - t} is standard compound Poisson and {Zt} an independent Brownian motion with mean zero and variance constant a2. Again , we assume that B E S, Bo E S; then (iv) holds since the distribution of supo
x-+oo
B2(x)
G(x)
= ci
for some distribution G such that both G and the integrated tail fx°O G(y) dy are subexponential , and for some constants ci < oo such that cl + • • • + c, > 0. The average arrival rate /3 and the Palm distribution B of the claim sizes are given by P
Q
ir i/i,
=
P
B = >2 7riaiBi
i=1
and we assume p = 01-4 B =
Ep
ri/3ipB;
i=1
< 1.
Theorem 4.5 Consider the Markov-modulated risk model with claim size distributions satisfying (4.9). Then (4.6) holds.
The key step of the proof is the following lemma.
CHAPTER IX. HEAVY TAILS
270
Lemma 4 . 6 Let (N1, ... , NP ) be a random vector in {0, 1, 2 , ...}P, X > 0 a r.v. and F a a-algebra such that (N1, ... , NP ) and X are .F-measurable. Let {Fi}t=1 P be a family of distributions on [0, oo) and define p Ni
Yx = EEX'i - X i=1 j=1
where conditionally upon F the Xi, are independent with distribution Fi for Xij. Assume EzN-1+"'+Np < oo for some z > 1 and all i, and that for some + cp distribution G on [0, oo) such that G E S and some c1, ... , cp with cl + > 0 it holds that Fi(x) - ciG(x). Then P
P(Yx > x) - c'(x)
where c = ciENi . i=1
Proof Consider first the case X = 0. It follows by a slight extension of results from Section 1 that P
P(Yo > x I Y) G( x) ci Ni,
P(Yo > x
I
+Np
^ ) < CG(x)zN1+
i =1
for some C = C(z) < oo. Thus dominated convergence yields
( P(Yo>x P(Yo>x .^•) G(x)
= E\
G(x)
P -^ E ciNi = C. i-1
In the general case, as x -a oo, P
P(YX
> x I.F) = P(Yo > X+x I •^)
G (x +x)>2ciNi i=1
P
- G( x )
> ciNi ,
i=1
and
P(Yx > x ^) < P(Y0 > x I.F) < CG(x)zn'1+,"+Np . The same dominated convergence argument completes the proof.
❑
Proof of Theorem 4.5. If Jo = i, we can define the regenerations points as the times of returns to i, and the rest of the argument is then just as the proof of Corollary 4.2. An easy conditioning argument then yields the result when Jo is ❑ random. For light-tailed distributions, Markov-modulation typically decreases the adjustment coefficient -y and thereby changes the order of magnitude of the ruin
5. FINITE-HORIZON RUIN PROBABILITIES
271
probabilities for large u, cf. VI.4. It follows from Theorem 4.5 that the effect of Markov-modulation is in some sense less dramatical for heavy-tailed distributions: the order of magnitude of the ruin probabilities remains ft°° B(x) dx. Within the class of risk processes in a Markovian environment, Theorem 4.5 shows that basically only the tail dominant claim size distributions (those with c, > 0) matter for determining the order of magnitude of the ruin probabilities in the heavy-tailed case. In contrast, for light-tailed distributions the value of the adjustment coefficient -y is given by a delicate interaction between all B. Notes and references Theorem 4. 5 was first proved by Asmussen, Floe Henriksen & Kliippelberg [31] by a lengthy argument which did not provide the constant in front of Bo(u) in final form. An improvement was given in Asmussen & Hojgaard [33], and the final reduction by Jelenkovic & Lazar [213]. The present approach via Theorem 4.1 is from Asmussen, Schmidli & Schmidt [47]. That paper also contains further criteria for regenerative input (in particular also a treatment of the delayed case which we have omitted here), as well as a condition for (4.6) to hold in a situation where the inter-claim times (T1,T2.... ) form a general stationary sequence and the U; i.i.d. and independent of (T1,T2.... ); this is applied for example to risk processes with Poisson cluster arrivals. For further studies of perturbations like in Corollary 4.2 and Example 4.4, see Schlegel [316].
5 Finite-horizon ruin probabilities We consider the compound Poisson model with p = /3pB < 1 and the stationary excess distribution Bo subexponential. Then O(u) - pl(1 - p)Bo(u), cf. Theorem 2.1. As usual, r(u) is the time of ruin and as in IV.7, we let PN"N = P(. I T(u) < oo). The main result of this section, Theorem 5.4, states that under mild additional conditions, there exist constants -Y(u) such that the F(u)distribution of r(u)/y(u) has a limit which is either Pareto (when B is regularly varying) or exponential (for B's such as the lognormal or DFR Weibull); this should be compared with the normal limit for the light-tailed case, cf. IV.4. Combined with the approximation for O(u), this then easily yields approximations for the finite horizon ruin probabilities (Corollary 5.7). We start by reviewing some general facts which are fundamental for the analysis. Essentially, the discussion provides an alternative point of view to some results in Chapter IV, in particular Proposition 2.3.
5a Excursion theory for Markov processes Let until further notice {St} be an arbitrary Markov process with state space E (we write Px when So = x) and m a stationary measure, i.e. m is a (or-finite)
272 CHAPTER IX. HEAVY TAILS measure on E such that
L for all measurable A C E and all t > 0. Then there is a Markov process {Rt} on E such that
fE m(dx)h(x)Exk(Rt) = Lm(dy)k(y)Eyh(St)
(5.2)
for all bounded measurable functions h, k on E; in the terminology of general Markov process theory, {St} and {Rt} are in classical duality w. r. t. m. The simplest example is a discrete time discrete state space chain, where we can take h, k as indicator functions, for states i, j, say, and (5.2) with t = 1 means m;rij = mjsji where r13,s=j are the transition probabilities for {St}, resp. {Rt}. Thus, a familiar case is time reversion (here m is the stationary distribution); but the example of relevance for us is the following: Proposition 5.1 A compound Poisson risk process {Rt} and its associated claim surplus process {St} are in classical duality w .r.t. Lebesgue measure. Proof Starting from Ro = x, Rt is distributed as x + t - >N` Ui, and starting from So = y, St is distributed as y - t + EI U; (note that we allow x, y to vary in, the whole of R and not as usual impose the restrictions x > 0, y = 0). Let G denote the distribution of ENt U, - t. Then (5.2) means
ffh(a,)k(x - z) dx G(dz) = ffh(y + z) k(y)dy G(dz). The equality of the l.h.s. to the r.h.s. follows by the substitution y = x - z.
❑
For F C E, an excursion in F starting from x E F is the (typically finite) piece of sample path' {St}o
where w(Fc) = inf It > 0: St 0 F} .
We let QS be the corresponding distribution and Qx,y = Qx (. Sw(F.)- = y, w(Fc) < oo ) 'In general Markov process theory, a main difficulty is to make sense to such excursions also when Px(w(F°) = 0) = 1. Say {St} is reflected Brownian motion on [0,00), x = 0+ and F = (0, oo). For the present purposes it suffices , however , to consider only the case Px(w(F`) = 0) 0.
5. FINITE-HORIZON RUIN PROBABILITIES 273 y E F (in discrete time, Sw(Fo)_ should be interpreted as Sw(F^)_1). Thus, Qx y is the distribution of an excursion of {St} conditioned to start in x E F and terminate in y E F. QR and QRy are defined similarly, and we let Qy y refer to the time reversed excursion . That is,
Qx,y(-) = P ({SW(F`)-t-} 0
w(0, oo) = r(0)
x=
St
y (a)
Figure 5.1 The sample path in (a) is the excursion of {St} conditioned to start in x = 0 and to end in y > 0, the one in (b) is the time reversed path. The theorem states that the path in (b) has the same distribution as an excursion of {Rt} conditioned to start in y < 0 and to end in x = 0. But in the risk theory example (corresponding to which the sample paths are drawn), this simply means the distribution of the path of {Rt} starting from y and stopped when 0 is hit. In particular: Corollary 5.3 The distribution of r(0) given r(0) < oo, S,(0)_ = y < 0 is the same as the distribution of w(-y) where w(z) = inf It > 0 : Rt = z}, z > 0. [note that w(z) < oo a.s. when p = ,13AB < 1] Proof of Theorem 5.2. We consider the discrete time discrete state space case only (well-behaved cases such as the risk process example can then easily be handled by discrete approximations). We can then view Qy,y as a measure on all strings of the form i0i1 ... in with i0, i1, ... , in E F, io = x, in = y, /^s x (S1 = Z1, ... , Sn = in = y; Sn+1 E Fc) nx,y(2p21 ...itt) = P Px(w(Fc) < 00, Sw(F)-1 = y)
CHAPTER IX. HEAVY TAILS
274 note that Fx(w(Fc) < 00, S, (Fc)-1 = y) 00
E E Px (Si = 21i ... , S. = in = y; Sn+1 E Fc) n=1 i1,...,i„_iEF
Similarly, t' y and Qy x are measures on all strings of the form ipi l ... in with 20,ii ,...,in E F, i0 =
QyR x(2p21 ...
y,
in)
in =
-
x,
Pt' (R1 = ii, ... , Rn = in = x; Rn+1 E
F (w(Fc) < 00,
F`)
R ,(F<)-1 = Y)
S S and Qx y( ipil ... in) = Qx,J (i.in-1 ... 2p). To show Q y x (i0 i 1 ... 2n) = Qx,TI( 2n2n _1 ... 2p) when 20, 21 , ... , in E
in = x, note first that Pt'
(R l = il, ... , Rn = in = x; Rn+1 E FC) TioilTili2...rin_1in
E Txj jEFC
m21 s2120 m2252221
m in Ssn n-1
mjSjx
m2p mil min-1 jEF`
1 Sinin _ 1
...
Si l io
MY
Mx
E mjSjx.
jEF^
Thus
Sxin-1 ... Silt' Qx(ioii ... in) = oo E Sxik_1 ... .gilt' k=1 ii .....ik-1EF
Similarly but easier Sxin_1 ... Si11 S 1 ... i0) Q x,y(inin _
E SYj jEF`
00 Sxik _1 ... Silt' E SO k=1 i1, ...,ik_1EF Sxin_1 ... Si1y 00
E E 5xik_ 1 ... Si1y k=1 i1 ,...,ik_1EF
jEF°
F,
20 =
y,
5. FINITE-HORIZON RUIN PROBABILITIES 275
5b The time to ruin Our approach to the study of the asymptotic distribution of the ruin time is to decompose the path of { St} in ladder segments . To clarify the ideas we first consider the case where ruin occurs already in the first ladder segment , that is, the case r (O) < oo, ST(o) > y. Let Y = Yl = Sr+( 1) be the value of the claim surplus process just after the first ladder epoch , Z = Zl = ST+( 1)_ the value just before the first ladder epoch (these r.v.'s are defined w.p. 1 w .r.t. P(o) ), see Fig. 5.2.
U T(O) = T (u)
Y
Figure 5.2 The distribution of (Y, Z) is described in Theorem 111.2.2. The formulation relevant for the present purposes states that Y has distribution Bo and that conditionally upon Y = y, Z follows the excess distribution B(Y) given by B(Y) (x) _ B(y + x)/B(y). We are interested in the conditional distribution of T(u) = T(0) given {T(0) < oo, S,(o) > y} = {T(0) < oo, Y > y} , that is, the distribution w.r.t. P(") = P(. 7-(0) < oo, Y > u). Now the P(u,')distribution of Y-u is Bo"). That is, the P(u,')-density of Y is B(y)/[,UBBo(u)], y > u. Bo") is also the P(u,')-distribution of Z since
P(Z>aIY>u) =
1
°° B(y) B(y + a) dy FLBBo(u) B (y)
J°° (z) dy - B(a) +a PBBo(u)
276
CHAPTER IX. HEAVY TAILS
Let {w(z)}Z^,o be defined by w(z) = inf It > 0: Rt = z} where {Rt} is is independent of {St}, in particular of Z. Then Corollary 5.3 implies that the P("'1)-distribution of T(u) = r(0) is that of w(Z). Now Bo E S implies that the Bo ")(a) -+ 0 for any fixed a, i.e. P(Z < a I Y > u) -3 0. Since w(z)/z a$. 1/(1 - p), z -^ oo, it therefore follows that T(u)/Z converges in Pi"'')probability to 1/(1 - p). Since the conditional distribution of Z is known (viz. Bo") ), this in principle determines the asymptotic behaviour of r(u). However, a slight rewriting may be more appealing. Recall the definition of the auxiliary function y(x) in Section 1. It is straightforward that under the conditions of Proposition 1.18(c)
Bo")(yY (u)) -+ P(W > y)
( 5.3)
where the distribution of W is Pareto with mean one in case ( a) and exponential with mean one in case (b). That is , Z/'y(u) -* W in Pi "' ')-distribution . r(u)/Z -4 1/(1 - p) then yields the final result T(u)/y(u) -+ W/(1 - p) in Pi"'')distribution. We now turn to the general case and will see that this conclusion also is true in P(")-distribution:
Theorem 5 . 4 Assume that Bo E S and that (5.3) holds. Then 7-(u)/-y(u) --^ W/(1 - p) in F(u) -distribution.
In the proof, let r+(1) = T(0),T+(2),... denote the ladder epochs and let Yk, Zk be defined similarly as Y = Y1, Z = ZI but relative to the kth ladder segment, cf. Fig. 5.3. Then, conditionally upon r+ (n) < oo, the random vectors (YI, Z1),. .. , (Y,,, Zn), are i.i.d. and distributed as (Y, Z). We let K(u) = inf In = 1, 2, ...: r+ (n) < oo, Y1 + • • • + Yn > u} denote the number of ladder steps leading to ruin and P("'n) = P(• I r(u) < oo, K(u) = n). The idea is now to observe that if K(u) = n, then by the subexponential property Yn must be large, i.e. > u with high probability, and YI,... , Yn_1 'typical'. Hence Z,, must be large and Z1,.. . , Zn_1 'typical' which implies that the first n-1 ladder segment must be short and the last long; more precisely, the duration T+ (n) - r+ (n - 1) of the last ladder segment can be estimated by the same approach as we used above when n = 1, and since its dominates the first n - 1, we get the same asymptotics as when n = 1.
5. FINITE-HORIZON RUIN PROBABILITIES 277
16
Z3
Z1
r+(1) T+(1) T+(1)
Figure 5.3 In the following, II ' II denotes the total variation norm between probability measures and ® product measure. Lemma 5.5 Ilp(u,n) (y1, ... , Y„-1, Yn - u) E •) - Bo (ri-1) ®B(,,u) II 0. Proof We shall use the easily proved fact that if A'(u), A"(u) are events such that P(A'(u) AA"(u)) = o(F (A'(u)) (A = symmetrical difference of events), then
IIP( I A'(u)) -
P(. I A"(u ))II -+ 0.
Taking A'(u) = {Y,, > u}, A"(u) _ {K(u)=n} = {Y1+
+Yn-1u},
the condition on A'(u) A A"(u) follows from Bo being subexponential (Proposition 1.2, suitably adapted). Further, P(. I A'(u)) = P(u,n),
P (Yj,...,Yn-1iYn - u) E • I A'(u)) = Bo (n-1) ®Bou) .
CHAPTER IX. HEAVY TAILS
278 Lemma 5 .6
IIPIu'n )
((Z1'..., Zn) E •) - Bo (n-1) ®Bo'
0.
Proof Let (Y11, Z11),..., (Y,,, Zn), be independent random vectors such that the conditional distribution of Zk given Y.' = y is BM, k = 1, ... , n, and that Yk has marginal distribution B0 for k = 1, . . . , n - 1 and Y„ - u has distribution Bout That is, the density of Yn is B(y)/[IBBO(u)], y > u. The same calculation as given above when n = 1 shows then that the marginal distribution of Zn is Bou). Similarly (replace u by 0), the marginal distribution of Zk is Bo for k < n, and clearly Zi, ... , Zn are independent. Now use that if the conditional distribution of Z' given Y' is the same as the conditional distribution of Z given Y and JIF(Y E •) - P(Y' E •)II -* 0, then 11P(Z E •) - P(Z' E •)II -> 0 (here Y, Y', Z, Z' are arbitrary random vectors, in our example Y = (Y1, ... , Y") ❑ etc.). Proof of Theorem 5.4. The first step is to observe that K(u) has a proper limit distribution w.r.t. P(u) since by Theorem 2.1, n_1 < u, Y1 +... + Y" > u) Flul (K (u ) = n) _ Cu) P"F(1'i +...+y 1 p"F(Yn > u) P)Pn-1 P/(1 - P) Bo(u) for n = 1, 2, .... It therefore suffices to show that the P(u'")-distribution of T(u) has the asserted limit. Let {wl(z)},..., {wn(z)} be i.i.d. copies of {w(z)}. Then according to Section 5a, the F'-distribution of r(u) is the same as the P'-distribution of w1(Zl) + • • • + wn(Zn). By Lemma 5.6, wk(Zk) has a proper limit distribution as u -+ oo for k < n, whereas wn(Zn) has the same limit behaviour as when n = 1 (cf. the discussion just before the statement of Theorem 5.4). Thus F(u'n)(T(u) /7(u) >
y)
=
F(u'n)((wl (Z1)
+ ...
+wn(Z n))l7( u )
>
1y)
^' P(u'n)(wn (Zn)/7(u) > y) -4 NW/(1 - P) > y)
Corollary 5.7 O (u,,y(u)T) - 1 P PBo(u) • P(W/(1 - p) < y).
Notes and references Excursion theory for general Markov processes is a fairly abstract and advanced topic. For Theorem 5.2, see Fitzsimmons [144]), in particular his Proposition (2.1).
6. RESERVE-DEPENDENT PREMIUMS 279 The results of Section 5b are from Asmussen & Kluppelberg [36] who also treated the renewal model and gave a sharp total variation limit result . Extensions to the Markov-modulated model of Chapter VI are in Asmussen & Hojgaard [33]. Asmussen & Teugels [53] studied approximations of i (u, T) when T -+ oo with u fixed; the results only cover the regularly varying case.
6 Reserve-dependent premiums We consider the model of Chapter VII with Poisson arrivals at rate /3, claim size distribution B, and premium rate p(x) at level x of the reserve. Theorem 6 .1 Assume that B is subexponential and that p(x) -> 00, x -> oo. Then (6.1)
0 (u) Qf "O ^) dy. u
The key step in the proof is the following lemma on the cycle maximum of the associated storage process {Vt}, cf. Corollary II. 3.2. Assume for simplicity that {Vt} regenerates in state 0 , i.e. that fo p(x)-1 dx < oo, and define the cycle as a = inf{t>0: Vt=0, max VB>0I Vo=0^ o<s
and the result follows.
J
u
B(y) dy , p(Y)
CHAPTER IX. HEAVY TAILS
280
Define D(u) as the steady-state rate of downcrossings of {Vt} of level u and Da (u) as the expected number of downcrossings of level u during a cycle. Then D(u) = f(u)p(u) and, by regenerative process theory, D(u) = DQ(u)/µ. Further the conditional distribution of the number of downcrossings of u during a cycle given Mo > u is geometric with parameter q(u) = P(Mo > u I Vo = u). Hence f (u)r(u) = D(u) = Do(u) - P(MT > u) $B(u) Ft µ(1 - q ( u)) 1 - q(u) Now just use that p(x) -* oo implies q (x) -+ 0.
❑
Notes and references The results are from Asmussen [22], where also the (easier) case of p(x) having a finite limit is treated . It is also shown in that paper that typically, there exist constants c(u) -4 0 such that the limiting distribution of r(u)/c(u) given r(u) < oo is exponential.
Chapter X
Simulation methodology 1 Generalities This section gives a summary of some basic issues in simulation and Monte Carlo methods . We shall be brief concerning general aspects and refer to standard textbooks like Bratley, Fox & Schrage [77], Ripley [304], Rubinstein [310] or Rubinstein & Melamed [311] for more detail ; topics of direct relevance for the study of ruin probabilities are treated in more depth.
la The crude Monte Carlo method Let Z be some random variable and assume that we want to evaluate z = EZ in a situation where z is not available analytically but Z can be simulated. The crude Monte Carlo ( CMC) method then amounts to simulating i.i.d. replicates Zl,... , ZN, estimating z by the empirical mean (Z1 + • • + ZN)/N and the variance of Z by the empirical variance N s2
=
N
z) 2 = Zit NE i-i i-i
E(Z{
-
2.
According to standard central limit theory , vrN-(z - z) 4 N(0, 4Z), where a2 = Var(Z ). Hence 1.96s z f (1.2) is an asymptotic 95% confidence interval , and this is the form in which the result of the simulation experiment is commonly reported.
281
282
CHAPTER X. SIMULATION METHODOLOGY
In the setting of ruin probabilities, it is straightforward to use the CMC method to simulate the finite horizon ruin probability z = i,b(u, T): just simulate the risk process {Rt} up to time T (or T n 7-(u)) and let Z be the indicator that ruin has occurred,
Z = I inf Rt < 0 (0
= I('r(u) < T).
The situation is more intricate for the infinite horizon ruin probability 0(u). The difficulty in the naive choice Z = I(T(u) < oo) is that Z can not be simulated in finite time: no finite segment of {St} can tell whether ruin will ultimately occur or not. Sections 2-4 deal with alternative representations of Vi(u) allowing to overcome this difficulty.
lb Variance reduction techniques The purpose of the techniques we study is to reduce the variance on a CMC estimator Z of z, typically by modifying Z to an alternative estimator Z' with EZ' = EZ = z and (hopefully) Var(Z') < Var(Z). This is a classical area of the simulation literature, and many sophisticated ideas have been developed. Typically variance reduction involves both some theoretical idea (in some cases also a mathematical calculation), an added programming effort, and a longer CPU time to produce one replication. Therefore, one can argue that unless Var(Z') is considerable smaller than Var(Z), variance reduction is hardly worthwhile. Say that Var(Z') = Var(Z)/2. Then replacing the number of replications N by 2N will give the same precision for the CMC method as when simulating N' = N replications of Z', and in most cases this modest increase of N is totally unproblematic. We survey two methods which are used below to study ruin probabilities, conditional Monte Carlo and importance sampling. However, there are others which are widely used in other areas and potentially useful also for ruin probabilities. We mention in particular ( regression adjusted) control variates and common random numbers. Conditional Monte Carlo Let Z be a CMC estimator and Y some other r . v. generated at the same time as Z. Letting Z' = E[Z I Y], we then have EZ = EZ = z, so that Z' is a candidate for a Monte Carlo estimator of z. Further, writing
Var(Z) = Var(E [Z I Y]) + E(Var[Z I Y])
283
1. GENERALITIES
and ignoring the last term shows that Var(Z') < Var(Z) so that conditional Monte Carlo always leads to variance reduction. Importance sampling The idea is to compute z = EZ by simulating from a probability measure P different from the given probability measure F and having the property that there exists a r.v. L such that z = EZ = E[LZ]. (1.3) Thus, using the CMC method one generates (Z1, L1),. .. , (ZN, LN) from P and uses the estimator N
zrs = N > L:Zj i=1
and the confidence interval zrs f
1.96 sis v^
N 2 1 where srs =N
j(LiZi
2 1 N 2 2 2
- zrs) =
i=1
N > Lt Zi
- zrs.
i=1
In order to achieve (1.3), the obvious possibility is to take F and P mutually equivalent and L = dP/dP as the likelihood ratio.
Variance reduction may or may not be obtained: it depends on the choice of the alternative measure P, and the problem is to make an efficient choice. To this end, a crucial observation is that there is an optimal choice of P: define P by dP/dP = Z/EZ = Z/z, i.e. L = z/Z (the event {Z = 0} is not a concern because P(Z = 0) = 0). Then z Var(LZ) = E(LZ)2 - [E(LZ)] = E Z2 Zz - E [Z Z]2 = z2 - z2 = 0. Thus, it appears that we have produced an estimator with variance zero. However, the argument cheats because we are simulating since z is not avaliable analytically. Thus we cannot compute L = Z/z (further, it may often be impossible to describe P in such a way that it is straightforward to simulate from
P). Nevertheless, even if the optimal change of measure is not practical, it gives a guidance: choose P such that dP/dP is as proportional to Z as possible. This may also be difficult to assess , but tentatively, one would try to choose P to make large values of Z more likely.
CHAPTER X. SIMULATION METHODOLOGY
284
1c Rare events simulation The problem is to estimate z = P(A) when z is small , say of the order 10-3 or less. I.e., Z = I(A) and A is a rare event. In ruin probability theory, A = {T(u) < T} or A = {r(u) < oo} and the rare events assumption amount to u being large, as is the case of typical interest. The CMC method leads to a variance of oZ = z(1 - z) which tends to zero as z ^ 0. However, the issue is not so much that the precision is good as that relative precision is bad: oZ z(1 - z) 1 -> 00.
Z
z
V5
In other words , a confidence interval of width 10 -4 may look small, but if the point estimate z is of the order 10-5, it does not help telling whether z is of the magnitude 10-4, 10 - 5 or even much smaller . Another way to illustrate the problem is in terms of the sample size N needed to acquire a given relative precision , say 10%, in terms of the half-width of the confidence interval. This leads to the equation 1.96oz /(zV) = 0.1, i.e.
N - 100 - 1.96 2Z ( 1 - z) 100-1.96 2 z2
z
increases like z-1 as z .0. Thus, if z is small, large sample sizes are required. We shall focuse on importance sampling as a potential (though not the only) way to overcome this problem. The optimal change of measure ( as discussed above) is given by
P(B) = E
[Z i;B ]
= iP(AB) = P(BIA). z
I.e., the optimal P is the conditional distribution given A. However, just the same problem as for importance sampling in general comes up: we do not know z which is needed to compute the likelihood ratio and thereby the importance sampling estimator, and further it is usually not practicable to simulate from P(•IA). Again, we may try to make P look as much like P(•IA) as possible. An example where this works out nicely is given in Section 3. Two established efficiency criteria in rare events simulation are bounded relative error and logarithmic efficiency. To introduce these, assume that the rare event A = A(u) depends on a parameter u (say A = {r(u) < oo}). For each u, let z(u) = P(A(u)), assume that the A(u) are rare in the sense that z(u) -* 0, u -+ oo, and let Z(u) be a Monte Carlo estimator of z(u). We then
2. SIMULATION VIA THE POLLACZECK-KHINCHINE FORMULA
285
say that {Z(u)} has bounded relative error if Var(Z(u))/z(u)2 remains bounded as u -3 oo. According to the above discussion, this means that the sample size N = NE(u) required to obtain a given fixed relative precision (say a =10%) remains bounded. Logarithmic efficiency is defined by the slightly weaker requirement that one can get as close to the power 2 as desired: Var(Z(u)) should go to 0 as least as fast as z(u)2-E, i.e. Var(Z(u)) hm sup U-+00 z (u) 2-E
<
oo
(1.4)
for any e > 0. This allows Var(Z(u)) to decrease slightly slower than z(u)2, so that NE (u) may go to infinity. However, the mathematical definition puts certain restrictions on this growth rate, and in practice, logarithmic efficiency is almost as good as bounded relative error. The term logarithmic comes from the equivalent form - log Var(Z(u)) lim inf > 2 u-+oo - log z(u)
of (1.4). Notes and references For surveys on rare events simulation, see Asmussen & Rubinstein [45] and Heidelberger [190].
2 Simulation via the Pollaczeck-Khinchine formula For the compound Poisson model, the Pollaczeck-Khinchine formula III.(2.1) may be written as V) (u) = P(M > u), where M = X1 + • • • + XK, where X1, X2, ... are i.i.d. with common density bo(x) = B(x)/µB and K is geometric with parameter p, P(K = k) = (1 - p)pk. Thus, O (u) = z = EZ, where Z = I(M > u) may be generated as follows: 1. Generate K as geometric, F(K = k) = (1 - p)pk.
2. Generate X1, ... , XK from the density bo(x). Let M - X1 + + XK. 3. If M > u, let Z +- 1. Otherwise, let Z +- 0. The algorithm gives a solution to the infinite horizon problem , but as a CMC method , it is not efficient for large u . Therefore , it is appealing to combine with some variance reduction method . We shall here present an algorithm developed by Asmussen & Binswanger [ 271, which gives a logarithmically efficient estimator
286
CHAPTER X. SIMULATION METHODOLOGY
when the claim size distribution B (and hence Bo) has a regularly varying tail. So, assume in the following that Bo(x) - L(x)/x`' with a > 0 and L(x) slowly varying. Then (cf. Theorem IX.2.1) V)(u) - p/(l - p)Bo(x), and the problem is to produce an estimator Z(u) with a variance going to zero not slower (in the logarithmic sense ) than Bo(u)2. A first obvious idea is to use conditional Monte Carlo: write i,b(u)
=
P (Xl
+•••+XK>u)
= EF[Xl + ...+XK > uIXl,...,XK-1] = EBo(u-X1 - ...-XK_1).
Thus, we generate only X1, ... , XK-1, compute Y = u - X1 - - XK_1 and let Z( 1)(u) = Bo (Y) (if K = 0, Z(1)(u) is defined as 0). As a conditional Monte Carlo estimator , Z(1) (u) has a smaller variance than Zl (x). However, asymptotically it presents no improvement : the variance is of the same order of magnitude F(x). To see this, just note that EZ(1)(u ) 2
> E[Bo (x - Xl - ... - SK-1)2; Xl > x, K > 2] = P2p(Xl > x) = P2Bo(x)
(here we used that by positivity of the X;, X1 + + XK_ 1 > x when X1 > x, and that Bo(y) = 1, y < 0). This calculation shows that the reason that this algorithm does not work well is that the probability of one single Xi to become large is too big. The idea of [27] is to avoid this problem by discarding the largest X; and considering only the remaining ones. For the simulation, we thus generate K and X1i ... , XK, form the order statistics X(1) < X(2) < ... < X(K)
throw away the largest one X(K), and let Z(2)(u)
=
P (SK
> u I X(l),X(2),...,X(K-1))
_
B0((u
- S( K_1)) V X(K-1))
/ Bo(X(K -1))
where S(K_l) = X(1) + X(2) + • • • + X(K_1). conditional probability, note first that
P(X(n) > x I X(1),X(2),...,X(n_1))
To check the formula for the
Bo(X(„_l) V X) Bo(X(n-1))
3. IMPORTANCE SAMPLING VIA LUNDBERG CONJUGATION 287 We then get
P(S"
> x I
X( 1), X(2), ...
, X(n-1))
X X(1), X(2),
P(X(TZ) + S(,_1) > P(X(n) > _
Bo((x
X
... ,
X (. -l))
- S(n_1) I X(1), X(2), ... , X(n-1))
- S (n-1)) V
X (. -l))
BO(X(n-1)) Theorem 2 . 1 Assume that Bo (x) = L(x)/x° with L(x) slowly varying. Then the algorithm given by { Z (2) (u) } is logarithmically efficient. Notes and references The proof of Theorem 2.1 is elementary but lengty, and we refer to [27]. The algorithm is sofar the only efficient one which has been developed for the heavy-tailed case. Asmussen , Binswanger and HOjgaard of [28] give a general survey of rare events simulation for heavy -tailed distributions , and that paper contains one more logarithmically efficient algorithm for the compound Poisson model using the Pollaczeck- Khinchine formula and importance sampling . However , it must be noted that a main restriction of both algorithms is that they are so intimately tied up with the compound Poisson model because the explicit form of the Pollaczeck-Khinchine formula is crucial (say, in the renewal or Markov- modulated model P(r+ < oo) and G+ are not explicit ). Also in other respects the findings of [28] are quite negative: the large deviations ideas which are the main approach to rare events simulation in the light-tailed case do not seem to work for heavy tails.
3 Importance sampling via Lundberg conjugation We consider again the compound Poisson model and assume the conditions of Ce-7", use the the Cramer-Lundberg approximation so that z(u) = '(u) - u is the = ST(") where ^(u) = e-7"ELe-7E(") representation 0(u) = e-7sr(u) instead of 13L, BL overshoot (cf. 111.5), and simulate from FL, that is, using = e-rysr(u). 0, B, for the purpose of recording Z(u) For practical purposes, the continuous-time process {St} is simulated by considering it at the discrete epochs {Qk} corresponding to claim arrivals. Thus, the algorithm for generating Z = Z(u) is: 1. Compute -y > 0 as solution of the Lundberg equation 0 = K(y) = )3(B[y] - 1) - y, and define )3L, BL by I3L = /3B[-y], BL(dx) = e7sB(dx)/B[y].
288
CHAPTER X. SIMULATION METHODOLOGY
2. Let Sf-0
3. Generate T as being exponential with parameter ,l3 and U from B. Let S - S+U - T. 4. If S > u, let Z F e_'s. Otherwise, return to 3. There are various intuitive reasons that this should be a good algorithm. It resolves the infinite horizon problem since FL(,r(u) < oo) = 1. We may expect a small variance since we have used our knowledge of the form of 0(u) to isolate what is really unknown, namely ELe-ry£("), and avoid simulating the known part e-7". More precisely, the results of IV.7 tell that P(. r(u) < oo) and FL (both measures restricted to.F,(u)) are asymptotically coincide on {r(u)} < oo, so that changing the measure to FL is close to the optimal scheme for importance sampling , cf. the discussion at the end of Section 1b. In fact: Theorem 3.1 The estimator Z(u) = e-'rs* "u) (simulated from FL) has bounded relative error. Proof Just note that EZ(u)2 < e - 2ryu _ z (u)2/C2.
❑
It is tempting to ask whether choosing importance sampling parameters ,Q, b different from ,QL, BL could improve the variance of the estimator . The answer is no. In detail , to deal with the infinite horizon problem , one must restrict attention to the case 4µB > 1. The estimator is then
M(u) /3e-QT' dB Z(u) (Ui) j=1 )3 e $Ti dB where M(u) is the number of claims leading to ruin, and we have: Theorem 3.2 The estimator (3.1) (simulated with parameters ^3, B) is not logarithmically efficient when (/3, b) # (/3L, BL).
The proof is given below as a corollary to Theorem 3.3. The algorithm generalizes easily to the renewal model . We formulate this in a slightly more general random walk setting '. Let X1, X2, ... be i.i.d. with distribution F, let S,, = X1 + ... + X,,, M(u) = inf {n : S„ > u}, and assume that µF < 0 and that F[y] = 1, P'[-y] < oo for some ry > 0. Let FL (dx) = 'For the renewal model, Xi = U; -Ti, and the change of measure F -r FL corresponds to B -> BL, A -> AL as in Chapter V.
3. IMPORTANCE SAMPLING VIA LUNDBERG CONJUGATION
289
e7yF(dx). The importance sampling estimator is then Z( u) = e-'rSM( ). More generally, let F be an importance sampling distribution equivalent to F and
M(u) dF Z(u) _ I -(Xi) . (3.2) dF Theorem 3.3 The estimator (3.2) (simulated with distribution F of the X3 has bounded relative error when .P = FL. When F # FL, is not logarithmically efficient. Proof The first statement is proved exactly as Theorem 3 . 1. For the second, write
W(F IF) _ -F(XI)... -F(XM(u)). By the chain rule for Radon-Nikodym derivatives,
EFZ(u)2
= EeW2(FIF) = Ep [W2(FIFL)W2(FLIF)] = EL [W2 ( FIFL)w(FLIF)] = ELexp {Kl+...+KM(u)},
where
Kl og
(X) (j) 2
)
= -log dFL (Xi) - 2'X1 .
Here ELK; = c'- 2ryELXi, where
e' = -EL Iog dFL (Xi) > 0 by the information inequality. Since K1, K2, ... are i.i.d., Jensen's inequality and Wald's identity yield EpZ(u)2
> exp {EL(K1 + ... + KM(u))} = exp {ELM(u)(E - 2ryELXi)} .
Since ELM(u)/u -+ 1/ELXi, it thus follows that for 0 < e < e'/ELXi, EFZ(u)2 EFZ(u)2 lim sup z(u)2eeU = lim cop C2e-2,yu+elu u -+oo e-try' 1 > lim up C2e-2,yu = G,2 > 0,
290
CHAPTER X. SIMULATION METHODOLOGY ❑
which completes the proof.
Proof of Theorem 3.2. Consider compound Poisson risk process with intensities /3', /3", generic interarrival times T' , T", claim size distributions B', B" and generic claim sizes U', U". Then according to Theorem 3.3, all that needs to be shown is that if U' - T' = U" - T", then /3' B' = B". First by the memoryless distribution of the exponential distribution , U' - T' has a left exponential tail with rate /3' and U" - T" has a left exponential tail with rate /3'. This immediately yields /3' = 3". Next, from
P(U'-T'>x) ^ =
^
/3'e-Q'YR'( x + y) dy = ,3'eO'x f f
e-Q'zB (z) dz,
P (U" - T" > x)
J
/3"e-0 yB (x + y) dy = ,3"eQ x
0
J
e-Q zB (z) dz
x
(x > 0) and /3' = /3", U' - T' D U" - T", we conclude by differentiation that Bo(x)=B' (x)forallx > 0,i.e.B'=B".
❑
Notes and references The importance sampling method was suggested by Siegmund [343] for discrete time random walks and further studied by Asmussen [ 13] in the setting of compound Poisson risk models . The optimality result Theorem 3.1 is from Lehtonen & Nyrhinen [244], with the present (shorter and more elementary) proof taken from Asmussen & Rubinstein [45]. In [13], optimality is discussed in a heavy traffic limit y 10 rather than when u -+ oo. The extension to the Markovian environment model is straightforward and was suggested in Asmussen [ 16]. Further discussion is in Lehtonen & Nyrhinen [245]. The queueing literature on related algorithms is extensive , see e.g. the references in Asmussen & Rubinstein [45] and Heidelberger [190].
4 Importance sampling for the finite horizon case The problem is to produce efficient simulation estimators for '0 (u, T) with T < oo. As in IV.4, we write T = yu. The results of IV.4 indicate that we can expect a major difference according to whether y < 1/r,'(-y) or y > 1/r.'(-y). The easy case is y > 1/k'(-y) where O(u, yu) is close to zk(u), so that one would expect the change of measure F -4 FL to produce close to optimal results. In fact: Proposition 4.1 If y > 1/ic'('y), then the estimator Z(u) = e-7Sr(°)I(r(u) < yu) (simulated with parameters /3L, BL) has bounded relative error.
4. IMPORTANCE SAMPLING FOR THE FINITE HORIZON CASE
291
Proof The assumption y > 1/n'(-y) ensures that 1fi(u, yu)/z,(u) -* 1 (Theorem IV.4.1) so that z(u) = zP(u, yu) is of order of magnitude a-71. Bounding ❑ ELZ(u)2 above by a-7u, the result follows as in the proof of Theorem 3.1. We next consider the case y < 1/r.'(7). We recall that ay is defined as the solution of a'(a) = 1/y, that ryy = ay - yk(ay) determines the order of magnitude of z'(u, yu) in the sense that - log 4')u) -4 7y
u (Theorem IV.4.8), and that ryy > ry. Further
,O(u, yu) = e-ayu Eay Le-ay^(u)+r(u)K(ay); T(u) < yu] . (4.2) Since the definition of ay is equivalent to Eay r(u) - yu, one would expect that the change of measure F Pay is in some sense optimal. The corresponding estimator is Z(u) = e-avS' ( u)+T(u)K (ay)I(T( u) < yu),
(4.3)
and we have: Theorem 4.2 The estimator (4.3) (simulated with parameters /gay, Bay) is logarithmically efficient. Proof Since ryy > -y, we have ic(ay ) > 0 and get
Eay Z(u)2 =
Eay
[e
-
2aySr( u)+2r(u )r.(ay);
e-2ryyuEay
e
le-
T( u) < yu]
2ay^(u); T(u) <
yu]
-27yu .
Hence by (4.1), lim inf u--oo
- log Var(Z(u)) _ - log Var(Z(u)) l im of .yy> 2 - to g x ( u ) u
so that (1.5) follows.
❑
Remark 4 . 3 Theorem IV.4.8 has a stronger conclusion than (4.1), and in fact, (4.1) which is all that is needed here can be showed much easier . Let Qy2 =
CHAPTER X. SIMULATION METHODOLOGY
292
Vara„ (-r(u))/u so that (T(u) - yu)/(uyu1/2) . N(0,1) (see Proposition IV.4.2). Then z(u) =
Eay Z(u)
>
Eay
avS'(u)+T( u)k(av
Le-
1 ); yu - o ,u1/2 < r(u) < yu
l
= e- a yu +l/ur' (av)Ei`av re-av^(u)+(T(++)(U)
yu - o.yu1/2 <1 T(u) < yu
r > e-7vu +avul/ 2r.(av)Eav l e- a vt(u); yu - Qyu1/2 < T(u) C yu e- ryyu
+oy u1/2K'(av)Eo
] l
l
1/2)
v
where the last step follows by Stam's lemma (Proposition IV.4.4). Hence lira inf log
-ryyu + vyu 1/2 tc(ay) - -7y x(u) > hm inf
u-+Oo U - u-aoo U
That lim sup < follows similarly but easier as when estimating En,, Z (u)2 above. 0 Notes and references The results of the present section are new. In Asmussen [13], related discussion is given in a heavy traffic limit q J. 0 rather than when u -3 oo.
5 Regenerative simulation Our starting point is the duality representations in 11.3: for many risk processes {Rt }, there exists a dual process { V t} such that i,b(u,T)
=
P
inf Rt < 0 = P(VT > u),
O
'%(u) = P I info Rt < 0) = P(VV > u), (5.1) where the identity for Vi(u) requires that Vt has a limit in distribution V. In most of the simulation literature (say in queueing applications), the object of interest is {Vt} rather than {Rt}, and (5.1) is used to study Voo by simulating {Rt} (for example, the algorithm in Section 3 produces simulation estimates for the tail P(W > u) of the GI/G/1 waiting time W). However, we believe that there are examples also in risk theory where (5.1) may be useful. One main example is {Vt} being regenerative (see A.1): then by Proposition A1.3, zi(u) = INV. > u) = -E f I(VV > u) dt 0
(5.2)
293
5. REGENERATIVE SIMULATION
where w is the generic cycle for {Vt}. The method of regenerative simulation, which we survey below , provides estimates for F ( V,, > u) (and more general expectations Eg(V... )). Thus the method provides one answer on to how to avoid to simulate { Rt} for an infinitely long time period. For details , consider first the case of independent cycles . Simulate a zerodelayed version of {V t } until a large number N of cycles have been completed.
For the ith cycle, record Zi'i = (Z1'),
Z2'>) where Zi'i = w, is the cycle length,
Z2'> the time during the cycle where { Vt} exceeds u and zj = EZJ'), j = 1, 2. Then Z(1), ... , Z(N) are i . i.d. and
EZ1'i = z1 = Ew, EZ2'i = z2 = E
J0 'o I (Vt > u) dt .
Thus, letting
Z1 = (Zl1i +... + Z1N>) ,
Zl the LLN yields Z1
a$'
z 1, Z2
a4*
Z2 =
Z(1)
N (X21' + ... + Z2N)) ,
+...+Z(N)
z2,
(u) ?2 = E fo I(Vt > u) dt = 0( u ) zl Ew as N -> oo. Thus, the regenerative estimator z%(u) is consistent.
To derive confidence intervals , let E denote the 2 x 2 covariance matrix of Z('). Then (Z1-z1i Z2-z2 ) 4 N2(0,E).
Therefore , a standard transformation technique (sometimes called the delta method) yields 1 V 2 (h (Zi, Z2 - h (zl, z2)) -> N(O, oh) for h : R2 -^ R and Ch = VhEVh, Vh = (8h/8z1 8h/ 8z2). Taking h(zl, z2) z2/z1 yields Vh = (-z2/z2i 1/zl),
(^(u) - t(u)) 4
N(0, 02)
(5.3)
294
CHAPTER X. SIMULATION METHODOLOGY
where
2
01 2
=
Z2
Z2
Eli
z1
+ 2 E22 - 2 E1 2 z1 z1
The natural estimator for E is the empirical covariance matrix N
S = N 1 12 (ZW - Z) ^Z(=) - z^ i=1
so a2 can be estimated by 2 2 = 72 S11+ 12 S22 - 2- g S12 (5.5)
Z1 Z1 Z1 and the 95% confidence interval is z1 (u) ± 1.96s/v"N-. The regenerative method is not likely to be efficient for large u but rather a brute force one. However , in some situations it may be the only one resolving the infinite horizon problem , say risk processes with a complicated structure of the point process of claim arrivals and heavy -tailed claims . There is potential also for combining with some variance reduction method. Notes and references The literature on regenerative simulation is extensive, see e.g. Rubinstein [310] and Rubinstein & Melamed [311].
6 Sensitivity analysis We return to the problem of 111 . 9, to evaluate the sensitivity z/i( (u ) = (d/d() 0(u) where ( is some parameter governing the risk process . In 111.9, asymptotic estimates were derived using the renewal equation for z /i(u). We here consider simulation algorithms which have the potential of applying to substantially more complex situations. Before going into the complications of ruin probabilities , consider an extremely simple example , the expectation z = EZ of a single r.v. Z of the form Z = ^p(X) where X is a r . v. with distribution depending on a parameter (. Here are the ideas of the two main appfoaches in today 's simulation literature: The score function ( SF) method . Let X have a density f (x, () depending on C. Then z(() = f cp(x) f (x, () dx so that differentiation yields zS d( fco(x)f(x,C)dx = f w(x) d( f ( x, () dx f Ax) (dl d()f (x' () f ( z, () dx = E[SZ] f(X,0
295
6. SENSITIVITY ANALYSIS where
S = (d/d()f (X, () = d log f (X, C) f(X,() d( is the score function familiar from statistics . Thus, SZ is an unbiased Monte Carlo estimator of z(. Infinitesimal perturbation analysis (IPA) uses sample path derivatives. So assume that a r.v. with density f (x, () can be generated as h(U, () where U is uniform(0,1). Then z(() = Ecp(h(U, ()),
zc = E [d( co(h(U, C)), = E [`d (h(U, ()) d( hc(U, C), where h( (u, () = (8/8()h (u, () Thus, cp' (h(U, ()) h((U, () is an unbiased Monte Carlo estimator of zS. For example , if f (x, () _ (e-Sx, one can take h (U, () = - log U/(, giving h( (U, () = log U/(2. The derivations of these two estimators is heuristic in that both use an interchange of expectation and differentiation that needs to be justified. For the SF method, this is usually unproblematic and involves some application of dominated convergence . For IPA there are, however , non-pathological examples where sample path derivatives fail to produce estimators with the correct expectation. To see this, just take cp as an indicator function , say W(x) = I(x > xo) and assume that h(U, () is increasing in C. Then , for some Co = (o(U), cp(h(U, ()) is 0 for C < Co and 1 for C > Co so that the sample path derivative cp'(h(U, ()) is 0 w . p. one. Thus , IPA will estimate zS by 0 which is obviously not correct. In the setting of ruin probabilities , this phenomenon is particularly unpleasant since indicators occur widely in the CMC estimators . A related difficulty occurs in situations involving the Poisson number Nt of claims: also here the sample path derivative w.r.t. /3 is 0. The following example demonstrates how the SF method handles this situation. Example 6 .1 Consider the sensitivity tka(u) w.r.t. the Poisson rate /3 in the compound Poisson model. Let M(u) be the number of claims up to the time r(u) of ruin (thus, r(u) = Tl + • • • +TM(u)). The likelihood ratio up to r(u) for two Poisson processes with rates /3, /3o is M(u)
Oe
-(3T:
11 /3oe-OoT; I(r(u)
< oo) .
296
CHAPTER X. SIMULATION METHODOLOGY
Taking expectation, differentiating w.r.t. j3 and letting flo = 0, we get
1 M(u) 00(u)
= E
(_Ti)I(T(U)<)
E [(M(u) - T(u)) I(T(u) < co)
]
.
To resolve the infinite horizon problem , change the measure to FL as when simulating tp(u). We then arrive at the estimator
ZZ(u) =
(M(u) - T(u)) e-7ue--rVu)
for ?P,3 (u) (to generate Zp (u), the risk process should be simulated with parameters ,3L, BL). We recall (Proposition 111.9 . 4) that V5,3 (u) is of the order of magnitude ue-7u. Thus, the estimation of z(ip(u) is subject to the same problem concerning relative precision as in rare events simulation . However, since ELZp(u)2
we have
\ 1
< (M(U) _T(u)
VarL(ZQ(u)) ZO(u)2
)
2 a-2ryu = O(u2)e-27u,
O(u2)e-2
-yu - 0(1)
u2e-2ryu
so that in fact the estimator Zf(u) has bounded relative error. 0 Notes and references A survey of IPA and references is given by Glasserman [161] (see also Suri [358] for a tutorial), whereas for the SF method we refer to Rubinstein & Shapiro [312]. Example 6.1 is from Asmussen & Rubinstein [46] who also work out a number of similar sensitivity estimators, in part for different measures of risk than ruin probabilities, for different models and for the sensitivities w.r.t. different parameters.
There have been much work on resolving the difficulties associated with IPA pointed out above. In the setting of ruin probabilities, a relevant reference is VazquezAbad [374].
Chapter XI
Miscellaneous topics 1 The ruin problem for Bernoulli random walk and Brownian motion. The two-barrier ruin problem The two-barrier ruin probability 0,,(u) is defined as the probability of being ruined (starting from u) before the reserve reaches level a > u. That is, Y'a(U) = P(T
(u)
= r+(a)) =
1 - P(•r(u, a) = r(u)),
wherel T(u) = inf {t > 0 : Rt < 0} ,
T+(a) = inf It > 0 : Rt > al,
T(u, a) = r(u) A T+(a).
Besides its intrinsic interest , Oa(U ) can also be a useful vehicle for computing t/i(u) by letting a -* oo. Consider first a Bernoulli random walk, defined as Ro = u (with u E {0,1,.. .... }), R„ = u+X,+• • •+X,, where X1, X2, ... are i.i.d. and {-1,1}-valued with P(Xk = 1) = 9.
'Note that in the definition of r(u ) differs from the rest of the book where we use r(u) = inf {t > 0 : Rt < 0} ( two sharp inequalities ); in most cases , either this makes no difference (P(R.,(u) = 0 ) = 0) or it is trivial to translate from one set-up to the other, as e.g. in the Bernoulli random walk example below.
297
CHAPTER XI. MISCELLANEOUS TOPICS
298
Proposition 1.1 For a Bernoulli random walk with 0 0 1/2, C1_0\a- (1-B)u
I\ e
oJ 0- (u)
=
1 oa ' ()i
a
=
u,
u + 1,.... (1.1)
o
If 0 = 1/ 2, then 'Oa(u) _
au a
We give two proofs , one elementary but difficult to generalize to other models, and the other more advanced but applicable also in some other settings. Proof 1. Conditioning upon X1 yields immediately the recursion 'a(1) = 1-9+00a(2), _ (1 - o)T/la (1) + 8z/'u(3),
tba(2)
7/la(a - 2)
= (1-9)4/'0(a-3)+9ba(a-1),
Oa(a - 1)
= (1 - o)'t/1a(a - 2),
❑
and insertion shows that ( 1.1) is solution.
Proof 2. In a general random walk setting , Wald's exponential martingale is defined as in 11.(4.4) by ea(u+Xl+...+Xn) F[ a]n n=0,1,...
where a is any number such that Ee°X = F[a]
= z°P
(RT ( u,a)
= 0) + zap ( R,r(u,a)
= z°Va(u) + za(1 -
Y,a(u)),
1. RANDOM WALK; BROWNIAN MOTION; TWO BARRIERS 299 and solving for 4/la(u) yields t/ia(u) = (za - zu)/(za - 1).
If 9 = 1/2, (1.2) is trivial (z = 1). However, {R,,} is then itself a martingale and we get in a similar manner
u = ER° = ER ra( u) = 0 •
Y'a (u)
+ all -
a-u Y'a( u)),
pa( u)
_
u
Corollary 1.2 For a Bernoulli random walk with 9 > 1/2, a el u
1h (u) =
\1
If 9 < 1/ 2, then Vi(u) = 1. ❑
Proof Let a-+ oo in (1.1).
Proposition 1.3 Let {Rt} be Brownian motion starting from u and with drift p and unit variance . Then for p 0 0, ,•
a-2µa
-
e-2µu
,ba(u) = e-2µa - 1
If p = 0, then
'Oa (U)
--
a-u a
Proof Since
Eea(R°- u) =
et(a2
/2 +aµ)
the Lundberg equation is rye/2-'yp = 0 with solution y = 2p. Applying optional stopping to the exponential martingale {e-7R, } yields e-7u
= Ee-7R° = e°Wa(u) + e-7a(1 - 0a(u)),
and solving for 9/la(u) yields Z/)a(u) = (e -76 - e-7u)/(e-7° - 1) for p # 0. If p = 0, {Rt} is itself a martingale and just the same calculation as in the ❑ proof of Proposition 1.1 yields 't/la(u) = (a - u)/u.
Corollary 1.4 For a Brownian motion with drift u > 0, i1(u) = e-211 . If p<0, thenz1 (u)=1.
(1.5)
CHAPTER XI. MISCELLANEOUS TOPICS
300
❑
Proof Let a -* oo in (1.4).
The reason that the calculations work out so smoothly for Bernoulli random walks and Brownian motion is the skip-free nature of the paths, implying R(u,a) = a on {r (u,a) = -r+ (a)} and similarly for the boundary 0. For most standard risk processes , the paths are upwards skip-free but not downwards, and thus one encounters the problem of controlling the undershoot under level 0. Here is one more case where this is feasible: Example 1.5 Consider the compound Poisson model with exponential claims (with rate, say, 5). Here the undershoot under 0 is exponential with rate 5, and hence e-7u =
=
Ee-7Ro E [e-7R(,, , a) I R(u a ) < 0] P (R(u ,a) < 0) + e-7°P (R(u,a) = a)
5 y =
P (R (u,a ) < 0) + e -7aF ( R (u,a) = a )
Ic 5-ry 'pa(u)
+ e -' ° ( 1 - 7/la(u)).
Using y = 6 - 3, we obtain 'Oa a-7u - e-7a (u) = 6
/0 - e-7a
Again , letting a -* oo yields the standard expression pe-7u for .0 (u) (where ❑ p =,616), valid if p < 1 (otherwise , 7/'(u) = 1). However, passing to even more general cases the method quickly becomes unfeasible (see, however, VIII.5a). It may then be easier to first compute the one-barrier ruin probability O(u): Proposition 1.6 If the paths of {Rt} are upwards skipfree and 7//(a) < 1, 0. (u) _ O(u) - 0(a) 0 < u < a. 1 - vi(a) Proof By the upwards skip-free property, 7O(u) = 7/la(u) + (1 - +^a(u))^(a) If 7k(a) < 1, this immediately yields (1.7).
(1.7)
301
1. RANDOM WALK; BROWNIAN MOTION; TWO BARRIERS
Note thas this argument has already been used in VII.1a for computing ruin probabilities for a two-step premium function. We now return to Bernoulli random walk and Brownian motion to consider finite horizon ruin probabilities. For the symmetric (drift 0) case these are easily computable by means of the reflection principle: Proposition 1.7 For Brownian motion with drift 0, 0(u, T ) = P(T(u) < T ) = 241, (i). (1.8) Proof In terms of the claim surplus process { St} = {u - Rt}, we have ili(u,T) P(MT > u) where MT = maxo
= P(ST > u) + P(ST > u) (1.9) = 2P(ST > u).
Corollary 1.8 Let {Rt} be Brownian motion with drift - µ so that {St} is Brownian motion with drift µ . Then the density and c.d.f. of -r(u) are ( U2 Pµ (T(u ) E dT) =
Pµ (T(u) < T)
2^T -3/2 exp µu - 2
( 1.10)
, + µ2T) } ,
)
!....µ%T (1.11) = 1 - 4) I = - µ ✓T I + e2µ"4) ( - VIT
Proof For p = 0, (1.11 ) is the same as (1.8 ), and (1 . 10) follows then by straightforward differentiation. For µ # 0, the density dPµ / dP0 of St is eµst-tµ2/2, and hence Pµ('r(u) E dT)
= Eo [e µsr(,.)_ _( u)µ2 /2; T(u) E dT, = eµu-Tµ2/2Po (T(
u) E dT) 2
eµu-Tµ2/2
u T-3/2 ex p u 27r p 1-2 T
CHAPTER XI. MISCELLANEOUS TOPICS
302
which is the same as (1.10). (1.11) then follows by checking that the derivative of the r.h.s. is (1.10) and that the value at 0 is 0. ❑ Small modifications also apply to Bernoulli random walks: Proposition 1.9 For Bernoulli random walk with 9 = 1/2, Vi(u,T) = P(ST = u) + 2P (ST > u),
(1.12)
whenever u, T are integer-valued and non-negative. Here {2-T( (v-}TT)/2) v=-T,-T+2,...,T-2,T
P(ST = v) = 0 otherwise. Proof The argument leading to ( 1.9) goes through unchanged, and (1.12) is the same as ( 1.9). The expression for F ( ST = v) is just a standard formula for the ❑ binomial distribution. The same argument as used for Corollary 1.8 also applies to the case 9 54 1/2, but we omit the details. We finally consider a general diffusion {Rt} on [0, oo) with drift µ(x) and variance a2 (x) at x. We assume that u(x) and a2 (x) are continuous with a2 (x) > 0 for x > 0. Thus, close to x {Rt} behaves as Brownian motion with drift µ = u(x) and variance a2 = a2(x), and in a similar spirit as in VII.3 we can define the local adjustment coefficient y(x) as the one -2µ(x)/a2(x) for the locally approximating Brownian motion. Let
s(y) = ef0 ry(.T)dx, S(x) = f x s(y)dy, S(oo) = f c s(y)dy. 0 0
(1.13)
The following results gives a complete solution of the ruin problem for the diffusion subject to the assumption that S(x), as defined in (1.13) with 0 as lower limit of integration, is finite for all x > 0. If this assumption fails, the behaviour at the boundary 0 is more complicated and it may happen, e.g, that 0(u), as defined above as the probability of actually hitting 0, is zero for all u > 0 but that nevertheless Rt ^4 0 (the problem leads into the complicated area of boundary classification of diffusions, see e.g. Breiman [78] or Karlin & Taylor [222] p. 226). Theorem 1.10 Consider a diffusion process {Rt} on [0, oo), such that the drift µ(x) and the variance a2(x) are continuous functions of x and that a2(x) > 0
1. RANDOM WALK; BROWNIAN MOTION; TWO BARRIERS
303
for x > 0. Assume further that S (x) as defined in (1.13) is finite for all x > 0. If (1.14) S(oo) < 00, then 0 < 2l.(u) < 1 for all u > 0 and (1 . 15)
^ S^ Conversely, if (1.14) fails, then,0(u) = 1 for all u > 0.
Lemma 1.11 Let 0 < b < u < a and let t&0,b(u) be the probability that {Rt} hits b before a starting from u. Then YIa,b(u) = S(a) - S(u) (1.16) S(a) - S(b) Proof Recall that under mild conditions on q, E„ q(Rdt) = q(u)+Lq(u)dt, where Lq(u) = 0'22u) q "(u) + p(u)q(u) is the differential operator associated with the diffusion. If b < u < a, we can ignore the possibility of ruin or hitting the upper barrier a before dt, so that Y)n,b('u) = Eu &0,b(Rdt), and we get Wo,b('u) = Eu , O,b (Rdt) = Oa,b(u) + L,ba,b(u)dt,
i.e LVa, b = 0. Using s'/ s = -2p/a2, elementary calculus shows that we can rewrite L as Lq(u)
d
1a2 (u)s(u)d
?
[ s (u)
] . (1.17)
Hence L,ba,b = 0 implies that VQ b/s is constant, i.e. Wa,b = a+/3S. The obvious boundary conditions '0a,b(b) = 1, 1', ,b(a) = 0 then yield the result. 0 Proof of Theorem 1.10. Letting b J. 0 in (1.16) yields 4b (u) = 1 - S(u)/S(a). Letting a T oo and considering the cases S(oo) = oo, S(oo) < oo separately ❑ completes the proof. Notes and references All material of the present section is standard. A classical reference for further aspects of Bernoulli random walks is Feller [142]. For generalizations of Proposition 1.6 to Markov-modulated models , see Asmussen & Perry [42]. Further references on two-barrier ruin problems include Dickson & Gray [116], [117]. A good introduction to diffusions is in Karlin & Taylor [222]; see in particular pp. 191-195 for material related to Theorem 1.10. In view of (1.16), the function S(x) is
304
CHAPTER XI. MISCELLANEOUS TOPICS
referred to as the natural scale in the general theory of diffusions (in case of integrability problems at 0, one works instead with a lower limit 5 > 0 of integration in (1.13)). Another basic quantity is the speed measure M , defined by the density 1/va(u)s(u) showing up in (1.17). Markov-modulated Brownian models , with the drift and the variance depending on an underlying Markov process , is currently an extremely active area of research; much of the literature dels with the pure drift case, correponding to piecewise linear paths or , equivalently, variance 0, which is motivated from the study of modern ATM (asynchronous transfer mode ) technology in telecommunications. The emphasis is often on stationary distributions , but by duality, information on ruin probabilities can be obtained . See Asmussen [20] and Rogers [305] for some recent treatments and references to the vast literature.
2 Further applications of martingales Consider the compound Poisson model with adjustment coefficient ry and the following versions of Lundberg 's inequality (see Theorems 111.5.2, 111 .6.3, IV.4.5): _ z/'(u) < e 7u,
(2.1)
C_e-7u < t(u) < C+e _7u,
(2.2)
where C_ =
B(x) _ B(x) sup 2no fy° e7(Y )B(dy)' f2e7(Y-2)B(dy)' C+ i/i(u, yu) '+/1(u) - t&(u, yu)
<
e-7yu,
1 y < k (y), (2.3)
< e -7yu, y > - (7) , (2.4)
where W (ay) = y, 7y = ay - ytc (ay). (2.5) A martingale proof of (2.1 ) was given already in II.1, and here are alternative martingale proofs of the rest . They all use the fact that ( tx(a) l ( e-aRt -
I.
= e-au + aSt-tx(a)
Lo
I.
Lo
is a martingale (cf. Remark 11.4.9 ) and optional stopping applied to the stopping time r(u) A T, yielding e-au = Ee- aRo - o•K(a) = Ee - aR,(,.)AT - (T(u)AT) r.(a)
(2.6)
2. FURTHER APPLICATIONS OF MARTINGALES
305
(we cannot use the stopping time r(u) directly because P(-r(u) = oo) > 0 and also because the conditions of the optional stopping time theorem present a problem). Proof of ( 2.2): As noted in Proposition II.1.1 , it follows easily from (2.6) with = 'y that e--yu
- E [e- 7R,( u ) I
T(U)
<
00]
.
Let H(dt, dr) denote the conditional distribution of (T(u), RT(u)_) given r(u) < oo. A claim leading to ruin at time t has c.d.f. (B(y) - B(r))/B(r), y > r, when Rt_ = r. Equivalently, -Rt has distribution B(r + dy)/B(r). Hence ^00 H( dt, dr
E [e-7Rr (u) Jr(u) < ool
JO Zoo
) f e7'B(r + dy) B(r) Jo
^00 ^00 H(dt, dr) e 7( y-r)B(dy) B(r) fr oo o 0 >
dr) 1 = 1 I0 /oH(dt, C+ C+
From this the upper inequality follows, and the proof of the lower inequality is similar. ❑ Proof of (2.3), (2.4): We take a = ay in (2.6). For (2.3), we have tc(ay) > 0 and we can bound (2.6) below by 1 E Le-7Rr(,.)-r(u)r.(ay)I T(u) < yu] P(r(u) < yu)
> e-yu"(ay ) iYj(u,yu)
(using RT(u) < 0), so that i/1(uL yu) < e-ayu , eyuk (ay) = e-7yu e
Similarly for (2.4), we have ic(ay ) < 0 and use the lower bound E [e-7Rr („)- T(u)K(ay) I yu < r(u) < T] F(yu < r(u) < T) > e- yuk (ay)(u&(u,T) -
V,(u,yu))•
Letting T -+ oo yield e_ayu
Notes and references
> e-yur4ay)(0(u) -
See II.1.
b(u, yu))•
CHAPTER XI. MISCELLANEOUS TOPICS
306
3 Large deviations The area of large deviations is a set of asymptotic results on rare event probabilities and a set of methods to derive such results. The last decades have seen a boom in the area and a considerable body of applications in queueing theory, not quite so much in insurance risk. The classical result in the area is Cramer's theorem. Cramer considered a random walk Sn = X1 + ... + X. such that the cumulant generating function r.(B) = log EeOX 1 is defined for sufficiently many 0, and gave sharp asymptotics for probabilities of the form P (S,,/n E I) for intervals I C R. For example, if x > EX1, then
C
S.
P
> ^ x e -nn 1 n 0o 2xn
(3.1)
where we return to the values of 0, ri, v2 later. The limit result (3.1) is an example of sharp asymptotics : - means (as at other places in the book) that the ratio is one in the limit (here n -* oo). However , large deviations results have usually a weaker form, logarithmic asymptotics , which in the setting of (3.1) amounts to the weaker statement lim 1 log P I Sn > x I = -17. n--roo n n ///
Note in particular that (3.1) does not capture the \ in (3. 1) but only the dominant exponential term - the correct sharp asymptotics might as well have +,3na with a < 1. Thus , large deviations results been, e.g., cle - nn or C2e-,?n typically only give the dominant term in an asymptotic expression . Accordingly, logarithmic asymptotics is usually much easier to derive than sharp asymptotics but also less informative . The advantage of the large deviations approach is, however , its generality, in being capable of treating many models beyond simple random walks which are not easily treated by other models , and that a considerable body of theory has been developed. og gn if For sequences fn, gn with fn -+ 0 , gn -4 0, we will write fn 1-
lim 109 fn = 1
n-ioo
log gn
(later in this section, the parameter will be u rather than n). Thus, (3.2) can be rewritten as F (Sn/n > x) 1-g a-'fin. Example 3.1 We will go into some more detail concerning (3.1), (3.2).
3. LARGE DEVIATIONS
307
Define rc* as the convex conjugate of rc, rc*(x) = sup(Ox - r.(0)) e (other names are the entropy, the Legendre-Fenchel transform or just the Legendre transform or the large deviations rate function). Most often, the sup in the definition of rc* can be evaluated by differentiation: rc*(x) = Ox - rc(0) where 0 = 0(x) is the solution of x = rc'(0), which is a saddlepoint equation - the mean rc'(0) of the distribution of X1 exponentially tilted with 0, i.e. of P(X1 E dx) = E[e9X1-K.(e)i XI E dx],
(3.3)
is put equal to x. In fact, exponential change of measure is a key tool in large deviations methods. Define ,q = rc* (x). Since
P
nn
>
x)
=
E {e_8
' ( 9).
S
rtn
> x
1,
replacing Sn in the exponent and ignoring the indicator yields the Chernoff bound P Sn > x 1 < e-°n (3.4) n Next, since Sn is asymptotically normal w.r.t. P with mean nx and variance no, 2 where o2 = o2(x) = rc"(0), we have
P(nx < Sn < nx + 1.960/) -* 0.425, and hence for large n P(Sn/n > x) >
>
E [e- 9S„+n' ( 9); nx < Sn < nx + 1.96o /]
0.4 e-nn +1.sseo f
which in conjunction with (3.4) immediately yields (3.2). More precisely, if we replace Sn by nx + o / V where V is N(0,1), we get P(Sn/n > x)
E [e-9nx +nK(9)-9" '; V > 0 e-9o^y
e- tin f o o')
= e-tin
1 Bo 27rn
1
1 e-y2/2 dy
21r
CHAPTER XI. MISCELLANEOUS TOPICS
308
which is the same as (3.1), commonly denoted as is the saddlepoint approximation. The substitution by V needs, however, to be made rigorous; see Jensen ❑ [215] or [APQ] p. 260 for details. Further main results in large deviations theory are the Gartner-Ellis theorem, which is a version of Cramer's theorem where independence is weakened to the existence of c(O) = limn,o log Ee9Sn /n, Sanov's theorem which give rare events asymptotics for empirical distributions, Mogulskii's theorem which gives path asymptotics, that is, asymptotics for probabilities of the form P ({S[nti/n}o
For the proof, we introduce a change of measure for X1, ... , Xn given by Fn(dxl,...,dxn) = 05n-Kn(7)Fn(dx1,...,dxn)
where Fn is the distribution of (X1i ... , Xn) and sn = x1 + • • • + xn (note that the r.h.s. integrates to 1 by the definition of Icn). We further write µ = tc'(ry). We shall need: Lemma 3 .3 For each i > 0, there exists z E (0, 1) and no such that Sn
n for n n0.
- p > 7 < zn,
>7 < zn
Pn Sn-1 - /^
3. LARGE DEVIATIONS
309
Proof Let 0 < 9 < e where a is as in Theorem 3.2. Clearly, P n(Sn/n > {c+77) < e
no(µ
+7)- r (7) +n)Enees n n = e- ne(µ +n)elcn(B
limsup 1 log Pn (Sn/n > µ + 17) < ic(9 + ry) - Bµ - 077 n-^oo n
and by Taylor expansion and (iv ), the r . h.s. is of order - 91) + o(O ) as 0 J. 0, in particular the r. h.s. can be chosen strictly negative by taking 9 small enough. This proves the existence of z < 1 and no such that Pn (Sn/n > µ.+r7) < zn for n > no. The corresponding claim for Pn(Sn/n < µ - 77) follows by symmetry (note that the argument did not use µ > 0). This establishes the first claim of the lemma , for Sn. For Sn-1i we have Fn(Sn -1/n > µ+r7) <
e-ne(µ+ 1?)EneeS„-1 = e-ne ( µ+n)EneeSn-eX„ e-no(µ +n) Ee(e+7)Sn -ex„ -wn (7)
< e- n e(µ +o)-w"(7) [Eep(B
+7)Sn]1 /p
[Ee-goX,.]1/q
= e- ne(p+ 17)- Kn(7)e'n (p(O +7))/p I Ee -geXn]1/q where we used Holder's inequality with 1/p+ 1/q = 1 and p chosen so close to 1 and 0 so close to 0 that j p(0 +,y) - 71 < e and jq9j < e. Since I Ee-qOX „ ] 1/q is bounded for large n by (ii), we get lim sup 1 log Pn (Sn-1 /n > µ + r7) < -0(1i + r7) + i(p(0 +'Y))/p n-+oo n
and by Taylor expansion, it is easy to see that the r.h.s. can be chosen strictly negative by taking p close enough to 1 and 0 close enough to 0. The rest of the argument is as before. ❑ Proof of Theorem 3.2. We first show that lim inf„_,,,,, log zl'(u)/u > -'y. Let r7 > 0 be given and let m = m(77) = [u(1 + 77)/µ] + 1. Then
V, ( U)
P(S,n > u ) =
Em 1e- -YS- +r-- W;
km e-7Sm+n.m(7). S. >
[
1
[em
mµ
+17] m(7);
Em
Sm > u]
S,n - > - µ?7
m µ 1 + rl
CHAPTER XI. MISCELLANEOUS TOPICS
310
>
]Em I e- YS +^c
IL exp
1-_
(7); I
`S- - I < µl1 I 1+77 M
1+277 S,n Yµ 1 + m + r ('Y) } U n \ 77
m µ
µ7 1
< 1+ 77 )
Here E,,,(•) goes to 1 by Lemma 3. 3, and since Ic,n(ry)/u -4 0andm/u-* (1 + r7)/µ, we get lum inf
z/i(u) +12r7 >_1-ry + 77
Letting r7 J. 0 yields liminfu __,,,. logO(u)/u > -ry. For lim supu,0 log i'(u )/u < -'y, we write 'i/I(u) _ E00P(T(u) = n) = Il + I2 + I3 + I4 n=1
where n(b) Lu(1-0/µJ Ii = 1: F(T(u) = n), I2 = F(T(u) = n), n=n(b)+1
n=1
Lu(1 +6) /µJ 13
00
P(T(u) n), 14 = = E Lu(1-6)/aJ+1 Lu(1+6)/µJ+l
F( T (u)
=
= n)
and n(S) is chosen such that icn('y )/n < 6 A (- log z) /2 and Sn Fn\
n
> la+ 8 I < zn >lb+S)
(3.6)
Pn \
for some z < 1 and all n > n(E); this is possible by (iii), (iv) and Lemma 3.3. Obviously,
P(T(u) = n) < P(Sn > u) = En [e-7S,.+wn(7); Sn > u] < e-Yu+Kn(7)pn(Sn > u) (3.7) so that
n(b) I1 < e-'Yu E en.(-Y), n=1
311
3. LARGE DEVIATIONS Lu(1-6)/µJ
I2
e'n(Y)P(Sn > u)
< e-"u n=n(6)+1
Lu(1-6)/µJ ^, e-ryu e-n logz/2p n nt n, -µ n=n(6)+1 \
<
1u(1-6)/µ1 00 1 zn < e-7u E Z n/2 < e--(U xn/2 E
e--Yu =
1 - zl/z
n=n(6)+1 n=0
[u(1 +6)/µJ
13
1u (1 +6) /µJ ekn(7)
E
C"
<
< e'
Lu(1-6)/µJ+l1
< e-7U
C
26u
+1 I
en6
Yu l
u(1-6)/lij+1
e6u(1+6)/µ
(3.10)
`p /
Finally, 00
I4
<
E F(Sn_1 < u, S. > u) Lu(1+6) /µJ +l 00 )^n 'YSn+kn (7) ; Sn-1 C U, Sn > U] [ e(u(1+6)/µJ+l
< e--Yu
e-n('Y ) fPn (u(1+6)/µJ+1
-7u r 00 e L^
1 n x n / 2x
(I Sn n 1 - ' 1 + b) e-7u x 1 /2
(3.11)
[u(1+6)/µJ+1 1 -
Thus an upper bound for z/'(u) is n(6) e-'Yu
eKn (7) + 2
1-
n=1
zl /2
+ (28U + 1) e6u(1+6)/µ Fi
and using (i), we get lim sup log u-/00
Letbl0.
O (U) < -y + b(1 + b) U
❑
CHAPTER XI. MISCELLANEOUS TOPICS
312
The following corollary shows that given that ruin occurs, the typical time is u/rc'(7) just as for the compound Poisson model, cf. IV.4. Corollary 3.4 Under the assumptions of Theorem 3.2, it holds for each b > 0 that 0(u) 1' g F(T(u) E (u(1 - b)/i(7), u(1 + b)/i(7))
Proof Since V,(u) = I1+I2+I3+I4'^ e-ry( u), 13 = P(T (u) E (u(1 -b)l^ (7),u(1+b)/rc'(7)), it suffices to show that for j = 1, 2, 4 there is an aj > 0 and a cj < oo such that Ij < c3e- 7' a-"ju. For 14, this is straightforward since the last inequality in (3.11 ) can be sharpened to x
[u(1+6)/µJ /2
4
1 - z 1/z
For I1, I2, we need to redefine n(b) as L,3ui where ,Q is so small that w = 1 - 4/3rc'(-y) > 0. For 12, the last steps of (3.9) can then be sharpened to x LQuJ /2
I2
< e-7u
1 - xl/2
to give the desired conclusion. For I,, we replace the bound P(Sn > u ) < 1 used in (3.8) by P(S,,
> u)
<
e-"'
E eIsn
=
e-ctueKn (a+'Y)-Kn(7)
where 0 < a < e and a is so small that r. (7 + a) < 2arc'(7). Then for n large, say n n1, we have rcn (a + 7) < 2n^c(7 + a) < 4narc' (7). Letting c11 = maxn
Il
Lou] E exp {-( 7 + a)u + Kn(a +7)} n=1
Lou] exp {-(-y + a)u} { 111 + exp {4narc'(7)} n=1
exp {-('y + a)u} c1 exp {4/3uarc'(7)} = clewhere a1 = aw.
ryue-«iu
, ❑
3.,-LARGE DEVIATIONS
313
Example 3 .5 Assume the Xn form a stationary Gaussian sequence with mean p < 0. It is then well-known and easy to prove that Sn has a normal distribution with mean np and a variance wn satisfying 00
i lim -wn = wz = Var(X1 ) + 2 E Cov(Xl, Xk+l)
n-aoo n
k=1
provided the sum converges absolutely. Hence z
z\
2
z
nr-n(9) _ n Cn0p+BZn/ -* ,(O) = 9µ+02 for all 9 E R, and we conclude that Theorem 3 . 2 is in force with -y = -2p/wz. 11
Inspection of the proof of Theorem 3.2 shows that the discrete time structure is used in an essential way. Obviously many of the most interesting examples have a continuous time scale. If {St}t> 0 is the claims surplus process, the key condition similar to (iii), (iv) becomes existence of a limit tc(9) of tct(9) _ log Ee8S° It and a y > 0 with a(y) = 0, r.'(-y) > 0. Assuming that the further regularity conditions can be verified, Theorem 3.2 then immediately yields the estimate log F( sup Skh > u) a-7u (3.12) k=0,1,...
for the ruin probability z/-'h(u) of any discrete skeleton {Skh}k=0,1,.... The problem is whether this is also the correct logarithmic asymptotics for the (larger) ruin probability O(u) of the whole process, i.e. whether P ( sup St > u ltg a ^" 0
(3.13)
One would expect this to hold in considerable generality, and in fact, criteria are given in Duffield & O'Connell [124]. To verify these in concrete examples may well present considerable difficulties, but nevertheless, we shall give two continuous time examples and tacitly assume that this can be done. The reader not satisfied by this gap in the argument can easily construct a discrete time version of the models! The following formula (3.14) is needed in both examples . Let {Nt}t>0 be a possibly inhomogeneous Poisson process with arrival rate ,3(s) at time s. An event occuring at time s is rewarded by a r.v. V(s) with m.g.f. 09(9). Thus the total reward in the interval [0, t] is
Rt = E V (Un) n: o„
CHAPTER XI. MISCELLANEOUS TOPICS
314 where the
an
are the event times. Then
rt
logEeOR° =
J0 /3(s)(^8(9) - 1) ds
(3.14)
(to see this , derive , e.g., a differential equation in t). Example 3.6 We assume that claims arrive according to a homogeneous Poisson process with intensity 0 , but that a claim is not settled immediately. More precisely, if the nth claim arrives at time a,, , then the payments from the company in [on, O'n +S] is a r .v. Un(s). Thus, assuming a continuous premium inflow at unit rate, we have
S, = U„ ( t - Q„) - t, n: o,
which is a shot-noise process. We further assume that the processes {U1(s)}8>0 are i. i.d., non-decreasing and with finite limits Un as s T oo ( thus, Un represents the total payment for the nth claim). We let ic (9) = 3(EeWU° - 1) - 0 and assume there are -y, e > 0 such that ic('y) = 0 and that r. (9) < oo for 9 < 'y + C. If the nth claim arrives at time Qn = s, it contributes to St by the amount Un(t - s). Thus by (3.14),
Kt (0)
t (Ee9U"it-8i - 1) ds - 9t = /3 t (Ee8U° i8l - 1) ds - 9t, J J0 0
and since EeOUn(8) -+ Ee°U^ as s -* oo, we have rct (9)/t -4 ic (9). Since the remaining conditions of Theorem 3.2 are trivial to verify, we conclude that Cu) log e-7 u (cf. the above discussion of discrete skeletons). It is interesting and intuitively reasonable to note that the adjustment coefficient ry for the shot - noise model is the same as the one for the Cramer -Lundberg model where a claim is immediately settled by the amount Un. Of course, the Cramer- Lundberg model has the larger ruin probability. 0 Example 3 . 7 Given the safety loading 77, the Cramer-Lundberg model implicitly assumes that the Poisson intensity /3 and the claim size distribution B (or at least its mean µB) are known. Of course , this is not realistic . An apparent solution to this problem is to calculate the premium rate p = p(t) at time t based upon claims statistics . Most obviously, the best estimator of /3µB based upon Ft-, where Ft = a(A8 : 0 < s < t), At = ;'`1 U;, is At - It. Thus, one would take p(t) = (1 + rt)At-/ t, leading to
St = At-(1+77)
Jo t
S8 ds.
(3.15)
315
3. LARGE DEVIATIONS With the Qi the arrival times, we have Nt
St =
Ui
t
N.
Ui
- (1 +i) f > i= 1
i=1 o
Nt
/
t
1 - (1 + r7) log t (3.16)
ds = E Ui
01i
s
i=1
Let ict (a) = log Eeast . It then follows from (3.14) that
rt _ 13
(a [1_( i+77)log]) ds_flt = t (a)
Jo
K(a)
_
o 1 O (a[I + (1 + 77) log u]) du
(3.17)
-)3. (3.18)
f
Thus (iii) of Theorem 3.2 hold, and since the remaining conditions are trivial to verify, we conclude that t,b(u) IN a-'Yu (cf. again the above discussion of discrete skeletons) where y solves ic('y) = 0
It is interesting to compare the adjustment coefficient y with the one y* of the Cramer-Lundberg model, i.e. the solution of (3.19)
/3(Eelu - 1) - (1 + 17)0µB = 0. Indeed, one has y
>
y'
(3.20)
with equality if and only if U is degenerate. Thus, typically the adaptive premium rule leads to a ruin probability which is asymptotically smaller than for the Cramer-Lundberg model . To see this , rewrite first rc as
te(a) _ /3E
1
eau 1 - /3. (3.21)
1 +(1+77)aUJ
This follows from the probabilistic interpretation Si
EN '1 Yi where
Yi = Ui( 1+(1 +r7)log ©i) = Ui(1-(1 +17)Vi) where the Oi are i .i.d. uniform (0,1) or , equivalently, the Vi = - log Oi are i.i.d. standard exponential , which yields eau f 1 t(1+n )audtl = E r Ee°Y = E [O(1+n)aueaul = E [eau J L Jo J L1+(l+r))aUJ
316
CHAPTER XI. MISCELLANEOUS TOPICS
Next, the function k(x) = e7*x - 1 - (1 + ri)y*x is convex with k(oo) = 00, k(0) = 0, k'(0) < 0, so there exists a unique zero xo = xo(r7) > 0 such that k(x) > 0, x > x0, and k(x) < 0, 0 < x < x0. Therefore
e7'U _ k(U) E [1+(1+77)y*U] - 1 E [1+(1+77)y*U] 0 k (+ *y B(+ 1 + (1(+71)y*y B(dy) L xa 1 +
f
+ (1 + rl) Y* xo jJxo k(y) B(dy
) + f' k(y) B(dy) } = 0,
using that Ek(U) = 0 because of (3.19). This implies n(y*) < 0, and since tc(s), a* (s) are convex with tc'(0) < 0 , rc*' (0 ) < 0, this in turn yields y > y*. Further, y = y* can only occur if U - xo. 11 Notes and references Some standard textbooks on large deviations are Bucklew [81], Dembo & Zeitouni [105] and Shwartz & Weiss [339]. In addition to Glynn & Whitt [163], see also Nyrhinen [275] for Theorem 3.2. For Example 3.7, see Nyrhinen [275] and Asmussen [25]; the proof of (3.20) is due to Tatyana Turova.
Further applications of large deviations idea in risk theory occur in Djehiche [122], Lehtonen & Nyrhinen [244], [245], Martin-L6f [256], [257] and Nyrhinen [275].
4 The distribution of the aggregate claims We study the distribution of the aggregate claims A = ^N' U; at time t, assuming that the U; are i.i.d. with common distribution B and independent of Nt. In particular, we are interested in estimating P(A > x) for large x. This is a topic of practical importance in the insurance business for assessing the probability of a great loss in a period of length t, say one year. Further, the study is motivated from the formulas in IV.2 expressing the finite horizon ruin probabilities in terms of the distribution of A.
The main example is Nt being Poisson with rate fit. For notational simplicity, we then take t = 1 so that p,, = P(N = n) = e-(3an
However, much of the analysis carries over to more general cases, though we do not always spell this out.
4. THE DISTRIBUTION OF THE AGGREGATE CLAIMS
317
4a The saddlepoint approximation We impose the Poisson assumption (4.1). Then Ee"A = e'(") where x(a) _ 0(B[a] - 1). The exponential family generated by A is given by Pe(A E dx) = E [eeA -K(9); A E dx] . In particular, no(a) = logE9e'A = rc(a + 9) - ic(9) = ,3e(bo[a] - 1) where )30 = ,3B[9] and Be is the distribution given by
eox B9(dx) = B [9] B(dx). This shows that the Pe-distribution of A has a similar compound Poisson form as the F-distribution, only with 0 replaced by a9 and B by B9. The analysis largely follows Example 3.1. For a given x, we define the saddlepoint 9 = 9(x) by EBA = x, i.e. K'(0) _ ic'(9) = x. Proposition 4.1 Assume that lim8T8. B"[s] = oo, B"' [s] lim (B",[s])3/2 = 0, 818' where s' = sup{s : B[s] < oo}. Then as x -* oo, e-9x+K(°) P(A > x)
B 2ir /3 B" [9] Proof Since EBA = x, Vare(A) = s;"(0) = ,3B"[9], (4.2) implies that the limiting Pe-distribution of (A - x)//3B"[9] is standard normal. Hence
P(A > x) = E e [e-9A+ ic(9); A > x)] = e-ex+K( e)E9 [e - 9(A-x); A > x)
x)]]
e-ex+K(e ) e-e AB°[ely 1 e-v2/2 dy 0 2^ 00 -9x+p(e) e e-ze-z2/(2BZpB „[9)) dz 9 27r/3B" [9] fo e-ex+w ( e)
J
oo z
e-ex+w(B)
dz
0 27r /3B" [9] o e 9 2 /3B" [9]
318
CHAPTER XI. MISCELLANEOUS TOPICS
It should be noted that the heavy-tailed asymptotics is much more straightforward. In fact, just the same dominated convergence argument as in the proof of Theorem 2.1 yields: Proposition 4.2 If B is subexponential and EzN < oo for some z > 1, then P(A > x) - EN B(x). Notes and references Proposition 4.1 goes all the way back to Esscher [141], and (4.2) is often referred to as the Esscher approximation. The present proof is somewhat heuristical in the CLT steps. For a rigorous proof, some regularity of the density b(x) of B is required. In particular, either of the following is sufficient: A. b is gamma-like, i.e. bounded with b(x) - ycix °-ie-6x B. b is log-concave, or, more generally, b(x) = q(x)e-h(z), where q(x) is bounded away from 0 and oo and h (x) is convex on an interval of the form [xo,x') where x' = sup {x : b(x) > 0}. Furthermore 00 b(x)Sdx < oo for some ( E (1, 2). For example, A covers the exponential distribution and phase-type distributions, B covers distributions with finite support or with a density not too far from a-x° with a > 1. For details, see Embrechts et al. [138], Jensen [215] and references therein.
4b The NP approximation In many cases , the distribution of A is approximately normal . For example, under the Poisson assumption (4.1), it holds that EA = ,l3pB, Var(A) _ ^3p.2i and that (A - (3µB)/(0µB^))1/2 has a limiting standard normal distribution as Q -^ oo, leading to
P(A > x) :; 1 - (D X - Q{AB
(4.3)
The result to be surveyed below improve upon this and related approximations by taking into account second order terms from the Edgeworth expansion. Remark 4 . 3 A word of warning should be said right away : the CLT (and the Edgeworth expansion) can only be expected to provide a good fit in the center of the distribution . Thus , it is quite questionable to use (4.3) and related results ❑ for the case of main interest , large x. The (first order) Edgeworth expansion states that if the characteristic function g(u) = Ee"`}' of a r.v. Y satisfies 9(u) ti e-u2/2(1 + ibu3) (4.4)
4. THE DISTRIBUTION OF THE AGGREGATE CLAIMS
319
where b is a small parameter, then P(Y < y) 4(y) - 6(1 - y2)^P(y)• Note as a further warning that the r.h.s. of (4.5) may be negative and is not necessarily an increasing function of y for jyj large. Heuristically, (4.5) is obtained by noting that by Fourier inversion, the density of Y is
9(y) =
°° _1 e-iuy f(u) du 2x _. f °o
1 e-'uye -u2/2(1 + iSu3) du 27r
_ cc(y) - 5 (y3 - 3& (y), and from this (4.5) follows by integration. In concrete examples , the CLT for Y = Y6 is usually derived via expanding the ch.f. as u2 u3 u4 9(u) = Ee'uY = exp {iuci - 2X2 - 2K3 + 4i 64 + .. . where Kl , ,c2i... are the cumulants ; in particular, s;l = EY,
K2 = Var (Y), K3 = E(Y - EY)3.
Thus if EY = 0, Var(Y) = 1 as above , one needs to show that 163, K4 .... are small. If this holds , one expects the u3 term to dominate the terms of order u4, u5, ... so that 1(u)
3 exp { - 2 2 - i 3 K3 } Pt^ exp - 2 ^ \1 - i 6 r 1 3
so that we should take b = -ic3/6 in (4.5). Rather than with the tail probabilities F(A > x), the NP (normal power) approximation deals with the quantile al_E, defined as the the solution of P(A < yl-e) = 1 - e. A particular case is a.99, which is often denoted VaR (the Value at Risk). Let Y = (A - EA)/ Var(A) and let yl_E, zl_e be the 1 - e-quantile in the distribution of Y, resp. the standard normal distribution. If the distribution of Y is close to N(0,1), yl-E should be close to zl_E (cf., however, Remark 4.3!), and so as a first approximation we obtain
a1_E = EA + yl-e Var(A) .: EA + zl_E Var(A) . (4.6)
CHAPTER XI. MISCELLANEOUS TOPICS
320
A correction term may be computed from (4.5) by noting that the 4;(y) terms dominate the S(1 - y2)cp( y) term. This leads to
1-
-t( yl -E) - 6 (1 - yi- E)A1✓ l -E)
E
4)(yl -E)
- 5(1 - zl-
E)^o(zl -E)
^' ..
4(z1-E) + ( yl-E - zl -E)V(zl_E) - S(1 - zl-E)W(zl-E)
=
1 - E + (yl- E - zl-E )w(zl _E)
- S(1 - zi- E )Azl -E)
which combined with S = -EY3/6 leads to
1
q^
Y1 - E = z1-E + S(zi_E
Using Y = (A - EA ) /
- 1)EY3.
Var(A), this yields the NP approximation
6(Z1 _E - 1) E (A - EA)3 a1_E = EA + z1_E(Var (A))1/2 + 1
Var(A)
Under the Poisson assumption (4.1), the kth cumulant of A is /3PBk' and so s;k = /3µB^1 / (,6pBki) d/2. In particular , k3 is small for large /3 but dominates 1c4, K5 .... as required . We can rewrite (4.7) as 1 (3) a1-E = Qµa +z1 - E(/3PB^1 )1^2 + s(z1-E - 1)^ 2) µ'E
Notes and references We have followed largely Sundt [354]. Another main reference is Daykin et at. [101]. Note, however, that [101] distinguishes between the NP and Edgeworth approximations.
4c Panjer 's recursion Consider A =
EN 1 U%, let pn
= P(N = n), and assume that there exist
constants a, b such that
Pn =
(a+
n
) Pn_i , n = 1, 21 ....
For example, this holds with a = 0, b = /3 for the Poisson distribution with rate /3 since Pn
^e-Q
,3n-i /3
= -Pn-1 n! n (n - 1)! n
4. THE DISTRIBUTION OF THE AGGREGATE CLAIMS
321
Proposition 4.4 Assume that B is concentrated on {0, 1, 2,.. .} and write gj = 2 , . . fj = P(A = j), j = 0,1..... Then fo = >20 9onpn and
fi = 1 E
(a + b!) 1-ag k_1 3
gkfj- k
, j = 1, 2, ... .
(4.10)
In particular, if go = 0, then j
f o = po, fj = E (a+ b
)9kfi_k
, j = 1, 2.....
(4.11)
k =1
Remark 4.5 The crux of Proposition 4.4 is that the algorithm is much faster than the naive method, which would consist in noting that (in the case go = 0)
fj = pn9jn
(4.12)
n=1
where g*n is the nth convolution power of g, and calculating the gj*n recursively by 9*1 = 9j, j-1
g; n =
9k(n-1 )9j -k •
(4.13)
k=n-1
Namely, the complexity (number of arithmetic operations required) is O(j3) for (4.12), (4.13) but only O(j2) for Proposition 4.4. ❑ Proof of Proposition 4.4. The expression for fo is obvious. By symmetry, E[a +bU=I >Ui =j l
(4.14)
i=1 J
is independent of i = 1, . . . , n. Since the sum over i is na + b, the value of (4.14) is therefore a + b/n. Hence by (4.9), (4.12) we get for j > 0 that
fj
a
-
b +
n
*n
n p
n-lgj
00 U I n *n = E a+b-1 Ui=j pn-19j n=1
j
i=1
CC)
EE n=1 Ia
n
+b Ul i=1
Ui
=j pn_1
CHAPTER XI. MISCELLANEOUS TOPICS
322 00 J
EE (a + bk I gkg3 _ k lien-i n=ik=0 (a+bk l gkE g j'`kpn = E (a+b!)9kfi_k n=0 k=0 k=0 ^I 1 E(a+b. agofj+ k Jgkfj-k, k=i /
and (4.9) follows . (4.11) is a trivial special case. and
❑
If the distribution B of the Ui is non-lattice , it is natural to use a discrete approximation . To this end, let U(;+, U(h) be U; rounded upwards, resp. downwards , to the nearest multiple of h and let A}h) = EN U. An obvious modification of Proposition 4.4 applies to evaluate the distribution F(h) of A(h) letting f( ) = P(A() = jh) and
g(h)
= P (U(h2 = kh) = B((k + 1)h) - B(kh ), k = 0, 1, 2, ... ,
gkh+
= P (U4;+ = kh) = B(kh) - B (( k - 1)h) = gk - l,-, k = 1, 2, ... .
Then the error on the tail probabilities (which can be taken arbitrarily small by choosing h small enough ) can be evaluated by 00
00
f! h)
< P(A > x ) f (h) j=Lx/hl j=Lx/hl Further examples ( and in fact the only ones , cf. Sundt & Jewell [355]) where (4.9) holds are the binomial distribution and the negative binomial (in particular, geometric ) distribution . The geometric case is of particular importance because of the following result which immediately follows from by combining Proposition 4.4 and the Pollaczeck-Khinchine representation: Corollary 4.6 Consider a compound Poisson risk process with Poisson rate 0 and claim size distribution B. Then for any h > 0, the ruin probability zb(u) satisfies 00 00 f^,h) Cu) < E ff,+, j=Lu/hJ j=Lu/hJ
(4.15)
5. PRINCIPLES FOR PREMIUM CALCULATION
323
where f^ +, f^ h) are given by the recursions (h)
3 (h)
(h)
fj,+ = P 9k fj-k,+ ' I = 17 2, .. . k=1
(h)
=
P
3 (h)
(h)
f9,- - (h) gk,-fA-k,- e 1 - ago,- k=1
j = 1+2,
starting from fo + = 1 - p, f(h) 07 = (1 - p)/(1 - pgoh-) and using g(kh)
1 (k+1)h
=
Bo((k + 1 ) h) - Bo(kh ) = - f AB
gkh+
Bo(kh ) - Bo((k - 1 ) h) = 9kh)1 ,
B(x) dx, k = 0, 1, 2, ... ,
kh
k = 1,2 .....
Notes and references The literature on recursive algorithms related to Panjer's recursion is extensive, see e.g. Dickson [115] and references therein.
5 Principles for premium calculation The standard setting for discussing premium calculation in the actuarial literature does not involve stochastic processes, but only a single risk X > 0. By this we mean that X is a r.v. representing the random payment to be made (possibly 0). A premium rule is then a [0, oo)-valued function H of the distribution of X, often written H(X), such that H(X) is the premium to be paid, i.e. the amount for which the company is willing to insure the given risk. The standard premium rules discussed in the literature (not necessarily the same which are used in practice!) are the following: The net premium principle H(X) = EX (also called the equivalence principle). As follows from the fluctuation theory of r.v.'s with mean, this principle will lead to ruin if many independent risks are insured. This motivates the next principle, The expected value principle H(X) = (1 + 77)EX where 77 is a specified safety loading. For 77 = 0, we are back to the net premium principle. A criticism of the expected value principle is that it does not take into account the variability of X which leads to The variance principle H(X) = EX+77Var(X). A modification (motivated from EX and Var(X) not having the same dimension) is
324
CHAPTER XI. MISCELLANEOUS TOPICS
The standard deviation principle H(X) = EX +rl
Var(X).
The principle of zero utility. Here v(x) is a given utility function, assumed to be concave and increasing with (w.lo.g) v(O) = 0; v(x) represents the utility of a capital of size x . The zero utility principle then means v(0) = Ev (H(X) - X);
(5.1)
a generalization v(u) = Ev (u + H(X) - X ) takes into account the initial reserve u of the company. By Jensen 's inequality, v(H(X) - EX) > Ev(H(X) - X) = 0 so that H(X) > EX. For v(x) = x, we have equality and are back to the net premium principle. There is also an approximate argument leading to the variance principle as follows. Assuming that the Taylor approximation
v(H(X) - X) ^ 0 +v'(0)(H (X) - X) + v 0 (H(X) - X)2 ,/2 is reasonable , taking expectations leads to the quadratic v"H(X )2 + H(X) (2v' - 2v"EX) + v"EX2 - 2v'EX = 0 (with v', v" evaluated at 0) with solution
H(X)=EX-v^±V(- ^ )2-Var(X). Write ( vI ) 2
\
-Var(X) v^ - 2v^Var(X)/ I - (
, Var(X) )2
If v"/v' is small, we can ignore the last term. Taking +f then yields H(X) ,:: EX -
2v'(0) VarX;
since v"(0) < 0 by concavity, this is approximately the variance principle. The most important special case of the principle of zero utility is The exponential principle which corresponds to v(x) = (1 - e-6x)/a for some a > 0. Here (5.1) is equivalent to 0 = 1 - e-0H(X)EeaX, and we get
H(X) = 1 log Ee 0X . a
325
5. PRINCIPLES FOR PREMIUM CALCULATION
Since m.g.f.'s are log-concave, it follows that H,, (X) = H(X) is increasing as function of a. Further, limQyo Ha (X) = EX (the net premium princiHa (X) = b (the premium ple) and, provided b = ess supX < oo, lim,, but is clearly not maximal loss principle is called the H(X) = b principle very realistic). In view of this, a is called the risk aversion The percentile principle Here one chooses a (small ) number a, say 0.05 or 0.01, and determines H(X) by P(X < H(X)) = 1 - a (assuming a continuous distribution for simplicity). Some standard criteria for evaluating the merits of premium rules are 1. 77 > 0, i .e. H(X) > EX. 2. H(X) < b when b (the ess sup above ) is finite 3. H(X + c) = H(X) + c for any constant c
4. H(X + Y) = H(X) + H(Y) when X, Y are independent 5. H(X) = H(H(XIY)). For example , if X = EN U= is a random sum with the U; independent of N, this yields
H
C^
U; I = H(H(U)N)
(where, of course, H(U) is a constant). Note that H(cX) = cH(X) is not on the list! Considering the examples above, the net premium principle and the exponential principle can be seen to the only ones satisfying all five properties. The expected value principle fails to satisy, e.g., 3), whereas (at least) 4) is violated for the variance principle, the standard deviation principle, and the zero utility principle (unless it is the exponential or net premium principle). For more detail, see e.g. Gerber [157] or Sundt [354]. Proposition 5.1 Consider the compound Poisson case and assume that the premium p is calculated using the exponential principle with time horizon h > 0. That is, N,,
Ev I P - E U; i =1
= 0 where
v(x) = 1(1 - e-°x a
Then ry = a, i.e. the adjustment coefficient 'y coincides with the risk aversion a.
326
CHAPTER XI. MISCELLANEOUS TOPICS
Proof The assumption means
0 a (1 - e-areo (B[a1-1)
l
i.e. /3(B[a] - 1) - ap = 0 which is the same as saying that a solves the Lundberg ❑ equation. Notes and references The theory exposed is standard and can be found in many texts on insurance mathematics, e.g. Gerber [157], Heilman [191] and Sundt [354]. For an extensive treatment, see Goovaerts et al. [165].
6 Reinsurance Reinsurance means that the company (the cedent) insures a part of the risk at another insurance company (the reinsurer). Again, we start by formulation the basic concepts within the framework of a single risk X _> 0. A reinsurance arrangement is then defined in terms of a function h(x) with the property h(x) < x. Here h(x) is the amount of the claim x to be paid by the reinsurer and x - h(x) by the the amount to be paid by the cedent. The function x - h(x) is referred to as the retention function. The most common examples are the following two: Proportional reinsurance h(x) = Ox for some 0 E (0, 1). Also called quota share reinsurance. Stop-loss reinsurance h(x) = (x - b)+ for some b E (0, oo), referred to as the retention limit. Note that the retention function is x A b. Concerning terminology, note that in the actuarial literature the stop-loss transform of F(x) = P(X < x) (or, equivalently, of X), is defined as the function
b -* E(X - b)+ =
f
(s - b)F(dx) _ f
(x) dx.
6 00
An arrangement closely related to stop-loss reinsurance is excess-of-loss reinsurance, see below. Stop-loss reinsurance and excess-of-loss reinsurance have a number of nice optimality properties. The first we prove is in terms of maximal utility: Proposition 6.1 Let X be a given risk, v a given concave non-decreasing utility function and h a given retention function. Let further b be determined by E(X b)+ = Eh(X). Then for any x,
Ev(x - {X - h(X)}) < Ev(x - X A b).
327
6. REINSURANCE
Remark 6 .2 Proposition 6.1 can be interpreted as follows. Assume that the cedent charges a premium P > EX for the risk X and is willing to pay P1 < P for reinsurance. If the reinsurer applies the expected value principle with safety loading q, this implies that the cedent is looking for retention functions with Eh(X) = P2 = P1/(1 + 77). The expected utility after settling the risk is thus
Ev(u + P - P1 - {X - h(X)}) where u is the initial reserve . Letting x = u + P - P1, Proposition 6.1 shows that the stop-loss rule h (X) = (X - b)+ with b chosen such that E(X - b)+ ❑ = P2 maximizes the expected utility. For the proof of Proposition 6.1, we shall need the following lemma: Lemma 6 .3 (OHLIN'S LEMMA) Let X1, X2 be two risks with the same mean, such that Fj(x) < F2 (x), x < b, Fi(x) ? F2(x), x > b for some b where Fi(x) = P(Xi < x). Then Eg(X1) < g(X2) for any convex function g. Proof Let Yi=XiAb, Zi=Xivb.
Then P(Yl < x) _ Fi(x) <_ F2 (x) = P(Y2 < x) x < b 1=P(Y2<x) x>b so that Y1 is larger than Y2 in the sense of stochastical ordering . Similarly, P(Zl < x) _
0 = P(Z2 < x) x < b Fi(x) > F2(x) = P(Z2 < x) x > b
so that Z2 is larger than Zl in stochastical ordering. Since by convexity, v(x) = g(x) - g(b) - g'(b)(x - b) is non-increasing on [0, b] and non-decreasing on [b, oo), it follows that Ev(Y1) < Ev(Y2), Ev(Zi) < Ev(Z2). Using v(Yi) + v(Zi) = v(Xi), it follows that
0 < Ev(X2) - Ev(Xi) = Eg(X2) - Eg(X1), using EX1 = EX2 in the last step.
❑
Proof of Proposition 6.1. It is easily seen that the asssumptions of Ohlin' s lemma hold when X1 = X A b, X2 = X - h(X); in particular, the requirement EX1
CHAPTER XI. MISCELLANEOUS TOPICS
328
= EX2 is then equivalent to E(X - b)+ = Eh(X). Now just note that -v is convex. ❑ We now turn to the case where the risk can be written as N
X = Ui i=1
with the Ui independent; N may be random but should then be independent of the Ui. Typically, N could be the number of claims in a given period, say a year, and the Ui the corresponding claim sizes. A reinsurance arrangement of the form h(X) as above is called global; if instead h is applied to the individual claims so that the reinsurer pays the amount EN h(Ui), the arrangement is called local (more generally, one could consider EN hi(Ui) but we shall not discuss this). The following discussion will focus on maximizing the adjustment coefficient. For a global rule with retention function h* (x) and a given premium P* charged for X - h* (X), the cedents adjustment coefficient -y* is determined by
(6.2)
1 = Eexp {ry*[X - h*(X) - P*]}, for a local rule corresponding to h(u) and premium P for X look instead for the ry solving
[ X_P_^
J _f
1 = Eexp
N 1 h (Ui), we
[ Ei [U - h(Ui)] -P
= Eexp{ry
h(Ui)]
J}
l (6.3) This definition of the adjustment coefficients is motivated by considering ruin at a sequence of equally spaced time points, say consecutive years, such that N is the generic number of claims in a year and P, P* the total premiums charged in a year, and referring to the results of V.3a. The following result shows that if we compare only arrangements with P = P*, a global rule if preferable to a local one. Proposition 6.4 To any local rule with retention function h(u) and any N
P > E X - N h(Ui) 4 =1
there is a global rule with retention function h* (x) such that N
Eh*(X) = Eh(U1) i=1
and 'y* > ry where ry* is evaluated with P* = P in (6.3).
(6.4)
6. REINSURANCE
329
Proof Define N
h* (x) = E > h(Ui) X = x ; then (6.5) holds trivially. Applying the inequality Ecp(Y ) > EW(E (YIX )) (with W convex ) to W(y ) = eryy, y = Ei [Ui - h(Ui)] - P, we get N
1 = Eexp
ry E[Ui i-i
- h(Ui)] - P
> EexP{7[X - h * (X) - P]}.
But since ry > 0, ry* > 0 because of (6.4), this implies 7* > 7.
❑
Remark 6.5 Because of the independence assumptions , expectations like those in (6.3), (6.4), (6.5) reduce quite a lot. Assuming for simplicity that the Ui are i.i.d., we get EX = EN • EU, N
E X - h( UU) = EN • E[U - h(U)],
Eexp
7 [E ' [Ui - h(Ui)] - P I = EC [7]N,
(6.6)
i-i where C[ry] = Ee'r(u-4(u)), and so on.
❑
The arrangement used in practice is, however, as often local as global. Local reinsurance with h(u) = (u - b)+ is referred to as excess-of-loss reinsurance and plays a particular role: Proposition 6.6 Assume the Ui are i. i.d. Then for any local retention function u - h(u) and any P satisfying (6.4), the excess -of-loss rule hl (u) = (u - b)+ with b determined by E(U - b)+ = Eh(U) (and the same P) satisfies 71 > ry. Proof As in the proof of Proposition 6.4, it suffices to show that
Eexp
'UiAb- P } < 1 = Eexp E[Ui- h(Ui)-P JJJ l:='l {ry
{ry i-i
]
or, appealing to (6.6), that 01[ry] < 0[-y] where 0[-y] = Ee'r(U^') . This follows by taking Xl = U A b, X2 = U - h(U) (as in the proof of Proposition 6.4) and ❑ g(x) = e7x in Ohlin's lemma.
330
CHAPTER XI. MISCELLANEOUS TOPICS
Notes and references The theory exposed is standard and can be found in.many texts on insurance mathematics, e.g. Bowers et at. [76], Heilman [191] and Sundt [354]. See further Hesselager [194] and Dickson & Waters [120]. The original reference for Ohlin's lemma is Ohlin [277]. The present proof is from van Dawen [99]; see also Sundt [354].
Appendix Al Renewal theory la Renewal processes and the renewal theorem By a simple point process on the line we understand a random collection of time epochs without accumulation points and without multiple points. The mathematical representation is either the ordered set 0 < To < T1 < ... of epochs or the set Y1, Y2, ... of interarrival times and the time Yo = To of the first arrival (that is, Y,, = T„ - T„_1). The point process is called a renewal process if Yo, Y1, ... are independent and Y1, Y2, ... all have the same distribution, denoted by F in the following and referred to as the interarrival distribution; the distribution of Yo is called the delay distribution. If Yo = 0, the renewal process is called zero-delayed. The number max k : Tk_j < t of renewals in [0, t] is denoted by Nt. The associated renewal measure U is defined by U = u F*" where F*" is the nth convolution power of F. That is, U(A) is the expected number of renewals in A C R in a zero-delayed renewal process; note in particular that U({0}) = 1. The renewal theorem asserts that U(dt) is close to dt/µ, Lebesgue measure dt normalized by the mean to of F, when t is large. Technically, some condition is needed: that F is non-lattice, i.e. not concentrated on {h, 2h,.. .} for any h > 0. Then Blackwell 's renewal theorem holds, stating that U(t+a)-U (t) -^ a, t -00
(A.1)
(here U(t) = U([0, t]) so that U(t + a) - U(t) is the expected number of renewals in (t, t +a]). If F satisfies the stronger condition of being spread-out (F*' is nonsingular w .r.t. Lebesgue measure for some n > 1), then Stone 's decomposition holds : U = U, + U2 where U1 is a finite measure and U2(dt) = u(t)dt where 331
332
APPENDIX
u(t) has limit 1/µ as t -4 oo. Note in particular that F is spread-out if F has a density f. A weaker (and much easier to prove) statement than Blackwell's renewal theorem is the elementary renewal theorem, stating that U(t)/t --> 1/p. Both result are valid for delayed renewal processes, the statements being
EN(t + a) - EN(t) - a, resp.
ENt -4 1
lb Renewal equations and the key renewal theorem The renewal equation is the convolution equation U Z(u - x)F(dx),
Z(u) = z(u) +
(A.2)
f where Z(u) is an unknown function of u E [0 , oo), z(u) a known function, and F(dx) a known probability measure . Equivalently, in convolution notation Z = z + F * Z. Under weak regularity conditions (see [APQJ Ch. IV), (A.2) has the unique solution Z = U * z, i.e.
Z(u) =
J0 u z(x)U(dx).
(A.3)
Further, the asymptotic behavior of Z(u) is given by the key renewal theorem: Proposition A1.1 if F is non-lattice and z (u) is directly Riemann integrable (d.R.i.; see [APQ] Ch. IV), then
Z(u) -i
f0
z(x)dx .
(A.4)
µF
If F is spread- out, then it suffices for (A.4) that z is Lebesgue integrable with limZ.i". z(x) = 0. In 111.9, wee shall need the following less standard parallel to the key renewal theorem: Proposition A1.2 Assume that Z solves the renewal equation (A.2), that z(u) has a limit z(oo) (say) as u -4 oo, and that F has a bounded density2. Then
Z(u) -4 z(oo), u
-4 00.
u PF
2This condition can be weakened considerably , but suffices for the present purposes
(A.5)
APPENDIX
333
Proof The condition on F implies that U(dx) has a bounded density u(x) with limit 1/µF as x -* oo. Hence by dominated convergence, Z(u) U
=
1 u 1 u f z(u - x)u(x) dx = z(u( 1 - t))u(ut) dt 0 0
J
f z(oo) • 1 dt = z(OO). 0 PF µF 11
In risk theory, a basic reason that renewal theory is relevant is the renewal equation II.(3.3) satisfied by the ruin probability for the compound Poisson model. Here the relevant F does not have mass one (F is defective). However, asymptotic properties can easily be obtained from the key renewal equation by an exponential transformation also when F(dx) does not integrate to one. To this end, multiply (A.2) by e7x to obtain Z = z +P * Z where Z(x) = e'Y'Z(x), z(x) = e7xz(x), F(dx) = e7xF(dx). Assuming that y can be chosen such that f °° Ox F(dx) = 1, i.e. that F is a probability measure, results from the case fo F(dx) = 1 can then be used to study Z and thereby Z. This program has been carried out in III.5a. Note, however, that the existence of y may fail for heavy-tailed F.
1c Regenerative processes Let {T,,} be a renewal process. A stochastic process {Xt}t>0 with a general state space E is called regenerative w.r.t. {Tn} if for any k, the post-Tk process {XT,k+t }t>o is independent of To, T1,. .. , Tk (or, equivalently, of Yo, Y1 , . . • , Yk ), and its distribution does not depend on k. The distribution F of Y1, Y2.... is called the cycle length distribution and as before, we let µ denote its mean. We let FO, Eo etc. refer to the zero-delayed case. The simplest case is when {Xt} has i.i.d. cycles. The kth cycle is defined as {XTk+t}o
334
APPENDIX
Proposition A1.3 Consider a regenerative process such that the cycle length distribution is non-lattice with p < oo. Then Xt -Di X,,,, where the distribution of X,,,, is given by Eg(Xoo) = 1 E0 f Ylg (Xt)dt. µ 0
(A.6)
If F is spread-out, then Xt - + X,, in total variation.
id Cumulative processes Let {Tn} be a renewal process with i.i.d. cycles (we allow a different distribution of the first cycle). Then {Zt}t^,0 is called cumulative w.r.t. {Tn} if the processes {ZT +t - ZT }0
then Zt /t a$• EU1/µ; (b) If in addition Var(Ul ) < oo, then (Zt - tEU1/µ)/f has a limiting normal distribution with mean 0 and variance Var(Ui) + (!)2Var (Yi)_ 2EU1 Cov(U1, Y1)
le Residual and past lifetime Consider a renewal process and define e ( t) as the residual lifetime of the renewal interval straddling t, i.e. fi (t) = inf {Tk - t : t < Tk}, and q(t) = sup It - Tk : t < Tk} as the age. Then {e(t)}, {i7(t)} are Markov with state spaces (0, oo), resp . [0, oo). If p = oo, then e (t) - oo (i.e. P(C ( t) < a) -4 0 for any a < oo) and ij (t) * oo. Otherwise , under the condition of Blackwell's renewal theorem, C(t) and ij (t) both have a limiting stationary distribution F0 given by the density F (x)/p. We denote the limiting r.v.'s by e, r,. Then it (ii, C), and we have: holds more generally that (rl(t), e(t ))
APPENDIX
335
Theorem A1.5 Under the condition of Blackwell's renewal theorem, the joint distribution of (rl, ^) is given by the following four equivalent statements: (a) P (77
> x, ^ > y) = 1 f
(z)dz; +Y
(b) the joint distribution of (ri, l:) is the same as the distribution of (VW, (1 V)W) where V, W are independent, V is uniform on (0, 1) and W has distribution Fw given by dFw/dF(x) = x/pF; (c) the marginal distribution of q is FO, and the conditional distribution of given 17 = y is the overshoot distribution R0(Y) given by FO(Y) (z) = Fo (y+z)/Fo(y); (d) the marginal distribution of ^ is FO, and the conditional distribution of ri given l; = z is Foz)
The proof of (a) is straightforward by viewing {(r,(t),^(t))} as a regenerative process, and the equivalence of (a) with (b)-(d) is an easy exercise. In IV.4, we used: Proposition A1.6 Consider a renewal process with µ < oo. Then fi(t)/t a4' 0 and, if in addition EYo < oo, EC(t)/t -+ 0. Proof The number Nt of renewal before t satisfies Nt/t a4' p. Hence for t large enough, we can bound e(t) by M(t) = max {Yk : k < 2t/p}. Since the maximum Mn of n i.i.d. r.v.'s with finite mean satisfies Mn/n a$• 0 (BorelCantelli), the first statement follows. For the second, assume first the renewal process is zero-delayed. Then Eo^(t) satisfies a renewal equation with z(t) _ E[Y1 - t; Yl > t]. Hence
t t lt ) = f U(dy)z(t y) = f U(t dy Eoe(t )z(y) < c ^ l z(k) 0
0
k=o
where c = sup., U(x + 1) - U(x) (c < oo because it is easily seen that U(x + 1) - U(x) < U( 1)). Since z ( k) < E[Yi ; Y1 > t] -4 0, the sum is o(t) so that Eo£(t)/t -+ 0 . In the general case, use
t E^(t)/t = E[Yo - t; Yo > 0] + f Eo^ (t - y)P(Yo E dy) . 0
If Markov renewal theory By a Markov renewal process we understand a point process where the interarrival times Yo , Y1i Y2, ... are not i.i.d. but governed by a Markov chain {Jn} (we
336
APPENDIX
assume here that /the state space E is// finite) in the sense that
P(Y.
< yIJ)
=
Fij( y)
on {Jn= i,
Jn +1=j}
where J = a(JO, J1 i ...) and (Fij )i,jEE is a family of distributions on (0, oo). A stochastic process {Xt}t>o is called semi-regenerative w.r.t. the Markov renewal process if for any n, the conditional distribution of {XT„+t}t>o given Yo, Y1, . . . , Yn, Jo, . . . , Jn_1, Jn = i is the same as the P; distribution ofjXt}t>o itself where Pi refers to the case Jo = i. A Markov renewal process {Tn} contains an imbedded renewal process, namely {Twk } where {Wk } is the sequence of instants w where Jo., = io for some arbitrary but fixed reference state io E E. The semi-regenerative process is then regenerative w.r.t. IT,,,,}. These facts allow many definitions and results to be reduced to ordinary renewal- and regenerative processes. For example, the semi-regenerative process is called non-lattice if {T,,,,} is non-lattice (it is easily seen that this definition does not depend on i). Further: Proposition A1.7 Consider a non-lattice semi-regenerative process. Assume that uj = EjYo < oo for all j and that {J„} is irreducible with stationary distribution (v3)jEE. Then Xt 4 Xo,, where the distribution of X,,. is given by Eg(X00) = 1
YO vjEj f g(Xt) dt
µ jEE o
where p = ujEEViAj.
Notes and references Renewal theory and regenerative processes are treated, e.g., in [APQ], Alsmeyer [5] and Thorisson [372].
A2 Wiener-Hopf factorization Let F be a distribution which is not concentrated on (-oo, 0] or (0 , oo). Let X1, X2, ... be i.i .d. with common distribution F, Sn = X1 + • • • + Xn the associated random walk, and define r+=inf{n>0: Sn>0}, T_=inf{n>0: Sn<0}, G+(x) = P(S,+ < x, -r+ < oo), G_(x) = P(ST_ < x,T_ < oo).
We call r+ (T_) the strict ascending (weak descending) ladder epoch and G+ (G_) the corresponding ladder height distributions.
APPENDIX
337
Probabilistic Wiener-Hopf theory deals with the relation between F, G+, G_, the renewal measures 00
00
n=0
n=0
U+=>G+, U- =EGn, and the T+- and r_ pre-occupation measures T+-1
R+(A) = E E I(Sn E A),
r_-1
R_(A) = E I(Sn E A).
n -0
n=0
The basic identities are the following: Theorem A2.1 (a) F = G+ + G_ - G+ * G_: (b) G_ (A) = f °° F(A - x)R_ (dx), A C (-oo, 0); (c) G+(A) = f °. F(A - x)R+(dx), A C (0, oo); (d) R+ = U_; (e) R_ = U+. Proof Considering the restrictions of measures to (-oc, 0] and (0, oo), we may rewrite (a) as
G_ (A) = G+(A) =
F(A) + (G+ * G_)(A), A C (-oo, 0], F(A) + (G+ * G_)(A), A C (0, oo)
(A.7) (A.8)
(e.g. (A.7) follows since G+(A) = 0 when A C (-oo, 0]). In (A.7), F(A) is the contribution from the event {T_ = 1} = {X1 < 0}. On {T_ > 2}, define w as the time where the pre-T_ path S1, ... , Sr_ _1 is at its minimum . More rigorously, we consider the last such time (to make w unique) so that {w=m,T_=n} = {S,-S.. >0, 0<j<m, S,-S.>0, m<j
u ,r- =n w=m i
Figure A.1
3
8
APPENDIX
Reversing the time points 0,1, ... , m it follows (see Fig. A.1) that P(Sj -Sn,.>0, 0<j<m, SmEdu) = P(T+=m, ST+Edu). Aso, clearly (Sj -Sm>0, m < j
F (7-- = n, S,_ E A) n-1
P(r_=nw=m
f
Sm
EduSrEA)
m=1 n-1
F(r+=mSr+Edu).F(r_n_mSrEA_u). m=1 f
S mming over n = 2,3.... and reversing the order of summation yields P(T_ > 2, ST_ E A) P(T+ = m, S,+ E du) E P(S,_ = n - m, ST_ E A - u) f0m m=1 00 n=m+1
J0 OO P(S,+ E du)P(S,_ E A - du) (G+ * G-)(A)• C llecting terms, (A.7) follows, and the proof of (A.8) is similar. (b) follows from 00
G+ (A) _ E F(Sn E A, -r+ = n) n=1
C-0 E fF(Sk< 0,0
n=1 0 -
00 00 1: F(A - x)P(Sk < 0, 0 < k < n, Sn-1 E dx)
f0 f0
n=1
F(A - x)R+(dx),
-
APPENDIX
339
and the proof of (c) is similar. For (d), consider a fixed n and let Xk = Xn_k+l, Sk = X1 + • • • + Xk = Sn - Sn_k. Then for A C (-oo, 0], P(SnEA ,T+> n) = P(Sk < O,O
is the probability that n is a weak descending ladder point with Sn E A. Summing over n yields R+ (A) = U_ (A), and the proof of (e) is similar. ❑ Remark A2.2 In terms of m.g.f.'s, we can rewrite (a) as 1 - F[s] = (1 - 0+[s])(1 - G_[s]) (A.9) whenever F[s], 6+ [s], G_ [s] are defined at the same time; this holds always on the line its = 0, and sometimes in a larger strip. Since G+ is concentrated on (0, oo), H+ (s) = 1-G+[s] is defined and bounded in the half-plane Is : ERs < 0} and non-zero in Is: Rs < 01 (because IIG+lI _< 1), and similarly H_ (s) = 1 - G_ [s] is defined and bounded in the half-plane is : ERs > 01 and non-zero in Is : ERs > 0}. The classical analytical form of the Wiener-Hopf problem is to write 1 -.P as a product H+H_ of functions with such properties. ❑
Notes and references In its above discrete time version, Wiener-Hopf theory is only used at a few places in this book. However, it serves as model and motivation for a number of results and arguments in continuous time. E.g., the derivation of the form of G+ for the compound Poisson model (Theorem 11.6.1), which is basic for the Pollaczeck-Khinchine formula, is based upon representing G+ as in (b), and using time-reversion as in (d) to obtain the explicit form of R+ (Lebesgue measure). In continuous time, the analogue of a random walk is a process with stationary independent increments (a Levy process, cf. 11.4). In this generality of, there is no direct analogue of Theorem A2.1. For example, if {St} is Brownian motion, then T+ = inf It > 0 : St = 0} is 0 a.s., and G+, G_ are trivial, being concentrated at 0. Nevertheless, a number of related identities can be derived; see for example Bingham [65]. Another main extension of the theory deals with Markov dependence. In discrete time, there are direct analogues of Theorem A2.1; see e.g. the survey [15] by the author and the extensive list of references there. Again, such developments motivate the approach in Chapter VI on the Markovian environment model. The present proof of Theorem A2.1(a) is from Kennedy [228].
APPENDIX
340
3 Matrix-exponentials T e exponential eA of a p x p matrix A is defined by the usual series expansion 00 An eA
n=0
n!
he series is always convergent because A' = O(nk Ialn) for some integer k < p, ere A is the eigenvalue of largest absolute value, JAI = max {Jjt : µ E sp(A)} and sp(A) is the set of all eigenvalues of A (the spectrum). Some fundamental properties are the following: sp(eA) = {e' : A E sp(A)} (A.10) d dteAt = AeAt = eAtA
(A.11)
A f eAtdt = eA, _I 0
(A.12)
eA-'AO = A-le AA (A.13) henever A is a diagonal matrix with all diagonal elements non-zero. It is seen from Theorem VIII. 1.5 that when handling phase -type distributi ons, one needs to compute matrix -inverses Q-1 and matrix -exponentials eQt ( r just eQ ). Here it is standard to compute matrix-inverses by Gauss-Jordan el imination with full pivoting , whereas there is no similar single established a proach in the case of matrix -exponentials. Here are, however , three of the c rrently most widely used ones: xample A3.1 (SCALING AND SQUARING) The difficulty in directly applying t e series expansion eQ = Eo Q"/n! arises when the elements of Q are large.
hen the elements of Q"/n! do not decrease very rapidly to zero and may contribute a non-negligible amount to eQ even when n is quite large and very any terms of the series may be needed (one may even experience floating point overflow when computing Qn). To circumvent this, write eQ = (eK)m where = Q/m for some suitable integer m (this is the scaling step). Thus, if m is s fficiently large, Eo Kn/n! converges rapidly and can be evaluated without p oblems, and eQ can then be computed as the mth power (by squaring if
=
2).
0
APPENDIX
341
Example A3.2 (UNIFORMIZATION) Formally, the procedure consists in choosing some suitable i > 0, letting P = I + Q/i and truncating the series in the identity
= e-17t 00 Pn(,]t)n (A.14) E n n=0
which is easily seen to be valid as a consequence of eqt = en(P-r)t = e-ntenpt The idea which lies behind is uniformization of a Markov process {Xt}, i.e. construction of {Xt} by realizing the jump times as a thinning of a Poisson process {Nt } with constant intensity 77. To this end, assume that Q is the intensity matrix for {Xt} and choose q with rt > max J%J = max -qii• 1,3
(A.15)
i
Then it is easily checked that P is a transition matrix , and we may consider a new Markov process {Xt} which has jumps governed by P and occuring at epochs of {Nt} only (note that since pii is typically non-zero , some jumps are dummy in the sense that no state transition occurs ). However , the intensity matrix Q is the same as the one Q for {Xt} since a jump from i to j 1-1 i occurs at rate qij = 77pij = q22. The probabilistic reason that (A.14) holds is therefore that the t-step transition matrix for {fft} is °O
eQt = E e-nt (,7t) n=0
n
Pn
n!
(to see this, condition upon the number n of Poisson events in [Olt]) -
❑
Example A3.3 (DIFFERENTIAL EQUATIONS) Letting Kt = eQt, we have k = QK (or KQ) which is a system of p2 linear differential equations which can be solved numerically by standard algorithms (say the Runge-Kutta method) subject to the boundary condition Ko = I.
In practice, what is needed is quite often only Zt = TreQt (or eQth) with it (h) a given row (column) vector. One then can reduce to p linear differential equations by noting that k = ZQ, Zo = a (Z = QZ, Zo = h). The approach is in particular convenient if one wants eQt for many different ❑ values of t. Here is a further method which appears quite appealing at a first sight: Example A3 .4 (DIAGONALIZATION) Assume that Q has diagonal form, i.e. p different eigenvalues Aj i ... , Ap. Let vi,... , vp be the corresponding left
342
APPENDIX
(row) eigenvectors and hl,..., hp the corresponding right (column) eigenvectors, v5Q = Aivi, Qhi = vihi. Then vihj = 0, i # j, and vihi ¢ 0, and we may adapt some normalization convention ensuring vihi = 1. Then P
Q =
P
eQt =
P
> Aihivi = E Aihi (9 vi, i= 1 i=1 P
E e\`thivi = E ea:thi ® vi. i=1
(A.16) (A.17)
i=1
Thus, we have an explicit formula for eQt once the A j, vi, hi have been computed; this last step is equivalent to finding a matrix H such that H-1QH is a diagonal matrix, say A = (Ai)diag, and writing eQt as eQt = He°tH-1 = H (e\it)di.g H-1. (A.18)
Namely, we can take H as the matrix with columns hl,..., hp.
❑
There are, however, two serious drawbacks of this approach: Numerical instability : If the A5 are too close, (A.18) contains terms which almost cancel and the loss of digits may be disasterous. The phenomenon occurs not least when the dimension p is large. In view of this phenomenon alone care should be taken when using diagonalization as a general tool for computing matrix-exponentials. Complex calculus : Typically, not all ai are real, and we need to have access to software permitting calculations with complex numbers or to perform the cumbersome translation into real and imaginary parts. Nevertheless, some cases remain where diagonalization may still be appealing.
Example A3.5 If Q= ( 411 ( q21
q12 q22
is 2 x 2, the eigenvalue, say Al, of largest real part is often real (say, under the conditions of the Perron-Frobenius theorem), and hence A2 is so because of A2 = tr(Q). Everything is nice and explicit here: 411+q2+-D' )12_g11+q2-^^ where (411-422 D = z ) + 4412421. 2 2
343
APPENDIX
Write 7r (= v1) for the left eigenvector corresponding to a1 and k (= hl) for the right eigenvector. Then 7r =
(ir1
7r2 ) = a (q21
Al
- q, 1) ,
k -
C k2
) =b
( A1
q 1 Q11 /
where a , b are any constants ensuring//Irk = 1, i.e. l ab (g12g21 +
(A1
-
411) 2)
= 1.
Of course, v2 and h2 can be computed in just the same way, replacing ai by A2. However, it is easier to note that 7rh2 = 0 and v2k = 1 implies v2 = (k2 - k1), h2 = Thus, eqt = eNlt ( ir1ki i2k1 \ ir1 k2 72 k2
+ e
azt
7r2k2 -i2k1 -7ri k2 7r1 k1
(A.19)
Example A3 .6 A particular important case arises when Q = -q1 qi ) q2 -q2 J is an intensity matrix. Then Al = 0 and the corresponding left and right eigenvectors are the stationary probability distribution 7r and e. The other eigenvalue is A = A2 = -q1 - Q2i and after some trivial calculus one gets eQt =
ir =
7r 1
112
+ eat
7r1 7r2 / (7fl 7r2) =
(
7r2
-1r2
-7r1 IF,
q2 ql qi +q 2 9l +q2
where
(A.20)
(A.21)
Here the first term is the stationary limit and the second term thus describes the rate of convergence to stationarity. ❑
Example A3.7 Let 3
9
2 14 7 11 2 2
344
APPENDIX
Then 7 T4 -2 =52,
D= 2+ 11)'
x1 -3/2 - 11/2 + 5 -1, A2 = -3/2 - 11/2 - 5 - - -6, 2 2 1 3 2 2)'
1=ab(142+(-1+2)2 ) = tab, ir =a(2 9
9
k=b
14
14 =b -1+ 2
ir1
k1
7r1 k2
ir2
k1
2
_
7r2 k2
9 2 10
9 70
5 7
1 '
10
e4" = e_,.
9 9 10 10 + 7 1 10 10
e_6u
10 1 10 7 10
9 70 9 10 0
A4 Some linear algebra 4a Generalized inverses A generalized inverse of a matrix A is defined as any matrix A- satisfying AA-A = A. (A.22) Note that in this generality it is not assumed that A is necessarily square, but only that dimensions match , and a generalized inverse may not unique. Generalized inverses play an important role in statistics. They are most often constructed by imposing some additional properties , for example
AA+A = A, A+AA+ = A+, (AA+)' = AA+, (A+A)' = A+A. (A.23)
APPENDIX
345
A matrix A+ satisfying (A.23) is called the Moore-Penrose inverse of A, and exists and is unique (see for example Rao [300]). E.g., if A is a possibly singular covariance matrix (non-negative definite), then there exists an orthogonal matrix C such that A = CDC' where
0 0 D =
AP Here we can assume that the A , are ordered such that Al > 0,. .. , Am > 0, Am+1 = ... _ A,, = 0 where m < p is the rank of A, and can define
/ail
A+ = C
0 0
0
A' 0 0 0
0
0 0 0
C' .
01
In applied probability, one is also faced with singular matrices , most often either an intensity matrix Q or a matrix of the form I-P where P is a transition matrix. Assume that a unique stationary distribution w exists . Rather than with generalized inverses , one then works with Q = (Q - eir )-1, (I - P)- = (I - P + e7r)-1 (here ( I - P + e7r )- 1 goes under the name fundamental matrix of the Markov chain). These matrices are not generalized inverses but act roughly as inverses except that 7r and e play a particular role - e.g. ( Q - eir )- 1Q = Q(Q - eir)-1 = I - ew. Here is a typical result on the role of such matrices in applied probability: Proposition A4.1 Let A be an irreducible intensity matrix with stationary row vector it, and define D = (A - e ® 7r)-1. Then for some b > 0,
lt o
eAx dx =
te7r + D(eAt - I)
(A.24)
= te7r - D + O(e-bt), (A.25)
346
APPENDIX
t
2
xe Ax dx
= eir + t(D + e-7r) + D(eAt - I) - DZ(ent - I) (A.26) 2
= 2 e7r + tD - 2e7r - D + D2 + O(e-bt).
(A.27)
Proof Let A(t), B(t) denote the l.h.s. of (A.24), resp. the r.h.s. Then A(O) _ B(O) = 0, B'(t) = e7r + DAeAt = eir + (I - eir)eAt = eAt = A'(t). (A.26) follows by integration by parts: t f t /' xeAx dx = [x {xe7r + D(eAx - I)}, - J {xe^r + D(e - I)} dx. o Finally, the formulas involving O(e-6t) follow by Perron-Frobenius theory, see below. ❑ 4b The Kronecker product ® and the Kronecker sum
We recall that if A(1) is a k1 x ml and A(2) a k2 x m2 matrix, then the Kronecker (tensor) product A(') ®A(2) is the (k1 x k2) x (ml x m2) matrix with (il i2) (jl j2)th entry a;91a(2) . Equivalently, in block notation i2h A®B= ( a11B a21 B
a12B a22 B
Example A4.2 Let it be a row vector with m components and h a column vector with k components. Interpreting 7r, h as 1 x m and k x 1 matrices, respectively, it follows that h ® it is the k x m matrix with ijth element hi7rj . I.e., h ® it reduces to hit in standard matrix notation. Note that h ® it has rank 1; the rows are proportional to it, and the columns to h, and in fact any rank 1 matrix can be written on this form. For example,
()®(6
f 6/ 7f 8^ 7 8 )=! ^)( 6 7 8 )=(6^ 7^ 8^) \ ❑
Example A4.3 Let
2 A= 4
3 Vf' N7 5 )' B= ( 8 ).
347
APPENDIX
Then
A®B =
2 f 20- 3v'6- 3vV/72f 20- 3V8- 3f 4v/6 4vf7 5v/-6 504f 4-,A9- 5v'-8 5vf9-
11 A fundamental formula is (A1B1C1) ®(A2B2C2) = (A1 (9 A2)(B1 (9 B2)(C1®C2).
(A.28)
In particular, if Al = vi, A2 = v2 are row vectors and C1 = h1, C2 = h2 are column vectors, then v1B1h1 and v2B2h2 are real numbers, and v1B1h1 • v2B2h2 = v1B1h1 ® v2B2h2 = ( v1(&v2 )( B1(&B2 )( h1(&h2 ) .(A.29) If A and B are both square (k1 = ml and k2 = m2), then the Kronecker sum is defined by
A(1) ®A(2) = A(1) ®Ik2 + k ®A(2). (A.30) A crucial property is the fact that the functional equation for the exponential
eA+B = eAeB function generalizes to Kronecker notation (note that in contrast typically only holds when A and B commute): Proposition A4.4 eA® B = eA ®eB. Proof We shall use the binomial formula
t / l (A ®B)t = I k Ak 0 B1-k
(A.31)
k=0
Indeed,
(AED B)1 = (A®I+I(9 B)l is the sum of all products of t factors, each of which is A ® I or I ® B; if A ® I occurs k times, such a factor is Ak (&B 1-k according to (A.29), and the number of such factors is precisely given by the relevant binomial coefficient.
Using (A.31), it follows that 0o
e® ® e B
_An
oo
J
oo Bn
®_
Ak ®Bl-k
r
7 I F n! = ` k! (I - k)! ( n-0 n=0 t=0 k=0
^. (A B)' = eA®B e! L 1=0
0
APPENDIX
348
Remark A4.5 Many of the concepts and results in Kronecker calculus have p(2) is the intuitive illustrations in probabilistic terms. Thus , p = P(1) ® {X }, X ) }, where transition matrix of the bivariate Markov chain {X n1),
n2
n1 )
{X(2) } are independent Markov chains with transition matrices P(1), P(2), and Q = Q(1) ® Q
(2)
= Q(1) ® I + I ® Q(2)
(A.32)
is the intensity matrix of the bivariate continuous Markov process {Yt(1), Yt(2) independent Markov processes with intensity matri{y(2) } are {Y(1) }, first term on the r . h.s. represents ces Q( 1), Q(2); in the definition (A.32), the {Yt(2) } transitions in the {Yt(1) } component and the second transitions in the component , and the form of the bivariate intensity matrix reflects the fact that Yt(2) } cannot change state in both components at due to independence ,
where
{Yt(1),
the same time. A special case of Proposition A4.4 can easily be obtained by probabilistic be the s-step transition reasoning along the same lines . Let P8f P(Sl), P(t) Yt(2) }. From what has been said about matrices of {Yt( 1), Yt(2 ) }, { 1't(1) }, resp . { On the other hand, independent Markov chains, we have P8 = Pal) ® p(2). P8 = exp {sQ} = exp {s (Q(1) ®Q(2)) } , Ps 1) = exp {sQ ( 1) } > p(2 ) = exp {sQ(2) } can therefore be rewritten as Taking s = 1 for simplicity , P8 = Pal ) ® P82) exp {Q ( 1) ® Q(2)1 = eXp {Q( 1) } ® exp {Q(2) }
Also the following formula is basic: B are both square such that a +,3 < 0 Lemma A4 .6 Suppose that A and of B. Let further it, v whenever a is an eigenvalue of A and 0 is an eigenvalue be any row vectors and h, k any column vectors. Then 2 0
ire At h • ve
Bt kdt = (^®v)(A®B)-1(e A®Ba - I)(h ® k).
(A.33)
APPENDIX
349
Proof According to (A.29), the integrand can be written as ( 7r (9 v)( eAt ® eBt )(h ®k ) =
( 7r
®v)(eA (DBt)(h (& k).
Now note that the eigenvalues of A ® B are of the form a +,3 whenever a is an eigenvalue of A and 3 is an eigenvalue of B, so that by asssumption A ® B is ❑ invertible, and appeal to (A.12).
4c The Perron-Frobenius theorem Let A be a p x p-matrix with non-negative elements. We call A irreducible if the pattern of zero and non-zero elements is the same as for an irreducible transition matrix. That is, f o r each i, j = 1, ... , p there should exist io, il,... , in such that io = i, i,, = j and atk_li,. > 0 for k = 1, . . . , n. Similarly, A is called aperiodic if the pattern of zero and non-zero elements is the same as for an aperiodic transition matrix. Here is the Perron-Frobenius theorem, which can be found in a great number of books, see e.g. [APQ] X.1 and references there (to which we add Berman & Plemmons [63]): Theorem A4.7 Let A be a p x p-matrix with non-negative elements. Then: (a) The spectral radius Ao = max{JAI : A E sp(A)} is itself a strictly positive and simple eigenvalue of A, and the corresponding left and right eigenvectors v, h can be chosen with strictly positive elements; (b) if in addition A is aperiodic, then IN < Ao for all A E sp(A), and if we normalize v, h such that vh = 1, then
An = Aohv+O(µ") = Aoh®v+O(µ")
(A.34)
for some u. E (0, ao). Note that for a transition matrix, we have AO = 1, h = e and v = 7r (the stationary row vector). .The Perron-Frobenius theorem has an analogue for matrices B with properties similar to intensity matrices: Corollary A4.8 Let B be an irreducible3 p x p-matrix with non-negative offdiagonal elements. Then the eigenvalue Ao with largest real part is simple and real, and the corresponding left and right eigenvectors v, h can be chosen with 3By this, we mean that the pattern of non-zero off-diagonal elements is the same as for an irreducible intensity matrix.
350
APPENDIX
strictly positive elements. Furthermore, if we normalize v, h such that vh = 1, then eBt = ea0thv + O(eµt) = eA0th ® v + O(et t) (A.35) for some p E (-oo, Ao). Note that for an intensity matrix, we have A0 = 0, h = e and v = 7r (the stationary row vector).
Corollary A4.8 is most often not stated explicitly in textbooks, but is an easy consequence of the Perron-Frobenius theorem. For example, one can consider A = 77I + B where rl > 0 is so large that all diagonal elements of A are strictly positive (then A is irreducible and aperiodic), relate the eigenvalues of B to those of B via (A. 10) and use the formula 00 Antn
-me at e Bt = e
= e - n t AL n=0
n!
(cf. the analogy of this procedure with unformization, Example A3.2).
A5 Complements on phase-type distributions 5a Asymptotic exponentiality In Proposition VIII.1.8, it was shown that under mild conditions the tail of a phase-type distribution B is asymptotical exponential. The next result gives a condition for asymptotical exponentiality, not only in the tail but in the whole distribution. The content is that B is approximately exponential if the exit rates ti are small compared to the feedback intensities tij (i # j). To this end, note that we can write the phase generator T as Q - (ti)diag where Q = T + (ti)diag is a proper intensity matrix (Qe = 0). I.e., the condition is that t is small compared to Q. Proposition A5.1 Let Q be a proper irreducible intensity matrix with stationary distribution a, let t = (ti)iEE # 0 have non-negative entries and define T(°) = aQ - (ti)ding.
Then for any (3, the phase-type distribution B(a)
with representation (,(3, T(°)) is asymptotically exponential with parameter t* _ r EiEE aiti as a -4 oo, Bi° (x) -+ a-t*x
Proof Let { 4 } be the phase process associated with B(a) and (°) its lifelength, let {Yti°i } be a Markov process with initial distribution a and intensity
351
APPENDIX
matrix aQ , and write Yt = Yt(1),
((1) etc. We can assume that Jta) = Yt(°), t < (a), and that Yt(a) = Yat for all t. Let further V be exponential with intensity V and independent of everything else. We can think of ( ( a) as the first event in an inhomogeneous Poisson process ( Cox process ) with intensity process
{t
Y( a) } v>0
((a) =
. Hence we can represent ( (a) as
inf { t > O : f tY( )dv=V
} ^l = inf { t > O : t adv = V }
jat inf{t > 0: tydv =aV} = JJJ a
where o (x) = inf {t >0: fo tY dv = x}.
J
l
J
By the law of large numbers for
Markov processes , fo tY dv/t a$' t*, and this easily yields a(x)/x a-' 1/t*. Hence O ((a) aa. v/ t-. We shall , in fact , prove a somewhat more general result which was used in the proof of Proposition VI.1.9. In addition to the asymptotic exponentiality, it states that the state, from which the phase process is terminated , has a limit distribution: Proposition A5.2 Pi (c(a) > x, J(()) _ = i) -+ a-t•x
t tt' .
Proof Assume first ti > 0 for all i and let I. = YQ(x). Then {Ix} is a Markov process with to = Yo. Conditioning upon whether { Yt} changes state in [0, dx/ti] or not, we get dx F (Idx = j) = (1 + qij t )Sij + qij dt,x (1 - bij) Hence the intensity matrix of { Ix} is (qij/ti)i,jEE, from which it is easily checked that the limiting stationary distribution is (aiti/t*)iEE• Now let a' -4 oo with a in such a way that a' < a, a'/a -+ 1, a - a' -+ oo (e.g. a' = a - aE where 0 < e < 1). Then a(a'V)/a (aV) a' 1. Since JJ(.)_ = Y(a) = 1'aS(a) = Ya(av)^ it follows that Pi ((,(a) > x , J^O)_ = j)
Pi (v(aaV) > x,YQ(av) = j)
Pi ( ci(a'V) > x,Yj(av) = j f
352
APPENDIX
rr Ia(a'V) Ei I ( > x) P
L
at
(Yo (aV)
,., Et II I a(a^V) > x) at' .+ a-t*x • a't' L ` at t* t*
J
Reducing the state space of {Ix } to {i E E : t, > 0}, an easy modification of the argument yields finally the result for the case where t; = 0 for one or more ❑
i.
Notes and references Propositions A5.1 and A5.2 do not appear to be in the literature. However, these results are in the spirit of rare events theory for regenerative processes (e.g. Keilson [223], Gnedenko & Kovalenko [164] and Glasserman & Kou [162]). See also Korolyuk, Penev & Turbin [238].
5b Discrete phase-type distributions The theory of discrete phase-type distributions is a close parallel of the continuous case, so we shall be brief. A distribution B on {1, 2, ...} is said to be discrete phase-type with representation (E, P, a) if B is the lifelength of a terminating Markov chain (in discrete time) on E which has transition matrix P = (p,j) and initial distribution a. Then P is substochastic and the vector of exit probabilities is p = e - Pe. Example A5.3 As the exponential distribution is the simplest continuous phasetype distribution, so is the geometric distribution, with point probabilities bk = (1 - p)k-1 p, k = 1, 2, ..., the simplest discrete phase-type distribution: here E has only one element, and thus the parameter p of the geometric distribution ❑ can be identified with the exit probability vector p. Example A5.4 Any discrete distribution B with finite support, say bk = 0, K}, a = b = (bk)k=1,...,x k > K, is discrete phase-type. Indeed, let E and Pkj
j=k-1, 1 k=1 1 0k>1, otherwise, ' pk 0 k>1
11 Theorem A5.5 Let B be discrete phase-type with representation (P, a). Then: (a) The point probabilities are bk = aPk-lp; (b) the generating function b[z] _ E' , zkbk is za(I - zP)-'p; (c) the nth moment k 1 k"bkis 1)"n!aP-"p.
APPENDIX
353
5c Closure properties Example A5.6 (CONVOLUTIONS) Let B1, B2 be phase-type with representations (E(1),a(1),T(1)), resp. (E(2),a(2),T(2)). Then the convolution B = B1 * B2 is phase-type with representation (E, a, T) where E = E(1) + E(2) is the disjoint union of E(1) and E(2), and a=1), a' - { 0, _
i E
E(1) T(1)
i E E(2) ,
t(1)a(2)
T= ( 0 T(2) )
(A.36)
in block-partitioned notation (where we could also write a as (a (1) 0)). A reduced phase diagram (omitting transitions within the two blocks) is
am
E(1)
t(1) a(2)
(2)
t(2)
Figure A.2 The form of these results is easily recognized if one considers two independent phase processes { Jt 1) }, { Jt 2) } with lifetimes U1 , resp. U2, and piece the processes together by it =
41) 0
Then {Jt} has lifetime U1 + U2 , initial distribution a and phase generator T. 11 Example A5.7 (THE NEGATIVE BINOMIAL DISTRIBUTION) The most trivial special case of Example A5.6 is the Erlang distribution Er which is the convolution of r exponential distributions. The discrete counterpart is the negative binomial distribution with point probabilities
bk k1) (1 k = r,r + 1,.... r - 1 This corresponds to a convolution of r geometric distributions with the same parameter p, and hence the negative binomial distribution is discrete phase❑ type, as is seen by minor modifications of Example A5.6.
354
APPENDIX
Example A5.8 (FINITE MIXTURES) Let B1, B2 be phase-type with representations (E(1),a(1),T(1)), resp. (E(2),a(2),T(2)). Then the mixture B = 9B1 + (1 - O)B2 (0 < 0 < 1) is phase-type with representation (E, a, T) where E = E(1) + E(2) is the disjoint union of E(1) and E(2), and o'i
=IT
Oa;'), i E E(1) T 0 I (A.37) (1) (1 - 0)ai2), i E E(2) 0 T(2)
(in block-partitioned notation, this means that a = (Oa(1) (1 - 0)a(2))). A reduced phase diagram is
0a(1)
E(1) A
- 0)a(2)
E(2)
Figure A.3 In exactly the same way, a mixture of more than two phase-type distributions is seen to be phase-type. In risk theory, one obvious interpretation of the claim ❑ size distribution B to be a mixture is several types of claims. Example A5.9 (INFINITE MIXTURES WITH T FIXED) Assume that a = a(°) depends on a parameter a E A whereas E and T are the same for all a. Let B(") be the corresponding phase-type distribution, and consider B(") = fA B(a) v(da) where v is a probability measure on A. Then it is trivial to see that B(") is ❑ phase-type with representation (a("),T,E) where a(°) = fAa(a)v(da). Example A5.10 (GEOMETRIC COMPOUNDS) Let B be phase-type with representation (E, a, T) and C = EO°_1(1 - p)pn-1B*n. Equivalently, if U1, U2,... are i.i.d. with common distribution and N is independent of the Uk and geometrically distributed with parameter p, P(N = n) = (1 - p)pn-1, then C is the distribution of Ul + • • • + UN. To obtain a phase process for C, we need to restart the phase process for B w.p. p at each termination. Thus, a reduced phase diagram is
f
a
E
t
Figure A.4
APPENDIX
355
and C is phase-type with representation (E, a, T + pta). Minor modifications of the argument show that 1. If U1 has a different initial vector, say v, but the same T, then U1 +• is phase-type with representation (E, v,T + pta);
+UN
2. if B is defective and N + 1 is the first n with U„ = oo, then U1 + • • + UN is zero-modified phase-type with representation (a, T + ta, E). Note that this was exactly the structure of the lifetime of a terminating renewal ❑ process, cf. Corollary VIII.2.2. Example A5.11 (OVERSHOOTS) The overshoot of U over x is defined as the distribution of (U - x)+. It is zero-modified phase-type with representation (E,aeTx,T) if U is phase-type with representation (E, a, T). Indeed, if {Jt} is a phase process for U, then Jy has distribution aeTx. If we replace x by a r.v. X independent of U, say with distribution F, it follows by mixing (Example A5.9) that (U - X)+ is zero-modified phase-type with representation (E,aF[T],T) where
F[T] =
J0 "o eTx F(dx)
is the matrix m.g.f. of F, cf. Proposition VIII.1.7.
❑
Example A5 . 12 (PHASE-TYPE COMPOUNDS ) Let fl, f2.... be the point probabilities of a discrete phase-type distribution with representation (E, a, P), let B be a continuous phase-type distribution with representation (F, v, T) and C = F,,°,°_1 f„ B*?l. Equivalently, if U1, U2, ... are i. i.d. with common distribution B and N is independent of the Uk with P(N = n) = f,,, then C is the distribution of U1 + • • • + UN. To obtain a phase representation for C , let the phase space be E x F = {i j : i E E, j E F}, let the initial vector be a ® v and ❑ let the phase generator be I ® T + P ® (ta). Example A5 . 13 (MINIMA AND MAXIMA ) Let U1, U2 be random variables with distributions B1, B2 of phase-type with representations (E('),a(1),TWWW), resp. (E(2), a(2), T(2) ). Then the minimum U1 A U2 and the maximum U1 V U2 are again phase-type. To see this, let {Jtl)}, { Jt2) } be independent with lifetimes U1, resp. U2. For U1 A U2, we then let the governing phase process be {Jt} _ {(411 Jt2))} 2) interpreting exit of either of {4 M }, { 4 } as exit of {Jt}. Thus the representation is (E(1) x E(2), a(1) ® a(2 ), T(1) ® T(2)).
356
APPENDIX
For U1 V U2, we need to allow { Jt,2) } to go on (on E(2)) when { i 1) } exits, and vice versa. Thus the state space is E(1 ) x E(2) U E(1) U E( 2), the initial vector is (a(1) (& a (2) 0 0), and the phase generator is T(1)
®T(2) T(1) ®t(2) t(1) ® T(2) 0 T(1) 0 0 0 T(2)
Notes and references The results of the present section are standard , see Neuts [269] (where the proof, however, relies more on matrix algebra than the probabilistic interpretation exploited here).
5d Phase-type approximation A fundamental property of phase-type distributions is denseness . That is, any distribution B on (0, oo) can be approximated 'arbitrarily close' by a phase-type distribution B: Theorem A5.14 To a given distribution B on (0, oo), there is a sequence {B,,} of phase-type distributions such that Bn 3 B as n -+ oo. Proof Assume first that B is a one-point distribution, say degenerate at b, and let Bn be the Erlang distribution E,,(Sn) with Sn = n/b. The mean of B„ is n/Sn = b and the variance is n/Sn = b2/n. Hence it is immediate that Bn 4 B. The general case now follows easily from this, the fact that any distribution B can be approximated arbitrarily close by a distribution with finite support, and the closedness of the class of phase-type distributions under the formation of finite mixtures, cf. Example A5.8. Here are the details at two somewhat different levels of abstraction: (diagonal argument , elementary) Let {bk} be any dense sequence of continuity points for B(x). Then we must find phase-type distributions Bn with B,(bk) -+ B(bk) for all k. Now we can find first a sequence {Dm} of distributions with finite support such that D,,(bk) -+ B(bk) for all k as n -* oo. By the diagonal argument (subsequent thinnings), we can assume that ID.(bk)'- B(bk) I < 1/n for n > k. Let the support of Dn be {xl(n),...,xq(n)(n)}, with weight pi(n) for xi(n). Then from above, q(n)
q(n)
C,-,n = I:pi(n)Er v ( __ n) ) ) a= 1
pi(n)a ,(n) = D, r # oo. i= 1
APPENDIX
357
Hence we can choose r(n) in such a way that ICr( n),n (bk) - D(bk)I < n, k < n. Then
2
ICr( n ),n( b k ) - B(bk )I < - , k < n, and we can take Bn = Cr(n),n.
❑
(abstract topological ) The essence of the argument above is that the closure (w.r.t. the topology for weak convergence) PET of the class PET of phase-type distributions contains all one-point distributions. Since PET is closed under the continuous operation of formation of finite mixtures, PIT contains all finite mixtures of one-point distributions, i.e. the class CO of all discrete distributions. But To is the class G of all distributions on [0, oo). Hence G C PET and L = PIT. ❑ Theorem A5.14 is fundamental and can motivate phase-type assumptions, say on the claim size distribution B in risk theory, in at least two ways: insensitivity Suppose we are able to verify a specific result when B is of phasetype say that two functionals Cpl (B) and W2 (B) coincide. If Cpl (B) and ^02(B) are weakly continuous, then it is immediate that WI(B) = p2(B) for all distributions B on [0, oo) approximation Assume that we can compute a functional W(B) when B is phase-type, and that cp is known to be continuous. For a general Bo, we can then approximate Bo by a phase-type B, compute W(B) and use this quantity as an approximation to cp(B0). In particular, if information on Bo is given in terms of observations (i.i.d. replications), one would use the B given by some statistical fitting procedure (see below). It should be noted, however, that this procedure should be used with care if ^p(B) is the ruin probability O(u) and u is large. Let E be the class of functions f : [0, oo) -* [0, oo) such that f (x) = O(e«x), x -4 oo, for some a < oo. Corollary A5.15 To a given distribution B on (0 , oo) and any fl, f2.... E E, there is a sequence {Bn} of phase -type distributions such that Bn Di B as n -4 oo and f ' f,(x)Bf,( dx) -* f r f{(x)B(dx), i = 1, 2,....
APPENDIX
358 Proof By Fatou' s lemma,
B,, -
B implies that
00 liminf
J
00
fi(x)B.(dx) >
n-,oo
o
J
fi(x)B(dx),
o
for each i, and hence it is sufficient to show that we can obtain limsup n-4oo
(A.38)
fi(x)Bn(dx) < f fi( x)B(dx ), i=1,2 ..... JTO o 0
We first show that for each f E E, n B=az, Bn=En
z
f f (x)Bn(dx) -fof (x)B(dx) = ° (A.39)
Indeed, if f (x ) = e°x, then
cc f (x)Bn ( dx) = (?!c )
e'= .f (z) = f 1 1 = 11-n/ o
.f (x)B(dx),
and the case of a general f then follows from the definition of the class E and a uniform integrability argument. Now returning to the proof of (A.38 ), we may assume that in the proof of Theorem A5.14 Dn has been chosen such that 1
00
f fi(x)D n(dx ) <
o
1++ '-
°°
f fi(x)B(dx),
i = 1, ..., n.
\ n o
By (A.39), f00 fi(x)Cr,n(dx) -+ f 0
fi(x)Dn(dx),
and hence we may choose r(n) such that
L
9l) f ' f (x)B(dx), i = 1, ... , n. f (x)Cr(n),n(dx) < 1+-
\\
0
Corollary A5.16 To a given distribution B on (0 , oo), there is a sequence {Bn} of phase -type distributions such that Bn -Di B as n -+ oo and all moments converge, f° xtBn(dx ) -* f °° x`B( dx), i = 1, 2, ....
APPENDIX
359
In compound Poisson risk processes with arrival intensity /3 and claim size distribution B satisfying ,l3µb < 1, the adjustment coefficient 'y = 7(B,/3) is defined as the unique solution > 0 of B[-y] = l+y/j3. The adjustment coefficient is a fundamental quantity, and therefore the following result is highly relevant as support for phase-type assumptions in risk theory: Corollary A5.17 To a given /3 > 0 and a given distribution B on (0, oo) with B[-y +e] < oo for some e > y = 7(B,/3), there is a sequence {B,,} of phase-type distributions such that Bfz + B as n -* oo and -Yn -4 ry where ryn = y(Bn,,3). Proof Let fi(x) = el'r+E;> y for some sequence {ei} with ei E (0, e ) and ei J. 0 as i -* oo. If ei > 0, then Bn['Y + ei] -* B[y + ei] > 1 +
7 Q
implies that 'yn < ry + ei for all sufficiently large n . I.e., lim sup ryn < 7. lim inf > is proved similarly. O We state without proof the following result: Corollary A5.18 In the setting of Corollary A5.16, one can obtain 7(Bn, /3) = ry for all n. Notes and references Theorem A5.14 is classical; the remaining results may be slightly stronger than those given in the literature, but are certainly not unexpected.
5e Phase-type fitting As has been mentioned a number of times already, there is substantial advantage in assuming the claim sizes to be phase-type when one wants to compute ruin probabilities. For practical purposes, the problem thus arises of how to fit a phase-type distribution B to a given set of data (1, . . . , (N. The present section is a survey of some of the available approaches and software for inplementing this. We shall formulate the problem in the slightly broader setting of fitting a phase-type distribution B to a given set of data (1i . . . , (N or a given distribution Bo. This is motivated in part from the fact that a number of non-phase-type distributions like the lognormal, the loggamma or the Weibull have been argued to provide adequate descriptions of claim size distributions, and in part from the fact that many of the algorithms that we describe below have been formulated within the set-up of fitting distributions. However, from a more conceptual
360
APPENDIX
point of view the two sets of problems are hardly different : an equivalent representation of a set of data (1 , ... , (N is the empirical distribution Be, giving mass 1 /N to each S=. Of course, one could argue that the results of the preceding section concerning phase-type approximation contains a solution to our problem : given Bo (or Be), we have constructed a sequence { B,,} of phase-type distribution such that Bo, and as fitted distribution we may take B,, for some suitable large n. B„ The problem is that the constructions of {B„} are not economical : the number of phases grows rapidly, and in practice this sets a limitation to the usefulness (the curse of dimensionality ; we do not not want to perform matrix calculus in hundreds or thousands dimensions). A number of approaches restrict the phase -type distribution to a suitable class of mixtures of Erlang distributions . The earliest such reference is Bux & Herzog [85] who assumed that the Erlang distributions have the same rate parameter, and used a non-linear programming approach . The constraints were the exact fit of the two first moments and the objective function to be minimized involved the deviation of the empirical and fitted c.d.f. at a a number of selected points . In a series of papers (e.g. [216] ), Johnson & Taaffe considered a mixture of two Erlangs (with different rates ) and matched (when possible ) the first three moments . Schmickler (the MEDA package; e .g. [317] ) has considered an extension of this set-up, where more than two Erlangs are allowed and in addition to the exact matching of the first three moments a more general deviation measure is minimized (e.g. the L1 distance between the c . d.f.'s). The characteristics of all of these methods is that even the number of parameters may be low (e.g . three for a mixture of two Erlangs ), the number of phases required for a good fit will typically be much larger, and this is what matters when using phase-type distributions as computational vehicle in say renewal theory, risk theory, reliability or queueing theory. It seems therefore a key issue to develop methods allowing for a more general phase diagram, and we next describe two such approaches which also have the feature of being based upon the traditional statistical tool of like maximum likelihood. A method developed by Bobbio and co-workers (see e. g. [70]) restrict attention to acyclic phase -type distributions , defined by the absence of loops in the phase diagram . The likelihood function is maximized by a local linearization method allowing to use linear programming techniques. Asmussen & Nerman [38] implemented maximum likelihood in the full class of phase-type distributions via the EM algorithm ; a program package written in C for the SUN workstation or the PC is available as shareware, cf. [202]. The observation is that the statistical problem would be straightforward if the whole ( EA-valued) phase process { Jtk)}
o
associated with each observa-
APPENDIX
361
tion Sk was available. In fact, then the estimators would be of simple occurenceexposure type,
EN
1 I (-(k) = i)
tii=i iEE, jEEA,
ai = N where
N Ti = I(J= i) dt, Nii = = , = j) f k=1 k =1 tE[0,(k] (Ti is the total time spent in state i and Nii is the total number of jumps from i to j). The general idea of the EM algorithm ([106]) is to replace such unobserved quantities by the conditional expectation given the observations; since this is parameter-dependent, one is lead to an iterative scheme, e.g. (n+1) _ Ea (n),T(n) (Nik IC1, ... , (N) tJk
Ea ( n),T(n) (Ti ^^ 1, ... , (N
) (^
54
k )+
and similarly for the cn+1) The crux is the computation of the conditional expectations. E.g., it is easy to see that N
Ea(n),T (n)(TiI(1,...,(N) =
(k
E Ea(n),T(n) k=1
I (Jti) dt o \f
N f:i a(n)eT(n)xei . eieT(n)((k- x)t(n) 1
a(n)eT(n )(kt(n)
and this and similar expressions are then computed by numerical solution of a set of differential equations. In practice, the methods of [70] and [38] appear to produce almost identical results. Thus, it seems open whether the restriction to the acyclic case is a severe loss of generality.
This page is intentionally left blank
Bibliography [1] J. Abate, G.L. Choudhury & W. Whitt (1994) Waiting-time tail probabilities in queues with long-tail service -time distributions . Queueing Systems 16, 311-338.
[2] J. Abate & W. Whitt ( 1992) The Fourier-series method for inverting transforms of probability distributions . Queueing Systems 10, 5-87. [3] J. Abate & W. Whitt (1998) Explicit M/G/1 waiting-time distributions for a class of long-tail service time distributions . Preprint, AT&T. [4] M. Abramowitz & I. Stegun ( 1972 ) Handbook of Mathematical Functions (10th ed.). Dover, New York.
[5] G. Alsmeyer (1991) Erneuerungstheorie. B.G. Teubner, Stuttgart. [6] V. Anantharam (1988 ) How large delays build up in a GI/GI11 queue. Queueing Systems 5 , 345-368. [7] E. Sparre Andersen (1957) On the collective theory of risk in the case of contagion between the claims . Transactions XVth International Congress of Actuaries, New York, II, 219-229. [8] G. Arfwedson ( 1954) Research in collective risk theory. Skand. Aktuar Tidskr. 37, 191-223.
[9] G. Arfwedson ( 1955) Research in collective risk theory. The case of equal risk sums. Skand. Aktuar Tidskr. 38, 53-100. [10] K. Arndt (1984) On the distribution of the supremum of a random walk on a Markov chain . In: Limit Theorems and Related Problems (A.A. Borokov, ed.), pp. 253-267. Optimizations Software, New York. [11) S. Asmussen ( 1982 ) Conditioned limit theorems relating a random walk to its associate, with applications to risk reserve processes and the GI /G/1 queue. Adv. Appl. Probab. 14, 143-170. [12] S. Asmussen (1984) Approximations for the probability of ruin within finite time . Scand. Act. J. 1984 , 31-57 ; ibid. 1985, 57. [13] S. Asmussen (1985) Conjugate processes and the simulation of ruin problems. Stoch. Proc. Appl. 20, 213-229. [14] S. Asmussen (1987) Applied Probability and Queues. John Wiley & Sons, Chichester New York. [15] S. Asmussen (1989a) Aspects of matrix Wiener-Hopf factorisation in applied probability. The Mathematical Scientist 14, 101-116.
363
364
BIBLIOGRAPHY
[16] S. Asmussen (1989b) Risk theory in a Markovian environment. Scand. Act. J. 1989 , 69-100. [17] S. Asmussen (1991) Ladder heights and the Markov-modulated M/G/1 queue. Stoch. Proc. Appl. 37, 313-326. [18] S. Asmussen (1992a) Phase-type representations in random walk and queueing problems. Ann. Probab. 20, 772-789. [19] S. Asmussen (1992b) Light traffic equivalence in single server queues. Ann. Appl. Probab. 2, 555-574. [20] S. Asmussen (1992c) Stationary distributions for fluid flow models and Markovmodulated reflected Brownian motion. Stochastic Models 11, 21-49. [21] S. Asmussen (1995) Stationary distributions via first passage times. Advances in Queueing: Models, Methods V Problems (J. Dshalalow ed.), 79-102. CRC Press, Boca Raton, Florida. [22] S. Asmussen (1998a) Subexponential asymptotics for stochastic processes: extremal behaviour, stationary distributions and first passage probabilities. Ann. Appl. Probab. 8, 354-374. [23] S. Asmussen (1998b) Extreme value theory for queues via cycle maxima. Extremes 1 , 137-168. [24] S. Asmussen (1998c) A probabilistic look at the Wiener-Hopf equation. SIAM Review 40, 189-201.
[25] S. Asmussen (1999) On the ruin problem for some adapted premium rules. Probabilistic Analysis of Rare Events (V.K. Kalashnikov & A.M. Andronov, eds.), 3-15. Riga Aviation University. [26] S. Asmussen (2000) Matrix-analytic models and their analysis. Scand. J. Statist. 27, 193-226. [27] S. Asmussen & K. Binswanger (1997) Simulation of ruin probabilities for subexponential claims. Astin Bulletin 27, 297-318. [28] S. Asmussen, K. Binswanger & B. Hojgaard (2000) Rare events simulation for heavy-tailed distributions. Bernoulli 6, 303-322.
[29] S. Asmussen & M. Bladt (1996) Renewal theory and queueing algorithms for matrix-exponential distributions. Matrix-Analytic Methods in Stochastic Models (A.S. Alfa & S. Chakravarty, eds.), 313-341. Marcel Dekker, New York. [30] S. Asmussen & M. Bladt (1996) Phase-type distributions and risk processes with premiums dependent on the current reserve. Scand. Act. J. 1996, 19-36. [31] S. Asmussen, L. Floe Henriksen & C. Kliippelberg (1994) Large claims approximations for risk processes in a Markovian environment. Stoch. Proc. Appl. 54, 29-43.
[32] S. Asmussen, A. Frey, T. Rolski & V. Schmidt (1995) Does Markov-modulation increase the risk? ASTIN Bull. 25, 49-66. [33] S. Asmussen & B. Hojgaard (1996) Finite horizon ruin probabilities for Markovmodulated risk processes with heavy tails. Th. Random Processes 2, 96-107. [34] S. Asmussen & B. Hojgaard (1999) Approximations for finite horizon ruin prob-
abilities in the renewal model. Scand. Act. J. 1999 , 106-119.
BIBLIOGRAPHY
365
[35] S. Asmussen, B. Hojgaard & M. Taksar (2000) Optimal risk control and dividend distribution policies . Example of excess-off-loss reinsurance for an insurance corporation. Finance and Stochastics 4, 299-324. [36] S. Asmussen & C. Kliippelberg (1996) Large deviations results for subexponential tails, with applications to insurance risk Stoch. Proc. Appl. 64, 103-125. [37] S. Asmussen & G. Koole (1993). Marked point processes as limits of Markovian
arrival streams. J. Appl. Probab. 30, 365-372. [38] S. Asmussen , O. Nerman & M. Olsson (1996). Fitting phase-type distributions via the EM algorithm. Scand. J. Statist. 23, 419-441. [39] S. Asmussen & H.M. Nielsen (1995) Ruin probabilities via local adjustment coefficients . J. Appl. Prob 32 , 736-755. [40] S. Asmussen & C.A. O'Cinneide (2000/01) On the tail of the waiting time in a Markov-modulated M/G/1 queue. Opns. Res. (to appear). [41] S. Asmussen & C.A. O'Cinneide (2000/2001) Matrix-exponential distributions [Distributions with a rational Laplace transform] Encyclopedia of Statistical Sciences, Supplementary Volume (Kotz, Johnson, Read eds.). Wiley. [42] S. Asmussen & D. Perry (1992) On cycle maxima, first passage problems and extreme value theory for queues. Stochastic Models 8, 421-458.
[43] S. Asmussen & T. Rolski (1991) Computational methods in risk theory: a matrix-algorithmic approach. Insurance: Mathematics and Economics 10, 259274. [44] S. Asmussen & T. Rolski (1994) Risk theory in a periodic environment: Lundberg's inequality and the Cramer-Lundberg approximation. Math. Oper. Res. 410-433. [45] S. Asmussen & R.Y. Rubinstein (1995) Steady-state rare events simulation in queueing models and its complexity properties. Advances in Queueing: Models, Methods 8 Problems (J. Dshalalow ed.), 429-466. CRC Press, Boca Raton, Florida. [46] S. Asmussen & R.Y. Rubinstein (1999) Sensitivity analysis of insurance risk models. Management Science 45, 1125-1141.
[47] S. Asmussen, H. Schmidli & V. Schmidt (1999) Tail approximations for nonstandard risk and queueing processes with subexponential tails. Adv. Appl. Probab. 31, 422-447. [48] S. Asmussen & V. Schmidt (1993) The ascending ladder height distribution for a class of dependent random walks. Statistica Neerlandica 47, 1-9. [49] S. Asmussen & V. Schmidt (1995) Ladder height distributions with marks. Stoch. Proc. Appl. 58, 105-119. [50] S. Asmussen & S. Schock Petersen (1989) Ruin probabilities expressed in terms of storage processes. Adv. Appl. Probab. 20, 913-916.
[51] S. Asmussen & K. Sigman (1996) Monotone stochastic recursions and their duals. Probab. Th. Eng. Inf. Sc. 10, 1-20. [52] S. Asmussen & M. Taksar (1997) Controlled diffusion models for optimal dividend payout. Insurance: Mathematics and Economics 20, 1-15.
366
BIBLIOGRAPHY
[53] S. Asmussen & J.L. Teugels ( 1996) Convergence rates for M /G/1 queues and ruin problems with heavy tails . J. Appl. Probab. 33, 1181-1190. [54] K.B . Athreya & P. Ney ( 1972) Branching Processes . Springer , Berlin.
[55] B. von Bahr (1974) Ruin probabilities expressed in terms of ladder height distributions. Scand. Act. J. 1974, 190-204. [56] B. von Bahr (1975) Asymptotic ruin probabilities when exponential moments do not exist. Scand. Act. J. 1975, 6-10. [57] C.T.H. Baker (1977) The Numerical Solution of Integral Equations. Clarendon Press, Oxford. [58] O. Barndorff-Nielsen (1978) Information and Exponential Families in Statistical Theory. Wiley, New York. [59] O. Barndorff-Nielsen & H. Schmidli (1995) Saddlepoint approximations for the probability of ruin in finite time. Scand. Act. J. 1995, 169-186. [60] J. Beekman (1969) A ruin function approximation. Trans. Soc. Actuaries 21, 41-48, 275-279. [61] J. Beekman (1974) Two Stochastic Processes. Halsted Press, New York. [62] J.A. Beekman (1985) A series for infinite time ruin probabilities. Insurance: Mathematics and Economics 4, 129-134. [63] A. Berman & R.J. Plemmons (1994) Nonnegative Matrices in the Mathematical Sciences. SIAM. [64] P. Billingsley (1968) Convergence of Probability Measures. Wiley, New York. [65] N.H. Bingham (1975) Fluctuation theory in continuous time. Adv. Appl. Probab. 7, 705-766. [66] N.H. Bingham, C.M. Goldie & J.L. Teugels (1987) Regular Variation. Cambridge University Press, Cambridge.
[67] T. Bjork & J. Grandell (1985) An insensitivity property of the ruin probability. Scand. Act. J. 1985 , 148-156. [68] T. Bjork & J. Grandell (1988) Exponential inequalities for ruin probabilities in the Cox case. Scand. Act. J. 1988 , 77-111.
[69] P. Bloomfield & D.R. Cox (1972) A low traffic approximation for queues. J. Appl. Probab. 9, 832-840. [70] A. Bobbio & M. Telek (1994) A bencmark for PH estimation algorithms: results for acyclic PH. Stochastic Models 10, 661-667. [71] P. Boogaert & V. Crijns (1987) Upper bounds on ruin probabilities in case of negative loadings and positive interest rates. Insurance: Mathematics and Economics 6, 221-232. [72] P. Boogaert & A. de Waegenaere (1990) Simulation of ruin probabilities. Insurance: Mathematics and Economics 9, 95-99. [73] A.A. Borovkov (1976) Asymptotic Methods in Queueing Theory. SpringerVerlag.
BIBLIOGRAPHY
367
[74] O. Boxma & J.W. Cohen (1998) The M/G/1 queue with heavy-tailed service time distribution. IEEE J. Sel. Area Commun. 16, 749-763. [75] O. Boxma & J.W. Cohen (1999) Heavy-traffic analysis for the GI/G/1 queue with heavy-tailed service time distributions . Queueing Systems 33, 177-204. [76] N.L. Bowers , Jr., H.U. Gerber, J.C. Hickman, D.A. Jones & C.J. Nesbitt (1986)
Actuarial Mathematics . The Society of Actuaries , Itasca, Illinois. [77] P. Bratley, B.L. Fox & L. Schrage (1987) A Guide to Simulation. Springer, New York. [78] L. Breiman ( 1968) Probability. Addison-Wesley, Reading. [79] P.J. Brockwell , S.I. Resnick & R.L. Tweedie (1982) Storage processes with general release rule and additive inputs . Adv. Appl. Probab. 14, 392-433. [80] F. Broeck , M. Goovaerts & F. De Vylder (1986) Ordering of risks and ruin probabilities. Insurance : Mathematics and Economics 5, 35-39. [81] J.A. Bucklew ( 1990) Large Deviation Techniques in Decision, Simulation and Estimation. Wiley, New York. [82] H. Biihlmann ( 1970) Mathematical Methods in Risk Theory. Springer-Verlag, Berlin. [83] D.Y . Burman & D .R. Smith ( 1983) Asymptotic analysis of a queueing system with bursty traffic. Bell. Syst. Tech. J. 62, 1433-1453. [84] D.Y. Burman & D.R. Smith ( 1986) An asymptotic analysis of a queueing system with Markov-modulated arrivals . Oper. Res. 34, 105-119. [85] W. Bux & U. Herzog (1977) The phase concept : approximations of measured data and performance analysis. Computer Performance (K.M. Chandy & M. Reiser eds.), 23-38. North-Holland, Amsterdam.
[86] K.L. Chung (1974) A Course in Probability Theory (2nd ed.). Academic Press, New York San Francisco London. [87] E. Qinlar (1972) Markov additive processes. II. Z. Wahrscheinlichkeitsth. verve. Geb. 24, 93-121. [88] J.W. Cohen (1982) The Single Server Queue (2nd ed.) North-Holland, Amsterdam. [89] M. Cottrell, J.-C. Fort & G. Malgouyres (1983) Large deviations and rare events in the study of stochastic algorithms . IEEE Trans. Aut. Control AC-28, 907920. [90] D.R. Cox (1955) Use of complex probabilities in the theory of stochastic processes. Proc. Cambr. Philos. Soc. 51, 313-319. [91] H. Cramer (1930) On the Mathematical Theory of Risk. Skandia Jubilee Volume, Stockholm.
[92] H. Cramer (1955) Collective risk theory. The Jubilee volume of Forsakringsbolaget Skandia, Stockholm. [93] K. Croux & N. Veraverbeke (1990). Non-parametric estimators for the probability of ruin. Insurance : Mathematics and Economics 9, 127-130.
368
BIBLIOGRAPHY
[94] M. Csorgo & J. Steinebach ( 1991 ) On the estimation of the adjustment coefficient in risk theory via intermediate order statistics . Insurance: Mathematics and Economics 10, 37-50. [95] M. Csorgo & J.L. Teugels ( 1990) Empirical Laplace transform and approximation of compound distributions . J. Appl. Probab. 27, 88-101.
[96] D. Daley & T. Rolski (1984) A light traffic approximation for a single server queue. Math. Oper. Res. 9 , 624-628. [97] D. Daley & T. Rolski (1991) Light traffic approximations in queues. Math. Oper. Res. 16, 57-71. [98] A. Dassios & P. Embrechts (1989). Martingales and insurance risk. Stochastic Models 5, 181-217. [99] R. van Dawen (1986) Ein einfacher Beweis fur Ohlins Lemma. Blotter der deutschen Gesellschaft fur Versicherungsmathematik XVII, 435-436. [100] A. Davidson (1946) On the ruin problem of collective risk theory under the assumption of a variable safety loading (in Swedish). Forsiikringsmatematiska Studier Tillagnade Filip Lundberg, Stockholm. English version published in Skand. Aktuar. Tidskr. Suppl., 70-83 (1969).
[101] C.D. Daykin, T. Pentikainen & E. Pesonen (1994) Practical Risk Theory for Actuaries. Chapman & Hall. [102] P. Deheuvels & J. Steinebach (1990) On some alternative estimators of the adjustment coefficient in risk theory. Scand. Act. J. 1990 , 135-159.
[103] F. Delbaen & J. Haezendonck (1985) Inversed martingales in risk theory. Insurance: Mathematics and Economics 4, 201-206. [104] F. Delbaen & J. Haezendonck (1987) Classical risk theory in an economic environment. Insurance: Mathematics and Economics 6, 85-116. [105] A. Dembo & O. Zeitouni (1992) Large Deviations Techniques and Applications. Jones and Bartlett, Boston. [106] A.P. Dempster, N.M. Laird & D.B. Rubin (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 22, 583-602. [107] L. Devroye (1986) Non-uniform Random Variate Generation. Springer, New York.
[108] F. De Vylder (1977) A new proof of a known result in risk theory. J. Comp. Appl. Math. 3, 227-229. [109] F. De Vylder (1978) A practical solution to the problem of ultimate ruin probability. Scand. Actuarial J., 114-119. [110] F. De Vylder (1996) Advanced Risk Theory. Editions de 1'Universite de Bruxelles. [111] F. De Vylder & M. Goovaerts (1984) Bounds for classical ruin probabilities. Insurance: Mathematics and Economics 3, 121-131. [112] F. De Vylder & M. Goovaerts (1988) Recursive calculation of finite-time ruin probabilities. Insurance: Mathematics and Economics 7, 1-7.
BIBLIOGRAPHY
369
[113] D.C.M. Dickson (1992) On the distribution of the surplus prior to ruin . Insurance : Mathematics and Economics 11, 191-207.
[114] D.C.M. Dickson (1994) An upper bound for the probability of ultimate ruin. Scand. Act. J. 1994 , 131-138. [115] D.C.M. Dickson (1995) A review of Panjer's recursion formula and its applications. British Actuarial J. 1, 107-124.
[116] D.C.M. Dickson & J.R. Gray (1984) Exact solutions for ruin probability in the presence of an upper absorbing barrier. Scand. Act. J. 1984 , 174-186. [117] D.C.M. Dickson & J.R. Gray (1984) Approximations to the ruin probability in the presence of an upper absorbing barrier. Scand. Act. J. 1984 , 105-115. [118] D.C.M. Dickson & C. Hipp (1998) Ruin probabilities for Erlang (2) risk processes. Insurance : Mathematics and Economics 22, 251-262.
[119] D.C.M. Dickson & C. Hipp (1999) Ruin problems for phase-type(2) risk processes. Scand. Act. J. 2000 , 147-167. [120] D.C.M. Dickson & H.R. Waters (1996 ) Reinsurance and ruin . Insurance: Mathematics and Economics 19, 61-80. [121] D.C.M. Dickson & H.R. Waters (1999) Ruin probabilities wih compounding. Insurance: Mathematics and Economics 25, 49-62. [122] B. Djehiche (1993) A large deviation estimate for ruin probabilities. Scand. Act. J. 1993 , 42-59.
[123] E. van Doorn & G.J.K. Regterschot (1988) Conditional PASTA. Oper. Res. Letters 7, 229-232. [124] N.G. Duffield & N. O'Connell (1995) Large deviations and overflow probabilities for the general single-server queue, with applications. Math. Proc. Camb. Philos. Soc. 118, 363-374. [125] F. Dufresne & H.U. Gerber (1988) The surpluses immediately before and at ruin, and the amount of claim causing ruin. Insurance : Mathematics and Economics 7, 193-199.
[126] F. Dufresne & H.U. Gerber (1991) Risk theory for the compound Poisson process that is perturbed by a diffusion. Insurance : Mathematics and Economics 10, 5159. [127] F. Dufresne, H.U. Gerber & E.S.W. Shiu (1991) Risk theory with the Gamma process. Astin Bulletin 21, 177-192. [128] E.B. Dynkin (1965) Markov Processes I. Springer , Berlin Gottingen Heidelberg.
[129] D .C. Emanuel , J.M. Harrison & A.J. Taylor ( 1975) A diffusion approximation for the ruin probability with compounding assets. Scand. Act. J. 1975 , 37-45. [130] P. Embrechts ( 1988 ). Ruin estimates for large claims . Insurance: Mathematics and Economics 7, 269-274. [131] P. Embrechts , J. Grandell & H. Schmidli (1993) Finite-time Lundberg inequalities in the Cox case . Scand. Act. J. 1993 , 17-41.
370
BIBLIOGRAPHY
[132] P. Embrechts, R. Griibel & S.M. Pitts (1993) Some applications of the fast Fourier transform in insurance mathematics . Statistica Neerlandica 47, 59-75. [133] P. Embrechts & T. Mikosch (1991). A bootstrap procedure for estimating the adjustment coefficient. Insurance: Mathematics and Economics 10, 181-190. [134] P. Embrechts, C. Kliippelberg & T. Mikosch (1997) Modelling Extremal Events for Finance and Insurance. Springer, Heidelberg.
[135] P. Embrechts & H. Schmidli (1994) Ruin estimation for a general insurance risk model. Adv. Appl. Probab. 26, 404-422. [136] P. Embrechts & N. Veraverbeke (1982) Estimates for the probability of ruin with special emphasis on the possibility of large claims . Insurance : Mathematics and Economics 1, 55-72. [137] P. Embrechts & J.A. Villasenor (1988) Ruin estimates for large claims. Insurance: Mathematics and Economics 7, 269-274. [138] P. Embrechts, J.L. Jensen, M. Maejima & J.L. Teugels (1985). Approximations for compound Poisson and Polya processes. Adv. Appl. Probab. 17, 623-637. [139] A.K. Erlang (1909) Sandsynlighedsregning og telefonsamtaler . Nyt Tidsskrift for Matematik B20, 33-39. Reprinted 1948 as 'The theory of probabilities and telephone conversations' in The Life and work of A.K. Erlang 2, 131-137. Trans. Danish Academy Tech. Sci. [140] J.D. Esary, F. Proschan & D.W. Walkup (1967) Association of random variables, with applications. Ann. Math. Statist. 38, 1466-1474.
[141] F. Esscher (1932) On the probability function in the collective theory of risk. Skand. Akt. Tidsskr. 175-195.
[142] W. Feller (1966) An Introduction to Probability Theory and its Applications I (3nd ed.). Wiley, New York. [143] W. Feller (1971) An Introduction to Probability Theory and its Applications II (2nd ed.). Wiley, New York. [144] P.J. Fitzsimmons (1987). On the excursions of Markov processes in classical duality. Probab. Th. Rel. Fields 75, 159-178. [145] P. Franken, D. Konig, U. Arndt & V. Schmidt (1982) Queues and Point Processes. Wiley. [146] E.W. Frees (1986) Nonparametric estimation of the probability of ruin. Astin Bulletin 16, 81-90. [147] M. Frenz & V. Schmidt (1992) An insensitivity property of ladder height distributions. J. Appl. Probab. 29, 616-624. [148] C.D. Fuh (1997) Corrected ruin probabilities for ruin probabilities in Markov random walks. Adv. Appl. Probab. 29, 695-712. [149] C.D. Fuh & T. Lai (1998) Wald's equations, first passage times and moments of ladder variables in Markov random walks. J. Appl. Probab. 35, 566-580. [150] H. Furrer (1998) Risk processes perturbed by a-stable Levy motion. Scand. Act. J. 1998 , 59-74.
BIBLIOGRAPHY
371
[151] H. Furrer (1996) A note on the convergence of the infinite-time ruin probabilities when the weak approximation is a-stable Levy motion. Unpublished, contained in the authors PhD thesis, ETH Zurich. [152] H. Furrer, Z. Michna & A. Weron (1997) Stable Levy motion approximation in collective risk theory. Insurance: Mathematics and Economics 20, 97-114.
[153] H. Furrer & H. Schmidli (1994) Exponential inequalities for ruin probabilities of risk processes perturbed by diffusion. Insurance: Mathematics and Economics 15, 23-36.
[154] H.U. Gerber (1970) An extension of the renewal equation and its application in the collective theory of risk. Skand. Aktuarietidskrift 1970 , 205-210. [155] H.U. Gerber (1971) Der Einfluss von Zins auf die Ruinwahrscheinlichkeit. Mitt. Ver. Schweiz. Vers. Math.71, 63-70.
[156] H.U. Gerber (1973) Martingales in risk theory. Mitt. Ver. Schweiz. Vers. Math. 73, 205-216. [157] H.U. Gerber (1979) An Introduction to Mathematical Risk Theory. S.S. Huebner Foundation Monographs, University of Pennsylvania.
[158] H.U. Gerber (1981) On the probability of ruin in the presence of a linear dividend barrier. Scand. Act. J. 1981 , 105-115. [159] H.U. Gerber (1986). Life Insurance Mathematics. Springer. [160] H.U. Gerber (1988) Mathematical fun with ruin theory. Insurance : Mathematics and Economics 7, 15-23.
[161] P. Glasserman (1991) Gradient Estimation via Perturbation Analysis. Kluwer, Boston, Dordrecht, London. [162] P. Glasserman & S.-G. Kou (1995) Limits of first passage times to rare sets in regenerative processes. Ann. Appl. Probab. 5, 424-445. [163] P.W. Glynn & W. Whitt (1994) Logarithmic asymptotics for steady-state tail probabilities in a single-server queue. In Studies in Applied Probability (J. Galambos & J. Gani, eds.). J. Appl. Probab. 31A, 131-156.
[164] B.V. Gnedenko & I.N. Kovalenko (1989) Introduction to Queueing Theory (2nd ed.). Birkhauser , Basel. [165] M.J. Goovaerts, F. de Vylder & J. Haezendonck (1984) Insurance Premiums. North-Holland, Amsterdam. [166] M.J. Goovaerts, R. Kaas, A.E. van Heerwarden & T. Bauwelinckx (1990) Effective Actuarial Methods. North-Holland, Amsterdam. [167] C.M. Goldie & R. Griibel (1996) Perpetuities with thin tails. Adv. Appl. Probab. 28, 463-480. [168] J. Grandell (1977) A class of approximations of ruin probabilities. Scand. Act. J., Suppl., 1977, 37-52. [169] J. Grandell (1978) A remark on 'A class of approximations of ruin probabilities'. Scand. Act. J. 1978 , 77-78.
372
BIBLIOGRAPHY
[170] J. Grandell (1979) Empirical bounds for ruin probabilities. Stoch. Proc. Appl. 8,243-255. [171] J. Grandell (1990) Aspects of Risk Theory. Springer-Verlag. [172] J. Grandell ( 1992) Finite time ruin probabilities and martingales . Informatica 2, 3-32. [173] J. Grandell (1997) Mixed Poisson Processes. Chapman & Hall. [174] J. Grandell (1999) Simple approximations of ruin functions. Preprint, KTH. [175] J. Grandell & C.-O. Segerdahl (1971) A comparison of some approximations of ruin probabilities. Skand. Aktuar Tidskr. 54, 143-158. [176] B. Grigelionis ( 1992) On Lundberg inequalities in a Markovian environment. Proc . Winter School on Stochastic Analysis and Appl .. Akademie-Verlag, Berlin. [177] B. Grigelionis ( 1993) Two-sided Lundberg inequalities in a Markovian environment. Liet. Matem. Rink. 33, 30-41. [178] B. Grigelionis (1996) Lundberg-type stochastic processes. Probability Theory and Mathematical Statistics (I.A. Ibragimov, ed.), 167-176. [179] R. Grnbel (1991) G/G/1 via FFT. Statistical Algorithm 265. Applied Statistics 40, 355-365.
[180] R. Griibel & R. Hermesmeier (1999) Computation of compound distributions I: aliasing errors and exponential tilting. Astin Bulletin 29, 197-214. [181] D.V. Gusak & V.S. Korolyuk (1969) On the joint distribution of a process with stationary increments and its maximum . Th. Probab. Appl. 14, 400-409. [182] A. Gut (1988 ) Stopped Random Walks . Theory and Applications. SpringerVerlag, New York. [183] M. Gyllenberg & D. Silvestrov (2000) Cram€r-Lundberg approximations for nonlinearly perturbed risk processes . Insurance: Mathematics and Economics 26, 75-90.
[184] H . Hadwiger (1940) Uber die Wahrscheinlichkeit des Ruins bei einer grossen Zahl von Geschiiften . Archiv v. mathematische Wirtschaft - and Sozialforschung 6,131-135. [185] J. M. Harrison (1977) Ruin problems with compounding assets . Stoch. Proc. Appl. 5 , 67-79.
[186] J. M. Harrison & A.J. Lemoine (1977) Limit theorems for periodic queues. J. Appl. Probab. 14, 566-576. [187] J. M. Harrison & S.I. Resnick (1976 ) The stationary distribution and first exit probabilities of a storage process with general release rule . Math. Opns . Res. 1, 347-358.
[188] J. M. Harrison & S.I. Resnick (1977) The recurrence classification of risk and storage processes . Math. Opns . Res. 3 , 57-66. [189] A. E. van Heerwarden (1991) Ordering of Risks: Theory and Actuarial Applications. Tinbergen Institute Research Series 20, Amsterdam.
BIBLIOGRAPHY
373
[190] P. Heidelberger (1995) Fast simulation of rare events in queueing and reliability models. ACM TOMACS 6, 43-85. [191] W.-R. Heilmann (1987) Grundbegriffe der Risikotheorie. Verlag Versicherungswirtschaft e.V., Karlsruhe. [192] U. Herkenrath (1986) On the estimation of the adjustment coefficient in risk theory by means of stochastic approximation procedures. Insurance: Mathematics and Economics 5, 305-313.
[193] U. Hermann (1965) Bin Approximationssatz fiir Verteilungen stationarer zufalliger Punktfolgen. Math. Nachrichten 30, 377-381. [194] 0. Hesselager (1990) Some results on optimal reinsurance in terms of the adjustment coefficient. Scand. Act. J. 1990 , 80-95. [195] B.M. Hill (1975) A simple general approach to inference about the tail of a distribution. Ann. Statist. 3, 1163-1174. [196] C. Hipp (1989a) Efficient estimators for ruin probabilities. Proc. Foruth Prague Symp. on Asymptotic Statistics (P. Mandl & M. Huskova, eds.), 259-268. [197] C. Hipp (1989b) Estimators and bootstrap confidence intervals for ruin probabilities. Astin Bulletin 19, 57-70. [198] C. Hipp & R. Michel (1990) Risikotheorie: Stochastische Modelle and Statistische Methoden. Versicherungswirtschaft, Karlsruhe.
[199] R.V. Hogg & S.A. Klugman (1984) Loss Distributions. Wiley, New York. [200] M.L. Hogan (1986) Comment on corrected diffusion approximations in certain random walk problems. J. Appl. Probab. 23, 89-96. [201] L. Horvath & E. Willekens (1986) Estimates for the probability of ruin starting with a large initial reserve. Insurance: Mathematics and Economics 5, 285-293. [202] 0. Haggstrom, S. Asmussen & 0. Nerman (1992) EMPHT - a program for fitting phase-type distributions. Studies in Statistical Quality Control and Reliability 1992 :4. Mathematical Statistics, Chalmers University of Technology and the University of Goteborg. [203] T. Hoglund (1974) Central limit theorems and statistical inference for Markov chains. Z. Wahrscheinlichkeitsth. verve. Geb. 29, 123-151. [204] T. Hoglund (1990) An asymptotic expression for the probability of ruin within finite time. Ann. Probab. 18, 378-389.
[205] T. Hoglund (1991). The ruin problem for finite Markov chains. Ann. Prob. 19, 1298-1310. [206] B. Hojgaard & M. Taksar (2000) Optimal proportional reinsurance policies for diffusion models. Scand. Act. J. 1998 , 166-180.
[207] D.L. Iglehart (1969) Diffusion approximations in collective risk theory. J. Appl. Probab. 6, 285-292.
[208] V.B. Iversen & L. Staalhagen (1999) Waiting time distributions in M/D/1 queueing systems. Electronic Letters 35, No. 25.
374
BIBLIOGRAPHY
[209] D. Jagerman (1985) Certain Volterra integral equations arising in queueing. Stochastic Models 1, 239-256. [210] D. Jagerman (1991) Analytical and numerical solution of Volterra integral equations with applications to queues. Manuscript, NEC Research Institute, Princeton, N.J. [211] J. Janssen (1980) Some transient results on the M/SM/1 special semi-Markov model in risk and queueing theories. Astin Bull. 11, 41-51. [212] J. Janssen & J.M. Reinhard (1985) Probabilities de ruine pour une classe de modeles de risque semi-Markoviens. Astin Bull. 15, 123-133.
[213] P.R. Jelenkovic & A.A. Lazar (1998) Subexponential asymptotics of a Markovmodulated random walk with a queueing application. J. Appl. Probab. 35, 325247. [214] A. Jensen (1953) A Distribution Model Applicable to Economics. Munksgaard, Copenhagen. [215] J.L. Jensen (1995) Saddle Point Approximations. Clarendon Press, Oxford. [216] M. Johnson & M. Taaffe (1989/90) Matching moments to phase distributions. Stochastic Models 5, 711-743; ibid. 6, 259-281; ibid. 6, 283-306.
[217] R. Kaas & M.J. Goovaerts (1986) General bound on ruin probabilities. Insurance: Mathematics and Economics 5, 165-167. [218] V. Kalashnikov (1996) Two-sided bounds of ruin probabilities. Scand. Act. J. 1996, 1-18. [219] V. Kalashnikov (1997) Geometric Sums: Bounds for Rare Event with Applications. Kluwer.
[220] V. Kalashnikov (1999) Bounds for ruin probabilities in the presence of large claims and their comparison. N. Amer. Act. J. 3, 116-129. [221] E.P.C. Kao (1988) Computing the phase-type renewal and related functions. Technometrics 30, 87-93. [222] S. Karlin & H.M. Taylor (1981) A Second Course in Stochastic Processes. Academic Press, New York. [223] J. Keilson (1966) A limit theorem for passage times in ergodic regenerative processes. Ann. Math. Statist. 37, 866-870. [224] J. Keilson & D.M.G. Wishart (1964) A central limit theorem for processes defined on a finite Markov chain. Proc. Cambridge Philos. Soc. 60, 547-567. [225] J. Keilson & D.M.G. Wishart (1964) Boundary problems for additive processes defined on a finite Markov chain. Proc. Cambridge Philos. Soc. 61, 173-190. [226] J. Keilson & D.M.G. Wishart (1964). Addenda to for processes defined on a finite Markov chain. Proc. Cambridge Philos. Soc. 63, 187-193. [227] J.H.B. Kemperman (1961) The Passage Problem for a Markov Chain. University of Chicago Press, Chicago. [228] J. Kennedy (1994) Understanding the Wiener-Hopf factorization for the simple random walk. J. Appl. Probab. 31, 561-563.
BIBLIOGRAPHY
375
[229] J.F.C. Kingman (1961) A convexity property of positive matrices . Quart. J. Math. Oxford 12, 283-284. [230] J.F.C. Kingman (1962) On queues in heavy traffic. Quart. J. Roy. Statist. Soc. B24, 383-392. [231] J.F.C. Kingman (1964) A martingale inequality in the theory of queues. Proc. Camb. Philos. Soc. 60 , 359-361. [232] C. Kliippelberg (1988) Subexponential distributions and integrated tails. J. Appl. Prob. 25, 132-141.
[233] C. Klnppelberg (1989) Estimation of ruin probabilities by means of hazard rates. Insurance: Mathematics and Economics 8, 279-285. [234] C. Klnppelberg (1993) Asymptotic ordering of risks and ruin probabilities. Insurance : Mathematics and Economics 12, 259-264.
(235] C. Klnppelberg & T. Mikosch (1995) Explosive Poisson shot noise with applications to risk retention. Bernoulli 1 , 125-147. [236] C. Klnppelberg & T. Mikosch (1995) Modelling delay in claim settlement. Scand. Act. J. 1995 , 154-168. [237] C. Kliippelberg & U. Stadtmiiller (1998) Ruin probabilities in the presence of heavy-tails and interest rates . Scand. Act. J. 1998 , 49-58. [238] V.S. Korolyuk, I.P. Penev & A.F. Turbin (1973) An asymptotic expansion for the absorption time of a Markov chain distribution. Cybernetica 4, 133-135 (in Russian). [239] H. Kunita (1976) Absolute continuity of Markov processes. Seminaire de Probabilties X . Lecture Notes in Mathematics 511, 44-77. Springer-Verlag. [240] U. Kuchler & M. Sorensen (1997) Exponential Families of Stochastic Processes. Springer-Verlag.
[241] G. Latouche & V. Ramaswami (1999) Introduction to Matrix-Analytic Methods in Stochastic Modelling. SIAM. [242] A.J. Lemoine (1981) On queues with periodic Poisson input. J. Appl. Probab. 18, 889-900. [243] A.J. Lemoine (1989) Waiting time and workload in queues with periodic Poisson input. J. Appl. Probab. 26, 390-397. [244] T. Lehtonen & H. Nyrhinen (1992a) Simulating level-crossing probabilities by importance sampling. Adv. Appl. Probab. 24, 858-874. [245] T. Lehtonen & H. Nyrhinen (1992b) On asymptotically efficient simulation of ruin probabilities in a Markovian environment. Scand. Actuarial J., 60-75. [246] D. Lindley (1952) The theory of queues with a single server. Proc. Cambr. Philos. Soc. 48, 277-289. [247] L. Lipsky (1992) Queueing Theory - a Linear Algebraic Approach. Macmillan, New York. [248] D. Lucantoni (1991) New results on the single server queue with a batch Markovian arrival process. Stochastic Models 7, 1-46.
376
BIBLIOGRAPHY
[249] D. Lucantoni , K.S. Meier-Hellstern & M .F. Neuts ( 1990) A single server queue with server vacations and a class of non-renewal arrival processes . Adv. Appl. Probab. 22, 676-705. [250] F. Lundberg (1903) I Approximerad Framstdllning av Sannolikhetsfunktionen. II Aterforsdkring av Kollektivrisker . Almqvist & Wiksell, Uppsala. [251] F. Lundberg (1926) Forsdkringsteknisk Riskutjdmning. F. Englunds Boktryckeri AB, Stockholm.
[252] A.M. Makowski ( 1994) On an elementary characterization of the increasing convex order , with an application . J. Appl. Probab . 31, 834-841. [253] V. Mammitzsch ( 1986 ) A note on the adjustment coefficient in ruin theory. Insurance : Mathematics and Economics 5, 147-149. [254] V .K. Malinovskii ( 1994) Corrected normal approximation for the probability of ruin within finite time. Scand. Act. J. 1994, 161-174.
[255] V.K. Malinovskii ( 1996) Approximation and upper bounds on probabilities of large deviations of ruin within finite time. Scand. Act. J. 1996 , 124-147. [256] A. Martin-L6f (1983) Entropy estimates for ruin probabilities . Probability and Mathematical Statistics (A. Gut & J. Hoist eds.), 29-39.
[257] A. Martin-L3f (1986) Entropy, a useful concept in risk theory. Scand. Act. J. 1986 , 223-235. [258] K .S. Meier ( 1984). A fitting algorithm for Markov-modulated Poisson processes having two arrival rates . Europ. J. Opns. Res. 29 , 370-377.
[259] Z . Michna ( 1998) Self-similar processes in collective risk theory. J. Appl. Math. Stoch. Anal. 11 , 429-448. [260] H.D. Miller ( 1961 ) A convexity property in the theory of random variables defined on a finite Markov chain. Ann. Math. Statist. 32, 1261 (0?)-1270.
[261] H.D. Miller (1962 ) A matrix factorization problem in the theory of random variables defined on a finite Markov chain . Proc. Cambridge Philos. Soc. 58, 268-285. [262] H .D. Miller ( 1962 ) Absorption probabilities for sums of random variables defined on a finite Markov chain . Proc. Cambridge Philos. Soc. 58 , 286-298.
[263] M . Miyazawa & V. Schmidt ( 1993) On ladder height distributions of general risk processes . Ann. Appl. Probab. 3, 763-776. , [264] G.V. Moustakides ( 1999) Extension of Wald 's first lemma to Markov processes. J. Appl. Probab. 36, 48-59.
[265] S.V. Nagaev (1957) Some limit theorems for stationary Markov chains. Th. Probab. Appl. 2, 378-406. [266] P. Ney & E. Nummelin (1987) Markov additive processes I. Eigenvalue properties and limit theorems . Ann. Probab. 15, 561-592.
BIBLIOGRAPHY
377
[267] M .F. Neuts (1977) A versatile Markovian point process. J. Appl. Probab. 16, 764-779. [268] M .F. Neuts ( 1978) Renewal processes of phase type. Naval Research Logistics Quarterly 25, 445-454. [269] M .F. Neuts ( 1981 ) Matrix-Geometric Solutions in Stochastic Models. Johns Hopkins University Press, Baltimore . London. [270] M .F. Neuts ( 1989) Structured Stochastic Matrices of the M/G/1 Type and their Applications. Marcel Dekker, New York.
[271] M.F. Neuts (1992) Models based on the Markovian arrival process. IEICE Trans. Commun. E75-B , 1255-1265. [272] J . Neveu ( 1961 ) Une generalisation des processus a accroisances independantes. Sem. Math. Abh. Hamburg. [273] R. Norberg ( 1990) Risk theory and its statistics environment . Statistics 21, 273-299. [274] H . Nyrhinen ( 1998 ) Rough descriptions of ruin for a general class of surplus processes. Adv. Appl. Probab. 30, 1008-1026. [275] H. Nyrhinen (1999) Large deviations for the time of ruin. J. Appl. Probab. 36, 733-746.
[276] C.A. O'Cinneide (1990) Characterization of phase-type distributions. Stoch. Models 6, 1-57. [277] J. Ohlin (1969) On a class of measures for dispersion with application to optimal insurance. Astin Bulletin 5 , 249-266.
[278] E. Omey & E. Willekens (1986) Second order behaviour of the tail of a subordinated probability distributions. Stoch. Proc. Appl. 21, 339-353. [279] E. Omey & E. Willekens (1987) Second order behaviour of distributions subordinate to a distribution with finite mean . Stochastic Models 3 , 311-342.
[280] A.G. Pakes (1975) On the tail of waiting time distributions. J. Appl. Probab. 12, 555-564. [281] J. Paulsen (1993) Risk theory in a stochastic economic environment. Stoch. Proc. Appl. 46, 327-361. [282] J . Paulsen ( 1998) Sharp conditions for certain ruin in a risk process with stochastic return on investments. Stoch. Proc. Appl. 75, 135-148. [283] J. Paulsen (1998) Ruin theory with compounding assets - a survey. Insurance: Mathematics and Economics 22, 3-16. [284] J. Paulsen & H.K. Gjessing (1997a) Optimal choice of dividend barriers for a risk process with stochastic return on investments. Insurance: Mathematics and Economics 20, 215-223.
[285] J. Paulsen & H.K. Gjessing (1997b) Present value distributions with applications to ruin theory and stochastic equations. Stoch. Proc. Appl. 71, 123-144.
378
BIBLIOGRAPHY
[286] J. Paulsen & H.K. Gjessing (1997c) Ruin theory with stochastic return on investments. Adv. Appl. Probab. 29, 965-985. [287] F. Pellerey (1995) On the preservation of some orderings of risks under convolution. Insurance: Mathematics and Economics 16, 23-30. [288] S. Schock Petersen (1989) Calculation of ruin probabilities when the premium depends on the current reserve. Scand. Act. J. 1989 , 147-159.
[289] C. Philipson (1968) A review of the collective theory of risk. Skand. Aktuar. Tidskr. 61, 45-68, 117-133. [290] E.J.G. Pitman (1980) Subexponential distribution functions. J. Austr. Math. Soc. 29A, 337-347. [291] S. Pitts (1994) Nonparametric estimation of compound distributions with applications in insurance. Ann. Inst. Stat. Math. 46, 537-555. [292] S.M. Pitts, R. Griibel & P. Embrechts (1996) Confidence bounds for the adjustment coefficient. Adv. Appl. Probab. 28, 820-827.
[293] N.U. Prabhu (1961). On the ruin problem of collective risk theory. Ann. Math. Statist. 32, 757-764. [294] N.U. Prabhu (1965) Queues and Inventories. Wiley. [295] N.U. Prabhu (1980) Stochastic Storage Processes. Queues, Insurance Risk, and Dams. Springer, New York, Heidelberg, Berlin.
[296] N.U. Prabhu & Zhu (1989) Markov-modulated queueing systems. Queueing Systems 5, 215-246. [297] R. Pyke (1959) The supremum and infimum of the Poisson process. Ann. Math. Statist. 30, 568-576.
[298] V. Ramaswami (1980) The N/G/1 queue and its detailed analysis. Adv. Appl. Probab. 12, 222-261. [299] C.M. Ramsay (1984) The asymptotic ruin problem when the healthy and sick periods form an alternating renewal process. Insurance: Mathematics and Economics 3, 139-143. [300] C.R. Rao (1965 ) Linear Statistical Inference and Its Applications. Wiley.
[301] G.J.K. Regterschot & J.H.A. de Smit (1986) The queue M/G/1 with Markovmodulated arrivals and services. Math. Oper. Res. 11, 465-483. [302] J.M. Reinhard (1984) On a class of semi-Markov risk models obtained as classical risk models in a markovian environment. Astin Bulletin XIV, 23-43. [303] S. Resnick & G. Samorodnitsky (1997) Performance degradation in a single server exponential queueing model with long range dependence. Opns. Res., 235-243. [304] B. Ripley (1987) Stochastic Simulation . Wiley, New York.
[305] C.L.G. Rogers (1994) Fluid models in queueing theory and Wiener-Hopf factorisation of Markov chains. Ann. Appl. Probab. 4, 390-413. [306] T. Rolski (1987) Approximation of periodic queues. J. Appl. Probab . 19, 691707.
BIBLIOGRAPHY
379
[307] T. Rolski, H. Schmidli, V. Schmidt & J. Teugels (1999) Stochastic Processes for Insurance and Finance. Wiley. [308] S.M. Ross (1974) Bounds on the delay distribution in GI/G/1 queues. J. Appi. Probab. 11, 417-421. [309] H.-J. Rossberg & G--Siegel (1974) Die Bedeutung von Kingmans Integralgleichungen bei der Approximation der stationiiren Wartezeitverteilung im Modell GI/C/1 mit and ohne Verzogerung beim Beginn einer Beschiiftigungsperiode. Math. Operationsforsch. Statist. 5, 687-699.
[310] R.Y. Rubinstein (1981) Simulation and the Monte Carlo Method. Wiley. [311] R.Y. Rubinstein & B. Melamed (1998) Modern Simulation and Modelling. Wiley. [312] R.Y. Rubinstein & A. Shapiro (1993) Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization via the Score Function Method. Wiley.
[313] M. Rudemo (1973) Point processes generated by transitions of a Markov chain. Adv. Appl. Probab. 5, 262-286. [314] T. Ryden (1994) Parameter estimation for Markov modulated Poisson processes. Stochastic Models 10, 795-829.
[315] T. Ryden (1996) An EM algorithm for estimation in Markov-modulated Poisson processes. Comp. Statist. Data Anal. 21, 431-447
[316] S. Schlegel ( 1998) Ruin probabilities in perturbed risk models. Insurance: Mathematics and Economics 22, 93-104. [317] L. Schmickler ( 1992 ) MEDA : mixed Erlang distributions as phase-type representations of empirical distribution functions . Stochastic Models 6, 131-156. [318] H. Schmidli ( 1994) Diffusion approximations for a risk process with the possibility of borrowing and interest. Stochastic Models 10 , 365-388. [319] H. Schmidli ( 1995 ) Cramer-Lundberg approximations for ruin probabilities of risk processes perturbed by a diffusion . Insurance: Mathematics and Economics 16, 135-149. [320] H. Schmidli ( 1996) Martingales and insurance risk. Lecture Notes of the 8th Summer School on Probability and Mathematical Statistics ( Varna), 155-188. Science Culture Technology Publishing , Singapore.
[321] H. Schmidli ( 1997a) Estimation of the Lundberg coefficient for a Markov modulated risk model . Scand. Act. J. 1997, 48-57. [322] H. Schmidli ( 1997b) An extension to the renewal theorem and an application in risk theory. Ann. Appl. Probab . 7, 121-133. [323] H. Schmidli ( 1999a) On the distribution of the surplus prior and at ruin. Astin Bulletin 29, 227-244. [324] H . Schmidli ( 1999b) Perturbed risk processes: a review . Th. Stoch. Proc. 5, 145-165. [325] H . Schmidli ( 2001 ) Optimal proportional reinsurance policies in a dynamic setting . Scand. Act. J. 2001 , 40-68. [326] H.L. Seal ( 1969) The Stochastic Theory of a Risk Business. Wiley.
380
BIBLIOGRAPHY
[327] H.L. Seal (1972) Numerical calculcation of the probability of ruin in the Poisson/Exponential case. Mitt. Verein Schweiz. Versich. Math. 72, 77-100. [328] H. L. Seal ( 1972). Risk teory and the single server queue . Mitt. Verein Schweiz. Versich. Math. 72, 171-178. [329] H.L. Seal (1974) The numerical calculation of U(w, t), the probability of nonruin in an interval (0, t). Scand. Act. J. 1974, 121-139. [330] H.L. Seal (1978) Survival Probabilities. Wiley, New York. [331] G.A.F. Seber (1984) Multivariate Observations. Wiley. [332] C : O. Segerdahl ( 1942) Uber einige Risikotheoretische Fagestellungen . Skand. Aktuar Tidsskr. 61, 43-83. [333] C: O. Segerdahl (1955) When does ruin occur in the collective theory of risk? Skand. Aktuar Tidsskr. 1955, 22-36. [334] C: O. Segerdahl (1959) A survey of results in the collective theory of risk. In Probability and statistics - the Harald Cramer volume (ed. Grenander), pp. 279299. Almqvist & Wiksell, Stockholm. [335] B . Sengupta ( 1989) Markov processes whose steady-state distribution is matrixgeometric with an application to the GI/PH/1 queue. Adv. Appl. Probab. 21, 159-180.
[336] B. Sengupta (1990) The semi-Markov queue: theory and applications. Stochastic Models 6, 383-413. [337] M. Shaked & J.G. Shantikumar (1993) Stochastic Orders and Their Applications. Academic Press. [338] E.S. Shtatland ( 1966) On the distribution of the maximum of a process with independent increments. Th. Probab. Appl. 11, 483-487. [339] A. Shwartz & A. Weiss (1995) Large Deviations for Performance Analysis. Chapman & Hall.
[340] E.S.W. Shin (1987) Convolution of uniform distributions and ruin probability. Scand. Act. J. 1987, 191-197. [341] E.S.W. Shin (1989) Ruin probability by operational calculus. Insurance: Mathematics and Economics 8, 243-249. [342] D. Siegmund (1975) The time until ruin in collective risk theory. Mitteil. Verein Schweiz Versich. Math. 75, 157-166.
[343] D. Siegmund (1976) Importance sampling in the Monte Carlo study of sequential tests. Ann. Statist. 4, 673-684. [344] D. Siegmund (1976) The equivalence of absorbing and reflecting barrier problems for stochastically monotone Markov processes . Ann. Probab . 4, 914-924. [345] D. Siegmund (1979) Corrected diffusion approximations in certain random walk problems. Adv. Appl. Probab. 11, 701-719. [346) D . Siegmund ( 1985) Sequential Analysis . Springer-Verlag.
[347] K. Sigman (1992) Light traffic for workload in queues. Queueing Systems 11, . 429-442. [348] K . Sigman ( 1994 ) Stationary Marked Point Processes: An Intuitive Approach. Chapman and Hall.
BIBLIOGRAPHY
381
[349] E. Slud & C. Hoesman ( 1989) Moderate- and large-deviations probabilities in actuarial risk theory. Adv. Appl. Probab. 21, 725-741. [350] W.L. Smith ( 1953) Distribution of queueing times. Proc. Cambridge Philos. Soc 49, 449-461. [351] D .A. Stanford & K.J. Stroinski ( 1994) Recursive method for computing finitetime ruin probabilities for phase-distributed claim sizes. Astin Bulletin 24, 235254.
[352] D. Stoyan (1983) Comparison Methods for Queues and Other Stochastic Models (D.J. Daley ed.). John Wiley & Sons, New York. [353] E. Straub (1988) Non-Life Insurance Mathematics. Springer, New York. [354] B. Sundt ( 1993) An Introduction to Non-Life Insurance Mathematics (3rd ed.). Verlag Versicherungswirtschaft e.V., Karlsruhe. [355] B . Sundt & W.S. Jewell ( 1981 ) Further results on recursive evaluation of compound distributions . Astin Bulletin 12, 27-39.
[356] B . Sundt & J.L. Teugels ( 1995 ) Ruin estimates under interest force . Insurance: Mathematics and Economics 16, 7-22. [357] B. Sundt & J.L. Teugels (1997) The adjustment coefficient in ruin estimates under interest force . Insurance: Mathematics and Economics 19, 85-94.
[358] R. Suri ( 1989) Perturbation analysis : the state of the art and research issues explained via the GI/G/1 queue. Proceeedings of the IEEE 77, 114-137.
[359] L. Takhcs (1967) Combinatorial Methods in the Theory of Stochastic Processes. Wiley. [360] G.C. Taylor (1976 ) Use of differential and integral inequalities to bound ruin and queueing probabilities . Scand. Act. J. 1976, 197-208.
[361] G.C. Taylor (1978) Representation and explicit calculation of finite-time ruin probabilities. Scand. Act. J. 1978, 1-18. [362] G.C. Taylor (1979) Probability of ruin under inflationary conditions or under experience ratings . Astin Bulletin 16 , 149-162.
[363] G.C. Taylor (1979) Probability of ruin with variable premium rate. Scand. Act. J. 1980 , 57-76. [364] G.C. Taylor (1986) Claims reserving in Non-Life Insurance. North-Holland. [365] J.L. Teugels (1982) Estimation of ruin probabilities . Insurance: Mathematics and Economics 1, 163-175.
[366] O. Thorin (1974) On the asymptotic behaviour of the ruin probability when the epochs of claims form a renewal process. Scand. Act. J. 1974, 81-99. [367] O. Thorin (1977). Ruin probabilities prepared for numerical calculations. Scand. Actuarial J., 65-102.
[368] O. Thorin (1982) Probabilities of ruin. Scand. Act. J. 1982 , 65-102. [369] O. Thorin (1986) Ruin probabilities when the claim amounts are gamma distributed. Unpublished manuscript.
382
BIBLIOGRAPHY
[370] 0. Thorin & N. Wikstad (1977) Numerical evaluation of ruin probabilities for a finite period. Astin Bulletin VII, 137-153. [371] 0. Thorin & N. Wikstad (1977) Calculation of ruin probabilities when the claim distribution is lognormal. Astin Bulletin IX, 231-246. [372] H. Thorisson (2000) Coupling, Stationarity and Regeneration. Springer-Verlag. [373] S. Tacklind (1942) Sur le risque dans les jeux inequitables. Skand. Aktuar. Tidskr. 1942, 1-42.
[374] F. Vazquez-Abad (1999) RPA pathwise derivatives of ruin probabilities. Submitted. [375] N. Veraverbeke (1993) Asymptotic estimates for the probability of ruin in a Poisson model with diffusion. Insurance : Mathematics and Economics 13, 5762.
[376] A. Wald (1947) Sequential Analysis. Wiley. [377] V. Wallace (1969) The solution of quasi birth and death processes arising from multiple access computer systems. Unpublished Ph. D. thesis, University of Michigan. [378] H.R. Waters (1983) Probability of ruin for a risk process with claims cost inflation. Scand. Act. J. 1983 , 148-164.
[379] M. Van Wouve, F. De Vylder & M. Goovaerts (1983) The influence of reinsurance limits on infinite time ruin probabilities. In: Premium Calculation in Insurance (F. De Vylder, M. Goovaerts, J. Haezendonck eds.). Reidel, Dordrecht Boston Lancaster. [380] W. Whitt (1989) An interpolation approximation for the mean workload in a GI/G/1 queue. Oper. Res. 37, 936-952. [381] W. Willinger, M. Taqqu, R. Sherman & D. Wilson (1995) Self-similarity in highspeed traffic: analysis and modeling of ethernet traffic measurements. Statistical Science 10, 67-85. [382] G.E. Willmot (1994) Refinements and distributional generalizations of Lundberg's inequality. Insurance: Mathematics and Economics 15, 49-63. [383] G.E. Willmot & X. Lin (1994) Lundberg bounds on the tails of compound distributions. J. Appl. Probab. 31, 743-756.
[384] R.W. Wolff (1990) Stochastic Modeling and the Theory of Queues. Prentice-Hall.
Index differential equation 16, 245-248, 341, 361
adjustment coefficient 17, 70-79, 9396, 97, 170-173, 180-182, 201-214, 308,314-316,328330,359 aggregate claims 103-106, 189, 316323
diffusion 3, 15, 17, 205, 302-303 diffusion approximation 17, 117-127 corrected 121-127 duality 13-14, 30-32, 33-34, 141144,185-187,272,292-293
Bessel function 102, 201 Brownian motion 3 , 25-26, 40, 117128,200-201,269,299,301
Edgeworth expansion 113, 318-319 Erlang distribution 7, 86, 217, 226, 360
central limit theorem 60 , 94-96, 110113,281,293-294,318-320 change of measure 26-30, 34-36, 3839,44-47,67-79,98-99,100, 111-117,121-129,135,137141,160-167,178-184,203, 283,287-292,307-312 compound Poisson model 4, 11-12, 14-15, 24-25, 37, 39, 4851. 57-96, 97-129, 135, 227229,242,259-261,285-292, 323 Coxian distribution 147, 218 Cox process 4, 5, 12 Cramer-Lundberg approximation 1617, 71-79, 138-139, 162164,182,203,308 Cramer-Lundberg model: see compound Poisson model cumulative process 334
excursion 155-156, 271-274, 278 gamma distribution 6-7, 79, 91, 207 heavy-tailed distribution 6, 14, 17, 18-19,251-280 heavy traffic 76, 80 -81, 82-83 hyperexponential distribution 7, 7879,86,150,217,226,228229,249-250 integral equation 16 Lindley 143
renewal 64, 74-75, 89, 332-333 Volterra 192-194, 248 Wiener-Hopf 144 interest rate 190, 196-201 inverse Gaussian distribution 76, 9293, 119, 122, 301 Kronecker product- and sum 221, 239,249,346-349
dams: see storage process
383
384
INDEX
ladder heights 47-56, 61-62, 71, 100, 106-108,152-160,227-230, 261-264,275-278,336-339 Laplace transform 15, 65, 99, 108109,123,234,240-244,339 large deviations 129, 203-204, 213214, 306-316 Levy process 3, 15, 36-39, 57-58, 108 life insurance 5, 134, 175 light traffic 81-83 Lindley integral equation 143 process 33-34, 142 likelihood ratio : see change of measure lognormal distribution 9, 251, 257, 260 Lundberg conjugation 69-79 , 98-99, 112113,128-129,134-135,161164,178-182,287-291
equation 16, 25, 69-70, 75-76, 134-135,161,180,287,315 inequality 17-18, 25, 71-79, 113114, 138, 162, 203 Markov additive process 12, 39-47, 52-
53,139-141,148,160-161, 171, 178 -modulation 12, 132-133, 145187,234-240,269-271,304 process 28-30, 38, 39-47, 154, 271-274,348 terminating 215-216, 227-228, 245
martingale 24-26, 27-30, 35, 39, 42, 44,108,161,238,298-299, 304-305
matrix equation , non-linear 155, 157, 230, 234
matrix-exponential distribution 240244 matrix-exponentials 14, 16, 41, 4446,218-221,340-350 multiplicative functional 28-30, 35,
38, 44, 179 NP approximation 318-320
Palm distribution 52-53, 149, 267269 Panjer's recursion 320-323 Pareto distribution 9-10, 86 periodicity 12, 176-185, 269 Perron-Frobenius theory 41-42,349-
350 perturbation 172-173, 295; see also sensitivity analysis phase-type distribution 8, 14, 16, 133,146-148,174,201,215250,350-361 Poisson process Markov-modulated 12 periodic 12, 176-185 non-homogeneous 60 Pollaczeck-Khinchine formula 61-67, 80,259-261,285-287 queue 14 , 141-144, 185-187 GI/G/1 141-144 M/D/1 66-67
M/G/1 13, 32, 37, 96, 144, 229 M/M/1 101 Markov-modulated 185-187 periodic 187
random walk 33-36, 59, 133, 137139,261-264,288-290,297299,302,336-339
385
INDEX
rational Laplace transform 8, 222, 240; see also matrix-exponential distribution regenerative process 264 -268, 280, 292-294, 333-334
regular variation 10, 251, 253, 256258, 260 reinsurance 8, 326-330
renewal process 131, 146, 174, 223226, 331-336 equation 64, 74-75, 89, 332-333 model 12, 131-144, 229-234, 261264 reserve-dependent premiums 14, 189214,244-250,279-280 Rouche roots 158, 233-234, 238 saddlepoint method 115-117, 123, 307-308, 317-318 semi-Markov 147, 162, 335-336 sensitivity analysis 86-93, 172-173, 294-296 shot-noise process 314 simulation 19, 213, 281-296 stable process 15, 120 statistics x, 11, 12, 18-19, 96-93, 152,314,359-361 stochastic control x stochastic ordering 18, 83-86, 168172
storage process 13, 30-32, 191-192, 279-280 subexponential distribution 11, 251280 time change 4, 60, 87, 147, 177 time-reversion 14, 31, 49-50, 54-55, 107,154-157,186,273-274, 338
utility 324, 327
waiting time 141, 186-187 virtual: see workload
Weibull distribution 9, 251, 257, 260 Wiener-Hopf theory 144, 160, 233, 244,262-263,336-339 workload 13, 37, 141-144, 186-187
Advanced Series on Statistical Science & Applied Probability - Vol. 2
A I 11 JjVb
.
I 1!
Ruin Probabilities
l' i | i Yj
,T [Ail i The book is a comprehensive treatment of || I i I \ classical and modern ruin probability theory. Some i (||l I JL I J r of the topics are Lundberg's inequality, the ^W A l \ i l ' ''' Cramer-Lundberg approximation, exact solutions, P'i yfliother approximations (e.g. for heavy-tailed claim size distributions), y finite horizon ruin probabilities, extensions of the classical compound Poisson model to allow f o r reserve-dependent premiums, Markov-modulation or periodicity. Special features of the book are the emphasis on change of measure techniques, phase-type distributions as a computational vehicle and the connection to other applied probability areas like queueing theory.
"This book is a must for anybody working in applied probability. It is a comprehensive treatment of the known results on ruin probabilities..." Short Book Reviews
ISBN 981-02-2293-9 mi u inn i nun I I I I I I i in u
www. worldscientific.com 2779 he
9 "789810ll22293211