This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Pi —El
A f Q - p f VI 0
„
Pi
\ wfki^ •As** '"
_E2L_ P% JWo^P?92
P* )
(8)
118
where Ma = Ylf=iPf<
and
Vi=v^=:
pm°-p?y*
,
i = 1> ...,*.
(9)
If a = 0, then H ^ is the as same as the one in (7) and v\a' is the as same as the one in (6). By comparing the values of v ^ s , one can find that the larger a is, the more patients will be assigned to a better treatment. So, the design Example 2.4 assigns more patients to better treatments than the PWC rule does. Also, when there is failure, the assignment is random in the design defined in this example. However, for this example, ]Cm=i ll^m — ^11 = o^1/2) does not hold. So, the asymptotic normality is not implied by Theorem 2.1. This leads us to consider the following design (c.f., Zhang [36]). Adaptive Design 2.1 Suppose the response sequence {£„} is an i.i.d. sequence and each response £ m ; is a random variable coming from a distribution family {Pe^. Without losing of generality, we assume that ®i — Efmti i = 1,...,K and write 0 = ( 0 i , . . . ,0/r). Suppose the previous m — 1 patients are assigned and the responses observed. Let ©m-i = {6m-i,i,---,9m-i,K) be an estimate of 0 , where 0m_i,fc = E ^ W * - ^ + i + e ° ' f c ' k = X' • • • ' K - H e r e ' 0 O = (*°.i»' • • ' 9°.«) i s a S u e s s e d value of 0 , or an estimate of 0 from other early trials. Now, if the m-th patient is assigned to the treatment i and the response observed, then we assign the (m + l)-th patient to the treatment j with probability dy(0 m _i,f m ,i), j = l,...,K. That is
'\-"-m+l,j
=
I|~' mi •^•m,i
=
IiSi7i,iJ
=
">ij\"m—liSm.ijj
\^l
i,j = h,...,K. Here Fn =0, H(x) -» H =: H ( 0 ) as x -> 0 and /or 5 > 0, H(x) - H = ^
^ T ^
(*k - ®k) + 0 ( | | x - 0|| 1 + *)
as x -> 0 .
119
Let Ai = 1, A2,..., XK be the eigenvalues o / H and v be the left eigenvector corresponding to Ai = 1 with v l ' = 1. Write a\ = Var(£„fc), k = 1 , . . . , K. Define H = H — l ' v and £ = (diag{v) -
f fc =v^?
x=0
H'diag(y)H), (11)
P=(f{|...,f^)'.
If IAi| < 1, i = 2,...,if, t/ien possibly in a richer underlying probability space in which there exist two independent d-dimensional standard Brownian motions {Bt} and {Wt}, we can redefine the sequence { X n , £ n } , without changing its distribution, such that 0 n - 0 = -Wndiag{ai/y/v^, n
••• ,
N n - nv = [ B ^ 1 / 2 + / ; 3 f * d M $ r . • • • • ^ )
F
] (! "
(12) fi _1
)
1 2 K
+o(n / - ) a.5., for some K > 0. in particular,
s
t7-- ^)*»hL-a»-
"V\-1T?I :
t
//
A = (I-H')-1(S + 2F'EtP)(I-H)-1, S t = diag{a\lvi, ••• , o-\jvd). 3. Randomized play-the-Winner rule and generalized Polya urn RPW rule. As pointed out in Wei and Durham [33] and Wei [32], the PW rule is too deterministic and is not applicable when we have delayed responses from patients to treatments. Motivated as an extension to Zelen's [34] idea, Wei and Durham [33] introduced the following randomized playthe-Winner (RPW) rule for a two-arm clinical trial: We start with (a, a) balls (type 1 and 2 respectively) in the urn. If a type 1 6a// is drawn, a patient is assigned to the treatment 1; If a type 2 ball is drawn, a patient is assigned to the treatment 2. The ball is replaced and the patient response is observed. A success on the treatment 1 or a failure on the treatment 2 generates a type 1 ball in the urn; A success on the treatment 2 or a failure on the treatment 1 generates a type 2 ball in the urn. The RPW rule may be regarded as a generalized Polya urn (GPU) model (Wei [32]). Further,
120
let Yni (Yn2) be the number of balls of type 1 (2) after n stage. Prom the results of Athreya and Karlin [1], we have Y
m 92 , Nnl q2 T}——^7 > • a.s. and • a.s.. *nl + *n2 91+92 1 9l + 92 When P1+P2 < 1.5 (or Qi + q2 > 0.5), we have the following asymptotic normality:
and «
91+92
where
„2 ^
_ 9i92[5-2(gi+ g2 )] " [2(9 1 +92)-l](9i+92) 2 -
(16)
The asymptotic normality was first given in Smythe and Rosenberger [30]. When 9 1 + 9 2 < 0.5, the asymptotic distributions of both the urn composition and the proportions of patients assigned to each treatments are unknown. GPU model. One large family of randomized adaptive designs can be developed from the GPU model. Consider an urn containing balls of K types. Initially, the urn contains Y 0 = (loi,-- - ,YQK) balls, where Yoi denotes the number of balls of type i, i = l,--- ,K. A ball is drawn at random from the urn. Its type is observed and the ball is then replaced. At the n-th stage, following a type i drawn, Dij(n) balls of type j , for j = 1, • • • ,K, are added to the urn. In the most general sense, J5y(n) can be random and can be some function of a random process outside the urn process (in the case of adaptive designs, Dij (n) will be a random function of patient's response £ n ). A ball must always be added at each stage (in addition to the replacement), and the expectation of the total numbers of balls added in each stage is the same (say 7), so P{Dij{n) = 0, for all j = 1, • • • , K} = 0, K
^)E{Aj(n)|Jrn-i}=7,
M = ! , • • • ,K,n
= 1,2,-•• ,
where J-n is the sigma field generated by the events up to the stage n. Without losing generality, we can assume 7 = 1. Let H n be the matrix
121 comprising element { ^ ( n ) = E{Djj(n)|.F n _i}}. We refer to D„ as the rules and H n as the design matrices. If H n = H for all n, the model is said to be homogenous. In general, it is assumed that H n —» H. Let Y n = {Yn\, • • • , YnK), where Yni represents the number of balls in the urn of type i after n-th stage, and N„ = (N„i, ••• , Nnx), where Nni represents the number of times a type i ball drawn in the first n draws. In the clinical trials, Nni is the number of patients assigned to treatment i in the first n stages. Let v = (v\, • • • ,VK) be the left eigenvector corresponding to the largest eigenvalue of H with v\ + ... + VK = 1- Then Vi is just the limiting proportion of both the patients assigned to treatment i and the type i balls in the urn, i.e., N
•
Y •
a.s. and —r? * v% o-s. (17) L-<j=l 1n, (c.f., Athreya and Karlin [1]). Smyth [29], Bai and Hu [3] [4] showed the normality of Y n and N„. Bai, Hu and Zhang [7] and Hu and Zhang [18] gives the strong approximation of Y n and N n . Let Ai = 7 = 1, A2,..., XK be the eigenvalues of H, and A = max{ite(A2), • • •, RB(XK)}- Let Vgki(n) = Cov{£>,fc(n),£>,t(n)|J'n_i} and V „ , = (Vqki{n))£l=v q,k,l = 1,.. .,K. T h e o r e m 3.1. Assume that for some 0 < e < 1, 3|D„||2+e
n = l,2,...,
(18)
n fe=i ^HHfc-HHoCn1-'), fc=i and let S i = diag{\) — v'v and S2 = Ylq=ivg^gIf X < 1/2, then on a suitably enlarged probability space with two independent K-dimensional standard Brownian motions B t j and B t 2, we can redefined the process ( Y n , N n ) without changing the distribution, such that for some K > 0, Yn-EYn
= Gn + o(n 1/2 ~ K ) a.s.,
/2 dx(I - l ' v ) + otn 1 / 2 -*) a.i N, - £ N n = B n l S j + / " SlZ. x Jo where Gt is a Gaussian process which is solution of the following type equation:
G t = B t l £ } / 2 H + B t 2 £ 2 / 2 + / — ( H - l ' v ) ds, Jo
s
t > 0,
G 0 = 0. (19)
122
Furthermore n
£Y n - nv = o(nx/2-K) + ]T v(Hfc - H), k-l n
£N n - nv = oin1'2-*) + ]T v(Hfc - H). fc=i
in particular, the asymptotic combining distribution of Yn and N n can be obtained. Example 3.1. (General R P W Rule) Suppose K = 2 and the adding rule matrices are denoted by D
= ( ^nl \ 1 _ £n2
1
~ ^nX\ £n2 / '
where {£„ = (£ n i, £7,2), n = 1,2...} is an i.i.d. sequence with 0 < £ni < 1, Var(£ni) = of, E£„j = pj and g, = 1 — p, for i = 1,2. For this design, one has
H = E[D„] = f P l 9 l V \ f t P2/ Ai = 1, A2 = pi + p2 - 1 and v = (52/(91 + ft), ft/(?i + ft))- Let a 2 = ftft/Ozi + ft)2 and 62 = (cri^ + 0"2
+ o{n1'^)
a.s.
Example 3.2. (Wei's Rule) Suppose K > 2 and the response sequence {£„} is an i.i.d. sequence with £„& = 0 or 1. £nfc = 1 or 0 corresponds to the response of the n patient on the treatment A: is a success or failure, respectively, i = 1 , . . . , K. Let the adding rule matrices D„ = (Dij(n))i . be denoted by Du(n) = £ni and Dij = (1 — t,ni)/(K — 1) for i ^ j . This design is proposed by Wei [32]. For this design, the design matrix H =
123
E[Dn] is the same as the one in (7), and v is the same as the one in (5). If A < 1/2, then Y„ - nv = G n + o(n 1/2 ~ K ) a.s., N„ - nv = B n l S J / 2 + [ —x dx(I - l ' v ) + o(n 1 / 2 - K ) a.s. Jo Designs with delay responses. Typically, clinical trials do not result in immediate outcomes, i.e., individual patient outcomes may not be immediately available prior to the randomization of the next patient. Consequently, the urn cannot be updated immediately, but can be updated only when the outcomes become available. Fortunately, it is verified that stochastic staggered entry and delay mechanisms do not affect the asymptotic properties of both the urn composition Y„ and the sample fractions N n for a wide class of designs defined by GPU (c.f., Bai, Hu and Rosenberger [5] and Hu and Zhang [21]). Designs dependent on estimated unknown parameters. Hu and Zhang [19] considered a general model with rules of the type D n = D ( 0 n _ i , £ n ) and design matrices of the type H n = H ( 0 n _ i ) dependent on the estimated unknown parameter, where 0 n _ i is the sample estimator of the unknown parameter 0 based on the responses of previous n — 1 stages. Strong consistency and the asymptotic normalities are established for both the urn composition and the number of patients assigned to each treatment. For such design, we also have the following strong approximation. Theorem 3.2. Suppose {£„} is an i.i.d. sequence, Q = E£n and 0 n is defined as in Adaptive Design 2.1. Suppose the conditions in Theorem 2.2 are satisfied and the condition (18) in Theorem 3.1 is satisfied. If A < 1, then • v a.s. and n
• v a.s., n
also, on a suitably enlarged probability space with an K-dimensional standard Brownian motion W(, we can redefined the process (Y n , N n , £ n ) without changing the distribution, such that (12) holds for some K > 0. If A < 1/2, then on a suitably enlarged probability space with three independent K-dimensional standard Brownian motions B t i , Bt2 and Wt, we can redefined the process ( Y n , N n , £ n ) without changing the distribution,
124
such that for some K > 0, (12) holds and Y n - nv = G n + o(n1/2~K)
a.s.,
Nn - nv = B n l S } / 2 + [ —x dx(I - l'v) + o ^ 2 - * ) a.s. Jo where Gt is a Gaussian process which is solution of the following type equation:
Gt = BtlEJ/2H + Bt2*r + f
W dx
* dia9( X
JO
%.. y/v{
+ f — ( H - l'v) ds, t > 0, Jo s and F is defined in (11).
Go - 0,
''
y/VK
(20)
By comparing the two equations (19) and (20), one finds that (20) has a more term than (19) has. This term is generated by the randomness of Q„s. Example 3.3. (General Wei Rule) Suppose K > 2 and the response sequence {£„} is the same as in Wei's rule, and 0Pm-i,j = j v - ^ . + i . a n d Sm-i,j denotes the number of successes of the j - t h treatment in all the Nm-i,j trials of previous n—1 stages, j = 1 , . . . , K. When a = 0, this design model is just Wei's rule, and when a = 1, it is the model proposed by Bai, Hu and Shen [6]. Write Pm-i = (Pm-1,1, • • • ,Pm,K)- In this case, H m = E[D m |J" m -i] - H ( p m _ i ) , where H(x) is defined as in Example 2.4. Also, H m -> H = H and *«
(a)
iv = v w
j
^ "
and
(a)
• v = vK >
a.s., n n where H ^ and v(Q) is defined as in (8) and (9), respectively. If A < 1/2, then the conclusions in Theorem 3.2 hold. In particular, n 1 / 2 ( Y n / n - v W ) £ iV(0, A+) for some A^ and A".
and
n 1 / 2 ( N „ / n - v< a) ) 3 JV(0, A")
125
4. Doubly adaptive biased coin designs Two-arm case. We come back to the PW rule and RPW rule. As it is known, the PW rule is too deterministic and is not applicable when we have delayed responses from patients of treatments. The RPW rule and its generalizations seem solve this problem. However, in using the RPW rule, when qi+q2 < 0.5, the limiting distributions of the proportions of patients assigned to each treatments are unknown. But in practice, both q\ and <j2 are usually very small. So, the RPW rule is not practical in such cases. The asymptotic variation of the proportion becomes a big problem in using the RPW rule (even the adaptive designs based on the GPU model) for 9i + ?2 < 0.5. Even in the case that q\ + 92 > 0.5, if 91 + 92 is near 0.5, aRPW is much larger than apw That is to say, the RPW rule is too random so that the asymptotic variance of proportion of patients assigned to each treatment is very large when the cure rates are large (near 1) and so it is much less stable than the PW rule. Also, in using the multi-arm RPW, when K > 3, the expression of A becomes very complex, it is very hard (even impossible in more general cases) to verify the condition A < 1/2. Now, with keeping the desired allocation proportions V\ = 92/(91 +92) and v2 = 91/(91+92)) just as the case of the PW rule and RPW rule, our goal is to reduce the asymptotic variance. A natural way is as follows. At the (m+l)-th stage, we assign a patient to a certain treatment by comparing the value Nmi/m with v\, or Nm2/m with v?,. If Nmi/m is larger than v\, then we assign a patient to the treatment 1 with a probability less than vi; If Nmi/m is less than v\, then we assign a patient to the treatment 1 with a probability larger than v\\ If ATml/m equals v\, then we assign a patient to the treatment 1 with probability v\ and to the treatment 2 with probability vi- By choosing suitable allocation function, we may minimize the asymptotic variance. However, v\ and V2 are unknown, so they should be replaced by their estimators based on the sample of the previous m stages. So, the following adaptive design of clinical trial is considered and proposed. At the first stage, a patient is assigned to each treatment with the same probability 1/2. After m assignments, we let Smk be the number of successes of all the Nmk patients on the treatment k in the first m assignments, k = 1,2, as usual. And let pmk = (Smk + l/2)/(Nmk + 1) be the sample estimate of pk, and write qmk = 1 — pmh, k = 1,2. At the (m + l)-th stage, the (m + l)-th patient is assigned to the treatment 1 with probability g(Nmi/m,vmi), and to the treatment 2 with probability
126
1 - g(Nmi/m,vmi), where vm\ — qm2/(qmi + 9m2) is the sample estimate of Vi = 92/(91 + qi). The function g(x,p) is called allocation rule. A large class of functions can be chosen as a allocation rule. If it is of the following form: 9(0, p) = 1, 5(1, P) = 0, g{x, p) =
P(f)fl p(f)ai + (i-p)(TE£)a'
where a > 0, then we have W„i n where
-vi=0{\
/log log n V
n
/-/-^"i ) o.s. and Vn( —— ~ *>i) -* n
5 _ 2 _ glg2(Pl+P2) , 2gig2 *ZMBC - *« + (i + 2a)(ft+ft)3 {qi+q2)3
N(0,erDABC)
,„n
(21)
(c.f., Hu and Zhang [20]). It should be noted that the asymptotic normality holds for all 0 < pi < 1 and 0 < P2 < 1. Recall that ap\y and CTRPW are defined in (1) and (16), respectively. It is easily seen that a\ is a strictly monotonous decreasing function of a > 0 and a% —> crpw as a —• +00. Also, a2a < a\pW for all a > 1, whenever 91+92 > 1/2. Furthermore, if 91 + 92 is near 1/2, then aa is much smaller than ORPW SO, this adaptive design is more stable than the RPW rule. This design is compromise between the stability in the PW rule and the randomization in the RPW rule. Such design can keep the spirit of the RPW rule in that it assigns more patients to the better treatment and allows delayed responses by the patients (c.f., Hu, Zhang, Chan and Cheung [22]). In such kind of designs, the assignments are adapted by both the results of responses and the current proportions of patients assigned, and its original idea came from Efron's [13] biased coin design. So, they are called doubly adaptive biased coin (DABC) design, first introduced by Eisele [10] and Eisele and Woodroofe [12] in the two-arm case. Hu and Zhang [20] filled a gap of their proof and studied a general multi-arm case. Multi-arm case. Now, consider an K-treatment clinical trial. Assume the response sequence {£ m } is a sequence of i.i.d random vectors. Suppose the desired allocation proportion of patients assigned to each treatment is a function of some unknown parameters of the response {£„}• That is, the goal of the allocation scheme is to have N m / m —• v = p ( 0 ) , where P(v) = (pi(y), • • •, PK{Y)) '• R ^ ~* (0, l)K is a vector-valued function satisfying p(y)l' — 1, & — (0I,...,6K) is a vector in RK, and 8k is an unknown parameter of the distribution of £1^, k — 1,...,K. Without
127 loss of generality, we assume that Ok — E£i,k, k = 1,...,K. Choose a ®o = (#o,i, • - •, ®O,K) S K K as the first estimate of 0 . If m patients are assigned and the responses are observed, we use the sample means 9m,k, which are based on the responses observed, to estimate the parameters Ok as we do in Adaptive Design 2.1, k — I,... ,K, i.e., °m,k-
Nm,k+1
'
K-l,...,n,
and write 0m,fc = (0 m ,i, • • •, 0m,/r)Here, adding 1 in the denominator is for avoiding the case of 0/0, and adding 0o,fe m the numerator is for using 0o,fc to estimate 8k when no patient is assigned to the treatment fc, k — 1,... ,K. Usually, the 0 o is chosen for avoiding pfc(0 m ) = 0, k = 1 , . . . , K. In practice, © 0 is the guessed value of 0 , or an estimate of 0 from other early trials. The following is the general BADC design. General D A B C Design Let g(x, p) = {gx (x, p),...,gK{x, p)) :J0,1]* x [0,1]* -> [0,1]* be the allocation rule with g(x,p)l' = 1. Let 0 O = 0 O and 0 m be estimated as in (22) from the first m observations, m = 1,2,.... Then the (m + l)-th patient is assigned to the treatment fc with probability Pm,k = 3 f c ( N m / m , p m ) , k = l,...,K, where Pm = P{®m)
(23)
is the sample estimate of v = (vi, •.., VK) = p ( 0 ) based on the responses observed from the first m patients. Theorem 4 . 1 . Suppose for some e > 0 and 0 < 5 < 1, £]|£ n || 2 + e < oo and K
dp
p(y) = p(©) + £(y fc -0 fc )~f | e + o(l|y-©ll1+*) yk
fe=l
Let S i = diag(v) — v ' v , a2k = Var{^k),
k=
S t = diag{o-\/vi,...,
fc=i
l,...,K, a2K/vK),
asy^e.
128
Choose the allocation function g(x, p) to be
where a > 0 and L > 1 are constants. Then in a possibly richer underlying probability space in which there exists two independent K- dimensional standard Brownian motions { W t } and {B t }, we can redefine the sequence { X n , £ n } , without changing its distribution, such that N n - nv = G n + o(n 1 / 2 _ K )
a.s.
and =
B £ v2i1/32_ + n
R
/2_K)
^
where
is a solution of the equation:
G t = W t s l / 2 + (l + a ) ^ 6£ f - a r ^ ,
t
Jo
x
i > 0 , G o = O.
In particular,
V^(N„/n-v,pn-v)^iV(0,A), where A=
5.
(l^^i + ^T^^Ssj
(26)
The drop-the-loss rule
Ivanova [23] proposed a randomization procedure so called the drop-theloss rule. He considered an urn containing balls of K + 1 types, type k, k = 1,...,K, and type 0. A ball is drawn at random. If it is type k, k = 1,...,K, the corresponding treatment is assigned and the patient's response is observed. If it is a success, the ball is replaced and the urn remains unchanged. If it is a failure, the ball is not replaced. If a type 0 ball is drawn, no subject is treated, and the ball is returned to the urn
129
together with one ball of each type k,k = l,...,K. Ivanova [23] shows that the limiting allocation proportion of this procedure is the same as that the PWC rule and Wei's rule (see (5)). For the two-treatment case, Ivanova [23] also shows that y/-n{NM__S^_)v
n
2
9i + 92
where c 2 = <M2fai +P2)/{qi + qif = VPW The drop-the-loss rule is fully randomized. For the case of K > 2, the asymptotic distribution is unknown up to today. It is also an open problem whether or not the delay of responses effects the asymptotic properties. 6.
The minimum asymptotic variance
Recall (1), (16) and (21). Comparing the PW rule, the RPW rule and the DABC design, one finds that, keeping the same limiting proportion v = (92/(91 + Q2),QI/(QI + 92)) of assignments, the asymptotic variance in using PW rule and the drop-the-loss rule is the smallest, and the one in using RPW is the largest, i.e., apw < &DABC = Oa< &RPW, a > 1.
Hu and Rosenberger [16] points out that, if the observed allocation proportions are asymptotically normal, the asymptotic power is an decreasing function of the asymptotic variance of the allocation proportions. Recently, Hu, Rosenberger and Zhang [17] obtains the lower bound of the asymptotic variance. Suppose the response sequence {£„} is an i.i.d. sequence. And suppose the limiting proportion v = p ( 0 ) is a function of the parameter 0 = E£x. Assume that for each fc = 1,...,K, the response £i,fc comes from a family of exponential distributions: C(uk)exp{£lkUk}dn,uk
G Uk C K
and 6k = E£i,fc = 0fc(«k)> and let Ik{uk) be the Fisher information of uk for this family. Then
^=Var(ei,fc) = ( ^ ) 2 / 4 K ) . Hu, Rosenberger and Zhang [17] shows the following theorem.
130
Theorem 6.1. Suppose
n ^ N J n - v ) £ JV(0,V(u)). Under some regularity conditions, there exists a Uo C U = U i ® • • • ® U # with Lebesgue measure 0 such that for every u G U — Uo,
0p(e(u))' V(u) >
gu
I
x
^ P (e(u)) (u,v)
^
I(u, v) = diag{vih{u{),...
u
A
= B(u),
,vKIi{uK))-
Here, for two covariance matrices A and B , A > B means that A — B is non-negatively definite. We can rigorously define an asymptotically best response-adaptive procedure as one in which V(u) attains the lower bound B(u) for a particular target allocation p(6). It is easily seen that
9 P (0)'90(u)' , ae(u) dP(@y (U V)_ "90- ~ ^ r * ' ^ 90" _ 9 p ( 0 ) ' A- (/ 11 (.aVi\Z d6>i.2 1 x , atd6 ^ ,K2^dp{@) \op^ 5 l j 1 d@ U 1 / 1 (u 1 ) dn 1 " ' " % / I M & K J / 90 B(U) =
= - ^ - ^ - ^ . . . . . — c r ^ - ^ -
= £3(0),
where E 3 = E 3 ( 0 ) is denned in (24). In the case K = 2,
Further, if the responses are dichotomous (success and failure) and v = (92/(91 + 92),9i/(9i + 92)), then cr3 = apiy- It follows that the PW rule and the drop-the-loss rule are asymptotically best response-adaptive procedure. However, the PW rule is too deterministic. On the surface, the drop-the-loss rule would seem to give us everything we want in a responseadaptive randomization procedure: it is fully randomized and its asymptotic variance attains the lower bound. However, it can only target the PW rule allocation, which is not optimal in any formal sense, and previously reported simulations have shown that it can be slower to converge for large
131
values of PA and ps and becomes more deterministic for small values of PA and PB (see Hu and Rosenberger [16]). The DABC procedure solve some of these deficiencies, in that they can target any desired allocation, and can approach the lower bound for large values of a. However, the procedure becomes more deterministic as a becomes larger, and hence careful tuning of a must be done in order to counter the tradeoff. When K > 3, no of the designs mentioned above is showed asymptotically best. However, in using the general DABC design, if the allocation rule is chosen as in (25), then the asymptotic variance of n 1 , / 2 (N„/n — v) satisfies 1 „ 2(1 + oA ^ ^ ; ——-Si + ) S 3 \ S 3 as a / oo. 1 + 2a 1 + 2a If a is large enough, the asymptotic variance of n 1 , / 2 (N„/n—v) is very close to the lower bound S 3 . So, the DABC design is nearly asymptotically most powerful if a is chosen large. Remark 6.1. Usually, many cases, in which the parameter 0 is not a mean of the response £ n , can be transferred to the case we studied. In fact, if for each k, an estimate 9nk = 0nfc(£jfc : j = 1,2, • • • , n) of 0 can be written in the following form: 1 n *nfc = - £ A&fc) + o(n-^2-6),
for some S > 0,
(27)
then in the adaptive designs, we can define #m-l,fc = 0JV m _i lfc ,k(£jk
:
Xjk
— l , j = 1 ) 2 , ••• ,J7l - 1)
and then
0m-i,k = j :
-T
JVm-l,fc + 1
X
M£jk)+°(N™-U5)-
E ,
Many maximum likelihood estimators and moment estimators satisfy (27). Appendix A. Strong approximations for martingale vectors The limiting theorems for martingales are main tools in studying the asymptotic properties of adaptive designs. We study the asymptotic properties of adaptive designs by first approximating the models to certain multidimensional martingale vectors and then approximating the martingales
132
to Gaussian processes. In this appendix, we present some strong approximations for martingale vectors. We assume that {Z n ,.F„;n > 1} is a square-integrable sequence of Kd-valued martingale differences, denned on (fi,.F, P), IFO is the trivial1}. Denote Sn(m) = E r = m + i Zfc and S„ = S n (0). Let
E[ZnZn\!Fn-i},
also let £ „ = £ £ = 1 ak, S„(m) = YlT=m+i ak and
K = M£n) = E[||z fe || 2 |j- fc _ 1 ]. The first strong approximation theorem for martingales can be found in Strassen's fundamental paper [31], where the strong approximation is established for one-dimensional martingale via the Skorohod embedding theorem. For d > 2, the strong approximations are also studied by many authors, though it is impossible to embed a general Rd-valued martingale in an Revalued Gaussian process (c.f., Monrad and Philipp [25]). One can refer to Morrow and Philipp [27], Eberlein [9], Monrad and Philipp [26] etc. However, in the approximations appeared in literatures, it is always assumed that the conditioned covariance matrix S n is convergent in L\ with some convergence rates. For example, Eberlein [9] assumed the following condition: for some 6 > 0 ||E[2 n (m)|.F m ] - nTHi = Oin1'6),
uniformly in m,
and Monrad and Philipp [26] assumed that for some 0 < p < 1 Emax{||S n - TVn\\/f(Vn)}p
< oo.
n>l
For the martingales which can approximate our adaptive design models, it is usual hard to verify that S n converges in L\ (or Lp) with the needed convergence rates. However, it is more easy to show that S n converges almost surely with some convergence rates. So, we present the following approximation theorems for martingale vectors, whose proofs can be found in Zhang [35]. T h e o r e m A . l . Suppose that there exist constants 0 < 6, e < 1 and a covariance matrix T, measurable with respect to T^ for some k > 0, such that E„ - n T = 0{nl~9)
a.s.
or
||S„ - n T ^ =
0{nl-e),
133 oo
£ E[||Z n || 2 /{||Z n || 2 > n
1
" ^ ^ ] / ^ < oo a.s.
(A.l)
n=l
Then there exists a sequence {Yn;n > 1} of i.i.d. Rd-valued standard normal random vectors, independent ofT, such that
Here K > 0 is a constant depending only on 6, e and d. T h e o r e m A.2. Suppose that there exists constants 0 < e < 1 such that (A.l) is satisfied, and that T is a covariance matrix measurable with respect t° 3~k for some k > 0. Then for any 8 > 0, there exists an K > 0 and a sequence {Y n ;n > 1} of i.i.d. M.d-valued standard normal random vectors, independent ofT, such that S„ - £
Y m T 1 / 2 = 0(n1'2-*)
+ 0{alJ2+5)
a.s.
Here an = m a x ^ , , | | S m - mT||. References 1. Athreya, K. B. and Karlin, S. (1968). Embedding of urn schemes into continuous time branching processes and related limit theorems. Ann. Math. Statist, 39: 1801-1817. 2. Bai, Z. D., Chen, Y. M., Hu, F. and Lin, Z. Y. (2001). Adaptive designs based on Markov chains. Manuscript. 3. Bai, Z. D. and Hu, F. (1999). Asymptotic theorem for urn models with nonhomogeneous generating matrices. Stochastic Process. Appl., 80: 87-101. 4. Bai, Z. D. and Hu, F. (2000). Strong consistency and asymptotic normality for urn models. Manuscript. 5. Bai, Z. D., Hu, F. and Rosenberger (2002). Asymptotic properties of adaptive designs for clinical trials with delayed response, Ann. Statist., 30: 122-139. 6. Bai, Z. D., Hu, F. and Shen, L. (2002). An adaptive design for multi-arm clinical trials. J. Multi. Anal., 81:1-18. 7. Bai, Z. D., Hu, F. and Zhang L. X. (2002). The Gaussian approximation theorems for urn models and their applications. Ann. Appl. Probab., 12:11491173. 8. Durham. S. and Flournoy, N. (1995). Up-and-Down designs I: Stationary Treatment Distributions. In Adaptive Designs (Flournoy, N. and Rosenberger, W. F. eds.) Hayward, CA: Institute of Mathematical Statistics, pp. 139-157. 9. Eberlein, E. (1986). On strong invariance principles under dependence, Ann. Probab., 14: 260-270.
134
10. Eisele, J. (1994). The doubly adaptive biased coin design for sequential clinical trials. J. Statist. Plann. Inf., 38: 249-262. 11. Eisele, J. (1995). Biased coin designs: some properties and applications. In Adaptive Designs (Flournoy, N. and Rosenberger, W. F. eds.) Hayward, CA: Institute of Mathematical Statistics, pp. 48-64. 12. Eisele, J. and Woodroofe, M. (1995). Central limit theorems for doubly adaptive biased coin designs. Ann. Statist., 23:234-254. 13. Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika, 62: 347-352. 14. Hoel, D. G. and Sobel, M. (1972). Comparison of sequential procedures for selecting the best binomial population. Proc. Sixth. Berkeley Symp. Math. Statist, and Probabilities IV:53-69. 15. Hoel, D. G., Sobel, M. and Weiss, G. H. (1975). A survey of adaptive sampling for clinical trials. In Advances in Biometry, (ed. Elashoff, R. M. ). Academic Press, New York. 16. Hu, F. and Rosenberger, W. F. (2003). Optimality, variability, power: evaluating response-adaptive randomization procedures for treatment comparisons. J. Amer. Statist. Assoc, in press. 17. Hu, F., Rosenberger, W. F. and Zhang, L.X. (2003). Asymptotically best response-adaptive randomization procedures. Manuscript. 18. Hu, F. and Zhang L. X. (2001a). The weak and strong approximation for generalized Friedman's urn model. Manuscript. 19. Hu, F. and Zhang, L. X. (2001b). The asymptotic theorems of adaptive designs for clinical trials with generations depending on the estimated success rates. Manuscript. 20. Hu, F. and Zhang, L.X. (2003a). Asymptotic properties of doubly adaptive biased coin designs for multi-treatment clinical trials. Ann. Statist, in press. 21. Hu, F. and Zhang, L. X. (2003b). The asymptotic normality of adaptive designs for clinical trials with delayed response. Bernoulli, to appear. 22. Hu, F., Zhang, L. X., Chan, W. S. and Cheung, S. H. (2003). Some properties of doubly adaptive biased coin designs. Manuscript. 23. IVANOVA, A. V. (2003). A play-the-winner type urn model with reduced variability. Metrika, in press. 24. Lin, Z. Y. and Zhang, L. X. (2001). Strong approximation of a nonhomogenous markov chain. Manuscript. 25. Monrad, D. and Philipp, W. (1990). The problem of embedding vector-valued martingales in a Gaussian process, Tear. Veroyatn. Primen., 35: 384-387. 26. Monrad, D. and Philipp, W. (1991). Nearby variables with nearby conditional laws and a strong approximation theorem for Hilbert space valued martingales, Probab. Theory Relat. Fields, 88:381-404. 27. Morrow, G. J. and Philipp, W. (1982). An almost sure invariance principle for Hilbert space valued martingales, Trans. Amer. Math. Soc, 273: 231-251. 28. Rosenberger, W. F. (1996). New directions in adaptive designs. Statist. Sci., 11: 137-149. 29. Smythe, R. T. (1996). Central limit theorems for urn models. Stochastic Process. Appl., 65: 115-137.
135 30. Smythe, R. T. and Rosenberger, W. F. (1995). Play-the-winner designs, generalized Polya urns, and Markov branching processes. In Adaptive Designs (Flournoy, N. and Rosenberger, W. F. eds.) Hayward, CA: Institute of Mathematical Statistics, pp. 13-22. 31. Strassen V. (1967). Almost sure behavior of sums of independent random variables and martingales, Proc. Fifth Berkeley Symp. Math. Stat. Prob., Vol II, Part. 1, pp: 315-343. 32. Wei, L. J. (1979). The generalized Polya's urn design for sequential medical trials. Ann. Statist, 7: 291-296. 33. Wei, L. J. and Durham, S. (1978). The randomized play-the-winner rule in medical trials. J. Amer. Statist. Assoc, 73: 840-843. 34. Zelen, M. (1969). Play the winner rule and the controlled clinical trial. J. Amer. Statist. Assoc, 64: 131-146. 35. Zhang L. X. (2001). Strong approximations of martingale vectors and its applocations in Markov-chain adaptive designs. Manuscript. 36. Zhang, L. X. (2002). A kind of multi-treatment adaptive designs with assignment probabilities depending on the estimated parameters. Manuscript.
J O H N S O N - M E H L TESSELLATIONS: A S Y M P T O T I C S A N D INFERENCES
S. N. CHIU Department of Mathematics Hong Kong Baptist University Kowloon Tong Hong Kong E-mail: [email protected]
Seeds are randomly scattered in R d according to a spatial-temporal point process. Each seed has its own potential germination time. Each seed that succeeds in germinating will be the centre of a growing spherical inhibited region that prohibits germination of any seed with later potential germination time. The radius of an inhibited region at time t after t h e germination of t h e seed at its centre is vt. The set of locations first reached by the growth of the inhibited region originated from x is called the cell of x. T h e space will be partitioned into cells and this space-filling structure is called the Johnson-Mehl tessellation. We show that the time until a large cube is totally inhibited has an extreme value distribution. In particular, for d — 1, we obtain the exact distribution of this time by transforming the original process to a Markov process. Moreover, we prove a central limit theorem for the number of germinations within [0, L)d. Finally, the maximum likelihood estimation for v, a nonparametric estimation for the intensity measure and for its density, and the maximum likelihood estimation for the parameters of the intensity with known analytical form are proposed.
1. Introduction Consider a set of n < oo distinct, isolated points {x,}, called seeds, in Rd. A seed at Xi will be stimulated by an internal or external stimulus after a time t{. A seed, once stimulated, immediately tries to germinate and at the same time to prohibit other seeds from germination by generating a spherical inhibited region the radius of which grows at a positive speed v. A seed stimulated at time t* fails to germinate if and only if its location has been inhibited on or before U. Such a germination-growth process was motivated by applications in many diverse fields, such as crystal growth (Johnson and Mehl 15 , Kolmogorov16), DNA replication (Vanderbei and Shepp 26 , Cowan et al.10), synapses (Bennett and Robinson1, Quine and 136
137
Robinson 23 ' 24 ) and so on (see Okabe et al.22 for more details). The set of locations f llx — ii|I ||a; — Xj\\ ,,. , .1 ) x : J! lii + t < II 111 + 1 Vj ^ i } { v v J first reached by the growth of the inhibited region originated from Xi is called the cell of xi. The space Rd will be partitioned into at most n cells, the interior of which are disjoint. Such a space-filling structure is known as the Johnson-Mehl tessellation. The most important special case of Johnson-Mehl tessellation is that all ij's are the same and so all seeds germinate simultaneously. As a result, all edges are segments of the bisectors between the seeds of two neighbouring cells and such a special JohnsonMehl tessellation is the well-known Voronoi diagram (Okabe et al.22). In this paper, seed locations and stimulation times are assumed to form a spatial-temporal Poisson process with intensity measure dxdA(t), where A is a non-decreasing function satisfying A(t) = 0, for t < 0, A(t) < oo, for t < oo, A(oo) > 0. That is to say, the number of seeds located in A C M.d and stimulated during [ti,^] follows a Poisson distribution with mean /
didA(t).
The Lebesgue measure dx means that seed locations are spatially homogeneous in R d . Characteristics of the typical cell of the Johnson-Mehl tessellation generated by a spatial-temporal Poisson process have been studied in details by Evans 11 , Gilbert 12 , Meijering18 and M0ller 20,21 . In this paper limit theorems and statistics problems will be addressed. 2. Asymptotics 2.1. Time until a large cube is totally
inhibited
Denote by TL the time until the cube [0, L]d is totally inhibited. The investigation of TL was motivated by a study 10 of the replication time of a DNA molecule in higher animals. Such a molecule is topologically linear and its replication is one of the most important process inside the bodies of animals. The replication commences at specific sites called 'origins
138
of replication'. These origins are randomly scattered along the molecule. Thus, we get Poisson seeds on the real line. Each origin of replication is recognised, after some random delay time, by an enzyme-complex which then binds to the site. The complex immediately initiates, via its influence on other enzymes, a bi-directional movement along the DNA molecule. At this moving 'frontier', the helical structure unwinds and separates into two single strands. Replication of these single-stranded regions then takes place (see Kornberg 17 for biological details). Therefore, an enzyme-complex approaches an origin that has been passed over by a moving frontier will not initiate any more unwinding and separation. In other words, the locations that have been passed over by moving frontiers are inhibited. Such a process can be modelled well by the germination-growth process described above with d = 1. Then the time TL in this context is the time until a molecule of length L is completely separated. Let TX be the inhibition time of the location x S Rd, then TL = max-fri : x £ [0, L]d} is the maximum of a random field and so when A satisfies certain regularity conditions, TL has an extreme value distribution: T h e o r e m 2.1. For each real u, P{aLTL -bL
e~9e~u as L -> oo,
where expressions for CLL, hi, and 6 can be explicitly found when A is given. Theorem 2.1 generalises the results in Vanderbei and Shepp 26 and Cowan et al.10 from A(i) = A and A(£) = Aae~ a t , respectively, to a very general class of A, as well as from d = 1 to an arbitrary d. This theorem has been proved in Chiu 2 , in which the extreme value theory has not been used because of the strong correlation between TX. Instead, the problem was expressed in terms of a coverage problem as follows. Consider a union U(t) of balls; the centres of the balls are scattered according to a stationary Poisson process {xi} with intensity A(t) and the radius of the ball centred at Xi is v(t — t%), where U is a random variable having the distribution function A(-)/A(i) and is independent of {XJ} and {tj : j ^ i}. The event {[0, L]d C U(t)} is the same as {TL < t} and so the distribution of TL can be obtained from the probability of complete coverage13. Letting u — ±E6L in Theorem 2.1 leads to an approximation of TL for large L: Corollary 2.1. As L —» oo, — bL
• 1 in probability.
139
In particular, consider the case that d = 1. The inhibition time {T X } forms a bi-directional growth process: Bi-directional growths start from germinated seeds. The growth of a line in a direction stops whenever it meets another line coming from the other direction. See Figure 1(a) for a realisation. If we take the shear transformation (x, t) — i > (x + vt, t), the bidirectional process will be transformed to a uni-directional growth process {T£}, see Figure 1(b), which is actually a Markov process.
(b) A realisation of {r^.}.
(a) A realisation of {TX}.
Figure 1. A realisation of the germination-growth process {TX} for d = 1 and the unidirectional process {T'X} obtained by taking the shear transformation of the realisation in (a).
Denote by £z the value of x (on the line representing that locations) at which the process {TX} first hits the level z on the time axis. Because of stationarity, the event {Tj, < z} in the bi-directional process has the same probability as the event {£z > L} in the uni-directional Markov process. The Laplace transform of £z can be obtained from the transition semigroup {%} of operators, where %f{t) : = E t / ( & )
= 11 - J A(t + u)du 1 f(t + x) + { I A{t + u)du\ o
[ f{u)^-du A(t)
+ o(x)
= {1 - xA{t + 5X)} f(t + x)+ xA(t + Sx) f f(u)^$-du J0
+ o(x),
AW
for some 5X in (0,:r), where E t denotes the conditional expectation given TQ =t and / is a bounded measurable real-valued function on [0, oo). Thus,
140
the infinitesimal generator A is given by
Af(t) = f'(t)-A(t)f(t) + [ f(u)X(u)du. Jo 9
Using this generator, Chiu and Yin obtained the exact distribution of Ti,. Theorem 2.2. For t > 0, let P t denote the distribution of the Markov process {r'x} with initial value t. For z > t, we have P t ( T L
U + ak(z) J ea"^u+A^du\
,
where a\(z) > a2{z) > 03(2;) > ••• are all negative zeros of g(a) := 1 + a / 0 Z e a u + A ( u ) a X A(u) = /0" A(t)dt and a\{z) J0Z Uea*(*)"+*Wdu - 1' Also, they 9 strengthened the convergence in probability in Corollary 2.1 to strong convergence. However, this approach does not work for an arbitrary d because we do not know what transformation can be used in general. So, the exact distribution of TL for an arbitrary d remains unknown. We will also see that usually stronger results can be obtained when d — 1.
2.2. Number of germinations
in a large cube
Another interesting characteristic is the number NL of germinations in the cube [0, L]d. This investigation was motivated by a study 23 of release of neurotransmitters at synapses. The terminal of a neuronal axon at the neuromuscular junction has branches consisting of strands containing many randomly scattered sites. Thus, we have Poisson seeds on intervals. At a synapse an action potential triggers the release of neurotransmitters at these sites. Such a release is regarded as a germination. Each quantum released is assumed to cause release of an inhibitory substance which diffuses along the terminal at a constant rate preventing further releases in the inhibited region. In other words, a spherical inhibited region grows with a constant rate. Therefore, we have a germination-growth process with d = 1. It has been shown that the distribution of NL, after normalisation, converges weakly to the standard normal distribution.
141
Theorem 2.3. Let H= J
exp i - J u)dvd{t - u) d dA(u) 1 dA(t), Ld
L-oo
where LUJ — VT^/T(1 and a > 0, then
+ d/2) is the volume of a unit ball inRd.
N -nLd
—L,
„
>Z
. ,
.
If fi < oo
.
L in distribution,
where Z has the standard normal distribution. Such a central limit theorem was first proved by Quine and Robinson 23 for d = 1 and A(i) = At and then generalised to more general A by Chiu 3 and Hoist et o/.14. Chiu and Quine6 established the asymptotic normality for a general class of A and d > 2, and in particular, when d = 1, they showed that NL satisfies the functional central limit theorem, meaning that after suitable normalisation and interpolation, NL behaves asymptotically like a Brownian motion. If A(t) ~ KV for some positive K and 1 < j < oo or A(i) = A/ 0 ya~1e~y/T(a)dy for some positive finite a and A, then the rate of convergence of the central limit theorem is 0(L~d/2 logd Ld) and that of the functional central limit theorem is 0(L~l^4+£) where e > 0. The tool used in Chiu and Quine6 was strong mixing, which means that the dependence between the numbers of germinated seeds in two sets are vanishing when the distance between the two sets is increasing. This fact comes from two model assumptions. The first one is that seeds form a Poisson process, which implies that the numbers of seeds in two sets are independent whenever they are disjoint. The second one is that a seed located at x and stimulated at t will germinate if and only if there is no seed in the cone {(y, s) G Rd x [0, oo) : \\y - x\\ < v(t — s),s < t}, since if there is a seed in the interior of this cone, the seed at x will be prohibited from germination at some time before t by the growth of the inhibited region from the seed in the interior. From these two properties, the numbers of germinated seeds located in A and B are not independent only if such cones corresponding to seeds in A intersect those corresponding to seeds in B. When the distance between A and B is large, this event happens only if at least one of these cones has a large height. However, the probability that there is no seed in a cone tends to zero as the height of the cone goes
142
to infinity, and thus the dependence between the numbers of germinated seeds located in A and B is diminishing when their distance is increasing. Under such a setting, if the word "intersect" is replaced by the phrase "are within a distance m of", the above argument remains valid whenever m is finite, because we are talking about infinite distance, and adding a finite constant to infinity does not change anything. Therefore, "Poisson" can be replaced by "spatially m-dependent", which means that the numbers of seeds located in two disjoint sets are independent if the distance between the two sets is at least m. That is, the asymptotic normality of Ni still holds for spatially m-dependent processes of seeds (Chiu and Quine 7 ). Such a model allows some short-range dependence between seed locations. For example, the locations can be a Poisson clustered process with bounded cluster radius or a dependently thinned Poisson process with bounded thinning radius. It is very natural to conjecture that the m-dependence condition can be further relaxed to strong mixing. I believe it is true but have not yet got a proof. Quine and Szczotka25 considered more general point processes of seeds but their approach works only for d = 1. For central limit theorems obtained by using strong mixing, the positivity of the normalised variance a2 is usually assumed instead of proved. The reason is that the calculation of the variance requires a thorough knowledge of the dependence structure, which is typically difficult to obtain. Chiu and Quine6 showed this positivity numerically for d = 1, 2, 3 and 4, and A(£) — Xt. By considering the Markov process obtained by taking the shear transformation, Chiu and Lee4 showed that if d = 1, u 2 is always positive for a very general class of A. Moreover they established several other strong limits such as the strong invariance principle and the strong law of large numbers for NL in the case d = 1. 3. Statistics The statistical problem discussed here is the estimation of the parameters, namely A and v, from a realisation of the germinated seeds observed in a bounded window. Such an estimation problem was motivated by the same application in neurobiology 5,8 ' 19 ' 24 described in Section 2.2, in which d = 1. Nevertheless, all methods discussed below can be applied for d > 1. 3.1. Estimation
of the speed
To estimate the speed v is the same as to estimate the absolute value of the slopes of the cones arising along the time axis from the germinated seeds
143
{(xi,ti)} observed, i.e. {(y,s) 6 M d x[0,oo) : | | y - i i | | < v(s—ti),s > t}, the lower envelope of the union of which forms the sample path of the process {Tx}> which are not observable except the local minima, at which we have germinated seeds. For d = 1, the sample path is piecewise linear and v is the absolute value of the slopes of the lines that form the sample path. The estimation can be done by maximising the likelihood. The likelihood of such a realisation is the product of the likelihood ri(x u •) dA(£jj) of having spatial-temporal Poisson points at the location Xij and germination time Uj of the observed seeds and the probability exp{— fB(v\ dxdA(t)} that there are no points below the sample path, so that the observed seeds are germinated seeds, where B(v) is the space-time region which lies beneath the sample path. Although the sample path is not observable, when the germinated seeds are known, the sample path is uniquely determined by the parameter v, which appears only in the probability exp{— JBM dxdA(t)} and such that the probability increases as v increases, because as v increases, the cones are bigger, or the lines of the sample path are flatter for d = 1, leading to a smaller region B(v). Thus, no matter what A is, the probability increases with v, and so does the likelihood as far as v < Umax, where vmax is the maximum possible speed of a given collection of germinated seeds. The value vmax is the maximum possible speed if at least one germinated seed lies above the sample path corresponding to v whenever v > vmax. The likelihood drops to zero if v > vmax, because when a seed is above the sample path, it should not have germinated. So the maximum likelihood estimator v of v is just u max , which is calculated as the minimum of the ratios ||XJ — Xj\\/\ti — tj\ over all distinct pairs of germinated seeds (xi,U) and (xj,tj) and is simply the reciprocal of the maximum slope among the lines joining adjacent germinated seeds if d = 1. If there are n independent realisations with at least two germinations, the maximum likelihood estimator is the minimum of these n individual estimates. With an unnormalised gamma density dA(t) = ^ f c - 1 e - 7 t d i ,
(1)
where A = 5, 7 = 2 and k — 4, a series of simulations on an interval of length L have been performed and the estimates of v are given in Table 1, from which we can see that the maximum likelihood estimation works fine and the biases are small for moderate n and L.
144 Table 1. Maximum likelihood estimates of v, where the true value is v = 0.2. Tl = l
5 10 50 100
L = 5 0.2694858 0.2083324 0.2060151 0.2000026 0.2006374
3.2. Nonparametric
10
50
100
0.2107414 0.2031187 0.2002706 0.2001489 0.2000903
0.2071159 0.2000478 0.2005522 0.2000029 0.2000017
0.2001104 0.2001629 0.2000303 0.2000076 0.2000141
estimation
of the
intensity
To estimate A, we can re-construct the process by using the estimated v. For ease of presentation, we consider d = 1 first. If the germination-growth process occurs on the whole real line, then at each fixed time level t, we will get a union of intervals the midpoints of which form a stationary Poisson process with intensity A(i). Because the midpoints form a Poisson process, the left-end points of the intervals also form a Poisson process with the same intensity, and the probability p(t) that a left-end point of an interval is contained in the interior of other intervals is simply the proportion of the length of the union of these intervals in the whole real line. Since we can observe only via a finite window, we estimate this probability by the observed proportion p(t) in the window. Thus, at each fixed time level t, we can estimate A(£) pointwise by taking the ratio of the number of exposed left-end points in the window to 1 — p(t), which is an estimate of the probability that a left-end point is exposed. Figure 2 shows a realisation of the process with an unnormalised gamma density given in (1) where L = Q, A = 5, 7 = 2, fc = 4 and v = 0.2; there are 37 seeds and 18 of them germinated. The nonparametric pointwise estimate of A, together with a parametric estimate obtained by minimising the absolute deviations between the nonparametric estimate and a function with an unnormalised gamma density, are given. This estimation was done with a single realisation. If we have n realisations, we pool the data together to get one estimate. Table 2 shows the estimates of the parameters for various n and L by minimising the absolute deviations. Chiu et al.8 simulated 50 times such an experiment with n = 10 and L = 25, and the means and standard deviations of the estimates are shown in Table 3. This estimation can also be applied when d > 2. At each fixed time t, we get a union of balls. Fix a particular direction and consider a hyperplane that is perpendicular to this direction. Move the hyperplane along this direction so that each ball has one and only one tangent points, the collection of which forms a Poisson process in Rd with intensity A(i). The
145
A realisation of the process
Estimates for this realisation
CM
Intensity 8
True Parametric Nonpara
<
4
6
y^
0
2
X^"
1
2
3
4
5
6
0.0
0.5
1.0
1.5 t
20
2.5
3.0
Figure 2. A realisation of the germination-growth process on an interval and the estimates for this realisation. Table 2. Estimates of (A, 7, fc) from n realisations on an interval of length L. A (A = 5) 7 (7 = 2)
fc (fc = 4)
( n , £ ) = (10,10) 5.575 1.891 3.972
(10,50) 4.549 2.110 4.124
(50,10) 5.477 1.907 3.975
(50,50) 5.046 2.039 4.064
Table 3. Mean and standard deviation of 50 estimates of (A, 7, fc, v) for n = 10 and L = 25.
Mean Standard deviation
A (A = 5) 5.821 1.689
7 (7 = 2) 1.888 0.903
k (fc = 4) 3.884 1.102
v («=0.2) 0.201 0.001
probability that a tangent point is not exposed equals the proportion p(t) of the volume occupied by balls. Thus, A(t) can be estimated by the ratio of the number of exposed tangent points in the observation window at time t to 1 — p(i), where p(t) is the observed volume proportion at time t.
146
3.3. Estimation
of the density function
by kernel
smoothing
However, such a nonparametric pointwise estimation does not necessarily give a monotonic estimate for the non-decreasing A. This problem can be solved if we assume that A has a density function A and consequently the germinated seeds also has a density function, say u, on the time axis and such that /i(*) = A ( t ) { l - p ( t ) } .
(2)
Since germinated seeds are observable, \x can be estimated by using kernel smoothing fi(t) = ^2t. k({t - ti}/h), where k is a kernel and h the bandwidth. Replacing /j,(t) in equation (2) by £(£), we can obtain a smoothed estimate of X(t) by two methods. The first is to replace p(t) by its estimate obtained from the re-construction of the sample path by using the estimate v. The second is to substitute p(t) = exp{— JQ u>dVd(t - s)dX(s)ds} into equation (2) to get an integral equation that can be solved numerically, where u>d is the volume of a unit ball in Rd. Numerous simulation studies for various density functions suggested that the integral equation method and the reconstruction method produce very similar results. Figure 3 shows the estimates of A given in equation (1) with A = 6, 7 = 1, k = 2, v = 0.2, d = 1 and L = 50 from ten independent simulated realisations by using the integral equation method. Further simulation results can be found in Molchanov and Chiu 19 .
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Figure 3. Left: Estimates of A from ten independent simulated realisations by using the integral equation method. Right: The mean and pointwise 95% confidence bounds from these ten estimates; the true A is shown as a solid line.
3.4. Maximum
likelihood
estimation
of the
parameters
The parameter estimation mentioned in Section 3.2 can also be improved because in the estimation of the speed v we already have the explicit form of
147
the likelihood of a given realisation if A is of known analytical form. Thus, if there are only a finite number of unknown parameters, the maximum likelihood estimates can be obtained. In Section 3.1 we have the maximum likelihood estimator for v if we know the germination times and locations. An important feature of the approach in the current section is that if only the germination times, but not the locations, are known, we are still able to write down the likelihood as a function of v and the parameters of A. Chiu et al.5 compared the two different cases and the results are given in Tables 4 and 5. A difficulty of the maximum likelihood estimation in this context is that unless A is a very simple function, we do not have a closed form for the optimal solution and so have to rely on numerical approximation. Table 4. Means and standard deviations of the maximum likelihood estimates obtained from 50 replicates of 100 independent realisations for d = 1.
times and locations times only
S (a = 6) 6.069 (0.598) 5.912 (1.048)
7 (7 = 1) 1.004 (0.174) 1.080 (0.273)
k (fc = 2) 2.008 (0.175) 2.036 (0.199)
v (u = 0.2) 0.202 (0.002) 0.192 (0.031)
Table 5. Means and standard deviations of the maximum likelihood estimates obtained from 50 replicates of 100 independent realisations for d = 2.
times and locations times only
5 ( a = 6) 5.988 (0.400) 6.026 (0.654)
7 (7 = 1) 1.054 (0.115) 1.020 (0.170)
k (k = 2) 2.060 (0.118) 2.012 (0.162)
v (v = 0.2) 0.201 (0.001) 0.193 (0.022)
Acknowledgments This paper results from several joint work with various collaborators. I thank all of them and especially thank M. P. Quine (Sydney), from whom I obtained Tables 1 and 2 and Figure 2, and I. S. Molchanov (Berne), from whom I obtained Figure 3. This research is partially supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. HKBU/2075/98P).
148
References 1. BENNETT, M. R. AND ROBINSON, J. (1990). Probabilistic secretion of quanta from nerve terminals at synaptic sites on muscle cells: nonuniformity, autoinhibition and the binomial hypothesis. Proc. R. Soc. London, Series B 239, 329-358. 2. CHIU, S. N. (1995). Limit theorems for the time of completion of JohnsonMehl tessellations. Adv. Appl. Prob. 27, 889-910. 3. CHIU, S. N. (1997). A central limit theorem for linear Kolmogorov's birthgrowth models. Stochastic Process. Appl. 66, 97-106. 4. CHIU, S. N. AND LEE, H. Y. (2002). A regularity condition and strong limit theorems for linear birth-growth processes. Math. Nachr. 2 4 1 , 21-27. 5. C H I U , S. N., MOLCHANOV, I. S. AND QUINE, M. P . (2003). Maximum like-
lihood estimation for germination-growth processes with application to neurotransmitters data. J. Statist. Comp. Simulation, to appear. 6. CHIU, S. N. AND QUINE, M. P . (1997). Central limit theory for the number of seeds in a growth model in Rd with inhomogeneous Poisson arrivals. Ann. Appl. Probab. 7, 802-814. 7. CHIU, S. N. AND QUINE, M. P. (2001). Central limit theorem for germination-growth models in K with non-Poisson locations. Adv. Appl. Prob. 33, 751-755. 8. CHIU, S. N., QUINE, M. P . AND STEWART, M. (2000). Nonparametric and
parametric estimation for a linear germination-growth model. Biometrics 56, 755-760. 9. CHIU, S. N. AND YIN, C. C. (2000). The time of completion of a linear birth-growth model. Adv. Appl. Prob. 32, 620-627. 10. COWAN, R., CHIU, S. N. AND HOLST, L. (1995). A limit theorem for the
replication time of a DNA molecule. J. Appl. Prob. 32, 296-303. 11. EVANS, U. R. (1945). The laws of expanding circles and spheres in relation to the lateral growth of surface films and the grain size of metals. Trans. Faraday Soc. 4 1 , 365-374. 12. GILBERT, E. N. (1962). Random subdivisions of space into crystals. Ann. Math. Statist. 33, 958-972. 13. HALL, P . (1985). Distribution of size, structure and number of vacant regions in a high-intensity mosaic. Z. Wahrsch. verw. Geb. 70, 237-261. 14. HOLST, L., QUINE, M. P . AND ROBINSON, J. (1996). A general stochastic
model for nucleation and linear growth. Ann. Appl. Probab. 6, 903-921. 15. JOHNSON, W. A. AND MEHL, R. F. (1939). Reaction kinetics in processes of nucleation and growth. Trans. Amer. Inst. Min. Metal. Petro. Eng. 135, 410-458. 16. KOLMOGOROV, A. N. (1937). On statistical theory of metal crystallisation. Izvestia Academii Nauk SSSR, Ser. Mat. 3, 355-360 (in Russian). 17. KORNBERG, A. (1980). DNA Replication. Freeman, San Francisco. 18. MEIJERING, J.L. (1953). Interface area, edge length, and number of vertices in crystal aggregates with random nucleation. Philips Res. Rep. 8, 270-290. 19. MOLCHANOV, I. S. AND CHIU, S. N. (2000). Smoothing techniques and es-
149
timation methods for non-stationaxy Boolean models with applications to coverage processes. Biometrika 87, 263-283. 20. M0LLER, J. (1992) Random Johnson-Mehl tessellations. Adv. Appl. Prob. 24, 814-844. 21. M0LLER, J. (1995). Generation of Johnson-Mehl crystals and comparative analysis of models for random nucleation. Adv. Appl. Prob. 27, 367-383. 22. OKABE, A., BOOTS, B., SUGIHARA, K. AND CHIU, S. N. (2000). Spatial Tes-
23. 24. 25. 26.
sellations. Concepts and Applications of Voronoi Diagrams. Second edition. Wiley, Chichester. QUINE, M. P. AND ROBINSON, J. (1990). A linear random growth model. J. Appl. Prob. 27, 499-509. QUINE, M. P. AND ROBINSON, J. (1992). Estimation for a linear growth model. Statist. Probab. Letters 15, 293-297. QUINE, M. P. AND SZCZOTKA, W. (2000). A general linear birth and growth model. Adv. Appl. Prob. 32, 1-23. VANDERBEI, R. J. AND SHEPP, L. A. (1988). A probabilistic model for the time to unravel a strand of DNA. Stochastic Models 4, 299-314.
R A P I D SIMULATION OF CORRELATED DEFAULTS A N D THE VALUATION OF BASKET DEFAULT SWAPS
ZHIFENG ZHANG, KIN PANG, PETER COTTON, CHAK WONG AND SHIKHAR RANJAN Morgan Stanley Dean Witter, 1585 Broadway, New York, NY 10036, USA E-mail: eZhifeng. Zhang@morganstanley. com Basket default swaps are complex credit derivatives that are difficult to price analytically and are typically priced by using Monte Carlo simulations. The pricing and risk management of basket default swaps present challenging computational problems. We present a method for efficiently generating correlated default times whose marginal distributions are consistent with a reduced form stochastic hazard rate model.
1. Introduction The global credit market has been growing rapidly. According to the annual surveys carried out by the International Swaps and Derivatives Association (ISDA), the global notional outstanding volume of credit derivatives transactions has grown steadily from USD 631.5 billion during the first half of 2001 to about USD 1.6 trillion by the middle of 2002. With the recent slow down in the global financial markets and marked increase in corporate and sovereign defaults, these products have an obvious appeal to market participants in managing risk in times of volatility and uncertainty. While conventional credit default swaps continue to dominate the market, more complicated derivatives such as basket default swaps are quickly gaining popularity. As trading volumes increase, pricing and risk management of these more complicated credit derivatives are becoming increasingly important. In this paper we describe an algorithm for efficient simulation of correlated defaults and how this algorithm is employed in a Monte Carlo pricing methodology for the valuation of basket default swaps. After a brief description of two of the most popular credit derivative products in the market, we lay down a general framework for treating similar pricing and hedging problems. 150
151
1.1. Credit default
swaps
Perhaps the simplest form of a credit derivative is the "plain vanilla" credit default swap, also referred to herein as a "single name default swap". The buyer of a credit default swap is protected against default by a reference entity and other actions constituting a "credit event". Typically the credit event is defined with respect to a particular reference security such as a bond or loan. Thus a credit default swap is a claim contingent on the time of a credit event, if any, of the reference asset and may be considered a derivative thereby. The ISDA definition of credit event is quite broad and includes the categories: bankruptcy; obligation acceleration; obligation default; failure to pay; repudiation or moratorium; and restructuring. Protection in the form of a credit default swap is most commonly provided up until expiry at maturity or prior exercise of the default swap by the buyer of protection. The buyer of protection will usually make premium payments up until this terminal time, proportional to both an agreed notional value of the reference security and the swap rate. In cases where the reference security is considered to be in distress, the buyer will sometimes make a upfront payment either in addition to or in lieu of the ongoing premium payments. In exchange for the premium payments and contingent on a credit event occurring, the credit default swap gives the protection buyer the right to "deliver" the reference security to the seller of protection, and claim in return a notional value of the reference security. Therefore, in case of an exercise, an economic loss for the protection seller is incurred equal to the difference between the notional value of the reference security delivered and its market value post default. In a standard valuation methodology the risk neutral expectation of the present value of this potential economic loss is equal, for some particular swap rate, to the risk neutral expectation of the present value of cashflows received from the buyer. This "fair" swap rate is referred to as the par swap rate or par premium. It will in general be a function of maturity, with the five year par swap rate providing the most commonly cited benchmark. In this framework, a default swap deal entered into today at the par swap rate will be valued at zero and marked accordingly. Existing or "offmarket" default swaps will in general fail to match the maturity of quoted deals and therefore require a valuation methodology. The methodology must value both the "fixed leg" and the "floating leg" contingent payments from buyer to seller and seller to buyer respectively.
152
1.2. Basket default
swaps
A basket default swap contract offers the buyer protection against credit events for a number of distinct reference securities comprising a "basket" or "reference portfolio". Almost invariably the aggregate loss - in the sense of the economic loss incurred by the seller in the case of a vanilla default swap - sustained across all reference securities at any given time defines the floating leg (credit event) payments from seller to buyer. The aggregate loss is calculated by assigning a notional to each reference security and subtracting from the notional the recovered amount or market value of the security after default. In the case of large reference portfolios, the potential credit event payments are typically restricted by upper and lower attachment points for the cumulative loss in the reference portfolio. Credit event payments begin only after a given cumulative loss is sustained by the reference portfolio. Credit event payments cease once this cumulative loss exceeds a specified upper bound. These lower and upper attachment points may be said to define the "tranche" protected. A partition of the reference portfolio into mutually exclusive collectively exhaustive tranches is popular and may be seen to replicate a simple subordination scheme in a synthetic setting, analogous to notes paid out by vehicles for physical pooled and tranched cashflows such as Collateralized Bond Obligations (CBO's). Investment banks and other market participants may attempt to issue multiple tranches which together comprise a partition of the reference portfolio, since the aggregate credit event payments across all tranches are equivalent to the aggregate credit event payments across a portfolio of default swaps with the same securities and notionals as the reference portfolio. In practice this is rarely achieved. For small baskets comprising of a small number of reference securities, it is customary for protection to be limited by attachment points corresponding to aggregate notional defaults rather than aggregate loss. Furthermore, it is common for all reference securities to share a common notional, and for tranche attachment points to correspond to integer multiples of this common notional amount. For illustration we assume this common notional per name is ten million dollars. As with the single name default swap contract, the payment equivalent to the economic loss on the security made from buyer to seller of protection is actually an exchange of the defaulted security for a notional cash amount. The protection buyer has the right to deliver bonds or loan obligations issued by reference entities included in the
153
basket, for their notional values, at the time of the credit event. A simple special example will motivate our study. In a notional protection first-to-default basket, the first attachment point is placed at zero. Beginning with the first default in the reference portfolio, the protection buyer claims the notional value of the securities in exchange for the defaulted security. This continues until the cumulative defaulted notional exceeds a threshold at which point the contract terminates. For example, a first-to-default basket swap may be written on thirty reference entities with equal notional amounts giving the protection buyer the right to exercise ten times within the protection period. This would be referred to as a "first ten of thirty" basket default swap. The complementary tranche for this type of basket is the second-todefault basket swap, allowing exercise only after a specified cumulative default notional has been exceeded. In this case the first attachment point is at one hundred million dollars notional and the product might be labeled a second-to-default "eleven through thirty" basket. Rules for premium payments (the fixed leg) may vary. Ordinarily the buyer of protection must pay at specified dates an amount proportional to the amount of protection currently afforded. In the "first ten of thirty example" above, annualized fixed leg payments from buyer to seller are initially equal to the (tranche) premium multiplied by the notional of one hundred million. However, as defaults occur thereafter this is prorated according to the actual amount protected. This actual amount protected decreases from one hundred million to zero as the first through tenth defaults in the reference portfolio occur. The terminology for single name default swap valuation carries over verbatim to tranche valuation. For at-the-money baskets the risk neutral expectation of the premium payments received is assumed (or defined) equal to the risk neutral expectation of the credit event payoffs. The mark-tomarket value of off-market (existing) basket default swap is equal to the difference in these two risk neutral expectations. 2. Hazard rate model and calibration 2.1. Default
event and survival
probability
Risk neutral Valuation of contingent cashflows for parties entering a single name credit default swap typically presupposes a distribution for the time of default. Except in cases of very high default probabilities, a first order consideration is the probability of a credit event occurring before the
154
maturity of the contract. In this paper the distribution of default times is determined in accordance with a dynamic model for the hazard rate. The default process is modeled as a doubly stochastic Poisson process, terminology motivated by the stochastic nature of the hazard rate and also the random nature of the default event given a particular hazard rate. Let T be the default arrival time for a given reference entity and At be the indicator of the default event, i.e. if T < t \ 0 cotherwise Let Nt be a counting process with
NQ=0.
(1)
It is well known that
dNt = htdt + dMt
(2)
where Mt is a martingale process and ht is the rate of the arrival times. We define the first arrival time of Nt to be the default arrival time T, i.e. NT — 1 and Nt = 0 for t < T. Formally, the connection between At and ht can be viewed as / (1 Jo
At = Ntl
Au)dNu
or dAt = {1 - At)(htdt + dMt
(3)
In this model, the survival probability is given by P ( T > £ ) = E
2.2. The hazard rate Let g(t,T,x;a,/3)
0
(4)
| W O ^
process
be the generating function
g{t,T,x;ai0)=E
exp
—ahx — /3 /
We obtain the function g(t,T,x;a,(3)
hudu
ht = x
(5)
by solving the PDE
dtg + £g = 0xg
(6)
with the terminal condition g(T, T, x; a, /3) — e~ax, where £ is the generator of ht. An exponential affine process ht has the property that g(t, T, x; a, P) = exp {-a{T -1; a, (3) - b{T - f, a, 0)x],
155
where a(-), and &(•) are deterministic functions (see DK92 and B 5 9 3 ) . An example of such process is the Cox-Ingersoll-Ross (CIR) process with jumps: dht = (K0 - (K + \)ht)dt + ay/htdWt + dJt where Wt is a standard Brownian motion and Jt is a pure jump process with jump intensity Xj and an exponentially distributed jump size with mean /J,J. The exponent functions a(-) and &(•) for such process are listed in Appendix §Appendix A. Some special cases of the function g are listed below: O Survival probability (from (4)) can be computed as P(T>i)=0(O,t,s;O,l).
(7)
@ The characteristic function of /ir is given by #(0, T, x; iu, 0). 2.3. Pricing
single name default swap and
calibration
The buyer of a default swap pays a stream of cashflow (promised payments) {iijSj}^! to the seller and in return receives, contingent on default, a net payment equivalent to of 1 - R (default payment) from the seller when default occurs. Here U is the cashflow date and s$ is the amount payed at U with unit notional, M is the total number of promised payments, tig is the maturity and R is the recovery rate. When the recovery rate is assumed constant, the pricing formula is given by M
PV = V P(r > ti)B{U)si -(1-R)
2.4.
rtu
/
B(U)F{T
£ du}
(8)
Calibration
Calibration of model parameters for the hazard rate process is undertaken for each particular credit curve. A credit curve is a list of triples {ti,Ci,Pi}iLi where ti, cj and Pi are the maturity, market quote and price respectively and N is the total number of points in the credit curve. In the case of a default swap curve, c, is the premium and Pi = 0 while in the case of a bond curve, c; is the par coupon and p» = 1. Let 0 be the parameter of the hazard rate process. For example, for the CIR process, we have & = (K,0,\,(T,AJ,IJ,J). Let PVi(@,x) be the price corresponding to the ith point in the credit curve using the model with parameter @, where
156 x = ho is the initial value of the process ht. Calibration comprises a search in the parameters 0 and x where violation of constraints PVi(®,x)=Pi
fbri = l,2,--- ,JV
0)
is penalized.
3. Pricing basket default swaps Prom a pricing perspective, a basket default swap written as notional loss is essentially an option on the aggregate loss for a given portfolio. Loss amount is a function of default timing and recovery rate. Default timing in turn is driven by hazard rate level and default correlation. Therefore the main risk factors that impact the price of a basket default swap are the hazard rates for each name, the default correlation and the recovery rate distribution. Analytic solutions for the price of the basket default swap may run up against a combinatorial explosion in the number of possible orderings of defaulting assets. In this paper we present a practical Monte Carlo approach to valuation, where the default times of multiple assets are simulated. We first describe the simulation of default times for a single asset.
3.1. Simulation
of default time for a single
asset
A "compensator" method may achieve simulation of the time of default by means of path-wise simulation of the hazard rate. We suggest in this paper that the computationally intensive method of simulating hazard rates may be replaced with a faster algorithm termed the (deterministic proxy algorithm). The proposed deterministic proxy is equivalent to a lookup function for the convexity adjusted integrated hazard rate, as compared with the actual integrated hazard rate which is a random variable. We note that
£ > - l o g E exp -
hu du
• E
Jo
exp
• / K du Jo
(r > t),
where £ is an exp(l) random variable. Therefore, if we define T*^infU:£<
- l o g E exp
/ hu du Jo
(10)
157
then for any t > 0, F(r*>t)
=
P(Z>-logE exp E exp
/ K
du
Jo
/ hu du Jo
(r>t) This demonstrates that by using a convexity adjusted lookup in place of simulation of the hazard rate, one may achieve simulation of the default time T* with the same marginal distribution as the default time r. For affine processes, we have closed form expression for the right hand side of equation (10), i.e. E exp
-
hu du
_
e-a(t)-b(t)h0
Jo
with deterministic a(t) and b(t). Therefore simulating the default time T using formula (10) requires only the simulation of exp(l) variable £ and simulating the process ht can be avoided. Figure 1 illustrates the algorithm for finding the default arrival time r.
Figure 1.
Fast Algorithm: A sample path for finding T
3.2. Modeling default default times
correlation
and and simulating
joint
The deterministic proxy method described above is now adapted for multivariate default time simulation in a general setting.
158
To simulate joint default arrival times, we need a joint distribution P( T i < *i,T2 < t2,--- ,Td < td) of default times, where n is the default time for the ith reference entity. In our case, the marginal distributions P(TJ < t) are given but the joint distribution is not. Intuitively speaking, if we are only concerned with the correlation among the n's, we can use the compensator algorithm to simulate each T* but with correlated exp(l) random variables & in formula (10). We can formulate this idea as follows. Observe that if Z is a standard Gaussian random variable with distribution function $ ( i ) , i.e. Z ~ N(0,1), then - l o g ( l - $(Z)) is an exp(l) random variable. Set & = - log(l-$(Zj)) and {Zi,Z2, ...,Zd)~ JVd(0, E), where E = (pij)dxd and pa = 1, then the exp(l) random variables £1,62, • • • ,fd are correlated. The joint distribution of default arrival times is given by P(TI
= QdiQ-^ih)),...,S-^te)))
(11)
where Fi(t) = P(TJ
= $d($-l(l-e-xi),...,$-1(l-e-x*)).
(12)
The default indicator correlation is given by CorriHU),^))
=
^•••Mti),...,Pj(tj),...,l)-Pi(ti)Pj(tj) VPt(*i)Pj(*i)(l -pi(u)){l -Pj(tj)) (13) where pi(t) = P(TJ < t) and pj{t) — P(TJ < t), and the conditional default probabilities are
(14) 3.3. The pricing
algorithm
In combination with the compensator proxy algorithm described above, we can price a basket default swap as follows: O Generate (ui,...,un)
~ $ n ( * _ 1 ( u i ) , . . . , $ _ n ( u n ) , E ) by
® Generating (j/i,..., yn) ~ $ „ ( j / i , . . . , y„, E) from the correlated normal distribution. © Setting Ui = * _1 (?/i)-
159
Set & = - log(l - Ui). Then the default time of the ith name is given n - inf t > 0 {t: & < o<(t) + 6»(t)/ii,o}, where a*, 6,, h j)0 are the parameters for the generating function of the affine process described in the previous section. @ Feed the default time into the cashflow generator (varies according to the deal microstructure and waterfall specifications) © Discount all cash flows using the risk free discount curve to obtain the present value for the sample path. © Repeat for a pre-defined number of sample paths. The basket price is the average over all paths. Figure 2 is a numerical example showing how correlation affects the default distribution for a 30 name basket. In this example, the average hazard rate is about 300 basis point. Figure 3 shows the impact of correlation on the premium for the first 10 to default and last 20 to default basket based on the same portfolio.
Histogram of Defaults (Corr=10%)
Histogram of Defaults (Corr=0%)
0.4
t: 0.1
lllllll—— 10 20 Number of Names
10 20 Number of Names
Histogram of Defaults (Corr=40%)
Histogram of Defaults (Corr=20%) 0.5
0.5 0.4
L^
10 20 Number of Names
I:; 0.1 0
10 20 Number of Names
Figure 2. Impact of correlation on default distribution, 30 reference entities, average spread ss 170bp, maturity = 3.5 years
160
A Graph of First-to-Default Par Premium vs Correlation
0.1
0.2 0.3 0.4 Default Indicator Correlation
0.5
0.6
A Graph of Second-to-Default Par Premium vs Correlation
0.1
0.2 0.3 0.4 Default Indicator Correlation
0.5
0.6
Figure 3. First (10/30) and second (20/30) to default swap - average spread w 170bp, maturity = 3.5 years
161
4. Conclusion We have presented a concrete algorithm for rapid simulation of correlated default times and in particular the Monte Carlo valuation of basket default swaps. The general simulation techniques are immediately applicable across a wide range of financial products whose cashflows are contingent on the default or otherwise of multiple reference assets. These include both funded and unfunded Collateralized Bond, Debt and Loan Obligations, as well as a menagerie of custom structured credit products. The compensator proxy algorithm presented here allows us to simulate default times more rapidly than methods requiring the simulation of individual hazard rates, and in many cases appears to give results similar to these substantially more computationally expensive procedures. We note that the use of a deterministic proxy as advocated herein is not theoretically equivalent to the use of hazard rate simulation. The deterministic lookup ensures agreement only for the marginal distributions of default times for assets comprising the basket. Closer agreement in the joint distribution may be achieved in methods disclosed in a future paper. We have noted the connection to Copula methods. In future papers we also intend to describe additional techniques for the simulation of default times, including specific noise reduction approaches, methods for sensitivity calculation and analytic approximations applicable to particular products. Appendix A. Explicit solution of the jump CIR generating function For the Jump CIR model of hazard rates, we can solve for the coefficients for the generating function in (5) and get, a(t;a,P) K6, . ,.. 2K0, -J 7-(« + A)t+—In
27
v
AjAtj(a 7 + 2/3 - a(K + X))t (1 + nja)-y + (1 - IIJO){K + A) + 2j3uj + aa2 2\JHJ(2P 2
2
- 2a(K
+ A) -
aV)
(1 + /jja) 7 - [(1 - / y a ) ( « + A) + 2/3/xj + a
162
b(t;a,P) _ (1 - e - * ) (2)9 - (w + A)Q) + 7 a (1 + e - * ) ~ 27e-T* + (K + A + 7 + cr2aO (l - e -7 *) where 7 = y/(n + A)2 + 2<72/3. If instead of being constant over the entire duration to maturity [0, T], the parameters are piecewise constant in each interval of the set {[Ti,Ti+i];Q < i < M;t = T0 < 7\ < T2 < • • • < TM = T} g{t,T,x;a,f3) = Et,x < exp
hudu
g(TM-i,T,hTM_1;a,P)
= E t , x {exp [-o(T - T M - i ; a, /?) - 6(T - T M _i; a, i 9)ft rM _ 1 I-TM-I
=
I
ew[-a(T-TM-iia,p)]g(t,TM-1,x\b(T-TM-iia,P),P)
By recursion, we have g(t,T,x;a,(3)
= exp
3(t,T!,a;;ai,/3) i=0
where au = a, cti — b(Ti+i -Ti;ai+i,p),
for i - 1,2, •• • , M — 1.
Appendix B. Copula Functions Denote the cumulative density function (CDF) of the ith marginal by F,(t). Let F(t\,... ,tn) denote the joint distribution function of n variates. Consistency with the marginal distributions requires that Fi(t) = F(ti < oo, ...,tj_i < oo, ti < t,ti+i < oo, ...,t„ < oo), i = l,...,n. By Sklar's theorem
JVei99 !
F(t\,...,
F(t1,...,tn)
=
tn) has to be of the following form C{F1{t1),...,Fn(tn)),
for some function C(x\, ...,xn) satisfying the following conditions: (1) C(l,...,l,:Ei,l,...,l) — Xi for all i = l,...,n and 0 < i j < 1. C(a;i,..., x„) = 0 if any Zj = 0.
163 (2) Given any hypercube with corners a = (ai,...,a„) and b = (&i,..., bn), 0 < Oj, bi < 1, the volume must be positive, i.e. 2
£
2
' • • E (-l) il+ "' +iB C(zi, il .^2,i 2 , -,x n t i n ) > 0,
where Xjt\ = a.j and Xjt2 = bj. Such a function C(x\, ...,x n ) is called a copula. Sklar's theorem also states that given a copula function, and a set of marginals, the resulting function is in fact a multivariate distribution function. Moreover, for continuous distributions, the copula function is unique. Thus, any distribution can equivalently be formulated by the joint distribution of random variates that are marginally uniform(0,l). Thus for a chosen copula, joint variates can be simulated by the following algorithm: © Simulate the vector (ui,... ,un) ~ C{u\,... ,un). In other words, simulate from the multivariate uniform distribution represented by the copula function. Note that by construction 0 < Ui < 1. @ Find Tj such that Fifa) = Ui. If we can evaluate ^ ( T J ) analytically, this root finding process is not difficult. If speed is desired, one can store the values of Fi(t) at different points, so that the root finding becomes a table look-up operation. The simulated (riI...,T„)~C(Fi(Ti),...IFn(Tn)). References BS93. R. Brown and S. Schaefer. Interest rate volatility and the shape of the term structure. Philosophical Transactions of the Royal Society: Physical Sciences and Engineering, 347:449-598, 1993. DK92. D. Duffie and R. Kan. A yield-factor model of interest rates. Research Paper, Graduate School of Business, Stanford University, 1992. Nel99. R. B. Nelson. An Introduction To Copulas. Springer, New York, 1999.
OPTIMAL CONSUMPTION A N D PORTFOLIO IN A M A R K E T W H E R E T H E VOLATILITY IS D R I V E N B Y FRACTIONAL B R O W N I A N MOTION *
YAOZHONG HU Department of Mathematics, University of Kansas 405 Snow Hall, Lawrence, Kansas 66045-214% hu@math. ukans. edu
This paper deals with the problem of optimal consumption and portfolio for a generalized Black and Scholes market where the volatility is driven by a standard Brownian motion and by a fractional Brownian motion in the same time.
1. Introduction In their famous theory of option pricing, Black and Scholes assume that the asset prices considered follow geometric Brownian motion. A famous optimal consumption and portfolio problem appears in u . Many researchers have extended this theory to more general models. One extension is to replace the geometric Brownian motions by more general diffusion processes, probably with jump. Another important extension is to consider stochastic volatility. We shall not give historical account for the idea of using stochastic volatility. See for example 6 and in particular the references therein. Researchers also believe that the financial market has long memory and try to employ long range dependent stochastic processes (see for example, x , 10 ) . Since the fractional Brownian motions is one of the simplest classes of long memory processes, people have tried to use fractional Brownian motion to model the stock prices and other assets. However, a naive way to simply replace the Brownian motion in the Black and Scholes market model "This work is supported in part by the National Science Foundation under grant no. dms 0204613 and no. eps-9874732, matching support from the state of kansas and general research fund of the university of kansas. AMS 2002 subject classifications. 60h05, 60hl0, 91b28. Key words and phrases: Fractional Brownian motions, Black and Scholes model, stochastic volatility, optimum, portfolio, consumption, Girsanov formula, Clark representation. 164
165
by the fractional Brownian motion produces arbitrage (see 1 2 ). In 8 , this problem is studied in the framework of fractional white noise analysis and the arbitrage problem is eliminated. A problem of optimal consumption and portfolio is also solved in 2 , 9 . In this paper we consider the problem of optimal consumption and portfolio in a market where the volatility is random and satisfies a stochastic differential equation driven by fractional Brownian motion. Namely, let us consider a market with two securities, one is called bond, whose price is given by At satisfying dAt = rtAtdt,
A0 = 1,
(1)
where r% > 0 is a (deterministic) continuous function. Another asset is called stock, whose price follows a generalized geometric Brownian motion dSt = HtStdt +
(2)
where (Bt,t > 0) is a standard Brownian motion and at = f{Yt) for a certain continuous function / and Yt satisfies the following stochastic differential equation:
UYt=b(t,Yt)dt \Yo
+ ai(t,Yt)dBt + a2(t,Yt)dB»
is given and deterministic,
where (B/^,0 < t < T) is a fractional Brownian motion independent of (Bt,0 < t < T). We shall assume the existence and uniqueness of the above equation (the existence and uniqueness of the solution to (3) will be discussed elsewhere). An option pricing theory analogous to the Black and Scholes's one is given in 7 . This paper is concerned with the problem of optimal consumption and portfolio of the Merton's type for the model (l)-(3). The novelty of this stock price model is that the price process itself is a semimartingale but the volatility is random and long range dependent may not be a semimartingale. It remains to numerically validate the model from the real financial data and this will be done in other place. Since the solution to (3) may not be a Markov process, we will not use the Hamilton-Jacobi-Bellman method. We shall use the approach of 3 , 4 , and 9 . An important ingredient for this approach is the Clark representation formula. We shall study our problem by enlarging the filtration. More precisely, we assume that the whole volatility process (at,0 < t < T) is observed at any time. This is relevant to the so-called insider trading problem. However, let us note that although the optimal consumption and portfolio are searched among all admissible portfolio in the sense that we know at
166
for all 0 < t < T at any time, the optimal consumption is adapted to the filtration generated by the standard Brownian motion and the fractional Brownian motion BH. This means that the availability of future information on the volatility has no impact on the optimal consumption rate. This phenomenon is striking. It is my conjecture that the optimal portfolio also adapt to the above filtration. It is also worth to point out that this approach seems applicable to any stochastic volatility model. 2. General Results Let (Bt,t > 0) be a standard Brownian motion and (B^,t > 0) be a fractional Brownian motion independent of the standard Brownian motion. Let Ft — o-(Bs, 0 < s < t) be the cr-algebra generated by Bs, 0 < s < t and Gt =0 and consider a market with two securities. (1) A bank account or a bond whose price per share At at time t > 0 is given by dAt = rtAtdt ;Ao = l
(*c At = e'o r-ds)
(1)
where rt > 0 is a given F t -adapted stochastic process satisfying
E( I \rt\dt\0 is given by the solution of a generalized "geometric Brownian motion" with stochastic volatility dSt = HtStdt + atStdBt
; SQ > 0 is given,
(2)
where ^xt{> rt) is a given F t -adapted stochastic processes and at = /(£, Yt) for a certain continuous function / and dYt = a(t, Yt)dt + b{t, Yt)dBt + c(t, Yt)dB? ,
(3)
where the differential dB^ is the Ito type fractional Brownian motion differential used in 5 , 8 . A portfolio is a pair of Gt-adapted process 0t — (at,/3t), 0 < t < T, where at and /3t denote the numbers of shares in bond and stock held
167
by the investor at time t respectively. In this paper we assume that the volatility is observable to simplify the setting. We first search the optimal portfolio among all portfolios which are adapted to the filtration Gt. And then we will show that the optimal portfolio in this class is in fact Ht = a(Bs , Bf, 0 < s
(4)
Let c = (ct, 0 < t < T) be a given adapted process, denoting the investor's consumption rate. We assume that ct > 0 and rT
Jo /o
Ctdt < oo almost surely.
The set of all such (c t , 0 < t < T) is denoted by C. A portfolio 8t = (at, /3t), 0 < t < T is called self-financing with respect to the consumption rate c if dz\ = atdAt + PtdSt - ctdt,
0 < t < T.
(5)
(where z\ is given by (4).) Denote $
= e -Jo
A self-financing portfolio 6 = (at,pt) if EU
»-.<*>.
(6)
is called admissible with respect to c
\PtatSS\2dt\
(7)
The set of all admissible portfolio is denoted by A. We shall denote by zt 'c an investor's wealth at time t corresponding to a self-financing portfolio 9. Namely, zf'c satisfies (4)-(5) and (7). Let (j> and tp be two given continuous concave functions. Define rT
J(6, c) = E
(8)
Jo
The problem that will be studied in this paper is the following P r o b l e m I: Find an admissible portfolio 6* £ A and a consumption rate C* such that J{6* ,c*)>J(6,c),
V6&A,
ceC.
(9)
168
W h e n rt, fit, <*t are constants, a similar problem was proposed and solved by R. Merton n . If Yt is driven only by standard Brownian motion, namely, in the equation (3), a-2(t,Yt) = 0, the problem has been discussed in t h e literature, see for example, 6 . In this paper we shall assume t h a t the volatility process is observable and driven both by a standard Brownian motion and a fractional Brownian motion. Unlike in t h e only one standard Brownian motion case we are no longer in Markovian case. We shall use t h e method of 3 , 4 , 9 . L e m m a 2 . 1 . Define Ps =
(10)
and let \ps\2ds\
Eexpl-
(11)
Let Q be a probability measure on (Q, F) defined by dQ_ ^ = n(T), dP
(12)
where 1
rt
rj(t) := exp I / psdBs
rt
- - /
p2sds
(13)
Denote £ t = exp (— J* rsdsj , 0 < t < T. Assume that F is a given random variable which is FT measurable. Then the following statements about F are equivalent: (i) There are admissible portfolio 6 and consumption rate c such that zQ'c = z and zT,c = F. (ii) G = &F + JQ £sCsds is square integrable and EQ [G]=z.
(14)
Proof Prom (4) we see t h a t a self-financing portfolio is uniquely determined by Pt, 0 < t < T. In fact, we have at
=
^tS
i
=
^t{zt_PtSt)
(15)
At
which substituted into (5) yields dzt = rtztdt
- ctdt + (fit - rt)PtStdt
+
atPtStdBt.
169
Or dzt - rtztdt + ctdt = at/3tSt
dBt
f^Hldt
+
CTt
= at/3tSt [dBt + Ptdt] . Using the definition of £t we may write the above equation as d(&zt) + itctdt =
a&PtStdBt,
(16)
where
Bt=Bt
+ Ht-rt
(17)
Hence ZTZT + / £scsds = z+ as£3(33SsdBs. (18) Jo Jo If Q is a probability measure on (Q,F) defined by (12), then from the Girsanov theorem Bf is a Brownian motion on the probability space (Q,, F, Q) Therefore by (7) we have EQ
&ZT+
L /
tsCsds
On the other hand, let
j
G(w) = £TF + /
&.&.
(19)
Jo
If E® (G) = z and G is square integrable, then there is a unique G t -adapted stochastic process ft such that G = E® [G] + f Jo
ftdBt.
Define ft
fit
(20)
and at by (15). Then 6t = (at,fit) is the portfolio. • Prom this lemma we see that the problem I is equivalent to the following problem. P r o b l e m II: Find a ct and an FT measurable random variable F subject to EQ
I
&F+ /
tesds
(21)
170
which maximizes EQ
(22)
(cs)ds Jo
We shall use the Lagrangian multiplier method to solve this constraint problem II. Consider for each A > 0 the following unconstrained optimization problem Vx(z) = sup IE c,F>0
ip(cs)ds
-XE®
f
Ztctdt + frF
Jo
(23)
Lemma 2.2. Suppose that for each A > 0 one can find V\(z) and corresponding c\(t,w) > 0,F\ > 0. / / there exists A* > 0 such that c\*,F\* satisfy the constraint (21), ie EQ
I
£sc\* {s)ds + £TF\.
(24)
z,
Jo then c\*, F \ . solve the constrained problem II. Proof E
If c > 0, F > 0 is any pair satisfying the constraint then
[/
ip(ct)dt +(F)
.JoE [ =
<E
iP(ct)dt +(F) -X*EQ
f iP(c*t)dt + 4>{F*) •
-X*EQ
f
£scsds + frF + X*z
Jo
[ ZsC*s ds + £TF* + \*z
Jo
T
/ £sc*sds +(F*) Jo This proves the lemma. • Prom this lemma it follows that Problem I is equivalent to the following problem: Problem III: Find F* and c* which maximizes = E
Jx*{0,c) = E
f(F) -\*EQ [ Jo Jo
Zscsds + ZTF (25)
for the fixed A* and the following holds for A*: rT EQ
f Zscx.{s)ds + tiTFx. Jo
(26)
171
Now we outline the general method to solve the unconstrained optimization problem (25)-(26). Using the definition of Tjt, we can write Vx(z) = sup CiF > 0 E [/0T (ij(ct) - Xrtrttct) dt + <j>(F) = sup CiF > 0 E [/ 0 r (iP(ct) - Xv&ct) dt + 4(F) -
XVT^TF
AT?T^TF]
. (27)
The problem (25) can be solved by maximizing the following two functions gt(c) = i>(c) - \r)t£tc ; c > 0
(28)
ht(F) =0
(29)
for each t £ [0,T] and u € ft. Since(w) is the solution to (25)-(26). We shall give more explicit solutions for some specific utility functions. 3. Some Particular Utility Functions Now we assume that ^(X)
=
and
D^
*{x)
=
D^
In this case 9t(c) = — c1 - Xti{t,u)ttc ; c > 0 7 h{F) = — F 7 for all t e [0, T] and u G 0 . We have g't(c) = 0 for c= c ,
7
- \n{t,u)ZTF
M
=
; F >0
[ ^ ] ^
(30) (31)
(32)
and by concavity this is the maximum point of gt. Similarly F = FX(OJ)
D2
(33)
172
is the maximum point of ht. We now seek A* such that (35) holds, i.e. E
'^fat'
/ fat
(34)
L Di \
Jo or
X^N
= z,
where N = E
f
fat
T Jo„^±T
fir
Jo
fat Di
L>2
tl)dt+^E[r,f
EUr1
(35)
D7_1
Dy-
Hence A* =
z\7-i
(s)
(36)
Substituted into (32) and (33) yields the optimal consumption rate
c\'{t,u) =
1 fayN\DiJ
(37)
and
*•<«>-*(^r
(38)
logx If >(x) — ^ p and tp(x) = - ^ p , then in similar way we have
(39)
A* D2
(40)
X*^TVT
and c*(*,w;
D2 ^fat
It is easy to see that c*(t,u) is Ht adapted.
(41)
173 4.
Conclusion
In this paper we study the optimal consumption and portfolio problem (9) under the constraint (l)-(5). T h e problem is reduced t o Problem III, namely, (25)-(26), which can be solved explicitly for some specific utility functions. Solving Problem III yields c t and F. Here, (c t ,£ > 0) is the optimal consumption rate process. To find the optimal portfolio process 9t = (at,Pt), one uses (19)-(20) t o find ftt- One may use (16) to find zt and then use (15) to find at.
References 1. Beran, J. Statistics for long-memory processes. Monographs on Statistics and Applied Probability, 61. Chapman and Hall, New York, 1994. 2. Biagini, F.; Hu, Y.; 0ksendal, B. and Sulem, A. A stochastic maximum principle for processes driven by fractional Brownian motion. Stochastic Process. Appl. 100 (2002), 233-253. 3. Cox J. and Huang C.F. Optimal consumption and portfolio policies when asset prices follow a diffusion process. Journal of Economic Theory 49 (1989), 33-83. 4. Cox J. and Huang C.F. A variational problem arising in financial economics. J. Mathematical Economics 20 (1991), 465-487. 5. Duncan T.E., Hu, Y. and Pasik-Duncan B. Stochastic calculus for fractional Brownian motion. I. Theory. SIAM J. Control Optim. 38 (2000), 582-612. 6. Fouque, J.-P.; Papanicolaou, G. and Sircar, K. R. Derivatives in financial markets with stochastic volatility. Cambridge University Press, Cambridge, 2000. 7. Hu Y. Option pricing in a market where the volatility is driven by fractional Brownian motions. Recent developments in mathematical finance (Shanghai, 2001), 49-59, World Sci. Publishing, River Edge, NJ, 2002. 8. Hu, Y. and Oksendal, B. Fractional white noise calculus and applications to finance. Preprint, University of Oslo, 1999. 9. Hu, Y.; Oksendal, B. and Sulem, A. Optimal portfolio in a fractional Black & Scholes market. Mathematical physics and stochastic analysis (Lisbon, 1998), 267-279, World Sci. Publishing, River Edge, NJ, 2000. 10. B. B. Mandelbrot: Fractals and Scaling in Finance: Discontinuity, Concentration, Risk. Springer-Verlag, 1997. 11. Merton, R. C. Optimum consumption and portfolio rules in a continuoustime model. J. Econom. Theory 3 (1971), no. 4, 373-413. 12. L. C. G. Rogers: Arbitrage with fractional Brownian motion. Math. Finance 7 (1997), 95-105.
M L E FOR CHANGE-POINT IN A R M A - G A R C H MODELS WITH A C H A N G I N G D R I F T *
SHIQING LING Hong Kong
Department of Mathematics University of Science and Technology Clear Water Bay E-mail: [email protected]
This paper investigates the maximum likelihood estimator (MLE) of structure -changed ARMA-GARCH models. The convergent rates of the estimated changepoint and other estimated parameters are obtained. After suitably normalized, it is shown that the estimated change-point has the same asymptotic distribution as t h a t in Picard(1985) and Yao (1987). Other estimated parameters are shown to be asymptotically normal. As special cases, we obtain the asymptotic distributions of MLEs for structure-changed GARCH models, structure-changed ARMA models with structure-unchanged GARCH errors, and structure-changed ARMA models with i.i.d. errors, respectively.
1. Introduction Consider the following autoregressive moving-average (ARMA) model with the generalized autoregressive conditional heteroscedasticity (GARCH) errors: p
«
Vt = '52
T
£t = r\t\fhu
(1)
i=\ S
ht = a0 + ] P o n e \ - i + ^2^ht-u
(2)
where p, q, r and s are known positive integers and r\t are independent and identically distributed (i.i.d.). Equations (1.1)-(1.2) is called the ARMA-GARCH model and denoted by M(A), where A = {m',6')' with m = ((pi,---
,4>P,^i,---
"This research is #hkust6113/02p.
,il>q)' and 5 = (a0,ai,---
supported
by
the
competitive
174
, a r , / 3 i , - - - ,/? s )'. Denote earmarked
research
grant
175
Yf = (j/i,--- ,2/fc)'- Yi 6 M(A0) means that yit--- ,yk are generated by model (1.1)-(1.2) with the true parameter A = Ao- We say that Y™ follows a structure-changed ARMA-GARCH model if there exist fco S [l,n — 1], Ao £ 0 and Aoi € © with Ao / Aoi so that Y{° e M(A0) and Ffe"0+1 G M(A 01 ).
(3)
This structure-changed ARMA-GARCH model is denoted by M(ko, Ao, Aoi). ko is called the change-point of this structure-changed model. Yj™ 6 M(fco, Ac Aoi) means that (1.3) holds. The focus of this paper is to investigate the maximum likelihood estimator (MLE) of model M(fc0,A0,Aoi). Structural change has been recognized to be an important issue in econometrics, engineering, and statistics for a long time. The literature in this area is extensive. The earliest references can go back to Quandt (1960) and Chow (1960). Many approaches have been developed to detect whether or not structural change exists in a statistical model. Examples are the weighted likelihood ratio test in Picard (1985), and Andrew and Ploberger (1994), Wald and Lagrange multiplier tests in Hansen (1993), Andrews (1993), and Bai and Perron (1998), the exact likelihood ratio test in Davis, Huang and Yao (1993), the empirical methods in Bai (1996), and the sequential test in Lai (1995). A general theory for exact testing change-points in time series models was established by Ling (2002a). Empirically, we want to know not only that structural change exists, but also the location of change-point. Hinkley (1970) and Hinkley and Hinkley (1970) investigated the MLE of change-points in a sequence of i.i.d. Gaussian random variables and the binomial model, respectively. Their changed parameters are fixed and the limiting distributions of the estimated change-points seem not to be useful in practice. Picard (1985) allowed the difference between parameters before and after the change-point in AR models to tend to zero but not too fast when the sample size tends to infinity, and obtained a nice limiting distribution for the estimated change-point. This distribution can be used to construct the confidence intervals of the change-point, and hence it is very useful in applications, as these confidence intervals indicate the degree of estimation accuracy. Yao (1987) used a similar idea for independent data and obtained the same limiting distribution as Picard's. Picard's method has been developed for the regression models by Bai (1994, 1995, 1997). In particular, Bai, Lumsdaine and Stock (1998) used Picard's method for the structure-changed multivariate AR model and cointegrating time series
176
model, and derived the asymptotic distributions of the change-points in these models. Chong (2001) developed a comprehensive theorem for the structure-changed AR(1) model. A general theory for estimating changepoints in time series models with a fixed drift was established by Ling (2002b). In this paper, we use Picard's method to model M(kQ, Ao, Aoi). The convergent rates of the estimated change-point and other estimated parameters are obtained. After suitably normalized, it is shown that the estimated change-point has the same asymptotic distribution as that in Picard(1985) and Yao (1987). Other estimated parameters are shown to be asymptotically normal. As special cases of model M(fco, Ao,Aoi), this paper obtains the asymptotic distributions of MLEs for structure-changed GARCH models, structure-changed ARM A models with structure-unchanged GARCH errors, and structure-changed ARMA models with i.i.d. errors, respectively. This paper proceeds as follows. Section 2 presents main results. In Section 3 we report some Monte Carlo results. We give the proofs of main results in Apendix A.
2.
Main Results
As common practice, we parameterize the change-point as fco = ["•To], where TQ e (0,1) and [x] represents the integer part of x. We assume that Aoi = Aon is changed over n with dn = Aon — Ao —> 0 as n —> oo. It is reasonable to allow the changed parameters to have such a small shift. Some arguments on this can be found in Picard (1985) and Bai et al. (1998). Let Y" € M([nr 0 ], Ao, A0n) and the corresponding unknown parameter model be M([TIT), A, AI), where (r, A, Ai) € (0,1) x O 2 and 6 be a compact subset of Rl with I =p + q + r + s + l. Suppose that (TQ, AO, Aon) is an interior point in (0,1) x 0 2 and, for each A € 0 , it follows that Assumption 1. All the roots ofiz — • • • — pzp and ip(z) — \ + -ty\z-\ \-ipqzq are outside the unit circle and have no common root, 4>p =£0 and ipq ^ 0; Assumption 2. 0 < QQ < Qo < ao, cx-i > 0, i = 1, • • • , r — 1, ar y^ 0, (3j > §_ > 0, j = l,--- ,s, £ i = 1 a j + £ j = i / 8 j < 1 and E [ = 1 Q j ^ and 1 — Yli=i PiZ% have no common root.
177
Assumption 3. The spectral radius p[E(At ® At)] < 1, where faiVt
OtrVt
A%2 O (frr_- l ) X s
Ar-l)x(r-l) 0 ( r _ i ) x i
At = «1
V
/W \
ov 0(s-l)xr
^(s-l)x(s-l) 0 ( s - l ) x l /
wz£ft Ikxk is the k x k identity matrix, and ® denotes the Kronecker product, j Assumption 3 is the necessary and sufficient condition for Ey\ < oo, see Ling (1999) and Ling and McAleer (2002b). Conditional on Fo — (yo> 2/-ii • • • )> the log-likelihood function (ignoring a constant) can be written as I'K J
It
Ln(r,XAi)=[YlhW+
£
W
t=[nr]+l . [«Tol
[£>(A0) + t' = l_
£
it(Aon)"
(1)
t=[nr0]+l
where lt(X) = -£ 2 (A)/2/i t (A)-2 _ 1 log /it(A), and et(A) and /it(A) are defined as £t and ht, respectively, but now they are the functions of V", Yo and A. Since we do not assume that nt is normal, (2.1) is called the quasi-likelihood function, and its maximizer on the parameter space (0,1) X 0 2 , denoted by (fn, An, A i n ), is called the quasi-maximum likelihood estimator (QMLE) of (To,Ao,Aon)- In practice, the initial value Yo is not available and can be replaced by any constant. This does not affect the asymptotic behavior of the QMLE, see Ling and Li (1997). When yt € M(A), yt is a fixed function of A and {rjt, f]t-i,- • •}, and hence when Yf° € M(A 0n ), it strictly constitutes a triangular array of the type {ytn : £ = 1,2, • • • ;n = l , 2 , - - - } . In order not to overburden notation, we simply refer to the time series generated by M(AQ„) as yt. Our first result gives the rates of convergence of the QMLE and it plays an important role in the proof of Theorem 2.2. Theorem 2.1. Suppose that F/ 1 e M([nr 0 ], A0, A0„), £ | % | 4 + l < oo for some i > 0, and Assumptions 1-3 hold. If dn —> 0 and *Jn\\dn\\/logn —» oo, then
An - AQ = Op(-j=\
and Ai„ - A0n = Op(—=\
178
Now, we state our second main result which shows the limiting distribution of the QMLE. In the following, F is the distribution with density / on R: r°°
1
-,2
This distribution was first found by Picard (1985) and Yao (1987). The latter also tabulated its numerical approximation. Theorem 2.2. If the conditions in Theorem 2.1 are satisfied, then fn, An and Ain are asymptotically independent, and C-ln{d'nmn){fn MK
- r 0 ) — » L F,
~ A0) — > L N(0, - O - ^ f i - 1 ) , TO
A/^(AI„
- A0n) —
L
JV(0, - ^ f i - i S f i - 1 ) , 1 -T
0
where C n = {d'Jld^id'Jldn)-1, Q = E[d2lt(X0)/dXdX'}, and £ = E[(0Jt(A o )/0A) (dJt(A0) ld\')]. As mentioned in Section 1, model (1.1)-(1.2) implies some important special cases. We first consider the GARCH model: r
St = Vty/ht,
s
ht = ao + ^2a^i
+ J^0iht-i-
(2)
Denote model (2.2) by M(5) and let e\ = (e,, • • • ,£&)'. We say that e? follows a structure-changed GARCH model if there exists fco £ [l,n — 1] so that £i° e M(60) and e£ o + 1 G M(<50„) with 80 ^ 50n- This structurechanged model is denoted by M(ko, So, 5on). Let (fn, 8n, 5\n) be the QMLE of (ro,<So,<5on)- From Theorem 2.2, we immediately obtain the following result. Corollary 2.1. Suppose that e? e M([nT0],<50,<W), -EM4"1"' < oo for some t > 0, and Assumption 2 holds. If dn = Son — So —> 0 and •v/n||dn||/k>gn _> ooj then f„, <5n and <5in are asymptotically independent, and nK~l{d'nQdn){Tn
ML
- T0) — > L F,
~ So) — > L ^ ( 0 , - £ - J * - 1 ) . 4To
Mhn
- Son) —L N{0,
K
.R-1).
4 ( 1 - To)
Next, we consider the structure-changed ARMA model with structureunchanged GARCH errors. We say that Y™ follows such a model if there
179
exists k0 G [l,n - 1] so that Yf0 G M(A 0 ) = M(m0,<50) and Y £ + 1 G •^(-^On) = -^("loni^o) with m 0 ^ mo„. This structure-changed model is denoted by M(ko,mo,mon,5o)- Let (tn,mn,min,5n) be the QMLE of (ro,TOo,mon,5o). Prom Theorem 2.2, we obtain the following corollary. Corollary 2.2. Suppose that Y™ G M([nTo],mo,mo„,<5o), Assumptions 1-3 hold, r/t has a symmetric density and J5|7y(|4+t < oo for some t > 0. If ^n = "ion _ m o —> 0 and i/n||d n ||/logn —> oo, then f„, m n , mi„ and Sn are asymptotically independent, and C~ln{d'n9.mdn){fn
- TO) —> L F,
Vn{rhin - m 0 „) — > L N{0, n^T,mn^), I -TO v ^ ( m n - m 0 ) — » L iV(0, - f i - ^ f i - 1 ) , TO
VS(4-*o)—•L^(O,^-1),
where Cn = (d^E m d n )(d^f2 m d n ) - 1 , fim = P + 2Q, and S m = P + KQ. Finally, we consider the ARM A model: v
q
+ £*>
yt = Yl fay*-* + Yl ^t-i i=l
(3)
i=l
where et are i.i.d. with mean zero and variance a2. We denote model (2.3) by M(m). We say that Y™ follows a structure-changed ARMA model with i.i.d. errors if there exists fco G [l,n - 1] so that Y^0 G M(mo) and Yj£ +1 G M(mo„) with mo ^ mo„. This structure-changed model is denoted by M(fc 0 ,mo,mo„). In this case, maximizing the log-likelihood function (2.1) is equivalent to minimizing: H
n
L„(T,m,mi) = Y^£t(m)+ t=l
t(mi)
t=[nrj + l
[nro]
-^e?(m0)t=l
e
^2
n
^
e?(m 0 n ).
(4)
t=[nr 0 ] + l
Usually, we call the minimizer of L„(T, m, mi) the conditional least squares estimator (CLSE), denoted by (fLn, mLn,rh,nn). From Theorem 2.2, we obtain the following result. Corollary 2.3. Suppose that Y™ G M([nTo],m 0 ,m 0 „), Assumption 1 holds and E\et\2+l < oo for some t > 0. If dn = m0n - m 0 —> 0 and
180
\/n||dn||/logn -> oo, then fLn, mLn and rriLin are asymptotically independent, and n(d'nVdn)a-2(fLn
- T0) —>L F, a2
Vn(mLn - m0) —> L iV(0, — V'1), TO
2
Vn(rhLin
V'1),
- mon) — > L N{0, I
-To
where V = E[{det(Xo)/dm)(det(Xo)/dm')]. Remark 2.1. When Y™ e M([nro], mo, mon, ^o), the objective function (2.4) can be used to estimate (To,mo,mon)- Using a similar approach as for Theorems 2.1-2.2, we can show that C^n(d'nVdn)(tLn
- TO) — > L F,
(5)
where CLn = {d'nVdn)(d'nVdn)~l and V E[(det(Xo)/dm)(d£t(Xo)/dm') e2]. It is interesting to compare the two estimators fn in Corollary 2.2 and TLn in (2.5). The optimality of estimated change-points has not been established in the literature. We define efficiency as follows: Assume that fi„ and f%n are two estimators of TO such that •Din(Tin - To) —>L F and D2n(f2n — To) ->L F. We say that f i n is more efficient than Tin if D\n > D-m for all n. The reason for this definition is that f i n can always provide sharper confidence intervals than T2n at any significant level. Under this definition, it is easy to see that the QMLE fn in Corollary 2.2 is more efficient than the CLSE TLn in (2.5) if nt is normal since (d'nVdn)2 < (d'nPdn){d'nVdn) by Chebyshev's inequality. We can show that, even when nt is not normal, the QMLE is also more efficient than its CLSE.
3.
Simulation Studies
We first examine the performance of our asymptotic results in the finite samples via some Monte Carlo experiments. The following three models
181
axe used: Model 1 : yt = 0o2/t-i^{tk0} + Eu £t = Vty/ht and ht = ("00 + ao£?_i + /3oht-i)I{t fc 0 }
Model 2 : £t = rjty/ht and ^t = ("oo + ao£?_i + /?o/it-i)^{tk0} Model 3 : j/t = 0o2/t-i-f{t 0nyt-\I{t>ko} + £t, £t = »7f\/^t a n ( i ftt = ooo + "o£?_i + A)/i*-ii where r/t ~ i.i.d.JV(0,1). The true observations are generated through these models with parameters: o = 0.5, Qoo = 1-0, c*o = 0.2, 0o = 0.7, on = -0.6, a0on = 0.5, a 0 n = 0.57 and /?o„ = 0.02. We use 4000 replications in all the experiments. These experiments are carried out by Fortran 77 and the optimization algorithm from Fortran subroutine DBCOAH in the IMSL library is used. In Tables 1 and 2, we summarize the empirical means and standard deviations (SD) of the MLEs of A0 and Aon, A„ = {^n,6ton,OintPn)' a n d A ln = (<j)in, doin, &in, Pin)'• From the two tables, we see that the SDs of An and Ai„ are decreased and increased, respectively, as To is increased from 0.496 to 0.504. This is consistent with our theoretical results in Section 2. As the sample size n is increased from 250 to 400, both the corresponding biases and SDs become smaller. This is the same as the usual results in the structure-unchanged AR-GARCH models. For Models 2 and 3, the Monte Carlo results are similar and hence are not reported here. TABLE 1 Mean and Standard Deviation of MLE of Ao and Aon in Model 1 n = 250 and 4000 Replications A0==(0.5, 1.0, 0.2, 0.7) Mean
SD .500
Mean
SD .504
0n
ln
aoin
&ln
.4954 .0847
.9566 .1748
.1707 .1105
.7251 .0985
-.5884 .0726
.4933 .1271
.5128 .1949
Pin .0457 .0971
.4952 .0845
.9580 .1692
.1706 .1104
.7251 .0980
-.5885 .0728
.4934 .1277
.5131 .1960
.0449 .0988
.4952 .0841
.9579 .1686
.1712 .1098
.7246 .0975
-.5887 .0730
.4937 .1291
.5119 .1973
.0452 .0996
4>n
TO
.496
Mean
SD
Aon=(--0.6, 0.5 0.57, 0.02)
&0n
&n
182
TABLE 2 Mean and Standard Deviation of MLE of Ao and Aon in Model 1 n = 400 and 4000 Replications Aon=(--0.6, 0.5 0.57, 0.02)
A0==(0.5, 1.0 0.2, 0.7) Mean
SD .500
Mean
SD .504
&0n
&n
Pn
4>ln
"Oln
ain
.4966 .0655
.9884 .1696
.1823 .0858
.7144 .0843
-.5968 .0556
.4851 .0994
.5408 .1524
Pin .0495 .0741
.4977 .0650
.9690 .1331
.1815 .0852
.7162 .0744
-.5931 .0562
.4943 .0996
.5288 .1534
.0399 .0755
.4977 .0646
.9692 .1317
.1816 .0848
.7161 .0738
-.5932 .0566
.4944 .0999
.5288 .1543
.0399 .0762
4>n
TO
.496
Mean
SD
We use two methods to estimate Model 3, that is, the MLE and the CLSE as in Corollary 2.3 and Remark 2.4. Model 3 is denoted by Model 3a and Model 3b as it is estimated by the MLE and the CLSE, respectively. In Tables 3, 4 and 5, we report the median (Med), mean, 90% range, estimated asymptotic confidence interval (EACI), and asymptotic confidence interval (ACI) of the change-point fc0 = [nT0] with T 0 = 0.496,0.500,0.504 and the sample sizes n = 250,400. The empirical mean is the average of kn from the 4000 replications. The empirical median and the 90% range are respectively the 50%-quantile and the range between the 5% and 95% quantiles of the distribution of kn. The EACI and ACI are computed, respectively, by the following formulas:
fc„-[AFw/2]-l,fcn-[Ai^/2]+l
and
fco-[AFw/2]-l,fco-[AFw/2]+l
,
where F w / 2 is the wth quantile of the distribution F and A = (d'n£ldn) 1, (d'nQdn)-\ (d'^dn)-1 and (d'nVdn){d'nVdn)-2 for Model 1, Model 2, Model 3a and Model 3b, respectively. Using the density function f(x) in Section 2, we obtain F0.05 = 7.792. ft is estimated by 1 2 Q, V and V are similarly estimated. Under n- Y^=id k(Xn)/dXdX'. Assumptions 1-3, these estimators are consistent in probability (see Ling
183
TABLE 3 MLE and Confidence Interval of the Change-point fco 4000 Replications n=250 ro = 0.496 Model 1 Model 2 Model 3a Model 3b
Med 124 124 124 124
Mean 124 124 124 124
90% Range [119, 128] [114, 130] [115, 134] [115, 135] n=400
EACI [121, 127] [119, 129] [118, 130] [113, 135]
ACI [121, [119, [118, [113,
127] 129] 130] 135]
r 0 = 0.496 Model 1 Model 2 Model 3a Model 3b
Med 198 198 198 198
Mean 198 198 198 198
90% Range [193, 201] [188, 204] [189, 208] [188, 210]
EACI [195, 201] [193, 203] [192, 204] [187, 209]
ACI [195, [193, [192, [187,
201] 203] 204] 209]
From Tables 3, 4 and 5, we see that the median and mean are unbiased in all cases, and the ACI is exactly the same as EACI. The 90% range is slightly wider than EACI and ACI in all cases. As n is increased from 250 to 400, the EACIs and ACIs of kn—ko have not been improved. This is because the rate of convergence of kn — ko is Op(l/d'ndn), which only depends on d'ndn, while dn is fixed for n = 250 and 400 in our experiments. This finding is similar to those in Bai (1995) for the structure-changed regression model and in Bai et al.(1998) for the structure-changed multivariate AR models
184
and cointegrating time series models. Comparing Model 1 TABLE 4 MLE and Confidence Interval of the Change-point fco 4000 Replications n=250 TO = 0.500 Model 1 Model 2 Model 3a Model 3b
Med 125 125 125 125
Mean 125 125 125 125
90% Range [120, 129] [115, 131] [116, 135] [115, 136]
EACI [122, 128] [120, 130] [119, 131] [114, 136]
ACI [122, [120, [119, [114,
EACI [197, 203] [195, 205] [194, 206] [189, 211]
ACI [197, 203] [195, 205] [194, 206] [189, 211]
128] 130] 131] 136]
n=400 TO = 0.500 Model 1 Model 2 Model 3a Model 3b
Med 200 200 200 200
Mean 200 200 200 200
90% Range [195, 204] [191, 207] [191, 210] [190, 211]
TABLE 5 MLE and Confidence Interval of the Change-point fco TO = 0.504 and 4000 Replications n=250 Model Model Model Model
1 2 3a 3b
Med 126 126 126 126
Mean 126 126 126 126
90% Range [121, 131] [116, 132] [117, 136] [116, 137]
EACI [123, 129] [121, 131] [120, 132] [115, 137]
ACI [123, [121, [120, [115,
129] 131] 132] 137]
EACI [199, 205] [197, 207] [192, 208] [191, 213]
ACI [199, [197, [192, [191,
205] 207] 208] 213]
n=400 Model Model Model Model
1 2 3a 3b
Med 202 202 202 202
Mean 202 202 202 202
90% Range [197, 206] [193, 209] [193, 211] [192, 213]
with Model 3a, we can see that both the 90% range and the ACI for Model 1 are, respectively, tighter than those for Model 3a. This means that we can estimate fco more precisely when changed coefficients also exist in the GARCH part. Comparing Model 3a with Model 3b, we find that both the 90% range and the ACI for Model 3a are, respectively, tighter than those for Model 3b. This is consistent with our discussion in Remark 2.1.
185
Appendix A.
Proofs of Theorems 2.1-2.2
Ling (2002c) proved that Assumptions A-C in Ling (2002b) are satisfied if Assumpptions 1-3 hold. In the following, we only present the different parts from those of Theorem 2.1 in Ling (2002b). We reparameterize model M(r, A, Ai) by letting 9 = X — Ao, #i = Ai — Aon; and 7 = C~ln(d'nSldn)(T — To), and only prove the case with 7 < 0. The proof for the case with 7 > 0 is similar. The log-likelihood function (2.1) can be rewritten as [nr]
LnbM)
= X>(Ao + 9) - lt(X0)} t=i n
+
Yl
[h(X0n+9l)-k(\0n)}
t=[nr 0 ]+l [nro]
+
£
l't(Ao +0i+<**)-!*(*«,)].
(A. 1)
t=[nr]+l
The basic idea in the proof of Theorem 2.1 is to investigate the asymptotic behavior of each term in (A.l). We need the following seven lemmas. The notation Aon = Ao or Aon will be used. Lemma A . l . If the assumptions of Theorem 2.1 holds, then, for any f>0, [TIT]
sup max y ^ t ( A o n + 0) - lt{Xon)] = Op(l).
-<-
e
i=i
Lemma A.2. If the assumptions of Theorem 2.1 holds, then, for any e > 0, there exists a constant D such that, as n is large enough, '( sup max V[Zt(Ao + 0) ~/t(A 0 )] > D l o g n ) ^0
Q
^
< e.
'
L e m m a A.3. If the assumptions of Theorem 2.1 holds, then, for any e > 0, there exists a constant D such that, as n is large enough, (nro)
P[
sup max
V
[lt(\o + 6)-lt(X0)}>
Dlogn)
<e.
The proofs of Lemma A.1-A.3 are similar to those of Lemma 5.1 in Ling (2002b) and hence the detalis are omitted.
186
L e m m a A.4. Let C, H and e be any positive constants. If the assumptions of Theorem 2.1 holds, then there exists 7 > 0 such that, as n is large enough, [nro]
P(
sup V
sup
V
[lt(\o
0)-lt{\o)}>-H)<e,
+
TC||dn||t=^+1
/
where Tn(j) £ ( 0 , T 0 ) such that 7 = -C-ln{d'Jldn)\Tn(i) - r 0 ]. Proof. Denote f = T„(7) and k = [nr n (7)]. As the proof of Lemma 5.1 in Ling (2002b), we can show that there exists a constant A such that [nr0]
P{
sup
V
sup
T
lh(\o +
9)-lt{\o))>-H)
T < r n ( 7 ) | | 9 | | > C | | d „ | |=t =[nr] r ^ ++1
'
[nr0]
sup [9 £
^
l - \\9f([nr0) - [nr])A] > ~H)
(A- 2)
+f Since 0 and E are positive definite, we can show that C2KH2A2([nT0]-[nf]) 16 2 2 2 1 = C l|dni| A Cn K^n)([nr0]-[nf])
>
where K > 0 is some constant. As 7 is large enough, (A.2) is bounded by P(sup
sup [9 £ ^°)_I||6l||2([„7b]-[„r])A]>0) + i ll«ll>c|l*.ll t = [ n r ] + 1 [nTo]
^
P
E
( ™ P { |
». . . .
^^|-^K||([nro]-[nr])A}>0)
+
t=[nr] + l
V* < p L
fc
o-fct^
p
_ L ^
9A 2
4 ^ 4 > i t )
+
!.
(A. 4)
Now, using a similar method as the proof of Lemma 6.2 in Ling (2002b), we can completes the proof. •
187
Lemma A.5. Let H and e be any positive constants, f e (0,1), and Mn = log n or 1. If the assumptions of Theorem 2.1 holds, then there exists a constant C > 0 such that, as n is large enough, [ rl
/
r[
"
_
-
x
sup sup V ?CM„7^
y
Lemma A.6. Let C, H and e be any positive constants, and f £ (0,1). If the assumptions of Theorem 2.1 holds, then, as n is large enough, [n-ro]
P(sup V
sup
J2
[h(Xo + 9) - lt(X0)} > -Hlogn)
< e.
T<*IIV5»ll>Cl0Bn t=[nT]+1
'
The proofs of Lemma A.5-A.6 are similar to those of Lemma 5.2, 5.3 and 5.4(a) in Ling (2002b) and hence the details are omitted. Lemma A.7. If the assumptions of Theorem 2.1 holds, then for each 7 > 0, there exists a constant C > 0 such that, as n is large enough, ["To]
P(
sup
sup
V*
[lt(\0 + 9)- Jt(A0)] > C) < e,
V n (7)
'
where rn(j) e (0,r 0 ) such that 7 = -C~1n(d'nQdn)[Tn(;y) - r 0 ] . Proof. Denote f = T „ ( 7 ) . There is an N such that, as n > N, \\0\\ < 2||d n || < 77. We can show that, as n> N, [nro]
P(
sup
V
sup
Y]
[lt(Xo +
e)-lt(Xo)}>c)
f
sup
sup
V
[0
/
> t=
[^j+1
-([nTo]-[nr])||0||2A]>c). Note that ||dn||2([nTo] —
[TIT])
\
«A
(A. 5)
< 7C1, where Ci is some constant. For any
188
e > 0, as C is large enough, (A.5) is bounded by [nr0]
,x
QI
t=[nr]+l
£,
N
Wo)
'(iKIl Ssup U
dX
>
r0]-[nf]
= P(\\dn
sup 0
£
t=[nT]+l [nr 0 ]-[nf]
= P(\\dn
sup 0
t=[nr] + l [nr 0 ]-[nf]
= P(\\dn
sup 0
= P(\\dn t 2
£
dk(Xo) > dX
CN
dk(Xo)
t=i
sup JZ %(Ao) dA
0
>
dX
dx >
t=l
dk(Xo) > dX
c-
CN
t = 1
Kirn(T0-r)=O(C-^)<e
(A. 6)
where the second last step holds by the maximal inequality for martingale given by Birnbaum and Marshall (1961). This completes the proof. • Proof of Theorem 2.1. Let 9 i „ = {(7, M i ) : \"f\ > 7} and G 2 „ = {(7> #,0i) : |T| < 7i l l v ^ l l > M or ||V"#i|| > M}- It is sufficient to prove that, for any e > 0, (a), there exists a constant 7 > 0 such that, as n is large enough, P(supL n (7,0,6>i) > e ) < e . (b). for any 7 > 0, there exists a constant M > 0 such that, as n is large enough, P(sup 1^(7,0,0!) > e ) < e . e2„ The proof of (a) is similar to that of Theorem 6.5 in Ling (2002). Now, we prove (b). For each 7 > 0, let r„(7) £ (0,r 0 ) such that 7 = —C~xn «fid„)[T„(7) - r 0 ] , and fcn(7) = [nTn(j)]. Let 9 2 i „ = {(7,0,6>i) : \\y/n0\\ > M, \\dn+0i\\ < 2\\dn\\,kn(nr)0 such that r n (7) > eo for all n. By Lemma A.5 with Mn = 1, for any H, e > 0, there exists a constant
189
M > 0 such that, as n is large enough, k
P( sup TUXo
+ 0)- lt(Xo)} > -H)
< e.
Thus, for any e > 0, there exists a constant M > 0 such that, as n is large enough, P(supLn(7,0,0i)>e)<e.
(A. 7)
Let 9 2 2 n = { ( 7 , M i ) = l l v ^ i l l > M, \\dn + 9^ < 2||d n ||,fc n (7) < k < k0}. By Lemma A.l, sup 0 2 2 n St=i['t(^o + 0) - lt(\o)] = O p (l) because r n (7) > e0- By Lemma A.7, sup 022?i E ^ f c + i [ ' t ( A o + d n + «i)-it(Ao)] = O p ( l ) . By Lemma A.5, for any H, e > 0, there exists a constant M > 0 such that n
PI sup Ve
V
{lt(X0n + 0i) - It(Xon)}
>-H)<e.
""t=t^i
Thus, for any e > 0, there exists a constant M > 0 such that, as n is large enough, P ( sup L n (7, M i ) > £ ) < £ •
(A. 8)
©22n
Let e 2 3 n = { ( 7 , M i ) : K Lemma A.2, sup e 2 3 n £t=ifr(Ao A.3, supe 23n E£fe+i[^( A o + dn \\y/n9i\\ > \\^{h+dn)\\-\\y/^dn\\ for H, e > 0, it follows that, as n
+ 0i|| > 2||d n || > k n ( 7 ) < k + 0) - lt(\o)] = O p (logn). +#i) - lt(\o)\ = O p (logn). >Vn\\dn\\ >Clogn. By is large enough,
< k0}. By By Lemma For any C, Lemma A.5,
n
P(sup J^ 923
[/t(Aon + 0 i ) - Z t ( A o „ ) ] > - i n o g n ) < e .
t=k0+i
Thus, for any e > 0, there exists a constant M such that, as n is large enough, P(supL„(7,Mi) >£)<£•
(A. 9)
©23n
Note that 9 2 „ = 6 2 i„ U 9 2 2 „ U 9 2 3 n . By (A.7)-(A.9), it follows that (b) holds for any e > 0. This completes the proof. •
190
Proof of Theorem 2.2. Let Jnu = 9 and ^/nui - Oi. We reparameterised the log-likelihood ratio Ln(j,0,6i) in (A.l) as k
L„(7, u, uj) = ^ [ / t ( A 0 + -p=) - h(Xo)} t=i
n
^ n
['t(A0n + " ^ ) - «t(A0„)]
+ J]
V n
t=fe0 + l fco
^ A ° + y= + dn)-
+ E t=fc+i
^
it(Ao)].
(A. 10)
n
By Theorem 2.1, we only need to consider the asymptotic behavior of Ln(-y,u,ui) on the bounded region: QM — {(l,u,ui) : \j\ < j,\\u\\ < M and ||ui|| < M}. On 9 M , by Taylor's expansion, we can show that, when k < fco,
. ,
,
u' ^di t (Xo) sfn ^ v^
n ^~f dXOX' 9A
-
n
t=fc 0 +l
v!_ X v
u'Ad%(M
dX
dlt(Xo)
^-
t=fc 0 +l
d2lt(X0)
u' X
t=fc+i
d\(X^, SAdA — « i
t=fc+i
^t(Ao)
d2lt(X0 dxdX'
, X * 2s
d
d
+ n 2_.
dX
t=k+l
dn
t=k+l
+ Aln-A2n-A3n
+ op{l),
(A. 11)
where o p (l) holds uniformly on 0 M and _ui-V Aln
- ^ v
X
2s
dlt(Xo) dX
'
t=k+l
ui_u'
n
* d\(x0). t—' t=k+l
aXoX'
A2n
_2(u[-u')
X
~
2s dxdx' "'
^ v
d\(XQ)j
t=k+l
. , 2(Ul-uy ^ n
^
^t(Ao) oXaX'
t=k+l
When fc >fco,a similar Taylor's expansion can be shown. It is easy to see that £HAinll2 is bounded by 2MS|fc 0 -fc|/n = Q(7/n||d n |] 2 ) andE||A 2 n || is bounded by 2Mfi|fc0 - k\dn/y/n = 0(j/y/n\\dn\\). Thus, Ain = o p (l) and A^n = o p (l), uniformly on 0 M - Similarly, we can show that A%n = o p (l) uniformly on 0 « .
191
By Theorem 4.2 in Ling (2002b), we have >2/t(A0. dXdX
'n E ^S^"^-^ 1 )' t=k0+l
ji
V^
d
9 /t(Aon) ,
*E
dxdx'
_
dn = Cnl + 0p(
, .
^
t=k+l
Thus, for any 7, we have that
., v>
d\(X0)
t=k+l
+d'n J2
^^dnI{k>ko]=Cnj
+ op(l),
(A. 12)
t=fc0 + l
where o p (l) holds uniformly on OM- From (A.11)-(A.12), we have that T
.
u'^d2lt(X0)
v! ^dlt(Xo) M'I ^ v
^p ^
dlt(X0n) dX
ui n
t=fc 0 +l
, 9 2 /t(A 0n ) dXdX'
^p ^
+ C„[W„(7)-7], where
Wn(j)
=
Ul
t=fco+l
l
C~ [l{k
(A-13) Etk+i
®t(X0)
/dX +
I{k>ko]d'n
dl X
T,t=k0 + l t( On) /dX}. When 7 > 0, by strict stationarity, we have
E
dlt(Xon)
±
V"^ ®lt(Xon)
~~ ^
dX
dX
/A , ,•,
'
l
j
where x — y means that x and y have the same distribution. Thus, by Lemma A.6, it follows that k
C~ld'n £
^2)_>LB1(7)inU[0,oo)1
(A. 15)
t=fc0 + l
as n —> 00, where {I?i(7),0 < 7 < 00} is a Brownian motion with Bi(0) = 0. Similarly, when 7 < 0, Cn dn } _ , t=fc+i
dX
=Cn
dn2_^ t=i
—gr
>L B i ( - 7 ) in £>(-oo,0],
192
as n —> oo. Now, we define a process {£(7) : 7 € (-00,00)} such that Bin) = 5i(7K{ 7 >0} + 3i(-7)/{ 7L 5 ( 7 ) in £>(-oo, 00),
(A. 16)
as n —> 00. By Theorems 4.2-4.3 in Ling (2002b), it follows that j= L \fn f^
- 3dX T - - -n 2^ _ - m flTT" ^ dXdX
u'i
\p
v^
A
dlt(X0n)
<9A
V ^ « £1 - TOU fi«(A. 17)
u\
A
g 2 / t (A 0n )
n
A
<9A9A'
t=[nr 0 ]+l
Ul
t=[nT 0 ]+l
V T ^ u i f c - (1 - TDjuiflui,
(A. 18)
as n —> 00, where £1 and £2 are normal random vectors with the covariance matrix S. It is easy to show that Wn, ra_1/2]Ct=i dlt(Xo)/dX and , + 1 dlt{Xon)/dX are asymptotically uncorrelated. Thus, £(7), n - i / 2 £2"_, £1 and £2 are mutually independent. In the following, we derive the limiting distribution of (fn, An, Ai„). When the limit of Cn does not exist, we cannot obtain the limiting distribution of (f n ,A„,A ln ) directly from L n (7,u,wi). However, let {(f ni , Arai, AinJ : i = 1,2, • • • } be an arbitrary subsequence of of {(fn, A n .Ai n ) : n = 1,2, • • •}. Using a similar method as the proof of Theorem 2.1 in Ling (2002b), we can show that there are a further subsequence {(f„ i ., A ni ., ^lrii.) : 3 — 1,2, • • • } such that Cnlid'n^dn^nijiTn^
- r 0 ) —»£ argmin 7 {S( 7 ) - 7},
y/n~{Xni. - A 0 ) — > L V ^ f i l / 2 6 > ^/^"(Ain;. - A0nij.) — > L v7! -r 0 fi 1 / 2 C2, as riij —> 00. Since the subsequence {(fni, A n i ,Ai n J : i = 1,2, •••} is arbitrary, we can claim that the conclusion in Theorem 2.2 holds. This completes the proof. • References 1. D.W.K. Andrews Tests for parameter instability and structural change with unknown change point. Econometrica 61, 821-856(1993).
193
2. D.W.K. Andrews and W. Ploberger Optimal tests when a nuisance parameter is present only under the alternative. Econometrica 62, 1383-1414(1994). 3. J. Bai Least squares estimation of a shift in linear processes. J. Time Ser. Anal. 15, 453-472(1994). 4. J. Bai Least absolute deviation estimation of a shift. Econometric 11, 403436(1995). 5. J. Bai Testing for parameter constancy in linear regressions: an empirical distribution function approach. Econometrica 64, 597-622(1996). 6. J. Bai Estimating multiple breaks one at a time. Econometric Theory 13, 315-352(1997). 7. J. Bai and P. Perron Estimating and testing linear models with multiple structural changes. Econometrica 66, 47-78(1998). 8. J. Bai, R.L. Lumsdaine and J.H. Stock Testing for and dating common breaks in multivariate time series. Rev. Econom. Stud. 65, 395-432(1998). 9. Z.W. Birnbaum and A.W. Marshall Some multivariate Chebyshew inequalities with extensions to continuous parameter processes. Ann. Math. Statist. 32, 687-703(1961). 10. T.T.L. Chong Structural change in AR(1) models. Econometric Theory 17, 87-155(2001). 11. G.C. Chow Tests of equality between sets of coefficients in two linear regressions. Econometrica 28 591-605(1960). 12. R.A. Davis and D.W. Huang and Y.C. Yao Testing for a change in the parameter values and order of an autoregressive model. Ann. Statist. 23, 282304(1995). 13. B.E. Hansen Tests for parameters instability in regression with 1(1) processes. Journal of Business and economic Statistics 10, 321-335(1993). 14. D.V. Hinkley Inference about the change-point in a sequence of random variables. Biometrika57, 1-17(1970). 15. D.V. Hinkley and E.A. Hinkley Inference about the change-point in a sequence of binomial variables. Biometrika 57, 477-488(1970). 16. T.L. Lai Sequential changepoint detection in quality control and dynamical systems. With discussion and a reply by the author. J. Roy. Statist. Soc. Ser. B 57, 613-658(1995). 17. S. Ling On the probabilistic properties of a double threshold ARM A conditional heteroskedasticity model. Journal of Applied Probability 36, 688705(1999). 18. S. Ling Exact testing for change-points in time series models. Working paper, HKUST(2002a). 19. S. Ling Asymptotic theory for estimating change-points in time series models. Working paper, HKUST(2002b). 20. S. Ling Estimation for change-point in ARMA-GARCH models with a fixed drift. Working paper, HKUST(2002c). 21. S. Ling and W.K. Li On fractionally integrated autoregressive movingaverage time series models with conditional heteroskedasticity. J. Amer. Statist. Assoc. 92, 1184-1194(1997) . 22. S. Ling and M. McAleer Necessary and sufficient moment conditions for the
194 GARCH and Asymmetric Power GARCH models. Econometric Theory 18, 722-729(2002b). 23. D. Picard Testing and estimating change-points in time series. Adv. in Appl. Probab. 17, 841-867(1985). 24. R.E. Quandt Tests of the hypothesis that a linear regression system obeys two separate regimes. J. Amer. Statist. Assoc. 55, 324-330(1960). 25. Y.C. Yao Approximating the distribution of the maximum likelihood estimate of the change-point in a sequence of independent random variables. Ann. Statist. 15, 1321-1328(1987).
D Y N A M I C PROTECTION W I T H OPTIMAL WITHDRAWAL
HANS U. GERBER Ecole des hautes etudes commerciales Universite de Lausanne CH-1015 Lausanne, Switzerland E-mail: [email protected] ELIAS S. W. SHIU Department of Mathematics The Hong Kong Polytechnic University Hung Horn, Hong Kong, China and Department of Statistics and Actuarial Science The University of Iowa Iowa City, Iowa 52242-1409, U.S.A. E-mail: [email protected] For t > 0, let Si(t) and S2M be the time-t prices of two stocks. Consider an American option that provides the amount II(t) = S2(t) max < 1, max -—7-f f \
o
if it is exercised at time t, t > 0. The option payoff is guaranteed not to fall below the price of stock 1 and is indexed by the price of stock 2 in the sense that, if II(t) > 5i(t), the instantaneous growth rate of II(t) is that of S2(t). Assuming that the stock prices are geometric Brownian motions and that the option does not have an expiration date, we present two equivalent, closed-form formulas for pricing the option. We also show that this price has a direct connection with that of the perpetual maximum option.
1. I n t r o d u c t i o n The maximum option (also called alternative option or greater-of option) is an option whose payoff is max {Si(i),S 2 (t)} ,
(1)
if it is exercised at time t. Assuming that the stock prices are geometric Brownian motions, Gerber and Shiu [3; 4] have derived closed-form formulas 195
196
for pricing perpetual American maximum options. The word "perpetual" means that the option has no expiry date, and hence it cannot be a European option. Although the qualifier "American" is not necessary, it is added to emphasize that the option holder can exercise the option at any time. The payoff (1) can be written as ftWn»x{l,||}
.
(2)
This is readily comparable with the payoff of the dynamic protection option, which is n(t) = 5 a ( t ) m « { l
> (
g«t|^}.
(3)
Here, there is an assumption that 5i(0) < 5 2 (0) .
(4)
Note that the payoff (3) is path-dependent. A special case of the dynamic protection option is the Russian option proposed by Shepp and Shiryaev [7]. Another special case is the dynamic fund protection considered by Gerber and Pafumi [1]. A main purpose of this paper is to present closed-form formulas for pricing a perpetual American dynamic protection option. See (45) and (54) below. Also, it is shown that there is a direct connection between the price of a perpetual dynamic protection option and that of a perpetual maximum option. See (60). 2. Interpretation and Motivation Let us elaborate on the financial meaning of the dynamic protection option. We let the option payoff be denoted by U(t), if the option is exercised at time t. The idea is to construct a payoff that is guaranteed not to fall below the price of stock 1 and is indexed by the price of stock 2 in the sense that, if H(t) > S\(t), the instantaneous growth rate of IT(£) is that of Sz(i), dU{t) n(t)
dS2(t) s2(t) •
[
'
To derive a formula for the payoff function Il(£), consider a contract that always provides a sufficient number of shares of stock 2 so that the total value of these shares is at least the price of one share of stock 1 at any time.
197
Let TI(T) denote the number of shares of stock 2 at time T, r > 0. Thus we have the condition TI(T)S2(T)
> SI(T)
for all T > 0 .
(6)
We start with one share of stock 2, i.e., n(0) = 1 .
(7)
Note that condition (6) is satisfied for T = 0, because of assumption (4). If the contract further stipulates that additional shares can be credited but they can never be taken away afterwards, then the function n(r) is nondecreasing, n(t) >
for 0 <
TI(T)
T
< t .
(8)
If follows from conditions (8) and (6) that n ( 0 > ^ T
forO
S2{T)
and hence n(t) > max . v ' oma*{l,max|jlj}.
(9)
Obviously, there are infinitely many functions n(t) satisfying these three conditions. For the least cost, we choose the smallest such function, that is the one with equality in (9): n(t) = max i 1, max - ^ i
1 .
(10)
O
(In the next section, we shall formally state as one of the assumptions that all securities are perfectly divisible. Thus the share-number function, n(t), does not need to be integer-valued.) Then the total value of the shares of stock 2 at time t is n(t)S2(t) = max | l , j n a ^ | ^ j J S2(t) , which is the right-hand side of (3).
198
3. Model of Stock Prices We assume that the market is frictionless and trading of securities can be done at any point of time. There are no taxes, no transaction costs, and no restriction on borrowing or short sales. All securities are perfectly divisible. The instantaneous risk-free interest rate is constant and is denoted by r. For j = 1,2, Sj(t) is the price of a share of stock j at time t. We assume that each stock pays a continuous stream of dividends, at a rate proportional to its price, i.e., for j = 1,2, there is a positive constant £,-, called dividend-yield rate or pay-out rate, such that the amount of dividends paid per share of stock j between time t and t + dt is Sj(t)£jdt. Let Xj(t)=ln[Sj(t)/Sj(0)}
(11)
X(t) = (X1(t),X2(t)).
(12)
and
We assume that {X(t); t > 0} is a two-dimensional Wiener process (Brownian motion) with E[X{t)] = (mt,li2t),
(13)
Va,[XW] = ( ' ? ( t ) / T ' ) •
(14
>
and — 1 < p < 1. To avoid a discussion on changing probability measure, we assume that these parameters are those of the risk-neutral measure. In the absence of arbitrage opportunities, the price of a (random) payment is its expected discounted value. Suppose that all dividends are reinvested in the stock; then each share of stock j at time 0 grows to exp(^i) shares at time t. Since 5,(0) is the price for exp(£j£) shares of stock j at time t, it follows that 5 j (0) = e - r t £ ; [ e ^ t 5 i ( t ) ] >
(15)
or 2 a
H=r-t,i- -f
, J = 1,2.
(16)
We remark that, for j = 1,2, the stochastic process {e--ti)*Sj(t)\ is a martingale.
t > 0}
(17)
199
4. A Pricing Formula We are to price a perpetual option with payoff function (3). Thus its time-0 price is ^ ( S l , S 2 ) = sup£[e- r T IT(T)] ,
(18)
T
where T is any exercise time and Sj = Sj(0), j = 1,2. The stopping time T for which the maximum is attained is called the optimal exercise strategy. Note that {(5i(i),n(t))} t >o is a stationary Markov process. The optimization problem is an optimal stopping problem for a stationary Markov process with a stationary payoff function. The optimal continuation region is C={(s1,s2)\V(s1,s2)>s2}
.
(19)
The optimal exercise strategy is to exercise the option at the first time t when (S\(t), Ti(t)) is not in the region C. Because the stock prices are geometric Brownian motions, we see that V(s\, s2) is a homogeneous function of degree 1. Thus V(Sl,s2)
= s2V(Sl/s2,l).
(20)
If follows that C={(s1,s2)\V(s1/s2,l)>l}
.
(21)
Because V(z, 1) is a nondecreasing function of z, z > 0, we conclude that (22)
C=l(Sl,s2)
for some number ip. Hence the optimal exercise strategy is one of the form Tv = min {t\Si(t) =
(23)
with 0 < 0 < 1. That is, the option is exercised as soon as the ratio 5i(t)/II(t) falls to the level ?. Let V{8Us2;tp)
= E[e-^U(TV)]
,
(24)
denote the value of such an option exercise strategy, with Sj = Sj(Q), j = 1,2. The function V(s\, s2; ip) can be evaluated by means of martingales as in Gerber and Shiu [2; 3; 4]. Now, V(s\,s2;ip) is also a homogeneous function of degree 1, i.e., for each a > 0, V(asi,as2;(p)
= aV(si,S2;ip),
(25)
200
Consider T = min {t\Si(t) =
(26)
for first time when the ratio of prices, Si(t)/S2(t), attains the value
=
rT
+ E[e- V(S2{T),S2(T)\
= S2(T))}
rT
= S [ e - 5 2 ( T ) / ( 5 1 ( T ) = tpS2(T)j\ + Ele-^&mnSriT)
= S2(T))]V (1,1;?)
(27)
by (25) with a = S2(T). Here /(•) denotes the indicator function of an event. With the definitions A(ai, s2; ip) = E [ e - r r 5 2 ( r ) J ( 5 i ( r ) = >52(T))]
(28)
and B(Sl,s2;ip) = E^S^T^S^T)
= S2(T))} ,
(29)
formula (27) becomes V{8i,32;
+ B(8i,82',tp)V(l,l\ip)
.
(30)
To get closed-form expressions for A(si,s2;
(31)
becomes a martingale. Equivalently, we seek 9 so that r e -rt+ex l (t)+(i-e)x 2 (0l
(32)
is a martingale, which means e-rE[eexiW+(i-e)xa(i)]
= 1
_
(33)
Condition (33) gives rise to the quadratic equation 0 = -r + E[9X1(l)
+ (1 - 0)X2(1)] + ^Var[0Xi(l) + (1 - 9)X2(1)] 2
= - r + ^9 + n2(l -9) + ^92
2
+ ^ ( 1 - 9f + pa1(r29(l - 9) , (34)
with n\ and \L2 given by (16). Note that, because of (16), the expression on right-hand side of (34) equals - £ 2 for 9 = 0 and equals - £ i for 9 = 1. Hence, one solution of (34) is negative, and the other is greater than one,
201
say 9i <0,92> 1. With 6 — 9j, the stochastic process (31) is a martingale; if we stop it at time T and apply the optional sampling theorem, we obtain E[e~'TS2{T)[S1{T)/S2(T)]ei]
3* si-"' =
= A(s1,82]
j" = 1,2 .
(35)
These are two linear equations for the functions A and B. Their solutions are Jx
l-ei _ J2
^.« 3 ;
i-e2 a
.
(36)
^ n ^
•
(37)
Substituting (36) and (37) in the right-hand side of (30) yields V(si,s2;tp) (£,01 _ (£02
(38) To determine V(l, 1;
= 0.
(39)
Sl=S2
(An intuitive explanation of this condition is that when s2 is "close" to si, the guarantee or protection will be used instantaneously, and so the value of V is unaffected by marginal changes in s 2 . For a rigorous derivation of a similar condition, see Goldman, Sosin and Gatto [6].) Differentiating (38) with respect to s2 and applying (39), we obtain the equation
o = [(i - ex) - (i - e2)} + [(i - o2)
v(i,i;
Thus, with the definition h(z) = (02 - I)*' 1 + (1 - 6i)z02 ,
z >0,
(40)
we have
Substituting (41) in (38) and simplifying, we obtain
visi s
> ^=nwS2-
(42)
202
Because the option price is V(si, s2) = supV(si, s2;oo for z —> oo and for z - > 0 . Also, h"(z) = (92 - 1)(1 - 0i)( - OxJ*-2 + 92ze>-2) > 0 . Hence the graph of the function h(z), z > 0, is {/-shaped, and the function h{z) has a unique minimum. Let ip be the unique value satisfying h'{ip) = 0 .
(43)
Then
The number (p is between 0 and 1 because both —#i/(l —0i) and (92 — l)/92 are between 0 and 1. The optimal exercise strategy is to exercise the option the first time when Si(t) = ^ll(t), if si >
u
si
W
#2 , (45) if 0 < — < ip \ s2 with h(-) given by (40). In the next section, we shall give an alternative expression for the option. The payoff of the option is path dependent, but with a simple structure: If the option has not been exercised by time t, t > 0, then
/ l (5 1 (i)/n(i)) 1 h($)
-n(f).
(46)
5. An Alternative Pricing Formula We now derive an alternative expression for the option price V(si, s2). The first-order condition (43) is the same as (1 - 9l)62Cpe* = -{92 - 1)9^
,
(47)
203
applying which to (40) yields the formulas h^)=9-^(62-l)
(48)
and
/#) = -\^(l-W 2 -
(49)
It follows from (48) that (fl2 - l)z &1
e2-el\(p)
(50)
"
Similarly, it follows from (49) that (1 - 6»i)z02
- 0 ! (z\<>*
-ar® •
(51
»
Define ,
z
v
92x91 - OixH
g( > =
e2-e\
'
,N
x>0
-
(52)
By (40) the sum of the left-hand sides of (50) and (51) is h(z)/h(
= $(4),
^>0.
(53)
Prom (53) and (45) we obtain the alternative expression for the price of the perpetual dynamic protection option, V(s1,s2)
f 5 K( ^J> 2
= \
^
s2 *•
if 0 < ^ < 1 #2 ^ . if 0 < — < $
(54)
s2
In the special case where S2(t) = m, a positive constant, which can be interpreted as the maximal price of stock 1 in the past, formula (54) was first given by Shepp and Shiryaev [7]. 6. Connection with the Perpetual Maximum Option Price By comparing (2) with (3), we see that the maximum option is cheaper than the dynamic protection option. If both options are perpetual, there is a direct connection between the prices of the two options. See (60) below.
204
Prom Gerber and Shiu [3; 4], a formula for pricing the perpetual maximum option is:
W(Sl,s2)={
s2
if — < b
S29
[il<sS
(t) ^bs '
2
i^^ s
2
S\
2
w
if — > C S2
where g is denned by (52), si = Si(0), s2 = ^2(0),
and Note that
(58)
jj = 0 by (44), and that 0 < 6 < 1< c.
(59)
Observe that the condition
<1
is equivalent to b < cs\/s2 < c by (58). Hence, with this condition, it follows from (54), (58) and (55) that V(si,s2) = s2g(-^-) = s2g(~) = W(csi,s2) , (60) x \ips2J bs2' which gives a simple connection between the two prices. The discussion by Kwok and Chu on [5] provides an elegant explanation for this result in terms of optimal resets. Remark Section 5 of [5] has shown that, for exercise strategy Tv, each time when II(i) falls to the level of S\(t), the replicating portfolio consists of exactly V(l, 1; ip) units(s) of stock 1. Now, V(l,l;$) = V{l,l) = W{&,l) = c.
(61)
by (60) and (55). Thus, for the optimal exercise strategy of the perpetual dynamic protection option, each time when U(t) falls to the level of S\(t), the replicating portfolio consists of exactly c units of stock 1.
205
Acknowledgment Elias Shiu gratefully acknowledges the generous support from the Principal Financial Group Foundation and Robert J. Myers, F.C.A., F.C.A.S., F.S A .
References 1. H. U. Gerber and G. Pafumi. Pricing Dynamic Investment Fund Protection. North American Actuarial Journal 4(2), 28-37 (2000); Discussion 5(1), 153157 (2001). 2. H. U. Gerber and E. S. W. Shiu. Pricing Financial Contracts with Indexed Homogeneous Payoff. Bulletin of the Swiss Association of Actuaries 94 143166 (1994). 3. H. U. Gerber and E. S. W. Shiu. Martingale Approach to Pricing Perpetual American Options on Two Stocks. Mathematical Finance 6 303-322 (1996). 4. H. U. Gerber and E. S. W. Shiu. Pricing Perpetual Options on Two Stocks. Derivatives and Financial Mathematics, edited by John F. Price, Nova Science Publishers, Commack, NY, 91-117 (1997). 5. H. U. Gerber and E. S. W. Shiu. Pricing Perpetual Fund Protection with Withdrawal Option. North American Actuarial Journal 7(2) 60-77; Discussions 77-92 (2003). 6. M. B. Goldman, H. B. Sosin and M. A. Gatto. Path Dependent Options: 'Buy at the Low, Sell at the High'. Journal of Finance 34 1111-1127 (1979). 7. L. Shepp and A. N. Shiryaev. The Russian Option: Reduced Regret. Annals of Applied Probability 3 631-640 (1993).
RUIN PROBABILITY FOR A MODEL U N D E R MARKOVIAN SWITCHING REGIME
H. YANG* Department of Statistics and Actuarial Science The University of Hong Kong Pokfulam Road, Hong Kong E-mail: [email protected]
G.YIN1 Department of Mathematics Wayne State University Detroit, MI 4.8202
This work is concerned with insurance risk models, in which continuous dynamics are intertwined with discrete events. We assume t h a t the system under consideration is allowed to have jumps in the regimes and the jump process is modeled as a continuous-time Markov chain. Under suitable conditions, Lunderberg type upper bounds and non-exponential upper bounds for the ruin probability are obtained. In addition, renewal type system of equations for ruin probability and the exponential claim sizes case are discussed.
1. Introduction In the classical insurance risk model, compound Poisson processes are used to model the surplus process. The premium is assumed to be a constant and the claim is assumed to follow a compound Poisson process where the claim sizes are i.i.d. (independent and identically distributed) random variables and the number of claims is assumed to be a Poisson process. This model captures certain basic properties of the insurance portfolio, but it is far from realistic. There is a huge amount of literature devoted to the generalization of the classical model in different ways. For more detailed discussions on the "The research of this author was supported in part by Research Grants Council of HKSAR (Project no: HKU 7139/01H). t T h e research of this author was supported in part by the National Science Foundation under grants DMS-9877090. 206
207
ruin probability under classical models and various extensions, see Gerber (1979), Grandell (1991, 1997), Klugman et al. (1998), Rolski et al. (1999), Asmussen (2000) and the references therein. Recently, people in finance and actuarial science have also started paying attention to regime switching models. Di Masi et al. (1994) considered the European options under the Black-Scholes formulation of the market in which the underlying economy switches among a finite number of states. Buffington and Elliott (2001) discussed the American options under this set-up. Hardy (2001) used monthly data from the Standard and Poor's 500 and the Toronto Stock Exchange 300 indices to fit a regime-switching lognormal model. The fit of the regime-switching model to the data is compared with other econometric models. In this paper, we consider a Markovian regime switching formulation to model the insurance surplus process. This model can capture the feature of insurance policies may need to change if the economical or political environment changes. The model under consideration is a hybrid system, in which continuous dynamics are intertwined with discrete events. The systems are subject to jumps or switches in regime-episodes across which the behavior of the corresponding dynamic systems are markedly different. To model such a regime change, we use a continuous-time Markov chain. Our interest is to figure out the ruin probability of the underlying systems. This model is a special case of the Markov modulate risk model considered in Asmussen (1989) and Janssen (1980). In Asmussen (1989) and Janssen (1980) (see also Asmussen (2000)), the exponential upper bound was obtained for the ruin probability. Under the proposed regime switching model, we also obtain the Lundberg type bound for the ruin probability. Compare to the work of Asmussen (1989) and Janssen (1980), our method used here is simple and has the advantage that the same method can be used to obtain non-exponential upper bounds. We also obtain the integro-differential equation system satisfied by the ruin probability. The rest of the paper is arranged as follows. Section 2 begins with the precise problem formulation. Section 3 proceeds with the derivation of Lunderberg type inequality. Section 4 generalizes the results to nonexponential upper bounds. Section 5 derives a coupled system of integrodifferential equation satisfied by the ruin probability. Finally, Section 6 deals with exponential claim sizes.
208
2. The Model Let £(£) represent a homogeneous continuous-time markov chain taking values in a finite set M = { 1 , 2 , . . . , m}. We assume that P t = ( P / , . . . , P t m ) with PI = Pr{£ ( = i}, i = l , . . . , m , satisfies the Kolmogorov forward equation, i.e. -£- = ptQ,
0
Po = P
(1)
where P is the initial probability of the process {£t} and Q = (g^) is the stationary transition rate matrix of the process {£*,£ G [0,T]}, with qij denotes the transition probability rate from state i to state j and satisfies «« > 0
(2)
9» = -Qu =
2J
9
(3)
*J
For more discussion on Markov chains, see Chung (1967), Yin and Zhang (1998), and the references therein. In this paper, given an i € M, we assume that the sizes of the claims, at a given state i, Xi(i),X2{i), • • • are i.i.d. and nonnegative. The number of claims up to time t, Nt is a Poisson process satisfying P{Nt = n} = ^fe~xt , n = 0,l,2,... . n! Suppose that Nt and Xk(i) are independent. Denote the distribution function, the mean, and the moment generating function by Fx{i)(x)
= P{X(i)<x}
x>0,
I* = E(X(i)), Mx(r)
= E[erX],
respectively. At each state i, we assume that the premium is payable at rate c(i) continuously. Let U(t, i) be the surplus process given the initial suplus x and initial state i: U(x,i) = x+ / c(£(s))ds - St, Jo where N(t)
St=J2 X{£{Ti)),
209
Ti is the ith claim time. We further assume that the following net profit condition is satisfied. E
[ c(tts))ds
Jo
= (l + 0)E[St],
where 9 > 0 such that min.,{c(j)} > Amaxj{/z.,}. Here 6 is called relative security loading. If U(t, i) < 0, we will say that ruin occurs. Let T = inf{t : U(x,i) < 0} be the time of ruin (T = oo if U(t,i) > 0 for all t > 0)._Let 4>{x, i) = P{T = oo\U(Q, ^(0)) = x, £(0) = i) be the ultimate survival probability, and ip{x,i) = 1 — 4>{x,i) be the ultimate ruin probability. 3. Lundberg Type Inequality Let R{t) be the smallest positive solution of the following equation: /•OO
e-Xt-^o
\E[Mxm)){r)
(4)
Jo Denote R = inftg(0]OO){i?(t)}, which is referred to as the adjustment coefficient. We have the following Lemma. L e m m a 3.1. Equation (4) has a positive solution and /•OO
XE[Mx{m)(R)
e-M-RJic(as))dsdt^l
^
Jo for all t S (0,oo). Remark: When m = 1, equation (4) becomes the classical adjustment coefficient equation under the compound Poisson model. Proof: Let /•OO
/(r, t) = XE [Mxm)
(r) / Jo
e" Xt ~ r K c^s»ds}
dt.
210
It is obvious that /(0, t) = 1, therefore 0 is a solution of (4). Since /•OO
f'(0,t)
= \E[M'XU)(0)
e~Xtdt] Jo /•OO
-XE[Mxm))(Q)
ft
e~xt / c{i{s))dsdt] Jo
/ Jo /•OO
< maxf/ij} — A / e~xttmin{c(j)}dt Jo , , minjcfj)} A for all t G (0, oo), /(r,£) is decreasing at 0. Notice that, a s r ^ oo, /•OO
f(r,t)
> XE[Mx{m)(r) > 1
/ ./o
e^-max^)}*^
^ T T T E ^ K W ) ] -
oo.
A+rmax{c(j)} The lemma is thus proved. The following result is a Lundberg type inequality. Theorem 3.1. For x > 0 iP(x,i)<e~Rx,
(6)
where R is the adjustment coefficient defined above, x is the initial surplus, and for i £ M, £(0) = i is the initial state of the Markov chain.
Proof: Define ipn{x, i) to be the probability of ruin on or before the n claim for n = 0,1,2,... We claim that tpn{x, i) < e~Rx for each nonnegative integer n. We prove this by induction. First,
tpo(x,i) = 0 < e
-Rx
211
Suppose ipn(x,i) < e ipn+i(x,i)
Rx
. We proceed to estimate
ipn+i(x,i).
= P{ruin on or before the (n + l ) t n claim} = P{ruin on the 1 s t claim} + P{ruin not on the 1 s t claim, but on or before the next n t n claim} = E J™ X e" A * 11 - Fxm)) /
(x + j T c(£(s))d.
i>n[x+ poo
<E
I
(
Xe-^i
c(£(s))ds -y)
dFxm)(y)
I •/*+;„««
c(«(s))d.
f
i+J 0 'c(f(s))d.
!
+
\ dt
poo
I
Jo
dFx(m{y)
"I e-R(x+H
°M>))*-v)dFxm(y)
/ { Jx+JZ
c(Us))ds
\ dt
Jo
<E
+
/0
Ae_At
I
= E
e-*(*+Si°M»)*-v)dFxm)(y)\dt
'£ Xe~xt{ f
= E T
e R{x+I c{i(s))ds y)
~
Xe-Xte-R*-R
°
~ ^^m)(y))d{
/o = « W ) * | r /•OO
e~HXE
xm))\
Jo
<e -Rx
Therefore, i)n+1(x,i)
<e~Rx,
and we have ^n{x, i) < e~Rx
for all n = 0 , 1 , 2 , . . .
Thus, we conclude that i/>(x,i) = lim ipn(x) < e~Rx. n—>tx
The expectation above is with respect to £.
eRydFx{m(y)\dt\
212
4. Non-exponential Upper Bounds In this section, we provide non-exponential upper bounds for the ruin probability. Henceforth, we say a distribution B(X) is a new worse than used (NWU) distribution if B(x) is a distribution function of a nonnegative random variable such that for x > 0 and y > 0, B(x) = 1 - B(x), B(x)B(y) 0 and y > 0, B{x)B(y)>B(x
+ y).
Let B{x) be a NBU distribution function. Let R(t) be the smallest positive solution of the following equation: XE[Mx{m)(r)
J™ e'XtB
Q f c(£(s))ds\ dt] = 1
(7)
Denote R = mf{R(t)}. Then R is called the adjustment coefficient again. Similar as before, we have the following lemma. Lemma 4.1. Equation (7) has a positive solution and XE[Mxm)(R)
J™ e~xtB (J
c(^s))ds\
dt] < 1
(8)
for all t 6 (0,oo). The following theorem provides us with non-exponential upper bounds for the ruin probability. Theorem 4.1. Suppose that B(x) is a NBU distribution function, and B(y-x)
y>x
(9)
Then ip{x,i)
(10)
where x is the initial surplus, and £(0) = i, for i G M is the initial state of the Markov chain. Proof: Define ipn(x, i) to be the probability of ruin on or before the n*11 claim, n — 0,1,2,... Similar as before, we use induction to carry out the proof. First, ipQ(x,i) = 0
.
213
Suppose ipn{x,i) < B(x) is true. For the NBU distribution function B(x), we assume that B(0) = 1 and if we extend the domain of B(x) to x < 0, we have B(x) > 1 for all x < 0. Xe-X4
iPn+1(x,i) = E[J~
1 - Fxm)
+ / f
k/o
dFxm)(y) { Jx+ti
c(i(a))da
rx+S^c(i(s))ds _ / I B[x+ /'OO
r
<E
/
~ \
ft I
\ \ c{Z{s))ds-yjdFx{m{y)\dt
/»00
J
Xe Xt
/
l
dFx«(t))(w)
t
J0 [ Jx+J* c«(s))(fa ,*+/„« c(Z(s))ds _ / ft
<
E[J°°
e
\dt
/*O0
Xe-Xt\
/
+
c(£(s))ds^
Vn ( x + j o c(S(s))ds - yj dFx{m(y) /•OO
<£
(x + jf
~i/0 4
A e - A t | J™
B(X
\
)
+ J* c^{ ))ds\ +
= B(x)^[M x ( e ( t ) ) (i?) ^ ° ° Xe~xtB
S
(J
eRydFx{m(y)
•
\ dt]
c(t;(s))ds ) dt
< B(x). From this we conclude that ip(x,i) = lim ipn(x,i) < B{x). 5. A Coupled System of Integro-differential Equations for Ruin Probability In this section, we show that the ruin probability ip(x, j) satisfies a coupled system of integro-differential equations. Consider the ultimate survival probability cf>(x, i) = 1 — tj)(x, i), we have the following system of integro-differential equations for (j>(x,i). T h e o r e m 5.1. 4>(x,i) satisfies c(i)(p'(x,i) + ^2qi:j(t>(x,j)=X(t>{x,i)-X
(p(x-y,i)dFx{i){y), Jo
(11)
214
where x is the initial surplus and £(0) = i. Proof: By the property of the Poisson process, we have 4>(x,i) = (l-XAt)E[4>(x
+j At
fx+{0
*c(£(a))cfa,£(At)j]
c(Z(s))ds
I
,At
x +
+ XAtE[J^
[
\
c
J
s
(£( ))^-y,£(A£)J
^Fx(S(At))(j/)] o(At)
+
Rearranging the terms, we have E[(x + J XAtE
c(Z(s))ds, e(At) J - >{x, a At))} + E[(x, £(A*))] -
4>(x + J
c(£(S))
/•x+/ 0 A t c(i(s))ds
- XAtE [J
I
,At
\
I a; + jf
c(g(s))ds - y, £( At) 1 dF X ( C ( A t ) ) (y)
+ o{At). Therefore, /•At
/ JO
,-At
c(£(s))ds £ [ # z + / Jo
c(e(s))ds, e(At)) - 0 Or, £(At))]
At
+E
Jo
c(S(s))ds
-(j>(x,Z{At))-(t>{x,i) At pAt
= XE[4>(x + J^
c(£{s))ds,ttAt)\]
rx+jf* c(S(s))ds
-XE[j
( t> X +
' \
rAt
J0
\
C s))ds
^
y
At)
~ > ^ J <***<«**» (?/)]
+ o(l). Letting At —> 0, we obtain equation (11). It is not easy to solve the system of equations (11) in general. We now look at the Laplace transform of the survival probability. Define /•OO
0(t, — /
Jo
e~tx<j>(x,i)dx.
215
Then we have the following result: Theorem 5.2. 4>(x,i) satisfies the following linear system [c(i)t - A + A/x W (t)]4>(t, t) + £
Qakt, J) = c(t)0(O. t)
i = 1,2,
...,m.
Proof: Note that /*oo
/ Jo
roo
e-tx'{x, i)dx = I Jo
e-txd(j>(x,i)
= 0(O,i) + t0(M) The theorem can be proved by taking Laplace transform of both sides of equation (11). By virtue of this theorem, if we know the value of <j)(0, i), then we can obtain the Laplace transform of the (j>(x, i) by solving a linear system. We use the method proposed in Dickson and Hipp (1998) to obtain the 0(0, i). Let Qn(t) 921
Qi2
•••
9in
92r,
922 (*) • • •
Mt) 9ml
9m2
...q^m(t)
where
qUt)=c(i)t-\
+ \fx(i)(t)+qil
=
l,...,m,
and let to be the positive solution of |-A0(£)| = 0. Since the Laplace transformation cannot equal to 0, we must have |-Aj(£0)| = 0, i — l , 2 , . . . m , where Ai(t) is the matrix Ao(t) with the ith column replaced by (c(l)<£(0,1), c(2)<j>{0,2),..., c(m)0(O, m))1. Therefore we have obtained a linear system |^(to)|=0
l,...,m.
Prom this linear system, if the above equation has m different positive roots, we can obtain(0, i), i = 1 , . . . , m.
216
6. Exponential Claim Sizes In this section, we assume that the claim size random variables follow exponential distributions. That is, X(i) is exponentially distributed with mean Pi-
Prom equation (11), we have
c(i)(j)'(x, i) + ^21a4>{^, J) i
r i _JL ~ X(j)(x,i) — X I
= X<j>(x, i)
_J£_
/
_£_
e n / Mi
(p(z,i)e"idz.
Jo
Differentiating both sides of the equation above with respect to x, we have c(i)<j>"{x,i) +
Y^QijfiixJ) i A
i
fx
z
= \<j>'(x,i) H — 2 e ~ ^ / Pi Jo = \4>'{x,i) Pi
=
<j>{z,i)e^dz
X
Pi
{x,i) + — Pi \
X<j>(x,i) - c(i)<j)'(x,i) i
-Y]qi:j(j)(x,j) J
(x-^)'(x,i)-±-^qij (x,j). \
Pi J
Mi •
Therefore we obtain the following system of second-order linear ordinary differential equations satisfied by <j)(x,i):
c(i)4>"(x,i) + J2'(x,j) 3
Pi
Pi
7. Concluding Remarks In this paper, we have provided a simple method to deal with the upper bounds of ruin probability under a Markovian regime switching model. Our method enables us to obtain both exponential and non-exponential upper bounds for the ruin probability. We have also derived a system of integro-differential equations satisfied by the ruin probability. The Markovian regime switching insurance risk model is of both theoretical interest
217
and practical importance. This is a relatively new model and there are many research problems in the context of risk theory under this model. We will further investigate this model in future research.
References 1. Asmussen, S. Risk Theory in a Markovian environment. Scand. Act. J., 69100 (1989). 2. Asmussen, S. Ruin Probabilities. World Scientific, Singapore, (2000). 3. Chung, K. L. Markov Chains with Stationary Transition Probabilities, Second Edition, Springer-Verlag, New York, (1967). 4. Di Masi, G. B., Kabanov, Y. M. and Runggaldier, W. J. Mean Variance Hedging of Options on Stocks with Markov Volatility, Theory of Probability and Applications, 39, 173-181 (1994). 5. Dickson, D. C. M. and Hipp, C. Ruin probabilities for Erlang(2) risk process. Insurance: Math, and Econom., 22, 251-262 (1998). 6. Buffington, J. and Elliott, R. J. American Options with Regime Switching. QMF Conference 2001, Sydney, Australia, (2001). 7. ardy, M. R. A Regime-Switching Model of Long-Term Stock Returns. North American Actuarial Journal, 5(2), 41-53 (2001). 8. Gerber, H. U. An Introduction to Mathematical Risk Theory. S.S. Huebner Foundation Monograph Series No.8, R.Irwin, Homewood, IL, (1979). 9. Grandell, J. Aspects of Risk Theory. Springer-Verlag, New York (1991). 10. Grandell, J. Mixed Poisson processes. Chapman and Hall, London, (1997). 11. Janssen, J. (1980). Some transient results on the M/SM/1 Special SemiMarkov Model in Risk and Queneing Theories, ASTIN Bulletin, 11, 41-51 (1980). 12. Klugman, S. A., Panjer, H. H. and Willmot, G. E. Loss Models Prom Data to Decision. Wiley, New York, (1998). 13. Rolski, T., Schmidli, H., Schmidt, V. and Teugels, J. StochasticProcesses for Insurance and Finance. Wiley and Sons, New York, (1999). 14. Yin, G. and Zhang, Q. Continuous-Time Markov Chains and Applications. Applications of Mathematics, Vol. 37, Springer, New York, (1998).
HEAVY-TAILED D I S T R I B U T I O N S A N D T H E I R APPLICATIONS *
CHUN SU and QIHE TANG Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, Anhui, China E-mail: [email protected] Department of Quantitative Economics, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands E-mail: [email protected]
Based on some recent investigations by the authors, this paper gives a short review on the concept of heavy-tailedness. Some commonly used and newly introduced classes of heavy-tailed distributions are examined. Applications to precise large deviations and to finite time ruin probabilities are proposed.
1. Heavy-tailed Distributions Let X be a random variable (r.v.) with a distribution function (d.f.) F(x) = V(X < x). Throughout, we denote by F(x) = 1 - F(x) the tail probability of the r.v. X and write by Fe(x) = dx+y1
[*F(z)dz Jo the equilibrium distribution of F provided 0 < /i+ -
/ Jo
(1.1)
~F(z)dz < oo.
Hereafter, we will be using the notation n~p silently. We say that the r.v. X or its d.f. F is heavy tailed to the right if its moment generating function Eexp{tX} = oo
Vt>0;
(1.2)
"This work was supported by the national science foundation of china (no.10371117) and the dutch organization for scientific research (no. nwo 42511013). 218
219
otherwise, we say that it is light tailed to the right. The most important class of heavy-tailed distributions is the subexponential class, denoted by S. This class was first introduced by Chistyakov (1964) in a more general case. Let {Xk, k > 1} be a sequence of independent and identically distributed (i.i.d.) r.v.'s with common d.f. F supported on (0,oo). By definition, the d.f. F belongs to the class S if and only if for each n > 2, the tail probabilities of the sum Sn and the maximum X^n^ of the first n r.v.'s of {Xk, k > 1} are asymptotically of the same order, i.e., P ( 5 n > x ) ~ P ( X ( n ) >x)
asx->oo.
.
(1.3)
In the recent literature, the d.f. F supported on (—oo, oo) is still said to belong to S if the d.f. of the r.v. X+ = .XI(x>o) belongs to S. Some other important classes of heavy-tailed distributions are listed as follows, where the d.f. F involved is always assumed to be supported on (—00,00):
1. the class C (long-tailed): F is said to belong to the class C if x->oo
F(x)
holds for any z > 0 (or equivalently for some z > 0). 2. the class V (of dominated variation): F is said to belong to the class V if hm sup — x—00
< 00
F(x)
holds for any 0 < z < 1 (or equivalently for some 0 < z < 1). 3. the class <S*: F is said to belong to the class S* if 0 < ji J < 00 and
lim r*bAT{z)dz
x
^°° Jo
=
F(x) F(x)
2ti.
4. the class C (of consistent variation): F is said to belong to the class Cif lim lim sup 3 ^ 2 = 1.
(1.4)
m x-^oo F(x) 5. the class ERV (with extended regular variation): F is said to belong to the class ERV(—a, —/?) for some 0 < a < / 3 < o o i f «-/3
x-*oo
F(x)
~
x^oo
F(x)
~
Vw > 1.
(1.5)
220
Especially, if a = fi > 0 then relation (1.5) describes the class Tl-a, which is a well-known subclass of subexponential distributions. The regularity property in (1.4) was first introduced and called "intermediate regular varying" property by Cline (1994). This class has been used in different studies of applied probability such as queueing system and ruin theory; see, for example, Schlegel (1998), Jelenkovic and Lazar (1999), Ng et al. (2004), and references therein. Specifically, the class C covers the class ERV. Most recently Cai and Tang (2004) constructed a simple, but non-trivial, example to illustrate that the inclusion ERV C C is strict; another similar example can be found in Cline and Samorodnitsky (1994). For the classes above, we have, for any 0 < a < 7 < / 3 < o o , H.1cEJ?y(-a,^)cCcDn£c5c£,
S
V£S.
(1.6)
Furthermore, if 0 < yip < oo, then FeX>n£=*F(E<S*C<S.
(1.7)
For more details and related discussions, we refer to Bingham et al. (1987), Embrechts et al. (1997), and Goldie and Kliippelberg (1998), among others. Recently, we have introduced some other classes of heavy-tailed distributions. 6. the class A* (Konstantinides et al. (2002)): F is said to belong to the class A* if 0 < yip < oo, Fe £ S and
liminf .ofg*}, >0.
(1.8)
7. the class M (Tang (2001), Su et al. (2002) and Su and Tang (2003)): F is said to belong to the class M if 0 < jip < oo and
z-»°° J
F(u)au
8. the class M* (Tang (2001), Su et al. (2002) and Su and Tang (2003)): F is said to belong to the class M* if 0 < fx£ < oo and lim sup
xF{x) -
f00
< oo.
221
2. Some asymptotics and bounds 2.1. Applications
of the classes C and S to
asymptotics
Let {Xk,k > 1} be a sequence of independent r.v.'s, each Xk with a d.f. Fk supported on (—oo, oo), k > 1. With XQ — 0, we write n X
(n) ~ 3?$ xk, v '
0
s
n= V ^ , *—' A:=0
S(ln) = max Sk, '
0
n > 1.
The following result shows that, for a large class of heavy-tailed distributions, the tail probability of the maxima of the partial sum is asymptotically equal to the tail probability of the sum of the individual random variables: Theorem 2.1. (Ng et al. (2002)) If Fk G C, k> 1, then for any n G N, F(S{n)>x)~F(Sn>x).
(2.1)
Now we reduce the class C to the class S and obtain Theorem 2.2. (Ng et al. (2002)) Assume that \imx->00F~k'(x)/F~(x) = ck holds for some constants Cfc > 0 and some d.f. F G <S, k > 1. Then for any n G N, we have Hm
x->oo
F(S^>x) F(x)
= Um
*-»<»
P(5„>.) F(x)
=
^ x
->°°
F(Xln)>x) F(x)
=£ck
~
(2.2) Let F be a d.f. with 0 < /xj < oo. Recall its equilibrium distribution function defined by (1.1). Clearly, Fe is absolutely continuous and supported on (0, oo) with a hazard rate function re(x) = -ll,Te{x)
=1J^t,
z>0,
(2.3)
where the latter equality holds almost everywhere with respect to the Lebesgue measure. The following result extends (2.2) to the random case: Theorem 2.3. (Ng and Tang (2004)) Let {Xk,k > 1} be a sequence of i.i.d. r.v. 's with common d.f. F and finite expectation HF- Assume that r is a nonnegative and integer-valued r.v. with finite expectation and independent of {Xk, k > 1}. / / one of the following three conditions holds: (1) F £ <S and for some 6 > 0 we have E exp{5r} < oo (2) F G V n C, u < 0 and P(r > x) = o (r e (x)) (3) F e ERV(~a, -/3), 1 < a < (3 < oo, and uF < 0
222
then P (5 ( T ) > x) ~ P (ST > x) ~ P ( X ( T ) > x) ~ ErF(a;).
(2.4)
Compared with (1.3), the asymptotic relations (2.2) and (2.4) describe some very natural properties of subexponential distributions. Theorem 2.2 partly improves the related result in Sgibnev (1996). The second sufficient condition of Theorem 2.3 weakens the conditions of Theorem 2.3 in Ng et al. (2002). We refer the reader to Kaas and Tang (2003) and Foss and Zachary (2003) for closely related discussions in more general probabilistic models. The asymptotic relation P (ST > x) ~ ETF(X) in (2.4) describes the tail asymptotic property of the compound sum oo
W(ST < x) = J ^ P ( r =
n)F{-n\x),
n=0
where and throughout F^ (x) denotes the n-fold convolution of the d.f. F whereas F^(x) denotes a d.f. degenerate at 0. Related discussions can be found in Chover et al. (1973), Embrechts et al. (1979), Embrechts and Goldie (1982), Cline (1987), among others. 2.2. Sums of random variables
with dominated
variation
A classical inequality in the literature says that if F £ S, then for any e > 0, there exists a constant C€ > 0 such that + e)nF(x)
P {Sn >x)
(2.5)
holds for all n £ N and x > 0. In the original work by Chistyakov (1964) and Athreya and Ney (1972), the d.f. F is assumed to be supported on (0, oo), but the extension of this result to the general case of F supported on (—oo, oo) is immediate. We remark that the bound given in (2.5) is too rough since it grows exponentially fast with respect to n. Now we present some more precise bounds as follows: Theorem 2.4. (Tang and Yan (2002)) Assume that d.f. F £ V has a finite expectation /J,. Then (1) for any 7 > 0, there exists C(j) > 0 such that FM(X)
> C(j)nF(x)
holds for any n > 1 and x > jn;
(2.6)
223
(2) for any 7 > max{/z, 0}, there exists D(j) > 0 such that jM{x)
< D{j)nF(x)
(2.7)
holds for any n > 1 and x > jn. A closely related reference in this field is Cline and Hsing (1991), who considered more general x-regions Tn. We think that the most important z-region for applications is Tn — [yn, 00), 7 > 0. By Theorem 2.4, for any 7 > max {/j,, 0}, we have 0 < liminf inf n->oo x>jn
_ nF(x)
< limsup sup n-*oo x>-yn
_
< 00.
(2.8)
nF{X)
i.e. the relation F(™)(x) x nF{x) holds uniformly for x > 771 as n —> 00. The following corollary is immediate: Corollary 2.1. (Tang and Yan (2002)) Assume that d.f. F e V has a finite expectation fi. Then for any 7 > /x, there exist two positive constants C(7) and D(j) such that C(j)nF(x
- nn) < F~^){x) < D{-y)nF(x - n/i)
(2.9)
holds for any n > 1 and x > jn. 2.3. Applications
of the class S* to local
asymptotics
For notational convenience we introduce Qh(X;x)
= Qh(F;x) = ^ (F(x) - F(x + h)) = ±-F{x,x + h],
h > 0.
This section is devoted to describing the local asymptotic behavior of the sums Sn and ST and the maxima 5(n) and 5( r ). Recent investigations in this field can be found in Kluppelberg (1989), Bertoin and Doney (1994), Sgibnev (1996), Tang (2002), and Asmussen et al. (20026, 2003). Hereafter, the d.f. F is always assumed to satisfy that Qh(F; x) ~ Cf(x)
for some C > 0 and any h > 0,
(2.10)
where the function f(x) : M+ —>R+ is proportional to the tail of some d.f. from the class S*. Without loss of generality, we assume that f{x) is non-increasing in x G (0,00).
224
Asmussen et al. (2002^, Lemma 1) obtained that if i<\ and F 2 are supported on (0, oo) and satisfy condition (2.10) with the coefficients Ci, Ci > 0, then it holds for any h > 0 that Qh{F1*F2;x)~(C1
+ C2)f{x).
(2.11)
This indicates that the operator Qh(-', x) is additive in the asymptotic sense. The extension of (2.11) to the general case where Ft are supported on (—00,00) is immediate. By induction, we can derive from (2.11) that Qh(Sn;x) ~ nQh(X;x), n > 1. On the other hand, based on (2.11), going along the same line as the proof of Theorem in Sgibnev (1996), we can also obtain that Qh (S(ny,x) ~ nQh{X;x), n > 1. Furthermore, the proof for Qh (X( n );x) ~ nQh (X;x), n > 1, is straightforward. Hence, we summarize: Theorem 2.5. (Ng and Tang (2004)) If the d.f. F satisfies (2.10), then it holds for any n 6 N and h > 0 that Qh{S{ny,x) ~ Q h ( 5 n ; x ) ~Qh{X(n);x)
~n®h(X;x).
(2.12)
Now we aim to extend Theorem 2.5 to the randomized case. By copying the proof of Lemma 2 in Asmussen et al. (2002^) with some adjustments we can obtain that if F supported on (—00,00) satisfies condition (2.10), then for any e > 0, there exists some C(e, h) > 0 such that Qh
(F^]X)
< C(e, /i)(l + £)"/(*)
(2.13)
holds for all n £ N and x > 0. Since Q h (S ( n ) ; 1) < 111=1 Qh{Sk\x), from (2.13) we immediately obtain that for any e > 0, there exists some D(e, h) > 0 such that Q„ (5 ( n ) ;x) < D(e,h)(l
+ e)nf(x)
(2.14)
holds for all n £ N and all x > 0. Hence, if we assume Eexp{<Sr} < 00 for some S > 0, then by the dominated convergence theorem, it follows from (2.12) that Qh(Sr;x) ~ ETh{X;x) holds. Similarly, we can obtain Qh (5( T );x) ~ ¥.TQh(X;x). Finally, the proof for the relationship Qh (X(Ty,x) ~ ErQh (X; x) is also immediate. Hence, we summarize again: Theorem 2.6. (Ng and Tang (2004)) If the d.f. F satisfies (2.10) and Eexp{5r} < 00 for some 5 > 0, then it holds for any h > 0 that Qh{SM;x)~Qh{ST;x)~Qh{X(Ty,x)~ET
(2.15)
225
3. The classes A l and A\* 3.1.
Introduction
In the literature, heavy-tailed distributions are often assumed to be long tailed. The following example, however, indicates that even the union VUC still fails to contain some heavy-tailed distributions of interest: Example 3.1. (Su and Tang (2003)) Assume that the r.v. X is geometrically distributed with a parameter p £ (0,1). Then for any r > 1, the d.f. Fr, say, of the r.v. XT is heavy tailed but Fr (j£ V U C In our recent investigation on heavy-tailed distributions we introduced two other classes, M and M*\ see Tang (2001), Su et al. (2002), Su and Tang (2003). Let F be a d.f. with 0 < jip < oo. Recall the equilibrium hazard rate re(x) of F defined by (2.3). We see that the d.f. F belongs to the class M if and only if lim re(x) = 0
(3.1)
x—>oo
and that the d.f. F belongs to the class M* if and only if Umsupxr e (x) < oo.
(3.2)
x—»oo
It is easy to check that the d.f. FT in Example 3.1 belongs to the class M and that any d.f. F g P U £ with 0 < //p < oo should be an element of the class M.. Recently, we extensively investigated the classes M. and M* and discovered the following implications and inclusions; see Tang (2001), Su et al. (2002), Su and Tang (2003) and Su and Hu (2003): (1) F e M ^ F e £ £ ; (2) F e M* «=> Fe £ C Fe € V; (3) F € A* n M* <=$ Fe e ERV(-aF, 0 < aF =: liminf
x~F(x) -
f00
-/3F),
where x~F(x)
< limsup
f00—
=: 0F < oo. (3.3)
(4) A* n M* C V. 3.2. Tails of distributions
from the classes Ad and Ai*
A mistake often encountered in the literature is the assertion that relation (1.2) implies lim etxF(x) = oo,
Vt > 0.
(3.4)
226
In fact counterexamples to this assertion can easily be provided. For any r.v. X with a d.f. F supported on (—00,00), we define its moment index by I F = l(X) = sup{u : E ^ < 00}.
(3.5)
This index indeed describes a characteristic of the right tail of the r.v. X. We refer to Daley (2001) and Tang and Tsitsiashvili (2003) for some interesting discussions. Our further investigation discovers the following fact: T h e o r e m 3.1. (Su and Hu (2003)) I/FGM has a moment indexlp > 1, then (3-4) holds. But there exist d.f. 's in the class M. with only finite 1st moment such that (3.4) doesn't hold. Hence, those distributions with only finite 1st moment possesses some abnormal properties, which may have not been discovered so far. Moreover, we have the following result: T h e o r e m 3.2. (Su and Hu (2003)) If F € M* has a moment indexlp > 1, then we have lim x 7 F(x) = 0 0 ,
V7 > J f ^ r
x—>oo
(3.6)
Ip — 1
where the quantity ftp is defined by (3.3). But there exist d.f. 's in the class M* with only finite 1st moment such that (3.6) doesn't hold. The theorem shows that, as an extension of V, the class M* inherits some properties of the class V. For the equilibrium distribution of a d.f. F in the class M*, a more interesting property is discovered: T h e o r e m 3.3. (Su and Hu (2003)) For F £ M*, we have lim xWe{x)
= 00,
V 7 > (3F.
(3.7)
x—»oo
3.3. Closure of the classes .M and
Ai*
The following theorem indicates the closure of the classes M and M* under convolution: T h e o r e m 3.4. (Su et al. (2002)) Let F = F1*F2. then F&M*
<=^FX&M*.
IfT2{x)
=
0{F\{x)), (3.8)
227
IfF2(x) = o(f™F1(t)dt), then FiEMt^FeM.
(3.9)
By Theorem 3.4 we know that, for any n £ N , F G M*
F ( n ) e M.
FEM&
Now we consider the difference X = Z - Y,
(3.10)
where Z and V are two independent and nonnegative r.v.'s which are not degenerate at 0, and Z is distributed by G. Then the d.f. F, say, of the difference X is therefore concentrated on (—00,00). The following result describes the intimate relationship between the right tails of the r.v.'s X and Z. Theorem 3.5. (Su et al. (2002)) Consider the independent difference (3.10). IfEZ^EY, then GEM*^-FEM*,
G&M<==>FEM.
4. Precise large deviations Throughout this section, {X, Xk,k > 1} denotes a sequence of i.i.d. and nonnegative r.v.'s with common d.f. F and finite expectation fi. As before, we denote by Sn the nth partial sum, n > 1, of the sequence {Xk, k > 1}. All limit relationships are for n —> 00 or t —* 00 unless stated otherwise. The main stream of research on precise large deviation probabilities has been concentrated on the study of the asymptotics V(Sn - nfi > x) ~ nF(x)
(4.1)
uniformly for some a;-region Tn. The uniformity in (4.1) is understood in the following sense: F{Sn-n/j,>x) lim sup 0. nF(x) - °° xeTn Some classical work on precise large deviations can be found in Nagaev (1969a, &), Heyde (1967a, b, 1968). See also Nagaev (1973, 1979), who studied the asymptotic relation (4.1) for r.v.'s with d.f. F £ ~R-a for some a > 1. A nice review of recent research on precise large deviations is Mikosch and Nagaev (1998). Other reviews can be found in Nagaev n >
228
(1979) and Rozovski (1993). See also the monographs Vinogradov (1994), Gnedenko and Korolev (1996) and Meerschaert and Schemer (2001). Recently, the precise large deviations for the randomly indexed sum (random sum), JV(t)
^w(t) = 2 ^ xk,
t > o,
fc=l
have been investigated by many researchers such as Cline and Hsing (1991), Kluppelberg and Mikosch (1997), Mikosch and Nagaev (1998) and Tang et al. (2001), among others. Here the random index N(t) is a nonnegative and integer-valued process, independent of the sequence {X, Xk,k > 1}.. We always suppose that the mean function \(t) —» oo as t —+ oo. Most recently, Ng et al. (2003) established the large deviations result for a socalled prospective-loss process under a very mild condition on the counting process N(t). We cite here the following result of Theorem 3.1 in Kluppelberg and Mikosch (1997); see also Proposition 7.1 in Mikosch and Nagaev (1998) for a further extension: Proposition 4.1. (Kluppelberg and Mikosch (1997)) Let the common d.f. F S ERV(—a, — (3) for some 1 < a < /? < oo. If the process N(t) satisfies that N(t)/X(t)
—>p 1
Assumption
A
and that for some e > 0 and any S > 0, Y^
(1 + e)k¥{N(t)
> k) = o(l),
Assumption
B
fc>(l+(5)A(t)
then for any 7 > 0, we have F (SN{t) - n\(t) > x) ~ X(t)F(x)
uniformly for x > 7A(t).
(4.2)
As verified in Kluppelberg and Mikosch (1997), homogenous Poisson process satisfies both assumptions A and B above. Some applications of result (4.2) to insurance and finance can be found in Chapter 8 of Embrechts et al. (1997) and some of the above-mentioned references. Lately, Su et al. (2001) and Tang et al. (2001) weakened Assumptions A and B into one as E ((N(t)f+e
V(t)>(i+«)A(t))) = O (A(i)).
Assumption C
229
for some e > 0 and any S > 0. This is a much weaker condition, which can be satisfied, for example, by the general renewal process and the compound renewal process. Recently, we have made some progress on precise large deviations. Theorem 4.1. (Ng et al. (2004)) If F GC has a finite expectation /x, then for any fixed 7 > 0, it holds that P (Sn - n/j, > x) ~ nF(x)
uniformly for x > jn.
(4.3)
Using the method of Kliippelberg and Mikosch (1997), we can extend Theorem 4.1 to the random case. To this end, we introduce the following notation. Let X be a r.v. with a d.f. F supported on (—00,00). For any y > 0 we set F.{y) = liminf ^ M x—00 F(x) and then define
F*(y) = limsup ^ M i-»oo F(x)
n = M{J!£M-.V>l\--to,*£M,
(4.4)
(4.5)
j- = su P (-^M :y > a = _ lim is^M. (4.6) [
logy
J
y^oo
logy
In the terminology of Bingham et al. (1987), here the quantities j £ and J ^ are the upper and lower Matuszewska indices of the function f(x) — (F(x)) , x > 0. Without any confusion, we simply call JJ^ the upper/lower Matuszewska index of the d.f. F. The latter equalities in (4.5) and (4.6) are due to Theorem 2.1.5 in Bingham et al. (1987). Trivially, for any d.f. F it holds that 0 < J ^ < j j < 00. But from inequality (2.1.9) in Theorem 2.1.8 in Bingham et al. (1987) we see that F € V if and only if J j < 00. Moreover, if F € TZ-a for some a > 0, then J ^ = a. For more details of the Matuszewska indices, see Chapter 2.1 of Bingham et al. (1987), Cline and Samorodnitsky (1994) and Tang and Tsitsiashvili (2003). Theorem 4.2. (Ng et al. (2004)) If F eC has a finite expectation \i and {N(t),t>0} satisfies EAT p (i)I ( N ( t ) > ( 1 + 5 ) A ( t ) ) = O (A(i))
Assumption D
for some p > j £ and any 5 > 0, then, the large deviations result (4-2) holds.
230
Now we propose a new application for Theorem 4.2. In the context of insurance risk theory, the claim sizes Xk, k > 1, are often assumed to be i.i.d. nonnegative r.v.'s with common d.f. F. Their occurrence times o^, fc > 1, constitute an ordinary renewal counting process N{t)=sup{n:an
t > 0,
(4.7)
with X(t) = EN(t) < oo for any t > 0. With er0 = 0, we can write the interarrival times by Yfc = ak — o~k-i, k > 1, which therefore form a sequence of i.i.d. nonnegative r.v.'s. These are standard assumptions in the ordinary renewal model. Now we consider the model in Denuit et al. (2002). In their model, the fcth insurance policy, k > 1, is associated with a Bernoulli variable Ik, and the variable Ik has a common expectation 0 < q < 1, where q is the claim occurrence probability of the fcth policy, k > 1. Then, the total claim amount up to time t is N(t)
s!{t) = j2XkI^
t
^°-
(4-8)
fc=l
We assume that the three sequences {Xk,k > 1}, {Yk,k > 1}, and {Ik, k > 1} are mutually independent and that the sequence {Ik, k > 1} is negatively associated. Therefore, Si (t) is a non-standard random sum unless the parameter q = l. Generally speaking, a sequence {Ik, k > 1} is said to be negatively associated (NA) if, for any disjoint finite subsets A and B of {1,2, • • •} and any coordinatewise monotonically increasing functions / and g, the inequality Cav\f(Ii]i
£ A), < ? ( J j ; j e J 3 ) ] < 0
holds whenever the moment involved exists. For details about the notation of NA, please refer to Alam and Saxena (1981), Joag-Dev and Proschan (1983), and Shao (2000), among others. We proved the following result: Theorem 4.3. (Ng et al. (2004)) If F & C has a finite expectations \x, then it holds for any 7 > 0 and all x > jX(t) that V(Si(t) - M\{t) > x) ~
q\{t)F(x).
5. The finite time ruin probability The following renewal risk model has been extensively investigated in the literature since it was introduced by Sparre Anderson (1957). As described in Section 4.3, the costs of claims Xi} i > 1, form a sequence of i.i.d.,
231
nonnegative r.v.'s with common d.f. F, and the inter-occurrence times Yi,i > 1, form another sequence of i.i.d. nonnegative r.v.'s which are not degenerate at 0. We assume that the sequences {Xi, i > 1} and {Yi, i > 1} are mutually independent. The locations of the successive claims constitute a renewal process defined by (4.7). The surplus process of the insurance company is then expressed as N(t) x
R(t) =x + ct-^2 i,
* > 0,
i=l
where x > 0 denotes the initial surplus, c > 0 denotes the constant premium rate, and Y^t=i %% — 0 by convention. If Y\ is exponentially distributed, hence N(t) is a homogenous Poisson process, the model above is just the classical risk model, which is also called the compound Poisson risk model. We define, as usual, the time to ruin of this model with initial surplus x > 0 as T(X)
= inf {t > 0 : R(t) < 0 | fl(0) = x} .
Hence, the probability of ruin within finite time t > 0 can be defined as a bivariate function V>(z; t) = P (T(Z) < t | R{0) =x),
(5.1)
and the ultimate ruin probability can be defined as ip(x; oo) = lim ip(x; t) = P ( r ( i ) < oo | R{0) = x). t—*oo
In order for the ultimate ruin not to be certain, it is natural to assume that the safety loading condition /x = cEYi — MX\ > 0 holds. Under the assumption that the equilibrium d.f. Fe defined by (1.1) is subexponential, Veraverbeke (1979) and Embrechts and Veraverbeke (1982) established a celebrated asymptotic relation for the ultimate ruin probability that tf>(x;oo)~-
1 r°° — /
F(u)du.
(5.2)
As universally admitted, the study of the finite time ruin probability is more practical, but much harder, than that of the infinite time ruin probability. The problem of finding accurate approximations to the finite time ruin probability for the renewal model has a long history and during the period many methods have been developed. In this section we aim at a simple asymptotic relation for the finite time ruin probability ip(x;t) as x —> oo with special request that this asymptotic
232
relation should be uniform for t in a relevant infinite time interval. Under some mild assumptions on the distribution functions of the heavy-tailed claim size X\ and of the inter-occurrence time Y\, applying a recent result by Tang (2004) on the tail probability of the maxima over a finite horizon inspired by Korshunov (2002), we obtained the following result: Theorem 5.1. (Tang (2003)) Consider the renewal risk model with the safety loading condition. If F € C with its upper Matuszewska index j £ and EYf < oo for some p > Jf£ + 1, then for arbitrarily given to > 0 such that A(to) > 0, it holds uniformly for t £ [to, oo] that T/>(x;t)~- / F{u)du. M Jx Moreover, relation (5.3) can be rewritten as
(5.3)
tp (x; t) ~ - / ~F(u)du M Jx if the horizon t is restricted to a smaller region [t(x),oo] for arbitrarily given function t(x) —> oo, where A = 1/EYi. The uniformity of the asymptotic relation (5.3) allows us to flexibly vary the horizon t as the initial surplus x increases. For example, under the conditions of Theorem 5.1, from (5.3) we can obtain that the relation ip{x;t(x))~-
F{u)du (5.4) A* Jx holds for any real function t(x) such that to < t(x) < oo. If we specify the d.f. F G 7?._Q for some a > 1, a more explicit formula can be derived immediately, which extends Corollary 1.6(a) of Asmussen and Kluppelberg (1996) to the renewal model. The uniformity of relation (5.3) also makes it possible to change the horizon t into a random variable as long as it is independent of the risk system, as done by Asmussen et al. (2002) and Avram and Usabel (2003). In fact, we have the following consequence of Theorem 5.1: Corollary 5.1. (Tang (2003)) Let the conditions of Theorem 5.1 be valid and let T be an independent r.v. with a d.f. H satisfying E T < oo and P(T > to) = 1 for some 10 > 0 such that A(t0) > 0. Then it holds that •0(x;T)~EA(T)-B(x).
(5.5)
233
It is clear that ip (i; T) = P (
max
Y" (Zj - c9i) > x ) .
(5.6)
i=l
A similar result as (3.3) has recently been established by Foss and Zachary (2003). However, we remark t h a t our Corollary 5.1 is never a special case of theirs since the r.v. N(T) in (5.6) cannot be explained as a stopping time in their sense.
References 1. Alam, K., Saxena, K.M.L., 1981. Positive dependence in multivariate distributions. Comm. Statist. A-Theory Methods 10, no. 12, 1183-1196. 2. Asmussen, S., Avram, F., Usabel, M., 2002 o . Erlangian approximations for finite-horizon ruin probabilities. Astin Bull. 32, no. 2, 267-281. 3. Asmussen, S., Foss, S., Korshunov, D.A., 2003. Asymptotics for sums of random variables with local subexponential behaviour. J. Theor. Probab., 16, no. 2, 489-518. 4. Asmussen, S., Kalashnikov, V., Konstantinides, D., Kliippelberg, C , Tsitsiashvili, G., 2002;,. A local limit theorem for random walk maxima with heavy tails. Statist. Probab. Lett. 56, no. 4, 399-404. 5. Asmussen, S., Kliippelberg, C , 1996. Large deviations results for subexponential tails, with applications to insurance risk. Stochastic Process. Appl. 64, no. 1, 103-125. 6. Athreya, K.B., Ney, P.E., 1972. Branching Processes. Springer-Verlag, New York-Heidelberg. 7. Avram, F., Usabel, M., 2003. Finite time ruin probabilities with one Laplace inversion. Insurance Math. Econom. 32, no. 3, 371-377. 8. Bertoin, J., Doney, R.A., 1994. On the local behaviour of ladder height distributions. J. Appl. Probab. 3 1 , no. 3, 816-821. 9. Bingham, N.H., Goldie, C M . , Teugels, J.L., 1987. Regular Variation. Cambridge University Press, Cambridge. 10. Cai, J., Tang, Q., 2004. On max-sum-equivalence and convolution closure of heavy-tailed distributions and their applications. J. Appl. Probab. 4 1 . no. 1, to appear. 11. Chistyakov, V.P., 1964. A theorem on the sums of independent positive random variables and its applications to branching random processes. Theory Prob. Appl. 9, 640-648. 12. Chover, J., Ney, P., Wainger, S., 1973. Functions of probability measures. J. Analyse Math. 26, 255-302. 13. Cline, D.B.H., 1987. Convolutions of distributions with exponential and subexponential tails. J. Austral. Math. Soc. Ser. A 43, 347-365. 14. Cline, D.B.H., 1994. Intermediate regular and II variation. Proc. London Math. Soc. (3) 68, no. 3, 594-616.
234
15. Cline, D.B.H., Hsing, T., 1991. Large deviation probabilities for sums and maxima of random variables with heavy or subexponential tails. Preprint. Texas A&M University. 16. Cline, D.B.H., Samorodnitsky, G., 1994. Subexponentiality of the product of independent random variables. Stochastic Process. Appl. 49, no. 1, 75-98. 17. Daley, D.J., 2001. The moment index of minima. J. Appl. Probab. 38A, 3 3 36. 18. Denuit, M., Lefevre, C , Utev, S., 2002. Measuring the impact of dependence between claims occurrences. Insurance Math. Econom. 30, no. 1, 1-19. 19. Embrechts, P., Goldie, C M . , 1982. On convolution tails. Stoch. Proc. Appl. 13, 263-278. 20. Embrechts, P., Goldie, C.M., Veraverbeke, N., 1979. Subexponentiality and infinite divisibility. Z. Wahrsch. Verw. Gebiete 49, no. 3, 335-347. 21. Embrechts, P., Kliippelberg, C., Mikosch, T., 1997. Modelling Extremal Events for Insurance and Finance. Berlin: Springer. 22. Embrechts, P., Veraverbeke, N., 1982. Estimates for the probability of ruin with special emphasis on the possibility of large claims. Insurance Math. Econom. 1, no. 1, 55-72. 23. Foss, S., Zachary, S., 2003. The maximum on a random time interval of a random walk with long-tailed increments and negative drift. Ann. Appl. Probab. 13, no. 1, 37-53. 24. Gnedenko, B.V., Korolev, V.Y., 1996. Random Summation. Limit Theorems and Applications. CRC Press. 25. Goldie, C M . , Kliippelberg, C , 1998. Subexponential distributions. A practical Guide to Heavy-Tails: Statistical Techniques and Applications. Eds. Adler, R., Feldman, R., Taqqu, M.S. Birkhauser, Boston. 26. Heyde, C C , 1967 a . A contribution to the theory of large deviations for sums of independent random variables. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 7, 303-308. 27. Heyde, CC, 1967;,. On large deviation problems for sums of random variables which are not attracted to the normal law. Ann. Math. Statist. 38, 1575-1578. 28. Heyde, CC, 1968. On large deviation probabilities in the case of attraction to a nonnormal stable law. Sankhya 30, 253-258. 29. Jelenkovic, P.R., Lazar, A.A., 1999. Asymptotic results for multiplexing subexponential on-off processes. Adv. Appl. Prob. 3 1 , 394-421. 30. Joag-Dev, K., Proschan, F., 1983. Negative association of random variables with applications. Ann. Statist. 11, 286-295. 31. Kaas, R., Tang, Q., 2003. Note on the tail behavior of random walk maxima with heavy tails and negative drift. North Amer. Actuar. J. 7, no. 3, 57-61. 32. Kalashnikov, V., 1997. Geometric sums: bounds for rare events with applications. Risk analysis, reliability, queueing. Kluwer Academic Publishers Group, Dordrecht. 33. Kliippelberg, C , 1989. Subexponential distributions and characterization of related classes. Probab. Th. Rel. Fields 82, no. 2, 259-269. 34. Kliippelberg, C , Mikosch, T., 1997. Large deviations of heavy-tailed random
235 sums with applications in insurance and finance. J. Appl. Prob. 34, 293-308. 35. Konstantinides, D., Tang, Q., Tsitsiashvili, G., 2002. Estimates for the ruin probability in the classical risk model with constant interest force in the presence of heavy tails. Insurance Math. Econom. 3 1 , no. 3, 447-460. 36. Korshunov, D.A., 2002. Large-deviation probabilities for maxima of sums of independent random variables with negative mean and subexponential distribution. Theory Prob. Appl. 46, no. 2, 355-366. 37. Meerschaert, M.M., Scheffler, H.P., 2001. Limit distributions for sums of independent random vectors. Heavy tails in theory and practice. Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons, Inc., New York. 38. Mikosch, T., Nagaev, A.V., 1998. Large deviations of heavy-tailed sums with applications in insurance. Extremes 1, no. 1, 81-110. 39. Nagaev, A.V., 1969 a - Integral limit theorems for large deviations when Cramer's condition is not fulfilled I, II. Theory Prob. Appl. 14, 51-64, 193208. 40. Nagaev, A.V., 1969;,. Limit theorems for large deviations when Cramer's conditions are violated (In Russian). Fiz-Mat. Nauk. 7, 17-22. 41. Nagaev, S.V., 1973. Large deviations for sums of independent random variables. In Trans. Sixth prague conf. on Information Theory, Random Processes and Statistical Decision Functions. Academic, Prague. 657-674. 42. Nagaev, S.V., 1979. Large deviations of sums of independent random variables. Ann. Prob. 7, 745-789. 43. Ng, K., Tang, Q., 2004. Asymptotic behavior of sums of subexponential random variables on the real line. J. Appl. Prob. 4 1 . no. 1, to appear. 44. Ng, K., Tang, Q., Yang, H., 2002. Maxima of sums of heavy-tailed random variables. Astin Bull. 32, no. 1, 43-55. 45. Ng, K., Tang, Q., Yan, J., Yang, H., 2003. Precise large deviations for the prospective-loss process. J. Appl. Probab. 40, no. 2, 391-400. 46. Ng, K., Tang, Q., Yan, J., Yang, H., 2004. Precise large deviations for sums of random variables with consistently varying tails. J. in Appl. Probab. 4 1 . no. 1, to appear. 47. Rozovski, L.V., 1993. Probabilities of large deviations on the whole axis. Theory. Probab. Appl. 38, 53-79. 48. Schlegel, S., 1998. Ruin probabilities in perturbed risk models. The interplay between insurance, finance and control (Aarhus, 1997). Insurance Math. Econom. 22, no. 1, 93-104. 49. Sgibnev, M.S., 1996. On the distribution of the maxima of partial sums. Stat. Prob. Letters 28, 235-238. 50. Shao, Q.M., 2000. A comparison theorem on moment inequalities between negatively associated and independent random variables. J. Theoret. Probab. 13, no. 2, 343-356. 51. Sparre Anderson, E., 1957. On the collective theory of risk in the case of contagious between the claims. In: Transactions of the XV International Congress of Actuaries. 52. Su, C , Hu, Z., 2003. On Two Broad Classes of Heavy-Tailed Distributions.
236
Submited to Statistica Sinica. 53. Su, C , Tang, Q., 2003. Characterizations on heavy-tailed distributions by means of hazard rate. Acta Math. Appl. Sinica (English Ser.) 19, no. 1, 135-142. 54. Su, C , Tang, Q., Chen, Y., Liang, H., 2002. Two broad classes of heavytailed distributions and their applications to insurance. Obozrenie Prikladnoi i Promyshlenni Matematiki, to appear. 55. Su, C , Tang, Q., Jiang, T., 2001. A contribution to large deviations for heavy-tailed random sums. Sci. China Ser. A 44, no. 4, 438-444. 56. Tang, Q., 2001. Extremal values of risk processes for insurance and finance: with special emphasis on the possibility of large claims. Doctoral Thesis, University of Science and Technology of China. 57. Tang, Q., 2002. An asymptotic relationship for ruin probabilities under heavy-tailed claims. Science in China (Series A) 45, no. 5, 632-639. 58. Tang, Q., 2004. Uniform estimates for the tail probability of maxima over finite horizons with subexponential tails. Probab. Engrg. Inform. Sci. 18 (2004), no. 1, 71-86. 59. Tang, Q., 2003. Precise Large deviations for the finite time ruin probability in the renewal risk model with consistently varying tails. Submitted. 60. Tang, Q., Su, C , Jiang, T., Zhang, J., 2001. Large deviations for heavytailed random sums in compound renewal model. Stat. Prob. Letters 52, no. 1, 91-100. 61. Tang, Q., Tsitsiashvili, G., 2003. Precise estimates for the ruin probability in finite horizon in a discrete-time model with heavy-tailed insurance and financial risks. Stochastic Process. Appl. 108 (2003), no. 2, 299-325. 62. Tang, Q., Yan, J., 2002. A sharp inequality for the tail probabilities of sums of i.i.d. r.v.'s with dominatedly varying tails. Science in China (Series A) 45, no. 8, 1006-1011. 63. Veraverbeke, N., 1977. Asymptotic behaviour of Wiener-Hopf factors of a random walk. Stochastic Processes Appl. 5, no. 1, 27-37. 64. Vinogradov, V., 1994. Refined large deviation Limit theorems. Longman, Harlow.
T H E I N S U R A N C E REGULATORY R E G I M E IN HONG KONG (WITH A N EMPHASIS ON THE A C T U A R I A L ASPECT)
AUGUST CHOW Office of the Commissioner of Insurance 21/F Queensway Government Offices 66 Queensway, Hong Kong E-mail: [email protected]
1. The Insurance Regulatory Regime in Hong Kong In this article, I would like to share with you the insurance regulatory regime in Hong Kong, and in particular some of the actuarial applications for purposes of monitoring the life insurance industry in Hong Kong today. 2. Outline This article will cover 3 areas: First, to enable you to have a better understanding as to what we do, I would like to give you an overview of the life insurance market and our insurance regulatory philosophy in Hong Kong. Then, I would like to cover the Appointed Actuary System that we have adopted in Hong Kong and the roles played by the actuaries in the market place. And finally, I would like to provide you with an example of an application on Reserving for Investment Guarantees with respect to the retirement scheme business sold in Hong Kong. 3. An Overview of the Hong Kong Life Insurance Market Compared with other territories in the region, Hong Kong is a free and open city with many insurers competing in the market, and has the highest number of insurers per capita. However, the penetration ratio, which is defined as 100% times the number of policies per capita, is only 70% in Hong Kong which is relatively 237
238
low as compared to over 200% in the United States and Japan. Therefore, there is still ample room for life insurance to prosper further in Hong Kong. Under our prudential regulatory regime, insurers in Hong Kong have the freedom to set premium rates and design products. 4. Number of Authorized Life Insurers by Place of Incorporation as of 30 June 2002 At the end of June 2002, there were 48 life insurers and 18 composite insurers authorized to write long term business in and from Hong Kong. About 1/3 of them (21) were incorporated locally in Hong Kong, with the remaining 2/3 incorporated in overseas (10 in Bermuda, 8 in UK, 7 in the US and the rest from China, Canada, Switzerland, Isle of Man etc). Figure 1. Number of authorized life insurers by place of incorporation as at 30 June 2002.
5. View of the Hong Kong Insurance Regulatory Framework The insurance industry in Hong Kong is regulated by the Insurance Companies Ordinance. It provides a legislative framework for the prudential supervision of the insurance industry in Hong Kong. The objectives of the Insurance Authority are to protect the interests of the policyholders and to maintain the general stability of the market place. In addition to being an insurance regulator, the Insurance Authority is a proactive market enabler that we welcome and stand ready to facilitate
239
entry of newcomers and a referee of insurers. As mentioned earlier, we adopt a prudential supervisory approach rather than a close supervision, with emphasis on self-control mechanism and the appropriate checks and balances. 6. Appointed Actuary System in Hong Kong At the beginning of 2001, we had introduced a fully fledged Appointed Actuary System for the life insurance business in Hong Kong. The adoption of this system is to strengthen the monitoring of the financial condition of life insurers and is also to be in line with the development in other developed countries. Under this system, the appointed actuaries have many other responsibilities apart from the traditional work in product pricing and valuation. Appointed Actuaries now have the responsibility to the company's board of directors and to the regulators. To enable the Appointed Actuary can properly carry out the responsibility, the actuary is given unrestricted access to all of the company's relevant financial and non-financial data that are complete and accurate. 7. Specific Reference to Actuaries in the Insurance Regulation As mentioned earlier, one of the important duties of the appointed actuary is to monitor the financial condition of insurance company and to report them to the board of directors. To fulfill this duty, the Insurance Authority has imposed regulations on the valuation of long term liabilities and the determination of solvency margin. Moreover, the appointed actuary has a whistle-blowing function under our Insurance Companies Ordinance. If the Appointed Actuary is aware of a situation that could threaten the company's financial condition, the Appointed Actuary has an obligation to report this to the company's management and the Board. If the company does not correct the situation, the Appointed Actuary is obligated to report the situation directly to the Insurance Authority to take proper action. In order to fulfill the duties of the Appointed Actuaries, the Insurance Authority has prescribed a professional standard for the Appointed Actuaries to follow. The professional standard was issued by a local actuarial body, the Actuarial Society of Hong Kong (ASHK) which I would further comment later on. Every year, the appointed actuary is required to issue
240
an actuarial certification indicating the compliance with this professional standard or some other standards acceptable by the Insurance Authority. 8. Roles of Actuaries in the Market Place Apart from all the regulatory responsibilities, actuaries also play a very important role in the operation of a life insurance company. Some of them could include identifying and targeting profitable market segments, designing sensible product features, setting the correct premium rate, assessing and conducting the risk control through reinsurance; and managing asset liability. For participating policyholders, the Appointed Actuary is also required to preserve the Policyholders' Reasonable Expectation (PRE). The Policyholders' Reasonable Expectation has often to do with the amount of bonus or dividends expected to be payable to policyholders in the future. 9. Actuarial Professional Body In order to have a strong Appointed Actuary system, it is very important that the actuarial professional body provides a professional code of conduct and a standard of practice to guide its members so that its members can carry out the responsibilities given to them. The Actuarial Society of Hong Kong (the ASHK) has taken up this important role. The ASHK has issued a number of professional standards and guidance notes over the years, such as the PS1 which I have already mentioned, and the GN3, an additional guidance for Appointed Actuary with respect to the determination of reserves and solvency margin. The ASHK also conducts training and professional development meetings such as the Appointed Actuaries Symposium held last summer. 10. Actuarial Application: Reserving for Investment Guarantee Finally, I would like to talk about an actuarial application on reserving standards for investment guarantees. In January 2001, the Insurance Authority issued a guidance note, GN7, which requires life insurers carrying on Class G retirement scheme insurance business with investment guarantees on capital or return to set up an adequate reserve. The Class G business includes the Mandatory Provident Funds (MPF) Scheme which is a community wide scheme providing a com-
241
pulsory retirement saving vehicle for all working population in Hong Kong, except for the very low income group. To better assist the members in complying with the GN7, the ASHK has recently released a draft guideline which further provides guidance on satisfying the requirements of GN7. 11. Details of the Reserving for Investment Guarantees The guidance note detailed the approaches to determine the provision such as the annual projection of assets and policy liabilities cash flow based on the stochastic scenario with 99% level of confidence. If deterministic scenario or factor approach is used, a stochastic adequacy test must be performed at least once a year on the total provision on the adverse scenarios with a 99% level of confidence. The provisions for investment guarantees must be revised at least quarterly to reflect current information. Provisions at month-ends within a quarter may be extrapolated from the previous quarter-end values. However, where there are significant fluctuations in the underlying factors that have been modeled by the stochastic analysis, the scenario testing should be done on a monthly basis. 12. Stochastic Approach Under the stochastic approach, the process for completing a full stochastic investigation includes the modeling and projection of investment returns as well as the in-force business liabilities. In the modeling and projection of investment returns, one needs to determine the relevant economic/market risk factors to be included in the model, the underlying investment policy of the assets and its investment returns, and the use of stochastic model to simulate the evolution of these quantities. For the modeling and projection of in-force business liabilities, one needs to extract the relevant policy in-force data and to determine the assumptions used in the projection such as the mortality, lapse, withdrawal, future contributions, expenses, policy holders behaviors, management strategies. Then the investment and business liabilities models are run with over 1000 economic scenarios. The result of the scenarios, which is the present value of claims less the present value of income, are ranked in the order of severity of the required reserve.
242
Finally, the total provision is established for the investment guarantees which is consistent with a 99% level of confidence. A transitional arrangement is provided for certain Class G insurance policies which are not related to MPF schemes. For these non-MPF policies, the level of reserves for guarantees is required to reach the 99% level of confidence by year 2004. It is also required that the results of stochastic testing must be traceable and reasonably reproducible for audit purposes. 13. Other Actuarial Applications Apart from this, there are other actuarial applications too. Due to the time constraint, I'm not going to cover them in details with you. I would just mention that the Insurance Authority had requested the ASHK to look into the possibility of implementing Risk based capital and the Dynamic solvency testing for purposes of enhancing the financial monitoring of life insurers. Starting from the end of 2001, the Insurance Authority also required general insurers engaging an actuary to certify reserves of certain compulsory lines of business. 14. Conclusion In conclusion, actuaries and actuarial mathematics have played and will continue to play a significant role in the regulatory framework of the insurance market. I hope you would agree with me that the demand for actuarial skills is getting greater in Hong Kong and particularly in China with the new opportunities upon China's entry into the WTO. References 1. GN 7 Guidance Note On Reserving Standards For Investment Guarantees. 2. PS 1 Professional Standard For Fellow Members Of The Actuarial Society Of Hong Kong.