Lecture Notes in Mathematics Edited by A. Dold and B. Eckmann
529 Jan Grandell
Doubly Stochastic Poisson Processes
Springer-Verlag Berlin. Heidelberg New York 1976
Author Jan Grandell Department of Mathematics The Royal Institute of Technology S-10044 Stockholm 70
Library of Congress Cataloging in Publication Data
Grandell, Jan, 194~iDoubly stochastic Poisson processes. (Lecture notes in mathematics ; 529) Bibliography: p. Includes index. 1. Poisson processes, Doubly stochastic. 2. Measure theory. 3. Prediction theory. I. Title. II. Series: Lecture notes in mathematics (Berlin) ; 529. QA3.L28 vol. 529 [QA274.42] 510'.8s [519.2'3]
76-20626
A M S Subject Classifications (1970): 60F05, 6 0 G 2 5 , 6 0 G 5 5 , 62M15
ISBN 3-540-0??95-2 ISBN 0 - 3 8 ? - 0 ? ? 9 5 - 2
Springer-Verlag Berlin 9 Heidelberg 9 N e w York Springer-Verlag N e w York 9 Heidelberg 9 Berlin
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under w 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher. 9 by Springer-Verlag Berlin. Heidelberg 1976 Printed in Germany. Printing and binding: Beltz Offsetdruck, Hemsbach/Bergstr.
PREFACE
The doubly stochastic Poisson process is a generalization of the ordinary Poisson process in the sense that stochastic variation in the intensity is allowed. Some authors call these processes processes'
'Cox
since they were proposed by Cox (1955). Later on Mecke
(1968) studied doubly stochastic Poisson processes within the framework of the general theory of point processes and random measures.
Point processes have been studied from both a theoretical and a practical point of view. Good expositions of theoretical aspects are given by Daley and Vere-Jones
(1972), Jagers (1974), Kallenberg
(1975:2) and Kerstan~ Matthes and Mecke
(1974). Accounts of more
practical aspects are given by Cox and Lewis (1966) and Snyder (1975).
The exposition in this monograph is based on the general theory of point processes and random measures, but much of it can be read without knowledge of that theory. My objective is to place myself somewhere between the purely theoretical school and the more applied one, since doubly stochastic Poisson processes are of both theoretical and practical interest.
I am quite aware of the risk that some readers
will find this monograph rather shallow while others will find it too abstract. Of course I hope - although perhaps in vain - that a reader who is from the beginning only interested in applications will also find some of the more theoretical parts worth reading. I have, however, tried to make most of the more applied parts understandable without knowledge of the more abstract parts. Also in most of the more theoretical parts I have included examples and numerical illustrations.
JV
All readers are assumed to have a basic knowledge of the theory of probability and stochastic processes. The required knowledge above that basic level varies from section to section. The three appendices, in which I have collected most of the non-standard results needed, may be of some help.
In section 1.2 doubly stochastic Poisson processes are defined in terms of random measures. A reader not interested in the more theoretical aspects may leave that section after a cursory reading.
In sec-
tion 1.3.1 the same definition is given in terms of continuous parameter stochastic processes and finally in section 1.4 in terms of discrete parameter stochastic processes. Sometimes alternative definitions, given in sections 1.3.2 - 1.3.4 are convenient. Generally I have used the definition in section 1.2 in the more theoretical parts. Section 1.5 contains some fundamental theoretical properties of doubly stochastic Poisson processes and requires knowledge of random measures. In section 1.6 mean values, variances and covariances are discussed. Only the first part of it requires some knowledge of random measures.
In section 2 mainly special models are treated. In sections 2.2, 2.3.2 and 2.3.3 some knowledge of renewal theory is helpful.
In section 2.3
and 2.4 the distribution of the waiting time up to an event is considered. Palm probabilities, to which section 2.4 is devoted, belong to the difficult part of point process theory. I have tried to lighten the section by including a heuristic and very non-mathematical introduction to the subject.
Section 3 is purely theoretical and illustrates how doubly stochastic Poisson processes can be used as a tool in proving theorems about random measures.
In section 4 the behaviour of doubly stochastic Poisson processes after long 'time'
is considered.
In section 4.2 knowledge of weak
convergence of probability measures
in metric spaces is helpful.
Some of the required results are summarized in section At.
In section 5 'estimation of random variables'
is considered.
Here
estimation is meant in the sense of prediction and not in the sense of parameter estimation. ful. In section 5.1 tion 5.2 'linear'
Some knowledge of random measures
'non-linear'
is help-
estimation is treated and in sec-
estimation is treated. The main mathematical tools
used are, in section 5.1, the theory of conditional distributions and, in section 5.2, the theory of Hilbert spaces.
In section A2 the
required results of Hilbert spaces are summarized.
In sections 6 and 7 the discrete parameter case is treated. tion 6 'linear estimation of random variables' section 7 estimation of covariances treated.
In sec-
is considered.
In
and of the spectral density is
In both sections methods from the analysis of time series
are used. These sections require no knowledge of random measures depend only on section
1.4 and the last part of section
and
1.6. A rather
complete review of the required theory of time series are given in section A3.
All definitions,
theorems,
lemmata,
corollaries,
examples and remarks
are consecutively numbered within each main section. definition 5 in section
1.2 is referred to as 'definition
whole of section I and as 'definition the'List of definitions,
So, for example, 5' in the
1.5' in the other sections.
...' it is seen that definition
From
1.5 is given
on page 7. The end of each proof, example or remark is signaled by ~
.
VI
There are of course many topics related to doubly stochastic Poisson processes which are not treated in this monograph.
In particular we
shall not consider line processes, i.e. random systems of oriented lines in the plane, or their generalizations to flat (hyperplane) processes. A line process can be viewed as a point process on a cylinder by identifying lines with a pair of parameters which determine the line, e.g. the orientation and the signed distance to the origin. It turns out that 'well-behaved'
stationary line processes correspond
to doubly stochastic Poisson processes. What 'well-behaved'
shall really
mean is as yet not settled. To my knowledge the best results are due to Kallenberg (1976) where results of Davidson, Krickeberg and Papangelou are improved.
There are many persons to whom I am greatly indepted, but the space only allows me to mention a small number of them. In a lecture Harald Cram@r, see Cram@r (1969), gave me the idea of studying doubly stochastic Poisson processes.
In my first works on this subject I
received much help from Jan Gustavsson. Peter Jagers introduced me to the general theory of point processes and random measures.
From
many discussions with him and with Olav Kallenberg and Klaus Matthes I have learnt much about that theory. The extent to which I have benefitted from Mats Rudemo~s advice and comments on early versions of this monograph can hardly be overestimated.
In the preparation of
the final version I was much helped by Bengt yon Bahr, Georg Lindgren and Torbj6rn Thed@en. Finally, I am much indepted to Margit Holmberg for her excellent typing.
Stockholm, March 1976
Jan Grandell
LIST OF DEFINITIONS, THEOREMS, LEMMATA, COROLLARIES, EXAMPLES AND REMARKS number
page
number
page
number
page
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8
87 88 88 116 116 118 121 142
6.1
162
AI.1 AI.2 AI.3 A1.4
206 206 208 208
16
4 4 5 5 7 11 17 23
1.1 1.2 1.3 1.4 1.5 1.6 1.7
18 19 19 2O 21 25 28
4.1 4.2
69 81
5.1 5.2 5.3 5.4 5.5
89 116 118 123 141
AI.5 AI.6 AI.7 AI.8 AI.9 AI.10
207 209 209 2O9 210 211
2.1 2.2
35 57
A2.1 A2.2
212 214
7.1
196
3.1 3.2 3.3
65 66 68
AI.1 AI.2 AI.3 AI.4
205 206 207 207
A3.1 A3.2 A3.3 A3.4 A3.5
216 217 218 220 224
1.1 1.2 1.3a 1.3b 1.4
5 10 23 24 27
3.1
67
5.1
122
4.1 4.2 4.3
77 78 80
6.1
180
Corollaries
1.1
22
2.1
37
4.1 4.2
72 72
Examples
2.1 2.2 2.3 2.4
47 48 59 60
164 167 170 183 187
83 84
95 97 107 127 128 129 132 140
6.1 6.2 6.3 6.4 6.5
4.1 4.2
5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9
7.1 7.2
193 198
5.1
94
1.1 1.2 1.3
20 24 25
4.2 4.3
78 83
5.5
126
5.1 5.2 5.3 5.4
93 118 120 125
6.1 6.2 6.3 6.4
162 165 166 182
Definitions
I I 2 1 3 14 1 5 1 5' 1 5"
Theorems
L er~mata
Remarks
2.1
55
4.1
74
CONTENTS
I,
Definitions
and basic properties
1.1
A heuristic
introduction
1.2
The general definition
1.3
Doubly stochastic Poisson processes on the real line
9
1.3.1
Recapitulation of the definition
9
1.3.2
An alternative
1.3.3
Classes of doubly stochastic Poisson processes
12
1.3.4
A definition based on interoccurrence
15
1.4
Doubly stochastic Poisson sequences
17
1.5
Some basic properties
18
1.6
Second order properties
22
1.7
A characterization
of ergodicity
27
2.
Some miscellaneous
results
31
2.1
The weighted Poisson process
31
2.2
Doubly stochastic Poisson processes and renewal processes
33
2.3
Some reliability models
4O
2.3.1
An application on precipitation aerosol particle
10
definition
times
scavenging of an 4O
A model with an intensity generated by a renewal process
44
A model with an intensity generated by an alternating renewal process
5O
2.4
Palm probabilities
53
2.4.1
Palm probabilities for doubly stochastic Poisson processes in the general case
53
2.4.2
Some special models
58
2.5
Some random generations
63
2.3.2
2.3.3
Characterization and convergence of non-atomic random measures
65
4.
Limit theorems
68
4.1
0he-dimensional limit theorems
69
4.2
A functional limit theorem
74
5.
Estimation of random variables
86
5.1
Non-linear estimation
87
5.2
Linear estimation
115
5.3
Some empirical comparisons between non-linear and linear estimation
143
Linear estimation of random variables in stationary doubly stochastic Poisson sequences
158
6.1
Finite number of observations
158
6.2
Asymptotic results
161
7.
Estimation of second order properties of stationary doubly stochastic Poisson sequences
190
7.1
Estimation of covariances
192
7.2
Estimation of the spectral density
195
A1
Point processes and random measures
2o5
A2
Hilbert space and random variables
212
A3
Some time series analysis
214
3.
6.
References
226
Index
232
I.
DEFINITIONS
I. I
A heu~tic
AND BASIC PROPERTIES
introduction
We will start the discussion
of doubly stochastic
Poisson procecesses
in a very informal way~ in order not to hide simple ideas behind notations
and terminology.
mathematical points
Consider therefore
model is needed for the description
in some space.
To be concrete,
in time and assume that multiple model describing The simplest
a situation where
a situation
such model,
events
of the location events
do not occur.
except perhaps
for a deterministic
intensity
in each time interval
dent.
Depending
in disjoint
occurring
A mathematical
intervals
is Poisson
distributed
different
with
Further,
are stochastically
of course on the situation
one, is
X. In this model the
mean value equal to X times the length of the interval. number of events
of
of this kind is called a point process.
the Poisson process with constant number of events
we consider
a
the
indepen-
objections
may
be raised against the use of this simple model. We will here discuss some objections
in such a way that we are led to a doubly stochastic
Poisson process.
(i)
Assume that the model seems realistic
know the value of the parameter
~, a rather
except that we do not common situation.
then natural to use some estimate
of ~. There exist, however,
tions where this is not possible.
Consider
insurance business dent pattern
and suppose that
follows
situa-
an automobile
for each policy-holder
the acci-
a Poisson process but that each policy-holder
has his own value of ~. The insurance knowledge
for example
It is
of how ~ varies
company may have a rather good
among its policy-holders.
For a new policy-
h o l d e r it may therefore as a constant
be reasonable
to treat his value
but as a r a n d o m variable.
w e i g h t e d P o i s s o n process is frequently
(ii)
In both the P o i s s o n
In many
variations
~. The number
situations
or other trends.
of events
in a time
perform
Formally this
is not a serious
a transformation
the m o d e l with constant
of the time X (cf Cram$r
this
complication
is more
X(t)
is required.
Thus
Suppose plays
S t a r t i n g with the ~(t) i n s t e a d of the con-
interval
is then P o i s s o n
a model
an important
variation,
complication
over the
since we may
scale which leads us b a c k to (1955, p 19)).
since k n o w l e d g e
for ~(t)
In practice
of the
function
is needed.
role.
variation.
There may of course be different To he concrete
at least partly,
depends
again we assume
on w e a t h e r
and the weather.
In spite of this
cessary to use a stochastic model In such a situation
it is thus
tion of a stochastic process. then led to a doubly
dependenc@
natural to regard
stochastic
that the In m a n y
b e t w e e n the time of
in order to describe
As indicated
reasons
conditions.
of the w o r l d there is a strong dependence
the y e a r
of ~(t)
now that we are in a s i t u a t i o n where the seasonal v a r i a t i o n
for a seasonal
parts
serious
~ was
~ will vary with time
d i s t r i b u t e d with m e a n value equal to the integral interval.
In fact this
and the w e i g h t e d P o i s s o n model
P o i s s o n model we are led to use a function stant
model.
a
used in insurance m a t h e m a t i c s .
a s s u m e d to be constant. due to seasonal
We are then led to use
as our m a t h e m a t i c a l
model
of ~ not
~(t)
it may be nethe weather. as a realiza-
in the p r e f a c e we are
Poisson process.
1.2
The general d e f i n i t i o n
In this section a general process will be given.
definition
of a doubly stochastic
The definition will be based on the theory of
random measures
and point processes.
are e~g. Jagers
(1974) and Kerstan, Matthes
vey is, however,
In section
Sometimes
As in Jagers
in time were considered,
there is a need for more general
state
i.e. R 2, is often
(1974) X will be assumed to be a
compact Hausdorff topological else is stated.
(1974). A sur-
are located will be called the state
In e.g. ecological models the plane,
natural.
for that theory
and Mecke
1.1~ where point processes
X was the real line. spaces.
Good references
given in section At.
The space X where the points space.
Poisson
locally
space with countable basis when nothing
A reader not interested
in topological
think of X as the real line or, perhaps better,
concepts may
as R 2" Often, how-
ever~ we will consider X = R when its natural order of real numbers is convenient
or X = Z where
Z is the set of integers.
Let B(X) be the Borel algebra on X, i.e. the a-algebra open sets. A Borel measure negative measure
that is finite on compact
all Borel measures. space.
(or Radon measure)
on (X,B(X))
sets.
Endowed with the vague topology M is a Polish
advised to turn to the beginning
Borel algebra on M. Let
N@B(M)
concepts
is
of section AI for definitions.
may also be helpful to read section
and B(N)
is a non-
Let M be the set of
(A reader not familiar with these topological
valued elements
generated by
1.3.1
first.)
Denote by
It
B(M)
the
be the set of all integer or infinite
of M. Endowed with the relative
denotes the Borel algebra on N. Usually
will be denoted by ~ and ~ respectively.
topology elements
N is Polish in M and N
Definition
I
A random measure
is a measurable mapping from some probability
(W, W, ~) into (M,
space
B(M)).
Usually a random measure will be denoted by A. The distribution is the probability measure H on (M, B(M)) H(B M) = ~ ( w ~ W
; A ( w ) & B M) for B M E
on (M, B(M)) we may take (W, mapping,
B(M).
For any probability measure H) and A as the identity
i.e. A(~) = ~. Thus any probability measure
talk about a random measure
random measure,
on
(M, B(M))
We may, and shall, thus
A with distribution
ference to an underlying probability
known,
induced by A, i.e.
W, ~) = (M, B(M),
the distribution of some random measure.
of A
H without any re-
space. When we talk about a
it is tacitly understood that its distribution
is
and it is often convenient to use the notation P r ( A E B M)
instead of H(B M) for B M ~ B ( M ) .
Let a random measure
A with distribution
H and a set B ~ B ( X )
be
given. We will talk about the random variable A(B), which is nonnegative
and possibly extended,
see theorem AI.2.
Similarly
for
given B I .... ,Bn~B(X) we talk about the random vector
(A(B 1 ) . . . . .
A[Bn)).
Definition
2
A random measure with distribution H is called a point process if ~(N) = I.
Usually a point process will be denoted by N. We will, whenever convenient
and without comments,
of a point process
assume that all realizations
are in N and interpret its distribution
probability measure on
(N, B(N)).
as a
is
Definition 3
Ar a n d o m ~ measure A is completely random if A{BI},...,A{Bn } are independent random variables whenever B I , . . . , B n ~ B ( x )
In e.g. Kerstan, Matthes
and Mecke
are disjoint.
(1974, p 24) it is shown that for
every ~ E M there exists exactly one p r o b a b i l i t y measure H
(N, B(N))
which is the distribution
on
of a completely random point pro-
cess N with
Pr{N(B} = k} = P{B)k e-P(B}
kl for all k = 0,1,...
and all bounded B ~ ( X ) .
I n this paper a set is
called bounded if it has compact closure.
Definition 4
A point process N with distribution
H
is called a Poisson process
with intensity measure ~.
We note that if N is a Poisson process with intensity measure ~ and if B is an unbounded set in B(X) with ~{B} = ~ then Pr{N{B}
We will now give the general definition Poisson process.
= ~} = I.
of a doubly stochastic
In order to justify the definition the following
lemma is needed.
Lemma
I
For every B e B ( N ) measurable.
the function ~ ~ H {B} from M into E0,1]
is
B(M)-
Proof
This lemma is a consequence
of 1.6.2 in Kerstan,
(1974, pp 64-65).
We will, however,
that the function
U ~ H {BN} is B(M)-measurable
form {v~ N; v{B I} = k I
~"
..,v{B
n
Matthes
give a proof.
and Mecke
We will first show
for sets B N of the
) = k } where BI,...,B n are disjoint n
sets in B(X) and kl,..,,k n are finite nonnegative
integers.
In this
case we have k.
n
~{9.i } ~
i=I
-~{B i } if all ~{B.}
e
<
1
k.~ i
H { B N) = 0
if some u{B.} = 1
and thus H { B N} is a measurable
function
~{B I) ..... ~{Bn }' Since for all B ~ B ( X ) B(M)-measurable
Since H
also the function
is a measure
B(M)-measurable
in the variables
the function
U ~U{B}
is
u ~ H {B N} is B(M)-measurable.
for each ~ & M
the function
~ ~ ~ {B } is
also for sets of the form n
BN = {v~N;
(v(B 1},...,v(Bn))~
E}, E C Z +
where Z+ = 0,1...
To see this we consider n = 2 and E = (kl,k2).
{~6N; =
(~{BI},~{B2})
~J J1+J2=kl
:
{~eN;~{BI~B
and ~.
Then
(kl,k2)} : 2) = J1' ~ { B I N B 2} = J2' ~ { B 2 k B I }
= J3 }
J2+J3=k2 Thus ~ ~ H {B N} is B(M)-measurable closed under intersection and the comments
for a class of B N sets which
and which,
after that theorem,
D = {D ~ B(N); H { D }
as follows generates
is B(~)-measurable}
we have
is
from theorem AI.1
B(N)
. If
(i)
N6N
(ii)
DI,D2s
(iii)
DI,D 2 .... ~ D
H
,
is a measure.
(cf Bauer
DICD2
~
D2~,D16P
and D i N D .
= ~
for i # j ~ L J
D . 6 D since
Thus ~ is a Dynkin system and thus
(1968, pp 17-18)). Thus H { B N} is
~ =
B(N)
B(M)-measurable
v BN6 B(N) .
m
It follows from lemma I that the set function P{BN] = i H~{BN}H{d~} measures
(M,B(M)).
H on
measure on
is well-defined
(N,B(N))
Since H
for all probability
for all ~ 6 M
it follows by monotone
P is a probability measure on
(N,B(N)).
is a probability
convergence that also
We will use the notation
P = S H H{d~} for that probability measure. M Definition 5 A point process N with distribution measure H on
(M,B(M))
S H H{dp} for some probability M is called a doubly stochastic Poisson process.
If A is a random measure with distribution H and N a doubly stochastic Poisson process with distribution P = ~ H H{dp} we sometimes call N the doubly stochastic Poisson process corresponding to A. For any bounded B E B(X) it follows in this case that
Pr{N{B)
) r
= k) = P{~
~{B} k k~
N; ~{B} = k )
= e-~{B}H{d~]
(A(B) k E , k~
=
-A{B) e
) .
M We will often consider N and A defined on the same probability space.
Intuitively we shall then think of a realization
doubly stochastic Poisson process N corresponding
of a
to a random
measure A as generated in the following way. First a realization
of A is generated,
and then a realization of a Poisson process
with intensity measure ~ is generated. reasoning precise we must introduce
In order to make this
some notations.
Let N•
be the
product of N and M~ which endowed with the product topology is Polish, and let B(N)•
be the ~-algebra generated by all rec-
tangles BNXB M. Note (cf e.g. Billingsley B(N)xB(M) Polish.
(1968, p 225)) that
equals the Borel algebra B(N•
on N•
since NxM is
(N,A) is a measurable mapping from some probability
into (NxM,B(N•
space
with a distribution determined by
Pr(NE BN, A ~ B M) = ~
H (BN}H{d~)
M for all B N ~ B ( N ) , H
B M ~ B(M).
In terms of conditional probabilities
is the distribution of N given A = ~. For more details we refer
to section 5.1.
Sometimes it is natural to consider Borel measures
in some sub-
space M o C M
may e.g. be
as the possible
all non-atomic measures, or all a b s o l u t e l y
intensity measures.
i.e. ~ & M
continuous
o
~
measures.
M
o
~{{x}} = 0 for all x ~ X , If
M ~B(M) o
we r e s t r i c t
ourselves to cases where H{M } = 1. If, however, M is not a Borel o o set~
a doubly
stochastic
Poisson
process
may t h e n
be defined
as
definition 5 except that M and B(M) are replaced by Mo and B(M o) where M ~ is endowed with the relative topology and B(M o) is the Borel algebra on M . o
in
1.3
Doubly stochastic Poisson processes on the r e .
line
Recapitulation of the d e f i n i t i o n
1.3.1
In the general definition in section 1.2 point processes were treated as random measures. A realization of a point process was thus regarded as an element v in N. On the real line, which is the traditional state space for point processes, it is sometimes convenient to regard a realization of a point process as a stepfunction v(x). Formally we put I the number of points in (O,x_~ ~(x)
Iminus
if
x > 0
the number of points in (x,O~
In the same way any Borel measure ~
if
x ~ O.
M corresponds to a non-
decreasing rightcontinuous function ~(x) on R such that ~(0) : 0
and
l~(x)I
< ~
for x ~ R . ~{(O,x]}
Formally we have the relation
if
x > 0
~(x) =I
t~{(x,0~}
if x ! 0
Thus the equivalence between the two points of view is not deeper than that a probability law of a random variable may he given either by a probability measure or by a distribution function. Since the 'random measure approach' may seem somewhat abstract, though appealing to intuition, we have a feeling that a short recapitulation of section 1.2 may be advisable.
Let M be the set of functions correponding to Borel measures endowed with the a-algebra B(M) generated by { ~ x,y~R.
B(M).
M; ~(x) ~ y},
It follows from theorem A 1.1 that B(M) corresponds to
Let N be the set of integervalued functions in M. Any pro-
bability measure H on (M,B(M) is the distribution of a stochastic process with its sample functions in M. If H(N) = I the process is called a point process.
For each ~E M a point process with
10
(i)
Pr{N(x) for
(ii)
- N(y)
(k(x)
= k} =
- ~ < y < x < ~
N(x) - N(y)
and
and
k: ~(y))k
-(~(x)-~(y))
k = 0,1 ,2,...
N(t) - N(s)
~
e
are independent
whenever
<~
is called a Poisson ~rocess with leadin~
function
~ and its distribu~
tion is denoted by H . P A doubly stochastic bution
Poisson process
is a point process
j H H{dp} for some probability P
1.3.2
measure
with distri-
H on (M,B(M)).
An alternative definitZon
Consider
for ~ I , P 2 ~ M
Obviously
~1oP2~M.
the function x ~ plO~2(x) Consider
f : MxM § M where
In order to justify the definition following
= p1(~2(x)). f(p1,~2)
= ~io~2 .
to be given we need the
lemma.
Lemma 2 The function
f is B(M)xB(M)-measurable.
Proof In order to prove the lemma, we will use a method Billingsley follows
(1968, p 232).
From the definition
given by e.g.
of B(~)
it
that it is enough to show that
{(p1,~2); ~leU2(x) ~ y } ~ B(M)xB(M) for a l l x,y~R. If for each p~ M we put p(n)(x) i not smaller than ~(x), 2n thus for ~ i ~ 2 ~ tions
it follows that p(n)(x)
M also plO~2(n)fx~ ~ , + pl ~ P2(x),
(pl,P2) ~ p1~p~n)(x)
(pl,P2) ~
equal to the smallest
p1~2(x)
converge pointwise
ratio
+ ~(x) and
i.e. the functo
as n § ~. It is thus enough to show the
11
measurability however,
of (~I'P2)
~Pl ~
(n), ix) for each x @ R. This follows,
since for each x,y@ R
{(PI'P2);Pl~
--< Y} =~((UI'~2);uI(iiEz7 ) -< y' i-I i
~2(x)@
(2-~"-,2--~]}
where each set in the union belongs to 8(M)xB(M).
Consider for any two probability measures H I and H 2 on (M,B(M)) the probability product measure. probability
space
(MxM,B(M)•215
, where
HIxH 2 is the
The function f is a measurable mapping from that
space into (M,B(M)) which implies that given two inde-
pendent stochastic processes A I and A 2 with sample functions in M. Then the process AIoA 2 with sample functions in M is welldefined.
Call a Poisson process with leading function ~(x) = x a Poisson ~rocess with intensity one.
Let A be a stochastic process with sample functions in M and let N be a Poisson process with intensinty one, defined on the same probability space in such a way that they are independent.
Consider the
point process NoA.
Definition 5' A point process N is called a doubly stochastic Poisson process if it has the same distribution
as N~A for some A.
It follows from the definition of a Poisson process that for any DE M the process Nc~ is a Poisson process with leading function p. Thus this definition of a doubly stochastic Poisson process in terms
12
of a random time transformation is equivalent to the ordinary one. For a detailed discussion of the relation between the two kind of definitions we refer to Serfozo (1968:2, pp 307-309).
1.3.3
Class~ of doubly stochastic Poison processes
Consider a Poisson process with leading function ~ lently with intensity measure p ~ M .
M or equiva-
We will in this section consider
the set M C M of Borel measures that are absolutely continuous with O
respect to the Lebesgue measure.
Thus to each U~ M
O
there exists a
function q such that p{B) = f q(x) dx for all B ~ B(R). Such a funcB tion will be called an intensity funct~qn. Let I be the set of intensity functions, i.e. I is the set of nonnegative
(Lebesgue)
measurable functions with finite integral over all bounded sets B E B(R). Denote the mapping I § M 0 by f. From the final remarks in section 1.2 it follows that any probability measure on
(Mo,B(Mo))
defines the distribution of a doubly stochastic Poisson process. In many applications it is, however, more natural to consider a stochastic process A = {Z(x); x E R) as a model for the intensity. We will then need some conditions to ensure that ~ generates a random measure.
These will guarantee that ~ has its sample functions in I
and that a certain measurability condition is fulfilled.
Suppose that the finite-dimensional distributions of ~ are specified. A stochastic process X is a measurable mapping from some probability space (W,W,~) into
(F,B(F)) where
real-valued functions on R and
B(F)
{qEF;
F is the set of all-
is the u-algebra generated by
q(x) ~ y}, for all x , y ~ R. Thus A generates a probability
measure on
(F,B(F)) which
is uniquely determined by the finite-
dimensional distributions. Let Z(w,x) denote the value of ~(w) at
13
the point x. A measurable if B { w ~ W WxB(R)
; 11(w,x)
= 1(w,x)}
the completion
measure, in Doob
mapping
11
: W § F is called a version
= I for all x E R .
of B(R), W and WxB(R)
B and the product
{(w,y)6 WxR
; 1(w,y)
< x}E
ml1<x>I
<
Denote by B(R), W and
with respect
of B and Lebesgue measure
(1953, p 60) we call I measurable ~xB(R)
of I
to Lebesgue
respectively.
As
if
for all x ~ R .
If I is measurable
and if
f
B for some B ~ B ( R )
then
(cf Doob 1953, p 62)
f 11(w,x)]dx B
is W-measurable
and finite a.s.
Assume now that the given finite-dimensional following
distributions
satisfy the
conditions:
(i)
Pr{1(x)
(ii)
lim Pr{11(y) y§
< O} = 0 for all x ( R . - 1(x) I h s} = 0 for all s > 0 and almost all
(Lebesgue measure) (iii)
(F).
xER.
f E 1(x)dx < co for all bounded B @ B(R). B
From (ii) it follows separable
chosen.
(cf Doob
and measurable
Then 1(w) is a.s.
that 1(w) is a.s.
(1953, p 61)) that there exists a
version
of t.
Assume t h a t
(~) B(R)-measurable.
(~) non-negative.
this
From
Thus the set W
version
is
(i) it follows = (wEW;
1(w) is
O
non-negative
and B(R)-measurable}
has ~-measure
one.
From ( i i i )
it
n
follows that the sets W n = ( w E W ~ ; measure
one for all n = I ,2, . . . .
l i m W = W' = { w ~ W ; t ( w ) E I } n
mapping
A : W § M
by 0
f 1(w,x)dx
< ~} also have E-
Since W n + 1 ~ W n and since
also
W' h a s B - m e a s u r e
one.
Define
the
14
if w6 w'
# o ~(w)
A<w) = if w ~ W - W' 0
where ~
for example is the measure with ~ {R) = O. 0
Thus ( w & W
0
; A(w) (B} <_x}(~W for all x ~ R
and all B ~ B ( R )
and thus
a random measure is uniquely determined since, see theorem AI.1,
B(Mo)
is generated by {~ 6 M o ; B{B} ~ x}, x & R , B E B(R).
An approach more in line with our treatment of random measures is to consider some subset
loCI
endowed with the a-algebra B(I o) generated
by ( n ~ I ~ ; ~(y) ~ x}, x , y 6 R , cess {X(x)
; x6R)
and to suppose that a stochastic pro-
with sample functions in I
is specified by its 0
distribution,
i.e. by a probability measure on
(Io,B(Io)). Such
a
probability measure is usually given by a description of the development of the realizations.
Then we have a good apprehension of a
natural set I . It is therefore
convenient to have simple conditions
0
on
Io, without
any reference to a probability measure, under which
the mapping f: Io § Mo, which exists since l o C I ,
is measurable.
We
will not go deeper into this question than to show that the mapping f is
B(I0 )-measurable
for I
equal to a set of Riemann integrable
func-
0
tions in I. This restriction is due to the facts that Riemann integrability is easy to check and that many cases of practical interest are covered.
Let I C I be a set of Riemann integrable o
functions. b
approximation with Riemann sums that ( ~ 6 1 o ;
It follows by
S n(y)dy ~ x}~
B(Io)
a
for all a < b and all x 6 R. From theorem AI.1 it then follows that f is
B(I0 )-measurable.
It is often natural to consider models where all n in I rightcontinuous
since then the instantaneous
o
are also
intensity defined by
15
x+A I lim ~ f 4(y)dy (cf Khintchine(1960, A+0 x 4 ~ I and all x E R.
p 22)) equals n(x) for all
O
1.3.4
A d e f i n i t i o n b ~ e d on i n t e r o c c u r r e n c e times
Up to now, all our discussions
about point processes
based on ~counting p properties, the number of points
have been
i.e. the basic quantities have been
(or events)
in certain sets of the state space.
On the real line a point process may also be defined by considering the epochs of the events or the interoccurrence
times b e t w e e n
successive events together with the epoch of some specified event as basic quantities.
We say that such a definition is based on inter-
occurrence
The renewal processes
times.
constitute
an important
class of point processes where a definition b a s e d on interoccurrenee times is natural.
In our opinion it is, however,
to base definitions these properties
of point processes
are m e a n i n g f u l
in general preferable
on counting properties,
on more general state spaces.
tion 2.2 we will, however, make use of interoccurrence ties for doubly stochastic Poisson processes
since
In sec-
times proper-
and therefore
a short
discussion of these properties will be given.
Let for any ~ E M the inverse ~ -I
(x) = sup
-I
of p be defined by
(y : ~(y)
< x)
If p(y) > x for all y ~ R we put ~ continuous nondecreasing
For any ~ E N
(x) = - ~. Thus p
function from (- ~, ~) into
-I
is a right-
~- ~, ~
we put t k = ~-1(k) and consider the infinite vector
t = (...,t_2,t_1,t0,tl,t2,...). have
-I
According to the properties
... ~ t_2 ~ t_1 ~ 0 < t O ~ t I ~ t 2 ~
of ~ we
... and further
lim t k = • ~. If ~ is considered as a realization k§177
of a point pro-
16
cess, the tk:s are the epoch of events of v provided the possible non-finite tk:s are properly
interpreted.
Let T be the set of all vectors t and let T be endowed with the oalgebra B(T) generated by { t ~ T ; t k ~ x}, k = 0,• As p o i n t e d out by e.g. Daley and Vere-Jones probability measure on (T, B(T)) generates
x6 ~
(1972, pp 308-309)
~, ~ . any
a p r o b a b i l i t y measure on
(N, B(N)), and conversely.
Let ~ N
and B E M be given and consider v = m v o ~. Put t k = v
and t k = v
-I
(k)
(k). Then we have
t k = sup(y : mv(~(y)) < k) < sup(y : ~(y) < m tk ) = B-I ( ~ k ) " On the other hand,
for every ~ > 0 we have
t k = sup(y
: v(~(y)) < k) > sup(y
: ~(y) < ~ t k - s) = ~ -1(t k - s)
-I m -I and thus t k = ~ (t k) p r o v i d e d ~ (x) is continuous
at x = t k.
Let N = N o A be a doubly stochastic Poisson process
as defined in
section
1.3.2.
L e t T and T be t h e random v e c t o r s
d e f i n e d by
%
Tk = N - l ( k ) pendent
and Tk = ~ - l ( k )
respectively.
S i n c e ~ and A a r e i n d e -
it follows that ~ and A -I almost surely have no common points
of discontinuity.
T=
Thus -I m
( .... A
(T_I),
-I m
A
(To),
A-I m
(~rl)...)
a.s.
and thus the two random vectors
are equally distributed.
This rela-
tion may serve as a definition,
based on interoccurrence
times, of
doubly stochastic Poisson processes.
Kingman
has used the above relation as definition by Serfozo
(1972:1, pp 290-291).
(1964),
see section 2.2,
and it has been discussed
17
1.4
Doubly stochastic Poisson sequences
Consider now the case X = Z, i.e. when the state space is the integers. A Borel measure on Z is a measure
assigning nonnegative
finite
mass to each integer and is completely determined b y these masses. Thus we may identify Borel measures
on Z and sequences of nonnegative
finite numbers.
By a point process or point sequence N with state space Z we m e a n a sequence of random variables Z+ = {0,1,2,...}. = {Uk ; k ~ Z }
A Poisson sequence with intensity measure
is then a sequence of independent
random variables all n ~ Z + .
{N k ; k @ Z} taking values in
such that
Poisson distributed
(~k)n -~k
Pr{N k = n} =
nl
e
for all k 6 Z and
By a random measure s with state space Z we mean a sequence
of random variables
{Zk ; k ~ Z }
taking values in R+.
The following definition is equivalent with definition 5.
Definition
5"
A point sequence N is called a doubly stochastic Poisson sequence if, for some random measure Z,
nk. m
Pr {n
m
{~k.
j=l
(Lk.)
= nk ) } = E { ~ j
j
j=1
J
'~
-~k e
J}
nk. J
for any positive integer m, any integers k I < k 2 < ... < k m and any nonnegative
integers
Parts of this paper Poisson sequences. applying methods
nkl,...,nkm. are devoted to the study of doubly stochastic
The main reason is that we are interested in
of time series analysis.
that in many cases observations
We will, however, point out
of a point process
are for measure-
ment reasons given in this form. There also exist cases where there is impossible to observe the exact ~time ~ of a point.
In e.g. sickness
18
statistics the number of people reported sick each day can be observed, but the exact time of the start of a disease is impossible to observe and even perhaps to define.
1.5
Some basic properties
We recall from section 1.2 that to each probability measure H on
(M, B(M))
the probability measure / H H(d~), which in this section M is denoted by PH' on (N, B(N)) is the distribution of a doubly stochastic Poisson process. In terms of Laplace transforms
(see defi-
nition A 1.2) we have the relation LpH(f) = LH(I - e -f) (cf B a r t l e t t % contribution to the discussion of Cox (1955, p 159) and Mecke (1968, P 75)). From this relation some theorems, most of them due to Krickeberg (1972) (cf also Kummer and Matthes (1970) and Kerstan, Matthes and Mecke (1974, pp 311-320)), follow as simple consequences.
Theorem
1
PHI = PH2
if and only if
H I = H 2.
Proof If H I = H 2 then PH
= PH2 follows from the definition. The converse I is proved by Krickeberg (1972, p 163) and will be reproduced here.
Assume that PHI = PH2 , which implies LPH I (f) = LPH2(f) and thus
LHI(I - e -f) = LH2(I - e -f) for all f~CK+.
Thus LH](g) = LH2(g) for
for all gE OK+ with sup g ~ I since to each such g there exists a f~ CK+ such that g = (I - e-f). To see this we just have to observe that f = - log(1 - g)~ CK+ for all g of the above kind. Consider now an arbitrary f~CK+.
Then LH1(sf) = LH2(sf) for all non-negative
s ~ (sup f)-1. Since f E C K +
it follows that sup f ~ ~ and thus
19
(sup f)-1 > 0. Since L(sf), as a function of s, is the Laplace transform of the random variable ; f(x)A(dx} where A is a random measure X with distribution H, it follows that L(sf) is determined by its values on ~O,a) for any a > 0. Thus L H (f) = LH2(f) for all f ~ C K + I and thus (see theorem A 1.3) H I = H 2.
Krickeberg (1972, p 165) notes that PHI~H 2 = PHIXPH2 for any H I and H 2 where ~ means convolution as defined in section A I.
Now we give a similar theorem about weak convergence, a concept which is discussed in section A I.
Theorem 2 Hn
_~w PH "
W,H if and only if PH n
Proof If Hn
W~ ~
then L H (f)--~ L~(f) and thus LPH (f)--+ Lp (f) which n H n implies (see theorem A 1.6) PH w~ PI[ " n If PH
n
w PH t h e n LH ( g ) - ~ LH(g) f o r a l l n
g~CK+ w i t h sup g < 1
and thus for an arbitrary f ~ CK4 it follows that L H (sf)-~ LH(sf) n for all n o n n e g a t i v e s < (sup f ) - I and t h u s LE (f)---* LH(f) (compare
n the proof of theorem I and the continuity theorem for Laplace transforms of random v a r i a b l e s )
Let
N o E B(N)
which i m p l i e s t h a t H ~
n
be the set of simple elements in N and let M
of n o n - a t o m i c e l e m e n t s i n M ( s e e d e f i n i t i o n
A 1.1).
theorem is due to Krickeberg (1972, p 164).
Theorem 3
Mo~ B(M)
~ .
and PH{No} = I if and only if H{M o} = I.
9 o
be the set
The f o l l o w i n g
20
Proof It is known that H {N ] = I if and only if bE M (cf e.g. Kerstan, o o Matthes and Mecke
(1974, p 31)) i.e. M
it follows from lemma I that PH{N~
= I H~{N~
o
= {~6M;
H IN } = I}. Thus ~ o
M ~ B(M) since N 6 B(N) and further o
o
= I if and only if ~ ( M ~ a.s.
(H).
9
Consider X = R and a random measure A with distribution H on
(M,B(M)). A (or H) is called strictly stationary if n Pr { ~ 2=1 n
=
{A{B. + y} < x . } } 1 -- 1 Bi ~ B ( R )
1,2,...,
is
independent
of y for
all
y ~ R,
a n d x i E R+ . (B + y = {x ; x - Y E B } ) .
Remark I This definition has an obvious extension to X = R k and may be further extended
(cf e.g. Mecke
(1967)) so that e.g. X = Z is
included.
We will sometimes consider strict stationarity when n X = R+. Then we mean that Pr {~] {A{B i + y] ~ xi}} is indepeni=I dent of y for all Y E R+, n = 1,2, .... Bi6 B(R+).
Theorem 4 PH is strictly stationary if and only if H is strictly stationary.
Proof It follows from theorems A 1.3 and A 1.4 that a random measure A is strictly stationary if and only if the distribution of f f(x - y)A{dx] is independent
of y for all f ~ C K + .
R
Define Ty : CK+-'* CK+ by T y f ( X )
= f(x
- y).
A is
stationary if and only if LH(Tyf) is independent f 6 CK+. S i n c e theorem
I.
Ty(1 - e - f )
= 1 - e-Ty f the
theorem
thus
strictly
of y for all
follows
from
21
Now we leave the stationary case and consequently X need not be the real line. L e t ~ d e n o t e The sets P g ~
the set of probability measures on
of all probability measures on
(N,B(N))
(M,B(M)).
and D ~ P
of all
distributions of doubly stochastic Poisson processes are of special interest to us.
Let D
: P § P for p ~ [~,I] denote the p-thinning operator, i.e. for P
any point process with distribution P6P the distribution of the point process obtained by independent selection of points with probability p is D P. The operator D is one to one (cf Kerstan, Matthes and Mecke P P (1974, p 311)). Mecke (1968) and (1972) has shown that D =
~ D P 0
for all
for c > 0 denote the c-amplifying operator,
C
i.e. for any random measure A with distribution H the distribution of cA is A H. It is not difficult to realize that (cf Kerstan, Matthes C
and Mecke (1974, p 312))
PAH
Dc PH
if
0 < c < I
D~ ] PH
if
c ~ ]
=
C
C
and thus D C
N D P is rather obvious. 0
The following theorem is due to Kallenberg (1975:1). Theorem 5 Let p],p2,...~ (0,I~ such that lim Pn = 0 and P]'P2 .... ~ P be given. n-~ Then Dpn Pn w some P0~P if and only if Apn Pn w some H0~ ~ and in this case P0 = PH 0'
Proof The proof is given by Kallenberg (1975:1) and will not be reproduced here.
9
22
Theorem 5 generalizes earlier limit results for p-thinnings. As noted by Kallenberg also Mecke~s characterization of doubly stochastic Poisson processes is a simple consequence. To see that, let P0 ~ P n
~ D P be given. Let pl,P2.., be as in theorem 5 and put O
= D-I P0" Since D P = P0 there exists H 0 6 ~ s u c h Pn Pn n
that
H 0 and thus P0 = PHO i.e. PO ~ D. Further Apn p n w
An P n ~
is the same as Ap D~IpH n n 0
w
H0
HO' which is a result due to Kerstan,
Matthes and Mecke (1974, p 315), from which theorem I follows. Another consequence of theorem 5 is the following corollary which will be used in the proof of lemma 3.1. Corollary I If PH
w
some H ~
then H = PH0 for some H 0 ~ .
n
Proof Since N is closed in M it follows from Billingsley (1968, p 12) that -I = D PH " Since p n n w+ H E p there exists H 0 ~ ~ such that H = PH0.
HE P. Using the notations in theorem 5 we put P DP nPn = PHn
n
From theorem 2 and corollary I it follows that if PH ~
some H ~
n
then H n
w
some H O ~
and H = pH 0 w h i c h i s
a result
due t o K e r s t a n ,
Matthes and Mecke ( 1 9 7 4 , p 3 1 7 ) .
1.6 Second orde~ properties Consider a random measure A with distribution H on
(M,B(M)) and
assume that E A2(B} < ~ for all bounded B E B(X). Remember that a set is called bounded if it has compact closure. Let N, defined on the same probability space as A, be the corresponding doubly stochastic Poisson process as defined in section 1.2.
23
Definition 6 For bounded B , B I , B 2 ~ B ( X ) the set function M given by
M{B} = E A{B) = I ~{B} ~{d~} is called the expectation or mean and the set function R given by R(BI,B 2} = Cov(A{BI},A{B2}) is called the covarianee.
It follows by monotone convergence that M 6 M .
R(B,'} is for fixed
bounded B E B(X) a signed measure, i.e. the difference of two Borel measures, on (X,B(X)), and further R{BI,B 2} may be extended to a signed measure on (XxX~B(XxX)) (cf Daley and Vere-Jones (1972, p 319)).
For a Poisson process the mean equals the intensity measure.
Lemma 3a For bounded B,BI,B 2 6 B ( X ) we have E N{B} = M{B} Var N{B} = H{B} + Var A{B} Coy (N{BI},N{B2}) = M { B I ~ B 2 }
+ R{BI,B2}.
Proof
N{B} = ~[~(N{B}IA{B} ~
= ~ A{B} = M{B}.
Var N{B} = E ~Var(N{B}IA{B} ~
+ Var ~-_E(N{B}IA{B}~
=
= E A{B} + Var A{B}. For random variables YI,Y2 and Z the relation 2 Cov (YI,Y2) = = Var (YI + Y2 + Z) - Var (YI + Z) - Var (Y2 + Z) + Var (Z) holds (cf Daley and Vere-Jones (1972, p 320)). Applying this relation to YI = A{BI}' Y2 = A{B2}~ Z = - A { B I ~ B 2} and to YI = N{BI}' Y2 = N{B2}' Z = - N{BIg~B 2} respectively, the result for the covariance follows.
24
It shall be noted that the results in Lemma 3a was already given by Cox (1955, p 135), where it was pointed out that Var N{B} ~ M{B) i.e. the doubly stochastic Poisson process is over-dispersed
relative
to the Poisson process.
Lemma 3b For bounded B(X)-measurable
functions
f and g with compact support
we have
E
f
f(x)~{dx} =
X coy
f
f(x)max}
X
(f f(x)N{~x}, f g(x)N{~}) X
= f f(x)g(x)m~x}
X
+
X
+ f f f(x)g(y)R{~,dy}. XX Proof By approximating
f and g with simple functions the results follows
from lemma 3a.
Remark 2 The lemmata can be extended to higher order moments. Assume that E Ak{B} < ~ for all bounded B ~ B ( X ) .
Then Krickeberg
has shown that for bounded B(X)-measurable
(1972, p 164)
functions fl,...,fk
with compact support it holds that k
k
E (j=1X
where {J1,...,Jm}
m=1
m {J1 .... Jm )
runs through all partitions
i=I X
J*Ji
of {1,...,k} into m
disjoint non-empty sets.
The following theorem will be of some importance.
25
Theorem 6 For all bounded B I , B 2 @ B ( X )
the random variables N{B I} - A{B I}
and A{B 2} are uncorrelated. There exist, however, BI,B 2 ~ B ( X ) such that N{B I} - A{B I} and A{B 2} are dependent unless N is a Poisson process. Further N{B I} - A{B 1} and N{B 2} - A{B 2} are uncorrelated for all bounded and disjoint BI,B 2 ~ B ( X ) .
Proof For all bounded B I and B 2 we have Coy (N{B I} - A{BI},A{B2}) = EL(N{B ~ I} - A{BI})A{B2~
=
= EFA{B2)E(N{B I} - A{B])IA{B]~A{B2}) ] = 0 i.e. the variables are uncorrelated. In a similar way it is shown that N{B]} - A{B]} and N{B 2} - A{B 2} are uncorrelated for disjoint B] and B 2. Assume that N is a Poisson process. Then A{B} = M{B} for all B~8(X)
and thus N{B]} - A{B]} and A{B 2} are independent. Assume
that N{B} - A{B} and A{B} are independent for all bounded B ~ B ( X ) . Since E[(N{B} - A{B})2[A{B}] = A{B} almost surely it follows from the independence assumption that A{B} almost surely equals some constant, and thus it follows from theorem A 1.4 that N is a Poisson process.
Remark 3 In the proof of independence part of theorem 6, the assumption about existing second moment was not used.
9
Consider now the case X = R @ . With the notations used in section 1.3 we define M(x) = E A(x) and R(x,y) = C o v
(A(x),A(y)). From lemma 3a
we get immediately E N(x) = M(x) and Cov (N(x),N(y)) = M(min(x,y)) + + R(x,y).
26
Consider now X = R and assume that A{B} = S ~(x)dx for all B @ $ ( R ) B as in section 1.3.3 and assume that E ~2(x) < ~ for all x 6 R . Define m(x) = E l(x) and r(x,y)
=Cov
(l(x),l(y)).
Thus we have
M{B} = S m(x)dx and R{B I,B2} = S S r(x,y)dxdy B B I B2
for all bounded
B,B] ,B26 B(I~).
For a doubly stochastic the notations
Poisson
sequence
m k = E Zk and rk, j = C o v
(see section
Zk,Zj.
1.4) we use
Thus we have
E N k = m k and Coy Nk,N j = 6k_jm k + rk,j. where
~k =
1
if
k = 0
0
if
k#0
Consider X = R and a random measure stationary
A. A is called
(weakly)
if M{B + y} and R{B I + y,B 2 + y) are independent
for all y ~ R
and all bounded B,B I , B 2 ~ B ( R ) .
corresponding
doubly
stochastic
and only if A is stationary. for doubly stochastic called stationary corresponding
Poisson
It is obvious that the
Poisson process N is stationary
if
We will make most use of stationarity sequences.
A random measure
if m k = m and rk, j = r k-j. for all k , j ~ Z .
doubly
of y
stochastic
Poisson
sequence
s is For the
N we thus have
N E N k = m and Cov (Nk,N j) = r . = m~k_ j + r .. Since both Z and N k-j k-~ are stationary
we have
(see section A3)
rk =
i eikx Fs
r kN =
i eikx FN{dx}
and thus
FN(x) = m(x + w) + FZ(x) 2w
27
The functions FZ(x) and FN(x) are called spectral distribution functions. An important special case is when Fs
is absolutely con-
tinuous with spectral densit~ f~(x). Then also FN(x) is absolutely continuous with spectral density fN(x) = ~ fN(x) > ~ --
1.7
> 0 for all x ~ E - w,wJ
m
+ fZ(x). Thus
a fact that will be useful.
2w
A ch~acteriz~gion of ergodi~ity
The result to be given in this section will not be used in the sequel but the topic has some relevance to the questions treated in section 4.
Consider X = R and define for all Y 6 R the shift operator T : M + M Y by (Ty~){A} = ~{A + y} for all A ~ B ( R ) A + y = {x~R
; x - y ~A}.
and recall that
For any B ~ B(M) we put
TyB = {~6 M ; T_y~ 6 B}. For general properties of the shift operator we refer to Kerstan, Matthes and Mecke (1974, pp 133-140). A set
BEB(M) is called invariant if T B = B for all y ~ R. Let A be a Y
strictly stationary random measure with distribution H. Then H{TyB} is independent of y for all y ~ R and B ~ B ( M ) .
H (or A) is called
er~odic if H{B} = 0 or I for all invariant B 6 B ( M ) .
Let A(M) be the algebra
[~ B (M) where B (M) is the G-algebra gene, n= I n n rated by {~& M ; ~ { A N ~- n,n~} ~ x} for all A 6 B(R) and x ~ R + . From
theorem A 1.1 it follows that A(M) generates B(M).
The following lemma contains all ergodic theory to be used in this section.
Ler~ma 4
Let H be the distribution of a strictly stationary random measure. The following statements are equivalent:
28
(i) (ii)
is ergodic. For all B(M)-measurable I h(~)H{d~}
lim-~ t -~ (iii)
functions h : M § R+ with
< ~ we have
t f h(Ty~)dy = -t
For all B 1,B2ff
A(M)
h(~)~{d~}
a.s. (9).
we h a v e
lim ~-~ 9{B Ig]TyB2}dy = H{B I}9{B2}. t-~ -t
(iv)
Any representation H = ~91 + (I - ~)92
(0 < ~ < I)
of 9 as a mixture of stationary distributions
H I and 92
is a trivial, i.e. ~ = 0 or a = I or H 1 = H 2.
Proof (i) @
(ii) ~:~ (iv) follows from Kerstan, Matthes and Mecke
(1974, p 141).
(i) <:~ (iii) follows e.g. from Billingsley
(1965,
p 17).
Let, like in section 1.5, P9 denote the distribution of the doubly stochastic Poisson process corresponding to a random measure with distribution H.
Theorem 7 Let H be the distribution of a strictly stationary random measure. Then PH is ergodic if and only if 9 is ergodic.
Before proving theorem 7 we note that the result is due to Westcott (1972, p 463). He derives the theorem as a corollary of his characterization of ergodicity in terms of probability generating functionals. We will, however, give a rather simple direct proof.
29
Proof Assume that H is not ergodic. It follows from lemma 4 (iv) that there exist ~ ( 0 , 1 )
and two strictly stationary H I and ~2 with H I r H 2
such that H = aH I @ (I - ~)H 2. From the definition of PH it follows that PH = ~PH I + (I - ~)PH2. PH 2
From theorem 4 it follows that PHI and
are stationary and from theorem I that
PHI
#
PH 2
. Thus, using
lemma 4 (iv) again it follows that PH is not ergodic. Assume now that H is ergodic. Let B I and B 2 be two arbitrary sets in A(M) a n d l e t
n o be such that
B 1 a n d B2 b e l o n g s
to
Bn (M). o
From definition 5, Fubini's theorem and dominated convergence it follows that t lim ~ t~
PH{BI ~TyB2}dY = t
= lim~-~ t-~
H {B1~TyB2}H{d~}dy =
I
-- lim S ~ t~
t
tS ~{BI~TyB2}dY
~{d~} --
M
t I = ~ (limt_~~-~ _! H {BiOTyB2}dY)H{d~} t provided lim ~-~ I t S H~{BI~TyB2}dY exists a.s. (H). Since H~ is comt+~ pletely random we have H {BINTyB 2} = H {BI}H {TyB 2} for IYl > 2n o and further (cf Kerstan, Matthes and Mecke (1974, p 134)) we have H {TNB2} = HT_y ~ {B2} . Thus t I t-~=lim~-~ _~ H {B1~TyB2}dY = t I = t+~lim~-~_~ H {BI} H {TyB2}dY =
3O
i! t
= H {B I} lim ~-~ t-~ _
H
{B2}dY T_y~
.
From lemma 4 (ii) it follows that
i! t
lim ~-~ t-~ _
H
{B2}dY = PH{B2} a.s. (H) T_y~
and thus t
llm ~-~ ] t-~
=
~ PH{BI~TyB2}dY -
=
/ II]j{B1}PII{B2]II{d~ ) = M
PH{B]}PH{B2}.
From lemma 4 (iii) it then follows that PH is ergodic.
I
31
2.
SOME M I S C E ~ E O U S
RESULTS
The sections under this title are almost independent of each other, with exception of section 2.4.2 where results from section 2.3 are used. Common to most topics in the different sections are that special doubly stochastic Poisson models are considered. A survey of such models has been written by Lawrance (1972, pp 218-235). To our opinion the most interesting model discussed by Lawrance, and not touched upon in this section, is the one where the intensity process {~(x)
; x ~ R} is the square of a normal process or the sum
of squares of normal processes. In example 5.4 this model will, however, be considered.
2. I
The w e i g h t e d P o i s s o n p r o c e s s
Let ~ be a Borel measure on X, that is ~ negative random variable.
Then A = ~
M, and let ~ be a non-
is a random measure, and the
doubly stochastic Poisson process corresponding to it is called a weighted Poisson process or a mixed Poisson process. When X is equal to or a part of the real line, then ~ is usually understood to be the Lebesgue measure.
Consider now X = [0, ~) and ~ equal to the Lebesgue measure. Then it is natural to consider a weighted Poisson process as a process in the class studied in section 1.3.3 such that X(t) = ~ for all t > 0 where is a nonnegative random variable with distribution function U. The first systematic treatment of weighted Poisson processes is due to Lundberg (1940). Lundberg called these processes
'compound Poisson
processes' a name that still is used in insurance mathematics.
Among many other things Lundberg showed that the weighted Poisson process {N(t)
; t ~ 0} is a continuous time Markov chain with timedepen-
dent transition intensities Pn(t) given by
32
I
Pn(t) = lim 7~, Pr{N(t + h) = n + 11N(t) = n} = h+0
f = E(~IN(t)
= n) =
x
n+1
e
-xt
U{dx}
0 S xn e -xt U{dx) 0
N is called a P$1ya process if
~ x ~-I e-~X if
r(B) u'(x)
x>
0
( a , 8 > O)
=
0
if
x<0
and in this case
Pnlt ~ J = B + n ~ + t
Lundberg
(19~0, p 99) showed that N is a P61ya process
if and only
if Pn(t) is linear in n, i.e. if Pn(t) = a(t) + b(t).n.
We may observe that for a weighted Poisson process N, we have
Pr{N(t) = 0} = f e -xt U{dx), 0 which,
considered as a function of t, is the L a p l a c e - t r a n s f o r m
of U.
Thus Pr{N(t) = 0} for t ~ 0 determines U uniquely and thus the distribution of N. Compare this with theorem A 1.5.
If on the other hand for some point process
{N(t)
; t > 0}
oo
Pr{N(t) = n) = f (xt)~n e -xt U{dx} 0 for all t > 0 and n = 0,1,2,...
and some distribution
point process need not be a weighted Poisson process. berg
(1969, p 123) gives an example.
If we, however,
function U this Jung and Lundassume that N is
33
a weakly
stationary
weighted
P o i s s o n process
(In section
doubly
1.6 stationarity
form of Pr(N(t)
rity it follows R(s,t)
U, and thus
where
2
'only if' direction
but the modifica-
1.1.) This
follows and
= ~2st.
from lemma
)%2 s
= (t -
Thus E(A(t) the
from
a random variable
it follows
- A(s))
since
Var N(t) =
= Var ~. From the assumption
that Var(A(t)
= Cov(A(s),A(t))
then N is a
= n) is as above.
for t ~ R
X here merely means
= o2(t2 + t 2 - 2t 2) = 0 w h i c h proves
2.2
process
= n) we get E N(t) = t E~
function
Var A(t) = t2~ 2
is defined
see remark
= t EX + t 2 Var ~, where distribution
Poisson
if and only if Pr(N(t)
tion to t > 0 is obvious, the
stochastic
with
1.3a that of stationa-
and thus
- tA(1)) 2 =
'if' direction.
The
is obvious.
Doubly stochastic Poisson proc~ses and r e n e w ~ p r o c ~ s e s
In this
section we will study the class
are both doubly
stochastic
Poisson
processes
Since both kinds of point processes the Poisson process,
interest,
is both a doubly
Kingman
in this
section
Kingman~s considered
In this
Poisson
may be helpful
(1964) has characterized
Poisson processes
for x < 0.
which
Such a study may also have
process
a process
a certain
out that
as a 'variation
which
and a renewal process,
in the analysis
which
of
common to the two
class
is somewhat
our p r e s e n t a t i o n
of the process.
of doubly
also are renewal processes.
give a discussion
we will point
section
generalizations
since if we are considering
stochastic
both representations
are natural
interest.
which
and renewal processes.
a study of the processes
classes m a y have a theoretical a practical
of point processes
stochastic
Although broader
we will than
may at most be
on a theme by Kingman'
all distribution
functions
are assumed
to be zero
34
We will consider tion
point processes
N = {N(x)
; x L @which,
1.3.4, may be defined by a random vector
To avoid
some minor
trouble
to zero with positive renewal process where
probability.
k = 1,2,...,
Since we only allow
finitely
< I. T O is allowed
tion H. The variables bility
and in that
renewal process distribution
have a common many events
function
intervals
+ ~ with positive transient.
if and only if at least
H and F are defective,
takes the value
i.e.
F.
we require
distribution
process
a
random variables
distribution
to have a different
is thus transient
T O to be equal
N is called
in finite
case we call the renewal
ding r a n d o m variable
section
are independent
T k may take the value
functions
T = (T0,TI,T2,...).
A point process
if T0,TI-T0,T2-TI,...
T k - Tk_1,
that F(0)
we allow in this
see sec-
funcprobaA
one of the
if the correspon-
+ ~ with positive
probability.
If
H(~)
= F(x)
the corresponding
I - e -x
if
x>
0
0
if
x<
0
=
renewal process
is a Poisson
process
with intensity
one.
Let {A(x) cess,
; x > O) be a n o n d e c r e a s i n g
of section
1.3, such that A(O-)
rightcontinuous < 0 < A(O).
stochastic
For the same reason
as w h e n we allowed T O to be equal to zero with positive we allow Pr{A(O)
A-1(x)
> 0} > 0. The process
= sup
is called the inverse A-I(0)
(y
:
A(y)
; x > 0} defined by
of A. Due to the assumption
vector ~ = (~0,~i,~2 .... ) define
(~1) .... )
A(0-)
< 0 we have
= + ~} > 0. Let the random
a Poisson
1.3.4 it then follows
T = (A -I(TO),A
probability
< x)
> O. Further we allow Pr{A-1(x)
From section
{A-1(x)
pro-
process
with intensity
that the random vector
one.
35
defines a doubly stochastic Poisson process on R+.
Put oo
f(~)
=
S e-S~
F{dx}
0 and oo
~(s) =
S e-SX
H{dx}
0 where F and H are the distribution functions in the definition of a renewal process.
A point process N, with Pr{N(x) = 0 for all x > 0} = I, is both a doubly stochastic Poisson process and a renewal process. This uninteresting case will be left out of considerations.
Theorem I (i)
A doubly stochastic Poisson process corresponding to A is a
renewal process if and only if A -I has stationary and independent increments.
(ii) A renewal process is a doublx stochastic Poisson process if and only if
~(s) :
I I - log ~(s)
and
~(s) = ~o(S)~(s)
where g(s) = S e-SX G{dx} for some infinitely divisible distribution 0 function G with G(O) < I and go(S) = S e-SX Go{dX} for some distribu0 tion function G . O
(iii) The two representations are related through
E e -sA-1(~
= ~o(S)
36
and E e
-s(A-1(1) - A-I(0))
= g(s).
Proof (i) The 'only if' part, which is the difficult part is proved by Kingman (1964, pp 929-930) and will not be reproduced here. Con^
sider now the 'if' part. Let go and g be given by part (iii) of the theorem. For any n > 0 we have n
E exp{- s0T 0 - kZ=1 sk (Tk - Tk_1)} =
= E exp{- s0A-1(0) - So(A-I(T~ O ) - A-I(0)) n
-
Z sk
(i-1(~k)
- i -I (~T k _ 1 ) ) ) =
k=l
TO)
n
= ~o(So) E(~(s o)
Tk
n E(~(s k)
) =
k=l =
go(SO)
n
I
I - Zog ~(s o)
k:1
I - Zog ~(s~)
which proves part (i) of the theorem.
(iii) This follows from the proof of the 'if' part of (i).
(ii) To any G and Go, defective or not, satisfying the conditions in (ii) there exists a process A -I with stationary and independent increments such that g and go satisfy the relations in (iii). Conversely, for any process A
-I
with stationary and independent in-
crements g and go given by (iii) satisfy the conditions in (ii), since if G(O) = I then the corresponding doubly stochastic Poisson process will not necessarily have only finitely many events in finite intervals. Thus (ii) follows from (i) and (iii).
37
Now we will consider the class of point processes which are both doubly stochastic Poisson processes and renewal processes in more detail. In the analysis we will alternate freely between the two representations. We will follow Kingman and consider the stationary case. A renewal process is called stationary, provided F is not defective and has finite expectation p, if
1
(1
i
~(x) = ~ o
- F(y))dy
.
A stationary renewal process is a strictly stationary point process.
Corollary I A stationary renewal process is a doubly stochastic Poisson process if and only if
~<s) = [I + bs + f
0 and some measure B on (0,~) such that f x B{dx}
<
0
Proof For any infinitely divisible G, defective or not, we have (cf e.g. Feller (1971, p 450))
~(s) = e -r where
r
=bs
+ f (I - e -sx) B{dx}
+b
0 for some b, b
> 0 and some measure B on (0,~) with
f 7-Yqx x B{dx} < 0 For the distribution function 0 (and thus also F) is defective if an only if b
> 0 .
Thus in the stationary case b
= 0.
38
co
Kingman
(1964, p 925) showed that ~ = b + f x B{dx},
co
and thus
0
f x B{dx} = ~ - b < ~. Thus the 'only if' part follows from 0 theorem I (ii). The 'if' part also follows from theorem I (ii) ^
if a distribution exists.
Kingman
function G o such that h(s) = go(S)f(s)
always
(1964, p 925) has shown that X
co
I
7(b +# # ~{dz}dy)
if xLO
0 y
0o(X) = 0
if
x<
0
satisfies the required condition.
From theorem
I (i) and the p r o o f of corollary
I it follows that a
doubly stochastic Poisson process corresponding to A is a stationary renewal process
if and only if A -I has stationary and inde-
pendent increments with -s(A-1(1) - A-I(o))
= exp{ - (bs + ] (I - e-SX)B{dx})} 0
Ee and X
co
b + ~ f B{dz}dy ou
if
x>
0
if
x<
0
b + ~ y B{dy} 0 Pr{A-I(o)
< x} =
0
for some b >_ 0 and some measure B on (0,co) such that co
x B{dx} _< 0 We may observe that since F(0) = lim f(s) we have F(0) > 0 if and oo
S-~co
only if b = 0 and 5 B{dx} < co. Since a stationary renewal process 0 is simple,
see definition A 1.1, if and only if F(0) = 0 it follows
from theorem
1.3 that A(t) is continuous
a.s. unless b = 0 and
39
f B{dx} < ~. 0 If b = 0 and S B{dx} = e < ~ we define the p r o b a b i l i t y measure C by 0 C{dx} = 2 B{dx}. Then C
9(s) = c f (I - e -sx) C{dx} = c(I - f e -sx C{dx}) 0 0 and thus A properties
-I
is compound Poisson process.
of A
-I
U s i n g the sample function
it is not difficult to see that A has the represen-
tation
9(x) k~1 ~
if
~(x) > 0
0
if
~(x) = 0
A(x) =
where N is a stationary renewal process with interoccurrence
distribution
C a n d {~k}k=l i s
a sequence of independent
variables all b e i n g exponentially
time
random 1
distributed with mean --. e
In the case b = 0 Kingman Pr{D+A(x) where D+A(x)
(1964, p 926) showed that
= 0 for almost all x ~ O} =
I
is the right-hand derivative.
Thus, if b = 0 and S B{dx} = *, almost all realizations of A are 0 continuous and, considered as measures, singular with respect to Lebesgue measure.
Kingman considered the important class of doubly stochastic Poisson processes,
discussed in section
1.3.3, where
X
A(x) = S ~(y)dy 0 for some stochastic process
{l(x)
; x > O} measurable
in the sense of
Doob and not identically equal to zero. He showed that a stationary
4o
renewal process can be expressed as such a doubly stochastic Poisson process if and only if b > 0. In this case ~(x) alternates between I the values 0 and ~ in such a way that ~(x) is proportional to a stationary regenerative phenomenon (cf Kingman 1972, p 48).
If f B(dx~ ~ ~ and if c and C are defined as above, it follows, see 0 I
Kingman (1964, p 9 2 8 ) , t h a t X(x) i s e q u a l t o 0 and ~ a l t e r n a t i v e l y on intervals whose lengths are independent random variables. The I lengths on the intervals where X(x) = ~ a r e
exponentially distributed
with mean ~ and the lengths where ~(x) = 0 have distribution function C. C
2.3
Some r e l i a b i l i t y models
Consider a doubly stochastic Poisson process (N(t)
; t ~ 0~. In this
section, with perhaps a somewhat misleading title, we will consider the distribution of the waiting time T for the first event. Since (T > t~ = (N(t) = 0~ this is the same problem as calculating the probability of no events in an interval.
2.3.1
An application on precipitation scavenging of an aerosol particle
In this section we will study a model, due to Rodhe and Grandell (1972), for precipitation scavenging of an aerosol particle from the atmosphere.
Information about the distribution of the waiting
time for the first event is of interest in connection with air pollution problems.
The intensity for the removal of a particle from the atmosphere is highly dependent on the weather. In the model we assume that the removal intensity only depends on whether it is raining or not. Let ~d denote the removal intensity during a dry period, i.e. during a dry period a particle has the probability ~d h + o(h) of getting
41
scavenged from the atmosphere in an interval of length h, and let
P
denote the removal intensity during a precipitation period. Let X(t) be a stochastic process defined by kd
if dry period at time t
kp
if precipitation period at time t
k(t) :
It is further assumed that k(t) is a continuous time Markov chain with stationary transition intensities qd and qp defined by I qd = lim ~ Pr(~(h) = ~pl~(O) = ~d } h+O I qp = lim ~ Pr{~(h) = XdlX(O) = Xp} , h+O and with initial distribution
Pd
: Pr{~(0)
: ~d }
pp = Pr{k(0) = k } . P For some discussion of the relevance of this model we refer to Rodhe and Grandell (1972).
Consider a particle which enters the atmosphere at time 0 and let T be the time for the removal of that particle from the atmosphere. Define G(t) by t G(t) = Pr t} = E(exp( - f k(s)ds}). 0 Put
Gd(t) = Pr{T > tl~(0) = kd } G(t)
: Pr t1~(0) : ~p}
and thus G(t) = PdGd(t) + ppGp(t) .
42
The chosen initial distribution describes the knowledge of the weather when the particle enters the atmosphere.
From the properties of k(t) it follows that
E(exp{ -
t+h 5 l(s)ds}ll(h)) h
is independent of h and by considering the possible changes of k(.) during (O,h) we get -Idh Gd(t + h) = (I - qdh)e
Gd(t) + q d h % ( t )
+ o(h)
and thus h + 0 gives
G~(t) = - (qd + Id ) Gd(t) + qdGp ( t ) and similarly G'(t)p = %
Gd(t) - (qp + Ip) Gp(t)
.
From the general theory of systems of linear differential equations it follows that -rlt Gd(t) = a d e
-r2t + Bd e
-rlt Gp(t) = ap e
-r2t + Bp e
where
rl
=
r2 =
1
-
I (qd+%+Xd+Xp) +
<
(qd+%+~d+Xp) _XdXp_Xd%_Xpqd"
Thus -rlt G(t) = ~e
-r2t + ae
Assume that r I > 0, which holds in all non-trivial cases.
43
Since G(0) = I we have e + B = I and thus -rlt G(t) = ~e
-r2t + (I - ~) e
From this we get
S G(t)dt = ~-- + I 0 rl r2 which by definition is equal to E T.
Integration of the differential Gd(~) = ~ ( ~ )
equations gives, since
= o ,
- I = - (qd + Id) ~ Gd(t)dt + qd ~ Gp(t)dt
- I : qp ! Gd(t)dt-
(qp + Ip) ~ Gp(t)dt
and thus
f G(t)dt = 0
qd + ~
+ Pdlp + Ppld
qdlp + qpl d + Idl p
which determines ~.
In Rodhe and Grandell
(1972) the above derivation is given in more
detail and further the model is illustrated by numerical examples. Of special interest is the case Pd = qd + qp which corresponds situation where the particle enters the atmosphere weather.
to the
independently
of the
In this case +
qd + ~
+ pplp qdld qd + ~ - rl
qd + qp + Id + Ip - 2r I We conclude this section by mention a natural generalization model for precipitation
scavenging.
of the
Let l(t) still be a Markov chain
44
with stationary transition probabilities
but let the possible values
of k(t) be kl,...,kK where K may be infinite.
In the precipitation
scavenging example kl may be the removal intensity during a dry period and k2,...,kK the removal intensities
during precipitation
periods classified according to the intensity of the precipitation. It is easy to see that for finite K the method of calculating G(t) in the case of only two possible values of k(t) also applies to this situation.
It is, however,
in general not possible to find an explicit
solution of the system of differential equations.
Doubly stochastic Poisson processes treated by for example Neuts
of the above kind have been
(1971) and Rudemo (1972) for finite K
and by Rudemo (1973:1) for infinite K. Rudemo's derivation of G(t) differs from ours and has connections with non-linear estimation.
We
will return to this in example 5.3. Neuts uses the process as the input in a queueing model.
2.3.2
A model w i t h an i n t e n s i t y process generated by a renewal pro cess
Consider now a (not transient)
renewal process N = {N(t)
in which the times between successive variables with a common distribution
; t > 0}
events are independent
random
function F. We assume that
F(0) = 0. F is called arithmetic with span T if T is the largest number such that F is concentrated on a set of points of the form T, 2T, 3T, ... and otherwise non-arithmetic.
Further the distri-
bution function of the time to the first event is denoted by H. The % interesting choices of H are H = F which makes N an ordinary renewal process and,
provided F has finite expectation,
45
X
S(x)
=
/ (I - F(y))dy 0 oo
F(y))dy
S (I 0
which makes N a stationary
Let {Xk}k=0 be a sequence
renewal process.
of independent
n o n n e g a t i v e random v a r i a b l e s further these variables
Define
a stochastic
be a doubly stochastic intensity Pr{N(t)
with distribution
be independent
process
and identically function
distributed U and l e t
of the renewal process
N.
~(t) by ~(t) = ~ ( t ), Let N = {N(t)
Poisson process with X(t) as a model
in the sense as discussed
in section
; t > 0]
for the
1.3.3. Let G(t) denote
= 0}.
Consider
first the case H = F and put u(t) = S e -tx U{dx}. Note that 0 u(t) = Pr{N(t) = 01~(t) = 0}. Separating the two cases N(t) = 0 and N(t) > 0 we get t G(t) = (I - F(t)) u(t) + ~ u(s) G(t - s)F{ds} 0 and thus we have obtained
Following
Feller
a (defective)
renewal equation
for G.
(1971, p 376) we assume that there exists a K > 0
such that co
f e ~t u(t)F{dt} 0
= I
and further we assume that S eKt u(t)(1 0
- F(t))dt
<
The equation
eKtG(t)
= eKtu(t)(1
is a proper renewal equation.
t - F(t)) + f eK(t-s)G(t-s)e<Su(s)F{ds} 0
~6
Since, mann
as w i l l
integrable
362))
p
be
shown
it
below,
follows
e
from
u(t)(1
the
- F(t))
renewal
is d i r e c t l y
theorem
Rie-
(cf Feller
(1971,
that co
f e ms u(s)(1 e
- F(s))ds
0
§
f
s e <s
[(
s)F{ds}
0
as t § ~ i f F is n o n - a r i t h m e t i c
and
that
co
Z
T e K(t+nT)
G(t+nT)
eK(t+JT)u(t+jT)(I
- F(t+jT))
§
f
S e ~s ~( s ) F { d s }
0 as n § co i f F is
To
see that
be
the
arithmetic
e Kt u ( t ) ( 1
largest,
- F(t))
a n d m. t h e J
m. -< e Kt u ( t ) ( 1 --j
with
- F(t))
span
is
co
directly
smallest
_< m.j f o r
9 and
number
(j -
1)h
0 < t < ~ .
Riemann such
integrable
l e t m.
that
< t < jh.
Then
co
m.
Z
j=l
Z
a
e <jh
u((j
1)h)(1
-
-
F((j
1)h))
-
=
j=l
-
oo
=
h
e
h
Z
e K(J-1)h
~l(jh)(1
- F(jh))
<
(e 2 K h
I) f e < t u ( t ) ( 1 - F ( t ) ) d t 0
j=1
co
+
e2
h
Z j=1
m. -J
Thus oo
j=1
which
tends
to
co
co
Z m. < h e K h + m. - h O j--1 - J -
zero
as h ~ 0 a n d t h u s
-
(cf Feller
u(t)(1
- F(t))
is
directly
Riemann
integrable.
1971,
p 361))
47
Example
I
A~sume that F is a one-point
distribution
with F(T) - F(T-0) = I.
Then < is determined by e
KT
~(T)
= I
and thus
eK(t+nT)G(t+nT)
§ 9 e
u(~)
e KT
e
u(t)
or
G(t+nT)
~
~(~)n u(t)
Thus we get, in this example,
.
the exact value
for all T and t.
B Assume now that F is non-arithmetic GH(t) denote Pr{N(t)
but let H be arbitrary.
Let
= 0}. Then we have
t GH(t) = (I - H(t)) Q(t) + ~ u(S)GF(t-s)H{ds} 0 or
Assume,
GH(t) = e
t - H(t)) + f e<Su(s)e<(t-S)GF(t-s)H{ds}. 0
in addition to the assumptions
in the case H = F, that
co
e <s & ( s )
~{d~}
<
0 and that
e Kt u(t)(l
and thus,
- H(t)) § 0 as t §
since e
convergence
that
bounded,
we get by dominated
48
oo
eKtGH(t)
oo
f e Ks u(s)(1 0
§
j-
- F(s))ds
f e Ks u(s)H{ds} 0
oo
Ks
S e
u(s)F
ds
0
Example
2
Let N be a Poisson
process
with
intensity
F(x) = H(x) = I - e -#x for x h 0. Then
p, i.e.
K is determined
by
co
f e Kt u(t) 0
e -pt dt = 1
and co
f e Kt u(t)(1 0
- F(t))dt
= I
and thus
e
Kt
G(t) §
o~
2
Consider
-
Then K is the smallest
Pu 1 p +
tl
-
x
+
Ks
U(Xk-0)
^..
u[s)
, k =
solution
e
- Hs
+
-
12
-
ds
distribution
with
1,2.
of
Pu 2 p
some calculations
K =-~(~,1+12+p)
and
) S e 0
the case where U is a two-point
% = U [kt ) ' "
and after
r
x
-
1
we get
(t1+t2+p)
2 -
tlt2
-
UlP,X 1 -
u2~t
2
49
2 ~ Ks f s e ~(s) 0
~ + 11 + 12 - 2< e -~s ds = ~ + u112 + u211
We note that 1(t) is a M a r k o v probabilities
chain w i t h
stationary
- <
transition
with parameters
Pk = Pr{l(O)
= Ik } = u k
and I qk = lim ~ Pr{l(h) h+O I = l i m ~ {h~(1 h+O
- Uk)
# Ikll(O)
+ o(h)
= I k} =
= ~(I
- u k)
.
For 11 = I d
12 = I
P
= qd+
qp
% Ul = qd + qp
qd u2 =
qd+~
we thus get the p r e c i p i t a t i o n
Pd = qd + qp
2 :j s 0
e
scavenging
model provided
, and it is seen that < equals
<s u(s)
e -~s d ~
equals
r I and that
I
--
m
5O
2.3.3
A model with an i n t e n s i t y process generated by an a l t e r n a t i n g renewal process
Consider again the precipitation scavenging model and let Fd(F p) denote the distribution function of a dry (precipitation) period. In -qd x -qpX the model Fd(X) = I - e and F (s) = I - e P Now we drop the assumption of exponential distributions and consider arbitrary F d and Fp, but keep the assumption of independence between the duration of different periods. X(t) is thus generated by an alternating renewal process. Let further Gd(t)(Gp(t)) denote the probability that T > t for a particle which enters the atmosphere exactly when a dry (precipitation) period starts.
Put
~d(S} = f e -s~ ;d{~} 0
9 (s) = S e-SX ; {dx} P
0
P
Gd(S) = ~ e - s x Gd(X)d x 0 ~(S)
= f e -sx G (x)dx . 0
P
Since -Xd t Gd(t) = e
t -XdT (I - Fd(t) ) + f e Gp(t-m) Fd{dT} 0
we get
~d (s) = I - 9d(S + ~d ) s + ~d + }d(s + ~d)Op(S) and in the same way
I - ?(s + xp) P
51
and thus e.g.
{
8d(S)= I - fd(s + Id)fp(S + lp)
I - 5(s + ~d(S + ~d )
1
-
~d(S
+
~a)
s + Id
+ ~ID) } .
s + I P
This formula is given by Gaver (1963, p 224) with, in our opinion, a more complicated proof. When the durations of the periods are exponentially distributed this definition of G d (Gp) coincides with the one given in section 2.3.1 because of the lack of memory of the exponential distribution. An inversion of the Laplace transform seems, however, to be as difficult as our direct derivation. Lawrance (1972, pp 228-233) has a further discussion of this model. We will now consider the behaviour of Gd(t) for large values of t.
Let ~d and ~
be the durations of the first dry period and the subseP
quent precipitation period respectively. Put ~ = ~d + ~p and let F = Fd ~ F
P
denote the distribution of T. Define the functions Ad(t )
and B(t) by Ad(t) = Pr{T > tl~ > t} B(t)
= Pr{T > tl~ = t} = E(e
-~dTd-Xp~ ~
PI~ = t) .
Note that it is irrelevant for B(t) if we start with a dry period or with a precipitation period. It can be shown that
Ad(t) (I - F(t)) = -~d t = e
t -XdT-~p(t-T) (I - F(t)) + 6 e (I - Fp(t-T))Fd{dT}
and that this function is non-increasing. q~
Separating the two cases T > t and ~ < t we get
52
t - F(t)) + f B(T) G d ( t 0
Gd(t) : Ad(t)(1
T)F{dT}.
Assume now, like in section 2.3.2, that there exists a K > 0 such that co
f e
~
e ~t
Ad(t)(1-
F(t)) dt < co
0 Since, in the same way as in section 2.3.2, it can be shown that e Kt Ad(t)(1
- F(t)) is directly Riemann integrable it
follows that
oo
e "s
l i m e Kt Gd(t) = t§
Ad(~)(1
-
F(s))
as
0 co
S s
e KS s(s)
F{ds}
0
It may simplify the calculations
to observe that
f e Kt B(t) F{dt} : E (e (K-ld)~dlE(e(K-lp)~p) 0 and thus K is the smallest solution of }d(id - X)fp(lp - x) and that co
f e Ks 0
Ad(S)( 1 i F(s))ds
--
^
I -
fa(~d-
Id - K and 0
^
K)
+
^
c
fd~Id
-
~)
I -
fp(~
1 P
-
K
K)
=
I
53
2.4
Palm p r o b a b i g i t i ~
In section 2.3 we considered the distribution of the waiting time for the first event. In this section we will consider the waiting time for the first event after the occurrence of an event. In order to do this we will introduce the concept of Palm probabilities.
2.4.1
Palm p r o b a b i l i t i ~ for doubly stochastic Poisson p r o c ~ s ~ in the general case
Intuitively a Palm probability for a point process is a conditional probability given that a certain x 6 X happens to be one of the points of occurrence of the process. There are several attempts made to make the intuitive notion of a Palm probability precise. A short discussion of such approaches is given by Daley and VereJones
(1972, p 360). Our approach will essentially be the one due
to Ryll-Nardzewski
(1961). Since the theory of Palm probabilities
will only be used in this section, we content ourselves with refering to Jagers (1973) for a discussion of a general theory of Palm probabilities based on Ryll-Nardzewski's approach.
Formally a PaZun probability will be defined as a Radon-Nikodym derivative. We hope that the following heuristic reasoning will be of some help for the understanding of the precise definition. Let a point process N and an event B E B(N) be given. The problem is to give a precise (and reasonable) meaning of P r { N 6 BIN({x)} > 0}. For a point process we have N{{x}} = N{dx). Let B(x) denote the event that x 6 X is one of the points of oectucrence of N, i.e. B(x) = { ~ 6 N ;
~{dx} > 0). Led by the elementary definition of condi-
tional probabilities we consider the ratio
Pr{N6 Bg~ B(x) } Pr{NEB(x)}
. Let IB be the
f
if N 6 B indicator for B, i.e. IB(N) =I IB(N)IB.x,(N ) " . ~function )" " , 1 if N~ B ' and consider the ratio E E ]B(x)(N) 9 If N is a simple point process, i.e. multiple pointsE ~IN)N{dx}d~ not occtu-, we have IB(x)(N) = N[dx}. Thus we are led to consider
E N(dx}
and this will be our Palm probability. It will turn out, see theorem 2,
54
that it is convenient for our purposes to use this definition of PaLm probabilities also for point processes which are not simple and for general random measures. For a random measure A, we by formal analogy define a Palm E IB(A)A{dx} p r ~ a b i l i t y by E A{dx}
We conclude this heuristic reasoning by some remarks on the definition used by Jagers (1973). He considers point processes N, not necessarily simple, and associates to N a simple point process N~ defined by ~ { { x } }
= min(N{{x}},~)
for all x 6 X. Then IB(x)(N) = N~{dx} and he defines a Palm probability by
s 1~(N) ~{ax} 9 For a simple point process N we have N ~ = N, and ~hus the m N~{dx} two definitions agree in that case.
Let A be a random measure with distribution and let B 6 B(M) be an arbitrary but be the indicator
function,
i.e.
I
if
~9 B
0
if
~
H such that EA = M 6 M
fixed set. Let IB : M §
[},I~
IB(~) =
and consider the measure with respect unique
a.e.
B
E IB(A)A which is absolutely
to M. By the Radon-Nikodym (M) determined
continuous
theorem there exists
function 11 {B}
a
: X § ~0,I~ such that
X
A for all A 9 8(X).
The family {Hx{B} p 258))
; x 6 X, B ~
so that H
B(M)}
can be chosen
is a probability
measure
on
(cf Bauer
B(M)
(1968,
for all x ~ X
X
and so that for each B 6 B ( M )
the function x + H {B} is B(X)X
measurable. H
In the sequel we assume this to be done.
will be called the Palm measure.
The measure
To simplify notations
X
regard
{x6X
two Palm measures
1I( 1 ) X
a n d 1I( 2 )
; H(1)x r H(2)} has M-measure
X
as equal
zero.
if
the
set
we will
55
Remark I Our definition of Palm probabilities
is indicated by Jagers
(1973,
p 20), although it differs from the one used there. For simple point processes,
see definition A 1.1, the two definitions
the discussion by Jagers
agree and thus
(1973) about the link to the intuitive
meaning of Palm probabilities
also applies to the present defini-
tion. For a point process where multiple points are possible this link is generally lost. We will illustrate this fact with a simple example.
Let N be a point process on the two-point
space X = (x,y)
and put H.
= Pr{N{{x}} = i, N{{y}} = k} .
l~k
Then the natural definition of H {B} for B = {v~ N ; ~{{y}} = k} X
is H {B} = Pr{N{{y}} = klN{{x}} > 0} = X
oo
i= I
1,k
=
co
oo
Z Z H. k=O i=I 1 ,k
provided Pr{N{{x}} > O} > O, while from our definition we get co
Z in. i=I i ,k H
{B}
=
X
co
co
Z Z i H. k=O i=I m ,k
For strictly stationary point processes the link to the intuitive meaning is exhaustively discussed by Kerstan, Matthes and Mecke (1974, pp 182-185).
From that discussion
it follows, however,
that our definition of Palm probabilities may he of practical use also in the presence of multiple points.
9
56
Let for x ~ X the Dirac measure
I
if
x~A
0
if
x~A
6
for x be defined by
x
for all A ~ B ( X )
6x{A} =
and let the Dirac measure A
Ax {B} =
Note that ~ 6 N x
I
]
if
0
if
x 6x~B
and that &
x
be defined by
for all BeB(~I)
.
is the distribution of a random
x
measure with almost surely 6
Mecke
for 6
x
as realization.
x
(1967, p 43) has shown that a probability measure
P on (N,B(N)) is the distribution sity measure
~, see definition
of a Poisson process with inten-
1.4, if and only if
X•
N X
for all B(X)•
function~
sider, as an example,
f : X•
f(x,v) = IA(X)IB(~)
~ R+. If we con-
and P = H
we get
f ~{A} H {d~} = f (I IB (" + 6 x) H {dD}) H{dx} = B ~ A N
= / (Ax~ ~) {B} ~{dx} A where the first equality follows from Mecke's c h a r a c t e r i z a t i o n the second one from the definition of convolution
and
given in section
A 1. Thus, since for a Poisson process the intensity measure equals the mean measure, we have Palm measures
(H~)x = AxXH~" With the definition of
used by Jagers
true if ~ is non-atomic a similar relation
(]973) this result is only generally
(cf Jagers
(]973, p 22)). We will now give
for doubly stochastic Poisson processes.
57
Theorem 2 For any random measure A with distribution H, such that EA = M 6 M, we have (PH)
= Ax~P H
X
where PH X
is %he distribution of a doubly X
stochastic Poisson process corresponding to a random measure with distribution H . x Proof The theorem is a consequence of Kummer and Matthes (1970, p 1636) but, as to be shown, it is also a rather simple consequence of Meckes characterization of Poisson processes. We have to show that
/ (Ax~P~ ){B}M{dx} = / ~{A]Pn{d~} A
x
B
for all A~ B(X) and all B~B(N). Using I.
the definition of convolution
2.
the definition of doubly stochastic Poisson processes
3.
approximation of r IB(V + 6x) H {dr]
with simple functions of
the form Z ak IAk(X ) IBk(~ ) for An~ B(X) and Bk~ ~(M) and the definition of Palm measures 4.
Mecke~s characterization with f(x,v) = IA(X) IB(V) and P = H
5.
approximation of IB(V) v{A} with simple functions
we get (the figures over the signs of equality refer to the above table)
(Ax~ ~ ) {~}M{dx} ~ / /
I B (~
+ ~x ) PK {d~}~{dx} :
A
x
AN
~ ~/ A~
IB(v + 6x)H~{dv} H= {d~}M{dx} x
x
~ ~ ; IB(V + 6x) H {dv}~{dx]H{d~} = ~ s IB(V)v{A} H {dr} H{d~} 2~5 f v{A} PH{dv} B which was to be proved.
58
Some special models
2.4.2
Consider X = R and, see section the intensity
is a stochastic
(Io,B(Io))where
tion H on
process
integrable
functions.
E l(x) = m(x)
< ~ for all x ~ R .
Poisson process
probabilities
{l(x)
; x@R}
Assume
rightconti-
further that
In this case the corresponding
is simple
for
with distribu-
I ~ is a set of nonnegative
nuous Riemann
stochastic
1.3.3, the case where the model
and from the definition
doubly
of Palm
we get
S Hx{B} m ( x ) d x A
=
S n(x) AxB
H{dn}
i.e.
x
{B} m(x)
/ n(x) n{an}
=
a.s.
(Lebesgue measure)
B
B(Io).
for all B e
Consider now the corresponding and define G(x,t)
G(x,t)
and Gx(x,t)
= PH{V
doubly
stochastic
Poisson process
N
, t > x, by
= O} = Pr{N(t)
; v(x,t]
- N(x) = 0}
and Gx(x,t) =
= (PH)x
ol={{x}}
>
{v
; V(x,t~
= O) = P r { N ( t )
o}
where the last sign of equality tion of Palm probabilities
refers to the intuitive
for simple point processes.
section 2.3, we have t
S n(y)~ G(x,t ) = S I
e
x
~{dn}
o
and, from theorem 2 and the fact that Ax~P H x = PH
{V ; V(x,tl = 0}, we have x
- N(x) =
interpretaThus, compare
59
t
f n(y)dy
Gx(x,t ) =
f
e
x
~x{dn}
o or, by the definition
of Hx, t - f n(y)~y
0x(x,t)m(x) = ~ n(x) e
x
~{dn}.
o
Example
3
Consider the case where transition where
I
l(x) is a Markov
probabilities
and with distribution
is the set of rightcontinuous
o
chain with stationary H on (Io,B(Io))
piecewise
constant
func-
tions R § R+ = [0,~) with only a finite number of jumps in every finite interval
and with range
finite or infinite.
Hki(X)
{lk ; k = ],...,K} where K may be
Put
= Pr{l(x + s) = lill(s)
= lk } ,
x6R+,
and ~k(X)
= Pr{~(x)
:
Xk } ,
xem
Of c o u r s e Hki(X) and Wk(X) must f u l f i l l regularity measure
m(x) =
conditions
In section
< ~.
G(x,t),
or strictly
speaking
upon the case
K.
be arbitrary
late G(x,t)
and
assume that
for the case K = 2 and commented
with arbitrary
Let x ~ R
Further
2.3.1 we calculated
only G(O,t),
consistency
in order to correspond to a probability
H on (Io,B(Io). K Z IkWk(X) k=]
certain
but such that m(x)
> 0. In order to calcu-
it is enough to know the restriction
of H to the o-
6o
algebra generated by ( ~ restriction
is
I ~ ; ~(y) ~ z} , y ~ x and z ~ R ,
determined
b y H{B} f o r
B~
B(I o)
of the
and this
form
(n~ ~o ; n(x) = Xk, ~(x I) = Xkl ..... n(x n) = Xkn } for n = 1,2,... , k , k 1 , . . . , k
n = I , . . .K and x < x] < x 2 < ... < x n < ~ .
We have shown that G ( x , t )
is
x
calculated
as G(x,t)
if
H is
replaced
with H . For B of the above form we have x
H(B) ~ Wk(X)Hkk1(Xl-X)Hklk2(X2-Xl)...Hkn_ikn(Xn-Xn_1).
Since
I
I
Hx(B} = ~
f ~(x)H(d~} - m(x) Xk H(B} B
we get
Hx(B} =
m(x)
Thus the restriction of H
Hkk1(Xl-X)'''Hkn_ikn (Xn-Xn-1) "
x
to the above mentioned a-algebra is the
distribution of a Markov chain (~x(y) ; y ~ x} with stationary transition probabilities and with sample functions in I . Thus for x with o m(x) > 0 the methods of calculating G(x,t) also applies to G (x,t) ~kWk(X ) x provided ~k(X) is replaced with m(x) . For the stationary case, i.e. Wk(X) = Wk' this result is obtained by Rudemo (1973:2, p 279).
It may be observed that the above reasoning holds also if the transition probabilities are not stationary, but then we have no methods for calculating G(x,t).
Example 4 Consider the case studied in section 2.3.2 where ~(x) = ~N(x) and assume that N(x) is a stationary renewal process. It was shown that under certain regularity assumptions
9
63
l i m e Kt G(t) = C. t~ Assume these regularity assumptions
and assume further that
0 < E kk < ~" Put m = E Xk'
Thus t
-
i fI ~
Gx(x't) --
n(x)
e
f n(y)dy x
o
and since N is stationary it follows that there exists a uniquely determined function G~
G~
such that (for almost every x)
= Gx(x,t+x)
For the general theory of Palm probabilities case we refer to Jagers
in the stationary
(1973, pp 25 - 26). Thus t
G~
i
=
f n(y)dy 0
e
n(o) m
~{dn}
o
and thus G~
is a monotone function.
We will now consider the behaviour of G~
for large values of t.
For arbitrary x E (0,t) we have x
x
f o~
=
0
f Oy(y,t)dy = 0 t
-• --
n(y) m
OI
e
f
n(z)dz
Y
0 t
t
- S n(z)dz
-•
(e x
-m
e
I o
= !
m
(o(t-x)
- f n(z)d~
- o(t))
0
) ~(dn}
=
62
which is a Palm-Khintchine
equation
(cf
e.g.
Daley and Vere-Jones
(1972, p 358)).
From the monotonicity
xG~
of G ~ it follows that
< A (G(t-x) - G(t)) < xG~ -- m
and thus
xe
~im (e<(t-x) G(t-x) e <x - e
< xe <(t-x) G~
e <x
for all t > 0 and all x ~ (0,t).
Thus
x lim sup e
< C (eKX _ I) < xe <x lim inf e
m
which implies that l i m e Kt G~ t-~
lira t-~o
e Kt
G~
=
t§
exists and that
<__qC m
We conclude this example by sketching an approach more in line with the one used in example 3.
Let {X~
; x ~ 0} be a stochastic process defined exactly like X(X),
except that " l 0 has distribution function V(x) = ~ Put v(t) = S e-tx V{dx}. Then H ~ 0 I
rl~
= m
S n(O)
i0 y
U{dy} instead of U.
defined by
II{dq} for all B
B(IO) ,
B
can be shown to be the distribution of {X~
; x S0}.
Thus
t
- S x~ G~
= E e
dx
0
and, see section 2.3.2 for notations,
G~
t = (I - H(t))v(t) + S v(s) GF(t - s) H{ds}. 0
63
Thus the asymptotic behaviour of C~
can be studied by methods, similar to
those used in section 2.3.2.
A detailed derivation of l i m e
2.5
m
Some random generations
We will in this section describe stochastic
Poisson
for illustration
sequences
reasons
s
and s
independent
for continuous
with s
exponentially
distri-
p, 0 ~ p ~ I, and
I - p. Then we have
I
rk, j = E(s k - I)(s
- I) :
plk-Jl
For this process we made seven generations ding {N k} for k = 1,2,... ,n. In table generations
time are presented.
be equal with probability
Zk=
5.2 and
I, i.e. Pr(s k ~ x} = I - e -x for x ~ 0. Let
with probability
m=E
of doubly
1.4) which will be used
6 and 7. In sections
; k ~ Z} be a Markov process
buted with parameter further
(see section
in sections
5.3 some similar generations
Let s = (s
some random generations
are given.
of (~k } and the correspon-
I some characteristics
of the
64
Name of
I n
P
n
n Z ~k --
generation
I
n
I
--~ Nk n I
I
n
--n Z (Nk-~,k)2 1
GI
500
0.0
0.993
0.982
0.951
G2
500
0.0
1.025
1.052
I .074
G3
500
0.0
1.018
1.046
0.856
G4
500
0.75
0.929
0.884
0.717
G5
500
0.75
0.878
0. 788
0.809
G6
500
0.75
0.933
0.976
0.978
G7
50
0.75
1.090
o. 86o
0.702
Table I: Some characteristics
The generations
for the generations
GI - 06 are the same as used by Grandell
(1972:2).
Generation G7 is illustrated in figure I.
25
5O
In each point k the height of the spike represents N k and the value of the plece~ise constant curve represents
Lk .
Figure I: Illustration of generation G7
65
3.
CHARACTERIZATION AND CONVERGENCE OF NON-ATOMIC R A N D O M M E A S U R E S
In this section we will illustrate how doubly stochastic Poisson processes may be used in proving theorems about random measures. Roughly speaking, any theorem about characterization or convergence of point processes that can be applied to doubly stochastic Poisson processes gives, due to theorem 1.1 and 1.2, a similar theorem for random measures.
Kallenberg (1973:1) gave general theorems (see theorem A 1.4 and A 1.7) for random measures and important improvements
(see theorem
A 1.5 and A 1.8) for simple point processes. We will use the theorems for simple point processes in order to get results for nonatomic random measures
(see definition A 1.1). The idea to use con-
tinuity assumptions in theorems of the kind to be given in this section is not new. Kallenberg (1971) announced such theorems for stochastic processes. The main results of this section were independently derived by Grandell (1973) and Kerstan, Matthes and Mecke (1974, pp 314-316) using identical methods. Using different methods the results were extended to certain classes of random signed measures by Kallenberg (1973:2) where also his announced theorems for stochastic processes were given.
Let A C B ( X ) be an algebra containing some basis for the topology on X.
Theorem I Let A be a non-atomic random measure with distribution H. Then ~ is uniquely determined by the distribution of A{A) for all bounded
A~A.
66
Proof The doubly stochastic Poisson process N with distribution PH' see section
1.5, is simple,
see theorem
1.3, and thus PH is uniquely
determined by Pr{N{A} = O} = E e -A{A} = f e-P{A}H{ d~} for all
M bounded A~A,
see t h e o r e m A 1.5. Since the distributions
of A{A}
determine E e -A{A} they also determine PH and thus H, see t h e o r e m 1.1.
It may be observed that in fact we have proved a little more, namely that H is determined by E e
-A{A}
for all b o u n d e d A ~ A.
In theorem A 1.5 the assumption that N is simple is essential, e.g. Kerstan, Matthes
and Mecke
assumption that A is non-atomic not mean that theorem
see
(1964, p 17), and thus also the is essential.
This does of course
I is true only for non-atomic A, in fact it
is of course true for simple point processes, but some condition on A is needed.
It is therefore of interest to find conditions
of the distributions
of A{A} for A ~ A which ensure that A is a non-
atomic random measure.
Jagers
(1974, p 203) shows that a point process
N is simple if there exists a non-atomic = o(p{A})
in terms
~M
such that Pr{N{A} > I} =
as p{A} + 0 and A ~ A .
Thus for any random measure A it follows that A is non-atomic
1 - El(1
+ A{A}) e-A{A~
if
= o(~{A}).
Denote the boundary of a set A by 3A.
Theorem 2 Let A be a non-atomic
random measure with Pr{A{~A} = 0} = I for all co
bounded A@A A
n
and let {An} I be a sequence of random measures.
d --~A if and o n l y i f
d
A {A}--* A{A} f o r n
all
hounded A~A.
Then
67
Proof The 'only if' part follows since ~ *
~{A} is continuous
for Borel
sets A with ~{~A} = 0.
Now we consider the 'if' part. We will use theorem A 1.9 in the same way as theorem A 1.5 was used in the proof of theorem I. The doubly stochastic Poisson process N corresponding to A is simple and
Pr{N{~A} = 0] = I ~
Let N
n
E e -A{~A} = I ~:@ Pr{A{~A} = 0} = I.
be the doubly stochastic Poisson process corresponding to A . n
For bounded A ~ A the random variable N {A} can be considered as a n
doubly stochastic Poisson process with an one-point
state space and
thus from theorem 1.2 it follows that A {A}-J-~ A{A} implies that n
d d N {A}---~ N{A) and thus, by theorem A 1.9 N ---+ N which, by theorem n n d 1.2. again, implies that A --+ A.
n
m
It seems natural that if theorem A 1.8 is used in the proof of theorem 2 a somewhat stronger result will come out. In order to do this the following lemma is needed.
Lemma I o~
oo
A sequence {PH }I is tight if and only if {Hn) I is tight. n
Proof Assume that {Hn)~ is tight and thus by Prohorov's theorem relative compact. {PH
For any subsequence
}I converges weakly, nk
{Hnk) ~ which is weakly convergent also
see theorem 1.2, and thus {PH }I is relan
tive compact and by Prohorov~s theorem in its other direction thus tight. The 'only if' part is proved in the same way except that corollary
1.1 is used together with theorem 1.2.
B
68
Theorem 3 d Let A,AI,A2,...
A if and only if
and A be as in theorem 2. Then A n
-i {A} n
(i)
E e
(ii)
E A {A} e
.~ E e
-A{A)
for all bounded A ~ A
-An{A}
-A{A} ," E A(A} e
for all b o u n d e d A s A
n
co
(iii)
{An) I is tight.
Proof Proceed as in the p r o o f of theorem 2 up to the construction of N n. From assumption
(iii) and lemma I it follows that {Nn} I is tight
and from assumptions
(i) and (ii) that Pr{N {A} = 0} § Pr{N{A} n
and that P r { N {A} = 1} § P r { N { A ) = 1} a n d t h u s
it
follows
= 0}
by theo-
n
d
rem A 1.8 that N ~
d
N which, by theorem
1.2, implies A --~ A .
n
n
m
4.
LIMIT THEOREMS
In this section we will consider asymptotic properties stochastic Poisson processes
Let A = {A(t) rightcontinuous
; t~R+}
on B+ = [0,~).
be a stochastic process with nondecreasing
sample functions,
for all t > 0. Thus, see section measure.
Let ~ = {N(t)
of doubly
such that A(0) = 0 and A(t) < ].3.1, A corresponds to a random
; t 6 R+} be a Poisson process with inten-
sity one and independent
of A and let, see section
doubly stochastic Poisson process N = {N(t)
; t~R+}
%
to A be defined by N = NoA, i.e. N(t) = N(A(t)).
1.3.2, the corresponding
68
Theorem 3 d Let A,AI,A2,...
A if and only if
and A be as in theorem 2. Then A n
-i {A} n
(i)
E e
(ii)
E A {A} e
.~ E e
-A{A)
for all bounded A ~ A
-An{A}
-A{A} ," E A(A} e
for all b o u n d e d A s A
n
co
(iii)
{An) I is tight.
Proof Proceed as in the p r o o f of theorem 2 up to the construction of N n. From assumption
(iii) and lemma I it follows that {Nn} I is tight
and from assumptions
(i) and (ii) that Pr{N {A} = 0} § Pr{N{A} n
and that P r { N {A} = 1} § P r { N { A ) = 1} a n d t h u s
it
follows
= 0}
by theo-
n
d
rem A 1.8 that N ~
d
N which, by theorem
1.2, implies A --~ A .
n
n
m
4.
LIMIT THEOREMS
In this section we will consider asymptotic properties stochastic Poisson processes
Let A = {A(t) rightcontinuous
; t~R+}
on B+ = [0,~).
be a stochastic process with nondecreasing
sample functions,
for all t > 0. Thus, see section measure.
Let ~ = {N(t)
of doubly
such that A(0) = 0 and A(t) < ].3.1, A corresponds to a random
; t 6 R+} be a Poisson process with inten-
sity one and independent
of A and let, see section
doubly stochastic Poisson process N = {N(t)
; t~R+}
%
to A be defined by N = NoA, i.e. N(t) = N(A(t)).
1.3.2, the corresponding
69
One-dime~ional l i m i t theorems
4.1
We will n o w consider This question and
(1972:2)
the asymptotic
has been independently and Grandell
It is well-known
of N(t)
t r e a t e d by Serfozo
as t § ~. (1972:1)
(197]).
that
~(t) - t
d --§
where W is a normally
as
t §
distributed
d and Vat W = ] and where - - ~ means In many
distribution
cases there
exists
d
'convergence
constants
> 0 and a r a n d o m variable A(t) - Kt Y
r a n d o m variable
with E W = 0
in distribution'.
K~ y~ 8 with y > 6 > 0 and
S such that i
~S
as
t +~.
from Dobrushin
(1955)
t8 Then it follows
N(t)
-
Kt Y
that
S + /~KW if
V =
S
y < 26
26
d
as
t§
t 6
where
S and W are independent.
that the specific
if
It follows
form of the norming
constants
portant~
and we are led to the
(1972:1~
p 293 and ]972:2 pp 3]2-3]3).
Theorem
]
Suppose
that there
~t lim 8 t = ~ and lim ~ t-~o t§ 8t such that
A(t)
- ~t
= K, 0 _< K < ~
d ~S
Bt
following
exist nonnegative
as
t§
from Dobrushin~s
are not too im-
theorem
constants
proof
due to Serfozo
~t and Bt with
and a random
variable
S
7o
Then N(t) - ~t
d ~S+
~Was
t
§
oo
St where S and W are independent.
Proof We will give a proof slightly different from the one given by Serfozo.
Since N(t) : N(A(t)) one may suspect that N(t) behaves somewhat alike A(t) + N(st) - st. Put, in order to simplify the notations, No(t) = N(t) - t. Thus we have
N(t) - ~t
A(t) - ~t
~o(~t )
=
+
St
-
Bt
-
Bt
Due to the assumptions A(t) - ~t
d §
as
t §
St For the second term we have
N~
m ~ t~ Bt
If K > 0
then
=
N~
~-~
at + ~
~t _-~§ v/K'K as
~t
as
t
t § ~
-~
ce
~t and
~o(~t )
d
---~W
as
t §
and
No (A(t)) - ~o(~t ) +
Bt
7]
If K = 0 then No (mt ) mt Var (~-~j----)= -~ § 0
as
t §
~t Thus, if the last term is shown to tend to zero in probability
as
t § ~, the theorem is proved.
Since N and A are independent we have
d
No(A(t)) - N(mt)
No(I A(t) - mt I)
Bt
=~
Bt
IA(t) - mt[" INo(IA(t) - mtl) I Bt2
where = means
N iA(t) _ mt I
'equality in distribution'
preted as zero. From Chebyshev~s
and where o
gg
is inter-
inequality it follows that
(t) Pr {L ~
N
<
]
} > 1 -
~
for
allt
> 0 a~d
(t)
{ o V~
all
~ > O,
-
; t ~ 0} is tight, and thus also {N~
IA(t) - mt I is tight. Since
i.e.
st
I), t > 0} ~IA(t) - ~tl --
d * IS1 as t § ~ it follows that
~t
i IA(t)
- ~tl
9 tends to zero in probability
as t § ~ and thus also
Bt No(IA(t ) - ~tl ) tends to zero in probability
as t § ~.
Bt
m
Consider now the case when E A2(t) < ~ for all t > O. Put M(t) = E A(t) and V(t) = Vat A(t). From lemma 1.3a it follows that E N(t) = M(t) and Vat N(t) = M(t) + V(t).
72
The
following
Grandell
(1971,
Corollary Suppose (i)
corollary
is a slight
reformulation
of results
due to
pp 207-213).
I
that
lim M(t) t-~
If k = ~
then,
N(t)
- M(t)
= ~
and
d ---+ W
as
M(t) l i m ~T-77~+ ~ =k,O
t § ~.
/Var N(t) (ii)
If k < ~ A(t)
and if there - M(t)
exists
d
a random
variable
Z such that
as t §
Z
V~(t) Then N(t)
where
- M(t)
--+d @
I " k v--;---2 z + ~T-~-f
W
as
t +~
Z and W are i n d e p e n d e n t .
Proof
(i)
Put
s t = M(t)
and
8t = M(m
(ii)
Put
s t = M(t)
and
Bt
=
7
/~.
Then
S = 0 a.s.
Then
d Z = S.
.
m In the
important
A(t)
we have
the
Corollary Suppose
-
special M(t)
following
case w h e n
d ~W
as
t§
result.
2
that
lira M(t) t+~
A(t)
- M(t)
= ~ and that
d --+ W
as
t § ~.
73
Then N(t) - M(t)
d
.~ W
as
t§
~Var N(t)
Proof Let
{ t n }n=1
be an arbitrary
sequence
such that t n § ~ as n § ~
M(t) Since V--~7 ~ 0 for all t > 0 there exists a subsequence M(tn,) {tn,}n,=1 g {t n} such that nlim'§ V(tn,------~ = k 6 [0,%~ where k may depend on {t '}. From the proof of theorem n
I and from corollary
I
it follows that N(tn,)
- M(tn,)
d ----+ W
as n' + ~
~Var N(tn,)
and, since {t ) was arbitrary,
thus
(cf Billingsley
(1968, p 16))
n
N(t) - M(t)
d ----+ W
as t § ~
~Var N(t)
l The following corollary
shows the existence
2 but not by corollary
of independent variables
example
and identically
I
9
of cases covered by
Let {~k }~ k=1 be a sequence
distributed
nonnegative
random
with E ~k = I and Var ~k = I and put
it] A(t) =
Z
~'~ means
'integer part'
Then M(t) = It] + exp {[log(t + I)~}, V(t) = It] and A(t) - M(t) V(/~
d_~ W
but
I lim inf ~M(t) = I + -t-~~ e
M(t) lira sup ~ = 2. t-~
Added
in proof
See page 86 for further remarks.
and
74
4.2
A fund~ion~g l i m i t theorem
The main purpose of this section is to present results corresponding to theorem I but where the statement
'convergence in distribution'
is given in a more informative sense.
Consider therefore the space D of rightcontinuous functions with lefthand limits defined on [0,~) endowed with the natural extension of the Skorohod J1 topology. Some properties of D are given in section A I. The Borel algebra B(D) on D is equal to the a-algebra generated by {xED
; x(t) ~ y }
for y 6 R and t in some set dense in E0,~). Let
D CD o
be the subset of non-decreasing functions x with x(0) = 0 and
let C ~ D be the set of continuous functions and put C
o
= CAD
o
. All
these subsets are closed sets. We topologize these sets by relativizing the topology of D.
We call a measurable mapping from some probability space into (D,B(D)) a stochastic process in D. The same terminology will be used for the subspaces and also for products of these spaces.
Remark 1 Let us consider the set of functions in D
o
and denote this set by
when it is endowed with the vague topology. Let h : M § D identity function. It is measurable,
o
be the
since from theorem A I. I and
the properties of B(Do) it follows that B(M) = B(Do). Thus every probability measure on (M~B(M)) is also well-defined on (Do,B(Do)). Let ~,~i,~2,. .. be functions in D O such that ~n § ~ in D o ~ i.e. in the Skorohod sense. Then (cf Billingsley
(1968,p 112))
~n(t) § ~(t) at continuity points t of U and thus, see theorem A 1.7, ~n + W in M, i.e. in the vague sense. On the other hand
75
0
if
0
< I---
--
~n(t) =
I
1 -!<
if
t n
2
I n
< 1 +1
--
n
1 + l--
if
n
- -
converges to 0
if
0
2
if
I
< 1
~(t)
in M but not in D o . Thus the (extended) Skorohod topology is strictly finer than the vague topology so h is not continuous.
If
A,AI,A2,... are stochastic processes in D o it thus follows that d d A - - 4 A (in M) does in general not imply that A ----* A (in D ). n
n
The function h is, however,
continuous
o
at continuous
functions,
at functions in C O . To see this, let ~I,~2,... be in D
i.e.
and let ~ be O
in C O . Then ~n § ~ in M means that ~n(t) § ~(t) for all t ~[0,~) while ~n § ~ in D o means that
sup l~n(S) - ~(s) I § 0 for all t 6 [0,~). O~s<_t It can be shown that ~n § ~ (in ~) implies ~n § ~ (in Do) almost exactly in the same way as it is shown that pointwise convergence distribution
functions to a continuous
distribution
form (cf e.g. Bauer (1968, p 191)). Thus A
n
d *
A
of
function is uni(in ~) implies
d An -
'~ A ( i n
DO ) p r o v i d e d
Pr
{A~ Co } = 1. T h e o r e m s 3 . 2
and 3.3 thus
d
hold true if the conclusion A
> A is interpreted in the Skorohod n
sense. Jagers
(1974, pp 211-212) has shown the same to be true for
theorems A 1.8 and A 1.9.
It is not sufficient to assume that A is continuous in probability d d in order to ensure that A n - - 4 A (in M) implies A n ~ A (in Do) as was erroneously stated by Bingham (1971, p 6). To see this, let 8 be a non-negative
random variable with continuous
distribution
and put
76
0
An(t) =
n(t -
@)
if
0
if
8
< t
<
@ I
< @ +--
--
if
I
t
> --
n
@ +--
1 n
and 0
if
0
I
if
@>
< @
A(t) t
d ---~ A (in M), but since A
Then A n
is a process in C n
is closed in D it follows that A
n
and since C o
does not converge in distribu-
tion to A in the Skorohod sense.
Let X be a stochastic process in D, m an element in D and Bt a positive function such that lim Bt = ~. Define for every t > 0 a stochast-~ tic process Xt in D by
Xt(T) =
x(t~) - m(tT) Bt
d If Xt
~ S as t § =, for some stochastic process S in D, we talk
about a functional limit theorem for X.
If for every T > 0 we have
~tT _- Tp t-~=limB t , p finite,
then Bt is said to var~ regulary at infinity with exponent p or shorter to be p-varying.
We will, in the functional limit theorem to be given, assume that Bt is p-varying. The following lemma, and the remark after it, indicates that this in fact is a mild restriction.
77
nemma
I
Let X be a stochastic limit theorem holds
process
in D and assume that a functional
for X. If B 6 D
O
and if Pr {S e 0} < I then
~t is p-varying with 0 < p < ~.
Proof A very similar result sidered convergence underlying
is due to Lamperti
of finite-dimensional
assumptions
were somewhat
proof to be given is, except
(1962, p 74). He condistributions
different
for technicalities,
and his
from ours. The the same as
Lamperti's.
It follows
from Billingsley
that the set T S consisting
(1968, pp 123-124)
and theorem A 1.10
of 0 and those T > 0 for which
Pr {S(T) - S(T-) = 0} = I is dense in E0,~). The statement d X t --~ S implies
= Xt(T)
d~
S(~) for all T ~ T S and since lim Bt t-~o we have Pr {S(O) = O} = I. Since B(D) is generated by { x e D , x(t) < y}, y ~ R and t ~ T S it follows that there a ~o~Ts,
t ~ > O, such that Pr {S(T o) r 0} > 0 since otherwise
Pr {S - O} = I. Then for every positive
s(T)
d
x(t~o~)
BtT
x(tTT o) - ~(tTT O)
= - -
BtT -~tT
BtT
O
The second factor of the right member S(~o) , and therefore
T > TS, when t §
- ~(tTo~)
* - -
Bt~
Thus,
exists
BtT
O
converges
in distribution
to
tends to a finite limit. O
since B ~ D o, it follows
(of Feller
(1971, pp 275-276))
that
78
~tT lim ~ t- ~ 8t~
i.e.
o
T B t - to = lim - t -~~ 8t
Bt is p-varying.
p = (~---) for all o
~ > 0,
Since lim ~t = ~ we have 0 >_ 0.
If O = 0 then S(T) =d S('c ) f o r a l l
o
ments of D are rightcontinuous,
positive
"c~-TS. S i n c e a l l
this implies
ele-
(cf Billingsley
(1968,
p 126)) that S(T ) =d S(0) and thus Pr (S -= 0) = I. This is a cono tradiction
and therefore 0 > 0.
B
Remark 2 If Pr (S(T) - S(T-) = 0) = I for all T ~ R+, then T S = R+ and the p r o o f of lemma I goes through if B is only assumed to be measurable
Lemma
(cf Feller
(1971, p 277)).
2
The function ~ : D x D and it is continuous given by r
o
+ D given by @(x,y) = x~y is measurable
for ( x , y ) 6 C
= x + y where
and it is continuous
x D . The function @ : D x D § D o
(x + y)(t) = x(t) + y(t) is m e a s u r a b l e
for (x,y)6 C x D.
Proof This lemma is a consequence of more general results given by Whitt (1972). A p r o o f will, however, be given. Consider first the function %. The m e a s u r a b i l i t y
follows from the p r o o f of lemma 1.2
slightly modified according to B i l l i n g s l e y observations xl,x2,... ~ D ,
about the Borel-algebras
(1968, p 232) and the
in remark
1. Let now
x 6 C and y,yl,Y2,... ~ D O be given. We will show that
if Xn + x and Yn § y than XnOY n + xoy. From the definitions
given
79
in s e c t i o n
A I it follows
that
Yn § y means
that
there
R~C
{Yn } =I' Yn 6 F '
such
that
ynOYn
exists
U
~ y
and Yn
~ e.
Since
U~C
x g C it follows tE
that
x
n
§ x means
that
x
) x. For
n
any
[0,~) we have
sup O<s
Let
IXn~YneYn(S)
- x~y(s)l
<
sup O<s
IXn~Yn~Yn(S)
+
sup O<s
[x.Y~Yn(S
e > 0 be g i v e n
continuous
and put
on [ O , t o .~ t h e r e
sup
<
- XOYnO7n(S)l
) - x~y(s)l
to = sup n
exists
+
.
(ynOYn(t)).
Since
x is u n i f o r m l y
~ > 0 such that
Ix(s 1) - x(s2)
I < E .
O<__sI 's2~t [s1-s21<6 U~C Since
x
U~C X and y n O Y n
n
IXn(S)
sup O<s
--
y
,,
- x(s)[
there
exists
<
0
and
sup O<s
[ynOYn(S)
- y(s)
I < 6
for all n > n . --
Thus
o
for n > n --
o
sup O<s
IXnOY,nO%#n(S)
IXn~S) -
sup
xs
- x,y(s)[
+
~ ~
<
2~
O<s
and thus
- -
0
@ is c o n t i n u o u s
for
(x,y)6
C x D . o
n
o
such
that
8O
Consider
the function
the continuity
~. This
assume
sup O<s
function
that x n § x ~ C
I(x n + yn ) ~ Yn(S)
is obviously
- (x + y)(s) I !
sup Ixn o Vn(S) - x(s)I +
sup ly n o Yn(S) - y(s) I O<s
The first term tends
to zero due to the
(put Yn = Yn and y = e) and the second
Let A be a stochastic
process
in D . From lemma o
in D
o
first part of this
independent
Define
lemma
since Yn § y"
2 it follows
N = N~A ~ in D o is well-defined. % N
For
and Yn + y' Then
O<s
a process
measurable.
of N, which
that the stochastic
No by No(T)
=
also is process
T) - T and No, t by
N (tT) o,t
(T) = ~
o
9 In section
4.1 we made use of the fact that
%
No(~)
d --+ W as § ~. In order to state
now let W be the Wiener
Lemma
process
a functional
in D with E W(t)
correspondence
= 0 and V a r W(t)
we = t.
3 N
o~t
d ----+ W as t § ~.
Proof
Since No, t is a process follows
from S k o r o h o d
it is sufficient
o,t for each ~ R
(~)
with i1957,
stationary
snd independent
p 151) combined
increments
it
with t h e o r e m A 1.10 that
to show that d
w(~)
+ as t § ~. For fixed
(t~) ~o,t(T ) = ~
o
T we have
d ~ ~w(1)
d =w(T).
I
81
We are now ready to state the functional limit theorem corresponding to t h e o r e m I.
Theorem 2 Let A be a stochastic process
in D
and let N = NoA be the corresponO
ding doubly stochastic Poisson process. and St a positive
p-varying
Let m be an element in D
O
function with 0 < p < ~ and S a stochas-
tic process in D. Define A t by At(T) = A(tT) - m(tT) St
and N t by
Nt(~ ) : N(tT) - m(tT) St
If
~(t) lim ~ = ~, 0 < ~ < ~ ~ and if t§
At
d
~ S
as
t § ~ , then
St d N
~ S + Woh
t
where h(~) = KT 2p
as
t -~
and S and W are independent.
Proof We have
Nt(T )
N(A(tT)')
- m(tm) = N(A(tT)) St
=
No (A(tT) St
- A(tT)
+ A(tT)
St (A(tT))
+ At(T) = N o,S 2 t
2 St
- m(tT) St
9 + At(z)
Thus we have
Nt = N
2 o At + At o,S t
where ~t(T )
A(tT) 2 St
At(~) St
~(tT) 2 St
From t h e o r e m A 1.10 and the fact that lim S t = ~(cf Feller (1971, t-~
=
82
p 277)) it follows that
At
P s0
as
t§
St P where --~ means
'convergence in probability'
and 0 is the function
identically equal to zero.
Further it follows from Feller (1971, p 277) that ~(tT)
u,c
KT2p
2 St
~(tT)
a(tT) = h(w)
since
2 St
=
~2 tT
B2 tT 2 St
for
T > 0
(strictly speaking it follows from Feller that the convergence is uniform on compacts not including zero,
but since p > 0 and ~ & D
o
it is not difficult to realize that zero may be included) and thus P t
'~ h
as
t §
From the assumptions
it follows that At----~ S and thus (cf Billingsd ley (1968, p 27)) (At,A t) ~ (S,h). Since lim St = ~ it follows from t-~ d lemma 3 that N
2 o,B t
pendent it follows ()
~ ~ W and thus, since N
(cf B i l l i n g s l e y
2 and (At,A t ) are indeo,B t
(1968, p 21)) t h a t
2,At,~t) _ d, (W,S,h). From lemma 2 it follows that (N~
2 ~ At'At )
o,~ t ,St is well-defined and that, since W is a process a.s. in C and At d a process in Do, (N 2 o At,At) ---* (Woh, S). o,~ t Since h E C it follows that W~h is a process a.s. in C and thus it o
d follows from the second part of lemma 2 that N t
~ S + WOh.
m
83
Remark 3 If it is possible to choose m(t) = bt and Bt = / ~ t h e n theorem 2 is very close to a result due to Serfozo (1972:2, p 314).
m Example I We will consider the case studied in section 2.3.2 where the intensity process is generated by a renewal process. Let N = {N(t); t > 0) be an ordinary renewal process which is independent of a sequence ^
co
{Xk}k= 0 of independent and identically distributed nonnegative random variables with distribution function U. (The use of ~k and N instead of Xk and ~ as in section 2.3.2 is only since the notation N is used in a different sense in this section.) Let T0,TI,... be the epochs of events corresponding to N, i.e. Tk = ~-1(k). Put ~0 = @0 and ^
co
[k = Tk - Tk-1" Then {$k)k=0 is a sequence of independent and identically distributed random variables. Let F denote their common distribution function.
Put ^
~(t) = ~N(t) " Then
A(t)
N(t)-1 t = / ~(s)ds = 0 k=0
2 < ~ and E ~2 Assume that E ~k k < ~ and put m = E ~k' 2
= Var ~k'
^2 fl = E ~k and f2 = E ~k" Then we have
A(t)
- mt =
N(t)-I Z k=O
(~k - m)~k + ( ~ N ( t )
- m)(t
- T~(t)_l).
Thus, except for the last term which has no influence on the asymptotic behaviour, A(t) is expressed as a randomly indiced sum of independent and identically distributed random variables with zero mean and finite variance. Put
84
At(~ ) = A(tT>
- mtT
Since N(t) ~ t
1 fl
and Var (~k - m)~k = ~2f2
it follows from the proofs given by Billingsley (1968, pp 146 and 149) that
A
--~
~
W
as t §
t From theorem 2 it then follows that d
2
Nt ~
f2 fU
~
+ m W
ast§
where N(tT)
Nt =
-
mtT
/~
m
Example 2 We will consider a case which illustrates the advantage of the rather general formulations of the theorems.
Let, like in example I
{ k)k=0 be a sequence of independent and
identically distributed nonnegative random variables with distribution function U. Put l(t) = ~ [ ~
where ~
means 'integer part'.
If E 22 < ~ this case is included in the case considered in k example I and therefore we assume that E 22k = ~" It follows from Feller (1971, p 577) that a non-degenerate (one-dimensional) limit theorem for A(t) exists if and only if U belongs to the domain of attraction of a stable distribution. Then for some Y , 0 < y ~ there exist norming constants a
t
and B t such that
2,
85
A(t)
-
d
st
S(1)
as
t +~
Bt
where
S(I ) is a r a n d o m
exponent E
y and
with
y = I will
f r o m n o w on b e
A t be d e f i n e d
a stable
I ~ -varying
B t is a n e c e s s a r i l y
= ~ if 0 < y < I and m = E I
k
Let
variable
distribution function.
Then
< ~ i f I < y < 2. The --
k
with
case
excluded.
by
At(~ ) = A(tT) - m(tT) #t where =] 0
~(t)
Let
S be
Imt
a stochastic
if
0 < Y < I
if
I
process
< 2
in D w i t h
stationary
and independent
d increments
such t h a t
At(1)
cess
if y = 2 and this
cess
in C. F u r t h e r
For a l l y w e h a v e
It f o l l o w s
~ S(1)
is e x a c t l y
as t § ~
the only
S is a p r o c e s s
in D
O
. S is a W i e n e r
case w h e r e
if and only
pro-
S is a p r o -
i f 0 < u < I.
d I = t ~ S(1).
S(t)
from Skorohod
(1957,
p
151) t h a t
d At
Since
~ S
as
^2 = ~ w e h a v e E ~k
t §
K = lim t-~
from theorem
N
t
2, or f r o m t h e o r e m d ---+ S
~(t)2 = 0 for a l l y a n d t h u s
it f o lY l o w s
Bt I and Skorohod~s
result,
that
also
as t § ~
where
Nt(T ) = N(t~)
- m(t~) Bt
m
86
Added
in p r o o f
Rootz6n
(1975) has
Suppose
that
shown the following
a t and 8t are constants
N(t)
- at
one-dimensional
with
lim t-~
d ---+ some r a n d o m variable
8t
=
~
result.
Then
.
R as t § ~
~t if and only if a t -- < ~2
< = lira sup t+~
and furthermore,
A(t)
-
at]) = E (e iuS)
(exp{iu
E
exp{-(<
at u 2 - -~) ~-}
~t
+ o(I)
In this
independent
5.
and W is normally
ESTIMATION
Consider
for some random variable
d R = S + {~-<W where,
case
a r a n d o m variable
estimate
t +
like in section
distributed
is, given
~ defined
4.1,
S and W are
with EW = 0 and Var W = I.
on the same p r o b a b i l i t y
A and a point process
M, is a Poisson
considered
S as
OF R A N D O M VARIABLES
as a r a n d o m measure A = ~
+
Bt
process
with intensity
an observation
of ~ in terms
N, where
N for given
~. The p r o b l e m
of N on X E B(X), o
of the observation.
space
to be
to find an
86
Added
in p r o o f
Rootz6n
(1975) has
Suppose
that
shown the following
a t and 8t are constants
N(t)
- at
one-dimensional
with
lim t-~
d ---+ some r a n d o m variable
8t
=
~
result.
Then
.
R as t § ~
~t if and only if a t -- < ~2
< = lira sup t+~
and furthermore,
A(t)
-
at]) = E (e iuS)
(exp{iu
E
exp{-(<
at u 2 - -~) ~-}
~t
+ o(I)
In this
independent
5.
and W is normally
ESTIMATION
Consider
for some random variable
d R = S + {~-<W where,
case
a r a n d o m variable
estimate
t +
like in section
distributed
is, given
~ defined
4.1,
S and W are
with EW = 0 and Var W = I.
on the same p r o b a b i l i t y
A and a point process
M, is a Poisson
considered
S as
OF R A N D O M VARIABLES
as a r a n d o m measure A = ~
+
Bt
process
with intensity
an observation
of ~ in terms
N, where
N for given
~. The p r o b l e m
of N on X E B(X), o
of the observation.
space
to be
to find an
87
Non-linea~ ~gimagion
5.1
Let ~ = N • M • R with elements ~ = (v,~,z) be our sample space. Let N and M be endowed with the vague topology and R with the usual topology.
With the product topology ~ is a Polish space (cf Bauer
(1968, p 169)). Let
B(~) be the Borel algebra on ~ and let Q be a
probability measure on (~,B(~)), i.e. Q is the distribution of (N,A,~).
Let thus
(~,B(~),Q) be the probability
space on which the
random elements, to be considered in this section, are defined. Since ~ is Polish, there exists for every sub-~-algebra of B(~) a conditional distribution
relative that sub-~-algebra
(cf Bauer
(1968, p 258)). Conditioning will be denoted by a superscript.
Let
B(N),B(M) and
respectively.
B(R) denote the Borel algebras on N, M and R
B(~) is thus the product of these algebras.
by BN, B M and B R sets in set B N • M • R. By
B(N), B(M)
B'(N)~B(~)
way B'(M), B'(R) and B~,
Denote
and B(R) and by B~ the cylinder
we mean (B~ ;
BNEB(N)}.
In the same
B R' are defined.
Consider the probability measure Q. Since N is assumed to be a Poisson process trarily.
for given A = ~
M we can not choose Q quite arbi-
For each Q on (~,B(~)) we may define a marginal probability
measure H
(for A)
on
(M,B(M)) by
Q {B N • B M • R} = f n BM Consider a set X o E B ( X ) (~60
H(B M) = Q(B~}. Thus Q must satisfy
{B N} H{d~)
.
and let O' be the ~-algebra generated by
; v(B} ~ x} for all x 6 R
and B 6 B ( X O) where B(X O) is the re-
striction of B(X) to Xo, i.e. B(Xo) = { B N X ~ ; B 6 B ( X ) } .
Definition
I
A set 0 ' ~ 0' is called an observation of N on X . O
88
Definition 2 An 0'-measurable
function sx : ~ § R is called an estimate of ~ in
terms of N on X .
We need some criterion to decide whether an estimate is good or not. Consider therefore a loss function L, i.e. a B(R) x B(R)-measurable function L : R • R § R+
~-F0,~)"
Definition 3 An e s t i m a t e
~
of ~ in terms
according to L if E ( L ( ~ , ~ ) )
o f N on X
0
~ E(L(f,~))
is
called
the best
estimate
for any 0'-measurable
func-
tion f : ~ § R.
In a given situation a best estimate need neither exist nor be unique a.s(Q). We will in example 2 consider a case where an estimate, which is not a best estimate according to any loss function L, may be reasonable.
If E 0' L ( ~ , ~ )
< E 0' L(f,~) a.s(Q) for any 0'-measurable
f : ~ § R then ~
function
is the best estimate of ~, and especially if
L(x,y) = (x - y)2 we get ~
= E0'~. Thus we consider Q0'. Although
Q0' is the a.s.(Q) unique 0'-measurable
solution of
S Q0, (.) aQ = Q (0'O.} 0' for all 0'~ 0', this may be of little help.
If Q{O'} > 0 it follows from the elementary definition probabilities
that Q{ O'{]B~}
Q{Ba]0' }
=
Q{0' }
of conditional
89
where Q(BolO') is the conditional probability of B ~ @ B ( O )
We will now consider the case where X
given 0'.
is bounded. In theorem I it
O
will be shown that Q0' may be calculated as a limit of elementary de-fined conditional probabilities.
It will further be shown that
what in 'every day language' is meant with an observation of N on Xo, really is an observation in the sense of definition I.
Let X vo @ N
0
be bounded. Then v(X ) < ~ for all v ~ N . 0
the set O'(Vo) by O'(v o) = ( ~
For Vl,V 2 ~ N
Define for any
~ ; v(B) = Vo(B)for all B~B(Xo)).
the sets O'(v I) and O'(v 2) are either disjoint or equal
and further
~ O ' ( v ) = ~. Let d be a metric metrizing the topology vaN on X. Let (Bnl,...,Bnr) be a sequence of finer and finer partitions n
of X ~ (i.e. for each n and ~ = 1,...,r n the set Bnj. is a union of certain Bn+1,j, j = 1,...,rn+1, sets) such that B n j 6 B ( X o) and
lim n+~
max diam (Bnj) = 0. 1~_j~rn
Put 0 n ( v o) = ( ~ g 2
. ; V(Bnj)
=
Vo ( B nj.) for
I -~ j
~ r n ). --
0 !
For each v6 N thus O'(V)~n 0'. Define Qn (B~) : ~ § [~,I] by
Q ~B~t0~(~)~
if
~o~(~)
and
Q~O~(~)~ ~ 0
0
if
~0~(~)
and
Q~O~(~)~ = 0
QO'n (B2)(w) =
for every B ~ B ( ~ ) .
Theorem I For each ~ N
the set O'(v) is an observation,
Further for each B~E B(~) we have
i.e. O'(v)~O'.
O' lie Qn {B~} = QO'(B~} a.s. (Q). n-~oo
9o
Proof Consider any ~o 6 N. The set 0'(v o) is characterized by the vector (xl,nl,x2,n2,...,Xm,nm) where Xl,...,x m are the only points in X ~ with w ({x}) > 0 and where n. = v ({x.}}. Denote for each x 6 X by o j o J o Bn(X) the set among Bn],...,Bnr
which contains x. Then Bn(X) + {x}. n
Thus there exists n o such that for n > n o the sets Bn(Xl),...,Bn(Xm) are disjoint. Thus 0~(~ o) + O'(v o) for n > n o which implies that 0'(~o)6 0' since O~(Vo)6 0' for each n.
Let 0'n be the a-algebra generated by {wE 2 ; ~{Bnj} _< x}, x 6 R, j = 1,...,r n. Thus 0~(~)~ ~0'n and 0~g~ 0'n+1 " Define 0'~ to be the ~algebra generated by
[j 0'. If we can show that 0' = 0' then the n n=1 theorem follows from Doob (1953, pp 611-612).
Since 0 " ( 0 '
and since 0" is a c-algebra it is enough to show that
v{B)(~) : ~ § Z is 0"-measurable for each B E B(Xo). Put n
D = { D E B ( X o) ; ~{D}(~) is 0"-measurable}
[j B nj .and . Since X ~ = j=1
since v(.)(~) is a measure for each ~ it follows (cf the proof of lemma 1.1) thatP is a Dynkin system.
For any closed set F in X the set X o ~ F 6 lim n§
[~ X6Xo~F
Bn(X) = X o N F .
D since
Since B(X o) is generated by (Xo~ F ;
F closed in X) and since for closed sets F I and F 2 also FI~]F 2 is closed and thus X o ~ F I ~ F 2 ~
~ it follows (of Bauer 1968, pp 17-18) that
p = B(Xo).
m Consider now the case where N and ~ are conditionally independent given A. To motivate a special study of this case, we just note that this is the case if ~ = A{B} for some B 6 B(X). In order to make our formulae somewhat more handsome, we denote the marginal distribution
of (A,~) by V. Thus V is a probability measure on (M • R, ~(M x R))
91
defined by V{B M x BE} = Q{N x BM x BR} . From the conditional independence we get Q(B N x BM x BR) = f
QB'(M)(B~) QB'(M)(B~} dQ =
= / n~{BN}QB'(M){B~}Q { N
• d~ • S} =
BM
BM
since
BM
i~ (W{Bn(Xi)} )niI e -w{X~ =I
Since H {O'(v)} = const. n
,
where Bn(X) and the vector (x],n 1,...,xm,n m) characterizing 0'(v) are defined in the proof of theorem I and where the constant depends on n and v but not on ~, it follows from theorem I that a.s. (Q)
(B(Bn(Xi)))niI e
~.!
Q~
=
lim
-~[x o )
V{d~ x BR }
I
n'+~
( ~ {Bn (x i ) } )
o
e
ff[d~)
1
for m~ O'(v) characterized by (xl,n I .... ,Xm,nm). Specializing further we consider X = R and X ~
(0,
and, see sec-
tion 1.3.3, the case where the model for the intensity is a stochastic process {l(x) ; x s
with distribution H on
(Io,B(Io))where
Io is a set of nonnegative Riemann integrable functions with finite integral over bounded sets. Let the space (~,B(~),Q) be modified in the obvious way. Then a.s. (Q)
92
t
m
QO'(BI)(~)
=
-S n(y)dy
I
( H n(y)dy) e o i=I B n x i)
Zim
o
V{dn x BR}
t
n-~co
-f ~(y)dy
m
/(~
n(y)ay)
I
e
0
H{dn}
I 0 i=l B n x i)
for ~O'(v)
by ( X l , 1 , . . . , X m , 1 ) .
characterized
(Multiple points do not occur.)
(t(j 2n
tj 2 Z]
=
n6 I
~(x) is continuous
O
such t h a t
2n
Jim n +~
l-
-
1)
Choose e.g. Bnj
f
Then for each x.E m (O,t] and each a t x = x. we h a v e 1
n(y)dy = n(x i) .
gn(X i )
Thus a.s. (Q), since a Riemann integrable
function is a.e. continuous,
t
-f
m
H (Bnlx i )
n(y)~y
n(y)dy) e 0
i=I lim
t_ (2 n)
n-~m
m
t = ( n n(xi))
e 0
i=I If e.g. t
-fn(yl~y supfl n
I 0
m 2n 0 H ((t--)B~n n(y)dy) e i=I (x i )
I+~ I
H{dn} < ~ a.s.(Q)
93
for some ~ > 0 it follows by uniform integrability
(cf Billingsley
(1968, p 32)) that
t
- f ~(y)dy
m
f ( H n(xi)) I Q0'{B{)(~ ) =
e
0
V{dn x BR}
i=I o
t
- f ~(y)dy
m
f ( ~ n(xi)) e I
a.s. (Q) .
0
~{an}
i=l
o
Remark I Two 'extreme'
cases where the condition of uniform integrability
holds are when
sup f O<x
~m(x) [{d~} < ~
for all m or when I
is o
0
the set of constant functions which means that N is a weighted Poisson process.
To see this we will show that t
- f n(y)dy
m 2n I =f ( H ((7-) Io i=I
f n(y)dy) e Bn(X i)
0
)2 H{dn}
for all n,m and Xl,...x m is less than a constant which does not depend on n.
Assume that Cm =
sup
f
~m(x) H{d~]
<
9 Using H 6 1 d e r %
O<x
0
equality twice we get
2 n 2m
i• 2 n 2m (7-)
m
f
n
I~
i=I
( f
n(y)dy)2~{dn} •
Bn(X i )
I m H (f ( f ~(y)dy)2m H {d~}) TM i=I I ~ Bn(X i )
in-
94
2 n 2m
(~)
m
<__ (7-)
H i=I
1
0
1
- - 2m
(
(f I
(I _ 1 ) 2m
2
l
n2m(y)dy) 2m)
)~
H{dn}
Bn(X i ) I
2n
m
= -t- ~i=I (f o (Bn(Xi J )
q 2m ( y ) d y )
lI{dq})
TM
<_.
I
2n < (~-)
--
m
(t__
H i=I
2n
m C2m)
= C2m 9
If N is a weighted Poisson process we have
2m
i = fI Example
(qm e-tq)2 ~ { d n }
<_ ( t )
e
0
I
Let PI'P2 ~ M
be given and define the distribution of A by
Pr{A = pk } = Wk, k = 0,1 , where w0 + Wl
Put { = k
if
=
I.
A = Pk" Then
EO'L(f,~)
= QO'{m; p = PO } L ( f , O ) + qO'{m; P = Pl } L(f,1).
Thus, if
L(x,k) =I 0
if
x = k
I I
if
x#k
the best estimate of 6 is equal to k if QO'{~; z = k) > QO'{~; z = I - k) .
=
95
Example 2 Consider X = R and X ~
=[0] ,t
Put like in section 2.1
and let N be a weighted Poisson process
X(t) = ~ where ~ is a random variable with
distribution function U. Put $ = ~. As already shown
X
QO'{~;
f ym e-Yt U{dy} 0
z ~ x} =
oo
f ym e-Yt U{dy} 0 where m = v{[0,t]}. For L(x,y) : (x - y)2 the best estimate is
co
~ ym+1 e-Yt U{dy} 0 co
f ym e-Yt U{dy} 0 This result, as pointed out in section 2.1, was given by Lundberg (1940, p 71) but motivated from a different point of view.
Assume now that N is a P$1ya process, i.e. B U'(x)
xB-1
e
-ax
r(B)
if
x>O
0
if
x<0
x
m+~-1 (~ + t)m+B ~
0
r(m + B)
=
where ~,~ > 0.
Then
Q0'
{~; z < x} = f -
and thus for L(x,y) = (x
~+t
_
y
)2
e
-(~+t)y
the best estimate is
dy a.s. (q)
96
Instead of this criterion we may consider L(x,y) = Ix - YI" Then the best estimate In t ~ s
case ~
~
is a median of the conditional
can not be analytically
given.
It may be regarded as natural to choose ~ tional distribution.
distribution.
as the mode of the condi-
Since the density of that distribution is propor-
tional to
ym+B-1
e-(a+t)y
for y > 0 we get
~
= max(O,
B - I + m)
~§
'
This last estimate is not a best estimate in the sense of definition 3.
To compare these estimates we have for a = B = I and t = 5 computed ~x for m = 0,1,2,...,10.
In figure 2 these estimate are drawn. Though
the estimates only have m e a n i n g for m = 0,1,... m varies continuously.
they are drawn as if
97
L(x,y) mode
= (x - y)2
L(x,y)
Ix
yl
1.5
0.5
0
i
|
I
1
2
3
m
9
,
9
9
9
~
5
6
7
8
9
9
i
m
lO
Figure 2: lllustration of estimates in a P61ya process
Example 3 In this example some results derived by Rudemo, specialized to doubly stochastic Poisson processes, will be surveyed.
Consider the case indicated in section 2.3.1 where (~(t) ; t ~ 0} is a Markov chain with stationary transition probabilities and with distribution H on (Io,B(Io)). Here I
is the set of rightcontinuous O
piecewise constant functions R+ § R+ = ~,~) with only a finite number of jumps in every finite time interval and with range (~k ; k = 1,2,...K) where K may be finite or infinite. Put Hki(t) = Pr(~(s + t) = ~iI~(s) = ~k ) qki
ki (0)
(right-hand derivative)
wk(t) = Pr(~(t) = ~k ).
98
Consider X ~ = D , t ]
9
Let 0' be the c-algebra of observations t
and put
0 !
w~(slt) = Pr t { k ( s )
= kk }
and K
~(slt
) =
where thus ~ ( s l t )
Z ~k Wk( s l t ) k=1 is the best estimate of X(s) in terms of N on
~0,t] according to L(x,y) = (x - y)2. To simplify wk(t) = w k ( t l t ) a n d
~x(t)=
notations put
~x(tlt ).
This example turns out to be a special case of a partially observed Markov chain, since the vector process
(N(t),~(t)) is a Markov chain.
Consider first the case K < ~ , treated by Rudemo (1972). Rudemo (1972, p 323) shows that in intervals between events
K
Wk' (t) =
Z i=I
w~(t) qik + (k~(t)
- kk ) w~(t) a.s.
while if an event occurs at t
wk(t) = ~ k ( t -
O) ~(t
~k - O)
a.s.
Consider now the general case where K is not assumed to be finite.
Following Rudemo (1973:1) and (1973:2) we define for t > 0
Hki(t) = Pr{k(s + t) = k.m, N(s + t) - N(s) : OI~(s) = kk}.
These probabilities may be obtained from
Hki(t)
= ~. Hkj(t) qJi - ~i Hki (t) J
99
and I ~
i~
~ = i
Hki(O) = 6ki = I 0
if
k # i
Let H(t) be the matrix with elements Hki(t) and D the diagonal matrix with elements 6kik i and let w(t) and w~(t) be the row vectors with components wk(t) and w~(t) respectively.
Then
wa(t) = ~(0) H(t0) D H(t I - t0)D...D H(t - tv(t)_l ~ S
a.s. (Q) where v(t) = v { ~ , t ] } ,
tk = v
-I
(k) (cf section 1.3.4) and
S is the normalizing operator on row vectors, defined by
Pk
(PS)k = Z pj
for p satisfying p~ ~ 0 and 0 < Z. P~o < ~" From this it
9
j
J can be shown, see Rudemo (1973:2, 271), that Ik a.s.
wk(t) : wk(t - 0) ~(t
- 0)
at events, like in the case K < ~, while t
<(t) :
<(u)qik + (~'(u) - ~) ~(~du
f s
i
a.s. for an interval (s,t] without events. Since In(u) appears in the equation it is not linear. Rudemo (1973:1, p 597) shows that ~(t)
= (~(t)S)k in (s,t] without events where w(t) is a row vector
with components given by t
~(t)
= ~k(s) + f [z ~i(u)qik - ~k ~k (u)] du s
i
a.s. which is a linear equation. These systems of equations are sometimes more suited for solution with a computer than the vector-matrix product representation.
100
In section 5.3 we will consider the calculation of w~(t) for a special case with K = 2.
Consider now, following Rudemo
(1975), the calculation of ~k(slt).
If
s > t and if H(t) is the matrix with elements Hki(t) we have w~(slt) = ~ ( t )
H(s - t) where w*(slt) is the row vector with compo-
nents ~k(slt).
If s < t and if P(t,s) is a matrix with elements 0 r
Pr t{l(s) : li' l(t) = Ik } Pk,i(t, s ~(t)
provided Wk(t) > 0 we have w (sit) = ~x(t) P(t,s) a.s.
As mentioned above the vector-matrix product representations be suited for computer calculations.
Rudemo
may not
(1975) shows that for
s > t s
=
+ I z t
(urt)
qik du
i
and for all s < t (i.e. also at events)
t
9{
~(slt) = ~(t) + f z (~(ult)qki ~k (u) s i
~. (u) l
x
- ~k(~It)
qik w~(u)
) du .
The above formulae seem suitable when t is fixed and s is varying. Rudemo
(1975) also gives recursive equations
for fixed s and varying
t and for both t and s varying but t-s constant.
101
With minor changes all the results given in this example hold true also when the intensity is a function of a Markov chain with stationary transition probabilities.
Problems
of this kind have also been studied by Snyder
(1972:1) and
(1972:2) when the intensity is a function of a vector M a r k o v process.
We will indicate how some of Rudemo's results may be proved. Assume that K < ~ since then all regularity assumptions are fulfilled. Define the random variables Z(t) and ~k(t) by
t Z(t) =
v(t)-1 H l(tk)) e k=O
S ~(u)d~ 0
and I
if
l(t) = kk
0
if
k(t) # kk
~k(t) =
From our general results it follows that
=~ (slt) = E ~kCS) Z(t) E zCt) and
=
E Z(t)
Consider s = t.
Assume that an event occurs at t, i.e. that ~(t) - ~(t - O) = I.
Then
~[t)
= E ~k(t) Z(t)
E ~k(t) k(t) Z(t - O)
E Z(t)
E k(t) Z(t - O)
k
~k E ~k(t) Z(t - O)
~k ~k(t - O)
=
E l(t).Z(t - O)
~(t
- O)
since k(t) = k(t - O) a . s . .
Assume that no event occurs at t.
Put wk(t) = E Ck(t) Z(t) and thus ~ ( t )
= (~(t)S) k. Note that
~k(t) = wk(t) E{Z(t)IX(t) = kk }.
For A > 0 such that . . . . . . t . . . . ur in the interval ~,t§
we have
102
t+A f x(u)du
t
- ~k(t)) Z(t)} =
- Ck(t))IA(t)
: ki}E(Z(~)Ik(t)
- ~k(t))IX(t)
= A i} ~i (t) .
wk(t+A) - wk(t ) = E{(~k(t+A)
e
t+A
S x(u)a~
-
= Z E{(~k(t+A) i
t
e
: Xi)~i(t)
t+A
S ~(u)au
-
= Z E{(~k(t+A) i
t
e
If i # k we have t+A
S
-
E {(s
+ ~) e
X(uldu
t
_ ~k(t))Ik(t ) = ki } =
t+A -
S
= E {e
x(u)du
t
IA(t + A) = kk' l(t) = k i} ~ik(A) =
= (I + O(A))(qikA
+ o(A)) = qik A + o(A)
and if i = k we have t+A -
E {(~k(t + A) e
S x(u)au $
- ~k(t)lil(t ) = ik } :
t+A = E
-
{(e
S xCu)du t
-
1)[x(t + a) = ~k' ~(~) = Xk } nkk(~) -
(I - nkk(n)) = - xka(1 + qkk A + o(A)) + qkk~ + o(~) =
= qkk A - ikA + o(A) 9
Thus we have
~k(t + A) - ~k(t) = A Z ~i(t ) - AlkWk(t ) + o(A) i qik and since a similar reasoning
goes through for A < 0 we have
~'wk(t) = iZ #i(t)qik - Ik#k(t)
.
:
103
Consider s > t. For A > 0 we have
E (~k(S + A) - ~k(s))Z(t) E Z(t)
= Z E {(~k(S + A) - ~k(S))lk(s) = ki} ,~(slt ) = i
iWk
= i~k (qikA + o(A)) ~ ( s l t )
= ~ i~ ~ ( s l t )
+ (qkk A + o(A)) ~ ( s l t )
=
qik + o(A) .
Since a similar reasoning goes through for A < 0 we have
@'~(slt) i
Consider s < t. For A > 0 such that no events OCCUr in the interval (s, s + A) we have, whether ~n event occurs at s or not,
"a(Sk + Aft) - ~ ( s l t ) k '
= E {(~k(S + A) - ~k(S)) Z(t)} E Z(t)
E {Z(s)(~k(S + A) - ~k(S))
Z(s + ~) Z(s)
z(t) } Z(s + A) =
E z(t)
= .E. (~i(s)~ij(A) E (Z(s)IA(s) = k i) "
E Z(t)
Since I + O(A) if i#k,j=k
E{(~(s+~)
- ~k(s)) ~IX(s)
= X i, ~(s*~) = Xj} =
-I + O(A) if i=k,j#k 0
otherwise
~o4
Ni~(4 ) = qij4 + o(4)
if
i # j
and
~i(s)E{Z<s)11(s) = ~i } S { ~
= ~j} =
=~(s) E{Z(s)} E{Z(t)11(s+a) = I.} E{Z(t)} m{z(s+a)lx(s+4) = t.} J
~](s) ~{z(s)} ,](~+~It) ~ <(~) ~](slt) + o(4)
E{z(s+~)},](s+~)
~](s)
we have
<(~
- Z (qki A + o(A))
(1 + 0 ( 4 ) )
( [k(s)~
i
<(~)k
+
(sl~)" + 0 ( 4 ) )
:
o(~))(
~
o(~)) -
~(s)
(<(sit)
= 4 z
qik ~ ( s )
~(slt ) qki..N~(s))+ o(~).
T,k( s )
,~(s)
i
For 4 < 0 a similar reasoning goes through if no event occurs at a. Therefore we consider A < 0 if an event occurs at s. Put 4' = - A. For A' such that no events occur in the interval (s-4',s) we have
~ ( s l t) - ~ ( s - & ' b )
= .Z. q ( s - & ' ) H i ~ ( A ' ) l,J Z(S)
rZ(t) ~ s) =
9 E 'z-%7
E{Z(s-A');~(s-A')
A(S-A') = ~i' l(s) = I-} '
.
~j }
Z Z(t)
In this case we have
z(s) ~J1(s-A
E {(~k(S) -
=
15 ( I + 0(A'))
if
i # k, j = k
lj (-I + 0(A'))
if
i = k, j # k
0
otherwise
) = li,1(s) = 13} =
= Xi]
105
and ~i( s-~ ~ ) z { z ( s - ~ ' ) l x ( s - ^ ' ) E
= xi}
E
{~l~(s)
= xj}
Z(t)
E Z(t)
E {Z(s)ll(s) = i.} J
9
*
E Z(s)
O(A')
=
~(s)
+ o(~' ).
Thus we h~ve
Thus Wk(slt) is continuous
in s for all s. If no event occurs at s we have
Z (Wk(sJ t)
q i k
~ ( s It)
qki
Consider finally the problem of calculating G(t
= Pr{N(t) = 0}.
It is shown by Rudemo
(1972, p 325) for K < ~ and by Rudemo (1973:1) t in general that G(t) = exp ~ / ~ (u) d ~ where ~ ( u ) is calculated 0 as if no events occur in -.~(0,u]" i
This relation between estimation
and calculation of G(t) holds true
for a large class of doubly stochastic Poisson processes.
To see this, we consider a nonnegative (X(t)
stochastic process
; t ~ 0} possible to use as a model for the intensity
section 1.3.3) and such that
(cf
~o6
U
f
-
d
for almost
Assume,
EE(e
all u
~(v)dv
0
= -
E(I(u)
e
~(v)dv
0
> 0.
for example,
that
Pr{Jl(t
lim
f
-
)]
+ 4)
- t(t)
I L s} = 0
4§
for all E > 0 and almost
E(12(t))
for almost
< C <
all t > O. We will
sufficient. bility
all t > 0 and that
show that these
From the discussion
in the sense of Doob,
to use as a model
in section
it follows
u+4
1
~
1.3.3 about
that
{l(t)}
are
integra-
is possible
for the intensity.
Further
lim k+O
assumptlons
(e
U
- f ~(v)dv 0
- f l(v)~v 0 --
e
}
U
- f ~(v)dv e 0
= - t(u)
for all realizations
Since surely
for almost continuous
of derivation however,
are continuous
all u > 0 our assumptions in t = u it only remains
and integration
by u n i f o r m
p 32)) since
of l(t) which
imply that l(t)
(cf e.g.
is almost
to verify that the order
may be interchanged.
integrability
in t = u.
This
Billingsley
follows, (1968,
107
u+A
f
-
sup ~I~ (e A>-u
u
l(v)dv
o
-
-
e
f
)1
u+A
<_ sup
2
l(v)dv
o
•
u+A u+A
~1 ~( /u l(v)dv)2 : sup ~2
A>-u
A>-u
f
f
u
u
~,X(v)~(w)dvdw <_c.
0 T
For almost all u h 0 the estimate k~(u) = E u(k(u)) is given by U
-S X(v)dv
u
- f X(v)dv
0 k~(U) = El(U) e u
= _ d__ log E (e du
0
)
S X(~)dv E e
0
if no events occur in (0,u].
Thus we have t
- f X(u)du
t exp [- f X~(u)dul = exp[iog E (e 0
0
)] =
t 0 and this without any Markov assumption. A similar relation has been shown by Rubin (1972, p 549) to hold also for point processes which are not doubly stochastic Poisson processes.
M Example 4 In this example we will consider a special case of a model for weak optical signals. The model has been studied by Macchi (1971) and Macchi and Picinbono (1972, pp 566 - 567).
Let {X(t) ; t ~ [0,T]) and {Y(t) ; t ~ [0,T]) be two independent and identically distributed normal processes. Assume that E X(t) = 0
lO8
and that C(s,t)
=Cov
that the sample
functions
and Leadbetter
(X(s)
=
Assume
(cf Cram6r
(1967, p 183)).
IZ(t)l 2
and define
X2(t) + y2(t)
=
events of an observation [0,T] corresponding
.
{l(t)
; t @ [0,T]} by
Let x I . .. ,xm be the epochs of ,
of a doubly stochastic
to l(t). Since
with mean value 2C(t,t)
Poisson process
l(t) is exponentially
it follows
I that the condition
of uniform integrability
distributed
from
holds,
and thus
T - f ~(s)ds
m
@
l(Xj))
E(l(t)(
~(t)
on
that
max E lk(t) = ( max 2C(t,t)) k k! < ~. Thus it follows O
further
of X(t) and Y(t) are continuous
Put Z(t) = X(t) + i Y(t) ~(t)
, X(t)) is continuous.
e
)
j=1
=
T - f ~(s)ds
m
z(( n a(xj ))
e
0
)
j=l is the best estimate to L(x,y)
of l(t) in terms of the observation
according
: (x - y)2.
Let t I ..... t n be in [0,T]. an expression
Following
Maechi
(1971) we will derive
for
n
-
T / ~(s)ds
j=1 with help of the Karhunen-Lo~ve
expansion
(cf Parzen
(1959, pp
278-283)).
Let {@k(t)
; k = 1,2 .... , t E D , T ] )
of the covariance eigenvalues.
be the normalized
eigenfunctions
C and let {~k ; k = 1,2,...} be the corresponding
This means that
I09
T / C(t,s) @k(S)ds = ~k@k(t) 0 and T / #k(t)#j(t) d t = 0
for all k,j = 1,2, ....
I
if
k=j
0
if
k~j
Then we have
C(t,s) = Z ~kCk(t)@k(S) k and
X(t) = Z @k(t) ~ k k Y(t) = Z @k(t)
k
where XI,X2,...
Xk
~'k'Yk
, YI,Y2,...
are independent normally distributed
random variables with mean values zero and variances one. Thus
Z(t) = Z @k(t) 2 ~ k k
Zk
where
zk=
Xk+iY
k
We will make use of the fact that
zk
=
where Z k and @k
Iz l e
i@ k
are independent random variables.
tially distributed with mean value one and
IZk 12 is exponen-
@k is uniformly distri-
buted on [0,2~.
Now we will introduce some notations. A vector (al,...,a n) is denoted by -~a. The set of all permutations of al,...,a n is denoted by
110
P(a ). Let i --I]
--~
be a vector of positive integers and denote by
the number of components in i
~n
equal to k. Let P
#(i) k
denote the set of
n
permutations of 1,2,...,n.
We have T -
E (( H l(tk)) k=1
T
X(s)ds )
f
0
e
n ~ ''IZ(tk)l 2)
= E ((
e
k=1
n
2 ~.~_~.i j @i(tk)@j(tk)Zi~j)
"
k=1 i ,j T 9 exm{-f
(Z 2 ~ '~~ "O ~i(s)@j(s)ZiZj)ds}] i,j
0
=
n
( H
2/Pi ~. '%. (tk) ~. (t_)Z i ~,. ) 9 k Jk ik Jk K k Jk
k=1
~'~n
-2 s pj Jzj 12 9 e
J
~
=
n
2/~.
~. '~. (t_)@.
ik Jk lk
K
Jk
(tk))
"
-2~ ~jrzjl2
n
" E (( H Z. Z. ) e k=1 ik Ok
J
).
We have
n E (( H
-2 Z ~jlZjJ2 Z.
k=1
ik
Z. ) e
8
)=
Jk 2
= ~ m (z. j
J
~
e-2njlzj + #(A)j)
J 9 e-2~JlzJ121
)=
J
eie'J (~(~)j
f
0
IZ(s)12ds )
111
which is equal to
co
-(l+2uj)x
1
f
x #Ln)J
e
d.x=H
jo
(I + 2#j) #(i-'n)3+1
J
#(&).~ "~ )(kn i (I + ) H= "1 + 2 p ')' 2~j I ik
(n j
if #(i_n)j =#(j~)j
for all j, i.e. if & 6 P(i_n), and otherwise
equal to zero. Thus T
~(s)ds
- f
n
E(( ]I l(tk) ) e k=1
0
)=
n
=
Z
Z
2~ik k=1 I + 2~ik r162
#(i__n)v Hv I + 2Pv
-n
n 2Ui H (I + 2Pv)-1 Z Z H I + __~Uik@ik(tk)r v -hi m---n6 Pn k=1
n
2~
Hv (I + 2~v)-I ~n~PnZ (k=IH iZ _ ~+1 12W i @i(tk)r
It is seen from the derivation that T - f ?~(s)ds
E(e
0
) = II (1 + 21237)-1
mk ) =
))"
112
Define the function f : EO,T'~ 2 § R by
2~ i f(t,s) = g i
I + 2~ i
el(t)
r (s)
9
In order to calculate f, its given form will mostly be of little help. We may, however, observe that
T f f(t,y) C(y,s)dy = 0 T
2~ i
=/(zz1+s . 0 a z
@i (t) @i (y)~j %j (y)r (s) )dy = z
~i + 2 ~ -
2~ = Y r162 i I + 2~ i
= Z i
I +
2~i
~i r162
(s) =
I
= C(t,s) - ~ f(t,s)
and thus f satisfies T f(t,s) + 2 f f(t,y)C(y,s)dy : 2 C(t,s) for t , s 6 EO,T]. 0
This equation is of the same kind as certain equations which will be discussed in the next section in connection with linear estimation. It follows from theorem 4 that f, at least among functions which are square integrable in each variable, is the unique solution of the equation. This also follows from the theory of Fredholm integral equations.
Let us, for notational reasons, define the matrix
F(t_.n) = {f(ti,tj)}
, i,j = 1,2 ..... n ,
113
and its permanent (cf e.g. Marcus and Minc (1965))
n
Per F(t ) : Z -n ~n~Pn
H f(t.,t. ) i=I z Ji
Put Per F(t O) = I.
Thus we have, and this is the result of Macchi (1971), T - f k(s)ds
n
0
E(( H k(t k)) e k=1
) =
H v
(I + 2 ~
)-i (Per
F(t_n))
and thus the expression for ~%(t) given by Macchi and Picinbono (1972, p 566) follows.
Let us, however, summarize. Let Xl,...,x m be the epochs of events of the observed doubly stochastic Poisson process on [O,T~. Then
Per F(t,_~n) k~(t) =
Per F(x ) --m
for t E ~ , ~
where f(t,s) is the solution of
T f(t,s) + 2 f f(t,y)C(y,s)dy = 2 C(t,s) , t , s 6 [O,T]. 0
If C(t,s) ~ C for s,t~ [O,T] it follows that k(t) ~ k for almost all sample function, where k is an exponentially distributed random variable with mean value 2 C. Thus the corresponding doubly stochastic Poisson process is a P61ya process and it follows from example 2 that k~ft~J = 2C(I + m) I +2CT This result may, of course, also be derived from the result given in this example. We have f -=
2C 1
+
2CT
and Per F ( t )
= n:f n"
and thus
114
t~(t) =
(m + 1 ) ' f m+l " mlfm
(1
=
+
m)f
=
2C(1 + m) '
1 + 2CT
Let us now specialize to the case C(s,t)
=
'
"
o
2 e -~ I s - t l
This means that X(t) and Y(t) are Ornstein-Uhlenbeck will consider the calculation of the distribution
~
&,C
2
>
O.
processes.
We
of the w a i t i n g
time to the first event. Put, see section 2.3, T O
G(T o).. = E e
- f ~(s)ds 0
and, see section 2.4 example
1.4, T O
- f ~(s)ds 0 GO(To ) = E(I(0) e
)
E X(O)
Let l~(t) be the best estimate of l(t) when no events occur in ,T]. From example 9 in section 5.2 it follows that
-2BT I +B e I - B 2 e -2~T
where
8 =
2
+ 4 2
and B =
9 From the final remarks in
example 3 it follows that
T
0
f G(T O].. = e
0
~(~)dT
I - B2
= e-(8-a)To
I - B 2 e-2BT~
= e
~T~
(c~
~ + 2~ +--7-------
This result has been derived by Siegert different 9
sinh(BT
o
))-I
(1957). His methods
are
115
Consider now G~
) . We have E I(0) = 2~ 2 and thus we get
G(T o ) G~
) = ~2o2
~
G(T o )
ITo(O) = ~ 2 o2
I~o(To) =
-2BT - a e-(B-a)To (I - B 2) 202
1+Be
o
(I - B 2 e-2BT~ 2
We end up this example by some comments on the case where ~ (t) is the sum of n squares of independent and identically Ornstein-Uhlenbeck processes. This case has been studied by Barndorff-Nielsen
and Yeo
(1969). Let Gn(To) and G~(To) be the quantities corresponding to G(T o) and G~
for this process. We note that G = G 2 and G ~ = G2.o
It is not difficult to realize that
Gn(T o) = (GI(To)) n = (G(To))n/2 and G~(T O) = G~(To)(GI(To))n-I=
G~
(n-2)/2
m 5.2
Linear estimation
Let N,A and ~ be defined on the same probability space as in section 5.1 and assume as in section 1.6 that E A2(B} < ~ for all bounded BEB(X)
and that E ~2 < ~ . Recall that M{B) = E A{B}and that
R{BI,B 2) = Cov(A(BI},A{B2~)
for bounded B, BI, B 2 E B ( X ) .
later purpose p by p{B} = Cov(~,N{B})
Define for
for bounded B ~ B ( X ) .
Let H be the Hilbert space L2(~,B(~),Q) , i.e. the set of B(~)measurable functions n : ~ § R such that f n 2 dQ < ~, with inner
776
product f ~i~2 dQ for n 1 ~ 2 ~ H .
In section A2 some facts about Hil-
bert spaces are summarized. H is our basic space in which the Hilbert spaces considered in this section are subspaces.
Let L(X o) for
X ~ B(X) be the Hilbert space spanned by N(B) for all bounded o B ~ B ( X o) and the constant one.
Definition 4 An element N
on
~ o
X
~ in L(X~ ) is called a linear estimate of ~ in terms of
.
Definition 5 A linear estimate estimate
~
of ~ in terms of N on X
o
is called the best linear
if E(~ ~ - ~)2 ~ E(n - 6) 2 for all ~ L ( X o ) .
As shown in theorem A 2.2 it is no restriction to assume E~ = 0 and to let a linear estimate be an element in L(X o) = S((N(B) bounded B ~ B(Xo))).
- M(B)
;
From the projection theorem it then follows that
the best linear estimate ~
of ~ is the unique solution of
E(~ ~ - ~)(N(B) - M(B]) = 0 for all bounded B ~ B ( X o ) .
This solution
will sometimes be denoted by E(~iL(Xo)).
In order to calculate the best linear estimate theorems 2 and 3, which we believe have independent
interest,
are helpful.
Theorem 2 For bounded X o ~ B(X) every ~ L ( X
f(x) X
(N(ax)
o
) has the representation
- M(~x~)
o
for some a.e.(M) uniquely determined B(Xo)-measurable f : X
o
§
9
function
117
Proof For every q E L(X o) there exists ql,q2,...,
nn = ~
such that for any n
fn (x) No{dX} o
where fn is a simple function and No{B} = N{B} - M{B} for all B e B(Xo) , with the property lim E(q n - q)2 = 0. Since lim E(q n - q)2 = 0 n-~ n§ implies f X
lim E(q n - qm )2 = 0 and since E(q n - qm )2 n,m+~
(fn(X) - fm(X)) 2 M{dx}
o measurable
(cf lemma 1.3b) there exists a B(X ~
_
function f such that
lim
f
n~
X
(fn(X) - f(x)) 2 M{dx} = 0 . o
The function f is determined and finite a.e.
(M), i.e. on X
o
- E
where M{E} = 0. Since M{E} = 0 implies P{N {E} = 0} = I and since o X
o
b o u n d e d implies M{X } < ~ o
and v{X } < ~ for ~ii ~ @ N the random o
vari able
= S X
f(x) ~ o {~x} o
is determined a.s.
The uniqueness
(Q).
follows from the above, and if we can show
E(q - ~)2 = 0, then the theorem is proved.
There exists a subsequence xEX
o
lira f k§ X ~
{fnk} such that
- E where M{E} = 0. Thus for all v 6 N fnk(x)v{dx}
= f X
f(x)v{dx}
o o N over a b o u n d e d set X
M{X o} < co it follows that
o
k-~limfnk(x) = f(x) for all with v{E} = 0 we have
since an integral with respect to a
is reduced to a finite sum. Since lim 5 k-~ X
f o
nk
(x)M{dx} = f X
f(x)M{dx} o
118
irrespective
of the chosen subsequence.
implies Pr{N{E}
Thus,
since M{E} = 0
= 0) = I, it follows that nnk § ~ a.s.
lim E(q _ q)2 = 0 it follows that 6 = q a.s. k-~ nk
~(~
-
n) 2
=
o
(Q). Since
(Q) and thus
.
Remark 2 In the proof of theorem 2 the condition that X
is b o u n d e d is only O
used to ensure that M{X ) < * and that v{X } < ~ for all v 6 N . O
only v{X } < ~ a.s.
Since
O
(Q) is required,
it is sufficient to assume that
O
M{X } < ~. This is a slight generalization
since X
O
bounded implies O
M{X } < ~ but there exist cases where M{X ) < ~ even if X O
O
is unO
bounded.
9
Now we drop the assumption of b o u n d e d X . When X O
is b o u n d e d it is no O
p r o b l e m to interpret a representation
q = f X
f(x)
Since
an increasing
(N{dx} - M{dx}).
O
pact
X is
a-compact
it
always
sets {Kn} I such that
creasing
sequence
exists
sequence
m] Kn = X and thus { K n N X o } I n=1
of bounded
sets
such that
0 n=]
of com-
is an in-
(Kn~"IX o) = Xo.
Definition 6 An element q ~ L ( X o ) q = f
f(x) (N{dx} - M{dx})
x
bounde~
is said to have the representation if for every i n c r e a s i n g sequence of
~
~
sets {Xn) I such that X n E B ( X o )
and
X n = X ~ we have n=1
Z i m E{q - f f(x) n~ X n
( N { d x ) - M { ~ } ) } 2 = O.
Theorem 3 Let X o ~ B(X) be arbitrary.
If for any b o u n d e d B(Xo)-measurable
tion g : X ~ § R with compact support
I X xX 0
g(x)g(y)R{dx,dy} 0
func-
_< c f g2 dM X 0
119
for some c < = then every n 6 L(X o) has the representation
n = ~ X for some a.e. f : X
o
f(x)(N
(M) uniquely determined S(Xo)-measurable
function
§
Proof Consider like in the proof of theorem 2 for every ~ L ( X sequence ~i,~2,...
such that n n = ~
o) a
fn(X)No {dx} where fn is a
simple function with compact suppor~ and lim E(~
n
- q~
= 0 and
a function f such that lim f (fn - f)2 dM = 0. n -~= X o Let { ~ } 1
be any increasing sequence of bounded sets with
U xk k=1
=
X
o
and define
~(k) : f f(X)No{~X} = S f(k)(x)No {~} X X~ and
(k)= ~
fn(X)No{dX}
: f f~k)(X)No{dX}. X
o
The variables nn(k) and n (k) are a.s. , (k) - ~m(k))2 _< (I + c) ~ E~nn thus n I(k) ,n 2(k) , .
(fn
(Q) determined by n. We have
fm )2 dM since X k is bounded and
is . a .Cauchy . sequence
From the proof of theorem 2
it then follows that lira E(nn(k) - n (k))2 = O. n-~o0 We have lim E(n (k) - n) 2 k§
< (I + c) lim lira f k-~ n+~ X
= (I +c) lim k--
f
o
= lim lim E(~ (k) _ n )2 < k-~ n-~
n
--
(f(nk) - fn )2 dM = (I + c) lira f k-~ X
f2
=O.
(f(k) _ f)2 dM = o
9
120
Remark 3 The condition in theorem 3 is sufficient but not necessary. that we will give an example where X
To see
is bounded but where the conO
dition in theorem 3 is not fulfilled.
Consider X = R and X
= (0,13 O
and A defined by A{B} = S X(x)dx B and {Ik)k=1
where
is a sequence of independent
l(x) = ~k
if
x6 ( I ~I k+1'
random variables with
2 E ~k = I and Var lk = Ok" Put
if
gn (x> =11 0
xs
(0,11nJ
elsewhere
If the condition in theorem 3 were fulfilled, then we must have Var A{(0,1/n]} < c/n for all n and some c < ~. 2 ok Since Vat A((0,1/nj}
example take
=
Z k=n
02 = k 2"5
it is seen that if we for ki(k+1)2
then Var A((0,1/n]}
I
Thus
2~ Var A((0,1/n]}
< ~ for all n but the condition in theorem 3 is not
fulfilled.
Consider now X = X section
O
= Z and let s be a stationary measure
(cf
1.6). Then the condition in theorem 3 is that for any
finite Z C Z and any sequence of real numbers O
{gk : k ~ Z
O
)
there exists some c < ~ such that 2 Z gkgj rk_ j ~ cm Z gk" Define y(x) by y(x) = E k,j~Z ~ k~Z k~Z O
An equivalent cm with
a.e.
ikx
gk"
O
form of the condition is then
flY(x)l 2 dx
e
fIy(x)I 2 FZ{dx}
which is fulfilled if F ~ is absolutely continuous
bounded density.
From t h e o r e m
bounded density is necessary.
A 3.3
it
follows
that
a.e.
For stationary doubly stochastic
Poisson sequences the condition in theorem 3 is thus the 'correct' condition.
121
It is seen from the p r o o f of the t h e o r e m that if it exists a Borel measure W on (Xo,B(Xo)) is non-negative
such that C{BI,B 2} = R{BI,B 2} - W { B I ~ B 2}
definite then the theorem holds under the weaker con-
dition that for some c < ~ and any b o u n d e d function g with compact
support
.r X • 0
g(x)g(y)C~ax,ay}
<_ e
f
g2 d(M + W). A simple example
X
0
0
is when A is completely random,
see definition
I .3, since then we
put W{B} = R{B,B}.
m
Definition 7 Let M be the mean measure
and R the covariance measure of a r a n d o m
measure. We say that R is absolutely
f
dominated by M on X ~
if
IR{ax,~}l A c M(B}.
BxX
0
for some c < ~ and every B ~ B ( X o ) .
If X ~ is b o u n d e d and if R{Xo,B} ~ c' M{B} for some c' < ~ and any B E B(X o) then R is absolutely dominated by M on X ~ since
f
]R~,~}]
_<
BxX
E A~dx}A<~} +M~X~ }M(B} =
f BxX
O
O
= E A{Xo}A{B} + M{X 0 }M{B} = R{Xo,B} + 2M{Xo}M{B}
<
< (c' + 2M{x }) M{B}. --
0
For arbitrary,
i.e. not n e c e s s a r i l y bounded,
S( X ~ )-measurable
Xo~S(X)
and any
function g : X ~ § R it follows by Schwarz's
equality that if R is absolutely dominated by M on X o then
f X • 0
Ig(x)g(y)R<~,~}l
0
< c f
g2(x) M{~x} O
g2(x)
f X xX
0
X
<__
0
J~{~x,~v}I <
in-
122
and thus the condition in theorem 3 is fulfilled.
The following lemma is a variation on lemma 1.3b.
Lemma I If R is absolutely dominated by M on X
it holds for every O
6,q~ L(X o) with representations
g(x) (N{ax}
max])
-
X 0
f(x) (N{ax] - M{ax}) X 0
that
~
f
f
g(x)f(x)M{ax} +
g(x)f(y)~{ax,~}.
X •
X O
O
O
Proof Choose
~n = ~
gn (x)NO{dx} and nn = ! fn(X)No{dX), where gn 0
and f
f
n
0
are simple functions with compact support, such that
(gn- g)2 ~ §
f
and
X
(fn- f)2 a M §
0
and
X
0
0
lim E(
gn(X)fn(x)M{dx} +
0
+ lim f n+ ~ X • O
gn(X)fn(X)R(dx,dy). By usual Hilbert space theory O
gf dM. Since R is absolutely dominated by M on n+~ X
X O
X
O ~
O
it follows that
f X xX
I e(f X
O
g2(x)M{dx})2 (f X 0
Ig(x)f(y)R(dx,dy} I < O
I
f2(x)M{dx))2 < ~ 0
and thus the integral
123
f
g(x)f(y)R{dx,dy} is well-defined.
X xX 0
0
Further
I f X xX O
X xX O
0
I
<
X • 0
<
g(x)f(y)R{dx,ay}[
gn(X)fn(Y)R{dx,dy} O
Ign(X)fn(y ) - g(x)f(y)llR{dx,dy}l 0
I
X xX 0
Ign(X)fn(y ) - gn(X)f(y)l + 0
+ Ign(X)f(y) - g(x)f(y)llR{dx,dy}l I c (f X
I
g~(x)M{dx}) ~ (~ X O
(fn(X) - f(x)) 2 M{dx}) ~ + O
]
I
+ c (f
f2(x) M{~x}) 2 (f
X
X O
(gn(X) - g(x)) 2 M{dx}) 2 O
which tends to 0 as n § ~, and thus the lemma is proved.
Consider now a random variable ~ with E~ = 0 and E$ 2 < ~ and recall that p{B} = Cov(~,N{B)) for bounded B ~ B ( X ) .
Theorem 4 If R is absolutely dominated by M on X
the best linear estimate ~ O
of $ in terms of N on X
is given by O
f
f(x) (N{ax} - M{dx})
X
O
where f(x) is the unique a.e. (M) square integrable (with respect to f(y) R{B,dy} = p{B} for all
M over X o) solution of f f(x) M{dx} + f B X O
]24
Further E(6 ~ - 6) 2 = E[ 2 - / X o
bounded B s
f(x)p[ax).
Proof From t h e o r e m A 2.1, i.e. the projection theorem,
it follows that 6~
is
all
the
unique
solution
o f E(6 ~ - 6) N(B} = 0 f o r
Since R is absolutely dominated by M on X representation
6 = / X
bounded BEB(Xo).
it follows that 6 ~ has the
o
f(x)N {dx) for some function f with f o X
o
f2 dM < o
From lemma I it follows that
E(6 ~ - 6) N~B)
=
/ f(x)M~d~ B
+
/ B•
o
and thus f is a solution of the equation in the theorem.
Let g be an other solution of the equation with / X h
f
g. Then / h(x)M(dx) + / B X
B E B(Xo).
Since also i X
exists a sequence
h(y)R(B,dy)
g2 dM < ~ and put o
= 0 for all bounded
o
h dim < ~ and since X is ~-compact there o
{h } of simple functions with compact support such n
that / (hn - h) 2 dM + 0. Since S hnh dM + f h (x)h(y)R(dx,dy} n X X X • 0 0 0 0
= 0
it follows by the argument used in the p r o o f of lemma I, put gn = g = f = h
and
f
n
= h , that also / h2dM + / h(x)h(y)R(dx,dy) n X X xX o o o
= 0. Since R is non-negative
definite it follows that h = 0 a.e.
(M).
Further it follows from the p r o j e c t i o n t h e o r e m that E(6 ~ - 6)2- = = E~ 2
-
E(6~) 2. Define 6n ~ = ~
fn(X)No{d.x} where f n is a simple
func-
o tion with c o m p a c t s u p p o r t / X
o
such that
E ( 6 ~ - 6~) 2 + 0 a n d t h u s
(fn - f)2 dM + 0. Then E(6~) 2 = lim E6n~ n-~
Since
= lim S n§ X
fn (x)0(dx)" o
=
125
Ill
S
B
If(y)llR{dx,
l
BxX O
it follows
for any 8(Xo)-measurable
flg( )ll {d )l_<S X
0
function g : X ~ § R that
Igll l +
X
f
X xX
O
0
I
(1
+
Thus lim f n~ X
X
0
fn(X)p[dx) =
f
{dx,dy)l
I
e) (S g2 ~)2 (f X
Ig(x)llf(Y)ll
0
f2 ~)2 . 0
f(x)p{dx}
since
X
o
O
Ifn(X) - f( )llP{ )l
If %(x) - f(x)p{a~'l ~ f X
X
0
0
1
(f
< (I + e)
f2dM) 2
1
(f
X
(fn - f)2 dM)2
X O
O
which tends to 0 as n §
II
Remark 4 The condition that R is absolutely dominated by M on X ~ is used to ensure that integrals with respect to R are w e l l - d e f i n e d transfer convergence
in the function space with
(f,g) = f
f
fg dM +
X0
f(x)g(y)R{dx,dy}
and to
'inner-product'
to essentially ordinary
XO •0
L2(Xo,B(Xo)~M)-convergenee. More p r e c i s e l y the condition that R is absolutely dominated by M on X
f X0 •
0
o
is used to ensure that I I
tf(x)g(y)R{ax,ax}] s e(f f2 ~ ) ~ (f g2 aM)~ . X0 X0
We have a feeling that lemma I and t h e o r e m 4 ought to be true under the sole conditions that.
of theorem 2 and 3 but we have failed to prove
~26
We may m e n t i o n the last through
Remark
that
comment after
in remark
simple
dominated
3, the proofs
by M + W on Xo~
of lemma
I and t h e o r e m
see 4 go
changes.
5
For every E ~ B ( X o ) Therefore thus
if C is absolutely
with M{E} : 0 we have Pr{A{E)
p and R{.,B}
are absolutely
continuous
also to any ~ s M to which M is absolutely
Radon-Nikodym
= O} = Pr{N{E} with respect
continuous.
= O} = I.
to M and
Thus the
derivatives
m(x)
=
~{~}
~(x)
=
P~ {{d~x}}
~{~x}
and
r(x,R}
exist
Thus
uniquely
=
R{dx,B} ~{<x}
a.e.
(U).
an alternative
f(~)m(~) + S X
completely
formulation
f(y)r(x,dy}
= p(x)
for a.e.
in t h e o r e m
(~)xEXo,
4 is
provided
a
o
additive
version
If X e R n and M is absolutely measure
of the equation
it is n a t u r a l
X ~ Z it is n a t u r a l
of r(x,']
continuous
to choose
to choose
is found for a.e.
w i t h respect
~ as Lebesgue
~ as the m e a s u r e
measure
(~)x6X
o
.
to Lebesgue while
w i t h mass
if
one in
each integer.
In Grandell approach
(1972:2)
as here
linear e s t i m a t i o n
was
studied with
for the case X = R and X ~ =
under the assumption
that
a similar
EO ,T~ , 0 < T < ~,
A{B} = S l(x)dx where B
l(x) is a stochastic
127
process with El(x) -- m and Cov(l(x),l(y))
= r(x,y)
such that
T
f r(x,x) dx < ~. These assumptions imply 0 TT
T
T
f 3[ le(x)g(y)r(x,y)Idx
f
0 0
0
0
T
I --
T
Ig(y)~V'r(y,y)dy <
I
T
- -
<_ f r(x,~l~x (f f2(~)~x)2 (f g2(x)~x)2 0 which,
0
0
see remark 4, is the condition required in lemma I and t h e o r e m
4. We may note that if k(x) is stationary, T
then
f r(x,x)ax : T Vat ~(0) < ~. 0
Example 5 Assume that R is absolutely dominated by M on X
o
and let A be a
b o u n d e d set in B(X).
Consider ~ = N{A) - M{A}.
If Af]X ~ = ~ this corresponds to prediction.
We have p{B} = M{A~'~B} + R{A,B} that the best linear estimate
f
f(x)mdx}
+
B
f
X
for all b o u n d e d B [ B(Xo).
and thus it follows
~--f X
from t h e o r e m 4
f(x) N {dx} is determined by o
o
f(y)R{B,ay} = M{Af]B] + R{~,B} o Further
E(<~ _ ~)2 = M{A} + R{A,A} A~X
=M{A\X o} + ~{A,A\X o} - f X
o
X
o
f(x)R{A\Xo,dX}. o
From our point of view it is more i n t e r e s t i n g to consider = A{A} - M{A}. mate 6~ = f X
Then p{B} = R{A,B} and thus the best linear esti-
f(x)No{dX} o
is determined by
128
f f(x)M{~}
+
S
B
X
f(y)R{B,dy} = R{A,B}
o
for all bounded B s B(Xo). Further E([M _ [)2 = R{A,A} - f X
E(~
f(x)R{A,dx} which if A ~ X ~ is reduced to o
_ ~)2 = S f(x)M{dx}. A
If A lies outside X
it is seen that the best linear estimates of
o
N{A} and A{A} coincide.
Example 6 Consider X = Z and let X ~ be a finite set. Put m k = E s ri, j = C o v
Zi,s
and
(see section 1.4 and 1.6) and Pk = Coy ~,N k
where as usual $ is a random variable with E~ = 0 and E$ 2 < ~. Since m k = 0 implies rk, k = 0 and since Z Iri,kl < ~ i~X i
z iex
Z /~ri,i ieX o
it follows that
Iri,kl ~ c o
holds for
max{ril i ; i 6 X o}
Z
r~-~--..
i~X ~
m,l
c = min{m i ; i E X o, m i > 0}
Thus it follows from theorem 4 that the best linear estimate fk(Nk - ink) is determined by k~X
o fkmk + iEX Z
firk'i = p k
for
k~ Xo
o This result is easy to derive by direct calculations Hilbert space theory.
without any
The example shows, however, that even if the
assumption that R is absolutely dominated by M on X ~ is unnecessary
129
(see remark 4) at least it does not have an absurd consequence
in
this case.
9
Example 7 Consider a weighted Poisson process in the general sense as defined in section 2.1, i.e. let M E M
and a nonnegative random variable ~ be
given. Define A by A{B} = XM{B} for all B E B ( X ) .
Put m = E~ and
2 = Vat ~ which implies M{B} = m~{B} and R{BI,B 2} = 2 { B I
for B,B1,B 2 E B ( X ) .
Then
5 IR{dx,dy}l BxX
}~{B 2)
= d2~{Xo}~{B} = (a2/m)~{Xo }M{B}
0
and thus R is absolutely dominated by M on X
o
if and only if
~{X o} < ~. Put ~ = ~ - m which implies that p{B} = d2~{B}. For ~{X o) < ~ it follows from theorem 4 that the best linear estimate ~
is determined by
m f f(x)~{dx} B
+ d2~{B) f X
f(x)~{dx}
= d2~{B)
0
for all B E B(Xo). This equation has the solution 2 f(x)
=
m + ~2~{X o}
and thus the best linear estimate of X is m 2 + a2N{X } ~t~
=
~
+
O
m
m
+ d2~{X
} 0
and E(A~ _ A)2 = E ( ~
_ ~)2 :
2 m~
m + o2u{X o}
oo
Let {Xk)k= I be an increasing sequence of bounded sets in B(X) and let ~
n
be the best linear estimate of ~ in terms of N on X . Then n
130
2
m(7 lim E(X ~ - ~)2 : 2 n n+~ m + ~ lira u{X } 0
n->-~
Thus if lim ~{X n} = ~ and X ~ ~ X k then X n -~ k=1 does not have an i n t e g r a l r e p r e s e n t a t i o n .
Consider
now X = R, X ~ =
E0,t~
and ~ Lebesgue
m@L(Xo)
but
measure.
Then
~
m
8 + o2st { [o,t~ }
~=m
m+~t
2
or with m = ~/a and o 2 = B/a 2
~ + t
which,
as shown
L(x,y)
= (x - y)2 in the case of a P~lya process.
p 99) has the best implies is
this
shown, estimate
that
'better'
managed
in example
2, is the best estimate
see section coincide
for any other
2.1, that the best
distribution
than the best
is the worst
the same mean value
Lundberg
linear
to
(1940,
estimate
if and only if ~ is r-distributed.
linear
to give any plausible
sense
according
This
of ~ the best estimate
estimate.
explanation
distribution
and
We have,
however,
of not
why the r-distribution
among all distributions
in
with
and variance.
m Suppose
now that N is observed
be the random variable integral wellknown
equation
which
from e.g.
Assume
set X I and let ~(E~ = 0)
is to be estimated.
given in theorem
In most
4 is difficult
linear p r e d i c t i o n
set X o ~ X 1 is considered solution.
on a b o u n d e d
theory
it m a y be much
that R is absolutely
that
cases the
to solve.
It is
if some u n b o u n d e d
easier to get an explicit
dominated
by M on X ~ and define
131
gl
=.L (glL(xl))
go = ~ (~[L(Xo))
/
=
X o
f(x)N {dx} o
and g~
f(X)No{aX}.
= S
appr.
XI
gappr. ~ is a reasonable approximation of the best linear estimate gl
if
E (g:ppr. ~)2 E (g~ - g)2 is close to one. This quantity is mostly difficult to calculate, but since
E (gappr. -
~)2
.
E
E (g] - ~)2
(gappr.
E (~
E ( ~
~)2
-
_ ~)2.
~)2
~_
~appr. - go
+ E (go
E (g: - g)2 g:)2
E(g:ppr.= 1 +
(go
E
_
g)2
(1 . e)
S
< --
f2(x) M{~}
Xo~X I
<1+
~2 _ f
X
f(x)p{~:}
o
we are on the safe side if
(I + c)
S
f2(x) M(~x}
Xo\X 1
E~ 2 - f f(x)p{~} X or
o
)2
132
E (~r. ~)2o E (g~ ~)2 -
are close to zero. These quantities may be fairly simple to calculate since only a solution of the integral equation corresponding to an observation
of N on X
is required. O
Example 8 This example, which is not quite trivial and which we believe have some practical interest, will be considered in some detail.
Let X be the real line and let A be given by A{B} = S X(x)dx B where ~(x) is a stochastic process with E ~(x) - I and Cov (~(x),~(y))
= e -~Ix-yl
Thus we have H{B}
= S
for some ~ > 0. Put ~ = ~(0) - I.
dx, X{B 1,~2 } =
B I
I
~dy
and
dx. Further we have
I
S 5 e-~Ix-Yl --~o
e -~lx-yl
I
p{B} = S e-~Ixl B co
S B 1•
dxdy = (2/~)M{B} and thus R is absolutely dominated by
B
M on the real line.
Assume that N is observed on Is,t]. If both -s and t are large, it seems natural to consider X ~ = R. Then ~o~ is determined by
f(x) +
f f(y) e -~Ix-yl
which has the solution
f<x)
and thus
=
dy = e -~Ixl
for x ~ R
133
E(I~(O)
_ t(0))2
=
c~
~/c~2 +
2~
where 1{t0)" " =
E(<:ppr.
6:)2
E(6:-
<)2
-2(~)t =
e
(j+
-2t)
which is less than e.g. 0.01 for t > 2~.6 if ~ = 0.01, for t > 6.07 if ~ = 0.1, for t > 1.18 if a = I and for t > 0.101 if ~ = 10.
If only -s is large it seems natural to consider X ~ = (-~,t]. In 9{
this case
f(x) +
t _alx I / f(y) e -~Ix-yl dy = e for x < t -oo
which has the solution
~ f(x)
;{e- (~7~+2~)IxI
+
(~ + I - ~ )
e -(~+2~)(2t-x)}
if t > 0
=
(+~~ - ~ 2 ~ -
~)e~t+(~
)(x-t)
ift
< 0
Let the function E{I~(0) - l(0)}2(t) denote the value of E(/~(0) - I(0)) 2 when N is observed on (-~,t~.
Instead of giving the rather complicated formula we have in figure 3 illustrated this function for some values of ~.
134
E
{A~(O)
-
A(O)} 2
(t)
1.0 '
~=i0
9
a = l
9
a=O.l
o.5
J ......
-I0
9
a
=
|
~
t
0.01
i0
Figure 3: Illustration of E{l~(O) - l(O)} 2 (t)
Consider t = 0, i.e. N is observed on Es,~
and X O
E{la(O)
- ~(0)} 2 = ~
--
Then
-
and
E( ~ p p r .
o
= e-2
27J
Ist
E(~o~_ g)2
2a
I
which is less than 0.01 for Is] > 22.9 if a = 0.01, for Isl > 5.66 if a = 0.1, for Is I > 1.04 if ~ = I and for isr > o.o713 if a = 1o.
For notational reasons we change the situation and consider the case when N is observed on ~ , T ] , estimated for t 6 E0,T~.
where T is large, and X(t) is to be
Since i(x) is stationary there is no real
change in the situation.
In this case we have ~ ( t l T ) = E(~(t)li~0,T]) where ft(x)__ is determined by
T = 1 + S ft(x)No{dX} 0
135
T ft(x) + ~ ft(y) e -~It-yl 0
~y = e -~It'x]
for x E EO,T]
which has the solution (ef van Trees (1968, pp 321-322))
ft(x ) : ~ {e-Slt-xl
A
+
[e-B(t+x)
+ e-B(2T-t-x)
+
1 - A2 e - 2 B T
+ A e-B(2T+t-x) + A e-B(2T-t+x)]}
where
B =
~2
+ 2a
and
A = ~ + 1 -
B.
For large T it seems reasonable to approximate this rather complicated T estimate with l~ (tiT) = I + f gt(x) N (dx} where appr. 0 o
gt(x)
=
~ {e -BIt-xl
+
A Ee - % ( t + x )
+ e-g(2T-t-xO
)
.
We have
E (X ~
appr.
(tiT)
- ~(tlT))
2
E (Xappr.(tlT)
- X~(tlT)) 2
<
E (ta(tlT)
-
t(t))
{
2
B T <--~ (I + 2 ) ~ (ft(x) _ gt(x))2 dx --
C~
(Z
which for all t E ~,T]
is less than e.g. 0.01 for T > 30.6 if ~ = 0.01,
for T > 5.7 if m = 0.I and for all T if ~ = I or 10.
Consider now l~(t) = E (l(t)Ii([O,t]). Then we have
t = I + f f(x) No{dX} 0
~(t)
where
e-~(t-x) + A e -(t+x) f(x)
=
(6
-
~) I - A 2 e-2~t
"
136
From this and the previous discussion in this example it follows t that I + f g(x) No(dX} with 0
g(x) = (B - e) e -B(t-x)
is a reasonable
approximation
of l~(t) p r o v i d e d t is large.
Consider as a further illustration
some random generations, of a
model within the class studied in section 2.3.2, which may be looked upon as continuous parameter analogues to the generations
described
in section 2.5.
Put in these generations T = 50 and ~(x) = I (N(x)
where
; x 6 [0~50]} is a Poisson process with parameter ~ and inde-
pendent of a sequence distribution
generations presented
{lk}k= 0 of independent
function U(x) = I - e
-X
random variables with
, x > O. In figures 4-6 these
together with some linear estimates of l(t) are
for e = 0.01, ~ = 0.1 and a = I. In the case ~ = 10 the
illustration value turned out to be very low, and this case is therefore omitted.
For e = I the curves representing E ( l ( t ) I L ( E O , t ] ) ) a n d
its approximation
coincide within the accuracy of the diagram.
137
I
% 5O
25 (a) The piecewise constant curve represents A(t). The continuous curve represents the approximation of E(
z(t) l[([0,50])).
5O
25 (b) The piecewise constant curve represents X(t). The picewise conti . . . . . . . . . . . . p ..... ts E(X(t)
i [([0,t])).
50
25 (C) The piecewise constant curve represents X(t). The p~ecewise conti ....... urve rep ..... ts the approximation of E(l(t)
IIIII II
i lJ
0
17( [0,t'])).
IN I
I
I
L 5O
25 (d) The spike train represents the location of the points of N.
Fi~ulre
4~
Illustration
of
linear
estimation
in
the
case
~ =
0.01.
t
138
!
!
25
50
(a) The piecewise constant curve represents ~(t). The continuous
curve represents the approximatlon of E( k(t) l ~ ( [0,50] )).
!
f
5O
25 (b) ~ e piecewise constant curve represents l(t). The piecewise cont~ ..... e~..... presonts ~(x(t)lY([o,t])).
50
25
(C) The piecewise constant curve represents A(t). The piecewise continuous curve represents the spgroximaton of E( t(t)[ [([0 t])).
i; IIHIli IIill l; l;IIrIIlllil]II IJiili~HlIIli ;lilil; o
50
25
(d) The spike train represents the location of the points of N.
Figure
5: I l l u s t r a t i o n
of linear
estimation
in the case
a = 0.1
139
0
!
9
25
50
(a) The piecewise constant curve represents l(t). The continuous curve represents the approximation of E(~(t) I T([0,50])).
i
25
0
(b) § (c)
5O
The piaeewise c o n s t ~ t curve represents ~ ( t ) . The pieeewise co~ti~uo~s ~ r v e ~epre~e~t~ ~( ~(t) l [ ( [ O , t ] ) ) .
rlllll lit I I il I III,I I ll!lllII o
11, 50
25 (d) The spike train represents the location of the points of N.
Figure
6:
Illustration
of
linear
estimation
in the
case
a =
I.
140
Example
9
We will now consider
a simple
generalization
example
8. Put X = R and let A have
chastic
process
of the case studied
density
l(x) where
with E l(x) = m and Cov(l(x),l(y))
= l(0) - m. If m = 62 = I we have the case
Suppose
that N is observed
in
X(x) is a sto-
= 62 e -~Ix-yl . Put
studied
in example
8.
on X . Then O
~
= ]~(r
S(x)
= /
(N{dx}
- m
dx)
X 0
where
f is the solution
m f(x)
+
O
= (-~,t]
{e-Blxl+
for x ~ X O.
~+~
e-B'2t-x'}(~
if
t
> 0
if
t < 0
=
(B-s) e
and if X
dy = 62 e -alxl
0
we have 62~
f(x)
e- l -Yl
62 J X
If X
of
= (s,0J we have
~t+6 (x-t)
(cf van Trees
(1968,
pp 321-322))
O
e
f(x)
=
-BIll 1 -
where
~T
eB(l l-21sl)
(B-h 2 ~
e
-2BIsl
in both cases
B =
Assume
+
(~B-~)
2 262e c~ + ~ m
that N is observed
on a b o u n d e d
set X
and let ~(E ~ = 0) be O
the random variable
which
is to be estimated.
We will n o w consider
141
a different kind of approximation
of [~ = ^'E([IL(Xo)) which may be
useful.
Let, like in theorem
finer partitions
lira n+~
I, (Bnl,...,Bnr} n
be a sequence
of finer and
of X ~ such that Bnj ~ B(X o) and
max I <j
diam(Bnj ) = 0 . n
Put Ln(Xo) = S(N(Bnj)
- M(Bnj)
; I ~ j ~ rn). From example
6 it
follows that
[~n always
=
]~([ILn(Xo))
can be computed.
the c a l c u l a t i o n
In section 6.1 it is shown that in general
o f ~n r e q u i r e s
tedious
a r xr matrix has to be inverted. n n
numerical
Anyhow,
computations
since
with a digital computer
~n can be computed and t h e o r e m 5 shows i n which s e n s e ~n may be regarded
Theorem
as an approximation
of ~ .
5
For bounded
it holds that lim E ( ~
X O
- ~)2
=
0
.
n-~oo
Proof Put L (X o) = S(N{Bnj) follows
=
; I < j < rn, n = 1,2 .... ). It
167) that lim E(~ n~ - q )2 = 0 where n-~ (Xo)). Thus the theorem is proved if we show that
from Doob
q = E(~IL
L (xo)
- M(Bnj)
(1953
,p
L(Xo).
We will do this in almost the same way as it was shown in the proof of theorem
I that 0' = 0'.
We will need the following:
For any AI,A2,... E B ( X o) such that
142
Ak+ I C A k and lim A k : r (the empty set) we have lim E(N{A k} - M{Ak}) 2 = 0. ke~o n-~O This follows, however, immediately from properties of measures and dominated convergence.
Since L ( X o ) C L ( X o) and since L (Xo) is a Hilbert space it is enough to show that N{B) - M { B ) ~ L (Xo) for each B ~ B ( X o ) . D = { D ~ B ( X o) ; N{D} - M { D } ~ L
Put
(Xo). For disjoint DI,D 2 .... 6 D
it
follows that
~ D i ~ D by putting A k = [J D i. Thus D is a Dynkin i=I i=k system since the other requirements are obvious.
Word for word the last part of the proof of theorem
I may be repeated
and thus D = B(Xo).
Up to now we have only considered estimates
of a random variable ~.
It is sometimes of interest to estimate a random vector = (~1,...,~n).
In this case our basic sample space is ~ = N•215 n.
The required modifications A vector ~ ~ = (g],...
are obvious and will not be discussed. with ~ ff[
, k = 1,...,n, is called
a linear estimate of ~ in terms of N on X o.
Definition 8 is called the best
A linear estimate ~__~of ~ in terms of N on X O
linear estimate if the matrix
is non-negative
definite for all ~ = (nl,...,n n) with ~k ~[(Xo).
It is almost trivial to show that ~
= (E(~IIL(Xo)) ..... E(~nlT(Xo)),
i.e. the best linear estimate of a random vector is the vector or best linear estimates ~(~Z - A)'
<~
of each component,
- ~> = ~ ' ~
- ~I~>'~ ~
and that
143
Assume that R is absolutely dominated by M on X 9 Then it follows o from theorem 4 that
+
f
X
s
(N{~x} - M{dx})
o
where ~(x) = (f1(x),...,fn(X))
/ ~(x)mdx} + / s B
X
is determined by
R{B,dy} = ~{B}
, B6B(Xo),
o
where ~{B) = (Cov(~I~N{B}) ..... Cov(~n,N{B}) ).
Further it follows almost immediately
from the proof of theorem 4 that
X
(~' denotes the
5.3
transpose
o
of ~ and not the derivative.)
Some empirical comparisons be~een non-linea~ and linear estimation
The very restricted purpose of this section is to consider some random generations
illustrating a case where it seems reasonable
to believe that non-linear estimates are much
'better' than linear
ones.
Put X = R+ and X ~ = [0,t] and consider the process described in I example 3 for the special case K = 2, w1(0 ) = w2(0 ) = ~ q
if
k#i
-q
if
k = i
and
qki =
This means that {1(x)
; x ~ R+) is a Markov chain with stationary
transition probabilities,
alternating between the values 11 and 12
144
in such a way that Hki(Y) = Pr{1(x+y) = till(x) = Ik } = qy + o(y) if k # i, and hence
~I ( 1 Hki(Y
e-2qy
)
if"
k#i
if
k = i
= 1
7
(1 + e -2qy)-
Thus I
m ~(x
=~
r(x,y
= Cov(l(x),l(y))
( l 1 + ~2 )
and =
I
~ (11 - t2 )2 e - 2 q l x - y l
O' In this section we will use the notations XL(t) for ~(X(t)l[EO,q)
IB(t)' for E t(1(t)) and
i.e. IB(t) is the best estimate of ~(t) in
terms of N on EO,t] according to L(x,y) = (x-y) 2 and IL(t) is the corresponding best linear estimate.
Consider first the case q = O. Then N is a weighted Poisson process and it follows from example 2 that
~(t)
N(t)+1 e-11t ~N(t)+1 e-12t = 11 + A2 .N(t) e-11t .N(t) e-12t AI + A2
and from example 7 that
~(t) =
(I I + 12)2
+ (~I
- 12)2 N(t)
2(I I + 12 ) + (i I - 12 )2 t
In figures 7 and 8 these estimates are illustrated by random generations for t 6 [O.50~ and (11,12) = (0.5, 1.5).
145
We note that if ~I = 0 then -~2 t Z2 e if
~(t)
if
N(t) > 0
=
o
-~2 t I +e X~(t) = ~2
and
~(t)
X2(1 + N ( t ) ) 2 + ~2 t
and further
E(X~(t) -
~)2
_
2 -Z2 t ~2 e
2(1 - e-x2t) and
s(~(t)
2 12 _ ~)2 = 4 + 212t
where ~ = l(t).
Thus for large values of ~2 t the best estimate l~(t) is much 'better' than l~(t).
Consider now the more interesting case q > 0. From the results of Rudemo, described in example 3, it follows that
z~(t) = ~i~i(t) ~ + ~2~2(t) where w~(t) , k = 1,2, is determined by
146
o) = 2
~(~-o) at epochs of events and
~1 ' ( ~ )
= (~(=)
- Xl - q) ~I (T) + q ~2 (~)
~ 72~' (T) = q ~I(T) + ( k~(t)
- ~2 - q) w2(T)
in intervals between events.
Using the linear equations
for ~k(T) it follows that if no events
occur in (Sl,S2~ then for 9 ~ s 2 - s I
~+ eBT[Tr~(Sl)(J3+~)+qw2(sl)]+e ~TI(S 1 T ) =
-ST
x ~, [ITI(Sl)(B-(~)--qTr2~Sl) ]
eB~ [~+q+~ i ~1< s~ >-~I ~ I>] +e -~ [B-q-~ <<<s I l-~I s~ II]
where ~ = ~ (k2 - ~I )
and B =
q2 +
.
From section 5.2 and especially example 9 it follows that
X~(t)
~I + ~2
= ~
2
t
+ I f t (m) No{d~} 0
where ft(T) is determined by the equation
i (kl+k2)ft(T) 2
+ 62 ft fy(X)e_2qlm_xl dx = ~2e_2qlt_T I 0
for T E [0,t], which has the solution
y +c& ft(T) = ,Y-Q I [ ~
e2YT + e-2Y T e2yt
I e 2yt] ~+q
147
where 'L
~q2 7
=
2q62 +
XI+X2 "
In figures 9-12 these estimates are illustrated by random generations for t ~ [0.50j, q = 0. I and some combinations
of ~I and 12"
!
I
25
"50
t
(a) The constant curve represents k. The piecewise continuous curve represents l~(t).
,-,
I
I
25
5o
t
(b) The constant _-urve represents k. The piecewise continuous curve represents XL(t).
II 0
II II III I IIIIIII I I IIIII I III 25 (c) The spike train representsthe epochs of events of N.
5o
Figure 7: Illustration of estimates in the case (~i,~2,q) = (0.5,1.5,0)
148
!
25
I. 5O
(a) ~le constant curve represents ~. The piecewise continuous curve represents
X~(t),
!
25
L
t
5O
(b) The constant ~urve represents I. The pieeewise continuous curve represents
X~(t).
lIIIII IIIIIIIIIIIIIIIIIIIIIIIIIII I IIIIIIIIIIIII 0
25
50
(c) The spike train represents the epochs of events of N.
Figure 8: Illustration of estimates in the case (X1,12,q) = (0.5,1.5,0)
149
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
L
l
25
50
t
(a) The p~ecewise constant curve represents k(t). The piecewise continuous curve represents X~(t).
5o
25 (b) The p i e c e w i s e c o n s t a r t c ur ve r e p r e s e n t s ~ ( t ) .
The p i e e e w i s e
c o n t i n u o u s curve r e p r e s e n t s k=~(t).
IfIfIll ~ HIII If (c) The s p i k e t r a i n
Figure
IIlll II If!If 5o
25
0
9: I l l u s t r a t i o n
r e p r e s e n t s t h e epoeh~ o f e v e n t s o f
of e s t i m a t e s
in the
case
N.
(~1,X2,q)
= (0.75,1.25,0.1)
~50
(a) The piecewise continuous
!
I
25
50
constant curve represents
curve represents
l(t). Tile piecewlse
IB(t).
_~--~'~.-'~ ---A (b) The piecewise continuous
25 constant curve represents
curve represents
l~(t). b
i IfJi II II fJlll
Jiflli~l[llli I 111 i o
5o l(t). The piecewise
25
!
5o
(c) The spike train represents the epochs of events of N.
Figure
10: Illustration of estimates in the case (~i,~2,q)
= (0.5,1.5,0.1)
151
!
I
50
25 (a) The piecewise constant curve represents continuous
curve represents
l(t). The piecewise
k~(t).
[
25 (b) The piecewise constsmt curve represents continuous
curve represents
r
5o R(t). The piecewise
~[(t).
JllJgll1111JnlJlJ~fnI ~llJllJLJII Jlfill HIllIII]IIJl ]J~Ji, 25
0
50
(C) The spike train represents the epochs of events of N.
Figure
11: Illustration of estimates
in the case
(~i,12,q) = (0.25,1.75,0.I)
152
I
25 (a) The piecewise constant curve represents continuous
curve represents
50 l(t). The pJecewise
l~(t).
I
25 (b) The piecewise constant curve represents continuous
IIIIIlllnli I~II o
curve represents
I
50 k(t). The piecewise
~(t).
IIHIIII , IIIIIIIIIIIIIIIIII~IIIHIIIIII, 25
50
(C) The spike train represents the epochs of events of N.
Fi6ure 12: lllustration of estimates in the case (~i,~2,q) = (0,2,0.1)
Added in proof Snyder (1975~ pp 329 and 352) has illustrated these estimates in the case t E [0,11] and (~i,~2,q) = (0.1,20.1,2).
153
We will now consider estimation of the intensity at other epochs than the right end point of X . In order to simplify the numerical computao tions we will consider discrete time, see sections 1.4 and 1.6. Thus we put X = Z and X
Let {Z. ; j ~ Z } J
= {1,2,...,n}.
o
be a stationary Markov chain, alternating between the
values 11 and 12 in such a way that Pr{~j+ I = lilZ j = Ik} = q if i # k. for k ~ { 1 , 2 , . .. ,n}. Let ~1,...,~n be
We will consider estimation of s the observation of NI,...,N n.
We will now consider the best estimate Z~(kln) of Zk in terms of N on {1,2,...,n}. Define the random variables Z(k) and ~j(k) by
Z(k)
k
:
~.
-]~'
1
if
s
0
if
Lk J
H (~ae
J)
j=1
and
~j(k)
=
= I.
J
X. J
Put
~(k)
= E [ (k) Z ( k )
m
E Z(k)
and
E ~(k) w~ (kln) =
Then
E Z(n)
Z(n)
154
After some ealculations~ similar to those in example 3, we get for k = 1,2,...,n
vk
X1
~(k) = Vk _t 1 Xl e
where ~
~(
{(
~(
O) =
e
-X 1
{(1-q)~l(k-1)+q~%(k-1)}
1-q)~(k-1)+q~(k-1)}+t
~k 2 e-t2{qw~(k_l)+(l_q)w~(k_l)}
1 O) = ~ , and f o r k = 1 , 2 , . . , n - 1
(1-q)~l(k)~(k+l
] n)
( 1-q)~T 1 (k)+q'rr2(k)
~
+
q~-i (~:)',,2(k 1In) q~l (z)+(1-cl)~2(k)
and
w~(nln)
=
~ . ~1(n)
We will now consider the best linear estimate Z~(kln) of of N on {1,2, .... n}. Put
~1 + X2
2
1~1 - k21
2 and p = 1 - 2q . Then E~.
=m J
and
co~ (~3,~3+k) = ~2pl~l
s
in terms
155
From example 6 it follows that n
ZL(kln)
= m +
Z
fj(vj
- m)
j;1 where f1'"" "'fn are determined by the equations f~+62
n Z i=I
J
Ij-il
f.p l
= ~2 p l k - j l
for
j = 1,2 . . . .
,n.
After some calculations, compare example 9 and example 6.3, we get the following approximative solution of the equations
fJ
= 2,2~
1 - ~
Iz-Jl
{b
( b 2 n - k - J +2 + b k + J ) } +
I - bp
where
A
= I
~ (1 + p
2
62
+ --
(1 - p 2 ) )
m
and A-
~A 2-
92
b = P In figures 13-16 these estimates are illustrated for n = 25, q = 0.2 and some combinations of 11 and 12"
156
4 3 2 I
At each point k the height of the spike represents Nk, the }lecewise constant curve rep ..... ts Ik, th . . . . ti . . . . . . . . . . . . p ..... ts ~ ( k 1 2 5 )
and the lowest angle of the
triangle represents the approximation of ~[(k125).
Fisure
13: Illustration
of estimates
in the case (~i,~2,q) = (1.5,2.5,0.2)
8 7
6
5 L 3 2 1
II
I
At each point k the height of tile spike represents Nk, the piecewise constant curve represents Ik, the continuous curve represents l~(kI25) and the lowest angle of the triangle represents the approximation of I~(k125).
Figure
14: Illustration
of estimates
in the case (X1,12,q) = (1,3,0.2)
157
+ At each point k the height of the spike represents Nk, the plecewise constant curve rep ..... ts ~ ,
th . . . . ti . . . . . . . . ve rep .... ~ts /~(k125) and the i .... t angle of the
triangle represents the approximation of ~ ( k i 2 5 ) .
Figure
15: Illustration
of estimates
in the case (l 1,k2,q) = (0.5,3.5,0.2)
lO 9
8
7 6 5 .
4 3 2 1
y
I I
r
'I I
ITITIT
At each point k the height of the spike represents Nk, the piecewise constant curve represents Lk, the continuous curve represents /~(k125) and the lowest angle of the triangle represents the approximation of /~(ki25).
Figure 16: Illustration
of estimates
in the case (k1,k2,q) = (0,4,0.2)
158
6.
LINEAR ESTIMATION OF RANDOM VARIABLES IN STATIONARY DOUBLY STOCHASTIC POISSON SEQUENCES
Let {N k ; k @ Z } {s
; k~Z}
be the underlying random measure,
Assume that s m = E s
be a doubly stochastic Poisson sequence and let see section
1.4.
is (weakly) stationary and use the notations
r~ = C o v
s163
and F s for the spectral distribution
function as in section 1.6. Let further ~ be a random variable with E 6 = 0, Var ~ = o
2
and Pk = Coy <,N k. The problem to be
studied is to find the best linear estimate ~
of 6 in terms of
an observation of the doubly stochastic Poisson sequence on Z or a part of Z. This is the same problem as studied in section 5.2 for the general case. We will, however, results by using methods
derive more explicit
from the analysis of time series. A survey
of needed such methods is given in section A3.
6. I
Finite number of observatio~
Consider now the situation where {Nk} is observed for k = nl,n1+1,...,n 2 where n I and n 2 are finite. From theorem 5.4 and example 5.6 it follows that the best linear estimate ~
of ~ in terms of Nnl ,.
by n2
z
fk(Nk - m)
k=n I
n
where {f, } 2 K
n]
is the unique solution of
m%
+ .z
n2
fjrLj
= Pk
J=n I
for k = nl,...,n 2
and further
'',Nn2
is given
159
n2 S (~
- ~)2 = 2
-
fkPk 9 k=n I
Using the notations --
P
=
(Pnl
~=(
--
fnl
~,
,.
"''Pn2
9 ,fn2
)
)
= (Nnl .... ,Nn2)
= (s
.... ,s
! = (I,...,I)
(n 2 - n I + I components)
R s = {r~_j}, n I ~ i, j s n 2
I
= {6i_j}, n I ! i, j ~ n 2
~.
=
where
a
1
if
j =0
o
i~
j#o
we get ~
=
f(N'
-
m1')
where
f = p(mI + RZ) -1 and
E (~
-
6) 2
= c
2
-
fp'
=
Now we specialize to the case ~ = s Zk~ = ~
c
2 - i ( m Z + R~) -I s
- m and use the notation
+ m. In this case
s
= m + 9_k(mI +
(N'
.
-
m !' )
160
where ~k
= (r ~
nl-k'''''
r~
)
n2-k
"
We may observe that
~o,,~
I
2
and thus
(s --
: (s
s nI
)' : m1'
From the very last part of section E (s
- !)'
= mRZ(ml
Any (&k ; k ~ Z }
+ Rs
+ Rs -I
(N' - m1')
n2
5.2 it follows that
(~f - _Z) = R Z - RZ(mI + Rs -I R Z =
+ Rs -I = ml - m2(ml + RL) -I
has the spectral
Zk - m =
representation
/ e ikx ZZ{dx} --7
and thus n2 l.i.m. n2-nl -~
(The notation ~, ~i,n2,...
1 n2_n1+ I
Z Zk = m + Z~{{0]). k=n I
l.i.m, means l.i.m,
We call m + ZZ{(0}]
nn
'limes in mean'
~ stands
From the spectral
for lim E (n n
for random variables ~)2
0 .)
the level of the process.
Consider the case ~ = Z~{{O]}.
~k=m+~+Sk
i.e.
Define
{c k ; k ~ Z }
by
.
representation
it follows that E (~s k) = 0 for
161
all k. Thus R Z = ~21'I + R e where R e = {r2_j}={Cov n I ~ i, j ~ n2, and ~ = 2 ~ .
t~
=
(ei,ej)} ,
From this it follows that
d2l(ml + a2i'l + R~) -I (~, - mi' )
and further
E (~
- 6) 2 = d2(1 - ~2A(mI + d21'1 + Re) -1 ! ' )
If n 2 - n I is large the inversion of ml + R ~ requires tedious numerical calculations
and therefore good approximations
are help-
ful. This is the problem to which most of section 6.2 is devoted.
6.2
Asymptotic result~
We will start with a general formulation of the kind of problems to be studied in this section.
Let Z I ,Z2,... C Z
be finite sets such that Z k C Z k + I and let LZk
denote the Hilbert space spanned by {N. - m ; j EZk}. J applications
In all
Z k will be of the form In I ,n2] where n I and n 2 are
finite and depend on k. In section 6.1 we considered the calculaoo
tion of E (
=
[J Zk and let iZ k=1
be the Hilbert space
spanned by {Nj - m ; j 6 Z
}. Put <~Zk = E (6}iZk)
, k = ~,I,2 .....
Then it follows from Doob (1953, p 167) that
l.i.m. k~
~
= Zk
~Z
and thus
lim E ( ~ k-~
_
k
~)2
=E
(tz
_ g)2
Therefore it is of interest to study ~Z
and E (~ oo
_ ~)2 oo
162
The main motivation
for this section is, however , that ~xZk generally
requires tedious computations
and therefore it is of interest to find
/Zk such that ~Zk is rather simple to compute and in some sense qZk~ serves as an approximation of ~ Zk '
Definition
I
A sequence
{~Zk} , ~Zk ~ IZk , is said to be asym~totically
efficient
in the linear sense if
lim
k-~
= I .
E ([~k -
g)2
Remark I If {~Zk} is asymptotically
efficient it does of course not mean that
~Zk is a reasonable approximation of ~Zk for fixed k. In section 5.2 we discussed what to mean with a reasonable
approximation.
l Now we assume that F ~ is absolutely continuous.
Then also {N k ; k ~ Z }
has absolutely continuous
F N and, using the nota-
spectral distribution
tions in section 1.6, we have
fN(x)
= ~ m + f~ ( x )
where fZ and fN are the spectral densities. culation of ~Z n or Z
= {0,•
and E (~
~
-
g)2
for Z
We will consider the cal-
= {n,n-1,...}
for some finite
This is exactly the same situation as discussed
in the beginning of section A3. We will, however, repeat some of the notations.
Thus ~ is a random variable with E ~ = 0, Var ~ = d
2
and
163
Coy (~,N k) = Pk" The f u n c t i o n cross s~ectral density.
r
_ ~ I2~
Z
Pk e
-ikx
. is called the
From section A3 it follows that
W
9{
tZ
S h(x)ZN{dx]
=
co
-IT
where Z N is the spectral process
S eikx
Nk - m =
zN{dx]
in the representation
,
--'iT
and h a function depending on ~ and Z . Define g(z) by
4w g(z) : exp {A-
For Z
= {n,n-1,...]
h(x/:
the notation
S -w
e-lY" + z log fNCy) dy) e -my - z
it follows from t h e o r e m A3.2 that
1. f~l gIe-~X/ L~/e~X~ '
[ I n is defined in section A3, and further
E ( ~Z
~
)2 = a 2 - i
ix
12
dx .
-~ [g(e )~nI For Z
= (0,•177
it follows from theorem A3.1 that
h(x) : r fN(x) and further
co
-~
fN(x)
~64
Example
I
In o r d e r
to i l l u s t r a t e
prediction
one
= Nn~ I - m
step
how these
ahead.
formulae
Consider
m a y be u s e d w e w i l l
therefore
Z
= {n,n-1,...}
consider a n d put
.
Then we have
r
=
I ~w
~ Z
N -ikx e rk_n_ 1
~ e
-i(n+1)x
fN(x)
k=-~
a n d thus
)2 E
([Z
-
71
N
[
r0
=
oo
dx =
-T[
~ix g(e-~x~ o r2)
17
f
~i(n+1)x g(e-iX)]nl2
f
-
(?~(~)
-
dx
=
--'/T
17
f (fN(x)-
g(e -ix) - go 12) ax =
--17
17
f (gO[g(e-ix) + g(eiX)] - g2o) dx = --17
17
=
217
go2 =
217 exp
~I
~ log f~(x) dx --17
which
is t h e w e l l - k n o w n
result
shown by Kolmogorov
(1939).
m For later purposes
2
we d e f i n e
%red=217exp~
I
'5
flogf~(~) dx. --17
Assume
n o w that
s f (x) <_ c < ~ for a l m o s t 17
a l l x~[-17,w]
and
consider
165
where I ~ ~
h(x)
s
~
e
-ikx
j[z and put
=
qZ k
Since ~-~w ~m
Z h. (N. - m) . jEZk J J
fN(x ) _< __mm +2w
c
it follows from theorem A3.3 that
l.i.m, nZk = 6Z k-~oo
co
and thus, provided E ( ~ Z
- ~)2 # 0 , {qZk} is asymptotically
efficient.
Remark 2 It may be a matter of taste if the q Z k % to compute'. If ~
are called 'rather simple
is rational, i.e. a ratio between two trigono-
metric polynomials, then g is simple to compute. It may be observed that if f~ is rational, then also fN is rational.
If the condition fZ(x) ~ c < ~ is strengthened to
Z
Ir~l <
jEz then the methods used in section 5.2 may be applied, i.e. the sequence {hi}, j E Z
mh. + Z J i~Z for j ~ Z
co
, is the unique solution of
hi
ri-j
=pj
.
Now we specialize to the case ~ = Zk + m and use the notation ~k~ = ~Z~ + m. In this case
I
166
oo
~(x)
Thus for Z
-~ 7
z j=_~
r.
e -ijk
a-k
e -ikx f~(x)
=
.
= {n,n-1 .... ) we have
h(x) =
Fee" J_kx
I g(e-iX)
f~(x) ~
L- g (eix)
n
and 2 g (Z k
For Z
L: -2orx) J n
= r0 -
dx
= {0, I,...} we have
h(x) = e
ikx
fs
fN(x) and
(fE(x))2
_ ~k )2 -~ w
f
dx =
fN(x)
m f~(x) dx
9
-w m + 2z f~(x)
Remark 3 It may be observed that N k - m = (s
- m) + (Nk - s ) where,
theorem 1.6, N k - m may be looked upon as an observation 'signal'
Zk - m with the
'noise' N k - Zk added.
see
of the
The question of
signal measurements
is treated by e.g. Hannan (1970, pp 168-179)
and the estimate s
could as well be derived by using the results in
Hannan.
Compare also Snyders
(1972).
I
167
Example
2
This is strictly speaking no example but some observations.
To
motivate these observations we look upon the doubly stochastic Poisson sequence as a model for the number of claims in an insurance portfolio.
o(n)~ for m + E (s In this example we will use the notation ~k with Z
Let N
n
= {n,n-1,...}
.
be the number of claims which occur during year n and
assume that each claim causes the insurance company the cost of one monetary unit. A natural risk premium for year n+1 is then E Nn+ I = m. If, however, the value of Zn+1 was known, a more
'fair'
risk premium
would be Zn+I" Of course, within the present model, s
, Nn,Nn_ I ...
but if the company has knowledge of the risk premium.
is unknown
it may use
~(n)~ n+1
This reasoning is in insurance mathematics
'experience rating'
and
as
called
g n+1 ( n ) ~ may be called the 'credibility premium'.
After year n+1, when the company has got knowledge of Nn+1, the updated credibility premium z(n+I)~ n+1
may be calculated.
Define Un+ I by
~(n+1)~ + ~(n)x n+1 = Un+I ~n+1 . The quantity Un+ I may be called the quantity'.
In insurance terminology
'updating
it corresponds to 'minus the
bonus'.
Consider E (s
- Zk)2 , a quantity which in general may be rather
2 hard to evaluate. We will show that if k = n or n+1 then only Opred has to be computed.
We have
-
elkX(fN(x ) _
E (Zk(n)a- s )2 = r~ - i --I[
-- r 0 -
l ikx
g(e -ix
_.~
2 dx =
g(e Ix)
>] mFei X 2 n
2"~ Lg(elX)~ n
dx .
]68
Specializing to k = n and n+1 we get
E , n+1
- ~n+1
= ro~ - -wi
=
r@
=
" ei(n+l)x g(e-ZX)
- go ei(n+1)x 2 dx =
fN(x) + g 20 - go (g(e ix ) + g( e-iX) )
-
dx
=
-IT
2 2w go
R, =
r0
-~
(~
-
2 pred
~Fm +
--
r0
+
2-
4w
go'[~ :
2 2w go
-
m
=
m
and
E (s (n)~ _ s
s =
ro
=ro
-
-
i -w
='
einX
/ If
" g(e-ZX)
N(X)
+
m
I
2w
-7-
-w r0
s -
einX 2 go
- -m
2w go
dx
[m
+
r0
m
(g(e ix) + g ( e - i X ) ~
2
2 m
+
2
2m]
=
2w go
m
2
2w go
2 m -
m
2 pred
Because of stationarity we have
E (s
I )~
Zn+1
)2
Consider now the two estimates
We have
~-
go
L :
m 27
(n)~ = E (Ln
s )2 n
~(n)~ z(n+1)~ n+1 and n+1
dx =
169
s n+1
w I ~i(n+1)x fs = m + ~ . -w g(e-lx) L g (eix)
1
= m +
f I. -w g(e -Ix)
m e zN{ax} = 2w g(e Ix) I n
= m +
f -~
I. g(e -Ix)
i(n+1)x g(e-lX)
[el(n+1)x g(e-lX)
= ~ + f e i(n+1 )x
n
zN{dx} =
_ ei(n+1)x
~g(e-ix) - goI zN{dx}
L(o-ix)
and z(n+1)~ w [ei(n+1) n+1 = m + f I. x fZ(x zN{dx} = -w g(e -Ix) g(e Ix) ]n+1
= m +
= m +
= m +
lei(n+1)x g(e-lX) " f 1. -~ g(e -Ix)
~
I
-~
g(e -Ix)
i(n+1)x
ei(n+1)xl m
zN{ dx] =
2~ go
Ig (e-ix ) _ m 2g0] ZN{dx) = fw e i(n+1)x . -w g(e -mx) ~pred ] i(n+1)x
= m + f -~
:m+
g(e-~X)
m_ e__~.+ i(n 1)x __I zN{dx} = 2~ g(e Ix) ]n+1
(I
m
2 pred
e
g(e -Ix)
1
g(e-lX_ go 2m ) g(e-lX) + ~m ~pred ~pred
m ) - m) + m , (n)~ m) = 2 (Nn+1 7 - - - ~n+1 ~pred ~pred ~(n)~ + (I n+1
m ) 2 Nn+1 ~pred
zN{dx)=
170
We have
= s Un+1
s n+1
~ = -~
n+1
-e-(n+1)x ----:--. - ig g(e-lX) 0
m ~ 2w g
zN{dx}
and thus
EU 2 n*l
=
~
m = 2"rr
2 pred
+
0 -
2 m 2 epred
~
-
2
2Tr g'
2m
= 2~T
2+( 0
)
_
=
0
.
In Grandell (1972:1, pp 548-552) similar results were derived by use of Toeplitz forms. In these derivations the spectral distribution F Z was not assumed to be absolutely continuous, but since theorems A3.I and A3.2 can be generalized to not necessarily absolutely continuous spectral distributions, the results in Grandell (1972:1) can be derived by the method used in this example.
Let us go back to the application to insurance models.
The simple formula
z(n+1)~ _ n+1
m 2 pred
z(n)~ m ) n+1 + (I - - 7 - - Nn+ I pred
seems attractive, since it means that the policy holder has a possibility to understand how the number of claims year n+1, i.e. Nn+1, affect the bonus year n+1.
Example
3
Consider the case m = I and r~ = g
plJl, IpI< I , and
e x a m p l e may be r e g a r d e d as t h e d i s c r e t e
parameter
~ = Zk - m. This correspondence to
171
the
case
following
studied
in e x a m p l e
Hannan
(1970,
5.8.
pp
We w i l l
give
a direct
derivation
171-172).
We have
f~(x)
2 1 - p 2 1 + p - 2p cos x
= 1 27
and
fN(x ) = 1
2(1
27
In order
to derive
tion
g. We w i l l
g(z)
is a n a l y t i c
-
p cos x )
1 + p2 _ 2p
the
estimates
use the
facts
and without
cos
x
Zk we have
that
to c a l c u l a t e
g(eiX) 9 g(e -ix)
zeros
in
the
= fN(x)
and that
Izl < I.
Since
fN(x ) = I_ 27
.. 2 -
p e
2
l+p
ix
-p
- p e ix e -p
--ix e
-ix
we c o n s i d e r
I
I
2w
1
=_I_
-
+
p
p_
2~
b
pz
2
-
pz
pz -
-I
I
pz
-1
(z-h)(bz-
2~
I)=
(z - p ) ( p z
- I)
where
b =
1-
~/1-p
2"
P I
Thus I - bz
g(z) =
I - pz
+
(z - b ) ( z
(~
p 2wb
p)(z
-
I - bz I
-
pz
- b -I) = p-1 ) -
I - bz I
-
pz
func-
-I -I
172
Since
Ibl <_ Ipl
it is seen that g(z) is analytic and without zeros
in Izl_< 1.
Consider now, despite we loose some systematics, the case Z
= {0,•177
In this case we do not need to calculate g, but
still a similar factorization will be used. In this case
e ikx fZ(x)
h(x)
=
=
eikx(1 _ p2)
* '
fN(x)
2(I - p cos x)
e ikx (1 - p2)b"
p(1
- h eiX)(1 - b e -ix )
eikx (j -- @2)b I P(1 b 2)
1 -p
~z
2
j=-~
I -- + I - b e ix I - b e -ix
hlJl ei(J+k) x
2
= ~-P 0
hlJ-kl 2
and ~I
E(~
Consider now Z
p2'
- ~k )2 = ~1 -
= {n,n-1,...}. Then
h(x) = I
re
g(e -ix) [
#I
-
2
i.e.
h.
-
~kx f~(x) g(e ix)
: n
I] =
S 2~ ~z j=-~
b
lJkleiJ=
173
feikx
I - p2
2zb
I - p e -ix
I
P
1 - b e-iX
/
l;
P'
L
V
2Trb
b(1 - p2)
2w(I - p elX)(1 - p
I - p e
/
I -belX I
-~ eix
e " I - b eZX)(1
I - b e -ix
p
]
e~X)I In
n
- p e -ix
We have
I
ikx
]
ikx
e ---- e
(I - b eiX)(1 - p e-ZX) n
e I - p b
[,'eiXi,1 eiXi[ n-k
I + -ix - b e ix I - p e
n-k
and thus
h(x) =
2
#I - p I + T ~ _-p- 2
e
ikx(! I
P e -ix) b e-iX
I
I beiX + 1-pe -ix -
We separate the two cases k < n and k > n. Consider first k < n. Since n - k > 0 we have
h(x)
=
#1-p 2 I+-~7
e
ikx(1_pe-iX) l-be -ix
eikX(1_pe -ix)
1+~-p
l-be
--iX
(1-bp)e ikx -
be
n-k 9 .. I {( Z bJe IJx) + "} = j=1 1-pe -ix
ixn-k ei(n-k)x) I1-b 1_be ix
1_pe-iX)b n-k+1 ei(n+1)x
(1-beZX)(1-be -zx)
+
-
I
-
1_oe -ix
~n-k
"
174
oo
= ,, ~
{(1_bp)eikX
I z b Ij leijX 1-b 2 j=-.
_ (1_pe-iX)bn-k+lei(n+1)x)
Thus
hj = --~{(~ bp) blkJl
bn-k+1
bln+l-Jl + phi-k+1 bln-Jl}
or
{blk-Jt
2 h.
+ b2n-k-J +2}
if
j
if
j>n
=
J 0
and thus
X /- 1p2
E(Z k - s ) 2 -
Consider
2
{1 + b 2(n-k+1)}
.
k > n. Then
{I h(x)
1 +
eikx(
-ix 1 - pe ) -ix I - b e
- p2
=
2 1 - p ~ _ p
P
k-n
I -be
e
1 - p e -i
inx -ix
Thus
~1-p h,
J
2
pk-n
bn-D
if
j
if
j >n
~-
0
n-k
175
Further
1T
E(,~-
.~k)2 = 1 -
]" Ih(x)l 2 f"(x)
d~ =
--'IT
=
1 -
(1
-
s
p2(k-n)
~
2
1 -
=
dx
-~
I1 - p e
ix
I
2
p2(k-n) 1 +
ql
-
p2
Put k = 0 and consider E(s
"
- ~)2
as a function
of n. To summarize
we have
21nl I
P
ifn
I+~i-~~
E(~% - ~o )2 =
To illustrate figure
-
2
p
{1
E(Z~ - Z0 )2 we consider
17 this function
2
J
2 ~1
+
(I -
p
~I
- p
2n+2 )
--
it as a function
is given for some values
} if n > 0
of n.
of 0. In
176
1.0
0.5
0,25
0.5O
0.75
Figure 17: Illustration of E(L~ - ~0 )2
Consider now Zn = {1,2,...,n). Then a natural approximation of Lk' when ~ = Zk - I for k ~ Z n ,
~
2
p2 ~Z
n
=~
I~
{i~p2
is
n b2n_k_j+2) Z (b Ik-jl + (Nj-I) j=l
if k equal to or near n
n j=1
h Ik-jl
n (blk-J
Z j=1
(Nj-
I +
I)
b~+j ) (Nj-I
if k not near n or I
)
if k equal to or near I
As illustration we have applied this approximation to the random generation G7 described in section 2.5. Thus we use 1 + nZ5 0 as approximation of Lk. In figure 18 which is equal to figure I with the approximations of Zk added, this is illustrated. Figure 18 may be compared with figures 4(a), 5(a) and 6(a).
177
50
25 In e&ch point k the height of the spike represents N k , the value of the plecewise constant curve represents ~k smd the v~lue of the continuous curve represents the approximation of Z k.
Fisure
18: l l l u s t r a t i o n
A natural
question
of e s t i m a t i o n
on g e n e r a t i o n
is a r e a s o n a b l e
is now if qZ
G7.
approximation
M
of ~Z n
n in the sense that E(q Z
_ ~)2 ~
E(~
n this we have c o n s i d e r e d means
integer part.
_ ~)2
To get some idea of
n ~ = ~ n+1 - I and ~ = ~ - I, where En~] n
In table
2 and 3 we have
calculated
E
E(q Z
] _ ~)2
n and E ( ~
_ ~)2 for some values
of n and some values
of ~. The tables
n illustrate
both h o w
convergence
'good'
of E(q Z
the a p p r o x i m a t i o n s
_ ~)2 and E ( ~ n
where
~ = Z
are and the rate of the
_ ~)2 to their
limits.
In table
2,
n
- I is c o n s i d e r e d ,
the a p p r o x i m a t i o n
n
n
~I-P2 : ----Z--
~Z
(I + b 2)
n
is used. In table
z j=1
3~where
Ln ] -
~ = ~ n+1
b n-J (~
- m) J
I is c o n s i d e r e d ~ t h e
approximation
178
1_~,2
nzn is
=
2
n+1 l IL~]-Dl
n Z
j=1
b
(N'-S -
m)
used.
~ =s
p=o.25
n
1
0=0.5o
p=o.75
E(~Zn-~)2
E(nZn-~)2
E(~Zn-~)2
E(nZn-~)2
E(~Zn-$)2
E(nzn-$)2
i
0.50000
O.5OO13
o.50o00
o.5o258
0.50000
0.52076
2
I0.492o6
0.49207
0.46667
0.46686
0.41818
0.42311
3
0.49194
0.49194
0.46429
0.46430
0.40217
0.40321
4
0.49193
0.49193
0.46411
0.46412
0.39894
0.39915
0.46410
0.46410
0.39828
0.39832
I
0.39815
0.39816
7
0.39812
0.39812
8
0.39811
0.39811
0.39811
0.39811
5 6
|
0.49193
0.49193
o.4641o
Table 2: Illustration of the rate of convergence.
o.4641o
'goodness'
of approximations
and the
179
~ =s
~=o~5
n
1
-i
~:o5o
~~
E(~n-g)2
E(RZn-~)2
E(~n-~)2
E(.Zn-~)2
E(~n-~)2
E(~Zn-g)2
0.50000
0.50050
O.5O0O0
O.50898
0.50000
0.55731
2
0.49206
0.49221
0.46667
0.47011
0.41818
0.45201
3
10.48438
0.48438
0.43750
0.43798
0.35938
0.37178
4
0.48425
0.48425
0.43541
O.43561
0.34749
0.35371
5
0.48413
0.48413
0.43333
0.43336
0.33636
0.33850
6
0.48412
0.48413
0.43318
0.43320
0.33410
0.33521
0.48412
0.43304
0.43304
0.33186
0.33224
8
0.43303
0.43303
0.33141
0.33161
9
0.43301
0.43301
0.33095
0.33102
i0
0.33086
0.33090
25
O.33072
O.33072
O. 33072
O. 33072
7
0.&8412
Table
3:
0.48412
lllustration
0.43301
of the
0.43301
'goodness ' of approximations
and the
rate o f c o n v e r g e n c e .
Now we will give
any help.
First the A3.2
consider
we
case
some
For the
consider
cases
rest
linear
estimation
compute
the
case w h e r e
the
not
absolutely
continuous.
l.i.m. n-~o
of this
~k = m + ~ + Sk w h e r e
in o r d e r t o
where
~Z
spectral
theorems
section
We will then which
of the get
~Z
- A3.3 Z
n
i.e.
Formally
the theorem
distribution
-- Z N. = m + ~, a r e s u l t n j=1 J
we p u t
of the level,
~ = Z{{0}}.
' since
A3.1
we
can b e
= {1,...,n}.
we
consider
can u s e t h e o r e m generalized
observed = ~ since
is o f no help.
do not
process
to is
18o
We use the n o t a t i o n
Rs = n
{r[
~-0
.}
=
{Cov(si,E
and further we denote by I vector
(I,...,I)
n
J
)}
the n•
with n components.
1 < i --'
'
j
< n --
identity matrix
'
and by ~
the
It shall be r e m e m b e r e d that
and s k are u n c o r r e l a t e d
for all k.
The f o l l o w i n g t e c h n i c a l
lemma will be used in order to i n v e s t i g a t e
the asymptotic
properties
of ~Z
" n
Lemma
I
For all n > I we have
1 =
(mI
1 --I1
+ o 2 1' n
-I1
1
+ RE ) - I
~
2'1
0
2
1 +
1'
1
-"n
-"rl
(ml
+ RS) -I I' n
n
--n
Proof
In Grandell given 9
(1972, pp
103-104)
a probabilistic
The p r o o f to be given here was
Put R = ml n + R ne
and -a = oI -n'
p r o o f of this
lemma is
s u g g e s t e d by B. yon Bahr.
Thus we shall prove that
(~ (R + a'_a)-I _a')-1 = I + (a_ R -I _a')-1
Let B be a symmetric positive put ~ = ~ B -I
definite m a t r i x
Let C be an o r t o n o r m a l m a t r i x
such that B 2 = R and
I~I
such that b C =
i
I n
where Ikl = (~ _~ b2) 2
and ! = (1,0 ..... 0). Then we have
a = bB =
I and thus
(~ (R + a'~) -I S ) -I =
= <1~12 i c,B (B B + I~12 B
c
i ' i C ' B ) -1 B
C i')
-1
=
I~I~ c,B
181
:
(Ib_..I2 i
= (l_b-I 2 i
c'
(z + }b__l2 c i ' i
( I + Ibl 2 i ' i )
c ' ) -1
c i ' ) -1 =
-1 i ' ) -1 :
= (Ib_l 2 (1 + I A 1 2 ) - 1 ) -1 = 1 + I s
and
(_~ R-I _~,)-I = (b_ B B-IB-IB b_,)-1 = Lkl-2
and the lemma follows.
Using lemma I and the expression
_ ~)2 given in section
for E (<~ n
6.1
we
get
u
E
(~ z
I (ml + RE) -I I' --n n n --n I u2 + I (mI + R~) -1 1'
_ ~ 2= ,
n
~
If F e, i.e. the spectral
distribution
absolutely
a neighbourhood
version then,
continuous
2
in
fS(x) of dFS(x-----~) is continuous dx
n
n
~
for {Sk }, is assumed to be of
x = 0 and
in a neighbourhood
see theorem A3.4,
= m + 2~ rE(0) + ~ I
--n
(ml n + Re) -I
n
I'
--n
and thus
lim n E ( ~ n+~ n
if
~)2 = m + 2w fs(0)
.
some
of x = 0
182
From theorem A3.4 it follows that under the above assumed regularity conditions
a natural approximation
of ~Z
is n
n
I =--
~_
Z N. - m. j=1 0
n
n
We have n
1
E (8Zn
Z N. - m - ~)2 = j=1 g
$)2 = E (n
n
= E ( -1 n
7.
N.
j=1
-
s
+
a
o
~T. a
s
s
-
m
-
~)2
=
O
n
=E
(1
Z ~=~ j
~
n
:
s
I (~-
+ s.) 2 = a
N. _ s . ) 2 + E ( 1 a a
z j=l
= I___ A 2
a
n
n
z j=l
~ .)2 -J
(mIn + Rs) I' n --n
n
and thus, see theorem A3.4,
lim n E (n Z _ ~)2 = m + 2w fs(0) n- ~ n
from which it follows that, provided F s fulfils regularity
assumptions,
{qZ ) is asymptotically n
the assumed efficient.
Remark 4 E s t i m a t i o n of the level is considered in more detail in Grandell (1972:2, pp 92-106). conditions sequence
We just mention that without
any regularity
on F C , except the condition F~(0) - FS(0 -) = 0, the
183
~Z
= n
I (ml + RE) -I N' --n n n --n -I I (ml + R E) I' --T1
where ~ both
= (NI,...,N n
~Z
a n d
tion
n
,
, is a s y m p t o t i c a l l y
requires
n does n o t
of n Z
- m
n
an i n v e r s i o n
efficient. of a n•
Calculation
matrix
but
of
calcula-
2 require
knowledge
of o
.
n
9
4
Example
Consider
case m = I a n d r~ = P I k l
the
-1
< p < 1
Then
lim n E
(~Z
n -~~176
In f i g u r e s
<
)2
=
2 1 - p
n
19-21
we h a v e
drawn
n E
(~Z
~)2
for some v a l u e s
of
n
n a n d for 2 n E
(n Z
= I a n d a 2 = 10 r e s p e c t i v e l y .
_ ~)2
Further
we h a v e
drawn
where
n
~Z
for t h e
~Z
is
n
1 --- n
same v a l u e s
n
~ j=1
N. - I 8
o f n.
Figures
as an approximation
o f ~Z
n
19-21
thus
and the
rate
illustrate
of convergence
n
n E (~
_ ~)2 a n d n
n E (n Z
_ ~)2 n
to their
how
limits.
'good'
of
184
Figure
19:
I
I
I
I
I
I
I
I
i
I
I
i
2
3
h
5
6
7
8
9
lo
25
lllustration rate
~ f the
of convergence
curves
represent
'goodness' for t h e
n E
(qZ
= I0
and
n E(~
I
of approximations
c a s e p = 0.25. _ ~)2
n E
From
(~
n
2
.
n
and the above
the
_ ~)2 for n
_ ~ ~2
for
o2 = 1
n
Figure
20:
I
I
I
I
I
I
I
I
I
I
I
I
I
2
3
4
5
6
7
8
9
lo
25
|
Illustration rate
of the
of c o n v e r g e n c e
curves
represent
'goodness' for the
(qZ
n E
of a p p r o x i m a t i o n s
case
p = 0.50.
2
=
10
and
n E
x (~Z
- 5) n
and the above
- ~ ) 2 , n E (~Z~ - ~)2 f o r n
o
From
n
n
2
for
o
2
=
I
the
185
I
I
I
I
I
I
I
I
I
I
I
I
i
2
3
~
5
6
7
8
9
10
25
Figure 21: lllustration of the
'goodness'
rate of convergence curves represent
of approximations
for the case p : 0.75. From above the
n E(n Z
_ g)2, n E(g Z n
and n E ( ~
and the
6) 2 for 2
g
)2
for ~
2
= 10
n = I
n
9
To end this section we will consider a case where the variable to be estimated depends on Z . The case we have in mind is when the average n intensity
~ n
Since ~
I nZ Z. is to be estimated in terms : -n j=1 J
depends on n, theorems A3.I-A3.3
of NI,...,N n.
are not applicable.
From
n
section 6.1 it follows that the best linear estimate ~
n
terms of N I,...,N n is given by
~
n
1 I : m + -n~
R s (ml n + R~ )- 1 (N - m ~ n n --
),
and 2 E (Z:-
~ )2 n
m n
m n
"2 1--n (mln +
Rs -I I' n --n
of [
in n
186
Thus it follows from section A3 that if Fs
E (#n~ - #n )2
- Fs
-) > 0, then
2 m n
m 2 n
I
Fs
- Fs
-) + o(1)
Thus
lim n E (s~ - s )2 = m
and since
-n n E (N
~n )2 =
m,
n
where N
"~ Z Nj, it follows, with a slight m o d i f i c a t i o n n n j=l
definition
1, t h a t
{N } i s n
asymptotically
More interesting is the case Fs Fs
is absolutely
continuous
that some version fs
of
efficient.
- Fs
-) = 0. We assume that
in a n e i g h b o u r h o o d
dFs dx
of
is continuous
of x = 0 and in a n e i g h b o u r h o o d
of x = 0. Then it follows from t h e o r e m A3.4 that
E
(#~ - s )2
m
m
n
2 2
I
m + 2w fs
+ o(I)
Thus
2 lim n E (~n ~ _ ~ )2 = m n n+~
Obviously however,
m
. . =
m + 2w fs
{N } is in this case not asymptotically n consider
aN
n
+ b.
T h e n we h a v e
2w mfs m + 2w fs
efficient.
Let
US
187
-
2
n E "--,(aNn + b - ,%n ).
=
= n
(1
E
~a(N n
-
~n ) -
= n ~a._~ + ( 1 - a )
2
-
a)(~ n
- m) + b -
2w f ' % ( O ) +
o(1)+
(1 -
(b-
a)m] 2 =
(1-a)m)21
=
n = a2m + (1 - a) 2 2w f ~ ( O )
+ o(1)
+ n(b
-
(1 - a)m) 2 .
Thus we must have b = (I - a)m. To get the asymptotically best choice of a we minimize a2m + (I - a) 2 2w fZ(0) and thus we get
a ---
2w f Z ( O ) m + 2~
f~(0)
and m
b =
2
m + 2~ f~(0)
For this choice o2 a and b we have
lim n§
n E (aN
+ b - ~ )2 = n
n
2w mfs m + 2w fZ (0)
and thus, p r o v i d e d fZ(0) > 0, it follows that
m2 + 2w f~(O)
n
m + 27 f'%(O)
is asymptotically
Example
efficient.
5 (Continuation of example 3)
We have m = I and 2w fZ(0) =
1-p (1
-
2 p)2
= l+p 1 -
p
188
Thus
lim n E (s ~ - s )2 = n-~co
I + P 2
and
1 - p + (1 + p) N n ]
I
2
ia asymptotically
In figure 1 -
efficient.
22 we have p +
(1
+ p)
n E (
N
n E ( ~~ - ~n )2
for some values
0.50 and 0.75 respectively
1 -
p +
(1
'good'
rate of convergence
and
2
n _ ~ ) n
2 p = 0.25,
drawn
+ p)
N
n
of n and for
in order to illustrate
is as an a p p r o x i m a t i o n
of the drawn quantities
of ~
n
how
and the
to their limits.
189
0.9 --
0 ~ 0 . 7 5
--
p=0.50
__
p:0.25
0.8
=
0.7
-0.6
0.5
I
I
I
i
2
3
I
J4
Figure 22: lllustration
I
I
I
I
I
5
6
?
8
9
of the
'goodness'
rate of convergence.
I
!
io
of approximations
1-p+(I+p)Nn n E (
and the
)2 ~
and n
2 -
I
For each value of p the curves
represent from above
nE(~ - ~n)
.
25
2
Consider now the r ~ n d o m generations
GI-G7 described in section 2.6.
In table 4 we give the values of ~
taken from table
n
mative estimates
I
~ 1 +p p + 2(I + p) Nn -
approximation
of
I, the approxi-
p + (1
E ( 2
+ o)
and N
2n 2
n _ ~ ) n
w h i c h is an
190
I-0+( 1+p )N n
Name of n
p
generation
n
GI
500
0.0
0.993
0.991
0.032
G2
500
0.0
1.025
I .026
0.032
G3
500
0.0
1.018
I .023
0.032
G4
500
0.75
0.929
0.899
0.042
G5
500
0.75
0.878
0.815
O.042
G6
500
0.75
0.933
0.979
0.042
G7
50
0.75
0.860
0.876
0. 132
Table 4: lllustration
7.
of estimates on random generations.
ESTIMATION OF SECOND ORDER PROPERTIES
OF STATIONARY DOUBLY
STOCHASTIC P01SSON SEQUENCES
Consider~ like in section 6, a stationary doubly stochastic Poisson sequence N = {N k ; k E Z} together with its underlying random measure = {Zk ; k s
In section 6, where linear estimation of random
variables was treated, we assumed m = E ~k to be known.
In general these quantities
and
rk = C o v
are unknown,
(~j ,~j+k )
and therefore
have to be estimated.
In this section we will study estimates of the
covariance structure.
We will, however,
assume m to be known.
If it was possible to observe Z the problem to find the estimates were
'standard'
time series analysis.
observed and we have to find estimates
In general ~ can not be in terms of an observation
Since also N is a stationary time series, we do really never leave
of N.
191
'standard'
time series analysis.
We will in this section assume that we have an observation NI,...,N n of N and we will compare natural estimates natural estimates
in terms of N with the
if we had an observation s
of Z.
In section 6.1, where linear estimates was studied for finite observations, the results were b a s e d on the covariances
r~ while in sec-
tion 6.2 the results were b a s e d on the spectral density fs therefore
study estimates
We will
of r k when n is 'small' and estimates of
fZ when n is 'large'. This division will also from the point of v i e w of estimation be natural.
As will be seen in example
I, the word
'small' has to be liberally interpreted.
We will always assume that E s
4
< ~ and that s is stationary up to
the 4th order. Thus the quantities
m
=Es
rk
= E (s
rk, ~
= E (.~) - m ) ( ~ . v + k - m)(.~ +~
-m)
r k , j , .i
= E (~)
-
D
exist and are independent
- m) ( ~ u + k
-m)
- m) (g~)+k - m ) ( g v + j
m)(s
-
m)
of ~. Observe that m and r k are defined
as before and, more important, that rk, j must not be confused with Cov (Zk,Zj) for a non-stationary
stochastic
sequence.
We note that, contrary to the situation when linear estimation is studied
(cf remark 6.3), N k - m can not here be considered as an
observation of a 'signal'
~k - m with an independent
Nk - Zk added. The reason is that here properties
'noise'
up to the 4th
order are needed, while for linear estimation only properties to the 2nd order are needed
(cf t h e o r e m
1.6).
up
]92
The quantities
n lkl Ck =
(~. J
Z j=1
-
)
m (s
I
-
m)
and
n-lkl CkN =
j=IZ
will be important
7. I
(Nj - m)(Nj+ k - m)
for the construction
of the estimates.
Esgimation of t h e cova~iances
Suppose
that ~]'''''~n
rk = ~
is a natural
and N I,...,N n are observed.
Ck
estimate
of r k in terms of s
Vat rk = 0( 1 ) under rather general see section
and it is known that
conditions.
Since r N = 6km + r k,
1.6,
=
is a natural
We observe
Then
ckN
I
estimate
_ 6k m
of r kZ in terms
of N.
that E r k = E r k = r~ and will compare Var r k with
Var r k. After some calculations,
cf Grandell
(1971, pp 227-229),
we get n
n Vat r 0 = n Var r O + Var
+ m + 2 In2 + ( 6 - m ) r ~
and for k r 0 and 2k < n
n
(1 E ~.) + 2 E (n-j) r~ n j=1 J n j=-n 'J
+ 2r~,~
193
(n-k) Var rk = (n-k) Vat r k + (n-k)
[m2 + r k~ + 2mr 0 +
~ + rk, k + r0,k] + 2(n-2k)
[Zr~2k + r kZ ,2k]
For large values of n these formulae do not give much information on the behaviour of Var r k. If {~k - m} is a linear process,
closed
forms for lim Var r k exist, but unfortunately these formulae can not n§ be applied to Var r k since {N k - m} is not a linear process. An unpleasant property of estimates of r k is that in general lira n Coy (rk, r~) is equal to a non-zero n-~ A good discussion of estimation Hannan
Example
constant
of covariances
also when k ~ j.
is found in e.g.
(1960, pp 34-45).
I
In order to get some idea of the relation between Var r k and Var r k we consider the case described in section 2.5. These random generations were used by Grandell tion of the estimates.
In spite of what is said above
of r k for large values of n, we have in figure 23
drawn Var
lim--
109-113) as illustra-
It shall be observed that in this case
{~k - m} is not a linear process. using estimates
(1972:2,
~k
n+~ Var r k
as a function of p for s'ome values of k.
194
lira
n~
Vat
~k
Vat
r~ k
I
0.25
Figure
23:
Illustration a n d Var r_ K
Consider
the
estimate
of the
I
1
I
0,50
0.75
I
asymptotic
relation
between
V a r rk
.
r0"
F o r this
estimate
we h a v e
(cf G r a n d e l l
(1972:2, p 110))
l i m n V a r ro =
and thus must
to f u l f i l
the r a t h e r
modest
requirement
V a ~ r r0 ~ 0.1 we
have
n
Thus,
13 1 + p + 21 I - p
~
100 {13
as e x a m p l e s ,
p = 0.25,
n
~
6000
1 + p + 21} l - p
we m u s t if
have
n
p = 0.50
.
~
3400 and
if n
~
p = 0, n 11200
if
~
4267
if
p = 0.75.
195
7.2
Estimation of the spectral density
Assume that F ~ is absolutely continuous and consider estimates of the spectral density f~ (see section 1.6). In section A3 a short discussion of spectral estimation is given. Suppose, like in section 7.1, that Z1,...,~n and NI,...,N n are observed. Since
f~(x) : 2~m + f~(x) (see section 1.6) it is natural to compare the estimates
~(x)
:
I
n-1
z
2~n
(n)(x)
wk
z
Ck e
-ikx
k=-n+1 and
Rx) :
I 2~n
I
2wn
of fs
n-1 (n) (x) N e-ikx _ m__ = k=_~n+1 Wk Ck 2w
n-1 E
(n)(x) ( N nm6k ) e-ikx wk Ck -
k=-n+ I
The coefficients w~n)(x) correspond to the chosen weight
function Wn(Y:X ). Since E f(x) = E f~(x) we do not consider the bias of the estimates.
If s is Quasi-normal, see section A3, we have good knowledge of the asymptotic behaviour of Var f~(x), see theorem A3.5. It is thus natural to investigate under which conditions on Z also N is quasi-normal, since then we also have good knowledge of the asymptotic behaviour of
Va~ }(~).
Put
. . - r.r. Pk,j,i = rk,j,i - rk r l-j J 1-k - r.r. i j-k"
is quasi-normal if, in addition to the general assumptions given in the beginning of this section
196
oo
z
I < Iik
<
*
k=-~
and
z
Ip~k,j,il
<
~
9
k,j ,i~Z
Theorem
I
Let s be a quasi-normal if
E k,j~Z
Jrk,-'j k I
<
sequence.
Then N is quasi-normal
if and only
~
Proof Since s is quasi-normal
it is stationary
up to the 4th order. oo
too is stationary
up to the 4th order.
Since
Z
s irkl < ~
Then N and
k=-oo oo
since rN = 6km + r~
also
N
Put
Z
IrNI < ~ .
s
d~,j,i : Pk,j,i
N
- Pk,j,i
N =
Pk,j,i
r
. k,g,i
-
where
NN rkr.
1-j.
-
NN r.r.
8 m-k
-
NN r.r.
i a-k
and N
rk,j ,i : E (N O - m)(N k - m)(Nj
Since
E
k , j ,ieZ if and only if
s .IPk,4, i] Z k,j ,i~Z
< ~
we h a v e
- m)(N i - m)
E
k , j ,i~Z I~,j,iI~
(d = dk,j, i)
<
< ~. it is enough to consider the
the case 0 < k < j < i. After rather lengthy that
N '~lPdj,il
calculations
it follows
197
d = 0
if
O
if
O=k
d = rk, i
if
O
< i
d = rk, j
if
O
=i
r.~ s l + 3r0,i
if
O=k=g
d = r k + 3 r k~ ,k
if
O
=i
if
O=k
= i .
d=r.
d=
d=
j,i
r~. + r.
g
+r
j ,j
O,j
.
In order to illustrate the calculations we consider the case 0 = k < j = i. In this case
d = ~ ~0 - E ~Z0
- ~)2(~j
-m)~-
- m)2(Zj
- m) 2] + (r 0)
= ~ [i~o _ ~)2(~j
(r0N)2 - 2(rjN.)2
2
~)2
+ 2(r
- m ) 2] - E [ ( s
m)2(s
=
m)2] - m2 - 2mr~
g
Since
E [(N O - m ) 2 ( N j
= s[~
- m) 2] =
{(N o - ~)2 I ~o,s
s {(Nj - m) 2 [ ~o,~j}]
= E {m + (~0 - m) + (~0 - m ) 2 } { m
=
+ m
+
+ (~j
+
- m) + (Zj
- ml*I*j
- m) 2} =
-
mS
we get
d = r L. + r ~. s J J ,J + r0, j co
From the different forms of d it follows, since
k,jZiEz.
,j,i
I < ~ if
and o n l y i f
E Irk,jl k ,jEZ
s
Z ,,Irkl < ~, that k=-co < co
198
If both ~ and N are quasi-normal
processes,
it follows
from theorem
A3.5 that
Var f~(x)
2w(fZ(x))2 n
~ w2( n
--7
:x) y
and 2 m
Var f(x) ~
2~(f~(x) + ~ n
)
S W2n(Y:X) dy -w
if f~(x) > 0 and x # 0,w . (For x = 0 the figure 2 has to be changed to 4). Thus for x ~ w and f~(x) > 0 we have
Vat f(x) lim
independent
Example
of the chosen weight
2 (continuation
Consider
f~(x) + =
functions.
of example
the random generations
I)
described
in section
2.5.
In this
case we have for 0 < k < j < i
m
=
I
rk
= p
rk, j
= 2p J
Pk,j,i
= 8pz - 2PZ+J-k
k
In generations p = 0.75.
GI-G2 we have p = 0 and in generations
From theorem
G4-G6 we have
I it follows that both Z and N are quasi-
normal.
Consider the simple
'truncated'
choose m500 = 10. The reason
estimate
described
in section A3. We
for such a small choice of m500 is two-
199
fold. Firstly the standard deviations of the estimates of rk are rather large, and secondly the estimates r~ and rk based on the random generations GI-G6 are given by Grandell (1972, pp 112-113) for k = 0,1,...,10 and we want to use the same random generations as illustration of the estimates of the spectral density.
For the random generations GI-G6 we have calculated f~(1~) and ^ p~ f(10 ) for p = 0,I,...,9. In figures 25-27 and 29-31 these estimates are drawn. In figures 24 and 28 the values of
Var f~(
) and
^
Var f(10 ) , taken from the asymptotic formulae, are drawn for p = 0,I,...,9 .
0
0
i .....
i ....
i ....
| ....
~ ....
1
2
3
4
5
i ....
6
i ....
7
i .....
i
8
9
p
Fi6ure 24: Standard deviations in the case p = 0. The solid lines connect the approximative / V a r and the dotted lines the / V a r
f (i~)'- values
f~(1~)'- values.
200
0.3 0.2 0.i 0 0
!
2
3
4
5
6
7
8
9
Figure 25: lllustration of estimates for generation GI. The solid lines connect the f (10) - values and the dotted lines t h e f (10) - v a l u e s .
The ' + ' - s i g n s
represent
fl(~).
0,3 0.2 9 .+ .... +
/ ...+....-F
0.i I
I
0 0
i
I
I
I
I
I
I
I
I
2
3
4
5
6
7
8
9
Figure 26: lllustration of estimates for generation G2. The solid lines connect the f (10) - values and the dotted lines the f (i~) - values. The '+'-signs represent fl(1~).
201
0.3
0.2 0,iI I
I
I
I
0
1
2
3
I
]
I
I
I
I
4
5
6
7
8
9
Figure 27: lllustration of estimates
for generation G3. The solid
lines connect the f (I0) pw - values and the dotted lines
~.p~.
the f ~i0 ) - values.
The
'+'-signs represent
Dw fg(~).
i
0.2 ?
J
] .....
0
l
2
l ......
3
I ......
4
4 .....
~ .....
,r ....
j ....
j
5
6
7
8
9
Figure 28: Standard deviations
in the case 0 = 0.75. The solid
lines connect the approximative and the dotted lines the / V a r
/Var
f (i~i - values
f ~i0 ) - values.
202
1.0 I
0.5
0
i
2
3
4
5
6
7
8
9
Figure 29: lllustration of estimates for generation G4. The solid lines connect the f (10) pw - values and the dotted lines the f ~i0) - values. The '+'-signs represent f/(
).
203
+ 1.0
o.5
o
i
2
3
Figure 30: Illustration
4 ~5
/ 7
of estimates
8 ~'-"
for generation G5. The solid
lines connect the f (i~) - values and the dotted lines the f [10 ) - values.
The
'+'-signs represent
Dw fg(~).
204
05
....~
0
i
2
3
4
5
6
7
8
....
+ 9
Figure 31: lllustration of estimates for generation G6. The solid lines connect the f (i~) - values and the dotted lines the f~(1~)U -values.
The '+'-signs represent fg(~w).iu
20.5
At.
POINT PROCESSES AND RANDOM MEASURES
In this survey, results given without references taken from Jagers
or proofs, are
(1974).
Let X be a locally compact Hausdorff topological basis. X is then o-compact
space with countable
and metrizable with a complete metric.
Let B(X) be the Borel algebra on X, i.e. the o-algebra generated by open sets, and let M be the set of Borel measures The set of continuous C K. Let ~ , ~ I , ~ 2 , . . . ~ M
X
on (X,B(X)).
functions with compact support is denoted by and define ~n § ~ by
X
for all f ~ C K as n § ~. This definition of convergence
corresponds
naturally to the vague topology on M (cf Bauer (1968, pp 182-191)), which makes M Polish
(cf Kerstan, Matthes and Mecke (1974, p 238))
i.e. separable and metrizable with a complete metric.
Let
B(M)
be
the Borel algebra on M. A class of sets is called a z-system if it is closed under finite intersection. Theorem I Let A ~ B(X) be a w-system containing a basis for the topology o on X. Then {~M
B(M)
is equal to the o-algebra generated by the sets
; w{A} ~ x} for all A 6 A o ,
x6R.
Consider some subspace M'~ M endowed with the relative topology. Borel algebra
B(M')
on M' is equal to { B ~ M '
; B~B(M)}
The
(cf Bauer
(1968, p 166)) and thus equal to the o-algebra generated by {~eM'
; ~{A} ~ x}, A ~ Ao, x E E .
an intersection
M' is Polish if and only if it is
of a countable number of open sets. Especially every
closed subspace of M is Polish.
2o6
Let N be the set of all integer or infinite valued elements in M. N is closed, and thus
N~B(M) and N is Polish.
Let A be a random measure
(see definition
1.1).
Theorem 2 For any B(X)-measurable
function f : X § R which is bounded and has
compact support or is nonnegative random variable
f f(x) A{dx} is a real-valued X (except that the value + ~ might be allowed).
An element v ~ N is said to be sim~le if v{{x)} = 0 or I for all x ( X . The set N
of such elements
o
An element ~ M The set M
is said to be non-atomic
of such elements
o
Definition
is a Borel set, i.e.
N ~ B(N). o
if ~{{x}} : 0 for all x ~ X .
is a Borel set, see t h e o r e m 1.3.
I
A point process N with distribution P is called simple if P{N } = I o and a r a n d o m measure A with distribution H is called non-atomic
if
o
Definition 2 The Laplace t r a n s f o r m L~ of a probability measure ~ on
(M,B(M)) is
the function
f
f /e p E- f f/x) M
= L /f)
x
defined for f ~ CK+ (i.e. for non-negative
continuous
functions with
compact support).
For any two probability measures ~I and ~2 on
(M, BIM)) we define
207
the convolution ~I M H2 by
H1 ~ H2{B} = i ~ 1B(#1 + p2) Rl{dPl} H2{dP2} for
B
6B(M)
where
IB(~) =
I
if
p6B
0
if
IJ~B
and where Pl + U2 is defined by (~I + ~ 2 ) { A }
= ~I{A}
+ >2{A}
for a~l A~8(•
If A I and A 2 are two independent
random measures with distributions
H I and H2, then A I + A 2 has distribution H I ~ H 2,
Theorem 3 (Uniqueness) A probability measure ~ on (M,B(M)) is uniquely determined by L H
Theorem 4 Let A be a random measure with distribution H and let A C 8(X) o be a w-system
containing a basis for the topology on X. Then H is
uniquely determined by all distributions
of (A{A]} .... ,A{Aj}) for
j = 1,2,... and bounded sets AI,...,A j ~ A o.
The following theorem is essentially due to MSneh tion follows from Kallenberg
(1972). Our formula-
(1973, p 11).
Theorem 5 Let N be a simple point process with distribution P and let A C B(X) be an algebra containing a basis for the topology on X. Then P is uniquely determined by all Pr{N{A} = O} for bounded A 6 A .
208
Let S be a metric space with Borel algebra B(S) and let ~,~i,~2,... be S-valued random variables with distributions ~,H1,n2,...
on
(S,S(S)).
Definition 3 If f _ f dH n § _ f f d~ for all bounded and continuous functions f : S § R S S w we say that H conver~es weakly to H and use the notation H ---* H or --n n d that -~n ~ converges in distribution to ~ and use the notation _ ~n ~ ~"
The standard reference
for the theory of weak convergence
is Billings-
ley (1968) where special attention is given to the Polish spaces CEO,I ~ of continuous
functions
on [-0,I~I endowed with the uniform topo-
logy and D[O,I~j endowed with the Skorohod J1 topology.
We will con-
sider the space DEO,~) later in this section.
Definition 4 A sequence
{Hn} I of probability measures on (S,B(S)) is called tight
if for every ~ > 0 there exists a compact set K C S such' that H {K} > I - ~ for all n. n A sequence {~n)1 is called tight if the corresponding of distributions
sequence {H n)
is tight.
A sequence of probability measures on (S,B(S)) is called relatively compact if each subsequence
of it contains a further subsequence which
is weakly convergent.
For Polish spaces Prohorov~s theorem (cf Billingsley states the equivalence between tightness
(1968, pp 35-40))
and relative compactness,
and this fact explains the importance of tightness.
The main motivation for the study of weak convergence
is that if h
209
is a measurable mapping from S into another metric space S' and if d Sn---* < then also h(6 n)
d h(~5) provided P r { ~
the set of dis-
continuity points of h} = 0. Thus the finer the topology the stronger a weak convergence result.
Consider now convergence in distribution
of random measures.
Theorem 6 (continuity) Let A,A I ,A2, ... be random measures with distributions H ,H I ,H 2, . . . . Then An
d .... A if and only if ~
(f) § ~ ( f )
This result is due to v. Waldenfels
(1968). His proposition is
stronger and formulated for characteristic p 13) gives a similar strengthening
for all f 6 C K +
functionals.
for Laplace transforms.
The following two theorems are weaker formulations Kallenberg
Mecke (1972,
of results due to
(1973, pp 10-11). For any subset A of X we denote its
boundary by 8A.
Theorem 7 Let A,AI,A2,...
be random measures and let A o C B ( X )
~-system containing
a basis
be a
on X s u c h t h a t d Pr(A{~A} = 0} = I for all bounded A ~ A . Then A ---~ A if and only o n if (An{A I} .... ,An{Aj)) all bounded A 1 , . . . , A j ~
d
for the topology
(A{AI],...,A{Aj})
for all j = 1,2 .... and
Ao.
Theorem 8 Let NI,~2,... be point processes, and let A C B ( X ) b e an a l g e b r a
let N be a simple point process
containing
a basis
for the topology d on X such that Pr{N{~A} = 0} = I for all bounded A ~ A . Then N : N n
210
if and only if (i)
Pr{N {A) = 0)
§
Pr{N{A}
= 0)
for all bounded
A
~
Pr{N{A)
> I}
for all bounded A
n
(ii)
Pr{N (A) > I} n
(iii) {Nn) I
is tight.
This is the first time in our discussion tion of random measures needed.
The explicit
where
a tightness
condition
by the following weaker
of convergence condition
is, however,
formulation
in distribu-
is explicitly
easy to remove
of theorem
as seen
8.
Theorem 9 Let N,NI,N2,... if N {A)
d~
and A be as in theorem
N{A)
for all bounded A ~ A
d 8. Then N n ---* N if and only .
n
Proof We have to show that N {A} n Tightness
of {N n}
compact K C X
d
N(A) implies
is equivalent
that {Nn}l is tight.
to tightness
of (Nn(K)} I for all
and thus we only have to show that tightness
of
oo
{Nn{A}) 1 implies tightness
of (Nn{K}} I
Take a compact K C X. Since X can be covered by countably many bounded basis many.
sets it follows that K can be covered by finitely
Thus there exists
> 0 there exists
for all n. Since
space
on s and A, such that
Pr{N {K) < k) > Pr{N {A} < k) n -n --
with left hand limits
DF0,A ~ , A ~ ~, of all rightcontinuous defined
dowed with the Skorohod J1 topology. properties
For every
{Nn{K)) T is tight.
Consider now the function functions
such that A ~ K .
a real number k, depending
Pr{N {A) ~ k} > I - s n it follows that
a bounded A ~ A
as DEO,I],
for which
on E0,A]. Let DE0,A ~ be en-
The space DE0,A ~ has the same
Billingsley
(1968)
is the standard
211
reference. on E0,|
In many situations
it is natural to consider
Let D be the set of all rightcontinuous
functions
functions with
left ha~d limits defined on E0,~). The following topology on D is studied by Lindvall
(1973:1) and (1973:2) who develops
Stone (1963) and Whitt
(1970).
Let F be the set of strictly increa-
sing, continuous mappings of F0, ~) onto itself. identity element of F. Take X,Xl,X2,. .. ~ D .
Let e denote the
Let x n + x mean that
U,C
there exist u where
U
U
F such that Xn~ Yn
~ stands for u n i f o r m convergence
form convergence
ideas due to
x and y n and
on compact subsets of D , ~ ) .
U~C
~
~ e
ands for uni-
With thirstsdefinition
of convergence D is Polish.
Let for A ~ ~ , ~ )
the function r A : D § DE0,A] be the restriction
operator to ~,AI,_ i.e. rA(x)(t) theorem given by Lindvall
= x(t)
, t 6 E0,A]. The following
(1973:2, p 21) and (1973:1, p 120) brings
the question about weak convergence of stochastic processes
in D
back to the finite interval case.
T h e o r e m 10 Let X,X I,X2,... be stochastic processes
in D. Suppose there exists
co
a sequence
{Ai)i= I , A.l > 0 and A.1 § ~ as i § ~
d
rA. (Xn) --~ rA. (X) 1
1
for i = 1,2, . . . .
X
d n
~ X
Then
as
n§
as
n
-~ oo
, such that
212
A2.
HILBERT SPACE AND RANDOM VARIABLES
The reader is assumed to be acquainted with the formal definition of a Hilbert space. A good introduction well suited for our purposes is, however,
given by Cram6r and Leadbetter
(1967, pp 96-104).
Let H be a Hilhert s~ace. Let h,hl,h2~ H. In general
(hl,h 2) denotes
the inner ~roduct between h I and h 2 and Ilhll = (h/~-~,h)denotes the norm of h. Let h,hl,h 2 .... { H .
Convergence h
§ h means that
n
H is complete in its norm. The operations
llhn - hll + O.
of addition and multiplica-
tions with real or complex numbers are defined for the elements in H. If (hl,h 2) is real for all hl,h 2 6 H , space.
then H is called a real Hilbert
Let {hi ; j E J } be a family of elements in H. Let H(J) be the
collection of all finite linear combinations
of elements in {hi
or limits of sequences of such combinations.
H(J) is Hilbert subspace
of H and is called the Hilbert space spanned by {hi denoted by S({hj
; j EJ]).
; jEJ}
; j ~J}
and often
It is obvious that if Jo is a subset of J
then H(J o) is a Hilbert subspace of H(J). For our applications
of
Hilbert space geometry the following theorem is of great importance.
Theorem I. Let H h
o
6H
The projection theorem
be a Hilbert subspace of H and let h 6 H .
o
called the projection of h on H
o
two equivalent
llh
o
There exists a unique
which satisfies the following
conditions:
- hJl
= rain Ilu - hPi uGH
o
o (ii)
(h ~ - h, u) = 0
Further
Ilho-hll2=
V uE H0
[Ihll 2 -
Ilholl 2
Our formulation of the projection theorem is close to the formulation given by Parzen
(1959, p 306).
213
We sometimes denote the projection of h on H
o
by E(h I H ). Projeco
tions have similar properties as conditional expectations (cf Doob (1953, p 155)). Examples of such properties are E(alh I + a2h 2 I H o) = alE(h I I H o) + a2E(h 2 I H o) where a I and a2 are real or complex numbers and E(E(h I H2) I H I) = E(h I H I) for H I C H 2 C
H .
Consider now a measure space (~,A,~) where ~ is an arbitrary set, A a a-algebra of subsets of ~ and p a measure on A. From the RieszFischer theorem it follows that the set of all square integrable Ameasurable functions f forms a Hilbert space with (fl,f2) = f flf2 d~ if functions differing on a set of u-measure zero are identified. This Hilbert space is denoted by L2(~,A,p) or shorter by L2(P). If only real-valued functions are considered L2(P) will denote the corresponding real Hilbert space.
Let f,fl,f2 .... ~L2(p) be functions such that fn § f' i.e. llfn - fll § 0. Then there exists a subsequence f such that f § f nk nk
a.e. (p) when
k § ~, i.e. there exists a set E with ~{E) = 0 such that lira f (~) = f(~) for all ~ k+~ nk
\E.
The case when ~ is a probability measure will be of a particular interest to us. In that case the measure is denoted by P and the functions are called random variables. For hl,h 2 ~ L 2 ( P ) we have (hl,h 2) = E hlh 2.
Consider a family (Xj ; j ~ J ) for all j ~ J
of random variables in L2(P). If E X.~ = 0
the inner product in S({Xj ; j ~ J } )
variance. Let h @ L 2 ( P ) .
is equal to the co-
Let H(J) denote the space S({Xj
; j~J),
I)
where I is the constant one. E(h I H(J)) is called the best linear estimate of h in terms of {X. ; j ~ J ) . J For reference reasons the following simple result is given as a theorem.
214
Theorem 2 E(h
i H(J))
= E h + E [ h - Eh
I S ( { X . - EX. J J
; Js
Proof It follows
from theorem
E[{E(h - Eh ueH(J).
I that we have to show that
I S({X. - EX. 9 j ~ J } ) ) J J '
- (h - Eh)} u] = 0
It is enough to consider u = I and Xj, j ~ J .
u = (u - Eu) + Eu. Since u - E u 6 S ( { X j - EXj E(X. - EX.) = 0 for all j ~ J the result J J
A3.
; j6J})
Put and since
follows.
[]
SOME TIME SERIES ANALYSIS
Consider a real-valued time series or stationary {X. J
for all
; j~Z}
such that
sequence
E X. = O, V a r X. = r < ~ and j j o
Coy (Xj,Xj+k) = r k. Then
rk =
i
eikx
F{dx}
-7
where the s~ectral distribution rightcontinuous
function F(x) is non-decreasing,
and b o u n d e d and further n o r m a l i z e d by F(-w) = 0.
Although the time series is real-valued it is convenient to use the complex form of the spectral representation.
A derivative
f of
the absolutely continuous
component of F is called spectral density.
Since {X.} is real-valued J
f(x) is symmetric.
f(x) ~ e I > 0 for all x E [ - ~ , ~ restriction.
We assume that
since for our purposes this is no
The time series itself has the spectral representation
xk =
f ei~z{d~} --IT
215
where,
in differential
notations,
E(Z(dx} Z(dy})
the process
F{dx}
if
x = y
0
if
x r y
Z(x) fulfils
=
(The reader is assumed to be acquainted
with the formal defini-
tion of this kind of representations.)
Define the Hilbert
space L
= S(X. ; j < n}, L = S(X. ; j 6 Z} J ~ O
n
with inner product E hlh 2 and with inner product
L = S{e iJx
~ hl(X) h2(x)
F{dx}.
; j < n}
L = S{emJX;
j ~Z}
For all n (including ~)
L
--IT
and L j
<
n
are isomorphic
n
under the linear mapping with X. ++ j
e mJx,
n.
For any integrable use the notation
function h from
h(x)
=
E-w,wJ
1 e -ikx h(x). E h k e ikx where h k = 2-~w k =-~ -w
sign = means merely that h corresponds for example Doob
into the complex plane we
to its Fourier
(1953, p 150). For square integrable
series,
The
see
functions
h we
define [h(X)]n by
n
EhIXl n-
z
hk e
ikx
k-~
Consider correlated and $~ =
a real-valued with
~ ([
{X.} J I L)
random variable
and d e f i n e
=
we have
Pk = E ( [ X k ) .
that E ~
X
i eikx ~
there corresponds
~n
n
~. = p~ and since
f h ( x ) Z{dx} f o r some f u n c t i o n h w i t h Pk =
Put
2
.
From theorem A2. I it follows
~
~ with E ~ = 0 and Var ~ = a
F{dx}.
f l h ( x ) l 2 F{dx) <
Thus if F is absolutely
to ~ a function r
= h ( x ) f ( x ) ~ "1~
continuous
~ k~-~
Pk e -ikx
216
with
j ~ --,ff
dx < ~ 9 The function ~ will be called the cross
f(x)
spectral density.
Consider the function g(z), z complex, defined by
g(x) = exp
{1_ i e-lY 4~r
+ z log
-ly
--~I
e
-
f(y) ay}
Z
for Izl < I. g(z) is analytic and without zeros in the unit circle Izl < I (cf. Grenander and Rosenblatt (1956, pp 67-69))and thus
g(z) =
=
zk f o r
~
Izl < I and f u r t h e r
g(e I x ) = l i m
k=0
g ( r ei x )
r~ I
fulfils g(e Ix) g(e -Ix) = f(x). Since f(x) is symmetric we have g(e ix) = g(e-lX). The function I/g(z) is analytic in Izl < I.
Following essentially Doob (1953, pp 590-594) we get the following two theorems.
Theorem I Let {X k} have an absolutely continuous spectral distribution with spectral density f and let g be a random variable which has the 2 cross spectral density @ to (Xk} , mean 0 and variance a . Then
g~ =
i r f(x)
z{~}
and further
- i Ir
f(x)
~x
Proof Since E ~ ~
=
f e ikx @(x) dx =
~ r
e -ikx dx = i ( ~ )
f(x)
e-ikx f(x) dx
211
the result follows
from t h e o r e m A2.1.
Theorem 2 Let {X k} and < be as in theorem
I. Then
Z{~x} )in
-Tr g ( e - l X ) b ( e
and further 2
~<-~)~:o~-
F(~-~-. ~] (eiX) I
7
i
Lg
-1T
<~<.
Un
Proof 9{
Let ~n be as in the formulation
of the theorem.
Since
2
b(e~X)In
-..
f(x) ~x <
2
< 7 ~g(elX)
---w
it follows that ~n ~ L theorem is
proved
if
n
~:
7 l+~-x~i~ f(x)
-w
dx < oo
. Thus it follows from t h e o r e m A2.1 that the
we s h o w t h a t
E ~ Xk = E ~n Xk f o r
k~
We have
_]f ~) g(e-iX) 1 Ig(ei~)l~ )e-i <x f/x x= s e-i~<x
--1T
/-~<)- ~/e~-~/ L--~/n
dx:
n.
218
_~
b(e~X)J n
From the Fourier series corresponding to the functions in the integral it follows that E ~ Xk = E ~n Xk for k ~ n.
Further
~(~_ ~)2 n
=
~2 -~1~1
2
2
i -~
1 L g
d n
The following theorem is given by Rozanov (1967, pp 77-78 and 201).
Theorem 3 Any element h ~ [
has the unique representation
which converges in mean square to the same limit independent of the order of summation if and only if F is absolutely continuous and c I ~ f(x) ! c2 for almost all x ~ E - w , ~
where 0 < c I ~ c 2 <
Proof The theorem is proved by Rozanov (]960). The
'if' part is also given
by Rozanov (1967, p 78) and will be reproduced here because of our interest in these kind of results.
For any h =
7T ] h(x) Z{dx] we have
~1 2 Ih(x)I
~z
Ihl 2_
S Ih(~)l 2dx
219
since
c
< f(x)
h -
< c 2. Consider
N Z
h k Xk
k=-N where
1
hk
= 2-~
i e-ikx h(x) dx --W
We have
N
N
E Ih-
h k Xk[ 2=
z
]" l h ( x ) -
k=-N
w
N
<_c2 I" I h ( x ) = c 2 2~
h k eikx I
since
dx =
k=-N Z
lkl>.
i__~lnk 12
Thus h has the r e p r e s e n t a t i o n unique
f(x) ax <
2
~
-~
hk e i k x 12
E
k=-N
-~
of the theorem.
The r e p r e s e n t a t i o n
if h = E gk Xk is a r e p r e s e n t a t i o n
of h we have
N
0 = l i m E Ih N+~
E
gk Xk 12 ~
k=-N
N limc I ( _~ I h ( x ) N+~ -w
I Z lhk-g
-
gk e i k x I
Z
2 dx =
k=-N
l 2
-co
To complete E hk e
ikx
the p r o o f
converges
of its terms.
it suffices
to h(x)
to remark that the
in mean
square
series
for any p e r m u t a t i o n
is
220
Consider
n o w the
constant.
stationary
Suppose
that
sequence
Yk = m + X k w h e r e
Yk is o b s e r v e d
for k = 1,...,n.
m is an u n k n o w n We use the
nota-
tions
~~
Y
= (YI
"''Yn )
I
= (I,...,I)
(n c o m p o n e n t s )
"-13.
and
Rn = E(Y--m - ml--n)'
R
n
is a s s u m e d
estimate
(Y-Y-'m - ml_n)
to be p o s i t i v e
mx o f m i n
terms
n
definite.
of Y
"-'-TI
is
"
Then
given
the best
linear
unbiased
by
-1
1 R Y' m ~ =--n n --n n -I I R I'
and
further
Var m
n
-
I
-I I R I' --n n --n
n
It is n a t u r a l
to compare
this
estimate
with
= In n
Grenander
and R o s e n b l a t t
(1956,
pp 89-90)
lim Var m x = l i m Var Y = F(0) n n n~ n-~
Theorem
4
If F(x)
is a b s o l u t e l y
there
is a v e r s i o n
continuous
f(x)
of F'(x)
it follows
- F(0-)
Yk"
From
e.g.
that
.
in a n e i g h b o u r h o o d which
Z k=1
is c o n t i n u o u s
of x = 0 and if in a n e i g h b o u r -
221
hood of x = 0 with f(x) > c I > 0 then
lim n Var m = lim n Var Y = 27 f(0) n n n-~ n-~
Proof From e.g. G r e n ~ d e r
and Szeg6 (1958, p 211) it follows that the
theorem is true if F is absolutely continuous and if f(x) is condinuous. We will now use an argument due to Grenander (1951, p 567) to show the theorem.
There exists s > 0 such that F(x) is absolutely continuous and f(x)
is continuous for x6 ~c, 4 . Define
fl(x)
=
[-~,-s)
cI
for x e
2(I + x) f(x)- (2x+ I) c
E for x e ~-~,-7)
f(x)
for x E-
2(I - x)e f(x) + (-7-2x I) c I
for x s (5' s~
cI
for x e (~,~]
E
With this construction f1(x) is continuous and fl (x) < f(x) for all x 6 L-~,~~ 9
Following Grenander (1951, p 567) we split {X k} in two ortogonal components { 4 1 ) ] a n d
{Xi2)], such that Xk = 4 1 )
+ Xk-(2)' where ( 4 I) }
has absolutely continuous spectral distribution with spectral density
f1(x). Since, if F 2 is the spectral distribution of { 4 2 ) ] , g
F 2 <~> - F 2 <- ~) = 0 and thus
we have
222
: 2~ f~(o) = 2~ f(o)
l i m n Var n-~ from w h i c h
we
n
get
l i m sup n Var m ~ < 2w f(0). n
nTS On the
other
hand,
if A
--
in the
set of v e c t o r s
a
n
with
a I' = I , Var m = --n--n n
> inf -- ~ n E A n
n ( Z j=1
Var
inf
Var
(a
proves
We w i l l
now
and thus
estimation
studied
for n o r m a l
problems
of the e s t i m a t e s ,
up to the
4th order
such
{X k
; k6Z}
that
= E
We w i l l
spectral
series only
and t h e r e f Q r e
k =-~E Irkl
assume
which
analysis, consider
only
series
< ~ and E
with
known
This
and was the
is one first
(asymptotic)
assumptions
of v for all k , j , i 6 Z .
for a n o r m a l
mean
(Xv - m ) ( X v + k
on m o m e n t s
process
value
E X k = m,
- m)(Xv+ j - m)(X
Put
(X v - m ) ( X v + k - m ) ( X v + j - m ) ( X v + i - m)
a quantity
density.
are required.
be a time
and is i n d e p e n d e n t
of the
of time
processes.
variances
Let
= 2~ f(0)
theorem.
consider
of the most studied
the
--
J
lim inf n Var m~n -> 2w f1(0)
which
(a I ..... a n )
Y' ) >
--n
-%hEAn
X!I) )
a. J
=
-lq
Pk,j,i
~i-m)
=
- rkr.l_j. - rjri_ k - r.r.1j-k'
is equal
to zero.
Let us
further
that
z
Ipk,j,iL <
k,j,i6Z
A time
series
quasi-normal.
fulfilling
all the
assumptions
exists
above
will
be c a l l e d
223
E Irkl < = it follows that {X~}~ has an absolutely continuous k=-~ spectral distribution and that the spectral density f can be chosen
From
continuous
and bounded.
Further
oo
1
f(x) = T w
z
rk e
-ikx
k_--oo
where the sum is absolutely
f(x)
convergent,
n
I
~ -2w
z
and thus
-ikx r k
e
k=-n
if f(x) > O.
Let XI,...,X n be observed
and put
n-l~l Ck =
E j=1
(xj - m)(X.~+
k
- m)
.
Then the periodogram
1 n-1 Z In(X) = 2~n k=-n+1
might
-ikx Ck
e
seem to be a good estimate
some unpleasant
properties
I
k~1= (X k - m) e -ikx 2
= 2~---~
of f(x). This estimate
has, however,
and we are led to consider weighted
esti-
mates of the form
ii
fn~(X) =
/ Wn(Y:X)
In(Y) dy
where
fw n
(y:x) ~=
and where the weight
I
functions
Wn(Y:X)
for all x accumulates
mass in
224
the neighbourhood
of y = x at a 'suitable'
rate as n § ~.
Put
w(kn)(x) = ~ e ik(x-y) Wn(Y:X)dy --7[
and thus we get
f~(x) = 1 n 2wn
Usually only estimates wk(n)(x)
n-1 ~n) e-ikx Z w (x) Ck k=-n+1
f (x) where w ~n)(x)
= 0 for Ikl < m m
(m
--
is independent
of x and where
much smaller than n) are considered. n
The simplest such estimate is the Grenander and Rosenblatt
'truncated'
estimate
(cf e.g.
(1956, p 148)) given by
m
fn~(x)
=
n
I 2~n
-ikx
E
Ck
e
k=-m n m
where m
n
§ ~ as n § ~ in such way that
n n
§ 0.
The following theorem is taken from Roseablatt where also the required conditions tions Wn(Y:X)
(1959, pp 253-255)
on the sequence of weight
func-
are given.
Theorem 5 Let
{Xk
; k~Z}
be a quasi-normal time series and let Wn(Y:X) be
a sequence of 'suitable'
weight
functions.
w n
Var
Then
(y:x) dy i f
x r 0,'~
-Tr
f~(x) 4w nf2(x)
if f(x) > 0. Further,
J w2(y'x)dy
if
x=0
if 0 <_ x I < x 2 < w and if f(x I) > 0 and
225
f(x 2) ~ O, the estimates
f~(x I) and f~(x 2) are asymptotically
un-
correlated.
It may be observed that Var f~(x)~~ tends to zero slower than I_ since n n f w~ (y:x) dy § ~ as n ~ ~. -7 For
the 'truncated' estimate we have
f w 2 (y:xl n
dy
m ~--~
.
226
REFERENCES
Barndorff-Nielsen, O. and Yeo, G.F. (196~. Negative binomial processes.
J. Appl. Prob. 6, 633-647. Correction in J.A.P. 7, 249.
Bauer, H. (1968). Wahn~ch~nlichkeigstheorie und Grundz~ge der Mass-
theoaie. Walter de Gruyter & Co. Berlin. Billingsley, P. (1965).
Ergodic theory and information. John Wiley
and Sons. New York. Billingsley, P. (1968). Convergence of p r o b a b i l i t y m e ~ u r e s . John Wiley and Sons. New York. Bingham, N.H. (1971). Limit theorems for occupation times of Markov processes. Z. W a ~ c h ~ n g i c h k e i t ~ t h e o r i e
verw. Geb. 17. 1-22.
Cox, D.R. (1955). Some statistical methods related with series of events.
J. R. staJgist. Soc. B, 17, 129-164.
Cox, D.R. and Lewis, P.A,W.
(1966).
The statistical analysis of event~.
Methuen. London. and Barnes and Noble. New York. Cram@r, H. (1955). Collective risk theory.
Fo~k~ingsbolaget
Skan~a.
The jubilee volume of
Stockholm.
Cram@r, H. and Leadbetter, M.R. (1967). S t a t i o n a r y and r e l a t e d
stoch~tic
proc~ses.
John Wiley and Sona, New York.
Cram@r, H. (1969). On streams of random events. Skand. A k t u ~ .
T i d s k r i f t 52 Suppl.,
13-23.
Daley, D.J. and Vere-Jones, D. (1972). A summary of the theory of point processes. S t o c h ~ t i c
theory and applications.
point p r o c ~ s e s : S t a t i s t i c a l
analysis,
Ed. by Lewis, P.A.W., 299-383. Wiley-
Interscience. New York. Dobrushin, R.L. (1965). A lemma on the limit of a composite random function (in Russian).
Uspe~ Mat. Nauk 10, no. 2 (64), 157-152.
Doob, J.L. (1953). S t o c h ~ t i c
Processe~. John Wiley and Sons, New York.
Feller, W. (1971). An i n ~ d u c t i o n
to probabigity and i t ~ appgications.
Vol. I f . 2nd ed. John Wiley and Sons. New York.
227
Gaver, D.P. (1963). Random hazard in reliability problems.
Technomet~cs 5, 211-226. Grandell J. (1971). On stochastic processes generated by a stochastic intensity function. Skand. Aktuar. T i d s k r i f t 54, 204-240. Grandell, J. (1972:1). On the estimation of intensities in a stochastic process generated by a stochastic intensity sequence.
J. Appl. Prob. 9,
542-556. Grandell, J. (1972:2). Statistical inference for doubly stochastic Poisson processes. S t o c h ~ t i c point processes: S t a t i s t i c a l analysis,
theory and applications. Ed. by Lewis, P.A.W., 90-121. W i l e y Interscience. New York. Grandell, J. (1973). A note on characterization and convergence of non-atomic random measures. Int.
conf. on prob. theory and math.
s t a t . , Abstract~ of commu~icagions T . I . ,
175-176, V i l n i u s .
Grenander, U. (1951). On Toeplitz forms and stationary processes.
Arkiv fur matematik I. 555-571. Statistical analysis of stationary time s e~es. Almqvist & Wiksell~ Stockholm, and
Grenander, U. and Rosenblatt, M. (1956).
John Wiley and Sons. New York. Grenander, U. and Szeg8, G. (1958).
Toeplitz forms and their applica-
t/0ns. Univ. of California Press, Berkeley and Los Angeles. Hannan, E.J. (1960). Time series analysis. Methuen & Co. London. Hannah, E.J. (1970). M u ~ p l e time series.
John Wiley and Sons.
New York. Jagers, P. (1973). On Palm probabilities. Z. Wahrscheinlichkeits-
theorie verw. Geb. 26, 17-32. Jagers, P. (1974). Aspects of random measures and point processes.
Advances in probability and related topics. 3.
Ed. by Ney, P . ,
179-239. Marcel Dekker, New York. Jung, J. and Lundberg, O. (1969). Risk processes connected with the compound Poisson process. Skand. Aktuar. Tid~krift, Suppl.,
118-131.
228
Kallenberg, 0. (1971). Lecture at the Gothenburg conference on point processes. Kallenberg, 0. (1973:1). Characterization and convergence of random measures and point processes. Z. Wah~chein~ichkeit~theorie verw.
Geb. 27. 9-21. Kallenberg~ 0. (1973:2). Characterization of continuous random processes and signed measures. Studia Sci. Math. Hungarica 8. 473-477. Kallenberg, 0. (1975:1). Limits of compound and thinned point processes.
J. Appl. Prob. 12, 269-278. Kallenberg, 0. (1975:2). Random measures. Schriftenreihe des Zentralinstituts fi~r Mathematik und Mechanik der ADW der DDR, AkademieVerlag, Berlin. Kallenberg, 0. (1976). On the structure of stationary flat processes. Tech. Rep., Dept. of math., Gothenburg. Kerstan, J., Matthes, K. and Mecke, J. (1974). Unbegrenzt t e i l b ~ e
Punktproz~se. Akademie-Verlag, Berlin. Khintchine, A.Y. (1960). Mathematical methods in the theory of queuing. Charles Griffin. London. Kingman, J.F.C.
(1964). On doubly stochastic Poisson processes. Proc.
Camb. P~ig. Soc. 60, 923-930. Kingman, J.F.C.
(1972). Regenerative phenomena. John Wiley and Sons,
New York. Kolmogorov~ A.N. (1939). Sur l'interpolation et extrapolation des suites stationnaires. C.R. Acad. Sc. Paris 208, 2043-2045. Krickeberg~ K. (1972). The Cox process. Symposia Mathematica IX. 151-167. Kummer, G. and Matthes, K. (1970). Verallgemeinerung eines Satzes yon Sliwnjak III. Rev. Roum. math. pure et appl.
15:10, 1631-1642.
Lamperti, J. (1962). Semi-stable stochastic processes. Trans. Am~.
Math. Soc. 104, 62-78.
229
Lawrance, A.J. (1972). Some models for stationary series of events.
Stochastic point process~: S t a t i c a l tions.
a n a l y s i s , theory and applica-
Ed. by Lewis, P.A.W., 199-256. Wiley-Ynterscience. New York.
Lindvall, T. (1973:1). Weak convergence of probability measures and random functions in the function space D[0,~). J. Appl. Prob.
10,
109-121. Lindvall, T. (1973:2). Weak convergence in the function space D~0,~) and diffusion approximations of certain Galton-Watson branching processes. Tech. Rep., Dept. of math., Gothenburg. Lundberg, O. (1940). On random p r o c ~ s e s
hess and accident s t a t ~ t i c s .
and t h e i r applic~gion to s i c k -
2 nd ed. 1964, Almqvist & Wiksell.
Uppsala. Macchi, O. (1971). Distribution statistique des instants d'$mission des photoelectrons d'une lumi~re thermique. C. R. Acad. Sc. P a ~
272, sea A, 437-440. Macchi, O. and Picinbono, B. (1972). Estimation and detection of weak
IEEE Trans. Inform. Theory 18, 562-573.
optical signals.
Marcus, M. and Minc, H. (1965). Permanents. Amer. Math. Monthly 72, 577-591. Mecke, J. (1967). Station~re zuf~llige Masse auf iokalkompakten Abelschen Gruppen.
Z. Wahrscheinlichk~tstheorie
verw. Geb. 9,
36-58. Mecke, J. (1968). Eine charakteristische Eigenschaft der doppelt stochastischen Poissonschen Prozesse. Z. Wahr$cheinlichk~t~theo~e
verw. Geb. 11, 74-81. Mecke, J. (1972). Zuf~llige Masse auf lokalkompakten Hausdorffschen
R~umen. B e i ~ g e
zur Analysis 3. 7-30.
M6nch, G. (1971). Verallgemeinerung eines satzes von A. RSnyi. Stud~a
Sci. Math. Hungar. 6, 81-9o. Neuts, M.F. (1971). A queue subject to extraneous phase changes.
Adv. Appl. Prob. 3, 78-119.
230
Parzen, E. (1959). Statistical inference on time series by Hilbert space methods. I. Published in Parzen, E.(1967).Time series analysis
papers. Holden Day, San Francisco. Rodhe, H. and Grandell~ J. (1972). On the removal time of aerosol particles from the atmosphere by precipitation scavenging.
Tellus 24. 443-454. Rootz6n, H. (1975). A note on the central limit theorem for doubly stochastic Poisson processes. Tech. report, The university of North Carolina. Rosenblatt, M. (1959). Statistical analysis of stochastic processes with stationary residuals. Probability and s t a t k s t i c s
- The Harald
Cram~r
volume. Ed. by Grenander~ U,, 246-257. Almqvist & Wiksell~ Stockholm, and John Wiley and Sons, New York. Rozanov, Yu. A. (1960). On stationary sequences forming a basis.
S o v i e t Math. - Do~gady I, 155-158. Rozanov, Yu. A. (1967). Stationary random processes. Holden-Day. San Francisco. Rubin, I. (1972). Regular point processes and their detection.
IEEE Trans. Inform. Theory 18, 547-557. Rudemo, M. (1972). Doubly stochastic Poisson processes and process control. Adv. Appl. Prob. 4, 318-338. Rudemo, M. (1973:1) State estimation for partially observed Markov chains. J. Ma~.
Anal. Appl. 44, 581-611.
Rudemo, M. (1973:2). Point processes generated by transitions of Markov chains. Adv. Appl. Prob. 5, 262-286. Rudemo, M. (1975). Prediction and smothing for partially observed Markov chains. J. Math. Aaal. Appl. 49, 1-23. Ryll-Nardzewski, C. (1961). Remarks on processes of calls. Proc. 4th
Berk~gey Symp. 2, 465-471. Serfozo, R. (1972:1). Conditional Poisson processes. J. Appl. Prob. 9, 288-302.
231
Serfozo~ R. (1972:2). Processes with conditional independent increments.
J. Appl. Prob. 9, 303-315. Siegert, A.J.F.
(1957). A systematic approach to a class of problems in
the theory of noise and other random phenomena: Part II. IRE Trans.
Inform. Theory 3, 37-43. Skorohod, A.V. (1957). Limit theorems for stochastic processes with independent increments.
Theory Prob. Applications II, 138-171.
Snyder, D.L. (1972:1). Filtering and detection for doubly stochastic Poisson processes.
IEEE Tra~s. Inform. Theory 18, 91-102.
Snyder, D.L. (1972:2). Smoothing for doubly stochastic Poisson processes.
IEEE Trans. Inform. Theory 18, 558-562. Snyder, D.L. (1975).
Random point processes. John Wiley and Sons.
New York. Snyders, J. (1972). Error formulae for optimal linear filtering, prediction and interpolation of stationary time series. Ann. Math. Stagist.
45, 1935-1943. Stone, C. (1963). Weak convergence of stochastic processes defined on a semifinite time interval. Proc. Ame~. Math. Soc. van Trees, H.L. (1968).
14, 694-696.
Detection, estimation, and mod~ation theory.
P a r t I. John Wiley and Sons, New York. Waldenfels, W.v. (1968). Charakteristische Funktionale zuf~lliger Masse.
Z. Wahr~ch~in~ichk~gt~theo~ie verw. Geb. 10, 279-283. Westcott~ M. (1972). The probability generating functional.
J. Aust.
Math. Soc. 14, 448-466. Whitt, W. (1970). Weak convergence of probability measures on the function space D[0,~). Tech. report, Yale university. Whitt, W~ (1972). Continuity of several functions on the function space D. A revised version is sometimes referred to as 'to appear in
Ann. Prob.'
232
INDEX
Absolutely dominated
121
Additive see completely random Asymptotically
efficient
Average intensity
162
185
Best estimate 88 Best linear estimate
116, 142, 213
Borel
-
algebra 3, 205 measure 3
Bounded set 5 Completely random 5 Completion
13
Convergence in distribution 69, 74, 208 - vague 205 - weak 19, 208 Convolution 207 Covariance 23 Cox process see doubly stochastic Poisson process Cross spectral density
163, 216
Diffuse see non-atomic Doubly stochastic Poisson process 7 alternative definitions of a - 10, 12, 16 Doubly stochastic Poisson sequence Dynkin system 7 Ergodic 27 Estimate 88 best - 88 best linear - 116, 142, 213 best linear unbiased - 220 linear - 116 Functional limit t h e o r e m 76 Hilbert
space
Instantaneous
115~ 212 intensity
14, 15
Intensity average - 185 instantaneous - 14, 15 - function 12 measure 5 -
Laplace-transform Leading function
18, 206 10
17
233
Level
160
Linear
estimate
116
Local convergence see vague convergence Loss
function
88
Mean 23 Measurable
process
13
M i x e d Poisson process see w e i g h t e d Poisson Non-atomic - measure 8, 206 random measure Observation
process
19, 206
87
Operator c-amplifying - 21 p-thinning - 21 shift - 27 P a l m measure w-system
54
205
Point process 4 simple - 19, 206 Poisson process with intensity measure 5 intensity one 11 - leading function 10 Polish
space
3, 205
P61ya process Quasi-normal
32 195, 222
Radon measure see Borel measure R a n d o m measure 4 distribution of a - 4 non-atomic - 19, 206 Regular
variation
Relatively
76
compact
208
Renewal process 34 alternating - 50 arithmetic - 44 non-arithmetic - 44 ordinary - 44 stationary - 37 transient - 34 Simple point Skorohod
process
topology
19, 206
74, 210,
211
Spectral density 27, 214 - distribution 27, 214 -
Standard
Poisson
see Poisson
process
process
with intensity
one
234 State space 3 Stationary strictly ~ 20 (weakly) - 26 Thinning 21 Tight 208 Topology Skorohod - 74, 210, 211 vague - 3, 205 "Truncated"
estimate
198, 224
Vague - convergence 205 - topology 3, 205 Version
13
Weak convergence
19, 208
Weighted Poisson process Without after-effects see completely random Without multiple points see simple
31