Doubly Stochastic Poisson Processes

Lecture Notes in Mathematics Edited by A. Dold and B. Eckmann 529 Jan Grandell Doubly Stochastic Poisson Processes Sp...

Author: J. Grandell

118 downloads 1345 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Mathematics Edited by A. Dold and B. Eckmann

529 Jan Grandell

Doubly Stochastic Poisson Processes

Springer-Verlag Berlin. Heidelberg New York 1976

Author Jan Grandell Department of Mathematics The Royal Institute of Technology S-10044 Stockholm 70

Library of Congress Cataloging in Publication Data

Grandell, Jan, 194~iDoubly stochastic Poisson processes. (Lecture notes in mathematics ; 529) Bibliography: p. Includes index. 1. Poisson processes, Doubly stochastic. 2. Measure theory. 3. Prediction theory. I. Title. II. Series: Lecture notes in mathematics (Berlin) ; 529. QA3.L28 vol. 529 [QA274.42] 510'.8s [519.2'3]

76-20626

A M S Subject Classifications (1970): 60F05, 6 0 G 2 5 , 6 0 G 5 5 , 62M15

ISBN 3-540-0??95-2 ISBN 0 - 3 8 ? - 0 ? ? 9 5 - 2

Springer-Verlag Berlin 9 Heidelberg 9 N e w York Springer-Verlag N e w York 9 Heidelberg 9 Berlin

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under w 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher. 9 by Springer-Verlag Berlin. Heidelberg 1976 Printed in Germany. Printing and binding: Beltz Offsetdruck, Hemsbach/Bergstr.

PREFACE

The doubly stochastic Poisson process is a generalization of the ordinary Poisson process in the sense that stochastic variation in the intensity is allowed. Some authors call these processes processes'

'Cox

since they were proposed by Cox (1955). Later on Mecke

(1968) studied doubly stochastic Poisson processes within the framework of the general theory of point processes and random measures.

Point processes have been studied from both a theoretical and a practical point of view. Good expositions of theoretical aspects are given by Daley and Vere-Jones

(1972), Jagers (1974), Kallenberg

(1975:2) and Kerstan~ Matthes and Mecke

(1974). Accounts of more

practical aspects are given by Cox and Lewis (1966) and Snyder (1975).

The exposition in this monograph is based on the general theory of point processes and random measures, but much of it can be read without knowledge of that theory. My objective is to place myself somewhere between the purely theoretical school and the more applied one, since doubly stochastic Poisson processes are of both theoretical and practical interest.

I am quite aware of the risk that some readers

will find this monograph rather shallow while others will find it too abstract. Of course I hope - although perhaps in vain - that a reader who is from the beginning only interested in applications will also find some of the more theoretical parts worth reading. I have, however, tried to make most of the more applied parts understandable without knowledge of the more abstract parts. Also in most of the more theoretical parts I have included examples and numerical illustrations.

JV

All readers are assumed to have a basic knowledge of the theory of probability and stochastic processes. The required knowledge above that basic level varies from section to section. The three appendices, in which I have collected most of the non-standard results needed, may be of some help.

In section 1.2 doubly stochastic Poisson processes are defined in terms of random measures. A reader not interested in the more theoretical aspects may leave that section after a cursory reading.

In sec-

tion 1.3.1 the same definition is given in terms of continuous parameter stochastic processes and finally in section 1.4 in terms of discrete parameter stochastic processes. Sometimes alternative definitions, given in sections 1.3.2 - 1.3.4 are convenient. Generally I have used the definition in section 1.2 in the more theoretical parts. Section 1.5 contains some fundamental theoretical properties of doubly stochastic Poisson processes and requires knowledge of random measures. In section 1.6 mean values, variances and covariances are discussed. Only the first part of it requires some knowledge of random measures.

In section 2 mainly special models are treated. In sections 2.2, 2.3.2 and 2.3.3 some knowledge of renewal theory is helpful.

In section 2.3

and 2.4 the distribution of the waiting time up to an event is considered. Palm probabilities, to which section 2.4 is devoted, belong to the difficult part of point process theory. I have tried to lighten the section by including a heuristic and very non-mathematical introduction to the subject.

Section 3 is purely theoretical and illustrates how doubly stochastic Poisson processes can be used as a tool in proving theorems about random measures.

In section 4 the behaviour of doubly stochastic Poisson processes after long 'time'

is considered.

In section 4.2 knowledge of weak

convergence of probability measures

in metric spaces is helpful.

Some of the required results are summarized in section At.

In section 5 'estimation of random variables'

is considered.

Here

estimation is meant in the sense of prediction and not in the sense of parameter estimation. ful. In section 5.1 tion 5.2 'linear'

Some knowledge of random measures

'non-linear'

is help-

estimation is treated and in sec-

estimation is treated. The main mathematical tools

used are, in section 5.1, the theory of conditional distributions and, in section 5.2, the theory of Hilbert spaces.

In section A2 the

required results of Hilbert spaces are summarized.

In sections 6 and 7 the discrete parameter case is treated. tion 6 'linear estimation of random variables' section 7 estimation of covariances treated.

In sec-

is considered.

In

and of the spectral density is

In both sections methods from the analysis of time series

are used. These sections require no knowledge of random measures depend only on section

1.4 and the last part of section

and

1.6. A rather

complete review of the required theory of time series are given in section A3.

All definitions,

theorems,

lemmata,

corollaries,

examples and remarks

are consecutively numbered within each main section. definition 5 in section

1.2 is referred to as 'definition

whole of section I and as 'definition the'List of definitions,

So, for example, 5' in the

1.5' in the other sections.

...' it is seen that definition

From

1.5 is given

on page 7. The end of each proof, example or remark is signaled by ~

.

VI

There are of course many topics related to doubly stochastic Poisson processes which are not treated in this monograph.

In particular we

shall not consider line processes, i.e. random systems of oriented lines in the plane, or their generalizations to flat (hyperplane) processes. A line process can be viewed as a point process on a cylinder by identifying lines with a pair of parameters which determine the line, e.g. the orientation and the signed distance to the origin. It turns out that 'well-behaved'

stationary line processes correspond

to doubly stochastic Poisson processes. What 'well-behaved'

shall really

mean is as yet not settled. To my knowledge the best results are due to Kallenberg (1976) where results of Davidson, Krickeberg and Papangelou are improved.

There are many persons to whom I am greatly indepted, but the space only allows me to mention a small number of them. In a lecture Harald Cram@r, see Cram@r (1969), gave me the idea of studying doubly stochastic Poisson processes.

In my first works on this subject I

received much help from Jan Gustavsson. Peter Jagers introduced me to the general theory of point processes and random measures.

From

many discussions with him and with Olav Kallenberg and Klaus Matthes I have learnt much about that theory. The extent to which I have benefitted from Mats Rudemo~s advice and comments on early versions of this monograph can hardly be overestimated.

In the preparation of

the final version I was much helped by Bengt yon Bahr, Georg Lindgren and Torbj6rn Thed@en. Finally, I am much indepted to Margit Holmberg for her excellent typing.

Stockholm, March 1976

Jan Grandell

LIST OF DEFINITIONS, THEOREMS, LEMMATA, COROLLARIES, EXAMPLES AND REMARKS number

page

number

page

number

page

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8

87 88 88 116 116 118 121 142

6.1

162

AI.1 AI.2 AI.3 A1.4

206 206 208 208

16

4 4 5 5 7 11 17 23

1.1 1.2 1.3 1.4 1.5 1.6 1.7

18 19 19 2O 21 25 28

4.1 4.2

69 81

5.1 5.2 5.3 5.4 5.5

89 116 118 123 141

AI.5 AI.6 AI.7 AI.8 AI.9 AI.10

207 209 209 2O9 210 211

2.1 2.2

35 57

A2.1 A2.2

212 214

7.1

196

3.1 3.2 3.3

65 66 68

AI.1 AI.2 AI.3 AI.4

205 206 207 207

A3.1 A3.2 A3.3 A3.4 A3.5

216 217 218 220 224

1.1 1.2 1.3a 1.3b 1.4

5 10 23 24 27

3.1

67

5.1

122

4.1 4.2 4.3

77 78 80

6.1

180

Corollaries

1.1

22

2.1

37

4.1 4.2

72 72

Examples

2.1 2.2 2.3 2.4

47 48 59 60

164 167 170 183 187

83 84

95 97 107 127 128 129 132 140

6.1 6.2 6.3 6.4 6.5

4.1 4.2

5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9

7.1 7.2

193 198

5.1

94

1.1 1.2 1.3

20 24 25

4.2 4.3

78 83

5.5

126

5.1 5.2 5.3 5.4

93 118 120 125

6.1 6.2 6.3 6.4

162 165 166 182

Definitions

I I 2 1 3 14 1 5 1 5' 1 5"

Theorems

L er~mata

Remarks

2.1

55

4.1

74

CONTENTS

I,

Definitions

and basic properties

1.1

A heuristic

introduction

1.2

The general definition

1.3

Doubly stochastic Poisson processes on the real line

9

1.3.1

Recapitulation of the definition

9

1.3.2

An alternative

1.3.3

Classes of doubly stochastic Poisson processes

12

1.3.4

A definition based on interoccurrence

15

1.4

Doubly stochastic Poisson sequences

17

1.5

Some basic properties

18

1.6

Second order properties

22

1.7

A characterization

of ergodicity

27

2.

Some miscellaneous

results

31

2.1

The weighted Poisson process

31

2.2

Doubly stochastic Poisson processes and renewal processes

33

2.3

Some reliability models

4O

2.3.1

An application on precipitation aerosol particle

10

definition

times

scavenging of an 4O

A model with an intensity generated by a renewal process

44

A model with an intensity generated by an alternating renewal process

5O

2.4

Palm probabilities

53

2.4.1

Palm probabilities for doubly stochastic Poisson processes in the general case

53

2.4.2

Some special models

58

2.5

Some random generations

63

2.3.2

2.3.3

Characterization and convergence of non-atomic random measures

65

4.

Limit theorems

68

4.1

0he-dimensional limit theorems

69

4.2

A functional limit theorem

74

5.

Estimation of random variables

86

5.1

Non-linear estimation

87

5.2

Linear estimation

115

5.3

Some empirical comparisons between non-linear and linear estimation

143

Linear estimation of random variables in stationary doubly stochastic Poisson sequences

158

6.1

Finite number of observations

158

6.2

Asymptotic results

161

7.

Estimation of second order properties of stationary doubly stochastic Poisson sequences

190

7.1

Estimation of covariances

192

7.2

Estimation of the spectral density

195

A1

Point processes and random measures

2o5

A2

Hilbert space and random variables

212

A3

Some time series analysis

214

3.

6.

References

226

Index

232

I.

DEFINITIONS

I. I

A heu~tic

AND BASIC PROPERTIES

introduction

We will start the discussion

of doubly stochastic

Poisson procecesses

in a very informal way~ in order not to hide simple ideas behind notations

and terminology.

mathematical points

Consider therefore

model is needed for the description

in some space.

To be concrete,

in time and assume that multiple model describing The simplest

a situation where

a situation

such model,

events

of the location events

do not occur.

except perhaps

for a deterministic

intensity

in each time interval

dent.

Depending

in disjoint

occurring

A mathematical

intervals

is Poisson

distributed

different

with

Further,

are stochastically

of course on the situation

one, is

X. In this model the

mean value equal to X times the length of the interval. number of events

of

of this kind is called a point process.

the Poisson process with constant number of events

we consider

a

the

indepen-

objections

may

be raised against the use of this simple model. We will here discuss some objections

in such a way that we are led to a doubly stochastic

Poisson process.

(i)

Assume that the model seems realistic

know the value of the parameter

~, a rather

except that we do not common situation.

then natural to use some estimate

of ~. There exist, however,

tions where this is not possible.

Consider

insurance business dent pattern

and suppose that

follows

situa-

an automobile

for each policy-holder

the acci-

a Poisson process but that each policy-holder

has his own value of ~. The insurance knowledge

for example

It is

of how ~ varies

company may have a rather good

among its policy-holders.

For a new policy-

h o l d e r it may therefore as a constant

be reasonable

to treat his value

but as a r a n d o m variable.

w e i g h t e d P o i s s o n process is frequently

(ii)

In both the P o i s s o n

In many

variations

~. The number

situations

or other trends.

of events

in a time

perform

Formally this

is not a serious

a transformation

the m o d e l with constant

of the time X (cf Cram$r

this

complication

is more

X(t)

is required.

Thus

Suppose plays

S t a r t i n g with the ~(t) i n s t e a d of the con-

interval

is then P o i s s o n

a model

an important

variation,

complication

over the

since we may

scale which leads us b a c k to (1955, p 19)).

since k n o w l e d g e

for ~(t)

In practice

of the

function

is needed.

role.

variation.

There may of course be different To he concrete

at least partly,

depends

again we assume

on w e a t h e r

and the weather.

In spite of this

cessary to use a stochastic model In such a situation

it is thus

tion of a stochastic process. then led to a doubly

dependenc@

natural to regard

stochastic

that the In m a n y

b e t w e e n the time of

in order to describe

As indicated

reasons

conditions.

of the w o r l d there is a strong dependence

the y e a r

of ~(t)

now that we are in a s i t u a t i o n where the seasonal v a r i a t i o n

for a seasonal

parts

serious

~ was

~ will vary with time

d i s t r i b u t e d with m e a n value equal to the integral interval.

In fact this

and the w e i g h t e d P o i s s o n model

P o i s s o n model we are led to use a function stant

model.

a

used in insurance m a t h e m a t i c s .

a s s u m e d to be constant. due to seasonal

We are then led to use

as our m a t h e m a t i c a l

model

of ~ not

~(t)

it may be nethe weather. as a realiza-

in the p r e f a c e we are

Poisson process.

1.2

The general d e f i n i t i o n

In this section a general process will be given.

definition

of a doubly stochastic

The definition will be based on the theory of

random measures

and point processes.

are e~g. Jagers

(1974) and Kerstan, Matthes

vey is, however,

In section

Sometimes

As in Jagers

in time were considered,

there is a need for more general

state

i.e. R 2, is often

(1974) X will be assumed to be a

compact Hausdorff topological else is stated.

(1974). A sur-

are located will be called the state

In e.g. ecological models the plane,

natural.

for that theory

and Mecke

1.1~ where point processes

X was the real line. spaces.

Good references

given in section At.

The space X where the points space.

Poisson

locally

space with countable basis when nothing

A reader not interested

in topological

think of X as the real line or, perhaps better,

concepts may

as R 2" Often, how-

ever~ we will consider X = R when its natural order of real numbers is convenient

or X = Z where

Z is the set of integers.

Let B(X) be the Borel algebra on X, i.e. the a-algebra open sets. A Borel measure negative measure

that is finite on compact

all Borel measures. space.

(or Radon measure)

on (X,B(X))

sets.

Endowed with the vague topology M is a Polish

advised to turn to the beginning

Borel algebra on M. Let

N@B(M)

concepts

is

of section AI for definitions.

may also be helpful to read section

and B(N)

is a non-

Let M be the set of

(A reader not familiar with these topological

valued elements

generated by

1.3.1

first.)

Denote by

It

B(M)

the

be the set of all integer or infinite

of M. Endowed with the relative

denotes the Borel algebra on N. Usually

will be denoted by ~ and ~ respectively.

topology elements

N is Polish in M and N

Definition

I

A random measure

is a measurable mapping from some probability

(W, W, ~) into (M,

space

B(M)).

Usually a random measure will be denoted by A. The distribution is the probability measure H on (M, B(M)) H(B M) = ~ ( w ~ W

; A ( w ) & B M) for B M E

on (M, B(M)) we may take (W, mapping,

B(M).

For any probability measure H) and A as the identity

i.e. A(~) = ~. Thus any probability measure

talk about a random measure

random measure,

on

(M, B(M))

We may, and shall, thus

A with distribution

ference to an underlying probability

known,

induced by A, i.e.

W, ~) = (M, B(M),

the distribution of some random measure.

of A

H without any re-

space. When we talk about a

it is tacitly understood that its distribution

is

and it is often convenient to use the notation P r ( A E B M)

instead of H(B M) for B M ~ B ( M ) .

Let a random measure

A with distribution

H and a set B ~ B ( X )

be

given. We will talk about the random variable A(B), which is nonnegative

and possibly extended,

see theorem AI.2.

Similarly

for

given B I .... ,Bn~B(X) we talk about the random vector

(A(B 1 ) . . . . .

A[Bn)).

Definition

2

A random measure with distribution H is called a point process if ~(N) = I.

Usually a point process will be denoted by N. We will, whenever convenient

and without comments,

of a point process

assume that all realizations

are in N and interpret its distribution

probability measure on

(N, B(N)).

as a

is

Definition 3

Ar a n d o m ~ measure A is completely random if A{BI},...,A{Bn } are independent random variables whenever B I , . . . , B n ~ B ( x )

In e.g. Kerstan, Matthes

and Mecke

are disjoint.

(1974, p 24) it is shown that for

every ~ E M there exists exactly one p r o b a b i l i t y measure H

(N, B(N))

which is the distribution

on

of a completely random point pro-

cess N with

Pr{N(B} = k} = P{B)k e-P(B}

kl for all k = 0,1,...

and all bounded B ~ ( X ) .

I n this paper a set is

called bounded if it has compact closure.

Definition 4

A point process N with distribution

H

is called a Poisson process

with intensity measure ~.

We note that if N is a Poisson process with intensity measure ~ and if B is an unbounded set in B(X) with ~{B} = ~ then Pr{N{B}

We will now give the general definition Poisson process.

= ~} = I.

of a doubly stochastic

In order to justify the definition the following

lemma is needed.

Lemma

I

For every B e B ( N ) measurable.

the function ~ ~ H {B} from M into E0,1]

is

B(M)-

Proof

This lemma is a consequence

of 1.6.2 in Kerstan,

(1974, pp 64-65).

We will, however,

that the function

U ~ H {BN} is B(M)-measurable

form {v~ N; v{B I} = k I

~"

..,v{B

n

Matthes

give a proof.

and Mecke

We will first show

for sets B N of the

) = k } where BI,...,B n are disjoint n

sets in B(X) and kl,..,,k n are finite nonnegative

integers.

In this

case we have k.

n

~{9.i } ~

i=I

-~{B i } if all ~{B.}

e

<

1

k.~ i

H { B N) = 0

if some u{B.} = 1

and thus H { B N} is a measurable

function

~{B I) ..... ~{Bn }' Since for all B ~ B ( X ) B(M)-measurable

Since H

also the function

is a measure

B(M)-measurable

in the variables

the function

U ~U{B}

is

u ~ H {B N} is B(M)-measurable.

for each ~ & M

the function

~ ~ ~ {B } is

also for sets of the form n

BN = {v~N;

(v(B 1},...,v(Bn))~

E}, E C Z +

where Z+ = 0,1...

To see this we consider n = 2 and E = (kl,k2).

{~6N; =

(~{BI},~{B2})

~J J1+J2=kl

:

{~eN;~{BI~B

and ~.

Then

(kl,k2)} : 2) = J1' ~ { B I N B 2} = J2' ~ { B 2 k B I }

= J3 }

J2+J3=k2 Thus ~ ~ H {B N} is B(M)-measurable closed under intersection and the comments

for a class of B N sets which

and which,

after that theorem,

D = {D ~ B(N); H { D }

as follows generates

is B(~)-measurable}

we have

is

from theorem AI.1

B(N)

. If

(i)

N6N

(ii)

DI,D2s

(iii)

DI,D 2 .... ~ D

H

,

is a measure.

(cf Bauer

DICD2

~

D2~,D16P

and D i N D .

= ~

for i # j ~ L J

D . 6 D since

Thus ~ is a Dynkin system and thus

(1968, pp 17-18)). Thus H { B N} is

~ =

B(N)

B(M)-measurable

v BN6 B(N) .

m

It follows from lemma I that the set function P{BN] = i H~{BN}H{d~} measures

(M,B(M)).

H on

measure on

is well-defined

(N,B(N))

Since H

for all probability

for all ~ 6 M

it follows by monotone

P is a probability measure on

(N,B(N)).

is a probability

convergence that also

We will use the notation

P = S H H{d~} for that probability measure. M Definition 5 A point process N with distribution measure H on

(M,B(M))

S H H{dp} for some probability M is called a doubly stochastic Poisson process.

If A is a random measure with distribution H and N a doubly stochastic Poisson process with distribution P = ~ H H{dp} we sometimes call N the doubly stochastic Poisson process corresponding to A. For any bounded B E B(X) it follows in this case that

Pr{N{B)

) r

= k) = P{~

~{B} k k~

N; ~{B} = k )

= e-~{B}H{d~]

(A(B) k E , k~

=

-A{B) e

) .

M We will often consider N and A defined on the same probability space.

Intuitively we shall then think of a realization

doubly stochastic Poisson process N corresponding

of a

to a random

measure A as generated in the following way. First a realization

of A is generated,

and then a realization of a Poisson process

with intensity measure ~ is generated. reasoning precise we must introduce

In order to make this

some notations.

Let N•

be the

product of N and M~ which endowed with the product topology is Polish, and let B(N)•

be the ~-algebra generated by all rec-

tangles BNXB M. Note (cf e.g. Billingsley B(N)xB(M) Polish.

(1968, p 225)) that

equals the Borel algebra B(N•

on N•

since NxM is

(N,A) is a measurable mapping from some probability

into (NxM,B(N•

space

with a distribution determined by

Pr(NE BN, A ~ B M) = ~

H (BN}H{d~)

M for all B N ~ B ( N ) , H

B M ~ B(M).

In terms of conditional probabilities

is the distribution of N given A = ~. For more details we refer

to section 5.1.

Sometimes it is natural to consider Borel measures

in some sub-

space M o C M

may e.g. be

as the possible

all non-atomic measures, or all a b s o l u t e l y

intensity measures.

i.e. ~ & M

continuous

o

~

measures.

M

o

~{{x}} = 0 for all x ~ X , If

M ~B(M) o

we r e s t r i c t

ourselves to cases where H{M } = 1. If, however, M is not a Borel o o set~

a doubly

stochastic

Poisson

process

may t h e n

be defined

as

definition 5 except that M and B(M) are replaced by Mo and B(M o) where M ~ is endowed with the relative topology and B(M o) is the Borel algebra on M . o

in

1.3

Doubly stochastic Poisson processes on the r e .

line

Recapitulation of the d e f i n i t i o n

1.3.1

In the general definition in section 1.2 point processes were treated as random measures. A realization of a point process was thus regarded as an element v in N. On the real line, which is the traditional state space for point processes, it is sometimes convenient to regard a realization of a point process as a stepfunction v(x). Formally we put I the number of points in (O,x_~ ~(x)

Iminus

if

x > 0

the number of points in (x,O~

In the same way any Borel measure ~

if

x ~ O.

M corresponds to a non-

decreasing rightcontinuous function ~(x) on R such that ~(0) : 0

and

l~(x)I

< ~

for x ~ R . ~{(O,x]}

Formally we have the relation

if

x > 0

~(x) =I

t~{(x,0~}

if x ! 0

Thus the equivalence between the two points of view is not deeper than that a probability law of a random variable may he given either by a probability measure or by a distribution function. Since the 'random measure approach' may seem somewhat abstract, though appealing to intuition, we have a feeling that a short recapitulation of section 1.2 may be advisable.

Let M be the set of functions correponding to Borel measures endowed with the a-algebra B(M) generated by { ~ x,y~R.

B(M).

M; ~(x) ~ y},

It follows from theorem A 1.1 that B(M) corresponds to

Let N be the set of integervalued functions in M. Any pro-

bability measure H on (M,B(M) is the distribution of a stochastic process with its sample functions in M. If H(N) = I the process is called a point process.

For each ~E M a point process with

10

(i)

Pr{N(x) for

(ii)

- N(y)

(k(x)

= k} =

- ~ < y < x < ~

N(x) - N(y)

and

and

k: ~(y))k

-(~(x)-~(y))

k = 0,1 ,2,...

N(t) - N(s)

~
e

are independent

whenever

<~

is called a Poisson ~rocess with leadin~

function

~ and its distribu~

tion is denoted by H . P A doubly stochastic bution

Poisson process

is a point process

j H H{dp} for some probability P

1.3.2

measure

with distri-

H on (M,B(M)).

An alternative definitZon

Consider

for ~ I , P 2 ~ M

Obviously

~1oP2~M.

the function x ~ plO~2(x) Consider

f : MxM § M where

In order to justify the definition following

= p1(~2(x)). f(p1,~2)

= ~io~2 .

to be given we need the

lemma.

Lemma 2 The function

f is B(M)xB(M)-measurable.

Proof In order to prove the lemma, we will use a method Billingsley follows

(1968, p 232).

From the definition

given by e.g.

of B(~)

it

that it is enough to show that

{(p1,~2); ~leU2(x) ~ y } ~ B(M)xB(M) for a l l x,y~R. If for each p~ M we put p(n)(x) i not smaller than ~(x), 2n thus for ~ i ~ 2 ~ tions

it follows that p(n)(x)

M also plO~2(n)fx~ ~ , + pl ~ P2(x),

(pl,P2) ~ p1~p~n)(x)

(pl,P2) ~

equal to the smallest

p1~2(x)

converge pointwise

ratio

+ ~(x) and

i.e. the functo

as n § ~. It is thus enough to show the

11

measurability however,

of (~I'P2)

~Pl ~

(n), ix) for each x @ R. This follows,

since for each x,y@ R

{(PI'P2);Pl~

--< Y} =~((UI'~2);uI(iiEz7 ) -< y' i-I i

~2(x)@

(2-~"-,2--~]}

where each set in the union belongs to 8(M)xB(M).

Consider for any two probability measures H I and H 2 on (M,B(M)) the probability product measure. probability

space

(MxM,B(M)•215

, where

HIxH 2 is the

The function f is a measurable mapping from that

space into (M,B(M)) which implies that given two inde-

pendent stochastic processes A I and A 2 with sample functions in M. Then the process AIoA 2 with sample functions in M is welldefined.

Call a Poisson process with leading function ~(x) = x a Poisson ~rocess with intensity one.

Let A be a stochastic process with sample functions in M and let N be a Poisson process with intensinty one, defined on the same probability space in such a way that they are independent.

Consider the

point process NoA.

Definition 5' A point process N is called a doubly stochastic Poisson process if it has the same distribution

as N~A for some A.

It follows from the definition of a Poisson process that for any DE M the process Nc~ is a Poisson process with leading function p. Thus this definition of a doubly stochastic Poisson process in terms

12

of a random time transformation is equivalent to the ordinary one. For a detailed discussion of the relation between the two kind of definitions we refer to Serfozo (1968:2, pp 307-309).

1.3.3

Class~ of doubly stochastic Poison processes

Consider a Poisson process with leading function ~ lently with intensity measure p ~ M .

M or equiva-

We will in this section consider

the set M C M of Borel measures that are absolutely continuous with O

respect to the Lebesgue measure.

Thus to each U~ M

O

there exists a

function q such that p{B) = f q(x) dx for all B ~ B(R). Such a funcB tion will be called an intensity funct~qn. Let I be the set of intensity functions, i.e. I is the set of nonnegative

(Lebesgue)

measurable functions with finite integral over all bounded sets B E B(R). Denote the mapping I § M 0 by f. From the final remarks in section 1.2 it follows that any probability measure on

(Mo,B(Mo))

defines the distribution of a doubly stochastic Poisson process. In many applications it is, however, more natural to consider a stochastic process A = {Z(x); x E R) as a model for the intensity. We will then need some conditions to ensure that ~ generates a random measure.

These will guarantee that ~ has its sample functions in I

and that a certain measurability condition is fulfilled.

Suppose that the finite-dimensional distributions of ~ are specified. A stochastic process X is a measurable mapping from some probability space (W,W,~) into

(F,B(F)) where

real-valued functions on R and

B(F)

{qEF;

F is the set of all-

is the u-algebra generated by

q(x) ~ y}, for all x , y ~ R. Thus A generates a probability

measure on

(F,B(F)) which

is uniquely determined by the finite-

dimensional distributions. Let Z(w,x) denote the value of ~(w) at

13

the point x. A measurable if B { w ~ W WxB(R)

; 11(w,x)

= 1(w,x)}

the completion

measure, in Doob

mapping

11

: W § F is called a version

= I for all x E R .

of B(R), W and WxB(R)

B and the product

{(w,y)6 WxR

; 1(w,y)

< x}E

ml1<x>I

<

Denote by B(R), W and

with respect

of B and Lebesgue measure

(1953, p 60) we call I measurable ~xB(R)

of I

to Lebesgue

respectively.

As

if

for all x ~ R .

If I is measurable

and if

f

B for some B ~ B ( R )

then

(cf Doob 1953, p 62)

f 11(w,x)]dx B

is W-measurable

and finite a.s.

Assume now that the given finite-dimensional following

distributions

satisfy the

conditions:

(i)

Pr{1(x)

(ii)

lim Pr{11(y) y§

< O} = 0 for all x ( R . - 1(x) I h s} = 0 for all s > 0 and almost all

(Lebesgue measure) (iii)

(F).

xER.

f E 1(x)dx < co for all bounded B @ B(R). B

From (ii) it follows separable

chosen.

(cf Doob

and measurable

Then 1(w) is a.s.

that 1(w) is a.s.

(1953, p 61)) that there exists a

version

of t.

Assume t h a t

(~) B(R)-measurable.

(~) non-negative.

this

From

Thus the set W

version

is

(i) it follows = (wEW;

1(w) is

O

non-negative

and B(R)-measurable}

has ~-measure

one.

From ( i i i )

it

n

follows that the sets W n = ( w E W ~ ; measure

one for all n = I ,2, . . . .

l i m W = W' = { w ~ W ; t ( w ) E I } n

mapping

A : W § M

by 0

f 1(w,x)dx

< ~} also have E-

Since W n + 1 ~ W n and since

also

W' h a s B - m e a s u r e

one.

Define

the

14

if w6 w'

# o ~(w)

A<w) = if w ~ W - W' 0

where ~

for example is the measure with ~ {R) = O. 0

Thus ( w & W

0

; A(w) (B} <_x}(~W for all x ~ R

and all B ~ B ( R )

and thus

a random measure is uniquely determined since, see theorem AI.1,

B(Mo)

is generated by {~ 6 M o ; B{B} ~ x}, x & R , B E B(R).

An approach more in line with our treatment of random measures is to consider some subset

loCI

endowed with the a-algebra B(I o) generated

by ( n ~ I ~ ; ~(y) ~ x}, x , y 6 R , cess {X(x)

; x6R)

and to suppose that a stochastic pro-

with sample functions in I

is specified by its 0

distribution,

i.e. by a probability measure on

(Io,B(Io)). Such

a

probability measure is usually given by a description of the development of the realizations.

Then we have a good apprehension of a

natural set I . It is therefore

convenient to have simple conditions

0

on

Io, without

any reference to a probability measure, under which

the mapping f: Io § Mo, which exists since l o C I ,

is measurable.

We

will not go deeper into this question than to show that the mapping f is

B(I0 )-measurable

for I

equal to a set of Riemann integrable

func-

0

tions in I. This restriction is due to the facts that Riemann integrability is easy to check and that many cases of practical interest are covered.

Let I C I be a set of Riemann integrable o

functions. b

approximation with Riemann sums that ( ~ 6 1 o ;

It follows by

S n(y)dy ~ x}~

B(Io)

a

for all a < b and all x 6 R. From theorem AI.1 it then follows that f is

B(I0 )-measurable.

It is often natural to consider models where all n in I rightcontinuous

since then the instantaneous

o

are also

intensity defined by

15

x+A I lim ~ f 4(y)dy (cf Khintchine(1960, A+0 x 4 ~ I and all x E R.

p 22)) equals n(x) for all

O

1.3.4

A d e f i n i t i o n b ~ e d on i n t e r o c c u r r e n c e times

Up to now, all our discussions

about point processes

based on ~counting p properties, the number of points

have been

i.e. the basic quantities have been

(or events)

in certain sets of the state space.

On the real line a point process may also be defined by considering the epochs of the events or the interoccurrence

times b e t w e e n

successive events together with the epoch of some specified event as basic quantities.

We say that such a definition is based on inter-

occurrence

The renewal processes

times.

constitute

an important

class of point processes where a definition b a s e d on interoccurrenee times is natural.

In our opinion it is, however,

to base definitions these properties

of point processes

are m e a n i n g f u l

in general preferable

on counting properties,

on more general state spaces.

tion 2.2 we will, however, make use of interoccurrence ties for doubly stochastic Poisson processes

since

In sec-

times proper-

and therefore

a short

discussion of these properties will be given.

Let for any ~ E M the inverse ~ -I

(x) = sup

-I

of p be defined by

(y : ~(y)

< x)

If p(y) > x for all y ~ R we put ~ continuous nondecreasing

For any ~ E N

(x) = - ~. Thus p

function from (- ~, ~) into

-I

is a right-

~- ~, ~

we put t k = ~-1(k) and consider the infinite vector

t = (...,t_2,t_1,t0,tl,t2,...). have

-I

According to the properties

... ~ t_2 ~ t_1 ~ 0 < t O ~ t I ~ t 2 ~

of ~ we

... and further

lim t k = • ~. If ~ is considered as a realization k§177

of a point pro-

16

cess, the tk:s are the epoch of events of v provided the possible non-finite tk:s are properly

interpreted.

Let T be the set of all vectors t and let T be endowed with the oalgebra B(T) generated by { t ~ T ; t k ~ x}, k = 0,• As p o i n t e d out by e.g. Daley and Vere-Jones probability measure on (T, B(T)) generates

x6 ~

(1972, pp 308-309)

~, ~ . any

a p r o b a b i l i t y measure on

(N, B(N)), and conversely.

Let ~ N

and B E M be given and consider v = m v o ~. Put t k = v

and t k = v

-I

(k)

(k). Then we have

t k = sup(y : mv(~(y)) < k) < sup(y : ~(y) < m tk ) = B-I ( ~ k ) " On the other hand,

for every ~ > 0 we have

t k = sup(y

: v(~(y)) < k) > sup(y

: ~(y) < ~ t k - s) = ~ -1(t k - s)

-I m -I and thus t k = ~ (t k) p r o v i d e d ~ (x) is continuous

at x = t k.

Let N = N o A be a doubly stochastic Poisson process

as defined in

section

1.3.2.

L e t T and T be t h e random v e c t o r s

d e f i n e d by

%

Tk = N - l ( k ) pendent

and Tk = ~ - l ( k )

respectively.

S i n c e ~ and A a r e i n d e -

it follows that ~ and A -I almost surely have no common points

of discontinuity.

T=

Thus -I m

( .... A

(T_I),

-I m

A

(To),

A-I m

(~rl)...)

a.s.

and thus the two random vectors

are equally distributed.

This rela-

tion may serve as a definition,

based on interoccurrence

times, of

doubly stochastic Poisson processes.

Kingman

has used the above relation as definition by Serfozo

(1972:1, pp 290-291).

(1964),

see section 2.2,

and it has been discussed

17

1.4

Doubly stochastic Poisson sequences

Consider now the case X = Z, i.e. when the state space is the integers. A Borel measure on Z is a measure

assigning nonnegative

finite

mass to each integer and is completely determined b y these masses. Thus we may identify Borel measures

on Z and sequences of nonnegative

finite numbers.

By a point process or point sequence N with state space Z we m e a n a sequence of random variables Z+ = {0,1,2,...}. = {Uk ; k ~ Z }

A Poisson sequence with intensity measure

is then a sequence of independent

random variables all n ~ Z + .

{N k ; k @ Z} taking values in

such that

Poisson distributed

(~k)n -~k

Pr{N k = n} =

nl

e

for all k 6 Z and

By a random measure s with state space Z we mean a sequence

of random variables

{Zk ; k ~ Z }

taking values in R+.

The following definition is equivalent with definition 5.

Definition

5"

A point sequence N is called a doubly stochastic Poisson sequence if, for some random measure Z,

nk. m

Pr {n

m

{~k.

j=l

(Lk.)

= nk ) } = E { ~ j

j

j=1

J

'~

-~k e

J}

nk. J

for any positive integer m, any integers k I < k 2 < ... < k m and any nonnegative

integers

Parts of this paper Poisson sequences. applying methods

nkl,...,nkm. are devoted to the study of doubly stochastic

The main reason is that we are interested in

of time series analysis.

that in many cases observations

We will, however, point out

of a point process

are for measure-

ment reasons given in this form. There also exist cases where there is impossible to observe the exact ~time ~ of a point.

In e.g. sickness

18

statistics the number of people reported sick each day can be observed, but the exact time of the start of a disease is impossible to observe and even perhaps to define.

1.5

Some basic properties

We recall from section 1.2 that to each probability measure H on

(M, B(M))

the probability measure / H H(d~), which in this section M is denoted by PH' on (N, B(N)) is the distribution of a doubly stochastic Poisson process. In terms of Laplace transforms

(see defi-

nition A 1.2) we have the relation LpH(f) = LH(I - e -f) (cf B a r t l e t t % contribution to the discussion of Cox (1955, p 159) and Mecke (1968, P 75)). From this relation some theorems, most of them due to Krickeberg (1972) (cf also Kummer and Matthes (1970) and Kerstan, Matthes and Mecke (1974, pp 311-320)), follow as simple consequences.

Theorem

1

PHI = PH2

if and only if

H I = H 2.

Proof If H I = H 2 then PH

= PH2 follows from the definition. The converse I is proved by Krickeberg (1972, p 163) and will be reproduced here.

Assume that PHI = PH2 , which implies LPH I (f) = LPH2(f) and thus

LHI(I - e -f) = LH2(I - e -f) for all f~CK+.

Thus LH](g) = LH2(g) for

for all gE OK+ with sup g ~ I since to each such g there exists a f~ CK+ such that g = (I - e-f). To see this we just have to observe that f = - log(1 - g)~ CK+ for all g of the above kind. Consider now an arbitrary f~CK+.

Then LH1(sf) = LH2(sf) for all non-negative

s ~ (sup f)-1. Since f E C K +

it follows that sup f ~ ~ and thus

19

(sup f)-1 > 0. Since L(sf), as a function of s, is the Laplace transform of the random variable ; f(x)A(dx} where A is a random measure X with distribution H, it follows that L(sf) is determined by its values on ~O,a) for any a > 0. Thus L H (f) = LH2(f) for all f ~ C K + I and thus (see theorem A 1.3) H I = H 2.

Krickeberg (1972, p 165) notes that PHI~H 2 = PHIXPH2 for any H I and H 2 where ~ means convolution as defined in section A I.

Now we give a similar theorem about weak convergence, a concept which is discussed in section A I.

Theorem 2 Hn

_~w PH "

W,H if and only if PH n

Proof If Hn

W~ ~

then L H (f)--~ L~(f) and thus LPH (f)--+ Lp (f) which n H n implies (see theorem A 1.6) PH w~ PI[ " n If PH

n

w PH t h e n LH ( g ) - ~ LH(g) f o r a l l n

g~CK+ w i t h sup g < 1

and thus for an arbitrary f ~ CK4 it follows that L H (sf)-~ LH(sf) n for all n o n n e g a t i v e s < (sup f ) - I and t h u s LE (f)---* LH(f) (compare

n the proof of theorem I and the continuity theorem for Laplace transforms of random v a r i a b l e s )

Let

N o E B(N)

which i m p l i e s t h a t H ~

n

be the set of simple elements in N and let M

of n o n - a t o m i c e l e m e n t s i n M ( s e e d e f i n i t i o n

A 1.1).

theorem is due to Krickeberg (1972, p 164).

Theorem 3

Mo~ B(M)

~ .

and PH{No} = I if and only if H{M o} = I.

9 o

be the set

The f o l l o w i n g

20

Proof It is known that H {N ] = I if and only if bE M (cf e.g. Kerstan, o o Matthes and Mecke

(1974, p 31)) i.e. M

it follows from lemma I that PH{N~

= I H~{N~

o

= {~6M;

H IN } = I}. Thus ~ o

M ~ B(M) since N 6 B(N) and further o

o

= I if and only if ~ ( M ~ a.s.

(H).

9

Consider X = R and a random measure A with distribution H on

(M,B(M)). A (or H) is called strictly stationary if n Pr { ~ 2=1 n

=

{A{B. + y} < x . } } 1 -- 1 Bi ~ B ( R )

1,2,...,

is

independent

of y for

all

y ~ R,

a n d x i E R+ . (B + y = {x ; x - Y E B } ) .

Remark I This definition has an obvious extension to X = R k and may be further extended

(cf e.g. Mecke

(1967)) so that e.g. X = Z is

included.

We will sometimes consider strict stationarity when n X = R+. Then we mean that Pr {~] {A{B i + y] ~ xi}} is indepeni=I dent of y for all Y E R+, n = 1,2, .... Bi6 B(R+).

Theorem 4 PH is strictly stationary if and only if H is strictly stationary.

Proof It follows from theorems A 1.3 and A 1.4 that a random measure A is strictly stationary if and only if the distribution of f f(x - y)A{dx] is independent

of y for all f ~ C K + .

R

Define Ty : CK+-'* CK+ by T y f ( X )

= f(x

- y).

A is

stationary if and only if LH(Tyf) is independent f 6 CK+. S i n c e theorem

I.

Ty(1 - e - f )

= 1 - e-Ty f the

theorem

thus

strictly

of y for all

follows

from

21

Now we leave the stationary case and consequently X need not be the real line. L e t ~ d e n o t e The sets P g ~

the set of probability measures on

of all probability measures on

(N,B(N))

(M,B(M)).

and D ~ P

of all

distributions of doubly stochastic Poisson processes are of special interest to us.

Let D

: P § P for p ~ [~,I] denote the p-thinning operator, i.e. for P

any point process with distribution P6P the distribution of the point process obtained by independent selection of points with probability p is D P. The operator D is one to one (cf Kerstan, Matthes and Mecke P P (1974, p 311)). Mecke (1968) and (1972) has shown that D =

~ D P 0
for all

for c > 0 denote the c-amplifying operator,

C

i.e. for any random measure A with distribution H the distribution of cA is A H. It is not difficult to realize that (cf Kerstan, Matthes C

and Mecke (1974, p 312))

PAH

Dc PH

if

0 < c < I

D~ ] PH

if

c ~ ]

=

C

C

and thus D C

N D P is rather obvious. 0
The following theorem is due to Kallenberg (1975:1). Theorem 5 Let p],p2,...~ (0,I~ such that lim Pn = 0 and P]'P2 .... ~ P be given. n-~ Then Dpn Pn w some P0~P if and only if Apn Pn w some H0~ ~ and in this case P0 = PH 0'

Proof The proof is given by Kallenberg (1975:1) and will not be reproduced here.

9

22

Theorem 5 generalizes earlier limit results for p-thinnings. As noted by Kallenberg also Mecke~s characterization of doubly stochastic Poisson processes is a simple consequence. To see that, let P0 ~ P n

~ D P be given. Let pl,P2.., be as in theorem 5 and put O
= D-I P0" Since D P = P0 there exists H 0 6 ~ s u c h Pn Pn n

that

H 0 and thus P0 = PHO i.e. PO ~ D. Further Apn p n w

An P n ~

is the same as Ap D~IpH n n 0

w

H0

HO' which is a result due to Kerstan,

Matthes and Mecke (1974, p 315), from which theorem I follows. Another consequence of theorem 5 is the following corollary which will be used in the proof of lemma 3.1. Corollary I If PH

w

some H ~

then H = PH0 for some H 0 ~ .

n

Proof Since N is closed in M it follows from Billingsley (1968, p 12) that -I = D PH " Since p n n w+ H E p there exists H 0 ~ ~ such that H = PH0.

HE P. Using the notations in theorem 5 we put P DP nPn = PHn

n

From theorem 2 and corollary I it follows that if PH ~

some H ~

n

then H n

w

some H O ~

and H = pH 0 w h i c h i s

a result

due t o K e r s t a n ,

Matthes and Mecke ( 1 9 7 4 , p 3 1 7 ) .

1.6 Second orde~ properties Consider a random measure A with distribution H on

(M,B(M)) and

assume that E A2(B} < ~ for all bounded B E B(X). Remember that a set is called bounded if it has compact closure. Let N, defined on the same probability space as A, be the corresponding doubly stochastic Poisson process as defined in section 1.2.

23

Definition 6 For bounded B , B I , B 2 ~ B ( X ) the set function M given by

M{B} = E A{B) = I ~{B} ~{d~} is called the expectation or mean and the set function R given by R(BI,B 2} = Cov(A{BI},A{B2}) is called the covarianee.

It follows by monotone convergence that M 6 M .

R(B,'} is for fixed

bounded B E B(X) a signed measure, i.e. the difference of two Borel measures, on (X,B(X)), and further R{BI,B 2} may be extended to a signed measure on (XxX~B(XxX)) (cf Daley and Vere-Jones (1972, p 319)).

For a Poisson process the mean equals the intensity measure.

Lemma 3a For bounded B,BI,B 2 6 B ( X ) we have E N{B} = M{B} Var N{B} = H{B} + Var A{B} Coy (N{BI},N{B2}) = M { B I ~ B 2 }

+ R{BI,B2}.

Proof

N{B} = ~[~(N{B}IA{B} ~

= ~ A{B} = M{B}.

Var N{B} = E ~Var(N{B}IA{B} ~

+ Var ~-_E(N{B}IA{B}~

=

= E A{B} + Var A{B}. For random variables YI,Y2 and Z the relation 2 Cov (YI,Y2) = = Var (YI + Y2 + Z) - Var (YI + Z) - Var (Y2 + Z) + Var (Z) holds (cf Daley and Vere-Jones (1972, p 320)). Applying this relation to YI = A{BI}' Y2 = A{B2}~ Z = - A { B I ~ B 2} and to YI = N{BI}' Y2 = N{B2}' Z = - N{BIg~B 2} respectively, the result for the covariance follows.

24

It shall be noted that the results in Lemma 3a was already given by Cox (1955, p 135), where it was pointed out that Var N{B} ~ M{B) i.e. the doubly stochastic Poisson process is over-dispersed

relative

to the Poisson process.

Lemma 3b For bounded B(X)-measurable

functions

f and g with compact support

we have

E

f

f(x)~{dx} =

X coy

f

f(x)max}

X

(f f(x)N{~x}, f g(x)N{~}) X

= f f(x)g(x)m~x}

X

+

X

+ f f f(x)g(y)R{~,dy}. XX Proof By approximating

f and g with simple functions the results follows

from lemma 3a.

Remark 2 The lemmata can be extended to higher order moments. Assume that E Ak{B} < ~ for all bounded B ~ B ( X ) .

Then Krickeberg

has shown that for bounded B(X)-measurable

(1972, p 164)

functions fl,...,fk

with compact support it holds that k

k

E (j=1X

where {J1,...,Jm}

m=1

m {J1 .... Jm )

runs through all partitions

i=I X

J*Ji

of {1,...,k} into m

disjoint non-empty sets.

The following theorem will be of some importance.

25

Theorem 6 For all bounded B I , B 2 @ B ( X )

the random variables N{B I} - A{B I}

and A{B 2} are uncorrelated. There exist, however, BI,B 2 ~ B ( X ) such that N{B I} - A{B I} and A{B 2} are dependent unless N is a Poisson process. Further N{B I} - A{B 1} and N{B 2} - A{B 2} are uncorrelated for all bounded and disjoint BI,B 2 ~ B ( X ) .

Proof For all bounded B I and B 2 we have Coy (N{B I} - A{BI},A{B2}) = EL(N{B ~ I} - A{BI})A{B2~

=

= EFA{B2)E(N{B I} - A{B])IA{B]~A{B2}) ] = 0 i.e. the variables are uncorrelated. In a similar way it is shown that N{B]} - A{B]} and N{B 2} - A{B 2} are uncorrelated for disjoint B] and B 2. Assume that N is a Poisson process. Then A{B} = M{B} for all B~8(X)

and thus N{B]} - A{B]} and A{B 2} are independent. Assume

that N{B} - A{B} and A{B} are independent for all bounded B ~ B ( X ) . Since E[(N{B} - A{B})2[A{B}] = A{B} almost surely it follows from the independence assumption that A{B} almost surely equals some constant, and thus it follows from theorem A 1.4 that N is a Poisson process.

Remark 3 In the proof of independence part of theorem 6, the assumption about existing second moment was not used.

9

Consider now the case X = R @ . With the notations used in section 1.3 we define M(x) = E A(x) and R(x,y) = C o v

(A(x),A(y)). From lemma 3a

we get immediately E N(x) = M(x) and Cov (N(x),N(y)) = M(min(x,y)) + + R(x,y).

26

Consider now X = R and assume that A{B} = S ~(x)dx for all B @ $ ( R ) B as in section 1.3.3 and assume that E ~2(x) < ~ for all x 6 R . Define m(x) = E l(x) and r(x,y)

=Cov

(l(x),l(y)).

Thus we have

M{B} = S m(x)dx and R{B I,B2} = S S r(x,y)dxdy B B I B2

for all bounded

B,B] ,B26 B(I~).

For a doubly stochastic the notations

Poisson

sequence

m k = E Zk and rk, j = C o v

(see section

Zk,Zj.

1.4) we use

Thus we have

E N k = m k and Coy Nk,N j = 6k_jm k + rk,j. where

~k =

1

if

k = 0

0

if

k#0

Consider X = R and a random measure stationary

A. A is called

(weakly)

if M{B + y} and R{B I + y,B 2 + y) are independent

for all y ~ R

and all bounded B,B I , B 2 ~ B ( R ) .

corresponding

doubly

stochastic

and only if A is stationary. for doubly stochastic called stationary corresponding

Poisson

It is obvious that the

Poisson process N is stationary

if

We will make most use of stationarity sequences.

A random measure

if m k = m and rk, j = r k-j. for all k , j ~ Z .

doubly

of y

stochastic

Poisson

sequence

s is For the

N we thus have

N E N k = m and Cov (Nk,N j) = r . = m~k_ j + r .. Since both Z and N k-j k-~ are stationary

we have

(see section A3)

rk =

i eikx Fs

r kN =

i eikx FN{dx}

and thus

FN(x) = m(x + w) + FZ(x) 2w

27

The functions FZ(x) and FN(x) are called spectral distribution functions. An important special case is when Fs

is absolutely con-

tinuous with spectral densit~ f~(x). Then also FN(x) is absolutely continuous with spectral density fN(x) = ~ fN(x) > ~ --

1.7

> 0 for all x ~ E - w,wJ

m

+ fZ(x). Thus

a fact that will be useful.

2w

A ch~acteriz~gion of ergodi~ity

The result to be given in this section will not be used in the sequel but the topic has some relevance to the questions treated in section 4.

Consider X = R and define for all Y 6 R the shift operator T : M + M Y by (Ty~){A} = ~{A + y} for all A ~ B ( R ) A + y = {x~R

; x - y ~A}.

and recall that

For any B ~ B(M) we put

TyB = {~6 M ; T_y~ 6 B}. For general properties of the shift operator we refer to Kerstan, Matthes and Mecke (1974, pp 133-140). A set

BEB(M) is called invariant if T B = B for all y ~ R. Let A be a Y

strictly stationary random measure with distribution H. Then H{TyB} is independent of y for all y ~ R and B ~ B ( M ) .

H (or A) is called

er~odic if H{B} = 0 or I for all invariant B 6 B ( M ) .

Let A(M) be the algebra

[~ B (M) where B (M) is the G-algebra gene, n= I n n rated by {~& M ; ~ { A N ~- n,n~} ~ x} for all A 6 B(R) and x ~ R + . From

theorem A 1.1 it follows that A(M) generates B(M).

The following lemma contains all ergodic theory to be used in this section.

Ler~ma 4

Let H be the distribution of a strictly stationary random measure. The following statements are equivalent:

28

(i) (ii)

is ergodic. For all B(M)-measurable I h(~)H{d~}

lim-~ t -~ (iii)

functions h : M § R+ with

< ~ we have

t f h(Ty~)dy = -t

For all B 1,B2ff

A(M)

h(~)~{d~}

a.s. (9).

we h a v e

lim ~-~ 9{B Ig]TyB2}dy = H{B I}9{B2}. t-~ -t

(iv)

Any representation H = ~91 + (I - ~)92

(0 < ~ < I)

of 9 as a mixture of stationary distributions

H I and 92

is a trivial, i.e. ~ = 0 or a = I or H 1 = H 2.

Proof (i) @

(ii) ~:~ (iv) follows from Kerstan, Matthes and Mecke

(1974, p 141).

(i) <:~ (iii) follows e.g. from Billingsley

(1965,

p 17).

Let, like in section 1.5, P9 denote the distribution of the doubly stochastic Poisson process corresponding to a random measure with distribution H.

Theorem 7 Let H be the distribution of a strictly stationary random measure. Then PH is ergodic if and only if 9 is ergodic.

Before proving theorem 7 we note that the result is due to Westcott (1972, p 463). He derives the theorem as a corollary of his characterization of ergodicity in terms of probability generating functionals. We will, however, give a rather simple direct proof.

29

Proof Assume that H is not ergodic. It follows from lemma 4 (iv) that there exist ~ ( 0 , 1 )

and two strictly stationary H I and ~2 with H I r H 2

such that H = aH I @ (I - ~)H 2. From the definition of PH it follows that PH = ~PH I + (I - ~)PH2. PH 2

From theorem 4 it follows that PHI and

are stationary and from theorem I that

PHI

#

PH 2

. Thus, using

lemma 4 (iv) again it follows that PH is not ergodic. Assume now that H is ergodic. Let B I and B 2 be two arbitrary sets in A(M) a n d l e t

n o be such that

B 1 a n d B2 b e l o n g s

to

Bn (M). o

From definition 5, Fubini's theorem and dominated convergence it follows that t lim ~ t~

PH{BI ~TyB2}dY = t

= lim~-~ t-~

H {B1~TyB2}H{d~}dy =

I

-- lim S ~ t~

t

tS ~{BI~TyB2}dY

~{d~} --

M

t I = ~ (limt_~~-~ _! H {BiOTyB2}dY)H{d~} t provided lim ~-~ I t S H~{BI~TyB2}dY exists a.s. (H). Since H~ is comt+~ pletely random we have H {BINTyB 2} = H {BI}H {TyB 2} for IYl > 2n o and further (cf Kerstan, Matthes and Mecke (1974, p 134)) we have H {TNB2} = HT_y ~ {B2} . Thus t I t-~=lim~-~ _~ H {B1~TyB2}dY = t I = t+~lim~-~_~ H {BI} H {TyB2}dY =

3O

i! t

= H {B I} lim ~-~ t-~ _

H

{B2}dY T_y~

.

From lemma 4 (ii) it follows that

i! t

lim ~-~ t-~ _

H

{B2}dY = PH{B2} a.s. (H) T_y~

and thus t

llm ~-~ ] t-~

=

~ PH{BI~TyB2}dY -

=

/ II]j{B1}PII{B2]II{d~ ) = M

PH{B]}PH{B2}.

From lemma 4 (iii) it then follows that PH is ergodic.

I

31

2.

SOME M I S C E ~ E O U S

RESULTS

The sections under this title are almost independent of each other, with exception of section 2.4.2 where results from section 2.3 are used. Common to most topics in the different sections are that special doubly stochastic Poisson models are considered. A survey of such models has been written by Lawrance (1972, pp 218-235). To our opinion the most interesting model discussed by Lawrance, and not touched upon in this section, is the one where the intensity process {~(x)

; x ~ R} is the square of a normal process or the sum

of squares of normal processes. In example 5.4 this model will, however, be considered.

2. I

The w e i g h t e d P o i s s o n p r o c e s s

Let ~ be a Borel measure on X, that is ~ negative random variable.

Then A = ~

M, and let ~ be a non-

is a random measure, and the

doubly stochastic Poisson process corresponding to it is called a weighted Poisson process or a mixed Poisson process. When X is equal to or a part of the real line, then ~ is usually understood to be the Lebesgue measure.

Consider now X = [0, ~) and ~ equal to the Lebesgue measure. Then it is natural to consider a weighted Poisson process as a process in the class studied in section 1.3.3 such that X(t) = ~ for all t > 0 where is a nonnegative random variable with distribution function U. The first systematic treatment of weighted Poisson processes is due to Lundberg (1940). Lundberg called these processes

'compound Poisson

processes' a name that still is used in insurance mathematics.

Among many other things Lundberg showed that the weighted Poisson process {N(t)

; t ~ 0} is a continuous time Markov chain with timedepen-

dent transition intensities Pn(t) given by

32

I

Pn(t) = lim 7~, Pr{N(t + h) = n + 11N(t) = n} = h+0

f = E(~IN(t)

= n) =

x

n+1

e

-xt

U{dx}

0 S xn e -xt U{dx) 0

N is called a P$1ya process if

~ x ~-I e-~X if

r(B) u'(x)

x>

0

( a , 8 > O)

=

0

if

x<0

and in this case

Pnlt ~ J = B + n ~ + t

Lundberg

(19~0, p 99) showed that N is a P61ya process

if and only

if Pn(t) is linear in n, i.e. if Pn(t) = a(t) + b(t).n.

We may observe that for a weighted Poisson process N, we have

Pr{N(t) = 0} = f e -xt U{dx), 0 which,

considered as a function of t, is the L a p l a c e - t r a n s f o r m

of U.

Thus Pr{N(t) = 0} for t ~ 0 determines U uniquely and thus the distribution of N. Compare this with theorem A 1.5.

If on the other hand for some point process

{N(t)

; t > 0}

oo

Pr{N(t) = n) = f (xt)~n e -xt U{dx} 0 for all t > 0 and n = 0,1,2,...

and some distribution

point process need not be a weighted Poisson process. berg

(1969, p 123) gives an example.

If we, however,

function U this Jung and Lundassume that N is

33

a weakly

stationary

weighted

P o i s s o n process

(In section

doubly

1.6 stationarity

form of Pr(N(t)

rity it follows R(s,t)

U, and thus

where

2

'only if' direction

but the modifica-

1.1.) This

follows and

= ~2st.

from lemma

)%2 s

= (t -

Thus E(A(t) the

from

a random variable

it follows

- A(s))

since

Var N(t) =

= Var ~. From the assumption

that Var(A(t)

= Cov(A(s),A(t))

then N is a

= n) is as above.

for t ~ R

X here merely means

= o2(t2 + t 2 - 2t 2) = 0 w h i c h proves

2.2

process

= n) we get E N(t) = t E~

function

Var A(t) = t2~ 2

is defined

see remark

= t EX + t 2 Var ~, where distribution

Poisson

if and only if Pr(N(t)

tion to t > 0 is obvious, the

stochastic

with

1.3a that of stationa-

and thus

- tA(1)) 2 =

'if' direction.

The

is obvious.

Doubly stochastic Poisson proc~ses and r e n e w ~ p r o c ~ s e s

In this

section we will study the class

are both doubly

stochastic

Poisson

processes

Since both kinds of point processes the Poisson process,

interest,

is both a doubly

Kingman

in this

section

Kingman~s considered

In this

Poisson

may be helpful

(1964) has characterized

Poisson processes

for x < 0.

which

Such a study may also have

process

a process

a certain

out that

as a 'variation

which

and a renewal process,

in the analysis

which

of

common to the two

class

is somewhat

our p r e s e n t a t i o n

of the process.

of doubly

also are renewal processes.

give a discussion

we will point

section

generalizations

since if we are considering

stochastic

both representations

are natural

interest.

which

and renewal processes.

a study of the processes

classes m a y have a theoretical a practical

of point processes

stochastic

Although broader

we will than

may at most be

on a theme by Kingman'

all distribution

functions

are assumed

to be zero

34

We will consider tion

point processes

N = {N(x)

; x L @which,

1.3.4, may be defined by a random vector

To avoid

some minor

trouble

to zero with positive renewal process where

probability.

k = 1,2,...,

Since we only allow

finitely

< I. T O is allowed

tion H. The variables bility

and in that

renewal process distribution

have a common many events

function

intervals

+ ~ with positive transient.

if and only if at least

H and F are defective,

takes the value

i.e.

F.

we require

distribution

process

a

random variables

distribution

to have a different

is thus transient

T O to be equal

N is called

in finite

case we call the renewal

ding r a n d o m variable

section

are independent

T k may take the value

functions

T = (T0,TI,T2,...).

A point process

if T0,TI-T0,T2-TI,...

T k - Tk_1,

that F(0)

we allow in this

see sec-

funcprobaA

one of the

if the correspon-

+ ~ with positive

probability.

If

H(~)

= F(x)

the corresponding

I - e -x

if

x>

0

0

if

x<

0

=

renewal process

is a Poisson

process

with intensity

one.

Let {A(x) cess,

; x > O) be a n o n d e c r e a s i n g

of section

1.3, such that A(O-)

rightcontinuous < 0 < A(O).

stochastic

For the same reason

as w h e n we allowed T O to be equal to zero with positive we allow Pr{A(O)

A-1(x)

> 0} > 0. The process

= sup

is called the inverse A-I(0)

(y

:

A(y)

; x > 0} defined by

of A. Due to the assumption

vector ~ = (~0,~i,~2 .... ) define

(~1) .... )

A(0-)

< 0 we have

= + ~} > 0. Let the random

a Poisson

1.3.4 it then follows

T = (A -I(TO),A

probability

< x)

> O. Further we allow Pr{A-1(x)

From section

{A-1(x)

pro-

process

with intensity

that the random vector

one.

35

defines a doubly stochastic Poisson process on R+.

Put oo

f(~)

=

S e-S~

F{dx}

0 and oo

~(s) =

S e-SX

H{dx}

0 where F and H are the distribution functions in the definition of a renewal process.

A point process N, with Pr{N(x) = 0 for all x > 0} = I, is both a doubly stochastic Poisson process and a renewal process. This uninteresting case will be left out of considerations.

Theorem I (i)

A doubly stochastic Poisson process corresponding to A is a

renewal process if and only if A -I has stationary and independent increments.

(ii) A renewal process is a doublx stochastic Poisson process if and only if

~(s) :

I I - log ~(s)

and

~(s) = ~o(S)~(s)

where g(s) = S e-SX G{dx} for some infinitely divisible distribution 0 function G with G(O) < I and go(S) = S e-SX Go{dX} for some distribu0 tion function G . O

(iii) The two representations are related through

E e -sA-1(~

= ~o(S)

36

and E e

-s(A-1(1) - A-I(0))

= g(s).

Proof (i) The 'only if' part, which is the difficult part is proved by Kingman (1964, pp 929-930) and will not be reproduced here. Con^

sider now the 'if' part. Let go and g be given by part (iii) of the theorem. For any n > 0 we have n

E exp{- s0T 0 - kZ=1 sk (Tk - Tk_1)} =

= E exp{- s0A-1(0) - So(A-I(T~ O ) - A-I(0)) n

-

Z sk

(i-1(~k)

- i -I (~T k _ 1 ) ) ) =

k=l

TO)

n

= ~o(So) E(~(s o)

Tk

n E(~(s k)

) =

k=l =

go(SO)

n

I

I - Zog ~(s o)

k:1

I - Zog ~(s~)

which proves part (i) of the theorem.

(iii) This follows from the proof of the 'if' part of (i).

(ii) To any G and Go, defective or not, satisfying the conditions in (ii) there exists a process A -I with stationary and independent increments such that g and go satisfy the relations in (iii). Conversely, for any process A

-I

with stationary and independent in-

crements g and go given by (iii) satisfy the conditions in (ii), since if G(O) = I then the corresponding doubly stochastic Poisson process will not necessarily have only finitely many events in finite intervals. Thus (ii) follows from (i) and (iii).

37

Now we will consider the class of point processes which are both doubly stochastic Poisson processes and renewal processes in more detail. In the analysis we will alternate freely between the two representations. We will follow Kingman and consider the stationary case. A renewal process is called stationary, provided F is not defective and has finite expectation p, if

1

(1

i

~(x) = ~ o

- F(y))dy

.

A stationary renewal process is a strictly stationary point process.

Corollary I A stationary renewal process is a doubly stochastic Poisson process if and only if

~<s) = [I + bs + f 0 and some measure B on (0,~) such that f x B{dx}

<

0

Proof For any infinitely divisible G, defective or not, we have (cf e.g. Feller (1971, p 450))

~(s) = e -r where

r

=bs

+ f (I - e -sx) B{dx}

+b

0 for some b, b

> 0 and some measure B on (0,~) with

f 7-Yqx x B{dx} < 0 For the distribution function 0 (and thus also F) is defective if an only if b

> 0 .

Thus in the stationary case b

= 0.

38

co

Kingman

(1964, p 925) showed that ~ = b + f x B{dx},

co

and thus

0

f x B{dx} = ~ - b < ~. Thus the 'only if' part follows from 0 theorem I (ii). The 'if' part also follows from theorem I (ii) ^

if a distribution exists.

Kingman

function G o such that h(s) = go(S)f(s)

always

(1964, p 925) has shown that X

co

I

7(b +# # ~{dz}dy)

if xLO

0 y

0o(X) = 0

if

x<

0

satisfies the required condition.

From theorem

I (i) and the p r o o f of corollary

I it follows that a

doubly stochastic Poisson process corresponding to A is a stationary renewal process

if and only if A -I has stationary and inde-

pendent increments with -s(A-1(1) - A-I(o))

= exp{ - (bs + ] (I - e-SX)B{dx})} 0

Ee and X

co

b + ~ f B{dz}dy ou

if

x>

0

if

x<

0

b + ~ y B{dy} 0 Pr{A-I(o)

< x} =

0

for some b >_ 0 and some measure B on (0,co) such that co

x B{dx} _< 0 We may observe that since F(0) = lim f(s) we have F(0) > 0 if and oo

S-~co

only if b = 0 and 5 B{dx} < co. Since a stationary renewal process 0 is simple,

see definition A 1.1, if and only if F(0) = 0 it follows

from theorem

1.3 that A(t) is continuous

a.s. unless b = 0 and

39

f B{dx} < ~. 0 If b = 0 and S B{dx} = e < ~ we define the p r o b a b i l i t y measure C by 0 C{dx} = 2 B{dx}. Then C

9(s) = c f (I - e -sx) C{dx} = c(I - f e -sx C{dx}) 0 0 and thus A properties

-I

is compound Poisson process.

of A

-I

U s i n g the sample function

it is not difficult to see that A has the represen-

tation

9(x) k~1 ~

if

~(x) > 0

0

if

~(x) = 0

A(x) =

where N is a stationary renewal process with interoccurrence

distribution

C a n d {~k}k=l i s

a sequence of independent

variables all b e i n g exponentially

time

random 1

distributed with mean --. e

In the case b = 0 Kingman Pr{D+A(x) where D+A(x)

(1964, p 926) showed that

= 0 for almost all x ~ O} =

I

is the right-hand derivative.

Thus, if b = 0 and S B{dx} = *, almost all realizations of A are 0 continuous and, considered as measures, singular with respect to Lebesgue measure.

Kingman considered the important class of doubly stochastic Poisson processes,

discussed in section

1.3.3, where

X

A(x) = S ~(y)dy 0 for some stochastic process

{l(x)

; x > O} measurable

in the sense of

Doob and not identically equal to zero. He showed that a stationary

4o

renewal process can be expressed as such a doubly stochastic Poisson process if and only if b > 0. In this case ~(x) alternates between I the values 0 and ~ in such a way that ~(x) is proportional to a stationary regenerative phenomenon (cf Kingman 1972, p 48).

If f B(dx~ ~ ~ and if c and C are defined as above, it follows, see 0 I

Kingman (1964, p 9 2 8 ) , t h a t X(x) i s e q u a l t o 0 and ~ a l t e r n a t i v e l y on intervals whose lengths are independent random variables. The I lengths on the intervals where X(x) = ~ a r e

exponentially distributed

with mean ~ and the lengths where ~(x) = 0 have distribution function C. C

2.3

Some r e l i a b i l i t y models

Consider a doubly stochastic Poisson process (N(t)

; t ~ 0~. In this

section, with perhaps a somewhat misleading title, we will consider the distribution of the waiting time T for the first event. Since (T > t~ = (N(t) = 0~ this is the same problem as calculating the probability of no events in an interval.

2.3.1

An application on precipitation scavenging of an aerosol particle

In this section we will study a model, due to Rodhe and Grandell (1972), for precipitation scavenging of an aerosol particle from the atmosphere.

Information about the distribution of the waiting

time for the first event is of interest in connection with air pollution problems.

The intensity for the removal of a particle from the atmosphere is highly dependent on the weather. In the model we assume that the removal intensity only depends on whether it is raining or not. Let ~d denote the removal intensity during a dry period, i.e. during a dry period a particle has the probability ~d h + o(h) of getting

41

scavenged from the atmosphere in an interval of length h, and let

P

denote the removal intensity during a precipitation period. Let X(t) be a stochastic process defined by kd

if dry period at time t

kp

if precipitation period at time t

k(t) :

It is further assumed that k(t) is a continuous time Markov chain with stationary transition intensities qd and qp defined by I qd = lim ~ Pr(~(h) = ~pl~(O) = ~d } h+O I qp = lim ~ Pr{~(h) = XdlX(O) = Xp} , h+O and with initial distribution

Pd

: Pr{~(0)

: ~d }

pp = Pr{k(0) = k } . P For some discussion of the relevance of this model we refer to Rodhe and Grandell (1972).

Consider a particle which enters the atmosphere at time 0 and let T be the time for the removal of that particle from the atmosphere. Define G(t) by t G(t) = Pr t} = E(exp( - f k(s)ds}). 0 Put

Gd(t) = Pr{T > tl~(0) = kd } G(t)

: Pr t1~(0) : ~p}

and thus G(t) = PdGd(t) + ppGp(t) .

42

The chosen initial distribution describes the knowledge of the weather when the particle enters the atmosphere.

From the properties of k(t) it follows that

E(exp{ -

t+h 5 l(s)ds}ll(h)) h

is independent of h and by considering the possible changes of k(.) during (O,h) we get -Idh Gd(t + h) = (I - qdh)e

Gd(t) + q d h % ( t )

+ o(h)

and thus h + 0 gives

G~(t) = - (qd + Id ) Gd(t) + qdGp ( t ) and similarly G'(t)p = %

Gd(t) - (qp + Ip) Gp(t)

.

From the general theory of systems of linear differential equations it follows that -rlt Gd(t) = a d e

-r2t + Bd e

-rlt Gp(t) = ap e

-r2t + Bp e

where

rl

=

r2 =

1

-

I (qd+%+Xd+Xp) +

<

(qd+%+~d+Xp) _XdXp_Xd%_Xpqd"

Thus -rlt G(t) = ~e

-r2t + ae

Assume that r I > 0, which holds in all non-trivial cases.

43

Since G(0) = I we have e + B = I and thus -rlt G(t) = ~e

-r2t + (I - ~) e

From this we get

S G(t)dt = ~-- + I 0 rl r2 which by definition is equal to E T.

Integration of the differential Gd(~) = ~ ( ~ )

equations gives, since

= o ,

- I = - (qd + Id) ~ Gd(t)dt + qd ~ Gp(t)dt

- I : qp ! Gd(t)dt-

(qp + Ip) ~ Gp(t)dt

and thus

f G(t)dt = 0

qd + ~

+ Pdlp + Ppld

qdlp + qpl d + Idl p

which determines ~.

In Rodhe and Grandell

(1972) the above derivation is given in more

detail and further the model is illustrated by numerical examples. Of special interest is the case Pd = qd + qp which corresponds situation where the particle enters the atmosphere weather.

to the

independently

of the

In this case +

qd + ~

+ pplp qdld qd + ~ - rl

qd + qp + Id + Ip - 2r I We conclude this section by mention a natural generalization model for precipitation

scavenging.

of the

Let l(t) still be a Markov chain

44

with stationary transition probabilities

but let the possible values

of k(t) be kl,...,kK where K may be infinite.

In the precipitation

scavenging example kl may be the removal intensity during a dry period and k2,...,kK the removal intensities

during precipitation

periods classified according to the intensity of the precipitation. It is easy to see that for finite K the method of calculating G(t) in the case of only two possible values of k(t) also applies to this situation.

It is, however,

in general not possible to find an explicit

solution of the system of differential equations.

Doubly stochastic Poisson processes treated by for example Neuts

of the above kind have been

(1971) and Rudemo (1972) for finite K

and by Rudemo (1973:1) for infinite K. Rudemo's derivation of G(t) differs from ours and has connections with non-linear estimation.

We

will return to this in example 5.3. Neuts uses the process as the input in a queueing model.

2.3.2

A model w i t h an i n t e n s i t y process generated by a renewal pro cess

Consider now a (not transient)

renewal process N = {N(t)

in which the times between successive variables with a common distribution

; t > 0}

events are independent

random

function F. We assume that

F(0) = 0. F is called arithmetic with span T if T is the largest number such that F is concentrated on a set of points of the form T, 2T, 3T, ... and otherwise non-arithmetic.

Further the distri-

bution function of the time to the first event is denoted by H. The % interesting choices of H are H = F which makes N an ordinary renewal process and,

provided F has finite expectation,

45

X

S(x)

=

/ (I - F(y))dy 0 oo

F(y))dy

S (I 0

which makes N a stationary

Let {Xk}k=0 be a sequence

renewal process.

of independent

n o n n e g a t i v e random v a r i a b l e s further these variables

Define

a stochastic

be a doubly stochastic intensity Pr{N(t)

with distribution

be independent

process

and identically function

distributed U and l e t

of the renewal process

N.

~(t) by ~(t) = ~ ( t ), Let N = {N(t)

Poisson process with X(t) as a model

in the sense as discussed

in section

; t > 0]

for the

1.3.3. Let G(t) denote

= 0}.

Consider

first the case H = F and put u(t) = S e -tx U{dx}. Note that 0 u(t) = Pr{N(t) = 01~(t) = 0}. Separating the two cases N(t) = 0 and N(t) > 0 we get t G(t) = (I - F(t)) u(t) + ~ u(s) G(t - s)F{ds} 0 and thus we have obtained

Following

Feller

a (defective)

renewal equation

for G.

(1971, p 376) we assume that there exists a K > 0

such that co

f e ~t u(t)F{dt} 0

= I

and further we assume that S eKt u(t)(1 0

- F(t))dt

<

The equation

eKtG(t)

= eKtu(t)(1

is a proper renewal equation.

t - F(t)) + f eK(t-s)G(t-s)e<Su(s)F{ds} 0

~6

Since, mann

as w i l l

integrable

362))

p

be

shown

it

below,

follows

e

from

u(t)(1

the

- F(t))

renewal

is d i r e c t l y

theorem

Rie-

(cf Feller

(1971,

that co

f e ms u(s)(1 e
- F(s))ds

0

§

f

s e <s

[(

s)F{ds}

0

as t § ~ i f F is n o n - a r i t h m e t i c

and

that

co

Z

T e K(t+nT)

G(t+nT)

eK(t+JT)u(t+jT)(I

- F(t+jT))

§

f

S e ~s ~( s ) F { d s }

0 as n § co i f F is

To

see that

be

the

arithmetic

e Kt u ( t ) ( 1

largest,

- F(t))

a n d m. t h e J

m. -< e Kt u ( t ) ( 1 --j

with

- F(t))

span

is

co

directly

smallest

_< m.j f o r

9 and

number

(j -

1)h

0 < t < ~ .

Riemann such

integrable

l e t m.

that

< t < jh.

Then

co

m.

Z

j=l

Z

a

e <jh

u((j

1)h)(1

-

-

F((j

1)h))

-

=

j=l

-

oo

=

h

e

h

Z

e K(J-1)h

~l(jh)(1

- F(jh))

<

(e 2 K h

I) f e < t u ( t ) ( 1 - F ( t ) ) d t 0

j=1

co

+

e2
h

Z j=1

m. -J

Thus oo

j=1

which

tends

to

co

co

Z m. < h e K h + m. - h O j--1 - J -

zero

as h ~ 0 a n d t h u s

-

(cf Feller

u(t)(1

- F(t))

is

directly

Riemann

integrable.

1971,

p 361))

47

Example

I

A~sume that F is a one-point

distribution

with F(T) - F(T-0) = I.

Then < is determined by e

KT

~(T)

= I

and thus

eK(t+nT)G(t+nT)

§ 9 e
u(~)

e KT

e

u(t)

or

G(t+nT)

~

~(~)n u(t)

Thus we get, in this example,

.

the exact value

for all T and t.

B Assume now that F is non-arithmetic GH(t) denote Pr{N(t)

but let H be arbitrary.

Let

= 0}. Then we have

t GH(t) = (I - H(t)) Q(t) + ~ u(S)GF(t-s)H{ds} 0 or

Assume,

GH(t) = e
t - H(t)) + f e<Su(s)e<(t-S)GF(t-s)H{ds}. 0

in addition to the assumptions

in the case H = F, that

co

e <s & ( s )

~{d~}

<

0 and that

e Kt u(t)(l

and thus,

- H(t)) § 0 as t §

since e
convergence

that

bounded,

we get by dominated

48

oo

eKtGH(t)

oo

f e Ks u(s)(1 0

§

j-

- F(s))ds

f e Ks u(s)H{ds} 0

oo

Ks

S e

u(s)F

ds

0

Example

2

Let N be a Poisson

process

with

intensity

F(x) = H(x) = I - e -#x for x h 0. Then

p, i.e.

K is determined

by

co

f e Kt u(t) 0

e -pt dt = 1

and co

f e Kt u(t)(1 0

- F(t))dt

= I

and thus

e

Kt

G(t) §

o~

2

Consider

-

Then K is the smallest

Pu 1 p +

tl

-

x

+

Ks

U(Xk-0)

^..

u[s)

, k =

solution

e

- Hs

+

-

12

-

ds

distribution

with

1,2.

of

Pu 2 p

some calculations

K =-~(~,1+12+p)

and

) S e 0

the case where U is a two-point

% = U [kt ) ' "

and after

r

x

-

1

we get

(t1+t2+p)

2 -

tlt2

-

UlP,X 1 -

u2~t

2

49

2 ~ Ks f s e ~(s) 0

~ + 11 + 12 - 2< e -~s ds = ~ + u112 + u211

We note that 1(t) is a M a r k o v probabilities

chain w i t h

stationary

- <

transition

with parameters

Pk = Pr{l(O)

= Ik } = u k

and I qk = lim ~ Pr{l(h) h+O I = l i m ~ {h~(1 h+O

- Uk)

# Ikll(O)

+ o(h)

= I k} =

= ~(I

- u k)

.

For 11 = I d

12 = I

P

= qd+

qp

% Ul = qd + qp

qd u2 =

qd+~

we thus get the p r e c i p i t a t i o n

Pd = qd + qp

2 :j s 0

e

scavenging

model provided

, and it is seen that < equals

<s u(s)

e -~s d ~

equals

r I and that

I

--

m

5O

2.3.3

A model with an i n t e n s i t y process generated by an a l t e r n a t i n g renewal process

Consider again the precipitation scavenging model and let Fd(F p) denote the distribution function of a dry (precipitation) period. In -qd x -qpX the model Fd(X) = I - e and F (s) = I - e P Now we drop the assumption of exponential distributions and consider arbitrary F d and Fp, but keep the assumption of independence between the duration of different periods. X(t) is thus generated by an alternating renewal process. Let further Gd(t)(Gp(t)) denote the probability that T > t for a particle which enters the atmosphere exactly when a dry (precipitation) period starts.

Put

~d(S} = f e -s~ ;d{~} 0

9 (s) = S e-SX ; {dx} P

0

P

Gd(S) = ~ e - s x Gd(X)d x 0 ~(S)

= f e -sx G (x)dx . 0

P

Since -Xd t Gd(t) = e

t -XdT (I - Fd(t) ) + f e Gp(t-m) Fd{dT} 0

we get

~d (s) = I - 9d(S + ~d ) s + ~d + }d(s + ~d)Op(S) and in the same way

I - ?(s + xp) P

51

and thus e.g.

{

8d(S)= I - fd(s + Id)fp(S + lp)

I - 5(s + ~d(S + ~d )

1

-

~d(S

+

~a)

s + Id

+ ~ID) } .

s + I P

This formula is given by Gaver (1963, p 224) with, in our opinion, a more complicated proof. When the durations of the periods are exponentially distributed this definition of G d (Gp) coincides with the one given in section 2.3.1 because of the lack of memory of the exponential distribution. An inversion of the Laplace transform seems, however, to be as difficult as our direct derivation. Lawrance (1972, pp 228-233) has a further discussion of this model. We will now consider the behaviour of Gd(t) for large values of t.

Let ~d and ~

be the durations of the first dry period and the subseP

quent precipitation period respectively. Put ~ = ~d + ~p and let F = Fd ~ F

P

denote the distribution of T. Define the functions Ad(t )

and B(t) by Ad(t) = Pr{T > tl~ > t} B(t)

= Pr{T > tl~ = t} = E(e

-~dTd-Xp~ ~

PI~ = t) .

Note that it is irrelevant for B(t) if we start with a dry period or with a precipitation period. It can be shown that

Ad(t) (I - F(t)) = -~d t = e

t -XdT-~p(t-T) (I - F(t)) + 6 e (I - Fp(t-T))Fd{dT}

and that this function is non-increasing. q~

Separating the two cases T > t and ~ < t we get

52

t - F(t)) + f B(T) G d ( t 0

Gd(t) : Ad(t)(1

T)F{dT}.

Assume now, like in section 2.3.2, that there exists a K > 0 such that co

f e
~

e ~t

Ad(t)(1-

F(t)) dt < co

0 Since, in the same way as in section 2.3.2, it can be shown that e Kt Ad(t)(1

- F(t)) is directly Riemann integrable it

follows that

oo

e "s

l i m e Kt Gd(t) = t§

Ad(~)(1

-

F(s))

as

0 co

S s

e KS s(s)

F{ds}

0

It may simplify the calculations

to observe that

f e Kt B(t) F{dt} : E (e (K-ld)~dlE(e(K-lp)~p) 0 and thus K is the smallest solution of }d(id - X)fp(lp - x) and that co

f e Ks 0

Ad(S)( 1 i F(s))ds

--

^

I -

fa(~d-

Id - K and 0

^

K)

+

^

c

fd~Id

-

~)

I -

fp(~

1 P

-

K

K)

=

I

53

2.4

Palm p r o b a b i g i t i ~

In section 2.3 we considered the distribution of the waiting time for the first event. In this section we will consider the waiting time for the first event after the occurrence of an event. In order to do this we will introduce the concept of Palm probabilities.

2.4.1

Palm p r o b a b i l i t i ~ for doubly stochastic Poisson p r o c ~ s ~ in the general case

Intuitively a Palm probability for a point process is a conditional probability given that a certain x 6 X happens to be one of the points of occurrence of the process. There are several attempts made to make the intuitive notion of a Palm probability precise. A short discussion of such approaches is given by Daley and VereJones

(1972, p 360). Our approach will essentially be the one due

to Ryll-Nardzewski

(1961). Since the theory of Palm probabilities

will only be used in this section, we content ourselves with refering to Jagers (1973) for a discussion of a general theory of Palm probabilities based on Ryll-Nardzewski's approach.

Formally a PaZun probability will be defined as a Radon-Nikodym derivative. We hope that the following heuristic reasoning will be of some help for the understanding of the precise definition. Let a point process N and an event B E B(N) be given. The problem is to give a precise (and reasonable) meaning of P r { N 6 BIN({x)} > 0}. For a point process we have N{{x}} = N{dx). Let B(x) denote the event that x 6 X is one of the points of oectucrence of N, i.e. B(x) = { ~ 6 N ;

~{dx} > 0). Led by the elementary definition of condi-

tional probabilities we consider the ratio

Pr{N6 Bg~ B(x) } Pr{NEB(x)}

. Let IB be the

f

if N 6 B indicator for B, i.e. IB(N) =I IB(N)IB.x,(N ) " . ~function )" " , 1 if N~ B ' and consider the ratio E E ]B(x)(N) 9 If N is a simple point process, i.e. multiple pointsE ~IN)N{dx}d~ not occtu-, we have IB(x)(N) = N[dx}. Thus we are led to consider

E N(dx}

and this will be our Palm probability. It will turn out, see theorem 2,

54

that it is convenient for our purposes to use this definition of PaLm probabilities also for point processes which are not simple and for general random measures. For a random measure A, we by formal analogy define a Palm E IB(A)A{dx} p r ~ a b i l i t y by E A{dx}

We conclude this heuristic reasoning by some remarks on the definition used by Jagers (1973). He considers point processes N, not necessarily simple, and associates to N a simple point process N~ defined by ~ { { x } }

= min(N{{x}},~)

for all x 6 X. Then IB(x)(N) = N~{dx} and he defines a Palm probability by

s 1~(N) ~{ax} 9 For a simple point process N we have N ~ = N, and ~hus the m N~{dx} two definitions agree in that case.

Let A be a random measure with distribution and let B 6 B(M) be an arbitrary but be the indicator

function,

i.e.

I

if

~9 B

0

if

~

H such that EA = M 6 M

fixed set. Let IB : M §

[},I~

IB(~) =

and consider the measure with respect unique

a.e.

B

E IB(A)A which is absolutely

to M. By the Radon-Nikodym (M) determined

continuous

theorem there exists

function 11 {B}

a

: X § ~0,I~ such that

X

A for all A 9 8(X).

The family {Hx{B} p 258))

; x 6 X, B ~

so that H

B(M)}

can be chosen

is a probability

measure

on

(cf Bauer

B(M)

(1968,

for all x ~ X

X

and so that for each B 6 B ( M )

the function x + H {B} is B(X)X

measurable. H

In the sequel we assume this to be done.

will be called the Palm measure.

The measure

To simplify notations

X

regard

{x6X

two Palm measures

1I( 1 ) X

a n d 1I( 2 )

; H(1)x r H(2)} has M-measure

X

as equal

zero.

if

the

set

we will

55

Remark I Our definition of Palm probabilities

is indicated by Jagers

(1973,

p 20), although it differs from the one used there. For simple point processes,

see definition A 1.1, the two definitions

the discussion by Jagers

agree and thus

(1973) about the link to the intuitive

meaning of Palm probabilities

also applies to the present defini-

tion. For a point process where multiple points are possible this link is generally lost. We will illustrate this fact with a simple example.

Let N be a point process on the two-point

space X = (x,y)

and put H.

= Pr{N{{x}} = i, N{{y}} = k} .

l~k

Then the natural definition of H {B} for B = {v~ N ; ~{{y}} = k} X

is H {B} = Pr{N{{y}} = klN{{x}} > 0} = X

oo

i= I

1,k

=

co

oo

Z Z H. k=O i=I 1 ,k

provided Pr{N{{x}} > O} > O, while from our definition we get co

Z in. i=I i ,k H

{B}

=

X

co

co

Z Z i H. k=O i=I m ,k

For strictly stationary point processes the link to the intuitive meaning is exhaustively discussed by Kerstan, Matthes and Mecke (1974, pp 182-185).

From that discussion

it follows, however,

that our definition of Palm probabilities may he of practical use also in the presence of multiple points.

9

56

Let for x ~ X the Dirac measure

I

if

x~A

0

if

x~A

6

for x be defined by

x

for all A ~ B ( X )

6x{A} =

and let the Dirac measure A

Ax {B} =

Note that ~ 6 N x

I

]

if

0

if

x 6x~B

and that &

x

be defined by

for all BeB(~I)

.

is the distribution of a random

x

measure with almost surely 6

Mecke

for 6

x

as realization.

x

(1967, p 43) has shown that a probability measure

P on (N,B(N)) is the distribution sity measure

~, see definition

of a Poisson process with inten-

1.4, if and only if

X•

N X

for all B(X)•

function~

sider, as an example,

f : X•

f(x,v) = IA(X)IB(~)

~ R+. If we con-

and P = H

we get

f ~{A} H {d~} = f (I IB (" + 6 x) H {dD}) H{dx} = B ~ A N

= / (Ax~ ~) {B} ~{dx} A where the first equality follows from Mecke's c h a r a c t e r i z a t i o n the second one from the definition of convolution

and

given in section

A 1. Thus, since for a Poisson process the intensity measure equals the mean measure, we have Palm measures

(H~)x = AxXH~" With the definition of

used by Jagers

true if ~ is non-atomic a similar relation

(]973) this result is only generally

(cf Jagers

(]973, p 22)). We will now give

for doubly stochastic Poisson processes.

57

Theorem 2 For any random measure A with distribution H, such that EA = M 6 M, we have (PH)

= Ax~P H

X

where PH X

is %he distribution of a doubly X

stochastic Poisson process corresponding to a random measure with distribution H . x Proof The theorem is a consequence of Kummer and Matthes (1970, p 1636) but, as to be shown, it is also a rather simple consequence of Meckes characterization of Poisson processes. We have to show that

/ (Ax~P~ ){B}M{dx} = / ~{A]Pn{d~} A

x

B

for all A~ B(X) and all B~B(N). Using I.

the definition of convolution

2.

the definition of doubly stochastic Poisson processes

3.

approximation of r IB(V + 6x) H {dr]

with simple functions of

the form Z ak IAk(X ) IBk(~ ) for An~ B(X) and Bk~ ~(M) and the definition of Palm measures 4.

Mecke~s characterization with f(x,v) = IA(X) IB(V) and P = H

5.

approximation of IB(V) v{A} with simple functions

we get (the figures over the signs of equality refer to the above table)

(Ax~ ~ ) {~}M{dx} ~ / /

I B (~

+ ~x ) PK {d~}~{dx} :

A

x

AN

~ ~/ A~

IB(v + 6x)H~{dv} H= {d~}M{dx} x

x

~ ~ ; IB(V + 6x) H {dv}~{dx]H{d~} = ~ s IB(V)v{A} H {dr} H{d~} 2~5 f v{A} PH{dv} B which was to be proved.

58

Some special models

2.4.2

Consider X = R and, see section the intensity

is a stochastic

(Io,B(Io))where

tion H on

process

integrable

functions.

E l(x) = m(x)

< ~ for all x ~ R .

Poisson process

probabilities

{l(x)

; x@R}

Assume

rightconti-

further that

In this case the corresponding

is simple

for

with distribu-

I ~ is a set of nonnegative

nuous Riemann

stochastic

1.3.3, the case where the model

and from the definition

doubly

of Palm

we get

S Hx{B} m ( x ) d x A

=

S n(x) AxB

H{dn}

i.e.

x

{B} m(x)

/ n(x) n{an}

=

a.s.

(Lebesgue measure)

B

B(Io).

for all B e

Consider now the corresponding and define G(x,t)

G(x,t)

and Gx(x,t)

= PH{V

doubly

stochastic

Poisson process

N

, t > x, by

= O} = Pr{N(t)

; v(x,t]

- N(x) = 0}

and Gx(x,t) =

= (PH)x

ol={{x}}

>

{v

; V(x,t~

= O) = P r { N ( t )

o}

where the last sign of equality tion of Palm probabilities

refers to the intuitive

for simple point processes.

section 2.3, we have t

S n(y)~ G(x,t ) = S I

e

x

~{dn}

o

and, from theorem 2 and the fact that Ax~P H x = PH

{V ; V(x,tl = 0}, we have x

- N(x) =

interpretaThus, compare

59

t

f n(y)dy

Gx(x,t ) =

f

e

x

~x{dn}

o or, by the definition

of Hx, t - f n(y)~y

0x(x,t)m(x) = ~ n(x) e

x

~{dn}.

o

Example

3

Consider the case where transition where

I

l(x) is a Markov

probabilities

and with distribution

is the set of rightcontinuous

o

chain with stationary H on (Io,B(Io))

piecewise

constant

func-

tions R § R+ = [0,~) with only a finite number of jumps in every finite interval

and with range

finite or infinite.

Hki(X)

{lk ; k = ],...,K} where K may be

Put

= Pr{l(x + s) = lill(s)

= lk } ,

x6R+,

and ~k(X)

= Pr{~(x)

:

Xk } ,

xem

Of c o u r s e Hki(X) and Wk(X) must f u l f i l l regularity measure

m(x) =

conditions

In section

< ~.

G(x,t),

or strictly

speaking

upon the case

K.

be arbitrary

late G(x,t)

and

assume that

for the case K = 2 and commented

with arbitrary

Let x ~ R

Further

2.3.1 we calculated

only G(O,t),

consistency

in order to correspond to a probability

H on (Io,B(Io). K Z IkWk(X) k=]

certain

but such that m(x)

> 0. In order to calcu-

it is enough to know the restriction

of H to the o-

6o

algebra generated by ( ~ restriction

is

I ~ ; ~(y) ~ z} , y ~ x and z ~ R ,

determined

b y H{B} f o r

B~

B(I o)

of the

and this

form

(n~ ~o ; n(x) = Xk, ~(x I) = Xkl ..... n(x n) = Xkn } for n = 1,2,... , k , k 1 , . . . , k

n = I , . . .K and x < x] < x 2 < ... < x n < ~ .

We have shown that G ( x , t )

is

x

calculated

as G(x,t)

if

H is

replaced

with H . For B of the above form we have x

H(B) ~ Wk(X)Hkk1(Xl-X)Hklk2(X2-Xl)...Hkn_ikn(Xn-Xn_1).

Since

I

I

Hx(B} = ~

f ~(x)H(d~} - m(x) Xk H(B} B

we get

Hx(B} =

m(x)

Thus the restriction of H

Hkk1(Xl-X)'''Hkn_ikn (Xn-Xn-1) "

x

to the above mentioned a-algebra is the

distribution of a Markov chain (~x(y) ; y ~ x} with stationary transition probabilities and with sample functions in I . Thus for x with o m(x) > 0 the methods of calculating G(x,t) also applies to G (x,t) ~kWk(X ) x provided ~k(X) is replaced with m(x) . For the stationary case, i.e. Wk(X) = Wk' this result is obtained by Rudemo (1973:2, p 279).

It may be observed that the above reasoning holds also if the transition probabilities are not stationary, but then we have no methods for calculating G(x,t).

Example 4 Consider the case studied in section 2.3.2 where ~(x) = ~N(x) and assume that N(x) is a stationary renewal process. It was shown that under certain regularity assumptions

9

63

l i m e Kt G(t) = C. t~ Assume these regularity assumptions

and assume further that

0 < E kk < ~" Put m = E Xk'

Thus t

-

i fI ~

Gx(x't) --

n(x)

e

f n(y)dy x

o

and since N is stationary it follows that there exists a uniquely determined function G~

G~

such that (for almost every x)

= Gx(x,t+x)

For the general theory of Palm probabilities case we refer to Jagers

in the stationary

(1973, pp 25 - 26). Thus t

G~

i

=

f n(y)dy 0

e

n(o) m

~{dn}

o

and thus G~

is a monotone function.

We will now consider the behaviour of G~

for large values of t.

For arbitrary x E (0,t) we have x

x

f o~

=

0

f Oy(y,t)dy = 0 t

-• --

n(y) m

OI

e

f

n(z)dz

Y

0 t

t

- S n(z)dz

-•

(e x

-m

e

I o

= !

m

(o(t-x)

- f n(z)d~

- o(t))

0

) ~(dn}

=

62

which is a Palm-Khintchine

equation

(cf

e.g.

Daley and Vere-Jones

(1972, p 358)).

From the monotonicity

xG~

of G ~ it follows that

< A (G(t-x) - G(t)) < xG~ -- m

and thus

xe
~im (e<(t-x) G(t-x) e <x - e
< xe <(t-x) G~

e <x

for all t > 0 and all x ~ (0,t).

Thus

x lim sup e
< C (eKX _ I) < xe <x lim inf e
m

which implies that l i m e Kt G~ t-~

lira t-~o

e Kt

G~

=

t§

exists and that

<__qC m

We conclude this example by sketching an approach more in line with the one used in example 3.

Let {X~

; x ~ 0} be a stochastic process defined exactly like X(X),

except that " l 0 has distribution function V(x) = ~ Put v(t) = S e-tx V{dx}. Then H ~ 0 I

rl~

= m

S n(O)

i0 y

U{dy} instead of U.

defined by

II{dq} for all B

B(IO) ,

B

can be shown to be the distribution of {X~

; x S0}.

Thus

t

- S x~ G~

= E e

dx

0

and, see section 2.3.2 for notations,

G~

t = (I - H(t))v(t) + S v(s) GF(t - s) H{ds}. 0

63

Thus the asymptotic behaviour of C~

can be studied by methods, similar to

those used in section 2.3.2.

A detailed derivation of l i m e
2.5

m

Some random generations

We will in this section describe stochastic

Poisson

for illustration

sequences

reasons

s

and s

independent

for continuous

with s

exponentially

distri-

p, 0 ~ p ~ I, and

I - p. Then we have

I

rk, j = E(s k - I)(s

- I) :

plk-Jl

For this process we made seven generations ding {N k} for k = 1,2,... ,n. In table generations

time are presented.

be equal with probability

Zk=

5.2 and

I, i.e. Pr(s k ~ x} = I - e -x for x ~ 0. Let

with probability

m=E

of doubly

1.4) which will be used

6 and 7. In sections

; k ~ Z} be a Markov process

buted with parameter further

(see section

in sections

5.3 some similar generations

Let s = (s

some random generations

are given.

of (~k } and the correspon-

I some characteristics

of the

64

Name of

I n

P

n

n Z ~k --

generation

I

n

I

--~ Nk n I

I

n

--n Z (Nk-~,k)2 1

GI

500

0.0

0.993

0.982

0.951

G2

500

0.0

1.025

1.052

I .074

G3

500

0.0

1.018

1.046

0.856

G4

500

0.75

0.929

0.884

0.717

G5

500

0.75

0.878

0. 788

0.809

G6

500

0.75

0.933

0.976

0.978

G7

50

0.75

1.090

o. 86o

0.702

Table I: Some characteristics

The generations

for the generations

GI - 06 are the same as used by Grandell

(1972:2).

Generation G7 is illustrated in figure I.

25

5O

In each point k the height of the spike represents N k and the value of the plece~ise constant curve represents

Lk .

Figure I: Illustration of generation G7

65

3.

CHARACTERIZATION AND CONVERGENCE OF NON-ATOMIC R A N D O M M E A S U R E S

In this section we will illustrate how doubly stochastic Poisson processes may be used in proving theorems about random measures. Roughly speaking, any theorem about characterization or convergence of point processes that can be applied to doubly stochastic Poisson processes gives, due to theorem 1.1 and 1.2, a similar theorem for random measures.

Kallenberg (1973:1) gave general theorems (see theorem A 1.4 and A 1.7) for random measures and important improvements

(see theorem

A 1.5 and A 1.8) for simple point processes. We will use the theorems for simple point processes in order to get results for nonatomic random measures

(see definition A 1.1). The idea to use con-

tinuity assumptions in theorems of the kind to be given in this section is not new. Kallenberg (1971) announced such theorems for stochastic processes. The main results of this section were independently derived by Grandell (1973) and Kerstan, Matthes and Mecke (1974, pp 314-316) using identical methods. Using different methods the results were extended to certain classes of random signed measures by Kallenberg (1973:2) where also his announced theorems for stochastic processes were given.

Let A C B ( X ) be an algebra containing some basis for the topology on X.

Theorem I Let A be a non-atomic random measure with distribution H. Then ~ is uniquely determined by the distribution of A{A) for all bounded

A~A.

66

Proof The doubly stochastic Poisson process N with distribution PH' see section

1.5, is simple,

see theorem

1.3, and thus PH is uniquely

determined by Pr{N{A} = O} = E e -A{A} = f e-P{A}H{ d~} for all

M bounded A~A,

see t h e o r e m A 1.5. Since the distributions

of A{A}

determine E e -A{A} they also determine PH and thus H, see t h e o r e m 1.1.

It may be observed that in fact we have proved a little more, namely that H is determined by E e

-A{A}

for all b o u n d e d A ~ A.

In theorem A 1.5 the assumption that N is simple is essential, e.g. Kerstan, Matthes

and Mecke

assumption that A is non-atomic not mean that theorem

see

(1964, p 17), and thus also the is essential.

This does of course

I is true only for non-atomic A, in fact it

is of course true for simple point processes, but some condition on A is needed.

It is therefore of interest to find conditions

of the distributions

of A{A} for A ~ A which ensure that A is a non-

atomic random measure.

Jagers

(1974, p 203) shows that a point process

N is simple if there exists a non-atomic = o(p{A})

in terms

~M

such that Pr{N{A} > I} =

as p{A} + 0 and A ~ A .

Thus for any random measure A it follows that A is non-atomic

1 - El(1

+ A{A}) e-A{A~

if

= o(~{A}).

Denote the boundary of a set A by 3A.

Theorem 2 Let A be a non-atomic

random measure with Pr{A{~A} = 0} = I for all co

bounded A@A A

n

and let {An} I be a sequence of random measures.

d --~A if and o n l y i f

d

A {A}--* A{A} f o r n

all

hounded A~A.

Then

67

Proof The 'only if' part follows since ~ *

~{A} is continuous

for Borel

sets A with ~{~A} = 0.

Now we consider the 'if' part. We will use theorem A 1.9 in the same way as theorem A 1.5 was used in the proof of theorem I. The doubly stochastic Poisson process N corresponding to A is simple and

Pr{N{~A} = 0] = I ~

Let N

n

E e -A{~A} = I ~:@ Pr{A{~A} = 0} = I.

be the doubly stochastic Poisson process corresponding to A . n

For bounded A ~ A the random variable N {A} can be considered as a n

doubly stochastic Poisson process with an one-point

state space and

thus from theorem 1.2 it follows that A {A}-J-~ A{A} implies that n

d d N {A}---~ N{A) and thus, by theorem A 1.9 N ---+ N which, by theorem n n d 1.2. again, implies that A --+ A.

n

m

It seems natural that if theorem A 1.8 is used in the proof of theorem 2 a somewhat stronger result will come out. In order to do this the following lemma is needed.

Lemma I o~

oo

A sequence {PH }I is tight if and only if {Hn) I is tight. n

Proof Assume that {Hn)~ is tight and thus by Prohorov's theorem relative compact. {PH

For any subsequence

}I converges weakly, nk

{Hnk) ~ which is weakly convergent also

see theorem 1.2, and thus {PH }I is relan

tive compact and by Prohorov~s theorem in its other direction thus tight. The 'only if' part is proved in the same way except that corollary

1.1 is used together with theorem 1.2.

B

68

Theorem 3 d Let A,AI,A2,...

A if and only if

and A be as in theorem 2. Then A n

-i {A} n

(i)

E e

(ii)

E A {A} e

.~ E e

-A{A)

for all bounded A ~ A

-An{A}

-A{A} ," E A(A} e

for all b o u n d e d A s A

n

co

(iii)

{An) I is tight.

Proof Proceed as in the p r o o f of theorem 2 up to the construction of N n. From assumption

(iii) and lemma I it follows that {Nn} I is tight

and from assumptions

(i) and (ii) that Pr{N {A} = 0} § Pr{N{A} n

and that P r { N {A} = 1} § P r { N { A ) = 1} a n d t h u s

it

follows

= 0}

by theo-

n

d

rem A 1.8 that N ~

d

N which, by theorem

1.2, implies A --~ A .

n

n

m

4.

LIMIT THEOREMS

In this section we will consider asymptotic properties stochastic Poisson processes

Let A = {A(t) rightcontinuous

; t~R+}

on B+ = [0,~).

be a stochastic process with nondecreasing

sample functions,

for all t > 0. Thus, see section measure.

Let ~ = {N(t)

of doubly

such that A(0) = 0 and A(t) < ].3.1, A corresponds to a random

; t 6 R+} be a Poisson process with inten-

sity one and independent

of A and let, see section

doubly stochastic Poisson process N = {N(t)

; t~R+}

%

to A be defined by N = NoA, i.e. N(t) = N(A(t)).

1.3.2, the corresponding

68

Theorem 3 d Let A,AI,A2,...

A if and only if

and A be as in theorem 2. Then A n

-i {A} n

(i)

E e

(ii)

E A {A} e

.~ E e

-A{A)

for all bounded A ~ A

-An{A}

-A{A} ," E A(A} e

for all b o u n d e d A s A

n

co

(iii)

{An) I is tight.

Proof Proceed as in the p r o o f of theorem 2 up to the construction of N n. From assumption

(iii) and lemma I it follows that {Nn} I is tight

and from assumptions

(i) and (ii) that Pr{N {A} = 0} § Pr{N{A} n

and that P r { N {A} = 1} § P r { N { A ) = 1} a n d t h u s

it

follows

= 0}

by theo-

n

d

rem A 1.8 that N ~

d

N which, by theorem

1.2, implies A --~ A .

n

n

m

4.

LIMIT THEOREMS

In this section we will consider asymptotic properties stochastic Poisson processes

Let A = {A(t) rightcontinuous

; t~R+}

on B+ = [0,~).

be a stochastic process with nondecreasing

sample functions,

for all t > 0. Thus, see section measure.

Let ~ = {N(t)

of doubly

such that A(0) = 0 and A(t) < ].3.1, A corresponds to a random

; t 6 R+} be a Poisson process with inten-

sity one and independent

of A and let, see section

doubly stochastic Poisson process N = {N(t)

; t~R+}

%

to A be defined by N = NoA, i.e. N(t) = N(A(t)).

1.3.2, the corresponding

69

One-dime~ional l i m i t theorems

4.1

We will n o w consider This question and

(1972:2)

the asymptotic

has been independently and Grandell

It is well-known

of N(t)

t r e a t e d by Serfozo

as t § ~. (1972:1)

(197]).

that

~(t) - t

d --§

where W is a normally

as

t §

distributed

d and Vat W = ] and where - - ~ means In many

distribution

cases there

exists

d

'convergence

constants

> 0 and a r a n d o m variable A(t) - Kt Y

r a n d o m variable

with E W = 0

in distribution'.

K~ y~ 8 with y > 6 > 0 and

S such that i

~S

as

t +~.

from Dobrushin

(1955)

t8 Then it follows

N(t)

-

Kt Y

that

S + /~KW if

V =

S

y < 26

26

d

as

t§

t 6

where

S and W are independent.

that the specific

if

It follows

form of the norming

constants

portant~

and we are led to the

(1972:1~

p 293 and ]972:2 pp 3]2-3]3).

Theorem

]

Suppose

that there

~t lim 8 t = ~ and lim ~ t-~o t§ 8t such that

A(t)

- ~t

= K, 0 _< K < ~

d ~S

Bt

following

exist nonnegative

as

t§

from Dobrushin~s

are not too im-

theorem

constants

proof

due to Serfozo

~t and Bt with

and a random

variable

S

7o

Then N(t) - ~t

d ~S+

~Was

t

§

oo

St where S and W are independent.

Proof We will give a proof slightly different from the one given by Serfozo.

Since N(t) : N(A(t)) one may suspect that N(t) behaves somewhat alike A(t) + N(st) - st. Put, in order to simplify the notations, No(t) = N(t) - t. Thus we have

N(t) - ~t

A(t) - ~t

~o(~t )

=

+

St

-

Bt

-

Bt

Due to the assumptions A(t) - ~t

d §

as

t §

St For the second term we have

N~

m ~ t~ Bt

If K > 0

then

=

N~

~-~

at + ~

~t _-~§ v/K'K as

~t

as

t

t § ~

-~

ce

~t and

~o(~t )

d

---~W

as

t §

and

No (A(t)) - ~o(~t ) +

Bt

7]

If K = 0 then No (mt ) mt Var (~-~j----)= -~ § 0

as

t §

~t Thus, if the last term is shown to tend to zero in probability

as

t § ~, the theorem is proved.

Since N and A are independent we have

d

No(A(t)) - N(mt)

No(I A(t) - mt I)

Bt

=~

Bt

IA(t) - mt[" INo(IA(t) - mtl) I Bt2

where = means

N iA(t) _ mt I

'equality in distribution'

preted as zero. From Chebyshev~s

and where o

gg

is inter-

inequality it follows that

(t) Pr {L ~

N

<

]

} > 1 -

~

for

allt

> 0 a~d

(t)

{ o V~

all

~ > O,

-

; t ~ 0} is tight, and thus also {N~

IA(t) - mt I is tight. Since

i.e.

st

I), t > 0} ~IA(t) - ~tl --

d * IS1 as t § ~ it follows that

~t

i IA(t)

- ~tl

9 tends to zero in probability

as t § ~ and thus also

Bt No(IA(t ) - ~tl ) tends to zero in probability

as t § ~.

Bt

m

Consider now the case when E A2(t) < ~ for all t > O. Put M(t) = E A(t) and V(t) = Vat A(t). From lemma 1.3a it follows that E N(t) = M(t) and Vat N(t) = M(t) + V(t).

72

The

following

Grandell

(1971,

Corollary Suppose (i)

corollary

is a slight

reformulation

of results

due to

pp 207-213).

I

that

lim M(t) t-~

If k = ~

then,

N(t)

- M(t)

= ~

and

d ---+ W

as

M(t) l i m ~T-77~+ ~ =k,O
t § ~.

/Var N(t) (ii)

If k < ~ A(t)

and if there - M(t)

exists

d

a random

variable

Z such that

as t §

Z

V~(t) Then N(t)

where

- M(t)

--+d @

I " k v--;---2 z + ~T-~-f

W

as

t +~

Z and W are i n d e p e n d e n t .

Proof

(i)

Put

s t = M(t)

and

8t = M(m

(ii)

Put

s t = M(t)

and

Bt

=

7

/~.

Then

S = 0 a.s.

Then

d Z = S.

.

m In the

important

A(t)

we have

the

Corollary Suppose

-

special M(t)

following

case w h e n

d ~W

as

t§

result.

2

that

lira M(t) t+~

A(t)

- M(t)

= ~ and that

d --+ W

as

t § ~.

73

Then N(t) - M(t)

d

.~ W

as

t§

~Var N(t)

Proof Let

{ t n }n=1

be an arbitrary

sequence

such that t n § ~ as n § ~

M(t) Since V--~7 ~ 0 for all t > 0 there exists a subsequence M(tn,) {tn,}n,=1 g {t n} such that nlim'§ V(tn,------~ = k 6 [0,%~ where k may depend on {t '}. From the proof of theorem n

I and from corollary

I

it follows that N(tn,)

- M(tn,)

d ----+ W

as n' + ~

~Var N(tn,)

and, since {t ) was arbitrary,

thus

(cf Billingsley

(1968, p 16))

n

N(t) - M(t)

d ----+ W

as t § ~

~Var N(t)

l The following corollary

shows the existence

2 but not by corollary

of independent variables

example

and identically

I

9

of cases covered by

Let {~k }~ k=1 be a sequence

distributed

nonnegative

random

with E ~k = I and Var ~k = I and put

it] A(t) =

Z
~'~ means

'integer part'

Then M(t) = It] + exp {[log(t + I)~}, V(t) = It] and A(t) - M(t) V(/~

d_~ W

but

I lim inf ~M(t) = I + -t-~~ e

M(t) lira sup ~ = 2. t-~

Added

in proof

See page 86 for further remarks.

and

74

4.2

A fund~ion~g l i m i t theorem

The main purpose of this section is to present results corresponding to theorem I but where the statement

'convergence in distribution'

is given in a more informative sense.

Consider therefore the space D of rightcontinuous functions with lefthand limits defined on [0,~) endowed with the natural extension of the Skorohod J1 topology. Some properties of D are given in section A I. The Borel algebra B(D) on D is equal to the a-algebra generated by {xED

; x(t) ~ y }

for y 6 R and t in some set dense in E0,~). Let

D CD o

be the subset of non-decreasing functions x with x(0) = 0 and

let C ~ D be the set of continuous functions and put C

o

= CAD

o

. All

these subsets are closed sets. We topologize these sets by relativizing the topology of D.

We call a measurable mapping from some probability space into (D,B(D)) a stochastic process in D. The same terminology will be used for the subspaces and also for products of these spaces.

Remark 1 Let us consider the set of functions in D

o

and denote this set by

when it is endowed with the vague topology. Let h : M § D identity function. It is measurable,

o

be the

since from theorem A I. I and

the properties of B(Do) it follows that B(M) = B(Do). Thus every probability measure on (M~B(M)) is also well-defined on (Do,B(Do)). Let ~,~i,~2,. .. be functions in D O such that ~n § ~ in D o ~ i.e. in the Skorohod sense. Then (cf Billingsley

(1968,p 112))

~n(t) § ~(t) at continuity points t of U and thus, see theorem A 1.7, ~n + W in M, i.e. in the vague sense. On the other hand

75

0

if

0
< I---

--

~n(t) =

I

1 -!<

if

t n

2

I n

< 1 +1

--

n

1 + l--
if

n

- -

converges to 0

if

0
2

if

I
< 1

~(t)

in M but not in D o . Thus the (extended) Skorohod topology is strictly finer than the vague topology so h is not continuous.

If

A,AI,A2,... are stochastic processes in D o it thus follows that d d A - - 4 A (in M) does in general not imply that A ----* A (in D ). n

n

The function h is, however,

continuous

o

at continuous

functions,

at functions in C O . To see this, let ~I,~2,... be in D

i.e.

and let ~ be O

in C O . Then ~n § ~ in M means that ~n(t) § ~(t) for all t ~[0,~) while ~n § ~ in D o means that

sup l~n(S) - ~(s) I § 0 for all t 6 [0,~). O~s<_t It can be shown that ~n § ~ (in ~) implies ~n § ~ (in Do) almost exactly in the same way as it is shown that pointwise convergence distribution

functions to a continuous

distribution

form (cf e.g. Bauer (1968, p 191)). Thus A

n

d *

A

of

function is uni(in ~) implies

d An -

'~ A ( i n

DO ) p r o v i d e d

Pr

{A~ Co } = 1. T h e o r e m s 3 . 2

and 3.3 thus

d

hold true if the conclusion A

> A is interpreted in the Skorohod n

sense. Jagers

(1974, pp 211-212) has shown the same to be true for

theorems A 1.8 and A 1.9.

It is not sufficient to assume that A is continuous in probability d d in order to ensure that A n - - 4 A (in M) implies A n ~ A (in Do) as was erroneously stated by Bingham (1971, p 6). To see this, let 8 be a non-negative

random variable with continuous

distribution

and put

76

0

An(t) =

n(t -

@)

if

0

if

8
< t

<

@ I

< @ +--

--

if

I

t

> --

n

@ +--

1 n

and 0

if

0

I

if

@>

< @

A(t) t

d ---~ A (in M), but since A

Then A n

is a process in C n

is closed in D it follows that A

n

and since C o

does not converge in distribu-

tion to A in the Skorohod sense.

Let X be a stochastic process in D, m an element in D and Bt a positive function such that lim Bt = ~. Define for every t > 0 a stochast-~ tic process Xt in D by

Xt(T) =

x(t~) - m(tT) Bt

d If Xt

~ S as t § =, for some stochastic process S in D, we talk

about a functional limit theorem for X.

If for every T > 0 we have

~tT _- Tp t-~=limB t , p finite,

then Bt is said to var~ regulary at infinity with exponent p or shorter to be p-varying.

We will, in the functional limit theorem to be given, assume that Bt is p-varying. The following lemma, and the remark after it, indicates that this in fact is a mild restriction.

77

nemma

I

Let X be a stochastic limit theorem holds

process

in D and assume that a functional

for X. If B 6 D

O

and if Pr {S e 0} < I then

~t is p-varying with 0 < p < ~.

Proof A very similar result sidered convergence underlying

is due to Lamperti

of finite-dimensional

assumptions

were somewhat

proof to be given is, except

(1962, p 74). He condistributions

different

for technicalities,

and his

from ours. The the same as

Lamperti's.

It follows

from Billingsley

that the set T S consisting

(1968, pp 123-124)

and theorem A 1.10

of 0 and those T > 0 for which

Pr {S(T) - S(T-) = 0} = I is dense in E0,~). The statement d X t --~ S implies

= Xt(T)

d~

S(~) for all T ~ T S and since lim Bt t-~o we have Pr {S(O) = O} = I. Since B(D) is generated by { x e D , x(t) < y}, y ~ R and t ~ T S it follows that there a ~o~Ts,

t ~ > O, such that Pr {S(T o) r 0} > 0 since otherwise

Pr {S - O} = I. Then for every positive

s(T)

d

x(t~o~)

BtT

x(tTT o) - ~(tTT O)

= - -

BtT -~tT

BtT

O

The second factor of the right member S(~o) , and therefore

T > TS, when t §

- ~(tTo~)

* - -

Bt~

Thus,

exists

BtT

O

converges

in distribution

to

tends to a finite limit. O

since B ~ D o, it follows

(of Feller

(1971, pp 275-276))

that

78

~tT lim ~ t- ~ 8t~

i.e.

o

T B t - to = lim - t -~~ 8t

Bt is p-varying.

p = (~---) for all o

~ > 0,

Since lim ~t = ~ we have 0 >_ 0.

If O = 0 then S(T) =d S('c ) f o r a l l

o

ments of D are rightcontinuous,

positive

"c~-TS. S i n c e a l l

this implies

ele-

(cf Billingsley

(1968,

p 126)) that S(T ) =d S(0) and thus Pr (S -= 0) = I. This is a cono tradiction

and therefore 0 > 0.

B

Remark 2 If Pr (S(T) - S(T-) = 0) = I for all T ~ R+, then T S = R+ and the p r o o f of lemma I goes through if B is only assumed to be measurable

Lemma

(cf Feller

(1971, p 277)).

2

The function ~ : D x D and it is continuous given by r

o

+ D given by @(x,y) = x~y is measurable

for ( x , y ) 6 C

= x + y where

and it is continuous

x D . The function @ : D x D § D o

(x + y)(t) = x(t) + y(t) is m e a s u r a b l e

for (x,y)6 C x D.

Proof This lemma is a consequence of more general results given by Whitt (1972). A p r o o f will, however, be given. Consider first the function %. The m e a s u r a b i l i t y

follows from the p r o o f of lemma 1.2

slightly modified according to B i l l i n g s l e y observations xl,x2,... ~ D ,

about the Borel-algebras

(1968, p 232) and the

in remark

1. Let now

x 6 C and y,yl,Y2,... ~ D O be given. We will show that

if Xn + x and Yn § y than XnOY n + xoy. From the definitions

given

79

in s e c t i o n

A I it follows

that

Yn § y means

that

there

R~C

{Yn } =I' Yn 6 F '

such

that

ynOYn

exists

U

~ y

and Yn

~ e.

Since

U~C

x g C it follows tE

that

x

n

§ x means

that

x

) x. For

n

any

[0,~) we have

sup O<s
Let

IXn~YneYn(S)

- x~y(s)l

<

sup O<s
IXn~Yn~Yn(S)

+

sup O<s
[x.Y~Yn(S

e > 0 be g i v e n

continuous

and put

on [ O , t o .~ t h e r e

sup

<

- XOYnO7n(S)l

) - x~y(s)l

to = sup n

exists

+

.

(ynOYn(t)).

Since

x is u n i f o r m l y

~ > 0 such that

Ix(s 1) - x(s2)

I < E .

O<__sI 's2~t [s1-s21<6 U~C Since

x

U~C X and y n O Y n

n

IXn(S)

sup O<s
--

y

,,

- x(s)[

there

exists

<

0

and

sup O<s
[ynOYn(S)

- y(s)

I < 6

for all n > n . --

Thus

o

for n > n --

o

sup O<s
IXnOY,nO%#n(S)

IXn~S) -

sup

xs

- x,y(s)[

+

~ ~

<

2~

O<s
and thus

- -

0

@ is c o n t i n u o u s

for

(x,y)6

C x D . o

n

o

such

that

8O

Consider

the function

the continuity

~. This

assume

sup O<s
function

that x n § x ~ C

I(x n + yn ) ~ Yn(S)

is obviously

- (x + y)(s) I !

sup Ixn o Vn(S) - x(s)I +

sup ly n o Yn(S) - y(s) I O<s
The first term tends

to zero due to the

(put Yn = Yn and y = e) and the second

Let A be a stochastic

process

in D . From lemma o

in D

o

first part of this

independent

Define

lemma

since Yn § y"

2 it follows

N = N~A ~ in D o is well-defined. % N

For

and Yn + y' Then

O<s
a process

measurable.

of N, which

that the stochastic

No by No(T)

=

also is process

T) - T and No, t by

N (tT) o,t

(T) = ~

o

9 In section

4.1 we made use of the fact that

%

No(~)

d --+ W as § ~. In order to state

now let W be the Wiener

Lemma

process

a functional

in D with E W(t)

correspondence

= 0 and V a r W(t)

we = t.

3 N

o~t

d ----+ W as t § ~.

Proof

Since No, t is a process follows

from S k o r o h o d

it is sufficient

o,t for each ~ R

(~)

with i1957,

stationary

snd independent

p 151) combined

increments

it

with t h e o r e m A 1.10 that

to show that d

w(~)

+ as t § ~. For fixed

(t~) ~o,t(T ) = ~

o

T we have

d ~ ~w(1)

d =w(T).

I

81

We are now ready to state the functional limit theorem corresponding to t h e o r e m I.

Theorem 2 Let A be a stochastic process

in D

and let N = NoA be the corresponO

ding doubly stochastic Poisson process. and St a positive

p-varying

Let m be an element in D

O

function with 0 < p < ~ and S a stochas-

tic process in D. Define A t by At(T) = A(tT) - m(tT) St

and N t by

Nt(~ ) : N(tT) - m(tT) St

If

~(t) lim ~ = ~, 0 < ~ < ~ ~ and if t§

At

d

~ S

as

t § ~ , then

St d N

~ S + Woh

t

where h(~) = KT 2p

as

t -~

and S and W are independent.

Proof We have

Nt(T )

N(A(tT)')

- m(tm) = N(A(tT)) St

=

No (A(tT) St

- A(tT)

+ A(tT)

St (A(tT))

+ At(T) = N o,S 2 t

2 St

- m(tT) St

9 + At(z)

Thus we have

Nt = N

2 o At + At o,S t

where ~t(T )

A(tT) 2 St

At(~) St

~(tT) 2 St

From t h e o r e m A 1.10 and the fact that lim S t = ~(cf Feller (1971, t-~

=

82

p 277)) it follows that

At

P s0

as

t§

St P where --~ means

'convergence in probability'

and 0 is the function

identically equal to zero.

Further it follows from Feller (1971, p 277) that ~(tT)

u,c

KT2p

2 St

~(tT)

a(tT) = h(w)

since

2 St

=

~2 tT

B2 tT 2 St

for

T > 0

(strictly speaking it follows from Feller that the convergence is uniform on compacts not including zero,

but since p > 0 and ~ & D

o

it is not difficult to realize that zero may be included) and thus P t

'~ h

as

t §

From the assumptions

it follows that At----~ S and thus (cf Billingsd ley (1968, p 27)) (At,A t) ~ (S,h). Since lim St = ~ it follows from t-~ d lemma 3 that N

2 o,B t

pendent it follows ()

~ ~ W and thus, since N

(cf B i l l i n g s l e y

2 and (At,A t ) are indeo,B t

(1968, p 21)) t h a t

2,At,~t) _ d, (W,S,h). From lemma 2 it follows that (N~

2 ~ At'At )

o,~ t ,St is well-defined and that, since W is a process a.s. in C and At d a process in Do, (N 2 o At,At) ---* (Woh, S). o,~ t Since h E C it follows that W~h is a process a.s. in C and thus it o

d follows from the second part of lemma 2 that N t

~ S + WOh.

m

83

Remark 3 If it is possible to choose m(t) = bt and Bt = / ~ t h e n theorem 2 is very close to a result due to Serfozo (1972:2, p 314).

m Example I We will consider the case studied in section 2.3.2 where the intensity process is generated by a renewal process. Let N = {N(t); t > 0) be an ordinary renewal process which is independent of a sequence ^

co

{Xk}k= 0 of independent and identically distributed nonnegative random variables with distribution function U. (The use of ~k and N instead of Xk and ~ as in section 2.3.2 is only since the notation N is used in a different sense in this section.) Let T0,TI,... be the epochs of events corresponding to N, i.e. Tk = ~-1(k). Put ~0 = @0 and ^

co

[k = Tk - Tk-1" Then {$k)k=0 is a sequence of independent and identically distributed random variables. Let F denote their common distribution function.

Put ^

~(t) = ~N(t) " Then

A(t)

N(t)-1 t = / ~(s)ds = 0 k=0

2 < ~ and E ~2 Assume that E ~k k < ~ and put m = E ~k' 2

= Var ~k'

^2 fl = E ~k and f2 = E ~k" Then we have

A(t)

- mt =

N(t)-I Z k=O

(~k - m)~k + ( ~ N ( t )

- m)(t

- T~(t)_l).

Thus, except for the last term which has no influence on the asymptotic behaviour, A(t) is expressed as a randomly indiced sum of independent and identically distributed random variables with zero mean and finite variance. Put

84

At(~ ) = A(tT>

- mtT

Since N(t) ~ t

1 fl

and Var (~k - m)~k = ~2f2

it follows from the proofs given by Billingsley (1968, pp 146 and 149) that

A

--~

~

W

as t §

t From theorem 2 it then follows that d

2

Nt ~

f2 fU

~

+ m W

ast§

where N(tT)

Nt =

-

mtT

/~

m

Example 2 We will consider a case which illustrates the advantage of the rather general formulations of the theorems.

Let, like in example I

{ k)k=0 be a sequence of independent and

identically distributed nonnegative random variables with distribution function U. Put l(t) = ~ [ ~

where ~

means 'integer part'.

If E 22 < ~ this case is included in the case considered in k example I and therefore we assume that E 22k = ~" It follows from Feller (1971, p 577) that a non-degenerate (one-dimensional) limit theorem for A(t) exists if and only if U belongs to the domain of attraction of a stable distribution. Then for some Y , 0 < y ~ there exist norming constants a

t

and B t such that

2,

85

A(t)

-

d

st

S(1)

as

t +~

Bt

where

S(I ) is a r a n d o m

exponent E

y and

with

y = I will

f r o m n o w on b e

A t be d e f i n e d

a stable

I ~ -varying

B t is a n e c e s s a r i l y

= ~ if 0 < y < I and m = E I

k

Let

variable

distribution function.

Then

< ~ i f I < y < 2. The --

k

with

case

excluded.

by

At(~ ) = A(tT) - m(tT) #t where =] 0

~(t)

Let

S be

Imt

a stochastic

if

0 < Y < I

if

I
process

< 2

in D w i t h

stationary

and independent

d increments

such t h a t

At(1)

cess

if y = 2 and this

cess

in C. F u r t h e r

For a l l y w e h a v e

It f o l l o w s

~ S(1)

is e x a c t l y

as t § ~

the only

S is a p r o c e s s

in D

O

. S is a W i e n e r

case w h e r e

if and only

pro-

S is a p r o -

i f 0 < u < I.

d I = t ~ S(1).

S(t)

from Skorohod

(1957,

p

151) t h a t

d At

Since

~ S

as

^2 = ~ w e h a v e E ~k

t §

K = lim t-~

from theorem

N

t

2, or f r o m t h e o r e m d ---+ S

~(t)2 = 0 for a l l y a n d t h u s

it f o lY l o w s

Bt I and Skorohod~s

result,

that

also

as t § ~

where

Nt(T ) = N(t~)

- m(t~) Bt

m

86

Added

in p r o o f

Rootz6n

(1975) has

Suppose

that

shown the following

a t and 8t are constants

N(t)

- at

one-dimensional

with

lim t-~

d ---+ some r a n d o m variable

8t

=

~

result.

Then

.

R as t § ~

~t if and only if a t -- < ~2

< = lira sup t+~

and furthermore,

A(t)

-

at]) = E (e iuS)

(exp{iu

E

exp{-(<

at u 2 - -~) ~-}

~t

+ o(I)

In this

independent

5.

and W is normally

ESTIMATION

Consider

for some random variable

d R = S + {~-<W where,

case

a r a n d o m variable

estimate

t +

like in section

distributed

is, given

~ defined

4.1,

S and W are

with EW = 0 and Var W = I.

on the same p r o b a b i l i t y

A and a point process

M, is a Poisson

considered

S as

OF R A N D O M VARIABLES

as a r a n d o m measure A = ~

+

Bt

process

with intensity

an observation

of ~ in terms

N, where

N for given

~. The p r o b l e m

of N on X E B(X), o

of the observation.

space

to be

to find an

86

Added

in p r o o f

Rootz6n

(1975) has

Suppose

that

shown the following

a t and 8t are constants

N(t)

- at

one-dimensional

with

lim t-~

d ---+ some r a n d o m variable

8t

=

~

result.

Then

.

R as t § ~

~t if and only if a t -- < ~2

< = lira sup t+~

and furthermore,

A(t)

-

at]) = E (e iuS)

(exp{iu

E

exp{-(<

at u 2 - -~) ~-}

~t

+ o(I)

In this

independent

5.

and W is normally

ESTIMATION

Consider

for some random variable

d R = S + {~-<W where,

case

a r a n d o m variable

estimate

t +

like in section

distributed

is, given

~ defined

4.1,

S and W are

with EW = 0 and Var W = I.

on the same p r o b a b i l i t y

A and a point process

M, is a Poisson

considered

S as

OF R A N D O M VARIABLES

as a r a n d o m measure A = ~

+

Bt

process

with intensity

an observation

of ~ in terms

N, where

N for given

~. The p r o b l e m

of N on X E B(X), o

of the observation.

space

to be

to find an

87

Non-linea~ ~gimagion

5.1

Let ~ = N • M • R with elements ~ = (v,~,z) be our sample space. Let N and M be endowed with the vague topology and R with the usual topology.

With the product topology ~ is a Polish space (cf Bauer

(1968, p 169)). Let

B(~) be the Borel algebra on ~ and let Q be a

probability measure on (~,B(~)), i.e. Q is the distribution of (N,A,~).

Let thus

(~,B(~),Q) be the probability

space on which the

random elements, to be considered in this section, are defined. Since ~ is Polish, there exists for every sub-~-algebra of B(~) a conditional distribution

relative that sub-~-algebra

(cf Bauer

(1968, p 258)). Conditioning will be denoted by a superscript.

Let

B(N),B(M) and

respectively.

B(R) denote the Borel algebras on N, M and R

B(~) is thus the product of these algebras.

by BN, B M and B R sets in set B N • M • R. By

B(N), B(M)

B'(N)~B(~)

way B'(M), B'(R) and B~,

Denote

and B(R) and by B~ the cylinder

we mean (B~ ;

BNEB(N)}.

In the same

B R' are defined.

Consider the probability measure Q. Since N is assumed to be a Poisson process trarily.

for given A = ~

M we can not choose Q quite arbi-

For each Q on (~,B(~)) we may define a marginal probability

measure H

(for A)

on

(M,B(M)) by

Q {B N • B M • R} = f n BM Consider a set X o E B ( X ) (~60

H(B M) = Q(B~}. Thus Q must satisfy

{B N} H{d~)

.

and let O' be the ~-algebra generated by

; v(B} ~ x} for all x 6 R

and B 6 B ( X O) where B(X O) is the re-

striction of B(X) to Xo, i.e. B(Xo) = { B N X ~ ; B 6 B ( X ) } .

Definition

I

A set 0 ' ~ 0' is called an observation of N on X . O

88

Definition 2 An 0'-measurable

function sx : ~ § R is called an estimate of ~ in

terms of N on X .

We need some criterion to decide whether an estimate is good or not. Consider therefore a loss function L, i.e. a B(R) x B(R)-measurable function L : R • R § R+

~-F0,~)"

Definition 3 An e s t i m a t e

~

of ~ in terms

according to L if E ( L ( ~ , ~ ) )

o f N on X

0

~ E(L(f,~))

is

called

the best

estimate

for any 0'-measurable

func-

tion f : ~ § R.

In a given situation a best estimate need neither exist nor be unique a.s(Q). We will in example 2 consider a case where an estimate, which is not a best estimate according to any loss function L, may be reasonable.

If E 0' L ( ~ , ~ )

< E 0' L(f,~) a.s(Q) for any 0'-measurable

f : ~ § R then ~

function

is the best estimate of ~, and especially if

L(x,y) = (x - y)2 we get ~

= E0'~. Thus we consider Q0'. Although

Q0' is the a.s.(Q) unique 0'-measurable

solution of

S Q0, (.) aQ = Q (0'O.} 0' for all 0'~ 0', this may be of little help.

If Q{O'} > 0 it follows from the elementary definition probabilities

that Q{ O'{]B~}

Q{Ba]0' }

=

Q{0' }

of conditional

89

where Q(BolO') is the conditional probability of B ~ @ B ( O )

We will now consider the case where X

given 0'.

is bounded. In theorem I it

O

will be shown that Q0' may be calculated as a limit of elementary de-fined conditional probabilities.

It will further be shown that

what in 'every day language' is meant with an observation of N on Xo, really is an observation in the sense of definition I.

Let X vo @ N

0

be bounded. Then v(X ) < ~ for all v ~ N . 0

the set O'(Vo) by O'(v o) = ( ~

For Vl,V 2 ~ N

Define for any

~ ; v(B) = Vo(B)for all B~B(Xo)).

the sets O'(v I) and O'(v 2) are either disjoint or equal

and further

~ O ' ( v ) = ~. Let d be a metric metrizing the topology vaN on X. Let (Bnl,...,Bnr) be a sequence of finer and finer partitions n

of X ~ (i.e. for each n and ~ = 1,...,r n the set Bnj. is a union of certain Bn+1,j, j = 1,...,rn+1, sets) such that B n j 6 B ( X o) and

lim n+~

max diam (Bnj) = 0. 1~_j~rn

Put 0 n ( v o) = ( ~ g 2

. ; V(Bnj)

=

Vo ( B nj.) for

I -~ j

~ r n ). --

0 !

For each v6 N thus O'(V)~n 0'. Define Qn (B~) : ~ § [~,I] by

Q ~B~t0~(~)~

if

~o~(~)

and

Q~O~(~)~ ~ 0

0

if

~0~(~)

and

Q~O~(~)~ = 0

QO'n (B2)(w) =

for every B ~ B ( ~ ) .

Theorem I For each ~ N

the set O'(v) is an observation,

Further for each B~E B(~) we have

i.e. O'(v)~O'.

O' lie Qn {B~} = QO'(B~} a.s. (Q). n-~oo

9o

Proof Consider any ~o 6 N. The set 0'(v o) is characterized by the vector (xl,nl,x2,n2,...,Xm,nm) where Xl,...,x m are the only points in X ~ with w ({x}) > 0 and where n. = v ({x.}}. Denote for each x 6 X by o j o J o Bn(X) the set among Bn],...,Bnr

which contains x. Then Bn(X) + {x}. n

Thus there exists n o such that for n > n o the sets Bn(Xl),...,Bn(Xm) are disjoint. Thus 0~(~ o) + O'(v o) for n > n o which implies that 0'(~o)6 0' since O~(Vo)6 0' for each n.

Let 0'n be the a-algebra generated by {wE 2 ; ~{Bnj} _< x}, x 6 R, j = 1,...,r n. Thus 0~(~)~ ~0'n and 0~g~ 0'n+1 " Define 0'~ to be the ~algebra generated by

[j 0'. If we can show that 0' = 0' then the n n=1 theorem follows from Doob (1953, pp 611-612).

Since 0 " ( 0 '

and since 0" is a c-algebra it is enough to show that

v{B)(~) : ~ § Z is 0"-measurable for each B E B(Xo). Put n

D = { D E B ( X o) ; ~{D}(~) is 0"-measurable}

[j B nj .and . Since X ~ = j=1

since v(.)(~) is a measure for each ~ it follows (cf the proof of lemma 1.1) thatP is a Dynkin system.

For any closed set F in X the set X o ~ F 6 lim n§

[~ X6Xo~F

Bn(X) = X o N F .

D since

Since B(X o) is generated by (Xo~ F ;

F closed in X) and since for closed sets F I and F 2 also FI~]F 2 is closed and thus X o ~ F I ~ F 2 ~

~ it follows (of Bauer 1968, pp 17-18) that

p = B(Xo).

m Consider now the case where N and ~ are conditionally independent given A. To motivate a special study of this case, we just note that this is the case if ~ = A{B} for some B 6 B(X). In order to make our formulae somewhat more handsome, we denote the marginal distribution

of (A,~) by V. Thus V is a probability measure on (M • R, ~(M x R))

91

defined by V{B M x BE} = Q{N x BM x BR} . From the conditional independence we get Q(B N x BM x BR) = f

QB'(M)(B~) QB'(M)(B~} dQ =

= / n~{BN}QB'(M){B~}Q { N

• d~ • S} =

BM

BM

since

BM

i~ (W{Bn(Xi)} )niI e -w{X~ =I

Since H {O'(v)} = const. n

,

where Bn(X) and the vector (x],n 1,...,xm,n m) characterizing 0'(v) are defined in the proof of theorem I and where the constant depends on n and v but not on ~, it follows from theorem I that a.s. (Q)

(B(Bn(Xi)))niI e

~.!

Q~

=

lim

-~[x o )

V{d~ x BR }

I

n'+~

( ~ {Bn (x i ) } )

o

e

ff[d~)

1

for m~ O'(v) characterized by (xl,n I .... ,Xm,nm). Specializing further we consider X = R and X ~

(0,

and, see sec-

tion 1.3.3, the case where the model for the intensity is a stochastic process {l(x) ; x s

with distribution H on

(Io,B(Io))where

Io is a set of nonnegative Riemann integrable functions with finite integral over bounded sets. Let the space (~,B(~),Q) be modified in the obvious way. Then a.s. (Q)

92

t

m

QO'(BI)(~)

=

-S n(y)dy

I

( H n(y)dy) e o i=I B n x i)

Zim

o

V{dn x BR}

t

n-~co

-f ~(y)dy

m

/(~

n(y)ay)

I

e

0

H{dn}

I 0 i=l B n x i)

for ~O'(v)

by ( X l , 1 , . . . , X m , 1 ) .

characterized

(Multiple points do not occur.)

(t(j 2n

tj 2 Z]

=

n6 I

~(x) is continuous

O

such t h a t

2n

Jim n +~

l-

-

1)

Choose e.g. Bnj

f

Then for each x.E m (O,t] and each a t x = x. we h a v e 1

n(y)dy = n(x i) .

gn(X i )

Thus a.s. (Q), since a Riemann integrable

function is a.e. continuous,

t

-f

m

H (Bnlx i )

n(y)~y

n(y)dy) e 0

i=I lim

t_ (2 n)

n-~m

m

t = ( n n(xi))

e 0

i=I If e.g. t

-fn(yl~y supfl n

I 0

m 2n 0 H ((t--)B~n n(y)dy) e i=I (x i )

I+~ I

H{dn} < ~ a.s.(Q)

93

for some ~ > 0 it follows by uniform integrability

(cf Billingsley

(1968, p 32)) that

t

- f ~(y)dy

m

f ( H n(xi)) I Q0'{B{)(~ ) =

e

0

V{dn x BR}

i=I o

t

- f ~(y)dy

m

f ( ~ n(xi)) e I

a.s. (Q) .

0

~{an}

i=l

o

Remark I Two 'extreme'

cases where the condition of uniform integrability

holds are when

sup f O<x
~m(x) [{d~} < ~

for all m or when I

is o

0

the set of constant functions which means that N is a weighted Poisson process.

To see this we will show that t

- f n(y)dy

m 2n I =f ( H ((7-) Io i=I

f n(y)dy) e Bn(X i)

0

)2 H{dn}

for all n,m and Xl,...x m is less than a constant which does not depend on n.

Assume that Cm =

sup

f

~m(x) H{d~]

<

9 Using H 6 1 d e r %

O<x
0

equality twice we get

2 n 2m

i• 2 n 2m (7-)

m

f

n

I~

i=I

( f

n(y)dy)2~{dn} •

Bn(X i )

I m H (f ( f ~(y)dy)2m H {d~}) TM i=I I ~ Bn(X i )

in-

94

2 n 2m

(~)

m

<__ (7-)

H i=I

1

0

1

- - 2m

(

(f I

(I _ 1 ) 2m

2

l

n2m(y)dy) 2m)

)~

H{dn}

Bn(X i ) I

2n

m

= -t- ~i=I (f o (Bn(Xi J )

q 2m ( y ) d y )

lI{dq})

TM

<_.

I

2n < (~-)

--

m

(t__

H i=I

2n

m C2m)

= C2m 9

If N is a weighted Poisson process we have

2m

i = fI Example

(qm e-tq)2 ~ { d n }

<_ ( t )

e

0

I

Let PI'P2 ~ M

be given and define the distribution of A by

Pr{A = pk } = Wk, k = 0,1 , where w0 + Wl

Put { = k

if

=

I.

A = Pk" Then

EO'L(f,~)

= QO'{m; p = PO } L ( f , O ) + qO'{m; P = Pl } L(f,1).

Thus, if

L(x,k) =I 0

if

x = k

I I

if

x#k

the best estimate of 6 is equal to k if QO'{~; z = k) > QO'{~; z = I - k) .

=

95

Example 2 Consider X = R and X ~

=[0] ,t

Put like in section 2.1

and let N be a weighted Poisson process

X(t) = ~ where ~ is a random variable with

distribution function U. Put $ = ~. As already shown

X

QO'{~;

f ym e-Yt U{dy} 0

z ~ x} =

oo

f ym e-Yt U{dy} 0 where m = v{[0,t]}. For L(x,y) : (x - y)2 the best estimate is

co

~ ym+1 e-Yt U{dy} 0 co

f ym e-Yt U{dy} 0 This result, as pointed out in section 2.1, was given by Lundberg (1940, p 71) but motivated from a different point of view.

Assume now that N is a P$1ya process, i.e. B U'(x)

xB-1

e

-ax

r(B)

if

x>O

0

if

x<0

x

m+~-1 (~ + t)m+B ~

0

r(m + B)

=

where ~,~ > 0.

Then

Q0'

{~; z < x} = f -

and thus for L(x,y) = (x

~+t

_

y

)2

e

-(~+t)y

the best estimate is

dy a.s. (q)

96

Instead of this criterion we may consider L(x,y) = Ix - YI" Then the best estimate In t ~ s

case ~

~

is a median of the conditional

can not be analytically

given.

It may be regarded as natural to choose ~ tional distribution.

distribution.

as the mode of the condi-

Since the density of that distribution is propor-

tional to

ym+B-1

e-(a+t)y

for y > 0 we get

~

= max(O,

B - I + m)

~§

'

This last estimate is not a best estimate in the sense of definition 3.

To compare these estimates we have for a = B = I and t = 5 computed ~x for m = 0,1,2,...,10.

In figure 2 these estimate are drawn. Though

the estimates only have m e a n i n g for m = 0,1,... m varies continuously.

they are drawn as if

97

L(x,y) mode

= (x - y)2

L(x,y)

Ix

yl

1.5

0.5

0

i

|

I

1

2

3

m

9

,

9

9

9

~

5

6

7

8

9

9

i

m

lO

Figure 2: lllustration of estimates in a P61ya process

Example 3 In this example some results derived by Rudemo, specialized to doubly stochastic Poisson processes, will be surveyed.

Consider the case indicated in section 2.3.1 where (~(t) ; t ~ 0} is a Markov chain with stationary transition probabilities and with distribution H on (Io,B(Io)). Here I

is the set of rightcontinuous O

piecewise constant functions R+ § R+ = ~,~) with only a finite number of jumps in every finite time interval and with range (~k ; k = 1,2,...K) where K may be finite or infinite. Put Hki(t) = Pr(~(s + t) = ~iI~(s) = ~k ) qki

ki (0)

(right-hand derivative)

wk(t) = Pr(~(t) = ~k ).

98

Consider X ~ = D , t ]

9

Let 0' be the c-algebra of observations t

and put

0 !

w~(slt) = Pr t { k ( s )

= kk }

and K

~(slt

) =

where thus ~ ( s l t )

Z ~k Wk( s l t ) k=1 is the best estimate of X(s) in terms of N on

~0,t] according to L(x,y) = (x - y)2. To simplify wk(t) = w k ( t l t ) a n d

~x(t)=

notations put

~x(tlt ).

This example turns out to be a special case of a partially observed Markov chain, since the vector process

(N(t),~(t)) is a Markov chain.

Consider first the case K < ~ , treated by Rudemo (1972). Rudemo (1972, p 323) shows that in intervals between events

K

Wk' (t) =

Z i=I

w~(t) qik + (k~(t)

- kk ) w~(t) a.s.

while if an event occurs at t

wk(t) = ~ k ( t -

O) ~(t

~k - O)

a.s.

Consider now the general case where K is not assumed to be finite.

Following Rudemo (1973:1) and (1973:2) we define for t > 0

Hki(t) = Pr{k(s + t) = k.m, N(s + t) - N(s) : OI~(s) = kk}.

These probabilities may be obtained from

Hki(t)

= ~. Hkj(t) qJi - ~i Hki (t) J

99

and I ~

i~

~ = i

Hki(O) = 6ki = I 0

if

k # i

Let H(t) be the matrix with elements Hki(t) and D the diagonal matrix with elements 6kik i and let w(t) and w~(t) be the row vectors with components wk(t) and w~(t) respectively.

Then

wa(t) = ~(0) H(t0) D H(t I - t0)D...D H(t - tv(t)_l ~ S

a.s. (Q) where v(t) = v { ~ , t ] } ,

tk = v

-I

(k) (cf section 1.3.4) and

S is the normalizing operator on row vectors, defined by

Pk

(PS)k = Z pj

for p satisfying p~ ~ 0 and 0 < Z. P~o < ~" From this it

9

j

J can be shown, see Rudemo (1973:2, 271), that Ik a.s.

wk(t) : wk(t - 0) ~(t

- 0)

at events, like in the case K < ~, while t

<(t) :

<(u)qik + (~'(u) - ~) ~(~du

f s

i

a.s. for an interval (s,t] without events. Since In(u) appears in the equation it is not linear. Rudemo (1973:1, p 597) shows that ~(t)

= (~(t)S)k in (s,t] without events where w(t) is a row vector

with components given by t

~(t)

= ~k(s) + f [z ~i(u)qik - ~k ~k (u)] du s

i

a.s. which is a linear equation. These systems of equations are sometimes more suited for solution with a computer than the vector-matrix product representation.

100

In section 5.3 we will consider the calculation of w~(t) for a special case with K = 2.

Consider now, following Rudemo

(1975), the calculation of ~k(slt).

If

s > t and if H(t) is the matrix with elements Hki(t) we have w~(slt) = ~ ( t )

H(s - t) where w*(slt) is the row vector with compo-

nents ~k(slt).

If s < t and if P(t,s) is a matrix with elements 0 r

Pr t{l(s) : li' l(t) = Ik } Pk,i(t, s ~(t)

provided Wk(t) > 0 we have w (sit) = ~x(t) P(t,s) a.s.

As mentioned above the vector-matrix product representations be suited for computer calculations.

Rudemo

may not

(1975) shows that for

s > t s

=

+ I z t

(urt)

qik du

i

and for all s < t (i.e. also at events)

t

9{

~(slt) = ~(t) + f z (~(ult)qki ~k (u) s i

~. (u) l

x

- ~k(~It)

qik w~(u)

) du .

The above formulae seem suitable when t is fixed and s is varying. Rudemo

(1975) also gives recursive equations

for fixed s and varying

t and for both t and s varying but t-s constant.

101

With minor changes all the results given in this example hold true also when the intensity is a function of a Markov chain with stationary transition probabilities.

Problems

of this kind have also been studied by Snyder

(1972:1) and

(1972:2) when the intensity is a function of a vector M a r k o v process.

We will indicate how some of Rudemo's results may be proved. Assume that K < ~ since then all regularity assumptions are fulfilled. Define the random variables Z(t) and ~k(t) by

t Z(t) =

v(t)-1 H l(tk)) e k=O

S ~(u)d~ 0

and I

if

l(t) = kk

0

if

k(t) # kk

~k(t) =

From our general results it follows that

=~ (slt) = E ~kCS) Z(t) E zCt) and

=

E Z(t)

Consider s = t.

Assume that an event occurs at t, i.e. that ~(t) - ~(t - O) = I.

Then

~[t)

= E ~k(t) Z(t)

E ~k(t) k(t) Z(t - O)

E Z(t)

E k(t) Z(t - O)

k

~k E ~k(t) Z(t - O)

~k ~k(t - O)

=

E l(t).Z(t - O)

~(t

- O)

since k(t) = k(t - O) a . s . .

Assume that no event occurs at t.

Put wk(t) = E Ck(t) Z(t) and thus ~ ( t )

= (~(t)S) k. Note that

~k(t) = wk(t) E{Z(t)IX(t) = kk }.

For A > 0 such that . . . . . . t . . . . ur in the interval ~,t§

we have

102

t+A f x(u)du

t

- ~k(t)) Z(t)} =

- Ck(t))IA(t)

: ki}E(Z(~)Ik(t)

- ~k(t))IX(t)

= A i} ~i (t) .

wk(t+A) - wk(t ) = E{(~k(t+A)

e

t+A

S x(u)a~

-

= Z E{(~k(t+A) i

t

e

: Xi)~i(t)

t+A

S ~(u)au

-

= Z E{(~k(t+A) i

t

e

If i # k we have t+A

S

-

E {(s

+ ~) e

X(uldu

t

_ ~k(t))Ik(t ) = ki } =

t+A -

S

= E {e

x(u)du

t

IA(t + A) = kk' l(t) = k i} ~ik(A) =

= (I + O(A))(qikA

+ o(A)) = qik A + o(A)

and if i = k we have t+A -

E {(~k(t + A) e

S x(u)au $

- ~k(t)lil(t ) = ik } :

t+A = E

-

{(e

S xCu)du t

-

1)[x(t + a) = ~k' ~(~) = Xk } nkk(~) -

(I - nkk(n)) = - xka(1 + qkk A + o(A)) + qkk~ + o(~) =

= qkk A - ikA + o(A) 9

Thus we have

~k(t + A) - ~k(t) = A Z ~i(t ) - AlkWk(t ) + o(A) i qik and since a similar reasoning

goes through for A < 0 we have

~'wk(t) = iZ #i(t)qik - Ik#k(t)

.

:

103

Consider s > t. For A > 0 we have

E (~k(S + A) - ~k(s))Z(t) E Z(t)

= Z E {(~k(S + A) - ~k(S))lk(s) = ki} ,~(slt ) = i

iWk

= i~k (qikA + o(A)) ~ ( s l t )

= ~ i~ ~ ( s l t )

+ (qkk A + o(A)) ~ ( s l t )

=

qik + o(A) .

Since a similar reasoning goes through for A < 0 we have

@'~(slt) i

Consider s < t. For A > 0 such that no events OCCUr in the interval (s, s + A) we have, whether ~n event occurs at s or not,

"a(Sk + Aft) - ~ ( s l t ) k '

= E {(~k(S + A) - ~k(S)) Z(t)} E Z(t)

E {Z(s)(~k(S + A) - ~k(S))

Z(s + ~) Z(s)

z(t) } Z(s + A) =

E z(t)

= .E. (~i(s)~ij(A) E (Z(s)IA(s) = k i) "

E Z(t)

Since I + O(A) if i#k,j=k

E{(~(s+~)

- ~k(s)) ~IX(s)

= X i, ~(s*~) = Xj} =

-I + O(A) if i=k,j#k 0

otherwise

~o4

Ni~(4 ) = qij4 + o(4)

if

i # j

and

~i(s)E{Z<s)11(s) = ~i } S { ~
= ~j} =

=~(s) E{Z(s)} E{Z(t)11(s+a) = I.} E{Z(t)} m{z(s+a)lx(s+4) = t.} J

~](s) ~{z(s)} ,](~+~It) ~ <(~) ~](slt) + o(4)

E{z(s+~)},](s+~)

~](s)

we have

<(~
- Z (qki A + o(A))

(1 + 0 ( 4 ) )

( [k(s)~

i

<(~)k

+

(sl~)" + 0 ( 4 ) )

:

o(~))(

~

o(~)) -

~(s)

(<(sit)

= 4 z

qik ~ ( s )

~(slt ) qki..N~(s))+ o(~).

T,k( s )

,~(s)

i

For 4 < 0 a similar reasoning goes through if no event occurs at a. Therefore we consider A < 0 if an event occurs at s. Put 4' = - A. For A' such that no events occur in the interval (s-4',s) we have

~ ( s l t) - ~ ( s - & ' b )

= .Z. q ( s - & ' ) H i ~ ( A ' ) l,J Z(S)

rZ(t) ~ s) =

9 E 'z-%7

E{Z(s-A');~(s-A')

A(S-A') = ~i' l(s) = I-} '

.

~j }

Z Z(t)

In this case we have

z(s) ~J1(s-A

E {(~k(S) -
=

15 ( I + 0(A'))

if

i # k, j = k

lj (-I + 0(A'))

if

i = k, j # k

0

otherwise

) = li,1(s) = 13} =

= Xi]

105

and ~i( s-~ ~ ) z { z ( s - ~ ' ) l x ( s - ^ ' ) E

= xi}

E

{~l~(s)

= xj}

Z(t)

E Z(t)

E {Z(s)ll(s) = i.} J

9

*

E Z(s)

O(A')

=

~(s)

+ o(~' ).

Thus we h~ve

Thus Wk(slt) is continuous

in s for all s. If no event occurs at s we have

Z (Wk(sJ t)

q i k

~ ( s It)

qki

Consider finally the problem of calculating G(t

= Pr{N(t) = 0}.

It is shown by Rudemo

(1972, p 325) for K < ~ and by Rudemo (1973:1) t in general that G(t) = exp ~ / ~ (u) d ~ where ~ ( u ) is calculated 0 as if no events occur in -.~(0,u]" i

This relation between estimation

and calculation of G(t) holds true

for a large class of doubly stochastic Poisson processes.

To see this, we consider a nonnegative (X(t)

stochastic process

; t ~ 0} possible to use as a model for the intensity

section 1.3.3) and such that

(cf

~o6

U

f

-

d

for almost

Assume,

EE(e

all u

~(v)dv

0

= -

E(I(u)

e

~(v)dv

0

> 0.

for example,

that

Pr{Jl(t

lim

f

-

)]

+ 4)

- t(t)

I L s} = 0

4§

for all E > 0 and almost

E(12(t))

for almost

< C <

all t > O. We will

sufficient. bility

all t > 0 and that

show that these

From the discussion

in the sense of Doob,

to use as a model

in section

it follows

u+4

1

~

1.3.3 about

that

{l(t)}

are

integra-

is possible

for the intensity.

Further

lim k+O

assumptlons

(e

U

- f ~(v)dv 0

- f l(v)~v 0 --

e

}

U

- f ~(v)dv e 0

= - t(u)

for all realizations

Since surely

for almost continuous

of derivation however,

are continuous

all u > 0 our assumptions in t = u it only remains

and integration

by u n i f o r m

p 32)) since

of l(t) which

imply that l(t)

(cf e.g.

is almost

to verify that the order

may be interchanged.

integrability

in t = u.

This

Billingsley

follows, (1968,

107

u+A

f

-

sup ~I~ (e A>-u

u

l(v)dv

o

-

-

e

f

)1

u+A

<_ sup

2

l(v)dv

o

•

u+A u+A

~1 ~( /u l(v)dv)2 : sup ~2

A>-u

A>-u

f

f

u

u

~,X(v)~(w)dvdw <_c.

0 T

For almost all u h 0 the estimate k~(u) = E u(k(u)) is given by U

-S X(v)dv

u

- f X(v)dv

0 k~(U) = El(U) e u

= _ d__ log E (e du

0

)

S X(~)dv E e

0

if no events occur in (0,u].

Thus we have t

- f X(u)du

t exp [- f X~(u)dul = exp[iog E (e 0

0

)] =

t 0 and this without any Markov assumption. A similar relation has been shown by Rubin (1972, p 549) to hold also for point processes which are not doubly stochastic Poisson processes.

M Example 4 In this example we will consider a special case of a model for weak optical signals. The model has been studied by Macchi (1971) and Macchi and Picinbono (1972, pp 566 - 567).

Let {X(t) ; t ~ [0,T]) and {Y(t) ; t ~ [0,T]) be two independent and identically distributed normal processes. Assume that E X(t) = 0

lO8

and that C(s,t)

=Cov

that the sample

functions

and Leadbetter

(X(s)

=

Assume

(cf Cram6r

(1967, p 183)).

IZ(t)l 2

and define

X2(t) + y2(t)

=

events of an observation [0,T] corresponding

.

{l(t)

; t @ [0,T]} by

Let x I . .. ,xm be the epochs of ,

of a doubly stochastic

to l(t). Since

with mean value 2C(t,t)

Poisson process

l(t) is exponentially

it follows

I that the condition

of uniform integrability

distributed

from

holds,

and thus

T - f ~(s)ds

m

@

l(Xj))

E(l(t)(

~(t)

on

that

max E lk(t) = ( max 2C(t,t)) k k! < ~. Thus it follows O
further

of X(t) and Y(t) are continuous

Put Z(t) = X(t) + i Y(t) ~(t)

, X(t)) is continuous.

e

)

j=1

=

T - f ~(s)ds

m

z(( n a(xj ))

e

0

)

j=l is the best estimate to L(x,y)

of l(t) in terms of the observation

according

: (x - y)2.

Let t I ..... t n be in [0,T]. an expression

Following

Maechi

(1971) we will derive

for

n

-

T / ~(s)ds

j=1 with help of the Karhunen-Lo~ve

expansion

(cf Parzen

(1959, pp

278-283)).

Let {@k(t)

; k = 1,2 .... , t E D , T ] )

of the covariance eigenvalues.

be the normalized

eigenfunctions

C and let {~k ; k = 1,2,...} be the corresponding

This means that

I09

T / C(t,s) @k(S)ds = ~k@k(t) 0 and T / #k(t)#j(t) d t = 0

for all k,j = 1,2, ....

I

if

k=j

0

if

k~j

Then we have

C(t,s) = Z ~kCk(t)@k(S) k and

X(t) = Z @k(t) ~ k k Y(t) = Z @k(t)

k

where XI,X2,...

Xk

~'k'Yk

, YI,Y2,...

are independent normally distributed

random variables with mean values zero and variances one. Thus

Z(t) = Z @k(t) 2 ~ k k

Zk

where

zk=

Xk+iY

k

We will make use of the fact that

zk

=

where Z k and @k

Iz l e

i@ k

are independent random variables.

tially distributed with mean value one and

IZk 12 is exponen-

@k is uniformly distri-

buted on [0,2~.

Now we will introduce some notations. A vector (al,...,a n) is denoted by -~a. The set of all permutations of al,...,a n is denoted by

110

P(a ). Let i --I]

--~

be a vector of positive integers and denote by

the number of components in i

~n

equal to k. Let P

#(i) k

denote the set of

n

permutations of 1,2,...,n.

We have T -

E (( H l(tk)) k=1

T

X(s)ds )

f

0

e

n ~ ''IZ(tk)l 2)

= E ((

e

k=1

n

2 ~.~_~.i j @i(tk)@j(tk)Zi~j)

"

k=1 i ,j T 9 exm{-f

(Z 2 ~ '~~ "O ~i(s)@j(s)ZiZj)ds}] i,j

0

=

n

( H

2/Pi ~. '%. (tk) ~. (t_)Z i ~,. ) 9 k Jk ik Jk K k Jk

k=1

~'~n

-2 s pj Jzj 12 9 e

J

~

=

n

2/~.

~. '~. (t_)@.

ik Jk lk

K

Jk

(tk))

"

-2~ ~jrzjl2

n

" E (( H Z. Z. ) e k=1 ik Ok

J

).

We have

n E (( H

-2 Z ~jlZjJ2 Z.

k=1

ik

Z. ) e

8

)=

Jk 2

= ~ m (z. j

J

~

e-2njlzj + #(A)j)

J 9 e-2~JlzJ121

)=

J

eie'J (~(~)j

f

0

IZ(s)12ds )

111

which is equal to

co

-(l+2uj)x

1

f

x #Ln)J

e

d.x=H

jo

(I + 2#j) #(i-'n)3+1

J

#(&).~ "~ )(kn i (I + ) H= "1 + 2 p ')' 2~j I ik

(n j

if #(i_n)j =#(j~)j

for all j, i.e. if & 6 P(i_n), and otherwise

equal to zero. Thus T

~(s)ds

- f

n

E(( ]I l(tk) ) e k=1

0

)=

n

=

Z

Z

2~ik k=1 I + 2~ik r162

#(i__n)v Hv I + 2Pv

-n

n 2Ui H (I + 2Pv)-1 Z Z H I + __~Uik@ik(tk)r v -hi m---n6 Pn k=1

n

2~

Hv (I + 2~v)-I ~n~PnZ (k=IH iZ _ ~+1 12W i @i(tk)r

It is seen from the derivation that T - f ?~(s)ds

E(e

0

) = II (1 + 21237)-1

mk ) =

))"

112

Define the function f : EO,T'~ 2 § R by

2~ i f(t,s) = g i

I + 2~ i

el(t)

r (s)

9

In order to calculate f, its given form will mostly be of little help. We may, however, observe that

T f f(t,y) C(y,s)dy = 0 T

2~ i

=/(zz1+s . 0 a z

@i (t) @i (y)~j %j (y)r (s) )dy = z

~i + 2 ~ -

2~ = Y r162 i I + 2~ i

= Z i

I +

2~i

~i r162

(s) =

I

= C(t,s) - ~ f(t,s)

and thus f satisfies T f(t,s) + 2 f f(t,y)C(y,s)dy : 2 C(t,s) for t , s 6 EO,T]. 0

This equation is of the same kind as certain equations which will be discussed in the next section in connection with linear estimation. It follows from theorem 4 that f, at least among functions which are square integrable in each variable, is the unique solution of the equation. This also follows from the theory of Fredholm integral equations.

Let us, for notational reasons, define the matrix

F(t_.n) = {f(ti,tj)}

, i,j = 1,2 ..... n ,

113

and its permanent (cf e.g. Marcus and Minc (1965))

n

Per F(t ) : Z -n ~n~Pn

H f(t.,t. ) i=I z Ji

Put Per F(t O) = I.

Thus we have, and this is the result of Macchi (1971), T - f k(s)ds

n

0

E(( H k(t k)) e k=1

) =

H v

(I + 2 ~

)-i (Per

F(t_n))

and thus the expression for ~%(t) given by Macchi and Picinbono (1972, p 566) follows.

Let us, however, summarize. Let Xl,...,x m be the epochs of events of the observed doubly stochastic Poisson process on [O,T~. Then

Per F(t,_~n) k~(t) =

Per F(x ) --m

for t E ~ , ~

where f(t,s) is the solution of

T f(t,s) + 2 f f(t,y)C(y,s)dy = 2 C(t,s) , t , s 6 [O,T]. 0

If C(t,s) ~ C for s,t~ [O,T] it follows that k(t) ~ k for almost all sample function, where k is an exponentially distributed random variable with mean value 2 C. Thus the corresponding doubly stochastic Poisson process is a P61ya process and it follows from example 2 that k~ft~J = 2C(I + m) I +2CT This result may, of course, also be derived from the result given in this example. We have f -=

2C 1

+

2CT

and Per F ( t )

= n:f n"

and thus

114

t~(t) =

(m + 1 ) ' f m+l " mlfm

(1

=

+

m)f

=

2C(1 + m) '

1 + 2CT

Let us now specialize to the case C(s,t)

=

'

"

o

2 e -~ I s - t l

This means that X(t) and Y(t) are Ornstein-Uhlenbeck will consider the calculation of the distribution

~

&,C

2

>

O.

processes.

We

of the w a i t i n g

time to the first event. Put, see section 2.3, T O

G(T o).. = E e

- f ~(s)ds 0

and, see section 2.4 example

1.4, T O

- f ~(s)ds 0 GO(To ) = E(I(0) e

)

E X(O)

Let l~(t) be the best estimate of l(t) when no events occur in ,T]. From example 9 in section 5.2 it follows that

-2BT I +B e I - B 2 e -2~T

where

8 =

2

+ 4 2

and B =

9 From the final remarks in

example 3 it follows that

T

0

f G(T O].. = e

0

~(~)dT

I - B2

= e-(8-a)To

I - B 2 e-2BT~

= e

~T~

(c~

~ + 2~ +--7-------

This result has been derived by Siegert different 9

sinh(BT

o

))-I

(1957). His methods

are

115

Consider now G~

) . We have E I(0) = 2~ 2 and thus we get

G(T o ) G~

) = ~2o2

~

G(T o )

ITo(O) = ~ 2 o2

I~o(To) =

-2BT - a e-(B-a)To (I - B 2) 202

1+Be

o

(I - B 2 e-2BT~ 2

We end up this example by some comments on the case where ~ (t) is the sum of n squares of independent and identically Ornstein-Uhlenbeck processes. This case has been studied by Barndorff-Nielsen

and Yeo

(1969). Let Gn(To) and G~(To) be the quantities corresponding to G(T o) and G~

for this process. We note that G = G 2 and G ~ = G2.o

It is not difficult to realize that

Gn(T o) = (GI(To)) n = (G(To))n/2 and G~(T O) = G~(To)(GI(To))n-I=

G~

(n-2)/2

m 5.2

Linear estimation

Let N,A and ~ be defined on the same probability space as in section 5.1 and assume as in section 1.6 that E A2(B} < ~ for all bounded BEB(X)

and that E ~2 < ~ . Recall that M{B) = E A{B}and that

R{BI,B 2) = Cov(A(BI},A{B2~)

for bounded B, BI, B 2 E B ( X ) .

later purpose p by p{B} = Cov(~,N{B})

Define for

for bounded B ~ B ( X ) .

Let H be the Hilbert space L2(~,B(~),Q) , i.e. the set of B(~)measurable functions n : ~ § R such that f n 2 dQ < ~, with inner

776

product f ~i~2 dQ for n 1 ~ 2 ~ H .

In section A2 some facts about Hil-

bert spaces are summarized. H is our basic space in which the Hilbert spaces considered in this section are subspaces.

Let L(X o) for

X ~ B(X) be the Hilbert space spanned by N(B) for all bounded o B ~ B ( X o) and the constant one.

Definition 4 An element N

on

~ o

X

~ in L(X~ ) is called a linear estimate of ~ in terms of

.

Definition 5 A linear estimate estimate

~

of ~ in terms of N on X

o

is called the best linear

if E(~ ~ - ~)2 ~ E(n - 6) 2 for all ~ L ( X o ) .

As shown in theorem A 2.2 it is no restriction to assume E~ = 0 and to let a linear estimate be an element in L(X o) = S((N(B) bounded B ~ B(Xo))).

- M(B)

;

From the projection theorem it then follows that

the best linear estimate ~

of ~ is the unique solution of

E(~ ~ - ~)(N(B) - M(B]) = 0 for all bounded B ~ B ( X o ) .

This solution

will sometimes be denoted by E(~iL(Xo)).

In order to calculate the best linear estimate theorems 2 and 3, which we believe have independent

interest,

are helpful.

Theorem 2 For bounded X o ~ B(X) every ~ L ( X

f(x) X

(N(ax)

o

) has the representation

- M(~x~)

o

for some a.e.(M) uniquely determined B(Xo)-measurable f : X

o

§

9

function

117

Proof For every q E L(X o) there exists ql,q2,...,

nn = ~

such that for any n

fn (x) No{dX} o

where fn is a simple function and No{B} = N{B} - M{B} for all B e B(Xo) , with the property lim E(q n - q)2 = 0. Since lim E(q n - q)2 = 0 n-~ n§ implies f X

lim E(q n - qm )2 = 0 and since E(q n - qm )2 n,m+~

(fn(X) - fm(X)) 2 M{dx}

o measurable

(cf lemma 1.3b) there exists a B(X ~

_

function f such that

lim

f

n~

X

(fn(X) - f(x)) 2 M{dx} = 0 . o

The function f is determined and finite a.e.

(M), i.e. on X

o

- E

where M{E} = 0. Since M{E} = 0 implies P{N {E} = 0} = I and since o X

o

b o u n d e d implies M{X } < ~ o

and v{X } < ~ for ~ii ~ @ N the random o

vari able

= S X

f(x) ~ o {~x} o

is determined a.s.

The uniqueness

(Q).

follows from the above, and if we can show

E(q - ~)2 = 0, then the theorem is proved.

There exists a subsequence xEX

o

lira f k§ X ~

{fnk} such that

- E where M{E} = 0. Thus for all v 6 N fnk(x)v{dx}

= f X

f(x)v{dx}

o o N over a b o u n d e d set X

M{X o} < co it follows that

o

k-~limfnk(x) = f(x) for all with v{E} = 0 we have

since an integral with respect to a

is reduced to a finite sum. Since lim 5 k-~ X

f o

nk

(x)M{dx} = f X

f(x)M{dx} o

118

irrespective

of the chosen subsequence.

implies Pr{N{E}

Thus,

since M{E} = 0

= 0) = I, it follows that nnk § ~ a.s.

lim E(q _ q)2 = 0 it follows that 6 = q a.s. k-~ nk

~(~

-

n) 2

=

o

(Q). Since

(Q) and thus

.

Remark 2 In the proof of theorem 2 the condition that X

is b o u n d e d is only O

used to ensure that M{X ) < * and that v{X } < ~ for all v 6 N . O

only v{X } < ~ a.s.

Since

O

(Q) is required,

it is sufficient to assume that

O

M{X } < ~. This is a slight generalization

since X

O

bounded implies O

M{X } < ~ but there exist cases where M{X ) < ~ even if X O

O

is unO

bounded.

9

Now we drop the assumption of b o u n d e d X . When X O

is b o u n d e d it is no O

p r o b l e m to interpret a representation

q = f X

f(x)

Since

an increasing

(N{dx} - M{dx}).

O

pact

X is

a-compact

it

always

sets {Kn} I such that

creasing

sequence

exists

sequence

m] Kn = X and thus { K n N X o } I n=1

of bounded

sets

such that

0 n=]

of com-

is an in-

(Kn~"IX o) = Xo.

Definition 6 An element q ~ L ( X o ) q = f

f(x) (N{dx} - M{dx})

x

bounde~

is said to have the representation if for every i n c r e a s i n g sequence of

~

~

sets {Xn) I such that X n E B ( X o )

and

X n = X ~ we have n=1

Z i m E{q - f f(x) n~ X n

( N { d x ) - M { ~ } ) } 2 = O.

Theorem 3 Let X o ~ B(X) be arbitrary.

If for any b o u n d e d B(Xo)-measurable

tion g : X ~ § R with compact support

I X xX 0

g(x)g(y)R{dx,dy} 0

func-

_< c f g2 dM X 0

119

for some c < = then every n 6 L(X o) has the representation

n = ~ X for some a.e. f : X

o

f(x)(N
(M) uniquely determined S(Xo)-measurable

function

§

Proof Consider like in the proof of theorem 2 for every ~ L ( X sequence ~i,~2,...

such that n n = ~

o) a

fn(X)No {dx} where fn is a

simple function with compact suppor~ and lim E(~

n

- q~

= 0 and

a function f such that lim f (fn - f)2 dM = 0. n -~= X o Let { ~ } 1

be any increasing sequence of bounded sets with

U xk k=1

=

X

o

and define

~(k) : f f(X)No{~X} = S f(k)(x)No {~} X X~ and

(k)= ~

fn(X)No{dX}

: f f~k)(X)No{dX}. X

o

The variables nn(k) and n (k) are a.s. , (k) - ~m(k))2 _< (I + c) ~ E~nn thus n I(k) ,n 2(k) , .

(fn

(Q) determined by n. We have

fm )2 dM since X k is bounded and

is . a .Cauchy . sequence

From the proof of theorem 2

it then follows that lira E(nn(k) - n (k))2 = O. n-~o0 We have lim E(n (k) - n) 2 k§

< (I + c) lim lira f k-~ n+~ X

= (I +c) lim k--

f

o

= lim lim E(~ (k) _ n )2 < k-~ n-~

n

--

(f(nk) - fn )2 dM = (I + c) lira f k-~ X

f2

=O.

(f(k) _ f)2 dM = o

9

120

Remark 3 The condition in theorem 3 is sufficient but not necessary. that we will give an example where X

To see

is bounded but where the conO

dition in theorem 3 is not fulfilled.

Consider X = R and X

= (0,13 O

and A defined by A{B} = S X(x)dx B and {Ik)k=1

where

is a sequence of independent

l(x) = ~k

if

x6 ( I ~I k+1'

random variables with

2 E ~k = I and Var lk = Ok" Put

if

gn (x> =11 0

xs

(0,11nJ

elsewhere

If the condition in theorem 3 were fulfilled, then we must have Var A{(0,1/n]} < c/n for all n and some c < ~. 2 ok Since Vat A((0,1/nj}

example take

=

Z k=n

02 = k 2"5

it is seen that if we for ki(k+1)2

then Var A((0,1/n]}

I

Thus

2~ Var A((0,1/n]}

< ~ for all n but the condition in theorem 3 is not

fulfilled.

Consider now X = X section

O

= Z and let s be a stationary measure

(cf

1.6). Then the condition in theorem 3 is that for any

finite Z C Z and any sequence of real numbers O

{gk : k ~ Z

O

)

there exists some c < ~ such that 2 Z gkgj rk_ j ~ cm Z gk" Define y(x) by y(x) = E k,j~Z ~ k~Z k~Z O

An equivalent cm with

a.e.

ikx

gk"

O

form of the condition is then

flY(x)l 2 dx

e

fIy(x)I 2 FZ{dx}

which is fulfilled if F ~ is absolutely continuous

bounded density.

From t h e o r e m

bounded density is necessary.

A 3.3

it

follows

that

a.e.

For stationary doubly stochastic

Poisson sequences the condition in theorem 3 is thus the 'correct' condition.

121

It is seen from the p r o o f of the t h e o r e m that if it exists a Borel measure W on (Xo,B(Xo)) is non-negative

such that C{BI,B 2} = R{BI,B 2} - W { B I ~ B 2}

definite then the theorem holds under the weaker con-

dition that for some c < ~ and any b o u n d e d function g with compact

support

.r X • 0

g(x)g(y)C~ax,ay}

<_ e

f

g2 d(M + W). A simple example

X

0

0

is when A is completely random,

see definition

I .3, since then we

put W{B} = R{B,B}.

m

Definition 7 Let M be the mean measure

and R the covariance measure of a r a n d o m

measure. We say that R is absolutely

f

dominated by M on X ~

if

IR{ax,~}l A c M(B}.

BxX

0

for some c < ~ and every B ~ B ( X o ) .

If X ~ is b o u n d e d and if R{Xo,B} ~ c' M{B} for some c' < ~ and any B E B(X o) then R is absolutely dominated by M on X ~ since

f

]R~,~}]

_<

BxX

E A~dx}A<~} +M~X~ }M(B} =

f BxX

O

O

= E A{Xo}A{B} + M{X 0 }M{B} = R{Xo,B} + 2M{Xo}M{B}

<

< (c' + 2M{x }) M{B}. --

0

For arbitrary,

i.e. not n e c e s s a r i l y bounded,

S( X ~ )-measurable

Xo~S(X)

and any

function g : X ~ § R it follows by Schwarz's

equality that if R is absolutely dominated by M on X o then

f X • 0

Ig(x)g(y)R<~,~}l

0

< c f

g2(x) M{~x} O

g2(x)

f X xX

0

X

<__

0

J~{~x,~v}I <

in-

122

and thus the condition in theorem 3 is fulfilled.

The following lemma is a variation on lemma 1.3b.

Lemma I If R is absolutely dominated by M on X

it holds for every O

6,q~ L(X o) with representations

g(x) (N{ax}

max])

-

X 0

f(x) (N{ax] - M{ax}) X 0

that

~
f

f

g(x)f(x)M{ax} +

g(x)f(y)~{ax,~}.

X •

X O

O

O

Proof Choose

~n = ~

gn (x)NO{dx} and nn = ! fn(X)No{dX), where gn 0

and f

f

n

0

are simple functions with compact support, such that

(gn- g)2 ~ §

f

and

X

(fn- f)2 a M §

0

and

X

0

0

lim E(
gn(X)fn(x)M{dx} +

0

+ lim f n+ ~ X • O

gn(X)fn(X)R(dx,dy). By usual Hilbert space theory O

gf dM. Since R is absolutely dominated by M on n+~ X

X O

X

O ~

O

it follows that

f X xX

I e(f X

O

g2(x)M{dx})2 (f X 0

Ig(x)f(y)R(dx,dy} I < O

I

f2(x)M{dx))2 < ~ 0

and thus the integral

123

f

g(x)f(y)R{dx,dy} is well-defined.

X xX 0

0

Further

I f X xX O

X xX O

0

I

<

X • 0

<

g(x)f(y)R{dx,ay}[

gn(X)fn(Y)R{dx,dy} O

Ign(X)fn(y ) - g(x)f(y)llR{dx,dy}l 0

I

X xX 0

Ign(X)fn(y ) - gn(X)f(y)l + 0

+ Ign(X)f(y) - g(x)f(y)llR{dx,dy}l I c (f X

I

g~(x)M{dx}) ~ (~ X O

(fn(X) - f(x)) 2 M{dx}) ~ + O

]

I

+ c (f

f2(x) M{~x}) 2 (f

X

X O

(gn(X) - g(x)) 2 M{dx}) 2 O

which tends to 0 as n § ~, and thus the lemma is proved.

Consider now a random variable ~ with E~ = 0 and E$ 2 < ~ and recall that p{B} = Cov(~,N{B)) for bounded B ~ B ( X ) .

Theorem 4 If R is absolutely dominated by M on X

the best linear estimate ~ O

of $ in terms of N on X

is given by O

f

f(x) (N{ax} - M{dx})

X

O

where f(x) is the unique a.e. (M) square integrable (with respect to f(y) R{B,dy} = p{B} for all

M over X o) solution of f f(x) M{dx} + f B X O

]24

Further E(6 ~ - 6) 2 = E[ 2 - / X o

bounded B s

f(x)p[ax).

Proof From t h e o r e m A 2.1, i.e. the projection theorem,

it follows that 6~

is

all

the

unique

solution

o f E(6 ~ - 6) N(B} = 0 f o r

Since R is absolutely dominated by M on X representation

6 = / X

bounded BEB(Xo).

it follows that 6 ~ has the

o

f(x)N {dx) for some function f with f o X

o

f2 dM < o

From lemma I it follows that

E(6 ~ - 6) N~B)

=

/ f(x)M~d~ B

+

/ B•

o

and thus f is a solution of the equation in the theorem.

Let g be an other solution of the equation with / X h

f

g. Then / h(x)M(dx) + / B X

B E B(Xo).

Since also i X

exists a sequence

h(y)R(B,dy)

g2 dM < ~ and put o

= 0 for all bounded

o

h dim < ~ and since X is ~-compact there o

{h } of simple functions with compact support such n

that / (hn - h) 2 dM + 0. Since S hnh dM + f h (x)h(y)R(dx,dy} n X X X • 0 0 0 0

= 0

it follows by the argument used in the p r o o f of lemma I, put gn = g = f = h

and

f

n

= h , that also / h2dM + / h(x)h(y)R(dx,dy) n X X xX o o o

= 0. Since R is non-negative

definite it follows that h = 0 a.e.

(M).

Further it follows from the p r o j e c t i o n t h e o r e m that E(6 ~ - 6)2- = = E~ 2

-

E(6~) 2. Define 6n ~ = ~

fn(X)No{d.x} where f n is a simple

func-

o tion with c o m p a c t s u p p o r t / X

o

such that

E ( 6 ~ - 6~) 2 + 0 a n d t h u s

(fn - f)2 dM + 0. Then E(6~) 2 = lim E6n~ n-~

Since

= lim S n§ X

fn (x)0(dx)" o

=

125

Ill

S

B

If(y)llR{dx,

l

BxX O

it follows

for any 8(Xo)-measurable

flg( )ll {d )l_<S X

0

function g : X ~ § R that

Igll l +

X

f

X xX

O

0

I

(1

+

Thus lim f n~ X

X

0

fn(X)p[dx) =

f

{dx,dy)l

I

e) (S g2 ~)2 (f X

Ig(x)llf(Y)ll

0

f2 ~)2 . 0

f(x)p{dx}

since

X

o

O

Ifn(X) - f( )llP{ )l

If %(x) - f(x)p{a~'l ~ f X

X

0

0

1

(f

< (I + e)

f2dM) 2

1

(f

X

(fn - f)2 dM)2

X O

O

which tends to 0 as n §

II

Remark 4 The condition that R is absolutely dominated by M on X ~ is used to ensure that integrals with respect to R are w e l l - d e f i n e d transfer convergence

in the function space with

(f,g) = f

f

fg dM +

X0

f(x)g(y)R{dx,dy}

and to

'inner-product'

to essentially ordinary

XO •0

L2(Xo,B(Xo)~M)-convergenee. More p r e c i s e l y the condition that R is absolutely dominated by M on X

f X0 •

0

o

is used to ensure that I I

tf(x)g(y)R{ax,ax}] s e(f f2 ~ ) ~ (f g2 aM)~ . X0 X0

We have a feeling that lemma I and t h e o r e m 4 ought to be true under the sole conditions that.

of theorem 2 and 3 but we have failed to prove

~26

We may m e n t i o n the last through

Remark

that

comment after

in remark

simple

dominated

3, the proofs

by M + W on Xo~

of lemma

I and t h e o r e m

see 4 go

changes.

5

For every E ~ B ( X o ) Therefore thus

if C is absolutely

with M{E} : 0 we have Pr{A{E)

p and R{.,B}

are absolutely

continuous

also to any ~ s M to which M is absolutely

Radon-Nikodym

= O} = Pr{N{E} with respect

continuous.

= O} = I.

to M and

Thus the

derivatives

m(x)

=

~{~}

~(x)

=

P~ {{d~x}}

~{~x}

and

r(x,R}

exist

Thus

uniquely

=

R{dx,B} ~{<x}

a.e.

(U).

an alternative

f(~)m(~) + S X

completely

formulation

f(y)r(x,dy}

= p(x)

for a.e.

in t h e o r e m

(~)xEXo,

4 is

provided

a

o

additive

version

If X e R n and M is absolutely measure

of the equation

it is n a t u r a l

X ~ Z it is n a t u r a l

of r(x,']

continuous

to choose

to choose

is found for a.e.

w i t h respect

~ as Lebesgue

~ as the m e a s u r e

measure

(~)x6X

o

.

to Lebesgue while

w i t h mass

if

one in

each integer.

In Grandell approach

(1972:2)

as here

linear e s t i m a t i o n

was

studied with

for the case X = R and X ~ =

under the assumption

that

a similar

EO ,T~ , 0 < T < ~,

A{B} = S l(x)dx where B

l(x) is a stochastic

127

process with El(x) -- m and Cov(l(x),l(y))

= r(x,y)

such that

T

f r(x,x) dx < ~. These assumptions imply 0 TT

T

T

f 3[ le(x)g(y)r(x,y)Idx
f

0 0

0

0

T

I --

T

Ig(y)~V'r(y,y)dy <

I

T

- -

<_ f r(x,~l~x (f f2(~)~x)2 (f g2(x)~x)2 0 which,

0

0

see remark 4, is the condition required in lemma I and t h e o r e m

4. We may note that if k(x) is stationary, T

then

f r(x,x)ax : T Vat ~(0) < ~. 0

Example 5 Assume that R is absolutely dominated by M on X

o

and let A be a

b o u n d e d set in B(X).

Consider ~ = N{A) - M{A}.

If Af]X ~ = ~ this corresponds to prediction.

We have p{B} = M{A~'~B} + R{A,B} that the best linear estimate

f

f(x)mdx}

+

B

f

X

for all b o u n d e d B [ B(Xo).

and thus it follows

~--f X

from t h e o r e m 4

f(x) N {dx} is determined by o

o

f(y)R{B,ay} = M{Af]B] + R{~,B} o Further

E(<~ _ ~)2 = M{A} + R{A,A} A~X

=M{A\X o} + ~{A,A\X o} - f X

o

X

o

f(x)R{A\Xo,dX}. o

From our point of view it is more i n t e r e s t i n g to consider = A{A} - M{A}. mate 6~ = f X

Then p{B} = R{A,B} and thus the best linear esti-

f(x)No{dX} o

is determined by

128

f f(x)M{~}

+

S

B

X

f(y)R{B,dy} = R{A,B}

o

for all bounded B s B(Xo). Further E([M _ [)2 = R{A,A} - f X

E(~

f(x)R{A,dx} which if A ~ X ~ is reduced to o

_ ~)2 = S f(x)M{dx}. A

If A lies outside X

it is seen that the best linear estimates of

o

N{A} and A{A} coincide.

Example 6 Consider X = Z and let X ~ be a finite set. Put m k = E s ri, j = C o v

Zi,s

and

(see section 1.4 and 1.6) and Pk = Coy ~,N k

where as usual $ is a random variable with E~ = 0 and E$ 2 < ~. Since m k = 0 implies rk, k = 0 and since Z Iri,kl < ~ i~X i

z iex

Z /~ri,i ieX o

it follows that

Iri,kl ~ c o

holds for

max{ril i ; i 6 X o}

Z

r~-~--..

i~X ~

m,l

c = min{m i ; i E X o, m i > 0}

Thus it follows from theorem 4 that the best linear estimate fk(Nk - ink) is determined by k~X

o fkmk + iEX Z

firk'i = p k

for

k~ Xo

o This result is easy to derive by direct calculations Hilbert space theory.

without any

The example shows, however, that even if the

assumption that R is absolutely dominated by M on X ~ is unnecessary

129

(see remark 4) at least it does not have an absurd consequence

in

this case.

9

Example 7 Consider a weighted Poisson process in the general sense as defined in section 2.1, i.e. let M E M

and a nonnegative random variable ~ be

given. Define A by A{B} = XM{B} for all B E B ( X ) .

Put m = E~ and

2 = Vat ~ which implies M{B} = m~{B} and R{BI,B 2} = 2 { B I

for B,B1,B 2 E B ( X ) .

Then

5 IR{dx,dy}l BxX

}~{B 2)

= d2~{Xo}~{B} = (a2/m)~{Xo }M{B}

0

and thus R is absolutely dominated by M on X

o

if and only if

~{X o} < ~. Put ~ = ~ - m which implies that p{B} = d2~{B}. For ~{X o) < ~ it follows from theorem 4 that the best linear estimate ~

is determined by

m f f(x)~{dx} B

+ d2~{B) f X

f(x)~{dx}

= d2~{B)

0

for all B E B(Xo). This equation has the solution 2 f(x)

=

m + ~2~{X o}

and thus the best linear estimate of X is m 2 + a2N{X } ~t~

=

~

+

O

m

m

+ d2~{X

} 0

and E(A~ _ A)2 = E ( ~

_ ~)2 :

2 m~

m + o2u{X o}

oo

Let {Xk)k= I be an increasing sequence of bounded sets in B(X) and let ~

n

be the best linear estimate of ~ in terms of N on X . Then n

130

2

m(7 lim E(X ~ - ~)2 : 2 n n+~ m + ~ lira u{X } 0

n->-~

Thus if lim ~{X n} = ~ and X ~ ~ X k then X n -~ k=1 does not have an i n t e g r a l r e p r e s e n t a t i o n .

Consider

now X = R, X ~ =

E0,t~

and ~ Lebesgue

m@L(Xo)

but

measure.

Then

~

m

8 + o2st { [o,t~ }

~=m

m+~t

2

or with m = ~/a and o 2 = B/a 2

~ + t

which,

as shown

L(x,y)

= (x - y)2 in the case of a P~lya process.

p 99) has the best implies is

this

shown, estimate

that

'better'

managed

in example

2, is the best estimate

see section coincide

for any other

2.1, that the best

distribution

than the best

is the worst

the same mean value

Lundberg

linear

to

(1940,

estimate

if and only if ~ is r-distributed.

linear

to give any plausible

sense

according

This

of ~ the best estimate

estimate.

explanation

distribution

and

We have,

however,

of not

why the r-distribution

among all distributions

in

with

and variance.

m Suppose

now that N is observed

be the random variable integral wellknown

equation

which

from e.g.

Assume

set X I and let ~(E~ = 0)

is to be estimated.

given in theorem

In most

4 is difficult

linear p r e d i c t i o n

set X o ~ X 1 is considered solution.

on a b o u n d e d

theory

it m a y be much

that R is absolutely

that

cases the

to solve.

It is

if some u n b o u n d e d

easier to get an explicit

dominated

by M on X ~ and define

131

gl

=.L (glL(xl))

go = ~ (~[L(Xo))

/

=

X o

f(x)N {dx} o

and g~

f(X)No{aX}.

= S

appr.

XI

gappr. ~ is a reasonable approximation of the best linear estimate gl

if

E (g:ppr. ~)2 E (g~ - g)2 is close to one. This quantity is mostly difficult to calculate, but since

E (gappr. -

~)2

.

E

E (g] - ~)2

(gappr.

E (~

E ( ~

~)2

-

_ ~)2.

~)2

~_

~appr. - go

+ E (go

E (g: - g)2 g:)2

E(g:ppr.= 1 +

(go

E

_

g)2

(1 . e)

S

< --

f2(x) M{~}

Xo~X I

<1+

~2 _ f

X

f(x)p{~:}

o

we are on the safe side if

(I + c)

S

f2(x) M(~x}

Xo\X 1

E~ 2 - f f(x)p{~} X or

o

)2

132

E (~r. ~)2o E (g~ ~)2 -

are close to zero. These quantities may be fairly simple to calculate since only a solution of the integral equation corresponding to an observation

of N on X

is required. O

Example 8 This example, which is not quite trivial and which we believe have some practical interest, will be considered in some detail.

Let X be the real line and let A be given by A{B} = S X(x)dx B where ~(x) is a stochastic process with E ~(x) - I and Cov (~(x),~(y))

= e -~Ix-yl

Thus we have H{B}

= S

for some ~ > 0. Put ~ = ~(0) - I.

dx, X{B 1,~2 } =

B I

I

~dy

and

dx. Further we have

I

S 5 e-~Ix-Yl --~o

e -~lx-yl

I

p{B} = S e-~Ixl B co

S B 1•

dxdy = (2/~)M{B} and thus R is absolutely dominated by

B

M on the real line.

Assume that N is observed on Is,t]. If both -s and t are large, it seems natural to consider X ~ = R. Then ~o~ is determined by

f(x) +

f f(y) e -~Ix-yl

which has the solution

f<x)

and thus

=

dy = e -~Ixl

for x ~ R

133

E(I~(O)

_ t(0))2

=

c~

~/c~2 +

2~

where 1{t0)" " =
E(<:ppr.

6:)2

E(6:-

<)2

-2(~)t =

e

(j+

-2t)

which is less than e.g. 0.01 for t > 2~.6 if ~ = 0.01, for t > 6.07 if ~ = 0.1, for t > 1.18 if a = I and for t > 0.101 if ~ = 10.

If only -s is large it seems natural to consider X ~ = (-~,t]. In 9{

this case
f(x) +

t _alx I / f(y) e -~Ix-yl dy = e for x < t -oo

which has the solution

~ f(x)

;{e- (~7~+2~)IxI

+

(~ + I - ~ )

e -(~+2~)(2t-x)}

if t > 0

=

(+~~ - ~ 2 ~ -

~)e~t+(~

)(x-t)

ift

< 0

Let the function E{I~(0) - l(0)}2(t) denote the value of E(/~(0) - I(0)) 2 when N is observed on (-~,t~.

Instead of giving the rather complicated formula we have in figure 3 illustrated this function for some values of ~.

134

E

{A~(O)

-

A(O)} 2

(t)

1.0 '

~=i0

9

a = l

9

a=O.l

o.5

J ......

-I0

9

a

=

|

~

t

0.01

i0

Figure 3: Illustration of E{l~(O) - l(O)} 2 (t)

Consider t = 0, i.e. N is observed on Es,~

and X O

E{la(O)

- ~(0)} 2 = ~

--

Then

-

and

E( ~ p p r .

o

= e-2

27J

Ist

E(~o~_ g)2

2a

I

which is less than 0.01 for Is] > 22.9 if a = 0.01, for Isl > 5.66 if a = 0.1, for Is I > 1.04 if ~ = I and for isr > o.o713 if a = 1o.

For notational reasons we change the situation and consider the case when N is observed on ~ , T ] , estimated for t 6 E0,T~.

where T is large, and X(t) is to be

Since i(x) is stationary there is no real

change in the situation.

In this case we have ~ ( t l T ) = E(~(t)li~0,T]) where ft(x)__ is determined by

T = 1 + S ft(x)No{dX} 0

135

T ft(x) + ~ ft(y) e -~It-yl 0

~y = e -~It'x]

for x E EO,T]

which has the solution (ef van Trees (1968, pp 321-322))

ft(x ) : ~ {e-Slt-xl

A

+

[e-B(t+x)

+ e-B(2T-t-x)

+

1 - A2 e - 2 B T

+ A e-B(2T+t-x) + A e-B(2T-t+x)]}

where

B =

~2

+ 2a

and

A = ~ + 1 -

B.

For large T it seems reasonable to approximate this rather complicated T estimate with l~ (tiT) = I + f gt(x) N (dx} where appr. 0 o

gt(x)

=

~ {e -BIt-xl

+

A Ee - % ( t + x )

+ e-g(2T-t-xO

)

.

We have

E (X ~

appr.

(tiT)

- ~(tlT))

2

E (Xappr.(tlT)

- X~(tlT)) 2

<

E (ta(tlT)

-

t(t))

{

2

B T <--~ (I + 2 ) ~ (ft(x) _ gt(x))2 dx --

C~

(Z

which for all t E ~,T]

is less than e.g. 0.01 for T > 30.6 if ~ = 0.01,

for T > 5.7 if m = 0.I and for all T if ~ = I or 10.

Consider now l~(t) = E (l(t)Ii([O,t]). Then we have

t = I + f f(x) No{dX} 0

~(t)

where

e-~(t-x) + A e -(t+x) f(x)

=

(6

-

~) I - A 2 e-2~t

"

136

From this and the previous discussion in this example it follows t that I + f g(x) No(dX} with 0

g(x) = (B - e) e -B(t-x)

is a reasonable

approximation

of l~(t) p r o v i d e d t is large.

Consider as a further illustration

some random generations, of a

model within the class studied in section 2.3.2, which may be looked upon as continuous parameter analogues to the generations

described

in section 2.5.

Put in these generations T = 50 and ~(x) = I (N(x)

where

; x 6 [0~50]} is a Poisson process with parameter ~ and inde-

pendent of a sequence distribution

generations presented

{lk}k= 0 of independent

function U(x) = I - e

-X

random variables with

, x > O. In figures 4-6 these

together with some linear estimates of l(t) are

for e = 0.01, ~ = 0.1 and a = I. In the case ~ = 10 the

illustration value turned out to be very low, and this case is therefore omitted.

For e = I the curves representing E ( l ( t ) I L ( E O , t ] ) ) a n d

its approximation

coincide within the accuracy of the diagram.

137

I

% 5O

25 (a) The piecewise constant curve represents A(t). The continuous curve represents the approximation of E(

z(t) l[([0,50])).

5O

25 (b) The piecewise constant curve represents X(t). The picewise conti . . . . . . . . . . . . p ..... ts E(X(t)

i [([0,t])).

50

25 (C) The piecewise constant curve represents X(t). The p~ecewise conti ....... urve rep ..... ts the approximation of E(l(t)

IIIII II

i lJ

0

17( [0,t'])).

IN I

I

I

L 5O

25 (d) The spike train represents the location of the points of N.

Fi~ulre

4~

Illustration

of

linear

estimation

in

the

case

~ =

0.01.

t

138

!

!

25

50

(a) The piecewise constant curve represents ~(t). The continuous

curve represents the approximatlon of E( k(t) l ~ ( [0,50] )).

!

f

5O

25 (b) ~ e piecewise constant curve represents l(t). The piecewise cont~ ..... e~..... presonts ~(x(t)lY([o,t])).

50

25

(C) The piecewise constant curve represents A(t). The piecewise continuous curve represents the spgroximaton of E( t(t)[ [([0 t])).

i; IIHIli IIill l; l;IIrIIlllil]II IJiili~HlIIli ;lilil; o

50

25

(d) The spike train represents the location of the points of N.

Figure

5: I l l u s t r a t i o n

of linear

estimation

in the case

a = 0.1

139

0

!

9

25

50

(a) The piecewise constant curve represents l(t). The continuous curve represents the approximation of E(~(t) I T([0,50])).

i

25

0

(b) § (c)

5O

The piaeewise c o n s t ~ t curve represents ~ ( t ) . The pieeewise co~ti~uo~s ~ r v e ~epre~e~t~ ~( ~(t) l [ ( [ O , t ] ) ) .

rlllll lit I I il I III,I I ll!lllII o

11, 50

25 (d) The spike train represents the location of the points of N.

Figure

6:

Illustration

of

linear

estimation

in the

case

a =

I.

140

Example

9

We will now consider

a simple

generalization

example

8. Put X = R and let A have

chastic

process

of the case studied

density

l(x) where

with E l(x) = m and Cov(l(x),l(y))

= l(0) - m. If m = 62 = I we have the case

Suppose

that N is observed

in

X(x) is a sto-

= 62 e -~Ix-yl . Put

studied

in example

8.

on X . Then O

~

= ]~(r

S(x)

= /

(N{dx}

- m

dx)

X 0

where

f is the solution

m f(x)

+

O

= (-~,t]

{e-Blxl+

for x ~ X O.

~+~

e-B'2t-x'}(~

if

t

> 0

if

t < 0

=

(B-s) e

and if X

dy = 62 e -alxl

0

we have 62~

f(x)

e- l -Yl

62 J X

If X

of

= (s,0J we have

~t+6 (x-t)

(cf van Trees

(1968,

pp 321-322))

O

e

f(x)

=

-BIll 1 -

where

~T

eB(l l-21sl)

(B-h 2 ~

e

-2BIsl

in both cases

B =

Assume

+

(~B-~)

2 262e c~ + ~ m

that N is observed

on a b o u n d e d

set X

and let ~(E ~ = 0) be O

the random variable

which

is to be estimated.

We will n o w consider

141

a different kind of approximation

of [~ = ^'E([IL(Xo)) which may be

useful.

Let, like in theorem

finer partitions

lira n+~

I, (Bnl,...,Bnr} n

be a sequence

of finer and

of X ~ such that Bnj ~ B(X o) and

max I <j
diam(Bnj ) = 0 . n

Put Ln(Xo) = S(N(Bnj)

- M(Bnj)

; I ~ j ~ rn). From example

6 it

follows that

[~n always

=

]~([ILn(Xo))

can be computed.

the c a l c u l a t i o n

In section 6.1 it is shown that in general

o f ~n r e q u i r e s

tedious

a r xr matrix has to be inverted. n n

numerical

Anyhow,

computations

since

with a digital computer

~n can be computed and t h e o r e m 5 shows i n which s e n s e ~n may be regarded

Theorem

as an approximation

of ~ .

5

For bounded

it holds that lim E ( ~

X O

- ~)2

=

0

.

n-~oo

Proof Put L (X o) = S(N{Bnj) follows

=

; I < j < rn, n = 1,2 .... ). It

167) that lim E(~ n~ - q )2 = 0 where n-~ (Xo)). Thus the theorem is proved if we show that

from Doob

q = E(~IL

L (xo)

- M(Bnj)

(1953

,p

L(Xo).

We will do this in almost the same way as it was shown in the proof of theorem

I that 0' = 0'.

We will need the following:

For any AI,A2,... E B ( X o) such that

142

Ak+ I C A k and lim A k : r (the empty set) we have lim E(N{A k} - M{Ak}) 2 = 0. ke~o n-~O This follows, however, immediately from properties of measures and dominated convergence.

Since L ( X o ) C L ( X o) and since L (Xo) is a Hilbert space it is enough to show that N{B) - M { B ) ~ L (Xo) for each B ~ B ( X o ) . D = { D ~ B ( X o) ; N{D} - M { D } ~ L

Put

(Xo). For disjoint DI,D 2 .... 6 D

it

follows that

~ D i ~ D by putting A k = [J D i. Thus D is a Dynkin i=I i=k system since the other requirements are obvious.

Word for word the last part of the proof of theorem

I may be repeated

and thus D = B(Xo).

Up to now we have only considered estimates

of a random variable ~.

It is sometimes of interest to estimate a random vector = (~1,...,~n).

In this case our basic sample space is ~ = N•215 n.

The required modifications A vector ~ ~ = (g],...

are obvious and will not be discussed. with ~ ff[

, k = 1,...,n, is called

a linear estimate of ~ in terms of N on X o.

Definition 8 is called the best

A linear estimate ~__~of ~ in terms of N on X O

linear estimate if the matrix

is non-negative

definite for all ~ = (nl,...,n n) with ~k ~[(Xo).

It is almost trivial to show that ~

= (E(~IIL(Xo)) ..... E(~nlT(Xo)),

i.e. the best linear estimate of a random vector is the vector or best linear estimates ~(~Z - A)'

<~

of each component,

- ~> = ~ ' ~

- ~I~>'~ ~

and that

143

Assume that R is absolutely dominated by M on X 9 Then it follows o from theorem 4 that

+

f

X

s

(N{~x} - M{dx})

o

where ~(x) = (f1(x),...,fn(X))

/ ~(x)mdx} + / s B

X

is determined by

R{B,dy} = ~{B}

, B6B(Xo),

o

where ~{B) = (Cov(~I~N{B}) ..... Cov(~n,N{B}) ).

Further it follows almost immediately

from the proof of theorem 4 that

X

(~' denotes the

5.3

transpose

o

of ~ and not the derivative.)

Some empirical comparisons be~een non-linea~ and linear estimation

The very restricted purpose of this section is to consider some random generations

illustrating a case where it seems reasonable

to believe that non-linear estimates are much

'better' than linear

ones.

Put X = R+ and X ~ = [0,t] and consider the process described in I example 3 for the special case K = 2, w1(0 ) = w2(0 ) = ~ q

if

k#i

-q

if

k = i

and

qki =

This means that {1(x)

; x ~ R+) is a Markov chain with stationary

transition probabilities,

alternating between the values 11 and 12

144

in such a way that Hki(Y) = Pr{1(x+y) = till(x) = Ik } = qy + o(y) if k # i, and hence

~I ( 1 Hki(Y

e-2qy

)

if"

k#i

if

k = i

= 1

7

(1 + e -2qy)-

Thus I

m ~(x

=~

r(x,y

= Cov(l(x),l(y))

( l 1 + ~2 )

and =

I

~ (11 - t2 )2 e - 2 q l x - y l

O' In this section we will use the notations XL(t) for ~(X(t)l[EO,q)

IB(t)' for E t(1(t)) and

i.e. IB(t) is the best estimate of ~(t) in

terms of N on EO,t] according to L(x,y) = (x-y) 2 and IL(t) is the corresponding best linear estimate.

Consider first the case q = O. Then N is a weighted Poisson process and it follows from example 2 that

~(t)

N(t)+1 e-11t ~N(t)+1 e-12t = 11 + A2 .N(t) e-11t .N(t) e-12t AI + A2

and from example 7 that

~(t) =

(I I + 12)2

+ (~I

- 12)2 N(t)

2(I I + 12 ) + (i I - 12 )2 t

In figures 7 and 8 these estimates are illustrated by random generations for t 6 [O.50~ and (11,12) = (0.5, 1.5).

145

We note that if ~I = 0 then -~2 t Z2 e if

~(t)

if

N(t) > 0

=

o

-~2 t I +e X~(t) = ~2

and

~(t)

X2(1 + N ( t ) ) 2 + ~2 t

and further

E(X~(t) -

~)2

_

2 -Z2 t ~2 e

2(1 - e-x2t) and

s(~(t)

2 12 _ ~)2 = 4 + 212t

where ~ = l(t).

Thus for large values of ~2 t the best estimate l~(t) is much 'better' than l~(t).

Consider now the more interesting case q > 0. From the results of Rudemo, described in example 3, it follows that

z~(t) = ~i~i(t) ~ + ~2~2(t) where w~(t) , k = 1,2, is determined by

146

o) = 2

~(~-o) at epochs of events and

~1 ' ( ~ )

= (~(=)

- Xl - q) ~I (T) + q ~2 (~)

~ 72~' (T) = q ~I(T) + ( k~(t)

- ~2 - q) w2(T)

in intervals between events.

Using the linear equations

for ~k(T) it follows that if no events

occur in (Sl,S2~ then for 9 ~ s 2 - s I

~+ eBT[Tr~(Sl)(J3+~)+qw2(sl)]+e ~TI(S 1 T ) =

-ST

x ~, [ITI(Sl)(B-(~)--qTr2~Sl) ]

eB~ [~+q+~ i ~1< s~ >-~I ~ I>] +e -~ [B-q-~ <<<s I l-~I s~ II]

where ~ = ~ (k2 - ~I )

and B =

q2 +

.

From section 5.2 and especially example 9 it follows that

X~(t)

~I + ~2

= ~

2

t

+ I f t (m) No{d~} 0

where ft(T) is determined by the equation

i (kl+k2)ft(T) 2

+ 62 ft fy(X)e_2qlm_xl dx = ~2e_2qlt_T I 0

for T E [0,t], which has the solution

y +c& ft(T) = ,Y-Q I [ ~

e2YT + e-2Y T e2yt

I e 2yt] ~+q

147

where 'L

~q2 7

=

2q62 +

XI+X2 "

In figures 9-12 these estimates are illustrated by random generations for t ~ [0.50j, q = 0. I and some combinations

of ~I and 12"

!

I

25

"50

t

(a) The constant curve represents k. The piecewise continuous curve represents l~(t).

,-,

I

I

25

5o

t

(b) The constant _-urve represents k. The piecewise continuous curve represents XL(t).

II 0

II II III I IIIIIII I I IIIII I III 25 (c) The spike train representsthe epochs of events of N.

5o

Figure 7: Illustration of estimates in the case (~i,~2,q) = (0.5,1.5,0)

148

!

25

I. 5O

(a) ~le constant curve represents ~. The piecewise continuous curve represents

X~(t),

!

25

L

t

5O

(b) The constant ~urve represents I. The pieeewise continuous curve represents

X~(t).

lIIIII IIIIIIIIIIIIIIIIIIIIIIIIIII I IIIIIIIIIIIII 0

25

50

(c) The spike train represents the epochs of events of N.

Figure 8: Illustration of estimates in the case (X1,12,q) = (0.5,1.5,0)

149

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

L

l

25

50

t

(a) The p~ecewise constant curve represents k(t). The piecewise continuous curve represents X~(t).

5o

25 (b) The p i e c e w i s e c o n s t a r t c ur ve r e p r e s e n t s ~ ( t ) .

The p i e e e w i s e

c o n t i n u o u s curve r e p r e s e n t s k=~(t).

IfIfIll ~ HIII If (c) The s p i k e t r a i n

Figure

IIlll II If!If 5o

25

0

9: I l l u s t r a t i o n

r e p r e s e n t s t h e epoeh~ o f e v e n t s o f

of e s t i m a t e s

in the

case

N.

(~1,X2,q)

= (0.75,1.25,0.1)

~50

(a) The piecewise continuous

!

I

25

50

constant curve represents

curve represents

l(t). Tile piecewlse

IB(t).

_~--~'~.-'~ ---A (b) The piecewise continuous

25 constant curve represents

curve represents

l~(t). b

i IfJi II II fJlll

Jiflli~l[llli I 111 i o

5o l(t). The piecewise

25

!

5o

(c) The spike train represents the epochs of events of N.

Figure

10: Illustration of estimates in the case (~i,~2,q)

= (0.5,1.5,0.1)

151

!

I

50

25 (a) The piecewise constant curve represents continuous

curve represents

l(t). The piecewise

k~(t).

[

25 (b) The piecewise constsmt curve represents continuous

curve represents

r

5o R(t). The piecewise

~[(t).

JllJgll1111JnlJlJ~fnI ~llJllJLJII Jlfill HIllIII]IIJl ]J~Ji, 25

0

50

(C) The spike train represents the epochs of events of N.

Figure

11: Illustration of estimates

in the case

(~i,12,q) = (0.25,1.75,0.I)

152

I

25 (a) The piecewise constant curve represents continuous

curve represents

50 l(t). The pJecewise

l~(t).

I

25 (b) The piecewise constant curve represents continuous

IIIIIlllnli I~II o

curve represents

I

50 k(t). The piecewise

~(t).

IIHIIII , IIIIIIIIIIIIIIIIII~IIIHIIIIII, 25

50

(C) The spike train represents the epochs of events of N.

Fi6ure 12: lllustration of estimates in the case (~i,~2,q) = (0,2,0.1)

Added in proof Snyder (1975~ pp 329 and 352) has illustrated these estimates in the case t E [0,11] and (~i,~2,q) = (0.1,20.1,2).

153

We will now consider estimation of the intensity at other epochs than the right end point of X . In order to simplify the numerical computao tions we will consider discrete time, see sections 1.4 and 1.6. Thus we put X = Z and X

Let {Z. ; j ~ Z } J

= {1,2,...,n}.

o

be a stationary Markov chain, alternating between the

values 11 and 12 in such a way that Pr{~j+ I = lilZ j = Ik} = q if i # k. for k ~ { 1 , 2 , . .. ,n}. Let ~1,...,~n be

We will consider estimation of s the observation of NI,...,N n.

We will now consider the best estimate Z~(kln) of Zk in terms of N on {1,2,...,n}. Define the random variables Z(k) and ~j(k) by

Z(k)

k

:

~.

-]~'

1

if

s

0

if

Lk J

H (~ae

J)

j=1

and

~j(k)

=

= I.

J

X. J

Put

~(k)

= E [ (k) Z ( k )

m

E Z(k)

and

E ~(k) w~ (kln) =

Then

E Z(n)

Z(n)

154

After some ealculations~ similar to those in example 3, we get for k = 1,2,...,n

vk

X1

~(k) = Vk _t 1 Xl e

where ~

~(

{(

~(

O) =

e

-X 1

{(1-q)~l(k-1)+q~%(k-1)}

1-q)~(k-1)+q~(k-1)}+t

~k 2 e-t2{qw~(k_l)+(l_q)w~(k_l)}

1 O) = ~ , and f o r k = 1 , 2 , . . , n - 1

(1-q)~l(k)~(k+l

] n)

( 1-q)~T 1 (k)+q'rr2(k)

~

+

q~-i (~:)',,2(k 1In) q~l (z)+(1-cl)~2(k)

and

w~(nln)

=

~ . ~1(n)

We will now consider the best linear estimate Z~(kln) of of N on {1,2, .... n}. Put

~1 + X2

2

1~1 - k21

2 and p = 1 - 2q . Then E~.

=m J

and

co~ (~3,~3+k) = ~2pl~l

s

in terms

155

From example 6 it follows that n

ZL(kln)

= m +

Z

fj(vj

- m)

j;1 where f1'"" "'fn are determined by the equations f~+62

n Z i=I

J

Ij-il

f.p l

= ~2 p l k - j l

for

j = 1,2 . . . .

,n.

After some calculations, compare example 9 and example 6.3, we get the following approximative solution of the equations

fJ

= 2,2~

1 - ~

Iz-Jl

{b

( b 2 n - k - J +2 + b k + J ) } +

I - bp

where

A

= I

~ (1 + p

2

62

+ --

(1 - p 2 ) )

m

and A-

~A 2-

92

b = P In figures 13-16 these estimates are illustrated for n = 25, q = 0.2 and some combinations of 11 and 12"

156

4 3 2 I

At each point k the height of the spike represents Nk, the }lecewise constant curve rep ..... ts Ik, th . . . . ti . . . . . . . . . . . . p ..... ts ~ ( k 1 2 5 )

and the lowest angle of the

triangle represents the approximation of ~[(k125).

Fisure

13: Illustration

of estimates

in the case (~i,~2,q) = (1.5,2.5,0.2)

8 7

6

5 L 3 2 1

II

I

At each point k the height of tile spike represents Nk, the piecewise constant curve represents Ik, the continuous curve represents l~(kI25) and the lowest angle of the triangle represents the approximation of I~(k125).

Figure

14: Illustration

of estimates

in the case (X1,12,q) = (1,3,0.2)

157

+ At each point k the height of the spike represents Nk, the plecewise constant curve rep ..... ts ~ ,

th . . . . ti . . . . . . . . ve rep .... ~ts /~(k125) and the i .... t angle of the

triangle represents the approximation of ~ ( k i 2 5 ) .

Figure

15: Illustration

of estimates

in the case (l 1,k2,q) = (0.5,3.5,0.2)

lO 9

8

7 6 5 .

4 3 2 1

y

I I

r

'I I

ITITIT

At each point k the height of the spike represents Nk, the piecewise constant curve represents Lk, the continuous curve represents /~(k125) and the lowest angle of the triangle represents the approximation of /~(ki25).

Figure 16: Illustration

of estimates

in the case (k1,k2,q) = (0,4,0.2)

158

6.

LINEAR ESTIMATION OF RANDOM VARIABLES IN STATIONARY DOUBLY STOCHASTIC POISSON SEQUENCES

Let {N k ; k @ Z } {s

; k~Z}

be the underlying random measure,

Assume that s m = E s

be a doubly stochastic Poisson sequence and let see section

1.4.

is (weakly) stationary and use the notations

r~ = C o v

s163

and F s for the spectral distribution

function as in section 1.6. Let further ~ be a random variable with E 6 = 0, Var ~ = o

2

and Pk = Coy <,N k. The problem to be

studied is to find the best linear estimate ~

of 6 in terms of

an observation of the doubly stochastic Poisson sequence on Z or a part of Z. This is the same problem as studied in section 5.2 for the general case. We will, however, results by using methods

derive more explicit

from the analysis of time series. A survey

of needed such methods is given in section A3.

6. I

Finite number of observatio~

Consider now the situation where {Nk} is observed for k = nl,n1+1,...,n 2 where n I and n 2 are finite. From theorem 5.4 and example 5.6 it follows that the best linear estimate ~

of ~ in terms of Nnl ,.

by n2

z

fk(Nk - m)

k=n I

n

where {f, } 2 K

n]

is the unique solution of

m%

+ .z

n2

fjrLj

= Pk

J=n I

for k = nl,...,n 2

and further

'',Nn2

is given

159

n2 S (~

- ~)2 = 2

-

fkPk 9 k=n I

Using the notations --

P

=

(Pnl

~=(

--

fnl

~,

,.

"''Pn2

9 ,fn2

)

)

= (Nnl .... ,Nn2)

= (s

.... ,s

! = (I,...,I)

(n 2 - n I + I components)

R s = {r~_j}, n I ~ i, j s n 2

I

= {6i_j}, n I ! i, j ~ n 2

~.

=

where

a

1

if

j =0

o

i~

j#o

we get ~

=

f(N'

-

m1')

where

f = p(mI + RZ) -1 and

E (~

-

6) 2

= c

2

-

fp'

=

Now we specialize to the case ~ = s Zk~ = ~

c

2 - i ( m Z + R~) -I s

- m and use the notation

+ m. In this case

s

= m + 9_k(mI +

(N'

.

-

m !' )

160

where ~k

= (r ~

nl-k'''''

r~

)

n2-k

"

We may observe that

~o,,~

I

2

and thus

(s --

: (s

s nI

)' : m1'

From the very last part of section E (s

- !)'

= mRZ(ml

Any (&k ; k ~ Z }

+ Rs

+ Rs -I

(N' - m1')

n2

5.2 it follows that

(~f - _Z) = R Z - RZ(mI + Rs -I R Z =

+ Rs -I = ml - m2(ml + RL) -I

has the spectral

Zk - m =

representation

/ e ikx ZZ{dx} --7

and thus n2 l.i.m. n2-nl -~

(The notation ~, ~i,n2,...

1 n2_n1+ I

Z Zk = m + Z~{{0]). k=n I

l.i.m, means l.i.m,

We call m + ZZ{(0}]

nn

'limes in mean'

~ stands

From the spectral

for lim E (n n

for random variables ~)2

0 .)

the level of the process.

Consider the case ~ = Z~{{O]}.

~k=m+~+Sk

i.e.

Define

{c k ; k ~ Z }

by

.

representation

it follows that E (~s k) = 0 for

161

all k. Thus R Z = ~21'I + R e where R e = {r2_j}={Cov n I ~ i, j ~ n2, and ~ = 2 ~ .

t~

=

(ei,ej)} ,

From this it follows that

d2l(ml + a2i'l + R~) -I (~, - mi' )

and further

E (~

- 6) 2 = d2(1 - ~2A(mI + d21'1 + Re) -1 ! ' )

If n 2 - n I is large the inversion of ml + R ~ requires tedious numerical calculations

and therefore good approximations

are help-

ful. This is the problem to which most of section 6.2 is devoted.

6.2

Asymptotic result~

We will start with a general formulation of the kind of problems to be studied in this section.

Let Z I ,Z2,... C Z

be finite sets such that Z k C Z k + I and let LZk

denote the Hilbert space spanned by {N. - m ; j EZk}. J applications

In all

Z k will be of the form In I ,n2] where n I and n 2 are

finite and depend on k. In section 6.1 we considered the calculaoo

tion of E (
=

[J Zk and let iZ k=1

be the Hilbert space

spanned by {Nj - m ; j 6 Z

}. Put <~Zk = E (6}iZk)

, k = ~,I,2 .....

Then it follows from Doob (1953, p 167) that

l.i.m. k~

~

= Zk

~Z

and thus

lim E ( ~ k-~

_

k

~)2

=E

(tz

_ g)2

Therefore it is of interest to study ~Z

and E (~ oo

_ ~)2 oo

162

The main motivation

for this section is, however , that ~xZk generally

requires tedious computations

and therefore it is of interest to find

/Zk such that ~Zk is rather simple to compute and in some sense qZk~ serves as an approximation of ~ Zk '

Definition

I

A sequence

{~Zk} , ~Zk ~ IZk , is said to be asym~totically

efficient

in the linear sense if

lim

k-~

= I .

E ([~k -

g)2

Remark I If {~Zk} is asymptotically

efficient it does of course not mean that

~Zk is a reasonable approximation of ~Zk for fixed k. In section 5.2 we discussed what to mean with a reasonable

approximation.

l Now we assume that F ~ is absolutely continuous.

Then also {N k ; k ~ Z }

has absolutely continuous

F N and, using the nota-

spectral distribution

tions in section 1.6, we have

fN(x)

= ~ m + f~ ( x )

where fZ and fN are the spectral densities. culation of ~Z n or Z

= {0,•

and E (~

~

-

g)2

for Z

We will consider the cal-

= {n,n-1,...}

for some finite

This is exactly the same situation as discussed

in the beginning of section A3. We will, however, repeat some of the notations.

Thus ~ is a random variable with E ~ = 0, Var ~ = d

2

and

163

Coy (~,N k) = Pk" The f u n c t i o n cross s~ectral density.

r

_ ~ I2~

Z

Pk e

-ikx

. is called the

From section A3 it follows that

W

9{

tZ

S h(x)ZN{dx]

=

co

-IT

where Z N is the spectral process

S eikx

Nk - m =

zN{dx]

in the representation

,

--'iT

and h a function depending on ~ and Z . Define g(z) by

4w g(z) : exp {A-

For Z

= {n,n-1,...]

h(x/:

the notation

S -w

e-lY" + z log fNCy) dy) e -my - z

it follows from t h e o r e m A3.2 that

1. f~l gIe-~X/ L~/e~X~ '

[ I n is defined in section A3, and further

E ( ~Z

~

)2 = a 2 - i

ix

12

dx .

-~ [g(e )~nI For Z

= (0,•177

it follows from theorem A3.1 that

h(x) : r fN(x) and further

co

-~

fN(x)

~64

Example

I

In o r d e r

to i l l u s t r a t e

prediction

one

= Nn~ I - m

step

how these

ahead.

formulae

Consider

m a y be u s e d w e w i l l

therefore

Z

= {n,n-1,...}

consider a n d put

.

Then we have

r

=

I ~w

~ Z

N -ikx e rk_n_ 1

~ e

-i(n+1)x

fN(x)

k=-~

a n d thus

)2 E

([Z

-

71

N

[

r0

=

oo

dx =

-T[

~ix g(e-~x~ o r2)

17

f

~i(n+1)x g(e-iX)]nl2

f

-

(?~(~)

-

dx

=

--'/T

17

f (fN(x)-

g(e -ix) - go 12) ax =

--17

17

f (gO[g(e-ix) + g(eiX)] - g2o) dx = --17

17

=

217

go2 =

217 exp

~I

~ log f~(x) dx --17

which

is t h e w e l l - k n o w n

result

shown by Kolmogorov

(1939).

m For later purposes

2

we d e f i n e

%red=217exp~

I

'5

flogf~(~) dx. --17

Assume

n o w that

s f (x) <_ c < ~ for a l m o s t 17

a l l x~[-17,w]

and

consider

165

where I ~ ~

h(x)

s

~

e

-ikx

j[z and put

=

qZ k

Since ~-~w ~m

Z h. (N. - m) . jEZk J J

fN(x ) _< __mm +2w

c

it follows from theorem A3.3 that

l.i.m, nZk = 6Z k-~oo

co

and thus, provided E ( ~ Z

- ~)2 # 0 , {qZk} is asymptotically

efficient.

Remark 2 It may be a matter of taste if the q Z k % to compute'. If ~

are called 'rather simple

is rational, i.e. a ratio between two trigono-

metric polynomials, then g is simple to compute. It may be observed that if f~ is rational, then also fN is rational.

If the condition fZ(x) ~ c < ~ is strengthened to

Z

Ir~l <

jEz then the methods used in section 5.2 may be applied, i.e. the sequence {hi}, j E Z

mh. + Z J i~Z for j ~ Z

co

, is the unique solution of

hi

ri-j

=pj

.

Now we specialize to the case ~ = Zk + m and use the notation ~k~ = ~Z~ + m. In this case

I

166

oo

~(x)

Thus for Z

-~ 7

z j=_~

r.

e -ijk

a-k

e -ikx f~(x)

=

.

= {n,n-1 .... ) we have

h(x) =

Fee" J_kx

I g(e-iX)

f~(x) ~

L- g (eix)

n

and 2 g (Z k

For Z

L: -2orx) J n

= r0 -

dx

= {0, I,...} we have

h(x) = e

ikx

fs

fN(x) and

(fE(x))2

_ ~k )2 -~ w

f

dx =

fN(x)

m f~(x) dx

9

-w m + 2z f~(x)

Remark 3 It may be observed that N k - m = (s

- m) + (Nk - s ) where,

theorem 1.6, N k - m may be looked upon as an observation 'signal'

Zk - m with the

'noise' N k - Zk added.

see

of the

The question of

signal measurements

is treated by e.g. Hannan (1970, pp 168-179)

and the estimate s

could as well be derived by using the results in

Hannan.

Compare also Snyders

(1972).

I

167

Example

2

This is strictly speaking no example but some observations.

To

motivate these observations we look upon the doubly stochastic Poisson sequence as a model for the number of claims in an insurance portfolio.

o(n)~ for m + E (s In this example we will use the notation ~k with Z

Let N

n

= {n,n-1,...}

.

be the number of claims which occur during year n and

assume that each claim causes the insurance company the cost of one monetary unit. A natural risk premium for year n+1 is then E Nn+ I = m. If, however, the value of Zn+1 was known, a more

'fair'

risk premium

would be Zn+I" Of course, within the present model, s

, Nn,Nn_ I ...

but if the company has knowledge of the risk premium.

is unknown

it may use

~(n)~ n+1

This reasoning is in insurance mathematics

'experience rating'

and

as

called

g n+1 ( n ) ~ may be called the 'credibility premium'.

After year n+1, when the company has got knowledge of Nn+1, the updated credibility premium z(n+I)~ n+1

may be calculated.

Define Un+ I by

~(n+1)~ + ~(n)x n+1 = Un+I ~n+1 . The quantity Un+ I may be called the quantity'.

In insurance terminology

'updating

it corresponds to 'minus the

bonus'.

Consider E (s

- Zk)2 , a quantity which in general may be rather

2 hard to evaluate. We will show that if k = n or n+1 then only Opred has to be computed.

We have

-

elkX(fN(x ) _

E (Zk(n)a- s )2 = r~ - i --I[

-- r 0 -

l ikx

g(e -ix

_.~

2 dx =

g(e Ix)

>] mFei X 2 n

2"~ Lg(elX)~ n

dx .

]68

Specializing to k = n and n+1 we get

E , n+1

- ~n+1

= ro~ - -wi

=

r@

=

" ei(n+l)x g(e-ZX)

- go ei(n+1)x 2 dx =

fN(x) + g 20 - go (g(e ix ) + g( e-iX) )

-

dx

=

-IT

2 2w go

R, =

r0

-~

(~

-

2 pred

~Fm +

--

r0

+

2-

4w

go'[~ :

2 2w go

-

m

=

m

and

E (s (n)~ _ s

s =

ro

=ro

-

-

i -w

='

einX

/ If

" g(e-ZX)

N(X)

+

m

I

2w

-7-

-w r0

s -

einX 2 go

- -m

2w go

dx

[m

+

r0

m

(g(e ix) + g ( e - i X ) ~

2

2 m

+

2

2m]

=

2w go

m

2

2w go

2 m -

m

2 pred

Because of stationarity we have

E (s

I )~

Zn+1

)2

Consider now the two estimates

We have

~-

go

L :

m 27

(n)~ = E (Ln

s )2 n

~(n)~ z(n+1)~ n+1 and n+1

dx =

169

s n+1

w I ~i(n+1)x fs = m + ~ . -w g(e-lx) L g (eix)

1

= m +

f I. -w g(e -Ix)

m e zN{ax} = 2w g(e Ix) I n

= m +

f -~

I. g(e -Ix)

i(n+1)x g(e-lX)

[el(n+1)x g(e-lX)

= ~ + f e i(n+1 )x

n

zN{dx} =

_ ei(n+1)x

~g(e-ix) - goI zN{dx}

L(o-ix)

and z(n+1)~ w [ei(n+1) n+1 = m + f I. x fZ(x zN{dx} = -w g(e -Ix) g(e Ix) ]n+1

= m +

= m +

= m +

lei(n+1)x g(e-lX) " f 1. -~ g(e -Ix)

~

I

-~

g(e -Ix)

i(n+1)x

ei(n+1)xl m

zN{ dx] =

2~ go

Ig (e-ix ) _ m 2g0] ZN{dx) = fw e i(n+1)x . -w g(e -mx) ~pred ] i(n+1)x

= m + f -~

:m+

g(e-~X)

m_ e__~.+ i(n 1)x __I zN{dx} = 2~ g(e Ix) ]n+1

(I

m

2 pred

e

g(e -Ix)

1

g(e-lX_ go 2m ) g(e-lX) + ~m ~pred ~pred

m ) - m) + m , (n)~ m) = 2 (Nn+1 7 - - - ~n+1 ~pred ~pred ~(n)~ + (I n+1

m ) 2 Nn+1 ~pred

zN{dx)=

170

We have

= s Un+1

s n+1

~ = -~

n+1

-e-(n+1)x ----:--. - ig g(e-lX) 0

m ~ 2w g

zN{dx}

and thus

EU 2 n*l

=

~

m = 2"rr

2 pred

+

0 -

2 m 2 epred

~

-

2

2Tr g'

2m

= 2~T

2+( 0

)

_

=

0

.

In Grandell (1972:1, pp 548-552) similar results were derived by use of Toeplitz forms. In these derivations the spectral distribution F Z was not assumed to be absolutely continuous, but since theorems A3.I and A3.2 can be generalized to not necessarily absolutely continuous spectral distributions, the results in Grandell (1972:1) can be derived by the method used in this example.

Let us go back to the application to insurance models.

The simple formula

z(n+1)~ _ n+1

m 2 pred

z(n)~ m ) n+1 + (I - - 7 - - Nn+ I pred

seems attractive, since it means that the policy holder has a possibility to understand how the number of claims year n+1, i.e. Nn+1, affect the bonus year n+1.

Example

3

Consider the case m = I and r~ = g

plJl, IpI< I , and

e x a m p l e may be r e g a r d e d as t h e d i s c r e t e

parameter

~ = Zk - m. This correspondence to

171

the

case

following

studied

in e x a m p l e

Hannan

(1970,

5.8.

pp

We w i l l

give

a direct

derivation

171-172).

We have

f~(x)

2 1 - p 2 1 + p - 2p cos x

= 1 27

and

fN(x ) = 1

2(1

27

In order

to derive

tion

g. We w i l l

g(z)

is a n a l y t i c

-

p cos x )

1 + p2 _ 2p

the

estimates

use the

facts

and without

cos

x

Zk we have

that

to c a l c u l a t e

g(eiX) 9 g(e -ix)

zeros

in

the

= fN(x)

and that

Izl < I.

Since

fN(x ) = I_ 27

.. 2 -

p e

2

l+p

ix

-p

- p e ix e -p

--ix e

-ix

we c o n s i d e r

I

I

2w

1

=_I_

-

+

p

p_

2~

b

pz

2

-

pz

pz -

-I

I

pz

-1

(z-h)(bz-

2~

I)=

(z - p ) ( p z

- I)

where

b =

1-

~/1-p

2"

P I

Thus I - bz

g(z) =

I - pz

+

(z - b ) ( z

(~

p 2wb

p)(z

-

I - bz I

-

pz

- b -I) = p-1 ) -

I - bz I

-

pz

func-

-I -I

172

Since

Ibl <_ Ipl

it is seen that g(z) is analytic and without zeros

in Izl_< 1.

Consider now, despite we loose some systematics, the case Z

= {0,•177

In this case we do not need to calculate g, but

still a similar factorization will be used. In this case

e ikx fZ(x)

h(x)

=

=

eikx(1 _ p2)

* '

fN(x)

2(I - p cos x)

e ikx (1 - p2)b"

p(1

- h eiX)(1 - b e -ix )

eikx (j -- @2)b I P(1 b 2)

1 -p

~z

2

j=-~

I -- + I - b e ix I - b e -ix

hlJl ei(J+k) x

2

= ~-P 0

hlJ-kl 2

and ~I

E(~

Consider now Z

p2'

- ~k )2 = ~1 -

= {n,n-1,...}. Then

h(x) = I

re

g(e -ix) [

#I

-

2

i.e.

h.

-

~kx f~(x) g(e ix)

: n

I] =

S 2~ ~z j=-~

b

lJkleiJ=

173

feikx

I - p2

2zb

I - p e -ix

I

P

1 - b e-iX

/

l;

P'

L

V

2Trb

b(1 - p2)

2w(I - p elX)(1 - p

I - p e

/

I -belX I

-~ eix

e " I - b eZX)(1

I - b e -ix

p

]

e~X)I In

n

- p e -ix

We have

I

ikx

]

ikx

e ---- e

(I - b eiX)(1 - p e-ZX) n

e I - p b

[,'eiXi,1 eiXi[ n-k

I + -ix - b e ix I - p e

n-k

and thus

h(x) =

2

#I - p I + T ~ _-p- 2

e

ikx(! I

P e -ix) b e-iX

I

I beiX + 1-pe -ix -

We separate the two cases k < n and k > n. Consider first k < n. Since n - k > 0 we have

h(x)

=

#1-p 2 I+-~7

e

ikx(1_pe-iX) l-be -ix

eikX(1_pe -ix)

1+~-p

l-be

--iX

(1-bp)e ikx -

be

n-k 9 .. I {( Z bJe IJx) + "} = j=1 1-pe -ix

ixn-k ei(n-k)x) I1-b 1_be ix

1_pe-iX)b n-k+1 ei(n+1)x

(1-beZX)(1-be -zx)

+

-

I

-

1_oe -ix

~n-k

"

174

oo

= ,, ~

{(1_bp)eikX

I z b Ij leijX 1-b 2 j=-.

_ (1_pe-iX)bn-k+lei(n+1)x)

Thus

hj = --~{(~ bp) blkJl

bn-k+1

bln+l-Jl + phi-k+1 bln-Jl}

or

{blk-Jt

2 h.

+ b2n-k-J +2}

if

j
if

j>n

=

J 0

and thus

X /- 1p2

E(Z k - s ) 2 -

Consider

2

{1 + b 2(n-k+1)}

.

k > n. Then

{I h(x)

1 +

eikx(

-ix 1 - pe ) -ix I - b e

- p2

=

2 1 - p ~ _ p

P

k-n

I -be

e

1 - p e -i

inx -ix

Thus

~1-p h,

J

2

pk-n

bn-D

if

j
if

j >n

~-

0

n-k

175

Further

1T

E(,~-

.~k)2 = 1 -

]" Ih(x)l 2 f"(x)

d~ =

--'IT

=

1 -

(1

-

s

p2(k-n)

~

2

1 -

=

dx

-~

I1 - p e

ix

I

2

p2(k-n) 1 +

ql

-

p2

Put k = 0 and consider E(s

"

- ~)2

as a function

of n. To summarize

we have

21nl I

P

ifn
I+~i-~~

E(~% - ~o )2 =

To illustrate figure

-

2

p

{1

E(Z~ - Z0 )2 we consider

17 this function

2

J

2 ~1

+

(I -

p

~I

- p

2n+2 )

--

it as a function

is given for some values

} if n > 0

of n.

of 0. In

176

1.0

0.5

0,25

0.5O

0.75

Figure 17: Illustration of E(L~ - ~0 )2

Consider now Zn = {1,2,...,n). Then a natural approximation of Lk' when ~ = Zk - I for k ~ Z n ,

~

2

p2 ~Z

n

=~

I~

{i~p2

is

n b2n_k_j+2) Z (b Ik-jl + (Nj-I) j=l

if k equal to or near n

n j=1

h Ik-jl

n (blk-J

Z j=1

(Nj-

I +

I)

b~+j ) (Nj-I

if k not near n or I

)

if k equal to or near I

As illustration we have applied this approximation to the random generation G7 described in section 2.5. Thus we use 1 + nZ5 0 as approximation of Lk. In figure 18 which is equal to figure I with the approximations of Zk added, this is illustrated. Figure 18 may be compared with figures 4(a), 5(a) and 6(a).

177

50

25 In e&ch point k the height of the spike represents N k , the value of the plecewise constant curve represents ~k smd the v~lue of the continuous curve represents the approximation of Z k.

Fisure

18: l l l u s t r a t i o n

A natural

question

of e s t i m a t i o n

on g e n e r a t i o n

is a r e a s o n a b l e

is now if qZ

G7.

approximation

M

of ~Z n

n in the sense that E(q Z

_ ~)2 ~

E(~

n this we have c o n s i d e r e d means

integer part.

_ ~)2

To get some idea of

n ~ = ~ n+1 - I and ~ = ~ - I, where En~] n

In table

2 and 3 we have

calculated

E

E(q Z

] _ ~)2

n and E ( ~

_ ~)2 for some values

of n and some values

of ~. The tables

n illustrate

both h o w

convergence

'good'

of E(q Z

the a p p r o x i m a t i o n s

_ ~)2 and E ( ~ n

where

~ = Z

are and the rate of the

_ ~)2 to their

limits.

In table

2,

n

- I is c o n s i d e r e d ,

the a p p r o x i m a t i o n

n

n

~I-P2 : ----Z--

~Z

(I + b 2)

n

is used. In table

z j=1

3~where

Ln ] -

~ = ~ n+1

b n-J (~

- m) J

I is c o n s i d e r e d ~ t h e

approximation

178

1_~,2

nzn is

=

2

n+1 l IL~]-Dl

n Z

j=1

b

(N'-S -

m)

used.

~ =s

p=o.25

n

1

0=0.5o

p=o.75

E(~Zn-~)2

E(nZn-~)2

E(~Zn-~)2

E(nZn-~)2

E(~Zn-$)2

E(nzn-$)2

i

0.50000

O.5OO13

o.50o00

o.5o258

0.50000

0.52076

2

I0.492o6

0.49207

0.46667

0.46686

0.41818

0.42311

3

0.49194

0.49194

0.46429

0.46430

0.40217

0.40321

4

0.49193

0.49193

0.46411

0.46412

0.39894

0.39915

0.46410

0.46410

0.39828

0.39832

I

0.39815

0.39816

7

0.39812

0.39812

8

0.39811

0.39811

0.39811

0.39811

5 6

|

0.49193

0.49193

o.4641o

Table 2: Illustration of the rate of convergence.

o.4641o

'goodness'

of approximations

and the

179

~ =s

~=o~5

n

1

-i

~:o5o

~~

E(~n-g)2

E(RZn-~)2

E(~n-~)2

E(.Zn-~)2

E(~n-~)2

E(~Zn-g)2

0.50000

0.50050

O.5O0O0

O.50898

0.50000

0.55731

2

0.49206

0.49221

0.46667

0.47011

0.41818

0.45201

3

10.48438

0.48438

0.43750

0.43798

0.35938

0.37178

4

0.48425

0.48425

0.43541

O.43561

0.34749

0.35371

5

0.48413

0.48413

0.43333

0.43336

0.33636

0.33850

6

0.48412

0.48413

0.43318

0.43320

0.33410

0.33521

0.48412

0.43304

0.43304

0.33186

0.33224

8

0.43303

0.43303

0.33141

0.33161

9

0.43301

0.43301

0.33095

0.33102

i0

0.33086

0.33090

25

O.33072

O.33072

O. 33072

O. 33072

7

0.&8412

Table

3:

0.48412

lllustration

0.43301

of the

0.43301

'goodness ' of approximations

and the

rate o f c o n v e r g e n c e .

Now we will give

any help.

First the A3.2

consider

we

case

some

For the

consider

cases

rest

linear

estimation

compute

the

case w h e r e

the

not

absolutely

continuous.

l.i.m. n-~o

of this

~k = m + ~ + Sk w h e r e

in o r d e r t o

where

~Z

spectral

theorems

section

We will then which

of the get

~Z

- A3.3 Z

n

i.e.

Formally

the theorem

distribution

-- Z N. = m + ~, a r e s u l t n j=1 J

we p u t

of the level,

~ = Z{{0}}.

' since

A3.1

we

can b e

= {1,...,n}.

we

consider

can u s e t h e o r e m generalized

observed = ~ since

is o f no help.

do not

process

to is

18o

We use the n o t a t i o n

Rs = n

{r[

~-0

.}

=

{Cov(si,E

and further we denote by I vector

(I,...,I)

n

J

)}

the n•

with n components.

1 < i --'

'

j

< n --

identity matrix

'

and by ~

the

It shall be r e m e m b e r e d that

and s k are u n c o r r e l a t e d

for all k.

The f o l l o w i n g t e c h n i c a l

lemma will be used in order to i n v e s t i g a t e

the asymptotic

properties

of ~Z

" n

Lemma

I

For all n > I we have

1 =

(mI

1 --I1

+ o 2 1' n

-I1

1

+ RE ) - I

~

2'1

0

2

1 +

1'

1

-"n

-"rl

(ml

+ RS) -I I' n

n

--n

Proof

In Grandell given 9

(1972, pp

103-104)

a probabilistic

The p r o o f to be given here was

Put R = ml n + R ne

and -a = oI -n'

p r o o f of this

lemma is

s u g g e s t e d by B. yon Bahr.

Thus we shall prove that

(~ (R + a'_a)-I _a')-1 = I + (a_ R -I _a')-1

Let B be a symmetric positive put ~ = ~ B -I

definite m a t r i x

Let C be an o r t o n o r m a l m a t r i x

such that B 2 = R and

I~I

such that b C =

i

I n

where Ikl = (~ _~ b2) 2

and ! = (1,0 ..... 0). Then we have

a = bB =

I and thus

(~ (R + a'~) -I S ) -I =

= <1~12 i c,B (B B + I~12 B

c

i ' i C ' B ) -1 B

C i')

-1

=

I~I~ c,B

181

:

(Ib_..I2 i

= (l_b-I 2 i

c'

(z + }b__l2 c i ' i

( I + Ibl 2 i ' i )

c ' ) -1

c i ' ) -1 =

-1 i ' ) -1 :

= (Ib_l 2 (1 + I A 1 2 ) - 1 ) -1 = 1 + I s

and

(_~ R-I _~,)-I = (b_ B B-IB-IB b_,)-1 = Lkl-2

and the lemma follows.

Using lemma I and the expression

_ ~)2 given in section

for E (<~ n

6.1

we

get

u

E

(~ z

I (ml + RE) -I I' --n n n --n I u2 + I (mI + R~) -1 1'

_ ~ 2= ,

n

~

If F e, i.e. the spectral

distribution

absolutely

a neighbourhood

version then,

continuous

2

in

fS(x) of dFS(x-----~) is continuous dx

n

n

~

for {Sk }, is assumed to be of

x = 0 and

in a neighbourhood

see theorem A3.4,

= m + 2~ rE(0) + ~ I

--n

(ml n + Re) -I

n

I'

--n

and thus

lim n E ( ~ n+~ n

if

~)2 = m + 2w fs(0)

.

some

of x = 0

182

From theorem A3.4 it follows that under the above assumed regularity conditions

a natural approximation

of ~Z

is n

n

I =--

~_

Z N. - m. j=1 0

n

n

We have n

1

E (8Zn

Z N. - m - ~)2 = j=1 g

$)2 = E (n

n

= E ( -1 n

7.

N.

j=1

-

s

+

a

o

~T. a

s

s

-

m

-

~)2

=

O

n

=E

(1

Z ~=~ j

~

n

:

s

I (~-

+ s.) 2 = a

N. _ s . ) 2 + E ( 1 a a

z j=l

= I___ A 2

a

n

n

z j=l

~ .)2 -J

(mIn + Rs) I' n --n

n

and thus, see theorem A3.4,

lim n E (n Z _ ~)2 = m + 2w fs(0) n- ~ n

from which it follows that, provided F s fulfils regularity

assumptions,

{qZ ) is asymptotically n

the assumed efficient.

Remark 4 E s t i m a t i o n of the level is considered in more detail in Grandell (1972:2, pp 92-106). conditions sequence

We just mention that without

any regularity

on F C , except the condition F~(0) - FS(0 -) = 0, the

183

~Z

= n

I (ml + RE) -I N' --n n n --n -I I (ml + R E) I' --T1

where ~ both

= (NI,...,N n

~Z

a n d
tion

n

,

, is a s y m p t o t i c a l l y

requires

n does n o t

of n Z

- m

n

an i n v e r s i o n

efficient. of a n•

Calculation

matrix

but

of

calcula-

2 require

knowledge

of o

.

n

9

4

Example

Consider

case m = I a n d r~ = P I k l

the

-1

< p < 1

Then

lim n E

(~Z

n -~~176

In f i g u r e s

<

)2

=

2 1 - p

n

19-21

we h a v e

drawn

n E

(~Z

~)2

for some v a l u e s

of

n

n a n d for 2 n E

(n Z

= I a n d a 2 = 10 r e s p e c t i v e l y .

_ ~)2

Further

we h a v e

drawn

where

n

~Z

for t h e

~Z

is

n

1 --- n

same v a l u e s

n

~ j=1

N. - I 8

o f n.

Figures

as an approximation

o f ~Z

n

19-21

thus

and the

rate

illustrate

of convergence

n

n E (~

_ ~)2 a n d n

n E (n Z

_ ~)2 n

to their

how

limits.

'good'

of

184

Figure

19:

I

I

I

I

I

I

I

I

i

I

I

i

2

3

h

5

6

7

8

9

lo

25

lllustration rate

~ f the

of convergence

curves

represent

'goodness' for t h e

n E

(qZ

= I0

and

n E(~

I

of approximations

c a s e p = 0.25. _ ~)2

n E

From

(~

n

2

.

n

and the above

the

_ ~)2 for n

_ ~ ~2

for

o2 = 1

n

Figure

20:

I

I

I

I

I

I

I

I

I

I

I

I

I

2

3

4

5

6

7

8

9

lo

25

|

Illustration rate

of the

of c o n v e r g e n c e

curves

represent

'goodness' for the

(qZ

n E

of a p p r o x i m a t i o n s

case

p = 0.50.

2

=

10

and

n E

x (~Z

- 5) n

and the above

- ~ ) 2 , n E (~Z~ - ~)2 f o r n

o

From

n

n

2

for

o

2

=

I

the

185

I

I

I

I

I

I

I

I

I

I

I

I

i

2

3

~

5

6

7

8

9

10

25

Figure 21: lllustration of the

'goodness'

rate of convergence curves represent

of approximations

for the case p : 0.75. From above the

n E(n Z

_ g)2, n E(g Z n

and n E ( ~

and the

6) 2 for 2

g

)2

for ~

2

= 10

n = I

n

9

To end this section we will consider a case where the variable to be estimated depends on Z . The case we have in mind is when the average n intensity

~ n

Since ~

I nZ Z. is to be estimated in terms : -n j=1 J

depends on n, theorems A3.I-A3.3

of NI,...,N n.

are not applicable.

From

n

section 6.1 it follows that the best linear estimate ~

n

terms of N I,...,N n is given by

~

n

1 I : m + -n~

R s (ml n + R~ )- 1 (N - m ~ n n --

),

and 2 E (Z:-

~ )2 n

m n

m n

"2 1--n (mln +

Rs -I I' n --n

of [

in n

186

Thus it follows from section A3 that if Fs

E (#n~ - #n )2

- Fs

-) > 0, then

2 m n

m 2 n

I

Fs

- Fs

-) + o(1)

Thus

lim n E (s~ - s )2 = m

and since

-n n E (N

~n )2 =

m,

n

where N

"~ Z Nj, it follows, with a slight m o d i f i c a t i o n n n j=l

definition

1, t h a t

{N } i s n

asymptotically

More interesting is the case Fs Fs

is absolutely

continuous

that some version fs

of

efficient.

- Fs

-) = 0. We assume that

in a n e i g h b o u r h o o d

dFs dx

of

is continuous

of x = 0 and in a n e i g h b o u r h o o d

of x = 0. Then it follows from t h e o r e m A3.4 that

E

(#~ - s )2

m

m

n

2 2

I

m + 2w fs

+ o(I)

Thus

2 lim n E (~n ~ _ ~ )2 = m n n+~

Obviously however,

m

. . =

m + 2w fs

{N } is in this case not asymptotically n consider

aN

n

+ b.

T h e n we h a v e

2w mfs m + 2w fs

efficient.

Let

US

187

-

2

n E "--,(aNn + b - ,%n ).

=

= n

(1

E

~a(N n

-

~n ) -

= n ~a._~ + ( 1 - a )

2

-

a)(~ n

- m) + b -

2w f ' % ( O ) +

o(1)+

(1 -

(b-

a)m] 2 =

(1-a)m)21

=

n = a2m + (1 - a) 2 2w f ~ ( O )

+ o(1)

+ n(b

-

(1 - a)m) 2 .

Thus we must have b = (I - a)m. To get the asymptotically best choice of a we minimize a2m + (I - a) 2 2w fZ(0) and thus we get

a ---

2w f Z ( O ) m + 2~

f~(0)

and m

b =

2

m + 2~ f~(0)

For this choice o2 a and b we have

lim n§

n E (aN

+ b - ~ )2 = n

n

2w mfs m + 2w fZ (0)

and thus, p r o v i d e d fZ(0) > 0, it follows that

m2 + 2w f~(O)

n

m + 27 f'%(O)

is asymptotically

Example

efficient.

5 (Continuation of example 3)

We have m = I and 2w fZ(0) =

1-p (1

-

2 p)2

= l+p 1 -

p

188

Thus

lim n E (s ~ - s )2 = n-~co

I + P 2

and

1 - p + (1 + p) N n ]

I

2

ia asymptotically

In figure 1 -

efficient.

22 we have p +

(1

+ p)

n E (

N

n E ( ~~ - ~n )2

for some values

0.50 and 0.75 respectively

1 -

p +

(1

'good'

rate of convergence

and

2

n _ ~ ) n

2 p = 0.25,

drawn

+ p)

N

n

of n and for

in order to illustrate

is as an a p p r o x i m a t i o n

of the drawn quantities

of ~

n

how

and the

to their limits.

189

0.9 --

0 ~ 0 . 7 5

--

p=0.50

__

p:0.25

0.8

=

0.7

-0.6

0.5

I

I

I

i

2

3

I

J4

Figure 22: lllustration

I

I

I

I

I

5

6

?

8

9

of the

'goodness'

rate of convergence.

I

!

io

of approximations

1-p+(I+p)Nn n E (

and the

)2 ~

and n

2 -

I

For each value of p the curves

represent from above

nE(~ - ~n)

.

25

2

Consider now the r ~ n d o m generations

GI-G7 described in section 2.6.

In table 4 we give the values of ~

taken from table

n

mative estimates

I

~ 1 +p p + 2(I + p) Nn -

approximation

of

I, the approxi-

p + (1

E ( 2

+ o)

and N

2n 2

n _ ~ ) n

w h i c h is an

190

I-0+( 1+p )N n

Name of n

p

generation

n

GI

500

0.0

0.993

0.991

0.032

G2

500

0.0

1.025

I .026

0.032

G3

500

0.0

1.018

I .023

0.032

G4

500

0.75

0.929

0.899

0.042

G5

500

0.75

0.878

0.815

O.042

G6

500

0.75

0.933

0.979

0.042

G7

50

0.75

0.860

0.876

0. 132

Table 4: lllustration

7.

of estimates on random generations.

ESTIMATION OF SECOND ORDER PROPERTIES

OF STATIONARY DOUBLY

STOCHASTIC P01SSON SEQUENCES

Consider~ like in section 6, a stationary doubly stochastic Poisson sequence N = {N k ; k E Z} together with its underlying random measure = {Zk ; k s

In section 6, where linear estimation of random

variables was treated, we assumed m = E ~k to be known.

In general these quantities

and

rk = C o v

are unknown,

(~j ,~j+k )

and therefore

have to be estimated.

In this section we will study estimates of the

covariance structure.

We will, however,

assume m to be known.

If it was possible to observe Z the problem to find the estimates were

'standard'

time series analysis.

observed and we have to find estimates

In general ~ can not be in terms of an observation

Since also N is a stationary time series, we do really never leave

of N.

191

'standard'

time series analysis.

We will in this section assume that we have an observation NI,...,N n of N and we will compare natural estimates natural estimates

in terms of N with the

if we had an observation s

of Z.

In section 6.1, where linear estimates was studied for finite observations, the results were b a s e d on the covariances

r~ while in sec-

tion 6.2 the results were b a s e d on the spectral density fs therefore

study estimates

We will

of r k when n is 'small' and estimates of

fZ when n is 'large'. This division will also from the point of v i e w of estimation be natural.

As will be seen in example

I, the word

'small' has to be liberally interpreted.

We will always assume that E s

4

< ~ and that s is stationary up to

the 4th order. Thus the quantities

m

=Es

rk

= E (s

rk, ~

= E (.~) - m ) ( ~ . v + k - m)(.~ +~

-m)

r k , j , .i

= E (~)

-

D

exist and are independent

- m) ( ~ u + k

-m)

- m) (g~)+k - m ) ( g v + j

m)(s

-

m)

of ~. Observe that m and r k are defined

as before and, more important, that rk, j must not be confused with Cov (Zk,Zj) for a non-stationary

stochastic

sequence.

We note that, contrary to the situation when linear estimation is studied

(cf remark 6.3), N k - m can not here be considered as an

observation of a 'signal'

~k - m with an independent

Nk - Zk added. The reason is that here properties

'noise'

up to the 4th

order are needed, while for linear estimation only properties to the 2nd order are needed

(cf t h e o r e m

1.6).

up

]92

The quantities

n lkl Ck =

(~. J

Z j=1

-

)

m (s

I

-

m)

and

n-lkl CkN =

j=IZ

will be important

7. I

(Nj - m)(Nj+ k - m)

for the construction

of the estimates.

Esgimation of t h e cova~iances

Suppose

that ~]'''''~n

rk = ~

is a natural

and N I,...,N n are observed.

Ck

estimate

of r k in terms of s

Vat rk = 0( 1 ) under rather general see section

and it is known that

conditions.

Since r N = 6km + r k,

1.6,

=

is a natural

We observe

Then

ckN

I

estimate

_ 6k m

of r kZ in terms

of N.

that E r k = E r k = r~ and will compare Var r k with

Var r k. After some calculations,

cf Grandell

(1971, pp 227-229),

we get n

n Vat r 0 = n Var r O + Var

+ m + 2 In2 + ( 6 - m ) r ~

and for k r 0 and 2k < n

n

(1 E ~.) + 2 E (n-j) r~ n j=1 J n j=-n 'J

+ 2r~,~

193

(n-k) Var rk = (n-k) Vat r k + (n-k)

[m2 + r k~ + 2mr 0 +

~ + rk, k + r0,k] + 2(n-2k)

[Zr~2k + r kZ ,2k]

For large values of n these formulae do not give much information on the behaviour of Var r k. If {~k - m} is a linear process,

closed

forms for lim Var r k exist, but unfortunately these formulae can not n§ be applied to Var r k since {N k - m} is not a linear process. An unpleasant property of estimates of r k is that in general lira n Coy (rk, r~) is equal to a non-zero n-~ A good discussion of estimation Hannan

Example

constant

of covariances

also when k ~ j.

is found in e.g.

(1960, pp 34-45).

I

In order to get some idea of the relation between Var r k and Var r k we consider the case described in section 2.5. These random generations were used by Grandell tion of the estimates.

In spite of what is said above

of r k for large values of n, we have in figure 23

drawn Var

lim--

109-113) as illustra-

It shall be observed that in this case

{~k - m} is not a linear process. using estimates

(1972:2,

~k

n+~ Var r k

as a function of p for s'ome values of k.

194

lira

n~

Vat

~k

Vat

r~ k

I

0.25

Figure

23:

Illustration a n d Var r_ K

Consider

the

estimate

of the

I

1

I

0,50

0.75

I

asymptotic

relation

between

V a r rk

.

r0"

F o r this

estimate

we h a v e

(cf G r a n d e l l

(1972:2, p 110))

l i m n V a r ro =

and thus must

to f u l f i l

the r a t h e r

modest

requirement

V a ~ r r0 ~ 0.1 we

have

n

Thus,

13 1 + p + 21 I - p

~

100 {13

as e x a m p l e s ,

p = 0.25,

n

~

6000

1 + p + 21} l - p

we m u s t if

have

n

p = 0.50

.

~

3400 and

if n

~

p = 0, n 11200

if

~

4267

if

p = 0.75.

195

7.2

Estimation of the spectral density

Assume that F ~ is absolutely continuous and consider estimates of the spectral density f~ (see section 1.6). In section A3 a short discussion of spectral estimation is given. Suppose, like in section 7.1, that Z1,...,~n and NI,...,N n are observed. Since

f~(x) : 2~m + f~(x) (see section 1.6) it is natural to compare the estimates

~(x)

:

I

n-1

z

2~n

(n)(x)

wk

z

Ck e

-ikx

k=-n+1 and

Rx) :

I 2~n

I

2wn

of fs

n-1 (n) (x) N e-ikx _ m__ = k=_~n+1 Wk Ck 2w

n-1 E

(n)(x) ( N nm6k ) e-ikx wk Ck -

k=-n+ I

The coefficients w~n)(x) correspond to the chosen weight

function Wn(Y:X ). Since E f(x) = E f~(x) we do not consider the bias of the estimates.

If s is Quasi-normal, see section A3, we have good knowledge of the asymptotic behaviour of Var f~(x), see theorem A3.5. It is thus natural to investigate under which conditions on Z also N is quasi-normal, since then we also have good knowledge of the asymptotic behaviour of

Va~ }(~).

Put

. . - r.r. Pk,j,i = rk,j,i - rk r l-j J 1-k - r.r. i j-k"

is quasi-normal if, in addition to the general assumptions given in the beginning of this section

196

oo

z

I < Iik

<

*

k=-~

and

z

Ip~k,j,il

<

~

9

k,j ,i~Z

Theorem

I

Let s be a quasi-normal if

E k,j~Z

Jrk,-'j k I

<

sequence.

Then N is quasi-normal

if and only

~

Proof Since s is quasi-normal

it is stationary

up to the 4th order. oo

too is stationary

up to the 4th order.

Since

Z

s irkl < ~

Then N and

k=-oo oo

since rN = 6km + r~

also

N

Put

Z

IrNI < ~ .

s

d~,j,i : Pk,j,i

N

- Pk,j,i

N =

Pk,j,i

r

. k,g,i

-

where

NN rkr.

1-j.

-

NN r.r.

8 m-k

-

NN r.r.

i a-k

and N

rk,j ,i : E (N O - m)(N k - m)(Nj

Since

E

k , j ,ieZ if and only if

s .IPk,4, i] Z k,j ,i~Z

< ~

we h a v e

- m)(N i - m)

E

k , j ,i~Z I~,j,iI~

(d = dk,j, i)

<

< ~. it is enough to consider the

the case 0 < k < j < i. After rather lengthy that

N '~lPdj,il

calculations

it follows

197

d = 0

if

O

if

O=k

d = rk, i

if

O
< i

d = rk, j

if

O
=i

r.~ s l + 3r0,i

if

O=k=g

d = r k + 3 r k~ ,k

if

O
=i

if

O=k
= i .

d=r.

d=

d=

j,i

r~. + r.

g

+r

j ,j

O,j

.

In order to illustrate the calculations we consider the case 0 = k < j = i. In this case

d = ~ ~0 - E ~Z0

- ~)2(~j

-m)~-

- m)2(Zj

- m) 2] + (r 0)

= ~ [i~o _ ~)2(~j

(r0N)2 - 2(rjN.)2

2

~)2

+ 2(r

- m ) 2] - E [ ( s

m)2(s

=

m)2] - m2 - 2mr~

g

Since

E [(N O - m ) 2 ( N j

= s[~

- m) 2] =

{(N o - ~)2 I ~o,s

s {(Nj - m) 2 [ ~o,~j}]

= E {m + (~0 - m) + (~0 - m ) 2 } { m

=

+ m

+

+ (~j

+

- m) + (Zj

- ml*I*j

- m) 2} =

-

mS

we get

d = r L. + r ~. s J J ,J + r0, j co

From the different forms of d it follows, since

k,jZiEz.

,j,i

I < ~ if

and o n l y i f

E Irk,jl k ,jEZ

s

Z ,,Irkl < ~, that k=-co < co

198

If both ~ and N are quasi-normal

processes,

it follows

from theorem

A3.5 that

Var f~(x)

2w(fZ(x))2 n

~ w2( n

--7

:x) y

and 2 m

Var f(x) ~

2~(f~(x) + ~ n

)

S W2n(Y:X) dy -w

if f~(x) > 0 and x # 0,w . (For x = 0 the figure 2 has to be changed to 4). Thus for x ~ w and f~(x) > 0 we have

Vat f(x) lim

independent

Example

of the chosen weight

2 (continuation

Consider

f~(x) + =

functions.

of example

the random generations

I)

described

in section

2.5.

In this

case we have for 0 < k < j < i

m

=

I

rk

= p

rk, j

= 2p J

Pk,j,i

= 8pz - 2PZ+J-k

k

In generations p = 0.75.

GI-G2 we have p = 0 and in generations

From theorem

G4-G6 we have

I it follows that both Z and N are quasi-

normal.

Consider the simple

'truncated'

choose m500 = 10. The reason

estimate

described

in section A3. We

for such a small choice of m500 is two-

199

fold. Firstly the standard deviations of the estimates of rk are rather large, and secondly the estimates r~ and rk based on the random generations GI-G6 are given by Grandell (1972, pp 112-113) for k = 0,1,...,10 and we want to use the same random generations as illustration of the estimates of the spectral density.

For the random generations GI-G6 we have calculated f~(1~) and ^ p~ f(10 ) for p = 0,I,...,9. In figures 25-27 and 29-31 these estimates are drawn. In figures 24 and 28 the values of

Var f~(

) and

^

Var f(10 ) , taken from the asymptotic formulae, are drawn for p = 0,I,...,9 .

0

0

i .....

i ....

i ....

| ....

~ ....

1

2

3

4

5

i ....

6

i ....

7

i .....

i

8

9

p

Fi6ure 24: Standard deviations in the case p = 0. The solid lines connect the approximative / V a r and the dotted lines the / V a r

f (i~)'- values

f~(1~)'- values.

200

0.3 0.2 0.i 0 0

!

2

3

4

5

6

7

8

9

Figure 25: lllustration of estimates for generation GI. The solid lines connect the f (10) - values and the dotted lines t h e f (10) - v a l u e s .

The ' + ' - s i g n s

represent

fl(~).

0,3 0.2 9 .+ .... +

/ ...+....-F

0.i I

I

0 0

i

I

I

I

I

I

I

I

I

2

3

4

5

6

7

8

9

Figure 26: lllustration of estimates for generation G2. The solid lines connect the f (10) - values and the dotted lines the f (i~) - values. The '+'-signs represent fl(1~).

201

0.3

0.2 0,iI I

I

I

I

0

1

2

3

I

]

I

I

I

I

4

5

6

7

8

9

Figure 27: lllustration of estimates

for generation G3. The solid

lines connect the f (I0) pw - values and the dotted lines

~.p~.

the f ~i0 ) - values.

The

'+'-signs represent

Dw fg(~).

i

0.2 ?

J

] .....

0

l

2

l ......

3

I ......

4

4 .....

~ .....

,r ....

j ....

j

5

6

7

8

9

Figure 28: Standard deviations

in the case 0 = 0.75. The solid

lines connect the approximative and the dotted lines the / V a r

/Var

f (i~i - values

f ~i0 ) - values.

202

1.0 I

0.5

0

i

2

3

4

5

6

7

8

9

Figure 29: lllustration of estimates for generation G4. The solid lines connect the f (10) pw - values and the dotted lines the f ~i0) - values. The '+'-signs represent f/(

).

203

+ 1.0

o.5

o

i

2

3

Figure 30: Illustration

4 ~5

/ 7

of estimates

8 ~'-"

for generation G5. The solid

lines connect the f (i~) - values and the dotted lines the f [10 ) - values.

The

'+'-signs represent

Dw fg(~).

204

05

....~

0

i

2

3

4

5

6

7

8

....

+ 9

Figure 31: lllustration of estimates for generation G6. The solid lines connect the f (i~) - values and the dotted lines the f~(1~)U -values.

The '+'-signs represent fg(~w).iu

20.5

At.

POINT PROCESSES AND RANDOM MEASURES

In this survey, results given without references taken from Jagers

or proofs, are

(1974).

Let X be a locally compact Hausdorff topological basis. X is then o-compact

space with countable

and metrizable with a complete metric.

Let B(X) be the Borel algebra on X, i.e. the o-algebra generated by open sets, and let M be the set of Borel measures The set of continuous C K. Let ~ , ~ I , ~ 2 , . . . ~ M

X

on (X,B(X)).

functions with compact support is denoted by and define ~n § ~ by

X

for all f ~ C K as n § ~. This definition of convergence

corresponds

naturally to the vague topology on M (cf Bauer (1968, pp 182-191)), which makes M Polish

(cf Kerstan, Matthes and Mecke (1974, p 238))

i.e. separable and metrizable with a complete metric.

Let

B(M)

be

the Borel algebra on M. A class of sets is called a z-system if it is closed under finite intersection. Theorem I Let A ~ B(X) be a w-system containing a basis for the topology o on X. Then {~M

B(M)

is equal to the o-algebra generated by the sets

; w{A} ~ x} for all A 6 A o ,

x6R.

Consider some subspace M'~ M endowed with the relative topology. Borel algebra

B(M')

on M' is equal to { B ~ M '

; B~B(M)}

The

(cf Bauer

(1968, p 166)) and thus equal to the o-algebra generated by {~eM'

; ~{A} ~ x}, A ~ Ao, x E E .

an intersection

M' is Polish if and only if it is

of a countable number of open sets. Especially every

closed subspace of M is Polish.

2o6

Let N be the set of all integer or infinite valued elements in M. N is closed, and thus

N~B(M) and N is Polish.

Let A be a random measure

(see definition

1.1).

Theorem 2 For any B(X)-measurable

function f : X § R which is bounded and has

compact support or is nonnegative random variable

f f(x) A{dx} is a real-valued X (except that the value + ~ might be allowed).

An element v ~ N is said to be sim~le if v{{x)} = 0 or I for all x ( X . The set N

of such elements

o

An element ~ M The set M

is said to be non-atomic

of such elements

o

Definition

is a Borel set, i.e.

N ~ B(N). o

if ~{{x}} : 0 for all x ~ X .

is a Borel set, see t h e o r e m 1.3.

I

A point process N with distribution P is called simple if P{N } = I o and a r a n d o m measure A with distribution H is called non-atomic

if

o

Definition 2 The Laplace t r a n s f o r m L~ of a probability measure ~ on

(M,B(M)) is

the function

f

f /e p E- f f/x) M

= L /f)

x

defined for f ~ CK+ (i.e. for non-negative

continuous

functions with

compact support).

For any two probability measures ~I and ~2 on

(M, BIM)) we define

207

the convolution ~I M H2 by

H1 ~ H2{B} = i ~ 1B(#1 + p2) Rl{dPl} H2{dP2} for

B

6B(M)

where

IB(~) =

I

if

p6B

0

if

IJ~B

and where Pl + U2 is defined by (~I + ~ 2 ) { A }

= ~I{A}

+ >2{A}

for a~l A~8(•

If A I and A 2 are two independent

random measures with distributions

H I and H2, then A I + A 2 has distribution H I ~ H 2,

Theorem 3 (Uniqueness) A probability measure ~ on (M,B(M)) is uniquely determined by L H

Theorem 4 Let A be a random measure with distribution H and let A C 8(X) o be a w-system

containing a basis for the topology on X. Then H is

uniquely determined by all distributions

of (A{A]} .... ,A{Aj}) for

j = 1,2,... and bounded sets AI,...,A j ~ A o.

The following theorem is essentially due to MSneh tion follows from Kallenberg

(1972). Our formula-

(1973, p 11).

Theorem 5 Let N be a simple point process with distribution P and let A C B(X) be an algebra containing a basis for the topology on X. Then P is uniquely determined by all Pr{N{A} = O} for bounded A 6 A .

208

Let S be a metric space with Borel algebra B(S) and let ~,~i,~2,... be S-valued random variables with distributions ~,H1,n2,...

on

(S,S(S)).

Definition 3 If f _ f dH n § _ f f d~ for all bounded and continuous functions f : S § R S S w we say that H conver~es weakly to H and use the notation H ---* H or --n n d that -~n ~ converges in distribution to ~ and use the notation _ ~n ~ ~"

The standard reference

for the theory of weak convergence

is Billings-

ley (1968) where special attention is given to the Polish spaces CEO,I ~ of continuous

functions

on [-0,I~I endowed with the uniform topo-

logy and D[O,I~j endowed with the Skorohod J1 topology.

We will con-

sider the space DEO,~) later in this section.

Definition 4 A sequence

{Hn} I of probability measures on (S,B(S)) is called tight

if for every ~ > 0 there exists a compact set K C S such' that H {K} > I - ~ for all n. n A sequence {~n)1 is called tight if the corresponding of distributions

sequence {H n)

is tight.

A sequence of probability measures on (S,B(S)) is called relatively compact if each subsequence

of it contains a further subsequence which

is weakly convergent.

For Polish spaces Prohorov~s theorem (cf Billingsley states the equivalence between tightness

(1968, pp 35-40))

and relative compactness,

and this fact explains the importance of tightness.

The main motivation for the study of weak convergence

is that if h

209

is a measurable mapping from S into another metric space S' and if d Sn---* < then also h(6 n)

d h(~5) provided P r { ~

the set of dis-

continuity points of h} = 0. Thus the finer the topology the stronger a weak convergence result.

Consider now convergence in distribution

of random measures.

Theorem 6 (continuity) Let A,A I ,A2, ... be random measures with distributions H ,H I ,H 2, . . . . Then An

d .... A if and only if ~

(f) § ~ ( f )

This result is due to v. Waldenfels

(1968). His proposition is

stronger and formulated for characteristic p 13) gives a similar strengthening

for all f 6 C K +

functionals.

for Laplace transforms.

The following two theorems are weaker formulations Kallenberg

Mecke (1972,

of results due to

(1973, pp 10-11). For any subset A of X we denote its

boundary by 8A.

Theorem 7 Let A,AI,A2,...

be random measures and let A o C B ( X )

~-system containing

a basis

be a

on X s u c h t h a t d Pr(A{~A} = 0} = I for all bounded A ~ A . Then A ---~ A if and only o n if (An{A I} .... ,An{Aj)) all bounded A 1 , . . . , A j ~

d

for the topology

(A{AI],...,A{Aj})

for all j = 1,2 .... and

Ao.

Theorem 8 Let NI,~2,... be point processes, and let A C B ( X ) b e an a l g e b r a

let N be a simple point process

containing

a basis

for the topology d on X such that Pr{N{~A} = 0} = I for all bounded A ~ A . Then N : N n

210

if and only if (i)

Pr{N {A) = 0)

§

Pr{N{A}

= 0)

for all bounded

A

~

Pr{N{A)

> I}

for all bounded A

n

(ii)

Pr{N (A) > I} n

(iii) {Nn) I

is tight.

This is the first time in our discussion tion of random measures needed.

The explicit

where

a tightness

condition

by the following weaker

of convergence condition

is, however,

formulation

in distribu-

is explicitly

easy to remove

of theorem

as seen

8.

Theorem 9 Let N,NI,N2,... if N {A)

d~

and A be as in theorem

N{A)

for all bounded A ~ A

d 8. Then N n ---* N if and only .

n

Proof We have to show that N {A} n Tightness

of {N n}

compact K C X

d

N(A) implies

is equivalent

that {Nn}l is tight.

to tightness

of (Nn(K)} I for all

and thus we only have to show that tightness

of

oo

{Nn{A}) 1 implies tightness

of (Nn{K}} I

Take a compact K C X. Since X can be covered by countably many bounded basis many.

sets it follows that K can be covered by finitely

Thus there exists

> 0 there exists

for all n. Since

space

on s and A, such that

Pr{N {K) < k) > Pr{N {A} < k) n -n --

with left hand limits

DF0,A ~ , A ~ ~, of all rightcontinuous defined

dowed with the Skorohod J1 topology. properties

For every

{Nn{K)) T is tight.

Consider now the function functions

such that A ~ K .

a real number k, depending

Pr{N {A) ~ k} > I - s n it follows that

a bounded A ~ A

as DEO,I],

for which

on E0,A]. Let DE0,A ~ be en-

The space DE0,A ~ has the same

Billingsley

(1968)

is the standard

211

reference. on E0,|

In many situations

it is natural to consider

Let D be the set of all rightcontinuous

functions

functions with

left ha~d limits defined on E0,~). The following topology on D is studied by Lindvall

(1973:1) and (1973:2) who develops

Stone (1963) and Whitt

(1970).

Let F be the set of strictly increa-

sing, continuous mappings of F0, ~) onto itself. identity element of F. Take X,Xl,X2,. .. ~ D .

Let e denote the

Let x n + x mean that

U,C

there exist u where

U

U

F such that Xn~ Yn

~ stands for u n i f o r m convergence

form convergence

ideas due to

x and y n and

on compact subsets of D , ~ ) .

U~C

~

~ e

ands for uni-

With thirstsdefinition

of convergence D is Polish.

Let for A ~ ~ , ~ )

the function r A : D § DE0,A] be the restriction

operator to ~,AI,_ i.e. rA(x)(t) theorem given by Lindvall

= x(t)

, t 6 E0,A]. The following

(1973:2, p 21) and (1973:1, p 120) brings

the question about weak convergence of stochastic processes

in D

back to the finite interval case.

T h e o r e m 10 Let X,X I,X2,... be stochastic processes

in D. Suppose there exists

co

a sequence

{Ai)i= I , A.l > 0 and A.1 § ~ as i § ~

d

rA. (Xn) --~ rA. (X) 1

1

for i = 1,2, . . . .

X

d n

~ X

Then

as

n§

as

n

-~ oo

, such that

212

A2.

HILBERT SPACE AND RANDOM VARIABLES

The reader is assumed to be acquainted with the formal definition of a Hilbert space. A good introduction well suited for our purposes is, however,

given by Cram6r and Leadbetter

(1967, pp 96-104).

Let H be a Hilhert s~ace. Let h,hl,h2~ H. In general

(hl,h 2) denotes

the inner ~roduct between h I and h 2 and Ilhll = (h/~-~,h)denotes the norm of h. Let h,hl,h 2 .... { H .

Convergence h

§ h means that

n

H is complete in its norm. The operations

llhn - hll + O.

of addition and multiplica-

tions with real or complex numbers are defined for the elements in H. If (hl,h 2) is real for all hl,h 2 6 H , space.

then H is called a real Hilbert

Let {hi ; j E J } be a family of elements in H. Let H(J) be the

collection of all finite linear combinations

of elements in {hi

or limits of sequences of such combinations.

H(J) is Hilbert subspace

of H and is called the Hilbert space spanned by {hi denoted by S({hj

; j EJ]).

; jEJ}

; j ~J}

and often

It is obvious that if Jo is a subset of J

then H(J o) is a Hilbert subspace of H(J). For our applications

of

Hilbert space geometry the following theorem is of great importance.

Theorem I. Let H h

o

6H

The projection theorem

be a Hilbert subspace of H and let h 6 H .

o

called the projection of h on H

o

two equivalent

llh

o

There exists a unique

which satisfies the following

conditions:

- hJl

= rain Ilu - hPi uGH

o

o (ii)

(h ~ - h, u) = 0

Further

Ilho-hll2=

V uE H0

[Ihll 2 -

Ilholl 2

Our formulation of the projection theorem is close to the formulation given by Parzen

(1959, p 306).

213

We sometimes denote the projection of h on H

o

by E(h I H ). Projeco

tions have similar properties as conditional expectations (cf Doob (1953, p 155)). Examples of such properties are E(alh I + a2h 2 I H o) = alE(h I I H o) + a2E(h 2 I H o) where a I and a2 are real or complex numbers and E(E(h I H2) I H I) = E(h I H I) for H I C H 2 C

H .

Consider now a measure space (~,A,~) where ~ is an arbitrary set, A a a-algebra of subsets of ~ and p a measure on A. From the RieszFischer theorem it follows that the set of all square integrable Ameasurable functions f forms a Hilbert space with (fl,f2) = f flf2 d~ if functions differing on a set of u-measure zero are identified. This Hilbert space is denoted by L2(~,A,p) or shorter by L2(P). If only real-valued functions are considered L2(P) will denote the corresponding real Hilbert space.

Let f,fl,f2 .... ~L2(p) be functions such that fn § f' i.e. llfn - fll § 0. Then there exists a subsequence f such that f § f nk nk

a.e. (p) when

k § ~, i.e. there exists a set E with ~{E) = 0 such that lira f (~) = f(~) for all ~ k+~ nk

\E.

The case when ~ is a probability measure will be of a particular interest to us. In that case the measure is denoted by P and the functions are called random variables. For hl,h 2 ~ L 2 ( P ) we have (hl,h 2) = E hlh 2.

Consider a family (Xj ; j ~ J ) for all j ~ J

of random variables in L2(P). If E X.~ = 0

the inner product in S({Xj ; j ~ J } )

variance. Let h @ L 2 ( P ) .

is equal to the co-

Let H(J) denote the space S({Xj

; j~J),

I)

where I is the constant one. E(h I H(J)) is called the best linear estimate of h in terms of {X. ; j ~ J ) . J For reference reasons the following simple result is given as a theorem.

214

Theorem 2 E(h

i H(J))

= E h + E [ h - Eh

I S ( { X . - EX. J J

; Js

Proof It follows

from theorem

E[{E(h - Eh ueH(J).

I that we have to show that

I S({X. - EX. 9 j ~ J } ) ) J J '

- (h - Eh)} u] = 0

It is enough to consider u = I and Xj, j ~ J .

u = (u - Eu) + Eu. Since u - E u 6 S ( { X j - EXj E(X. - EX.) = 0 for all j ~ J the result J J

A3.

; j6J})

Put and since

follows.

[]

SOME TIME SERIES ANALYSIS

Consider a real-valued time series or stationary {X. J

for all

; j~Z}

such that

sequence

E X. = O, V a r X. = r < ~ and j j o

Coy (Xj,Xj+k) = r k. Then

rk =

i

eikx

F{dx}

-7

where the s~ectral distribution rightcontinuous

function F(x) is non-decreasing,

and b o u n d e d and further n o r m a l i z e d by F(-w) = 0.

Although the time series is real-valued it is convenient to use the complex form of the spectral representation.

A derivative

f of

the absolutely continuous

component of F is called spectral density.

Since {X.} is real-valued J

f(x) is symmetric.

f(x) ~ e I > 0 for all x E [ - ~ , ~ restriction.

We assume that

since for our purposes this is no

The time series itself has the spectral representation

xk =

f ei~z{d~} --IT

215

where,

in differential

notations,

E(Z(dx} Z(dy})

the process

F{dx}

if

x = y

0

if

x r y

Z(x) fulfils

=

(The reader is assumed to be acquainted

with the formal defini-

tion of this kind of representations.)

Define the Hilbert

space L

= S(X. ; j < n}, L = S(X. ; j 6 Z} J ~ O

n

with inner product E hlh 2 and with inner product

L = S{e iJx

~ hl(X) h2(x)

F{dx}.

; j < n}

L = S{emJX;

j ~Z}

For all n (including ~)

L

--IT

and L j

<

n

are isomorphic

n

under the linear mapping with X. ++ j

e mJx,

n.

For any integrable use the notation

function h from

h(x)

=

E-w,wJ

1 e -ikx h(x). E h k e ikx where h k = 2-~w k =-~ -w

sign = means merely that h corresponds for example Doob

into the complex plane we

to its Fourier

(1953, p 150). For square integrable

series,

The

see

functions

h we

define [h(X)]n by

n

EhIXl n-

z

hk e

ikx

k-~

Consider correlated and $~ =

a real-valued with

~ ([

{X.} J I L)

random variable

and d e f i n e

=

we have

Pk = E ( [ X k ) .

that E ~

X

i eikx ~

there corresponds

~n

n

~. = p~ and since

f h ( x ) Z{dx} f o r some f u n c t i o n h w i t h Pk =

Put

2

.

From theorem A2. I it follows

~

~ with E ~ = 0 and Var ~ = a

F{dx}.

f l h ( x ) l 2 F{dx) <

Thus if F is absolutely

to ~ a function r

= h ( x ) f ( x ) ~ "1~

continuous

~ k~-~

Pk e -ikx

216

with

j ~ --,ff

dx < ~ 9 The function ~ will be called the cross

f(x)

spectral density.

Consider the function g(z), z complex, defined by

g(x) = exp

{1_ i e-lY 4~r

+ z log

-ly

--~I

e

-

f(y) ay}

Z

for Izl < I. g(z) is analytic and without zeros in the unit circle Izl < I (cf. Grenander and Rosenblatt (1956, pp 67-69))and thus

g(z) =

=

zk f o r

~

Izl < I and f u r t h e r

g(e I x ) = l i m

k=0

g ( r ei x )

r~ I

fulfils g(e Ix) g(e -Ix) = f(x). Since f(x) is symmetric we have g(e ix) = g(e-lX). The function I/g(z) is analytic in Izl < I.

Following essentially Doob (1953, pp 590-594) we get the following two theorems.

Theorem I Let {X k} have an absolutely continuous spectral distribution with spectral density f and let g be a random variable which has the 2 cross spectral density @ to (Xk} , mean 0 and variance a . Then

g~ =

i r f(x)

z{~}

and further

- i Ir

f(x)

~x

Proof Since E ~ ~

=

f e ikx @(x) dx =

~ r

e -ikx dx = i ( ~ )

f(x)

e-ikx f(x) dx

211

the result follows

from t h e o r e m A2.1.

Theorem 2 Let {X k} and < be as in theorem

I. Then

Z{~x} )in

-Tr g ( e - l X ) b ( e

and further 2

~<-~)~:o~-

F(~-~-. ~] (eiX) I

7

i

Lg

-1T

<~<.

Un

Proof 9{

Let ~n be as in the formulation

of the theorem.

Since

2

b(e~X)In

-..

f(x) ~x <

2

< 7 ~g(elX)

---w

it follows that ~n ~ L theorem is

proved

if

n

~:

7 l+~-x~i~ f(x)

-w

dx < oo

. Thus it follows from t h e o r e m A2.1 that the

we s h o w t h a t

E ~ Xk = E ~n Xk f o r

k~

We have

_]f ~) g(e-iX) 1 Ig(ei~)l~ )e-i <x f/x x= s e-i~<x

--1T

/-~<)- ~/e~-~/ L--~/n

dx:

n.

218

_~

b(e~X)J n

From the Fourier series corresponding to the functions in the integral it follows that E ~ Xk = E ~n Xk for k ~ n.

Further

~(~_ ~)2 n

=

~2 -~1~1

2

2

i -~

1 L g

d n

The following theorem is given by Rozanov (1967, pp 77-78 and 201).

Theorem 3 Any element h ~ [

has the unique representation

which converges in mean square to the same limit independent of the order of summation if and only if F is absolutely continuous and c I ~ f(x) ! c2 for almost all x ~ E - w , ~

where 0 < c I ~ c 2 <

Proof The theorem is proved by Rozanov (]960). The

'if' part is also given

by Rozanov (1967, p 78) and will be reproduced here because of our interest in these kind of results.

For any h =

7T ] h(x) Z{dx] we have

~1 2 Ih(x)I

~z

Ihl 2_
S Ih(~)l 2dx

219

since

c

< f(x)

h -

< c 2. Consider

N Z

h k Xk

k=-N where

1

hk

= 2-~

i e-ikx h(x) dx --W

We have

N

N

E Ih-

h k Xk[ 2=

z

]" l h ( x ) -

k=-N

w

N

<_c2 I" I h ( x ) = c 2 2~

h k eikx I

since

dx =

k=-N Z

lkl>.

i__~lnk 12

Thus h has the r e p r e s e n t a t i o n unique

f(x) ax <

2

~

-~

hk e i k x 12

E

k=-N

-~

of the theorem.

The r e p r e s e n t a t i o n

if h = E gk Xk is a r e p r e s e n t a t i o n

of h we have

N

0 = l i m E Ih N+~

E

gk Xk 12 ~

k=-N

N limc I ( _~ I h ( x ) N+~ -w

I Z lhk-g

-

gk e i k x I

Z

2 dx =

k=-N

l 2

-co

To complete E hk e

ikx

the p r o o f

converges

of its terms.

it suffices

to h(x)

to remark that the

in mean

square

series

for any p e r m u t a t i o n

is

220

Consider

n o w the

constant.

stationary

Suppose

that

sequence

Yk = m + X k w h e r e

Yk is o b s e r v e d

for k = 1,...,n.

m is an u n k n o w n We use the

nota-

tions

~~

Y

= (YI

"''Yn )

I

= (I,...,I)

(n c o m p o n e n t s )

"-13.

and

Rn = E(Y--m - ml--n)'

R

n

is a s s u m e d

estimate

(Y-Y-'m - ml_n)

to be p o s i t i v e

mx o f m i n

terms

n

definite.

of Y

"-'-TI

is

"

Then

given

the best

linear

unbiased

by

-1

1 R Y' m ~ =--n n --n n -I I R I'

and

further

Var m

n

-

I

-I I R I' --n n --n

n

It is n a t u r a l

to compare

this

estimate

with

= In n

Grenander

and R o s e n b l a t t

(1956,

pp 89-90)

lim Var m x = l i m Var Y = F(0) n n n~ n-~

Theorem

4

If F(x)

is a b s o l u t e l y

there

is a v e r s i o n

continuous

f(x)

of F'(x)

it follows

- F(0-)

Yk"

From

e.g.

that

.

in a n e i g h b o u r h o o d which

Z k=1

is c o n t i n u o u s

of x = 0 and if in a n e i g h b o u r -

221

hood of x = 0 with f(x) > c I > 0 then

lim n Var m = lim n Var Y = 27 f(0) n n n-~ n-~

Proof From e.g. G r e n ~ d e r

and Szeg6 (1958, p 211) it follows that the

theorem is true if F is absolutely continuous and if f(x) is condinuous. We will now use an argument due to Grenander (1951, p 567) to show the theorem.

There exists s > 0 such that F(x) is absolutely continuous and f(x)

is continuous for x6 ~c, 4 . Define

fl(x)

=

[-~,-s)

cI

for x e

2(I + x) f(x)- (2x+ I) c

E for x e ~-~,-7)

f(x)

for x E-

2(I - x)e f(x) + (-7-2x I) c I

for x s (5' s~

cI

for x e (~,~]

E

With this construction f1(x) is continuous and fl (x) < f(x) for all x 6 L-~,~~ 9

Following Grenander (1951, p 567) we split {X k} in two ortogonal components { 4 1 ) ] a n d

{Xi2)], such that Xk = 4 1 )

+ Xk-(2)' where ( 4 I) }

has absolutely continuous spectral distribution with spectral density

f1(x). Since, if F 2 is the spectral distribution of { 4 2 ) ] , g

F 2 <~> - F 2 <- ~) = 0 and thus

we have

222

: 2~ f~(o) = 2~ f(o)

l i m n Var n-~ from w h i c h

we

n

get

l i m sup n Var m ~ < 2w f(0). n

nTS On the

other

hand,

if A

--

in the

set of v e c t o r s

a

n

with

a I' = I , Var m = --n--n n

> inf -- ~ n E A n

n ( Z j=1

Var

inf

Var

(a

proves

We w i l l

now

and thus

estimation

studied

for n o r m a l

problems

of the e s t i m a t e s ,

up to the

4th order

such

{X k

; k6Z}

that

= E

We w i l l

spectral

series only

and t h e r e f Q r e

k =-~E Irkl

assume

which

analysis, consider

only

series

< ~ and E

with

known

This

and was the

is one first

(asymptotic)

assumptions

of v for all k , j , i 6 Z .

for a n o r m a l

mean

(Xv - m ) ( X v + k

on m o m e n t s

process

value

E X k = m,

- m)(Xv+ j - m)(X

Put

(X v - m ) ( X v + k - m ) ( X v + j - m ) ( X v + i - m)

a quantity

density.

are required.

be a time

and is i n d e p e n d e n t

of the

of time

processes.

variances

Let

= 2~ f(0)

theorem.

consider

of the most studied

the

--

J

lim inf n Var m~n -> 2w f1(0)

which

(a I ..... a n )

Y' ) >

--n

-%hEAn

X!I) )

a. J

=

-lq

Pk,j,i

~i-m)

=

- rkr.l_j. - rjri_ k - r.r.1j-k'

is equal

to zero.

Let us

further

that

z

Ipk,j,iL <

k,j,i6Z

A time

series

quasi-normal.

fulfilling

all the

assumptions

exists

above

will

be c a l l e d

223

E Irkl < = it follows that {X~}~ has an absolutely continuous k=-~ spectral distribution and that the spectral density f can be chosen

From

continuous

and bounded.

Further

oo

1

f(x) = T w

z

rk e

-ikx

k_--oo

where the sum is absolutely

f(x)

convergent,

n

I

~ -2w

z

and thus

-ikx r k

e

k=-n

if f(x) > O.

Let XI,...,X n be observed

and put

n-l~l Ck =

E j=1

(xj - m)(X.~+

k

- m)

.

Then the periodogram

1 n-1 Z In(X) = 2~n k=-n+1

might

-ikx Ck

e

seem to be a good estimate

some unpleasant

properties

I

k~1= (X k - m) e -ikx 2

= 2~---~

of f(x). This estimate

has, however,

and we are led to consider weighted

esti-

mates of the form

ii

fn~(X) =

/ Wn(Y:X)

In(Y) dy

where

fw n

(y:x) ~=

and where the weight

I

functions

Wn(Y:X)

for all x accumulates

mass in

224

the neighbourhood

of y = x at a 'suitable'

rate as n § ~.

Put

w(kn)(x) = ~ e ik(x-y) Wn(Y:X)dy --7[

and thus we get

f~(x) = 1 n 2wn

Usually only estimates wk(n)(x)

n-1 ~n) e-ikx Z w (x) Ck k=-n+1

f (x) where w ~n)(x)

= 0 for Ikl < m m

(m

--

is independent

of x and where

much smaller than n) are considered. n

The simplest such estimate is the Grenander and Rosenblatt

'truncated'

estimate

(cf e.g.

(1956, p 148)) given by

m

fn~(x)

=

n

I 2~n

-ikx

E

Ck

e

k=-m n m

where m

n

§ ~ as n § ~ in such way that

n n

§ 0.

The following theorem is taken from Roseablatt where also the required conditions tions Wn(Y:X)

(1959, pp 253-255)

on the sequence of weight

func-

are given.

Theorem 5 Let

{Xk

; k~Z}

be a quasi-normal time series and let Wn(Y:X) be

a sequence of 'suitable'

weight

functions.

w n

Var

Then

(y:x) dy i f

x r 0,'~

-Tr

f~(x) 4w nf2(x)

if f(x) > 0. Further,

J w2(y'x)dy

if

x=0

if 0 <_ x I < x 2 < w and if f(x I) > 0 and

225

f(x 2) ~ O, the estimates

f~(x I) and f~(x 2) are asymptotically

un-

correlated.

It may be observed that Var f~(x)~~ tends to zero slower than I_ since n n f w~ (y:x) dy § ~ as n ~ ~. -7 For

the 'truncated' estimate we have

f w 2 (y:xl n

dy

m ~--~

.

226

REFERENCES

Barndorff-Nielsen, O. and Yeo, G.F. (196~. Negative binomial processes.

J. Appl. Prob. 6, 633-647. Correction in J.A.P. 7, 249.

Bauer, H. (1968). Wahn~ch~nlichkeigstheorie und Grundz~ge der Mass-

theoaie. Walter de Gruyter & Co. Berlin. Billingsley, P. (1965).

Ergodic theory and information. John Wiley

and Sons. New York. Billingsley, P. (1968). Convergence of p r o b a b i l i t y m e ~ u r e s . John Wiley and Sons. New York. Bingham, N.H. (1971). Limit theorems for occupation times of Markov processes. Z. W a ~ c h ~ n g i c h k e i t ~ t h e o r i e

verw. Geb. 17. 1-22.

Cox, D.R. (1955). Some statistical methods related with series of events.

J. R. staJgist. Soc. B, 17, 129-164.

Cox, D.R. and Lewis, P.A,W.

(1966).

The statistical analysis of event~.

Methuen. London. and Barnes and Noble. New York. Cram@r, H. (1955). Collective risk theory.

Fo~k~ingsbolaget

Skan~a.

The jubilee volume of

Stockholm.

Cram@r, H. and Leadbetter, M.R. (1967). S t a t i o n a r y and r e l a t e d

stoch~tic

proc~ses.

John Wiley and Sona, New York.

Cram@r, H. (1969). On streams of random events. Skand. A k t u ~ .

T i d s k r i f t 52 Suppl.,

13-23.

Daley, D.J. and Vere-Jones, D. (1972). A summary of the theory of point processes. S t o c h ~ t i c

theory and applications.

point p r o c ~ s e s : S t a t i s t i c a l

analysis,

Ed. by Lewis, P.A.W., 299-383. Wiley-

Interscience. New York. Dobrushin, R.L. (1965). A lemma on the limit of a composite random function (in Russian).

Uspe~ Mat. Nauk 10, no. 2 (64), 157-152.

Doob, J.L. (1953). S t o c h ~ t i c

Processe~. John Wiley and Sons, New York.

Feller, W. (1971). An i n ~ d u c t i o n

to probabigity and i t ~ appgications.

Vol. I f . 2nd ed. John Wiley and Sons. New York.

227

Gaver, D.P. (1963). Random hazard in reliability problems.

Technomet~cs 5, 211-226. Grandell J. (1971). On stochastic processes generated by a stochastic intensity function. Skand. Aktuar. T i d s k r i f t 54, 204-240. Grandell, J. (1972:1). On the estimation of intensities in a stochastic process generated by a stochastic intensity sequence.

J. Appl. Prob. 9,

542-556. Grandell, J. (1972:2). Statistical inference for doubly stochastic Poisson processes. S t o c h ~ t i c point processes: S t a t i s t i c a l analysis,

theory and applications. Ed. by Lewis, P.A.W., 90-121. W i l e y Interscience. New York. Grandell, J. (1973). A note on characterization and convergence of non-atomic random measures. Int.

conf. on prob. theory and math.

s t a t . , Abstract~ of commu~icagions T . I . ,

175-176, V i l n i u s .

Grenander, U. (1951). On Toeplitz forms and stationary processes.

Arkiv fur matematik I. 555-571. Statistical analysis of stationary time s e~es. Almqvist & Wiksell~ Stockholm, and

Grenander, U. and Rosenblatt, M. (1956).

John Wiley and Sons. New York. Grenander, U. and Szeg8, G. (1958).

Toeplitz forms and their applica-

t/0ns. Univ. of California Press, Berkeley and Los Angeles. Hannan, E.J. (1960). Time series analysis. Methuen & Co. London. Hannah, E.J. (1970). M u ~ p l e time series.

John Wiley and Sons.

New York. Jagers, P. (1973). On Palm probabilities. Z. Wahrscheinlichkeits-

theorie verw. Geb. 26, 17-32. Jagers, P. (1974). Aspects of random measures and point processes.

Advances in probability and related topics. 3.

Ed. by Ney, P . ,

179-239. Marcel Dekker, New York. Jung, J. and Lundberg, O. (1969). Risk processes connected with the compound Poisson process. Skand. Aktuar. Tid~krift, Suppl.,

118-131.

228

Kallenberg, 0. (1971). Lecture at the Gothenburg conference on point processes. Kallenberg, 0. (1973:1). Characterization and convergence of random measures and point processes. Z. Wah~chein~ichkeit~theorie verw.

Geb. 27. 9-21. Kallenberg~ 0. (1973:2). Characterization of continuous random processes and signed measures. Studia Sci. Math. Hungarica 8. 473-477. Kallenberg, 0. (1975:1). Limits of compound and thinned point processes.

J. Appl. Prob. 12, 269-278. Kallenberg, 0. (1975:2). Random measures. Schriftenreihe des Zentralinstituts fi~r Mathematik und Mechanik der ADW der DDR, AkademieVerlag, Berlin. Kallenberg, 0. (1976). On the structure of stationary flat processes. Tech. Rep., Dept. of math., Gothenburg. Kerstan, J., Matthes, K. and Mecke, J. (1974). Unbegrenzt t e i l b ~ e

Punktproz~se. Akademie-Verlag, Berlin. Khintchine, A.Y. (1960). Mathematical methods in the theory of queuing. Charles Griffin. London. Kingman, J.F.C.

(1964). On doubly stochastic Poisson processes. Proc.

Camb. P~ig. Soc. 60, 923-930. Kingman, J.F.C.

(1972). Regenerative phenomena. John Wiley and Sons,

New York. Kolmogorov~ A.N. (1939). Sur l'interpolation et extrapolation des suites stationnaires. C.R. Acad. Sc. Paris 208, 2043-2045. Krickeberg~ K. (1972). The Cox process. Symposia Mathematica IX. 151-167. Kummer, G. and Matthes, K. (1970). Verallgemeinerung eines Satzes yon Sliwnjak III. Rev. Roum. math. pure et appl.

15:10, 1631-1642.

Lamperti, J. (1962). Semi-stable stochastic processes. Trans. Am~.

Math. Soc. 104, 62-78.

229

Lawrance, A.J. (1972). Some models for stationary series of events.

Stochastic point process~: S t a t i c a l tions.

a n a l y s i s , theory and applica-

Ed. by Lewis, P.A.W., 199-256. Wiley-Ynterscience. New York.

Lindvall, T. (1973:1). Weak convergence of probability measures and random functions in the function space D[0,~). J. Appl. Prob.

10,

109-121. Lindvall, T. (1973:2). Weak convergence in the function space D~0,~) and diffusion approximations of certain Galton-Watson branching processes. Tech. Rep., Dept. of math., Gothenburg. Lundberg, O. (1940). On random p r o c ~ s e s

hess and accident s t a t ~ t i c s .

and t h e i r applic~gion to s i c k -

2 nd ed. 1964, Almqvist & Wiksell.

Uppsala. Macchi, O. (1971). Distribution statistique des instants d'$mission des photoelectrons d'une lumi~re thermique. C. R. Acad. Sc. P a ~

272, sea A, 437-440. Macchi, O. and Picinbono, B. (1972). Estimation and detection of weak

IEEE Trans. Inform. Theory 18, 562-573.

optical signals.

Marcus, M. and Minc, H. (1965). Permanents. Amer. Math. Monthly 72, 577-591. Mecke, J. (1967). Station~re zuf~llige Masse auf iokalkompakten Abelschen Gruppen.

Z. Wahrscheinlichk~tstheorie

verw. Geb. 9,

36-58. Mecke, J. (1968). Eine charakteristische Eigenschaft der doppelt stochastischen Poissonschen Prozesse. Z. Wahr$cheinlichk~t~theo~e

verw. Geb. 11, 74-81. Mecke, J. (1972). Zuf~llige Masse auf lokalkompakten Hausdorffschen

R~umen. B e i ~ g e

zur Analysis 3. 7-30.

M6nch, G. (1971). Verallgemeinerung eines satzes von A. RSnyi. Stud~a

Sci. Math. Hungar. 6, 81-9o. Neuts, M.F. (1971). A queue subject to extraneous phase changes.

Adv. Appl. Prob. 3, 78-119.

230

Parzen, E. (1959). Statistical inference on time series by Hilbert space methods. I. Published in Parzen, E.(1967).Time series analysis

papers. Holden Day, San Francisco. Rodhe, H. and Grandell~ J. (1972). On the removal time of aerosol particles from the atmosphere by precipitation scavenging.

Tellus 24. 443-454. Rootz6n, H. (1975). A note on the central limit theorem for doubly stochastic Poisson processes. Tech. report, The university of North Carolina. Rosenblatt, M. (1959). Statistical analysis of stochastic processes with stationary residuals. Probability and s t a t k s t i c s

- The Harald

Cram~r

volume. Ed. by Grenander~ U,, 246-257. Almqvist & Wiksell~ Stockholm, and John Wiley and Sons, New York. Rozanov, Yu. A. (1960). On stationary sequences forming a basis.

S o v i e t Math. - Do~gady I, 155-158. Rozanov, Yu. A. (1967). Stationary random processes. Holden-Day. San Francisco. Rubin, I. (1972). Regular point processes and their detection.

IEEE Trans. Inform. Theory 18, 547-557. Rudemo, M. (1972). Doubly stochastic Poisson processes and process control. Adv. Appl. Prob. 4, 318-338. Rudemo, M. (1973:1) State estimation for partially observed Markov chains. J. Ma~.

Anal. Appl. 44, 581-611.

Rudemo, M. (1973:2). Point processes generated by transitions of Markov chains. Adv. Appl. Prob. 5, 262-286. Rudemo, M. (1975). Prediction and smothing for partially observed Markov chains. J. Math. Aaal. Appl. 49, 1-23. Ryll-Nardzewski, C. (1961). Remarks on processes of calls. Proc. 4th

Berk~gey Symp. 2, 465-471. Serfozo, R. (1972:1). Conditional Poisson processes. J. Appl. Prob. 9, 288-302.

231

Serfozo~ R. (1972:2). Processes with conditional independent increments.

J. Appl. Prob. 9, 303-315. Siegert, A.J.F.

(1957). A systematic approach to a class of problems in

the theory of noise and other random phenomena: Part II. IRE Trans.

Inform. Theory 3, 37-43. Skorohod, A.V. (1957). Limit theorems for stochastic processes with independent increments.

Theory Prob. Applications II, 138-171.

Snyder, D.L. (1972:1). Filtering and detection for doubly stochastic Poisson processes.

IEEE Tra~s. Inform. Theory 18, 91-102.

Snyder, D.L. (1972:2). Smoothing for doubly stochastic Poisson processes.

IEEE Trans. Inform. Theory 18, 558-562. Snyder, D.L. (1975).

Random point processes. John Wiley and Sons.

New York. Snyders, J. (1972). Error formulae for optimal linear filtering, prediction and interpolation of stationary time series. Ann. Math. Stagist.

45, 1935-1943. Stone, C. (1963). Weak convergence of stochastic processes defined on a semifinite time interval. Proc. Ame~. Math. Soc. van Trees, H.L. (1968).

14, 694-696.

Detection, estimation, and mod~ation theory.

P a r t I. John Wiley and Sons, New York. Waldenfels, W.v. (1968). Charakteristische Funktionale zuf~lliger Masse.

Z. Wahr~ch~in~ichk~gt~theo~ie verw. Geb. 10, 279-283. Westcott~ M. (1972). The probability generating functional.

J. Aust.

Math. Soc. 14, 448-466. Whitt, W. (1970). Weak convergence of probability measures on the function space D[0,~). Tech. report, Yale university. Whitt, W~ (1972). Continuity of several functions on the function space D. A revised version is sometimes referred to as 'to appear in

Ann. Prob.'

232

INDEX

Absolutely dominated

121

Additive see completely random Asymptotically

efficient

Average intensity

162

185

Best estimate 88 Best linear estimate

116, 142, 213

Borel

-

algebra 3, 205 measure 3

Bounded set 5 Completely random 5 Completion

13

Convergence in distribution 69, 74, 208 - vague 205 - weak 19, 208 Convolution 207 Covariance 23 Cox process see doubly stochastic Poisson process Cross spectral density

163, 216

Diffuse see non-atomic Doubly stochastic Poisson process 7 alternative definitions of a - 10, 12, 16 Doubly stochastic Poisson sequence Dynkin system 7 Ergodic 27 Estimate 88 best - 88 best linear - 116, 142, 213 best linear unbiased - 220 linear - 116 Functional limit t h e o r e m 76 Hilbert

space

Instantaneous

115~ 212 intensity

14, 15

Intensity average - 185 instantaneous - 14, 15 - function 12 measure 5 -

Laplace-transform Leading function

18, 206 10

17

233

Level

160

Linear

estimate

116

Local convergence see vague convergence Loss

function

88

Mean 23 Measurable

process

13

M i x e d Poisson process see w e i g h t e d Poisson Non-atomic - measure 8, 206 random measure Observation

process

19, 206

87

Operator c-amplifying - 21 p-thinning - 21 shift - 27 P a l m measure w-system

54

205

Point process 4 simple - 19, 206 Poisson process with intensity measure 5 intensity one 11 - leading function 10 Polish

space

3, 205

P61ya process Quasi-normal

32 195, 222

Radon measure see Borel measure R a n d o m measure 4 distribution of a - 4 non-atomic - 19, 206 Regular

variation

Relatively

76

compact

208

Renewal process 34 alternating - 50 arithmetic - 44 non-arithmetic - 44 ordinary - 44 stationary - 37 transient - 34 Simple point Skorohod

process

topology

19, 206

74, 210,

211

Spectral density 27, 214 - distribution 27, 214 -

Standard

Poisson

see Poisson

process

process

with intensity

one

234 State space 3 Stationary strictly ~ 20 (weakly) - 26 Thinning 21 Tight 208 Topology Skorohod - 74, 210, 211 vague - 3, 205 "Truncated"

estimate

198, 224

Vague - convergence 205 - topology 3, 205 Version

13

Weak convergence

19, 208

Weighted Poisson process Without after-effects see completely random Without multiple points see simple

31

Stochastic Processes

Read more

Stochastic processes

Read more

Stochastic processes

Read more

Stochastic Processes

Read more

Stochastic Processes

Read more

Stochastic Processes

Read more

Stochastic Processes

Read more

Stochastic Processes

Read more

Stochastic processes

Read more

Stochastic processes

Read more

Stochastic processes

Read more

Stochastic Mechanics and Stochastic Processes

Read more

Combinatorial Stochastic Processes

Read more

Basic Stochastic Processes

Read more

Essentials of Stochastic Processes

Read more

Stochastic Processes 001

Read more

Stochastic Spatial Processes

Read more

Probability and Stochastic Processes

Read more

Applied Stochastic Processes (Universitext)

Read more

Stochastic Processes and Models

Read more

Applied stochastic processes

Read more

Almost Periodic Stochastic Processes

Read more

Essentials of Stochastic Processes

Read more

Doubly Dying

Read more

Stochastic Processes and Models

Read more

Almost Periodic Stochastic Processes

Read more

Stochastic processes and models

Read more

Surveys in Stochastic Processes

Read more

Adventures in Stochastic Processes

Read more

Combinatorial stochastic processes

Read more

Recommend Documents

Stochastic Processes

This page intentionally left blank Stochastic Processes This comprehensive guide to stochastic processes gives a comp...

Stochastic processes

Stochastic processes

Stochastic Processes

This page intentionally left blank Stochastic Processes This comprehensive guide to stochastic processes gives a comp...

Stochastic Processes

Stochastic Processes SlAM's Classics in Applied Mathematics series consists of books that were previously allowed to ...

Stochastic Processes

This page intentionally left blank Stochastic Processes This comprehensive guide to stochastic processes gives a comp...

Stochastic Processes

...

Stochastic Processes

This page intentionally left blank Stochastic Processes This comprehensive guide to stochastic processes gives a comp...

Stochastic processes

Stochastic processes