The Modelling of Systems with Small Observation Sets

Lecture Notes in Control and Information Sciences Edited by A.V. Balakrishnan and M.Thoma 10 Jan M. Maciejowski The Mo...

Author: J.M. Maciejowski

21 downloads 1055 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Control and Information Sciences Edited by A.V. Balakrishnan and M.Thoma

10 Jan M. Maciejowski

The Modelling of Systems with Small Observation Sets

Springer-Verlag Berlin Heidelberg New York 1978

Series Editors A. V. Balakrishnan • M. Thoma Advisory Board A. G. J. MacFarlane • H. Kwakernaak • Ya. Z. Tsypkin Author Dr. Jan Marian Maciejowski Maudstey Research Fellow, Pembroke College, Cambridge also with the Control and Management Systems Group, Cambridge University Engineering Department Mill Lane, Cambridge CB2 1RX, England

ISBN 3-540-09004-5 Springer-Verlag Berlin Heidelberg NewYork ISBN 0-387-09004-5 Springer-Verlag NewYork Heidelberg Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher. © by Springer-Verlag Berlin Heidelberg 1g78 Printed in Germany

SUMMARY

The p r o b l e m systems,

when

is i n t r o d u c e d defined

of a s s e s s i n g

only

of a system,

of a v a i l a b l e algorithm un d e r

A general "information more no

of models,

criteria

information gain

including

and its c o m p u t a t i o n gain

for the The

language about

of m o d e l l i n g ,

to the p r o b l e m

of s y s t e m

account

of the size of the set

A model

is d e f i n e d

observation

to be an set of a s y s t e m

to find

that

in the s e n s e the m o d e l

with

gain.

nonlinear

criterion

dynamical

with

gain

program.

i n s i g n i f i c a n t as the o b s e r v a t i o n

that

models,

is d e m o n s t r a t e d . requires

The c h o i c e

the m o d e l l e r ' s

It is s h o w n

class

The use of i n f o r m a t i o n

of rival m o d e l s

of i n f o r m a t i o n

for a w i d e

stochastic

is s t r a i g h t f o r w a r d .

is a s s o c i a t e d

with

It is p r o v e d

c a n exist,

in general,

its

consistency

is d i s c u s s e d .

algorithm"

as a c o m p u t e r

the system.

of a model,

and its

is a s u i t a b l e

assessment

calculation

be e x p r e s s e d

solution

is proposed,

modelling

Information

information

a characterisation

of the q u a l i t y

it is not possible,

the h i g h e s t

accounts

of a l g o r i t h m i c

the o u t p u t

criterion

conventional

that

is

restrictions.

gain",

"universal

identification

for

observations.

specified

are available,

to a t h e o r y w h i c h

taking

for c o m p u t i n g

of

of

a partial

while

models

f r o m a set of o b s e r v a t i o n s

on to d e v e l o p

constitutes

identification,

System

The c o n c e p t s

are d r a w n

interpreting

of o b s e r v a t i o n s

and discussed.

that b e h a v i o u r .

which

sets

as the p r o g r e s s i o n

the b e h a v i o u r

theory

small

and

this

that

of p r o g r a m m i n g

a priori choice

sets b e c o m e

the m o d e l

beliefs

becomes

large.

A detailed

IV

investigation of

shows

"the s m a l l e s t

program.

t h a t it is p o s s i b l e

language"

A priori

knowledge

t h e r e f o r e be c o n s i d e r e d required

to r u n a p a r t i c u l a r

assumed

about a system can

to be d e f i n e d by the s m a l l e s t

language

to run the m o d e l .

Finally, which

required

to s p e a k p r e c i s e l y

the e f f e c t on m o d e l

system observations

t h a t a "safe"

are c o d e d

c o d i n g exists,

a s s e s s m e n t as w o u l d

a s s e s s m e n t of the m a n n e r is e x a m i n e d .

which often

It is f o u n d

leads to the

the use of m o s t o t h e r c o d i n g s .

in

same

ACKNOWLEDGEMENTS

The

idea of e x a m i n i n g

information

theory

His c o n s t a n t detailed

modelling

is due

to P r o f e s s o r

encouragement

criticism,

in the

light

A.G.J.

and e n t h u s i a s m ,

has b e e n

an e s s e n t i a l

of a l g o r i t h m i c

MacFarlane.

as w e l l

as

ingredient

of this

work. I have also benefited

from d i s c u s s i o n s

of the C o n t r o l

and M a n a g e m e n t

Dr.F.P.

Kelly,

Dr.

special

mention.

chapter

was

Watson

M.B.

from N e w t o n

of w h o m Beck

deserve

in the

last

out to me by D r . A . T . F u l l e r .

support

Council,

Group,

and Dr.

The q u o t a t i o n

pointed

Financial Research

S.R.

Systems

with many members

for this r e s e a r c h

and in the

final

came

stages

from

the S c i e n c e

from P e m b r o k e

College. Roberta of typing,

Hill but

special

so s u c c e s s f u l l y My wife

has p r o d u c e d thanks

through

are due

chapter

saying

have b e e n w i t h o u t

her

I shall

how

leave

to her

this

standard

for s t r u g g l i n g

one of those

impossible

constant

excellent

5.

has a s k e d me not to w r i t e

acknowledgements,

consequently

her usual

this

encouragement to t h e

embarassing

research

would

and support;

reader's

imagination.

CONTENTS

1

1.

Introduction

2.

S u r v e y of R e l a t e d Work

23

3.

A Characterisation

60

4.

I n c o r p o r a t i o n of A Priori K n o w l e d g e

102

5.

F r a g m e n t s of P r o g r a m m i n g

115

6.

h-Comparability

135

7.

Table L o o k - U p C o d i n g s

148

8.

D i s c u s s i o n and C o n c l u s i o n

158

References

180

of M o d e l l i n g

Languages

Appendices: A

Formal S e m a n t i c s of P r o g r a m m i n g L a n g u a g e s

185

B

S y n t a x of the A l g o l W - S u p p o r t of the G a s - F u r n a c e Models

216

Table L o o k - U p s

220

C

Diagrams

for the G a s - F u r n a c e M o d e l s

229

1.

i.i

INTRODUCTION

Motivation

The areas in w h i c h the s c i e n t i f i c m e t h o d has b e e n demonstrably

and s p e c t a c u l a r l y

by the p o s s i b i l i t y observations,

successful

are c h a r a c t e r i s e d

of p e r f o r m i n g e x p e r i m e n t s ,

or m a k i n g

more or less freely w h e n e v e r these are d e e m e d

desirable.

The result of this has b e e n that e x p l i c i t

c o n s i d e r a t i o n of the size of the set of o b s e r v a t i o n s w h i c h a m o d e l is h y p o t h e s i s e d , fitted, has b e e n n e g l e c t e d .

from

and to w h i c h a m o d e l is Any doubts w h i c h

arise about

the m o d e l can be r e s o l v e d by further e x p e r i m e n t a t i o n

and

observation. This p l e a s a n t p r o p e r t y i n c r e a s i n g l y d i s a p p e a r s enters

the domains of complex i n d u s t r i a l processes,

m e n t a l c o n t r o l systems, m a n a g e m e n t systems, e c o n o m i c systems.

as one environ-

and socio-

The w o r k d e s c r i b e d here aims to c l a r i f y

the r e l a t i o n s h i p b e t w e e n the s m a l l n e s s of the a v a i l a b l e o b s e r v a t i o n sets for such systems of the m o d e l s

and the d e g r e e of u s e f u l n e s s

o b t a i n e d for them.

Until recently,

the class of m o d e l s ~ h ~ c h

c o u l d be used

in s c i e n t i f i c i n v e s t i g a t i o n s was r e s t r i c t e d by a v e r y p r a c t i c a l consideration. understood,

The b e h a v i o u r of the m o d e l had to be

and that u n d e r s t a n d i n g

the theory of the model. s u f f i c i e n t l y simple

could only be o b t a i n e d from

The m o d e l was

c o n s t r a i n e d to be

for t h e o r e t i c a l i n v e s t i g a t i o n to be

possible. The

availability

situation

of the

radically.

the b e h a v i o u r theoretical

complicated behaviour,

of it.

of u s e f u l

relaxed.

model

changed

with hardly

Consequently

models

structure,

has b e e n

to o b s e r v e

the d e t a i l s

this

to i n v e s t i g a t e

It is now p o s s i b l e

and to a d j u s t

simulated

by s i m u l a t i o n ,

understanding

least g r e a t l y

has

It is now p o s s i b l e

of a m o d e l

on the c o m p l e x i t y

computer

any

this

constraint

removed,

or at

to p o s t u l a t e

a

its s i m u l a t e d

of the m o d e l

b e h a v i o u r r e s e m b l e s the b e h a v i o u r

until

its

of the s y s t e m b e i n g

investigated. When

is such

understanding be used the

of h o w

some

light

investigate

say how

to how

models

good

the

an i s o l a t e d

Why should

the details ability

model

a simulation

above not be u s e f u l

system behaviour,

indicate

the q u a l i t y

any can it

in this

A further

with

thesis aim

is

of rival m o d e l

connected

in

system

of the thesis

to d i s t i n g u i s h

assessment,

between

the a b i l i t y

to

is.

model

or r e l i a b l e ?

observed

When

of the same

Most

is i n t i m a t e l y

it give

reported

on t h e s e q u e s t i o n s .

how r i v a l m o d e l s

that

does

the s y s t e m w i l l b e h a v e

of the w o r k

concerned with

b u t it is clear

When

really works?

s h o u l d be assessed.

ostensibly

competing

guide

The p u r p o s e

is to t h r o w

behaviour

useful?

the s y s t e m

as a r e l i a b l e

future?

is to

a model

of the type d e s c r i b e d If it r e p r o d u c e s

is that not s u f f i c i e n t

of the m o d e l ?

In fact,

the

evidence

is it not

to

clear

that

the b e t t e r

the b e t t e r

the

the m o d e l ?

is the p o s s i b i l i t y complexity checked

against

the

time.

clear

the only

is no m o r e value.

agrees w i t h model

of some v a r i a b l e

no o t h e r

that v a l u e s

the v a l u e

in some

taken,

model,

then

It n e v e r

observations, prediction

also

confidence

amounts

assessment

say,

w o u l d be

than

confidence

increases

little

(which does

in

value

but

predictions, measurements of the

very quickly. after

doubt not

in the

to say

the p r e d i c t i o n s

of course,

have

is

any o t h e r

If further

in the m o d e l

third

it is

of c o n f i d e n c e

are b e t t e r

agree w i t h

correct

then

It is now p o s s i b l e

guesses.

one w o u l d

at some

It

is taken w h i c h

of the model,

to certainty,

it.

The p r e d i c t e d sense)

by the m o d e l

than m e r e

that

time of the v a r i a b l e

is nil.

increases.

and these

about

of the two o b s e r v a t i o n s ,

the p r e d i c t i o n

sense,

its

at two d i f f e r e n t

of the v a r i a b l e

with

reasonable

predicted

since

Suppose

information

if a third m e a s u r e m e n t

immediately

reason

and it is b e i n g

example.

(in an i n t u i t i v e

However,

The b a s i c

the model,

simple

of the m o d e l

likely

behaviour,

set of data.

are t a k e n

on the b a s i s

that

is no.

unconstrained,

following

to p r e d i c t

the p r e d i c t i o n

that

imply

only

ten

the next

that it

be). The

model

answer

If a linear v a r i a t i o n

proposed,

of the o b s e r v e d

"overfitting"

and that w e have

is d e s i r e d

would

of

a small

two m e a s u r e m e n t s

are

Our

is r e l a t i v e l y

Consider

times,

reproduction

confidence

clearly

which

depends

one is w i l l i n g

on the d i f f e r e n c e

to ascribe between

to this

the n u m b e r

of o b s e r v a t i o n s observations

required

the a v a i l a b l e then we have situation

of a r b i t r a r y

number

it "explains"

no

to c o n s t r u c t

that

by s a y i n g

then w e

if the n u m b e r

about

it fit the o b s e r v a t i o n s , i s

of o b s e r v a t i o n s ,

the model, This

that

have been m a d e

of

If all of

in its p r e d i c t i o n s .

also be d e s c r i b e d decisions

the model.

are used

confidence

to m a k e

and the n u m b e r

to c o n s t r u c t

observations

can

in o r d e r

which

the m o d e l ,

the same

have no c o n f i d e n c e

as~the

in the

model. This

p o i n t was m a d e

dismissed

Jeans'

catastrophe

succinctly

classical

and the

by P o i n c a r e ,

explanation

specific

heat

when

he

of the u l t r a v i o l e t

of solids

(i) :

"It is o b v i o u s that by g i v i n g s u i t a b l e d i m e n s i o n s to the c o m m u n i c a t i n g tubes b e t w e e n his r e s e r v o i r s and g i v i n g s u i t a b l e values to the leaks, Jeans can a c c o u n t for any e x p e r i m e n t a l results w h a t e v e r . But this is not the role of p h y s i c a l theories. T h e y s h o u l d n o t i n t r o d u c e as many a r b i t r a r y c o n s t a n t s as there are p h e n o m e n a to be e x D l a i n e d ; they should establish connections between different experimental facts, and above all they s h o u l d allow p r e d i c t i o n s to be made."

On the o t h e r hand, reproduces If o n l y increase

a slight

have

been

of p h e n o m e n a "

r e q u i r e d for m o d e l the

complexity

accuracy

behaviour

increase

in accuracy,

constants" "number

the o b s e r v e d

the

is c l e a r l y

in c o m p l e x i t y

then

in some

added

sense

to it than

which

assessment

of a m o d e l

with which

and its

significant.

results fewer

in a large "arbitrary

the a d d i t i o n a l

it now explains. is some

the m o d e l

What

"trade-off"

accuracy.

is

between

A prerequisite

for this a wide

is a m e a s u r e

class

appears

casting

of m o d e l s

in such

of fit of m o d e l

behaviour

to the o b s e r v e d

is the

as a c o m p o n e n t is thus

a suitable

of m o d e l

achieved

assessment

qrthodox would

be

f r o m a small

approach

a form,

in

that behaviour

The r e q u i r e d

model

class,

ment problem

as a s t a t i s t i c a l has

of the

complexity

in

indeed been

follow

of m o d e l s

some

statistical

to f o r m u l a t e

decision

the a s s e s s -

problem.

investigated,

such

of m o d e l

assessment

then be p o s s i b l e

type e n c o u n t e r e d

We do not

the

and to p o s t u l a t e

It m a y

of a p p r o a c h

to the p r o b l e m

to e x a m i n e

framework.

(5).

introduced

complexity.

by a s s e s s i n g

to

manner.

A more

models

is a p p l i c a b l e

innovation

trade-off

chosen

of models.

which

A major

this w o r k poorness

of c o m p l e x i t y

even

in c o n t r o l

an a p p r o a c h

This

type

for d y n a m i c a l

studies for the

(2)(3)(4) following

reasons. Any m e t h o d w i l l be

arrived

appropriate

(such as l i n e a r

Such

compared

investigated market,

for a n a r r o w

(statistical)

corrupted

a method will

are b e i n g

only

difference-equation

set in a p a r t i c u l a r "observations

at from s t a t i s t i c a l

by w h i t e ,

n o t be u s e f u l - for e x a m p l e ,

is the b e h a v i o u r

it may be d e s i r e d

Forrester's

"Industrial

class

of m o d e l s

models,

for e x a m p l e ) ,

environment Gaussian,

(such

firms

models

being in some

a model based

techniques

noise").

different

if the s y s t e m

to c o m p a r e

as

additive

if two very

of c o m p e t i n g

Dynamics"

considerations

on

(6) w i t h

a model

which

uses

market's

game

theory

firms'

elements. usually

simulation

When

the p r o b a b i l i t y Furthermore, economic

difficult

when

under

conditions.

few o b s e r v a t i o n s

and there

is little

the

statistical

specification

may

i t s e l f be very u n c e r t a i n .

by n o t a s s u m i n g conclusions These fruitful

it to be known;

considerations

to i n v e s t i g a t e

by a p a i n s t a k i n g

three

of r e l e v a n t are

about

it,

environment little

is lost

misleading

indicate

that

by m a k i n g

the g e n e r a l

and d i f f i c u l t

it may be m o r e of m o d e l s

of complex,

as few a s s u m p t i o n s situation,

analysis

rather

as

than

of each m o d e l

as it arises.

Overview

We

case

in fact,

the a s s e s s m e n t

systems

and e x a m i n i n g

structure,

knowledge

these,

may be avoided.

understood

possible

behaviours

of a s y s t e m

of the s y s t e m ' s In this

(8).

and s o c i o -

stationariness

a priori

of

When modelling

processes. available,

it is

and i m p o r t a n t

to assume

Finally,when

nonlinear

variables

environmental

it may not be a p p r o p r i a t e

1.2

and the

the e v o l u t i o n

of r e l e v a n t

interesting

transient

contain

also d y n a m i c a l ,

to d e s c r i b e

investigating the m o s t

often

are

distributions

systems,

occur

models

such m o d e l s

extremely

poorly

actions

responses.

Realistic

often

(7) to e x p l a i n

develop

of A p p r o a c h

and Results.

a characterisation

"components":

the s y s t e m

of m o d e l l i n g

to be m o d e l l e d ,

which

has

a model

of

this system, The

and a c r i t e r i o n

system

pair of sets

of q u a l i t y

to be m o d e l l e d

of o b s e r v a t i o n s are

and accuracy,

observation

Each

each

therefore

discrete-time that this

of d a t a detail

does n o t

time,

of this

become

of such

reflects

evident

a system

to

the r e a l i t i e s

be d e f i n e d

in m o r e

which

implies

compute

a reversed

time

obtained.

exercise

interest,

ordering.

functions

defined

These

It only

are u s e l e s s

in a n e w s i t u a t i o n exercise),

as a r e f e r e n c e ,

with

will

of a p a r t i c u l a r

subsets

to a d m i t

be of m u c h

of the m o d e l l i n g

to m o d e l s

which

is b r o a d e n o u g h

system may behave

serve

the o u t p u t

a lack of any

of the m o d e l l i n g

Any r e s t r i c t i o n

onto

not n o r m a l l y

observations

the goal

by s p e c i f y i n g

The

is any a l g o r i t h m w h i c h maps

or even

h o w the

type w i l l

the success

It m e r e l y

interpretation

algorithms

on the p a r t i c u l a r

(presumably

finite.

it w i l l

the m o d e l s

definition

would

such as those w h o s e

for d e d u c i n g

resolution

a set of d i s c r e t e - s t a t e ,

of the o b s e r v a t i o n s

which

allows

limited

to be r a t i o n a l .

to be

However,

system

This

of

and output.

is a s s u m e d

A system will

of the

observations.

direction

by a

1.3.

subsets

algorithms

like

category.

in sec.

input

is a s s u m e d

constrain

collection.

A model certain

looks

to be d e f i n e d

obtained with

measurements.

be of the same

also

always

set of o b s e r v a t i o n s

system

is t a ke n

of its

Since m e a s u r e m e n t s

of the model.

but models

respect

to w h i c h

be assessed. type

of the o b s e r v a t i o n s

is a c c o m p l i s h e d lie

in the

domain

of the a l g o r i t h m ,

observations

are

deterministic

successive

successive

outputs,

the W i e n e r

- Kolmogorov

blocks

of i n p u t

elements

to be the c o r r e s p o n d i n g

For example, n e e d o n l y map

and w h i c h

images.

difference

blocks

whereas

of the o u t p u t

of input

stochastic

or K a l m a n

and p a s t o u t p u t

equation

models

observations

predicting

types m u s t map

observations

to

models

of

successive

to s u c c e s s i v e

outputs. The

term

program".

Thus

the o u t p u t specified

"algorithm" we

think

observations, subsets

may be

interpreted

of m o d e l s and these

as p r o g r a m s programs

of the o b s e r v a t i o n s

task.

This

it w e r e

not

for the p o w e r

of C h u r c h ' s

states

that

any p r o c e d u r e

which

notion

of an " a l g o r i t h m "

equivalent hence

viewpoint

the m o d e l

some p r o g r a m m i n g taken

to be the

the n u m b e r the p r o g r a m which

have

is w r i t t e n

shortness

is a m e a s u r e

the o u t p u t

criterion

program

in

of q u a l i t y

is

as m e a s u r e d

with which

the o b s e r v a t i o n s

The

length

of a r b i t r a r y

to the p r o g r a m m i n g Furthermore,

observations were

and

program.

of the n u m b e r

to c o m p u t e

(9), w h i c h

of a l g o r i t h m s ,

in the program.

the model.

if

in any one of the

of that p r o g r a m ,

(relative

in this

the i n t u i t i v e

as a c o m p u t e r

the

them

arbitrary,

Thesis

theory

as a c o m p u t e r

of c h a r a c t e r s

in c o n s t r u c t i n g

of the

lanaguage,

been m~e

be e x c e s s i v e l y

satisfies

for c o m p u t i n g

may use the

to help

can be e x p r e s s e d

formalisations

can be e x p r e s s e d When

would

as " c o m p u t e r

originally

of

decisions

language)

a model

exactly

by

is r e q u i r e d

(to the a c c u r a c y made).

In o r d e r

to do this,

the m o d e l m u s t g e n e r a t e i n t e r n a l l y

those terms

w h i c h w o u l d c o n v e n t i o n a l l y be t h o u g h t of as "fitting errors". Since the p r o g r a m m i n g terminals,

l a n g u a g e has a finite n u m b e r of

the length of the m o d e l i n c r e a s e s w h e n these

terms increase.

The c r i t e r i o n of q u a l i t y

a particular trade-off between

thus i n c o r p o r a t e s

c o m p l e x i t y and a p p r o x i m a t i o n .

The above c h a r a c t e r i s a t i o n of m o d e l l i n g more detail 2.2.

in C h a p t e r 3.

Support

is e x p l a i n e d in

for it is given in s e c t i o n

The e s s e n c e of this s u p p o r t is that the length o~

the s h o r t e s t p r o g r a m r e q u i r e d to c o m p u t e a s ~ q u e n c e d i s p l a y s properties

analogous

to the p r o p e r t i e s

of the e n t r o p y

associated with a probability

space.

long sequence, w h i c h r e q u i r e s

a maximally

compute it, p a s s e s every e f f e c t i v e (asymptotically, w i t h p r o b a b i l i t y

possible

long p r o g r a m to

i).

This suggests

to "compress"

that

the p r o g r a m

r e q u i r e d to compute a set of o b s e r v a t i o n s

represents

a

test for r a n d o m n e s s

the amount by w h i c h it is p o s s i b l e (model)

In p a r t i c u l a r ,

(system)

the amount of i n f o r m a t i o n w h i c h it has b e e n

to e x t r a c t from the o b s e r v a t i o n s .

If the only

m o d e l w h i c h has b e e n found is one that m e r e l y reads out the observations

from a look-up table,

has b e e n achieved,

and such a m o d e l

then no " c o m p r e s s i o n " conveys no i n f o r m a t i o n

about the o b s e r v a t i o n s . A c o n s e q u e n c e of our c h a r a c t e r i s a t i o n

is that no

a l g o r i t h m can e x i s t for finding

the best m o d e l

the above c r i t e r i o n of quality)

of an a r b i t r a r y

(according to system.

10

The choice of p r o g r a m m i n g

l a n g u a g e to be used,

a s s e s s i n g the q u a l i t y of a model,

for

can be v i e w e d as the

s p e c i f i c a t i o n of "what is to be taken for granted". should

It

t h e r e f o r e be m a d e in the light of the m o d e l l e r ' s

a priori k n o w l e d g e

about the system,

the m o d e l l i n g exercise.

In C h a p t e r 4 this c o n n e c t i o n is

e x a m i n e d m o r e closely. sets are large enough,

and of the p u r p o s e s of

It is shown that,

if the o b s e r v a t i o n

then the results of m o d e l a s s e s s m e n t

are i n d e p e n d e n t of the choice of p r o g r a m m i n g

language.

This can be i n t e r p r e t e d to m e a n that the m o d e l l e r ' s 9 p r i o r i beliefs become

less s i g n i f i c a n t as the set of o b s e r v a t i o n s

a v a i l a b l e to him grows. Nevertheless, observation

the a s s e s s m e n t of m o d e l s of small

sets ~ d e p e n d e n t on the m o d e l l e r ' s

of his a p r i o r i beliefs.

Consequently

cannot be taken to be definitive.

specification

such an a s s e s s m e n t

However,

this is

m i t i g a t e d by the fact that the m o d e l l e r does not n e e d to choose b e t w e e n

mutually exclusive

he can s t i p u l a t e p r o g r a m m i n g

sets of a priori beliefs:

l a n a g u a g e s w h i c h imply a g r e a t e r

or s m a l l e r state of k n o w l e d g e . S e v e r a l d i f f e r e n t models,

even w h e n w r i t t e n in the same

language, w i l l rarely use e x a c t l y the same f e a t u r e s of that language.

It is t h e r e f o r e q u e s t i o n a b l e w h e t h e r a c o m p a r i s o n

of their lengths gives a m e a s u r e to the same set of assumptions. this difficulty.

Chapter

of their c o m p l e x i t y r e l a t i v e Chapters

5 develops

5 and 6 resolve

a formal e q u i v a l e n t

of "a p r o g r a m makes use of s u c h - a n d - s u c h f a c i l i t i e s of a

11

language".

A prerequisite

for this is a formal m e t h o d of

d e f i n i n g the s e m a n t i c s of p r o g r a m m i n g

languages.

such m e t h o d is o u t l i n e d in A p p e n d i x A. the concepts d e v e l o p e d in C h a p t e r

these c o n d i t i o n s

C h a p t e r 6 then uses

5 to specify some c o n d i t i o n s

under w h i c h m o d e l s may be m e a n i n g f u l l y d e m o n s t r a t e d that m o d e l

One

compared.

It iS

a s s e s s m e n t is not m u c h a f f e c t e d if

are not m e t exactly.

The details of the c o m p l e x i t y / / a p p r o x i m a t i o n t r a d e - o f f , w h i c h is i n h e r e n t in our p r o p o s e d m e t h o d of m o d e l a s s e s s m e n t , d e p e n d on the p r e c i s e m a n n e r in w h i c h the o b s e r v a t i o n s coded in the p r o g r a m m i n g

language.

It is c o n v e n i e n t

are to

s e p a r a t e this aspect of the s e l e c t i o n of a s u i t a b l e p r o g r a m m i n g language from those aspects c o n s i d e r e d in C h a p t e r s e q u e n t l y the coding of o b s e r v a t i o n s

4;

con-

is d i s c u s s e d in C h a p t e r 7.

A d i s t i n g u i s h e d m i n i m a l coding is shown to exist,

and it is

argued that this is a n a t u r a l c o d i n g to use for m o d e l assessment. The m o d e l l i n g of one p a r t i c u l a r s y s t e m gas-furnace data

(i0))

(Box and Jenkins'

is used as an e x a m p l e throughout.

The r i v a l m o d e l s c o n s i d e r e d for this s y s t e m are very simple and in no way r e p r e s e n t the range of possibi.lities d i s c u s s e d in sec.

i.i.

Nevertheless,

the c o n s i d e r a t i o n s

there apply e v e n to these simple models, Chapter

3.

It w i l l b e c o m e

raised

as w i l l be seen in

a p p a r e n t that the a s s e s s m e n t

m e t h o d p r o p o s e d in this thesis is i m m e d i a t e l y a p p l i c a b l e to a much

larger class of models.

12

1.3

System

Identification r Realisation

Modern notion

developments

of a d y n a m i c a l

experimental with

data

of systems

system

(ii),

the i n f e r e n c e

not y e t o b s e r v e d

conditions,

behaviour,

known

under

theory

emphasise

as an a b s t r a c t

(12),

of s y s t e m

and M o d e l l i n @

(13).

summary

Modelling

behaviour

under

is c o n c e r n e d

by w h i c h

is a c h i e v e d

is the p o s t u l a t i o n

the

system,

which

and

the s e l e c t i o n ,

from t h e s e

candidate

is p r e f e r r e d

on the basis

of some

criterion.

its h e a v y

emphasis

that

modern

discussing

as

However,

observations, upon

Consequently,

a more

than

if a s y s t e m

then as little

we adopt

structures,

and

one

the

of one The

on

to adopt,

less u s e f u l

when

view

of c o m p o n e n t s " . and the

by r e f e r e n c e

abstract

modelling

for

observations,

is to be m o d e l l e d ,

is to be g a u g e d

it, b e f o r e

these

natural

the o l d e r

structures

this

following

to the

structure

has begun,

success

should

be

as possible.

definition:

(1.3.1)

A system observations, U=

with

"an i n t e r c o n n e c t i o n

of the m o d e l l i n g

Definition

with

is t h e r e f o r e

modelling,

of a s y s t e m

(i)

compatible

v i e w of a system,

observations,

imposed

are

but

of p a s t

The m e t h o d

of a b s t r a c t

of

specified

from o b s e r v a t i o n s

conditions.

the

S is d e f i n e d S=

(u I , u 2

to be an o r d e r e d

(U, Y)

, where:

, .

,uM)

and Y=

(Yl

p a i r of

' Y2

'

,YN )

13

are the i n p u t and o u t p u t o b s e r v a t i o n sets r e s p e c t i v e l y ; ui=

(Ul, u2

i )and . , u~i

• .

y i=

are o r d e r e d sets of o b s e r v a t i o n s

w h e r e tl,t2,..,

(yi1

'

yi2

i ' Ymi

'

c a r r i e d out at time ti,

t N is the n a t u r a l

time ordering;

u~ E { r a t i o n a l s }

'

u {b} where b

i

for yj;

3

(blank)

denotes a missing observation;

similarly

and

(ii) w i t h the c o n v e n t i o n

£i=0;

)

if

that

Yi=b t h e n mi=O;

if

(b,b,...,b)=b,

if u . = b then l

u.%b t h e n u£.@b; i l

if Y i ~ b then

1 i

Ym, ~b; 1

and YN%b.

C o n d i t i o n s (ii) serve only to e n s u r e that adding on a set of blanks

(missing o b s e r v a t i o n s )

does not create a new system.

For c o n c r e t e n e s s • we have s p e c i f i e d that ui,Y i refer to observations made

at time t i, since we are i n t e r e s t e d p r i m a r i l y

in d y n a m i c a l models. essential.

Also,

However•

this i n t e r p r e t a t i o n

is not

each u i , Y i could be a m u l t i d i m e n s i o n a l

finite a r r a y of o b s e r v a t i o n s ,

r a t h e r than a o n e - d i m e n s l o n a l

array, w i t h o u t a f f e c t i n g later results. The input o b s e r v a t i o n set is a l l o w e d to be empty, order to admit d e v i c e s such as noise g e n e r a t o r s as systems of the form w h e n stating

(b, Y).

in

and o s c i l l a t o r s ,

It has b e e n a r g u e d that

the g e n e r a l p r o b l e m of s y s t e m i d e n t i f i c a t i o n ,

it should not be n e c e s s a r y

to d i s t i n g u i s h b e t w e e n input and

output(14).

The two should be lumped t o g e t h e r as a "system

behaviour",

and the task of s y s t e m i d e n t i f i c a t i o n s h o u l d

~4

include

the

it seems the

two

separation

essential cases

shown

and

internal

structures

procedure inputs

must

have

sets.

The

f r o m the sets

lead

form

Our

especially field

difference

a system

define

of

cc. c e ~ n ~ d r

however,

interaction have

is s o m e

cbservaLions concise

referred we prefer the

with

with

to above

o f its

the

set of observations

of observation

themselves and

systems

(b, Y). seem odd,

theory.

by

by

In t h i s

a set

of

examining

equations. process.

We We

of a system

hehaviour.

"laws"

- such

"explain"

The as t h e

this

set of equations as

a "system".

reason

the o b s e r v a t i o n

a system

reverse

this

are

assume

because

eD~TircFme~.t - 'I o t h e r w o r d s ,

- which

to regard

control

the e x i s t e n c e

set of

are

for

the

observations

at f i r s t

of these

the

its

that

a

identification

unless

pair

input

its b e h a v i o u r

solutions

of

with

to define

properties

aware

may

any

It is

t h a t U # b)

familiar

and investigate

we

a system,

"system"

equations,

its

Note

different

between

of both

the

between

labelled

very

But

as an o r d e r e d

distinguishes

i t is c o n v e n t i o n a l

the

point.

distinguished.

of

to t h o s e

t h a t ?,e are

the

same model

(b, U) ( p r o v i d i n g

definition

boxes

-

observations.

U a n d Y, w h i c h

o f the

The black

consider

However,

of distinguishing

to h a v e

are

ordering

"output".

can be expected

to the

defined

output

i.

and an earthing

and outputs

that we

and

a means

in Fig.

"sink"

generator

"input"

to have

"source"

signal

of

goal

of because

of modelling

set of equations

interaction. as a " m o d e l " ,

Hence and

15 The d e f i n i t i o n of "system" w h i c h is p r o p o s e d above is much cruder than the d e f i n i t i o n s

usually encountered.

It

is w o r t h s t a t i n g in full one such d e f i n i t i o n - that of Kalman, Falb and A r b i b

Definition

(ii) :

(1.3.2)

A dynamical system mathematical (a)

(i)

( i n p u t / o u t p u t sense)

is a c o m p o s i t e

c o n c e p t d e f i n e d as follows:

T h e r e is a given time set T, a set of input values U,

a set of a c c e p t a b l e i n p u t functions

R={~

:T+

output values Y, and a set of o u t p u t functions (ii)

(Direction of time).

U}, a set of F ={y

:T÷

Y}.

T is an o r d e r e d subset of the reals.

(iii) The i n p u t space ~ s a t i s f i e s

the f o l l o w i n g conditions:

(I)

(Nontriviality).

~ is nonempty.

(2)

(Concatenation of inputs).

An input s e g m e n t

~(t I, t 2) is ~e~ r e s t r i c t e d to

(t I , t2)~T.

If ~,~'e~ and tl< t 2 < t3, there is an e"e~ such that m" (tl,t2) = ~ ' ( t l , t 2 ) and ~" (b)

T h e r e is given a set F = (fe

:

T

x

A

(t2,t3)=w"(t2,t3).

i n d e x i n g a family of f u n c t i o n s ~ ~Y,~eA}

;

each m e m b e r of F is w r i t t e n e x p l i c i t l y

as f (t,~)= y(t)

w h i c h is the o u t p u t r e s u l t i n g at time

t

under the e x p e r i m e n t

e.

Each f

from the input

is c a l l e d an i n p u t / o u t p u t

function and has the f o l l o w i n g p r o p e r t i e s : (i)

(Direction of time).

f (t,~)

There

is d e f i n e d for all t>l(e).

is a map

~:A÷T such that

16

(ii) ~(~

Let T,teT

(Causality). ,t) =~

and T
, then f (t,~)=f

(T,t)

If ~,m'e~

(t,W')

and

for all e such

that T=t(a).

The r e a s o n why our d e f i n i t i o n definition

(1.3.2)

by o b s e r v a t i o n s

(1.3.1)

is that we c o n s i d e r our s y s t e m to b e . d e f i n e d

of reality.

Our s y s t e m is not so m u c h

a b s t r a c t s u m m a r y of e x p e r i m e n t a l data", definition

(1.3.2)

can be c r u d e r than

"an

as the s y s t e m of

is, but r a t h e r is the d a t a itself.

do not have to i n c l u d e c o n d i t i o n s " c o n c a t e n a t i o n of inputs",

ensuring

We

"causality"

or

b e c a u s e we c o n s i d e r these to be

c o n d i t i o n s w h i c h w i l l be i m p o s e d on the class of m o d e l s w h i c h we are w i l l i n g

to c o n s i d e r

for the system,

c o n d i t i o n s on the s y s t e m itself.

r a t h e r than

Definition

(1.3.2)

is

u n d o u b t e d l y v e r y s u i t a b l e for the d e d u c t i v e d e v e l o p m e n t of t h e o r i e s of s y s t e m b e h a v i o u r ,

but it is i n a p p r o p r i a t e

as a

s t a r t i n g p o i n t for the study of s y s t e m i d e n t i f i c a t i o n , p r i m a r i l y b e c a u s e it assumes that the m a i n task of s y s t e m i d e n t i f i c a t i o n has already b e e n a c c o m p l i s h e d . To see that this is so, c o n s i d e r the family of i n p u t / output

f u n c t i o n s F.

corresponding

function

Under a particular experiment f

e

determines

how the s y s t e m w i l l b e h a v e In o t h e r words,

But

is p r e c i s e l y to d e t e r m i n e

in r e s p o n s e

it is to d e t e r m i n e

the

the future o u t p u t

b e h a v i o u r of the s y s t e m for any a c c e p t a b l e input ~. the task of s y s t e m i d e n t i f i c a t i o n

~

to c e r t a i n inputs.

some of the f 's.

17

Furthermore,

the d i v i s i o n

"experiments" process.

is p r o p e r l y

If a m a c h i n e

operated

again,

its b e h a v i o u r of some

when

one e x a m i n i n g

also

decision

a continuous

does n o t k n o w t h a t

switched

off

it has b e e n

then

a reflection

of the m a c h i n e .

of its b e h a v i o u r ,

into

and

account of

is a l r e a d y

switched

of the r e c o r d

off,

to take

we p o s s e s s record

into s e p a r a t e

of the i d e n t i f i c a t i o n

not

it is s w i t c h e d

conception

division

a part

is o p e r a t e d ,

the u s u a l

abstract

suitable

of o b s e r v a t i o n s

off,

may

separate

Some-

and w h o

find

a

"experiments"

to be far from obvious. In our view,

system

identification

progression

from

to a s y s t e m

in the form of d e f i n i t i o n

first of all

a system

a division

whereupon

the

system

functions

F to c e r t a i n

determination, of the

many

possibilities It seems

to quote again

Falb yet

from

trouble

another

subsets is that

from,

even

to d i s t i n g u i s h

that

and A r b i b

(ii) :

of T x ~.

restrictions,

to c h o o s e

from

(1.3.2).

This

into

of s y s t e m

(ii).

definition

the

the e x t e n s i o n s (commonly

are i n f i n i t e l y

for a p a r t i c u l a r

the p r o b l e m

realisation,

To do this, of

of the

remains

of T x ~ there

involves

experiments,

There of w h a t

the

(1.3.1)

as a set of r e s t r i c t i o n s

subsets

The

as w e l l

identification Kalman,

appears

F are on larger

on all of T x ~).

form of d e f i n i t i o n

of the o b s e r v a t i o n s

from these

functions

in the

is e s s e n t i a l l y

~e~.

of s y s t e m as d e f i n e d

it is n e c e s s a r y

"dynamical

system"

-

by

18

Definition

(1.3.3)

A dynamical mathematical (a)

There

(state space sense)

is a c o m p o s i t e

c o n c e p t d e f i n e d by the f o l l o w i n g axioms: are given sets T, U, ~, Y, F s a t i s f y i n g all

the p r o p e r t i e s (b)

system

r e q u i r e d by d e f i n i t i o n

(1.3.2)

T h e r e is g i v e n a state set X and a s t a t e - t r a n s i t i o n

function 2:

T xTx

w h o s e value is the state x

Xx

~ + X

(t) = ~ ( t ; T , x , ~ ) c X r e s u l t i n g at

time teT from the i n i t i a l state X = X ( T ) E X at initial time ToT u n d e r the action of the input ~£~.

~ has the f o l l o w i n g

properties: (i)

(Direction of Time).

but not n e c e s s a r i l y (ii)

g is d e f i n e d for all t~T,

for all t
(Consistency).

~(t;t,x,~)=x

for all teT, all xEX,

and all me~. (iii)

(Composition property).

For any t l < t 2 < t 3 we have

~(t3;tl,X,~)=~(t3;t2,~(t2;tl,x,~),~) for all xeX and all ~£R.

(iv)

(Causality).

If m ,~e~ and ~

~(t;~,x,~)

(c)

(T,t)=~

(T,t)' then

= ~(t;T,x,~').

T h e r e is given a r e a d o u t map ~ : T x X ÷Y w h i c h defines

the o u t p u t Y ( t ) = ~ ( t , x ( t ) ) . O+D(O,@(~;T,X,~)),~E(T,t), r e s t r i c t i o n Y(~,t)

The map

(~,t)+Y g i v e n by

is an o u t p u t segment,

of some y£F to

(T,t).

that is, the

19

The p r o b l e m of s y s t e m r e a l i s a t i o n the p r o b l e m of c o n s t r u c t i n g of d e f i n i t i o n (1.3.2).

(1.3.3)

Kalman,

is now d e f i n e d to be

a dynamical

s y s t e m in the sense

from a s y s t e m in the sense of d e f i n i t i o n

Falb and A r b i b

(ii) state t h a t this is

"simply an a b s t r a c t way of looking at the p r o b l e m of s c i e n t i f i c model b u i l d i n g " .

We disagree.

If this w e r e

s c i e n t i f i c m o d e l b u i l d i n g w o u l d be "merely" problem.

so, then

.

a mathematical

But the m a j o r p r o b l e m s w i t h m o d e l b u i l d i n g are

p h i l o s o p h i c a l ones - q u e s t i o n s nature of inference, the p r o b l e m s

c o n c e r n i n g the p o s s i b i l i t y

the v a l i d i t y of inductiQn,

a r i s i n g out of our u n c e r t a i n t y

of s c i e n t i f i c method.

These problems

connection with system realisation.

in fact,

and all

about the n a t u r e

do not arise in T h e y all arise, however,

when the q u e s t i o n of s y s t e m i d e n t i f i c a t i o n ,

as d e f i n e d above,

is considered. T h a t is not to say, h o w e v e r ,

that s y s t e m i d e n t i f i c a t i o n

is "an a b s t r a c t w a y of looking at s c i e n t i f i c m o d e l b u i l d i n g " , any more than s y s t e m r e a l i s a t i o n

is.

In order to be useful,

a m o d e l m u s t not only s p e c i f y the i n p u t / o u t p u t a system; them.

it m u s t u s u a l l y p r o v i d e

f u n c t i o n s of

also a m e a n s of c o m p u t i n g

To do this it m u s t take the form of a s y s t e m in the

sense of d e f i n i t i o n

(1.3.3).

A better abstract building"

is, then:

the p r o b l e m of c o n s t r u c t i n g a s y s t e m in

the sense of d e f i n i t i o n of d e f i n i t i o n

f o r m u l a t i o n of " s c i e n t i f i c m o d e l

(1.3.1).

(1.3.3)

from a s y s t e m in the sense

This i n c l u d e s w i t h i n it b o t h the

e s s e n t i a l l y p h i l o s o p h i c a l p r o b l e m of s y s t e m i d e n t i f i c a t i o n ,

20

and the e s s e n t i a l l y m a t h e m a t i c a l

one of s y s t e m realisation.

It is i m p o r t a n t to n o t e that this d i v i s i o n c o n c e p t u a l one;

its aim is to b r i n g to the s u r f a c e the

p r e c i s e n a t u r e of the m o d e l l i n g problem. to imply that a m o d e l l e r identification, in s t a t e - s p a c e

is a p u r e l y

It is not i n t e n d e d

first carries out a p r o c e s s of

and then of r e a l l s a t i o n .

Indeed,

form is o f t e n u s e d to obtain the i n p u t - o u t p u t

f u n c t i o n of the m o d e l ,

rather than the other way round.

But the p o i n t is that in g o i n g a p p a r e n t l y d i r e c t l y d a t a to a s t a t e - s p a c e model, c a r r i e d out a p r o c e s s of the p h i l o s o p h i c a l a c c e p t i n g the m o d e l

of i d e n t i f i c a t i o n , difficulties

and s h o u l d be aware

a s s o c i a t e d w i t h this b e f o r e

as a r e p r e s e n t a t i o n

state-space

(ii).

from the

the m o d e l l e r has i m p l i c i t l y

of the system.

An i l l u s t r a t i o n of the above d i s t i n c t i o n s Ho's a l g o r i t h m

a model

is p r o v i d e d by

This a l g o r i t h m c o n s t r u c t s a s y s t e m in

form from a s e q u e n c e of data.

The e x i s t e n c e of

s u c h an a l g o r i t h m m a y a p p e a r to imply that for any s e q u e n c e of data it is p o s s i b l e

to d e t e r m i n e u n i q u e l y a r e a l i s a t i o n ,

and that this m u s t t h e r e f o r e be the "true" m o d e l of the s y s t e m generating entails

the sequence.

the a s s u m p t i o n s

is linear,

However,

that the s y s t e m i n p u t / o u t p u t

is t i m e - i n v a r i a n t ,

has the s m a l l e s t d i m e n s i o n T h e s e a s s u m p t i o n s of course completely. constitutes

the use of Ho's

algorithm function

and that its m i n i m a l r e a l i s a t i o n

c o m p a t i b l e w i t h the data sequence. s p e c i f y the i n p u t / o u t p u t f u n c t i o n

Thus the d e c i s i o n

to use Ho's a l g o r i t h m itself

the p r o c e s s of s y s t e m i d e n t i f i c a t i o n .

The term " i d e n t i f i c a t i o n "

is used s o m e w h a t d i f f e r e n t l y

21

above

than

is c o n v e n t i o n a l .

being to d i s t i n g u i s h the m a t h e m a t i c a l Our precise system

clearly

problems

definition

of

- c o u l d be used

set of time

functions

of all the v a l u e s Indeed,

own d e f i n i t i o n But they b o t h

whose

of a s y s t e m share w i t h that m a k e s

identification

process,

the a s s u m p t i o n

that

them

rather

(i.e.

as a the

in s p i r i t

pair

(1.3.2)

set (12). to our

of o b s e r v a t i o n s .

the e s s e n t i a l

"outputs"

suitable

situation

as

of the

"inputs":

the b e h a v i o u r

of

is specified. term

"system"

it can be u n d e r s t o o d

is p e r h a p s

in so m a n y

definitions

really be said to d e f i n e

models

for not d o i n g

"model"

for

persist

in c a l l i n g

emphasise

(1.3.2),

is a r e l a t i o n

suitable

In a s t u d y of m o d e l l i n g ,

reason

space

are c l o s e r

than

of a

(function)

of a s y s t e m

as an o r d e r e d

in any new

The use of the because

and o u t p u t

functions)

definition

concept

(13), - a r e l a t i o n

notion

approaches

characteristic

the s y s t e m

Zadeh's

attainable

time

not rely on the

of d e f i n i t i o n

similar

of the

b o t h of these

does

of the i n p ut

very

and

in m o d e l l i n g .

object"

in p l a c e

the aim

the p h i l o s o p h i c a l

(1.3.2).

abstract

product

could W i n d e k n e c h t ' s

between

involved

form of d e f i n i t i o n

on the c a r t e s i a n

is d e l i b e r a t e ,

identification

as an " o r i e n t e d

spaces

This

something

the

are d e r i v e d

fact

our b a s i c that

from its

rather

specific.

all

data

than

and

with

senses. (1.3.3)

systems.

to r e s e r v e On the o t h e r

a "system",

abstract

interaction

different

(1.3.2)

so is that we w i s h more

unfortunate,

The

the term hand,

in o r d e r

conceptions

should

about

its e n v i r o n m e n t

we

to a system - in

22

o t h e r words,

from our o b s e r v a t i o n s of it.

the s e p a r a t i o n of the o b s e r v a t i o n s already reflects but,

into "inputs"

some a b s t r a c t c o n c e p t i o n s

as has already been e x p l a i n e d ,

S t r i c t l y speaking, and "outputs"

about the system,

this s e p a r a t i o n

c o n s i d e r e d to be an e s s e n t i a l part of the "input"

is

to the

p r o c e s s of i d e n t i f i c a t i o n . Such aspects as the choice of m e a s u r e m e n t and e v e n how these m e a s u r e m e n t s ui' Yi

(in d e f i n i t i o n

preconceptions

also imply a b s t r a c t However,

some p r e c o n c e p t i o n s

and we choose to r e g a r d o b s e r v a t i o n s

as

This is e q u i v a l e n t to c o n s i d e r i n g that " o b j e c t i v i t y "

for the m o d e l l e r to him.

are a r r a n g e d in the arrays

about the system.

m u s t be assumed, primitive.

(1.3.1))

scales,

is d e f i n e d by the o b s e r v a t i o n s

available

2.

2.1

SURVEY

Complexity

This

not

literature

since

survey

particular,

such

will

of m u c h

of m o d e l s

of s c i e n c e philosophy

authors

them. who

namely,

inference

from s a m p l e s constitute

Instead, for the with

the s u r v e y w i l l

the

that

the n o t i o n

the

shortness

also e x a m i n e

considered

In

and s t a t i s t i c a l

thesis

most

those w h o have

using

a

and p h i l o s o p h y .

It w i l l

are

the e n t i r e

of s c i e n c e

can be a s s o c i a t e d

study:

of s c i e n t i f i c

of h y p o t h e s e s

of s u p p o r t

for c o m p u t i n g

the p r e s e n t

to c o v e r

not be included.

the d e v e l o p m e n t

w o r k of those

WORK

an a t t e m p t w o u l d

the r e l e v a n t

literature

programs

attempt

on the i n f e r e n c e

of b e h a v i o u r ,

quality

RELATED

Measures

survey will

literature

trace

OF

of

the

relevant

examined

to

the n a t u r e

of c o m p u t a t i o n a l

complexity. It is c o n v e n i e n t d u c e d by ideas

Blum

(15),

(16),

to w h i c h m o s t

Blum's

to b e g i n

by s t a t i n g

since

computable axiomatic

functions.

interest

machines

is in his

theorems which

axioms,

axioms

can be related. of

is the d e v e l o p m e n t

do not d e p e n d

is b e i n g

help

of an

on the class

considered.

w h i c h will

intro-

some u n i f y i n g

of the c o m p l e x i t y

His m e t h o d

theory whose

of c o m p u t i n g

these p r o v i d e

of the o t h e r w o r k

aim is a c h a r a c t e r i s a t i o n

some

Our main

to c l a s s i f y

other

work. Let N denote an e f f e c t i v e one v a r i a b l e ,

the set of n o n n e g a t i v e

listing and

of all p a r t i a l

{M i} a set of

integers,

recursive

"machines",

{~i }

functions

such

that M i

of

24

c o m p u t e s ~i"

The p r e c i s e n a t u r e of M i is not specified.

It can be v i e w e d as a program. Axioms

for Size

mapping N

(Blum

A recursive

(viewed as the set of indices)

the set of sizes) machines, (i)

(15)).

function

into N

I I

(viewed as

is called a m e a s u r e of the size of

Ill b e i n g c a l l e d the size of Mi, if and only if

there e x i s t at m o s t a finite n u m b e r of m a c h i n e s of

any g i v e n size and (2)

there exists

an e f f e c t i v e p r o c e d u r e

any y, w h i c h m a c h i n e s

Axioms

for deciding,

are of size y.

for S t e p - c o u n t i n g

(Blum

(16)).

The set ~ i : i = O , l , . . . }

is a s t e p - c o u n t i n g m e a s u r e on { ~ i : i = O , l , . . . } (i)

~i is a p a r t i a l

recursive

(2)

~i(n)

if and only if ~i(n)

converges

for

if and only if

function converges

r (3)

M ( i , n , m ) = Ii

L0

if ~i(n)---m otherwise

is

(total)

recursive.

(M is a m e a s u r e o n c o m p u t a t i o n ) . An e x a m p l e of a size m e a s u r e is the length of a program, if m e a s u r e d by the n u m b e r of c h a r a c t e r s

a p p e a r i n g in it.

The n u m b e r of s t a t e m e n t s in a p r o g r a m is not u s u a l l y a size measure,

b e c a u s e it v i o l a t e s a x i o m

(i).

The l e n g t h Z(p) of

a p r o g r a m p is not a s t e p - c o u n t i n g m e a s u r e b e c a u s e it v i o l a t e s axiom

(2). E x a m p l e s of s t e p - c o u n t i n g m e a s u r e s

statements

e x e c u t e d by a p r o g r a m ,

are the n u m b e r of

and the amount of time t a k e n

25

by a p r o g r a m are not

"static"

the

obtained

consequence

of the

machines.

For example:

be m e a s u r e s

of the

respectively, T i both g such

compute that,

C2)

glil~

However,

~i"

than

sense

measures.

~

sizes

and {T } are 1

Let

(18)),

sharp

"complexity

measure". enough

to

of

II M and

M. l

and

so o r d e r e d

exists

IT T

l

that M. and l

a recursive

function

that

imposed find

there

any r e c u r s i v e A result

lilt •

of M. and T. are not 1 1

the b o u n d

a primitive

(15)),

there

and

for any class

of the m a c h i n e s

Then

(17)

is an i n e v i t a b l e

they hold

(Blum

are

In some of

"step-counting

by B l u m are not

that

measures

of d e s c r i p t i o n "

This

called

Lofgren

and H o p c r o f t

n.

~ glilT

b o u n d w h i c h we m a y the

{M.} l

are o f t e n

argument

for all i:

lil~

the

size

where

(1)

Thus,

fact

on the

step-counting

for

for our p u r p o s e s .

These measures

respectively.

Hartmanis

as a s y n o n y m

theorems

be u s e f u l

and

"complexity

(e.g.

is u s e d

The

size m e a s u r e s

of i n t e r p r e t a t i o n " ,

literature

units).

they d e p e n d

complexity

t h e m the names

measure"

clock

measures,

"dynamic"

"complexity

since

literature,

complexity

called

gives

in CPU

size m e a s u r e s ,

In the

then

(measured

are

"too d i f f e r e n t " .

by g is m u c h

useful.

greater

(It is only

functions

which

than

a bound

any in

grow m o r e q u i c k l y

function).

of more

interest

to us is that

recursive

function

whose

smallest

there

exists

primitive

26

recursive derivation is considerably larger than its smallest general recursive derivation.

The practical implication of

this is that if we characterise m o d e l l i n g as a search for short algorithms, then ~le should be prepared to use programming languages which allow general recursion, even i~ we only wish to compute primitive recursive functions.

2o2 Algorithmic Information Theory 2.2.1 ~ o l m o g o r o v Complexity

Much of the support for our view of modelling is provided by the theory of complexity and information developed by Kolmogorov

(19),

(20).

Kolmogorov complexity is defined in

terms of lengths of programs, and is therefore more akin to a static than a dynamic measure. property of sequences

It is, however, a

(observed behaviours in our application)

rather than of machines for computing them.

The results

presented below have been taken from the excellent survey paper by Zvonkin and Levin

~21).

Proofs of most of them

can be found there. The r e s u l ~ in which we are interested are concerned with finite binary sequences, which we call words. means of the bijection:

A ~+0, 0÷÷i,

i~2,

By

OO~+3, 01÷+4,...

we can regard words as nonnegative integers, and conversely (here

A is the empty word).

Thus effective procedures

transforming words into words are viewed as partial recursive functions mapping integers into integers.

We denote by £ (x)

27

the ~

of the w o r d x, i.e.

the n u m b e r of bits it contains,

and by xy the c o n c a t e n a t i o n of the words x and y. x denotes

a w o r d or the c o r r e s p o n d i n g

clear from the context. ~(x)

For example,

is m e a n t £(00), but by log

integer w i l l always be if x = 00, then by

(x) is m e a n t

log

we do not d i s t i n g u i s h b e t w e e n x = O0 and x = 3. the above b i j e c t i o n is not the usual b i n a r y integers.

In p a r t i c u l a r ,

Whether

(3).

Thus

N o t e that

coding of

s e q u e n c e s w h i c h d i f f e r Q n l y in

the n u m b e r of leading zeros are a s s o c i a t e d w i t h d i f f e r e n t integers.

Clearly,

£(xy) Also,

=

£(x)

the f o l l o w i n g result holds:

+ £(y)

. . . . . . . . . . . . . . .

(2.1)

it can easily be shown that I£(x) - log 2

Definition

(x) I ~i , (x>O)

. . . . . . . . . . .

(2.2)

(2.2.1) I

Let F

be an a r b i t r a r y p a r t i a l

recursive

f u n c t i o n of one

variable.

T h e n the K 0 1 m o ~ o r o v c o m p l e x i t y of the w o r d x i w i t h r e s p e c t to F is d e f i n e d as: KFI (x)

=

~

[

rain Z(p)

, s, t.

F1(p)

= x. . (2.3)

if no such p exists

A g e n e r a l i s a t i o n of this is the c o n c e p t of c o n d i t i o n a l complexity:

Definition

(2.2.2)

The c o n d i t i o n a l K o l m o ~ o r o v c o m p l e x i t y of the w o r d x, for a

28

2

g i v e n y, w i t h r e s p e c t to the p a r t i a l r e c u r s i v e (of two variables)

function F

is d e f i n e d to be 2

I min£(p) KF2 (x]y)

, s.t.

F

(p,y) = x,

=

.(2.4) if no such p exists.

C o m p l e x i t y was

i n t r o d u c e d by K o l m o g o r o v

was also used by S o l o m o n o f f the t e r m i n o l o g y

(19), a l t h o u g h it

(22) and C h a i t i n

(23), who u s e d

"the length of the s h o r t e s t p r o g r a m r e q u i r e d

to compute x from y, using a T u r i n g M a c h i n e w h i c h c o m p u t e s the f u n c t i o n F 2".

This t e r m i n o l o g y shows the i n t e n d e d

i n t e r p r e t a t i o n of the definitions,

p is t h o u g h t of as a

code or p r o g r a m for x, and F is t h o u g h t of as a d e c o d i n g d e v i c e or computer. w i l l do the job,

the

S i n c e p is the s h o r t e s t p r o g r a m w h i c h (conditional)

c o m p l e x i t y is in some

sense the " s m a l l e s t amount of i n f o r m a t i o n r e q u i r e d to o b t a i n x

(from y), u s i n g F".

Theorem

(2.2.3)

T h e r e exists

a partial recursive

function F 2

(called

0

optimal),

such that,

there exists

for

a constant C

any partial

recursive

(depending o n l y on P a o

function

G2,

and G2),

such that KF2 (xly) ~ o

KG2(xly)

+ C . . . . . . . . . . . . .

This t h e o r e m is due to K o l m o g o r o v

(19) and S o l o m o n o f f

If F O2 is t h o u g h t of as a g e n e r a l p u r p o s e computer,

(2.5) (22).

then the

29

theorem

is e a s i l y

program

for F 2 w h i c h 0

worst,

seen to be true.

the p r o g r a m

the s i m u l a t i o n computes

causes

x using

it to s i m u l a t e

for o b t a i n i n g

program

C is the

x using

+

F 2.

However,

C, b e c a u s e

Theorem

of this

Thus,

KF2

at

G 2 is p r e f i x e d program

(xly) may be less

than

the g r e a t e r by some

flexibility

shorter

of F2o m a y

program.

As

a

t h e o r e m we have:

(2.2.4)

For

any two o p t i m a l

G ~, there

exists

G 2) , such

that

partial

a constant

C

recursive

(depending

functions

only

F ~ and

on F 2 and

IK~(xly) - K ~ (xty)l<.c . . . . . . . . . . . . . . We henceforth chosen,

omit

complexities

assu~e

that

optimal

and r e f er

simply

or K(xly) .

of c o m p l e x i t y ,

as K(x)

it is u s e f u l

in the

intuitively

appealing

information

required

procedure.

a fixed

the s u b s c r i p t ,

At this p o i n t

together

by

O

allow x to be o b t a i n e d corollary

G 2.

for F o' 2 and the r e s u l t i n g

O

KG2(xly)

l e n g t h of a

to r e f l e c t

has b e e n

on the s i g n i f i c a n c e

Thesis.

of the s m a l l e s t

to o b t a i n

Furthermore,

function

to K o l m o g o r o v

light of C h u r c h ' s measure

(2.6)

an entity,

and r e m a r k a b l y ,

It is an

amount

of

x, by any e f f e c t i v e result

(2.6)

with lira K(xly)

=

~

. . . . . . . . . . . . . .

(2.7)

30

show

that,

for m o s t

approximately to be the

invariant.

coding

complexity

arbitrary

following

theorem

K(x)

recursive

set of points,

(2.2.5)

(19).

following

defined does

operates which

the n u m b e r

being

complex

of

there-

of theories,

important

since

equal,

one. consequences

later.

is not p a r t i a l

function

there

(the c o r r e s p o n d i n g

Kolmogorov

K(x),

complexity

~(x),

with

recursive.

defined

K(x)

Moreover,

on an i n f i n i t e

in the w h o l e

of its

of d e f i n i t i o n .

Theorem

#(x)

then

It can

things

a more

has m o s t

can c o i n c i d e

In o t h e r words,

on the

other

as w i l l be seen

function

no p a r t i a l

K(x)

in a theory.

than

x to y,

(intuitive)

it m e a s u r e s

that,

is b e t t e r

p is c o n s i d e r e d

relates

the q u a l i t y

is

(2.2.5)

The

domain

that

embodied

believed

theory

for m o d e l l i n g ,

Theorem

of the

as m e a s u r i n g

it is g e n e r a l l y

The

a measure

decisions

a simpler

If the p r o g r a m

in the sense

fore be v i e w e d

this m e a s u r e

of a theory w h i c h

gives

of theories,

such e n t i t i e s ,

is no e f f e c t i v e

theorem

was

first

Zvonkin lines.

Suppose

Then we

as follows:

is in the d o m a i n

also

a partial

can i m a g i n e

the t h e o r e m

recursive

function

and c o i n c i d e n t

with

a computer which

it uses ~(x)

of d e f i n i t i o n

p r o o f by

(21) p r o v e

set of points,

for each m,

of o b t a i n i n g

for K(xly)).

stated without

and L e v i n

on an i n f i n i t e

exist.

holds

way

of ~(x),

to find

an x

and for w h i c h

31

K(x)>m. However,

Denote this value F(m). this c o m p u t e r

T h e n K(F(m))>m.

(call it F) r e q u i r e s only to be given

m in o r d e r to find F(m).

Hence KF(F(m))~£(m).

know that for some C, K ( F ( m ) ) ~ K F ( F ( m ) ) Hence we k n o w that, w h i c h is false.

+ C

But we

~£(m)

for some C, and for all m,

+ C.

m<~(m)

Hence such a # c a n n o t exist.

The above p r o o f relies on F(m)

being general

recursive.

T h a t this can be a r r a n g e d for, is shown as follows: d o m a i n of #(x) enumerable

set.

But e v e r y such set c o n t a i n s (Rogers

for F(m)

is defined,

the

is by s u p p o s i t i o n an i n f i n i t e r e c u r s i v e l y

recursive subset possible

+ C,

(9), t h e o r e m 5-IV).

an i n f i n i t e Hence

it is

to e x a m i n e only i n t e g e r s x for w h i c h

and t h e r e f o r e F(m)

is d e f i n e d

#(x)

for e a c h m.

2.2.2 R a n d o m n e s s

N o t e that there exists

a C, i n d e p e n d e n t of x, such

that K(x)

~

£(x)

+ C . . . . . . . . . . . . . . . . . . .

(2.8)

This r e s u l t says that if all else fails, we can always compute x by m a k i n g p a copy of x, t o g e t h e r w i t h i n s t r u c t i o n s t e l l i n g the o p t i m a l c o m p u t e r s i m p l y to copy its input, symbol by symbol.

This

corresponds

to c o m p u t i n g x by

u s i n g a "table look-up".

Theorem

(2.2.6)

The p r o p o r t i o n of w o r d s

of length £(x)

for w h i c h

32

K ( x ) < £ ( x ) - m does not e x c e e d 2 -m+l.

This m e a n s that m o s t

finite s e q u e n c e s have n e a r l y m a x i m a l complexity. K o l m o g o r o v and C h a i t i n p r o p o s e d maximal

c o m p l e x i t y is e q u i v a l Q n t

In o t h e r words,

that the p r o p e r t y of

to the p r o p e r t y of randomness.

w h e n we say that a s e q u e n c e is "random",

w h a t we m e a n is that we have no w a y of c o m p u t i n g it, o t h e r than by looking up its terms in a table. This idea is a d e v e l o p m e n t of C h u r c h ' s that von Mises' collectives

"Law of E x c l u d e d G a m b l i n g Systems"

(random sequences)

c o m p u t i n g s u c c e s s f u l gambles and a s s o c i a t i n g

"partial recursive Sequences

(25),

(26) for

can be f o r m a l i s e d by

s t i p u l a t i n g that no e f f e c t i v e p r o c e d u r e

sequences,

suggestion

can exist for

on the o u t c o m e s of such

"effective procedure" with

function".

are c o n s i d e r e d to be n o n - r a n d o m if they c o n t a i n

sufficiently many regularities.

A r e g u l a r i t y is "any

v e r i f i a b l e p r o p e r t y of a s e q u e n c e i n h e r e n t o n l y in a n a r r o w e r class",

More p r e c i s e l y ,

the m e a s u r e of the set of s e q u e n c e s

c o n t a i n i n g m o r e than m bits of r e g u l a r i t y It is e s s e n t i a l that,

that the r e g u l a r i t i e s

as Zvonkin and L e v i n

(21) say:

cannot e x c e e d 2 -m.

are v e r i f i a b l e ,

so

"We r e g a r d as r a n d o m

those s e q u e n c e s w h i c h under any a l g o r i t h m i c test and in any algorithmic experiment behave

as r a n d o m sequences".

To e x p l a i n the above p a r a g r a p h m o r e carefully, we shift our a t t e n t i o n

from finite to i n f i n i t e b i n a r y sequences.

We denote an i n f i n i t e b i n a r y s e q u e n c e by m, and the set of all such s e q u e n c e s by ~.

The initial s e g m e n t of ~, of length

33

n, is d e n o t e d by

Definition

(~)

n

(2.2.7)

Let P be a p r o b a b i l i t y m e a s u r e on ~. of p r o o f of P - r e g u l a r i t y , function F(x) w h i c h

or P-test,

satisfies

(a)

It is g e n e r a l r e c u r s i v e

(b)

for m>O where

, P{~

F(~)

=

:

A correct method

is d e f i n e d to be a

the f o l l o w i n g conditions:

F ( ~ ) ~ m}<2 -m,

s~p F((~)n).

F(~), w h i c h is the " q u a n t i t y of r e g u l a r i t i e s "

found by a

test, is t a k e n to be the value of the test.

The P - t e s t F

is said to reject ~ if F(~) Let F

x

denote

= ~.

the set of all b i n a r y s e q u e n c e s w h o s e

initial s e g m e n t is the w o r d x. (strictly,

A p r o b a b i l i t y m e a s u r e on

on the B o r e l ~- algebra of subsets of ~) can

be d e f i n e d by giving its values on the sets F x. see this,

imagine i n f i n i t e b i n a r y s e q u e n c e s

e x p a n s i o n s of real numbers corresponds

Definition

to

in [O,i).

(To

to be b i n a r y

Then F0,,

for example,

~,½)).

(2.2.8)

A p r o b a b i l i t y m e a s u r e P on ~ is c o m p u t a b l e if there exist g e n e r a l r e c u r s i v e that the r a t i o n a l n u m b e r

f u n c t i o n s F(x,n)

and G(x,n),

such

34

F(x,n) ~p (x,n)

. . . . . . . . . . . . .

=

(2.9)

G(x,n) approximates

Theorem

P(F x) to w i t h i n

an a c c u r a c y of 2 -n.

(2.2.9)

For any c o m p u t a b l e m e a s u r e P there exists a P - t e s t F, called universal,

such that for any P - t e s t G a c o n s t a n t C

can be found such that, G(~)

Definition

~

F(~)

for all ~ ,

+ C . . . . . . . . . . . . . . . .

(2.10)

(2.2.10)

A s e q u e n c e ~ is called r a n d o m w i t h r e s p e c t to a m e a s u r e P if it w i t h s t a n d s

any P-test.

W i t h this d e f i n i t i o n , r e s p e c t to P, s a t i s f i e s law of p r o b a b i l i t y

every s e q u e n c e w h i c h is r a n d o m w i t h

every c o n c e i v a b l e e f f e c t i v e l y v e r i f i a b l e

theory,

since the v i o l a t i o n of such a law w o u l d

constitute a r e g u l a r i t y w h i c h w o u l d be d e t e c t e d by some P-test. Now c o n s i d e r a finte s e q u e n c e x.

The f o l l o w i n g c o n s t r u c t i o n

can be used to d e f i n e the "number of r e g u l a ~ i ties", p(x), r e s p e c t to the u n i f o r m m e a s u r e L, d e f i n e d b y ~ L { F x } = 2 - £ ( x ) corresponds

to L e b e s g u e m e a s u r e on

c o r r e s p o n d s to B e r n o u l l i Let F(x,n)

~,i),

in x, w i t h (L

and is the m e a s u r e w h i c h

s e q u e n c e s with g e n e r a t i n g p r o b a b i l i t y

~).

d e n o t e the m i n i m u m v a l u e of the u n i v e r s a l L - t e s t on

w o r d s of length n b e g i n n i n g w i t h x. p (x) =lira F (x,n)

Then

. . . . . . . . . . . . . . . .

(2.11)

35

The q u a n t i t y

£(x)

to c o m p l e x i t y ,

- p(x)

is a n a l o g o u s

and is r e l a t e d

in several

to it by the

respects

following

theorem:

Theorem

(2.2.11)

There

exists

I (£(x)

- p(x))

As a c o r o l l a r y has

a finite

Theorem

a constant

such

that

- K(x) I~4£(Z(x))

of this we obtain,

number

+ C

.......

(2.12)

(since a r a n d o m

sequence

of r e g u l a r i t i e s ) :

(2.2.12)

For

any

a constant

sequence

C,

such

~,

random with

The supports

-C

above d e v e l o p m e n t the c o n t e n t i o n

"maximally

respect

to L,

there

exists

that

K((~) n ) ~ n - 4£(n)

2.2.3

C,

. . . . . . . . . . . .

is due

that

to M a r t i n - L ~ f

"random"

(2.13)

(24),

is e q u i v a l e n t

and

to

complex".

Information As was

remarked

in s e c t i o n

is an a p p e a l i n g

measure

of i n f o r m a t i o n "

required

object between

by any e f f e c t i v e complexity

that e n t r o p y

is the

2.2.1,

Kolm~gorov

of the i n t u i t i v e to obtain,

procedure.

and entropy: "average

concept

of the

or r e c o n s t r u c t , An a n a l o g y

"amount

an

can be d i s c e r n e d

~t is g e n e r a l l y

amount

complexity

accepted

of i n f o r m a t i o n "

required

to s e l e c t space.

(i.e.

predict)

Furthermore,

entropy

measure

of the r a n d o m n e s s

Section

2.2.2

measure

of the r a n d o m n e s s

Pursuing

suggests

the

proposed

Definition

(2.2.13)

I(y:x) This

information defined

For

analogy

to be

a

of events.

complexity

is a s u i t a b l e

of a sequence. we m a k e

- K(xly)

the

following

definition,

(19):

in y about x is

. . . . . . . . . . . . .

with

the c l a s s i c a l

random variable

= H(~)

- H(~I~)

entropy.

ab o u t

(2.14)

Shannon

another,

which

is

. . . . . . . . . . . . . . This

classical

quantity

(2.15) has

the

properties :

J(~:~)

~> 0

J(~:~)

= H(~)

J(~:~)

= J(n:~)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

the a l g o r i t h m i c

properties

Theorem

Kolmogorov

in one

H denotes

following

and

that

by K o l m o g o r o v

= K(x)

taken

probability

by

J(n:~) where

is u s u a l l y

of i n f o r m a t i o n

is in d i r e c t

from a g i v e n

of a c o l l e c t i o n

analogy,

originally

The q u a n t i t y

an e v e n t

hold

only

. . . . . . . . . . . . . . . . . quantity

of i n f o r m a t i o n ,

(2.16) (2.17) (2.18)

the c o r r e s p o n d i n g

approximately:

(2.2...14)

•here

exist positive

constants,

C] ,C2,C~,

(independent

37

of x and y)

such

I(x:y)

that

~ -CI

. . . . . . . . . . . . . . . . . . .

II(x:x)

- K(x)I~C2

II(x~y)

- I(y:x)l~12£(K(xy))

A concrete established

Theorem

by

. . . . . . . . . . . . . . . .

link between the

(2.19)

+ C3

complexity

following

(2.20)

. . . . . . . . and entropy

(2.21) is

theorem:

(2.2.15)

Suppose

that

i words,

each

possible

words

with

frequency

the

a constant

of

a word

C,

length

of

r.

length

such

qk

x, of

length

Suppose

r

(label

(k=l,

...,

£(x) that

such 2r).

= ir, each

a word Then

consists

of k)

the

and

that

< i(H(qk) + ~(i)) + C . . . . . . . . . . . . 2r H(qk) = ~ k=lqk l°g2qk . . . . . . . . . . . . . e(i)

-- C r

A closer information stationary

£n

i

i

÷

connection

O as

i ~ "

between

can be established random

in x

exists

K(x) where

2r

occurs

there

of

processes

algorithmic

for

(see

. . . . . . . . . .

arbitrary

Zvonkin

(2.22) . ~2.23)

(2.24)

and probabilistic ergodic

and Levin

(21)).

38

2.2.4

The W o r k of C h a i t i n and S c h n o r r

Chaitin

(23) also i n v e s t i g a t e d

the p r o p e r t i e s

sequences which need maximal-length programs computation. above:

of b i n a r y

for their

His f o r m a l i s m was d i f f e r e n t from that used

he w o r k e d d i r e c t l y w i t h p a r t i c u l a r T u r i n g machines.

C h a i t i n showed that m o s t s e q u e n c e s have m a x i m a l

complexity,

and he o b t a i n e d a r e s u l t r e l a t i n g c o m p l e x i t y to entropy, s i m i l a r to t h e o r e m predicate

~(x)=n

(2.2.15). &K(x)<m]

w e a k e r r e s u l t than t h e o r e m d e d u c e that K(x) K(x)

Also,

he s h o w e d that the

is not decidable. (2.2.5), since

from it one can

is not g e n e r a l recursive,

is not p a r t i a l

T h i s is a

but not that

recursive.

C h a i t i n s u g g e s t e d that " m a x i m a l c o m p l e x i t y " appropriate explication

for "randomness".

is an

He also h i n t e d

that c o m p l e x i t y m a y be a m e a s u r e of the p o o r n e s s of s c i e n t i f i c theories,

arguing that "static"

appropriate measure

c o m p l e x i t y is a m o r e

for this than "dynamic"

did not use these terms of course,

complexity

(he

since they w e r e not yet

c u r r e n t at the time). Schnorr

(27) has i n v e s t i g a t e d

u s i n g the c o n c e p t of program.

related questions without

Instead,

G ~ d e l n u m b e r i n g s of p a r t i a l r e c u r s i v e

he has e x a m i n e d

functions.

He has

shown that there e x i s t o p t i m a l G6del n u m b e r i n g s ,

relative

to w h i c h the lowest G~del n u m b e r of each p a r t i a l r e c u r s i v e f u n c t i o n is r a t h e r low w h e n c o m p a r e d w i t h the lowest G o d e l n u m b e r of the same f u n c t i o n r e l a t i v e to any other G ~ d e l

39

numbering.

Optimal G6del numberings

to K o l m o g o r o v ' s o p t i m a l p a r t i a l an o p t i m a l p a r t i a l

recursive

recursive

functions:

f u n c t i o n is u n i v e r s a l ,

it is an o p t i m a l G ~ d e l numbering. that not all o p t i m a l

correspond closely if then

(However, S c h n o r r shows

f u n c t i o n s are u n i v e r s a l ) .

It s h o u l d be n o t e d that S c h n o r r i n v e s t i g a t e s p r o p e r t i e s of functions, w h e r e a s K o l m o g o r o v i n v e s t i g a t e s p r o p e r t i e s

of

sequences, w i t h o u t s p e c i f y i n g the f u n c t i o n s w h i c h are to compute them.

S c h n o r r ' s m a i n result is that an optimal

Godel n u m b e r i n g

is e s s e n t i a l l y u n i q u e in the f o l l o w i n g

sense:

g i v e n any two o p t i m a l G ~ d e l n u m b e r i n g s ,

o b t a i n a Godel n u m b e r of a f u n c t i o n f, r e l a t i v e the n u m b e r i n g s ,

one can to one of

from a G~del n u m b e r of f r e l a t i v e to the

other n u m b e r i n g ,

by a r e c u r s i v e i s o m o r p h i s m

-

t and t 1 are l i n e a r l y b o u n d e d s i m i l a r l y for t-l).

However,

(i.e. Meyer

.

11~up

t, such that t(n)

n

<~ and

(28) has shown that the

sets of lowest G~del n u m b e r s r e l a t i v e to any two Godel numberings

are not n e c e s s a r i l y

Furthermore,

M e y e r remarks

of " s - m i n i m a l "

that this r e s u l t extends

to sets

indices d e f i n e d by any r e c u r s i v e f u n c t i o n s

w h i c h is a s i z e - m e a s u r e Thus a p a r t i c u l a r S c h n o r r and M e y e r

in the sense of Blum~ i n t e r p r e t a t i o n of these results of

is the following:

s h o r t e s t p r o g r a m for c o m p u t i n g univers~

r e c u r s i v e l y isomorphic.

language,

some function, w r i t t e n in a

then one can e f f e c t i v e l y

w r i t t e n in some o t h e r u n i v e r s a l the same function,

if one knows the

find a program,

language, w h i c h c o m p u t e s

such that the lengths of these two p r o g r a m s

40

are "not too different". effectively

in general,

find the s h o r t e s t program, w r i t t e n

language, w h i c h c o m p u t e s

2~3

But one cannot,

in the second

the same function.

A s y m p t o t i c Inference.

We now turn to i n v e s t i g a t i o n s i n f e r e n c e of h y p o t h e s e s

of the p r o c e s s of

from o b s e r v a t i o n s .

to c l a s s i f y these into two categories.

It is c o n v e n i e n t

In the first are

those i n v e s t i g a t i o n s w h i c h c o n c e n t r a t e on g o o d "asymptotic" results,

n a m e l y on e v e n t u a l l y o b t a i n i n g g o o d results

set of o b s e r v a t i o n s g r o w s w i t h o u t

limit.

consists of w o r k w h i c h e m p h a s i s e s

inference

finite sample.

if the

The second c a t e g o r y from a fixed,

In this section we e x a m i n e the first of

these categories.

2.3.1

Grammatical

Inference

The p r o b l e m of finding a m o d e l

for ~ b e h a v i o u r

is

o f t e n p o s e d as fo ~ o w s :

g i v e n a sample set of strings

(words)

i d e n t i f y the g r a m m a r w h i c h g e n e r a t e s

f r o m a language,

that l a n g u a g e considerably

(29),

(30).

This

f o r m u l a t i o n differs

from the c h a r a c t e r i s a t i o n of the m o d e l l i n g

p r o c e s s w h i c h is d e v e l o p e d in C h a p t e r 3, and is m e n t i o n e d here only for c o m p l e t e n e s s . regard whereas

The m a i n d i f f e r e n c e is that we

a m o d e l as a f u n c t i o n from o b s e r v a t i o n s

to o b s e r v a t i o n s ,

a g r a m m a r w i l l in g e n e r a l be a r e l a t i o n b e t w e e n

41

observations.

In other words,

a m o d e l in our sense can

be r e g a r d e d as a p a r t i c u l a r set of d e r i v a t i o n s

in a grammar,

r a t h e r than the g r a m m a r itself. Another difference inference

is that w h e n

considering grammatical

it is o f t e n a s s u m e d that a "teacher"

is available,

who is able to s u p p l y not only those strings g e n e r a t e d by the g r a m m a r to be inferred,

but also some strings w h i c h

cannot be g e n e r a t e d by that grammar,

and who

"informs"

the

i n f e r e n c e m a c h i n e w h i c h type of string it is being shown. We do not allow this p o s s i b i l i t y ,

since it implies the

e x i s t e n c e of an agent who knows the "true model" s y s t e m b e i n g investigated.

Gold

e x i s t e n c e of such a t e a c h e r makes a s y m p t o t i c c a p a b i l i t y of a m a c h i n e A third d i f f e r e n c e

of the

(31) has shown that the a great difference

to the

for i n f e r r i n g grammars.

is that it is u s u a l l y

a s s u m e d that each

string in the l a n g u a g e w i l l e v e n t u a l l y be p r e s e n t e d to such a machine. There inference

are also s i m i l a r i t i e s b e t w e e n g r a m m a t i c a l and modelling.

investigated,

the s o - c a l l e d

"type O g r a m m a r s "

The m o s t g e n e r a l class of g r a m m a r s "general r e w r i t i n g systems"

are as p o w e r f u l

as T u r i n g m a c h i n e s ,

or

in the

sense that the set of strings g e n e r a t e d by any such g r a m m a r is the range of some T u r i n g m a c h i n e , (32).

and c o n v e r s e l y

Thus i n f e r r i n g a g r a m m a r is in g e n e r a l e q u i v a l e n t

to i n f e r r i n g a T u r i n g machine. two p r o b l e m s

s t r o n g l y related.

This o b v i o u s l y m a k e s

the

42

A n o t h e r s i m i l a r i t y is that any l a n g u a g e can be g e n e r a t e d by m o r e

than one grammar,

and any

w h i c h is used for i n f e r e n c e language.

Consequently,

(finite)

sample of strings

can b e l o n g to m o r e than one

some c r i t e r i o n is n e e d e d for

c h o o s i n g b e t w e e n rival grammars.

One way of o b t a i n i n g

such a c r i t e r i o n is to r e g a r d the g r a m m a r as s t o c h a s t i c - e a c h p r o d u c t i o n in it occurs w i t h a c e r t a i n p r o b a b i l i t y . Selection

from among c a n d i d a t e g r a m m a r s

e i t h e r by u s i n g Bayes' grammar,

is then p o s s i b l e

t h e o r e m to i n d i c a t e the m o s t p r o b a b l e

or by u s i n g s t a t i s t i c a l t e s t i n g t e c h n i q u e s

The B a y e s i a n a p p r o a c h r e q u i r e s probabilities

(33).

the s p e c i f i c a t i o n of 9 priori

for the c a n d i d a t e grammars.

A second w a y of o b t a i n i n g a criterion,

and one w h i c h is

of m o r e i n t e r e s t to us, is to choose the least c o m p l e x of the c a n d i d a t e grammars,

(29),

(34).

d i f f e r e n c e now emerges b e t w e e n

Another significant

the p r o b l e m s of g r a m m a t i c a l

i n f e r e n c e and m o d e l l i n g as we u n d e r s t a n d it. c o m p l e x i t y m e a s u r e u s e d is a "static" of the lengths of the p r o d u c t i o n s

one,

If the

such as the sum

in the grammar,

then the

g r a m m a r w h i c h w i l l u s u a l l y be chosen is a u n i v e r s a l g r a m m a r which generates

the l a n g u a g e c o n s i s t i n g of all p o s s i b l e

strings from the alphabet.

Consequently

the c o m p l e x i t y

m e a s u r e used m u s t i n c l u d e a c o m p o n e n t w h i c h is a "dynamic" measure,

such as the n u m b e r of d e r i v a t i o n steps r e q u i r e d to

g e n e r a t e the sample set of strings. is used,

it is p o s s i b l e

W h e n such a m e a s u r e

to e f f e c t i v e l y

(relative to the c o m p l e x i t y measure)

find the b e s t g r a m m a r

which generates

a

43

particular

sample

(34).

on the o t h e r hand, is a p p r o p r i a t e , best m o d e l

In our f o r m u l a t i o n of m o d e l l i n g ,

the use of a static c o m p l e x i t y m e a s u r e

and it is not u s u a l l y p o s s i b l e

in an e f f e c t i v e m a n n e r

to find the

(see c h a p t e r 3).

We shall not d i s c u s s p a r t i c u l a r a l g o r i t h m s grammatical

inference,

for

since these are not a p p l i c a b l e to

the m o d e l l i n g p r o b l e m p o s e d in the n e x t chapter.

2.3.2

Inductive

Inference

We use the t e r m " i n d u c t i v e d i s t i n c t i o n to " g r a m m a t i c a l of inferring,

inference",

inference",

to denote the p r o b l e m

from an o b s e r v e d b e h a v i o u r ,

w h i c h p r o d u c e d that b e h a v i o u r

as output.

from s e c t i o n 2.3.1 that i n d u c t i v e

in c o n t r a -

the a l g o r i t h m It w i l l be c l e a r

and g r a m m a t i c a l

inference

h~ve m u c h in common. In this s e c t i o n we r e v i e w the two p a p e r s w h i c h we c o n s i d e r to be by far the m o s t s i g n i f i c a n t in this field, namely Solomonoff Solomonoff

(22), and B l u m and B l u m

(35).

c o n s i d e r e d the p r o b l e m of e x t r a p o l a t i n g

very long s e q u e n c e of symbols.

He f o r m u l a t e d this as the

p r o b l e m of finding the degree of c o n f i r m a t i o D c(a,T) the h y p o t h e s i s

that the s e q u e n c e

a w i l l occur,

the sense of Carnap Solomonoff's

of

given the

e v i d e n c e that the s e q u e n c e T has just occurred. this d e g r e e of c o n f i r m a t i o n

a

He c o n s i d e r e d

to be a logical p r o b a b i l i t y

in

(36).

distinctive

contribution

to the s o l u t i o n

of this p r o b l e m was that he r e g a r d e d the o b s e r v e d

and p r e d i c t e d

44

sequences

as o u t p u t s of some T u r i n g m a c h i n e ,

the p r o p e r t i e s of those b i n a r y

"programs"

w h i c h c a u s e d the o b s e r v e d and p r e d i c t e d computed.

He was a p p a r e n t l y

and e x a m i n e d

for this m a c h i n e

s e q u e n c e s to be

the first to e x a m i n e the

p r o b l e m in these terms. He p r e s e n t e d several a l t e r n a t i v e c(a,T).

The first g i v e s c(a,T)

concatenated

schemes for c a l c u l a t i n g

a high v a l u e if the

s e q u e n c e Ta can be c o m p u t e d by short p r o g r a m s

a n d / o r if it can be c o m p u t e d by m a n y programs. programs

Short

are f a v o u r e d b e c a u s e they r e p r e s e n t simple h y p o t h e s e s

about the s t r u c t u r e of the o b s e r v e d sequence, w h i l e s e q u e n c e s w i t h n u m e r o u s p r o g r a m s are f a v o u r e d b e c a u s e of the feeling that if they can have m a n y a l t e r n a t i v e

"causes"

are m o r e " l i k e l y " .

to be used is a

The T u r i n g m a c h i n e

"universal machine",

n a m e l y one w h i c h can s i m u l a t e another

u n i v e r s a l m a c h i n e by p r e f i x i n g set of " t r a n s l a t i o n corresponds

t h e o r e m s i m i l a r to t h e o r e m

machine

its p r o g r a m s w i t h a fixed

instructions".

to K o l m o g o r o v ' s

show that c(a,T)

then they

"optimal

Such a m a c h i n e function",

(2.2.4) holds.

and so a

This is u s e d to

is fairly i n d e p e n d e n t of w h i c h u n i v e r s a l

is used, p r o v i d i n g

that T is long enough.

The m a i n d r a w b a c k of this scheme is that the e v a l u a t i o n of c (a,T)

r e q u i r e s the s u m m a t i o n of an i n f i n i t e n u m b e r of

terms, m o s t of w h i c h are not e f f e c t i v e l y computable. S o l o m o n o f f tries to o v e r c o m e approximations,

this by d e r i v i n g s u i t a b l e

but the n e c e s s a r y

approximations

depend

h e a v i l y on the n a t u r e of the s e q u e n c e s b e i n g e x t r a p o l a t e d .

45

His e x a m p l e s Markov

are:

chain,

generated

a Bernoulli

and the e x t r a p o l a t i o n

by some

The d e t a i l s relevant

language

computed.

of the

not m a k e

To m a k e

this p o i n t

needed,

in o r d e r

sufficient

to m a k e

It is i m p o r t a n t "best"

programs

compute

models,

is b a s e d

rather

same

setting

as that

find an a l g o r i t h m inductive

the

c(a,T)

is

without

is not b a s e d

any

sequence

"best"

Ta.

Thus

different.

model.

for c o m p u t i n g

in C h a p t e r s

law w h i c h

f(x)=y. f.

close

1 and

and Blum

(35)

3.

This

is

b u t the

The o b s e r v a t i o n s

(x,y) , and the

computes

Solomonff's

of all p o s s i b l e

by S o l o m o n o f f ,

is very

on

but on the set of all

investigated

inference

as f o r m u l a t e d

to a solution,

that

program,

that

machines".

approximation,

by Blum

to be an a l g o r i t h m

but

of k n o w l e d g e

investigated

to be a set of p a i r s

evidence

schemes,

"inference

on a "weighting"

is s l i g h t l y

c a n n o t be

procedures.

than on the

The p r o b l e m the

a suitable

to note

just a s i n g l e

flaw -

suitable

of his

amount

direct

same

cases p r o v i d e s

validity

the

are not very

whieh

in f i n d i n g

special

clearer:

Solomonoff's

that

from the

t h e m into p r a c t i c a l

for p r o c e e d i n g

to use

prediction

success

inference).

proposed

of q u a n t i t i e s

conceptual

it does

grammatical

all s u f f e r

for c e r t a i n

a finite-order

of a set of strings

schemes

the c o m p u t a t i o n

approximations

need

they

Solomonoff's

in s u p p o r t

(i.e.

of the o t h e r

to us, but

they r e q u i r e

sequence,

formal

are a s s u m e d

explains

The p r o b l e m formulation

to the p r o b l e m

them is to of

of m o d e l l i n g ,

48

Blum f which

and B l u m

it is p o s s i b l e

for w h i c h The

attempt

to i d e n t i f y

it is p o s s i b l e

characterisations

counting

(dynamic)

it is p o s s i b l e to compute between

can not be used

Generally

the c l e a r e s t

that have been

a step-counting

in n a t u r e

of stepspeaking,

if it is not too d i f f i c u l t

give

and c o m p l e x i t y

However,

different

results

functions

algorithm.

are in terms

measures.

a function

functions

is, t h o s e

a correct

they o b t a i n

complexity

These

inference

to date.

which

those

- that

to i n f e r

to i n f e r

it.

to c h a r a c t e r i s e

measure

from a size m e a s u r e ,

connections established

is v e r y

so these

to supportthe h y p o t h e s i s

that

results

small m o d e l s

are good models. Nevertheless,

the a u t h o r s

"to have quality,

a hypothesis

than

be e n c o d e d

could m e r e l y

The m a c h i n e s identify through

which

by a p r o c e s s algorithm

numbering

increasing

but the

search

construction

Blum

to e n s u r e

in their p r o o f s sense

- the G o d e l

that

more

recursive

for it.

smallest

invariably

they

number

suitable

nevertheless

difficult

and B l u m

that

to be m e a n i n g f u l .

arbitrarily

explain

occurs

search functions

of an (It is not G~del

number,

in order

of

size).

In one example,

enough

that

. in that m a n y bits"

of all p a r t i a l

the

conviction

long m u s t

a size m e a s u r e

for t h e m to find

in g e n e r a l ,

.

in the

of e n u m e r a t i o n

is of c o u r s e

their

n bits

construct

"small"algorithms, a G~del

possible

they

state

the

inferred

A machine

0-i v a l u e d

employ

an i n t e r e s t i n g

algorithm

is to i d e n t i f y

recursive

is small some

functions.(A

machine

47 con~rqesin

the limit to i if it e v e n t u a l l y o u t p u t s i and

then n e v e r o u t p u t s

a d i f f e r e n t number,

w i t h some i n f i n i t e s e q u e n c e of pairs. if, w h e n e v e r (x,f(x)), ~i

It can i d e n t i f y

f

it is g i v e n a c o m p l e t e s e q u e n c e of pairs

it c o n v e r g e s

to i and the p a r t i a l r e c u r s i v e f u n c t i o n

is an e x t e n s i o n of f).

If its last c o n j e c t u r e d e f i n e d and ~ i ( y ) + f ( y ) , o t h e r hand,

upon b e i n g p r e s e n t e d

as fcllows.

is i, and it finds that ~i(y) then it c o n j e c t u r e s

if ~i(y)=f(y)

the f o l l o w i n g manner.

The m a c h i n e works

i+l.

is

On the

for all y<x then it tests @i(x)

in

F i r s t it c o n s t r u c t s an u p p e r bound,

then it tests w h e t h e r ~i(x)=f(x)

w i t h i n this u p p e r bound.

If so, it accepts i, o t h e r w i s e it c o n j e c t u r e s

i+l.

The

i n t e r e s t i n g p a r t is the c o n s t r u c t i o n of the upper bound: first a r e c u r s i v e f u n c t i o n h is fixed. for, such that ~j c o n v e r g e s faster

(i.e. in fewer steps)

to f on inputs O , l , . . . , m a x ( 2 j , 2 x ) than ~i c o n v e r g e s on x.

{#i } be the set of s t e p ~ o u n t i n g a s u i t a b l e j is found, (h(x,f(x)), max Thus

(#j(y):

T h e n a j is s e a r c h e d

m e a s u r e s b e i n g used.

Let When

take the u p p e r b o u n d to be max y~ m a x ( 2 j , 2 x ) ) ) .

c o n j e c t u r e i is a b a n d o n e d if it is d i s c o v e r e d

that some a l g o r i t h m computes

a r e s t r i c t i o n of the f u n c t i o n

to be i n f e r r e d m o r e q u i c k l y than i, for a c o n s i d e r a b l y larger set of values.

Now the r e a s o n why j can be r e g a r d e d

as a p o t e n t i a l l y m e a n i n g f u l this:

e x p l a n a t i o n of the data is

think of j as the jth p r o g r a m for some u n i v e r s a l

48 machine.

If it is w r i t t e n in a b i n a r y a l p h a b e t its

length is r o u g h l y enough to store

log2j, and it is t h e r e f o r e not large

(f(O),f(1),...,f(2j))

table

(recall that f(n) e{O,l}).

2.4

Small-Sample

in a look-up

Inference

2.4.1 W r i n c h and J e f f r e y s

In 1921 W r i n c h and J e f f r e y s m o d e l s of a set of o b s e r v a t i o n s b a s i s of simplicity. physics

(37) p r o p o s e d that c o m p e t i n g should be a s s e s s e d on the

They s u g g e s t e d that any m o d e l in

could be f o r m u l a t e d as a d i f f e r e n t i a l equation;

if two d i f f e r e n t i a l e q u a t i o n s

e x p l a i n e d the same set of data,

then the one w i t h the fewer p a r a m e t e r s

should be preferred.

In fact they p r o p o s e d a s s i g n i n g p r o b a b i l i t i e s

to m o d e l s in

this form,

function

the p r o b a b i l i t y b e i n g a d e c r e a s i n g

the n u m b e r of parameters.

Popper

of

(38) s u g g e s t e d a s i m i l a r

but v a g u e r scheme, but argued that W r i n c h and J e f f r e y s should have r e g a r d e d the s i m p l e r m o d e l s as the less p r o b a b l e ones.

This a r g u m e n t seems to arise from an almost w i l f u l

c o n f u s i o n of two s e p a r a t e concepts,

and it is c o n v e n i e n t to

c l a r i f y these at this stage. It seems c l e a r that W r i n c h and J e f f r e y s use the term "probability"

in the sense of "degree Of c o n f i r m a t i o n " .

T h a t is, h a v i n g o b s e r v e d

some data,

and h a v i n g c o n j e c t u r e d

some h y p o t h e s e s w h i c h are c a p a b l e of e x p l a i n i n g the data, they are a s s e s s i n g the r e l a t i v e

l i k e l i h o o d s of the v a r i o u s

49

predictions

e n t a i l e d by t h o s e hypotheses.

This is e q u i v a l e n t

to a s s e s s i n g the q u a l i t y of the c o m p e t i n g hypotheses. this sense

"probabflity"

is s y n o n y m o u s w i t h P o p p e r ' s

"degree of c o r r o b o r a t i o n " , simplicity.

which,

he insists,

increases with

T h u s P o p p e r ' s v i e w is c o n s i s t e n t w i t h that of

W r i n c h and J e f f r e y s ,

if this i n t e r p r e t a t i o n of " p r o b a b i l i t y "

is admitted.

(Popper argues,

corroboration"

cannot be i n t e r p r e t e d as a p r o b a b i l i t y .

is p r o b a b l y correct,

however,

t h a t " d e g r e e of

but it is a s e p a r a t e point.

argument which supports

it is g i v e n in C h a p t e r

"the p r o b a b i l i t y

that this m o d e l c o r r e c t l y p r e d i c t s

in the light of the b e h a v i o u r

to mean,

roughly,

"giventhat

are to be made, w h a t is the p r o b a b i l i t y p o s s i b l e to e x p l a i n them using Now it is q u i t e r e a s o n a b l e ,

(if it is d e f i n e d

like future

P o p p e r uses some o b s e r v a t i o n s

that it w i l l be

a model with n parameters?"

intuitively,

that this p r o b a b i l i t y

should i n c r e a s e w i t h the n u m b e r of p a r a m e t e r s , suggests

An

already observed

and the m o d e l s w h i c h have b e e n c o n j e c t u r e d " , "probability"

This

8).

W h e r e a s W r i n c h and J e f f r e y s r e f e r to s o m e t h i n g

behaviour,

In

as P o p p e r

at all).

Thus no c o n f l i c t arises b e t w e e n W r i n c h and J e f f r e y s and Popper,

providing

that it is b o r n e in m i n d that W r i n c h

and J e f f r e y s use " p r o b a b i l i t y " model,

to i n d i c a t e

in the light of o b s e r v a t i o n s ,

a p r o p e r t y of a

w h e r e a s P o p p e r uses it

to d e n o t e an a priori p r o p e r t y of the o b s e r v a t i o n s .

This

thesis is c o n c e r n e d e n t i r e l y w i t h the p r o p e r t i e s of models.

SO As it stands, W r i n c h and J e f f r e y s

the c r i t e r i o n of q u a l i t y p r o p o s e d by is not a p r a c t i c a l one.

Empirical

data w o u l d u s u a l l y n e e d a v e r y high o r d e r d i f f e r e n t i a l e q u a t i o n m o d e l to fit it.

In p r a c t i c e ,

some a p p r o x i m a t i o n

is t o l e r a t e d in o r d e r to allow a s i m p l e r model, trade-off between approximation made.

Also,

intuitive

the o n l y s u p p o r t

and some

and c o m p l e x i t y has to be for their p r o p o s a l

f e e l i n g that the n u m b e r of p a r a m e t e r s

is the

is the

a p p r o p r i a t e m e a s u r e of complexity.

A p a r t from this, however,

the c r i t e r i o n of W r i n c h and J e f f r e y s

is v e r y s i m i l a r to the

p r o p o s e d c r i t e r i o n of m o d e l q u a l i t y w h i c h is i n t r o d u c e d in C h a p t e r

2.4.2

3, and is close to b e i n g a s p e c i a l case of it.

Gaines

Gaines has r e c e n t l y p r o p o s e d general

system identification problem

i s a t i o n of C h a p t e r Gaines'

a f o r m u l a t i o n of the

proposals.

Gaines

considers

of m o d e l s w h i c h are of interest, p a r t i a l o r d e r i n g of m o d e l s

of m o d e l s

The c h a r a c t e r -

3 can be v i e w e d as a special case of

p r o b l e m to be d e f i n e d by an o b s e r v e d

b e i n g called

(14).

"complexity"),

an i d e n t i f i c a t i o n behaviour,

a class

an a r b i t r a r y but fixed

in this class

(this o r d e r i n g

and an a r b i t r a r y partial o r d e r i n g

in this class w h i c h is i n d u c e d by the p a r t i c u l a r

behaviour being observed "approximation").

(this o r d e r i n g being called

The " a d m i s s i b l e

subset"

of m o d e l s

then the set of m o d e l s w h i c h has the p r o p e r t y that,

is

if m is

51

a member which

of this

is b o t h

less

approximation this

in the

subset,

complex

than m,

to the b e h a v i o u r

admissible

problem.

admissible

subset

admissible

subset

depends

exercise.

characterisation,

and in e f f e c t

approximation

A point which will is that

Gaines

approximation revolutions" these

(39).

relations Gaines

behaviours symbols,

has

complexity

being

of a p p r o x i m a t i o n observed

admissible

models

is not p r a c t i c a l interest

which only

being

being

to i d e n t i f y

most

complex

"scientific or both

sequences state

of

and

distance

measures the

the set of

However, which

because

it

are of

of e x c e s s i v e

because

algorithms,

of

between

case

behaviours

finite-state

automata

several

computable.

but m a i n l y

of a r b i t r a r y

the m e a s u r e

In this

partly

by c o m p a c t

and

case of the

finite

of states,

behaviours.

8)

"view of science".

on the H a m m i n g

requirements,

by m a x i m a l l y

trade-

complexity

in e i t h e r

finite

is e f f e c t i v e l y

can be p r o d u c e d

in our

(in C h a p t e r

Kuhn's

the special

to us in this manner,

computational

with

to a n e w

the n u m b e r

and c o m p u t e d

circumstances

than this

later

and p r o b a b i l i s t i c ) ,

based

those

a particular

in the

a change

of m o d e l s

(both d e t e r m i n i s t i c

propose

changes

investigated

class

from among

and c o m p l ex i t y .

corresponds

to be m o d e l l e d

the

to the i d e n t i f i c a t i o n

We go f u r t h e r

relations

Thus

considers

on the p a r t i c u l a r

be of i n t e r e s t

associates ordering

solution

exists

a better

Gaines

is to be p r e f e r r e d

of each m o d e l l i n g

off b e t w e e n

and gives

than m.

to be the

Which model

then no m o d e l

many

behaviours,

can be m o d e l l e d

automata.

Thus

the

52

use of this class of m o d e l s does not lead to a d i s c e r n i b l e s t r u c t u r e w h e n used for the i d e n t i f i c a t i o n behaviours.

of m a n y s i m p l e

An e x a m p l e of such a b e h a v i o u r is that p r o d u c e d

by the program: n : = i; loop:n:= n*(n+l); write

(n);

g o t o loop; namely,the sequence

2,6,42,1806, ....

This p r o g r a m c a n n o t

be c o r r e c t l y i m p l e m e n t e d by e i t h e r a d e t e r m i n i s t i c or s t o c h a s t i c finite state automaton.

2.4.3 L o f @ r e n

Lofgren

(40) has made the r a t h e r u n l i k e l y s u g g e s t i o n

that"as soon as a s c i e n t i s t b e l i e v e s theory for some p h e n o m e n o n ,

that he has p r o d u c e d a

he should try to f o r m a l i s e the

theory so as to m a k e it e f f e c t i v e l y c o m m u n i c a b l e " . "formalise"

L ~ f g r e n m e a n s that the t h e o r y should be t r a n s l a t e d

into one of the formal "formalise"

By

logical systems!

is r e i n t e r p r e t e d

However,

if

to m e a n that the theory s h o u l d

be e x p r e s s e d as an a l g o r i t h m for c o m p u t i n g the o b s e r v e d phenomenon,

then L ~ f g r e n ' s views on the q u a l i t y of t h e o r i e s

b e c o m e v i r t u a l l y i d e n t i c a l w i t h our views on models. For example,

his key h y p o t h e s i s

is:

Let S and S' be

two formal t h e o r i e s w i t h the same logical basis, w h i c h e x p l a i n one set of e x p e r i m e n t a l

b o t h of

facts w i t h c o m p a r a t i v e l y

53

short and few p r o p e r axioms and p r o p e r rules of inference. Then,

if b o t h S and S' p r e d i c t f u r t h e r e x p e r i m e n t a l results

(as b e i n g p r o p e r t h e o r e m s or not), simplest proper axioms p r e d i c t i v e power,

the theory w i t h the

and p r o p e r rules has the g r e a t e r

in the s e n s e that its p r e d i c t i o n s

m o r e likely to agree w i t h f u r t h e r experiments.

are

The s i m p l i c i t y

of the p r o p e r axioms and p r o p e r rules of i n f e r e n c e can be m e a s u r e d by the total length of all c o r r e s p o n d i n g w e l l formed formulae".

This h y p o t h e s i s w i l l be seen to be very

c l o s e to the b a s i c idea u n d e r l y i n g

this thesis,

that the

q u a l i t y of a m o d e l can be m e a s u r e d by the s h o r t n e s s of the p r o g r a m r e q u i r e d to r e a l i s e it. Also, L o f g r e n has a n t i c i p a t e d one of the key c o n c e p t s w h i c h w i l l be i n t r o d u c e d later:

"Let { h l , . . . , h n} be a set

of e x p e r i m e n t a l l y o b t a i n e d facts.

This set can be c o n s i d e r e d

the t h e o r e m h o o d of a formal theory S O w i t h

{ h l , . . . , h n} as the

set of axioms, w i t h no rules of inference,

and thus w i t h a

logical basis w h i c h is empty the e x p e r i m e n t a l l y whatsoever".

obtained

.... Such a m e r e listing of facts has no p r e d i c t i v e p o w e r

The t h e o r y S O c o r r e s p o n d s

later call the "trivial m o d e l " ,

to w h a t we shall

and w h i c h w i l l serve as a

s t a n d a r d a g a i n s t w h i c h the q u a l i t y of m o d e l s w i l l be measured. L o f g r e n also i n v e s t i g a t e d the c o n n e c t i o n b e t w e e n r a n d o m n e s s of a s e q u e n c e and the length of p r o g r a m r e q u i r e d to c o m p u t e it.

He p r o v e d a w e a k e r

by using the r e c u r s i o n t h e o r e m

form of t h e o r e m

(see Rogers

(2.2.5)

(9)), and showed

54

that

there

can be no a l g o r i t h m

whose

Kolmogorov

2.5

Inference

Recently,

complexity

of the structure, a Gauss-Markov

as w e l l

process,

is a s h o r t d e s c r i p t i o n Suppose observed,

that

and

x(t+l) y(t)

(41)

finding

is g r e a t e r

of P a r a m e t e r s

Rissanen

for

a sequence

than

a given

of G a u s s - M a r k o v

has

Process

investigated

as the r e a l - v a l u e d

by u s i n g

the i d e a

of the o b s e r v e d

value.

the e s t i m a t i o n parameters,

that

of

a good m o d e l

data.

the set y = ( y ( O ) , .... y(N-l)) T has been

that

it was

generated

by the p r o c e s s

= Ax (t) +Be (t) = Cx(t)+e(t),

x(O)=O, t=O,... ,N-I.

Then

((A,B,C) ;

e(o),...,e(N-l))

description

of y,

using

where

y=Te,

can be c o n s i d e r e d

since y can be r e c o v e r e d

to be a

from it by

I T=

y is s u p p o s e d number

CB

I

CAB

CB

to be r e c o r d e d

of f r a c t i o n a l

consider

bits.

the s h o r t e s t

the e x p e c t e d

length

of such

of y.

triple

is such that

with

an a c c u r a c y

It is t h e r e f o r e

description

by the e n t r o p y (A,B,C)

I

of y.

a description

Rissanen the

meaningful

It is k n o w n is b o u n d e d

demonstrates sequence

of a f i x e d

that,

to

that below

if the

e is s e r i a l l y

SS

uncorrelated, Gaussian,

and if the d i s t r i b u t i o n

then e can be e n c o d e d

length n e a r l y

attains

in the

scalar

case

writing

each

There

e(t)

with

a pair

of

(A,B,C),

lower bound.

as a b i n a r y

number.

triples Each

(s,8),

triple

where

parameters

positions

of the c o m p o n e n t s

(s*,8*) z* =

make

e uncorrelated,

a concise coding

associated

of

y +

(s,e)

the

structure

and the

another

form of

~ is a code

of e.

a description

(s,).

Given

entropy

of e. with

of y, w h e r e

~* is the

s,8

are

to m i n i m i s i n g the m e a n

a structure

s, a c a n d i d a t e

asymptotically

the m a x i m u m

of @ g i v e n

where

also m i n i m i s e s

a Gaussian

If s* is the true

in a d d i t i o n

length of the

estimator

is one w h i c h

For

if the

is m i n i m i s e d ,

Thus,

of ~, a c o n c i s e

estimator

estimates

8 is a v e c t o r

of a set of

where

is c o n c i s e

(s,e,e)

for s,8.

concise

coincides

description

description

codes

the l e n g t h length

then

to the

of e.

An e s t i m a t o r

the b e s t

and

system

Thus

of y is a t r i p l e ( s , 8 , ~ ) ,

rise

(s*,e*,~*)

is c a l l e d shortest

of the

of 8.

that

can be i d e n t i f i e d

s represents

the o r d e r

shows

by s i m p l y

give

an e n u m e r a t i o n

which

description

which

(A,B,C)

(42).

and is in fact define

reached

s is an integer,

integers,

If

(A,B,C)

is

string w h o s e

He also

is n e a r l y

e.

of r e a l - v a l u e d

as a b i n a r y

the b o u n d

are m a n y

same s e q u e n c e

the

of the o b s e r v a t i o n s

at least m i n i m i s e s

the e s t i m a t e d

distribution,

an e s t i m a t o r

likelihood

structure

by this

such

estimator.

of the p r o c e s s ,

estimator

are

then

the

asymptotically

5~

G a u s s i a n at s=s*.

Consequently

it is p o s s i b l e to e s t i m a t e

the e n t r o p y of 8, c o n d i t i o n a l on s*. H(@Is)~H(81s*), how H(SIs)

Furthermore,

w h e r e H d e n o t e s entropy,

m a y be estimated.

and R i s s a n e n shows

If H(@Is)

not o n l y has a e been found w h o s e

is m i n i m i s e d ,

then

shortest expected length

of d e s c r i p t i o n is the s h o r t e s t p o s s i b l e , b u t the a s s o c i a t e d s m u s t also be the true structure. Thus an e s t i m a t o r w h i c h m i n i m i s e s 1 H ( e ) + ~H(SIs) , w h e r e the m i n i m i s a t i o n asymptotically

concise estimator,

an e s t i m a t e of is over

(s,8),is an

n a m e l y one w h i c h e v e n t u a l l y

gives the s h o r t e s t p o s s i b l e d e s c r i p t i o n of the o b s e r v e d d a t a y, and w h i c h also e s t i m a t e s Furthermore,

R i s s a n e n shows that H(@Is)

error covariance

2.6

c o r r e c t l y the s t r u c t u r e s.

is h i g h l y s e n s i t i v e

is small w h e n the

to v a r i a t i o n s

in 8.

Summary

The above survey has, h o p e f u l l y ,

f i l l e d in the back-

g r o u n d to the t r e a t m e n t of m o d e l s w h i c h w i l l be d e v e l o p e d in this thesis - that is, a t r e a t m e n t b a s e d on ideas of c o m p u t a t i o n a l complexity.

Blum's

axioms h e l p to d i s t i n g u i s h

two very d i f f e r e n t n o t i o n s of c o m p l e x i t y - the c o m p l e x i t y r e q u i r e d to d e s c r i b e interpreting

something,

and the c o m p l e x i t y of

that d e s c r i p t i o n .

A d e t a i l e d r e v i e w has been g i v e n of the t h e o r y d e v e l o p e d from the n o t i o n of K o l m o g o r o v complexity. is shown to be not e f f e c t i v e l y

computable.

This c o m p l e x i t y The results

S7 which have been given support the idea

that complexity

represents the amount of information required to obtain an

entity by any effective procedure.

They also

indicate that the property of being maximally complex is equivalent to the property of being random.

These ideas

are the chief support of our approach to modelling. Kolmogorov complexity is asymptotically

invariant with

respect to a wide class of universal computers,

but it will

be seen later that this invariance property does not appear sufficiently quickly for the purposes of practical model assessment,

and the choice of computer

(in fact, of language)

constitutes a major problem. Aspects of grammatical examined.

inference have been briefly

Although grammatical

inference is conceptually

very close to the inference of any type of model from observations, different

the statement of the problem is sufficiently

from what we understand as the modelling problem,

to make the nature of possible

solutions very different.

The use of dynamic complexity measures fact, essential)

for grammatical

an effective procedure grammar,

is appropriate

inference,

and this allows

to exist for finding the "best"

within a sufficiently restricted class.

the procedures attempted enumeration,

(in

(Admittedly

so far are based on exhaustive

and so are not practical,

but the problem can

at least be solved in principle). Solomonoffqs

approach to inductive inference is the

58

progenitor

of the w o r k

Solomonoff

emphasised

confirmation"

and ours

behaviour,

terms.

is that

confirmation"

one - the b e s t The p a p e r it is c l e a r l y generally.

we

one

that

consider

which

That

because

Gaines

refer

inference

relevant

to this

to the a s y m p t o t i c

with

the

situation

"Small-Sample

thesis

situation,

of h a v i n g

suggests

in our

L~fgren

a fixed,

similar

The a p p r o a c h e s a characteristic a more

features,

to our

none

scheme

to i n f e r e n c e

less p l a u s i b l e of them

identification

and these

features

theories

for a s s e s s i n g

models.

surveyed

each of t h e m seems

account

can be a p p l i e d

of the

can

problem.

for a s s e s s i n g

w h i c h w e have

while

considered

case of our

of the m o d e l l i n g

a scheme

in common:

or

any s y s t e m

we

is i n t e r e s t i n g

as a s p e c i a l

that

formulation

suggests

Inference"

and J e f f r e y s

be r e g a r d e d

the same b a s i c

of inference,

for i n d u c t i v e

of W r i n c h

it can a l m o s t

is v e r y

provide

for a

of b e h a v i o u r .

three papers.

Finally,

programs

of

and B l u m has b e e n m e n t i o n e d

it is not v e r y

the h e a d i n g

be d i s c e r n e d

"degree

by only

are c o n c e r n e d

has

Solomonoff's

it to be d e t e r m i n e d

important

we

problem

by all the p o s s i b l e

of

sualmation

considered

whereas

proposals.

the

Solomonoff

all its r e s u l t s

because

required

"degree

between

of B l u m

However,

Under

and this

of

available.

most

sample

measure

However,

difference

because

finite

thesis.

A major

is d e c i d e d

whereas

in this

a quantitative

of a theory,

of u n c o m p u t a b l e work

reported

to

abstract

to a u s e f u l

have

nature

59

range

of p r a c t i c a l

a v i e w of m o d e l l i n g and w h i c h

which

can be d i r e c t l y

encountered

in c o n t r o l

In s e c t i o n which

models.

is b a s e d applied

assumes

a very

a particular

complexity

by an i n v e s t i g a t i o n

Gauss-Markov the d a t a

leads

estimation.

develop

of c o m p l e x i t y , of the type

with

search

to an e x t e n s i o n This w o r k g i v e s

of the c h a r a c t e r i s a t i o n

of the

of R i s s a n e n ,

v i e w of m o d e l l i n g .

explicit

the model.

the

the work

similar

can r e p l a c e

process,

to m o d e l s

probabilistically

Rissanen

associated

on ideas

examined

situation,

entropy

3 we shall

studies.

2.5 w e have

implicitly

By e x a m i n i n g

In C h a p t e r

described

consideration

classical For the

for the

of

Shannon

case of a

shortest

model

of

of m a x i m u m - l i k e l i h o o d strong

support

w h i c h we shall

to the p l a u s i b i l i t y

d e v e l o p below.

3.A C H A R A C T E R I S A T I O N

OF

MODELLING.

3.1 I n t r o d u c t i o n

In this modelling

problem

is p a r t i a l obtaining choosing

chapter we propose introduced

because a model

between

exist.

for a system, competing

Our

can be expected. if the class

3.2

solution

1.3.

The

b u t only

of m o d e l l i n g

is that

is, t h e r e f o r e ,

of c a n d i d a t e

models

have

been

for

a criterion

However,

(More c o m p l e t e

solution

an a l g o r i t h m

models.

solution

to the

solutions

for

a consequence such

an a l g o r i t h m

as c o m p l e t e

as

can be o b t a i n e d

is s u f f i c i e n t l y

restricted).

S~stems

Systems (1.3.1). Dositive

-

already

For each integer

in S by n

unchanged)

results

a system

is e i t h e r

such

introduced

S there

(except

for blanks,

in an i n t e g e r

an i n t e g e r

system

each

or a blank.

with

of w h i c h

transformation. be the e n t i t i e s

We which

exists

that m u l t i p l y i n g

as d e f i n e d p r e v i o u s l y

can be i d e n t i f i e d systems,

system

n,

appearing

is,

in s e c t i o n

it does not p r o v i d e

of our c h a r a c t e r i s a t i o n cannot

a partial

is m a p p e d

shall

a smallest e v e r y u~

i

3' YJ

which

are

left

SI=(UI,YI) ;

, but n o w each Each

a countable

in d e f i n i t i o n

integer

equivalence

that

observation

system class

SI

of

into S I by the above

consider

the i n t e g e r

are to be m o d e l l e d .

systems

The p r e f i x

to

81

"integer" will usually be omitted. We need to establish that the set of integer systems can be identified with the set of nonnegative

integers.

However, we do not need to know the details of the correspondence.

Theorem

(3.2.1)

There exists an effective bijection between the set of integer systems and the set of nonnegative integers. Proof.

For each u~ in an integer system S I, write O if 3

u~=b3 , u~+l if uj)o,i~ and u~3 otherwise. i yj.

Similarly for each

S I has now been replaced by an ordered pair of arrays

of integers.

Each integer can be mapped into a nonnegative

integer by the function p, where: p(n) = 21n I if n~O, 2n-i if

n>O.

+ Instead of considering SI, we can now consider SI, where SI=((( u + ~,...,u£11 ) ..... (u~ ..... u£~)),((y~ ..... y~l ) ,.., (y~,...,y~N))), i i and where each uj, yj is now a nonnegative integer. Rogers

(9) demonstrates the existence of a recursive

bijection T: ~ q x ~ ~ ,

which he calls a Rairing function,

and he uses it to define recursively a coding of k-tuDles of nonnegative integers onto the nonnegative integers: ~k(nl,n2,...,n k) = T(nl,Tk-l(n2,...,nk)), with Tl(n)=n.

62

+ If S I is n o w r e p l a c e d by an "indexed"

((M,(glU , . . . , u £ 1 ) , . . . , (

version:

gM,U~, .,u Ml),(N,(ml,Yl,...

1

""

), . . . .

£M . . . . .

then it can be c o d e d as the single n o n n e g a t i v e integer: £+i £M+I ml+l (TM+I(M,T (£i'''''u£i) ,... ,T ( £ M ' ' ' ' ' u ~ M) ) 'TN+I(N'T

mN+l (ml,...,YMll) ..... T Since an "indexed"

v e r s i o n of S + I has b e e n coded,

+ to r e c o v e r S I u n i q u e l y

possible

(since T is i n v e r t i b l e ) . + to r e c o v e r S I from S I. the s e q u e n c e

bijection

it is

from this single i n t e g e r it is o b v i o u s l y p o s s i b l e

It is now p o s s i b l e to search t h r o u g h

(0,1,2,...)

(no,nl,n2,...) establishing

Also,

(mN .... , y ~ ) ) ) .

for the s e q u e n c e of i n t e g e r s + from w h i c h a v a l i d S I can be recovered.

the c o r r e s p o n d e n c e n o

n l ÷ + l ,.

, the r e q u i r e ~

is obtained.

Theorem

(3.2.1)

establishes

that a n o n n e g a t i v e

integer

can be u n i q u e l y a s s o c i a t e d w i t h every i n t e g e r system, that it t h e r e f o r e m a k e s system,

By

Si

(integer)

But it was p o i n t e d out in section 1.3 that

e v e r y input o b s e r v a t i o n observation

sense to refer to "the ith

and

set U (except U=b) and e v e r y o u t p u t

set can itself be r e g a r d e d as a system.

Consequently a nonnegative each of these sets.

i n t e g e r can be a s s o c i a t e d w i t h

As in s e c t i o n 2.2, we shall f r e q u e n t l y

not d i s t i n g u i s h b e t w e e n a s y s t e m and the i n t e g e r a s s o c i a t e d w i t h it.

3.3

Models It is t e m p t i n g

which operates

to r e g a r d

a model

on the i n p u t o b s e r v a t i o n

output observation

set of a system.

for m a n y p u r p o s e s ,

in p a r t i c u l a r

deterministic models. desirable

However,

to a l l o w m o d e l s

observations

recent

deterministic determine

in some e a s e s

it is

The most

not

common example

for the p r e d i c t i o n of

environment.

In this

to a l l o w the m o d e l

is u n n e c e s s a r y ,

system behaviour

f r o m the

since

initial

case

the use of

i n f o r m a t i o n on s y s t e m b e h a v i o u r .

case this

sufficient

to use some of the o u t p u t

in a s t o c h a s t i c

it w o u l d be u n r e a s o n a b l e

the

for the t r e a t m e n t of

in the use of m o d e l s

system behaviour

set to p r o d u c e

This w o u l d be

for c o m p u t i n g others.

of this o c c u r s

the m o s t

as an a l g o r i t h m

(In the

the m o d e l

conditions

can and

the i n p u t h i s t o r y ) . It is t h e r e f o r e

necessary

to a l l o w a m o d e l

effective procedure which operates as on inputs,

in o r d e r

to c o m p u t e

a b i l i t y m u s t be r e s t r i c t e d , for i n s t a n c e ,

models which

set s i m p l y by c o p y i n g to c o m p u t e p r e v i o u s

ones.

these u s e l e s s

with reference These

as w e l l

This

the o u t p u t o b s e r v a t i o n future observations

this r e s t r i c t i o n

I n s t e a d of d e f i n i n g m o d e l s types of a l g o r i t h m s ,

to c e r t a i n

sets of s u b s e t s input

cap-

if one is to e x c l u d e ,

We accomplish

sets d e t e r m i n e w h i c h

be u s e d

however,

it, or t h o s e w h i c h use

a r a t h e r r o u n d a b o u t way. to e x c l u d e

on some o u t p u t s the output.

compute

to be an

in

so as

we d e f i n e m o d e l s

of the o b s e r v a t i o n s .

and o u t p u t

observations may

for the c o m p u t a t i o n of e a c h output.

R e s t r i c t i o n to

64

the classes of m o d e l s w h i c h

are of i n t e r e s t is then

a c c o m p l i s h e d by s u i t a b l y d e l i m i t i n g these sets. We d i s t i n g u i s h b e t w e e n a b s t r a c t models, w h i c h are partial

r e c u r s i v e functions,

c o m p u t e r programs.

Definition

and c o n c r e t e models, w h i c h are

This t e r m i n o l o g y follows C h a i t i n

(43).

(3.3.1)

G i v e n a s y s t e m S=(U,Y), subsets of U,

let A = { A i} be a set of o r d e r e d

let B={B i} be a set of o r d e r e d subsets of Y,

and let C={C i} be a set of m d i s j o i n t o r d e r e d subsets of Y, w h i c h is c o m p l e t e in the sense that e v e r y yj c Y occurs in some C i.

The o r d e r i n g of the e l e m e n t s of A i , B i and

C i is to be the same as their o r d e r i n g in U and Y. be a set of o r d e r e d p a i r s Di=(Aj,Bk) , (i=l, finally

.,m) , and

let E be a set of o r d e r e d p a i r s E i = ( D i , C i ) ,

T h e n an a b s t r a c t E - m o d e l of the s y s t e m S=(U,Y) partial recursive function M:~x~, E

l

such that,

Let D

(i=l,...,m). is a

for each

e E, (i=l,... ,m) , M (i ,D i) = C i . . . . . . . . . . . . . . . . . .

(3. i)

This d e f i n i t i o n is b e s t i l l u m i n a t e d by some examples. We use the n o t a t i o n S = ( ( u l , . . . , U N ) ,

Example

(yl,...,yN)).

(3.3.2)

If only n o n - d y n a m i c m o d e l s w e r e of interest, we m i g h t

SS

specify

the sets A,B,C,D,E

as follows:

A i = ui ,

i=l,

..., N,

B

i=l,

(~ denotes

1

= ~

,

Ci = Yi

, i=l,

(A i, B1)=(ui,0)

Ei ~

(Di, C i) = E

function

an abstract

i=l,

E-model

...,N}. of S is a partial

recursive

for i=l,

.... , N.

(3.3.3)

If we were -memory

:

..., N,

M, such that

M(i, (u i,~) )= Yi

Example

, i=l . . . . , N,

(ui,Yi) , i=l,

= { (ui, yi )

In this case,

set)

..., N,

Di =

so that

the empty

(say,

interested

two-period)

in dynamical, models,

deterministic,

a suitable

finite

specification

the sets might be: A

= {~,(u,,u~,u3),

B

=

Ci

= Yi

(u2,u3,

u~) ..... (UN_2,UN_l,UN)},

{@} , i=l,...,N,

=I(~,~)

for i=i,2,

Di

I[((Ui_ 2,

ui_ 1,u i) ,~) for i=3,

= I ( (@,0) ,yi)

... , N

for i=1,2,

Ei ((ui_ 2,ui_ 1,u i) ,~) ,yi ) for i=3,

...,N

of

66

This time an a b s t r a c t E - m o d e l of S is a p a r t i a l r e c u r s i v e f u n c t i o n M, such that M(i, (~,~)) = Yi and M(i,

Example

'

for i=i,2,

((ui_2,ui_1,ui),~))

= Yi

for

i=3,

..., N.

(3.3.4)

If we w e r e i n t e r e s t e d in o n e - s t e p - a h e a d p r e d i c t i n g models, w h i c h w e r e a l l o w e d to use all past o b s e r v a t i o n s , we could s p e c i f y the sets as follows:

(We take U = b for s i m p l i c i t y

in this case) A

= {~}

B

= {~ 'Y~

'(YI'N2 )'

Ci

= Yi

i=l,...,N,

'

[(~,~)

for i=l

(NI'Y2'Y~) ..... (Yl ..... YN-I )}

,

Di [(@, (yl,...,yi_l) Ei

=I((~,~),yi)

for i=2,...,N,

for i=l,

[,(~' (YI' .... Y i - ~ ) ' Y i ) In this case,

for i=2, .... N.

an a b s t r a c t E - m o d e l of S is a p a r t i a l r e c u r s i v e

f u n c t i o n M, such that M(I, (~,~)) and

=

Yl

M(i,(@, ( y l , . . . , y i _ 1 ) ) ) = y i M takes

index, w h i c h o p e r a t l n g on.

two a r g u m e n t s

for i=2 .... ,N.

so that the first can act as an

"tells" M w h i c h b l o c k of o b s e r v a t i o n s

it is

The s i g n i f i c a n c e of this can be seen m o s t

c l e a r l y if we a n t i c i p a t e a little,

and c o n s i d e r M to be a

67

computer

program.

The p r o g r a m m a y be d e s i g n e d

on d i f f e r e n t

blocks

on t h e m w i t h

different

the p r o g r a m

in d i f f e r e n t

subroutines.

m u s t be told w h i c h

operating

on.

argument,

but

It w o u l d it w o u l d

of i n t e r e s t of the

inadvertently A more must

to a l l o w M only

recursive

satisfy.

functions,

the

is to s p e c i f y

it w o u l d

for w h i c h

class

because

"YkEBj ,, d e n o t e s

each of w h i c h

of m o d e l s

which

is

in the m a n n e r of

w h i c h m a y be of interest.

conditions

be s u f f i c i e n t E satisfied

which

desired

"copying"

these

sets

to e x c l u d e

the o u t p u t

to c o n s i d e r

only

the condition:

for e a c h E i E E , ( Y k £ B j & B j E D i ) ~ Y k ~ Here

a model

of the risk

if it w e r e

merely

one

algorithm.

sets A , B , C , D , E

some m o d e l s

of a m o d e l

set,

those E - m o d e l s

the

For example,

the p o s s i b i l i t y

to r e g a r d

computational

primarily

excluding

usual method

observation

be p o s s i b l e

to s p e c i f y

examples,

to do this,

it is c u r r e n t l y

by s p e c i f y i n g

above

In o r d e r

by o p e r a t i n g

block

to a d i f f e r e n t

It is n o t u s u a l

for e x a m p l e

then be n e c e s s a r y

as a set of such p a r t i a l corresponded

ways,

to o p e r a t e

Ci

.

" B . = ( . . . , y k , . . . ) , ,, and 3

,,BjeDi,,

denotes

"D =(. .,B )" i " j " Similarly, compute excluded

previous

if m o d e l s ones

by i m p o s i n g

Definition

which

use

future

are of no interest, suitable

conditions

observations

to

they can be on E:

(3.3.5)

A set E, d e f i n e d

as in d e f i n i t i o n

(3.3.1),is

nonanticipative

88 if, for each E i e E: (i) m a x ( p : y p e B j & B j E D i ) < m i n ( p : y p e C i) and

(ii)max(p:up~Aj&AjeDi)~min(p:ypeCi). An E - m o d e l w i l l be said to be n o n a n t i c i p a t i v e

if E is

nonanticipative. We w i s h to c o n s i d e r m o d e l s We f o r m a l i s e a c o m p u t e r

to be c o m p u t e r programs.

(together with a p r o g r a m m i n g

language)

as a

(not n e c e s s a r i l y universal)

3-place p a r t i a l

recursive

f u n c t i o n F, and a p r o g r a m as the first a r g u m e n t of

this function.

Definition

(3.3.6)

A concrete

(FvE)- m o d e l of the s y s t e m S=(U,Y)

is an

integer p, such that F (P,i,Di) =C i . . . . . . . . . . . . . . . . . .

(3.2)

for e v e r y EiEE , w h e r e C i , D i , E i , E are d e f i n e d as in d e f i n i t i o n (3.3.1) , a n d F : ~ X ~ x ~ i s

Theorem

a 3 - p l a c e p a r t i a l r e c u r s i v e function.

(3.3.7)

T h e r e exists an F, such that to e v e r y a b s t r a c t E - m o d e l M of S there c o r r e s p o n d s

a concrete

(F,E) -model p of S,

such that F(p,x,y)

= M(x,y)

for all x,y £N.

. . . . . . . . . . . . . . . .

(3.3)

89

Proof:

Choose F to be any 3-place u n i v e r s a l

recursive

function.

F(p,x,y)=M(x,y)

Then there exists p, such that

for all x , y £ ~

M(i,Di)=C i for every EieE. every EieE.

Theorem

(Rogers(9),p.22).

But

T h e r e f o r e F ( p , i , D i ) = C i for

Consequently

p is an

(F,E)-model

of S.

(3.3.8)

To every concrete an

partial

abstract E - m o d e l M(x,y) =F (p,x,y)

(F,E) -model p of S there corresponds

M of S, such that . . . . . . . . . . . . . . . . .

(3.4)

for all x,ye~. Proof:

By the s-m-n theorem

a 2-place partial M(x,y)=F(p,x,y) every EieEo

recursive

(Rogers(9),p.23),

function M, such that

for all x,ye~.

Consequently

there exists

But F ( p , i , D i ) = C i for

M ( i , D i ) = C i for every EieE,

and

therefore M is an abstract E -model of S. Theorems(3.3.7)

and

(3.3.8)

concrete m o d e l s

are e s s e n t i a l l y

F which appears

in d e f i n i t i o n

some computing The n o n n e g a t i v e

facility, integers

for such a facility Thus concrete models

show that abstract equivalent.

(3.3.6)

by invoking

and

The function

can be a s s o c i a t e d w i t h Church's

Thesis.

can be used to e n u m e r a t e

in some standard m a n n e r

the programs

(cf section

2.2.1).

can be a s s o c i a t e d with programs.

Note that every system has i n f i n i t e l y many a b s t r a c t E-models,

and that to each of these abstract

will u s u a l l y c o r r e s p o n d

infinitely

E-models

many concrete

(F,E)

there

70

-models always

(for a fixed F).

If F is u n i v e r s a l

correspond infinitely many concrete

each a b s t r a c t E -model, programs partial

G~del n u m b e r i n g ,

functions:

appropriate requires

it is k n o w n that in every f u n c t i o n has i n f i n t e l y

The task of m o d e l l i n g

abstract model

the p o s t u l a t i o n

(programs)

to

a G ~ d e l n u m b e r i n g for the

each p a r t i a l r e c u r s i v e

m a n y G ~ d e l numbers.

(F,E) -models

since the e n u m e r a t i o n of the set of

for F then c o n s t i t u t e s recursive

then there w i l l

for a system.

is to find an In p r a c t i c e

this

and e x a m i n a t i o n of c o n c r e t e m o d e l s

for it.

A c c o r d i n g to the d i s c u s s i o n of s e c t i o n 1.3, the goal of the m o d e l l i n g of d e f i n i t i o n statement,

exercise

(1.3.3).

is a d y n a m i c a l s y s t e m in the sense To r e c o n c i l e this w i t h the above

we p o i n t out that a n o n a n t i c i p a t i v e

E - m o d e l M can be r e g a r d e d as such a d y n a m i c a l providing

,

there are no gaps in the set C.). l

It

to take the time set T to be the set of integers

(O,l,2,...,m) time T=I,

system

that each C i has the form C i = ( y k , Y k + l , . . . , Y k + n i )

(in o t h e r words, suffices

abstract

w i t h the usual ordering,

to take the i n i t i a l

and to i d e n t i f y time t w i t h the c o m p u t a t i o n of C t

(using the t e r m i n o l o g y of d e f i n i t i o n s

(1.3.3) and

(3.3.1)).

The state at time t can be taken to be the input and o u t p u t h i s t o r y x ( t ) = ( ( U l , U 2 , .... u j ) , ( y l , y 2 ..... yk)), w h e r e j= m a x ( p : u p e A q & A q e D i, i~t)

and k = m a x ( p : y p e B q & B q ~ D i , i ~ t ) .

input at time t is the s e q u e n c e m ( t ) = ( u j + l,...,u r) w h e r e r--max(p:UpeAq&AqCDt) , and the o u t p u t is C t.

Since E is

The

71

nonanticipative, (x(t),~(t+l)),

all the e l e m e n t s of Dt+ 1 appear in

and so Dt+ 1 can be "assembled"

(x(t),~(t+l)).

from

Let ~ be an a l g o r i t h m for doing this.

Then ~ and M t o g e t h e r d e t e r m i n e

the state t r a n s i t i o n

f u n c t i o n ~:

(t+l; t, ( (u I ..... uj) , (Yl ..... Yk ) ) '~ (t+l)) = ((Ul,... ,uj,~(t+l)) , (YI' .... Yk 'M(t+l '~(t+l'x(t) ,~(t+l))))) . The initial state x(O)

is a p a i r of e m p t y sequences,

and

the r e a d o u t map ~ is d e f i n e d by n ( t , ( ( U l , . . . , u £ ) , ( y I .... ,Ym,Ct)))= Ct, Conventionally, behaviour, (3.3.1)

for t > O.

m o d e l s are allowed to a p p r o x i m a t e s y s t e m

r a t h e r than r e p r o d u c e it exactly.

and

(3.3.6)

however,

Definitions

r e q u i r e m o d e l s to compute the

o b s e r v e d s y s t e m b e h a v i o u r exactly.

This does not m e a n that

the class of m o d e l s w h i c h we can t r e a t is any smaller than the class of models w h i c h are u s u a l l y of interest. merely m e a n s

It

that w h e r e a s a c o n v e n t i o n a l m o d e l may r e p r o d u c e

a system b e h a v i o u r a p p r o x i m a t e l y ,

the c o r r e s p o n d i n g m o d e l

in our f o r m a l i s m has the a d d i t i o n a l task of g e n e r a t i n g the "corrections" behaviour, Fig.

w h i c h m u s t be a p p l i e d to the a p p r o x i m a t e

in o r d e r to p r o d u c e the e x a c t s y s t e m behaviour. 2 shows the c o r r e s p o n d e n c e b e t w e e n a type of

c o n v e n t i o n a l m o d e l c o m m o n l y e n c o u n t e r e d in c o n t r o l studies, and a m o d e l w h i c h s a t i s f i e s d e f i n i t i e n s It w i l l be r e c a l l e d that t h e o r e m "random"

is e q u i v a l e n t

table look-up",

(3.3.1)

(2.2.12)

and

(3.3.6).

suggests that

to "can be c o m p u t e d o n l y by u s i n g a

If a m o d e l is c o n s i d e r e d to be a s u m m a r y

72 of k n o w l e d g e

about a system,

then those c o m p u t a t i o n s of the

m o d e l w h i c h have to be p e r f o r m e d by using a table look-up correspond

to those aspects of the s y s t e m b e h a v i o u r w h i c h

are not u n d e r s t o o d ,

and cannot be p r e d i c t e d - in fact,

those that a p p e a r to be random. may be very d i f f e r e n t

from that shown in fig.

if they are "corrections", m o r e generally,

The role of these c o m p u t a t i o n s 2.

For example,

they need not be additive.

But

the terms c o m p u t e d by table look-up need not

play the role of "corrections".

T h e y may,

for instance,

be p a r a m e t e r s , w h i c h w o u l d c o n v e n t i o n a l l y be v i e w e d as " r a n d o m l y varying".

3.4 C r i t e r i o n of Q u a l i t y

The third c o m p o n e n t of our c h a r a c t e r i s a t i o n of m o d e l l i n g is a c r i t e r i o n of q u a l i t y of a model. Let F r e p r e s e n t a c o m p u t i n g programming

language.

facility,

together with a

Let c be an i n j e c t i v e f u n c t i o n from

the i n t e g e r s to the set of strings of t e r m i n a l in the progr a m m i n g language, w h i c h is used to r e p r e s e n t the integers

in

programs.

(c t h e r e f o r e is i n c l u d e d in the d e f i n i t i o n of the

programming

l a n g u a g e F.

The d e f i n i t i o n of p r o g r a m m i n g

languages is r e v i e w e d in A p p e n d i x A; given in C h a p t e r 7).

m o r e d~tails of c are

Let S be an i n t e g e r s y s t e m as d e f i n e d

in s e c t i o n 3 2, w i t h input and o u t p u t o b s e r v a t i o n s u~, •

Definition

i yj-

(3.4.1)

The trivial F m o d e l of S is the s h o r t e s t p r o g r a m w h i c h

73 4

is a concrete

(F,E)-model of S, such that each c(y~)

appears in it, where the minimisation all possible sets E

(defined by def.

of length ranges over (3.3.1)).

It is assumed that the length of a program is measured by the number of terminals

appearing in it.

The trivial model of a system is one which computes the output observation table look-up.

set by simply reading

it out from a

It is a model which the modeller has

available right at the beginning of the modelling

exercise,

before he has found any structure or pattern in the system behaviour. For any system S, let the sets Ci,D i be those defined by def.

(3.3.1) ~

One can think of the length of a concrete

(F,E)-model of S as the "perceived complexity", F, of the set

(Cl,...,Cm),

conditional

relative to

on the set

((l,Dl),...,(m,Dm)).

The greatest lower bound of this "perceived complexity", taken over all concrete

(F,E)-models of S, is just the

conditional Kolmogorovcomplexity

KF((C,,...,Cm) I ((l,Dl),...,(m,Dm))).

(Although Kolmogorov complexity was developed and binary programs,

for binary sequences

it can be readily generalised to sequences

and programs containing

any finite number of sMmbols).

approximate upper bound for this Kolmogorov complexity

An is

the length of the trivial model of S. The length of the trivial F model of S is the "perceived complexity"

of

(C~,...,C m) before any structure has been

discovered in the system behaviour.

If a shorter model of

74

S is found,

then its " p e r c e i v e d c o m p l e x i t y " w i l l be reduced.

R e c a l l i n g the a n a l o g y b e t w e e n c o m p l e x i t y and entropy, is a p p e a l i n g to m e a s u r e in

((l,D1),...,(m,Dm))

the " p e r c e i v e d q u a n t i t y of i n f o r m a t i o n " about

(CI, .... C m)

as the d i f f e r e n c e

b e t w e e n these two " p e r c e i v e d c o m p l e x i t i e s " . Kolmogorov complexity

it

Since

is not e f f e c t i v e l y c o m p u t a b l e ,

the o n l y

u p p e r b o u n d on this " p e r c e i v e d q u a n t i t y of i n f o r m a t i o n " which

is a v a i l a b l e ,

model.

in general,

is the length of the t r i v i a l

Thus the length of the trivial m o d e l is a m e a s u r e

of the a m o u n t of i n f o r m a t i o n p o t e n t i a l l y to be c o n v e y e d by the m o d e l l i n g exercise.

Definition

(3.4.2)

Let p be a c o n c r e t e trivial F m o d e l of S.

(F,E)-model of S, and let t be the Then the i n f o r m a t i o n ~ain I(p)

of p is the d i f f e r e n c e I (p) =£ (t) -Z (p) . . . . . . . . . . . . . . . . . . where

£(.)

denotes

In section

(3.5)

the length of a program.

i.i a simple e x a m p l e was p r e s e n t e d ,

which

s u g g e s t e d that the c o n f i d e n c e w h i c h one has in a m o d e l d e p e n d s on the d i f f e r e n c e b e t w e e n the n u m b e r of o b s e r v a t i o n s w h i c h the m o d e l e x p l a i n s

and the n u m b e r of o b s e r v a t i o n s

r e q u i r e d to c o n s t r u c t the model. m e a s u r e of this difference.

The i n f o r m a t i o n g a i n is a

If the i n f o r m a t i o n gain is

zero, then all of the o u t p u t o b s e r v a t i o n s have been used to c o n s t r u c t the model;

the t r i v i a l m o d e l is, of course,

prime e x a m p l e of such a model.

the

If the i n f o r m a t i o n gain is

7S

close to its u p p e r enough

bound

£(t),

to be c o n s t r u c t e d

observation

set,

by the model.

"parameters"than

that

the m o d e l

is j u s t i f i e d

course,

implies

(Chapter that

of a s y s t e m

contains

sets

that system.

This

that if w e have we can n e v e r

only

the

latter

accords

size

confidence

of the p r o g r a m m i n g

aspect).

This

model

the i n t u i t i v e

of

notion

of a system,

in any m o d e l

claim

in some m o d e l

of the t r i v i a l

well w i t h

of

is c o n t a i n e d

we m a y have

a few o b s e r v a t i o n s

have m u c h

We assume,

the s y s t e m

confidence

by the

in a m o d e l

increases.

about

about

set.

and in the d e f i n i t i o n

the p o s s i b l e

of course.

arbitrary

the c o n f i d e n c e

gain

4 deals with

is b o u n d e d

more

observation

that

all our k n o w l e d g e

in the o b s e r v a t i o n

set is " e x p l a i n e d "

by the amount of i n f o r m a t i o n

as its i n f o r m a t i o n

that

language

then,

of this

of the o u t p u t

gain m a y be n e g a t i v e ,

in the o u t p u t

We are c l a i m i n g ,

is simple

a small p a r t

and the r e m a i n d e r

contained

increases

the m o d e l

from o n l y

The i n f o r m a t i o n

This i n d i c a t e s

reality

then

then

of it w h i c h

may be p o s t u l a t e d . We e m b o d y of w h i c h

Axiom

our c l a i m

a c h o i c e m a y be m a d e

following

between

axiom,

competing

on the basis models.

(3.4.3)

If S is a system, El- m o d e l s

an

and El and E2

of S and Ez- m o d e l s

and q are models, being

in the

(F,E2)

has the h i g h e r model of S.

with p being

-model

of S an

of S, then

information

gain

are sets

such

that

are of interest,

(F,EI)

-model

and p

of S and q

the one of p and q w h i c h

is to be c h o s e n

as the b e t t e r

76 This a x i o m implies that good m o d e l s Good m o d e l s w i l l t h e r e f o r e computational

are small models.

tend to use the same

(short)

a l g o r i t h m for as many c o m p u t a t i o n s

as p o s s i b l e ,

since the s p e c i f i c a t i o n of every new a l g o r i t h m i n c r e a s e s the size of the model. specific

Thus the above a x i o m p r o v i d e s

a

link b e t w e e n the w i d e l y - h e l d b e l i e f that s i m p l i c i t y

(as m e a s u r e d by smallness)

is d e s i r a b l e

and the a l m o s t u n i v e r s a l c o n v i c t i o n r e g u l a r i t y has been repeated,

in s c i e n t i f i c h y p o t h e s e s ,

that the more an o b s e r v e d

the more

likely it is to recur.

The f o l l o w i n g t h e o r e m is a crucial c h a r a c t e r i s a t i o n of m o d e l l i n g .

feature of our

As before,

£(p) d e n o t e s

the length of p r o g r a m p, m e a s u r e d by the n u m b e r of t e r m i n a l c h a r a c t e r s w h i c h appear in it.

Theorem

(3.4.4)

T h e r e is, in general, an

no e f f e c t i v e p r o c e d u r e

(F,E) - m o d e l p of a system S, such that,

for finding

for any o t h e r

(F,E) -model q of S, ~(p)~ £(q).

Proof.

S u p p o s e that such an e f f e c t i v e p r o c e d u r e exists.

C o n s i d e r the case E = { E I } = { ( ~ , Y ) } ( w h e r e have an e f f e c t i v e p r o c e d u r e

S=(U,Y)).

Then we

for f i n d i n g the s h o r t e s t p r o g r a m

w h i c h computes Y, using only the set {i}. Now suppose that the p r o g r a m m i n g t e r m i n a l characters. procedure

language F has only two

Then there exists

an e f f e c t i v e

for finding the s h o r t e s t b i n a r y p r o g r a m w h i c h

c o m p u t e s Y, u s i n g b i n a r y sequence:

{i}.

But Y can be a s s o c i a t e d u n i q u e l y w i t h

Y is a system,

and can t h e r e f o r e be

77

a s s o c i a t e d w i t h its index in some fixed e n u m e r a t i o n of systems. This i n d e x can be a s s o c i a t e d w i t h a b i n a r y s e q u e n c e by the b i j e c t i o n i n t r o d u c e d in s e c t i o n 2.2,1. above steps is effective,

Since each of the

there exists an e f f e c t i v e p r o c e d u r e

for finding the s h o r t e s t b i n a r y p r o g r a m w h i c h c o m p u t e s

the

binary s e q u e n c e a s s o c i a t e d w i t h Y, and h e n c e there exists an e f f e c t i v e p r o c e d u r e

for finding its length, n a m e l y the

K o l m o g o r o v c o m p l e x i t y KF(YII).

S u p p d s e F is optimal.

Then, by C h u r c h ' s Thesis,

is p a r t i a l recursive, w h i c h

contradicts

theorem

Theorem

K(YII)

(2.2.5).

This p r o v e s

the theorem.

(3.4.4) does not rely on F h a v i n g only two

terminals or on E h a v i n g the form i n d i c a t e d in the proof. These a s s u m D t i o n s to t h e o r e m

are made in o r d e r to d e r i v e

(2.2.5).

However,

a contradiction

as m e n t i o n e d earlier,

this

t h e o r e m can be g e n e r a l i s e d to the case w h e r e the s e q u e n c e s c o n s i d e r e d have an a r b i t r a r y

finite n u m b e r of symbols,

to cover the u n c o m p u t a b i l i t y

of c o n d i t i o n a l complexity.

the o t h e r hand,

and On

the t h e o r e m does rely on F b e i n g optimal.

A sufficiently restricted programming shortest m o d e l to be found, systems w i l l not p o s s e s s

language may allow the

if it exists.

any m o d e l s

simplest e x a m p l e is a p r o g r a m m i n g

However,

most

in such a l a n g u a g e

(the

l a n g u a g e w h i c h always

computes the same thing, w h a t e v e r p r o g r a m it may be given). Theorem

(3.4.4)

implies that there is no a l g o r i t h m for

finding the m o d e l of a s y s t e m w h i c h has the h i g h e s t i n f o r m a t i o n gain.

So, a c c o r d i n g to our axiom,

finding the b e s t m o d e l of a system.

there is no a l g o r i t h m for C o n s e q u e n t l y the

m o d e l l i n g e x e r c i s e c a n n o t p r o c e e d a c c o r d i n g to some

78

"universal

modelling

of n o n a l g o r i t h m i c followed our

by the

algorithm",

(creative?)

assessment

but must

postulation

of these

involve

a process

of h y p o t h e s e s ,

hypotheses,

according

to

axiom. Note

that the

in s e c t i o n is still

(2.2.4),

Most models

of data w h i c h

up,

it is m o s t

can be explained.

error.

by the m u m b e r

It w i l l

of c h a r a c t e r s

a table

not b e e n e x p l a i n e d usually

be possible

system behaviour

any table

look-ups

algorithms

- that

in the m o d e l

in the rest

a trade-off

between

which would

conventionally

and the d e g r e e

the

a table required

look-

of the data as

to p r o g r a m

it,

of p r e d i c t i o n features

of

by the model. to e x p l a i n m o r e is,

to r e d u c e

of the

the size of

- only by u s i n g m o r e - that Thus

of q u a l i t y

complexity

the use of s m a l l n e s s of a m o d e l

leads

to

of the m o d e l

as the m o d e l

provided

elaborate

is, by i n c r e a s i n g

of that p a r t

be r e g a r d e d

of a p p r o x i m a t i o n

table

look-up,

is, the more

of the m o d e l

as a c r i t e r i o n

aspect

measure

the size of the rest of the model. of the p r o g r a m

artificially

at least one

that e v e r y

size of such

such

the s h o r t e s t

Criteria.

has not been

as a very g e n e r a l

The b i g g e r

the d a t a have

observed

The

there

model.

to c o n t a i n

unlikely

is m e n t i o n e d

is found,

for finding

with Conventional

can be e x p e c t e d

can be r e g a r d e d

(28), w h i c h

if a m o d e l

procedure

generated

measured

that

of that p a r t i c u l a r

Compatibility

since

of M e y e r

implies

no a l g o r i t h m i c

implementation

3.5

result

(cf fig. 2),

by that p a r t of the

79

model

to the o b s e r v e d

therefore model

provides

data.

The use of this

a safeguard

against

observation

set is large

criterion

"overfitting"

the

to the data. If the o u t p u t

to the size look-up,

of that part

then

the

of the m o d e l

size of the

the size of the model. are b e i n g quality table

compared,

leads

to the

selection This

p r e f e r e n c e for small is large

In this

which

is not

look-up(s)

case,

fitting

enough

the

of smaller

conventional

if the n u m b e r

for the d a n g e r

dominate

criterion

to the

errors,

a table

will

of the m o d e l w i t h

corresponds

relative

if two such m o d e l s

the use of the p r o p o s e d

look-up(s).

ations

table

enough,

of o b s e r v -

of o v e r f i t t i n g

to be

d~missed. The d e f i n i t i o n mines

the d e t a i l s

the s m a l l n e s s

of the p r o g r a m m i n g

of the

trade-off

criterion.

are

definition,

is c o n s i d e r e d

A serious proposed about

the s y s t e m

A typical

is not

situation

a program, a priori

the

has

smallest

of this where

that

of the

language

a priori this m a y

the use of the knowledge

indicate

that

a

should

be p r e f e r r e d .

in c o n v e n t i o n a l

system

identification

a particular

will

about

knowledge

a more

look-

available

~

a smaller

knowledge

then

part

table

7.

be m a d e

If s u f f i c i e n t

model with

It m a y h a p p e n

must

used d e t e r -

in the use of

in w h i c h

constitutes

in C h a p t e r

is a v a i l a b l e

example

parametric

which

reservation

criterion.

model w h i c h

is the

coded,

implicit

The m a n n e r

up e l e m e n t s

language

elaborate overall

prevent

indicates

structure model,

size;

that

is a p p r o p r i a t e .

when written

nevertheless,

that m o d e l

a

being

as

the

chosen

as

80

better.

A n o t h e r e x a m p l e is p a r a m e t e r e s t i m a t i o n of a

l i n e a r d y n a m i c a l p r o c e s s w h o s e o u t p u t is c o r r u p t e d by noise. In this case a s t r a i g h t f o r w a r d m i n i m i s a t i o n of the e q u a t i o n e r r o r u s u a l l y leads to b i a s e d e s t i m a t e s So if two m o d e l s

(Eykhoff,

are b e i n g c o m p a r e d w h o s e

c o n t a i n the e q u a t i o n errors,

table look-ups

it is p o s s i b l e

that the larger

one w i l l be p r e f e r r e d on p r o b a b i l i s t i c grounds. again,

a priori k n o w l e d g e

the s m a l l n e s s

(44)).

(about the noise)

Once

is r e q u i r e d if

c r i t e r i o n is to be overridden.

Furthermore,

the s m a l l n e s s c r i t e r i o n could still be u s e d to d e c i d e b e t w e e n the l a r g e r of these two m o d e l s and a third m o d e l b e l o n g i n g to a d i f f e r e n t

class.

As i n d i c a t e d in s e c t i o n intended

i.i,

the p r o p o s e d c r i t e r i o n is

for use in s i t u a t i o n s w h e r e little a p r i o r i

is available,

information

or in s i t u a t i o n s w h e r e it is too d i f f i c u l t to

use such a p r i o r i k n o w l e d g e

for m o d e l assessment.

The s m a l l n e s s - of - m o d e l c r i t e r i o n choice of m o d e l

leads to the same

as do s t a t i s t i c a l c o n s i d e r a t i o n s ,

i m p o r t a n t class of s y s t e m b e h a v i o u r s

for a v e r y

and m o d e l s of them.

If the s y s t e m b e h a v i o u r is a s t a t i o n a r y r a n d o m p r o c e s s w i t h rational spectral density predict,

at any time,

function,

its future behaviour,

the m e a n - s q u a r e p r e d i c t i o n error. to W i e n e r and

then it is known how to So as to m i n i m i s e

The method,

due e s s e n t i a l l y

Kolmogorov, is to m a k e the p r e d i c t i o n for any

future time a s u i t a b l e

linear f u n c t i o n of past o b s e r v a t i o n s

of the b e h a v l o u r

(46).

(45),

are e q u a l l y spaced,

If the o b s e r v a t i o n i n t e r v a l s

and a p r e d i c t i o n

is b e i n g m a d e at each

i n s t a n t of the s y s t e m b e h a v i o u r at the n e x t o b s e r v a t i o n instant,

81

then the p r e d i c t i o n errors are equal

to the random,

uncorre-

lated d i s t u r b a n c e s w h i c h are i m a g i n e d to be acting on the system. S u p p o s e it is d e s i r e d to b u i l d a c o n c r e t e

(F,E) - m o d e l

of the s y s t e m w h i c h w i l l give useful o n e - s t e p - a h e a d predictions.

Any E can be chosen w h i c h allows the m o d e l

to use p r e v i o u s o b s e r v a t i o n s example

(3.3.4)).

to compute p r e d i c t i o n s

(cf.

The m o d e l w i l l have to g e n e r a t e terms

c o r r e s p o n d i n g to p r e d i c t i o n errors by m e a n s of a table up.

If the p r o g r a m m i n g

look-

l a n g u a g e used codes table look-

up terms in such a way that length of code is n o n d e c r e a s i n g with the m a g n i t u d e of the term

(cf. C h a p t e r 7), then,

s u f f i c i e n t l y long s e q u e n c e of o b s e r v a t i o n s , smallest

(in magnitude)

(in length)

for a

the model w i t h

p r e d i c t i o n errors will be the s m a l l e s t

model.

But it is k n o w n that,

for the s y s t e m u n d e r c o n s i d e r a t i o n ,

the s m a l l e s t m e a n square p r e d i c t i o n error is o b t a i n e d by the use of the W i e n e r - K o l m o g o r o v theory. Sherman

(47) has shown that,

Furthermore,

if the p r o c e s s

is Gaussian,

then

the same linear p r e d i c t o r is o b t a i n e d if the e x p e c t a t i o n of any even n o n d e c r e a s i n g

f u n c t i o n of the p r e d i c t i o n error is

minimised. So, u n d e r these conditions, of o b s e r v a t i o n s , to axiom

(3.4.3),

the " e x p e c t e d b e s t model",

uncorrelated

fore s u g g e s t s

judged according

is the W i e n e r - K o l m o g o r o v model.

terms a p p e a r i n g in the table a random,

for a long enough s e q u e n c e

look-up of this m o d e l

sequence.

Theorem

(2.2.12)

The constitute there-

that these terms could not be g e n e r a t e d by any

82

m o r e e f f i c i e n t a l g o r i t h m than a table

look-up.

3.6 P r e d i c t i o n If the b e s t m o d e l that has b e e n found up to some time is a c o n c r e t e

(F,E) - m o d e l p, and it is d e s i r e d to find the

s y s t e m b e h a v i o u r u n d e r some new conditions,

(possibly not yet observed)

w h i c h can be r e p r e s e n t e d by a b l o c k of " v i r t u a l

observations",

D m + ~ , t h a t is, the o b s e r v a t i o n s w h i c h w o u l d

be o b s e r v e d if the new c o n d i t i o n s obtained,

then the m o d e l p,

and the c o m p u t e r F, can be used to find the " p r e d i c t i o n " F(p,m+l,Dm+1).

This p r o v i d e s

v a l u e s of a p o s s i b l e

a m e a n s of c o m p u t i n g

input/output

the

f u n c t i o n of the s y s t e m on

elements of its d o m a i n w h i c h have not b e e n p r e v i o u s l y observed.

According

best "predictions" "prediction"

to our axiom,

a v a i l a b l e to us.

in quotes,

these values

are the

We have put the w o r d

b e c a u s e the v a l u e F ( p , m + l , D m + I)

n e e d not r e p r e s e n t a future v a l u e

(for example,

if the m o d e l

runs b a c k w a r d s t h r o u g h the o b s e r v a t i o n interval). It is possible, is not defined.

In this case,

use p for p r e d i c t i n g

%+i

of course,

However,

that the value F ( p , m + l , D m + I) it may not be p o s s i b l e

s y s t e m b e h a v i o u r u n d e r the c o n d i t i o n s

for some models,

the v a l u e F ( p , m + l , D m + I)

may be u n d e f i n e d simply b e c a u s e p i n c l u d e s

the g e n e r a t i o n of

c e r t a i n p a r a m e t e r s by m e a n s of a table look-up,

and the table

does not contain an e l e m e n t w h i c h is to be used for the computation.

to

In this case, p r e d i c t i o n

(m+l)th

is still p o s s i b l e

if

83

such an e l e m e n t what v a l u e propose

should

a second

extension

Axiom

be s u p p l i e d that

to the model.

element

axiom,

take?

Our

w h i c h m a y be

of the p r e v i o u s

The p r o b l e m solution

is,

is to

thought

of as an

to t a b l e

look-ups

one.

for P r e d i c t i o n

If e l e m e n t s model,

in o r d e r

are

to be s u p p l i e d

to a l l o w

then the b e s t p r e d i c t i o n are chosen

that m o d e l will

so as to m i n i m i s e

to c o m p u t e

be o b t a i n e d the

a prediction,

if these

resulting

of a

elements

increase

in

size of the model. The use of this the v a l u e

of the i n f o r m a t i o n

to a t r i v i a l no c o n f i d e n c e A rough The b a s i c

axiom must

model,

thus

gain.

enabling

in that p r e d i c t i o n , justification

assumption which

observations,

will

of the s y s t e m

during

should

the e l e m e n t s

that p r e v i o u s l y

have b e e n continue

the p r e d i c t i o n

observed

requires

a large

of code

amout

to c o m p u t e

such r e g u l a r i t i e s , by using

the

"fixed"

part

is that

can c e r t a i n l y

of

H e n c e we

look-up

is such

to appear

any

in the b e h a v i o u r

to be such

are p r e s e n t

if the m o d e l

of the m o d e l

b u t we have

as follows.

interval.

an o u t p u t w h i c h

then we

an e l e m e n t

in a s e q u e n c e

table

by

the e l e m e n t m a y be.

prediction

regularities

But,

supply

runs

to be p r e s e n t

prediction.

in o r d e r

whatever

detected

for the

of course,

it to predict;

of the a x i o m

computed

up,

We can

of s c i e n t i f i c

regularities,

choose

be q u a l i f i e d ,

in the that

in a table

look-

is c o n s i s t e n t obtain (i.e.

it

with

a better model

the p a r t

that

84

is common to all the computations)

to c o m p u t e the regularities.

This is true b e c a u s e for a s u f f i c i e n t l y observations, average look-up.

large set of

the size of the m o d e l w i l l be g o v e r n e d by the

length of code a p p e a r i n g as e l e m e n t s of the table Thus the a x i o m is r e a s o n a b l e if it is a s s u m e d

that it is a p p l i e d to the b e s t a v a i l a b l e model. The above a r g u m e n t can be i l l u s t r a t e d by the f o l l o w i n g example.

S u p p o s e a s y s t e m is d e f i n e d by the o b s e r v a t i o n s :

S=(U,Y)=(b, ( 5 5 2 , 5 5 3 , 5 4 6 , 5 5 1 , 5 4 9 , 5 4 4 , 5 4 7 , 5 5 4 , 5 5 7 , 5 5 1 ) ) . If the p r o g r a m m i n g

l a n g u a g e and c o m p u t i n g

f a c i l i t y F is

t a k e n to be A l g o l W, as i m p l e m e n t e d on the IBM 370/165 i n s t a l l a t i o n at C a m b r i d g e ,

and E i = ( ~ , Y i ) , i = l , . . . , l O , (so

that E3=(@,546) , for example),

then a trivial

(F,E) - m o d e l

of S is: B E G I N I N T E G E R I,J;

I N T E G E R A R R A Y Y(I::IO);

FOR J:=l U N T I L I0 DO READ READ WRITE

( Y(J));

(I) ; (Y(I)) ;

END.

552,553t546,551,549,544,547,554,557,551 , W h e n p r e s e n t e d w i t h an i n t e g e r i

(16i~iO), this p r o g r a m

c o m p u t e s Yi by looking it up in the array Y. We k n o w that this m o d e l

is useless for p r e d i c t i o n ,

b e c a u s e it is a trivial model. it c o m p u t e a "prediction".

we can m a k e

We m u s t first supply it w i t h a

new e n t r y in its table look-up. integer

Nevertheless,

To do this, we replace the

iO in line 2 by the i n t e g e r ii, and add a new n u m b e r

85

at the end of the program.

When presented

ii, this p r o g r a m w i l l

the new number.

this n u m b e r

be?

output

According

to our A x i o m

should be one of the i n t e g e r s Yll will

then be that

Clearly nearer

can see that

it.

But why

regularity

will

doing this

close

of this

to 550,

INTEGER

READ WRITE

of

In o t h e r w o r d s ,

by not o b e y i n g

Because

to 550.

But

to b u i l d that

"mean plus INTEGER

UNITL

that

a

the b e h a v i o u r

case w e

can use

O n e w a y of

of the b e h a v i o u r

to b u i l d

random ARRAY

iO DO READ

the A x i o m

detected

a b e t t e r model. the m e a n

we

by o b e y i n g

we have

in that

and t h e r e f o r e

I,J;

F O R J:=l

it

A prediction

than one o b t a i n e d

see this?

is of the c o n v e n t i o n a l BEGIN

should

The p r e d i c t i o n

be better.

obtained

is to o b s e r v e

close

integer

for P r e d i c t i o n ,

in the s y s t e m b e h a v i o u r - n a m e l y ,

our k n o w l e d g e

What

is a v e r y bad one.

be b e t t e r

can we

tends to r e m a i n

remains

"obviously"

a prediction

for P r e d i c t i o n

the

integer.

this p r e d i c t i o n

550 w o u l d

0,...,9.

with

a model

error"

which

type:

E(I::IO);

(E(J));

(I) ; (550+E(I)) ;

END. 2,3,-4,1,-i,-6,-3,4,7,1, This m o d e l the o b s e r v e d obtained

from

only s l i g h t l y

computes

regularity a table better

gain is 12 terminals,

the

system

(550),

look-up. than but

the

behaviour

and c o r r e c t i n g Admittedly, trivial

it w o u l d

model

rapidly

Y by c o m p u t i n g it by a term this m o d e l

is

(its i n f o r m a t i o n become

decisively

86 superior cl o s e

if m o r e

observations

became

available,

which

remained

to 550. In this

obtain

case,

if we

as the p r e d i c t e d

500 and

559.

This

Clearly,

apply next

time,

several

our A x i o m

output

an i n t e g e r

the p r e d i c t i o n

similar

models

each of t h e m there

is a c o n s i d e r a b l e

It may be p o s s i b l e

to r e d u c e

estimating

the p r o b a b i l i t y

table

terms.

3.7

An E x a m p l e

3.7.1

will above

data,

In this

section

portray

a particular

example

w h i c h was

can be built,

range,

distribution

an e x a m p l e

will

model~ng

and

of"best"

for

predictions.

for e x a m p l e of the

by

look-up

be p r e s e n t e d ,

exercise

of 296 p a i r s

which

in terms

gas

of the

and J e n k i n s

flow

as Series

observations.

rate

The

furnace

(45).

into

The J), The

a furnace,

are of the c o n c e n t r a t i o n

gases.

observations

and

of c a r b o n were made

at

seconds.

obtain

a model

of a d e t e r m i n i s t i c flow rate

of the gas

and J e n k i n s

of i n p u t - o u t p u t

of nine

and J e n k i n s

consists

the i n p u t

by Box

observations

intervals

by Box

are of gas

in the o u t l e t

Box

is the m o d e l l i n g

considered

observations

dioxide

which

used

(which is g i v e n

the o u t p u t

equal

reasonable.

characterisation.

consists input

lying b e t w e e n

is q u i t e

range

we

Introduction

The data,

this

for P r e d i c t i o n ,

for these

transfer

to the o u t p u t

observations,

function

concentration

relating of carbon

87

dioxide,

and a m o d e l

deterministic

of the n o i s e

relationship.

process

The m o d e l

which

disturbs

they o b t a i n

the

is:

2

^ Yt

0.53+0 =

37B+O "

--

51B "

u t --3

. . . . . . . .

(3.6)

2

I-0.57B-O.OIB nt

1

=

wt. . . . . . . . .

(3.7)

. . . . . . . . . . . . . . . . .

(3.8)

2

I-O.53B+O.63B Yt

=

Yt+nt

IIere u t and y~ r e p r e s e n t respectively,

after

the input

removal

and o u t p u t

of t h e i r m e a n

variables,

values,

at

^

sampling

instant

generated

t.

Yt r e p r e s e n t s

by the t r a n s f e r

the e s t i m a t e

function

of y~

of eqt~ (3.6),

and n t is

^

the error b e t w e e n identification in variables" (48))).

y~ and Yt"

terminology, in the

white-noise"

process

random s e q u e n c e ) , nt according operator,

u,y denote

(i.e.

which

(3.7).

by Bx t = xt_ I.

the m e a n

to cause

of the

(Johnston acting

on

uncorrelated the d i s t u r b a n c e

B is the b a c k w a r d The m o d e l

representation,

values

("error

t, and w t is a " d i s c r e t e

is c o n s i d e r e d

diagrammatic

disturbance

a zero-mean,serially

to r e l a t i o n s h i p

defined

conventional

at time

system

error"

of e c o n o m e t r i c s

a stochastic

of the p r o c e s s

conventional

n t is an "output

terminology

n t represents

the o u t p u t

Using

input

shift

can be g i v e n

as in fig. and o u t p u t

a

3, w h e r e

variables,

respectively.

3.7.2

The S y s t e m

In terms

of d e f i n i t i o n

(1.3.1),

the

s y s t e m w h i c h we

are

88

considering

is S=(U,Y)

where

U=(u

, ....

u

1

Y=(y

1

, ...,

£ . = m =i, l l and As

the

Y 2 9 G ),

for

example

of

programming

language

IBM

installation

3.7.3

Model

definition

...,

296,

{ui,Y i}

are

section

3.6,

F to be A l g o l

I - The

We m u s t

i=l,

observations

in the

370/165

), 296

3.3.1.

listed

we

shall

in A p p e n d i x take

as i m p l e m e n t e d

C.

the on the

at C a m b r i d g e .

Trivial

define

W,

as

Model

the

sets

For

the

A,B,C,D,E, w h i c h trivial

model,

we

occur can

in

take

these

to be: A =

{Ai:i=i,...,296}

,

Ai= ~

B = { B i : i = l .... ,296}

,

Bi=~

C =

{Ci:i=i,...,296}

,

Ci=Y i

D =

{Di:i=i,...,296}

,

Di=(Ai,Bi)=@

,

Ei=(Di,Ci)=(~,y

i)

is a t r i v i a l

model

E = { E i : i : l ..... 296} A concrete S,

(F,E)

-model,

which

is: BEGIN

INTEGER

I,J;

FOR J:=l READ WRITE

UNTIL

REAL 296

ARRAY

Y(I::296);

DO R E A D O N

(I) ; (Y(I)) ;

END. 53.8

53.6

53.5

57.0

(Y(J)) ;

of

the

system

89 The last line of the trivial m o d e l is the table look-up, which c o n t a i n s the o u t p u t o b s e r v a t i o n s . can be r e p r e s e n t e d d i a g r a m m a t i c a l l y ,

3.7.4

Model

The t r i v i a l m o d e l

as in fig.

4(a).

II - The Mean

Probably

the first n o n t r i v i a l m o d e l to be h y p o t h e s i s e d

for many systems is that the s y s t e m b e h a v i o u r has a c o n s t a n t mean value.

This m o d e l is of the type w h i c h r e p r o d u c e s

regularities only in the o u t p u t o b s e r v a t i o n s ,

and does not

exploit any i n f o r m a t i o n in the input o b s e r v a t i o n s .

Con-

sequently, the sets A , B , C , D , E may be taken to be the same as for the t r i v i a l model.

The m e a n value of the o u t p u t

observations is 53.5.

The f o l l o w i n g is a

(P,E)-model of S

which m a k e s use of this fact: BEGIN

I N T E G E R I,J;

REAL A R R A Y Y(I::296);

FOR J : = l U N T I L 296 DO READON READ

(Y(J));

(I);

WRITE

(53.5 + Y(I)) ;

END.

.3

.i

O

0

-.i

. ..

3.8

3.5

The table look-up of this m o d e l is listed in the column headed y~ in A p p e n d i x C. Fig. r e p r e s e n t a t i o n of this model.

4(b)

shows a d i a g r a m m a t i c

The d a s h e d line r e p r e s e n t s

the b o u n d a r y of the model.

3.7.5 M o d e l I I I -

Deterministic Transfer Function

We now assume that the t r a n s f e r f u n c t i o n of e q u a t i o n

90

(3.6) the

has been

input

and output

restriction output may

hypothesised

that

of

assume

the

between we make

knowledge

initial

the past

to c h o o s e

{Ai:i=1,...,296}

However,

not

sets

the

of past

conditions),

and present

new

Ai = A =

relationship

system.

than

of all

We have

the

may

(other

knowledge

information.

the

the model

observations

assume

as

A,

but

input

...,

(u ,u ,. 1 2 "''ui)

E: f o r i~6

, A i = @ for i<6

B =

{Bi:i=i,...,296}

,

Bi =

C = {Ci:i=i,.o.,296}

,

Ci = Yi

D = {Di:i=i,...,296}

,

Di =

(Ai,B i)

Ei =

((ul,u2,...,ui),Yi).

E = { E i : i = l , .... 2 9 6 } , An

(F,E)

-model,

behaviour BEGIN

which

is g o v e r n e d INTEGER

the

hypothesis

by equation

I,J;

FOR J:=l READ

uses

(3.6),

REAL ARRAY

UNTIL

296

WRITE

DO R E A D O N

(N(I))

BEGIN

FOR J:=l

UNTIL

5 DO

BEGIN Y(J) := N ( J ) - 5 3 . 5 ; READON

(U (J)) ;

END; FOR J:=6

UNTIL

I DO

(u ,U 1

that is:

ELSE

(N(J));

2

the

N,U,Y(I::296);

(I) ;

IF I<6 THEN

=

, . . . , u i)

system

91

BEGIN READON

(U (J)) ;

Y (J) :=-. 53"U (J-3)-. 37"U (J-4).51*U (J-5) +. 57"Y (J-l) + .OI*Y(J-2) ; END ; WRITE

(Y(I)+53.4+N(I)) ;

END; END. 53.8

53.6

53.5

53.4

-.2

The table look-up of this m o d e l in the column h e a d e d n.. 1

-.4

...

4.1

is listed in A p p e n d i x C

N o t e that the first five terms of

the table are simply the OUtDut o b s e r v a t i o n s

reason for this is that e q u a t i o n Yi to the values of u i_3,ui_

y l , . . . , y s.

The

(3.6) relates the v a l u e of

,ui_s.

H e n c e this e q u a t i o n

cannot be used for g e n e r a t i n g the first five terms of the observed behaviour. by table look-up. model.

So the first five terms are g e n e r a t e d Fig.

The s e q u e n c e

defined by e q u a t i o n Note that fig.

4(c)

shows a r e p r e s e n t a t i o n of this

{n i} is the same as the s e q u e n c e

{n t}

(3.8). 4(c)

shows the s u b t r a c t i o n of the m e a n

of the input o b s e r v a t i o n s b e f o r e these o b s e r v a t i o n s

are

submitted to the t r a n s f e r f u n c t i o n algorithm.

In fact,

model achieves

53.4 i n s t e a d

this more e c o n o m i c a l l y by h a v i n g

of 53.5 in its output statement.

the

4.0

92

3.7.6

Model

IV

-

Deterministic

Transfer

Function,

Using

Output Observations. Now suppose as governing

the behaviour

is now allowed as past nature

that the same transfer

of the system,

to use all past output

and present

function

but that the model

observations,

input observations.

of the transfer

function,

A = {Ai:i=i,...,296} ,

is hypothesised

Because

suitable

as well of the

sets A,...,E

Ai= ~ for i<6, Ai=(ui_s,ui_

are: ,ui_3)

for i>6, B = {Bi:i=i,...,296} ,

Bi=~ for i<6, Bi=(Yi_1,yi_ 2) for ij>6,

C = {Ci:i=i,...,296} ,

Ci=Y i

D = {Di:i=i,...,296} ,

Di=(Ai,Bi){~

for i<6

=

(ui_s,ui_ # ,u.i--3 ) ' (Yifor i>6,

1

'Yi-2 ))

I

=(~,yi ) for i<6

E = {Ei:i=I,...,296},

Ei= (Di,C i )

|: c c cui- 5 ,u~_~ ,u~_~ ~ ' cYi- ~ 'Yi-~ ! Yl )

for i>~6. A suitable BEGIN

(F,E) -model INTEGER

is:

I,J;

FOR J:=l UNTIL READ

REAL ARRAY E(I::296); 296 DO READON

(I) ;

IF I<6 THEN WRITE

(E(I))

ELSE

BEGIN READON

(U,V,W,Y,Z) ;

(E(J));

REAL U,V,W,Y,Z;

93 WRITE

( - . 5 3 * U - . 3 7 * V - . 5 1 * W + . 5 7 * Y + .OI*Z + 22.4 + E(I));

END;

END, 53.8

53.5

53.6

53.5

53.4

-.2

-.3

...

1.7

1.6

The table look-up for this m o d e l is listed as column e. l in A p p e n d i x C.

The d a t a items

U=ui-~' V=u.1-% , W=ui_s,

r e q u i r e d by this m o d e l are:

Y=Yi-l , Z=Yi-2 .

appears in the o u t p u t statement,

corrects

The term "22.4" ' w h i c h for the n o n - z e r o

m e a n s of b o t h input and o u t p u t o b s e r v a t i o n s . r e p r e s e n t a t i o n of this m o d e l

A diagrammatic

is shown in fig.

4(d).

N o t e that the e l e m e n t s e i in the table look-up are not the same as the terms n t w h i c h a p p e a r in e q u a t i o n The r e a s o n for this is, of course,

(3.8).

that the output of the A

t r a n s f e r f u n c t i o n p a r t of the m o d e l

is no longer Yi' but is

^ is g i v e n by Yi

a new q u a n t i t y Yi"

Y i = - O ' 5 3 u i -3-O'37uf1-~-O'51ufl-5+O'57yi-]+O'Olyi- 2. . (3.9) w h e r e a s Yi* is g i v e n by Y [ = - O ' 5 3 u i - 3 -O.37ufl-~ - O . 5 1 u fl-s + O . 5 7 y ~ _ i+ O . O l Y i _ 2. . (3.10) Since y ~ = Y i + n i , we h a v e y~=~i-O.57ni_ -O.Oln . . . . . . . . . . . . . . . I

In general,

y~=yi+(l-A(B~ni

i f the scalar t e r m of A(B) Eq~

i--2

(3.8) and

. . . . . . . . . . . . . . . is i, w h e r e A(B)

(3.11) (3.12)

is d e f i n e d as in fig.

(3.12) , t o g e t h e r w i t h

ei:y~-y [ . . . . . . . . . . . . . . . . . . . . . .

(3.13)

3.

94

lead to which

ei=A(B)n i . . . . . . . . . . . . . . . .

is a w e l l - k n o w n

result

(44).

e. is in fact the 1

S for the m o d e l y =~u

" e q u a t i o n error"

In v i e w of t h e s e d i f f e r e n c e s , which was

a s s u m e d by B o x

measuring

uses

model

IV is not of the

and

for w h i c h

and thus

obtaining

Theorem

some

This u n d e r l i n e s

f e a t u r e of our c h a r a c t e r i s a t i o n of m o d e l l i n g .

(3.4.4)

implies

of m o d e l s w h i c h we there exists

that,

consider

a fundamental

and a s s e s s i n g model was

from

as a o n e - s t e p - a h e a d p r e d i c t o r w h i c h

the m o s t r e c e n t o u t p u t o b s e r v a t i o n s .

an i m p o r t a n t

form

the m o d e l

This does n o t p r e v e n t us

its i n f o r m a t i o n gain,

a s s e s s m e n t of its v a l u e

.

and J e n k i n s ,

coefficients were estimated.

(3.14)

it.

arrived

Thus

for the v e r y g e n e r a l

(those of d e f i n i t i o n dichotomy

between

the p r o c e s s

at is q u i t e

by w h i c h

irrelevant

class

(3.3.1)),

finding

a model

a particular

to its a s s e s s m e n t

by the use of i n f o r m a t i o n gain.

3.77

Model V - Stochastic Process We n o w c o n s i d e r

u s e s eqn From

(3.7)

Model

a r e f i n e d v e r s i o n of m o d e l

in an a t t e m p t

to p r e d i c t

a s t a t i s t i c a l p o i n t of v i e w this

the c o e f f i c i e n t s the p r o c e s s

ni,

appearing and eqn

(3.14)

and n i have quite different again,

however,

in eqn

spectral

we are free to a s s e s s

for o n e - s t e p - a h e a d p r e d i c t i o n , obtained.

the t e r m s e i-

is a n o n s e n s e ,

(3.7) w e r e

shows

IV, w h i c h

because

estimated

for

t h a t the p r o c e s s e s

ei

characteristics.

Once

the v a l u e

regardless

of the m o d e l

of h o w it was

95

In this case the model must be allowed output

observations,

exploit

equation

slightly

since it would

(3.7).

different

However,

otherwise

to past

be unable

the sets A,...,E

from those defined

A={Ai:i=l , .... 296},

access

for model

Ai= ~ for i<8,

to

must be

IV:

Ai=(ui_

,...,ui_1)

for i>8, B={Bi:i=l,...,296}

,

Bi=~

for i<8,

Bi=(Yi_7,...,yi_ I)

for i98, C={Ci:i=i,...,296},

Ci=Y i,

D={Di:i=l ..... 296},

Di=

for i<8

Ai,Bil= (Ui_~'''.,Ui_l) ,(Yi_7,.-.,Yi_!

for i~8, f for i<8 Ei=(Di,Ci )= ~(~'Yi )

E={Ei:i=l,...,296},

[

(((ui_7,...,ui_l),(Yi_ yi_,)),yi)

A model which

exploits

be built

for smaller

However,

the model would

Since

real interest

a particular

lies not in building

(3.3.5)),

...,E in a way

that allows

it is sensible the smallest

could

I,J;

FOR J:=

1 UNTIL

an

above.

larger.

(F,E)-model

to define

for E

the sets A,

(F,E)-model

to be

nonanticipative.

(F,E)-model

INTEGER

READ

(3.7)

for any nonanticipative

that E remains

In this case the BEGIN

and

then have to be slightly

set E, but rather

providing

(3.6)

sets A i and B i than those defined

(cf. definition

built,

equations

for i98.

is:

REAL ARRAY A,U,Y,Z 296 DO READON

(I);

IF I<8 THEN WRITE

(A(I))

ELSE

(1::296);

(A(J));

,...,

96

BEGIN

FOR J : = l

UNTIL

F O R J:=(I-2)

7 DO R E A D O N

UNTIL

(U(I-J),Y(I-J));

I DO

Z(J) : = - . 5 3 * U ( J - 3 ) - . 3 7 * U ( J - 4 ) - . 5 1 * U ( J - 5 ) +.57*Y(J-1)+.OI*Y(J-2) WRITE

;

(Z (I)+1.53" (Y(I-1)-Z(I-1))-.63* (Y(I-2)-Z(I-2))

+ A(I)+2.2) ; END; END. 53.8

53.6 The

53.5

table

in A p p e n d i x

53.5

look-up

C.

statement.

shown

4(e).

above

are

necessary.

A shorter

52.7

computed,

adjustments by

shows

which

O

....

as column

a. 1

to the m e a n s

the single

term

of

"2.2"

of the m o d e l

clearly

but is s l i g h t l y

version

.1

is listed

A representation

form of the m o d e l

observations

computation

required

are a c c o m p l i s h e d

in the o u t p u t

The

53.1

for this m o d e l

All the

the o b s e r v a t i o n s

in fig.

53.4

is

how the o u t p u t

larger

performs

than

the same

is:

BEGIN INTEGER

I,J;

F O R J:=l READ

UNTIL

REAL A R R A Y

A,U,Y(I::296);

296 DO R E A D O N

(A(I));

(I);

IF I<8 THEN W R I T E

(A(I))

ELSE

BEGIN FOR J : = l WRITE

UNTIL

7 DO R E A D O N

(U(I-J),Y(I-J));

(2.1*Y(I-1)-I.5*Y(I-2)+.34*Y(I-3) + . O I * Y (I-4) -. 53"U (I-3) +. 44"U (I-4)

.2

97 -. 28"U (I-5) +. 55"U (I-6)-. 32"U (I-7) +A (I) +2.2) ;

END; END. 53.8 The

53.6 table

look-up

for the p r e v i o u s

.

for this

(3.6),

(3.7)

Jenkins model

and

(45).

V.

function

we

version

& Jenkins

consider (3.8) The

.

2

is of c o u r s e

the

same

as

Model

the m o d e l w h i c h

in the m a n n e r

which

on p . 4 0 7

of

can be (45),

uses

intended

sets A , . . . , E c a n

The model, given

.

one.

3.7.8 M o d e l VI - Box

Finally,

.

remain

equations by Box

the same

compared

with

the

and as for forecast

is:

BEGIN INTEGER

I,J;

F O R J:=l READ

UNTIL

REAL A R R A Y

W,U,Y(I::296);

296 DO READON

(W(J));

(I) ;

IF I<8 THEN W R I T E

(W(I))

ELSE

BEGIN FOR J:=l WRITE

UNTIL

7 DO READON

(U(I-J),Y(I-J));

(2.1*Y(I-1)-I.5*Y(I-2)+.34*Y(I-3) + . O l * Y (I-4)-. 53"U (I-3) +. 44"U (I-4) -. 28"U (I-5) +. 55"U (I-6) -. 32"U (I-7) +W (I) -. 57"W (I-l) - . O l * W (I-2) +2.2) ;

END; END. 53.8

53.6

53.5

53.5

53.4

53.1

52.7

.1

.1

....

4

98 The table look-up for this m o d e l as the c o l u m n h e a d e d w i.

is shown in A p p e n d i x C

The e l e m e n t s of this c o l u m n are

e s t i m a t e s of the e l e m e n t s of the "white noise" of e q u a t i o n Fig.

sequence

(3.7).

4(f) shows the s t r u c t u r e of this model,

although

the above p r o g r a m is a m o r e e f f i c i e n t i m p l e m e n t a t i o n that shown in the figure this m o d e l to e q u a t i o n s manipulations, fig.

{w t}

(cf m o d e l V). (3.6)-(3.8)

than

The e q u i v a l e n c e of

is shown by the f o l l o w i n g

w h e r e the o p e r a t o r s A , B , C , D are as d e f i n e d in

3: Yl =

Yi*+ei

yi*=(l-A)y~+Bu[ e

l

= ( l - C ) e +ADw l

1

= (l-C) ~ ~-yi*) + A D w i

= (l-C) (Ay~-Bu[) + A D w i . •. y~ = ( 1 - A C ) ~ ÷ B C u ~ + A O w i

ACy .'. y~

=BCu2+AOw i B . D

~ui+~w i N o t e that the o p e r a t o r s

(l-A) and

(l-C) act only on

past v a l u e s of y~ and e i, r e s p e c t i v e l y . 3.7.9 A s s e s s m e n t of the Models

The size of each of the above six m o d e l s was m e a s u r e d as the n u m b e r of t e r m i n a l c h a r a c t e r s in it.

R e s e r v e d words,

be single terminals,

of Algol W that appears

such as BEGIN, w e r e c o n s i d e r e d to

as w e r e s t a n d a r d p r o c e d u r e names,

such

99

as WRITE.

This p r a c t i c e

is j u s t i f i e d in c h a p t e r 6.

U n n e c e s s a r y spaces w e r e not counted.

The e l e m e n t s of each

table look-up were taken to be as shown in A p p e n d i x C, e x c e p t that p o s i t i v e e n t r i e s w e r e c o n s i d e r e d to be p r e c e d e d by "+".

The reason for this is d i s c u s s e d in c h a p t e r 7.

The f o l l o w i n g table gives assessment. of e a c h model,

the results of the m o d e l

In a d d i t i o n to the size and i n f o r m a t i o n gain the " i n f o r m a t i o n explained"

by it is shown.

This q u a n t i t y is the r a t i o of the i n f o r m a t i o n gain to the size of the t r i v i a l m o d e l and r e s e m b l e s an e f f i c i e n c y m e a s u r e , if the i n f o r m a t i o n gain is not negative. MODEL

SIZE

I

1532

O

O

II

1159

373

24.4%

III

1076

456

29.8%

964

568

37.0%

V

1OO5

527

34.4%

VI

1013

519

33.8%

IV

TABLE It is i n t e r e s t i n g

INFORMATION GAIN

I

-

Results of Model A s s e s s m e n t .

to c o n s i d e r w h a t

the same six m o d e l s w e r e p o s t u l a t e d

the s i t u a t i o n w o u l d be if for systems w h i c h

c o n s i s t e d of the initial s e g m e n t s of the data, (yl,...,yj)), O~j~296. of the six m o d e l s

INFORMATION EXPLAINED

Sj=((ul,...,uj) ,

The m o d e l sizes and i n f o r m a t i o n gains

for these systems

are shown in figs.5

and 6, r e s p e c t i v e l y . F r o m fig.

6 it is seen that after the first few

o b s e r v a t i o n s b e c o m e available,

the b e s t of the six

models

100

is m o d e l

II - namely,

This m o d e l

remains

been obtained, model more

elaborate

does

actually

Nevertheless, that

This

have

smaller

the

comparison

step-ahead, whereas

model

that

III,

output

for l o n g - t e r m than

also has

than

insignificant

not,

is b e i n g

output

gains

IV.

V

to the data. V.

V is p r e d i c t i n g

the w h o l e

than m o d e l

at each

onestep,

observation However,

II.

model

This

it is b e t t e r

the m e a n

to j u s t i f y

form i n t e n d e d

there

is little

information IV.

errors

those of m o d e l

It is i n t e r e s t i n g

to m o d e l

a slightly

gain.

lower

value

to use

of the

V, b u t

the i n c r e a s e d to note

by Box

information

w i are s l i g h t l y

that,

although

and Jenkins,

to choose

between

Furthermore,

while

gain than

smailer,

the d i f f e r e n c e complexity

model

on

is too

of m o d e l

model

VI

model

V is

them on the basis

neither

V

indicates

than m o d e l

information.

just p r e d i c t

that m o d e l

IV to m o d e l

gain

prediction

have

than m o d e l

information

over

gain

information

"overfitted"

since m o d e l

the i n p u t

The p r e d i c t i o n

the whole,

of the

from m o d e l

V - a

observations.

V.

VI.

latest

information

rather

M o d e l VI model

only

a higher

indicates

the

Model

C reveals

errors

have

Thereafter

a lower

of i n f o r m a t i o n

III is p r e d i c t i n g

interval t using III has

of A p p e n d i x

informa[~on

surprising,

using

model

a lower

better.

296 o b s e r v a t i o n s

prediction

the m o d e l

is not very

all

mean.

90 o b s e r v a t i o n s

IV - has

in c o m p l e x i t y

III has

about

the five models.

after

Examination

justified;

Model

of

of a c o n s t a n t

IV b e c o m e s

than m o d e l

IV, e v e n

the i n c r e a s e

is not

until

model

the b e s t

model

gain than m o d e l obtained.

the best

whereupon

IV remains

been

the p o s t u l a t i o n

is

of

is p r e f e r a b l e

101

3.8

Summar[ The c h a r a c t e r i s a t i o n

in this chapter

of m o d e l l i n g

can be s u m m a r i s e d

w h i c h has been d e v e l o p e d

as follows:

(i)

A system is d e f i n e d by a set of observations.

(2)

A model of a system is an a l g o r i t h m

the output o b s e r v a t i o n

set by using s p e c i f i e d

for computing subsets of

the system observations. (3)

Those

not u n d e r s t o o d table

aspects

of a system's

b e h a v i o u r which are

are computed by the m o d e l with the aid of a

look-up. (4)

exercise

The situation

The m o d e l l i n g

each step r e s u l t i n g the system.

model.

exercise

progresses

from the p o s t u l a t i o n

In general,

the next cannot p r o c e e d

exercise

of:the m o d e l l i n g

is c a p t u r e d by the concept of the trivial model.

(5)

(6)

at the b e g i n n i n g

At each step,

the t r a n s i t i o n

in "steps",

of h y p o t h e s e s

about

from one step to

algorithmically. the p r o g r e s s

of the m o d e l l i n g

is m e a s u r e d by the i n f o r m a t i o n

gain of the current

4.

4.1

INCORPORATION

OF

A

PRIORI

KNOWLEDGE

C h o i c e of P r o @ r a m m i n @ Language.

The m o d e l l e r has c e r t a i n a priori beliefs s y s t e m he is m o d e l l i n g . s h o u l d reflect

these.

about the

His choice of p r o g r a m m i n g

language

It will be r e c a l l e d from sec.

3.4

that a s s e s s i n g m o d e l s on the basis of i n f o r m a t i o n gain is tantamount

to c o m p a r i n g the n u m b e r of " a r b i t r a r y elements"

w h i c h m a k e up a h y p o t h e s i s of o b s e r v a t i o n s "

about a b e h a v i o u r w i t h the "number

of that behaviour.

These a r b i t r a r y e l e m e n t s

are always c o u n t e d r e l a t i v e to some s t r u c t u r e w h i c h is taken for granted.

This s t r u c t u r e

of the p r o g r a m m i n g programming

is p r o v i d e d by the d e f i n i t i o n

l a n g u a g e used.

language embodies

In o t h e r words,

the

those a r b i t r a r y e l e m e n t s w h i c h

w i l l be common to all the m o d e l s b e i n g assessed.

It

o b v i o u s l y makes

sense to choose the language so that these

common elements

c o i n c i d e w i t h those a s s u m p t i o n s

that the

m o d e l l e r is w i l l i n g to take for granted. For example,

suppose that the m o d e l l e r b e l i e v e s

the g a s - f u r n a c e data of sec.

that

3.7 is c e r t a i n l y p r o d u c e d by

a m o d e l of the form Y i = b o U i + b l U i _ l + . . . + b m U i _ m - a z Y i _ ]- . . . - a n Y i _ n % e i . . . .

(4.1)

and that he is not p r e p a r e d to s e r i o u s l y c o n s i d e r a m o d e l w i t h any o t h e r structure. the f o l l o w i n g p r o g r a m m i n g

Then he can language.

l a n g u a g e is a list of i n t e g e r s

z

define

Every p r o g r a m of the

and r a t i o n a l s w h i c h is given

the i n t e r p r e t a t i o n : n,m,a

(informally)

,...,an,bo,. "" ,bm,e 1 ' ' ' ' ' e N "

103

The data for such a p r o g r a m is a s i m i l a r llst, w i t h the interpretation: i , Y i _ l , . . . , Y i , n , U i , - - - , u i _ mGiven such a p r o g r a m and such a set of data, of Yi in a c c o r d a n c e w i t h eqn observations

(u ,...,u N)

the c o m p u t a t i o n

(4.1) is invoked.

and o u t p u t o b s e r v a t i o n s

If input (y , ....,yN )

i

!

of a s y s t e m are obtained, programs

then a certain

(infinite)

set of

in this l a n g u a g e w i l l c o n s t i t u t e m o d e l s of the s y s t e m

((u ,...,UN), (y ,...,yN)). I

The terms e ,...,e N w h i c h appear

1

!

in the p r o g r a m form a table

look-up.

A t r i v i a l m o d e l is

o b t a i n e d if m = n = b o = O , and e l=YI'- "''eN=YN" The p r o g r a m m i n g

language

Linear M o d e l Language, of the c o m p u t a t i o n s A to i l l u s t r a t e

just d e s c r i b e d w i l l be called

or LML.

Fig.

it performs.

7 shows the s t r u c t u r e

LML is u s e d in A p D e n d i x

the formal d e f i n i t i o n of p r o g r a m m i n g

languages.

A s s u m i n g that the same r e p r e s e n t a t i o n of n u m b e r s is used in LML as in Algol,

it is clear that a m o d e l w r i t t e n in LML

w i l l be s m a l l e r than the A l g o l algorithm.

i m p l e m e n t a t i o n of the same

So m o d e l a s s e s s m e n t u s i n g LML will i n d i c a t e

fewer " a r b i t r a r y elements"

in each m o d e l than w o u l d a s s e s s m e n t

using Algol.

some of the a r b i t r a r y e l e m e n t s

In a sense,

have b e e n s h i f t e d from the d e f i n i t i o n of eac h p r o g r a m to the d e f i n i t i o n of the language.

This is seen clearly if LML is

c o n s i d e r e d to be an A l g o l p r o c e d u r e ,

rather than a s e p a r a t e

language. It w i l l be seen later in this chapter that the choice of p r o g r a m m i n g assessment.

language can affect the results of m o d e l So, if the choice of l a n g u a g e is c o n s i d e r e d

104 to be the s p e c i f i c a t i o n of the m o d e l l e r ' s

a priori k n o w l e d g e ,

then m o d e l a s s e s s m e n t on the basis of i n f o r m a t i o n g a i n is seen to d e p e n d on a p r i o r i knowledge. any m e t h o d of m o d e l a s s e s s m e n t ,

This is a feature of

and is not s p e c i f i c to the

m e t h o d b e i n g a d v o c a t e d here. The m o d e l l e r w i l l o f t e n be u n c e r t a i n of his a priori beliefs.

Fortunately,

have to choose b e t w e e n c o n f l i c t i n g choose b e t w e e n p r o g r a m m i n g

about the c o r r e c t n e s s

he does not always

assumptions.

He can

l a n g u a g e s w h i c h imply a g r e a t e r

or lesser state of knowledge.

For example,

the choice

of LML for m o d e l a s s e s s m e n t i m p l i e s m u c h m o r e s p e c i f i c knowledge

about the s y s t e m than does the choice of Algol.

An i n t e r m e d i a t e state of k n o w l e d g e may p e r h a p s be r e p r e s e n t e d by use of a s i m u l a t i o n

language.

N o t e that the choice of l a n g u a g e is not just a choice between a special-purpose modeller's

and a u n i v e r s a l

language.

The

a ~ r i o r i b e l i e f s may c o i n c i d e fairly w e l l w i t h

the s t r u c t u r e e m b o d i e d in some s u b s e t of Algol,

but may be

q u i t e d i f f e r e n t from that e m b o d i e d in some l a n g u a g e d e s i g n e d for m a n i p u l a t i n g

strings.

Nevertheless both

languages m a y

be universal,

in the sense that each is capable of c o m p u t i n g

every partial

recursive

function.

An o b v i o u s r e s t r i c t i o n w h i c h m u s t be p l a c e d on a l a n g u a g e w h i c h is to be used for m o d e l Suppose

a s s e s s m e n t arises as follows.

that the l a n g u a g e b e i n g u s e d includes

a standard

p r o c e d u r e w h i c h can be called by its s i n g l e - t e r m i n a l say A.

Suppose

further that this p r o c e d u r e

output observation

computes

identifier, the

set of the s y s t e m S by m e a n s of a table

I05

look-up. single

Then

a program

terminal

A would

be a shorter,

and h e n c e

model written

in that

without

be a m o d e l

language,

must

that

the

would

any other

it c o u l d be c o n s t r u c t e d S.

Clearly,

about

aspect

of such

this

the system.

is w i l l i n g

as an " e x o g e n o u s

variable"

he does

would,

is not p r o b l e m a t i c

he is e x a m i n i n g ,

to investigate.

render

-

the s y s t e m b e h a v i o u r

of the w o r l d w h i c h

of course,

the m o d e l l e r ' s

The use of the p r o c e d u r e

to accept

not w i s h

a procedure

of s p e c i f y i n g

the s y s t e m b e h a v i o u r

that the m o d e l l e r

acceptance

This m o d e l of S than

of the s y s t e m

as a n o t h e r

knowledge

and one w h i c h

yet

on the i n a d m i s s i b i l i t y

can be r e g a r d e d

imply

model

than

be outlawed.

Insisting

a priori

of little m o r e

of S.

a better,

any u n d e r s t a n d i n g

situation

would

consisting

Such

the m o d e l l i n g

an

exercise

redundant.

4.2

Asymptotic Models

In this

section

language

does

sets

large

are

developed a single this,

it is shown

not a f f e c t m o d e l enough.

in c h a p t e r

The

that

assessment

so as to d e s c r i b e

a system

sets.

the m o d e l l i n g

of p r o g r a m m i n g

if the o b s e r v a t i o n

characterisation

3 considered

p a i r of o b s e r v a t i o n

the choice

of m o d e l l i n g

to be d e f i n e d

We n o w w i s h

to e x t e n d

of an i n c r e a s i n g

of o b s e r v a t i o n s .

Definition (a)

(4.2.1) Let U =(u 1

observation

sets.

1

,...,u i) and U 2 = ( u i + 1 , . . . , u j )

Then

the o b s e r v a t i o n

set

by

be

set

106

U U =(u 1

2

1

,...,u i

(b)

'Ui+l''''

Let S =(U 1

,Uj) is an e x t e n s i o n of U1

,Yl)

and S =(U

1

2

,Y ) be systems. 2

Then

2

the s y s t e m S S =((U U ), (Y Y )) is an e x t e n s i o n of S . 12

Definition

12

] 2

1

(4.2.2)

An i n f i n i t e

s e q u e n c e of systems

2

~=(Sl,S

,...1

w h e r e S j is an e x t e n s i o n of S j-1 sJ=s S ...Sj) 12

for every j>l,(i.e.

is an a s y m p t o t i c system.

We w i s h to c o n s i d e r m o d e l s of the sJ's w h i c h d i f f e r only in their table table

look-ups.

To capture the idea of a

look-up w i t h o u t r e s t r i c t i n g it unduly, we shall

c o n s i d e r m o d e l s to be pairs

(m,T).

is a part of a program,

and the pair

the c o m p l e t e program.

This

if required: (3.2.1)),

take a p a i r i n g

E a c h e l e m e n t of the pair (m,T) is r e g a r d e d as

can be f o r m a l i s e d q u i t e easily, function T

and change d e f i n i t i o n

(cf proof of t h e o r e m

(3.3.6),

so that a c o n c r e t e

(F,E)-model b e c o m e s an o r d e r e d pair of i n t e g e r s that F ( T ( m , T ) , i , D i ) = C i. w i t h programs,

(m,T), such

T h e s e i n t e g e r s can be a s s o c i a t e d

as before,

m will be c o n s i d e r e d to be the

p a r t w h i c h is common to m o d e l s of all the sJ's, w h i l e T j w i l l be r e g a r d e d as a table for e a c h S j.

look-up, w h i c h may be d i f f e r e n t

W h e n a t r a n s l a t i o n of the p r o g r a m

one l a n g u a g e to another is considered, T

(or at least its length)

hand,

(m,T) from

it w i l l be a s s u m e d that

remains unchanged.

On the o t h e r

the t r a n s l a t i o n of m w i l l be assumed to be d i f f e r e n t

107

from m.

In this way a distinction is drawn between T and

m, which corresponds to some aspects of the distinction between table-lookup and other types of program. In the following definition a particular programming language is assumed, in this language.

m and T j are fragments of programs The definition is based on definition

(3.3.1), and the notations of definition

(4.2.1) are

generalised in an obvious manner. Definition

(4.2.3)

Let AJ={A~} be a set of ordered subsets of let BJ={B~} be a set of ordered subsets of

(Uz 2U ...Uj),

(Y, Y 2 "''Yj)' and

let cJ={c~} be a complete set of mj disjoint ordered subsets of

J_ J J Let DJbe a set of ordered pairs Di-(Ak,B£)

(YI y 2 "''Yj)"

(i=l,2,...,mj),

and let E j be a set of ordered pairs

E i-' j- ~Di'~i j ~J ) (i=l'2'''''mj)" I

Finally,

let ~ be the sequence

2

I

~=(E ,E .... ), a n d ~ Then the pair

be the sequence

~=(T

2

,T .... ).

(m,~) is an asymptotic t-model of the

asymptotic system =(S ,S ,...) if and only if (m,T j) is an EJ-model of S j, for every j=l,2,... The following definitions distinguish between two possible asymptotic behaviours of rival models. denotes the i n f o r m a t i o n gain of the model denotes the information explained by model

I(m,T j)

(m,TJ), and E(m,T j) (m,TJ), n ~ e l y

the ratio of I(m,T 3) to the size of the trivial model of Sj .

(m, , < )

and (m2 '~)2 denote asymptotic models of some

108

I

asymptotic s y s t e m # ,

with ~

2

= (T II ,T 21 ,...) and f 2 = ( T 2 , T 2 .... ). 1

We use lim inf xj to denote lim inf Xk, and j+~ j~m k>j similarly for lim sup. Definition

(4.2.4)

(m , ~ ) 1

is asymptoticall[ weakly better than

(m ,/)

1

2

(denoted by

(m , < ) > w ( m 1

2

,/2)) if and only if 2

lira inf {I(m ,TJ)-I(m ,TJ)}=+ ~. . . . . . . . . . 1

j~ Definition

2

(4.2)

2

(4.2.5)

(m , ~ ) 1

1

is asymptotically

strongly better than

(m , ~ )

]

(denoted by

2

(m1,~1)>s(m2,~z))

lim inf

j~

if and only if

{E(m ,TJ)-E(m 1

1

2

,TJ)}>O . . . . . . . . . . 2

(4.3)

2

The ideas behind these definitions

are the following.

Let tj denote the trivial model of S j, and Itjl denote its size.

We henceforth make the natural assumption that lim [tj[=+ ~ . . . . . . . . . . . . . . . . . . . j~

If

(m , ~ ) 2

is asymptotically weakly better than

I

the "amount of information"

(4.4)

(m ,~) 2

extracted from S j by

eventually greater than that extracted by difference between them is eventually

(m ,T j ) is l

]

(m2,T23), and the

increasing.

their "rates of information extraction",

then

2

But

as measured by the

109

information explained, may be converging towards each other. For example, if Itjl=kj, I(m],TJ)=pj ½,1

I(m2,TJl=qj½2 , with p>q,

then I(mz,T32)-I(m2,T3)=(p-q)j½~- , while E(m ,T j)-E(m ,T~)= k ~ j -~ ~O. i ! 2 If (m ,~) I

(m ,~) 2

is

is asymptotically strongly better than

1

then the "rate of information extraction" by (m ,~)

2

1

eventually greater than that by (m ,~}. 2 2

strong"

terminology

is

justified

by

the

1

The "weak/

following

theorem.

Theorem (4.2.6) (m 1 ' 3 )1 >

S

(m2 , ~2) ~ ( m l , ~ l ) > w ( m

2

,~). 2

Proof Suppose lira inf{I(m ,TJ)-I (m ,T j)}<+~ . 1

j-~

1

?.

2

Then :IN, such that for any integer k, ~i>k, such that I(m ,Tl)-I(m ,T~),
2 4

Since E (m,Ti)= ~ )

and ItiI+m , this implies that for any

integer k, and for any £>O,~i>k, such that E(m ,Ti)-E(m ,Ti) <£ . l

1

2

2

But this contradicts lira inf {E(m ,TJ)-E(m ,TJ)}>O. j~ 1 l 2 z Hence lim inf {E(m ,TJ)-E(m ,TJ)}>O=~lim inf {I(m ,T3)j+~ 1 1 2 2 j~ 1 1 I(m ,TJ)}= 2

+oo

•

2

We now consider the effect of writing models in different languages on their asymptotic performance.

For a precise

discussion of what it means for a program to be written in

110

some particular

language,

see chapter

5.

Let

(m , ~ ) I

(m

,~)

be asymptotic

models of J w r i t t e n

language

~.

a programming

programs

(p ,T~), (p ,T3), j=l,2,...,

2

and

l

in a programming

Z

Let

~ be 2

functions

such

can be written

that

in ~,

2

and such that these programs recursive

language,

compute

as the programs

the same partial (m 'TJ)' 1 (m2'TJ)' 2

j=l,2

1

respectively.

Using

the

notation

of

definition

(3.3.6)

we can write (T (PI'T~) ,' ,') = ~ (T (ml 'T3~) '''' ) where T is an a p p r o p r i a t e

pairing

for P2,m2.

(p , ~ )

Consequently

.......

function,

and

(p , ~ )

]

models

of#written

Let

IPl denote

It J=Jt 1÷k Theorem

2

similarly

are asymptotic

2

in z. the size of a program p;

trivial model of S j written model of S 3 written

and

(4.5)

in ~.

let t~ be the 3

in ~, and let t~ be the trivial 3 we assume that

..................

146)

(4.2.7)

With the notations

and assumptions

as stated above,

(a)

(ml , ~ ) >w(m2 , < ) ~

(pl, < ) >w (p2 , < )

(b)

(ml ' < ) > s (mr '<)<=>(Pl ' 4 ) >s (P2 ' < )

Proof (a)

There exist integers

IpII=Im11+k 2 and

k , k , such that 2 3

Ip21=Im21+k ~.

t

111

Hence

I(ml 'Tj)I - I(m2 'Tj)2 = Im21+ITJI'Im11-1T~]2 = Ip21+ITJl-lp 2

1

l-IT~l+k -k 2

= I ( p ,T~)-I (P2 'Tj )+k -k 2

2

$

{I(m1,T~)-I(m2,T~)}-k2+k~

.

The result follows from definition (b)

li~+~nf

3

(4.2.4).

{E(pl ,TJ)-E(p2,T~)}= 1

= lim inf , ~ i r {I( p ,T~)-I(P2,TJ ) } j+~ It j ] 1 2 lira inf

It l = lim inf {g(m ,TJ)-E(m j~

1

1

The result follows from definition Theorem

It L+k ,TJ)} , by eqn

2

(4.4).

2

(4.2.5).

(4.2.7) shows that the asymptotic relative

merits of two asymptotic models are not changed by a change of programming

language.

that choice of programming knowledge,

This result,

coupled with the view

language specifies

has the following interpretation:

our characterisation

of modelling,

a priori According to

the relative merits of

two rival models are independent of the modeller's beliefs,

if the observation sets are large enough.

a priori This is

a weak condition which should be satisfied by any reasonable procedure for model assessment. 4.3

Practical Effect of Chan~e of Lan~ua@e. The asymptotic results of theorem

(4.2.7) say nothing

about the situation for small observation sets. section it is demonstrated

In this

that a change of programming

112

language

can a f f e c t

Models

I,II,

(cf. s e c t i o n Model

the r e s u l t s

be a s s e s s e d

(ELML).

w h i c h was d e s c r i b e d computations

assessment

IV and V of the g a s - f u r n a c e

3.7) w i l l

Language

of m o d e l

This

in s e c t i o n

using

4.1,

observations

Extended

is a l a n g u a g e

in practice.

Linear

similar

but w h i c h

to LML,

performs

of the form

boUi+'''+bmUi-m-alYl-2-'''-anYi-n

Yi =

e i = doWi+...+dpWi_p-Clei_1-...-Cqei_ q Yi = Yi+ei +y Fig.

8 shows

the s t r u c t u r e

ELML p r o g r a m given

(4.7) of these

is a list of i n t e g e r s

computations.

Each

and r a t i o n a l s

which

is

the i n t e r p r e t a t i o n :

Y'm'n'p'q'a1'''''an'bo'''''bm'Cl'''''Cq'do'''''dp'Wi'''''w The d a t a

for such

a program

is a n o t h e r

list,

with

N.

the

interpretation: i,Ui_q_m,...,ui,Yi_q_n,-.-,Yi_ Since

the

value

of e i t h e r

outputs

algorithm

(assuming

an

(i-max(m+q,n+q))th

that p
III and VI of the g a s - f u r n a c e

structure

However,

u or y

requires

the p r o g r a m

w i if i~ m a x ( m + q , n + q ) .

Models the

(4.7)

I.

models

constitute Model

(4.7),

and so c a n n o t

I, II,

these m o d e l s

data do not have

be w r i t t e n

IV and V can.

The ELML

are:

I - Trivial O,O,O,O,O,O,i,+53.8,+53.6,...,+57.O

Model

in ELML.

II - Mean 53.5,O,O,O,O,O,i,+.3,+.i,...,+3.5.

.

program~which

113

Model

IV - D e t e r m i n i s t i c 22.4,5,2,O,O,.57,.01,0,O,O,-.53,-.37,-.51,1, +53.8,+53.6,...,+1.6.

Model V - Stochastic 2.2,5,2,0,2,.57,.01,0,0,O,-.53,-.37,-.51,.53,.63,1, +53.8,+53.6,...,+.2. The were shows

table

look-ups

for the A l g o l

of these m o d e l s

W versions

a comparison

are the same

of s e c t i o n

of the p e r f o r m a n c e

3.7.

as they

Table

II

of the models.

MODEL

SIZE

INFORMATION GAIN

INFORMATION EXPLAINED

I

1494

0

O

II

1119

375

25.1%

IV

879

615

41.2%

v

853

641

42.9%

TABLE

II - R e s u l t s

It is seen as b e i n g b e t t e r section

3.7,

than m o d e l

information

comes

form model

that m o d e l

IV.

W,

Assessment

showed model

about.

small

The

increase

of

IV to be b e t t e r

why

this

change

language

ELML

has

to m o d e l

in

so m u c h

that it r e q u i r e s

in the n u m b e r

to change

Consequently

the a s s e s s m e n t

I).

in its d e f i n i t i o n ,

appropriate

U s i n @ ELML.

V is now a s s e s s e d

H ow e v e r ,

to see i n t u i t i v e l y

of a p r o g r a m

(4.7)) V.

table

inherent

comparatively elements

than m o d e l

(cf.

It is easy assessment

from the table

using Algol

V

of M o d e l

of a r b i t r a r y

from the a l g o r i t h m

(of the

IV to that a p p r o p r i a t e

a comparatively

a

small

to

improvement

114

in the a b i l i t y is s u f f i c i e n t written

of this

algorithm

to j u s t i f y

to e x p l a i n

the increase.

in A l g o l W a g r e a t e r

the o b s e r v a t i o n s

When

improvement

the m o d e l s

are

is required.

4.4 S u m m a r y

The c h o i c e

of the p r o g r a m m i n g

the c h a r a c t e r i s a t i o n

of m o d e l l i n g

developed

can be r e g a r d e d

as a s p e c i f i c a t i o n

knowledge

the s y s t e m w h i c h

The common

about

characterisation

case,

examined. a priori

where

It has b e e n beliefs

assessment

become

of any

as the o b s e r v a t i o n However, especially such

extended

in c h a p t e r

in

3

a priori

to deal w i t h

sets of o b s e r v a t i o n s that

in this

increasingly

alternative

appears

of the m o d e l l e r ' s

models

case

irrelevant which

the

are b e i n g

the m o d e l l e r ' s to the

he may be considering,

sets grow.

it has b e e n d e m o n s t r a t e d

for small

an a s s e s s m e n t

beliefs.

shown

which

he is i n v e s t i g a t i n g .

has b e e n

increasing

language

sets

are

that

of o b s e r v a t i o n s ,

conditional

in p r a c t i c e , the results

on the m o d e l l e r ' s

of

a priori

115

5.

5.1

FRAGMENTS

OF

PROGRAMMING

LANGUAGES

Introduction In c h a p t e r

4 the idea was d e v e l o p e d that the d e f i n i t i o n

of the p r o g r a m m i n g

l a n g u a g e to be used for m o d e l

c o r r e s p o n d s to a s p e c i f i c a t i o n of the m o d e l l e r ' s k n o w l e d g e of the system. a t t a c h e d to the p h r a s e W h e n two m o d e l s

assessment a priori

But can s p e c i f i c m e a n i n g be

" d e f i n i t i o n of a p r o g r a m m i n g

are compared,

to w r i t e each of t h e m as p r o g r a m s

language"?

the l a n g u a g e s r e q u i r e d are r a r e l y e x a c t l y the same.

Can their c o m p a r i s o n on the basis of p r o g r a m lengths be meaningful

then,

since the a p r i o r i k n o w l e d g e

a s s u m e d for e a c h

is s l i g h t l y d i f f e r e n t ? This c h a p t e r and the next are c o n c e r n e d w i t h these questions.

The e x p e d i e n t

a programming

adopted earlier,

of a s s o c i a t i n g

l a n g u a g e w i t h a p a r t i a l r e c u r s i v e function,

no longer s a t i s f a c t o r y ,

since it gives no i n f o r m a t i o n

is

about

the a l g o r i t h m u s e d for c o m p u t i n g the function. A p p e n d i x A reviews one m e t h o d of d e f i n i n g p r o g r a m m i n g languages

formally,

We h e n c e f o r t h assume Appendix.

n a m e l y the s o - c a l l e d " V i e n n a Method". f a m i l i a r i t y w i t h the c o n t e n t s of this

In this chapter,

the V i e n n a m e t h o d is used as

a basis for a p r e c i s e d e f i n i t i o n of " p r o g r a m m i n g The n o t i o n s of a p r o g r a m being

language".

" w r i t t e n in a language"

of "the f u n c t i o n c o m p u t e d by a language" The concept of " f r a g m e n t of a language" section 5.4, and is used to d e f i n e

and

are then introduced. is m a d e p r e c i s e

in

" e q u i v a l e n c e of languages".

116

Languages

are e q u i v a l e n t only if they are i n d i s t i n g u i s h a b l e

to the user. equal,

However,

equivalent

since their i n t e r p r e t i n g

languages n e e d not be

automata,

for instance,

may be different. Most programs

are w r i t t e n in i n f i n i t e l y m a n y

m a n y of w h i c h are f r a g m e n t s of others.

languages,

T h e r e f o r e the

"family of l a n g u a g e s in w h i c h a p r o g r a m is w r i t t e n " introduced.

Such a family c o r r e s p o n d s ,

roughly,

is

to the set

of l a n g u a g e s w h i c h is r e f e r r e d to by a name like "Algol" or "Fortran".

If s u i t a b l y r e s t r i c t e d ,

a family of l a n g u a g e s

in w h i c h a p r o g r a m is w r i t t e n has a "smallest" is c a l l e d the "support"

of the program.

element, w h i c h

The m a i n aim of

this c h a p t e r is the f o r m a l i s a t i o n of this c o n c e p t of "the s u p p o r t of a p r o g r a m " ,

since this is the c o n c e p t w h i c h is

r e q u i r e d for m o d e l assessment. The f o l l o w i n g e x a m p l e m a y help to c l a r i f y this chapter. C o n s i d e r the program: ~egin integer i_~f i=j

i,j; then i:=j+l else i:=j;

i:=l; end. This p r o g r a m is w r i t t e n

in Algol,

but it uses only a few

of the features w h i c h A l g o l u s u a l l y provides. b l o c k structure,

some i n t e g e r arithmetic,

and c o n d i t i o n a l expressions. for statements,

It uses

a s s i g n m e n t statements,

It does not use p r o c e d u r e s ,

or goto statements, for instance.

A l s o it

117

uses very few terminals.

In our t e r m i n o l o g y this p r o g r a m

is w r i t t e n in an A l g o l - f a m i l y of languages

(more p r e c i s e l y ,

it is w r i t t e n in an A l g o l X - f a m i l y of languages, w h e r e A l g o l X is some c o m p l e t e l y s p e c i f i e d A l g o l - l i k e

language).

Languages

in this family include all the v e r s i o n s of A l g o l found i m p l e m e n t e d in practice. is, r o u g h l y speaking,

The A l g o l X - s u p p o r t of the p r o g r a m

the s m a l l e s t A l g o l - l i k e f r a g m e n t w h i c h

allows the above p r o g r a m to be written. block structure,

integer addition,

It will i n c l u d e

a s s i g n m e n t statements,

conditional expressions,

the i d e n t i f i e r s

i.

any f u r t h e r features.

It w i l l not include

i,j, and the integer A constraint

on the c o n c r e t e s y n t a x of the A l g o l X - f a m i l y ensures that the A l g o l X - s u p p o r t of the p r o g r a m allows m o r e than one p r o g r a m to be w r i t t e n in it.

Thus the A l g o l X - s u p p o r t

may not have the trivial and u s e l e s s c o n c r e t e syntax: <program>::=be~in (However,

integer i,j;

i f i =j then i:=j+l else i:=j;

a h - s u p p o r t of the p r o g r a m exists for some language ~,

w h i c h does have this c o n c r e t e

syntax).

The n o t i o n of " e q u i v a l e n c e " w h i c h will be f o r m a l i s e d is i n t e n d e d to c o i n c i d e w i t h the concept w h i c h appears in 011ongren

(49), a l t h o u g h the n o t i o n of "fragment"

is s l i g h t l y

different. The a p p l i c a t i o n of these ideas to m o d e l will be p r e s e n t e d in c h a p t e r

5.2

i:=l;en~

assessment

6.

Preliminaries

We shall use p to d e n o t e a p r o g r a m in the usual sense, namely a finite string of letters from a finite a l p h a b e t

118

of t e r m i n a l s .

When

a specified of

and

(concrete)

a parsing

tree

namely

shall

that

in G.

~G'

(see s e c t i o n We

for

L(G)

Definition

A.3

will

p,

shall

assume

that

language

assume

(49)

the

with existence

the derivation

sections

2.3

G is u n a m b i g u o u s ,

is at m o s t

the

language

P G to d e n o t e

and O l l o n g r e n

there

denote

G, w e

and write

always

any

a particular

grammar

algorithm

~G(p) 3).

discussing

one

derivation

generated

of p

by G.

(5.2.1)

Consider

three

context-free

G i = ( N i , Z i , P i S i)

grammars

(cf.

sec.

A.3)

, i=i,2,3,

where

N i is a s e t of n o n t e r m i n a l s ,

Zi is a set of t e r m i n a l s ,

Pi

a set

is

is

(i)

of

productions,

G ~

G

i

~:~

and

(a) N ~

2

a start

symbol.

N

i

(b)

Si 2

Z ~ 1

Z

2

(c) P ~- P i 2 (d) S (ii)

G =G n G 1

=

1

S

2

(a) N =N n N

2

3

1

(b)

2

3

Z =Z m I

2

3

(c) P =P ~ P 1

2

3

(d) S =S =S i

(Note:

we

G ~ - G,

for

Lemma

use--

to d e n o t e

. 3

improprer

inclusion.

Thus

all G).

(5.2.2)

G~G i

Proof:

2

Immediate

&G~G~GcG 2

from

2

3

]

3

transitivity

of set

inclusion.

119 Lemma

(5.2.31

(i)

G~G

~L(G

1

)~L(G

2

1

) 2

(ii) G I = G 2 n G 3 ~ L ( G , ) ~ L ( G 2 ) n L ( G 3 (i)

Proof:

)"

G~G2&P~L(GI)~G~G2&SI~P I

~$2

G=~p

(by d e f .

(5.2.1))

2

~p

e L (G) 2

(ii)From

(i),G =G a G 1

~

2

L(G )cL(G

3

)&L(G

1

~L(G

)cL(G

2

1

) 3

)~L (G2)mL (G3) . I

(Note that

S~p

the

converse

inclusion

and S G p may be different 2

not

hold,

derivations~

because

neither

of

3

which

is

possible

in

Definition

(5.2.4)

derivation

trees

Lemma

G ). i

~(G)={~G(p)

generated

(5.2.5)

(i)

G~G I

~H(G

)c~(G

2

1

Proof: production

(i)

I

2

G

1

is also in G i

is

.

also

is the set of

in

) 2

~(G ) ~ ( G 2 ) n ~ ( G

3

1

From definition in

: peL(G)}

b y G.

(ii) G =G n G ~

G

does

(5.2.1) G .

it follows

Hence

2

). 3

every

that every

derivation

But to every derivation

in

there corresponds

2

a unique

derivation

every derivation

tree

(Ollongren

(49)

s e c ~. 2 . 3 ) .

tree in ~(G ) is also in

~ (G).

]

(ii)

From

2

(i), G =G r%G ~ ]

2

~(G

3

)~ (G)&H(G 1

~g(G

2

availability

of

a metalanguage

)~H(G l

)cE(Gz)nH(G I

The

Hence

). 3

for

the

) 3

definition

120

of p r o g r a m m i n g of the

languages

is assumed,

set of t r e e - s t r u c t u r e d

in A p p e n d i x

A.

metalanguage

objects

In the c o n t e x t

is the same

by the m o d e l l e r

as is the e x i s t e n c e

of m o d e l

as the

for s p e c i f y i n g

which

is i n t r o d u c e d

assessment,

one that w o u l d (informally)

the

be u s e d

his

a priori

knowledge. The differs

following slightly

reference dropped,

since

from that g i v e n

the same states

M =

is the i n i t i a l F ~is-state

automaton

states

is a p r e d i c a t e state,

w h i c h has

Also,

Its p u r p o s e

will

is-state

that e v e r y

state has

described languages

that A(~)

then

over

A(~)

the set of objects,

states,

is n o t

in the m a n n e r general

is a 5-tuple

~o

function,

and E is a set of

the p r o p e r t y :

control

~eF,

has been

throughout.

A is the s t a t e - t r a n s i t i o n

and that this

assumed

Explicit

~o,A,F,E)

is a set of final

The p r e d i c a t e

more

objects

E is i n t r o d u c e d.

(is-state,

is-state

assumed

set is a s s u m e d

A.6.

(5.2.6)

An i n t e r p r e t i n g

error

in s e c t i o n

automaton

later.

Defintion

where

of an i n t e r p r e t i n g

to the set of t r e e - s t r u c t u r e d

a set of e r r o r be seen

definition

further a

determines for LML

(possibly

empty)

but it is control

the s t a t e - t r a n s i t i o n in s e c t i o n

by O l l o n g r e n

is a set of states,

is not defined.

defined,

(49).

A.6.2,

and

function for

(Thus it is

in general).

part,

If

121

Definition

(5.2.7)

A computation ~O,A,F,E)

is a sequence

for i=O,l,..., terminates 5.3

of an interpreting (~o,~i,...),

and ~ i ~ F ~

automaton

M=(is-state,

such that is-state

~i+leA(~i).

(~i)

The computat£on

if ~i e F for some i.

Pro~rammin~

Definition

Lan@uages

(5.3.1)

A programming

language

where G is an unambiguous, the concrete

syntax;

the abstract

syntax;

is an £nterpreting is effectively

is a 5-tuple context-free

I =(G, is-program,T,M,op) grammar which defines

is-program is a predicate which defines ^ T:H(G)+is-program is a translator; M

automaton whose state-transition

computable,

and whose

initial

function

state ~o(PA )

A

depends

on an object PA=T(PG ) e

output function whose domain

is-program;

and op is an

is in the set F of final states ^

of M.

Furthermore,

~i(pA) |

F ,

. . , ~ ( p A )) is a terminating

other computation ~F),

for every PA e is-program,

(~o(PA),~J(pA),...)

if

computation, terminates

(~o(PA) , then every

(with

and o p ( ~ ( p ) ) = o p ( ~ ( p ) ) . In the above definition

it is assumed

is-program

specifies

"program",

but also of the "data".

departure

the abstract

from the practice

and in the definition

that the predicate

syntax not only of the

followed

This constitutes

a

in the Vienna method,

of LML in Appendix A.

To see that this

122

is no r e s t r i c t i o n ,

c o n s i d e r the following.

If a p r e d i c a t e

i s - p r o g r a m has b e e n d e f i n e d w h i c h does not m a k e p r o v i s i o n for data,

and the d a t a is a s s u m e d to be l o c a t e d in some

d i r e c t o r y of the state

(as is done in the case of LML),

then a new p r e d i c a t e can be d e f i n e d by: is-program I =(<s-program:is-program>, w h e r e i s - d a t a is a p r e d i c a t e

<s-data:is-data>)

s a t i s f y i n g the a b s t r a c t syntax

of d a t a sets.

All that remains to be done is to m o d i f y

the i n s t r u c t i o n

a s s o c i a t e d w i t h the initial state,

the first action of the i n t e r p r e t i n g s-data

a u t o m a t o n is to read

(p) into the a p p r o p r i a t e d i r e c t o r y

and to m a k e s - p r o g r a m new m a c h i n e

so that

(or d i r e c t o r i e s ) ,

(p) the n e x t argument.

~

l

of the

is then i d e n t i c a l w i t h ~

of the old one. o The o u t p u t f u n c t i o n op is i n t r o d u c e d in o r d e r to have

a v a i l a b l e the n o t i o n of "result of a c o m p u t a t i o n " w i t h o u t r e s t r i c t i n g the p r e d i c a t e

is-state.

N o t e that a l t h o u g h

the states a p p e a r i n g in a l t e r n a t i v e c o m p u t a t i o n s need not be the same, we impose the usual r e q u i r e m e n t that if the r e s u l t of a c o m p u t a t i o n

Definition

then it is unique.

(5.3.2)

Let I=(G, p is w r i t t e n is d e f i n e d

is defined,

is-program,

in ~ (pew~)

T,M,op)

~ieF, and ~i~E.

language.

if and only if peL(G), P A = T ( P G )

(where p G = ~ G ( p ) ) ,

the c o m p u t a t i o n

be a p r o g r a m m i n g

and for every ~i a p p e a r i n g in

(~o(PA),~1 (pA),...) , A(~ i) is d e f i n e d unless

123

The role of the set of error states E can now be seen. ~t the end of section A.6 it is r e m a r k e d that specifying the concrete sufficient

and abstract

syntax of a language

to define the v a l i d programs

in a language.

Part of the d e f i n i t i o n m u s t be a c c o m p l i s h e d of the i n t e r p r e t i n g is encountered,

is f o r m a l i s e d

(5.2.6),

state it remains Definition

whenever

by s p e c i f i c a t i o n

an invalid p r o g r a m

an error state is entered.

language d e f i n i t i o n By d e f i n i t i o n

functions:

is not

This aspect of

in d e f l n ~ t i o n

once a c o m p u t a t i o n

(5.3.2).

enters

an error

in an error state.

(5.3.3) is the set of all programs

PI={P:PEw~} Definition

written

(5.3.4)

The function

computed by the language

I is the function

~:Pl--)range

(op) , such that ~ l ( p ) = o P ( ~ F ( P A )) ' w h e r e

PA =T(nG (p))'

(~o(PA) .... '~F(PA ) ) is a c o m p u t a t i o n

~F(PA) eF, and (Definition

~(p)

(5.3.1)

is u n d e f i n e d ensures

The function ~ function.

that ~ l

is a partial

is a function).

effectively

integers),

can be r e g a r d e d

as a partial

if the p r e d i c a t e

is-program

and "data",

correspondence

then, by Church's recursive allows

computable

(op) are suitably

(i.e. put into o n e - t o - o n e

the n o n n e g a t i v e

and

if no such ~F exists.

If the sets P~ and range

arithmet~ed

"program"

in I.

with

thesis, ~

function.

Furthermore,

a clear d i s t i n c t i o n

between

and if each of these can be a r i t h m e t i s e d

124

separately,

then ~ %

can be r e g a r d e d

function of two arguments. regarded

(cf. Rogers

entities

conventionally

are universal. (5.3.1), 5.4

when so

(9)), then I is universal. considered

All those

be p r o g r a m m i n g

to

is included

languages

in d e f i n i t i o n

is not

In this

of Languages

section

language to w h i c h

a subscript

reference

GI denotes

the g r a m m a r

Definition

(5.4.1)

Let k

2

recursive

is universal,

H o w e v e r LML, w h i c h

Fragments

of k

If ~

as a p a r t i a l

and %

i

(11
2

(ii)

pewk1~pew~

1

For example,

of I.

languages,

l

1

is a fragment

G~2

(iii) V pewk 1

2

, ~ (p)=~l (p), where k, 2

~Iz(p), ~l~p)

(iv) V p e L ( G ~ 1 ) ,

everything

a programming

if and only if

G~

Roughly

is being made.

be p r o g r a m m i n g

(i)

if one of

usually denotes

is undefined,

Pewl2~

speaking,

if I

it is u n d e r s t o o d

that

then so is the other,

Pgwl l1

is a fragment

that can be done using

I

of I , then 2

can also be done using ]

I . However, to make this d e f i n i t i o n useful, the languages 2 c o n c e r n e d m u s t be d e f i n e d in a rather i d i o s y n c r a t i c manner. Part

(iv) of the d e f i n i t i o n

implies

that if I

2

contains

125 standard procedures,

for example, which are not available

in ~ , then they must be called by new terminal characters I which do not appear in the grammar of ~ ] . This is not the usual practice, but there ks no reason why it should not be A given programming language

done.

{understood informally)

does not have a unique formal definition. (5.4.1)

Definition

assumes that much more of the burden of the language

definition has been transferred

from the translator to the

concrete syntax, than is convenient in practice.

This

point will be discussed further in section 6.2. Theorem

(5.4.2)

Reflexivity is obvious Suppose ~ I
Then

(i) G ~

O~2 and G A ~ G I

(ii) p e w l l ~ Pewit O

.

Hence G I ~ G I 3

pew~ z and p e w l 2 ~

pewl3.

by lemma

(5,2,2)

Hence

Pew~3" (lii)VP£wl1' ~l] (P)= q~A~p) a n d V P e w l 2 '

~A

(P)=~@13 (p)"

S°'~Pewl1' ~l,(P)=q°13 (P)' by def.

(5.4.1)

2 (iv)~/peL(G~1) , pew~2 ~

pewll

(A)

and V p e L ( G ~ 2 ) , pewl ~ ~

pewl~.

(B)

Suppose peL(Gl1) (5.2.3)

(i).

and p£wl3.

So pew~2 , by

Then peL(GI2) (B).

by lemma

Hence pEwl1, by

(A).

(ii).

126

Definition 1

1

(5.4.3)

is equivalent

to

(I %1 ) if and only if 1 i z 2
2

and I
(5.4.4)

~ is an equivalence

Proof:

Reflexivity

and symmetry

follows from theorem Theorem

(5.4.5)

relation.

are obvious.

Transitivity

(5.4.2).

~ ~

2~

Ptl=P~2

(i)

(ii) q31 i = ~ 2 Proof : Lemma Proof:

Immediate (5.4.6)

from definitions

H
~
=G ~

(5.4.1)

and

(5.4.3).

~9

pewU~Pew~

(ii)~Pew~ , ~ 0 (p) = q01 (p) (iii)~peL(G u) , P £ w ~

Pew~.

G~=G~&9
=G 9

'.

(i) G = G (ii) pew~-~ p~w~ (iii) V P £ w ~ , Q 3 ( p ) = ~ ( p ) (iv)~peL(G) Hence ~ ~ ~ by defs Definition

(5.4.1)

, pew94=~pew~. and

(5.4.3).

(5.4.7)

[~]~={l~:~'~l}

is the equivalence

class of I m o d u l o ~.

127 Lemma

(5.4.8)

For any programming language I, there

exists at most a finite number of equivalence classes

[U]~,

such that H
~
G ~

G~=(N~,Z~,P~,S~)

But each of the sets N~,Z~,Pk,SI definition

is finite.

(5.2.1)) Hence, by

(5.2.1), there exist only finitely many distinct

G , such that G Definition

(cf. def.

~-Gk.

The result follows from lemma

(5.4.6).

(5.4.9)

Let S be a set of programming

languages.

Then the set

of common fra@ments of S is the set NFS ={I: ~ e S ~ Theorem

(5.4.10)

Proof:

~eNFS&~

I
~

(~ES~
~(9¢S ~E~FS Definition

pElFS. ~)&~u

~
~
~) by theorem

by definition

(5.4.2)

(5.4.9).

(5.4.11)

The @reatest common fra@ment of a set of programming languages S is the set r (s) ={X :I~NFS& (~cAFS ~ p < F I) }. Theorem (5.4.12) IaF(S) & ueS~l
Theorem

(5.4.13)

(i)

IEF(S)&~eF(S) ~

l~p

(ii)

~EF (S) &1~p ~ U E F

(S)

128

(i)

Proof:

ICF(S)&NCF(S)~AEF(S)&P£~FS ~H
Similarly, I
(5.4.11).

%, by definition

Hence ~%p.

IEF (S) &I~B ~ %¢~F S& (Ue~FS ~ 9
( 9 ~ F S ~ U
by theorem (5.4.10) and definition

(5.4.3) ,

by theorem (5.4.2), ~eF(S),

by definition

Lemma (5.4.14)

%eS&%eNFS ~

Proof:

%~S&~e~FS ~ ( ~ F S ~ < F I ) & I ~ F S

for(S)

~leF(S), Definition

(5.4.11).

by definition

(5.4.11).

(5.4.15)

If pew% , then the l-family of languages in which p is written is the set A l(p)={p:pewp& (l
pewl4=~ leak(p)

Obvious

Theorem (5.4.17)

PeAx(P)&U£AI(P)~v(P)=~(P).

Proof:

(~
and

~eAl(p)~

U
by definition

l
similarly.

(5.4.1)

129

Hence

l-lEA;t ( p ) ~ ~

Similarly

VEA~ ( p ) ~ I D U (p)=~QX (p) "

Therefore

~eA~(p)&ueA{p)~

( p ) = ~ (p).

~0p(p)=g~(p).

Note that if the convention p ~ w l ~ then theorem

(5.4.17) would not hold.

A~(p)=0 were dropped, Also, note that if

pewk then it is always possible to construct a ~, such that pew~, yet A k ( p ) ~ A (p)=~. Definition

(5.4.18)

The l-support of a program p is the set of languages ZI (p) =F (Ak (p)). By theorem

(5.4.13)

p are equivalent.

(i), all languages in the h-support of If p{w ~, then Z~ (p)=~.

We now come to the central result of this chapter. Theorem

(5.4.19)

(i)

V

l~,k"eAk(p) , 3~cAk(p)

such that

~
Al(p)#~u£Al(p)

, such t h a t V f

£ Ak(p),

~
language

in A k (p)). (iii) Furthermore, Proof:

9£Zl(p).

(i) Suppose t"EA X(p)

Then pewl'&(l
or l'
and l " e A t (p). & p£w l'' &(l
then k~
(5.4.2).

Put ~=l'.

130

If I"
Put p=l".

If I
G~,~GI,,

is-program T (pGI)

={T~(pG~

) :pG~ e~ (G) }

= Tl~(pGl ) , V p G I

eH (G)

Mp=MI~ op=opl~ Now pEL(GI,)

& peL(G~,,) & GI.~G 1 & GI,, ~ GI.

derivations S ~ . p

If the

and S ~ G ,,p were distinct then there would W

be two distinct derivations S G ~ P ,

by definition

(5.2.1).

But this would contradict the standing assumption that G1 is unambiguous.

Consequently

and GI,, are the same.

the derivations of p in GI.

Consequently the productions which

appear in this derivation are all in Gp. and P G ~ = ~ G ~ ( P ) ~ ( G

).

Also, p e w l ' ~ A p ( ~ computation

Hence p£L(G ),

Therefore TU (PGp)=TI~(PGp)~

is-program

i) is defined for every ~i in the

(to( p ) ,~i (P) .... ) (~i%F) , and Ap(~i)nE =~.

These last two properties follow becaue the interpreting automata of l ~ and p are the same. (5.3.2), pCwp

....................

Hence, by definition (i)

131

From the construction, (a)

G

(b)

P'ew~ ~

it is clear that:

~ G I. p~£w ~"

(c)

V p'~w n

: ~x,(p') = ~ (p~)

(d)

kf p'£L(G u) : p'ewX ~ p ' e w U .

Hence, by definition

(5.4.1),

But I~
(i) and

It remains

(3),

~
hence

~eA~(p)

. . . . . . . . . . . .

(e)

From the construction,

(f)

p~ew~

(5.2.3)

(5.4.1)

P ~ w ~'' & I " < F I ~ ° ~ " (f) , p ~ e w ~ 0

(h)

Let p'eL(G

I"
By lemma

[I']~ ~ A l ( p )

By

(p')=~l(p')

(5.4.1)

(iii).

similarly.

Then p'ew I, since

, by definition

(5.4.1)

(iv).

show that ~
(5) together prove (5.4.8),

(5)

(i).

there are at most finitely

, such that I'
shows that 3 9 £ A l(p) (iii)

by definition

) and p " e w I" .

(g) , (h) together

(4) and (ii)

many

(iv)

(p')=~l, , (p').

But u
(e) , (f),

But p~eL(GI,,)

(i), and I"
p~ewU&~
So, using

(4)

it is clear that G U ~ GI,,.

p~ew~ , since ~
Hence p ~e w I" , by definition (g)

(5.4.2).. (3)

to show that U
by (e) and lemma

(2),

~
(2)

such that ~

This, together with

I'EA l(p)

(ii), ~e~FA 1 (p) and 9e~ 1 (p).

9 F1 . Hence,

by lemma

(i),

132

(5.4.14) , 9oFf(p).

Corollary

(5.4.20)

Definition languages. Theorem

pew~Z~(p)

(5.4.21)

Let S and T be sets of programming

Then S~T~=~(VseS&VteT) :

(5 .4.22)

Proof:

~Al(p)

DeZ~(pI ) &~eZ~

Immediate from theorems

Definition

(5.4.23)

(pz)

s~t. &

P~==~(Pl)~E~(P2

(5.4.13)

)"

(i) and (5.4.4).

Let T(p) denote the set of terminals

which occur in p. Theorem

(5.4.24)

Proof:

~% (Pl)~X~ (Pz) ==~ T (pt)=Y (P~) •

Z~(pl)%7-1(p2 ) ~

(5.4.3).

GF I(p~)=Gz l(P2 )' by definition

Let G ZI(Pl ) =(N,A,P,S),

of terminals •

where A is the alphabet

Suppose T(pl)~T(p2) .

Then T(pl ) c A

T(pz)cA~nd either T(pl )#A or T(p2 )~A (or both). T(p )#A. 1

Consider the grammar G=(N,T(p

1

and

Suppose

), P~,S), where P"

is obtained from P by deleting those production rules ~÷8 for which 8 ~ A - T ( p ) .

Clearly,

G~G Z l(pl ).

If El(p l ) is

now replaced by the language obtained from El(pl) by replacing GE l(Pl) by G, then the resulting of Zl(pl) since

language will be a fragment

and will be in Al(pl).

But this is a contradiction,

El(pl) is a fragment of every language in Al(pl) ,

and is not equivalent

to the new language

(because T(p )#A). i

Hence T(pl)=A.

Similarly T(p2)=A.

Hence T(pl)=T(p2).

183

5.5

Conclusion Conditions

(i) and

(iv) of definition

(5.4.1) ensure

that it is possible for two different programs same l-support. 5.1.

Recall the short program given in section

Consider its AlgolW-support,

to be defined definition

to have the

(informally)

(5.4.1)

as in

where AlgolW is taken

(50).

Condition

(i) of

implies that its concrete syntax specification

must include the production rules: <program>::=.

::=<statement>end

::=l<statement>; ::= beginl::=
head><declaration>

statement>I<simple

::=
clause><simple

statement> statemen%)

else <statement> etc. The language generated by these production rules includes the program be~in integer i,j; if i=j then i:=j else i:=j+l; j:= i; end. NOW, condition

(iv) of definition

(5.4.1) ensures that this

program is written in the AlgolW-support of the first program,

since it is obviously written in AlgolW.

easy to see that the AlgolW-supports

It is

of the two programs are

134

equivalent. formal

However,

definition

not

to p r o v e

of this w o u l d

example,

w e have

of the s e m a n t i c s the e q u i v a l e n c e

method

the ¢ 0 ~ e p t

assessment.

specified checked

formally,

necessary

This

chapter

are c o n s i d e r e d using

that

this w i l l

of X - s u p p o r t

language

and t h e o r e m

X will (5.4.24)

content

of the

be the t y p i c a l

in c o n n e c t i o n not need

with

to be

provides

an e a s i l y -

condition

for the e q u i v a l e n c e

of l-supports.

has

that

languages

shown

to be d e f i n e d

the V i e n n a m e t h o d ,

on d e f i n i t i o n about

The

and w e w e r e

of the A l g o l W - s u p p o r t s

It is e n v i s a g e d

model

a full

r e l i e d on an i n f o r m a l

of AlgolW,

two programs. of u s i n g

require

of AlgolW.

In the above understanding

a proof

(5.4.1)),

the"smallest"

if p r o g r a m m i n g

in a c e r t a i n way,

and in a c c o r d a n c e then

language

with

it is p o s s i b l e required

(namely, the c o m m e n t s

to speak p r e c i s e l y

to run

a particular

program. The that

reason

for s h o w i n g

the p r o g r a m

smallest

this

is a model,

is the

and that

following.

Suppose

the d e f i n i t i o n

of the

language

required

to run it is t a k e n

of the m o d e l l e r ' s

a priori

knowledge.

Then

"a p r i o r i

knowledge"

shown

that

defined,

this

of

it has b e e n can be p r e c i s e l y

if required.

However, relative

concept

as a s p e c i f i c a t i o n

w e have

succeeded

to a p a r t i c u l a r

family

only of

in d e f i n i n g

languages.

this

concept

6

~ - COMPARABILITY

6.1 I n t r o d u c t i o n

Two rival m o d e l s

of a system, w r i t t e n in a language

will v e r y r a r e l y have the same h-support.

i,

It m a y be argued,

then, that the use of e a c h of them implies a s l i g h t l y d i f f e r e n t set of a p r i o r i

assumptions.

In this case,

a

c o m p a r i s o n of the two m o d e l s m a y not s e e m m e a n i n g f u l . On the o t h e r hand, one may c o n s i d e r that the choice of a p a r t i c u l a r

language specifies

the a p r i o r i

assumptions,

and that these are not c h a n g e d if it later appears t h a t c e r t a i n features of the l a n g u a g e are not needed. R a t h e r than argue the m e r i t s of e i t h e r v i e w p o i n t , we shall a t t e m p t to show that it does not m a t t e r m u c h for m o d e l assessment,

whichever position

is adopted.

For the p u r p o s e s of m o d e l

assessment,

the term

" p r o g r a m m i n g language" m u s t be u n d e r s t o o d r a t h e r d i f f e r e n t l y than is u s u a l in c o m p u t e r science. of A l g o l

Since the d e f i n i t i o n

60 it has b e c o m e c o m m o n to d e f i n e "reference"

languages,

w h i c h are i n d e p e n d e n t of p a r t i c u l a r h a r d w a r e

facilities.

Consequently,

c e r t a i n aspects of a l a n g u a g e

(such as i n p u t / o u t p u t in Algol) the p r o v i n c e of its d e f i n i t i o n . language" m e a n s a c o m p u t i n g

are c o n s i d e r e d to lie o u t s i d e For us, h o w e v e r

facility,

"programming

e x a c t l y as it appears

to the p r o g r a m m e r . The m o s t

i m p o r t a n t d i f f e r e n c e b e t w e e n this v i e w of a

language and a "reference"

language

(for us)

is that s t a n d a r d

136

procedures, (e.g.

sin,

language 6.2

such

as i n p u t / o u t p u t

sqrt),

must

Definition

to be i n c l u d e d

in the

and m I

are

Models

(6.2.1)

Let m

2

be c o n s i d e r e d

procedures

definition.

l-Comparable

and m

and m a t h m a t i c a l

be two m o d e l s

of a s y s t e m

S.

Then m

z

i

l-comparable

if and only

if

E~(m I) % Z l(m 2) ~0 In this d e f i n i t i o n is a l a n g u a g e

which

in o t h e r words, m

and m 1

are

it is of c o u r s e

interprets

m

in the t e r m i n o l o g y assumed

to be

for some

E

and E i

l-comparability

Such

More

Perhaps

the

to a f f e c t m o d e l

feature

in a p r o g r a m

the

without

arguments,

"facilities"

identity

most

is its

those p r o c e d u r e s being

defined

procedures or h o w m a n y

of S,

of

are terminal

as " f a c i l i t i e s " .

are

features

such

or of arrays.

of all

of a l a n g u a g e

assessment

"facilities"

as w h e t h e r

of got___~os t a t e m e n t s ,

ensures

- namely,

all the

can be r e g a r d e d

significant

l-comparability

- models

2

are also u s e d by the other.

one or s e v e r a l

availability

procedures

(I,E)

(3.3.6),

.

that

details

are a v a i l a b l e ,

obviously

as the

ensures

trivial

to take

characters

and

of S;

2

u s e d by one m o d e l

apparently

allowed

as m o d e l s

2

1

respectively,

that

of d e f i n i t i o n

(~,E)

2

a language

and m

l

assumed

which

such features. can be e x p e c t e d

complement which

of s t a n d a r d

can be c a l l e d

(declared).

Consequently,

137

if l - c o m p a r a b i l i t y ensured

that

is to be a u s e f u l

h-comparable

procedures.

This

models

to be t e r m i n a l s

in c h a p t e r

5, this

it for the

following

Firstly, procedures

is not

treated

called

is arises

language

implementation. calls w e r e

identifiers concept

standard

definition

such

between

would

to the

of p r a c t i c a l

procedure

terminals),

as d e f i n e d

intended

not ensure

of s t a n d a r d

of p r o g r a m m e r -

if s t a n d a r d

of a language",

a

Such d i s t i n c t i o n

if the s y n t a x

existing

there

as addition,

as the s y n t a x

up from

when

Secondly,

of c o n v e n i e n c e

(that is,

standard

then our

in s e c t i o n

intuitive

the use of the

notion, same

procedures.

To see are w r i t t e n standard

adopt

language

for our p u r p o s e s ,

same

correspond

and h - c o m p a r a b i l i t y

but we

6.3 how

Thirdly,

calls

were built

not

are

As was m e n t i o n e d

by its own terminal.

the

of " f r a g m e n t

5.4, w o u l d

names

as terminals.

for r e a s o n s

procedure

procedure

practice,

and an o p e r a t i o n

as there

supplied

standard

in s e c t i o n

in the

distinction,

w h i c h i_~s u s u a l l y

the same

reasons.

are

procedure

it m u s t be

language.

the u s u a l

can be i n c l u d e d

is no e s s e n t i a l

procedure

of the

it is d e m o n s t r a t e d

their i d e n t i f i e r s

standard

call

is so if s t a n d a r d

considered

concept,

this,

suppose

in Algol.

procedures

sin as w e l l

are t e r m i n a l s

of the

not A l g o l - c o m p a r a b l e

two m o d e l s

Suppose

entier

as e n t i e r

that

system

that one of t h e m uses

and sqrt,

and sqrt. language,

of some

while

the o t h e r

If entier, then

(by t h e o r e m

s~rt

the m o d e l s

(5.4.24)).

the uses

and sin

are c e r t a i n l y

138

But if entier,

sqrt and sin are s i m p l y p r o c e d u r e

i d e n t i f i e r s w h o s e syntax is g i v e n by, for example: <procedure identifier>::=
::= l e t t e r > [ < i d e n t i f i e r > < l e t t e r >

then the A l g o l - s u p p o r t s equivalent.

of the two m o d e l s may be

This syntax, w h i c h is part of the s y n t a x of

the A l g o l - s u p p o r t of the first model,

allows

sin to be used

::=ilnls must

(since the p r o d u c t i o n s

also be p a r t of the syntax), to be u s e d as a p r o c e d u r e

and f u r t h e r m o r e

identifier.

A l g o l - s u p p o r t of the first m o d e l

(5.4.1)

intuitively).

Then,

it allows it suppose that the

is a f r a g m e n t of the

A l g o l - s u p p o r t of the second m o d e l to allow,

Now,

the i d e n t i f i e r

(a p o s s i b i l i t y w h i c h we w i s h

a c c o r d i n g to d e f i n i t i o n

(iv), the sin call m u s t have the same effect in both

languages.

So,

the A l g o l - s u p p o r t of the first m o d e l m u s t

c o n t a i n sin as a s t a n d a r d procedure.

C l e a r l y this c o n t r a d i c t s

the i n t e n d e d m e a n i n g of " A l g o l - s u p p o r t " . Consequently,

we insist that s t a n d a r d p r o c e d u r e i d e n t i f i e r s

be r e g a r d e d as terminals.

If it is now s t i p u l a t e d that only

l - c o m p a r a b l e m o d e l s s h o u l d be c o m p a r e d for m o d e l assessment, then we have the formal e q u i v a l e n t of the i n t u i t i v e idea, that m o d e l s

should be c o m p a r e d only if they use the same

f a c i l i t i e s of a language.

One r e a s o n for m a k i n g this

s t i p u l a t i o n has a l r e a d y been r e f e r r e d to in s e c t i o n It m a y be felt to be an "unfair"

6.1.

c o m p a r i s o n if the m o d e l s

are not l - c o m p a r a b l e . An o b v i o u s e x a m p l e of this w o u l d be a c o m p a r i s o n of a

139

d i f f e r e n c e - e q u a t i o n m o d e l w i t h a d i f f e r e n t i a l - e q u a t i o n model. If the d i f f e r e n t i a l - e q u a t i o n m o d e l w e r e a l l o w e d to call a standard p r o c e d u r e

for integration,

w o u l d it be r e a s o n a b l e

to compare the " n u m b e r of a r b i t r a r y elements"

e m b o d i e d in

it w i t h the n u m b e r e m b o d i e d in the d i f f e r e n c e - e q u a t i o n m o d e l ? The d l f f e r e n c e - e q u a t i o n m o d e l assumptions

r e q u i r e s fewer a priori

(if its ~- s u p p o r t is a f r a g m e n t of the l - s u p p o r t

of the d l f f e r e n t i a l - e q u a t i o n model, w h e r e in w h i c h b o t h m o d e l s There

I is the l a n g u a g e

are w r i t t e n ) .

are, however,

two w a y s of m a k i n g m o d e l s

l-comparable.

Rather than a d d i n g an e x p l i c i t l y d e c l a r e d i n t e g r a t i o n p r o c e d u r e to the d i f f e r e n t i a l - e q u a t i o n model,

it is p o s s i b l e to add a

"dummy" call of the s t a n d a r d p r o c e d u r e the d i f f e r e n c e - e q h a t i o n model. o f f e r r e d by this p o s s i b i l i t y , r e d u n d a n t statements,

in s e c t i o n

It is the f l e x i b i l i t y of "padding"

that r e d u c e s

i n s i s t e n c e on l - c o m p a r a b i l i t y .

for i n t e g r a t i o n to

models with

the s i g n i f i c a n c e of any

This w i l l be d e m o n s t r a t e d

6.3.

If l - c o m p a r a b i l i t y

is required,

the choice of a s u i t a b l e l - s u p p o r t to be compared.

there still remains

for the m o d e l s w h i c h are

R e t u r n i n g to the above example,

still a d e c i s i o n to be m a d e - s h o u l d b o t h m o d e l s standard p r o c e d u r e decision,

for i n t e g r a t i o n ,

of course,

or n e i t h e r ?

is v e r y s i g n i f i c a n t

it w i l l be g o v e r n e d by the apriori

the m o d e l l e r w i s h e s

to make.

call the This

for m o d e l assessment.

But this is the d e c i s i o n d i s c u s s e d in c h a p t e r 4. words,

there is

In o t h e r

assumptions

that

140

6.3

Example:

Algol W-Comparable Gas Furnace Models

This section investigates how the assessment of the six models of the gas-furnace data

(cf. chapter 3) is altered,

if they are required to be AlgolW-comparable. 6.3.1

Standard Procedures The definition of Algol W is assumed to be a formalised

version of the specification given in use three standard procedures, procedures

(50).

The six models

namely the input/output

READ, READON and WRITE.

In accordance with the

discussion of section 6.3, we consider the syntax specification of

(50) to be augmented by the productions:

<simple statement>::=<standard

procedure statement>

<standard procedure statement>::=<standard procedure (
identifier> list>)

identifier>::=READ[READON]WRITE

The abstract syntax,

translator and interpreting

automaton are considered to be modified accordingly. 6.3.2 Al~olW-Comparable

Models

In this example the models are modified so as to be AlgolW-comparable expressions,

by inserting redundant statments

and

rather than by avoiding certain constructions.

Referring to section 3.7, and comparing the models in order, we notice first that the support of model II contains

syntax of the AlgolW-

the productions

<simple t expression>::=<simple

t expression>+
term>l

141 whereas

the s y n t a x of the A l g o l W - s u p p o r t of m o d e l I c o n t a i n s

only the p r o d u c t i o n <simple t e x p r e s s i o n > : : = < t

term>

(For the s i g n i f i c a n c e of "t" see Appendix B). the W R I T E

This d i s c r e p a n c y

s t a t e m e n t of m o d e l

(50) or the i n t r o d u c t i o n to can be removed by c h a n g i n g

I to:

WRITE(Y(I)+O);.

M o d e l III r e q u i r e s several p r o d u c t i o n s w h i c h are not needed for m o d e l s

I or II.

T h e s e are:

< l e t t e r > : : = NIU ::=. <simple

t expression>::=<simple

: : = < t t e r m > * < t

t expression>-
term>

factor>

::= ::=<simple

t expression>
operator>

<simple t e x p r e s s i o n > ::=< <statement>::=
statement>

<simple s t a t e m e n t > : : = < b l o c k > l < : : = < t

t a s s i g n m e n t statement> left part><

t expression>

: : = : = : : = < i f

clause><simple

statement>ELSE

<statement> : : = IF < l o g i c a l e x p r e s s i o n >

THEN

M o s t but not all of these are n e e d e d for m o d e l IV, but model IV itself needs two p r o d u c t i o n s w h i c h are not n e e d e d by m o d e l s

I,II, or III:

::= EIVIWIZ

142

l i s t > : : = < a c t u a l p a r a m e t e r > l < a c t u a l p a r a m e t e r list>, The only new p r o d u c t i o n r e q u i r e d by m o d e l s v and VI are < l e t t e r > : : = A ,

and < l e t t e r > : : = W ,

respectively,

but these can easily be r e m o v e d by u s i n g d i f f e r e n t identifiers. We give b e l o w the six models, m o d i f i e d AlgolW-comparable. AlgolW-support I

so as to be

The c o n c r e t e syntax of their common

is g i v e n in A p p e n d i x B.

The Trivial Model

BEGIN INTEGER

I,J,N,V,W,Z;

REAL A R R A Y E w U , Y ( I : : 2 9 6 ) ;

BEGIN FOR J : = l UNTIL READ

296 DO READON

(Y(J-O));

(1) ;

V:=O; IF I
(O,V) ELSE W R I T E ( Y ( I ) * I + O ) ;

END END. 53.8

II

53.6

53.5

...

57.0

The M e a n

BEGIN INTEGER

I,J,N,V,W,Z;

REAL A R R A Y E , U , Y ( l : : 2 9 6 ) ;

BEGIN FOR J : = l U N T I L 296 DO READON READ v:=O;

(I) ;

(Y(J-O));

143

IF I < l T H E N W R I T E WRITE

(O,V)

ELSE

( 5 3 . 5 + Y (I)*l) ;

END END. .i

.3

III

O

O

Deterministic

-.1

Transfer

...

Function

(Using

3.5

Input

Observations

Only).

BEGIN INTEGER FOR

I,J,E,V,W,Z;

J:=l

READ

UNTIL

REAL ARRAY

296 D O R E A D O N

N,U,Y(l::296);

(N(J));

(I) ;

IF I<6 T H E N

WRITE

(N(I),O)

ELSE

BEGIN FOR J:=l

UNTIL

5 DO

BEGIN Y(J) : = N ( J ) - 5 3 . 5 ; READON

(U (J)) ;

END; FOR J:=6

UNTIL

I DO

BEGIN READON

(U (J)) ;

Y (J) :=. 5 7 " Y (J-l) + . O I * Y (J-2)-. 53"U (J-3) -.37*U(J-4)-.51*U(J-5) END ; WRITE

(Y(I)+53.4+N(I)) ;

END ; END. 53.8

53.6

...

4.1

;

144

IV

Deterministic Transfer Function

(Using Input and O u t p u t Observations)

BEGIN I N T E G E R I,J;

REAL A R R A Y E(I::296);

REAL N , U , V , W , Y , Z ; FOR J:=l U N T I L 296 DO R E A D O N READ

(E(J));

(I) ;

N:=O; IF I<6 THEN W R I T E

(E(I)) ELSE

BEGIN READON WRITE

(U,V,W,Y,Z); (22.4 -.53"U - . 3 7 " V - . 5 1 * W + .57"Y + . O I * Z + E ( I ) ) ;

END; END. 53.8

V

53.6

...

1.6

Stochastic Process Model

BEGIN I N T E G E R I,J;

REAL A R R A Y E , U , Y ( I : : 2 9 6 ) ;

REAL N,V,W,Z~ FOR J:=l U N T I L 296 DO R E A D O N READ

(E(J));

(I) ;

IF I<8 THEN W R I T E

(E(I)) ELSE

BEGIN FOR J : = l U N T I L 7 DO READON WRITE

(U(I-J), Y(I-J));

( 2 . 1 * Y ( I - I ) - 1.5"Y(I-2)

+.34"Y(I-3)

+ . O I * Y (I-4)-. 53"U (I-3) +. 44"U (I-4) -. 28"U (I-5) +. 55"U (I-6)-. 32"U (I-7) +E (I) +2.2) ;

145

END; END. 53.6

53.8

VI

Box

. . . .

& Jenkins

2

Model

BEGIN INTEGER

I,J;

REAL A R R A Y

E,U,Y(I::296);

REAL N t V t W t Z ; FOR J:=l READ

UNTIL

296 DO R E A D O N

(E(J));

(I) ;

If I<8 THEN W R I T E

(E(I))

ELSE

BEGIN F O R J:=l U N T I L WRITE

7 DO R E A D O N

(U(I-J),Y(I-J));

(2. I*Y (I-l) -i. 5*Y (I-2) +. 34"Y (I-3) + . O I * Y (I-4)-. 53"U (I-3) +. 44"U (I-4) -. 28"U (I-5) +. 55"U (I-6) -. 32"U (I-7) +E (I) -. 57"E (I-1)-.Ol*E (I-2) +2.2) ;

END; END. 53.8

53.6

It w i l l that

they

of t h e i r

be seen

that

are not c h a n g e d compilation

(for example, models

. . . .

the m o d e l s

that each p r o d u c t i o n

have b e e n

in essence,

and e x e c u t i o n

by the i n t r o d u c t i o n

I and II).

t h e s e models,

4

although

may

and that

changed

of the e x t r a b l o c k

in A p p e n d i x

every p r o d u c t i o n

but

so

the d e t a i l s

have b e e n

It is s t r a i g h t f o r w a r d listed

"padded"

tedious

B is used required

in to c h e c k

in each for the

of

146

p a r s i n g of these m o d e l s

6.3.3

is listed in A p p e n d i x B.

C o m p a r i s o n of A s s e s s m e n t s

T a b l e III shows a c o m p a r i s o n of the sizes of the m o d e l s before and after the above m o d i f i c a t i o n s . their p e r f o r m a n c e s .

T a b l e IV c o m p a r e s

The t a b l e look-ups remain u n c h a n g e d

(from t h o s e shown in A p p e n d i x C), since the syntax r e q u i r e d for each of t h e m is a l r e a d y common.

model

Size,

e x c l u d i n g table look-up Unmodified Modified

Size of table look-up

T o t a l Size Unmodified I Modified

I

52

93

1480

1532

1573

II

57

96

1102

1159

1198

III

199

209

877

1076

1086

IV

129

131

835

964

966

V

203

212

802

1OO5

1014

VI

225

234

788

1013

1022

Table

III-

Sizes of Models Before and A f t e r Modification

T a b l e s III and IV show that a l t h o u g h the sizes of m o d e l s I and II, e x c l u d i n g the t a b l e r e s u l t of the m o d i f i c a t i o n s ,

look-ups,

n e a r l y d o u b l e d as a

the i n f o r m a t i o n gains and the

i n f o r m a t i o n e x p l a i n e d by e a c h m o d e l w e r e v e r y little affected. This can be e x p e c t e d to be true in m a n y cases, reason.

for the f o l l o w i n g

The size of the trivial m o d e l w i l l u s u a l l y be

147

Information

Model

Unmodified

Gain

Information

Modified

I

O

0

II

373

III

Unmodified

Explained Modified

O

O

375

24.4%

23.8%

456

487

29.8%

31.0%

IV

568

607

37.1%

38.6%

V

527

559

34.4%

35.4%

VI

519

551

33.8%

35.4%

i T a b l e IV - P e r f o r m a n c e s of M o d e l s Before and After Modification

d o m i n a t e d by the size of its table look-up,

even for small

o b s e r v a t i o n sets.

of the m o d e l can

Any n e c e s s a r y

"padding"

u s u a l l y be i n t r o d u c e d very e c o n o m i c a l l y ,

so the o v e r a l l size

of the trivial m o d e l does not change m u c h 1573 in the p r e s e n t example).

(from 1532 to

On the o t h e r hand,

as the

m o d e l s b e c o m e m o r e s o p h i s t i c a t e d and their table look-ups smaller,

they w i l l need

less "padding"

since m a n y m o r e

f e a t u r e s of the l a n g u a g e w i l l a l r e a d y be in use Thus,

once again,

6.4

Conclusion

(cf.table III).

the o v e r a l l size w i l l not c h a n g e much.

A l t h o u g h chapter

5 provides

some p r e c i s e c o n c e p t s w h i c h

can be a s s o c i a t e d w i t h "a priori k n o w l e d g e " , w h i c h c o n c e p t is m o s t a p p r o p r i a t e .

Should

it is not clear "a priori knowledge"

be c o n s i d e r e d to be s p e c i f i e d by the l a n g u a g e X in w h i c h one intends to w r i t e a model,

or by the X - s u p p o r t in w h i c h it turns

148

out to be w r i t t e n ? associating recursive

The first v i e w c o r r e s p o n d s

"a priori k n o w l e d g e "

w i t h a certain p a r t i a l

f u n c t i o n w i t h o u t r e f e r e n c e to the model, w h i l e

the second v i e w c o r r e s p o n d s recurslve

to simply

to a s s o c i a t i n g it w i t h a p a r t i a l

f u n c t i o n after e x a m i n i n g the a s s u m p t i o n s n e e d e d by

the model. For the e x a m p l e i n v e s t i g a t e d , it does not m a t t e r ,

in p r a c t i c e ,

it has b e e n shown that

w h i c h v i e w is adopted.

T h e r e is r e a s o n to s u p p o s e that this is true in m o s t cases. This is m o s t useful, models

not o n l y b e c a u s e

it is not clear w h e t h e r

should be ~ - c o m p a r a b l e b e f o r e b e i n g compared,

also b e c a u s e c h e c k i n g h - c o m p a r a b i l i t y w o u l d be q u i t e d i f f i c u l t automated).

but

of c o m p l i c a t e d m o d e l s

(although m u c h of it c o u l d be

7. TABLE LOOK-UP

7.1

CODINGS

Introduction M o d e l a s s e s s m e n t w i l l o b v i o u s l y be a f f e c t e d by the m a n n e r

in w h i c h table look-ups in m o d e l s

are coded.

S i n c e a table

look-up can be v i e w e d as a g e n e r a l i s e d error

(cf. chapter3),

the details of the coding w i l l d e t e r m i n e

the i m p l i c i t trade-

off b e t w e e n c o m p l e x i t y and a p p r o x i m a t i o n . of p r e v i o u s without

chapters,

comment.

In the e x a m p l e s

a p a r t i c u l a r coding has been a s s u m e d

We shall show that this coding is a n a t u r a l

one to use.

7.2

Size-Capturing Codings

We assume a finite a l p h a b e t P.

Let P* be the set of

all finite s e q u e n c e s of e l e m e n t s of P, i n c l u d i n g the empty s e q u e n c e A.

S u p p o s e that m o d e l a s s e s s m e n t is to be p e r f o r m e d

using a l a n g u a g e ~, and that the set of p r o g r a m s w r i t t e n in 1 is a subset of P*. S u p p o s e f u r t h e r that all the rival m o d e l s b e i n g a s s e s s e d are such t h a t the e l e m e n t s of their table l o o k - u p s are rationals.

By using a s u i t a b l e scaling,

can be r e g a r d e d as i n t e g e r s obtain a concrete

l-model,

(cf. sec.

all these e l e m e n t s

3.2).

In o r d e r to

t h e s e i n t e g e r s m u s t be coded into

e l e m e n t s of P* w h i c h a p p e a r as f r a g m e n t s of p r o g r a m s w r i t t e n in I. Definition

(7.2.1)

A c o d i n g is a total, e f f e c t i v e ,

injective function

150

c : ~ P *, w h e r e N d e n o t e s the set of all i n t e g e r s

(previously

N has d e n o t e d the set of n o n n e g a t i v e integers). Let ~+

d e n o t e the set of n o n n e g a t i v e

Definition

integers.

(7.2.2)

W i t h each coding c there can be a s s o c i a t e d a codelength f u n c t i o n L = c o £ : N ~ + ,

where

gives the length ~(p) of peP*.

£ is the function w h i c h Here o d e n o t e s c o m p o s i t i o n ,

so that L ( n ) = Z ( c ( n ) ) , The terms w h i c h a p p e a r in a table look-up of a m o d e l r e p r e s e n t the errors w h i c h

the m o d e l makes

c o m p u t e the o b s e r v e d s y s t e m behaviour.

in trying to

Our m e t h o d of

a s s e s s i n g m o d e l s is b a s e d on the a s s u m p t i o n that a short table look-up c o r r e s p o n d s

to small errors.

This will only

be so if the coding used in ~ is suitable. Definition

(7.2.3)

A size-capturing

coding is one for w h i c h the a s s o c i a t e d

c o d e - l e n g t h L satisfies: (i)

L(-n)=L(n)

(ii)

Inl l>In21~L(nl)~L(n2).

Clearly,

, for all nE~,

the use of a s i z e - c a p t u r i n g c o d i n g leads to table

look-ups having the d e s i r e d property.

T h e r e q u i r e m e n t that

L should be an even f u n c t i o n r e f l e c t s the usual i n d i f f e r e n c e to the sense of the error.

The p o s i t i v e entries a p p e a r i n g

in A p p e n d i x C w e r e c o n s i d e r e d to be p r e c e d e d by "+", in order to o b t a i n a s i z e - c a p t u r i n g

coding.

S i z e - c a p t u r i n g codings can be c o m p a r e d w i t h the class L of l o s s - f u n c t i o n s d i s c u s s e d by D e u t s c h

(51).

The use of

151

a coding w h i c h was not s i z e - c a p t u r i n g declaring

that certain

small errors,

or, perhaps,

are p r e f e r a b l e permissible,

large errors

the m o r e usual

are p r e f e r a b l e

ones,

and so on.

This

but we w i s h to r e s t r i c t

situation.

Notice

the m e a s u r e m e n t

scales

errors

is p e r f e c t l y

ourselves

to

also that the same effect

could be achieved by using a s i z e - c a p t u r i n g changing

to

to certain

that in some cases positive

to n e g a t i v e

of course,

would be t a n t a m o u n t

coding,

but

(i.e. by " t r a n s f o r m i n g

the

variables"). Let Q be a subset of P, c o n t a i n i n g and CQ be the set of s i z e - c a p t u r i n g subsets of Q*.

For example,

of the integers Definition

has as range

r-codin9

the o r d i n a r y

ranges

decimal

are

coding

a subset of { 0 , i . . . , 9 , + , - } ~

is a s i z e - c a p t u r i n g

such that L O ( n ) = ~ o g r l n l ] + l the g r e a t e s t

, w h e r e Lo=coO£

integer not e x c e e d i n g

For example,

is a smallest

2-coding: Co(3)=10,

can clearly be c o n s t r u c t e d (7.2.5)

x

(xc~),

if r=2 then the coding

Co(2)=OO , Co(-2)=O1,

Theorem

codings w h o s e

(r)2),

(7.2.4)

A smallest

follows

r elements

coding

and

CoECQ,

[x~ denotes

and Lo(O)=O.

cO defined

Co(O)=A , ¢o(i)=O,

as

Co(-l)=l ,

etc. A c o r r e s p o n d i n g

coding

for any r>2.

If Co is a smallest

r-coding,

and Lo=CoO£ ,

then for any CeCQ,L=cQ£, L (n) 9Lo (n) . Proof:

Since

c is injective

all the elements

and total

(definition

of S ( n ) = { c ( O ) , c ( 1 ) , c ( - l ) , c ( 2 )

(7.2.1))

.... ,c(n)}

152

exist

and are distinct.

capturing, Then

we have

images

c is assumed

lil
Is(n) I=21nI+l.

(i.e.

Since

Now the number

of c) of code-length

to be sizeSuppose

of possible

n~O.

code words

j is r j, since CECQ.

Hence L (n) j E r j=O

IS(n) I Z

r L (n) +i r-i =

[logrlnl]+l .

Assume

L(n)
i.e.

L (n)~< [logrl nlJ .

Then

IS (n) I<

[logr In I] +l r

-i r - 1

<.

rlnll r

_

-

r

nl -

r-i

_!_l

r-I

< 21n I + 1 since r>2 =IS(n) I. Hence

the assumption

then the assumption since L(n)=L(-n)

to a contradiction.

L(n)
decimal

coding

+,- for denoting

or negative. ~J~°glo'n']+2'

leads

and Lo(n)=Lo(-n).

The ordinary the symbols

leads

Consequently

to the same contradiction Hence L(n)>Lo(n).

of the integers

whether

the integer

its code-length

if it is assumed

If n>O

that positive

reserves is positive

is given by L(n)= integers

are

always preceded by "+"

(The usual convention of leaving

positive integers unsigned size-capturlng,

leads to this coding not being

since L(n)~L(-n);

identification of O with OO,OOO,

furthermore

the usual

etc, means that the usual

notation is not a coding according to our definition (7.2.1)). models, length 7.3

Note that since we are interested only in comparing

it does not much matter whether the code we use has [loglolnIl+ 1 or

[lOglolnl ] +2.

Coding Determines Feature Se!ectign Suppose that the models being considered have the structure

shown in fig.

9.

Each model has one table look-up, whose

conteDts are denoted by E={e

:i=l,...N}.

The "fixed"

l

part of the model, look-up,

namely that part which is not the table

is denoted by m.

m operates on the model input

D i to give m(Di) , which ~s summed with e i to give the model output C i.

(cf. see.

3.3).

ith s~stem output observation integers

suppose further that

Ci=Yi, the

, and that ei,Y i are single

(as opposed to vectors of integers).

Let £(m)

be the length of the ~ program implementing m, and let L(E) N

denote

Z L(ei), namely the length of the table look-up i=l

when coded into I (assu/aing that ~ uses a coding c for which L=coZ).

Note that we have assumed that any separators

used in the table look-up are included in m. The suggested assessment procedure namely choosing the shorter model,

for rival models,

leads to the selection

of the model with the shorter table look-up,

if the number N

154

of o b s e r v a t i o n s

is large enough.

u s e d for m o d e l a s s e s s m e n t

Thus a change of the coding

tends to change the "weighting"

w h i c h is g i v e n to v a r i o u s For example,

f e a t u r e s of the m o d e l behaviour. 2 if the coding u s e d is such that L ( n ) = n

then the m o d e l w h i c h gives tend to be selected.

a better least-squares

fit w i l l

If L(n)=Inl x and x is large then the

m o d e l w h i c h gives a b e t t e r "minimax"

fit w i l l tend to be

selected. Suppose

that the table l o o k - u p of one of the m o d e l s has

leiI=a, i=l,...,N, w h e r e a s of the entries,

a n o t h e r one has ei=O for 99%

and

leil=3a for 1% of the entries. T h e n the 5 use of a coding w i t h c o d e - l e n g t h L ( n ) = I n I , for example,

w o u l d tend to favour the s e c o n d model, w h i l e L ( n ) = I n I w o u l d tend to favour the first. the use of a c o d i n g w i t h L(n)

increasing

a coding w i t h

On the

other hand,

sufficiently slowly

m a y be i n d i f f e r e n t to e i t h e r table look-up.

Ideally,

the

coding w h i c h s h o u l d be used is one w h i c h best e m p h a s i s e s the features

in w h i c h the m o d e l l e r

is interested.

However,

it is not clear w h e t h e r a coding exists w h i c h is c a p a b l e of reflecting

the features w h i c h are i m p o r t a n t to the m o d e l l e r ,

or even w h e t h e r he

can

say p r e c i s e l y w h ~ t they are.

The e x i s t e n c e of a s m a l l e s t r - c o d i n g s ~ r o n g l y s u g g e s t s that this coding s h o u l d be used for m o d e l assessment. A f t e r all,

in s e a r c h i n g

for a g o o d model,

the m o d e l l e r is

t r y i n g to find the m o s t c o n c i s e way of expressing, computing,

the o b s e r v a t i o n s .

It t h e r e f o r e

that he s h o u l d use the m o s t c o n c i s e coding.

or

seems n a t u r a l

155

The use of the s m a l l e s t conservative sense

that

favour

sense

any o t h e r

in a "safe"

size-capturing

the results

coding w e r e b e i n g

section.

7.4

Effect

Let

of C h a n ~ e

l ~ be a p r o g r a m m i n g that

than

If the entries

c.

integers

are t r a n s f o r m e d

to them,

then

definition

(7.2.1)).

look-up

and its

in a table

and m

L=co~,

which

look-up

in the

is i d e n t i c a l

if the

is £(m)+L(E)

of a

l-model i c o c"

is £ ( m ) + L ~ ( E ) ,

, and table

l-model

(cf.

fig.

where

two rival models, look-ups

E

2

, E 1

has

with

of

of by table

9), then

L=co£

with

c ~ rather

the f u n c t i o n

the entries

of E

the

and L'=c~o£. "fixed"

, respectively. 2

, E i

Theorem

-

2

e i , e i denote

models

size

This w i l l be s h o w n

by a p p l y i n g

that we have

1 1

agree with

if any o t h e r

language

Furthermore,

size

size of the l ' - m o d e l

Let

the r e s u l t s

in the

the r e s u l t i n g p r o g r a m will be a l ~- m o d e l _I (c is a f u n c t i o n , since c is injective,

the system.

Suppose

This

assessment,

are coded v i a the coding

a system

E,

errors.

quickly

of Coding.

l, e x c e p t

m

used.

in the

coding will more

for m o d e l

be o b t a i n e d

to the m o s t

trade-off,

fitting

circumstances

which would

next

parts

small

procedure

that in c e r t a i n

capturing

leads

complexity/approximation

a complex model with

results

r-coding

are

assessed

(7.4.1) L~=c~o£

using

.

that the

I and l'.

Let c,c ~ be s i z e - c a p t u r i n g be such

Suppose

2

that,

for any i,j,

IL" (i)-L~(J) J)IL(i)-L(J ) I-

codings

and

156

If £(ml)+L(EI)>£(mz)+L(E2),

and a bijection 2

{1,2 ..... N} exists, then Z(m )+LJ(E 1

)>£(m )+L~(E 1

(The bijection

that

such

2

h:{l,2 .... ,N}÷

I

leil<~leh(i) } (i=l .... N), ).

2

h is used to rearrange

E , so that to every I

entry in h(E ) there corresponds

a smaller entry in E ).

I

2

Proof: 1

2

1

2

[LJ (eh(i))-L" (e i) ]~>IL(eh(i))-L(e i) I 1

But c,c" are size-capturing 2

1

Therefore

and

leh(i)l~>le21"z 2

1

L'(eh(i))-L" (ei)~>L(eh(i))-L(ei)-

So, summing over i, L~(E

)-L'(E ]

)~>L(E )-L(E ). 2

1

2

But £(m )+L(E )>Z(m )+L(E )---~1

L(E

1

2

2

)-L(E )>£(m )-£(m ). 1

2

Therefore

2

L ~E

1

)-L'(E 1

hence

)>£(m )-~(m ), 2

Z (m)+L'(E 1

2

1

This theorem,

1

)>Z(m )+L'(E 2

). 2

together with the previous

shows that under the stated condition, than model

1 according

c', whose code-length

if model

to a size-capturing

remains better according

discussion, 2 is better

coding c, then it

to any other size-capturing

everywhere

increases

coding

at least as quickly

as that of c. The condition expected Example

to be satisfied

the existence

, E =(9,1). 2

of h can be

in many cases of practical

(7.4.2)

E =(8,15) 1

stipulating

(N=2)

interest.

157

h(1)=2, h (2) =i. 2 1 e, =9<15=eh(l) So the theorem L(n) L'(E

2 I e2=l<8=eh(2).

,

can be applied,

e.g. :

= lOglOlnl+l , L'|n)=In I 1

)-L'(E

Example

2

)=I3>I=L(m

(1,15)

).

(N=2)

z

case a suitable

)-L~(E

)=-I
1

would

2

, E =(8,9).

1

L'(E

)-L(E

(7.4.3)

E = In this

1

gives

h does not exist, )-L(E

1.

), so the proof

of the theorem

2

break down.

Corollary

(7.4.4)

If c o is a smallest

r-coding

with

and a b i j e c t i o n h:{I,...,N}~{I,...,N} 2 i leil~< l e h ( i ) I , (i=l, .... N), and

and in fact we have

£(ml)+Lo(EI)>~(m2)+Lo(E2)'

code-length exists

Lo(lil+l)-Lo(i)=l---~

n(lil+l)-L(i))l,

we have Z(m )+L(E l

)>£(m 1

)+L(E 2

). 2

Proof: From definitions

(7.2.3)

O<~Lo ( Iil +i) -Lo (i) ~
from definition L (lil+l)-L(i)>,O.

such that

then for any CeCQ,

such that

and ,

(7.2.4), i%O.

(7.2.3) ,

Lo,

L=co£,

158

These two facts, the s t a t e m e n t

together w i t h the condition on L imposed

of the corollary,

in

imply that

n ( Iil +l)-L (i)~no ( Ii l + l ) - L o (i) . Consequently,

for any i,j,

IL(i)-L(j) I~ILo(i)~Lo(J) I. The corollary

follows

Corollary than m o d e l remains

(7.4.4)

if model

r-coding

CECQ is used,

at those points

2 is b e t t e r

is used,

then it

providing

only

at which L ° increases,

h can be found.

The c o n d i t i o n

some codings w h i c h may c o n c e i v a b l y

such as those

and

on L does

be of interest,

for w h i c h L ( n ) = m a x ( k , L o ( n ) ) .

Summary A class of codings

table

(7.4.1).

that,

b e t t e r w h e n any other

that a suitable

7.5

states

1 w h e n a smallest

that L increases

exclude

from t h e o r e m

look-ups

has been i n t r o d u c e d w h i c h

of m o d e l s w i t h the d e s i r a b l e

that their size increases increase.

A smallest

given size of alphabet. be a natural of using

one to use

as the m a g n i t u d e s

coding

exists

This

it is that i n m a n y cases

codings

approximated

of the integers

for a

coding appears

to

An a d v a n t a g e

the result of model

o b t a i n e d w i t h most other codings. closely

of their entries

assessment.

a s s e s s m e n t will agree w i t h the result

reasonably

characteristic

in this class,

smallest

for model

endows

that w o u l d have been

The s m a l l e s t

coding

by the c o n v e n t i o n a l

(cf. Bobrow

and A r b i b

(52)).

is

r-ary

8

8.1

Model

DISCUSSION

has

The m o s t choice

before, a priori

of drawbacks.

value?

We b e l i e v e

obvious

drawback

of p r o g r a m m i n g

assumptions

holding.

is no w o r s e

of language

assumptions

Admittedly, is r a t h e r

it is not p o s s i b l e which

specified.

language

is the w a y

actually

specified

discussions choice

(e.g. M i h r a m language

in w h i c h

in w h i c h m a n y in s i m u l a t i o n

language

specifications A second

drawback

respect,

a priori hand,

such

For example, knowledge

knowledge

a programming

In fact,

common

are

in

to a s s o c i a t e " w o r l d view"

of p r o g r a m m i n g

to t i g h t e r

or looser

knowledge.

is that,

although

--

is

assumptions

the m o d e l l e r ' s

correspond

if the

of a s s u m p t i o n

choosing

choices

of

of a p r i o r i

statistical

studies.

Furthermore,

of a ~ r i o r i

our m e t h o d

of s p e c i f y i n g

a priori

with

on some

restricted.

it has b e c o m e

can be made w h i c h

relies

in this

a priori

depends

As has been m e n t i o n e d

and the type

On the o t h e r

(53)).

assessment

as a s p e c i f i c a t i o n

to s p e c i f y

of s i m u l a t i o n

the

Consequently,

is s o m e w h a t

of m o d e l l i n g

it n e v e r t h e l e s s

it does.

assessment

this m e t h o d

is one of the forms

commonly

is that

indirect,

can be s p e c i f i e d

that

that others

is t a k e n

w h i c h we h a v e

Does

language.

of m o d e l

assumptions.

which

rival models

every method

assessment choice

of a s s e s s i n g

a number

have p r a c t i c a l

on the

CONCLUSION.

Assessment

The m e t h o d proposed

AND

a n a t u r a l w a y of

160

coding table

look-up e l e m e n t s has been d e m o n s t r a t e d ,

a c o d i n g exists s h o u l d be used?

for any p o s i t i v e i n t e g e r r~2. Fortunately,

such

Which r

there is a s m a l l e s t p o s s i b l e

r (namely 2), and the l o g a r i t h m i c n a t u r e of the s m a l l e s t code-length

implies t h a t t h e

relative

sizes of m o d e l s w i l l

not change m u c h if r ranges o v e r the c o m m o n l y used values (say rzl6).

However,

the change may be significant;

t h e r e f o r e the only m e a n i n g f u l p r o c e d u r e try two or three

low values of r, i n c l u d i n g r=2,

how m u c h this i n f l u e n c e s As an example,

the a s s e s s m e n t of the

using b i n a r y - c o d e d table

rather than decimal ones.

of the table

look-ups of

shown in A p p e n d i x C, are each m u l t i p l i e d

by I0 to m a k e them integral,

and the c o n v e n t i o n a l b i n a r y

coding a p p l i e d to the r e s u l t i n g i n t e g e r s "-"),

look-ups

We shall c o n s i d e r only m o d e l s

If the e l e m e n t s

these three models,

and see

the m o d e l assessment.

let us i n v e s t i g a t e

Algol g a s - f u r n a c e models

I, IV and VI.

appears to be to

(preceded by "+" or

then the a s s e s s m e n t shown in table V is obtained.

We have a s s u m e d that the fixed parts of the models unchanged,

are

and that the c o m p i l e r t r a n s l a t e s the table look-

ups back to s t a n d a r d A l g o l r e p r e s e n t a t i o n .

~odel

Size, e x c l u d i n g table look-up

Size of table look-up

Total size

Information gain

Information explained

I

52

3184

3236

O

IV

129

1153

1282

1954

60.4%

VI

225

1059

1284

1952

60.3%

O

T a b l e V - A s s e s s m e n t of A l g o l G a s - F u r n a c e Models I,IV and VI, U s i n ~ B i n a r y - C o d e d T a b l e Look-Ups.

161

The

coding

used

since we have L (n) = sequence,

since we

From is used

performances

VI.

are i n t e r e s t e d

of m o d e l s

c o u l d have

of m o d e l

good, than

in the

concluded

sense

that

for the other.

answer model

than m o d e l

no firm s u p p o r t primarily choose

The

two chief are

only n e c e s s a r y

because

However,

and

remarked

in this

that

to

there

is

interested

then one

could

information

used.

in C h a p t e r

the m o d e l s

assessment non-

being

similar.

It is

the m o d e l

i, this

is no g r e a t

of s i m u l a t i o n .

is n e c e s s a r y

the m o d e l s

model

to c o m p l e x

of o b t a i n i n g

of the p o s s i b i l i t y

thesis

trying

since

a higher

(2) that

a qualification

for one

elaborate

of the p r o p o s e d

a means

several

if one w e r e

procedure

to be s t r u c t u r a l l y

to have

- as was

restriction

assumed

models,

do not need

behaviour

"no",

(i) that it can be a p p l i e d

dynamical

compared

support

a more

F i n a ll y ,

advantages

to m o d e l

changed,

if one w e r e

VI.

was

case we have

are e q u a l l y

answer

2-coding

the

IV and VI

could

had

2-coding

Firstly,

using

decision

sizes.

reached.

is no m o r e

the m o d e l w h i c h

a smallest

procedures linear

one

in a c l e a r - c u t

gain w h e n

that m o d e l s

for m o d e l

as b e t t e r

perforance

"Is it w o r t h

IV?"

between

IV is s u p e r i o r

have b e e n

there

model

a smallest

In this

Alternatively,

the q u e s t i o n

is of no con-

no d i f f e r e n c e

that m o d e l

alternativeconclusionscould one

when

IV and VI.

the o r d e r i n g

this

2-coding,

in r e l a t i v e

that,

is v i r t u a l l y

firm i n d i c a t i o n

Had

a smallest

~ o g 21nl] +2, b u t

table V it is seen

, there

a rather

is not q u i t e

here.

of i n t e r e s t

W e have are

162

essentially deterministic

causal p r o c e s s e s , w i t h s t o c h a s t i c

effects e n t e r i n g in such a m a n n e r to be m a n i f e s t a t i o n s of the system.

that they can be c o n s i d e r e d

of the m o d e l l e r ' s

imperfect understanding

The m o d e l s w h i c h appear in control theory

are of this type.

But in o p e r a t i o n a l

r e s e a r c h it is common

p r a c t i c e to use m o d e l s w h i c h are e s s e n t i a l l y s t o c h a s t i c in nature,

such as the c o n v e n t i o n a l m o d e l s

c h a r a c t e r i s a t i o n of C h a p t e r

of queues.

The

3, and c o n s e q u e n t l y the m e t h o d

of a s s e s s m e n t b a s e d on it, is not i m m e d i a t e l y a p p l i c a b l e to such models.

(In some cases it m a y be p o s s i b l e to

e x t e n d the c h a r a c t e r i s a t i o n models

to such m o d e l s by r e q u i r i n g the

to compute not the o b s e r v e d b e h a v i o u r

itself, but

some o b s e r v e d p r o p e r t y of that b e h a v i o u r - for example,

a

histogram). Perhaps

the b e s t known type of complex m o d e l w h i c h can

be a s s e s s e d by our m e t h o d is the "System Dynamics" model

i n t r o d u c e d by F o r r e s t e r

fact s u b j e c t his m o d e l s

(6).

to tests

argues that c o m p l e x i n d u s t r i a l

F o r r e s t e r does not in

against real data.

and s o c i o - e c o n o m i c

c o n t a i n such strong c r o s s - c o u p l i n g s validation

type of

is v i r t u a l l y impossible.

He

systems

and f e e d b a c k s that s t a t i s t i c a l C o n s e q u e n t l y he insists

that the m o d e l l e r should be able to j u s t i f y all the detail of the model,

and that e l e m e n t s of the m o d e l should c o r r e s p o n d

to "real elements"

in the system.

In other words,

only

those e l e m e n t s are p e r m i t t e d about w h i c h the m o d e l l e r ' s a priori

knowledge

t h e i r inclusion.

is so strong that he has no doubts about Furthermore,

if the m o d e l behaves

in an

163

unsatisfactory m u s t be

manner,

found w h i c h that

then

"causes

for the d i s c r e p a n c y

can be e x p l a i n e d

and d e f e n d e d

other

than

their

inclusion

corrects

since

"a s u f f i c i e n t l y

elaborate

formal

can be d e v i s e d data"

to fit a r b i t r a r i l y

restrictive. competing

to s t r i c t l y ,

curve-fitting

closely

structures

he s h o u l d

should

include

procedure

any e n s e m b l e

structure,

then F o r r e s t e r

of d e c i d i n g

or of c h o o s i n g The Dynamics"

of

which

includes

doubt,

modeller

will

is,

systems. uncertain

him w i t h

gain p r o v i d e s method

the m o d e l l e r

of course. structures

of m a k i n g

such

requires

some

He can take

a model

about which

he has

gain.

It does

b a s e d his m o d e l

at this

stage

on m a n y

other observations

knowledge of s i m i l a r

one of the c o m p e t i n g

and see w h e t h e r Obviously,

the

on the p a r t i c u l a r

ultimately,

incorporate

no

not m a t t e r

but on his a p r i o r i

or not.

any

a "System

to him,

structures,

is i m p r o v e d

one or o t h e r

available

He can now

he

structures.

its i n f o r m a t i o n

not have

hand,

in fact be included,

is - it can even be negative;

observations - that

not p r o v i d e

an o b j e c t i v e

those

of two

in some p a r t of the model,

it s h o u l d two

about which

and i n c l u d e s

use of it,

data,

only

and assess

this

with

To m a k e

observed

the

does

are v e r y

If, on the o t h e r

of i n f o r m a t i o n

modeller

decisions. relevant

between

concept

appear

guidelines,

whether

injunctions

is u n s u r e

neither.

the above

means

these

If the m o d e l l e r

disregards

what

system behaviour"

(6). If a d h e r e d

then

on g r o u n d s

the

information

he can also e s t a b l i s h

gain which

164

of these

structures

Note

that

Forrester's should

proviso,

that

about

ensures

adding

gain

the s y s t e m

information

gain.

consistency

with

new features

be a " c u r v e - f i t t i n g "

of i n f o r m a t i o n

explained

the g r e a t e r

this p r o c e d u r e

not m e r e l y

increase

gives

exercise.

indicates

behaviour

to a m o d e l An

that m o r e has been

than has b e e n

added

to

the model. The d e s i r a b i l i t y

of v a l i d a t i n g

set of o b s e r v a t i o n s

other

it is now g e n e r a l l y

recognised.

more

strongly:

to be m e a n i n g l e s s .

used

for the v a l i d a t i o n

been

stated

several

can be a c h i e v e d ment

Why,

then,

this

of a m o d e l

by u s i n g

the

for c o n s t r u c t i n g

is c o r r e c t

already,

can be said

if the

set of data.

is of c o u r s e

it is

criterion

since,

an a r b i t r a r y

as has

goodness-of-fit

N o w the

a procedure

set of o b s e r v a t i o n s

same

assess-

for m o d e l

is assumed.

is it not m e a n i n g l e s s ?

simpler

is that but

it.

as m e a s u r i n g ,

information

trades

the model,

to c o n s t r u c t

the

gain

y e t only one

goodness-of-fit,

set u s e d

for c o n s t r u c t i n g

is g o o d n e s s - o f - f i t ,

for any finite

The r e a s o n

The

This

times

of i n f o r m a t i o n

validation,

Indeed,

as the one u s e d

held

by the use of a

than the one used

the v a l i d a t i o n

set of o b s e r v a t i o n s

a model

Thus

the

for c o n s t r u c t i n g

"proportion"

An i n f o r m a t i o n observations

which

gain

have

fewer

the

been

to b u i l d

are r e q u i r e d

of the o b s e r v a t i o n

and c o m p a r i n g validates

indicates,

just assess

can be r e g a r d e d

"proportion"

successfully

used

gain

not

complexity.

observations

the model,

of zero

does

it off a g a i n s t

information

in a sense,

gain

this w i t h

the model.

roughly,

that all

the model,

and none

the

I&5

remain

for v a l i d a t i o n .

Suppose

that

to c o n s t r u c t on a s e c o n d behaviour clear

a model,

the

that

converse

constructed

that its

gain

system which

it is is,

of the m o d e l

is a c o n c a t e n a t i o n

It m a y be c o n j e c t u r e d

true:

sets,

Then

set of o b s e r v a t i o n s

that

then it is p o s s i b l e

using

such

that

one of the sets,

if a m o d e l

has

a high

to d i v i d e

the o b s e r v a t i o n

the m o d e l

can be

and s u c c e s s f u l l y

validated

the other. Of course,

he s u c c e e d s

the m o d e l l e r

in v a l i d a t i n g

if his m o d e l

modelling

the

(for example, acquire

To certain method

does

because then

have

a high

he w o u l d

features of m o d e l

which

means

assessment

be a u s e f u l

can

indicate

to him

The

features

which

can,

guide

information

have

to w a i t

arguments of m o d e l

appear if used

of his

arbitrary

safer

if

set of data, gain w h e n

is not p o s s i b l e too

long to

show that

information

validation. that

in spite

arbitrary,

insofar

efforts

are no more

of

the p r o p o s e d

intelligently

to the m o d e l l e r ,

the success remain

on a second

it is b e l i e v e d

at first

feel

But if this

the above

then,

inevitably

his m o d e l

an a l t e r n a t i v e

summarise,

care,

will

first set alone.

new data)

gain g i v e s

with

second

of the

set into two d i s j o i n t

even

the

is also

gain,

is then v a l i d a t e d

in the s e n s e

sets of o b s e r v a t i o n s ) .

information

using

this m o d e l

be the i n f o r m a t i o n

as a m o d e l

the

and that

is u s e d

fit to those o b s e r v a t i o n s .

larger

the g r e a t e r w i l l

of the two

set of o b s e r v a t i o n s

set of o b s e r v a t i o n s ,

is a good

that

(viewed

a certain

and as it

to date. prominent

166 than the a r b i t r a r y features of any m o d e l

a s s e s s m e n t procedure.

The a v a i l a b i l i t y of such a guide s i g n i f i c a n t l y extends range of t e c h n i q u e s

a v a i l a b l e to the m o d e l l e r ,

the

for an i m p o r t a n t

and large class of d y n a m i c a l models.

8.2

Model Buildin~

The c h a r a c t e r i s a t i o n of m o d e l l i n g w h i c h was p r e s e n t e d in C h a p t e r s

3 and 4 is of course

an i d e a l i s a t i o n and a

s i m p l i f i c a t i o n of the m o d e l l i n g process. exhibits differs (e.g.

some i m p o r t a n t c h a r a c t e r i s t i c s

Nevertheless, of m o d e l l i n g .

from o t h e r a t t e m p t e d f o r m a l i s a t i o n s

(53),

beliefs:

(54)) in that it q u e s t i o n s

namely,

that a "true m o d e l "

s y s t e m to be m o d e l l e d , in m o d e l complexity.

These b e l i e f s systems,

of m o d e l l i n g

some l o n g - c h e r i s h e d n e c e s s a r i l y exists of the

are a legacy from the

and do not s e e m a p p r o p r i a t e

and other p o o r l y - u n d e r s t o o d

M o s t d i s c u s s i o n s of m o d e l l i n g some "true model"

systems.

assume at the o u t s e t that

of the s y s t e m u n d e r i n v e s t i g a t i o n exists,

and that the aim of the m o d e l l i n g e x e r c i s e

is to find a

m o d e l w h i c h in some sense a p p r o x i m a t e s the true one. m a k e no such assumption. ]

@=(S

It

and that there is some i n h e r e n t v i r t u e

m o d e l l i n g of e n g i n e e r i n g for s o c i o - e c o n o m i c

it

We

C o n s i d e r an a s y m p t o t i c s y s t e m

2

,S ,...)

(cf. C h a p t e r 4).

Let x(n)

of the best m o d e l of the s y s t e m S n c r i t e r i o n of C h a p t e r sequence

3).

(x(1),x(2),...)

be the G~del n u m b e r

(judged a c c o r d i n g to the

T h e n it is p o s s i b l e is random.

that the

In this case it does

not seem m e a n i n g f u l to speak of the "true model" However,

an a s y m p t o t i c m o d e l

of / c a n

of ~ .

always be found,

since

167

the s e q u e n c e

of trivial

When modelling detailed well

models

My m a k i n g

association

them more

in t h e s e

circumstances.

a model

performance

the o p p o s i t e are

using

with

useful

model

set.

Of course,

them directly

though

contradicting

usually

by large

pay

not very

conditions,

prominent,

a "bigger

For e x a m p l e ,

they m a k e

relationships hundred

are

the m e r i t s

of M e s a r o v i c

that e n o u g h justify

such

performance

data

ultimately

may

that

its

and

derived

not be able

of such o b s e r v a t i o n s .

the size which

complex

to

of u s e f u l m o d e l s have

models

b e e n made,

prove

useful

they m a y not be d i r e c t l y

sets.

Other

ideas; finds

treatments

however, the

of m o d e l l i n g

these

author

notions

slipping

are

back

into

of mind. and P e s t e l

as a p o i n t

statement:

(55)

in its

favour.

statement

and P e s t e l ' s

model.

it seems

should

causes

about

iOO,0OO to a few

doubts

It seems

the

to the

the v i e w p o i n t

grave

"world

even more

be so g o o d

about

From

refer

At one p o i n t

as c o m p a r e d

world models".

this

a large model;

repeatedly

"In our m o d e l

in the computer,

c o u l d be c o l l e c t e d

of the m o d e l

compare

of a c r e d i b l e

it is a m o d e l

though

in o t h e r w e l l - k n o w n

of our c h a r a c t e r i s a t i o n ,

but must

is i t s e l f

even

Mesarovic

stored

cannot

that very

frame

the f o l l o w i n g

if one

fact

to these

size of their m o d e l

Furthermore,

of o b s e r v a t i o n s

and one o f t e n

is better"

being

alone,

generally,

observation

lip-service

things

the m o d e l l e r

amount

the

other

by the size of the o b s e r v a t i o n

- in a sense,

total

an

In our c h a r a c t e r i -

the size

knowledge

it can be s t a t e d q u i t e

is l i m i t e d by the

large

limited

a priori even

then

better

and g o o d m o d e l s

is taken:

a priori knowledge

the

are

Thus

models

small models.

is in e f f e c t

certain

view

observations,

from o b s e r v a t i o n s ,

supported

detailed.

formed

good models

under

and more

systems to o b t a i n

detailed

equal,

without

to seek

model.

in w h i c h

within

complex,

precisely

Thus

relationships

it is r e a s o n a b l e

sation

exhibit

situations,

between

is n a t u r a l l y

build

is an a s y m p t o t i c

engineering

cause/effect

understood,

models

about

unlikely

system"

unlikely

as to j u s t i f y

to

that the the r e j e c t i o n

168

of a r e a s o n a b l y smaller.

(Although

made with the

same

care,

model

effective

since

the m o s t

procedure

can exist.

This

standard

guaranteed

of the n a i v e specific

important,

implies system

have

times

to be

sets w o u l d

expertise

achieve

not be

to compare

theories. principle

application

analysis"

it m a y

should

on w h a t

with

the

the

systems

as e c o n o m i c s theory,

of m o d e l l i n g

dispose

can r e p l a c e

concerned

of e c o n o m i c

can not be

though

This

such

of a s y s t e m

technique even

no

depend

and not on

and simulation.

Inference.

is c o n c e r n e d

with

the i n f e r e n c e

from o b s e r v a t i o n s .

briefly

of theories

It is t h e r e f o r e

the c h a r a c t e r i s a t i o n some p h i l o s o p h i c a l

interesting

of m o d e l l i n g studies

developed

of general

inference.

Both philosophers regarded

model,

in a field

trappings

thesis w i t h

scientific

in g e n e r a l

the m e c h a n i c a l

The b o u n d s

on the p r o g r e s s

contentious,

the best m o d e l

"systems

primarily

and h y p o t h e s e s

not

an "answer".

can

Modelling

that

a useful

modelling

Scientific

is that

of the d i s c i p l i n e

studied.

the t e c h n o l o g i c a l

albeit

identification

to p r o d u c e v i e w that

system being

in this

would

the o b s e r v a t i o n

for f i n d i n g

to p r o d u c e

be g u a r a n t e e d

8.3

is a t h o u s a n d

the c o m p a r i s o n s

of our c h a r a c t e r i s a t i o n

of some

which

for the two models).

Perhaps result

adequate

simplicity

of science

as a d e s i r a b l e

One of the b e s t k n o w n of O c k h a m ' s

Razor,

which

and s c i e n t i s t s characteristic expressions states

that

have

long

of s c i e n t i f i c

of this only

is the

those

169

entities

s h o u l d be i n t r o d u c e d

absolutely Philosophy" "Rule

necessary. (56) I:

Newton's

stem We

from

are

"Rules

than such their

Therefore

III:

and w h i c h

are

the

IV:

of our e m p h a s i s IV justify degree

to all b o d i e s

of our e x p e r i m e n t s ,

has b e e n made

qualities

philosophy inferred

m a d e more

accurate,

some rules

on high

we

of all

are

by w h i c h

such

I and II are

an i n f o r m a l

which

may be held

on the (57).

idea

that

However,

be

to exceptions".

with

gain

as o t h e r

they m a y e i t h e r

or liable

gain,

hypotheses

time

similarities

information

induction

or very n e a r l y

any c o n t r a r y till

to look

by g e n e r a l

as a c c u r a t e l y

occur,

by Bunge

are to

whatsoever.

notwithstanding

attack

admit neither

to b e l o n g

the use of i n f o r m a t i o n

A detailed

causes.

found

phenomena

of c o n f i d e n c e

the same

we m u s t ,

of degrees,

that m a y be imagined,

can be discerned:

and s u f f i c i e n t

effects

which

the u n i v e r s a l

from p h e n o m e n a

rules

true

remission

upon p r o p o s i t i o n s

In these

of n a t u r a l

nor

In e x p e r i m e n t a l

true,

assign

of bodies,

reach

be e s t e e m e d bodies

causes

to the same n a t u r a l

The q u a l i t i e s

within

in

appearances.

intensification

Rule

of R e a s o n i n g

as are b o t h

as far as p o s s i b l e , Rule

are

this p r i n c i p l e :

to e x p l a i n II:

a theory w h i c h

to admit no m o r e

things

Rule

into

our characterisation

while

counterpart

rules

as an i n d i c a t o r

III and of the

in a model. simplicity

is d e s i r a b l e

our c h a r a c t e r i s a t i o n

170

of m o d e l l i n g e s c a p e s the b r u n t of this attack. m a i n concern is that r e g a r d i n g s i m p l i c i t y of s c i e n t i f i c q u a l i t y is too naive.

Bunge's

as the sole c r i t e r i o n

Proper emphasis should

also be p l a c e d on a c c u r a c y and depth:

"The m o t t o of science

is not just P a u c a but r a t h e r P l u r i m a ex p a u c i s s i m i s - the m o s t out of the least.

In short, we w i s h e c o n o m y and not

merely parsimony." I n f o r m a t i o n gain does not just m e a s u r e program.

It m e a s u r e s

the size of a

the size of a program,

p r o g r a m is a m o d e l of the system.

given t h a t the

As was m e n t i o n e d earlier,

this leads to a t r a d e - o f f b e t w e e n m o d e l c o m p l e x i t y and approximation.

Thus our c h a r a c t e r i s a t i o n

B u n g e ' s motto.

It should be added, h o w e v e r that w h e n m o d e l l i n g

p o o r l y u n d e r s t o o d systems,

accords w i t h

the p l u r i m a is limited by the n a t u r e

of the a v a i l a b l e o b s e r v a t i o n

set;

if this is small then

useful m o d e l s w i l l n e c e s s a r i l y have to be small also.

Bunge's

second m a j o r c r i t i c i s m of a general e x h o r t a t i o n

of s i m p l i c i t y is that several d i f f e r e n t aspects of s i m p l i c i t y can be discerned,

and that these are often m u t u a l l y

An i n d i s c r i m i n a t i n g

incompatible.

call for s i m p l i c i t y is t h e r e f o r e n o n s e n s i c a l .

We e a s i l y evade this criticism, s p e c i f i c about how s i m p l i c i t y

since we have been q u i t e

should be measured,

and t h e r e f o r e

about w h a t type of s i m p l i c i t y we r e g a r d as important. Our theory .concerning m o d e l s "realisation" (38).

of P o p p e r ' s

is in some r e s p e c t s

a

a b s t r a c t theory of s c i e n t i f i c m e t h o d

The first p o i n t of a g r e e m e n t is the h y p o t h e t i c o -

d e d u c t i v e n a t u r e of the theory.

Theorem

(3.4.4)

shows that

171

the h y p o t h e s e s obtained

in some

algorithm" to us.

from w h i c h

our m o d e l s

routine

can exist.

Forcing

manner How

the m o d e l

it p o s s i b l e

to test

of the e x t e n t observations.

the theory

gain

not e x c l u d e

which

introduced

prima

facie

the e m p i r i c a l the o r i g i n a l is g r e a t e r Popper's

to save

auxiliary

content

the

that

whereas

an a l l e g e d

true

has

counterpart

If a m o d e l w h i c h possible cannot

model

prove

possible a smaller Our

has b e e n

found

of the s y s t e m

it.

model,

On

the o t h e r

then we

A

the so that of -

(58).

law w h i c h law w h i c h

is true is not

in our theory. (smallest)

investigation,

hand,

then we

if it is not the b e s t

can d e m o n s t r a t e

this by e x h i b i t i n g

model. assertion

to P o p p e r ' s

belief

that good m o d e l s that good

are small

theories

content,

all

hypothesis

system"

is the best

under

... by

auxiliary

a scientific

its

not e v e n

system - consisting

can n e v e r be v e r i f i e d , can be falsified,

from b e i n g

hypotheses,

that of the o r i g i n a l

contention

can be v i e w e d

hypotheses ....

auxiliary

theory plus

by the

model:

m a y be e v a d e d

of the

a measure

the e m p i r i c a l

all i m m u n i s a t i o n s ,

ad hoc

This m a k e s

the m o d e l

of the o v e r a l l

of t e s t a b l e

than

look-up

then m e a s u r e s

falsification

introduction

gives

is f a l s i f i e d

a table

of c o r r o b o r a t i o n " ,

"We m u s t

the o b s e r v a t i o n s

look-up

introduced

Information

is i r r e l e v a n t

- the size of the c o r r e c t i o n s

Alternatively,

falsified.

modelling

of the method.

by a table

as an ad hoc h y p o t h e s i s

or "degree

part

the t h e o r y

to w h i c h

are o b t a i n e d

to c o m p u t e

to the d e d u c t i v e

c a n n o t be

- no " u n i v e r s a l

they

corresponds

w h i c h m u s t be g e n e r a t e d

are b u i l t

corresponds

are simple.

Popper

172

associates

"simplicity" w i t h " p a u c i t y of p a r a m e t e r s " , w h e r e a s

we have a rather m o r e g e n e r a l c o n c e p t of s i m p l i c i t y - p a u c i t y of terms in the model.

It is b e c a u s e of this c o r r e s p o n d e n c e

that we s u g g e s t that i n f o r m a t i o n gain can be i d e n t i f i e d w i t h Popper's Kuhn

"degree o f c o r r o b o r a t i o n " . (39) has s u g g e s t e d that s c i e n c e does not in fact

use a u n i f o r m method, but that it can be d i v i d e d into two d i s t i n c t phases. fundamental

In the u s u a l phase - "normal science"

assumptions

of a " p r o b l e m - s o l v i n g "

are not q u e s t i o n e d ,

many anomalies

and r o u t i n e w o r k

c h a r a c t e r is pursued.

time to time a " s c i e n t i f i c revolution"

However,

from

occurs - s u f f i c i e n t l y

and s h o r t c o m i n g s of the e s t a b l i s h e d

"Weltanschauung"

a c c u m u l a t e to force a r e v i s i o n of b a s i c assumptions. t e m p t i n g to a s s o c i a t e

-

a change of the p r o g r a m m i n g

It is

l a n g u a g e used

for w r i t i n g m o d e l s w i t h such a " s c i e n t i f i c revolution".

For,

as has been argued in c h a p t e r 4, a change of p r o g r a m m i n g language implies a change in a priori b e l i e f s being investigated.

about the s y s t e m

Such a change leads to a change in the

o r d e r i n g of models, w h e n they are o r d e r e d in a c c o r d a n c e w ~ t h their i n f o r m a t i o n gains. o r d e r i n g of m o d e l s

The s u g g e s t i o n that a change in the

corresponds

to a " s c i e n t i f i c revolution"

p r e v i o u s l y b e e n m a d e by G a i n e s Formal developments

of logical p r o b a b i l i t y

s c i e n t i f i c i n d u c t i o n by C a r n a p assume a p a r t i c u l a r formal language,

corresponding

and his school

" l o g i c a l basis".

in w h i c h

p r i n c i p l e be made.

((14) and cf. sec.

and of

(36),(59)

all s c i e n t i f i c s t a t e m e n t s

a s s u m p t i o n is made,

particular programming

2.4.2).

language.

always

T h i s is taken to be a

In our c h a r a c t e r i s a t i o n

has

could in

of m o d e l l i n g a

n a m e l y the a s s u m p t i o n of a A m a j o r c r i t i c i s m of the

a s s u m p t i o n of a f o r m a l i s e d

language has always been that it

is e v i d e n t that s c i e n t i f i c

statements

are n e v e r e x p r e s s e d in

173 such languages,

and that it is not c e r t a i n w h e t h e r a formal

language capable of e x p r e s s i n g i n t e r e s t i n g s c i e n t i f i c s t a t e m e n t s can exist. programming

This c o n t r a s t s w i t h the status of the

languages w h i c h we have to assume.

such l a n g u a g e s

Obviously,

are capable of e x p r e s s i n g s t a t e m e n t s w h i c h

are i n t e r e s t i n g to the m o d e l l e r , p r a c t i c a l to do so.

and in m a n y cases it is

Furthermore,

it has been shown in

c h a p t e r 5 that such l a n g u a g e s can be d e f i n e d q u i t e precisely. A tenuo~similarity characterisation (60).

can be p o i n t e d out b e t w e e n our

and C a r n a p ' s

" C o n t i n u u m of I n d u c t i v e Methods"

Carnap introduces a "confirmation

function", w h i c h

is to be i n t e r p r e t e d as the logical p r o b a b i l i t y event w i l l occur.

The v a l u e of this f u n c t i o n d e p e n d s on a

term w h i c h can be i n t e r p r e t e d as an a priori on

an a p o s t e r i o r i e m p i r i c a l

frequency.

that a p a r t i c u l a r

logical factor, and

factor, w h i c h is a r e l a t i v e

The r e l a t i v e w e i g h t i n g of these two factors is

g o v e r n e d by a p o s i t i v e real p a r a m e t e r I, w h i c h thus indexes the " c o n t i n u u m of i n d u c t i v e methods".

The value of the

p a r a m e t e r % is u s u a l l y taken to be chosen s u b j e c t i v e l y , depends on how r e g u l a r or "lawlike" his " u n i v e r s e of discourse"

and

the i n d u c t i v i s t b e l i e v e s

to be.

S o m e w h a t analogously, we can c o n s i d e r our c h a r a c t e r i s a t i o n of m o d e l l i n g to be an " e n u m e r a t i o n of i n d u c t i v e m e t h o d s " , w h i c h is i n d e x e d by a G~del n u m b e r of the p r o g r a m m i n g w h i c h has b e e n chosen. subjective,

The choice of this index is also

since it r e f l e c t s

a priori beliefs.

we are not p r o p o s i n g that our m e t h o d probability.

language

However,

leads to a logical

A l t h o u g h it is easy to o b t a i n a rO,l]-valued

474

function to make

from i n f o r m a t i o n it b e h a v e

like

is that no e f f e c t i v e exist. need

effectively

w a y of n o r m a l i s i n g

the f u n c t i o n

would

over

out

they

as follows:

ravens

are black".

suppose This

that

Any

of i n d u c t i o n a ~pothesis

supports This

the m o d e l arise,

logical

algorithm a very

by o b s e r v a t i o n s

notion from the

"all to the

on " c o n f i r m i n g

seems

to lead to the c o n c l u s i o n , thing w h i c h

hypothesis

that

instances"

is also not "all ravens

that a are

absurd. clearly

does

behaviour, However,

a model

depend

on " c o n f i r m i n g

a hypothesised

the i n f o r m a t i o n

Hempel's

paradox

can be v i e w e d

no m o d e l w i l l

one - s u c h

not

corresponds

(we can w r i t e

an

Thus we have

type of e n t i t y

classical

does

as the s t a t e m e n t

but not x:~y;).

of w h a t

regularity

gain Of

exist w h i c h

of that h y p o t h e s i s

of the form x:=y;

different

that

paradox

is b a s e d

in g e n e r a l

negation

Hempel's

is e q u i v a l e n t

b e c a u s e every time

although

as "benchmark"

which

in the s y s t e m

because

consideration

are n o n - r a v e n s " .

is p a t e n t l y

is increased.

of

We include

it is h y p o t h e s i s e d

hypothesis

the o r i g i n a l

of a h y p o t h e s i s , to the

is not

of i n d u c t i o n

inference.

of a n o n - b l a c k

in a sense,

is r e p e a t e d

and w o u l d

things

Our characterisation instances"

paradoxes

and Goodman.

"all n o n - b l a c k

every observation

black".

which

tend to be r e g a r d e d

hypothesis

raven

famous

of i n d u c t i v e

arises

supporting

to be additive,

difficulty can

that our c h a r a c t e r i s a t i o n

of H e m p e l

of a c c o u n t s

theory

have

a set of m o d e l s

the two

those

of these b e c a u s e tests

The e s s e n t i a l

we p o i n t

escapes

(61), n a m e l y

not s e e m p o s s i b l e

computable).

Finally, modelling

it does

a probability.

(The f u n c t i o n

to sum to u n i £ y

gain,

can be "supported"

entities

cannot

175

be m e r e l y

logical s t a t e m e n t s ,

The s e c o n d p a r a d o x also arises

but m u s t be algorithms.

is G o o d m a n ' s

from c o n s i d e r a t i o n

of " c o n f i r m i n g instances".

time a green e m e r a l d is observed, "emeralds are green" "emeralds

are blue"

also confirms

Every

the h y p o t h e s i s that

is confirmed, is falsified.

the h y p o t h e s i s

"grue" p a r a d o x , w h i c h

and the h y p o t h e s i s However,

the o b s e r v a t i o n

that " e m e r a l d s are grue"

green until 1980 and blue t h e r e a f t e r ,

that

and f a l s i f i e s

- namely,

the

h y p o t h e s i s that " e m e r a l d s are bleen" - blue until 1980 and g r e e n thereafter. Thus the o b s e r v a t i o n w o u l d a p p e a r to tell us n o t h i n g at all about the a p p e a r a n c e of e m e r a l d s in the future. This p a r a d o x is evaded, characterisation.

r a t h e r than solved, by our

We assume

language has b e e n chosen.

that a p a r t i c u l a r p r o g r a m m i n g

The d e f i n i t i o n of this

language

can be r e g a r d e d as the d e f i n i t i o n of w h a t b a s i c p r e d i c a t e s are to be u s e d in our s c i e n t i f i c predicates

like "blue"

like "grue"

and bleen"

statements.

and "green"

If b a s i c

are defined,

then p r e d i c a t e s

can be c o n s t r u c t e d from these - it is

h e l p f u l to think of them as b e i n g d e f i n e d by p r o c e d u r e s . Now a theQry

like " e m e r a l d s

are green"

can be e x p r e s s e d as

a m o d e l by using just the t e r m i n a l c h a r a c t e r s of the language. However,

a m o d e l c o r r e s p o n d i n g to " e m e r a l d s

are grue" w o u l d

need to i n c l u d e the d e c l a r a t i o n of the p r o c e d u r e consequently,

"grue";

its i n f o r m a t i o n gain w o u l d be lower,

and this

m o d e l w o u l d be r e j e c t e d in favour of the first one. Of course,

this m a k e s no c o n t r i b u t i o n

to the p h i l o s o p h i c a l

p r o b l e m of why the l a n g u a g e chosen should d e f i n e "blue" "green"

r a t h e r than "grue"

and "bleen".

and

176

8.4

S[stems Science It has long b e e n p e r c e i v e d that c o n t r o l theory and,

more generally, with

systems

the acquisition,

concerned

t r a n s f e r and use of i n f o r m a t i o n ,

r a t h e r than of energy. information.

theory are p r i n c i p a l l y

Models

However,

are often said to convey

"information"

u s u a l l y u s e d in an intuitive,

in this sense is

p r e s y s t e m a t i c way.

I n f o r m a t i o n gain has b e e n i n t r o d u c e d in an attempt to formalise

this idea of i n f o r m a t i o n c o n v e y e d by a model.

The

e s t a b l i s h e d t h e o r i e s of i n f o r m a t i o n do not appear a d e q u a t e for this purpose.

Use of the s t a t i s t i c a l theory of i n f o r m a t i o n

t r a n s m i s s i o n w o u l d have r e s t r i c t e d us to the c o n s i d e r a t i o n only of r a n d o m p r o c e s s e s w h i c h could be d e s c r i b e d s t a t i s t i c a l l y , and the i d e ~ o f

"cause"

in such a framework. Hillel's

and "effect"

could have found no p l a c e

On the o t h e r hand, C a r n a p and Bar-

theory of s e m a n t i c i n f o r m a t i o n

i n v o l v e d the use of u n c o m p u t a b l e

(59),

(62) w o u l d have

"logical probabilities",

and w o u l d in any case not be c a p a b l e of p r a c t i c a l

application

to m o d e l l i n g . The a l g o r i t h m i c i n f o r m a t i o n a l m o s t p r o v i d e s w h a t is required, not in terms of p r o b a b i l i t i e s , - and m o d e l s

theory of K o l m o g o r o v since it defines

(19)

information

but in terms of a l g o r i t h m s

are c l e a r l y algorithms.

Consequently,

i n f o r m a t i o n g a i n has b e e n d e v e l o p e d from the ideas of a l g o r i t h m i c i n f o r m a t i o n theory.

However,

it m u s t be e m p h a s i s e d that

i n f o r m a t i o n gain is not the same entity as K o l m o g o r o v ' s " q u a n t i t y of i n f o r m a t i o n " .

~he latter is an u n c o m p u t a b l e

177

quantity, whereas

it has b e e n d e l i b e r a t e l y e n s u r e d that

i n f o r m a t i o n g a i n is computable. " q u a n t i t y of i n f o r m a t i o n " signals, w h e r e a s models

Furthermore,

Kolmogorov's

is a f u n c t i o n d e f i n e d on p a i r s of

i n f o r m a t i o n g a i n is a f u n c t i o n d e f i n e d on

(assuming the "signals"

to be givem).

How can the a s s e r t i o n that i n f o r m a t i o n gain m e a s u r e s the p e r f o r m a n c e four a r g u m e n t s association:

of a m o d e l be j u s t i f i e d ? in its favour.

We have e m p l o y e d

The first is a r g u m e n t by

a l g o r i t h m i c i n f o r m a t i o n is a p l a u s i b l e

f o r m a l i s a t i o n of " i n f o r m a t i o n " ;

i n f o r m a t i o n g a i n is i n t i m a t e l y

r e l a t e d to a l g o r i t h m i c information;

therefore information

gain is a p l a u s i b l e m e a s u r e of " i n f o r m a t i o n " . is a r g u m e n t by w e i g h t of opinion:

The s e c o n d

m a n y of the w o r k e r s who

have a t t a c k e d s i m i l a r p r o b l e m s have b e e n d r a w n to s i m i l a r conclusions chapter 2).

(e.g. W r i n c h and Jeffreys,

Solomonoff,

etc.

cf.

The third a r g u m e n t we have used is the a r g u m e n t

of c o n s i s t e n c y w i t h w h a t we w o u l d expect:

there is no

u n i v e r s a l m e t h o d for f i n d i n g the best model;

for large

o b s e r v a t i o n sets m o d e l s are chosen on the basis of g o o d n e s s of fit;

for large o b s e r v a t i o n

matter;

for an i m p o r t a n t class of p r o c e s s e s ,

is c h o s e n as best,

sets a priori beliefs do not the same m o d e l

as w o u l d be chosen by e s t a b l i s h e d theory.

The fourth a r g u m e n t is that of " o p e r a t i o n a l i s m " : gain of m o d e l s w h i c h are of p r a c t i c a l

the i n f o r m a t i o n

i n t e r e s t can be calculated.

This has b e e n d e m o n s t r a t e d by an example. A s t r o n g e r type of a r g u m e n t than any of these w o u l d be the use of the concept of i n f o r m a t i o n gain to obtain new results in systems theory,

or to e x p l a i n known p h e n o m e n a of

178

systems m o d e l l i n g . result of this k i n d

R i s s a n e n has already p r o v i d e d one ((41) and cf.

sec.

2.5).

In effect,

R i s s a n e n has shown that if the i n f o r m a t i o n gain of the m o d e l of a G a u s s - M a r k o v p r o c e s s is m a x i m i s e d , identification

then the r e s u l t i n g

scheme is an e x t e n s i o n of s t a n d a r d m a x i m u m

- l i k l i h o o d techniques. A second p o s s i b l e a p p l i c a t i o n of i n f o r m a t i o n g a i n is to the e x p l a n a t i o n of w h a t Y o u n g has termed the "Law of Large Systems"

(63).

that complex,

This is the conjecture,

b a s e d on experience,

p o o r l y u n d e r s t o o d systems can often be a d e q u a t e l y

r e p r e s e n t e d by v e r y simple models.

One can see i m m e d i a t e l y

how the e x p l a n a t i o n of this in terms of i n f o r m a t i o n gain w o u l d run.

The p o s s i b l e i n f o r m a t i o n gain of a m o d e l of such

a system is limited by the size of the a v a i l a b l e i n f o r m a t i o n sets.

S u p p o s e that m o d e l s

(understood informally)

belonging

to a c e r t a i n class are fitted to these o b s e r v a t i o n sets in o r d e r of i n c r e a s i n g complexity. trivial model,

Let t be the size of the

and I n be the i n f o r m a t i o n gain of the nth

m o d e l so fitted w h e n s u i t a b l y formalised. from the nth to the

(n+l)th model,

the g r e a t e s t p o s s i b l e

i m p r o v e m e n t in i n f o r m a t i o n gain is t-I n . seems

likely that d i m i n i s h i n g

Then in p a s s i n g

If t is sma11,

returns w o u l d q u i c k l y set in,

and no i m p r o v e m e n t in i n f o r m a t i o n gain w o u l d be p o s s i b l e after the first few m o d e l s had been fitted. interesting

It w o u l d be

to e s t a b l i s h the c o n d i t i o n s u n d e r w h i c h this

does i n d e e d happen.

it

179

8.5

Conclusion

A characterisation which

is b e l i e v e d

the m o d e l l i n g

of m o d e l l i n g

to i n c o r p o r a t e

of complex,

are e n c o u n t e r e d

and m a n a g e m e n t

studies,

This

characterlsation

model,

A system

sets

observation

set is a s s u m e d

of

such

socio-economic

industrial

control

a system,

studies. a

quality.

to be d e f i n e d

- an i n p u t

features

systems,

of three parts:

of m o d e l

observation

salient

understood

and in c e r t a i n

is c o n s i d e r e d

introduced

in e n v i r o n m e n t a l ,

consists

and a c r i t e r i o n

certain

poorly

as those w h i c h

has b e e n

by a p a i r

set and an o u t p u t to be a finite

set.

array

of Each

of r a t i o n a l

numbers. A model observation sets

but m u s t which

set.

to help

The m o d e l

is an a l g o r i t h m

it in this

is not compute

represents

exercise,

It m a y

before

which

use

certain

task,

allowed

computes

the o u t p u t

elements

as s p e c i f i e d

to a p p r o x i m a t e

by the modeller.

the

trivial

of the o b s e r v a t i o n

system behaviour,

it exactly.

The

model

the s i t u a t i o n

at the b e g i n n i n g

any s t r u c t u r e

has b e e n

is a m o d e l of the m o d e l l i n g

discerned

in the

system behaviour. The c r i t e r i o n gain. than

This

trade-off prevents puts

is the a m o u ~ b y

the t r i v i a l

program.

of m o d e l

model,

model

"overfitting"

a premium

which

when both

In c o n v e n t i o n a l between

quality

terms,

the m o d e l are w r i t t e n

of the m o d e l

and m o d e l

information

is s m a l l e r as a c o m p u t e r

this c r i t e r i o n

complexity

on finding

is the m o d e l ' s

leads

to a

accuracy.

to the o b s e r v a t i o n s ,

the g r e a t e s t

amount

This and

of r e g u l a r i t y

180

in the

s y s t e m behaviour.

The ranking depends

of m o d e l s

on the p r o g r a m m i n g

models.

However,

this

it can be a s s o c i a t e d about also

the

system.

affects

arbitrary, shown

according

A detailed "a p r i o r i

and

this

investigation

information

by a s s o c i a t i n g

required

to w r i t e of the

meaningfully consequence

if such

The w o r k Firstly,it logically

sound,

a formal

Secondly,

it g i v e s

the p r o g r e s s has

been

a model

that,

indicates

other

is not

coding

has been

one to use. the n o t i o n

of

language However,

models

an

can be

it is of no g r e a t

serves

analysis

a plausible

efforts. being

which

is

formalisation

by a model".

a simple

rather

two purposes.

of m o d e l l i n g

supplied

things

vacuousness

that

thesis

the m o d e l l e r

of his m o d e l l i n g

beliefs

are not m e t exactly.

gives

of the c o n c e p t of " i n f o r m a t i o n

because

can be p r e c i s e l y

the s m a l l e s t

indicated

in this

and w h i c h

that

under which

conditions

reported

provides

has

shown

as a program.

conditions

compared

smallest

by a model"

it w i t h

the m o d e l

this

is a n a t u r a l

has

assumed

defined,

examination

coding

the

the o b s e r v a t i o n s

Again,

a distinguished

for w r i t i n g

a priori

of coding

ranking.

criterion

arbitrary,

the m o d e l l e r ' s

The m a n n e r

because

chosen

is not e n t i r e l y

with

the m o d e l

to exist,

language

to this

technique Our

for

guiding

assessing

idea

equal,complexity

than

sophistication.

in

REFERENCES i.

Weber, R.L, "A Random Walk in Science'~ Physics, (1973), p.92.

The Institute

2.

Astrom, Survey~

3.

Akaike, H, "Autoregressive Model Fitting for Control", Annals of the Institute of Statistical Maths, 23, (1971), 163-180.

4.

Chan, C-W, Harris, C.J, and Wellstead, P, "An Order-Testing Criterion for Mixed Autoregressive Moving Average Processes", Int. J. Control, 20, (1974), 817-834.

5.

Akaike, H,"A New Look at the Statistical Model Identification", IEEE Trans. Auto. Control, AC-19, (1974), 716-723

6.

Forrester, (1961).

7.

Von Neumann, J, and Morgenstern, O, "The Theory of Games and Economic Behaviour", Princeton, (1944).

8.

Fuller, A.T., "Analysis of Nonlinear Stochasti~ Systems by Means of the Fokker-Planck Equation", Int. J. Control, 9, (1969) , 603-655.

9.

Rogers, H, "Theory of Recursive Functions Computability", McGraw-Hill, (1967).

K.J. and Eykhoff, P, "System Identification Automatica, !, (1971) , 123-162.

J.W,

"Industrial

Dynamics",

M.I.T°

Press

of - A

and Wiley,

and Effective

iO.

Box, G.E.P. Forecasting

ll.

Kalman, Systems

12.

Windeknecht, T.G, Press, (1971).

13.

Zadeh, L.A, and Desoer, C.A, "Linear System Theory-The State Space Approach", McGraw-Hill, (1963).

14.

Gaines, B.R, Complexity",

15.

Blum, M. "On the Size of Machines", 257-265.

16.

Blum, M. "A Machine-Independent Theory of the Complexity Recursive Functions", J. ACM, 14, (1967), 322-336.

and Jenkins, G.M, "Time Series Analysis, and Control", Holden-Day, (1970).

R.E, Falb, P.L, and Arbib, M.A, Theory", McGraw-Hill, (1969). "General Dynamical

"Topics

Processes",

in Mathematical Academic

"System Identification, Approximation and Int. J. Gen. Systems, !, (1977), 145-174. Info & Contr, ll,

(1967) , of

182

17.

L~fgren, L, "Complexity of Descriptions of Systems", Research Report IV1 7601, Dept. of Automata and General Systems Sciences, Lund Institute of Technology, (January 1976).

18.

Hartmanis, J, and Hopcroft, J.E, "An Overview of the Theory of Computational Complexity", J. ACM, I_~8, (1971), 444-475.

19.

Kolmogorov, A.N, "Thmee Approaches to the Quantitative Definition of Information", Problems of Information Transmission, ~, No. i, [1965), 1-7.

20.

Kolmogorov, A.N,"Logical Basis for Information Probability Theory", IEEE Trans. Info. Theory, (1968), 662-664.

21.

Zvonkin, A.K., and Levin, L.A., "The Complexity of Finite Objects and the Development of the Concepts of Information and Randomness by Means of the Theory of Algorithms", Russian Mathematical Surveys, 2_~5, no. 6, (1970), 83-124.

22.

Solmonoff, R.J., "A Formal Theory of Inductive Inference", Information and Control, !, (1964), 1-22, and 224-254.

23.

Chaitin, G.J., "On the Length of Programs for Comptuing Finite Binary Sequences," J.ACM, 13, (1966), 547-569.

24.

Martin-L~f, P., "The Definition of Random Sequences", Information and Control, ~, (1966), 602-619.

25.

Church, A, "On the Concept of a Random Sequence", Amer. Math. Soc., 46, (1940), 130-135.

26.

Gillies, D.A., Methuen,(1973~

27.

Schnorr, C.P."Optimal Enumerations and Optimal G~del Numberings", Math. Syst. Th, 8, (1975), 182-191.

28.

Meyer, A.R, "Program Size in Restricted Info& Contr., 21, (1972), 382-394.

29.

Biermann, A, & Feldman, J.A, "A Survey of Grammatical Inference" in: Watanabe, S, (ed), "Frontiers of Pattern Recognition", Academic Press, (1972), 31-54.

30.

Fu, K.S, & Booth, T.L, "Grammatical Inference: Introduction and Survey-Part I", IEEE Trans Syst., Man & Cybernetics, SMC-5, (1975), 95-111.

31.

Gold, M, "Language Identification Contr., i_~O, (1967), 447-474.

32.

Chomsky, N, "On Certain Formal Properties Info & Contr., ~, (1959), 137-167.

"An Objective

Theory IT-14,

and

Bull

Theory of Probability",

Programming

in the Limit",

Languages",

Info &

of Grammars",

183

33.

Fu, K.S, & Booth, T.L, "Grammatical Inference: Introduction and Survey-Part If", IEEE Trans. Syst, Man & Cybernetics, SMC-5, (1975), 409-423.

34.

Feldman, J, "Some Decidability Results on Grammatical Inference and Complexity", Info & Contr., 2__O0, (1972), 244-262.

35.

Blum, L, and Blum, M, "Toward a Mathematical Theory of Inductive Inference", Info & Contr., 28, (1975), 125-155.

36.

Carnap, R, "Logical Foundations of Chicago Press, (1950).

37.

Wrinch, D, and Jeffreys, H, "On Certain Fundamental Principles of Scientific Inquiry", Philosophical Magazine, ser. 6, vol. 42, (1921), 369-390

38.

Popper, K.R., (1959).

39.

Kuhn, T.S, "The Structure of Scientific University of Chicago Press, (I962).

40.

L~fgren, L, "Relative Explanations of Systems" in Klir, G.J, "Trends in General Systems Theory", Wiley, (1972), 340-407.

41.

Rissanen, J, "Parameter Estimation by Shortest Description of Data", Proceedings of JACC Conference, ASME, (1976), 593-597.

42.

Rissanen, J, "Basis of Invariants and Canonical Forms for Linear Dynamic Systems", Automatica, i_~0, (1974), 175-182.

43.

Chaitin, G.J. "A Theory of Program Size Formally Identical to Information Theory", J.ACM, 22, (1975), 329-340.

44.

Eykhoff, P, "System Identification-Parameter Estimation", Wiley, (1974).

45.

Box, G.E.P., and Jenkins, Forecasting and Control",

46.

Whittle, P, "Prediction and Regulation Squares Methods", EUP, (1963).

47.

Sherman, S, "Non-Mean-Square Error Criteria", Info. Theory, IT-4, (1958), 125-126.

48.

Johnston,

49.

Ollongren, A, "Definition of Programming Languages preting Automata", Academic Press, (1974).

of Probability",

"The Logic of Scientific

J, "Econometric

University

Discovery",

Hutchinson,

Revolutions",

and State

G.M, "Time Series Analysis, HoldenTDay, (1970).

Methods",

by Linear LeastIRE Trans.

McGraw-Hill,

(1963). by Inter-

184

50.

Challis, M.F, "Algol W Language Specification and Programmer's Guide", University of Cambridge Computing Service, 3rd edition (August 1975).

51.

Deutsch,

52.

Bobrow, L.S, and Arbib, M.A, "Discrete Mathematics: Applied Algebra for Computer and Information Science~ W.B. Saunders, (1974).

53.

Mihram, G.A, "Simulation: Statistical Foundations Methodology", Academic Press, (1971).

54.

Zeigler, B.P, "Theory of Modelling and Simulation", Wiley, (1976).

55.

Mesarovic, M, and Pestel, E, "Mankind at the Turning Point", Hutchinson, (1975).

56.

Cajori, F, "Sir Isaac Newton's Mathematical Principles of Natural Philosophy and his System of the World, vol. 2", University of California Press (1962).

57.

Bunge, M, "The Myth of Simplicity",

58.

Popper, K.R, "Unended Quest", Fontana/Collins,

59.

Hintikka, J, and Suppes, P, "Information Reidel, (1970).

60.

Carnap, R, "The Continuum of Inductive Methods", of Chicago Press, (1952).

61.

Hesse, M, "The Structure of Scientific Inference", (1974).

62.

Bar-Hillel, Y, "Language and Information", and The Jerusalem Academic Press, (1964).

63.

Young, P.C, Shellswell, S.H, and Neethling, C.G., "A Recursive Approach to Time Series Analysis", Cambridge University Engineering Department, Technical Report CUED/BControl/TR16(1971).

64.

De Bakker, J.W. "Semantics of Programming languages" in: Tou, J.T. (ed), "Advances in Information Systems Science, vol. 2", Plenum Press, (1969), pp.173-227.

65.

Lauer, P. "Formal Definition of Algol 60", IBM Laboratory, Vienna, Technical Report TR 25.088, (1968)

66.

Zimmermann, K. "Outline of a Formal Definition of Fortran", IBM Laboratory, Vienna, Technical Report LR.25.3.O53, (1969).

R, "Estimation Theory",

Prentice-Hall,

(1965).

Prentice-Hall,

and

(1963)

(1976).

and Inference", University Macmillan

Addison-Wesley

185

67.

Neuhold, Jo "The Formal Description of Programming Languages", IBM Systems Journal, i__OO,2, (1971), pp.86-i12.

68.

Lucas, P., Lauer, P, Stigleitner, H, "Method and Notation for the Formal Definition of Programming Languages" IBM Laboratory, Vienna, Technical Report TR 25.087, (1968), (revised 1970).

69.

Aho, A.V. & Ullman, J°D. "The Theory of Languages", Mathematical Systems Theory, ~, 2, pp 97-125, (1969),.

188

APPENDIX Formal

A.I

of P r o g r a m m i n g

formal d e f i n i t i o n

language

(64).

in r e c e n t

will

semantics

so that

and b e c a u s e

it has been

practical

programming

therefore

be no d o u b t s

A very Method

complete

is g i v e n

some ways

a brief

because

used

about

language - too

called

is c h o s e n

simple,

of the V i e n n a

stages.

is defined. which

so-called

In

Vienna

of d e f i n i n g

(66).

documented, of s e v e r a l

There

need

account

of the V i e n n a

(49), but m o r e

are a v a i l a b l e

can be g i v e n

because

and in

(67)

This

defining

Language",

to n e e d m o s t

in

here.

by f o r m a l l y

Model

concise,

and

(68).

intro-

a special

or LML. it is v e r y

This simple

of the S o p h i s t i c a t i o n

Method.

The V i e n n a M e t h o d in four

(65),

for e x p o s i t i o n

in fact,

now exist.

its power.

descriptions

"Linear

the

has b e e n m a d e

it is the best

and c a r e f u l

introduction

of c o m p u t e r

for the d e f i n i t i o n

languages

d u c t i o n w i l l be i l l u s t r a t e d language,

solutions

This m e t h o d

by O l l o n g r e n

clearer,

problem

solutions,

be described. chosen

of a p r o g r a m m i n g

a lot of p r o g r e s s

several

one of these

has been

semantics

difficult

Fortunately,

years,

appendix

Method,

of the

is a n o t o r i o u s l y

science

Only

Languages.

Introduction

The

this

Semantics

A

This

involves

First,

the c o n c r e t e

specifies

are v a l i d p r o g r a m s

the d e f i n i t i o n

those

in the

syntax

strings

language.

of a l a n g u a g e

of the

language

of c h a r a c t e r s The

specification

187

is i n v a r i a b l y

in B a c k u s - N a u r

form;

therefore a context-free grammar

the c o n c r e t e syntax is

(69).

The c o n c r e t e syntax

i n d i c a t e s how e a c h string of c h a r a c t e r s w h i c h forms a p r o g r a m is to be parsed.

On c o m p l e t i o n of p a r s i n g c e r t a i n c h a r a c t e r s

(such as s e m i c o l o n s redundant,

and c o m m e n t strings in Algol)

and can be discarded.

become

The r e m a i n i n g e n t i t i e s

w h i c h a p p e a r on the nodes of the p a r s i n g tree are now m a p p e d into e n t i t i e s assigned.

to w h i c h s e m a n t i c roles w i l l e v e n t u a l l y be

This m a p p i n g

is s p e c i f i e d by d e f i n i n g a translator;

this is a f u n c t i o n w h i c h maps the s t r u c t u r e d o b j e c t w h i c h is a

(parsed)

c o n c r e t e p r o g r a m into a n o t h e r s t r u c t u r e d o b j e c t

called an a b s t r a c t program.

The set of s t r u c t u r e d o b j e c t s

w h i c h are v a l i d a b s t r a c t p r o g r a m s of the l a n g u a g e is c a l l e d the a b s t r a c t syntax of the language.

The language d e f i n i t i o n

is c o m p l e t e d by d e f i n i n g an i n t e r p r e t i n g

automaton.

This is

d e f i n e d by s p e c i f y i n g a set of s t r u c t u r e d states of the automaton,

and a state t r a n s i t i o n

function.

It is the

d e f i n i t i o n of this f u n c t i o n w h i c h r e a l l y d e t e r m i n e s p r o g r a m s in the l a n g u a g e are to be interpreted.

how

A computation

is v i e w e d as a s e q u e n c e of states of this automaton.

The

i n i t i a l state is d e t e r m i n e d by the p r o g r a m and its data. Tree-structured objects Method.

are a l l - p e r v a s i v e

in the V i e n n a

They are u s e d to r e p r e s e n t both the c o n c r e t e and

a b s t r a c t s y n t a x of a language, of the i n t e r p r e t i n g automaton. the D - f u n c t i o n ,

is i n t r o d u c e d

and is u s e d e x t e n s i v e l y .

and to r e p r e s e n t the states A special

function,

called

to carry out " t r e e - s u r g e r y " ,

188

Finally,

it is assumed that a m e t a l a n g u a g e

which

can be used for the d e f i n i t i o n

A.2

Linear Model L a n g u a g e Suppose

rational

systems,

numbers,

finite-order

difference

input o b s e r v a t i o n

only single-input,

input and output values input-output

equation,

relation

together w i t h

of the the output. of such a system,

the d i f f e r e n c e

ark

is a

an additive

Let u.z be the ith

Yi its ith output

and d i the ith random disturbance.

bj be c o e f f i c i e n ~ o f following

whose

and w h o s e

random d i s t u r b a n c e

observation,

of the above entities.

that we w i s h to i n v e s t i g a t e

single-output

is a v a i l a b l e

equation,

Let aj,

so that the

e q u a t i o n holds:

Yi = alYi_ 1 + ...+anYi_ n + b o u i + . . . b m U i _ m +di. We can follows. integers

(informally)

define

a programming

Every p r o g r a m of the language and rationals w h i c h

is given

as

is a list of

the interpretation:

n,m,al~2, .... a ~ o , b l . . . , b m , d l , d 2 , . . . , d N . such a p r o g r a m is a similar

language

The data for

list, with the i n t e r p r e t a t i o n

i , Y i _ l , . . . , Y i _ n , U i , . . . , u i _ m.

Given such a p r o g r a m

a set of data,

the c o m p u t a t i o n

of Yi in accordance with the

above equation

is invoked.

and output o b s e r v a t i o n s

(y n,...,yN)

then a certain .(infinite) will

constitute

models

If input o b s e r v a t i o n s

and such

(U_m,...,UN) ,

of a system are obtained,

set of programs

in this language

of the system defined by the o b s e r v a t i o n s

((Ul,...,UN), (yl,...,yN)).

The terms d l , . . . , d N w h i c h appear

in the program

are, in fact,

shows the structure

a look-up table.

language described

called Linear Model Language,

algorithm

7

of each such model.

The programming

not a universal

Fig.

language,

above will be

or LML.

Note that LML is

in the sense that not every

can be implemented

in it.

A trivial model of a system is obtained

in LML if

m=n=bo=O. ~.3

Tree-Structured As mentioned

Objects

earlier,

used in the Vienna Method. are familiar. nature,

tree-structured

for short),

objects

are much

We assume here that such objects

The following example

and will also introduce

A typical

A

and the ~-Function.

tree-structured

should clarify

some terminology object

their

and notation.

(or simply "object",

is the following entity A:

=

el

/

Sl!2 s 3 ~

I

e2

e3 Sl,S2,S3,

are called simple selectors.

at each node of the tree are themselves with the objects el,e2,e3,

tree-structured,

as degenerate

are called elementary

is some tree-structured

appearing

which appear at the terminal

nodes of the tree, being regarded These objects

The objects

objects.

trees. The o b j e c t Q

object which is not necessarily

190

elementary. Every o b j e c t has a finite n u m b e r of nodes, node, o t h e r than the u n i q u e root, p r e c e d i n @ node. is"selected" nj=s(ni).

and e a c h

is a s s o c i a t e d w i t h a unique

A s u c c e s s o r node nj of a p r e c e d i n g node n i

by a simple s e l e c t o r s, and this is d e n o t e d by Strictly

r a t h e r than nodes.

speaking, Thus,

n i and nj here denote objects

in the example, we have el=Sl(A).

The simple s e l e c t o r s a s s o c i a t e d w i t h any node m u s t be p a i r w i s e distinct.

This m a k e s

it p o s s i b l e to select the o b j e c t

a s s o c i a t e d w i t h any n o d e by c o m p o s i n g simple selectors. For example,

in the above o b j e c t we have e 2 = s l o S 2 ( A ) ,

e3=s2os2(A),

B=s3os2(A) , w h e r e o denotes

selectors.

E n t i t i e s of the form s o...os, are called l 3

c o m p o s i t e selectors.

c o m p o s i t i o n of

N o t e that r e a d i n g a c o m p o s i t e s e l e c t o r

from left to right c o r r e s p o n d s

to "reading"

an o b j e c t from

b o t t o m to top. The null o b j e c t is a s s o c i a t e d w i t h the empty tree, that is the tree w i t h no nodes,

and is d e n o t e d ~.

Let K d e n o t e a c o m p o s i t e selector, object.

Then the c h a r a c t e r i s t i c

the set of all pairs , can be d e f i n e d by giving

and e an e l e m e n t a r y

set of an object A is

such that K(A)=e.

its c h a r a c t e r i s t i c

set.

An o b j e c t For example,

the above o b j e c t A is d e f i n e d by;

A={<sl:el>,<SlOS2:e2>,<s2os2:e3>,...} The c h a r a c t e r i s t i c

set of B is not k n o w n in this case,

this d e f i n i t i o n cannot be completed. w e r e the object:

. so

But suppose that B

191

B={<sl:e4>,<s2:e5@ =

then we would have A={<sl:el>,<SlOS2:e2>,<s~ s2:e3>,<Sl°S3 ° s2:e4>,<s2 ° s ~ s2:e5>}. We now introduce the

H-function, which is used to

perform operations on objects.

The ~-function takes two

arguments, the first of which is an object A, and the second is a pair, where K is a composite selector, and B is an object.

The range of ~ is the set of all objects.

The

value ~(A;) is an object which is obtained from A by replacing K(A) by B in such a way that K(~(A;)=B. This is most clearly shown by examples (taken from Lucas et al (68)) : Let A = i/sl

s2~ s1

Sl~2

/ e2

S~e 4

\ e3

Then (i)

/Sl/~S3

(A;<s3:B>)=

~

i

/ e2

Sl

2~e

s2

\ e3

I--.

4

192

/

(ii)u (A;<SlO s2:B>)=

s2

s\

e~ S1

e4 (iii)

(A;<Sl~ sl,s l.s 2:B>)=

e/

i/•2 \

Sl~S

e4

2 e3

c/ In particular, (i) ~(A;<s3:~>)

if B=~, we obtain: = A

(ii) ~ ( A ; < s ~ s2:~>)

=

~e 4

(iii) H (A; <SlO SlO Sl- S 2 :~>) =

/

sI

S2

s~s2\

e I

<,

e, e 3

193

Ollongren

(49) and Lucas et al (68) define the ~-function

more precisely

than has been done here.

We now introduce (i)

the following notations:

~(A;,
>)A =

~(~(A;);,...,), with

~(A;)~=A.

Exam Dle Let A=

/

sI

s2

\

eI

Then

e2

~(A;<s3:e3>,<SlOS2:e4>)=

Sls / ~ 2

/

eI

J

s1

s3

\ e3

I

e4 Ollongren

(49) gives conditions

arguments of the (ii)

~-function

under which interchanging

the

leaves the value unchanged.

~o (,... ,) ~ ]l(~; , . . . , )

Thus ~o is a function which

"creates"

objects.

Example ~° (<Sl :el> '<Sl° s2 :B> '<s2e s2 :e3>) =

s ~

2

S1

U

s2

\

e3

194

~.4

C o n c r e t e Syntax.

The c o n c r e t e syntax of a p r o g r a m m i n g d e f i n e d by u s i n g the B a c k u s - N a u r rules.

language can be

form of w r i t i n g p r o d u c t i o n

This is a s h o r t h a n d m e t h o d of d e f i n i n g a grammar.

S u p p o s e that there exists

a finite n o n - e m p t y set Z of

terminals.

T y p i c a l e l e m e n t s of Z are:

b, 2, *, begin,

and so on.

Let Z* denote the set of all finite strings of

e l e m e n t s of Z.

Also,

suppose N is a finite n o n - e m p t y set

of n o n - t e r m i n a l s

such that N n Z = ~ and N* is the set of all

finite strings of e l e m e n t s of N.

Let V=ZuN,

Let V+=V*-A, w h e r e A is the empty string.

and V * = ( Z u N ) *

Then the set of

p r o d u c t i o n rules is the set ~={(~,~) :~eV*xNxV* Each pair

&SeV+}.

(~,8)e~ is w r i t t e n

the set of p r o d u c t i o n

rules

as ~ 8 .

In B a c k u s - N a u r

{6+~i ' ~+B2'''''

~+~n }

form,

is

d e n o t e d by the single e x p r e s s i o n < ~ > : : : 8 1 1 8 2 1 - . 1 8 n. The b r a c k e t s Naur notation providing

<> are used to d e n o t e n o n - t e r m i n a l s .

Backus-

can be used to express p r o d u c t i o n rules

that ~eN.

(~,8),

Such p r o d u c t i o n rules are c a l l e d

context-free. A g r a m m a r G is a 4-tuple G = ( N , Z , P , S ) , w h e r e P is a finite n o n - e m p t y subset of ~ , and SEN is the start s~mbol. If each p r o d u c t i o n of a g r a m m a r is c o n t e x t - f r e e g r a m m a r is said to be c o n t e x t - f r e e .

then the

A context-free grammar

can be c o n v e n i e n t l y d e f i n e d by a finite set of e x p r e s s i o n s

195

in B a c k u s - N a u r

form.

If there e x i s t 61,6 yi=61~2

and y 2 = ~ i ~ 2

7i-i

Yi

~

2 e V* and ~+SEP such that

then Y1

(i=l,2,...,n),

~Y2"

If ~leV*

then y o ~ Y n ( Y n

and

is d e r i v e d

from yo ) . The g r a m m a r G is said to g e n e r a t e L(G)={x:S

~.

Two g r a m m a r s

the l a n g u a g e

x & xeZ*} are e q u i v a l e n t if they g e n e r a t e the same

language. The V i e n n a M e t h o d d e f i n e s One is the c o n c r e t e syntax, L u c a s et al

two g r a m m a r s

for each language.

the o t h e r the a b s t r a c t syntax.

(68) e x p l a i n the d i s t i n c t i o n m o s t clearly:

"An a b s t r a c t syntax is one w h i c h only s p e c i f i e s the e x p r e s s i o n s of the l a n g u a g e as to the s t r u c t u r e s s i g n i f i c a n t subsequent interpretation

for their

and not as to how they are to be

e x p r e s s e d for the p u r p o s e of c o m m u n i c a t i o n either to o n e s e l f or to others.

A c o n c r e t e syntax s p e c i f i e s

the e x p r e s s i o n s

of the language

as a set of c h a r a c t e r strings".

The c o n c r e t e syntax of LML is d e f i n e d as follows: <program>

::= ,

,

,

, .

< r a t i o n a l > : : = +J

-JO

::= J

::= J<

::: O J l J 2 J 3 J 4 J 5 J 6 J 7 J 8 J 9

The t e r m i n a l s of LML are Positive

rationals

.J.

integer>

:

O 1 2 3 4 5 6 7 8 9 . , + -

are r e q u i r e d to be signed so that the

...,

196

terms

in the table

manner

(cf.

the o t h e r

chapter7).

terms

An e x a m p l e

solely

of a v a l i d

This p r o g r a m give

look-up w i l l This

be

coded

requirement

for s i m p l i c i t y string

in LML

(not shown

is e x t e n d e d

to

of d e f i n i t i o n .

is:

can be p a r s e d (using the

the o b j e c t

in a s i z e - c a p t u r i n g

2,1,+.6,-3,O,-1.41,-5.2.

syntax definition)

to

in full):

[/s

s

s1

1

6 A. 5.

Abstract

We

Syntax

introduce

if an o b j e c t

4

the

following

x satisfies

notational

a predicate

conventions:

P, we w r i t e

is -P(x). A

The That

set of o b j e c t s is,

which

is-P={x:is-P(x)}

satisfy .

The

P is d e n o t e d set is-P

is-P.

is d e f i n e d

by

197

an expression is-P=

of the form

(<S-Pl:iS-Pl>,<s-P2:is-P2>,...<S-Pn:iS-Pn

>)

^

which indicates

that for every x c is-P,

X=llo(<S-Pl:Xl>,<s-P2:x2>,...,<S-Pn:Xn>), A

^

A

where x I e is-P I, x 2 e is-P2,...,x n e is-P n. (<S-Pl:iS-Pl >) then we write is-P=is-P I.

If is -P=

A predicate

also be defined by using the disjunction

may

operator V, e.g.: ^

is-P=is-P 1 V is-P2, which denotes x e is-P 1 V is-P 2.

that x e is-P only if

It is assumed that certain predicates

are satisfied by subsets of the elementary Using this notation, defined

the abstract

objects.

syntax of LML is

as follows:

is-program=(<s-n:

is-integer>,<s-m:

is-integer>,<s-rational-

list: is-rational-list>) is-rational-list=(<s-head:

is-rational>,<s-tail:-is-rational-

list V is -~>) It is assumed that is-~={~}, integer and is-rational

and that the predicates

are satisfied by

infinite

sets of elementary

program"

satisfies

objects.

the predicate

program introduced P=_

(countably)

Every LML "abstract

is-program.

the abstract program corresponding

is-

For example,

to the concrete

LML

in section A.4 is the object

//~s-rational-list s~n

/ 2 s-m

! 1

/

_~eaS-tail s- ead ~s-tail / +.6

s-head / -3

~

-tail

s-head

0/

~s-tail s-head /

2 -h e a d /

-1.41 --

.2

198

How

this

object

specified next

by

i9 o b t a i n e d

the

section.

from

translator, Note

which

is m e r e l y

object,

it is c h o s e n

Most the

discussions

abstract

defining

syntax

syntax, not be

If w e w e r e

to a d o p t

would

assumes not

an i n f i n i t e

to have

measure of

terminals

earlier which

must

(in o u r (i)

(2)

case,

of

view,

set

of

has

over

there

of t h e

size

3 is

are

allowed.

of the

string

over

"machines"

programs

is n o t

are:

a finite

an e f f e c t i v e

terminals

length

and the

a size measure

there

axiom

of p r o g r a m s ,

of axioms

size,

first

I t is e s s e n t i a l

pair

These

it

therefore

a useful

of any given

the

A.4).

since

discussed

at m o s t

any y, w h i c h

abstract

As

programs).

exists

as its realisations

(and is

the

that

languages.

above

a program.

introduced

exists

out

for our purposes,

in s e c t i o n

by

viewed

concrete

"terminals"

constitute~

value.

point

separate

to

for that

mnemonic

the

in t h e

any m e a n i n g

Method

then

is

the o b j e c t

label

can be

to b e

satisfactory

satisfied

for Clearly,

this

attach

alternative

in c h a p t e r

(15)

be

that

a maasure

which

Blum

a language

program

defined

for

arbitrary

the Vienna

as d e f i n e d

introduced

not

to h a v e

considered

not be

a "grar~mar"

for us

of

and

of it n e e d

syntax

of

an

be

"+.6"

(p) d o e s

that object.".6"

concrete

will

that writing

s-head o s-rational-list

although

the

number

procedure

of programs

for d e c i d i n g ,

are o f s i z e y.

satisfied

if i n f i n i t e

sets

199

Furthermore, procedures

we w a n t p r o g r a m s

for c o m p u t i n g

to d e s c r i b e e f f e c t i v e

functions.

D e f i n i n g a language

w i t h i n f i n i t e l y m a n y t e r m i n a l s w o u l d c o r r e s p o n d to d e f i n i n g a T u r i n g m a c h i n e w i t h i n f i n i t e l y m a n y tape symbols. w o u l d be a f u n d a m n n t a l

change in the n o t i o n of " c o m p u t a b i l i t y " .

To o v e r c o m e these o b j e c t i o n s , define

This

it w o u l d be p o s s i b l e to

an a b s t r a c t s y n t a x for LML w h i c h s p e c i f i e d a finite

set of terminals.

The object at each t e r m i n a l node of an

a b s t r a c t p r o g r a m w o u l d then s a t i s f y one of the p r e d i c a t e s i s - d i g i t or is-sign,

say, and these w o u l d be d e f i n e d by

i s - d i g i t = is-O V is-i V . . . V is-9 i s - s i g n = is-+ and is~O={O},

V is- -,

.... is-9={9},

is-+={+},

In this case the a s s e m b l y of the d i g i t s

is--=[-}. into i n t e g e r s

and

r a t i o n a l s w o u l d h a v e to be p e r f o r m e d by tile i n t e r p r e t i n g automaton,

A.6

rather than by the translator.

The T r a n s l a t o r

The t r a n s l a t o r is a f u n c t i o n w h i c h maps parsed

concrete programs

a b s t r a c t programs.

the set of

in a l a n g u a g e into the set of

To d i s t i n g u i s h b e t w e e n

concrete

and

a b s t r a c t o b j e c t s we i n t r o d u c e the conventions: is-<program>(x)

means

that x is a p a r s e d c o n c r e t e program,

n a m e l y an object such as that shown in s e c t i o n A.4. precisely,

for LML we have,

for some p o s i t i v e

More

integer k:

is-<program>=(<sl:is->,<s2:is-,>,..~<S2k_l:is->, <s2k:is-.>)

200

The p r e d i c a t e s the concrete

is-,

syntax in exactly

we have is-,={,},

is-O={O},

In the following ...else...

that is-<program>(p)

obtained

are

the same way.

Obviously,

the statement

It is assumed

and is-(xi).

~o(<S-n:trans-integer

if...then

in the metalanguage.

trans-program,

from

etc.

definition,

is a s t a t e m e n t

translator,

is-

is d e f i n e d

The LML

as: t r a n s - p r o g r a m

(p)=

(s l(p))>,

<s-m:trans-integer

(s 3 ( p ) ) > , < s - r a t i o n a l - l i s t :

m a k e l i s t ( s 5(p) ,

s7(P) , .... S2k_l (P)) >) where makelist

(Xl,X2,...,Xn)=~o(<s-head:trans-rational(Xl)>,

<s-tail:i_~f x 2 = ~ & . . . & X n = ~

then ~ else m a k e l i s t

(x2,...,Xn)>)

and the functions trans-rational:

is-

+ is-rational

^

trans-integer

^

: is-

are not further defined. functions

+ is-integer

For our p u r p o s e s

these two

are best thought of as the usual mappings

the rational numbers.

(In an actual

may be more useful to consider

implementation,

them as m a p p i n g s

In this case the sets

w o u l d be finite sets, practical

it

into bit-

^

patterns.

onto

A

is-rational

and i s - i n t e g e r

due to the fixed w o r d - l e n g t h

of

computers). ^

Note that t r a n s - p r o g r a m

(p) e is-program,

(x I, .... x n) e is-rational-list.

and m a k e l i s t

20~

A.7

The I n t e r p r e t i n g A u t o m a t o n

Following Ollongren a u t o m a t o n to be a 5-tuple

(49), we d e f i n e (0, is-state,

~o,A,F), w h e r e

0 is the set of t r e e - s t r u c t u r e d objects and i s - s t a t e

is a p r e d i c a t e over O.

an i n t e r p r e t i n g

a l r e a d y introduced,

Objects satisfying A

i s - s t a t e are states of the automaton.

~o e

is the initial state of the automaton, final states. however, A(~)

is-state

and F is a set of

A is the state t r a n s i t i o n

function;

its range is not is-state, but the p o w e r set of is~state.

is thus a set of states

d e f i n i t i o n of LML,

in qeneral,

a l t h o u g h in our

A(~) will always be a single state.

A . 7 . 1 The State

The state of the i n t e r p r e t i n g

a u t o m a t o n is structured.

The s t r u c t u r e depends on the language to be defined, the d e f i n i t i o n of LML can be rather simple. b l o c k structure, types,

procedures,

conditional

variable

For the LML i n t e r p r e t i n g

A language w i t h

i d e n t i f i e r s of various

and qoto statements,

need a r a t h e r m o r e c o m p l i c a t e d

and for

and so on, will

set of states. automaton,

or LML machine,

we

define is-state=

(<s-c: i s - c > , < s - d n : i s - d n > , < s - c o u n t e r : i s - i n t e g e r > ) .

is-dn is a p r e d i c a t e s a t i s f i e d by a d e n o t a t i o n directory, and is d e f i n e d by is-dn=(<s-data:is-data>,<s-y:

is-rational>,<s-parno:

is-integer

V is -~>) where

202

is-data=

(<s-i:is-integer>,<s-list:is-rational-list>).

The data for a program,

namely the sequence i,Yi_l,Yi_2,...

appears in the initial state as the object

s-list

s-i / i

s_hea~s_tai 1

/

),,

Yi-1

s-head'-

/ Yi-2 We do not specify how this is achieved. result of the computation,

Similarly,

the

Yi,is the object s-yos-dn(~F) ,

where ~F £ F, and we do not specify how it is output.

The

number m+n, which is required for the correct interpretation of the program,

is stored in s-parno • s-dn

"denotation directory"

(~).

(The term

is taken over from

(49) and

(68).

For LML this directory is simpler than in

(49) and

(68),

but it serves essentially intermediate

the same purpose,

namely storing

and final results).

The most complex part of the state is the control, which is an object satisfying the predicate is-c=

(<s-in: <s-ri:

is-in>,<s-al:

is-dum V is-~>,)

where the following c: control, ri:

is-obj-list>,

in:

V is-~,

abbreviations have been used:

instruction,

al:

argument

list, obj

: object,

return information,

dum: dummy.

In this definition,

is~in is a subset of the elementary

203 ^

objects,

called the set of instructions,

subset of the e l e m e n t a r y r is a simple is-obj-list

selector,

is a p r e d i c a t e

discussion

called the set of dummy names.

different

w h i c h we do not define extensive

objects

and is-dum is a

from s-in,

s-al or s-ri.

satisfied by lists of objects,

further;

Ollongren

(49) gives an

of lists.

An example of a control

is the object:

r~-al s-in

s-

/

in/~Ss-al

in 2

ri

[ in 1

I x

This p a r t i c u l a r

control may have

the i n s t r u c t i o n

in 2 is performed,

The result of carrying name a.

of the next state

s-i~s-al

in 1

with x as its argument. to the dummy

so that the control part

\

in 2 (x)

in 1 is now carried out, with

in 2(x)

as its argument.

in 2 is said to be contracting.

On the

it may be that carrying out in 2 requires

carrying out some other instruction

effect:

is

/

In this case,

the following

out in 2 is assigned

in°2 is then deleted,

other hand,

~a

a

instruction

in 3 on in 4 (x).

first

in 4 on x, and then an

In this case in 2 is said to

204

be expanding,

and c a r r y i n g it out results in the n e x t state

having the control:

r

\ /

~

-

s-in

/

s_ri

in3

-ri 1

s-al

b

I

in 4

a

in 1

~b

x

If b o t h in 4 and in 3

a

are contracting,

the c o n t r o l s of the

n e x t two states w i l l be:

r ~N~s-al

s-in s-ri

1 in 1

a

in 3 s~al

a s-in

s-al

/

\

in 4 (x)

in 3 (in 4(x))

in 1

= in2

If an i n s t r u c t i o n

is expanding,

(x)

then p e r f o r m i n g it leaves

all c o m p o n e n t s of the state u n c h a n g e d e x c e p t the control itself.

However,

if it is contracting,

then its e f f e c t

m a y be to change any of the c o m p o n e n t s of the state case,

s-counter

(~) and s-dn

We need some d e f i n i t i o n s

(in our

(~), as w e l l as s-c(~)). for later use.

The set of

205

control

selectors

selectors ~he

of a control

C is the set of composite

~(C)={K:K=roro...or

identity

of a control

selector)

&K(C)¥Q},

if C=~.

if C ~ ,

The terminal

C is the composite

and is I control

selector

selector

T ( C ) = { Y : T e ~ ( C ) & r o T % ~ ( C ) }. If K=r n is a control where

rn=rorg...0r

precedin~

selector

(n compositions),

control

selector

If K is a control

& s-alopreci(K)(C)

selectors

of i n s t r u c t i o n s

selectors

control C, then

o K(C)#~}

is the set of composite

control

of a n o n - e m p t y

(C,K)={s-alopreci(K):i~l

arguments

then the mth

(O~m~n).

selector

=s-ri

and n)l,

control,

of K is

prec m ( K ) = { K ' : r m o K ' = K }

prec-arg

of a n o n - e m p t y

which

select those

associated with preceding

of K w h i c h

are equal

to the dummy name

a s s o c i a t e d with K. If K is a control

selector

of a n o n - e m p t y

then the derived

return

the set r i ( C , K ) =

prec-arg

included because

these two sets differ

for the r e l a t i v e l y The initial

Here, p

(C,K).

a s s o c i a t e d with K is (This d e f i n i t i o n in

state of the LML m a c h i n e

is

(49), but coincide

(<s-data:

introduced

is

int-prog>,<s-al:p>)>,<s-counter: d>,<s-y:

is the LML program,

is-program

C

simple LML machine).

~o=~o(<S-C:~o(<S-in: <s-dn:~

information

control

which

I>,

O>)>). satisfies

in section A.5.

the p r e d i c a t e

The object d

206

satisfies

the p r e d i c a t e

section,

int-prog

is-data d e f i n e d

is an i n s t r u c t i o n

earlier

in this

w h i c h will be defined

later. The set of final states of the LML m a c h i n e F={~:is-state(~) A sequence

~o,~i,...

the LML machine. terminates. A.7.2

& s-c(~)=~}. , where

~i+l~A(~i ) is a c o m p u t a t i o n

The State T r a n s i t i o n

interpretin~

of

If, for some i, ~i e F then the c o m p u t a t i o n

(Every LML c o m p u t a t i o n

W i t h every

is the set

instruction

function

Function in

~in"

of a state ~, and K a control and let ARG= s-al.K(C)

terminates).

e is-in is a s s o c i a t e d Let C be a n o n - e m p t y selector

of C.

an

control

Let s-in-K(C)=in,

be the list of arguments

of in.

Then ~in(ARG,$,K)

= i f PI(ARG,~)

then

gl

else if P2(ARG,~)

then

g2

then

gm'

es___~e 1 . else if Pm(ARG,~) where PI,P2,...,Pm

are p r e d i c a t e s

(m~l),

and gj has one of

two forms: (i)

For the case of c o n t r a c t i n g

control,

gj=~(~(~ (~;) ;{:TEri(C,K) }) ; <s-counter: where

eJo ande3 are objects.

deletes

the i n s t r u c t i o n

in,its

is-integer>,<s-dn:E~(ARG)>)

In this e x p r e s s i o n argument

the innermost

list and its return

207

information,

the m i d d l e ~ r e t u r n s the o b j e c t

p r e c e d i n g control

selectors,

EJ(ARG) o

to

and the o u t e r m o s t ~ alters

c o m p o n e n t s of the state o t h e r than the control. (ii)

For the case of e x p a n d i n g control,

gj = ~ ( ~ ; < K a s - c : ~ ( c 3 (ARG);<s-ri: w h e r e eJ (ARG)

satisfies

s-rioKos-c(~)>)>),

the p r e d i c a t e

is-c.

In this case

the i n n e r ~ a s s o c i a t e s the r e t u r n i n f o r m a t i o n of K(C) w i t h the new control

EJ(ARG),

and the o u t e r ~ r e p l a c e s

the

control K(C) w i t h the new o b j e c t thus created. The V i e n n a M e t h o d uses to d e f i n e i n t e r p r e t i n g in a m o r e r e a d a b l e

a s y s t e m of i n s t r u c t i o n

functions

schemata

rather m o r e c o n c i s e l y and

fashion than the above e x p r e s s i o n s .

However, we shall not d e s c r i b e

this

feature,

since it is

f e a s i b l e to define LML in the above manner. It is now p o s s i b l e to d e f i n e the state t r a n s i t i o n A(~)={q:q=~in(ARG,~,K) &

F r o m this d e f i n i t i o n

ARG

& K=T(s-c(~)) =

& i__nn=s-inoK,s-c(~)

s-al=KoS-C

it is a p p a r e n t

(~) }.

that the state t r a n s i t i o n

is d e t e r m i n e d by always c a r r y i n g out the i n s t r u c t i o n w i t h the t e r m i n a l control of the state, o c c u r r i n g at the "deepest" in s e c t i o n A.7.1).

associated

n a m e l y the i n s t r u c t i o n

level of the control

In g e n e r a l

function:

(cf. e x a m p l e s

(although not for LML),

will be a set c o n t a i n i n g m o r e than one control

selector.

T(S-C(~)) Hence

our e a r l i e r remark that A(~) w i l l in g e n e r a l be a set of states,

r a t h e r than a single state.

In such a case,

does not m a t t e r w h i c h of the t e r m i n a l i n s t r u c t i o n s first.

it

is p e r f o r m e d

208

It is the specification of the interpreting functions of an interpreting automaton which assigns meaning to an abstract program. A.7.3

Interpretin~ Functions

for LML

we now complete the definition of LML by defining a set of interpreting

functions

for it.

The instructions

to

be defined are as follows: Instruction

Type

Domain

int-prog

expanding

is-program

int-m~

expanding

(is~integer) 2

set-mn

contracting

is~integer

int-~ro~-list

expanding

is-rational-list

updatey

contracting

is-rational

product

contracting

{is-rational) 2

sum

contracting

(is-rational)

A

^

We assume that the binary arithmetic operators available.

2

+ and * are

The remarks at the end of section A.6 apply

to these. (i)

int-pro@ int-~ro~

(p,~o,I)=H (to;<S-C:e (p)>)

where e(p)

s-al s-in s-in

/

s-al

J

s-rational-list

int-prog-list

int-mn

(s-m(p) ,s-n(p))

(p)

209

(2)

int-mn int_mn((X,y) ,~,K)=~(~; <Eos-c:e(x,y)>) where e (x,y) = r / ~ s - a l s-in s-i /

s-al

s=

I

v

I ~et-mn

~v

(x,y) (3)

set-mn set-mn(X'~'K)=~(~(~;) ;<s-dn:~(s-dn(~) ;<s-parno:x>) >) Note:

(4)

set-mn puts the value m+n into s-parno-s-dn(~).

int-prog-list (x,~,K)= if s-counter (~)<s-parnoos-dn(~)+2 ~ (~;) ~(~;<~ ~ c : e 2 (x)>)

~int-prog-list then else

where e I (x) =

s-in

i

sn rri< \

k s-tail (x)

int-prog-list v

product

"~'"

~k

\

(u, s-go s-dn(~) )

s-el

I (s-head(x),

s-head

•

s-list

°

s-data

o

s-dn

(~))

210

and ¢2 (x) =

r

/•s-al s-in

updatey s-

-

±

/

\ v

sum s-al

I

(s-yDs-dn(~) , s-head o (s-tail) i (x)) where (5)

i=s-ios-dataos-dn

(~)

updatey ~updatey

(x,~,K)=~(~(~;
-c:~>) ;<s-dn:~(s-dn(~) ;<S-y=x> ,

<s-list#s-data: <s-counter:

s-tail-s-list-s-data,

s-counter

updatey

brings

the next data item to the top of s-list-s-data.s-dn(~),

(6)

(7)

into s-yos-dn(~),

(~) by i.

produc 9 ~product

where

s-counter

value

(6)+1>)

Note:

and increases

puts an intermediate

s-dn

((x,y),~,K)=~ (~ (~;) ; )

r e ri (s-c(~),K) sum ~ s u m ((x,y),~,K)=~(~(~;) where

T eri(s-c(6),K)

;)

(6)>),

211

In order to clarify the above definitions, some of the steps in an LML computation are shown below.

To save

space, only the control and those parts of the state which have just changed are shown. s

s-in

s-counter

/ int-prog

I

1 0/

s-data

s-al

/s-i~ 1

s-list

s_n ~ s _ m s-rational-list

s-ha~d

m n

/

Yi_l-/"

s-tail

s-head

s-tail

I

\.

/

.)

2

el

s-head s-hel

u.1-m

o/

sc ~

I

s-al

int-prog-list

/ int-mn

/ s-head

s-in

I

(re,n)

s-tail k%

I aI

212

~2 =

S-C

r "~s-al s_~in siin t ~ in -prog-lis s-t~l r s-head s-al %

I

\v

set-mn

/

ai

\

/

s-al

sum

I

(ra,n)

~3 =

o-c~ s-al

int-pro~-list

s-in / ~ s-al / \ Set-mn

~s-tai< s-head

m+n

/

aI ~4 =

•/ ~

s-c~

/ s-in s-al int-pro@-list s-ea~ds-tail /

aI

\

s-dn S - ~ s 0

s-parno

I

m+n

-da "-.

213

s-o~ if m+n > 0 then ~5 =

/ ~ r s-¢n / / ~

i

r

s-al

int-prog-list ~

IX

/

[ s-a\ /

l~s_r~

s;i%Sa~

s-ln

s-head

~v

/

updatoy

s-i / product

s-al

\ s-rx

/

,0)

\

(al,yi_ 1 )

u

~6 =

S-C/~

~

/

_ s-~n I \ ~,,~-~,-°~-;,, ,s-head

s-in s-i / sum

s-al ~

I updatey

~

/

k

a2

v

v

(al*Yi_l,O)

s-c~

~7 =

/ updatey

int-pro@-list

I (al*Yi_I)

s-

~-t,,,.il

214

S--C

s-dn s-in~ / int-pro~-list _

s-counter 2I

~

s-da~

al*Yi_1 s-head

/

a2

s-head

I

/

:

Yi-2 A sequence like ~5,~6,~7,~8 is now repeated until s-counter (~i)=m+n+2, whereupon we get s-dn s_c/~ / ~ s-co~nter s_y~ s-data m+n+2 r s-al s-parno

~i+l=

.

s-ln sum / ''

s-~n

v

I ~pdatey s-al I s-ri (~,di)

~

~i+2=

//~ S-C

update~

s-al I 9+d.1

I

s-i

m+n

i

I

215

\

~i+3 =

s-dn s-counter

I

\

m+n+3

s-y

/

s-data s- ~arno

Yi

s-i m+n

~i+3£F,

so the

is a v a i l a b l e

computation

be r e m a r k e d

the LML i n s t r u c t i o n s restrictions

These

that

length of the

are,

items

table.

is s i m p l y

done by e n t e r i n g

free g r a m m a r

cannot

definitions

These

LML

and

the v a l u e s

not e x c e e d N,

of m

the

can be

instructions.

state

This

if any of these

like Algol,

context

be e x p r e s s e d (49)).

are

and a b s t r a c t

with

of the LML

of

be e x p r e s s e d

restrictions

to s p e c i f y

(see

There

of p a r a m e t e r s

In l a n g u a g e s

can also be u s e d which

cannot

of c o n c r e t e

an e r r o r

are violated.

restrictions,

which

of i s h o u l d

in the d e f i n i t i o n s

technique

complete.

be c o m p a t i b l e

the v a l u e

look-up

above d e f i n i t i o n s

that the n u m b e r

expressed

conditions

the

are not q u i t e

specifications

rumber of data

and n, and

that

on an LML p r o g r a m

in the e a r l i e r

the

Its r e s u l t

in s - y o s - d n ( ~ i + 3 ) .

It s h o u l d

grammars.

has t e r m i n a t e d .

this

- sensitive

in the c o n t e x t -

216

A.8

Summary

The V i e n n a m e t h o d of d e f i n i n g progra~%ming languages has been described.

This m e t h o d includes

d e f i n i t i o n of the s e m a n t i c s of a language,

the formal and is s u f f i c i e n t l y

p o w e r f u l to be used for the d e f i n i t i o n of p r a c t i c a l p r o g r a m m i n g languages.

It has been used here for the d e f i n i t i o n of

the simple and s p e c l a l - p u r p o s e L i n e a r M o d e l Language. This has b e e n done b o t h to i l l u s t r a t e the method, o r d e r to m a k e language"

and in

f a m i l i a r a r a t h e r b r o a d e r n o t i o n of " p r o g r a m m i n g

than is usual.

The V i e n n a M e t h o d of l a n g u a g e d e f i n i t i o n is used in ch~ter

5 to f o r m a l i s e the n o t i o n of a "fragment"

of a language.

217

APPENDIX B Syntax

Of the

Algo iW-Support

of the Gas-Furnace

Models

This appendix contains the concrete syntax of the AlgolW-support

of the five models of section 6.3.2.

It

is based on the AlgolW syntax specification given in The numbers in brackets the relevant sections of comparison.

to the right of subheadings

(50). indicate

(50), in order to facilit&te

Standard procedure

statements

terminals which do not appear in

are new non-

(50) (cf. sec. 6.3.1).

The symbol "t" may be replaced by either "real" or "integer", in accordance with the rules specified in sections i.i, 1.5, 1.5.3, I.

and 1.6.2 of

Identifiers

::=

(50).

(1.2)

::= ::= <standard procedure identifier>::=

READIREADONIWRITE

::= EIIIJINIUIVIWIYIZ ::= 0111213141516171819

(Note:each of these appears in

Numbers

every

look-up table).

list>:: = l
list>,

(1.3.1)

::=

::=
number>.
number> I

.

218

::=l
Declarations

number>

(1.4)

<declaration>::=<simple

variable declaration> I

3.1

Simple Variable Declarations

(1.4.1)

<simple variable declaration>::=<simple

type>

<simple type>:: = INTEGERIREAL 3.2

Arra[ Declarations

(1.4.2)

::=<simple

type>ARRAY
list>

() ::= ::=:: ::= ::= 4.

Expressions

(1.5)

::=<simple 4.1 Variables

t expression>

(1.5.1)

<simple t variable>::= I ::=<simple

t variable>

::=
array identifier>(<subscript

<subscript list>::=<subscript> <subscript>::=
expression>

list>)

219

4.2

Arithmetic Expressions (1.5.3)

<simple t expression>::=l<simple t expression>+ l<simple t expression>- ::=l* ::= ::=I 4.3

Lo@ical Expressions

(1.5,,..4)

::= ::=<simple t expression>
operator>

<simple t expression> ::= < 5.

Statements

(1.6)

<program>::=.

(Note we do not provide a

specification of the syntax of ). <statement>::=<simple

statement> I

I <simple statement>

::=l I <standard procedure statement>

5.1

Blocks

(1.6.1)

::=<statement>END ::=l<statement>; ::= BEGINI<declaration> 5.2

Assignment Statements

(1.6.2)

::=
left part>

220

::=
Standard

variable>:=

Procedure

<standard procedure

Statements

(cf. 1.6.3 and 1.6.8)

statement>::=<standard

procedure

(
list>::=

list>,
expression>I
designator>::=
5.4

designator

If Statements

::=
parameter>

subarray designator>

array identifier>(<subarray designator

<subarray

list>)

list>::=<subscript> (1.6.5) clause><simple

statement>

ELSE<statement> ::= 5.5

Iterative

IF
expression>THEN

Statements

(1.6.7)

statement>::=
::=

clause><statement>

FOR:=BO

value>::=
::=
list>)

parameter> I

::=
identifier>

expression>

expression>

value>UNTIL

I I

~

~ ,

I

I

o

,

. 1 .

I

.

I

,

I

~

0

I

I

.

I I

.

]

.

.

I

~

.

l

,

I

o

.

.

I

°

I

~

I I

I

o

o

I

.

I

,

I

I

,

0

I

I

o

I

I

.

~

.

.

o

.

I I I

I I

I I

o

.

~

' ~ X ' ' ~

I

o

.

!

~

.

~

~

*

i

I

-

~

.

. . . .

I I I I ~ 1 1

I

'

0

'

I

)

'

I

o

I

'

I

.

I

~

,

•

I

l

I

,

*

I

.

I

I

•

.

*

.

,

~

I

I

,

.

I

o

I

.

,

o

~

~

I

.

~

I

.

W

o

~

W

~

~

.

.

.

•

~

~

.

•

~

~

.

w

.

w

. o

°

I

I-'-

'~'~

I~. •

I~-

F-'-

I-I

I-t

I--I

H

o

0

I

0 0

t~

,

i

i

~

l

,

i

I

~

I

~

0

0

o

o°

o

I I I

~

i

~

e

0

I

I I I

I

i

,

o

~

,

~ , .

° ° l ~ ~

i

,

i

~

~p

~

i

I

e

I I I

e

,

• ~

~

i

I

.

~.

,

~

I | e e e e ~ ~ ~ ~

°

.

~

I

I

i

,

i

o

~

I

,

o ~ 0

e

~

I I

e

,

e

i o

~

.

l

i ,

0

~°

a m

i

I I ~ e e e ~ ~ ~

i

~o

l

.

i

I

o

~

~

i e

i

1

~~

o

0

e ~ m ~ ~

.

o ~

I

.

i

.

~

I

~ ~

e

~

,

|

~

m

I

~

i

o°

e ~

I

I

I I

I 1

i

~

~

~

,

~ ~

1

i

o

.

~

0

I

I

1

i

°

~.

~

I

o

~

I

i

~

l

o

0

i

~ 0

o ~

0

I I

i o

.

e ~

I I ee ~ ~

i

o

o ~

i l

o°

*

o

~

~

j

i

,

t

I ~

I

i

0 0

0

~

e

i

~°

*

l

'

~

e

i

°

"

• ~

~

~

j

i

o

'

~

~,

'

l

i

.

o

e

l ~ ~

e

o

~

°

'

o

'

m

I

~

"

I

o

~

I

o

I

~

I

I

1

I I ~

I

~

~

~

I

~

o

~

I o

l

I

o

I

I

I

.

o

I

I

I I I I I . o l a o

l o

.

. . . . . . .

I

~

l .

.

l i

o

,

l

.

l

I I I I I ~ i

~

I

o

2

I I I

I

~

l

~

I

J

l

l

i

-

l

.

t

l

~

I .

l

I

I

o

l l

l

l l l l ' ' " ~

~

l

,

o

l

~

l

II

l

l

I

o

l

~

l

0

I

I

.

.

0

~

I

4

.

l

.

l

•

0

I

.

l

.

1

l

1

.

1

.

.

1

~

1

.

1

.

1

1

.

1

.

.

1

1

.

I

I

I

•

I

I

I I I

. . . o o . 1 1 1 1 1 1

I

I I I I I I I I I I I I o w l m , . - O O o . i ~ ~ ~ ~

I

l

.

I

I I

•

nJ

I--,-

ro ~o

,

I

I

~

,

I

I

~

,

,

l

~

•

o

I

l ~o

l

l ,~

,

~

0

~

~

,

I

~

I

I

~

o

i ,

I

I

.~

~

l

l

~

I

I

I

~

,

l

o

I

,

I

~

~o

l

°

I

I

o

~

I

i

~

~ ~ .......

~

.

,

I

,

~

l

.

I

l ,~

-

I

I

,

~

.

~~

0

I

l

,

"

~

=

I

"

~.

"

I

I

~

I

o

~

~

,

"

I

~

I

~

I =

~

. ~ .

~

I

I

"

l

,

I I

~

.

&

~

I

,

-

~

+~

~

I

I

°

.

.

~

I

.

-

~

.

,

~

I

.

~

l ,

.

l

,

~

.

II

i

,

I.

,

I

~

i

~~

l

,

0

I

~

I

I

~

0

0

O

<~

0

o

. F--+

. I--+

~

II

, O +

~ .

I

I I

.I ,I~

~

. ~ .

•l l--+

~

.

0

I

.I I-~

~

o~

.

0

°

,~

. ,=.-

~

I

~

0

.

• I,~

~

I

0

~

0

+

I

I

I

I

~

o ~

~

~

~

I

I-+"~

.

.

I o

I

I

,

~

.

o

.

o

I

.

•

,

,

I o

~

I l l . .

I I I

I

0

.

0

I

.

o

~

~

.

•

.

l

.

.

l

.

.

~

I o

~

l

I o

.

-

•

l

I

I '

,

l

~

.

I °

~

.

i

,

I '

o

.

l

I '

I

0

o

'

0

o

.

I I

.

~

I

o

.

l o

.

°

l o

I

I

~

I t

l .

~

I

I

l .

I ,

~

l .

I

O

b

i ~

I

O

.

.

I t

l i

I ,

0

l ,

I O

I .

l ,

O

I .

Q

I

~

.

I

I

O

I .

.

I .

~

I

I I

I

~

I

I

o .

,

-

I °

I

.

I

'

.

~

l

'

-

l

l

.I I

I

'

obbhb

.

.

. , . , , ,

b b b b o b b b o ~ o b b b o b o b b b b b - o ~ b o

.

0

•

.

I

I

I

.

I

.

~

~

I

l

.

i

I

I f

o

m

~P

r..n

I

I

0

~

I

I

I

.

f

~

~

0

•

I

.

I

.

~

~

~

I

0

.

~

~

~

o

|

.

j

.

~

.

I !

O

0

I

~

,

~

I

~

.

o

......

0

~

.

~ ~

0

.

o

~

~.~

i

.

I

I

B ~

I

~

l

l

I

~

~

~

I

o

i

f

~

I

~

~

l

~

~

0

0

o

l

l

I

O

~

I

I

l i ~

~

~

~

I

l o

I f

O

~

I

l o

~

I

~

I

~

0

i o

I

~

~

i o

I

~

I

I

e

~ . -

I

0

l

I

l I - I

~

l

~

~

I

i

!

~

I

~

~

|

~

~

0

|

~ •

I

~

I

~

.

~

~

~

I

~

I

I l

~

I

I l

I

I

I

~

I

0

.

~

~

0

I

~

~

I

~ .

l [

~

l

~

~

i

.

1

O

~

1

0....

~

1

~

,!

~

O

~

O . ,

~

. . . . , . , , ' I 1 " 1 1 1 1 1 1 1

~

I l i ~ l l l t l l l . , , . , , ,

0

I

0

~

I

~

!

0

t

t-,-

~P

r~

I

.

.

.

i

~

~

l

~

.

~

.

. ~

~

i

~ ~

*

0

I

.

~

i

* ~

I

~ 0

l

.i

l

0

l

I .

~

*

~

i

*

.

~

l

I

~

i

.

~

I

~ ~

*

.i

I ~

I

I

~

.

4 ~

~

~

i

~

I

I

~

.

.i

~

I

~ ~

I

~

I

I .

.

~

~

l

~

~

0

l

~ ~

.

~

l

~ 0

0

i

.

I

(

)

~

* * o ~ ~

~

.

"

~

~

~

l

!

.

I

0

.

0

~ ~

l

.

~

.

• ~

~

.

~

|

.

.

I

0

0

~

~ ~

I

.

~

• ~

~

~

.

I

I

~

.

I

I

~

l

~ ~

.

.l

l

~

l

~

~

I

~

.

I

l

I

~

.

0

~ 0

.

~

.

0

.

~

°

~

l

.

!

I

.

0

~

.

~ ~

,

l

o

.

p

.

~

~

~

l

.

I

0

~

.

,

~ ~

*

*

l

*

I

I

~

~

0

.

l

o

~

~

~

~

0 ~

.

.

!

~ b

!

~

.

~

~

0

I-'

~

~

~

l

0

.

,

~

0

b

.

~

.

i i

0

I I ~

0

~

~

!

I

~

~

0

.

~

I

0

I

I ~

~

.

~

~

.

~

P H ~

0

I

~

~

.

*

~

0

I

.

I

I

~

!

~

~

0

~-~

°

~

~

I

~

1

1

1

I

I

0

0 °

F

I

.

.

i

~ 1 1 1 1

~

I

1

I I

0 j

~

~

I ~ I • ~ ~

~

~ •

~

0

I

0

~

o

I

I . ~

~ ,

.

I

,

O

~

~ .

.

I

,

o

.

. ~

I I

I

~

~

.

I

~

I

~ ~

,

~

.

~

~ e *

I

~

I

~

~

.

0

O

.

. ~

.

~

~

o

.

I

I

I I o .

-

.

. * ~ ~

I

I o

o

~

.

-

~

I

I

~

-

-

°

0

e

0

°

] I

I ~

°

I

~

I o o . ~ ~

'

I

~

L

.

l .

'

l '

|

~

I

~

1

l '

I

l '

1

0

l

'

l

l '

1

'

i

• ~

I I I

o

0

1

l *

~

l

l

l *

'

I I ~

~

l '

'

~

l

l

l

o

'

~

• ~

~

'

I

0

l

l

~

I

'

i

"

I

~

I

~

~

~

I

o

~

. , , , , II I

I

~

'

l

~

FJ.

r~

k< l.J.

h~ O0

o ~

j

I t

l

o ~

I

o

l

~

a

I

4

.

o

o ~

o

e

'

'

0

I

'

'

I

~

"

I

I

~

I

a

I

I

t

I

~

o

I

I l

. ~

e

I

0

I

I I . e ~

~

o ~

~

~

I

.

I

o ~

I I o ,

l

*

l

.

I

I l

o

I

l

o

'

.

I

i

e

o

I o

l

I

.

[

I I e o

.

I ~

"

I '

I

I '

I ~ I I o O O

I

I

'

I O

I

O

I

I I

I

i.J.

230

constant behaviour

~sou RCE ~'~X°mp[exX~

S'NK

behavJour S 1=(O,X)

FIG

52 =(X,O)

1

I

)

const,ant beha~our

231

e.

Yi

.*,~rror

_JCONVENT]ONAL

ui

Yi Iii exact output observation

I approximate output obser vat[ on (model. output) Representation of Conventional Mode[ FIG 20

GENERATION OF CORRECTIONS

I

I ui

I I I

II

.J CONVENTIONAL MODEL

f/i

J

I

,Yi lexact output observation i(model output )

I

I I

I

I

Corresponding Modet Which Satisfies Definitions 3.3.1,3.3.6 FIG 2b

232

LI I

° t

D(B) C(B)

NOISE PROCESS

nt

B(B) A(B) DE TERMINISTI C PROCESS

ut

Yt

o

A(B) : 1 -0.57B-0.01B

2

B(B) = - ( 0 . 5 3 , 0 3 7 B , O . S 1 B 2 ) B 3 C(B) = 1 - O . 5 3 B , O . 6 3 B 2 D(B) = 1

The B o x - J e n k i n s M o d e l

of t h e

FIG

Gas-Furnace

3

Data

233

,

TABLE LOOK-UP

"Yi

(Yl ' " ' Y 2 9 6 )

(a) Model I - T h e

Trivial Model

I

I

I I

I I I I I I I I I I

53.5

I I

I

I I

TABLE LOOK-UP

I s

a

a

Yi

I I

I I

1_

_I

(b)Model I[-The Mean

FIG 4

,-'Yi

234

I

I

I

I

,[

INITIAL CONDITIONS

I

-0.049

J

I

J

I (u I ,...,u i ) I I

.-1

I

Yi,%

I

,l

I I I I I I I

AB(B) (B)

I ( ~'6 .....

I

Y~-I )

TABLE LOOK-UP

,~

I I

( n I .... n 296 )

I

I L

/

(c) Model I i I - Deterministic Using

Input

Observations

FIG

4

Transfer Only

Function

II

Yi (i;~6)

(i{6)

235

I

I

, I-o.o 9i,

(ui_5,ui_4,ui_ 3 ) [

I

53.5

..

B(B) A(B)

Yi

*

.It

Yi (i~.6)

(Yi_2,Yi_l) ei 53.5

I

~ I

I i

.

I I

TABLE

L OOK- UP

(el ' " # 2 9 6 )

I

L (d)Model IV-Deterministic

Tronsfer

Function

Us{,,ng Input ond Output Observotiqns

FIG

4

Yi (i~6)

236

[

1

I

I

,I

-0.049 J

53.5

I

(ui_7, ,u _ii) I

Yi

B(B) A(B)

I

J(Yi*__2,Yi*__l) +~ el-l)

L

(Y

,-7

.... 'Yi-1 )1 I

I

Y

I I

I I i

I I I

J

i

(Yi-2'Yi -1 )

-"v

J

D(B)

J

53.5

kl

TABLE LOOK-UP

-1 (al , ..., a296)

L

..I F'

j I I I 1 I !

I I I I I I I I I J

.(e) Model V - Stochastic

FIG

Process Model

4

Yi (i)8)

Yi (i<8)

237

I

I

I

53.5

-0.049 1

I

)1

i-7 ..... ui-1 I

÷

I

B(B)

Yi

l, +

-7 ""'Yi-1 )1

~

I

I

I I I I I I

.I I I

I

I

I

Y

(Yi-2,Y(_1)

ei-1)

=~-J

I A(B) D(B) _~

(wi_2,wiq,wi)l'

-I ..

I TABLE • LOOK-UP

-{ (Wl,...,w296)

L

c ( a i ....

I ei

I I I I I '1"

~r

t I

(i(8)

J

(f) Model VI- Box_& Jenkins M.ode[ FIG

4

Yi (i~8)

Yi

238

',N U'l

'1500

.1000

,500

lII&V I

observations I

I

100

20O FIG

5

Sizes of modets

I-VI

3OO

239

]

~rof,.OrmotiOn n

600'1" IV V VI

500

11"[ 400

300

200

100'

0

-100

y

observations :

I

100

200

360

VI -200 FIG

information

6 gains

of modets

l-Vl

240

.i

(ui...m,...,ui)J

i

m

bmBm. . . .

boI

~ LI

]

i

Y~ n

TABLE

LOOK-UP

FIG 7

Structure of computations performed by Linear Model Language (B denotes the backward shift operator)

241

Fq ( ~i -q-r~

" • ", ui

)J

Yi (i > m a x ((m- q-),(n.q)))

(Yi-q-n'""Yi-1)! -I A,"~

(Y'i-q.... ,Y D(B) C(B)

(Yi-q'""Yi-1)

-I

(Wi_p,...,w i ) _I

TABLE

-l

LOOK-UP

Yi Ill (i< m a x ((m~,q),(n • q)))

FIr~ 8 Structure

of comDutQtions perform~l Linear Mode[ Lar~juaeje

bY Extend.e~

242

I TABLE -

LOOK-UP

e,

Di

I~ixEo PART ~i oF ~OOEL

I

• - Ci=Y i

+=6

FIG

9

Assumed structure of models (in chapter

For definitions

of

D.,C. I

7)

I

see

def. (3.3.7)

Small Satellites for Earth Observation

Read more

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Read more

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Read more

Modelling of Marine Systems

Read more

All Sets Great and Small

Read more

Small-Scale Armour Modelling (Modelling Masterclass)

Read more

Mechatronic Systems: Modelling and Simulation with HDLs

Read more

Mechatronic Systems, Modelling And Simulation With HDLs

Read more

The Elements of Cantor Sets: With Applications

Read more

Modelling Forest Systems

Read more

Modelling Distributed Systems

Read more

Modelling, State Observation and Diagnosis of Quantised Systems (Lecture Notes in Control and Information Sciences)

Read more

Small Computer Systems Handbook

Read more

Histories of Scientific Observation

Read more

Modelling with Words - Learning

Read more

Modelling Metabolism with Mathematica

Read more

Observation Reconsidered

Read more

Small Persons With Wings

Read more

Small Persons With Wings

Read more

Small Persons With Wings

Read more

Modelling of Mechanical Systems: Structural Elements, Volume 2 (Modelling of Mechanical Systems)

Read more

Measurements, Modelling and Simulation of Dynamic Systems

Read more

Modelling and Analysis of Enterprise Information Systems

Read more

Measurements, Modelling and Simulation of Dynamic Systems

Read more

Measurements, Modelling and Simulation of Dynamic Systems

Read more

Modelling of Mechanical Systems: Fluid-Structure Interaction, Volume 3 (Modelling of Mechanical Systems)

Read more

Computer Modelling of Electrical Power Systems

Read more

Systems Modelling: Theory and Practice

Read more

Modelling Photovoltaic Systems Using PSpice

Read more

Innovative Information Systems Modelling Techniques

Read more

Recommend Documents

Small Satellites for Earth Observation

Small Satellites for Earth Observation Small Satellites for Earth Observation Selected Contributions Rainer Sandau D...

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Discrete gstems This Page Intentionally Left Blank Modelling of Mechanical Sgstems Discrete gstem Volume I Fran...

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Discrete Systems This page intentionally left blank Modelling of Mechanical Systems Discrete Systems Volume I Fra...

Modelling of Marine Systems

MODELLING OF MARINE SYSTEMS FURTHER TITLES IN THIS SERIES 1 J.L.MER0 THE MINERAL RESOURCES OF THE SEA 2 L.M. FOM...

All Sets Great and Small

Philosophical Perspectives, 17, Language and Philosophical Linguistics, 2003 ALL SETS GREAT AND SMALL: AND I DO MEAN AL...

Small-Scale Armour Modelling (Modelling Masterclass)

...

Mechatronic Systems: Modelling and Simulation with HDLs

Mechatronic Systems Modelling and Simulation with HDLs Georg Pelz Infineon Technologies, Munich, Germany Translated b...

Mechatronic Systems, Modelling And Simulation With HDLs

Mechatronic Systems This Page Intentionally Left Blank Mechatronic Systems Modelling and Simulation with HDLs Geor...

The Elements of Cantor Sets: With Applications

THE ELEMENTS OF CANTOR SETS— WITH APPLICATIONS THE ELEMENTS OF CANTOR SETS— WITH APPLICATIONS ROBERT W. VALLIN Slipper...

Modelling Forest Systems

1. 6 Ashton (2001), Victoria log(DBHob*100)=1.02*log(Age), Ashton (1976) Patton (1919), Victoria 5 SX004C, Tasmania...