Lecture Notes in Computer Science Edited by G. Goes and J. Hartmanis
30 F. L. Bauer • J. B. Dennis • G. Goos • C. C. Gotlieb R. M. Graham • M. Griffiths • H. J. Helms • B. Morton P. C. Poole • D. Tsichritzis. W. M. Waite
Software Engineering An Advanced Course Reprint of the First Edition
E d i t e d by F. L. B a u e r I
I
II
Springer-Verlag Berlin. Heidelberg • New York 1975
Editorial Board: P. Brinch Hansen • D. Gries C. Moler • G. SeegmLiller • N. Wirth Prof. Dr. Dr. h. c. F. L. Bauer Institut fer lnformatik der TU M0nchen 8 MLinchen 2 ArcisstraBe 21 BRD
Formerly published 1973 as Lecture Notes in Economics and Mathematical Systems, Vol. 81 ISBN 3-540-06185-1 1. Auflage Springer-Verlag Berlin Heidelberg New York ISBN 0-387-06185-1 1st editlon Springer-Verlag New York Heidelberg Berlin
Library of Congress Cataloging in Publication Data
Advanced Course on Software Engineering, Munich, 1972. Software engineering. (Lecture notes in computer science ; 30) First published in 1973 under title: Advanced Course on Software Engineering. "Tee advanced course took place February 21-March 3, 1972, organized by the Mathematical Institute of the Technical University of Munich and the Leibnitz Computing Center of the Bavarian Academy of Sciences, in cooperation with the Ministry of Education and Science of the Federal Republic of Germany." Includes bibliographies and index. l. Electronic digital computers--Programming--Congresses. 2. Programming languages (Electronic computers)--Congresses. I. Bauer~ Friedrich Ludwig, 1924II. Nk~nich. Teehnische Universit~t. Mathematisehes Instltut. Ill. ~.kademie der Wissensehaften, Munich. Leibnitz Eechenzentrum. IV. Title. V. Series. QA76.6.A33 1972a 001.6'425 75-14409
AMS Subject Classifications (1970): 6 8 A 0 5 CR Subject Classifications (1974): 4.
ISBN 3 - 5 4 0 - 0 7 1 6 8 - 7 ISBN 0 - 3 8 7 - 0 7 1 6 8 - 7
Nachdruck der 1. Auflage Springer-Verlag Berlin Heidelberg New York 1st edition, 2nd printing Springer-Verlag New York Heidelberg Berlin
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other
Contents
PREFACE F.L. Bauer CHAPTER 1: INTRODUCTION K.W. Morton
WHAT THE SOFTWARE ENGINEER CAN DO FOR THE COMPUTER USER
4
1. Introduction
4
2. Program Duplication
5 8
3. User Images
J.B.
Dennis
4. Application Program Suites
lo
5. Conclusion
11
6. References
11
THE DESIGN AND CONSTRUCTION OF SOFTWARE SYSTEMS
12
1. Introduction
12
2
Terminology
13
2,1
Computer Systems
13
2.2
Software Systems
15
2.3
Hierarchy
15
2.4
System and Application Software
17
3
Description
19
4
Function~ Correctness, and Reliability
of Software Systems Performance
19
4.1. Function
2o
4.2.
Correctness
22
4o3. Performance
23
4.4. Reliability
24
5. Software Projects
25
6. Acknowledgement
27
7. References
27
CHAPTER 2: DESCRIPTIONAL TOOLS G, Goos
HIERARCHIES
29
O. Introduction
1. Hierarchical Ordering as a Design Strategy 1.1. Levels of Abstraction
36
1.2. The Order of D~sign Decisions
38
2. Hierarchical Ordering and Languages
G. Goos
29
41
2.1. Abstract Machines and the Production Process
41
2.2. Hierarchies of Languages
42
3. Protection by Hierarchical Ordering
44
4. References
46
LANGUAGE CHARACTERISTICS Programming Languages as a Tool in Writing System Software
47
O. Introduction
47
1. The Influence of Language Properties on Software Creation
47
1.1. Language Constructs as Models for Program Behavior
48
1.2. Influence on Programming Style and Program Documentation
49
1.3. Machine Independence and Portability
51
1.4. Portability
52
Versus Efficiency
1.5. Limitations of Programming Languages 2. Requirements for Structured Programming and Program Modularity
53 54
2.1
Modularity
54
2.2
Hierarchies, Nesting and Scope Rules
56
Concurrent Processes
58
Data Structures in System Programming
59
3,1
Simple Values
61
3.2
Records
62
2.3 3
3.3 4
Storage-Allocation for Records
64
System-Dependent Language Features and Portability
66
5. Some open Problems
67
6, References
69
V
M. G r i f f i t h s
LOW LEVEL LANGUAGES SUMMARY OF A DISCUSSION SESSION
7o
1. Introduction
7o
2. Justification
7o
3. Features
71
4, Machine
M. G r i f f i t h s
Dependence
72
S. Efficiency
73
6. Style and Education
73
7. Conclusion
74
8. Acknowledgement
74
9. References
74
RELATIONSHIP BETWEEN DEFINITION AND IMPLEMENTATION OF A LANGUAGE
76
I. Introduction
77
Requirements
1.2
Design of Language for good Programming
80
1.3
Design for Testing
82
Language
83
2
of Different People
77
1.1
Definition
2.1
Syntax
83
2.2
Static Semantics
85
2.3
Dynamic Semantics
85
2.4
Example
85
taken from ALGOL 6o
2.4.1
Syntax
86
2.4.2.
Static Semantics
88
2°4.3.
Dynamic Semantics
92
2.4.4.
Comments on the Example
95
3. From Definition
to Implementation
96
3.1. semantic Functions
96
3.2. Implementation
98
Languages
98
3.3, Execution Model 3.4. Final Comments
on Implementation
4. A Look at some Definitions
99 Ioo
4.1. ALGOL 68
too
4.2.
lo2
Vienna Definitions
4.3. Extensible
Languages
lo5
5. Conclusion
lo6
6. Acknowledgements
lo7
7. References
lo8
VI J.B. Dennis
CONCURRENCY IN SOFTWARE SYSTEMS
111
I. 2. 3. 4. 5. 6. 7.
111 112 115 119 121 125 127
Introduction Petri Nets Systems Determinacy Interconnected Systems Interprocess Communication References
128
CHAPTER 3: TECHNIQUES J.B. Dennis
MODULARITY
128
1. Introduction Concepts
128
1.1. Definition of Modularity
129
1.2. Modularity in Fortran
131
1.3. Modularity in ALGOL 6o
134
1.4. Substitution
136
1.5. References
137
2. Data Structures in Modular Programming
139
2.1. Address Space and Modularity
139
2.2. Representation of Program Modules
14o
2.3. Linguistic Levels for Modular Programming
144
~,3.I. PL/I
145
2.3.2. ALGOL 68
146
2.3.3. LISP
147
2.3.4. Discussion
149
2.4. References 3. Modularity in Multics 3.1. The Model
149 151 151
3.1.1. The File System
151
3.1.2. Processes and Address Spaces
152
3.1.3. Making a Segment known to a Proce8~
154
3.1.4. Dynamic Linking
157
3.1.5. Search Rules and the Working Directory
16o
3.2. Accomplishments
161
3.3. Unresolved Issues
162
3.3.1. Treatment of Reference Names 3.4. References 4. A Base Linguistic Level for Modular Programming 4.1. Objects
162 165 166 166
4.2. Structure of a Base Language Interpreter167 4.3. State Transitions of the Interpreter
17o
4.4. Representation of Modular Programs
171
4.5. Use of the Model 5. References
18o 182
VII
P.C.
Poole
W.M. Waite
PORTABILITY AND ADAPTABILITY 1. I n t r o d u c t i o n
183 184
1.1. The Basic Principles
185
1.2.
185
What we can expect to achieve
2. Portability Through High Level Language Coding
187
2.1, The Need for Extensions
187
2,2, Extension by Embedding
188
3. Portability through Abstract Machine Modelling 3,1, Background
192 193
3,2. Relating the Modes to Existing Computers196 3,3. Relating the Model to the Problem 4. Realization of Abstract Machine Models
203 205
4.1. Translator Characteristics
205
4,2. Obtaining the Translator
209
5. A Case Study of some early Abstract Machines
211
5.1. Machine and Language Design
211
5.2. Porting and Adapting
222
5,3. Review and Evaluation
233
6. Low Level Languages for Abstract Machines 234 6,1. The Basic Hardware Model
P.C.
Poole
6,2. A Framework for Low Level Languages
239
6,3. An Example of a Low Level Language
250
7. A Hierarchy of Abstract Machines
262
7,1. Need for the Hierarchy
262
7,2. A Standard Base for the Hierarchy
267
7,3, A Case Study
272
8. References
275
DEBUGGING AND TESTING
278
1.
Introduction
2. Planning for the Testing and Debugging Phases
278 281
2.1. Documentation
282
2.2, Debugging Code
284
2.3. Generation of Debugging Code
281
2.4. Modularity
289
2.5. Parameterisation
292
3. Testing and Debugging Techniques 3.1. Classical Debugging Techniques
294 295
VIII
3ol
3.2. Online Debugging 3.3. Testing Strategies
D. T s i c h r i t z i s
and Techniques
31o
4. References
317
RELIABILITY
319
I. Design and Construction Software
of ReLiable
319
1.1. Introduction
319
1.2. Influence
32o
of the Language
1.3. Semantic Checking
322
1.4. Programming Style
323
1.5. Influence of Protection
325
1.6. Program Correctness
325
[email protected]. Informal Proof
326
1.6.2.
327
Formal Proof
1.7. Design for Reliability
328
1.8. Reliability during the Life Cycle of the Software
329
1.9. Summary and Conclusions
33o
2. Protection
332
2.1. Introduction
332
2.2. Domains and Objects
333
2.3. Protection
335
Walls and Monitors
2.4. Identity Cards and Capabilities
336
2.5. Policing
338
2.6. Describing the Protection Status of a System 2.7. Implementation
34o
2.8. A Capability
344
Based File System
342
2.8.1. Introduction
344
2.8.2.
345
Capability Format
2.8.3. Packing Capabilities
346
2.8.4. Kernel System Facilities
348
2.8.5. Passing Capabilities
349
2.8.6.
Outline of the File System
351
2.8.7.
Facilities
351
2.8.8.
Organization
of the File System of the File System
357
3. Security
357
3.1. Introduction 3.2. Information System Approach 3.2.1.
Integrity of Personnel
3,2.2. Authentication
354
of Users Identity
359 359 36o
IX
3.2.3. Protection of Data Off Line and in Transmission
360
3.2.4. Threat Monitoring
361
3.3. Data Dependence and Data Transformations362 3.3.1. Data Transformations
362
3.3.2. Data Dependent Access
363
3.3.3. Program Certification
363
3.4. Summary of Current Practices 4. References CHAPTER 4:
371
PRACTICAL ASPECTS
D. T s i c h r i t z i s
374
PROJECT MANAGEMENT
374
Introduction
374
I.
2. Project Communication, and Control
Organization
3.1. Proposal
378
3.2
379
Survey Phase
3.3. Design and Implementation Phase 4. Managing
Goos
Graham
381 382
"Large" Projects
5. References
383
DOCUMENTATION
385
O. Introduction
385
1. The Needs for Documentation
386
1.1. The User's Guide
387
1.2. The Conceptual Description
389
1.3. Design and Product Documentation
390 391
2. Special Problems
R.M.
376 378
3. Project Phase
G.
364
2.1. Description of Data and Algorithms
391
2.2. Crossreferencing between Documentation and Program
392
2.3. Maintaining the Documentation
393
PERFORMANCE
PREDICTION
1. Performance: Definition, and Limitations 1,1.
395 Measurement
What is Performance?
1.2. Measurement of Performance
396 396 397
1.2.1. Performance as a Function of Input
397
1.2.2. Metrics
398
1.2.3. Steady State, Transient, and overload Behavior
40o
Z
I. 3. Limitations
4ol
of Performance
1.3.1. Inherent Limitations
4ol
1.3.2. Economic Limitations
4o2 4o3
1.4. Summary
4o3
2. System Modeling 2. I 2.1.1
Analytical Models
4o5
2.1.2
Directed Graph Models
4o7
2.1.3
Simulation Models
412
2.2 3
Problems
416
in Modeling
Use of models in Performance
Prediction
418
3,1
Problems in using Models
418
3.2
Prediction using an Analytical Model
422
3.3
Prediction using a Directed Graph Model 427 Simulation
437
4.1
Major Methods
437
4.2
Specification
4.3
Data Collection
443
4.4
Simulation Languages
444
4.5
An Example Simulation Model
452
Integrated Performance Prediction, Design, and Implementation
455
4
5
C.C. Gotlieb
4o4
Types of Models
of Job Properties
439
5.1. The Problems with Non-Integrated Prediction
456
5.2. Single Language Approach
457
5.3. Interaction with the DesignerImp lementer
46o
5.4, Aids to Project Management
461
6. References
462
PERFORMANCE MEASUREMENT
464
I. Introduction
464 464
2. Figures of Merit 3, Kernels, Benchmarks Programs
and Synthetic 467
4. Data Collection and Analysis 5. Hardware Monitors
47o
5.1, One Computer Monitoring Another
472
5.2. Monitor Logic
472
471
5.3. Examples of Currently Available Hardware Monitors 474 5.4. Analysis of Output of Hardware Monitors 475
Xi
478
6. Software Monitors 6.~. Monitoring form Job-Accounting Data
C.C.
Gotlieb
6.2. Packaged Software Monitors
48o
6.3. Special Monitor and Trace Programs
481
6.4. Estimating Monitor Statistics from the Observations
486
7. References
488
PRICING MECHANISMS
492
1. The Rationale of Pricing
492
2. Determining Factors
493
3. Costs
493
4. The Factory Model
495
5. Pricing a Service
495
6. Software Requirements
497
7. Examples for Pricing Mechanisms
498
7.1. Rate Schedule for the University of Toronto, 1 Jan 1972
498
7.2. Disk Pack Rental
5oo
(Off-Line)
Disk Pack Storage
7.4
Disk to Tape Backup
5oo
7.5
Tape Rental
5oo
7.6
Tape Storage
5oo
7.7
Tape Cleaning and Testing
5oo
7.8
Negotiated Contract Services
5oo
7.9
Calcomp Plotting
5oi
8. References Helms
5oo
7.3
7.1o. Card Processing
H.J.
478
5oi 5o2
EVALUATION IN THE COMPUTING CENTER ENVIRONMENT 1. Introduction
5o3
2. The User and his Needs
5o5
3. Software and the Computing Center
510
4. Installation and Maintenance of a Piece of Software
517
5. Conclusion
52O
6. References
521
SOFTWARE ENGINEERING
522
1.
523
APPENDIX F.L.
Bauer
1.1.
What i s
it?
The Common C o m p l a i n t
523
XII
1.2. The Aim
524
1".3. The Paradox of Non-Hardware Engineering
524
1.4. The Role of Education
525
2. Software Design and Production is an Industrial Engineering Field
528
2,1, Large Projects
528
2.2, Division into Managable Parts
529
2.3, Division into Distinct Stages of Development
530
2.4. ComputeriZed Surveillance
531
2,5, Management
532
3. The Role of Structured Programming
532
3.1. A Hierarchy of Conceptual Layers
532
3,2. Communication between Layers
534
3~3. Software Engineering Aspects
537
3.4. Flexibility: Adaptability
538
Portability
and
3.5. Some existing Examples
539
3.6. The Trade-Offs
541
4. Concluding Remarks
541
Acknowledgements
543
References
543
SOFTWARE ENGINEERING An Advanced Course J B.Dennis
by
Cambridge, Mass.)
G Goos
Karlsruhe)
C C.Gotlieb
Toronto)
R M.Graham
Berkeley,
M Griffiths
Grenoble)
H J.Helms
Copenhagen) ng, England)
B Morton
edited by
Cal.)
P C.Poole
Abingdon,
D Tsichritzis
Toronto)
W M.Waite
Boulder,
F.L.Bauer
England) Colo.)
(Munich)
The Advanced Course took place February 21 - March 3, 1972, organized by the Mathematical University
Institute
of Munich and the L e i b n i z
of the Technical
Computing Center of
the Bavarian Academy of Sciences, in cooperation with
the European Communities,
sponsored by the M i n i s t r y Federal
of Education and Science of the
Republic of Germany.
PREFACE It
is
the
not
necessary
present
fully
prepared
presented use o f In
book,
at
the
in
ers,
way,
in
mean.
problems the
Soon
not
teaching
material
step
in
of
whether
the
there
essential
the
is to try
all
the
the
to
Science find
indeed
turn
up i n
the
claim
manufacturthat
whatever
the
they this
understood,
the of
of
Garmisch
be d o n e .
may
advertisements.
The r e p o r t s
much more.
in
show c o n c e r n
much b e t t e r
not
a
The s i t u a -
and s y s t e m a t i z e d .
of
is
difficult, it the
this
course
aspects of
of
the
software the
This
matethe
and Rome are
In order
was to
the
theme,
field.
to
have
book b r i n g s
of
a
penetrate
concern
was t h a t
be used i n notes
contribute
actually
in
debate
we t h i n k
engineering
its
it should
curricula.
whether
a topic
as a k i n d
of
of a theme
environment.
as many p e o p l e lecture
as much as we
Instead,
s o m e t h i n g y o u can m e n t i o n
could
and to
software
as much as p o s s i b l e
an a c a d e m i c
cover
We do n o t
engineering. ideas
and s h o u l d
out
my m a j o r
of
will
was w r o n g
software.
addressed,
o u t where
respect,
hand so t h a t
something of
engineering,
more has t o
planning
In
ture.
and
illustrates
and some o f
conferences
a need f o r point
engineering
publication
72,
has been used i n
many p e o p l e
but
in
at
then;
sponsored
students
extremely
that
material,
to y o u r this
71/Jan.
72,
software
systematization
Computer
Thus we w i l l software
care-
direction.
t h e moment o f to
are
available,
in
further
influence
Engineering:
experts,
Dec.
was m a i n l y
concentrated
this
Our i n t e n t i o n
is
of
Engineering'
engineering
engineering'
Committee
collection
since
of
the problems
still
NATO S c i e n c e
can a t
Software
Febr.-March
and s e r v i c i n g
software
principles
a useful first
in
of
a group
Garmisch,
demonstrate
changed
of
'software
But a l t h o u g h
to
production
provocation
obey t h e
is
course
of
in
'Software
order
design,
to which
already
rial
t h e word
has c o n s i d e r a b l y the
a definition effort
seminar
a EEC s p o n s o r e d
existing
tion
with
term.
provocative
about
start
a two-week
1967 and 1 9 6 8 ,
the
to
a consolidated
told
to
today
me, t o
a course.
despite
one s t i l l
their
digest
Therefore, somewhat
finds the
it
material
we e n v i s a g e d tentative
na-
In s e l e c t i n g
the p a r t i c i p a n t s
e v e r t h e y may l e a r n here i s the u n i v e r s i t i e s It
is
spread o u t ,
accidental
that
in
the
sharp c o n t r a s t
demand f o r live
with
It
affluent
will
the p r o d u c t ,
improve,
and t h i s
But I hope t h a t
I hope one day s o f t w a r e
tion
in
nomy i n
Science',
a rich
nation.
ends a l l
education help;
and then to be used. the ground f o r
crisis
Conference S t a f f thanks tute
On the o t h e r
may l e a d to s t r a n g u l a in parti-
I enjoyed
lecturers
for
to the
their
co-director,
from the M i n i s t r y me f o r
In
i n the subgroup
EEC, D r . R . G n a t z ,
s u p p o r t from M r . J . D e s f o s s e s of Education
of Germany s h o u l d be g r a t e f u l l y forgive
the a d v i c e and
encouraging support.
of group PREST of the the moral
and eco-
for (EEC)
and Science
of
acknowledged. The
not m e n t i o n i n g a l l
of them, my
to them go by the name of Mr. Hans Kuss o f the Mathematics I n s t i -
of
redactor
Munich,
support will
life.
depend on the computer t o d a y ,
I owe thanks
how
Thus, what we have to
our f u t u r e
to the German r e p r e s e n t a t i v e
connection
Republic
around,
and may thus do harm a l s o to s c i e n c e
in i n f o r m a t i c s
and the f i n a n c i a l the F e d e r a l
a defense turn
and to the
in t h i s
improvements.
partly
dictate
I am o b l i g e d
his
is
will
and f r i e n d s .
for
some hope
will
help
Prof. L.Bolliet,
is
situation
o f the Advanced Course,
of c o l l e a g u e s
there
hopes f o r
the t i m e b e i n g ,
In the p r e p a r a t i o n
particular,
They have not c o n s t r u c t e d
s i m p l e market c o n s i d e r a t i o n ,
some day t h i s
users t h a t
'Big
of the s o f t -
people are f o r c e d
he can do to make the customer s t a y
in m a s t e r i n g the s o f t w a r e
of scientific
cular
for
also preparing
hand, f a i l u r e
that
which
leads to
engineering considerations
machines are to be b u i l t is
but f o r
usually
software engineering,
work f o r
But the r o o t s
t h e y do not want.
The p o v e r -
them and have to make the b e s t out o f i t .
the m a n u f a c t u r e r does e v e r y t h i n g
stratagem.
community,
the chance of b u y i n g a new machine,
the s i t u a t i o n
Thus,
solution.
States.
have
on the c o n t i n e n t ,
US computer
comes from the f a c t
machines t h a t
Sometimes, w i t h
with
propagated in
Engineering'
o u t s i d e the U n i t e d
the most economical
them, t h e y s i m p l y r e c e i v e that
is
on ' S o f t w a r e
in Europe, a t l e a s t
to the
ware m i s e r y go deeper. to
to assure t h a t what-
in particular
efforts
on to a l a r g e e x t e n t
t y of the computer s i t u a t i o n is
some e f f o r t
and the major m a n u f a c t u r e r s .
not q u i t e
been c a r r i e d
we took
the T e c h n i c a l of t h i s
June 1972
University
Munich, who a l s o was the r e s p o n s i b l e
publication.
Friedrich
L.Bauer
CHAPTER 1.A WHAT THE SOFTWARE ENGINEER CAN DO FOR THE COMPUTER USER Prof.
Dr.
K. W. Morton
Culham L a b o r a t o r y ,
Abingdon,
Berkshire
Great B r i t a i n
1.
INTRODUCTION
There can be l i t t l e sion
doubt t h a t
there
i n the computer community.
potential
and,
generation
in p a r t i c u l a r ,
in new e q u i p m e n t .
hardware which c o n t i n u e s orders
of m a g n i t u d e .
i s more l i k e l y
that
it
with
In s h o r t ,
do we f i n d
to advance by
s o f t w a r e becoming ten
ten t i m e s more e f f i c i e n t ?
qualities
the c a p a c i t y
in g e n e r a l
It
to m a i n t a i n and
have been s a c r i f i c e d
has o u t s t r i p p e d software
users
the computer
i s ten t i m e s more complex both
of concept
implementation.
As a r e s u l t
to show a r e m a r k a b l e c a p a c i t y
ten t i m e s c h e a p e r ,
third
and are l e s s ready to
The reason does not l i e
to use and these more d e s i r a b l e sophistication
unfulfilled,
and c r i t i c a l
But how o f t e n
t i m e s more r e l i a b l e ,
of d i s i l l u up to t h e i r
the promises of the s o - c a l l e d
systems have been l a r g e l y
have become more c o n s e r v a t i v e invest
i s a t p r e s e n t an a i r
Computers are not l i v i n g
for
as the
practical
shows a l l
the s i g n s of
1960's w i t h
the e s t a b -
poor and i n a d e q u a t e e n g i n e e r i n g . While computer s c i e n c e has f l o u r i s h e d l i s h m e n t of j o u r n a l s ,
in the
degree courses
in u n i v e r s i t i e s ,
ware e n g i n e e r i n g a s p e c t s o f the s u b j e c t
etc.,
have s t r u g g l e d
for
what t e c h n i q u e s
exist
little
in the hands of users which has been b u i l t
software
available
have been p o o r l y d i s s e m i n a t e d and t h e r e
engineering principles.
In f a c t ,
both
As a m a t h e m a t i c i a n ,
the c o n t r o v e r s y and the a c t u a l
between mathematics in g e n e r a l is
not the s u b j e c t
but r a t h e r
matter
the use made of i t
Computer s c i e n c e gave us A l g o l time sharing. program,
it
But when we s i t
I am s t r u c k
relationship
it
that
that
to
it
distinction
adopted toward
it.
a l s o gave us the p r o s p e c t
down at a c o n s o l e to w r i t e
of
existing
in my v i e w ,
forms the i m p o r t a n t
and t h e a t t i t u d e 6.0: i t
related
by the s i m i l a r i t y with
and a p p l i e d m a t h e m a t i c s :
itself
is very
on the b e s t
many people are s t i l l
a r g u i n g about what i s s o f t w a r e e n g i n e e r i n g and how i s computer s c i e n c e .
the s o f t support,
of
an A l g o l
i s s o f t w a r e e n g i n e e r i n g which d e t e r m i n e s how easy i t
is
to
achieve this
end o r ,
alternatively,
the f r u s t r a t i o n s
that
we have to
go t h r o u g h . In h i s
address to
fessor
Bauer has g i v e n an e x c e l l e n t
I F I P Congress 71,
the more i m p o r t a n t r e f e r e n c e s . on to j u s t puter
three
reproduced
in t h i s
introduction
In t h i s
lecture
volume, Pro-
to the s u b j e c t
I want to draw a t t e n t i -
problems which are o f p a r t i c u l a r
concern
to the com-
user a t the moment and where an i n c r e a s e d a p p l i c a t i o n
ware e n g i n e e r i n g p r i n c i p l e s
could
and
be of immense b e n e f i t
of soft-
to him.
They
are
(i)
program d u p l i c a t i o n
- duplication
in one's
own programming
because of i g n o r a n c e of t h e work of o t h e r s , change of computing and d u p l i c a t i o n one has to (ii)
languages,
change of r e q u i r e m e n t s ,
which
in the l a s t
analysis
pay f o r ;
the poor d e s i g n and i m p l e m e n t a t i o n o f user images and t h e i r irrational
(iii)
variation
from system to s y s t e m ;
the management o f l a r g e a p p l i c a t i o n them w r i t t e n ,
2.
systems or p a r t i a l
o f system s o f t w a r e ,
differing
PROGRAM
- getting
DUPLICATION
The e a r l i e s t
response to t h i s
Every computer r a n g e , stallation
now has i t s
t h e y are a l l lication
program s u i t e s
used and m a i n t a i n e d .
problem was the s u b r o u t i n e
e v e r y programming l a n g u a g e , e v e r y computer i n subroutine
different.
barriers
library
- but to a l a r g e e x t e n t
Some of the reasons f o r
are u n d o u b t e d l y human but
technical
library.
are placed
it
is
this
higher
also astonishing
level
dup-
how many
in t h e way of users s h a r i n g s u b r o u t i n e s
more w i d e l y . Routines
implementing numerical
distributed last
and most o f t e n
algorithms
y e a r or so has seen a g r e a t deal
chine and/or manufacturer
are p r o b a b l y most w i d e l y
form the b a s i s of independent
libraries.
of p r o g r e s s libraries
Indeed the
in setting
in t h i s
up ma-
area.
The
appearance of the second volume of the Handbook o f A u t o m a t i c Computation [l]has tical
been a g r e a t
stimulus
S o f t w a r e Symposium [ 2 ]
show the
increasing
and the p r o c e e d i n g s
of the Mathema-
h e l d a t Purdue U n i v e r s i t y
i n 1970 c l e a r l y
awareness of the b e n e f i t s
used m a t h e m a t i c a l
software.
starts
back over many y e a r s ,
stretching
analysts
and computing
In the U n i t e d
service
and problems o f w i d e l y
Kingdom, a f t e r
several
false
a l a r g e number of n u m e r i c a l
people have now pooled t h e i r
efforts
i n the NAG l i b r a r y materialised originally
had i t
project
[3]
not been f o r
i n v o l v e d had o r d e r s
whether t h i s
the f a c t
the s i x u n i v e r s i t i e s
for
approved a t about the same t i m e . ject
I am d o u b t f u l
But now t h a t
i s being encouraged to cover
other
that
the same computers it
would have
(ICL 1906As)
has s t a r t e d
the p r o -
IBM and CDC machines as w e l l
as
ICL machines.
As one of the best a v a i l a b l e within current operating systems, the NAG l i b r a r y is a good i l l u s t r a t i o n of the p r a c t i c a l l i m i t a t i o n s imposed by these systems. For example: (a) The l i b r a r y covers the needs of both Fortran and Algol programmers but to do so i t
has to contain duplicate routines - a waste of
both development e f f o r t and storage space as well as preventing the e x p l o i t a t i o n of the most suitable language for each p a r t i c u l a r algorithm. Many of the problems of mixed language programming, e s p e c i a l l y between t h i s pair of lanquages, have been overcome in other operating systems and i t
is highly desirable that t h i s i n t e r -
face should be properly defined and engineered once and f o r a l l . (b) Routines in Fortran have to be in the ANSI d i a l e c t . This again means that any extra features of the local Fortran d i a l e c t cannot be exploited and a great deal of conversion work carried out.
It
could well be possible that some of the techniques described in t h i s course could provide automatic d i a l e c t conversion tools to avoid t h i s l i m i t a t i o n . Indeed i t would seem that the proper engineering approach would be to i n s i s t ,that such conversion tools should be an i n t e g r a l part of any proposed extension to a language. (c) To increase p m r t a b i l i t y , other l i m i t a t i o n s are placed on the subsets of the languages that may be used - f o r example, no I/O s t a t e ments are allowed, nor are COMMON variables in Fortran. These are important r e s t r i c t i o n s leading to poor programming practices and r e s u l t l a r g e l y from i m c o m p a t i b i l i t i e s in run-time packages between machines and languages. A properly engineered solution is to base a l i b r a r y on a family of portable compilers with a shared run-time package. (d)
Accuracy i s g e n e r a l l y decision are h e l d .
entails
given priority
severe p e n a l t i e s
T h i s r e q u i r e m e n t of
over e f f i c i e n c y :
when such a
several versions of a routine
"adaptability"
i s a common one and
forms a major t a r g e t
of s o f t w a r e e n g i n e e r i n g t e c h n i q u e s
'generic
has been g i v e n to program modules which can
components'
- the name
be used to g e n e r a t e e x e c u t a b l e
code m e e t i n g d i f f e r i n g
require-
ments. In commercial
data-processing,
common though
the d i s a d v a n t a g e s
This
is
largely
libraries
they raise
are more a k i n
ming when l a r g e r
These i n c l u d e
(a) (b) (c) (d) (e) If)
for
great
access and s t o r a g e mechanisms; security,
private layout
files
and v a r i a b l e
from d i f f e r i n g
dimensioning
and o v e r l a y s ;
overlapped
of
level
The problems programsources
and a r c h i v i n g ;
program s e g m e n t a t i o n execution
t h e problems
and are h a r d l y
above
and o u t p u t ;
file
At t h i s
great.
described
as to be u n a c c e p t a b l e .
data
data s t o r a g e
are no l e s s
of r e s t r i c t i o n s
modules or whole programs input
are l e s s
to t h o s e which appear in s c i e n t i f i c
are combined. formats
of d u p l i c a t i o n
because t h e s o r t
would be so s e v e r e in p r a c t i c e
of sub-routines
independent
of s h a r i n g
tasks
of arrays;
(parallelism).
program modules become v e r y
touched by the use of c o n v e n t i o n a l
programming
"languages e x c e p t between p e o p l e u s i n g t h e same i n s t a l l a t i o n . choices
entailed
are h i g h l y
in t h e d i f f e r i n g
needed. N e v e r t h e l e s s magnitude
greater
whole s t r u c t u r e linkage store
the d i f f i c u l t i e s level
supervisor
These d i f f i c u l t i e s
merge almost
utility
systems.
I look to the software
(c)
it
etc.
involving like
are
are an o r d e r
defensible.
has e v o l v e d
advances
imperceptibly
programs and e x p e n s i v e ,
in a r a t h e r
un-
paging and t h e o n e - l e v e l
into
t h o s e of non-
to so r e o r g a n i s e
construct
large
to c o n s t r u c t
them;
operating computer
program s u i t e s
becomes
easier
and more e f f i c i e n t
possible levels; through
to share program modules more w i d e l y application
cheaper and e a s i e r
of t h e s e b e n e f i t s
of
The
compilers,
difficult-to-use
engineers
systems and t h e way in which users
(a) (b)
practice
that
acceptance.
standard
that
met in
programming
calls,
way and s i m p l i f y i n g
have not won g e n e r a l
is r e f l e c t e d
o f t h e language f a c i l i t i e s
than t h o s e which are l o g i c a l l y
of high
editors,
disciplined
machine dependent and t h i s
implementations
The
and over more
to system s o f t w a r e ,
to use good system s o f t w a r e .
3.
USER
IMAGES
In t o d a y ' s
pattern
o f computer usage, the user image of a computer
system i s o n l y to a q u i t e languages t h a t
it
masses of i n f o r m a t i o n languages,
s m a l l e x t e n t formed by the high
supports.
about system f a c i l i t i e s ,
installation
nised at d i f f e r e n t
The user has to c a r r y
procedures
right
o f the v a r i o u s c o n s o l e s made a v a i l a b l e or w e l l
job
level head g r e a t
control
and command
and how the job queues are o r g a -
t i m e s o f the day,
t h e s e are as l o g i c a l
in h i s
down to the key c o n v e n t i o n s
to him. And h a r d l y any of
designed as the common high l e v e l
languages. To l e a r n a l l if,
as f o r
this
for
just
one system m i g h t be a c c e p t a b l e ,
second g e n e r a t i o n machines,
system e v o l v e s .
system r i g h t
down to the
the degree of p o r t a b i l i t y a b l y hope f o r
for
rationalisation
last
detail.
Moreover, whatever
On the hardware s i d e a s t r o n g
need f o r
drastic
user image begins a t h i s
pattern
the l o g i c a l
i s now emerging which
online
terminal:
terminal
a so-called
of a s m a l l
computer c o n t r o l l i n g communications w i t h
p r o c e s s i n g and f i l e 'front-end'
may be or a
intelliall
user-
any main
frame computer which a user wishes to access. Thus t h e r e
the
it
a local
users on the same s i t e ,
and h a n d l i n g a l l
s e p a r a t i o n of f u n c t i o n :
The
VDU or
on the o t h e r hand,
connected t o e i t h e r
When t h e r e are s e v e r a l
consisting
peripherals
indicates
be reached.
T h i s may be a t e l e t y p e ,
n e x t s t e p i s to combine these i n t o
gent t e r m i n a l , orinted
terminal.
printer
simplification,
o f these user images.
the way in which computer n e t w o r k i n g may e v e n t u a l l y
remote batch s t r e a m .
one may r e a s o n -
many users are going to want to
is a crying
and s t a b i l i s a t i o n
some more s o p h i s t i c a t e d
and some-
systems because o f the packages which t h e y
Thus t h e r e
a card-reader/line
as the
has to face a complete
program packages t h a t
in the near f u t u r e ,
access s e v e r a l d i f f e r e n t alone support.
one u s u a l l y
especially
happens g r a d u a l l y
But when one has to r e p l a c e the h a r d w a r e ,
times even when one does n o t , change of
it
is a clear
the main frame computer p r o v i d e s the main
storage capability
or t e r m i n a l
which may be l o c a l
computer handles the u s e r s '
or r e m o t e ;
peripherals.
I t seems to me t h a t the front-end should therefore become more and more responsible f o r providing the user image. This can then become s t a b i l i s e d against differences between main frame computers and adap~ ed to the local needs of the user community. Building such front-end
systems is very much a job for the software engineers: there are no new techniques r e a l l y required and the a l l are for r e l i a b i l i t y ,
important requirements
good design and s t a b i l i t y . Several groups are
already working on these problems and we have a small team so engaged at Culham Laboratory [ 4 ] Some of the tasks which i t
is envisaged may be handled by such a
system include the f o l l o w i n g : (a) User communication-controlling the consoles and other peripherals, determining which keys are used for which purpose and providing i n - l i n e t e x t e d i t i n g and format control of output; queuing requests and providing information about the state of accessible main-frame systems; checking user i d e n t i t y and access protocol; handling messages to and from other users; giving a f i r s t
line
information r e t r i e v a l service. (b) Main frame communication - c o n t r o l l i n g a l l
information t r a n s f e r s ;
optimising use of communications l i n e s ; providing spooling f a c i lities
f o r I/O; providing for f i l e
transfers,
(c) Job control and console command language - providing a common core of language with t r a n s l a t i o n to the main frame machine to be accessed; executing commands appropriate to i t s e l f (in many cases i t
will
have i t s own f i l i n g
system which w i l l
be accessed
through the command language and which, by means of e d i t o r s , syntax checkers e t c . , may be used for program preparation and job set-up); providing escape mechanisms into the JCL of p a r t i cular main-frames when necessary; otherwise checking a l l
input
and providing prompts where appropriate. (d) Scheduling - providing for local job queues and a l l o c a t i n g p r i o r i t i e s so that a maximum amount of local control
is main-
tained; relaying as required up-to-date information to users on job status. (e) Special device handling - t h i s could range from handling f a i r l y normal devices such as graph p l o t t e r s and displays to acquiring data from special measurinq devices. (f) U t i l i t i e s
- providing many of the common u t i l i t i e s
conversion.
such as media
10
Front-end systems such as t h i s w i l l vary from the very simple to the very complex and there are many d i f f e r e n t ways in which the interface between the main-frame and front-end tasks w i l l develop. The lead in these developments is u n l i k e l y to be taken by major manufacturers since i t
cuts across them and is very user-oriented. Thus we are
l i k e l y to be faced by a very confused s i t u a t i o n which is no improvement on the present unless t h i s work is very f i r m l y based on sound software engineering p r i n c i p l e s . 4. A P P L I C A T I O N P R O G R A M SUITES
Managing and planning the production and maintenance of large a p p l i cations programs raises s i m i l a r problems to those met in systems software. The use of software engineering techniques is j u s t as relevant and indeed h i s t o r i c a l l y the early support f o r t h e i r development came from t h i s d i r e c t i o n . Since most of the topics are dealt with at length in the main lectures, I w i l l only h i g h l i g h t some of the most p e r t i n e n t : (a) Project management - t r a i n i n g of s t a f f in appropriate programming techniques; s e t t i n g up standards; sub-dividing work into manageable parts; monitoring progress and q u a l i t y . (b) Product d e f i n i t i o n - specifying i t s function; defining user image; effects of host operating system. (c) Documentation - selecting l e v e l s , methods and automatic aids; c o n t r o l l i n g q u a l i t y ; disseminating and updating. (d) Design and implementation - t h i s is a very large area but there is a p a r t i c u l a r problem with designing general purpose packages to operate in a multiprogramming environment where storage is at a premium - namely, how to combine g e n e r a l i t y and comprehensiveness with small size at run-time when applied to a simple p a r t i c u l a r case. This problem is of increasing importance and has design implications not only f o r the package but also for the operating system in which i t (e) Problem-oriented
runs.
languages - a recurrent theme of the course is
the use of l e v e l s of language or hierarchies of abstract machines. to provide a structure w i t h i n which a programming problem may be solved. Most application
programs use only two l e v e l s , one at
the Fortran or Cobol level and one at assembly code, although
11
sometimes a l e s s f o r m a l n i q u e s now e x i s t
for
flow chart
readily
level
creating
can be r e c o g n i s e d .
levels
t h e problem at hand and which can be a u t o m a t i c a l l y from one l e v e l
to t h e n e x t l o w e r one.
groups are u s i n g v e r y high specifies
level
system o f d i f f e r e n t i a l
methods to be used i n t h e i r (f)
Testing
- generation
Tech-
which are matched to translated
In my own f i e l d ,
several
languages
in which one m e r e l y
equations
and t h e broad n u m e r i c a l
solution.
of t e s t
data;
use of t e s t
~g) Performance measurement - s i m u l a t i o n ;
beds.
measurement t o o l s ;
monitor-
ing and o p t i m i s a t i o n . (h) M a i n t e n a n c e and enhancement. 5.
CONCLUSION
The h e l p t h a t falls
into
the software
two p a r t s :
to use; and t o o l s work.
and t e c h n i q u e s
In the f o r m e r
sharp d i s t i n c t i o n 6.
e n g i n e e r can p r o v i d e
improvements
that
case g r e a t e s t
t h e computer u s e r
to t h e computer
systems
he can make use of
benefit
drawn between s o f t w a r e
will
result
if
that
in h i s there
he has own is
no
and hardware e n g i n e e r i n g .
REFERENCES
1
Wilkinson, Vol.
2
II
Rice,
J.H.
Linear Algebra",
J.
R. ( E d . ) ,
Ford, in
4
C. "Handbook f o r Springer-Verlang,
Automatic Berlin,
Computation,
1971.
"Mathematical
Software",
Academic P r e s s ,
B. " D e v e l o p i n g a Numerical
Algorithms
Library",
New Y o r k , 3
& Reinsh,
1971. to appear
IMA B u l l e t i n .
Poole,
M.D.,
Laboratory
"Interim
Internal
R e p o r t on A S t a b l e User Image", Report SEN 2 / 7 2 .
Culham
CHAPTER I . B THE
DESIGN
AND
CONSTRUCT
SOFTWARE
Massachusetts
Institute
Cambridge,
+
of Technology
Massachusetts,
USA
INTRODUCTION
Software
Engineering
is the a p p l i c a t i o n
to the design and c o n s t r u c t i o n is o f t e n
asserted
very l i t t l e
that
that
lating shall
In t h i s
tical ware.
This
there
software
engineering
to be p r e s e n t e d
to assess the l i m i t a t i o n s
application little
is
largely
art
in t h i s
increase
and based
the r o l e
published
and the p r o s p e c t s
for for
be a very personal
material
that attempts
sys-
for
In a d d i t i o n ,
reI
the pracbroad f u -
to the design and c o n s t r u c t i o n
certainly
of
of software
course.
of s o f t -
view of the f i e l d , to c h a r a c t e r i z e
engineering.
The theme of t h i s
talk
s e t of p r i n c i p l e s
for
is
that
behind the absence o f a s a t i s f a c t o r y
the p r a c t i c e
lack of adequate means f o r
of s o f t w a r e
representing
engineering
software
lies
the
and hardware system
d e s i g n s . F u r t h e r development o f the t h e o r e t i c a l f o u n d a t i o n f o r p r o gramming language semantics and system r e p r e s e n t a t i o n i s r e q u i r e d to overcome the l i m i t a t i o n s
of contemporary
software
It
and new ideas
of known p r i n c i p l e s
engineering,
of p r i n c i p l e
sketch w i l l is
and a r t
I wish to p r e s e n t a frame of r e f e r e n c e
needs of s o f t w a r e
ture
skills
Yet trends are v i s i b l e
in the design and c o n s t r u c t i o n
lecture,
the m a t e r i a l try
software
promise to s u b s t a n t i a l l y
t h e o r y and p r i n c i p l e tems.
of p r i n c i p l e s ,
Df programs and systems of programs.
on sound p r i n c i p l e .
are d e v e l o p i n g
for
0 F
SYSTEMS
Jack B. Dennis
I.
I ON
engineering.
+ The p r e p a r a t i o n of these notes was s u p p o r t e d in part by the N a t i o n a l ~ c i e n c e F o u n d a t i o n u n d e r grant GJ-432 and in part by the A d v a n c e d ~ e s e a r ~ h P r o j e c t s Agency, D e p a r t m e n t of D e f e n s e , under Office of Naval R e s e a r c h C o n t r a c t N o n r - N O O O ] 4 - 7 0 - A - 0 3 6 2 - O 0 0 | .
13
2.
TERMINOLOG Y
In p r e s e n t i n g
a framework
n e e r i n g we i m m e d i a t e l y "software"?
2.1.
for
discussing
principles
e n c o u n t e r problems
What do we mean by " c o m p u t e r
of s o f t w a r e
of terminology:
engi-
What is
system"?
COMPUTER SYSTEMS
We s h a l l
use the term c o m p u t e r s y s t e m to mean a c o m b i n a t i o n
and s o f t w a r e group o f
components t h a t
"users".
different
provides
A particular
form of s e r v i c e
insta~Ilation
to a
appears as many
computer systems d e p e n d i n g on the group o f users c o n s i d e r e d .
For example, the a b i l i t y Basic [ 1 ] ,
in a g e n e r a l to e d i t
purpose computer i n s t a l l a t i o n
and i n t e r p r e t
we can i d e n t i f y
and c o r r e s p o n d i n g
programs
at l e a s t
that
offers
e x p r e s s e d in the language
three distinct
computer systems
user g r o u p s .
system
u s e r group
1.
the computer
2.
hardware p l u s
3.
hardware, Basic
operating
hardware operating
operating
users o f B a s i c
system and
system d e f i n e s
a language in terms o f which a l l
run on the computer system i s
expressed.
A computer system p r o v i d e s
types and i n f o r m a t i o n operations
system i m p l e m e n t e r s
subsystem i m p l e m e n t e r s
system
language subsystem
Any computer sense:
a definite
computer
o f hardware
representations
structures,
on t h e s e data types
I mean t h i s
and implements
and s t r u c t u r e s .
software
in a v e r y e x a c t
for
certain
data
a s e t of p r i m i t i v e
L e t us c o n s i d e r
the
t h r e e cases m e n t i o n e d above. Suppose the computer system c o n s i s t s unit
and main memory,
terpretations ations
for
numerical
the p r o c e s s o r ,
are s i m p l y desired
Then the data types
o f memory words t h a t
of the processor
sentations in
say).
all
--
p u t e r must a l s o
fixed
quantities. contents
component o f a s t r u c t u r e
o r address c o m p u t a t i o n .
are i m p l i c i t
usually
the i n f o r m a t i o n
possible
in
(a p r o c e s s i n g
correspond
to the i n -
in the b u i l t - i n
and f l o a t i n g
point
oper-
repre-
In the absence o f base r e g i s t e r s
structures
of t h i s
o f the main memory, being accomplished
The e f f e c t
be m o d e l l e d
o n l y of hardware
o f the i n t e r r u p t
the l a n g u a g e .
computer system selection
of a
through
indexing
feature
o f t h e com-
The p o s s i b i l i t y
of asyn-
14
chronous i n t e r r u p t s makes the language defined by a hardware computer system nondeterministic; that i s ,
there may be many successor states
possible for a given state of the system. When the central hardware is augmented by peripheral devices and an operating system, additional data types and classes of information structures are represented, new p r i m i t i v e operations are defined, and some features of the hardware are made inaccessible. One important addition is the a v a i l a b i l i t y of f i l e s as a representation for i n f o r mation structures -- data and programs. Separate address spaces are provided for each concurrent computation and a generalized means of referencing data items and programs is implemented. The absolute addressing mechanism of the hardware is often not available to the user. S i m i l a r l y , the hardware f a c i l i t i e s
for process switching and i n t e r r u p t
processing are replaced by software primitives for interprocess communication, which are implemented by the scheduling modules of the operating system. The operations and data structures of the language defined by hardware and operating system may be complex. For example, in this view, the action of a program l i n k i n g loader must be considered as a primi t i v e operation that transforms one information structure (representing a set of program modules generated by compilers) into a new information structure (a set of procedures linked together and assigned to the address space of a computation). The inclusion of peripheral devices may a l t e r the view the user has of the language of the computer system. In the absence of peripherals, the machine appears as a device into which one puts programs for execution. The language of the computer system is then the set of programs that can be represented in memory according to the computer system's i n struction code. I f users i n t e r a c t with a computer system from peripheral terminals, the system behaves as a device having a set of internal configurations and which responds to messages with answers depending on i t s extant configuration. The language of the system now appears to the user as a set of meaningful messages together with corresponding state t r a n s i t i o n s and conditioned respondes. Adding a software subsystem for the Basic programming language yields a t h i r d computer system. The language defined by i t
is a model for the
commands and responses by which one interacts with the Basic subsystem
15
from a u s e r ' s operations operating
2.2.
terminal.
system o n l y
SOFTWARE
language,
through
for
the program r u n s ,
with
system,
required
We may i l l u s t r a t e
in
the computer
by the program.
called
The o p e r a t i n g
terms
o f the example c i t e d
system i s
that
then a s o f t w a r e mass s t o r a g e
implements
the language B a s i c .
an i n t e r p r e t e r ,
a communications
Basic would f i n d
it
computer
By the term s o f t must be in o r d e r
This
line
For an o p e r a t i n g
and main memory h a r d -
system h a v i n g many s o f t -
devices
to hold f i l e s .
software
This
a software
system
system c o n s i s t s
and a command p r o c e s s o r .
does n o t i n c l u d e
2.3.
above:
units
system may then s e r v e as the host system f o r
an e d i t o r ,
than
function.
ware modules and a p p r o p r i a t e computer
other
the h o s t s y s t e m ,
system the h o s t system may be the p r o c e s s i n g ware.
system on which
and hardware components t h a t
computer system,
some d e s i r e d
of
any hardware components,
we mean the s o f t w a r e
added to a s p e c i f i c to r e a l i z e
data types and
use o f the subsystem.
a program c o n s i s t s
together
those o f the computer system
the p r i m i t i v e
Users have access to the language of the
SYSTEMS
The e n v i r o n m e n t
ware
In t h i s
are those o f B a s i c .
controller,
If
of
the host system
the i m p l e m e n t e r
n e c e s s a r y to add one to the host as p a r t
of
of the new
system.
HIERARCHY
Hierarchical relationships o c c u r in many forms in computer systems. Here, we w i l l discuss j u s t one form of hierarchy: the hierarchy of l i n g u i s t i c levels
defined by successive layers of software.
Each level of this
hierarchy is a computer system characterized by the data types and primi t i v e operations of i t s language. Each level is (or, is p o t e n t i a l l y ) the host system for the d e f i n i t i o n of new l i n g u i s t i c levels through the addition of further software systems. Hierarchy permits
is
a tool
of software
engineering
the components of s e v e r a l
separately.
Of c o u r s e ,
possible
the languages
if
of software
levels
which,
if
properly
s e p a r a t e d e v e l o p m e n t of system l e v e l s corresponding
have been p r e c i s e l y
specified
used,
to be d e s i g n e d and d e v e l o p e d to the b o u n d a r i e s and agreed t o .
is o n l y
between l a y e r s For s u c c e s s ,
16
the implementers alter
of a software
any component o f
pleteness
the host system.
or i n e f f i c i e n c y
the software
system.
three
by a s o f t w a r e
system:
combinations
techniques
a new l i n g u i s t i c
in p r a c t i c e ,
system is m o d i f i e d an o u t e r
level
layer. level Often
simply
exten-
a collection
o f t h e new l e v e l
operations
of
the h o s t system.
in
this
New data
to the p r i m i t i v e s the i n t e r n a l
in
types o r
way and made a v a i l a b l e
in u s i n g e x t e n s i o n
p r o c e d u r e s a t both l e v e l s ,
software
by p r o c e d u r a l
are implemented
types of the host system,
for
so an
a new l i n g u i s t i c
operations
users of the e x t e n d e d system in a d d i t i o n for
violated
of
are used.
e x p r e s s the p r i m i t i v e
terms o f the p r i m i t i v e classes
the o b j e c t i v e s
system added to t h e h o s t system i s
that
n e c e s s a r y to
translation and interpretation.
techniques
In d e f i n i n g
the s o f t w a r e
of procedures structure
often
used to d e f i n e
extension,
o f the t h r e e
Extension:
sion,
is
o f an o p e r a t i n g
p r o c e d u r e may be implemented w i t h i n
We d i s t i n g u i s h
I.
principle
layer
it
Such need would expose incom-
o f the h o s t language f o r
This
example, when an i n n e r accounting
system should n o t f i n d
to
and data representations
h o s t and new, are i d e n t i c a l ,
syntactical-
l y and s e m a n t i c a l l y . 2.
Defining
Translation:
sists
of writing
programs
at
the host
system.
program is
a new l i n g u i s t i c
a compiler
the new l i n g u i s t i c
level
The n e c e s s i t y
characteristic
level
by t r a n s l a t i o n
to run on the h o s t system t h a t
of
into
programs
of compilation
this
technique.
in
con-
translates
the language o f
as a s t e p in r u n n i n g Representations
grams e x p r e s s e d in t h e language o f the new l e v e l
a
of pro-
are not d i r e c t l y
exe-
cuted. 3.
Interpretation:
consists
of writing
Defining
a new l i n g u i s t i c
an i n t e r p r e t e r
for
in terms o f the data types and p r i m i t i v e Programs a t the new l i n g u i s t i c cutable
level
level
by i n t e r p r e t a t i o n
the language o f the new l e v e l operations
o f the h o s t system.
are r e p r e s e n t e d
in d i r e c t l y
exe-
form.
A s o f t w a r e system may be d e s i g n e d so t h a t a l l persons u s i n g the host system are r e q u i r e d to do so at the l i n g u i s t i c l e v e l o f the s o f t w a r e system. An example i s all
users o f
a computer run under a s p e c i f i c
operating
the computer must use. A l t e r n a t i v e l y ,
systems may share the same h o s t , language systems o p e r a t e
several
as i~ the case t h a t
under t h e same e x e c u t i v e
system which software
several
control
programming
program.
17
Further,
the d e f i n i t i o n
cess to p a r t
or a l l
o f a new l e v e l
o f the l i n g u i s t i c
ence between use of e x t e n s i o n the p r i m i t i v e s whereas t h i s It
of is
the t e c h n i q u e
procedural
extension
level
hierarchical for
and i n t e r p r e t a t i o n
unless
often
application
defines
prevented
the c o l l e c t i o n linguistic
are d e f i n e d .
systems.
the new l i n g u i s t i c is
and i n t e r p r e t a t i o n
If
for
as de-
are grouped of
a collection
of
the new l e v e l
are
the c o l l e c t i o n ,
an i n t e r p r e t e r
different
and then for
standard
is
different
usually
procedure
level.
That i s ,
t h e new
If
is
interpreters
the language o f t h e h o s t , coordinated
will
data t y p e s ,
likely
interfacing
together
planning
use e n t i r e l y
hence each c a l l
features
imple-
in the
conventions e x p r e s s e d in
successfully.
In con-
from the h o s t ,
not p o s s i b l e
at
for
a or to
the host
incomplete f o r t h e o b j e c t i v e s
o f the
two source languages are w r i t t e n
then communication
pressed in the two languages w i l l carefully
the f o l if
done because o f a need to u t i l i z e
and c o n t r o l
the host l e v e l
in
procedures
then programs
form o f data o r g a n i z a t i o n
program m o n i t o r i n g system.
different
source l a n g u a g e s ,
produce c o m p i l e d
languages may be o p e r a t e d
interpretation
fundamentally
be d i f f i c u l t
between p r o c e d u r e s if
not
impossible,
is done by the i m p l e m e n t e r s . different
representations
of interpreters
exunless
Each i n t e r for
equivalent
on a p r o c e d u r e e x p r e s s e d in t h e o t h e r
guage would have to cause s w i t c h i n g
2.4.
Users o f
essentially
the same h o s t system,
obtain
all
used.
Examples are t h e use o f
are f u n d a m e n t a l l y
Two c o m p i l e r s
the two source
preter
is
In t h e s e c a s e s ,
level.
from u s i n g p r o c e d u r e s o u t s i d e
language of the h o s t .
in
at t h e new l e v e l
ought not be c o n s i d e r e d
of the host are honored by both c o m p i l e r s ,
software
in e x t e n s i o n
packages and in the i m p l e m e n t a t i o n
of provedures
respect:
mented f o r
trast,
that
The d i f f e r -
level.
Translation lowing
is
the added p r o c e d u r e s
relations
command languages o f o p e r a t i n g procedures
the h o s t .
not t h e case when i n t e r p r e t a t i o n
a new l i n g u i s t i c
in a way t h a t
of
the host system remain a v a i l a b l e
usually
would seem t h a t
fining
may or may n o t deny the user acfeatures
lan-
and t r a n s l a t i o n
of
data to be communicated.
SYSTEM
Traditionally "belong"
AND APPLICATION
SOFTWARE
"system program"
to a computer
the i n s t a l l a t i o n ;
refers
installation
"application
to the l a y e r s and are a v a i l a b l e
software"
refers
of
software
to a l l
that
clients
to the s o f t w a r e
of
brought
18
to an i n s t a l l a t i o n
by a c l i e n t
This d i s t i n c t i o n ing w i t h
the evolution
o f an i n s t a l l a t i o n
gramming language and make i t lation.
Or an i n s t a l l a t i o n
inventory
available
his
desired
software
computation. has l o s t
mean-
uses of computer systems. may implement a new p r o -
to o t h e r
clients
may be d e v o t e d e n t i r e l y
as in t h e case o f r e a l - t i m e
systems
of the instal-
to a p a r t i c u l a r
such as r e s e r v a t i o n
and
systems.
Nevertheless, we may l i s t crudely
performing
o f more s o p h i s t i c a t e d
For example, one c l i e n t
application
for
between system and a p p l i c a t i o n
by u s i n g certain
classify
t h e c o n c e p t s and t e r m i n o l o g y
distinguishing
software
characteristics
as system software
d i s c u s s e d above, that
will
serve to
application software
or
for the purposes of subsequent discussion.
system software: A c o l l e c t i o n archy of software
systems
of system programs
having
I •
The c o l l e c t i o n
2.
The h i e r a r c h y
of s o f t w a r e
which
applies
to a l l
3.
Inner
linguistic
4.
The o u t e r goals o f
5.
of programs are implemented
linguistic
The p r i m a r y
systems d e f i n e s
of the h i e r a r c h y
level
the i m p l e m e n t i n g
forms
a hier-
under one a u t h o r i t y . a single
users of the c o l l e c t i o n
levels
usually
these p r o p e r t i e s :
linguistic
level
o f programs.
are hidden from the u s e r .
o f the h i e r a r c h y
is
"complete"
for
the
authority.
means o f d e f i n i n g
new l i n g u i s t i c
levels
is
partial
inter-
pretation.
application software: An a p p l i c a t i o n
program or s o f t w a r e
system u s u a l l y
has t h e s e p r o p e r t i e s : I.
The programs
are e x p r e s s e d in terms of a " c o m p l e t e "
2.
The programs d e f i n e interpretation,
3.
4.
a new l i n g u i s t i c
o r by some c o m b i n a t i o n
The l i n g u i s t i c
level
inadequate
defining
A variety clients
for
defined
o f such programs
by e x t e n s i o n ,
linguistic
or s o f t w a r e
level.
translation,
of t h e s e t e c h n i q u e s .
by the program or s o f t w a r e
further
of an i n s t a l l a t i o n ,
authorities.
level
linguistic
system i s
levels.
systems are a v a i l a b l e
and are o f t e n
implemented
to
under d i f f e r e n t
19
3.
DESCRIPTION
OF SOFTWARE
SYSTEMS
Tile design and c o n s t r u c t i o n
of a software system i s ,
c r e a t i o n of a complete and p r e c i s e d e s c r i p t i o n s c r i p t i o n of a software system i s a c o l l e c t i o n software
fundamentally,
the
of the systel~. The deof d e s c r i p t i o n s of i t s
and Fardware components.
The complete and precise d e s c r i p t i o n
of a software component
is in
r e a l i t y a program expressed in a w e l l - d e f i n e d ~rogramming language, i f t h i s language is the language of the host system: or the t r a n s l a t i o n of the program to the l i n g u i s t i c a clerical
operation,
of c o n s t r u c t i n g
level
defined by the host is s t r i c t l y
then preparing the program completes the process
the system component. Otherwise implementation
component is incomplete u n t i l
a correct
is prepared at the l i n g u i s t i c
level
representation
permits
r e l e v a n t behavior of the component f o r a l l
situations
f o r these do not describe the f u n c t i o n
formed by the hardware component.
Usually,
the form of a model of the i n t e r n a l
Besides d e s c r i p t i o n s descriptions realize.
t h a t may occur
of the software system. Statements of i n t e r f a c i n g
ventions are i n s u f f i c i e n t ,
description
is adequate only i f
the designer of the software system to determine e x a c t l y the
during o p e r a t i o n
take
of the component
of the host system.
In the case of a hardware component, a d e s c r i p t i o n it
of the
of i t s
are r e q u i r e d :
an adequate d e s c r i p t i o n
A description level
must
components, two f u r t h e r
of the host system, and a
the software
The semantics of the l i n g u i s t i c
per-
o p e r a t i o n of the component.
hardware and software
of the l i n g u i s t i c
con-
level
system is intended to
of the host system must
be known before the components of outer software l a y e r s can have exact representations. before f i n a l
4"
Of course,
the o b j e c t i v e s
designs of a l l
FUNC..TI.ON~ C O R R E C T N E S S ,
of i t s
of the system must be known
components can be s p e c i f i e d .
PERFORMANCE
AND RELIABILITY
The designer of a software system wishes to achieve c e r t a i n goals. The goals are expressed in terms of four kinds of p r o p e r t i e s desired of the completed software system: f u n c t i o n , c o r r e c t n e s s , performance, and r e l i ability. Let us consider the s t a t e - o f - t h e - a r t in each of these four aspects of software ment of p r i n c i p l e
systems and the d i r e c t i o n s
is needed.
in which f u r t h e r
develop-
20
4.1.
FUNCTION
The function of a software system is the correspondence desired of output with input. Input is a l l information absorbed by the software system from outside the host system; output is a l l information delivered outside the host system. Information held by a software system between interactions with the outside is covered by this view, since such i n formation either is the r e s u l t of processing information received as i n put, or should be considered part of the software system, i t s effect then being incorporated in the mapping of inputs to outputs. In the case of application software, the function of a software system depends on what one takes as the host system. For example, the data base for an application may be internal i f the host system provides a data management f a c i l i t y , or i t may be external i f the data base is on a set of tapes not part of the host system. In the case of system programs, the function of a collection of system programs is to implement a specified l i n g u i s t i c level. A l i n g u i s t i c level is adequately defined only by a model of a class of system states, and a s t a t e - t r a n s i t i o n function which, together, give the equivalent of a formal i n t e r p r e t e r for the level. There is a rapidly growing body of formal knowledge applicable to many aspects of the representation of programs and systems. Some of this material is l i s t e d below: i.
Semantic models for programming languages. the lambda calculus [2] the contour model ~ , 42 Vienna d e f i n i t i o n method ~5, 6] program schemas ~ , 8~
2.
Concepts r e l a t i n g to interacting concurrent a c t i v i t i e s Petri nets [g] processes, semaphores, determinacy ~10] modularity ~ I ]
3.
Fundamentals of classes of algorithms numerical methods symbolic algorithms (e.g. sorting, theorem proving)
21
parsing methods Altho~gh the t h e o r e t i c a l
foundation
f o r programs and systems is f a s t
developing, there is a y e t no g e n e r a l l y accepted r e p r e s e n t a t i o n scheme t h a t has a p r e c i s e l y known semantics and is s u f f i c i e n t l y general to meet the d e s c r i p t i v e
needs of software system designers.
which the t h e o r e t i c a l
Areas in
development has not y e t provided an accepted
s y n t h e s i s of concepts are: I.
Representation
2.
The sharing of procedures and data among computations.
3.
of concurrent a c t i v i t i e s
and t h e i r
interaction.
Representation of data s t r u c t u r e s which change in content and e x t e n t during computation.
4.
The notions of ownership,
protection,
The consequence of t h i s
s t a t e of a f f a i r s
systems adopt d i f f e r e n t
sets of p r i m i t i v e
and m o n i t o r i n g . is t h a t designers of computer data types and o p e r a t i o n s
as
the basis f o r the design of the inner l a y e r s of hardware and software. Then, in r e a l i z i n g
a standardized l i n g u i s t i c
level
such as a
FORTRAN
programming system the system designer employs these p r i m i t i v e s implement the standardized aspects of the language. implementer is u s u a l l y forced
to
Nevertheless,
the
to implement extensions of the language
so a p p l i c a t i o n programmers may make use of unstandardized l i n g u i s t i c f e a t u r e s of the host. Since the p r i m i t i v e s in terms of which these extensions are defined are d i f f e r e n t extensions are u n l i k e l y cation
software
This d i s c u s s i o n
for different
to be compatible,
computer systems, the
and p o r t a b i l i t y
of the a p p l i -
is l o s t . underscores the need f o r b e t t e r
semantic issues l i s t e d
understanding of the
above.
Suppose a computer system is developed as a h i e r a r c h y of several guistic
levels.
each l i n g u i s t i c
Then the data types and p r i m i t i v e level
are r e s t r i c t e d
operations
lin-
used at
to those implemented at deeper
l e v e l s . Often a s i n g l e language (a system programming language) is advocated f o r r e p r e s e n t i n g software components at a l l l e v e l s w i t h i n the system. tic
In t h i s
features
case, e i t h e r
the language can include only the l i n g u i s -
implemented at the innermost l e v e l
(the hardware),
or
r e s t r i c t i o n s must be placed on use of l i n g u i s t i c features depending on the level f o r which software is being w r i t t e n . Certain e s s e n t i a l hard-
22
ware features tection
such as i n t e r r u p t
features
mechanisms, processor f a u l t s ,
are not u s u a l l y i n c o r p o r a t e d
of the system programming language, language procedures. level
features,
In t h i s way, the system programming language is
and l i n g u i s t i c
required
features
to implement i t s
levels
easy use of l i n g u i s t i c
at which i t
of the computer system t h a t
ware depends c r i t i c a l l y
4.2.
features
a syntactic
common to a l l
struc-
linguistic
is used. The degree to which a system programming
language aids in s i m p l i f y i n g features
higher
encompassed by the system programming language.
Thus a system programming language provides p r i m a r i l y ture p e r m i t t i n g
and profeatures
and recourse must be made to machine
extended to encompass the p r i m i t i v e s are not d i r e c t l y
as l i n g u i s t i c
common to a l l
the design and programming of system s o f t -
on the g e n e r a l i t y software
of the set of l i n g u i s t i c
levels.
CORRECTNESS
Correctness of a software system means correctness
of i t s description
with respect to the objective of the software system as specified by the semantic description of the l i n g u i s t i c level i t defines. Regardless of the approach adopted to favor correctness
of a software system, i t
is always the r e s p o n s i b i l i t y of the designer of the system or system component
to convince h i m s e l f of the correctness of some d e s c r i p t i o n
of the system or component. One would l i k e simple as p o s s i b l e ,
this
description
f o r example, a simple r e l a t i o n
to be as
of output to i n p u t .
Two approaches to the correctness of systems have been suggested: I.
Structured programming ~2~: The use of a programming style that makes the correctness
of a program self-evident to the author.
Greater use of structured programming is limited by the need for l i n g u i s t i c features not found in established programming languages. Use of structured programming may be encouraged by use of languages that disallow troublesome l i n g u i s t i c f e a t u r e s such as g o t o statements and side effects. 2.
Proof of correctness ~ 3 ] : To prove correctness of a software system or component, one establishes by logical deduction that some description of the system or component asserted to be correct by the desig-
23
her is equivalent to the d~scription of the system or component expressed at the host l e v e l . In the case t h a t translating translator
the h o s t
suffices.
man-machine
proof
be e f f e c t i v e , axiomatized correctness
description
description,
In o t h e r
and t h e s e m a n t i c s for
the proof
the result
o f the
generated proofs for
this
or
approach to
o f the host language must be c o r r e c t l y
it
is
This
approach i s
questionable
become a p r a c t i c a l
knowledge f o r
of automatically
the correctness
are r e q u i r e d
generator.
Although
by p r o o f w i l l
useful
is
proving
cases m e c h a n i c a l l y
generatingsystems
used e x p e r i m e n t a l l y . yielding
level
the d e s i g n e r ' s
improving
beginning
to be
whether establishing
technique,
the r e s e a r c h
the d e s i g n o f programs
is
and
languages.
4.3.
PERFORMANCE
Performance of a s o f t w a r e
system i s
of the host system are u t i l i z e d
the e f f e c t i v e n e s s
w i t h which r e s o u r c e s
toward m e e t i n g the o b j e c t i v e
o f the s o f t -
ware system. The demands on a c o n t e m p o r a r y exactly,
and s t a t i s t i c a l
oretical
foundation
queuing models, well
determined
systems,
for
for
nable to a n a l y s i s .
software
system u s u a l l y
characterizations performance
studies
analysis
service
systems are ame-
systems where the demands can be r e a s o n a b l y
by o b s e r v a t i o n ,
statistical
The t h e -
i s Markov p r o c e s s e s and
t h e s e models o f s t o c h a s t i c In s o f t w a r e
cannot be m o d e l l e d
must be employed.
for
example,
has p r o v i d e d
in r e a l - t i m e
valuable
transaction
predictions
of per-
formance to system d e s i g n e r s . On t h e o t h e r
hand,
performance
adequate methods f o r where the a p p l i c a t i o n s are unknown. T h i s for
software.
guistic
level.
predicting
failed
to p r o v i d e systems
is
due to
performance
both schemes
the absence o f a s a t i s f a c t o r y
application
performance
level
two d i f f i c u l t i e s ,
accepted representation programs
has to be f o r m u l a t e d
used to e x t r a p o l a t e for
of generally
is
represented
For each d e s i g n o f a s o f t w a r e
program b e h a v i o r ful
of affairs
One d i f f i c u l t y usage f o r
has so f a r
the p e r f o r m a n c e o f s o f t w a r e
to be implemented a t the new l i n g u i s t i c
state
stemming from the l a c k of r e s o u r c e
analysis
predicting
data.
system,
and v a l i d a t e d
model
a t the new l i n -
a new model o f before
it
can be
These models have not been use-
of a tentative
system d e s i g n .
The o t h e r
24
difficulty
is t h a t the software system i t s e l f
g e n e r a l l y accepted n o t a t i o n , a n a l y s i s are a v a i l a b l e
is not represented
in a
and no standard techniques of performance
for direct
application
to the d e s c r i p t i o n s
of
software systems. The main p o i n t of these remarks is t h a t our a b i l i t y
to analyze and
p r e d i c t performance of software systems i s l i m i t e d by the inadequacies of a v a i l a b l e d e s c r i p t i o n schemes r a t h e r than by the inadequacy of statistical
methods. A f t e r a l l ,
approximate answers to performance
questions are often s a t i s f a c t o r y ,
but there is no such t h i n g as a s a t -
isfactory
of f u n c t i o n .
4.4.
approximate d e s c r i p t i o n
RELIABILITY
Reliability correctly ure
is the a b i l i t y
of a software system to perform i t s
in s p i t e of f a i l u r e s
of computer system components.
function By f a i l -
of a component we mean a temporary or permanent change in i t s
characteristics is often
that alters
referred
its
function.
to as "software
failure"
Software does not f a i l .
What
is a matter of c o r r e c t n e s s .
N e v e r t h e l e s s , one must recognize the high l i k e l i h o o d
of i n c o r r e c t
soft-
ware being present in a complex software system. The design of a system as a set of m i n i m a l l y
interacting
programming can l i m i t
effects
structures
modules using p r i n c i p l e s
of s t r u c t u r e d
of software bugs to the modules and data
t h a t depend on correctness of the module in e r r o r .
The
p o s s i b i l i t y of r e a l i z i n g p r a c t i c a l systems constructed according to t h i s p r i n c i p l e depends on new fundamental knowledge of s t r u c t u r e d programming and modular systems. I f a software system has no hardware components, then component f a i l u r e s can only occur w i t h i n the hardware components of the host computer system.
In the ideal
host system, f a i l u r e s
be observable at the l i n g u i s t i c
level
of i t s
defined.
hardware would ~ot
Some c u r r e n t work
~4]
on f a u l t - t o l e r a n t and s e l f t e s t i n g and r e p a i r computer a r c h i t e c t u r e is d i r e c t e d toward r e a l i z i n g t h i s i d e a l , but is s t i l l f a r from s o l v i n g the problem in the c o n t e x t of general purpose computer systems. Most reported work on r e l i a b i l i t y is concerned with the d e t e c t i o n of f a i l u r e s and does not attempt to cope with the loss of i n f o r m a t i o n t h a t i n e v i t a b l y accompanies hardware f a i l u r e . We need concepts of computer o r g a n i z a t i o n t h a t w i l l permit the c o n s t r u c t i o n of computer
25
systems
in which s i n g l e
internal
failures
do not produce o b s e r v a b l e
effects. S i n c e the i d e a l will
affect
level
h o s t system is
operation
of the host.
without
of
of
At p r e s e n t we must be s a t i s f i e d occasionally
fail
with
known how to c o n s t r u c t computer Switching tection
against easily
5.
SOFTWARE
irrecoverable
loss
software
notorious
all
I ESS) t h a t single
divide
for
further
their
delays
one in which with
For i t
is
not
(such as the
System's
Electronic
to p r o v i d i n g for
in m e e t i n g s p e c i f i e d of
one a n o t h e r .
the p r o j e c t
In a l a r g e units
itself
for
o f work i n t o between u n i t s .
the subdivision
Suppose a p r o j e c t of a software
I.
they
complete pro-
used do not gen-
general
application.
systems
objectives.
are
A large
of management are r e q u i r e d , are not in c o n t i n o u s
project
for
of s o f t w a r e
it
is
assignment
amounts to a l a r g e
necessary to p r o j e c t
project
comto teams.
must be
subdivided.
the i n t e r a c t i o n
precise
the
system u s i n g a f a l l i b l e
the t e c h n i q u e s intended
two or more l e v e l s
o f work which in
The b e s t d i v i s i o n a basis
come c l o s e
unless
specified.
of information.
t h e d e s i g n and c o n s t r u c t i o n
the work to be done i n t o
Any u n i t
is
systems even i f
t h e r e are systems
failures,
and hence the key p e r s o n n e l munication
described
system to
PROJECTS
for
is
Although
to computer systems
Large p r o j e c t s project
software
Sabre system and the B e l l
System No.
eralize
with
n o t complete
and t h e r e s u l t -
A software
mode o f t h e h o s t
an i n f a l l i b l e
system as h o s t .
American A i r l i n e s
level.
not c o m p l e t e l y
each f a i l u r e
at the l i n g u i s t i c
modes o f f a i l u r e ,
o b s e r v e d a t the h o s t l i n g u i s t i c
to be taken f o r
some hardware f a i l u r e s
implemented
o f t h e h o s t system i s
the p o s s i b l e
be implemented on such a h o s t is action
systems
Then a d e s c r i p t i o n
a specification
ing e f f e c t s
n o t now a v a i l a b l e ,
software
team i s
system.
specification
The f u n c t i o n
of
units
of work: assigned
The t e a m ' s of:
the module.
is
Two k i n d s
the division
hierarchy
is
minimizes
may be used as
and m o d u l a r i t y .
the c o n s t r u c t i o n
task
that
of structure
completely
of
some module
defined
by a
26
2.
The l i n g u i s t i c
3.
The performance
required
4.
The performance
capability
In p r a c t i c e
this
level
of the host system.
information
o f the module. o f the h o s t . i s at best o n l y p a r t i a l l y
known by a p r o -
j e c t team at the time i t is expected to begin work. I t is o f t e n s t i l l incomplete at the time the team is expected to have a usable v e r s i o n o f the module ready f o r Clearly,
integration
the most c r u c i a l
precise
definition
is i m p o s s i b l e
for
Iteration
o f design
is f r e q u e n t l y
found
of overall
to be necessary in l a r g e s o f t -
system o b j e c t i v e s .
is discovered
needed to implement a s o f t w a r e
in terms o f the p r i m i t i v e
it
o f any p a r t
is found t h a t d e c i s i o n s
modules in o u t e r
ways: Sometimes i t
for
of the host system are known.
occurs when i t
of a l l
team is a
description
software
constructs
that
already
The most s e r i o u s
level
is a f f e c t e d ,
layers
by a change to a host system. The need f o r
in s e v e r a l features
by a p r o j e c t
of the host system:
i s where more than one l i n g u i s t i c
as the d e s c r i p t i o n validated
required level
the semantics
Iteration
iteration
o t h e r system components.
the team to produce a c o r r e c t
made p r e v e n t r e a l i z a t i o n design
information
of the l i n g u i s t i c
o f the module unless
ware p r o j e c t s .
with
may be i n -
iteration
certain
system are i m p o s s i b l e
o f the host l e v e l .
arises
linguistic to r e a l i z e
Then the semantics
of the host l e v e l must be r e v i s e d to meet the need. In o t h e r cases, i t is found t h a t the performance o b j e c t i v e s of a s o f t w a r e system cannot be achieved w i t h o u t
altering
the s p e c i f i c a t i o n
of host l e v e l
function.
These o b s e r v a t i o n s b r i n g out the importance of having a p r e c i s e s p e c i f i c a t i o n o f the host system before b e g i n n i n g c o n s t r u c t i o n of components o f a s o f t w a r e system. For each a d d i t i o n a l l a y e r i n c l u d e d in a s o f t w a r e system, e i t h e r formulation overlap,
the p r o j e c t
o f the new l i n g u i s t i c
raising
the r i s k
need to implement s e v e r a l circumvented
must be extended to a l l o w
if
level,
t h a t design linguistic
or work on s e v e r a l
iteration levels
will
within
linguistic
level
for
the p r e c i s e
levels
be r e q u i r e d . one p r o j e c t
a host computer system were a v a i l a b l e
complete and s a t i s f a c t o r y
time f o r
must The
would be
that realized
the o b j e c t i v e s
a
of the p r o -
ject. These arguments r e i n f o r c e the need f o r be~ter u n d e r s t a n d i n g of fundamental l i n g u i s t i c c o n s t r u c t s f o r b u i l d i n g s o f t w a r e systems and the development of c o r r e s p o n d i n g p r i n c i p l e s of computer system a r c h i t e c t u r e .
27
When t h i s
understanding
be any need f o r
6.
large
whose i n c i s i v e preparation
no l o n g e r
projects.
to e x p r e s s
I.
J.
G. Kemeny and T.E.
Inc.,
New York 1967.
P. J.
Landin,
notation, pp 89
J.
thanks
to P r o f e s s o r
draft
Jerome S a l t z e r ,
have been v a l u a b l e
in
the
of t h e s e n o t e s .
REFERENCES
Part
his
comments on an e a r l y
7.
3.
software
perhaps t h e r e w i l l
ACKNOWLEDGEMENT
The a u t h o r wishes
2.
has been g a i n e d ,
Kurtz,
John Wiley
BASIC Programming.
and Sons,
A c o r r e s p o n d e n c e between ALGOL 60 and C h u r c h ' s
Part
I:
Comm. o f
the ACM, V o l .
2 (February
8, No.
lambda1965),
101. II:
Comm. o f
B. J o h n s t o n ,
Proceedings Languages,
the ACM, V o l .
The c o n t o u r
3 (March
8, No.
model o f b l o c k
structured
of a Symposium on Data Structures SIGPLAN Notices,
Vol.
1965),
pp 158 - 169 processes.
in Programming
6, No. 2 CFebruary
1971),
pp 55 -
82.
.
D. M. B e r r y , Proceedings
Block
structure:
retention
or d e l e t i o n ?
of the 3rd Annual ACM Symposium on Theory of Computing,
May 1971, pp 86 - I 0 0 . 5.
P. Lucas and K. Walk, On the formal description of PL/I. Annual Review in Automatic Programming,
Vol.
6, Part 3, Pergamon
Press,
1969. .
P. Lucas,
P. L a u e r ,
the Formal
TR 25.087, 7.
Method and Notation for
Definition of Programming Languages.
Technical
Report
IBM Laboratory Vienna, June 1968.
M. S. P a t e r s o n , ceedings
and H. S t i g l e i t n e r ,
Decision
problems
of an ACM Conference
SIGPLAN Notices,
Vol.
7, No.
in c o m p u t a t i o n a l
on Proving Assertions 1 (January
1972),
models.
Pro-
About Programs,
pp 74 - 82.
28
8.
A. P. Ershov,
Survey paper on program schemata,
IFIP Congress, 9.
A. H o l t ,
Ljubljana,
presented
at the
1971.
F. Commoner, S. Even, and A. P n u e l i ,
Marked d i r e c t e d
graphs. J. o~ Computer and System Sciences, Vol. 5, No.
(!971),
pp 511 - 523. 10. E. W. D i j k s t r a ,
gical
sequential
processes.
Programming
F. Genuys, E d . , Academic Press, New York 1968.
Languages,
published
Co-operating
as Report EWD 123, Department of Mathematics,
University,
11. S. S. P a t i l ,
Eindhoven,
The N e t h e r l a n d s ,
Closure P r o p e r t i e s
(First
Technolo-
1965.)
of i n t e r c o n n e c t i o n s
of determinate
systems. Record of the Project MAC Conference on Concurrent System and Parallel Computation,
12. E. W. D i j k s t r a , correctness. Vol.
8, No.
A constructive
BIT
(Nordisk
3 (196~,
the ACM, Vol.
14. A. A v i z i e n i s ,
approach
Tidskrift
for
to the problem of program Informations-behandling),
pp 174 - 186.
13. Z. Manna and R. J. W a l d i n g e r , Comm. o f
ACM, New York 1970, pp I07 - 116
14, No.
G. C. G i l l e y ,
Toward a u t o m a t i c 3 (March
1971),
F. P. Mathur,
program s y n t h e s i s . pp 151 - 165.
D. A. Rennels,
and D. K. Rubin,
The STAR ( S e l f - T e s t i n g
an i n v e s t i g a t i o n
of the t h e o r y and p r a c t i c e
and R e p a i r i n g )
J.A.Rohr,
computer:
of f a u l t - t o l e r a n t
computer design. IEEE Trans. on Computers, Vol. 0-20, No. 11 (November 1971),
pp 1312 - 1321.
CHAPTER 2.A. HIERARCHIES Gerhard Goos, K a r l s r u h e U n i v e r s i t y of K a r l s r u h e ~ Germany O.
INTRODUCTION
Large s o f t w a r e systems are u s u a l l y e v e r y component solves be s p l i t . final ral
into
a subproblem i n t o which
The d e c o m p o s i t i o n
system;
subdivided
influences
the o r i g i n a l
problem can
not o n l y the p r o p e r t i e s
the i m p l e m e n t a t i o n e f f o r t
itself
is
influenced
of the in seve-
respects.
There are v e r y few ideas decomposition well-known lecture
engineering principle
is
construction
i.
o n l y about the m e t h o d o l o g i c a l
can be a c h i e v e d b e s t .
s i m p l e r ones,
establishing
concerned w i t h
HIERARCHICAL
of b u i l d i n g
a hierarchical
ORDERING
AS
A
from the
input-data,
put-data
solving
of t h i s
principle
system s t a r t s
a
from a d e s c r i p t i o n
host system
(in
of s o f t w a r e
for
program
a particular
this
machine,
First
a s e t of program components is
solves
implements some f u n c t i o n s
ments some f u n c t i o n s
: procedure
case of c o r o u t i n e s ) ,
a part
their
some o u t -
interfaces
synchronization
each o t h e r
primitives.
in two
asynchroto the o u t problem,
machine, or i t
components.
use of common d a t a ,
design of the system.
now b r i d g e d
of the o r i g i n a l
other
communicate w i t h
calls,
produces
coroutines,
of the a b s t r a c t
needed in d e f i n i n g
the components
step the gross
(procedures,
d e f i n e d by s p e c i f y i n g
Every component e i t h e r
~I]).
machine which
case of the problem.
:
neous p r o c e s s e s )
of the p r o -
the sense o f
r e p r e s e n t e d by an a b s t r a c t
steps
connections
from This
STRATEGY
The gap between the host system and the problem i s
interfaces
the
complex components
DESIGN
blem to be s o l v e d and the a v a i l a b l e
it
how the
o r d e r of components.
the a p p l i c a t i o n
The problem may be f o r m a l l y
side.
question
The most e l a b o r a t e d one is
and programming languages.
The design of a s o f t w a r e
i.e.
many components;
imple-
Via t h e i r
by v a r i o u s
inter-
exchange-jumps We c a l l
this
(in
first
30
As a second
step
the
S i n c e we know t h e ponent plied
internal
interfaces
can be s e p a r a t e l y to
the
design
of
behaviour of
the
considered the
of
each
component and t h e
component
as to
component
to
the
outside
same p r i n c i p l e s the
system
is
defined. the
com-
can be ap-
as a w h o l e .
The ideas may be i l l u s t r a t e d by considering the construction of a f i l e system of an operating system.
The f i l e - s y s t e m may be subdivided into
four components : the basic
I/O
routines for the disc
the storage a l l o c a t i o n on disc the handling of d i r e c t o r i e s , protection-mechanisms etc.
f o r the f i l e s
the implementation of access functions to f i l e s , and d i r e c t o r i e s based on the
I/O
routines mentio-
ned before. The r e s u l t of the gross design can be represented as a network of components
(fig.
I.).
Every arrow represents an asymmetric communication
l i n e between components, e.g. a possible procedure c a l l .
Symmetric
communication l i n e s , e . g . , use of common data, is represented by two arrows.
The network is a directed graph of a r b i t r a r y complexity. This
complexity may cause trouble concerning
the f o l l o w i n g objektives of
software-design : The design should allow at every stage to convince oneself of the correctness of the designed program as f a r as i t
is already known.
One should not use design
techniques which increase the p r o b a b i l i t y that one must go back r e v i s i n g large parts of e a r l i e r design decisions because of errors found to l a t e .
In practice
such techniques very often imply that errors are never corrected. Programs are very often modified e i t h e r during design, production or l a t e r to meet modified requirements or d i f f e r e n t resources. Therefore the o r i g i n a l design should produce a program s t r u c t u r e in which the components are as independent from each other as possible.
At l e a s t an overview on a l l
consequences of
changing a p a r t i c u l a r design decision must be possible.
cf
f0
o
C~ 0
P~
Fl 0
o ~h
o
Z ('I) c~
l
J
I-
T
32
Design,
production
and m a i n t e n a n c e
manageable t a s k s . pendence o f
same t i m e t h e not
only
in
one c a n n o t
This
components
split
but
Either
since
their
work
mation
is
to be used.
at all
since
they
t h e y need and where i t These o b j e c t i v e s
are h a r d l y
shows i n t e r d e p e n d e n c i e s , cause d i f f i c u l t i e s sions
in
each o t h e r
:
in
the
Or t h e y do n o t
cycles,
i n which
get
know which
between
all
Obviously the
getting
the environment
do n o t
al-
the
infor-
information
can be f o u n d .
overviewing
to o v e r v i e w
people will
with
met when - as in f i g u r e
e.g.
and m o d i f i c a t i o n s .
impossible
to be s o l v e d
these
to spend too much t i m e about
Otherwise
subtasks
communicating
informations
interdeAt the
presented
principle.
of people in
theyhave
in
must be
the
must be c l e a r l y
also
into
group
that
to a minimum.
the p r o b l e m
ways have t r o u b l e necessary
requires
kept
dependencies
detail
by a l a r g e r
also is
of software
grounds
the
1 - the
implications
such d i f f i c u l t i e s on which
network
t h e program p a r t s of design will
further
which
also
desisions
decimake i t
have to
be based. Hence t h e o b j e c t i v e parts.
Since
eration processes le - all
in
to
reduce
cannot
to
a cycle.
layers
(fig.
or a c y c l e a tree
(fig.
3.)
The p r i n c i p l e is
or a l i n e a r l y for
of structuring
called
successful
at
each o f which
introduced
a partially is
either
ordered the
it
allows
- layers
At the same t i m e ,
tem i n t o a clear
the
the
different
components picture
component
in
contributes
- in
to
the
interface tasks
of
such a c y c -
set of
layers
4.).
The t e r m l a y e r L3].
Hierarchical
ordering
systems
set
set is
of
up a c l e a r ordering
way.
the
a
components
We
gross
scheme f o r
splits
we can hope e v e r y b o d y w i l l to o t h e r
of
f r o m more e l e -
conceived
and c o r r e c t n e s s to
forms
by D i j k s t r a ordered
a clearly
be-
o f program
a partially
Hierarchical
such a way t h a t
o f what t h e
(fig.
case o n l y
we are f o r c e d layers.
o f coop-
program component
the set
to b u i l d
completeness
interfacing
set
latter
g e t maximum i n s i g h t
in
a single
a system into
because
program
constitutes ordered
Very o f t e n
hierarchical ordering.
technique
[2]
between
- every set
t h e number o f program p a r t s
m e n t a r y program components design.
completely
Dijkstra
o f program components.
was o r i g i n a l l y
interrelations
to m i n i m i z e
We a r r i v e
2.)
the
be e x c l u d e d
t h e sense o f
what we can do i s
longing
layers
is
cycles
is
the s y s t e m as a w h o l e .
the sysget
and how a
33
I ........ i
/ l
I<
,I
I
I< >I
t.
I Fig.
2
Partially ordered
Set of Layers
I
J
34
[ I..... I
Fig.
3
Tree-like
i
1
I~
I
structured
Program
I
I
0
ct"
0
k.-'
I-'uCI
U D
<
36
Hierarchical
ordering
is
a c h i e v e d by a s y s t e m a t i c
about how the o b j e c t i v e s
way of t h i n k i n g
of the system can be met. T h i s
i s our n e x t
subject. 1.1.
LEVELS
OF
ABSTRACTION
Let us assume we have to s o l v e some n u m e r i c a l puter. level
U s u a l l y we s h a l l language, e.g.
rithms
for
solving
solve
This,
however, is
by a program w r i t t e n enough t h i s
and t h a t
in some h i g h algo-
straightfor-
problems from the programmers p o i n t
allows
ALGOL
on some com-
program i s
a consequence of the f a c t
l y w e l l - k n o w n which programming t o o l s cations
Po
P r o v i d e d we know the m a t h e m a t i c a l
ALGOL.
our problem w e l l
ward and causes no p a r t i c u l a r view.
Po
problem
for
that
are r e q u i r e d
for
it
is
of
relative-
numerical
expressing algorithms fairly
applieasily
by these t o o l s . In f a c t , GOL.
we have not y e t s o l v e d our problem by programming i t
In a d d i t i o n
computer.
we must s u p p l y an i m p l e m e n t a t i o n of
Hence our s o l u t i o n
o r i g i n a l problem
Po
(a c o m p i l e r and) this
run-time
nes f o r
arrays -,
I/0
languages s i m u l t a n e o u s l y . Each of these available
by a r u n - t i m e
routi-
addressing of multi-
and s t a n d a r d f u n c t i o n s .
In a m u l t i - p r o g r a m -
in d i f f e r e n t
high-level
Thus we have to implement many r u n - t i m e
reduces a problem o f t y p e
of implementing a resource allocation sources
incorporating
- e.g.
ming e n v i r o n m e n t we run many programs w r i t t e n tems.
On s m a l l e r machines
ALGOL.
storage-access
of the
the i m p l e m e n t a t i o n of
system may be implemented d i r e c t l y ,
storage-allocation,
dimensional
system f o r
PI,
AL-
on our
up to now was o n l y a r e d u c t i o n
to a n o t h e r problem
a run-time
in
ALGOL
PI
to the problem
scheme which d i s t r i b u t e s
on the computer to the d i f f e r e n t
users,
system and a program r u n n i n g on i t .
P2
is
-
The o r i g i n a l
problem
significant Po
is
the r e -
s o l v e d by the
-
system, o p e r a t i n g
:
ALGOL
a certain
s e t of programming t o o l s . constitutes
program,
system.
Each of these l a y e r s s o l v e s a problem of these t o o l s
:
s o l v e d not by one program
but by a number of program l a y e r s run-time
properties
problem
P±
by means of
The i m p l e m e n t a t i o n P±+~.
P2
represented
o p e r a t i n g system. This example shows the following
sys-
37
The tools f o r the f i n a l problem are the properties of the hardware. Except that every layer implements the tools for the ~regoing one, the layers are completely independent. At least conceptually, when w r i t i n g the
ALGOL program
we are not concerned with the d e t a i l s of how the elementary constructs of
are implemented. Conversely,
ALGOL
when w r i t i n g the operating system or the run-time system we are not concerned with the properties of grams for which we supply the t o o l s .
ALGOL pro-
(Exceptions from
this rule of independence may arise from e f f i c i e n c y considerations.) To be more general, the method which we have applied to t h i s example may be expressed as follows :
To s o l v e LLGOL ted.
a problem we choose an a p p r o p r i a t e
machine in
the example above, on which
The machine is
appropriate
if
it
which we have e x p r e s s e d the a l g o r i t h m se n o t i o n s bilities yields
abstract
must c o n t r i b u t e
o f the h o s t system. a sequence o f a b s t r a c t
~he problem is
implements for
to r e d u c i n g
the b a s i c
the problem.
the o r i g i n a l
Repetitive
machine,
the
implemennotions
Of c o u r s e ,
by the-
problem to the capa-
application
machines the l a s t
e.g.
of this
of which
principle
is
identical
to the g i v e n h o s t system. By every abstract machine of t h i s sequence we abstract from some det a i l s of the previous one and of the o r i g i n a l problem. I t constitutes a
level of abstraction
on the way from the o r i g i n a l problem to the
host system. Conversely : Every abstract machine abstracts from some properties of the host system using i t
f o r implementing some new tools
which are better suited for the intended application.
a
level of abstraction problem.
on t h e w a y
So i t
constitutes
from the h o s t system to the o r i g i n a l
In introducing the t e r m level of abstraction,
E. W. D i j k s t r a used the
bottom-up approach and stated the following properties of the abstract machines
(which are now numbered Ao, A t , . . . . . , An,
s t a r t i n g from the
host system) :
-
The r e s o u r c e s
and the f u n c t i o n s
the complete b a s i s
on which
provided
to b u i l d
by
Ai+ I.
Ai
form the
There is
38
no way to use p r o p e r t i e s Hence, e v e r y
Ai
of
Ai-1
in b u i l d i n g
AC+I.
i s a complete i n t e r f a c e - d e s c r i p t i o n
in the h i e r a r c h y . Resources of Ai
Ai-1
used in d e f i n i n g
can no l o n g e r be p r e s e n t
The c o r r e c t n e s s
in
o f the s o l u t i o n
new r e s o u r c e s
of
Ai . o f the f i n a l
problem
can be a s s e r t e d by s t e p w i s e p r o v i n g the c o r r e c t n e s s the i m p l e m e n t a t i o n o f each a b s t r a c t
The l a s t
assertion
is
obvious.
mentioned here because i t Modularity perties
of
these f o r Ai
is Ai
identical
and we have to f o r g e t
The bottom-up rent by
property.
but i t
is
in p r a c t i c e .
H o w e v e r , t h e r e may be p r o of
AC-~.
we have to c o n s i d e r
But in u s i n g
them as p r o p e r t i e s
about whether a p r o p e r t y
Ai
Ai ,
is
1.2.
structures AC+I
there
THE
level.
: Based on an a b s t r a c t
machine
AC
many d i f f e -
may be implemented s h a r i n g the r e s o u r c e s p r o v i d e d
is
ORDER
OF
remark shows t h a t
In f a c t ,
if
DESIGN
DECISIONS
top-down
design i s
not always a p p r o p r i a t e .
the problem to be s o l v e d can be s p l i t
blems which have to be s o l v e d s i m u l t a n e o u s l y , plementing sharing of resources between d i f f e r e n t
we s h a l l
and the p o s s i b l y
program components.
into
Obviously this
case bottom-up
consideration
are t a k e n .
quence o f the o r d e r i n g
in t i m e .
In g e n e r a l w o r k i n g through
get l a y e r s
These l a y e r s
im-
cannot be de-
a sequence o f ab-
i s more a p p r o p r i a t e .
i s concerned w i t h
which design d e c i s i o n s
various sub-pro-
necessary synchroniza-
s i g n e d w o r k i n g downwards from a subproblem c r e a t i n g In t h i s
as-
o n l y one problem to be s o l v e d .
The l a s t
stractions.
of
newly c o n s t r u c t e d
or m e r e l y p r e s e r v e d from the p r e v i o u s
Top-down design shows one path in the t r e e o n l y because i t
sumes t h a t
tion
trivial
on
approach a l s o shows the way in which we a c h i e v e t r e e - l i k e
machines Ai .
is
Ai+1
o v e r l o o k e d and v i o l a t e d
to some p r o p e r t i e s AC+~
in the l a y e r y i e l d i n g
hierarchical
The second r u l e
is often
a c h i e v e d by the f i r s t
constructing
machine
of
the o r d e r
The c o n c e p t u a l
the l e v e l s
o n l y once i s
in time in
ordering
is
insufficient.
a conse-
In-
39
stead we must i t e r a t e til
one or more times r e v i s i n g
we get the system b a l a n c e d .
these e a r l i e r start
Although
earlier
decisions
such i t e r a t i o n s
un-
show t h a t
d e c i s i o n s were based on wrong assumptions we must o f t e n
from unproven assumptions
ses the q u e s t i o n
if
we want to s t a r t
how to get a s t a b l e
and c o r r e c t
at all.
gross
This
rai-
design as f a s t
aspossible. Top-down design w i t h o u t problems d i s c u s s e d i n and i f
iteration
is useful
for
that
purpose i f
the
the b e g i n n i n g of the paragraph are not i n v o l v e d
moreover the f o l l o w i n g The problem i s
conditions
described
are met :
in a f a i r l y
constructive
manner it
must be known in advance, e . g . ,
from the given d e s c r i p t i o n can be d e r i v e d , ble Conversely,
by e x p e r i e n c e t h a t
o f the problem a s o l u t i o n
efficiently
i m p l e m e n t a b l e by a v a i l a -
resources.
for
u s i n g bottom-up
design the host system must be p r e c i s e -
7y known and e x p e r i e n c e must a l l o w to d e r i v e we r e a l l y
approach the problem to be s o l v e d .
assure f o r ed
each l a y e r
in the a b s t r a c t
The f i r s t
is
mentioned above i s
never f u l f i l l e d
not o n l y on c o r r e c t n e s s ence ting
to the u s e r , systems.
that is
and e f f i c i e n c y
hardware c o n f i g u r a t i o n s
by most p r o -
to be s a t i s f i e d
by the s o l u -
of the s o l u t i o n
depends
but a l s o on such terms as c o n v e n i etc.
as in the case of o p e r a if
may v a r y in a wide range or i f
the d e c i s i o n
for
satisfied
or not depends m o s t l y on the
solving
partial
by i n v e s t i g a t i n g
implementaion.
ally
do not a l l o w f o r
with
iteration.
E.g.,
straightforward
choice
amongst d i f -
problems knowing i n advance
never has to be r e v i s e d .
when one s t a r t s for
is
One must be a b l e to make a u s e f u l
alternatives
suitable
need-
s o f t w a r e has to be p o r t a b l e .
people i n v o l v e d .
ration
conditions
satisfied
when the v a l i d i t y
range of a p p l i c a b i l i t y
Whether the second c o n d i t i o n ferent
case we must
any major f e a t u r e
A n a l o g o u s l y bottom-up d e s i g n should not be a p p l i e d
the u n d e r l y i n g the r e s u l t i n g
such t h a t
machines below and above r e s p e c t i v e l y .
o f the c o n d i t o n s
It
In e i t h e r
t h a t we have not f o r g o t t e n
blems s t a t e d by a s e t o f f o r m a l tion.
the n e x t l e v e l s
Of c o u r s e ,
one must use i t e -
w h e t h e r the problem d e s c r i p t i o n since
language
solutions,
definitions
compilers
usu-
are designed
40
To a v o i d i t e r a t i o n s It
efficiency
must be noted t h a t
problems must be c o n s i d e r e d c a r e f u l l y .
every abstract
machine
A±
the u n d e r l y i n g machines.
The e x e c u t i o n o f some
for
the c a l l
that
low.
machine i n v o l v e s
So the
ly this
latter
works
is
machine-instructions
of some procedures
using a s m a l l e r
remark may a p p l y to o p e r a t i o n s
hierarchical possible
order.
Careful
occuring
analysis
in advance so t h a t
l a y e r be-
Unfortunate-
very frequently
which
by c i r c u m v e n t i n g the
should e x h i b i t
such c r i t i c a l
they can be p l a c e d in a l a y e r as low as
in o r d e r to speed them up.
To summarize, design w i t h o u t a hole. ball
of the
g r a i n o f time.
perhaps could be implemented much more e f f i c i e n t l y operations
s l o w e r than a l l
iteration
looks
like
throwing
a ball
into
Whether we succeed depends on the s i z e of the hole and of the
as w e l l
as on our knowledge about the p o s i t i o n
o f the hole and
our e x p e r i e n c e in t h r o w i n g . In g e n e r a l we cannot hope to succeed by top-down only. ther
T~ere are too many problem areas which correctly
implications
i n the f i r s t of i n t r o d u c i n g
attempt. certain
not be o v e r v i e w e d i m m e d i a t e l y . ments a p p l y to a p a r t i c u l a r
or bottom-up
design
cannot be r e l a t e d
toge-
A n o t h e r reason might be t h a t algorithms
or data s t r u c t u r e s
the can-
A p p a r e n t l y w h e t h e r or not such a r g u -
design depends on the p r e v i o u s e x p e r i e n c e
of the d e s i g n e r s . In such cases we can s t a r t
using any design s t r a t e g y
But a f t e r
we have gone through
decisions
or - in
ning.
the w o r s t
once we have to go back r e v i s i n g
case - s t a r t i n g
R e v i s i o n s are based on the i n s i g h t s
other parts
mentioned above. earlier
o v e r again from the b e g i n we have got in d e s i g n i n g
o f the system or i n d e v e l o p i n g d e t a i l s
of the proposed
gross d e s i g n . If
there
are subproblems whose s o l u t i o n
design of o t h e r
parts
seems to i n f l u e n c e
of the system, we can a l s o s t a r t
strongly
the
the design some-
where in the m i d d l e o f the system i n s t e a d o f p r o c e e d i n g top-down or bottom-up.
O p e r a t i n g systems are o f t e n
from d e c i s i o n s
on memory a l l o c a t i o n .
Also s i m u l a t i o n
e x p e r i m e n t s may d i s c u s s e d by S.
point.
Gill
and Randell
Zurcher
As D i j k s t r a thought
[3]
points
out,
[5] it
is
and Randell useful
that
[6]. the f i n a l
to be a c h i e v e d in the bottom-up manner r e g a r d l e s s
was a c h i e v e d
: At l e a s t
way s t a r t i n g
Those problems are f u r t h e r
be a good s t a r t i n g [4],
designed in t h i s
during
testing
it
i s much b e t t e r
design
is
how i t
really
to c o n s i d e r
41
the l a y e r s
in sequence s t a r t i n g
environment are o n l y
for
each l a y e r .
useful
2.
if
from the bottom than to p r o v i d e
- In p r a c t i c e ,
the
interfaces
HIERARCHICAL
ORDERING
such a r t i f i c i a l
a test
environments
are v e r y s i m p l e ,
AND
LANGUAGES
Each l e v e l of abstraction in a h i e r a r c h i c a l l y ordered system introduces a new programming language.
The skeleton of t h i s language is given by
the c~alogue of admissible operations on that l e v e l .
Other concepts -
data types, resources etc. - are introduced as the a t t r i b u t e s of parameters of these operations.
The set of operations may be viewed as the
set of i n s t r u c t i o n s of a computer and i t
abstraot maohine.
term
is t h i s view which leads to the
Of course, to be a convenient basis f o r pro-
gramming the language should have some flesh
Considering
levels
a set of criteria, bility,
e.g.
Our s u b j e c t b u t to
is
relate
important
Hierarchical
MACHINES
ordering
and thus
product.
The f i r s t mers.
adapta-
These c r i t e r i a
level
software.
of a b s t r a c t i o n
levels.
THE
was i n t r o d u c e d
the f i n a l
in d e s i g n i n g
AND
of
introduces
portability,
in the development o f a p p l i c a t i o n
not to a p p l y them to a p a r t i c u l a r
and maintenance phase a l s o . rules
convenience,
languages
and i m p l e m e n t a b i l i t y .
them to the h i e r a r c h y
ABSTRACT
2.1.
as programming
programming
range of a p p l i c a b i l i t y
are p a r t i c u l a r l y
sign
of a b s t r a c t i o n s
around t h i s skeleton.
PRODUCTION
as a means f o r
However, i t
PROCESS
structuring
influences
the de-
the p r o d u c t i o n
T h e r e f o r e we must observe some a d d i t i o n a l
the d i f f e r e n t
r u l e is very simple :
abstract
machines,
system programmers
Therefore convenient t e s t f a c i l i t i e s ,
are also program-
appropriate means of sto-
rage a d m i n i s t r a t i o n , procedures converting between d i f f e r e n t data types etc.
should belong to the lowest possible l e v e l , not only to the user
oriented topmost l e v e l . Secondly the production of p o r t a b l e
software requires that there is an
intermediate l e v e l which e a s i l y can be implemented on a l l computers.
This l e v e l is not n e c e s s a r i l y the lowest o n e .
available E . g . , to im-
42
plement a s t r i n g for
the b a s i c
facilities fer
manipulation
string
are a l r e a d y
all
to be the common base f o r dent f e a t u r e s Thirdly
all
computers
level
yielding
2
level and
those and o n l y
3
it
follows
that
rule
says t h a t
as p o s s i b l e .
system d e s i g n . accesssible
mal user programs. implement
rule
is
critical that
abstract
Counterexamples E. g.
there exist
machine depen-
paths
if
there
those f u n c t i o n s
sub-
Thus,
for
are found very o f t e n very o f t e n
implementing
the f u n c t i o n s
designers for
the a c t u a l
all
procedures
the c o n t r o l
never know the c r i t i c a l
for
text
of e f f i c i e n c y .
o f the b a s i c
executing
It
is
Hence i t
operations
the f r e q u e n c y
comabout is
re-
of any
and the r e q u i -
these operations.
path t h r o u g h
text
editors
estimates
in time or space.
means to r e c o r d
in o p e r a for
again.
do not make the c o r r e c t
implementation
machine c o n t a i n s
present
as
but not by n o r -
more p o w e r f u l
already
efficiency
lay-
with-
should be made a v a i l a b l e
by the command language i n t e r p r e t e r
red amount o f space and time we w i l l
thought
of the system.
algorithms
concerned w i t h
monly observed t h a t quired
is
can be e n g i n e e r e d to maximum e f f i c i e n c y
all
the
which
as the
we have two l e v e l s so t h a t
The f o u r t h
The l a s t
level
can be b e s t a c h i e v e d
or a d a p t a b i l i t y
we have to
these
only.
ers between these l e v e l s
editing
the second l e v e l
must implement
out hampering p o r t a b i l i t y
generally
If
Thus adapting means c h a n g i n g some a l g o r i t h m s
to the system.
on top of t h i s
ting
provide
hardware then we can t r a n s -
below t h a t
to new a p p l i c a t i o n s
is an i n t e r m e d i a t e
From r u l e
layers
should
computer.
only.
adaptability
stantial
by a c e r t a i n
computer i m p l e m e n t i n g
Apparently,
level
on a w o r d - o r i e n t e d
provided
the system to t h i s
l o w e s t one.
system the l o w e s t
operations
Otherwise
the system c o n c e r n i n g e f -
ficiency.
2.2.
HIERAROHIES
As Dennis an a b s t r a c t
[i]
OF
points
machine
LAN@UAGES
out the programming
Ai+1
may be o b t a i n e d
ques from the language c o r r e s p o n d i n g
to
Ai
language c o r r e s p o n d i n g by t h r e e d i f f e r e n t : Procedural
translation
and i n t e r p r e t a t i o n .
The p r i m a r y
concern of a new language in the h i e r a r c h y
to
techni-
extension,
is
the i n t r o d u c -
43
tion
o f the new o p e r a t i o n s ,
data types
and data s t r u c t u r e s
ding to the new a b s t r a c t
machine.
lation
one can p r o t e c t
or i n t e r p r e t a t i o n
longer available
on l e v e l
On the o t h e r hand, two d i f f e r e n t it
is
there
levels
not u s e f u l
expressions,
is
have a d i f f e r e n t
rent
languages,
language,
etc.
of
available
But a l s o e.g.
A good example f o r roughs
[7,
8].
in
languages w i t h extension
a system programming is
allows by
for
set of
allows
system and thus
this
languages
the o p e r a t i o n s
disable
in
define
these operations
the c u r r e n t low f o r
sequential
process
ESPOL
a basis
procedural
extensions. Thus,
language'
of
languages.
or
a hierarchy E.g.,
statements 'program
By s p e c i f y i n g
a certain
technique
implementing
Waite
for
may be i l l u s t r a t e d and Poole
after
ALGOL
B
rough idea o n l y of the l e v e l gram A or B.
fact
ALGOL 60.
ESPOL
file-handling and
not p r o v i d e d
implemented
not a v a i l a b l e
in
enable
Both
ESPOL.
• In
interrupts
machine i n s t r u c t i o n s ;
from b e i n g l o g i c a l l y
Extended
from which
matrix-calculations
than b e f o r e . bly
is
of
by Bursystems,
denote system c a l l s
and
guages, not a h i e r a r c h y
for
provided operating
which p r o t e c t
interrupted
or a l -
such i n t e r r u p t s .
Intentionally, fines
is
and data types
denote the c o r r e s p o n d i n g
the o p e r a t i o n s
E x t e n d e d ALGOL
languages
for
by the o p e r a t i n g ESPOL
based on two d i f f e -
language and a h i g h - l e v e l
operations
The l a t t e r
state-
taken o v e r from one l e v e l
are both e x t e n s i o n s
some m a c h i n e - o r i e n t e d
E x t e n d e d ALGOL.
but e x p r e s s e d d i f -
language used in w r i t i n g
and Burroughs E x t e n d e d ALGOL
case-
preferable.
such an u n i f i e d the
to E.g.
loops,
the c o n t r o l
case o f a h i e r a r c h y
approach
ESPOL,
transno
structure.
on each l e v e l
by u s i n g p r o c e d u r a l
an u n i f i e d
by u s i n g
corresponding
control
ments o f the base language are a u t o m a t i c a l l y to the n e x t one.
least
misuse o f t o o l s
no reason why languages
to have a h i e r a r c h y
In f a c t ,
at
against
Ai+ I .
should
procedures
ferently.
In a d d i t i o n ,
correspon-
[9].
ALGOL
Each o f these
of
set
languages
we have added the usual to a n o t h e r
'program
is w r i t t e n
A in
of a b s t r a c t i o n
lan-
only
de-
ALGOL'
language
by the h i e r a r c h y
is
procedures
abstract
is w r i t t e n
of
not always
give
a very
instructions
se macro languages may be implemented by w r i t i n g
of p r o -
languages
the
implied.
This
of m a c r o - l a n g u a g e s
The s e t of p r i m i t i v e
machine
in assem-
used as the b a s i s
language out o f a h i e r a r c h y this
a
of languages may be d e v e l o p p e d by
corresponds as
are c a l l e d
used by
of one of t h e -
a procedure
for
each
44
instruction. case o f
structions other
But i n - l i n e
compiling
coding by m a c r o - s u b s t i t u t i o n ,
, is a l s o p o s s i b l e .
may even d i r e c t l y
instructions
3.
BY
in design and p r o d u c t i o n
hierarchical of s o f t w a r e .
and data s t r u c t u r e s hierarchical
The main assumption was t h a t ~ e r -
ordering
as a t o o l
a g a i n s t misuse of o p e r a t i o n s
operations
and data in a d i f f e r e n t
There i s ferent
no p r i c e
way i s
by e n t r y
and e x t e r n a l
can be c i r c u m v e n t e d lations
or i f
if
in-
and debugging a l s o
of d i f f e r e n t
:
is
layers.
guaranteed t h a t
d e f i n e d communication
declarations.
the language a l l o w s
for
diflines
However, the p r o t e c t i o n explicit
address c a l c u indices
exceed-
no programming language a l l o w s parts
for
of a program w h i l e w r i t i n g
elsewhere. protection
supported
r e n t a d d r e s s i n g schemes we can p r o t e c t of data and o p e r a t i o n s can be read.
We can d i s t i n g u i s h s i n g schemes i s
layers
very limited.
Usually it
certain
is
also
data which n e v e r -
o n l y because the h i e r a r c h y
but to r a t h e r
P, Q
logical
large physical
s o f t w a r e the number o f l e v e l s
guished by h a r d w a r e - p r o t e c t i o n
:
of addres-
M o r e o v e r , we o f t e n waste memory space
mechanism does not a p p l y to
or p r o c e d u r e )
have two processes
layers.
from w r i t i n g
By u s i n g d i f f e -
a g a i n s t any misuse
This w e l l - k n o w n method has two d i s a d v a n t a g e s
v e r y few l e v e l s
because the p r o t e c t i o n
By some a d d i t i o n a l
by hardware.
lower l a y e r s
p r o v i d e d by h i g h e r
to e x c l u d e h i g h e r
(one t a b l e
rule
:
r u n - t i m e and i t
Moreover v i r t u a l l y
The second means i s
theless
in t e s t i n g
principle
r e a d - o n l y access to data in c e r t a i n
possible
However, t h i s
the i m p l e m e n t a t i o n does not check a g a i n s t
the bounds. permitted
but not on l e v e l
and data we have to place these
modules communicate o n l y v i a c l e a r l y
specified
is
Ai .
separate compilation
to be paid at
Ai_1
level.
There are t h r e e ways to a p p l y t h i s The most e f f i c i e n t
as an e n g i n e e r i n g a i d
p r e s e n t on a l e v e l
cannot be used by programs r u n n i n g on
troduces
while
,ORDERING
ordering
To p r o t e c t
ing
to machine i n s t r u c t i o n s
HIERARCHICAL
So f a r we have d e a l t w i t h
Ai
correspond
a particular
computer some i n -
must be implemented by one of the o t h e r t e c h n i q u e s .
PROTECTION
ations
On a s u i t a b l e
records.
which can be d i s t i n -
mechanisms can be i n c r e a s e d .
running
records
Suppose we
in s l a v e mode and the c o r r e s p o n d i n g
45
address spaces are h a r d w a r e - p r o t e c t e d means t h a t
P
and
Q
e v e r , we can c o n s t r u c t sends c e r t a i n
a control
system-calls
may become a l a y e r below programs r u n n i n g All
these p r o t e c t i o n in
procedure
coming from Q
which
There i s
P.
In t h i s
against
against
lower layers
Q
way
P
and s t i l l
against disallowed
N e i t h e r does t h e r e running
How-
P.
no method g e n e r a l l y
a g a i n s t wrong programs
Usually this
machine.
i n master mode which
back to
is protected
mechanisms p r o t e c t layers.
the converse d i r e c t i o n .
ware-protection
Q
running
in master mode are p r o t e c t e d
access from h i g h e r protect
a g a i n s t each o t h e r .
are r u n n i n g on the same a b s t r a c t
available
exist
in a b s o l u t e
to
any h a r d -
addressing
mode nor does any method h e l p a g a i n s t misuse o f addresses which were passed to a procedure tected
o n l y by c a r e f u l
as a c t u a l
parameter.
debugging of the
Those m i s t a k e s can be de-
interfaces
between the
layers.
46
4.
REFERENCES Dennis,
J.B.
Lecture
Notes.
Dijkstra, (ed.),
These
The Design and Construction of Software.
E.W.
In
Cooperating Sequential Processes.
Programming
Languages.
London-New York
: F. Genuys
: Academic P r e s s ,
1968. Dijkstra,
E.W.
Comm. ACM
System. Gill,
The Structure of the
S.
11
Engineering. Zurcher,
F.W.
A Methodology Congress
Brussels
B.
(ed.),
and R a n d e l l , Groningen
Burroughs
B.
: North-Holland
(ed.).
Brussels
Burroughs
5000094,
Publ.
IFIP
Comp. 1969.
Report on a C o n f e r e n c e on
Information
Manual.
1969.
Detroit
:
1970
B6700 Extended ALGOL Language,
W.M., Poole P.
In : P r o c e e d i n g s
of Computing System Design.
: B~urroughs Comp. G(~ 5000128,
Notes.
1969.
: NATO Science Committee,
B6700 ESPOL Language, Comp. #
:
Iterative Multilevel Modelling,
Towards a Methodology
Engineering.
Burroughs
ture
In
Report on a C o n f e r e n c e on S o f t w a r e
: NATO Science Committee,
: P. Naur and B. Randell
Software
Waite,
341-346.
for Computer System Design.
1969.
Randell,
Detroit
Multiprogramming
Thoughts on the Sequence of Writing Software.
P. Naur and B. Randell
In
(1968),
'T.H.E. t
Portability
Information
Manual,
1971.
and Adaptability.
These Lec-
CHAPTER 2.B. LANGUAGE
CHARACTERISTICS
PROGRAMMING LANGUAGES AS A TOOL IN WRITING SYSTEM SOFTWARE Gerhard Goos University
O.
Germany
INTRODUCTION
There are v a r i o u s guage.
aspects
in j u d g i n g
From the e n g i n e e r i n g
point
gramming languages i n f l u e n c e properties
o f the f i n a l
assembly
language can have a l l this
is
not t r u e
This
lecture
and p r o p e r t i e s idea which
programming
investigates
guages r e p l a c i n g
language
desired thinking
in
except portability.
for
writing
I.
THE
practice
in
for
The same remark a p p l i e s
the language in o r d e r to g e t some
taken from
language s h o u l d
system s o f t w a r e .
OF LANGUAGE
There is
a well-known
thinking
habits
habits
language.
We t h e r e f o r e
and not on a p p l i c a t i o n can be i n f l u e n c e d
languages
themselves
must d e v e l o p use a n o t h e r
it
approach
like
further
language.
it,
to
conversely
inventing
is
ON SOFTWARE
we s t u d y v a r i o u s
new n o t i o n s
PS 440 ~7L
language
language and the
The language m i r r o r s
they f i n d
,
CREATION
p e o p l e are f o r c e d If
After
ALGOL 68.
between a n a t u r a l
the l a n g u a g e .
lan-
concentrate
software.
use a h i g h - l e v e l
language.
This
FORTRAN, ALGOL 60 [IS
point
PROPERTIES
relationship
in
is
Our s t a r t i n g
of p e o p l e u s i n g the
of those c r e a t i n g
to e x p r e s s
have.
the d e s i g n of system programming
program p r o p e r t i e s
Our g e n e r a l
INFLUENCE
In
language i n f l u -
SIMULA 67 ~2~!, ALGOL 68 ~3~, PL 360 ~4-Z, ESPOL ~_5~, BLISS ~6], and PASCAL ~ .
in
between language p r o p e r t i e s
a good programming
t h e use of assembly
which
habits.
and the
a program w r i t t e n
properties
the r e l a t i o n s h i p
on system s o f t w a r e
constructs
how p r o -
of program c r e a t i o n
Theoretically
o f programs w r i t t e n
our d i s c u s s i o n discussing
of view we a r e i n t e r e s t e d
lan-
language.
characteristics
most i m p o r t a n t
of a programming
because t h e use of assembly
ences the programmer and h i s to e v e r y o t h e r
the q u a l i t y
the process
program.
practice
is
of K a r l s r u h e ,
this
and idioms
to t h i n k difficult
the and they
or t h e y must
48
The same a p p l i e s
to
structure
of
good f o r .
Therefore
with
respect
to
- The s e t
of
of
- The s t y l e
understanding
basic
notions
programming of
"portability"
- The meaning
of
"efficiency"
is
the
1.1.
purpose
LANGUAGE
Except
for
of
this
CONSTRUCTS
storage
context
about
influences
reflect
the
are
thought
to
its
user
least
in
at
be
for
(clarity,
robustness,
paragraph
to
AS
FOR
MODELS
every
assembly
equivalent
programming readability
make t h e s e
PROGRAM
points
more c o n c r e t e .
can be r e p r e s e n t e d
The q u e s t i o n
arises
about Turing-machines
formulations,
etc.)
BEHAVIOR
Turing-machine
language.
speak
by c o m p u t e r
by p r o g r a m m i n g
available
we a u t o m a t i c a l l y
theoretically
can be s o l v e d
can be a t t a c k e d
limitations
by a p r o g r a m w r i t t e n this
language
they
computers
how a p r o b l e m
which
- The meaning
It
Additionally
and w h a t
following:
problems
of
languages.
computers
a programming
the
- The c o n c e p t u a l - The r a n g e
programming
present-day
e.g.
why i n
and why n o t
recursive
functions
or Markov-algorithms. The a n s w e r i s : language. quired
It
use o f
To i m p l e m e n t
using
solve
68 does n o t no
Without influence
allow
or
for
not
mentioning
parallel
of
the
matching
about
-definable
syntax
analysis not
the
conclusion his
processing.
be s o l v e d are
available
these
examples
theoretical
in
this
that
it
to
this
show t h a t
model
rethe
use FOR-
language. is
useful
be e x e c u t e d
other
coroutines in
to
implementation
On t h e
using
facilities
Markov-algorithms;
allowed
if
assembly
we had used
when f o r c e d
are
idea
If
in
functions.
two a l g o r i t h m s
have t h i s
details
choice
pattern
had t h o u g h t
come to
processes,
the
a Turing-machine
much more c o m p l i c a t e d .
procedures
problem will
parallel
our
formulate
P by c o n s t r u c t i n g
he w i l l
the
to
use o f
top-down
recursive
a problem
SIMULA 67,
the
ALGOL 68 m i g h t
Probably
is
we p r o b a b l y
implement
TRAN b e c a u s e
but
easy
recursion
LISP had i m p l i e d
Nobody w i l l
lel.
very
by M a r k o v - a l g o r i t h m s
SNOBOL 4 i n s t e a d ,
body
is
hand,
Someto
in
paral-
o f ALGOL using
because t h e s e ,
language. programming
by w h i c h we w a n t
languages to
solve
49
a problem. Analogous
remarks can be made on data s t r u c t u r e s .
spread use of FORTRAN and ALGOL 60 i n
A p p a r e n t l y the w i d e -
the S i x t i e s
has s e v e r e l y hampered
the development of s t r i n g
m a n i p u l a t i o n and nonnumerical
The use o f
EULER ~
for
languages
structuring
like
data.
Tree-like
will
imply linear
structures
applications.
lists
as models
are p r e s e n t e d when using
languages as SIMULA 67, ALGOL 68 or PASCAL. These c o n s i d e r a t i o n s
show t h a t
guage v e r y much i n f l u e n c e s tures
solving
the c h o i c e o f a c e r t a i n
the design of the a l g o r i t h m s
a given problem.
Thus the programming
o n l y d e t e r m i n e how to e x p r e s s programs; choosen f o r only if
1.2.
programming l a n -
the problem s o l u t i o n .
it
and data s t r u c -
language does not
a l s o d e t e r m i n e s the scheme
Of c o u r s e ,
the
latter
statement is
true
the language was known and used in the d e s i g n stage a l r e a d y .
INFLUENCE
ON P R O G R A M M I N G
Programs can be t r i c k y
STYLE AND
or s t r a i g h t f o r w a r d .
modules or t h e y can be u n s t r u c t u r e d . lection
PROGRAM
DOCUMENTATION
They can be s u b d i v i d e d
They can look
like
into
an ad-hoc c o l -
of s t a t e m e n t s or t h e y can show a s y s t e m a t i c t r e a t m e n t of the
subject. For a v e r y long time thought
to
financial
to the costs portability
design and c o n s t r u c t i o n of a c t u a l
no e v i d e n c e t h a t
neglected
tricky
in
Analogous
remarks a p p l y to
doubtful.
However, the
Firstly
are c o n s i d e r e d n e g l i g i b l e
the compared
Secondly maintenance and
completely.
programming r e a l l y
Thirdly
in most cases t h e r e
leads to programs more
time and space than o t h e r s .
today i t
is
the two o t h e r
very difficult
For example,
it
depends on the
The p r o p e r t i e s
alternatives.
to d e f i n e
programming" p r e c i s e l y
d e v i c e i n ALGOL 60 i s
in g u i d i n g
of programming - was
more economic.
program e x e c u t i o n .
are u s u a l l y
"structured
and t h e r e f o r e
behind such r e a s o n i n g i s
efficient
least
programming - the a r t
be more e f f i c i e n t calculation
expenses f o r
is
tricky
at
the meaning of " t r i c k y "
and we s h a l l
not a t t e m p t
or
to do t h a t .
c i r c u m s t a n c e s whether the use of J e n s e n ' s -
c o n s i d e r e d as t r i c k y
of the programming
programming or n o t .
language in
use p l a y an i m p o r t a n t
programmers to e x p r e s s t h e m s e l v e s in
A few examples are as f o l l o w s :
Of c o u r s e ,
a well-organized
r~le
fashion.
50
Most p r o g r a m m i n g data but
type this
to
a subfield
Therefore
When r e a d i n g the
programmer
usually
these
informations
tioned since
also
he i s
and i n
not to
be done by a c o m p i l e r The e x i s t e n c e
of
while or tion
do o c c u r
in
peated might
by t h e is
Another style
the
introduced program als
of 2.2
for
the
behind
loops is
This
no need t o
of
the
Moreover
bits
are
integer,
are
of
All
being
badly
the programmer
is
values men-
readable. misguided
clearly.
be c o n c e r n e d This
a set
etc.
range of
instead
structures
Lastly
it
on c o d i n g
clerical
task
should
number o f m i s t a k e s .
the
is
programmer
to
say c l e a r l y
he must d e s c r i b e
information
an easy j o b
"loop"
if
the
a "long-distance
the
where
construc-
must be r e c o n -
statement
jump"
not
characteristics of
global
t o be r e -
constitutes
to
a subdivision
subprogram
a loop
it
is
languages
of
examples
is
the
to
logical
first
but
make an i n c o r r e c t
that
data which
successors
global
of
a
glob-
we do n o t
find
find
many exam-
variables
deviating
(loops
clarity the
global defined
good p r o g r a m m i n g
algorithm
the
COMMON was
(see
sec-
problem).
and w i t h o u t
leads
all
clearly
we t h e r e f o r e
use o f
remove t h i s
programming
Labelled
I n ALGOL 60 and i t s these
straightforward of
influencing
variables.
and u n p e r m i t t e d
description is
not.
using
these
oneself
etc.).
up p r o g r a m s
data.
admissible
programs
should
Otherwise
suggest
every
an a t t e m p t
original
data
guides
This
or
In
uncontrolled
to e x p r e s s
these
text
explicitly.
jump and t h e
i n FORTRAN t o
such t h a t
Our c l a i m
it
program.
use o r m i s u s e
for
his
by
packed d a t a .
a word as a c o l l e c t i o n
the
a much s m a l l e r
language
may be a c c e s s e d
tion
be d e s c r i b e d
the
integers the
defi-
identification
a signed
program
the
difficult.
such a c o n s t r u c t . ples
whether
programmer
To see t h a t
example of
is
of
PASCAL [ 8 ] ,
from
do s t a t e m e n t
of
his
reader.
short.
be v e r y
the
is
and a
a while-construction condition
by a c o n d i t i o n a l
structed
must
accessing
Such p r o g r a m s
with
not
integer,
the
o f words
some g e n e r a l i z a t i o n
loops
from
to define
subfields
exception
data
any s u b f i e l d
case o f
why t h e
an i d e n t i f i e r
only,
on such d a t a
explicitly
designing
asked
of
operations
Thus a u t o m a t i c a l l y in
an open q u e s t i o n access
state
attach
no mnemnonic names f o r
an u n s i g n e d
implicitly
explicitly.
Moreover,
the
are
considers
He does n e v e r
to
word.(An
and u n p a c k i n g
there
Boolean variables,
allow
implemen~on
a program operating
must be d e r i v e d
is
the
must be decoded f r o m t h e
of bits. of
from
Instead,packing etc.
do n o t
of. a c o m p u t e r
can be f o u n d
nition). shifts data
languages
second
should
style too
be e x p r e s s e d
and c o r r e c t n e s s . step
in
means much f r o m
Speeding
programming.
p r o g r a m more e f f i c i e n t .
as
There
51
Logical
clarity
can be measured i n
easy to r e c o n s t r u c t In t h a t
the c o n c e p t u a l
Besides t h i s
there
is
paragraph.
comments.
It
algorithms
mostly requires
does not
must be c l e a r l y
of t h i s
related
to
additional
o n l y mean to f o r m u l a t e
information
the d i f f e r e n t
purpose the data and program p a r t s
number of o f the
the program and the d o c u m e n t a t i o n . possible
to have i d e n t i f i e r s
parts
the des-
o f the program.
should be named c o n s i s t e n t l y
throughout is
example
i n d e p e n d e n t l y of the program.
that
it
the f i r s t
a l s o some c o h e r e n t d e s c r i p t i o n supplied
itself.
between language c h a r -
the program and to add a s u f f i c i e n t
To get the maximum b e n e f i t s To t h i s
relation
Program d o c u m e n t a t i o n
and data s t r u c t u r e s
program.
from the program t e x t
a l s o an e x p l i c i t
in
should be
from the w r i t t e n
and d o c u m e n t a t i o n a l r e a d y d e m o n s t r a t e d in
readable statements
it
a l s o means to g e t the maximum con-
to the program d o c u m e n t a t i o n
acterstics
cription
algorithm
way good programming s t y l e
tribution
of this
terms o f r e a d a b i l i t y :
for
This
all
at
least
data i n c l u d i n g
requires
parts
of a word.
1.3.
MACHINE
INDEPENDENCE
AND PORTABILITY
Machine independence r e f e r s it
independent
word l e n g t h , tability special quires
those p r o p e r t i e s
in
properties
addition
that
system,
environment for
or,
such as the
of r e g i s t e r s
the program i s
of the o p e r a t i n g
an a p p r o p r i a t e
o f a program making
of the computer s t r u c t u r e
a d d r e s s i n g scheme, number and kind
requires that
to
of the d e t a i l s
etc.
independent
more g e n e r a l l y ,
it
computers.
Both p r o p e r t i e s
can be a p p r o x i m a t e l y a c h i e v e d by u s i n g a h i g h - l e v e l
dependent o n l y w i t h of a r i t h m e t i c concerning
in
such
v a l u e s and the c h a r a c t e r - s e t . sets
letters,
the d i g i t s
and a small
(together
written
languages are p o r t a b l e
to the use of s e q u e n t i a l
scratch-file).
files
There do not y e t e x i s t
more s o p h i s t i c a t e d
range
The machine dependency
available
in high-level
arithmetic,
can be removed by using o n l y a s e t
t e r s which are w i d e l y stricted
languages are machine
r e s p e c t to the accuracy of r e a l
character
of the c a p i t a l
Programs w r i t t e n
re-
the program can be p r o v i d e d
on most c u r r e n t
programming language.
Por-
from
ca.
number of o t h e r 48 c h a r a c t e r s ) .
if
the use of
(input-file, widely
consisting
I/0
characPrograms is
re-
printer-file,
implemented s t a n d a r d s
for
access-methods on f i l e s .
C o n s i d e r i n g system programming languages
t h e r e are a number of u n s o l v e d
52
problems
concerning
abstract
machines
portability.
have proven b e i n g s u c c e s s f u l the b a s i c
operations
languages
for
d e x i n g is for
Using a language which
concerning
done in s t e p s o f d i f f e r e n t
implies
is
into
might
data d e s c r i p t i o n in
the a l g o r i t h m i c
interchange logical
being supplied
PORTABILITY
It
often
ficient
software.
- storage
section
might
together
for is
splitting
section.
the
The a l g o -
as p o s s i b l e .
adapting
it
not r e q u i r e d
are p h y s i c a l l y applications
The
to o t h e r that
the
split.
can be i n s e r t e d
require
with
is
as f a r
It
However, f u t u r e
that
portable
Whether t h i s
Inefficiency
allocation
data packing
indexing
De-
anywhere
involving
complete
physical
and the a l g o r i t h m i c
data and
section
the d a t a .
VERSUS EFFICIENCY
claimed
language used.
one.
of the data d e s c r i p t i o n
1.4.
is
a logical
in computer n e t w o r k s
splitting
the f o r m e r
theseproblems
to the data d e s c r i p t i o n
section.
(Step-
powers of 2 on
and an a l g o r i t h m i c
and the a l g o r i t h m i c
belonging
computers. 2 on the TR4,
by d i f f e r e n t
need some m o d i f i c a t i o n s is
in-
inefficiency.
t o d a y to s o l v e
The s p l i t t i n g
clarations
on d i f f e r e n t
should be machine i n d e p e n d e n t
data d e s c r i p t i o n computers.
The d e s c r i p t i o n
Additionally
Using a unique scheme f o r
a data d e s c r i p t i o n
section
size
multiplication
a serious
The b e s t we can a c h i e v e rithmic
similarly
There must by a
a computer word.
1 on the UNIVAC 1108, CDC 6400,
therefore
computers,
programs
However, adapted
structured
should be i n d e p e n d e n t o f the word l e n g t h .
integers
different
is
machine i n d e p e n d e n c e .
TR440, 4 on the IBM System 3 6 0 ) . arrays
software.
machines are s p e c i a l l y
way o f p a c k i n g and a c c e s s i n g data w i t h i n
size
portable
manner which
language such as PL360, PS440, PASCAL or BCPL [ I I ]
causes some problems of packing
have d e v e l o p p e d
an a s s e m b l y - l i k e
of t h e s e a b s t r a c t
to the problem at hand. to a h i g h - l e v e l
Poole and Waite [10]
to be programmed in
is
software true
may a r i s e
and data p a c k i n g
schemes not s u i t e d
automatically
or not l a r g e l y
means i n e f -
depends on the
from schemes not s u i t e d
to g i v e f a s t
to the problem
access on the computer
a t hand too c o m p l i c a t e d system inefficient The f i r s t
interfaces
to the e n v i r o n m e n t
code g e n e r a t e d f o r
two problems
PASCAL. The t h i r d
very heavily
(e.g.
the o p e r a t i n g
used l o o p s .
can be removed by using methods as p r e s e n t e d
problem is
a problem o f
the f u t u r e .
It
requires
in
fur-
53
ther
standardization
o f the i n t e r f a c e .
by using v e r y s i m p l e i n t e r f a c e s we must a p p l y h i e r a r c h i c a l ticated
tools
two l a y e r s :
index-sequential this
The f o u r t h
system i n t e r f a c e . access to f i l e s
Of c o u r s e ,
being adapted.
statements
is
r e q u i r e d the
by i n - l i n e The c a l l
additional
software
First
in
known to be c r i t i c a l
E,g.,
access i s
for
form.
a facility
the s u b r o u t i n e - j u m p the at
direct index-
using the
language p r o p e r t i e s .
There i s
loops.
used are not
not
a closed subroutine
but
language should p r o v i d e .
sufficient
operating
because v e r y o f t e n is
too slow.
systems in a u s e f u l
the
I/0,
for
interrupt-handling,
system, e t c .
inserting
but
also
these i n s t r u c t i o n s
than 5 % of
language,
every special
in-
moving the CPU
These t a s k s
o n l y machine d e p e n d e n t ,
less
the program are r e -
purpose we need the p o s s i b i -
which our is
version
of machine-code a l s o s o l v e s a n o t h e r problem en-
starting
shows t h a t
program leads to
to a computer
On any g i v e n computer the
to s u p p l y a l a n g u a g e - c o n s t r u c t
used f o r
tuned
the performance of
To t h i s
for
in writing
around the processes in Experience
for
layer
and the p a r a m e t e r - t r a n s m i s s i o n
insertion least
impossible
struction
may p r o -
supplies
the l o w e r
machine-code not o n l y by c a l l i n g
coding,
countered
if
available.
the program i n a s t a n d a r d
inner
of c l o s e d c o d e - p r o c e d u r e s
Providing
of
implements t h i s
lower l a y e r
direct
should not be f i n e
we w r i t e
inefficient
in a more e f f i c i e n t
to i n s e r t
tions
The lower l a y e r
on a system which i t s e l f
problem r e q u i r e s
which perhaps i s
is
program c o n s i s t s
system i n t e r f a c e .
no reason why p o r t a b l e
written
sophis-
l a y e r s o l v e s the programming problem on the b a s i s
access we may remove or s i m p l i f y
more e l a b o r a t e
It
the r e q u i r e d
the program can run on e v e r y h o s t system p r o v i d i n g
sequential
lity
to a v o i d i t
seems to be i m p o s s i b l e
constructing
access-method assuming t h a t
access to f i l e s .
after
this
assuming the e x i s t e n c e of a s i m p l e r i n t e r f a c e .
vide for Hence,
ordering
Today we must t r y If
based on more s i m p l e ones, The f i n a l
The top
of an a p p r o p r i a t e interface
only.
and the i n s t r u c critical
in
directly
into
the system being e x p r e s s e d i n
time. the
a machine
dependent manner.
2.5.
LIMITATfONS
OF
PROGRAMMING
From the p r e c e d i n g paragraphs language p r o p e r t i e s programs.
Of c o u r s e ,
are u s e f u l
LANGUAGES
there evolved a set or r e q u e s t e d f o r
these c r i t e r i a
are p a r t l y
pends on the problem at hand in which
o r d e r the
of c r i t e r i a
writing
contradictory. criteria
which
good system It
de-
get p r i o r i t y .
54
At l e a s t
the
last
paragraph showed t h a t
which should not be c o n s i d e r e d appropriate
language c o n s t r u c t s .
language does f o r c e
there
as being t a s k s Moreover,
programmers to w r i t e
and p r o p e r e n g i n e e r i n g
can be i n f l u e n c e d
it
Lastly
cannot be e n f o r c e d .
tains
a certain
pline
is
2.
it
to be s o l v e d by s u p p l y i n g must be s t r e s s e d t h a t
"good"
programs.
no
Good design
by a programming language but
should be noted t h a t
misusing it.
Therefore
e v e r y language con-
programming d i s c i -
an a b s o l u t e n e c e s s i t y .
REQUIREMENTS
We d i s c u s s
FOR S T R U C T U R E D
some means f o r
are p r e s e n t r~le
freedom f o r
it
are a number of problems
in e x i s t i n g
PROGRAMMING
better
AND P R O G R A M
structuring
programs whether t h e y
programming languages or n o t .
p l a y e d by p r o c e d u r e s
and t h e i r
p r o p e r use is
MODULARITY
The s i g n i f i c a n t
assumed to be known
and i s not d i s c u s s e d .
2.7.
MODULARITY
Modularity to
larger
denotes the a b i l i t y modules w i t h o u t
to combine a r b i t r a r y
knowledge o f the
program modules i n -
construction
of the modules.
With r e s p e c t to programming languages we are concerned w i t h
the f o l l o w -
ing q u e s t i o n s : Which s y n t a c t i c
units
are s u i t e d
How to e x p r e s s the i n t e r f a c e s Technical
to r e p r e s e n t program modules
used i n
combining modules
aspects o f the process of c o m b i n a t i o n .
A module must be d e s c r i b e d i n d e p e n d e n t l y from o t h e r modules. all
syntactic
Usually this
units
are a p p r o p r i a t e
which
means procedures or p a r t s
DATA in F o r t r a n ) ,
Simula 67 s u p p l i e s
Therefore
can be c o m p i l e d i n d e p e n d e n t l y .
of the data d e s c r i p t i o n
some a d d i t i o n a l
facilities.
(BLOCK Class
definitions class A(B,C);
intege[B~;
begin#Declarations ~ an
serve to d e f i n e cord a f t e r
procedures
leaving their
they have been e x e c u t e d .
Stat. A
local
address space as a r e -
Classes can be compiled s e p a r a t e l y .
Moreover t h e y can be used as p r e f i x e s
of other
c l a s s e s or normal b l o c k s :
55
class
D; b e g i n
;
< S t a t . Dl> ; i n n e r ; < S t a t , D 2 ~ e n d
D c l a s s E; begin
cDecl.E~ ~
< S t a t . E>
end
means: c l a s s E; b e g i n < D e c l . D > ; A slight
generalization
ber of ALGOL-blocks
of this
separately
P: begin ~Decl.~ ; . . . . •
;KStat. Dl~;KStat. E>;<Stat. D2>end scheme would a l l o w to compile any numand to b u i l d
; A: i n n e r ;
....
~
A: b e g i n
....
programs as f o l l o w s :
; B: i n n e r L . . . . e n d
/ ~ B:begin
--~J~---.
-....
;
C:
inner;
C: b e g q n
Any o f these that
lines
could be compiled s e p a r a t e l y .
e v e r y p a r e n t h e s i z e d s e t of d e c l a r a t i o n s
separately
c o m p i l e a b l e whether i t
every definition ded i t
is
allocated
in store
The main d i f f i c u l t y different of
data are s i m p l e to h a n d l e . variables
on some b l o c k
an a l g o r i t h m
calls
for
correct
including
time.
global
howeve~ i t
Of c o u r s e ,
in
case of s e p a r a t e c o m p i l a t i o n blocks
necessary interface.
the b l o c k
Lastly
data and e n t r y
all
level
only
global
located
somewhere
must be s u p p l i e d can g e n e r a t e macro-
access sequence at modules s u p p l y i n g
glob-
more ge-
severely.
of blocks of that
for
block
the d i s c u s s i o n shows t h a t points
of
parameters, e.g. ~
in advance. The second method i s
may hamper code o p t i m i z a t i o n
somewhere i n o t h e r of e x t e r n a l
that
consists
allocated
the c o m p i l e r e i t h e r
method r e q u i r e s
provi-
interfacing
a c c e s s i n g these p a r a m e t e r s or i t
The f i r s t
is
parameter transmission
Also s t a t i c a l l y
In case of o t h e r
al p a r a m e t e r s are t r a n s l a t e d
rule
Additionally
the i n t e r f a c e
o n l y which must be r e p l a c e d by the c o r r e c t
binding neral,
is
level~I=O or Boolean v a r i a b l e s
i n the m i d d l e of a computer word, with
So the g e n e r a l
from the push-down.
compilation
on r e t u r n .
end
and s t a t e m e n t s should be
There are no problems i f
a result
end
can be compiled s e p a r a t e l y
sequence o f a p r o c e d u r e
and o f s u p p l y i n g
....
a procedure or n o t .
separately
in s e p a r a t e
modules.
the c a l l i n g
is
of a data s t r u c t u r e
....
later is
insertion
part
of the
the s i m p l e n o t i o n s
as used by most assembly languages are
not s u f f i c i e n t . The way we choose to
supply i n f o r m a t i o n
mines the sequence of steps r e q u i r e d les t o g e t h e r .
Assume t h a t
in
about g l o b a l compiling
parameters d e t e r -
and b i n d i n g
the modu-
in the example above modules A and B use g l o b -
56
al
parameters
from P and C uses parameters
the access a l g o r i t h m s
for
from B. I f
these p a r a m e t e r s
we want to have
known at c o m p i l e - t i m e
we get
the sequence:
Compile A
Compile
B
Com!l le C Binding If
the access a l g o r i t h m s
the o r d e r i s Compile P
of P,A,B,C
are r e p r e s e n t e d
by macros d u r i n g
compilation
not s i g n i f i c a n t : Compile A
Compile
B
Compile C
B i n d i n g o f P,'A,B,C ~ It
however,
the b i n d i n g
of the o u t e r m o s t
should
occur as an a p p e n d i x to the c o m p i l a t i o n
b l o c k we get the sequence Compi 1e C
S
Compile A
Compile A, Bind B,C
/ Compile P,
2.2.
HIERARCHIES~
The l a s t
NESTIN@
AND
example shows t h a t
Bind P,A,B
SCOPE
RULES
hierarchical
ordering
can be a c h i e v e d by
means of n e s t e d b l o c k s :
of c o u r s e , in
nesting
P
is
the base l a y e r
A, B
is
the second l a y e r
C
is
the top
of b l o c k s
layer
may a l s o
serve purely
the ALGOL 60 - c o n s t r u c t i o n . begin
integer
n;
read (n)~ begin a r r a y
a~J:n];
syntactic
purposes as
57
Or i t
might
as soon
be used t o m i n i m i z i n g
as p o s s i b l e
such b l o c k s
from
serving
tion
A" d e f i n e s reminds
the
storage
push-down.
requirements
Therefore
it
by d e l e t i n g is
useful
arrays
to
mark
as new l a y e r s :
level "level
the
A
be g ! n ' . . . . . .
A to
end
be a l a b e l .
programmer
that
The a v a i l a b i l i t y
of
he s h o u l d
think
blocks
a well-known
such
a construc-
on s t r u c t u r i n g
his
pro-
gram h i e r a r c h i c a l l y . Hierarchical
ordering
seems t h a t rests
the
by n e s t i n g
success
by such r o u g h
advices
clared
outer
in
the
block
consisting
a set
of
solved
is
functions
layer.
In a n e s t e d No o u t e r
sely:
global
programming
parameters
dures
it
against
for
all
A solution
that
is
to
This
these
block
no l a y e r this
the
operations,
standard
local
guaranteed
in
open f o r
any k i n d
of misuse.
serving
access
define from
as g l o b a l
No member o f
inner
the
of
in
the next
ALGOL-family
another Conver-
To e n f o r c e
layer.
to
program.
one way
blocks.
means by w h i c h
parameters
define can be
given
data
declared to
to
and l i b r a r y
the
data
necessary
tells
problem
Hierarchical
enclosing
is
de-
innermost
declarations
uses
rule
procedures
as t h e
advice
such t h a t
to
a set
data This of
provides
good can is
procea means
that. w o u l d be t o
blocks
can be t a k e n
ple
self-explanatory:
is
are
algorithms
calls".
an o u t e r
structure
on one same l e v e l .
doing
in
is
a main p r o g r a m
using
unpermitted
data
your
describing
can a c c e s s
practice
be p r o t e c t e d
level in
block
mostly beginners
(procedures)
requires
block
principle taught
procedure
declared
ordering
only:
necessary
next
present
It
as a p r o g r a m m i n g
then write
of
principle.
The p r i n c i p l e
"Distribute
operations
as b e i n g
Hierarchical
as:
blocks;
on t h e also
nesting
ordering.
mainly
new b a s i c
easily
ordering
for
of
upon h i e r a r c h i c a l
is
out
restrict from
the the
scope
scope
of
identifiers
arbitrarily.
except level
A A begin end
end
real
;
x;
that
The f o l l o w i n g
Scope o f begin
such
]
x:
inner exam-
58
2. 3.
CONCURRENT
ALGOL 68 a l l o w s
PROCESSES
for
formulating
collateral
execution
of expressions
El . . . . En by w r i t i n g (E l , E2 . . . . It
is
requested
that
no e x p r e s s i o n
l e accessed by any o t h e r in -
the f u t u r e
for
Computers sults
of
expression.
Compilers
expressions
can o p t i m i z e not
sophisticated
compilers.
processes
ations
contain In t h e s e
in
The c o m p i l e r process.
parallel
sections
[12])
sidering
and t h a t
and t h e
his
algorithms.
can compute
for
a l s o by c o n s t r u c t i n g
the re-
common
more
when t h e o t h e r
one a f t e r
is
i n which
in
is
way
built
(E l , E 2 . . . . .
the
differ-
and change
on t h e s e b a s i c for
these oper-
some e x p r e s s i o n s is
are
not appropriate.
a new s t a c k
execution
Parallel
by means o f P-
To a l l o w
execution
allocate
parallel
if
t h e y access
required
advance t h a t
sequential
another.
needed o n l y
o r by o p e r a t i o n s
code to
for
e v e r y such
by
En)
notion El , E2 . . . . .
assume t h a t
the
advance and t h a t assumption
operating
unit
message s y s t e m s ) .
parbegin
known i n
into
as b e i n g
o f many e x p r e s s i o n s ,
execution
must be t o l d
must g e n e r a t e
very systematic
processes
code by s e a r c h i n g
one b u t
critical
ALGOL 68 i n d i c a t e s
based upon D i j k s t r a ' s
is
insight
cases s y n c h r o n i z a t i o n
par
lel
sequential
can be a l s o e x e c u t e d
(Dijkstra
These c o n s t r u c t s
are u s e f u l
programming?
(event-operations,
the compiler
executed
expressions
parallel.
resulting
of
(time-shared)
and V - o p e r a t i o n s operations
v a l u e o f any v a r i a b -
But whz s h o u l d we do t h a t of
expressions
common d a t a .
a better
can be a c h i e v e d
to c l a r i t y
or quasi-parallel
in
the
only
two o b j e c t i v e s
Collateral
Collateral
h a v i n g more t h a n one a r i t h m e t i c these
contributes
changes t h e
independent
Thus he a c h i e v e s
subexpressions The l a s t
Ei
3 reasons:
The programmer can d e s c r i b e independent.
ent
, En)
systems:
number o f p a r a l l e l
is
number o f p r o c e s s e s they all
nevertheless
User j o b s jobs
En parend
is
start
only
in
paral-
a t t h e same t i m e .
somewhat a r t i f i c i a l
are started
limited
running
This
when con-
whenever t h e y a r r i v e by t h e
length
o f some
59
system tables
and by s t o r a g e
BLISS,
on a more e l e m e n t a r y
acting
ment c r e a t i n g create P is is
the
provides
for
a special
P (API . . . . . APn) a t < r e f e r e n c e e x p r e s s i o n > then ( s t a t e m e n t >
allocated
and h a v i n g
level,
state-
a process:
pure
procedure starting
the
the process
requirements.
t o be e x e c u t e d
at the
prescribed
address
length.
has f i n i s h e d .
It
as a s e p a r a t e
given
The l a s t
frees
the
by t h e
process.
reference
statement
stack
length
is
A stack
expression
executed after
and i n d i c a t e s
the
successor-
statement. This
construction
Coroutines
3.
DATA
is
not only
can be h a n d l e d
STRUCTURES
The t e r m " d a t a by a p r o g r a m .
IN
structure"
meaning i f
certain
"data-type",
More c o m p l e x d a t a (Lucas
and Walk
objects) objects
[13]
and a s e t
):
<si:xi>
A special
Given
which
process.
can be m a n i p u l a t e d
them. real,
f r o m more e l e m e n t a r y
are simple
They are
to
integer,
or a d d r e s s
classified
Each c l a s s long
v a l u e s which
by t h e monadic
of values
integer,
loose
long
is
real,
of
a
Boolean,
etc.
can be d e s c r i b e d the set
of elementary
of
theoretically
simple
selectors
values
S we b u i l d
as f o l l o w s
EO ( e l e m e n t a r y the set of all
0 as f o l l o w s : (I)
EO ~ 0
(2)
If
x=
selector
further.
structures
an i n d e p e n d e n t
recursively
structures
applicable i.e.
reference
to an o b j e c t
are built
data
subdivided
operations
starting
PROGRAMMING
Data s t r u c t u r e s
their
for
same way.
refers
The most e l e m e n t a r y
character,
useful the
SYSTEM
ones.
and d y a d i c
in
is
called
x n ~0 and s I . . . . .
(<Sl:Xl >.....
a component o f
or name s i .
case i s
xI ......
Sn~. S , s i ~ = s j f o r
<Sn : Xn~) ~ 0
x consisting
of
an o b j e c t
By d e f i n i t i o n
si(x):
= <si:xi>
given
when u s i n g
i~-j
integers
as s e l e c t o r s :
xi
and i t s
then
60
(<1: Xl > . . . . .
)
or <x I . . . . . denotes
the
Obviously are
array
all
labelled
The f i n a l
of
objects
such o b j e c t s the
x I, .....
x n.
can be d e s c r i b e d
by s e l e c t o r s .
nodes o f
Xn>
Subtrees
tree
are
as r o o t e d
describe
labelled
trees.
components o f
by e l e m e n t a r y
The b r a n c h e s the
object.
objects:
s2 eol
s3
eo2 (<sl:eOl>, It
turns
ta
structures
-
It
is
-
these
occurring
this
s(x)
x is
that
allowed
To a p p l y tion
out
to
objects in
insert
are g e n e r a l
practice
if
enough t o
construction
at
different
most darule:
nodes i n
we must i n v e s t i g a t e
There are four
selectors
describe
one adds one a d d i t i o n a l
one same s u b t r e e
theoretical
and a l l
<s4:eo3>)>)
<s2:(<s3:eo2>,
can be i m p l e m e n t e d .
an a r r a y
eo3
a tree.
how a s e l e c -
ways:
are i n t e g e r s .
Thus s e l e c t i o n
is
mapped on i n d e x i n g . -
x is fix
a structured or postfix
selection -
is
value
operation:
tree
used as c o m p o n e n t s o f
-
another
Selection
s of
Selection
x or
x.s
in
is
represented
the program.
is the
represented record.
by a r e c o r d .
They r e p r e s e n t
Internally
represented
by u s e r - d e f i n e d
References are
selectors
record. is
as a p r e -
mapped on i n d e x i n g .
E v e r y node o f t h e to
(record).
operations.
by p o i n t i n g
61
Nothing general well-known.
3.1.
can be s a i d on the l a s t
case.
The f i r s t
alternative
is
The r e m a i n i n g two are d i s c u s s e d i n 3 . 2 .
SIMPLE VALUES
In system programming the o n l y s i m p l e v a l u e s are the b i t s this
is
fine
some l a r g e r
no
useful
unit
unit
for
most a p p l i c a t i o n s
as the b a s i c one.
it
is
Two d i f f e r e n t
0 and L. Since
n e c e s s a r y to dedirections
can
be f o l l o w e d : The o p e r a t i o n - o r i e n t e d according
to the a d m i s s i b l e o p e r a t i o n s
used to r e p r e s e n t it
is
useful
finite,
approach mentioned b e f o r e
the o b j e c t .
classifies
regardless
objects
of the number o f b i t s
Besides the data types mentioned b e f o r e
to have some means of d e f i n i n g
new types d e s c r i b i n g
a
ordered set of values:
(Sunday, Monday, Tuesday, Wednesday, T h u r s d a y , F r i d a y ,
Saturday)
or (0..9)
(the
The b a s i c assumption o f binary
handling
be t r u e
them.
Storage
cells the
two c e l l s
the
algorithm
this
at different
a practical
different
is
true
conceptually
type.
the c o n t e n t s
need f o r
having variables
times.
Inspection in o t h e r
It
is
and i t From the
to c o n s t r u c t
sophisticated
and to use the i n f o r m a t i o n
found.
At l e a s t
However, our approach does not a l l o w indexing.
Our approach does
o f the c e l l s
may be
whose c o n t e n t
is
languages show t h a t
by one s t o r a g e c e l l storage allocation it
it
only.
schemes.
as a l i n e a r
array
must be moved around.
to access the i n t e r i o r
The mapping o f s e l e c t o r s
of
of the union-mode in
purpose we must be a b l e to access s t o r a g e
compiler.
on the
times o n l y .
i m p o s s i b l e to i m p l e m e n t these c o n s t r u c t s
to the
the
Hence we may waste space by using
is
by e x p l i c i t
that
should be p o r t a b l e .
v a l u e s of a r b i t r a r y freedom.
type at d i f f e r e n t
difficult
is
however, s e v e r e d i s a d v a n t a g e s o f
ALGOL 68 and analogous c o n s t r u c t i o n s
For t h a t
approach
to and has no i n f l u e n c e
approach:
can c o n t a i n use o f
are,
i n s t e a d o f one a l t h o u g h
significant There i s
if
assumption
of v i e w t h e r e
the o p e r a t i o n - o r i e n t e d
prohibit
no r e l a t i o n
This
in practice
engineering point
9)
the o p e r a t i o n - o r i e n t e d
coding o f v a l u e s bears
algorithm must
numbers 0 , 1 , 2 . . . . .
onto
indices
of r e c o r d s is
left
62
ESPOL, PASCAL and a l l
high-level
approach.
(ESPOL a l l o w s
for
The o t h e r
languages quoted
languages
in
use the computer word
(4 b y t e s
same p r o p e r t i e s
cluding -
all
the freedom.
Software
is
on a b y t e - o r i e n t e d as w i t h i n
The d i s a d v a n t a g e s
not portable
try
to f i l l
too).
of
machine)
They
as t h e b a s i c
assembly languages this
when t h e word s i z e
because most programmers
operation-oriented
units
are word-oriented:
the beginning
unit.
Thus we g e t t h e
use the
u s i n g words as b a s i c
approach
decreases
t h e words w i t h
in-
are:
significantly
data to the
utmost. There a r e no good t e s t i n g
and d e b u g g i n g
cannot
types.
interpret
the
and h e x a d e c i m a l There
is
well-known
mistakes
think
never enforced
3.2.
control
allocation
of
as a c c e s s i n g
in
correct
systems.
have been moved i n
- Programmers
facilities
since
Thus we have to
live
the
with
system
octal
memory-dumps.
no a u t o m a t i c
namic s t o r a g e which
data
use o f
Thus i t data
is
having
references
in
dy-
v e r y e a s y t o make such d i e d or a c c e s s i n g
data
the m e a n t i m e .
terms o f words
to d e s c r i b e
precisely
these bits.
Hence the method
programmers
only.
is
and groups the
useful
of bits.
information
for
"good"
They a r e
represented
by
and d i s c i p l i n e d
RECORDS
Records
are c l a s s i f i e d
according
to the
selectors
~nto
indices
the word-oriented explicitly
according
types
to t h e i r
and names o f
their
can be done i n both
case t h e
u s e r can c o n t r o l
length
(word-oriented)
components.
cases by t h e it
or
The mapping o f compiler.
or he may d e f i n e
In it
as i n PS440:
The d e c l a r a t i o n
allows
full
select
s of
x
s = [2]
for
yielding
x [2] In both cases i t
must be possible to pack more than one component i n t o
one word in order not to vaste space. When using o p e r a t i o n - d e f i n e d types
63
finite
typeslike (0...9)
number of b i t s
are u s e f u l
necessary for
p r e s e n t e d by p a c k i n g
is
for
the c o m p i l e r to d e t e r m i n e the
the component.
the m i n i m i z a t i o n
The e n g i n e e r i n g problem
of space and of a c c e s s - t i m e
by one same c o n s t r u c t i o n : Packing can be c o n t r o l l e d
by the c o m p i l e r ,
a r r a n g e the o r d e r o f the components f o r optimized instruction be c o n t r o l l e d division
sequences i n
by the u s e r ,
i.e.,
the c o m p i l e r
method i s
a c c e s s i n g the
the advantage o f being a b l e to d e s c r i b e records w i t h i n instructions
the frame
into
adapting
is
for
using
Or i t
can
and the sub-
the program.
Although
recommended here.
It
has
h a r d w a r e - or s y s t e m - d e p e n d e n t
of the language.
operation-code,
of p a r a m e t e r - r e c o r d s
components.
the o r d e r o f components
machine-dependent i t
can r e -
use of space or f o r
of words always remains as d e s c r i b e d i n
the l a t t e r
E.g.,
address-fields
a SVC-instruction
the s u b d i v i s i o n etc.
does not
of
or the b u i l d i n g
cause p r o b l e m s .
When
the program to a n o t h e r computer system the r e c o r d - d e c l a r a t i o n
must be r e w r i t t e n , access i s
but the a l g o r i t h m s
remain unchanged as f a r
as data
concerned.
Records c o n t a i n i n g
references
trees.
is
the
i.e.,
better
If
the
usual
tree
simplified
list-processing
two a d d i t i o n a l
as components to a l i n e a r
languages
can be used to r e p r e s e n t chain we get a " l i s t " .
the l i s t - c e l l s
(the
records)
In contain
components: Value tag reference
field
value field The v a l u e tag s p e c i f i e s field.
The v a l u e f i e l d
references field) It
is
field zation
is
to o t h e r
the
lists
a union-variable
interesting that
are a l l o w e d . in
the
contains
The l a t t e r
i n an o p t i m a l
remark a p p l i e s
field
especially
(value tag,
value
is
of the r e f e r e n c e
as the r u n - t i m e - o r g a n i a reference.
explicitly.
code are o p t i m i z e d .
in the v a l u e f i e l d
never d e a l t w i t h
treatment
The c o m p i l e r as w e l l
reference field
and the g e n e r a t e d
of a r e f e r e n c e is
Thus the p a i r
types,
the sense o f ALGOL 68.
the user must not handle the r e f e r e n c e routines
c o n t e n t o f the v a l u e
v a l u e s of d i f f e r e n t
to compare the i n t e r n a l
and the v a l u e f i e l d : know
type of the p r e s e n t
can c o n t a i n
Therefore
The system-
However, the o c c u r r e n c e
m o s t l y handled as an e x c e p t i o n and
fashion.
generally
to a l l
user-defined
record-components
64
of type
reference.
processing record
3.3.
oriented
in
a sophisticated
size.
objects
in
references
to
stack
components
presented
such t h a t
All
problems
these
it
proc
store
proc
load
is
The f o r m e r
to
set of
records
records
From a symbol mation
about
contain replaced If
class
routines
the
a storage the
administration are s p e c i f i e d
declaration
of
o_~f x : = w ;
a component
are mostly
into
type;
e.g.
to
thus
tables
induced
compilers,
classes
concerning
may behave s i m i l a r
the
such t h a t
storage
containing
a stack;
all
allocation:
other
garbage-collection
inforclasses can be
free-list-administration.
of the
a separate
can be s u p p l i e d specifies
storage
zone we can use f o r
scheme making use o f
class.
The d i f f e r e n t from a library;
by t h e d e c l a r a t i o n
a record-mode
mode are a l l o c a t e d .
The l a s t
the expressive
s.
subdivided
administration
records
must be r e -
of the record.
containing
garbage-collection
list-cell
to
: s o__ff x ;
some r e c o r d - m o d e
t o each such c l a s s
of
w):s
are n e v e r removed;
by a more e f f i c i e n t
properties storage
of
fieldtzpe
In most s y s t e m - p r o g r a m s ,
entries
to
valid
possible
such as
x) f i e l d t y p e
concerning
is
system programming.
x,
by the s e l e c t o r
and f o r
contain
it
do add n o t h i n g
record
is
cells
the beginning
record
declarations
we a s s i g n
every
components
of different
Such r e f e r e n c e s
show a common b e h a v i o r
table
records
storage
to f i n d
dynamically
every record
Moreover,
be a v o i d e d i n
can be l o g i c a l l y
of a class
true.
the existence
records
type.
= (ref
problems
is
generated
which
Instead, procedures
by u s i n g one heap o n l y .
a listuse o f a
converse
implies
for
= (ref
selected
within
than the
records This
directly.
possible
be used where r e c o r d
of fieldtype
all
handling
a certain
a record
should
language.
records time
space t h e
at run-time
indicating
of
of
references
power of t h e
should
templates
a record
reference
one because
storing
garbage-collector
the
in
cause no p r o b l e m s ) .
There must e x i s t
of in
FOR RECORDS
heap f o r
one
the stack
a tree
efficiency
For e f f i c i e n c y
STORAGE-ALLOCATION
(records
all
better
system.
ALGOL 68 p r o v i d e s of
Thus r e p r e s e n t i n g
system yields
of
the
the storage
i n which
the particular
routines
for
requested
zone.
zone r e c o r d s
Every
of this
65
The i d e a s e x p l a i n e d here can be found half
the way: A l l
free
list
records
administration
over the l e n g t h means f o r
in PASCAL. But N . W i r t h went o n l y
of a c l a s s must have the same mode and the must be o r g a n i z e d by the user h i m s e l f .
o f the zones i s
fixed
at c o m p i l e - t i m e .
expanding one zone by s h o r t e n i n g
others
More-
There i s
no
as used in many
system programs. A more g e n e r a l method can be developed by combining h i e r a r c h i c a l dering records
can be enhanced by a d d i t i o n a l
laration
o f the
Burroughs' records priate
in the queue,
algorithms
is
used;
should be p r o t e c t e d We t h e r e f o r e
contains
a garbage-collector
accessible within
to a l l
records
components o f the ministration. struct
The h i g h e r
allocated
if
below t h a t
that
level
where
level.
it
i n the zone.
allocation
etc.,
The a r r a y - s t r u c t u r e
of the
only.
The p r e f i x
information
for
used as a p r e -
specifies
those
the s t o r a g e
ad-
through bool
are i n a c c e s s i b l e
marking b i t . . . . ) from the h i g h e r
every record-generator
This a u t o m a t i c a l l y
o f the record-mode mentioned i n done by c a l l i n g
storage
a record-mode which i s
= ( e x c e p t user l e v e l
prefixes
for
necessary.
containing
By d e c l a r i n g
level
algorithms
may s p e c i f y
record
prefix
zone-identifier. is
at appro-
suggests
to the s t o r a g e - a d m i n i s t r a t i o n
these a l g o r i t h m s
the components of the p r e f i x
cation
records
of
new
d e c l a r e d to be an a r r a y .
- The z o n e - d e c l a r a t i o n
-
space f o r
ordering
a g a i n s t misuse from the h i g h e r
- The z o n e - d e c l a r a t i o n zone i s
insert
should be done one l e v e l local
the dec-
propose the f o l l o w i n g :
- Every zone i s
including
to
Hierarchical
information
how
The q u e u e - d e c l a r a t i o n s
used to a l l o c a t e
to remove r e c o r d s ,
p l a c e s o f the queue e t c .
the s t o r a g e
components by p r e f i x i n g
record-mode by a n o t h e r one.
ESPOL c o n t a i n
storage-administration
fix
or-
and ideas from SIMULA 67 and ESPOL: SIMULA 67 d e m o n s t r a t e s
supplies
by the a p p r o p r i a t e
the a d d i t i o n a l
the z o n e - d e c l a r a t i o n .
implicitly
levels.
components
Storage a l l o -
the a l l o c a t i o n - a l g o r i t h m
of the
zone-declaration. -
Other a l g o r i t h m s calls,
e.g.
This proposal
of the z o n e - d e c l a r a t i o n
REMOVE ( r e f e r e n c e is
how to e x t e n d the
not y e t
to r e c o r d ,
implemented.
scheme such t h a t
it
can be c a l l e d
by e x p l i c i t
zone i d e n t i f i e r ) .
T h e r e f o r e no comment can be made allows
zones of v a r y i n g
length.
66
4.
SYSTEM-DEPENDENT
Compilers tion.
LANGUA@E FEATURES
are machine dependent at l e a s t
They depend on the o p e r a t i n g
handling utility
facilities. routines
Operating handling
This (cf.
systems is
registers,
in
their
system w i t h
1.4 f o r
proposals
code g e n e r a t i n g common w i t h
how to weaken t h i s
on hardware p r o p e r t i e s ,
etc.).
scheme ( l i n e a r ,
The p e r f o r m a n c e
storage
paged,
of many p a r t s
of the speed r a t i o s
sec-
r e s p e c t to the f i l e have in
most
dependency).
of today are machine dependent on purpose:
based s o l e l y
analysis
PORTABILITY
dependency c o m p i l e r s
depends on the a d d r e s s i n g careful
AND
Interrupt
allocation
segmented u s i n g base of the system depends on
between d i f f e r e n t
components of
the computer system. In a l l
programs p e r f o r m a n c e
of the program d i r e c t l y for If
-
the c o m p i l e r
small
of the system programming
the assembler
These s t r i n g s
positioned
by the c o m p i l e r .
anywhere in
correctly
parts use
without
explicitly
in
as p a r t s
any change to
in the assembly program g e n e r a t e d
written
as p r o c e d u r e
T h i s method r e q u i r e s the l a n g u a g e .
that It
is
it
is
calls
possible
used in
to
PL 360 and
in ESPOL.
Other methods can be d e r i v e d The f i r s t
from t h e s e two by macro s u b s t i t u t i o n .
method has the advantage t h a t on s t a t i c
structions nucleus
use t r a n s l a t e s
method.
can be i n t r o d u c e d
the program.
access r e g i s t e r s partly
are t r a n s m i t t e d
PS440 uses t h i s
Machine i n s t r u c t i o n s
control
language in
assembly language we can use assembly language s t r i n g s
o f the program.
storage
allocation
to the a s s e m b l e r .
of an o p e r a t i n g
jumps and o t h e r
This
system.
instructions
the programmer
by w r i t i n g
feature
is
into
achieving
the same:
objects
It
and l a b e l s
explicitly
in
pseudo-in-
needed in w r i t i n g
hardware-defined
must be p o s s i b l e
can get e x p l i c i t
appropriate
There we have to s t o r e
case of the second method we have to s u p p l y tain
by r e c o d i n g
There are two methods in
doing t h a t :
into
-
might be i n c r e a s e d
in machine code.
certain
storage
an a d d i t i o n a l
to s p e c i f y
data,
cells.
In
feature
for
the address
the d e c l a r a t i o n
the
or l a b e l
of c e r defini-
tion. Both methods a l l o w f o r
statically
meters o f the m a c h i n e - i n s t r u c t i o n s . restriction.
allocated This
objects
appears
only
to be p a r a -
to be a v e r y u n d e s i r e d
67
Machine i n s t r u c t i o n s duce them i n t o
and h a r d w a r e - r e g i s t e r s
a language using o p e r a t i o n - o r i e n t e d
some i n c o n s i s t e n c y .
In p a r t i c u l a r
ency checks and p r o t e c t i o n grammer assumes t h a t rectly
it
is
not
needed in suffice
it
difficult
is
possible
mechanisms o f the
these checks
Using machine code i s is
are w o r d - o r i e n t e d .
for
To i n t r o -
data types
involves
to b y - p a s s a l l
compiler.
and p r o t e c t i o n
consist-
Since the p r o -
mechanisms work c o r -
him to d e t e c t when they f a i l .
n e c e s s a r y not o n l y f o r
increasing
performance.
It
any case where the e x p r e s s i v e power of the language does for
dealing with
hardware p r o b l e m s .
machine dependent data s t r u c t u r e s
On the o t h e r
and a l g o r i t h m s
machine i n d e p e n d e n t language c o n s t r u c t s . a c o m p i l e r may be f o r m u l a t e d w i t h o u t
E.g.,
can be e x p r e s s e d by
the
referring
hand many
code g e n e r a t i o n
of
to machine dependent
constructs. Thus p o r t a b i l i t y written
of a program r e q u i r e s
i n a language s u f f i c i e n t l y
not o n l y t h a t
machine i n d e p e n d e n t .
the c o n t e n t
of the program and the way i t
is
or n o t .
5.
portable
SOME
n o n - e x i s t e n c e o f a b a s i c model f o r are developed w i t h
difficult allowing
a certain
to adapt to o t h e r for
sequential
depends on
by u n s t r u c t u r e d
linear
system programming languages i s file-handling operating
operating
files
the i n t e r f a c e s
statically.
Basically
this
of o p e r a t i n g
On the one hand t h i s
stacks. is
problem i s
All
hand the programmer i s s e r v e d b e t t e r
when he gets
for
instead
write
such t o o l s
himself.
To use o p e r a t i o n - d e f i n e d
The AED Free Storage
data are the On the
standardized
of being f o r c e d
Package (Rc$~
[14])
to is
can be changed.
data types
requires
more e f f i c i e n t
to the problem of union-modes than are a v a i l a b l e c o n t e x t the i m p l e m e n t a t i o n of
other
as p o s s i b l e .
tools
complex data s t r u c t u r e s
a
systems.
done on purpose:
the use o f memory as f a r
a good example how the s i t u a t i o n
the
models
system i n mind and are
other
handling
All
r a n d o m - d e v i c e s are m o d e l l e d
Most system programming languages a l l o w f o r programmer should c o n t r o l
and I / 0 .
systems. Or t h e y are to s i m p l e
only while
address spaces.
problem of s t a n d a r d i z i n g
allocated
It
uses the language whether i t
OPEN PROBLEMS
The most s e r i o u s problem of t o d a y ' s either
the program i s
indexing
Most schemes i m p l y much more s h i f t s
is
not y e t
today.
solutions
In the same
solved satisfactorily.
or m u l t i p l i c a t i o n s
by powers of
68
two than any a s s e m b l e r programmer would e v e r w r i t e . Concerning the o v e r a l l t h a t we are s t i l l
in
guages are c r i t i z e d chine-oriented.
situation
o f system programming
the b e g i n n i n g either
for
of knowing what i s
being too h i g h - l e v e l
really
to d i s t i n g u i s h
It
being too ma-
seems to be v e r y
between language c o n s t r u c t s
need and those which t h e y b e l i e v e
used them f o r
or f o r
But we l e a r n o n l y v e r y s l o w l y where good compromises
between these two extremes must be looked f o r . difficult
languages I f e e l needed. Most l a n -
a long t i m e .
systems programmers
t h e y need because t h e y have
69
6.
REFERENCES
I.
Naur,
P.
(ed):
Revised r e p o r t
Num. Mathematik 2.
Dahl,
O.-J.,
guage, 3.
Myhrhaug,
revised
A.:
Nygaard,
Oslo:
K.:
SIMULA 67, Common Base Lan-
Norwegian Computing C e n t e r ,
Report on the A l g o r i t h m i c
14 ( 1 9 6 9 ) ,
Language ALGOL 60.
420-453.
B.,
edition.
v. W i j n g a a r d e n , Mathematik
4.
4 (1963),
on the A l g o r i t h m i c
1970.
Language ALGOL 68.
Num.
79-218.
Wirth, N.: P1 360, A Programming Language f o r the 360 Computer. J.ACM 15 (1968).
5.
6.
Burroughs
B 6700 ESPOL Language,
Burroughs
Comp.
Wulf,
W.A.
Report, 7.
et a I . : B L I S S
Carnegie-Mellon
Goos, G.,
Lagally,
sprache - , 7002. 8.
Wirth,
5000094,
K.,
Information
Manual.
Detroit:
1970.
R e f e r e n c e Manual. Univ. Sapper,
Pittsburgh
Computer Science (Penn.),
G.:PS440 - Eine n i e d e r e
Rechenzentrum der T e c h n i s c h e n
Department
1969. Programmier-
Hochschule MUnchen, B e r i c h t
MUnchen 1970. N.:
The Programming
Language PASCAL. Acta
Informatica
1 (1971
35-63. 9.
Wirth,
N. and Weber, H.:EULER:A G e n e r a l i z a t i o n
Definition: I C).
Waite,
Part
II.
W.M., P o o l e ,
Comm. ACM 9 ( 1 9 6 6 ) , P.:
Portability
o f ALGOL, and i t s
Formal
89-99.
and A d a p t a b i l i t y .
These L e c t u r e
Notes. 11,
Richards, versity
12.
M.:
Dijkstra,
E.W.:
Programming 13.
Lucas,
BCPL R e f e r e n c e Manual.
o f Cambridge,
P.,
Cooperating
Languages. Walk,
K.:
Reviews in A u t o m a t i c 14.
Ross,
D.T.:
Technical
Computing L a b o r a t o r y , sequential
processes.
London - New York:
On the formal Programming,
The AED Free S t o r a g e
Memorandum 6 9 / 1 ,
Package.
In:F.Genuys
Academic P r e s s ,
description vol.6,
Uni-
1969.
of PL/I.
Pergamon P r e s s ,
(ed.)
1968.
in:
Annual
1970.
Comm.ACM 10 ( 1 9 6 7 ) ,
481-491.
CHAPTER 2.C. LOW SUMMARY
LEVEL OF
A
edited
by
University
I.
LANGUAGES DISCUSSION
S E SS
M. G r i f f i t h s
of G r e n o b l e ,
France
INTRODUCTION
This
chapter
ring
which p r e p a r e d c o n t r i b u t i o n s
is
a summary of a d i s c u s s i o n
GRIFFITHS and RAIN, t o g e t h e r
on l o w - l e v e l
were p r e s e n t e d
with
pertinent
languages,
differentiate
the c o n t r i b u t i o n s
Over the l a s t languages,
five
years
loosely
comments from GOBIN,
of the d i f f e r e n t
'low-level
languages'.
example was PL 360 El ] , which was r a p i d l y
The f i r s t ,
followed
number of t h e s e languages would seem to i n d i c a t e and c o n t i n u e is
real
to f e e l ,
our aim to t r y
Questions include
features
by many o t h e r s .
that
programmers
need, to d i s c o v e r
available.
whether it
conclusions
The felt, is
concerning
and which are n e a r l y
all
still
machine-independence,
under d i s c u s s i o n ,
degree of e f f i c i e n c y ,
and e d u c a t i o n .
JUSTIFICATION
The immediate guage?'
replies
are u s u a l l y
to the q u e s t i o n either
or because the a v a i l a b l e complete
control
some t h o u g h t , it
of new
of such a l a n g u a g e .
which o c c u r , style
this
not
well-known
which were not n o r m a l l y
and to draw the r e l e v a n t
those of g e n e r a l i t y ,
programming
tools
and a n a l y s e
or p s y c h o l o g i c a l
the d e s i r a b l e
2 •
a need f o r
will
speakers.
we have e x p e r i e n c e d a p r o l i f e r a t i o n
called
du-
by BAUER, GOOS,
GOTLIEB, NIEVERGELT, SONNENBERG, WAITE and WEGNER. The t e x t
It
I ON
not t h e r e f o r e
guages,
in
to a v o i d u s i n g
it
implies
be s u f f i c i e n t
including
a selected
It
is
this
a criticism to
a new l o w - l e v e l
lan-
assembler o r machine code,
languages were i n e f f i c i e n t
o f the computer.
since
'why c r e a t e
and d i d not a l l o w
second answer which m e r i t s
of e x i s t i n g
improve c u r r e n t ,
languages.
general-purpose
number of s u p p l e m e n t a r y
features,
Would lanand
71
in
i m p r o v i n g the e f f i c i e n c y
long t e r m , a short yet
of c e r t a i n
the answer to t h i s
term b a s i s ,
question
of the i m p l e m e n t a t i o n ?
rests
under d i s c u s s i o n ,
but on
seems to be t h a t
we do not
the consensus of o p i n i o n
know how to a l l o w the n e c e s s a r y f e a t u r e
within
the framework of e x i s t i n g
A fundamental Users of
question
low-level
languages are u s u a l l y
cult
clearer,
for
task
the g e n e r a l
control user,
about the c o m p i l e r to e x p l o i t
guage i s terns.
less important
This
sensation
proved a s s e r t i o n . to r e t h i n k
It
is
allocation
automatically,
seem d i f f i -
in a language
but even were the know too much
to d e f i n e ,
but can be
I m p l e m e n t e r s o f t e n wish to t h i n k
in terms of a l g o r i t h m s
then o t h e r
program or not in a g e n e r a l - p u r p o s e
than the c l e a r not d i f f i c u l t
exposition
of their
to pin down, but
has been s a i d t h a t
in terms of a h i g h e r
methods
successfully. i s more d i f f i c u l t
and l e s s their
Not o n l y does i t
over storage
r o u g h l y as programming s t y l e .
in terms of machines,
and are
Since t h e i r
the user would need to
it
Another i m p o r t a n t j u s t i f i c a t i o n
Whether t h e y can w r i t e
software,
in hand, t h e y are more e f f i c i e n t ,
does t h i s
language designed s a t i s f a c t o r i l y ,
described
producing
than a g e n e r a l method.
to a l l o w s u f f i c i e n t
which,
manner
storage allocation.
own s t o r a g e a l l o c a t i o n .
atre adapted to the p a r t i c u l a r and o f t e n
in a s a t i s f a c t o r y
In
languages and c o m p i l e r s .
seems to be concerned w i t h
capable of programming t h e i r
3.
parts
is
users. lan-
thought
pat-
a l s o an un-
the i m p l e m e n t e r would do b e t t e r
level.
FEATURES
The contents of l o w - l e v e l languages have varied from minimal additions to the assembler which are considered as a departure point to f u l l - b l o w n languages with a complete range of data and control s t r u c t u r e s . Examples of these extremes are [ 2 ]
and [ 3 ] .
The differences depend, of
course, on the use to which the language is to be put, e i t h e r as the lowest readable level in the language h i e r a r c h y , or as the highest l e v el which s t i l l
knows about the computer.
Some measure of agreement e x i s t s as to the minimum control s t r u c t u r e necessary in such a language - loops, c o n d i t i o n s , procedures, and so on, a l l
implemented in the most e f f i c i e n t manner. For example loops tend
to be of the form while . . .
do or to have constant and once-evaluated
step f u n c t i o n s . On the other hand, d i f f e r e n t points of view are presented concerning block s t r u c t u r e , parameter passing, and even recursion,
72
although As f a r
few people defend t h e e l i m i n a t i o n
as data s t r u c t u r e
suggestions
pointer
concerned,
data s t r u c t u r e s ,
checking
is
conversion,
possibility.
v e r y open, s i n c e
declarations
to complex systems
and sometimes
last
the subject
v a r y from a language w i t h o u t
out any g i v e n types with
is
of t h i s
and hence w i t h -
allowing
indexing,
various
structures
and
manipulation.
A further defined
point
of d i s c u s s i o n
lies
in the l i n k s
and t h e machine or assembly
languages proposed a l l o w another.
The i n t e r f a c e
language.
between the language
Most,
if
not a l l ,
of the
t h e use of machine language in one form or
requires
thought,
especially
in a d d r e s s i n g ma-
chanisms. In view of the wide d i f f e r e n c e s we may perhaps t r y contain
to d i s t i n g u i s h
the v e r y s i m p l e ,
guages than d i f f e r e n t of such w o r k , products. are n o t , term
If
which
styles
oriented
low-level,
languages
first
group,
This
of t h e s e would
is
seems i m p o r t a n t
not a c r i t i c i s m
designers
to p o i n t
tools
of c e r t a i n
and t h a t
low-level
which a l l o w
tight
prefer
out t h a t
Languages such as those d e s c r i b e d
programming
lan-
t h e languages which remain
and many of t h e i r
It
projects,
which are l e s s
helped the f o r m u l a t i o n
can be h i g h - l e v e l ,
be problem o r i e n t e d . are s o p h i s t i c a t e d
this
The f i r s t
efforts,
of programming.
has c o n s i d e r a b l y
'machine-oriented'.
between d i f f e r e n t
two g r o u p s .
special-purpose
we e l i m i n a t e
in f a c t ,
which e x i s t
the
machine-
languages can in E 3 ] and [4~
control
of the
computer.
4.
MACHINE
DEPENDENCE
One problem which a r i s e s
when we c o n s i d e r m a c h i n e - o r i e n t e d
is w h e t h e r t h e s e languages must be f u l l y seem r e g r e t t a b l e
that
we d e s i g n
languages
machine d e p e n d e n t .
a new language f o r
It
would
each c o m p u t e r ,
but
on the o t h e r hand, in o r d e r to have complete c o n t r o l , t h e languaqe must know about t h e p a r t i c u l a r machine. In t h i s c o n t e x t t h e r e are two d i f ferent
aims to c o n s i d e r .
derstandable is
to make programs
portable.
in a m a c h i n e - o r i e n t e d
Opinions
is
differ
that
of t h e i r
the p r o p o r t i o n
of t h e p a r t i c u l a r
systems,
readable,
un-
and the second
aim can be a c h i e v e d a l l
which i m p l i e s
the structure
concerning
account
to make a l g o r i t h m s
in o t h e r
Neither
language,
c o n s i d e r more c a r e f u l l y
which t a k e s
The f i r s t
by programmers w o r k i n g
t h e time
programmers must
programs.
of the language d e f i n i t i o n
computer,
but agreement is
reached
73
on the f a c t particular
that,
even i f
the language is
programs should not be so.
machine dependent, of any program i s
machine i n d e p e n d e n t ,
and i t
parts
which are n o t .
These machine dependent s e c t i o n s
larly
careful
~,
is
heavily
The major p a r t
up to the programmer t o
isolate
those
require
particu-
documentation.
EFFICIENCY
We c o n s i d e r
three types
chine t i m e ,
and memory space.
means to o b t a i n
all
a minimum by the this
three
portant
gains
kinds o f e f f i c i e n c y .
run-time.
true,
in
for
run-time using
language i s
Memory space i s
example in MULTICS, efficiency
written
basis
(see G r a h a m [ 5 ] ) . I n
allows
both
language,
local
b e s t o f both w o r l d s . necessarily
produce
and g l o b a l
utopic
the im-
optimisation any case, p r o -
in the same way as
compact code as in a s s e m b l e r , and
optimisation
The f a c t
may w e l l
is
u n a c c e p t a b l e . The m a c h i n e - o r i e n -
language should encourage programmer e f f i c i e n c y
any h i g h - l e v e l
a
kept a t
this
in P L / I ,
have come from g l o b a l
an a s s e m b l e r i s
ma-
often
has been s a i d t h a t
Recent work would suggest t h a t
and not from r e c o d i n g on a l o c a l grammer e f f i c i e n c y ted
those of programmer e f f o r t ,
The machine o r i e n t e d
use o f a s s e m b l e r programs, and i t
also minimises
not n e c e s s a r i l y
of e f f i c i e n c y ,
that
this
at
run-time.
In g e n e r a l
best of both w o r l d s
is
the
not
have an impact on g e n e r a l - p u r p o s e program-
ming l a n g u a g e s .
6.
STYLE
AND
EDUCATION
Language d e s i g n e r s have at l a s t design
influences
been an e f f o r t with
a lot
become c o n s c i o u s
the programming s t y l e
to e n f o r c e
of d i s c u s s i o n
chine o r i e n t e d
language d e s i g n e r s by t r y i n g
is
(or usually
to c o n t i n u e
irreductible
and t h e y have u s u a l l y
as f a r
for
often
example, goto have a p o i n t
to
use)
h i s own s t y l e .
comes from the f a c t as changes in t h e i r
that
their has
together
statements.
o f view which This d e c i s i o n ,
that
Mais which
system programmers
methods are concerned,
good programming s t y l e
and m e r e l y encouraged by good t o o l s .
language may w e l l
that
The r e s u l t
much more e x p e r i e n c e than the average programmer.
We are aware o f the f a c t cation,
users.
to d e s i g n a language which a l l o w s the user to
in any case a good t h i n g ,
are o f t e n
o f the f a c t
a more e l e g a n t programming s t y l e , concerning,
more f l e x i b l e , find
of their
be used in the e d u c a t i o n
is
c r e a t e d by edu-
Since the m a c h i n e - o r i e n t e d of young system programmers
74
it
is
of v i t a l
importance
that
these languages
should
gant and c l e a n programming.
Flexibility
least
s i n c e we are not y e t
for
pletely
7.
the present
these f e a t u r e s
time, which
lead to
in the d e s i g n 'good'
allow
ele-
help,
able to define
at
com-
programming.
CONCLUSION
In g e n e r a l ,
the speakers assume t h a t
sometimes c a l l e d gineering.
low-level,
and we s h o u l d received
being
in t h i s
the
their
interface
general-purpose attention
software
languages. is
lan-
Many p o i n t s
fifteen
This e f f o r t
en-
d e s i g n and im-
between these
o v e r the l a s t
new c o n t e x t .
should not i g n o r e what a l r e a d y
8.
of m a c h i n e - o r i e n t e d ,
respond t o a need in
need to be put i n t o consider
guages and the more c l a s s i c a l which have a l r e a d y reconsidered
the a r r i v a l
languages
Thought and e f f o r t
plementation,
years
are
not w a s t e d ,
but
exists.
A CKNO WLEDGEMENTS
We thank
all
will
consider
not
ideas fore
the speakers that
at the s e s s i o n
their
thoughts
concerned,
have been m i s r e p r e s e n t e d .
in t h e paper are a concensus of o p i n i o n , be supposed t h a t
any p a r t i c u l a r
and hope t h a t
and i t
s p e a k e r agrees w i t h
REFERENCES
[1]
[2]
N.WIRTH A Programming
Language f o r
J A C M, Jan.
1968
Computer B u l l e t i n , !3]
the 360 Computer
M.GRIFFITHS, M.PELTIER A Macro-Generable
Language f o r Nov.
the 360 Computer
1969
M.RAIN MARY SINTEF, T e c h n i c a l
University
of Norway,
1972
they
The
should not t h e r e -
tents.
9.
clear,
should
all
of the con-
75
[4]
G.GOOS, K.LAGALLY, G.SAPPER PS 440, Eine n i e d e r e Technische
15]
Programmiersprache
Universit~t,
R.GRAHAM Notes from t h i s
school
MUnchen 1970
CHAPTER 2. O. RELATIONSHIP BETWEEN DEFINITION ,,
,,,
,
....
AND IMPLEMENTATION OF A LANGUAGE M. Griffiths University of Grenoble
France
1,
INTROOUCTION
The
non-specialist
sometimes a c c u s e s
both computer scientists
and software engineers of spending all their time on discussions languages,
to the detriment
of all the 'real' problems.
accusation
is net without foundation,
about
Whilst this
it must be clearly understood that
language is central to the whole problem of software engineering. cannot supply powerful, corresponding,
well-defined,
understandable
economic implementations
If we
languages with
on existing computing
equipment,
then the programmer can hardly be expected to express himself in a way which will permit us to use the term which is the theme of this school. In these lectures we will discuss the impact of language definition
on the programs written in the language and on the methods
used to execute these programs a two-fold
process
in the computer.
Defining a language is
: the future user must be allowed ways of saying
what he will wish to say, which means that the language content must be suitable,
and the way in which this content is expressed
is also impor-
tant, although this importance may be greater for the implementer
than
for the user. The ultimate aim of compiler writers is to find a way of automatically
converting a language definition
into an implementation.
We are of course a long way from this ultimate aim, but this should not stop us from trying to define languages
in a way that is reasonably
close to the way in which they are implemented.
1,1
-
REOUIREMENTS OF D I F F E R E N T
PEOPLE -
Three main groups of people are possible readers of a language definitionj
the users,
implementors,
and theoretlciens.
groups has its own particular requirements, be seen to contain certain incompatibilltles,
Each of these
and these requirements In particular,
can
it seems
78
that no existing language definition
can be used satisfactorily
by all
three sections of the community. The user of a programming
language requires a document
which will answer the question
: 'How do I obtain such an effect?'
His aim is to create programs,
and he is not interested
he will not write.
by the programs
No user exploits all the features of a language,
and few are capable of so doing. Their approach is thus synthetic and limited, which means that user requirements
are met by a description,
with examples,
in order to encourage pro-
which may w e l l b e
restricted
gramming clarity. A standard example of this type of restriction of side-effects.
Consider the ALGOL 60 program
is that
:
begin integer procedure a; begin b:=b+2; a:=sqrt{b)
end; integer x, b; b:=8; x:=a+b
end Obviously,
since a changes the value of b, the result of the assignment x :=a+b
depends on the order of evaluation
of the operands of an expression.
This order is in fact defined in ALGOL 60, but the users' manual may well say : 'do net use expressions value of another'.
in which one operand changes the
The manual may even suggest that procedures
only change the values of their parameters, axiom under certain conditions,
should
but this is a more doubtful
We may consider that Keeping the user
in ignorance of certain parts of the language is a bad thing, but it is certain that language designers would do well to consider to what degree it is possible to encourage, Modern languages, in this field.
like ALGOL 68, have made some considerable
improvements
In the example program, ALGOL68 does not define the order
of evaluation of operands, of a language definition programming.
or even force, users to write clear programs.
end indeed says it is arbitrary.
One property
is thus to be such as to encourage clear, clean
This can almost be rephrased to say that programs written
by a user who exploits all the details of a language should be as easy to understand
as those written in a subset.
79
For the implementer a guide,
but a bible,
which can be written, their effects. synthetic
of a language the definition
He requires
a complete
together with a precise
This analytic
and complete
definition
point of view is in opposition
one, and the implementer
is not just
list of every type of phrase
does not worry whether
of
to the user's
it is reaso-
nable or not to write certain programs, but simply to apply the law. As in real life, the law may often be an ass, but this is not an excuse to change it unilaterally. definition menter
One of the measures
is unreasonable
spends
in working
can be found
on details which are exploited
mers who should not use such Knowledge procedure
of the degree to which the
in the amount of time the imple-
anyway
the implementer
in a particular
of 'over-definition'.
should
luate expressions
not inform
in ALGOL88,
definition
A corollary
ever though
compiler.
is of course
requires
not sufficient.
it is possible to transform
the definition
definition
languages
this transformation, should
implementer
one,
of existing
are defined
for a real machine
the case.
For example,
fact that the abstract
cution. tics'
process.
usually
corresponds
the translation
process
has therefore
machine
to a full inter-
more usually used is in two
machine
the problem of separating
between
semantics'
introduce
and attempt
the operations
the terms
to indicate
be given in terms of these two concepts.
comes from the level at which the abstract much as we discuss
two factors
which
at compile time and those which are left to the exe-
In the next chapter we will 'dynamic
However
written
The first of these is the
The implementer for the abstract
a good
written for the
to the algorithms
parts.
and
should
closely
then is desirable.
machine
or hypothetical
The idea is basically
whereas
will be performed
some of the more
in terms of an abstract
in the translating
make this more difficult
the interpreter
in a form which
and it is our view that the language
machine will correspond
preter,
in
into an implementation.
since it is to be hoped that the algorithms
distinct
interested
in the degree to which
are not usually
which acts like an interpreter.
abstract
precise
be in terms which are much closer to those of the
then is presently
languages
a complete,
He is extremely
and in particular
facilitates
of this is that
this order may well be fixed
the form of the definition,
Descriptions
which we may
his users in what order he does eva-
To say that an implementer
machine
Side effects due to
calls were the first example ot this phenomenon,
term the phenomenon
modern
by few program-
hlgh-level,
is not usually defined
rectly on any existing
The second difficulty
machine
general-purpose
'static seman-
how a definition
is defined.
languages,
In as
the abstract
in such a way as to be implemented
computer,
which means
di-
that its use is strictly
80
conceptual.
In makin@ this criticism,
of the opposite trap, which consists
we must of course remain aware in defining a language in terms
so close to those of a particular computer that the definition applicable to other circumstances. Per the theoretician, complete,
a language description must be a
formal, mathematically
consistent
conform to the usual mathematical axioms to which everything
set of statements which
principles
of a minimal
is formally related.
However,
number of
In principle his
point of view is nearer to that of the implementer since both are analytic,
is not
then ±s the user's,
the method used to define axioms
will often seem more closely related to the theory of automata then to the way in which computers really work. Their aims are to prove that algorithms
can be expressed
in the language,
or are correct,
to discover the more efficient algorithm in theoretical
or
terms.
The remarks in this section were mostly concerned with the definition
of high-level,
general-purpose
conclusionwe would draw applies also to is that the basic definition
languages.
other
types
The
of
language,
and
of a language should be in implementor's
terms, that a user discription
should be drawn from it, as also may
be deduced a satisfactory model. We will suggest in a later section that the definition
should also be accompanied by
a form
mentation which has been mapped onto a real machine. forms of language definition and description the problem will have been solved search for points of reasonable
1.2.
of
imple-
When all these
become one document
; until then we need a constant
balance.
DESIGN OF LANGUAGE FOR GOOD PROGRAMMING We may approach this topic by criticising
current languages,
and trying to find remedies.
concerned with a lack of obviousness, and over-definition,
the contents of
The main problems are
and the phenomena of over-ordering
which lead to problems Of dependence
elements of a program, The example of side-effects
of different
seen above is a
typical one and we should consider it very carefully. Side effects can be created by procedure calls, use of parameters
[in particular
of parameters by name) and with loops of
various sorts. All three are in fact different Collateral
evaluation
forms of procedure,
is one of the ways of eliminating
side effects,
since if the user does not know what the order of evaluation he cannot rely on it. in statement X
:=a+b
llke
is, then
81
the evaluation
of e and b in either order normally
gives the same result,
as is also true in a := I, Whichever However,
what should
b := 2.
assignment
happen if a and b are not mutually
ALGOL66
says that the result
further
and forebid mutual
this could
is obeyed first in irrelevant.
is arbitrary,
dependence
lead to criticism
in collaterals.
from users,
independent
but we are tempted
?
to go
Unfortunately
who may find it normal to
write a := random, since whichever
is done first
arise in input/output
b := random leaves them indifferent.
statements,
Similar
problems
but we feel that the price paid
would be worth-while. A second problem
aid towards
at its source.
w i t h control parameters,
variable,
a solution
In the case of loops, step size and limit,
their simple elimination,
procedure,
or by reference.
languages.
For procedures,
If it is decided
parameters
are already
requires
already,
no comment,
but the second
listing
rule,
that required a subject
in documentation.
with over-orderlng,
Not only would this
information
all the elements
of
by the compiler
help the programmer
is
for example in loops.
are
To set a vector to
:
for i := I step I until n d__Eo a[i] It is clearer,
is typical
study.
Further design points which should concerned
at
which change values.
Aid to documentation
which could give rise to further
zero in ALGOL 60, we write
has some disadvantages.
non-locals
but the resulting
being the centre
it would be possible
This list could perhaps be given by the compiler. programs,
taken in modern
variables.
not to apply the strict
least to insist on a declaration
lead to clearer
of type
label
to external
discussion
interference
and in the case of name
leaving
These measures
The first suggestion of considerable
to f o r b i d
the
we may suggest
no g oto to an external no assignment
is to eliminate
and less error-susceptible,
of a do', or as in ALGOL68 for i := iub a to up b a do
:= 0 to write
: a[i]
:= 0
'for
82
Where l w b and up b are the lower and upper bounds of their operands. is an example of 'defensive
programming',
which is programming
This
against
errors.
1.3,
DESIGN FOR TESTING
The concept counter-measure a further
of defensive
to over-ordering.
application
program than does the compi3er.
at certain
points,
or
The programmer
[x)
~
real procedure
note -I<x
indicate ranges of values
after input statements,
note x > 0
sin
should be able to write
he should
He may wish to write
read
It has
where the user knows more about the
example,
in particular
entry for parameters,
was seen above as a
The user had to say too much,
in testing,
what he knows. As a typical
programming
or at procedure
: ; ..,
(x) ~ value integer x ;
; ,,.
The note statement
should
not have any side effects.
This
rejoins
some of the automatic
proof mechanisms
for programs.
The tests
implied
by the note statement
may be performed
only in debug mode,
but
this depends on the implementation. During the discussion preference
for languages
expressions
following
side-effects
which did not define them.
question for the implementor writes
concerning
we stated a
An associated
is to know what he should do if a user
in which the operands
the rulesthis will produce
are not independent.
some arbitrary
result,
By but it
would be a kind action to indicate to the user that his program may well be badly written . Howeve R the process of discovering dependence
is a long one,
to which transitive probably
statements
of building
closures must be applied.
only be printed Further
since it consists
in the debugging
examples
mutual
large matrices
Thus the warning would
compiler.
of the price of finding
in a program can be found in the testing
all nonsensical that array indices
are between their declared 'bounds. This can slow the execution certain programs ting lists, ALGOL68,
by a factor.
In the same way,
in a language manipula-
testing the type of an atom can be very expensive,
such an atom would be a union of a number of modes,
use would be via a 'conforms to and becomes'
of
symbol.
In
and its
For example
:
@3
mode atom = union
lint,char)
j
atom a~ Int i; char p; a :=1;
i ::= a The use of this construction
implies that the compiler
not trust the user to knew that a particular will produce
but
atom is an integer,
code which tests for this at runtime,
will
whether the compiler
is in debug mode or not, Whether we can pay this price for security or net is still a matter for discussion. A further of testing
example which was shown up by ALGOL68
the scopes of references,
Should
point at values which have a shorter the pointer
? The ALGOL68
language which allowed printing
a warning,
answer
pointers
lifetime
is that
be allowed
than the lifetime of
is no, but we could conceive of a
this, going wrong only when necessary,
Howewer,
to
these alternative
solutions
or
de not
appear to be very satisfactory.
2,
LANGUAGE
DEFINITION
In this section we consider defining
a language,
the technical
that is to say the form of the document
is the bible for the language
concerned.
We will
from the point of view of the implementer,
with the theory that he is
by the strict definition.
of the other two classes
o# readers
For the purposes
semantics
between
of a program,
an identifier
and its declaration.
actual manipulations the execution
ler will normally
by the dynamic
2.1~
Dynamic
between
semantics
to
Static
Which does not depend
upon
the use of
considers
the
and of values which takes place during
In a classica3
implementation,
perform that which is defined
and will produce machine defined
it seems convenient
llke the relationship
of objects
of a program.
separately.
syntax and semantics.
is that part of the semantics
the execution
The requirements
will be considered
o# definition
not only a separation
which
look at this problem
the person most affected
consider
problem of
the compi-
as static semantics,
code which will in its turn obey the rules
semantics.
SYNTAX
If a language description
is to based on a syntax,
and we
84
will not in this section discuss any languages which are not, then the form of the grammar is important to the implementor, Since he wishes to make an anlyser out of this grammar he requires it to conform to certain criteria, some of which depend on the method of analysis to be used, Since the criteria differ for different methods, it is reasonable to suppose that the definition will make a decision concerning the method of analysis. Since efficiency is to be hoped for the method chosen will be one which can only treat a subset of the context-free languages, for example simple precedence [4] or left-factored (otherwise known as LL(1)) ~]. The choice of one of these methods implies the more important property of being non-ambiguous, since if the grammar is analysable by means of one of these restricted methods, standard tools can be applied to prove the conformity of the grammar, These tools will also ensure the absence of parasites, non-producing symbols and undefined symbols. Not using tools will often mean that there will be errors in the grammar. This attitude may lead to critism on several grounds. First of all one can suggest that it is a way of stifling progress, and this would have been true ten to fifteen years ago. At that time no techniques were available, and so we could not insist on using them. At present, it is merely good engineering practice to make products using proven techniques, leaving research workers in computer science departments t o p r o v e new o n e s . We do n o t c o n s i d e r a new l a n g u a g e t o have reached its
final
form until
Proved in the leb~atory,
its
definition
and i m p l e m e n t a t i o n
haVebeen
since weak points in the definition will always
need to be changed as a result of implementation. Until this stage is reached, user programmers should not have access to new languages. The feedback
is best obtained by carrying out implementation and design at
the same time ; even if the implementation is sketchy, this idea is extremely important. Grammars should also be shortened, if not minimised. For example it is unreasonable to read : ::= := ::= := Not only are these rules ambiguous, since a := b
85
is not Known syntactically to be boolean or arithmetic, but also they confuse concepts which are syntactic with the semantic notion of type. The difference between boolean and arithmetic objects is static semantics, and putting this in the syntax helps nobody.
2.2.
STATIC SEMANTICS
Static semantics is that part of the definition which can normally be treated at compile-time, that is to say the static relationships between elements of the program. These concern mostly the association of uses of identifiers with their declarations and the corresponding type information, that is to say with the classical problems of static scope. The relationships will be expressed in terms of the syntax tree produced automatically by the enalyser. In some oases the tree will be transformed to a form which is more convenient, and then information will be attached to individual nodes of the tree. This information will serve later for the expression of the dynamic semantics. We wlll use the term 'property' of a node for the information which is attached to it. These concepts will be seen in the example in section 2.4.
2.3.
DYNAMIC SEMANTICS
The dynamic semantics describe an interpreter of the program which operates on the revised form of the syntactic tree which results from the static semantics. They correspond to the execution of the program, and are thus concerned with the date structures which correspond to each declaration, accessing functions, the control structure of the program, and other functions of that sort. Llke the static semantics, the dynamic semantics will be illustrated by the example lin the next section.
2.4.
EXAMPLE TAKEN FROM ALGOL60
The following example is included with the aim of clarifying the brief escriptions of the different concepts given above, It is simply illustrative and is both incomplete and untested by implementation. Further work would therefore be necessary before it could become useful in the practical sense.
86
2,4,1,
Svntay
The syntax which follows is of ALGOL60 without own, string or numerical labels, The method used is a slmpliflcatlon of the scheme used in [6]. The brackets {,a~d } indicate that their contents may or may not be present, an asterisk Indicates repetition and ~ and ~ group their contents. The symbol ÷ stands for the Backus-Naur
[7] symbol
::=
and the vertlcal bar has the same sense as in the Backus-Naur form. Rule numbering is ~or later reference. The rules are LL(1), except in a llmlted number of cases where their transformation would make comprehension less easy. An LL(1) equivalent of these rules is given later. The axiom 04 the grammar is 'Block'. 1. Block
÷ begln ~ D ; ~x S { ; S }x end
2. O
÷ Declarer ! Idlist Arrayd
I Procd
I Arrayd
I Procd ~ I
I
Switch Id := Desex { , Desex }:~ 3. Arrayd
÷ array Bounds { , Bounds }x
4. Procd
÷ procedure Id Formals S
5. Idllst
÷ Id { , Id }~"
6. Declarer ÷ real linteger 7,
Bounds
8, Desex
÷ Idlist
*Id
Ex then
(Desex))
10. Specs
[ Ex : Ex {
, Ex : E x } x ]
{[ Ex ] } Z if
9. Formals
I boolean.
{
( Id else
[ Ex ]
} i
Desex
÷ ; 1 (Idllst) ; Specs ÷ { value Idlist j } ~ Specifier !dl±st;)__'~
11, Specifier÷ Declarer{procedure arr a~ I label
I array}
I procedure
I switch
12. S
~ NLUS I NLCS I I d
: S
13. NLCS
÷ if Ex then {Id:} x NLUS {else S} I for Var := Forlist do S
14, NLUS
÷ begin S {~S} ~ end I Bl°ck I g°t° Desex I
! vat ==2' ~ 15, Exlist
Ex { , Ex }-"
I ~d {mx~ist~} I
I
87
16. Vat
+ Id {[Exlist]}
17. Forlist ÷ Porel {, Forel} '~ 18. Forel
÷ Ex {while Ex I step Ex until Ex]
19. Ex
÷ Exl
{~ Exl }
20. Exl
÷ EX2
{= EX2 }
21. EX2
÷ Ex3
{ o r Ex~ }::
Ex3
÷ Ex4
{and Ex~ }:"
22,
23, Ex4
÷ { n o t } ; " Ex5
{ R e l o p Ex5 }
24, Relop
+ > 1>_I
25. Ex5
÷ Ex6
26, EX6
÷ {+
27. Ex7
÷Ex8
28. Ex8
÷ Prim { + P r i m } : :
29. Prim
÷ SimplexIi__f_f Ex then Simplex else Simplex
< !<_I
= I /
{[_+ I -)_ EX7 }:= I - } Ex7
{1"
I / I +ZEx8 }':
30. simpl~× ÷ ~Ex~ I true I fals~ I Zntno I R~olno I I d {[Exlist]
I (Exlist)}
The lexical analyser is assumed to eliminate comments and layout characters,
and to furnish underlined words, identifiers,
nume-
rical constants and symbols. The grammar could still be shortened, but this would be of strictly limited interest. The above grammar would be accepted by the LL[lJ analyserproducing program made by Bordier [8] based on Foster's SID program [9] if rules 12 to 14 were rewritten as follows 12. S
:
÷ Id {:S I $1} I $2 ] if Ex then $3 {else S} 1 for Ferlist do S
13. $1
÷ {[Exlfst]}
:= { V a t
14, S2
+ b e g i n {D~}:" S { ; S } x" end I g o t o Oesex
31. $3
+ $2 ] I d
{:$3
:=}:" Ex
[Exllst)
] S1}
This version is LL(IJ if the lexical analyser distinguishes from :=. The semantics w111 use the numbers of the rules of the original grammar,
since the transformations
Thus we have a clear, unambiguous, criteria developed above.
applied can be shown to be correct. compact grammar which conforms to the
:
88
The syntax tree which is produced above grammar risks.
has the obvious
In the case of repetitions,
the tree
by the analyser
form for the constructions
of the
without
aste-
these are done at the same level in
; ~or example the tree ~or a block has the ~orm
:
Block
I ........ I..... 1 begin
0
;
2.4.2,
Static
Three essential definition
I
I
I
I
I
I
I
D
;
...
S
~
S
...
end
semantics
actions
ere accomplished
by this part of the
:
Declarative
-
I
information
is accumulated
which are the properties
in deelaratlon
of nodes which are
'scope'
tables,
nodes
(in
ALGOL60 blocks and procedure declarations). - Uses of identifiers - Some trivial the dynamic
are identified
transformations
in connection
1)
of the tree make it more suitable for
are accomplished
into play at the different
with the relevant
possesses
a declaration
table.
nodes of the syntax tree,
It is a scope mode,
The entries
in the declaration
provided
by the D and S nodes in the expansion
responds
to an identifier
entries
in the declaration
2) - The term
in ascending
(Id), No identifier
table attached
for the different
- Declarer
table are
Each entry cor-
may correspond
to two
table to a node N' is defined
to the first
the tree from node N. Each expansion
entries in its nearest declaration different
of Block.
and it
table.
'nearest declaration
mean the declaration
by means of functions
syntax rules,
A Block node has two properties.
-
declarations
semantics,
These three actions which are brought
with their relevant
table,
expansions
Idlist creates
of D puts one or more
The contents
of thls entry are
:
one entry for each Zd of the Idlist,
contents
o~ the entry specifying
Declarer
glve~ by rule B.
to
scope ~ode encountered
the
a type which is the expansion
of
89
- Declarer
Arrayd
mode Arrayd, the expansion - Declarer
of the
the type X array where X is
by rule 6.
the entry given by the property
to which is attached
the expansion Arrayd
the entries given by the property
of Declarer
Procd creates
node Procd,
-
creates
to which are attached
of Declarer
of the
the type X proc where X is
by rule 6.
has the same effect
as Declarer Arrayd
has when Declarer
expands to real. - Procd creates
the entry @iven by the property
of its node to@ether
with the type proc. - Switch
... creates
an entry of type switch which correponds
to
the identifier.
3] - The property
of Arrayd
is the union
of the p r o p m r t i e s
of
the
Bounds.
4) - The property the identifier table.
The entries
Formals,
of Procd
passed to the D immediately
Id, In addition
followed
Procd
in this declaration by those provided
5) - The property
of Idlist
above it is
is a scope node and has a declaration are, firstly
those provided
by
by S.
is the list of its identifiers.
No identi-
fier may occur twice in Idiist.
7] - The property No identifier
of Bounds is the list of identifiers
used in any of the expressions
an entry in the nearest declaration
8] - The identifier alternative follows.
The process
The nearest declaration
corresponds
(Ex) must correspond
of the first or of the second of identification
table is consulted.
to an entry in that declaration
dicates that entry in that table,
otherwise
table,
proceeds
the nearest
is followed
in-
declaration If the iden-
its use is illegal.
The type of the entry indicated identifier
as
If the identifier
the identifier
table of the scope node is taken and the process repeated. tifier canot be identified
to
table.
{if it occurs]
is identified.
from Idlist.
by the identifier
by [Ex], otherwise
[Ex] must be of type inte@er,
must be switch if the
it must be label.
or of type real,
The Ex of
in which case it receives
90
the property 'convert to integer' • The Ex after if must be of type boolean.
9J - Formals provides entries in the nearest declaration table for each of the identifiers in (Idlist). The contents of these entries are provided by Specs.
10) - Each identifier in the Idlist following value must occur in one of the Idlists following specifier. The type value is added to the type of the corresponding entry in the declaration table. The identifiers in the Idlist following specifier must correspond one to one wlth the identifiers in the Idlist of Formals. The type given by the specifier is inserted in the corresponding entry in the declaration table,
11) - The type given by specifier is its expansion, with Declarer being replaced by Its expansion from rule 6, together with the type parameter.
12) - The identifier creates an entry in the nearest declaration table with type label.
13) - The Ex after if must be of type boolean. The Id creates an entry in the nearest declaration table with type label.
14) - In the expansion Id {(Exlist)} the following rules must be observed. The Id is identified and must indicate an entry of type procedure. The Prccd which created thls entry is compared wlth the expansion under consideration as follows, If the Exlist of the expansion is absent, then the Formals of the Procd must have been accepted by the first expansion of rule 9. Otherwise the second expansion of rule 9 must have been applied. The number of Ex in the Exlist must be equal to the number of elements in the Idlist of the second expansion of rule 9, In addition, comparing the elements of the Exlist with the elements of the Idlist in left-to-rlght order, the type of each Ex must be the same as the type indicated by the identification of its corresponding Id without the words parameter or value. In the expansion (Var type
o f Ex l s
made c o m p a t i b l e no a c t l e n
ls
:=)-'~ Ex t h e t y p e
made c o m p a t i b l e wlth
taken.
type T' If
with
this
ef
each V a t must be t h e same. The
type.
has t h e f o l l o w i n g
the type of E is
The p h r a s e meaning
different
'expression
: If
E
E has t y p e T,
f r o m T and one o f
the
91
types is boolean property
then the program
'convert
is incorrect,
otherwise
E takes the
to T'.
16) The Id is identified. ted by Id must include
If the Exlist
is present,
the type indica-
array and the number of elements
of Exlist must
be equal to half the number of Ex in the Bounds which created responding
entry in the declaration
compatible
with the type integer.
real or boolean
to be found
table.
by Id.
to this Forel.
not be boolean,
If step is present,
- If ~ is present,
must be of type boolean,
the Forlist
the type of this Vat
and the Ex after step, together
until are made compatible
19)
The first Ex is made
with the type of Vat of rule 13 which preceeds
which expands must
is made
The type of Vat is that one of integer,
in the type indicated
18) - The Ex after while must be of type boolean. compatible
the cor-
Each Ex in the Exllst
with the Ex after
with this type,
Ex takes the type boolean, otherwise
and the two Ex 1
Ex takes the type of the Ex I.
20) - As 19 with ~, Ex I and Ex 2 substituted
for ~
, Ex and Ex 1
respectlvely.
21) - If or is present,
Ex 2 takes the type boolean and all the E x 3
must be of type boolean,
otherwise
Ex 2 takes the type of Ex 3,
22) - As 21 with and, Ex 3 and Ex 4 substituted
for or, Ex 2 and Ex 3
respectively.
23)
- If not is present,
must be of type boolean. and the two examples nor Relop is present,
Ex 4 takes the type boolean and the Ex 5
If Relop is present,
Ex 4 takes the type boolean
of Ex 5 must not have type boolean.
If neither
not
Ex 4 takes the type of Ex 5.
25) - If + or - is present,
no example of Ex 8 may be of type boolean
and if all the Ex 6 are of type integer Ex 5 takes the type integer, otherwise
Ex 5 takes the type real,
If + and - are absent Ex 5 takes the
type of the Ex 6.
26j - If + - ~ / ÷ are all absent, of them are present,
Ex 6 takes the type of Ex 7. If any
no Ex 7 may be of type booleam.
If any Ex 7 which
92
neither
preoeeds
nor immediately
Ex 6 takes the type real,
follows ÷ is real the
otherwise
the type integer,
27) - If ÷ is present the Ex 7 takes the type real,
otherwise
Ex 7 takes
the type of Prim.
28) - For the first alternative, the second alternative, either Simplex type boolean, integer,
Prim takes the type of Simplex.
the expression
after if must be boolean.
In
If
is of type boolean then both must be and Prim takes the otherwise
otherwise
if both are integer the Prim takes the type
Prim takes the type real.
29) - Simplex takes the type of its applied - The type of
alternative
as follows
:
[Ex) is the type of the Ex
The types of true and false are boolean Intno
is integer
and Realno
The Id is identified
is real
and the Prim is that one of integer,
or boolean which is indicated. parts of rule 16 are applied.
If [ ] are present If ( ) are present
real
the relevant the relevant
parts o~ rule 14 are applied.
2.4.3
-
~ @ ~ § _ § ~ @ U ~ §
The dynamic all information numbers.
semantics
is stored.
-
have Knowledge
of the stack in which
They are again given in terms of syntax rule
The result of a program
is the result
of evaluating
its syntax
tree.
1) - The evaluation - An indication
of a block is preceeded
of the position
by the stacking
of
in the stack of the preceedlng
block
or Procd on the stack. -
Data space corresponding which is of type real,
table
integer or boolean.
- Data space corresponding whose type contains
to each entry in its declaration
to each entry in its declaration
the word array, The definition
of the language will not be given here,
table
of this part
93
The evaluation tuent S until the block
of a block is the successive either the end is reached
evaluation
of each consti-
or a goto statement
[see rules 8 and 14), The termination
terminates
of a block removes from
the stack those objects which were put on to it at the start of the block.
2) - The evaluation
of switch
where n is the index passed
... is the evaluation
of the n th, Desex,
by the object which provoked
the evaluation
of the switch,
4) - The evaluation following
of o Procd is preceeded
by the stacking
An indication
of the position
Block or Procd 'dynamic
on the stack.
in the stack of the preceeding This will be refered
declaration
of the position
in the stack of the block in the
table of which occurs the identifier
of the Procd.
If this block occurs more than once in the stack, occurrence
For each member corresponding
to as the
of the Idlist
for a real,
'static scope pointer',
(if it exists)
type in the declorotion
integer value parameter
or boolean value parameter,
integer or boolean object,
object which provoked
the evaluation
of Procd is the evaIuation
of Procd requires
which were stacked
of Formals
of the
of Procd.
of the S, The termination
of the
the removal from the stack of the objects
at the beginning
of its evaluation.
(see below).
have been stacked more recently
which the Id was found
are terminated.
the declaration
This dote-
by the evaluation
8) - If the form of Desex was Id, the Id is found
followlng
a data-space
respectively,
Blocks and Procd which
statement
whose
table is real value parameter,
space is filled with the value provided
evaluation
the most recent
is indicated.
This will be referred
The evaluation
to as the
scope pointer'.
- An indication
-
of the
:
the ocurrence
The evaluation
Those
than the one is proceeds
of the Id which provoked
with the
the entry in
table.
If the form of Desex was Id [Ex], the Expression the Id is found and the more recent value of the expression
is evaluated,
Blocks and Procds terminated.
The
is passed as index to the switch correspondin~
to the entry corresponding
to Id in the declaretlon
table.
This switch
9 9
is evaluated, If the form of the Desex was if .... the Ex after if is evaluated.
If its value is true then the part between then and else
is evaluated
as a Desex,
otherwise the Desex after else is evaluated.
12) - The evaluation of S is the evaluation of the relevant alternative.
13) - If the form of the NLCS is if .... the value of the Ex is evaluated. If this value is true then the NLUS is evaluated,
otherwise the S (if
it exists). The evaluation of for
... is not given,
14) - The evaluation Of the different alternatives - begin
.... each S is evaluated
-
The Block is evaluated
-
the Desex is evaluated
- the Variables are found,
:
in turn.
the expression
is evaluated,
is put into the data-space corresponding - the Id is found.
is as follows
If the Exllst exists,
and the value
to each Variable.
each Ex of the Exlist
which corresponds to an entry is the declaration table of the found Id which is of a type which contains value is evaluated. The values are passed to the erocd corresponding
to the found
Id, which is evaluated.
16) - Array references are not described.
Finding Vat is equivalent to
finding the Id,
19 - 26) - The evaluation of the different
levels of expression will
not be given,
29) - The evaluation of simplex is the evaluation of the corresponding alternative, -
as follows
:
the Ex is evaluated
- the constants have their obvious value - the Id is found,
and the value is the contents of its correspon-
95
ding data-space See the last alternative of rule 14. The value is the value of the Procd. Many details
have been omitted in this exposition
tics, for example for statements,
of the dynamic
seman-
array references and the value of
functions.
The aim of the example is to give ideas, and not a complete
definition
of ALGOL60.
The most important part of the dynamic semantics
is the concept of 'finding' follows
an identifier,
which will be defined as
: It is a property of the node of each identifier to indi-
cate the block or procedure in whose declaration occurs.
table the identifier
Finding an identifier means therefore finding the data space
in the stack corresponding
to the entry in the declaration
table. This
data space is in the space taken in the stack at entry to the block or procedure.
Consider the most recent block or procedure,
the required
If it is not
block then consider the block or procedure indicated
by
this block or by the static pointer of the procedure and repeat the process,otherwise
the data space is found.
2.4.4
Comments on the example
The example given above could not serve as a definition of ALGOL60,
since it is incomplete and untested,
course is not, howewer,
to redefine ALGOLBO,
The aim of this
but to try to indicate
the way in which we might define a language to be implementable.
Thus
we will continue to develope the example by making it resemble more and more the program
[compiler)
which is its equivalent.
In order to produce a compiler from the definition, will be necessary to rewrite it in mare machine-like terms,
it
or at
least in more symbolic terms. The next step is therefore to find a definltion/implemantation
language which allows us to express the
information
necessary
we considered
should be able to be Implemented
in the definition.
on different machines.
noted that this way of working is not a return to UNCOL any system,
in which the 'universal'
since the definition
It should be [10] or to
language is defined,
language is particular to the high level language
which is being defined. definition
intermediate
This language
We also suggest that the implementation
of the
language be left open, since it is probable that the imple-
mentor will rewrite in any case,
96
3.
FROM DEFINITION TO IMPLEMENTATION
Since our ideal of an automatic transformation of a definition into an implementation is not yet feasible, the programmer must perform this transformation by hand. His method of working is to separate the subject matter into a number of logical steps, firstly in deciding what can be done at compile time and what must wait until run-time, and then in dividing the compiler into a number of passes. This division depende on the order of arrival of information in the source text. Problems which arise can be trivial but may often not be so. They are often due to using an object which is declared 'lower down the page'. This occurs with identifiers in many languages, and is the principal logical reason for multi-pass compilers. The division of work between different phases of the compiler is not helped by the type of language definition given above [apart from the separation of static and dynamic semantics, which splits compile time from run time]. Thus we suggest that even the 'implementer-oriented' definition given in the preceedlng chapter is not good enough, and the language should comport an "implementers' guide", which is essentially the design of an ideal compiler. This compiler could be specified, at least to some extent, by the use of syntax directed semantic routines, which are called semantic functions in what follows. We should consider carefully the language in which these semantic functions might be written.
3.1,
SEMANTIC FUNCTIONS
Established compiler-compiler methods allow the insertion in the grammar of function calls to obey hand-written semantic procedures during the analysis process. The classic exampla~ taken from [9] is the following evaluation of an integer : Grammar : Integer ::=Digit fl x x
::=Oigit f2 x ]
Functions fl : value ÷ digit f2 : value + value ~ 10 + digit Similar calls are allowed in bottom-up methods at each reduction. All the rules of static semantics can be written in terms of these functions, with the addition of the set of procedures or macros
97
which are frequently
used,
The difficult,
is the choice of the language Before discussing
important
decision
will be written,
possible ways in which this decision might be taken,
we look at a typical
part of the static semantics
idea of the facilities insertion
and extremely
in which these functions
that the language might
of the declaration
in order to get some
contain.
We consider
of a llst of real variables
the
in the declara-
tion table. The grammar aloe
rules concerned
with this part of the example
:
2. D
÷ Declarer
5. Idlist
÷ Id {, Id} :~
6. Declarer + reel Functions
Idlist
I '''
.,.
are inserted
as follows
2.D
÷ Declarer
5. Idlist
÷ Id f2 {, Id f3}=:
Idlist fl
6. Declarer ~ real f4 fl
] ...
.,.
: while llstofids begin add
:
# null do
(dectable
head
(localblockno),
(lastdeclarer,
(listofids))j
listofids
÷ tall
(listofids]
end f2
: listofids ÷ null add
Clistofids,
f3
: add
(listofids,
f4
: lastdeclarer
÷
idnumber) Idnumber) 'real'
Within these functions,
'add' adds the second of its parameters
list which is the first
parameter.
list-like
sense, The functions
'Head' and
'tail'
to the
have their usual
are meant to be understood,
rather than
compiled. The example
shows that the implementation
static semantics must contain a certain are sometimes processing functions
as sophisticated, of aggregates
of the language
of various
sorts.
Individual
and hence the con-
need not be complicated,
ere very language-dependent
for the
which
for example a minimum of list-
are likely to be short pieces of program,
trol structure structures
described
and the treatment
language
number of %acilitles
but the data-
(their implementation
will also
98
be v e r y m a c h i n e - d e p e n d e n t ] .
3.2.
IMPLEMENTATION LANGUAGES
We have suggested above that the syntantic definition of a language should have a particular analysis method in mind. In the same way it would seem reasonable that the definition of the semantics of the language be in terms of a particular implementation language. This has the double advantage of forcing the language designer to consider the implementer's terms, at the same time as approaching our aim of automatic transformation of definition into implementation. This means that consideration of specialised languages for the production of software is an important topic [22].In addition this once again limits the implementer, who either obeys the rules, or takes on his own head any changes he may wish to make. This encouragement towards a certain logical standardisation will also help in portability, adaptability and maintenance, since different groups working in the same area should be able to apply each others' results.
3.3~
EXECUTION MODEL
The Static semantics can be defined as they are programmed, since they represent essentially that part of the language which is treated at compile time. The dynamic semantics, in a normal compiler, are obeyed at execution of the programme. They are thus expressed in terms of an interpreter which works on a double data structure, the first part of which is the representation of the program, the second being the data. This representation of the dynamic semantics can be expressed in natural language (as in ALGOL 68 [ 12] or in some algorithmic or mathematical one [an effort is made with PL/I [13]). These two examples will be examined in a later section. An essential rule defining the execution model by a formal interpreter, or by a natural language description of an interpreter, is that given a choice between two ways of describing some phenomenon, it should be automatic to choose the algorlthm which most resembles what is likely to be implemented. This rule has not always been followed in definitions, since, for example, the 'copy rule' is never implemented in pratice. A slightly different idea is to give not an interpreter for the dynamic semantics, but equivalences of each concept in some
99
lower form of language. philosophy, hierarchy.
This is the basis of the extensible
and the language These concepts
is simply defined
will also be examined
They are not yet well-enough software
engineering.
established
to come under the heading equivalences
between these two methods
it would appear from their external
of
is to
equivalences
form,
in a hierarchy
is not as great as
the interpreter
which is strictly
are necessary
semantics
the interpreter,
since in the case of a language defined language,
in the definition
static semantics
the lower level.
In pratice
static semantics
at each level.
it should
loop by its equivalence
defining
limited.
much more static
simple
section.
with as many layers as desired.
The difference
from a lower-level
language
in a language
in a future
One obvious way for defining
use some macro system,
sentially
as a level
es-
However,
which uses
by extension
need only be defined
at
be noted that there may well be
As an example of this, we me W define a in terms of a test and goto,
as follows
:
while Expression d o statement is equivalent
to : i__~fExpression
then begin
statement
j
,~,,9,,to ~ end The s t a t i c
semantics requiring
a r e c o v e r e d by t h o s e r e q u i r i n g
the expression an e x p r e s s i o n
would seen t o be an a d v a n t a g e f o r definition
this
compact v e r s i o n
while
after
if
t h e mechanism o f
w o u l d seem t o be more c o m p a c t ,
implement
after
with
but it
t o be b o o l e a n
t o be b o o l e a n ,
extension,
is
not yet
as much e f f i c i e n c y
This
since the clear
how t o
as t h e c l a s s i c a l
one. A smell point of terminology language which is defined sible language, within
them
3,4
by extension
mechanisms
- FINAL COMMENTS
for languages
language may have on its implementation. is considerable,
since it is easier,
by directly
prompts us to suggest
itself an extenwhich contain
ON IMPLEMENTATION
but have tried to see what influence
a language
: a
which allow them to be extended.
In this section we have not described methods,
this section
is not necessary
this term being reserved
definition
to conclude
following
particular
the definition
It is clear that the influence
safer and more efficient
its definition.
that the definition
implementation of the
to implement
It is this idea which
of a language
should foresee
100
its implementation, and that the implementation should be strongly directed from the very start. There is here a parallel with the architecture of computer hardware, which is likely improve conceptually each time the hardware engineer works closel~enough to the software engineer and hence considers the use to which the machine may be put. The use to which a language definition is put is in the first instance the implementation, We consider, therefore, that a language definition should be in terms of an idealised implementation, or at least be accompanied by an "implementers' guide", which may be at two different levels. A language issued for general use should not be given to implementers as a challenge : "here it is, now it is your turn", since the two processes are mutually dependent and the feedback loop between them is of considerable importance,
~..1~.~., A LOOK AT SOME DEFI..N.ITIONS
In this section we will look at some existing language definitions from the point of view which has been developed above. Since the definitions which we consider were not necessarily written for the same reasons as those we have exposed, criticisms of particular points in the definitions should not be taken as criticisms of the project concerned in the context in which the project was carried out.
4.1,
ALGOL68
ALGOL88 is defined [12] in terms of the famous two-level grammar, together with a stylised English text describing the interpretation of a program on a hypothetical computer. It is not the intention of this course to present in detail this much-dlscussed document, but the concept of double grammar is important. For simplicity, we show an example from ALGOL60. The double grammar can express compatibility of type, ~or example in the ALGOL60 assignment Ass±gnment : TYPE var±able, assignment symbol, TYPE expression, The two occurrences of TYPE must, in e given expansion of 'assignment', give rise to the same expansion in the rule
101
TYPE
: Arithmetic type
j booleantype
,
{Note that the colon corresponds
in some sense to the
normal form, that the semi-colon
separates alternatives
ma is the concatenation
::g of B a o k u s
and the com-
operator).
This example does not seem very powerful, so as soon as we allow recursion.
but it becomes
In the ALGOL68 grammar,
each object
carries its type along with it. The grammar recognises different identifiers
from the rule TAG
Thus,
: LETTER
: TAG LETTER
in a rule which contains A:
... TAG
.,, TAG
; TAG DIGIT.
two occurrences
of TAG,
...
The two occurrences refer to the same identifier. object
includes,
in an array, procedure.
amongst other information,
and the number and description
The dimensionality ROWSETY:ROWS ROWS
The type of an
the number of dimensions of parameters
of an array is found in
of a
:
; EMPTY.
: row of ; ROWS row of.
Parameter descriptions PROCEOURE PARAMETY
are found in the rules
:
: procedure PARAMETY MOID. : with PARAMETERS
; EMPTY.
PARAMETERS : PARAMETER ; PARAMETERS and PARAMETER. PARAMETER
Type conversion
:
MODE p a r a m e t e r ,
is also handled by this mechanism.
Much of the information which is included
in static
semantics can therefore be handled directly by the syntax, is an extremely powerful definitional definition method remains, implementor,
however,
feature.
and this
The power of this
a source of frustration
for the
who is unable to profit from it for two reasons.
first of these is that no satisfactory for two-level
grammars
not be used directly. most important
in general,
The
analyser has yet been built
which means that the grammar can-
The second reason stems from the fact that the
immediate adavantage of the method
treatment of types and type conversion.
is the automatic
For this treatment
operative,
the uses of the identifiers
of a program must
associated
with their relevant declarations
to be
have been
in order to discover
102
their type. This in its turn implies a certain level of syntactic analysis, since the association process depends on the program structure. Thus the syntax must be applied in at least two independent phases [there are also further reasons why this in necessary]. Computer science has not yet provided solutions to these problems, which may reside, if solutions there are, in a transformation of the two-level grammar into another form. Efforts towards this goal have been published, for example in [14]. In this sense ALGOL68 represents en open challenge and does not correspond to our aims of easy implementability. In practice it has proved to be the language for which the delay between definition and implementation has been the longest up till now. In defence of ALeOL68 it must be said that its aims were not ours. For the dynamic semantics, and for that part of the static semantics not included in the grammar, ALGOL68 uses an interpreter written in English together with a limited number of extensions. The advantages and disadvantages o£ these techniques have already been discussed. The extensions are not defined in a way as to be implemented es such, which limits their use to the definition. In conclusion, ALGOL88, while being an impressive and important contribution to computer science, is
yet of limited
application in software engineering, since the techniques used are more experimental than is desirable for a production engineer. The immediate advantages in software engineering come from some of the nicer touches in the language contents, for example the concept of collateralism. A more complete evaluation of ALGOL88 is to be found in [23].
4.2.
VIENNA DEFINITIONS
The definition method developed at Vienna is used for the formal definition of PL/I [13] end other languages such as ALGOLeO. A formal definition is composed of a concrete syntaxj an abstract syntax is written in the form we have already seen [6], end concerns the physical representation of phrases of the language.
103
The abstract syntax allows an abstract representation of the program based on predicates, with formal rules concerning relationships
between different elements. The function
translates the abstract representation
'translate'
into an abstract program. The
translator contains rules which perform the tests included above in static semantics,
checking in particular for double declarations,
and
forming the declaration tables. As an example, taken from [15], consider the rule forebidding multiple declarations trans - declllst -7 (3i,j)
:
{pJ =
[l#j & S2oSiop[tJ
pc [(
= S2oSjop[t]
# ~] ÷
[$Io p) > I
¢ ~)J
T ~ error The first line of the definition says that there do not exist two entries
{i and jJ in the declaration table for which the identifiers
[discovered by selectorsJ are equal and non-vide. The second line defines the action to be taken when this condition is confirmed, that is to say the translation of the declaration If the condition is not confirmed,
list.
the error function is entered.
The translation of a concrete program, which is defined by concrete syntax, is an a b s t r a ~
program, defined by the a b s t r a c t
syntax. The highest level rules of the PL/I abstract
syntax have the
following form : is-program = [(IIis-id
[idJ}J
is-body
= (is-block ;< s-param-list:is-id-list>J
is-block
= [<s-decl-part:is-decl-part>, < s-st-list:is-st-list>J
is-deal-part = [(IIis-id[idJ}J. is~program defines the form of an object which is a program. The right-hand side of the equation says that a program is a set of objects which are bodies
(is-body]
selected by identifiers.
Each identifier is of type is-id
{Ilcan be read as 'where').
An abstract program is thus a tree of the form
104
~
-program
i s - b o ~ s-decl-p~/
Is_st_list~am_list
is-dec1-~
y
I is-st-list
~ . is-±d-list
yd
is-decl
The interpreter is a formal set o# transitions which are applied to give successive states of an abstract machine. A brief example, again taken from [15], is the evaluation o5 an expression int-expr[e)
:
=
is-bin(e) ÷ int-bin-op
(s-op[e),a,b)
a : int-expr
(s-rdi(e)),
b : int-expr [s-rd2[e)) is-unary(el + int-un-op
[s-op(e),a)
a : int-expr is-var(e]
+ PASS : content
;
(s-rd(e)) [e,~)
is-const(e) + PASS : value(e) This evaluatlon o# an integer expression reads : 'if e is a binary expression than the evaluation is the evaluation o# the b±nar~ operator applled to a and b, where a and b are the two operands, otherwise,
if e is a unary expression
.... etc.
This formal definition corresponds closely to the scheme of deqinltien/implementation
which is desirable in practice, The
interpreter knows about stacks and data-structures
(although in a
representation which may seen strange to an implementer), all the necessary information for implementation.
and contains
To our knowledge,
no implementation based on the definition has been carried out, neither is one likely. Whether implementatlon is feasible remains doubtful.
It is this project which shows up most clearly the gap
between our formal Knowledge and possibilities given by computer science research, and our possibilities of applying reasonable and
105
clear techniques in software engineering practice.
4.3,
EXTENSIBLE LANGUAGES
One approach to the problem of closing the gap between the formal definition, unusable (or unused) by the implementer, and the practical definition, which is very unsatisfying from the point of view of mathematical precision, may well turn out to be that of extenslbie languages. An extensible language consists of a relatively simple, but self-contained, base language, which itself contains extension mechanisms which allow the definition of new statement or data types. It is difficult to quote particular pieces of research without creating lists of projects, but two successive international symposia give some idea for those particularly interested [16], [17]. The example uses the notation of [16]. Consider the loop already used as an example. We may write its syntax as : macro Statement 0 + while Ex I d__oStatement 1 where type (Ex I) = boolean means ~I : if Ex I then begin Statement I ; goto £I e n d This mechanism of syntactic macros allows conditions to be put on the parameters of the macro (where clause) and gives its expansion (means .... ). Thus, given a base language, the new macros are automatically processed and the new compiler is produced automatically. This line of research, which fellows compiler-compiler and macro systems, can be considered promising, but the results are not yet sufficiently efficient to be considered as standard engineenring practice. We may consider the macro, where and means parts of a definition to correspond respectively to the syntax, static semantics and dynamic semantics of the extended language. The base language must be defined by some other means, probably traditional,
106
5,
CONCLUSION
We have tried to look at the impact of language definition on software engineering from three points of view. Firstly in considering the user, we see that certain aspects of a language may encourage the programmer to write better programs. Secondly we consider the needs of an implementor, who we consider should be more directly guided, and thirdly some attempts at formal definition. One conclusion which may be drawn from this overview is that there is an unexplored gap between the work of formal language definition and compiler implementation which is extremely unsatisfactory. We hope and suggest that computer science research should work towards bridging this gap. If the miracle happens, then the implementor will have less work to do, since the definition will imply the implementation method. In the last resort, implementors become redundant, since the transformation from definition to implementation becomes automatic. In the last section, a particular line of research which leads towards this goal was indicated. It must not be thought that no other research projects are of any interest. Indeed, one of the more recent tendencies has been away from the classical languages which we have considered, and towards new structures in different ways, for example APL [19] or ABSYS [20] and ABSET [21]. Our crystal bali is not tuned to foreseeing the language scene in 1985, nor is this part of software engineering, One may also consider what is the domain of application of these reflections concerning languages and compiiers. In our view they are not restricted only to general-purpose programming languages. It seems obvious that it is almost more important to apply sound techniques in the deflnition of special-purpose languages, since the limited application of these normaliy means that less effort is available for later clarification. In addition to their application to other language situations, these techniques have much in common with any program, since it is always necessary to carry through the cycle of definition/implementation. Of course a program to be compiled represents for the moment the most complicated data structure usually treated by a computer (with the probable exception of natural language) and much more
107
effort is put into this data definition. It is possible to find a parallel between syntax, static semantics and dynamic semantics on the one hand, and data structure, defensive testing of the data and the manipulations to be performed on the other. In our efforts to eliminate the impiementor, the definition and implementation of a language tend towards different forms of the same object. This object becomes less and iess understandable to the user, who needs different documents to serve as description. We do not at the moment see any way of making the definition serve satisfactorily as this description. It is perhaps to be regretted that this course presents more probiems than solutions, However, we must be aware of the distance which separates our present techniques from those future ones which will, optimistlcally, be mathematically provable, efficient, and usable by programmers with little formal training outside their own fields of application.
6.
ACKNOWLEDGEMENTS
The author wishes to thank MM. BOUSSARD, JORRAND and LOUIS for their helpful remarks during discussions.
108
7.
[I]
REFERENCES
- N. WIRTH A Programming Language for the 360 Computers JACM, Jan.1968.
[2]
- G. GOOS, K. LAGALLY, G. SAPPER PS440, Elne n i e d e r e P r o g r a m m i e r s p r a c h e Technischen Universit~t,
Munich 1970
[3] - N. WIRTH The Programming Language PASCAL Acta Informatica I, 1971 and The Design of e PASCAL Compiler Software Practice and Experience 1, 4, 1971
[4] - R.W, FLOYD Syntactic Analysis and Operator Precedence JACM, July 1963
[5] - D.E. KNUTH Top -Down Syntax Analysis Proceedings of a Summer School, Copenhagen,
1967
[6] - K. ALBER, P. OLIVA, G. URSCHLER Concrete Syntax o~ PL/I IBM Vienna Laboratory,
TR 25.084, 1968
[7] - J,W, BACKUS et al. Report on the Algorithmic Language ALGOLGO CACM Dec, 1960
[8] - J. BOROIER M@thodes pour la mise au point de grammaires LL(1) Th~se de Troisi~me Cycle, Grenoble,
1971
109
[ 9] - J.M. FOSTER A Syntax Improving Device Computer Journal, May 1968
[10]
- T,B.
STEEL
UNCOL : The Myth and t h e F a c t Ann. Rev.
in Aut.
Prog,,
2,
1961
[11] - P.C. POOLE, W.M. WAITE, This school
[12] - A. Van WIJNGAARDEN et al. Report on the Alogorithmlc Language ALGOL66 Mathematisch Centrum, Amsterdam,
MRI01, Oct. 1969
[13] - P. LUCAS, K. WALK On the Formal Definition of PL/I Ann. Rev. of Aut. Prog. 6, Pergamon Press, 1971
[14] - C.H.A. KOSTER Affix Grammars in. J.E.L. Peck (Editor), ALGOL68 Implementation,
1971
[15] - P. LUCAS, P,.LAUER, H.STIGLEITNER Method and Notation for the Formal Definition of Programming Languages IBM Laboratory Vienna, TR 25.087, 1968
[16] - Proceedings of the Extenslble Languages Symposium SIGPLAN Notices, Aug. 1969
[17] - Proceedings o# an Extensible Languages Symposium SIGPLAN Notices, Dec. 1971
[18] - S. SCHUMANN Specification des Langages de Programmation Treducteurs au moyen de Macros Syntaxiques Proc. AFCET, 1970
[19] - K.E. IVERSON A Programming Language Wiley, New York, ~962
et de leurs
110 34. [20] - J.M. FOSTER, E.W, ELCOCK ABSYSI
: An Incremental Compiler for Assertions
Machine Inte11igence 4, Edinburgh University Press, 1969
[21] - E.W. ELCOCK, J.M. POSTER, P.M.D. GRAY, J.M, Mo GREGoR, A.M. MURRAY ABSET, a Programming Language Based on Sets Machine Intelligence 6, Edinburgh University Press, 1971.
[22] - M. GRIFFITHS Low-Level Languages, Notes from this school,
[25] - J.C. BOUSSARD,
J.J. OUBY
[editors]
Rapport d'6valuation ALGOL68 RIRO, Feb. 1971
C'HA~ER 2.E.
+) CONCURRENCY
IN,,,,,, SOFTWARE
~,, ~,
SYSTEMS
Jack B. Dennis Massachusetts Institute of Technology Cambridge, Massachusetts, USA
i.
INTRODUCTION A large program such as an operating system, a compiler, or a real-
time control program is a precise representation of a system composed of many interacting parts or modules.
Due to the size of these, programs,
it is essential that the parts be represented in such a way that the descriptions of the parts are independent of the pattern in which they are interconnected to form the whole system, and so the behavior of each part is unambiguous and correctly understood regardless of the situation in which it is used.
For this to be possible, all interactions between
system parts must be through explicit points o f communication established by the designer of each part. If two parts of a system are independently designed, then the timing of events within one part can only be constrained with respect to events in the other part as a result of interaction between the two parts.
So
long as no interaction takes place, events in two parts of a system may proceed concurrently and them.
with
Imposing a time relation
no definite on
time
independent
parts of a syst~n is a common source of
relationship among actions
of separate
overspecification.
The
result is a system that is more difficult to comprehend, troublesome to alter, and incorporates unnecessary delays that may reduce performance. This reasoning shows that the notions of concurrency and asynchronous operation are fundamental aspects of software systems. In this lecture we consider a model for systems viewed as collections of concurrently operating subsystems that interact with one
+) T h e p r e p a r a t i o n o f t h e s e n o t e s w a s s u p p o r t e d in p a r t by the National Science Foundation under grant GJ-432 and in part by the Advanced Research Projects Agency, Department o f D e f e n s e , u n d e r O f f i c e of N a v a l R e s e a r c h Contract Nonr-NOOO14-70-A-O362-OO01.
112
another through specific disciplines of cormnunication.
In many cases,
we desire that such a system have a behavior that is reproducible in separate runs when presented with the same input data. of systems is known as determinacy.
This property
We shall present and illustrate an
important result that if interactions between subsystems obey certain natural conditions, then determinacy of the subsystems guarantees determinacy of the whole system.
We conclude by illustrating the ap-
plication of this result to systems of concurrent processes that interact by means of semaphores using the primitives P and V of Dijkstra. 2.
PETRI NETS During the discussion we will illustrate concepts by reference
to particular examples of systems.
Since we have found it con-
venient to use the formalism of Petri nets to represent these examples of systems, we begin with a brief introduction to the notation and semantics of Petri nets. A Petri net [I, 2, 3] is a directed graph with two types of nodes called places and transitions.
Each arc must go from a place to a tran-
sition or from a transition to a place.
In drawing a Petri net, places
are represented by circles and transitions by bars as in the example shown in Figure i.
Places from which arcs are incident on a transition
5
.......
Figure i.
A Petri net.
113
are called input places of the transition, and places on which arcs from a transition terminate are called its output places. may hold zero, one, or more markers or tokens. in a net is called a marking of the net.
Each place
An arrangement of markers
A Petri net may assume any
series of markings consistent with the following simulation rule: I.
For a Petri net and a marking, each transition which has at least one token
in each of its input places is enabled.
2.
Any enabled transition may be chosen to fire.
3.
Firing a transition consists of removing one token of its input places and adding one token
from each
to each of its output
places. Figure 2 shows a sequence of markings for a Petri net resulting from the firing of transitions in the sequence a,c,e,f.
Note that this firing
sequence returns the net to its original marking. A marking of a Petri net is said to be safe if no simulation of the net, starting from the given marking, yields a marking in which some place holds more than one token. transition is ever enabled when places.
In a net having a safe marking no tokens are present in any of its output
A marking of a Petri net is live if, for any marking reachable
from the given marking, there is a firing sequence that will enable any transition of the net.
Liveness of a marked Petri net requires that no
part of the net ever reach a condition from which further activity of the part is impossible.
The Petri net in Figs. I and 2 is both live and
safe for all of the markings shown. If simulation of a Petri net reaches a marking for which two transitions are enabled that share an input place, the two transitions are said to be in conflict over the
token in the shared place.
transitions a and b are in conflict at place i.
In Fig. 2a
In this case, the con-
flict is resolved by making an arbitrary decision as to which transition is to receive the
token
called a free choice.
in place I~ hence this sort of conflict is
114
~)
(a)
[_-A
, ....
L+_J
b
P (c)
(d)
EL
LL Figure 2.
Simulation of a Petri net.
J
115
3.
SYSTEMS In Figure 3 we show a system S with m inlets and n outlets.
inlets are points at which the tems
system
receives signals from other sys-
or from the environment E in which the systems operate.
inlets
2 ~
~ 2
I
The
The outlets
outlets
s
I
m ~
~ n
\ Figure 3.
3 A system.
are points at which the system emits signals for reception by other systems or the environment.
An alphabet of possible signals is associated
with each inlet or outlet of the system. Suppose system S begins operation from some internal configuration CO and makes transitions to successive configurations
CI, C2,...,Ck~ ....
In some transitions symbols are absorbed at certain inlets; in other transitions symbols are delivered at outlets. reached configuration C i.
Suppose the system has
During the activity from CO to C.l some defi-
nite sequence of signals was abosrbed by S at each inlet, and some definite sequence of signals was delivered at each outlet, as shown in Figure 4.
The array of input sequences U is called an input of the system;
the array of output sequences V is a corresponding output of the system. In this way, the behavior of a system is given by a binary relation R S containing each pair (U~ V) such that S has some finite activity~ starting from configuration CO~ during which it absorbs the input array U and emits the output array V.
116
,~, R
i b
inlets
a
a
a
c
input array U
cabb
1
output array V
outlets x
xyzyxyz
Figure 4
The domain of R S consists of all inputs that can be absorbed by S during some activity starting from CO .
The domain does not include
all possible arrays of finite sequences because S may cease absorbing signals at some inlets either temporarily or permanently.
The range of
the relation R S contains each output S could emit for some input. Let us consider some simple examples of systems to become familiar with the kinds of behavior that may occur•
The system of Figure 5,
shown in its initial configuration, transmits the pair of signals x,y for
inle1t~ ~
I °utle1t Figure 5.
117
each signal received at its inlet.
Corresponding to the input
the output may be any one of the three possibilities:
In this case, a bounded input can yield only bounded outputs.
Figure 6
shows a system that can have unbounded outputs for certain bounded inputs.
i.
Figure 6.
For the input
1.[-/-7] the output may be any member of the infinite set of arrays
2.
2.
2.
2.
The example in Figure 7 shows how the orde~ in which input signals are absorbed at different inlets may vary for different runs of a system
118
I,
i.
2.
2.
Figure 7.
without affecting input-output
the sequences
that are emitted at each outlet.
pairs are shown below:
2.
2.
i.
I.
2.
2.
Two
119
4.
DETERMINACY Next we introduce the ultimate output
presented input X.
Y of a system S for some
We imagine that the symbol sequences of the array
X are made available for absorption at the inlets of S. sequences may be infinite.
Some of these
Then an associated ultimmte output of S
is an output array Y emitted by advancing the activity of S as far as possible without absorbing input symbols beyond those in X. activity,
it may be that S does not absorb all of X.)
precisely~, let us say that two sequences
(In this
To state this
(possibly infinite)
x I, x 2, ..., x k . . . . YI' Y2 ~ "''' Yk' "'" are similar if x i = Yi for each index such that both x i and Yi exist. Two arrays are similar if corresponding rows are similar.
Then Y is
an ultimate output of S for X if and only if (U, V) E R s
%
!
U a prefix of X ~ f V similar to y j
implies V is a prefix of Y
Some examples of ultimate outputs for the systems shown in Figures 5, 6 and 7 are given below: figure
presented input X
ultimate output Y_
1 ixyxyl 6
I.
2.
6
i. ~
2.1" ~ z y x y x y ......
l lV 2. ~
2.
In these systems, there is a unique ultimate output for each presented input. call any system having this property a determinate system.
Some examples
of systems that are not determinate are shown in Figures 8, 9, and I0.
We
120
I.
Figure 8
I.
I.
Figure 9
I. I.
Y
Figure I0
fiKure
presented input
ultimate outputs
8
i.[]
i.~
l[]
9
i.[]
i. i
i.[~] I. I x x x x
....
121
5.
INTERCONNECTED SYSTEMS In the examples of systems used above, arrival of a signal at an
inlet occurs when a marker is put in one of the places of the inlet.
We
assume that no further input signals arrive until the system absorbs the signal by removing the marker from the place.
An output signal is
emitted when the system puts a marker in one of the places of an outlet. We assume the marker is immediately removed by the environment in which the system operates. Suppose a finite collection of systems [Si} are assembled to form a larger system S by specifying associations of certain outlets and inlets, as illustrated by Figure iio
i environment E
We may define the input-output relation R S
J
I
>_-
Sl
I
[
environment E S2
S3
-'~
-=
j
Figure ii
for the composite system by employing the following convention regarding the inputs and outputs of the constituent systems: Suppose operation of S has reached a point where S has absorbed input U and emitted output V, and each subsystem S. has absorbed input U i and emitted output V.. l
Then, if outlet p of S. is associated with i
inlet q of Sj, the qth row of U. must be a prefix of the pth row of V.. 3 l
122
If the pth inlet of S. is specified to be qth inlet of S, then row q of l X and row p of X. must be identical. If the pth outlet of S. is i l specified to be the qth outlet of X, then row p of Y. and row q of Y l must be identical. Using these conventions for defining the behavior of assembled systems, Patil [4] has established this important result: Theorem
A system S formed by the assembly of systems {Si} is deter-
minate if each system S. is determinate. That is, the class of i determinate systems is closed under the operation of assembly. If, in an assembly of systems, outlet p of S. is associated with l inlet q of Si, then more signals may have been emitted by outlet p than have been absorbed by inlet q.
Thus, to apply the above result, we must
connect outlet p to inlet q in such a way that signals emitted by p are fed to q in exactly the same order, and no signals are lost.
Two ways
of accomplishing this are: i.
Insert an FIFO queue of unbounded capacity between outlet p and inlet q to hold signals emitted by p but not yet absorbed by q.
2.
Prevent S. from emitting a signal at outlet p until the i previous signal emitted has been absorbed by S. at inlet q. J
Suppose outlets are connected to inlets by means of unbounded queues. Then an event that emits a signal at an outlet enters the signal in the associated queue; an event that absorbs a signal at an inlet removes a signal from the queue, and can only occur if the queue is not empty. Under this con~nunieation discipline, the Theorem shows that interconnections of determinate systems are necessarily determinate. To prevent a system from emitting signals before a previous signal has been absorbed, it is sufficient that an assembly of systems satisfy the following condition:
123
~-condition:
For each association of an outlet p of some S i with an
inlet q of some Sj, the assembly S must contain a path from inlet q to outlet p by way of systems in [Si} and the environment of S such that each signal emitted at outlet p requires the prior absorption of a signal at inlet q. Figure 12 is an example of an assembly of systems that satisfies the u-condition.
If it can be verified that an assembly S of systems satis-
fies the u-condition,
then the Theorem guarantees that S is determinate.
¢ ..................sl
Ix
~ F
.
t.. .......
h
_
_1
J ~ Figure 12
l
124
There is an important scheme for interconnecting systems that guarantees that the ~-eondition hold for the resulting system.
The only
kind of connection permitted between systems is a link that connects an output port of one system to an input port of another as shown in Figure 13.
Each port consists of an inlet and an outlet.
Systems are re-
I output port
~-- i
input port
system 2
system i
Figure 13
quired to obey the discipline of emitting a signal at the outlet of a port only after receiving a signal at the associated inlet.
In the
initial configuration of a system, each output port is considered to have just received a (null) signal, and is prepared to emit a signal at the outlet of the port.
Each input port is prepared to absorb a signal at the
inlet, and will not emit a signal at the outlet until a signal arrives at the inlet.
We call systems that communicate according to this discipline
~-systems.
Since any G-system satisfies the ~-condition automatically,
and any interconnection of G-systems is also a ~-system, the Theorem shows that the class of determinate G-systems is closed under interconnection. From Figure 14 we see that, since a FIFO queue is a determinate ~-system~ it is also true that determinate G-systems interconnected by queues yield determinate ~-systems.
125
F I
h inlet
I inlet
t
00S.
- O0 I --O-]
FIFO queue
i
i
outlet
S.
J
outlet
Figure 14
6.
INTERPROCESS COMMUNICATION A sequential process may be represented by a Petri net.
ample is shown in Figure 15.
Since there is one site of control,
only one marker is ever present in the Petri net. called state machines.
An ex-
Such Petri nets are
The location of the marker corresponds to the
notion of "program counter" in a conventional computer.
(a)
block diagram
(b)
Petri net
I 1 !
3
Figure 15
4F ~ _ ~
126
The synchronizing primitives of Dijkstra [5], as used to control the interaction of pairs of processes, may be represented as in Figure 16. The number of markers in place s represents the value of the semaphore.
f
) V[s]
P[s] J
4 J
Figure 16 Suppose n sequential processes interact only in the two ways defined in Figure 17.
Our development shows that such a system of pro-
cesses is determinate. (a)
FIFO queue
(b)
G-link
O
,vls] -- - ~ _ ~ )
~
.
receive
6/ Figure 17
s2
sl
--
O
"-\ 6
127
7. I.
REFERENCES C.A.
Petri, Conmmnication With Automata.
Supplement i to Technical
Report RADC-TR-65-377, Vol. !, Griffiss Air Force Base, New York 1966.
[Originally published in German: Kor~nunikation mit Automaten,
University of Bonn, 1962.] 2.
A. W. Holt and F. Commoner, Events and conditions.
Record of the
Project MAC Conference on Concurrent Systems and Parallel Computation, ACM, New York 1970, pp 3-52. 3.
A. W. Holt, F. Cormnoner, S. Even, and A. Pnueli, Marked directed graphs.
J. of Compute r and System Sciences, Vo! ,. 5 (1971),
pp 511-523. 4.
S. S. Patil, Closure properties of interconnections of determinate systems.
Record of the Project MAC Conference on Concurrent Systems
and Parallel Computation , ACM, New York 1970, pp 107-116. 5.
E.W.
Dijkstra, Co-operating sequential processes.
Prograrm~ing
Languages, F. Genuys, Ed., Academic Press, New York 1968. [First published as Report EWD 123, Department of Mathematics, Technological University, Eindhoven, The Netherlands, 1965.]
CHAPTER 3.A. MODULARITY Jack B. Dennis Project MAC, Massachusetts I n s t i t u t e of Technology Cambridge, Massachusetts, USA
1.
INTRODUCTORY
CONCEPTS
The word "modular" means "constructed with standardized units or dimensions for f l e x i b i l i t y and variety in use." Applied to software engineering, modularity refers to the building of software systems by putting together parts called
program modules.
The dictionary meaning applies very well i n , for example, the construction materials trade: In the United States floor t i l e comes in nineinch squares (the modules) which may be conveniently adjoined to f i l l up any shape of f l o o r area with j u s t a b i t of trimming at the boundary. A great variety of patterns may be produced by using modules of d i f fering color and texture. In modular software, clearly the "standardized units or dimensions" should be standards such that software modules meeting the standards may be conveniently f i t t e d together (without "trimming") to realize large software systems. The reference to " v a r i e t y of use" should mean that the range of module types available should be s u f f i c i e n t for the construction of a usefully large class of programs. In July 1968 a two-day symposium was held in Boston on the subject of Modular Programming [ 1 ] . The preprints of papers for this meeting probably form the only collection of material representing a s i g n i f i cant range of viewpoints on the nature and purpDse of modular programming. In this c o l l e c t i o n of papers various concepts of program modularity are described ranging from vaguely defined principles to
+The p r e p a r a t i o n of these notes was supported in part by the National Science F o u n d a t i o n under grant GJ-432 and in part by the Advanced Research Projects Agency, Department. of Defense, under Office of Naval Research Contract N o n r - N O O O ] 4 - 7 0 - A - 0 3 6 2 - O 0 0 | .
129
d e f i n i t i v e formal concepts. Yet there is an important objective common to a l l .
I t stems from recognition of the high cost of producing cor-
r e c t l y functioning software systems; i t
ised by the s a y i n g :
"divide
is to realize the benefits prom-
et impera".
To many people in software practice, modular programming means the division of the whole of a program into parts so "the interactions between parts are minimized" or so "the parts have functional
independ-
ence." Frequently, the assumption is made that in modular programming the program and i t s par~ are designed at the same time and under the same authority. There is l i t t l e appreciation that the objective of simplifying program construction by dividing the task into parts has definite implications regarding
the structure of programs and the char-
a c t e r i s t i c s of computer systems.
N e v e r t h e l e s s , s e v e r a l t h o u g h t f u l and p r e c i s e n o t i o n s were a l s o e x p r e s s ed at the symposium. The d e s i g n e r s of the I n t e g r a t e d C i v i l E n g i n e e r i n g System (ICES) [ 2 ] emphasized the importance o f being able to use t o gether i n d e p e n d e n t l y w r i t t e n program modules. Boebert 'L3] also recognized t h a t the success of modular programming depends on c h a r a c t e r i s t i c s of the l i n g u i s t i c l e v e l at which the modules are expressed. He p o i n t s out t h a t m o d u l a r i t y should be regarded as a p r o p e r t y o f a comp u t e r system or l i n g u i s t i c l e v e l r a t h e r than a p r o p e r t y possessed or not possessed by some program. E. W. D i j k s t r a ' s concern [ 4 ] w i t h principles
of " s t r u c t u r e d
programming" is c l o s e l y r e l a t e d .
Our goal in these lectures is to develop further understanding of these notions of modular programming, and to derive t h e i r implications for the design of programming languages and computer systems.
1.1.
DEFINITION
OF MODULARITY
We take the f o l l o w i n g
statements to be the o b j e c t i v e s
of modular p r o -
gramming: 1. One must be able to convince himself of the correctness
of a pro-
gram module, independently of the context of i t s use in building larger units of software. 2. One must be able to conveniently put together program modules w r i t ten under d i f f e r e n t authorities without knowledge of t h e i r inner workings.
130
These statements embody the concept of "context-independence"- discussed by Boebert [ 3 ] ,
[4].
and the concept of non-interference stated by Dijkstra
We consider modularity to be a property of computer systems: A computer system has modularity i f
the l i n g u i s t i c level defined by
the computer system meets these conditions: Associated with the l i n g u i s t i c level is a class of objects that are the units of program representation. These objects are program modules. The l i n g u i s t i c level must provide a means of combining program modules into larger program modules without requiring changes to any of the component modules. Further, the meaning of a program module must be independent of the context in which i t i s used. In previous publications
~ , 6 ] I have applied the term "programming
generality" to computer systems that have this property of modularity. Two r e l a t i v e l y precise concepts regarding the form of a program module occur in the l i t e r a t u r e on modular programming. On one hand, a module is viewed as a procedure: At any point during the progress of a computation, one module (procedure) may i n i t i a t e an a c t i v a t i o n of another procedure by specifying a set of input data. The new procedure a c t i v ation is carried on, possibly making use of additional procedures, u n t i l it
terminates, leaving a set of output data for use by the procedure
from which i t was activated. In this concept, a modular program is a c o l l e c t i o n of n o n - i n t e r f e r r i n g procedures. Characteristic of programs constructed as combinations of procedures is the flow of control in a pattern described by a tree. The notion of procedure is a central feature of most modern programming languages, ALGOL 60 being the classical model [7,8]
. But, as we shall see, the procedure in i t s
does not me~t our requirements
usual form
for modular programming.
On the other hand, a module may be conceived as an e n t i t y that is j o i n ed to other modules by communication l i n k s . over i t s input l i n k s , transforms i t other modules over i t s output l i n k s .
Each module receives data
in some way, and sends i t
on to
In this p i c u t r e , each module is
continously a c t i v e , processing data so long as inputs are available. Concurrency of operation is an inherent part of this notion of modular i t y . The links connecting one module to another are thought of as channels through which data flow. F i r s t i n - f i r s t
out queues may be in-
troduced in the links as a means of improving the e f f i c i e n c a of an implementation
without a l t e r i n g the semantics of a modular program.
131
This form of modular programming is advocated [ 3 , 9 ] applications
where the l i n k s
cept is c l o s e l y r e l a t e d operating
sequential
having features tain simulation
f o r data processing
are implemented as " b u f f e r
to Conway's c o r o u t i n e s
processes
[II].
files."
The con-
[ 1 0 ] and D i j k s t r a ' s
co-
The only programming languages
s u i t a b l e f o r t h i s form of modular programming are cerlanguages, in p a r t i c u l a r Simula 67 [ 1 2 ] .
In these l e c t u r e s ,
we study the l i m i t a t i o n s
on modular programming
found in the l i n g u i s t i c
levels
defined by c e r t a i n
consider the well-known
programming languages, FORTRAN and ALGOL GO,
to understand the issue of clashes of i d e n t i f i e r s . the problems of handling dynamic data s t r u c t u r e s
computer systems. We Wen then consider in modular programs
and the problems of combining program modules expressed in d i f f e r e n t representations. Multics[l~ is studied as a system in which sharing of procedures and data is p o s s i b l e with considerable g e n e r a l i t y . F i n a l l y , we consider the d e f i n i t i o n which a very general
1.2.
of a h y p o t h e t i c a l
linguistic
level
within
form of modular programming is p o s s i b l e .
MODULARITY TN FORTRAN
Let us s t a r t
by c o n s i d e r i n g
at the l i n g u i s t i c We w i l l
level
the forms of modular programming possible
defined by the ANSI FORTRAN language standard.
not consider here the features of FORTRAN f o r
and t r a n s f e r of data between storage l e v e l s , grams in other languages are not p e r m i t t e d . A FORTRAN p r o # r a m
output
c o n s i s t s of a sequence of statements t h a t make up a
m a i n p r o g r a m and a c o l l e c t i o n
present function subprograms no p r o v i s i o n
input,
and we assume t h a t subpro-
of separate sets of statements t h a t r e and subroutine subprograms.
Since there is
in the FORTRAN standard f o r combining s e p a r a t e l y w r i t t e n
FORTRAN programs,
a complete FORTRAN program c o n s i s t i n g of main program
and subprograms cannot serve as a program module at the l i n g u i s t i c
lev-
el defined by the standard. The obvious choice as a u n i t
f o r modular programming is the .FORTRAN
subprogram. We encounter one d i f f i c u l t y immediately: The only method of combining several subprograms is to c o l l e c t them together with a main program, y i e l d i n g an executable FORTRAN program. A l a s , t h i s is not a program module, and t h e r e f o r e cannot be f u r t h e r combined with other units
to form l a r g e r modules.
Thus FORTRAN f a i l s
by not p e r m i t t i n g
hierarchical
structure
in a modu-
132
l a r program. N e v e r t h e l e s s , other problems. putation
It will
let
us disregard
be useful
this
defect and look f o r
to have in mind a p i c t u r e of the com-
s t a t e s o c c u r r i n g during execution of a FORTRAN program. The
s t r u c t u r e of a s t a t e is shown in Fig.
I as an o b j e e b of the v a r i e t y
used by the IBM Vienna Group in t h e i r work on formal gramming languages. therefore
This o b j e c t represents an execution s t a t e ,
the o p e r a t i o n of p u t t i n g
program has been performed. o b j e c t having as i t s
definition
and
several modules together to form a
The ' t e x t ' - c o m p o n e n t of the State is an
components the compiled form of each source l a n -
guage subprogram, i n c l u d i n g
one subprogram i d e n t i f i e d
the remaining subprograms i d e n t i f i e d
as ' m a i n ' ,
by names chosen by t h e i r
grammers. The ' p r i v a t e ' - c o m D o n e n t of the s t a t e has, as i t s data e n t i t i e s
of pro-
and
pro-
l e a f nodes,
and other values t h a t are accessed only during execution
of the corresponding subprogram t e x t
( e x c e p t , of course, when these
values are passed as arguments to other subprograms). These values are values of FORTRAN v a r i a b l e s and arrays not mentioned in COMMON s t a t e ments of the source language subprogram, and a d d i t i o n a l
variables
gen-
erated by the compiler. The 'common'-component of the s t a t e contains
several
vectors of data
items t h a t are accessed during execution of statements in several
sub-
programs. The computation state of a FORTRAN program has a fixed structure during execution of the program, only values at the l e a f nodes are changed (two exceptions: adjustable arrays and extension of COMMON). Limitations on the generality of modular programming in a l i n g u i s t i c level arise from points of i n t e r a c t i o n between program modules. For FORTRAN subprograms these points of i n t e r a c t i o n are:
c a l l i n g a function
or subroutine; the naming of subprograms; and the use and naming of COMMON. I f two authors have chosen the same name for t h e i r independently w r i t ten subprograms, a c~ash of names occurs when these subprograms are used together. S i m i l a r l y , two authors may choose to use blank COMMON for d i f f e r e n t pruposes, or may use the same names for labelled COMMON storage. These are v i o l a t i o n s of our d e f i n i t i o n of modularity since a l t e r a t i o n of the representation of a module may be required before it
can be c o r r e c t l y combined with other modules.
These names clashes may be removed by changing the names of subprograms and choosing new labels for COMMON storage areas. Matters would be more difficult
if
a program module were to consist of several subprograms,
possibly independently w r i t t e n , working together. The problems i n t r o -
133
I
I 'p r i v a t e '
't e x t '
I 'main '
1
I
I name-i
1
•m a in '
I
C0!T~T~on t
I......... Ol
0
li
1
II
0
1
0
II
1
il
I
I 'blank '
name - i
0
l a b e i- j
j_
0
t
II
II L
data items temporaries
Statements cons
tants
Figure
i.
State
of a Fortran
data
program.
items
1
134
duced by attempting to remove clashes through s u b s t i t u t i o n are discussed below.
1.3
MODULARITY
In ALGOL
IN ALGOL
60
60 the procedure is c l e a r l y tile candidate f o r consideration
as the form f o r program modules. Since procedures may be combined without modification to form larger procedures, a modular program in ALGOL SO may be a hierarchy of modules having an a r b i t r a r y depth of
nesting. The modules are represented as ALGOL 60 source t e x t . Compiled ALGOL programs are not program modules of the ALGOL-defined l i n g u i s t i c
level and cannot be combined. The instances of the i d e n t i f i e r y in the ALGOL procedure real procedure
f(x);
begin
f
real X;
:= x + y ;
y := y + I ; end are nonlocal
references and therefore y must be a local i d e n t i f i e r in
some enclosing procedure i f
the complete ALGOL program is to be mean-
i n g f u l . A person using procedure f as a module must know about a l l such external references occurring in f (including those a r i s i n g within procedures enclosed by procedure f ) since external references are a form of i n t e r a c t i o n of a procedure with external objects. One may wish to use two ALGOL procedures, f and g, in the construction of a modular program where each procedure makes use of the i d e n t i f i e r y to reference some external object.
I f both procedures are placed in the
program as declarations within the same enclosing procedure, there is a clash of names. Thus the use of nonlocal references in an ALGOL GO program module is a v i o l a t i o n of our concept of modularity. Several means are a v a i l a b l e to remove or avoid clashes of names between procedures in ALGOS 60 programs: I.
Substitute an a l t e r n a t e i d e n t i f i e r f o r each appearance of y as an
external reference in one of the procedures. For reasons to be discussed s h o r t l y , the use of s u b s t i t u t i o n has s i g n i f i c a n t disadvantages. 2.
Enclose one of the procedures within an " i n t e r f a c e procedure" that
135
renames the e x t e r n a l
object
by a s s i g n m e n t :
real procedure f l ( x )
real X;
begin real y; real procedure f ( x ) ; f
begin
real X;
:= X + y ;
y :: y + I;
end
y :=yl; f l := f ( x ) yl
:= y
end
"This would be awkward to do f o r the e x t e r n a l yl
object
is
depends on t h e t e x t
arrays,
a procedure. of
and i m p o s s i b l e
in ALGOL 60 i f
Moreover the c h o i c e o f
the p r o c e d u r e t h a t
encloses
identifier
fl.
3. E n c l o s e one o f t h e p r o c e d u r e s in a p r o c e d u r e d e c l a r a t i o n y is a local identifier and formal p a r a m e t e r :
real procedure f l ( x ,
y);
in which
real y
begin real procedure f(x); real x; begin f :: x + y; y :: y + I; fl
end
:= f ( x )
end
This has the e f f e c t dure e n t r y . 4.
of substitution
O r g a n i z e the modular
for y,
program t h a t
the scopes o f y do not o v e r l a p ,
but takes
effect
at proce-
uses p r o c e d u r e f and g so t h a t
by p l a c i n g
the d e c l a r a t i o n s
o f f and g
within d i s t i n c t procedures or blocks of the program. The need f o r any o f t h e s e schemes would be a v o i d e d i f y were i n c l u d e d as one o f the formal p a r a m e t e r s o f p r o c e d u r e s f and g. l h e mechanism o f n o n - l o c a l evaluation quired
rules
formal
reference
in ALGOL 60 was i n s p i r e d
o f t h e lambda c a l c u l u s ,
parameters
between i n d e p e n d e n t l y
and reduces
in p r o c e d u r e a p p l i c a t i o n .
written
program modules,
by the
the number o f r e -
At t h e i n t e r f a c e
the need to d i s c o v e r
136
and r e s o l v e les
name c o n f l i c t s
an u n a t t r a c t i v e
adopt as a p r i n c i p l e communicating meters
o f modular
effects"
value,
if
any).
information
from program modu-
reason,
that
Note t h a t
we s h a l l
t h e o n l y means o f by i t s
this
formal
principle
rules
in ALGOL SO: O p e r a t i o n
explicitly
para-
of a
passed to i t .
SUBSTITUTION
The names ( i d e n t i f i e r s )
that
module can be d i v i d e d if
programming,
of the kind observable
module can o n l y a f f e c t
1.4.
references For t h i s
data to and from a p r o c e d u r e module is
(and r e s u l t i n g
out "side
makes e x t e r n a l
form o f i n t e r a c t i o n .
a name has a f r e e
into
occurrence
bound to the name o u t s i d e nate name f o r binding that
all
will
primitive
in a r e p r e s e n t a t i o n
in the module,
the module.
instances
names o u t s i d e
identify
occur
of a program
two groups - bound and f r e e .
change t h e e f f e c t
level
a t which the module i s
fixed
meaning.
refers
Hence s u b s t i t u t i o n
o f the name w i t h i n
operations,
it
By d e f i n i t i o n , to some o b j e c t o f an a l t e r -
the module w i t h o u t o f the module.
constants,
etc.
e x p r e s s e d are f r e e
All
re-
names
o f the l i n g u i s t i c
and have p e r m a n e n t l y
Names that are bound in a program module may be uniformly replaced throughout the module without a l t e r i n g i t s meaning. I f name c o n f l i c t s occur when two program modules are combined, i t
is
because the same i d e n t i f i e r occurs free in both modules, and with d i f f e r e n t intended meanings. We have seen how such c o n f l i c t s can arise
from f u n c t i o n
names, subprogram names, and l a b e l s
and from n o n l o c a l flicts
identifiers
in ALGOL 60.
may be removed by s u b s t i t u t i n g
name at each appearance as an e x t e r n a l le.
This
have l o s t
substitution their
an a l t e r n a t e reference
must be made b e f o r e
separate
identity,
for
for
COMMON in FORTRAN,
We have noted t h a t name f o r within
name con-
a free
a program modu-
the modules to be combined
example b e f o r e an ALGOL program
is compiled or before FORTRAN subprograms are linked. There are several d i f f i c u l t i e s with name s u b s t i t u t i o n as a means of resolving name c o n f l i c t s . F i r t s l y , performing the s u b s t i t u t i o n may i n volve considerable information processing. A program module may i t s e l f be a combination of many simpler modules and the substituted name must be chosen so that no new c o n f l i c t s are generated e i t h e r inside or outside the program module.
137
The most i m p o r t a n t bility
of sharing
t h e module i s
consequence o f name s u b s t i t u t i o n a representation
foreclosed.
program.
that
the p o s s i -
o f a program module among users o f
A substitution
cannot be made in a r e p r e s e n t a t i o n of a n o t h e r modular
is
required
to remove a c o n f l i c t
o f a module a l r e a d y
in use as p a r t
A copy of the module must be made f i r s t .
The importance of being able to share representations of program modules is
gradually
been c a r r i e d tem may
becoming r e c o g n i z e d .
furthest:
be shared by a l l
We e x p e c t s h a r i n g tems.
Therefore,
that
important
14],,
the idea has
operation
in the sys-
the making o f c o p i e s .
in f u t u r e
computer s y s -
o f our c o n c e p t o f program m o d u l a r i t y ,
names o c c u r r i n g entities
[13,
for
users w i t h o u t
to be i n c r e a s i n g l y
o n l y to fundamental
1.5
authorized
as a r e q u i r e m e n t
we adopt the r u l e fer
In M u l t i c s
Every p r o c e d u r e w r i t t e n
free
in a program module may r e -
of the l i n g u i s t i c
level.
REFERENCES T. O. B a r n e t t ,
Modular programming: Proceedings of a NatConal Symposium, Symposium Preprint. I n f o r m a t i o n and Systems P r e s s , Cambridge,
.
Massachusetts ',2.
J.
•
Out o f b u s i n e s s
M. Sussman and R. V. Goodman, I m p l e m e n t i n g
under 0S/360. 13
1968.
W. E
.
Boebert,
Published
in
[i],
Toward a modular
pp 69
ICES module management
84.
programming
system.
Published
in
[i]
pp 95 - I I I .
4.
E: W. D i j k s t r a , A constructive approach to the problem of program correctness. BIT (Nordisk T i d s k r i f t for Informations-behandling), Vol. 8, No. 3, 1968, pp 174 - 186.
5.
J. B. Dennis, Future trends in time-sharing systems. Time-Sharing Innovation for Operations Research and Decision-Making, Washington Operations Research Council, Rockville, Maryland 1969, pp 229-235.
6.
J. B. Dennis, Programming generality, parallelism and computer architecture. I n f o r m a t i o n Processing 68, North-Holland Co., Amsterdam 1969, pp 484 - 492.
7.
Publishing
E. W. Dijkstra, Recursive programming. Numerische Mathematik, Vol.2,
,
138
1960,
8.
9.
et al,
Comm. o f
t h e ACM,
E. M o r e n o f f
and J.
, No.5
(May 1 9 6 0 ) ,
guage. F. J.
pp 299 - 314.
in
[I],
Vol.
6, No.
Co-operating
F. Genuys, E d . ,
structures
transition-diagram
7 (July
1963),
sequential
Academic P r e s s ,
E i n d h o v e n , The N e t h e r l a n d s ,
Dahl and K. Nygaard, SIMULA - Comm. o f
Corbato,
t h e ACW, V o l .
C. T. C l i n g e n ,
9,
and modu-
pp 133 - 143.
No.
compiler.
pp 396 - 408.
processes.
Programming Lan-
New York 1968.
as Report EWD 123, Department o f M a t h e m a t i c s ,
University, O. J.
language ALGOL 60.
B. McLean, Program s t r i n g
Published
t h e ACM,
E. W. D i j k s t r a , lished
13.
Vol.3
M. E. Conway, Design o f a s e p a r a b l e
guages,
12.
Report on the a l g o r i t h i m i c
programming.
Gomm. o f
11.
318.
P. Naur,
lar 10.
pp 312 -
First
pub-
Technological
1965.
an ALGOL-based s i m u l a t i o n 9 (September 1 9 6 6 ) ,
and J . H .
Saltzer,
seven years. AFIPS Conference Proceedings,
pp 671-678.
MULTICS - -
Vol. 40, SJOC,
lan-
The f i r s t 1972,
pp 571 - 583. 14.
R. C. Da|ey and J.
B. D e n n i s ,
ing in MULTICS. Comm. o f 312.
Vurtual
t h e ACM, V o l .
memory,
processes,
11, No.5
and s h a r -
{May 1 9 6 8 ) ,
pp 306-
139
.2.
DATA STRUGTURES
IN MODULAR P R O G R A M M I N G
The a c h i e v e m e n t o f program m o d u l a r i t y the linguistic ther
requirements
from t h e l i n g u i s t i c
in
the c o n s t r u c t i o n extend,
modularity, vides
a computer
2.1.
for
met by c o n v e n t i o n a l programming
we e x p l o r e
that
data.
require
issues
as
arising
the a b i l i t y
We c o n c l u d e t h a t ,
system must d e f i n e
of contemporary
ADDRESS
structured
difficult
program modules move f u r -
by the computer system on which
lecture,
base r e p r e s e n t a t i o n
not s a t i s f a c t o r i l y tations
defined
In t h i s
o f program modules
and m o d i f y
a suitable
becomes i n c r e a s i n g l y
representing
level
the modules are to be r u n . ate,
for
a linguistic structured
level data,
to c r e -
to a c h i e v e that
pro-
a requirement
computer systems o r by implemen-
languages.
SPACE A N D M O D U L A R I T Y
F i r s t we note that conventional computer memories and addressing schem.es impose a l i m i t a t i o n on modular programming. When a program is run on a contemporary computer system, a l l
procedures and data involved in the
computation must be assigned positions within the address space provided for the computation by the computer system. I f more than a single object -- whether procedure or data -- is assigned to some area of the address space, the meanings of addresses must change during the computation. This violates our p r i n c i p l e s of modular programming because some program modules w i l l
require knowledge of the internal
construction of
others in order to determine which objects should occupy the shared areas of address space. Thus the f i n i t e n e s s of address space l i m i t s
the
size of modular programs. To support modular programming a computer system must provide an address space of size s u f f i c i e n t to hold a l l
pro-
cedures and data structures required for the execution of any modular program. A More complete presentation of this argument may be found in
The a d d r e s s i n g through ories.
the b r u t e
of
finite
main memories have been reduced
expedient of using
programs.
A more s o p n i s t i c a t e d
o f main memory is
virtual
given a large
larger
main memories are s t i l l
o f data bases and program l i b r a r i e s
modular
finiteness large
force
Yet p r a c t i c a l
extent ting
limitations
small
and l a r g e r
main , mem-
in comparison
we wish to use in
to t h e
construc-
approach to overcoming
the
to a r r a n g e a computer system to p r o v i d e
address
space f o r
each u s e r .
address
space w i t h o u t
tying
In e f f e c t ,
a process
up a c o r r e s p o n d i n g
a
is
amount o f
140
main memory. As i t also
is
currently
has l i m i t a t i o n s ,
one p h y s i c a l
for
storage
word pages,
for
implemented,
related
example)
items w i l l
memory idea
chunks o f address space are r e a s s i g n e d
device
to a n o t h e r
and i t
is
module to map his data s t r u c t u r e s that
the virtual
in r e l a t i v e l y
difficult into
for
large
(512-
the programmer o f a
the address space i r
be moved t o g e t h e r
from
units
such a way
between p h y s i c a l
storage
lev-
els.
2.2.
REPRESENTATION
Other
implications
el
OF P R O G R A M M O D U L E S
of modularity
concern f e a t u r e s
at which modules are r e p r e s e n t e d
We noted e a r l i e r be bound w i t h i n the l i n g u i s t i c pendently follows
that
all
the module u n l e s s level.
for
identifiers
Otherwise
they refer
identifier
that
any i n f o r m a t i o n
its
function
for
Any i n f o r m a t i o n
use o u t s i d e
a program module,
clashes
parameters
of modularity
is
that
must be p o s s i b l e
p a r a m e t e r o f t h e module. applies.
It
is
a wide range o f or one f o r grammar.
possible inputs
constructing for
known u n t i l
I.
building
parameters.
any e n t i t y
to which r e f -
input
or o u t p u t
implements
a certain
data to which
the a l g o r i t h m
t h a t work e f f e c t i v e l y
a procedure
for
matrix
data s t r u c t u r e s
for
inversion
to a formal
o f such program modules r e q u i r e s
and a l t e r i n g
state-
linguistic
of extent
not
t h e time o f e x e c u t i o n .
as a f o u n d a t i o n
Any data s t r u c t u r e
ture.
formal
the parse o f a s e n t e n c e a c c o r d i n g
In summary, we have t h r e e intended
example,
access
o r must be
any program may be used as
to any i n p u t
for
The r e p r e s e n t a t i o n
primitives
it
of t h e c a l l i n g
through
to t r e a t
to d e s i g n a l g o r i t h m s as,
premise
by the module and i n t e n d e d
A program module t h a t
should be a p p l i c a b l e
of
can o c c u r when i n d e -
From t h i s
erence may be made by a program module as an a c t u a l algorithm
units.
constructs
o f t h e module i t s e l f ,
c r e a t e d or m o d i f i e d
it
larger
to p r i m i t i v e
must be passed to the c a l l e r
S i n c e the o b j e c t i v e
into
lev-
in a program module must
to which a program module r e q u i r e s
must be p a r t
passed to the module by means of formal ment.
combination
occurring
p r e p a r e d modules are used t o g e t h e r .
to p e r f o r m
o f the l i n g u i s t i c
requirements for
to be met by a l i n g u i s t i c
level
modular programming:
may o c c u r as a component o f a n o t h e r
data s t r u c -
141
2.
Any data structure may be passed (by reference) to or from a pro-
gram module as an actual parameter. 3.
A program module may b u i l d
The l i n g u i s t i c
levels
tems have a l i n e a r
defined
structure,
and i n d e x i n g
not an a c c e p t a b l e
the p r i m i t i v e without
by c o n v e n t i o n a l l y
organized
as t h e i r
constructs
interfering
one s t r u c t u r e
of arbitrary
address space as t h e i r
a level
is
data s t r u c t u r e s
with
may r e q u i r e
fundamental
fundamental
complexity. computer s y s -
notion
o f data
means o f data a c c e s s .
foundation
for
modular programming
do not p r o v i d e
for
altering
the r e p r e s e n t a t i o n s rearrangement
space and c a n n o t be done w i t h o u t
one data s t r u c t u r e
of others.
of other
To e n l a r g e
structures
knowledge o f t h e i r
Such
because
in address
scheme o f r e p r e s e n -
tation. There are three ways in which a s a t i s f a c t o r y l i n g u i s t i c level for modul a r programming can be realized s t a r t i n g from a host level H defined by some computer system: 1.
Use a " s t a n d a r d "
to l e v e l
programming
language L w i t h
H and h a v i n g an adequate c l a s s
an a v a i l a b l e
o f data s t r u c t u r e s
translator
and p r i m i t i v e
operations. 2.
Extend a programming language L' that does not o f f e r an adequate
class of data s t r u c t u r e s , to r e a l i z e a new l i n g u i s t i c level L that is adequate. 3.
Design and implement a new language L by constructing e i t h e r
a.
A t r a n s l a t o r from L to H.
b.
An i n t e r p r e t e r of L that runs at level H.
Suppose t h e h o s t l e v e l
H is
provides
the user with
a linear
means is
used to r e a l i z e
structures
in
the p r i m i t i v e
les
(2)
linguistic
is
the l i n e a r
in
cases
address
(2)
level
the l i n e a r
operations
L,
address
in
H is
and (3)
space o f H is
t h e data
space o f H in
o f L can be implemented
o f H. The d i f f e r e n c e
that
and t h e mapping o f L i n t o
e x p r e s s e d in L;
L into
Or (3)
computer which
address space. Whichever o f t h e above
terms o f the p r i m i t i v e s
( 1 ) a b o v e and means standard
by a c o n v e n t i o n a l
the d e s i r e d
o f L must be mapped i n t o
such a way t h a t fectively
defined
(1)
ef-
between means
in the language L is
uniform
over all
program modu-
the mapping o f s t r u c t u r e s chosen i n d e p e n d e n t l y
by the
in
142
designer of each program module and the same choice is unlikely to be made for any pair of modules. To be more specific, suppose the designer of a program module is using the second approach. Let the language L' be a language (FORTRAN or ALGOL SO, for example) that does not provide adequate primitives for manipulating
structured data. To implement the program module, the de-
signer must extend L' by adding a memory. He does this by setting aside some portion M of the linear address space of H to hold representations of data structures of L as they are created and operated upon during operation of the program module. The memory may be viewed as a pair (M, C) where M is a one-dimensional array, and C is a collection of procedures that implement the primitive data structure operations of L. I f L' is FORTRAN, the memory array M may be allocated within a block of COMMON storage and the procedures of C may be realized as a group of subprograms. I f L' is ALGOL 60, the memory array and the procedures of C would be declared within the outermost block of the program module. There are serious problems with an approach in which the memory is separately implemented in independent program modules. Suppose A and B are two such modules. Then: 1.
Either the base l i n g u i s t i c level H includes an allocation mechanism
for units of address space, or a r b i t r a r i l y chosen areas of address space must be set aside as the memory arrays for modules A and B. 2.
A structure created by module A cannot be d i r e c t l y accessed from
within module B, for the primitives of A are not used within B. Partitioning the address space into separate areas for each module requires that each area be large enough to hold any structure that could be created. The idea of segmentation [ l ] is a way of meeting this requirement. I f the host level H provides a f a c i l i t y for management of address space, then introducing a second layer of memory management mechanism aggravates the inefficiency of program execution. The problem o f communicating e x p r e s s e d in d i f f e r e n t Figure
data
structures
representations
between program modules
may be d i s c u s s e d
2. Modules A and B are e x p r e s s e d in d i f f e r e n t
L B o f a host l i n g u i s t i c o f data s t r u c t u r e
level
in terms o f
extensions
H. Sets SA and SB r e p r e s e n t
representations
L A and
the classes
in L A and L B . The maps fA and fB
143
(which may be r e l a t i o n s )
relate
sponding r e p r e s e n t a t i o n s
at
If
L A and L B are d i f f e r e n t ,
the l i n g u i s t i c
produced less,
levels
the host l e v e l
by module A cannot be d i r e c t l y
host l e v e l to t h e i r
these r o u t i n e s
and t h i s
then a data s t r u c t u r e
if
no data s t r u c t u r e s
from t h e i r
t and t - I
representation
i n L B and v i c e v e r s a . Of c o u r s e ,
is a violation
how the data s t r u c t u r e s
H.
we can p r e p a r e r o u t i n e s
H which c o n v e r t s t r u c t u r e s
representations
in L A and L B to c o r r e -
accessed by module B. N e v e r t h e -
modules A and B may be used t o g e t h e r
exchanged between them, or i f
write
representations
of modularity
since
are at the in L A
the need to
knowledge o f
o f LA and L B are r e p r e s e n t e d a t H i s r e q u i r e d ,
knowledge concerns
the i n t e r n a l
construction
o f modules A and
B.
We have discussed Figure 2 assuming modules A and B include the d e f i n i t i o n s of LA and LB as i n t e r n a l
components. The same p i c t u r e holds i f
modules A and B are expressed in "standard" languages LA and LB that define p r i m i t i v e operations on data s t r u c t u r e s by two d i f f e r e n t extensions of a host level H. I f
LA and LB are "standard" languages, then
knowledge of the mappings fA and fB does not involve i n t e r n a l
knowl-
edge of modules A and B. Thus the construction of the conversion routines t and t - I depends on knowledge of the implementations of LA and LB r a t h e r than the workings of the modules. routines are subject to i n v a l i d a t i o n i f
However, now these
the implementation of e i t h e r
LA or LB is changed.
sB
fA
F i g u r e 2. Exchange o f data s t r u c t u r e s
f
between program modules.
144
If
the host
conversion notions
level
H defines
routines
a linear
address
can prove d i f f i c u l t .
space,
This
is
t h a t would save t h e programmer from the need f o r
edge o f the data s t r u c t u r e s
being transformed.
address space i s
referenced
is
for
items
no u n i f o r m
rule
locating
of the data s t r u c t u r e .
garding
how i n d i v i d u a l
all
Also
there
data s t r u c t u r e s
in
is
of the H lacks
complete
A data s t r u c t u r e
s e n t e d in a l i n e a r parts
construction
because l e v e l
by an a d d r e s s ,
the address
no u n i f o r m
knowl-
reprebut t h e r e
space t h a t
convention
may be combined i n t o
are
re-
a single
object. That two program modules are r e p r e s e n t e d L does not ensure t h a t dealt
with
consistent
by t h e a l g o r i t h m s
of
many ways in which a d i r e c t e d integers. les
that
tation
If
representations the modules.
in L f o r
directed
directed
For example,
interested
graphs,
in s h a r i n g
then programs
without
routines
are r e q u i r e d .
graph i s
to be passed as an argument or r e s u l t
le,
functional
specification
can be w r i t t e n
2.3.
routine
LEVELS
We have argued t h a t
by computer systems
ticularly
in r e g a r d
data s t r u c t u r e s . are i n a d e q u a t e structure
provided,
level
features
organization
their
suitability provisions
The two most f a m i l i a r
has been a l l o c a t e d ,
rou-
Without
ade-
the c o n v e r -
to w r i t e .
for
modular
building levels
proand
defined
are i n a d e q u a t e .
by s e v e r a l
well-known
building
languages,
the bounds o f a r r a y s
FORTRAN and ALGOL 60,
sort
are i n f l e x i b l e 68,
par-
and t r a n s f o r m i n g
are the o n l y
and the d i m e n s i o n a l i t y
Next
pro-
to modular programming, for
since arrays
program t e x t . The languages PL/I, A L G O L
for
the l i n g u i s t i c
defined
to t h e i r
by d e f a u l t ,
by a modu-
PROGRAMMING
linguistic
of conventional for
not impossible,
and t h a t
levels
of a directed
The n e c e s s a r y c o n v e r s i o n
adequate p r i m i t i v e
we examine the l i n g u i s t i c gramming languages
if
FOR M O D U L A R
data s t r u c t u r e s ,
conversion
of computation
in the common language L,
a satisfactory
gramming must p r o v i d e transforming
the r e p r e s e n t a t i o n
o f the module.
primitives
by t h e
in L must be g i v e n as p a r t o f the
would be d i f f i c u l t ,
LINGUISTIC
if
of
represen-
Otherwise
in L from the module s p e c i f i c a t i o n .
quate data s t r u c t u r e sion
Nevertheless,
the scheme o f r e p r e s e n t a t i o n
tines
t h e r e are
program modu-
contributed
difficulty.
objects
by a v e c t o r
graphs can agree on a s t a n d a r d
community may be used t o g e t h e r
level
are used f o r
graph may be r e p r e s e n t e d
a community o f users
manipulate
at the same l i n g u i s t i c
of a r r a y s
o f data
once s t o r a g e is
fixed
by t h e
and L I S P are considered in
145
•the f o l l o w i n g
,~. s.I.
paragraphs.
PL/S
In P L / I [ 2 ]
the principal
manipulate Ipointers.
structured Arrays
in FORTRAN
types
in P L / I
for
dimensionality; an a r r a y
is
to s i m i l a r
identifier
allocated;
assignment of array
elements
limitations
may o n l y
and
as a r r a y s
name a r r a y s
of
elements of an a r r a y must be o f are imposed so t h a t
to a c o n t i g u o u s
and the e f f i c i e n t
and
based v a r i a b l e s
bounds cannot be changed once
all
These l i m i t a t i o n s
possible,
structures,
subscript
the same data t y p e . is
t h a t may be used to r e p r e s e n t
are s u b j e c t
or ALGOL 60: an a r r a y
the d e c l a r e d storage
data
data are a r r a y s ,
indexing
portion
a permanent
of address
space
access mechanism o f p r e s e n t
day computers may be used. In P L / I
structures,
symbolic is
components are accessed by means o f a sequence o f
names c a l l e d
the l e n g t h
selectors;
of
the s e l e c t o r
the depth o f the component in t h e s t r u c t u r e .
ture
may be f u r t h e r
structures,
arrays,
etc.
of a structure
may be p e r m a n e n t l y
address
each component o f a s t r u c t u r e
stated
space,
in the s t r u c t u r e
same d e c l a r a t i o n ) Structures gramming.
It
a cemponent o f a n o t h e r
structure
not p o s s i b l e
must be s p e c i f i e d Furthermore,
the program t e x t , arbitrary
extent
Use o f P L / I
since
there
is
as P L / I
pointer
rays and s t r u c t u r e s trarily
during
(all
of
to a s i z e
satisfying
--
the
of PL/I
pointer
address space.
Pointer
values
function
to make an a r b i t r a r y
the e n t i r e
form o f a
components may be g i v e n
no way o f r e p r e s e n t i n g
the addr p r i m i t i v e ,
d e c l a r e d based p e r m i t s
variables.
pro-
is
implicit
in
data s t r u c t u r e s
of
and v a r i a b l e s ,
ar-
structures.
variables,
components o f s t r u c t u r e s
o f modular
the depth o f a s t r u c t u r e
complex a d d r e s s - l i n k e d
the p r i m i t i v e
portion
restricted
a computation
structure
b e f o r e any of i t s
interpretation
as p o i n t e r
is
Structures
do not meet t h e r e q u i r e m e n t s
structure a value.
each g e n e r a t i o n
may o c c u r as elements of a r r a y s .
as in P L / I is
So t h a t
a s s i g n e d to a c o n t i g u o u s
declaration.
sequence
Components o f a s t r u c -
storage values
the c o n s t r u c t i o n
structures.
The o n l y
is as l o c a t i o n s
of
correct
within
may o c c u r as elements o f a r r a y s
as w e l l
A pointer
as v a l u e s value
is
of simple
variables
created either
addr to a name, or by e x p l i c i t l y
arbi-
a linear and as declared
by a p p l y i n g allocating
146
storage for
a variable
b e i n g the o r i g i n Although
PL/I
pointers
representations are n o t met.
declared
provide
data s t r u c t u r e
belonging
no b u i l t - i n
ponent of a n o t h e r .
structure until
Each programmer is claimed.
no c e n v e n t i o n
no g u a r a n t e e
that
forced
is
o f t h e data s t r u c t u r e
PL/I
for
er one choses the P L / I such as t a s k i n g
has not c o n s i d e r e d
a s s o c i a t e d mode t h a t
dation
A structure
is
in
regarding
extent
communication
be-
n o t the o n l y problem refers
name c l a s h e s
to " e x t e r n a l "
are p o s s i b l e
wheth-
p r o c e d u r e as the form of
the introduction
o f new language f e a t u r e s
the r e q u i r e m e n t s
of modularity.
, each o c c u r r e n c e o f an i d e n t i f i e r
determines
The modes t h a t
the s e t o f v a l u e s provide
and, i n t h e m s e l v e s ,
modular
permitted
representations
are multiple values and structures. for
from
remains
68
In an ALGOL 68 program [ 3 , 4 ]
arrays
for
Since P L / I
program o r the P L / I
In a d d i t i o n ,
tures
values
no advantage o v e r a bare ma-
facilities
identifiers,
program module.
to P L / I
to has
o f elements
in t h e same manner as FORTRAN, and s i n c e p r o -
cedures may have n o n l o c a l
named v a r i a b l e .
deletion
and when s t o r a g e may be r e -
structures
modular programming.
and data s e t s
ALGOL
val-
b e i n g a com-
address space.
Unsuitab,ility
2.3.2.
by a p o i n t e r
released.
tween i n d e p e n d e n t program modules o f f e r s
presents
the
an e l e m e n t d i s c o n n e c t e d
o f component, linked
to a
identifying
to adopt his own c o n v e n t i o n s
a notion
chine having a linear
procedures
building
an element p o i n t e d
of pointer
explicitly
Hence t h e use o f P L / I
returned
programming
structure
Further,
free statements; reassignment
for
referenced
by t h e programmer.
storage
o f data s t r u c t u r e s ,
for
the needs o f modular
to the s t r u c t u r e
through
its
facility
c o n c e p t o f one l i n k e d
must be done by e x p l i c i t existence
a very general
provides
There is
the data t y p e i n t e n d e d a linked
value
v a l u e cannot be r e g a r d e d as a r e f e r e n c e
because P L / I
s e t o f elements
the pointer
r e g i o n o f address space.
o f data s t r u c t u r e s ,
A pointer
ue. There is
to be based,
o f the a l l o c a t e d
Multiple
for
values
for
has an the
data s t r u c are s i m i l a r
do n o t p r o v i d e an adequate f o u n -
programming.
mode d e c l a r a t i o n
t h e mode b e i n g d e c l a r e d
is
i n ALGOL 68 s p e c i f i e s an o b j e c t
having a fixed
that
any v a l u e o f
number o f compo-
I47
nent objects i d e n t i f i e d by f i e l d s e l e c t o r s ,
each component being an ob-
j e c t of specified mode. Through use of several mode declarations one may define a class of objects having graphs that are trees. Each node of such a tree has an associated mode and is the o r i g i n for a fixed number of arcs, each bearing a f i e l d selector as specified in the mode declaration. Since r e c u r s i v e
mode d e c l a r a t i o n s
mode may be of unbounded d e p t h , trees. all
Yet no A L G O L 68 s t r u c t u r e
ALGOL
68 data
an a r b i t r a r y cifically, tains
it
is
for
Also,
mode p e r m i t s v a l u e s t h a t
Thus t h e r e
i s no means f o r
structure
of data s t r u c t u r e s ,
because a f i n i t e
suitable
conventions
and has s a t i s f a c t o r y
for
t h a t obto a n o t h e r
to s p e c i f y
the data s t r u c t u r e as a f o u n d a t i o n
to those o f P L / I
However, the r e q u i r e m e n t t h a t an u n f o r t u n a t e
delineating
provisions
a c c e s s i n g complex s t r u c t u r e s ,
its
an a r -
for
for
the e x t e n t
building
primitives
and
of ALGOL 68
modular programming.
the mode o f e v e r y v a r i a b l e
be e x p l i c i t
is
limitation.
Other l i m i t a t i o n s
of A L G O L
68 f o r
ign of the language p r i m a r i l y complete program f o r
that
modular programming stem from the des-
as a means f o r
a computation
the concept of c o e r s i o n s
data t y p e to a n o t h e r
2.3.3.
it
Spe-
s e t of mode d e c l a r a t i o n s
are s u p e r i o r
to f i x
structure.
the complete c l a s s of ALGOL 68 o b j e c t s .
to d e s c r i b e
Since A L G O L 68 i n c l u d e s
is
substituting
from one program module and g i v e s
knowing enough about the data s t r u c t u r e 68 data
ALGOL
coersion
range o v e r
an ALGOL 68 p r o c e d u r e
to w r i t e
of a given
of b i n a r y
a program module expressed in ALGOL 68 cannot b u i l d
insufficient
ample i s
the o b j e c t s
example, the c l a s s
some component of an e x i s t i n g
not p o s s i b l e
a data s t r u c t u r e
bitrary is
structures.
structure
module w i t h o u t mode.
are p e r m i t t e d , as f o r
is
implicit
of
one programmer to w r i t e to h i m s e l f .
a
A prime e x -
by which c o n v e r s i o n o f v a l u e s from one in many c i r c u m s t a n c e s .
a scan of an e n t i r e
the meaning o f
interest
ALGOL 68
A consequence of
program may be n e c e s s a r y
s t a t e m e n t s in a d e e p l y nested p r o c e d u r e .
LISP
In Lisp ~,63 data structures are represented as lists. A region of a l i n e a r address space (the memory) is reserved f o r cells from which l i s t s are b u i l t to represent data structures. Each c e l l has two f i e l d s which may contain addresses (called pointers) of other c e l l s in the memory.
148
A list
ks s p e c i f i e d
by the a d d r e s s of a c e l l
that
can be reached by t r a c i n g
list
is e s s e n t i a l l y
origin
a rooted,
of at most two arcs
the c o r r e s p o n d i n g cycles
cell.
do not o c c u r ,
pointers directed
that
and c o n s i s t s
define
cells
cell.
Thus a
graph in which each node i s
the
the l e f t
for
In most a p p l i c a t i o n s ,
and l i s t s
of a l l
from the s t a r t i n g and r i g h t lists
sublists
containing
have t h e form o f a b i n a r y
directed
tree with
shared s u b t r e e s . Lisp
includes
the l e f t
primitive
or r i g h t
two l i s t s
operations
component s u b l i s t
are equal or r i g h t
sublist
The l e a f
cells
lists
ues c a l l e d
of
are c a l l e d
properties.
A property
erations
list.
may be used to r e p r e s e n t programming
basic
or a r e a l
functions
has p r i m i t i v e s
lists
programming w i t h of L i s p f o r
building,
respect
disturbing
for
performing
opin
as an e f f i c i e n t
a variety
as l i s t s of d i f f e r e n t
representations
representation
in c o n f l i c t ,
shares w i t h
o f having a g l o b a l
lists modular
weakness to e x p l o i t
For a p p l i c a t i o n s efficient
of a c c e s s .
written
many operation
Because t h e s e
of data s t r u c t u r e s
Lisp functions,
where
the modules were e x p r e s s e d in a
as a b a s i c data
the o t h e r
level
if
for
a data s t r u c t u r e ,
conversion
to combine i n d e p e n d e n t l y arrays
inability
arrays.
for
expected patterns
would not be r e q u i r e d
language o f f e r i n g
its
ap-
and r e a r ~
requirements
have been d e s i g n e d to y i e l d
are g e n e r a l l y
required
disecting
The p r i n c i p a l
from
access mechanism f u r
is a natural
representations
arises
function
the meaning of o t h e r
to data s t r u c t u r e s .
modular programming
where an a r r a y
Lisp also
without
for
specified
often
ob-
number, or may
p a r a m e t e r of a L i s p
the memory, L i s p meets our f u n d a m e n t a l
is
named v a l -
any of the commonly used data s t r u c t u r e s
ranging
conversion
the
is easy to d e v i s e ways in which l i s t s
sharing
for
whether
list.
an i n t e g e r
may o c c u r as an a c t u a l
and L i s p
indexing
testing
making one l i s t
practice.
Since any l i s t plication,
It
for
and f o r
obtaining
atoms and have a s s o c i a t e d
includes
values.
for
of an atom may be an e l e m e n t a r y
string,
Lisp
on p r o p e r t y
lists,
of any l i s t ,
of an e x i s t i n g
such as a c h a r a c t e r
be an a r b i t r a r y
building
or are the same l i s t ,
new l e f t
ject
for
type.
languages we have d i s c u s s e d
of nomenclature.
Programmer d e f i n e d
the f a i l i n g functions
and c o n s t a n t s are g i v e n names t h a t are g l o b a l in a L i s p program. There is no p r o v i s i o n f o r e n s u r i n g freedom from name c o n f l i c t s when i n d e p e n dently
written
L i s p programs are combined.
149
2.3.4.
DISCUSSION
On one hand, modular
Lisp
programming
quate f o u n d a t i o n limitations designers for it
is
of P L / I
essential
contemporary
arrays
machines.
implement
that
arrays
On the o t h e r
damental
notion
cessions,
and i g n o r i n g
the a d d r e s s
so t h a t
the allocation
ations.
In t h i s
symbolic
a linear using
hand,
the use of
and d e a l l o c a t i o n
way a p o w e r f u l
of c e l l s
In the f i n a l linguistic that
for
yields
natural
for
general
programming
representations
commonly a p p l i e d
prove v a l u a b l e
in
2.4.
1.
list
cells
become t r i v i a l
oper-
computations
on
of t h e s e t h r e e
this
the d e f i n i t i o n
of a base
for
a wide v a r i e t y
practice,
of da~a
including
li
ts,
c o n c e p t may prove i m p r a c t i c a l
use on computers as a s t a n d a r d
languages?
of c o n v e n t i o n a l
of a c h i e v e m e n t ,
intended
to
organization,
it
and as a guide f o r
to advance the p r o s p e c t s
for
REFERENCES
J.
B. D e n n i s , J.
Segmentation
of
S. V. P o l l a c k and W i n s t o n ,
3.
into
programming.
systems. 2.
a more s a t as a f u n -
u s i n g a c o n c e p t of data s t r u c -
programming
Although
t h e d e s i g n o f computer systems modular
Thus
By making t h e s e con-
expressing
o f t h e s e notes we e x p l o r e modular
and s t r u c t u r e s .
implement should
section level
structures arrays,
possible
data has been r e a l i z e d .
Is t h e r e a way to combine the b e s t a s p e c t s
ture
of the
hardware of
has a c h i e v e d
uniformly
language f o r
The
as a fundamental
up the a r r a y
indexing.
space may be d i v i d e d
an ade-
address space.
the i n d e x i n g
Lisp
by g i v i n g
for
structures.
implementations
be i n c l u d e d
be implemented
c o n c e p t of data s t r u c t u r e
data
to the d e s i r e
to make e f f i c i e n t
that
to p r o v i d e
and m a n i p u l a t i n g
and ALGOL 68 can be t r a c e d
computers
was c o n s i d e r e d
and ALGOL 68 as a f o u n d a t i o n
and ALGOL 68 f a i l
representing
of t h e s e languages
data t y p e and t h a t
to P L / I
because P L / I
for
conventional
isfactory
superior
and the d e s i g n
t h e ACM, V o l .
and T. D. S t e r l i n g , Inc.,
A. van W i j n g a a r d e n ,
of multiprogrammed
12, No. 4 ( O c t o b e r A Guide
1965),
to P L / I .
computer
pp 589-602.
Holt,
Rinehart
1969. Ed.,
Numerische Mathematik,
R e p o r t on the a l g o r i t h m i c
Vol.
14, No.79 ( 1 9 6 9 ) ,
language A L G O L
pp 79-218.
68.
150
.
J. E. L. Peck, An ALGOL 68 Companion. U n i v e r s i t y of B r i t i s h 1971 ( p r e l i m i n a r y
.
M. I .
Department of Computer Science,
Columbia, Vancouver, B.D., Canada, October
edition).
T. Computation Center, LISP 1.5 Programmer's Manual.
Computation Center and Research Laboratory of E l e c t r o n i c s , Massachusetts I n s t i t u t e .
of Technology,
Cambridge, Mass., August 1962.
E. C. Berkeley and D. G. Bobrow, Eds., The Programming Language LISP: Its Operation and Applications.
Cambridge, Mass. 1964.
Information International,
Inc.,
151
3.
MODULARITY
IN MULTICS
We have seen t h a t most c o n t e m p o r a r y computer systems and programming languages do not s u p p o r t
a very general
form of modular programming.
one advanced computer system comes s i g n i f i c a n t l y linguistic
level
suitable
f o r modular programming.
of the d e v e l o p m e n t of M u l t i c s environment within sed in d i f f e r e n t culty.
In t h i s
closer
at Project
MAC [ i ]
to d e f i n i n g
Yet a
A major o b j e c t i v e
has been to c r e a t e an
which programs developed i n d e p e n d e n t l y and e x p r e s -
source languages may be combined w i t h minimum d i f f i lecture
we s h a l l
s t u d y how w e l l
this
objective
has been
achieved. First,
we p r e s e n t a model f o r
understood
those a s p e c t s of M u l t i c s
to d i s c u s s m o d u l a r i t y
Then we d i s c u s s
from the v i e w p o i n t
the a c h i e v e m e n t s and l i m i t a t i o n s
programming in terms of the model. tion
for
jects,
the s t a t e s
of M u l t i c s
and an i n f o r m a l
occur d u r i n g
The model c o n s i s t s
discussion
of c e r t a i n
3.1.
THE
3.1.1.
and
FILE
users i n
segments.
for
entry,
that
We do not access,
con-
entry,
for
must be u n i q u e .
component t h a t etc.
A directory
gives attributes
- an
sequence of e n t r y
and l i n k s
or a segment.
or segment)
directory
or segment e n t r y
A link
that
type
are
name in a
has an ' a t t r ' date of l a s t
represents either
a pathname
composed
of a
file
called
in the f i l e
is
of each
of a d i r e c t o r y
Each e n t r y
i s an o b j e c t
names. The M u l t i c s
presents a particular
example
such as access r i g h t s ,
The second component
another directory,
as in F i g u r e 3. A
each of which may be a
or a l i n k
the e n t r i e s
of d i r e c t o r i e s
structure
entry names, and are c h a r a c t e r s t r i n g s .
(directory
protection,
by an o b j e c t
many components,
a segment
shown. The s e l e c t o r s
directory
transitions
processes.
the programs and data of a l l
the form of a h i e r a r c h i c a l
has a r b i t r a r i l y
change,
retains
We r e p r e s e n t a d i r e c t o r y
directory
called
of a r e p r e s e n t a -
SYSTEM
directory
is
modular
communication.
system of M u l t i c s L 2 ]
Multics
for
MODEL
THE
The f i l e
state
by M u l t i c s
a t t e m p t to model the mechanisms of M u l t i c s and i n t e r p r o c e s s
of M u l t i c s
user.
processes as an augmented c l a s s of ob-
e x e c u t i o n of p r o c e d u r e s
trol,
t h a t must be
of the M u l t i c s
system i s an o b j e c t t h a t r e the r o o t d i r e c t o r y . Each i t e m
system i s
specified
by the unique
152
sequence of e n t r y o f the d i r e c t o r y
names by which the item may be reached from the r o o t tree.
directory
or segment.
A segment
in M u l t i c s
may hold e i t h e r by an o b j e c t 5
"
° "
The sequence o f e n t r y
is a linear
names i s
a pathname of the
address space of 218 addresses which
data or one or more p r o c e d u r e s .
A segment i s
h a v i n g e l e m e n t a r y components s e l e c t e d
represented
by the i n t e g e r s
O,
"
In the r o o t d i r e c t o r y and the e n t r i e s tories
of the f i l e
s y s t e m , the e n t r y
names are u s e r
are u s e r
directories.
A user i s the o w n e r
and segments t h a t
are e n t r i e s
in h i s user d i r e c t o r y ,
owner of d i r e c t o r i e s We w i l l
simplify
attribute
and segments t h a t
the r e p r e s e n t a t i o n
components and o m i t t i n g
'segment' of l i n k s
This s i m p l i f i e d
of the f i l e
by an a s t e r i s k .
the
in owned d i r e c t o r i e s .
labelled
illustrated
direc-
and i s
system s t a t e
the branches
form i s
are d i s t i n g u i s h e d
are e n t r i e s
of all
names
'directory'
in F i g u r e 4.
The l i n k
by o m i t t i n g or
E n t r y names
shown i s to the
item h a v i n g pathname ' b . b . a '
3.1.2.
PROCESSES
When a M u l t i c s him.
AND
SPACES
user begins a c o n s o l e s e s s i o n ,
By t y p i n g commands a t the c o n s o l e ,
execute procedures. file
ADDRESS
system s t a t e .
sole session only record
is
in changes in the
N o r m a l l y a user process ceases to e x i s t
retained
and the changes to the f i l e
in M u l t i c s
For our purposes a s t a t e h a v i n g a component f o r distinct
is created for
the user causes the process to
The e x e c u t i o n of commands r e s u l t s
terminated,
cess in e x i s t e n c e .
a process
of the u s e r ' s
of M u l t i c s
the f i l e
In F i g u r e
when h i s con-
system are the
activity.
may be r e p r e s e n t e d as an o b j e c t
system, and one component f o r
5 we have i d e n t i f i e d
each p r o -
each process by a
user name.
The s t a t e
of process
i s an o b j e c t
h a v i n g components as f o l l o w s
6): i.
'memory'
process address space
2.
'stack'
s t a c k segment and p o i n t e r
(Figure
153
T
i
I ent-name-i
i, u,
i
I
ent-name- j
ent-name-k
I ~attr'
'se ~ent'
'attr'
[attributes
t attributed
'directory'
I I directory I I
k-
directory entry
I
I
I
I
0
i
2
n
666-°°6 ¥ - -
• ....
/
segment entry
Figure 3.
Model for the Multics file system.
~
i ink
I
154
3.
'k~t'
bown
4.
'link'
linkage
segment and p o i n t e r
5.
'w.dir'
working
directory
In f a c t ,
segment t a b l e
components o f the process s t a t e
the M u l t i c s
file
are implemented as segments in
system which are a c c e s s i b l e to system p r o c e d u r e s .
choose to model them as s e p a r a t e o b j e c t s function
from the u s e r ' s
for
ease in d i s c u s s i n g
state
i s the address space i m p l e -
mented by the hardware and s o f t w a r e of M u l t i c s ure 7.
It
integers
is a two-level
tree.
The s e l e c t o r s
process.
i s shown in F i g level
are
sejment numbers. Each segment number i d e n t i f i e s up to 218 words.
space are not d i s t i n c t
from segments of the f i l e
the f i l e
each M u l t i c s
a t the f i r s t
ment which may c o n t a i n lected
for
t h a t models the address space of a process
called
their
viewpoint.
The 'memory'-component o f a process The o b j e c t
by segment numbers a r e , system s t a t e .
a seg-
Since the segments of an address
in f a c t ,
system, the nodes se-
identical
with
The address spaces of M u l t i c s
segment nodes of
processes are
implemented by a complex arrangement of h a r d w a r e - a c c e s s e d t a b l e s core memory, a small
associative
(drum and d i s c )
to hold
core memory [ 3 ,
4].
called
memory, and a u x i l i a r y
pages of segments not a l l o c a t e d
A two-component address c o n s i s t i n g
number and a word n u m b e r , t h a t a process,is
specifies
storage devices space in the of a segment
a word in the address space of
of a process
state
consists
of a segment ( f o r
purposes not p a r t of the f i l e
system) and a p o i n t e r
a s s i g n e d by the programmer to
"automatic"
to the s t a c k p o i n t e r .
variable.
entry.
In t h i s
and r e t u r n
3.1.3.
the s t a c k p o i n t e r
way, a l l
Multics
our
Variables
s t o r a g e are accessed by ad-
On procedure e n t r y
the p o i n t e r
is advanced to the end o f the s t a c k area used by the c a l l i n g on p r o c e d u r e e x i t
in
a g e n e r a l i z e d address.
The ' s t a c k ' - c o m p o n e n t
dresses r e l a t i v e
We
is returned
procedures
that
to i t s
procedure;
value before
use the s t a n d a r d c a l l
c o n v e n t i o n s may be used r e c u r s i v e l y .
MAKING A S E G M E N T KNOWN TO A PROCESS
The a s s i g n m e n t of a segment from the M u l t i c s space of a process This a c t i o n
is called
file
system to the address
making the segment known to the process.
occurs when the p r o c e s s ,
in executing a procedure,
encounters
1,55
T
I ta!
i,,,,,,
IIII ~
i
I
Ib ,
vc '
I
lel
I 0
1 'b'
II
! 1
I'"
I
I
'a'
0
i
•
0
•
(56
66 I
I
0
1
0
•
•
66 Figure
I user-i
i process s tare
4.
Simplified
I
for the file system.
I
I
user-2
I I process s tare Figure
model
5.
' file '
user-k
I
I process s tare Model
for a state
I
I file system
of Multics.
state
[
156
H,ill
T 'ks t '
I
'memory '
I
I
'wdir '
'link '
'stack'
I J address I
J known
I ~ l stack
space
I
segment table
J
I linkage I segment
segment
Figure 6.
I
I
0
i
i
T
fill
I , e
k
•
i
I
0
Model of a Multics process.
0
,2
0
66
0
1
66
l
I
0
i
iiinl
O O O
66
218 words
approx. 212 segments Figure 7.
Model for the address space of a process.
157
a s y m b o l i c r e f e r e n c e to a segment. The s y m b o l i c name used in the code of the p r o c e d u r e segment i s the segment in the f i l e
called
ment number;
The path name of
name.
system to which a r e f e r e n c e name r e f e r s
by a system p r o c e d u r e d i r e c t e d be d i s c u s s e d l a t e r .
a reference
by a s e t of s e a r c h r u l e s
i s found
in a manner to
A segment known to a process has an a s s o c i a t e d seg-
segment numbers are a s s i g n e d to segments s e q u e n t i a l l y
as
t h e y become known to the p r o c e s s . The a s s o c i a t i o n s names f o r
all
between segment numbers,
segments known to a process are held
the known se#ment t a b l e
called
cess s t a t e .
r e f e r e n c e names and path
which
The known segment t a b l e
8. For example, the f i g u r e has the path name ' x . y . a '
is
the
in a data s t r u c t u r e
'kst'-component
i s m o d e l l e d as an o b j e c t
shows t h a t
to the segment d u r i n g
component of
the known segment t a b l e
operation is
and ' b '
of the p r o c e s s .
the h i g h e s t
integer
the segment number of a segment known to the p r o c e s s . initial for
v a l u e 0 when the process
is
and i s
It
process have been The ' n ' -
in use as
i s g i v e n the
i n c r e m e n t e d by 1
each segment made known to the p r o c e s s .
An i l l u s t r a t i o n
o f the s t a t e
made known to a process
is
transition
tains
for
a new e n t r y
rules.
obtained
Segment i+1 of
DYNAMIC
is
by system r o u t i n e s
The new e n t r y conin e x e c u t i o n and the
directed
in the f i l e
is
identified
system.
LINKING
For a segment S to be made known to a p r o c e s s , of a r e f e r e n c e name must occur from w i t h i n Once segment S i s
known to the p r o c e s s ,
r e f e r e n c e to S by means
some procedure
references
to i t
h a r d w a r e - i m p l e m e n t e d a d d r e s s i n g mechanism p r o v i d e d f o r dresses.
by the search
the address space of the process 'x.y.a'
'n'-
i n c r e m e n t e d and used as the
used by the p r o c e d u r e
the segment having pathname
3.1.4.
occurs when a segment i s
i n the known segment t a b l e .
the r e f e r e n c e name ' a '
path name ' x . y . a '
that
shown in F i g u r e 9. The v a l u e i of the
component of the known segment t a b l e selector
with
created,
the p r o -
in F i g u r e
segment number i of t h i s
and the r e f e r e n c e names ' a '
used to r e f e r
of
The M u l t i c s
state
transition
that
realizes
segment P. should use the
generalized
this
objective
adis
called
linking.
cannot
i n v o l v e any change in the c o n t e n t of segment P, because p r o c e d u r e
Linking
segments in M u l t i c s implement r e f e r e n c e s
a site
of r e f e r e n c e
in segment P to segment S
are shared among p r o c e s s e s . to o t h e r
The scheme used i s
segments from segment P by i n d i r e c t
to ad-
158
,n t
d
T
I 0
i
+
I
•
•
1
•
I 'ref'
i 0
Figure 8.
I i
i,,I
I 1
'path '
i
Model for the known segment
table.
159
(a)
before file system state
process state
T
I
iii
'memory
'kst'
i
I
!
.I
lnl
I
I
IXI
'
i i| II,,i ,i
i
i
I
I'
0
1
I
iai
66"" 0
i
66 (b)
after
process
file system state
state
T
1
I
Ikst'
[i i'll,
6
I
I
I
i
i+l
i
i
IX!
'memory '
t
'ref'
I i+l
J
;
'path'
L 0
66 Figure 9.
Making segment
'y '
'a' with pathname
; i
°.°
'x.y.a' known to a process.
160
dressing
through
items c a l l e d
segment P. The l i n k a g e process form the ponent,
w i t h each of
system r o u t i n e .
sections
'link'
segment i s made known,
t h a t make up a linkage s e c t i o n f o r
links
for
all
procedure
segments known to a
component of the process s t a t e . its
its
linkage
links
section
s e t to cause t r a n s f e r
The system r o u t i n e
When a p r o c e d u r e
i s added to the
of c o n t r o l
link
If
not,
3.1,5.
of t h i s
mechanism have been p u b l i s h e d
segment.
[4].
S E A R C H R U L E S AND THE W O R K I N G D I R E C T O R Y
A Multics
user must s p e c i f y
an owned d i r e c t o r y
of the f i l e
system as
working directory f o r h i s process when he begins a c o m p u t a t i o n . The
working d i r e c t o r y
o f a process may be changed by a system command p r o -
cedure which may a l s o be c a l l e d the w o r k i n g
directory
is
the
The s e a r c h rules of M u l t i c s during rules
by the u s e r ' s
'wdir'-component specify
are s t a t e d
as a l i s t
search r u l e s
program.
The pathname of
of the process
state.
how r e f e r e n c e names encountered
p r o c e d u r e e x e c u t i o n are to be c o n v e r t e d
to be searched f o r
1.
segment i s
segment i s made known as d e s c r i b e d above. Then the
i s r e p l a c e d by the g e n e r a l i z e d address of the r e f e r e n c e d
The d e t a i l s
the
this
to a
reads the r e f e r e n c e name from the
procedure segment and d e t e r m i n e s whether the r e f e r e n c e d known.
'link'-com-
of data s t r u c t u r e s
into
pathnames. The search
in the sequence t h e y are
an e n t r y named by the g i v e n r e f e r e n c e name. The usual
specify
the f o l l o w i n g
o r d e r of s e a r c h :
known segments
2.
referencing
3.
working
directory
4.
system l i b r a r i e s
directory
The search begins by t e s t i n g entry
whether the segment i s r e p r e s e n t e d by an
in the known segment t a b l e .
This
i s done so t h a t
links
to seg-
ments a l r e a d y known to the process may be completed w i t h o u t
any d i r e c -
tory
If
searching,
which consumes s i g n i f i c a n t
processing
time.
the r e f -
erence i s not to a segment a l r e a d y known, a search i s made of the erencing
directory"
currently procedures
--
the d i r e c t o r y
in e x e c u t i o n was o b t a i n e d . that
This
search r u l e
form a subsystem are grouped t o g e t h e r
and g i v e s p r e f e r e n c e to such a r e l a t e d the same name in the u s e r ' s
working
"ref-
from which access to the procedure supposes t h a t in d i r e c t o r i e s ,
p r o c e d u r e o v e r a p r o c e d u r e of
directory.
161
A program e x p r e s s e d in FORTRAN references
its
directory,
and accesses l i b r a r y
or P L / I
~or e x e c u t i o n
by M u l t i c s
normally
user-owned p r o c e d u r e and data segments in the w o r k i n g procedures
in
the system l i b r a r i e s
di-
rectory.
3.2.
ACCOMPLISHMENTS
~ultics
has r e a l i z e d
design,
and has made them a v a i l a b l e
the f i r s t
time.
features I.
vided for 2.
All
importance
virtual
address
for
4.
modular
community of users f o r
of Multics
include
some
programming. 230 e l e m e n t s )
is
pro-
each u s e r .
user
information
is accessed t h r o u g h
Any p r o c e d u r e a c t i v a t i o n
limited
to a l a r g e
space ( a p p r o x i m a t e l y
No s e p a r a t e access mechanism is such as f i l e s . 3.
advances in computer system
These u n i q u e c h a r a c t e r i s t i c s
of major
A large
a number of s i g n i f i c a n t
only
provided
for
can a c q u i r e
by the number of f r e e
his
virtual
particular
address sorts
an amount of w o r k i n g
segments in
the u s e r ' s
Any p r o c e d u r e may be shared by many p r o c e s s e s w i t h o u t
space.
of data
space
address
space.
the need of
making c o p i e s . !5.
Every p r o c e d u r e w r i t t e n
PL/I and o t h e r s ) rency. 6.
may be a c t i v a t e d
A common t a r g e t
source
in s t a n d a r d
languages
--
representation PL/I
are major
and i m p l e m e n t a t i o n
of l a r g e
the M u l t i c s
the realization
is
through
recursion
used by the c o m p i l e r s
contributions software
software
of a large
]procedure segments [ 5 ~ .
multiply
u s e r languages
(FORTRAN,
or c o n c u r -
of two major
and FORTRAN.
These a c h i e v e m e n t s by b u i l d i n g
~ultics
virtual
toward
systems.
simplifying
the design
They were made p o s s i b l e
on a machine e x p r e s s l y
organized
for
memory and shared access to data and
162
3.3.
UNRESOLVED
ISSUES
The ease of modular programming problems t h a t all
in M u l t i c s
remain u n r e s o l v e d i s s u e s .
is
limited
One problem M u l t i c s
computer systems in which data s t r u c t u r e s
ear address space.
As observed e a r l i e r
" t h e e x t e n t of a data s t r u c t u r e "
data s t r u c t u r e "
for
for
structured
are e s t a b l i s h e d
the a d o p t i o n
data as the b a s i s f o r
the e s s e n t i a l
attributes
introducing
by the ~ u l t i c s
of a more s u i t a b l e
computer system d e s i g n .
is discussed
a lin-
and "component of a
machine nor by the s t a n d a r d user languages of ~ l u l t i c s .
can be s o l v e d o n l y t h r o u g h
shares w i t h
each a u t h o r of
conventions
the concepts of
no c o n v e n t i o n s
design
must be mapped i n t o
in these n o t e s ,
a program module must adopt h i s own p r i v a t e
tual
by c e r t a i n
in the f i n a l
vir-
This problem model f o r
A model having
section of
these
notes.
3.3.1.
TREATMENT
OF REFERENCE
NAMES
Another problem f o r modular programming in M u l t i c s ment of r e f e r e n c e names. B a s i c a l l y , that
occur f r e e
in the t e x t
occur not o n l y as i d e n t i f i e r s tic
level,
concerns
the t r e a t -
r e f e r e n c e names are i d e n t i f i e r s
of M u l t i c s of fixed
absence of name c o n f l i c t s
procedures.
Since r e f e r e n c e names
elements of the M u l t i c s
linguis-
cannot be ensured when a user a t -
tempts to combine i n d e p e n d e n t l y w r i t t e n
procedures.
The f o l l o w i n g
dis-
c u s s i o n of the i s s u e i s based in p a r t on a s t u d y by C l i n g e n [ 5 ] . The s e t of search r u l e s a segment s p e c i f i e d
given earlier
for
e v o l v e d to t h i s
programming w i t h I.
working d i r e c t o r y system l i b r a r i e s
fined
a collection
his process, and so w i l l
all
we f i r s t
of search r u l e s
consider
the problems of modular
is appropriate
where a user has de-
o f procedure and data segments and e n t e r e d
an owned d i r e c t o r y . collection
form,
To see how the s e t of search
the search r u l e s
2.
This combination
the pathname o f
by a r e f e r e n c e name i s an a t t e m p t to a v o i d the un-
d e s i r e d consequences of name c o n f l i c t s . rules
determining
By making t h i s
directory
them in
the w o r k i n g d i r e c t o r y
of
r e f e r e n c e names d e s i g n a t i n g members of the u s e r ' s
of segments w i l l references
be a s s o c i a t e d w i t h
to l i b r a r y
procedures
the c o r r e c t
segment,
so long as t h e i r
reference
163
names are not d u p l i c a t e d The p o s s i b i l i t y and r e f e r e n c e with
this
in the w o r k i n g
of'clashes
between r e f e r e n c e names chosen by the user
names of l i b r a r y
procedures
c h o i c e of search r u l e s .
implemented i n d e p e n d e n t l y f o r
If
p l e m e n t a t i o n s may i n c l u d e is
not p r o v i d e d f o r one of s e v e r a l
this
would not p r o v i d e f o r
in the two source
library
in
Working d i r e c t o r y
2.
Run time l i b r a r y
A
3.
Run time l i b r a r y
B
but d u p l i c a t e d
for
the two im-
conflicting
One could
directories
meanings.
let
but t h i s
the u s e r ,
spe-
in the second search r u l e , combined procedures
Alternatively
but
expressed
one could use a s e t of search
names would be m i s i n t e r p r e t e d
Another d i f f i c u l t y
is
lead to s u c c e s s f u l brary directory, an e r r o r In M u l t i c s ,
that
a mistake
to a s t r a n g e
the n a t u r a l
form f o r
If
in a common p r i v a t e
that
directory
the working
cedure in e x e c u t i o n .
procedures
One scheme i s
always the d i r e c t o r y that
the w o r k i n g
passes from procedures Since changing
an e x p e n s i v e t a s k , transfers
arrangement r e q u i r e s
inclusion
is
the o t h e r module. control
directory
r e f e r e n c e names o c c u r r i n g
correctly.
This r e q u i r e s
changed whenever c o n t r o l is
of of
a user wishes to use two such modules t o g e t h e r ,
be i n t e r p r e t e d
if
in a l i -
a program module i s a c o l l e c t i o n
:some arrangement must be made so t h a t
process
procedure
to have such m i s t a k e s produce
response by the system.
system.
in
to
in use of a r e f e r e n c e name may
search and l i n k i n g
t h e r module w i l l
cedure
they were i n t e n d e d
B.
whereas one would p r e f e r
procedure and data segments e n t e r e d the f i l e
if
f
r e f e r e n c e segments in run t i m e l i b r a r y
this
the s e t s of r e f e r e n c e libraries
such as
I.
cially
languages are
separate directories,
programs t h a t
languages.
not the o n l y d i f f i c u l t y
names w i t h
entries
by the search r u l e s .
cify
rules
procedure
duplicate
identify
is
two programming
use in M u l t i c s ,
names used to access the r u n - t i m e These names should
directory.
this
directory
solution call
the p r o be
in one module to a p r o -
the working d i r e c t o ~ is
not a t t r a c t i v e ,
and r e t u r n
of a command to change the w o r k i n g of o t h e r modules.
containing
between modules occur f r e q u e n t l y .
different
in e i -
to a r r a n g e
of a espe-
Also,
conventions
directory)
This r e q u i r e m e n t c o n f l i c t s
for with
calls
(the on
the con-
164
cept that its
one should
rily
of making the w o r k i n g
led to a d d i t i o n
I.
referencing
2.
working
3.
system l i b r a r i e s
is
directory in which
accomplished
this
its
thereby
the g i v e n
it
reference
ponent of the e n t r y the e n t r y
given reference
is
identification
its
rejected
name i s
of
its
di-
of t h a t
module the f i r s t
di-
names e n c o u n t e r e d d u r i n g
exe-
of the module.
was added to the s e t of search r u l e s s e a r c h e s in d i r e c t o r i e s
system e f f i c i e n c y .
tested
in
The ' p a t h ' -
any p r o c e d u r e of a program
This
'ref'-component. to v e r i f y
search
of
is
that
is
located
Then the
the e n t r y
and search f o r
other
directory that
If
entries
has
'path'-com-
is f o r
as the segment in e x e c u t i o n .
the f i l e
performed
as the r e f e r e n c i n g
in the known segment t a b l e
is
in use of
a segment
the t e s t
having
the
continued.
of M u l t i c s
implement t h e c o r r e c t
context
in p r o c e d u r e s o f program modules.
for
ref-
Yet s e v e r a l
reference
names may lead to u n s u s p e c t e d
dif-
linkage
or system p r o c e d u r e s .
Implementers
conflicts
name to
remain:
Mistakes
to l i b r a r y 2.
are p a r t
name in
erence names o c c u r r i n g ficulties
reference
has the same e f f e c t
Thus t h e search r u l e s
1.
all
in the same d i r e c t o r y
fails,
calling
spent performing
An e n t r y
a reference
the known segment t a b l e . unambiguous
search r u l e
improving
in such a way t h a t search r u l e .
in
in e f f e c t ,
that
The "known segments" to reduce the time
search f o r
makes the d i r e c t o r y
of procedures
system,
search r u l e :
the segment number of the p r o c e d u r e
provides
to be searched f o r
cution
directs
entry
rule
module a u t o m a t i c a l l y
found
rule
by using
to l o c a t e With
rectory
c o n c e p t work s a t i s f a c t o -
directory"
the p r o c e d u r e segment in e x e c u t i o n was f o u n d .
component of the e n t r y rectory.
directory
of the " r e f e r e n c i n g
directory
the d i r e c t o r y execution
by u s i n g
directory
The r e f e r e n c i n g This
to a p p l y a program module s i m p l y
statement.
name in a c a l l
The d i f f i c u l t y
be a b l e
of programming
among t h e i r
libraries.
language subsystems must avoid
name
165
3.
No s u i t a b l e
means is
data segments of a l a r g e
provided, f o r data base.
anism has been implemented
for
representing
This
creating
references
among the
is a problem because no mechlinks
from uses o f r e f e r e n c e
names in data segments. In the f i n a l a computer solved
3 . 4.
i.
section
of t h e s e n o t e s ,
we p r e s e n t a c o n c e p t u a l
system in which t h e s e i s s u e s
by p r o v i d i n g
the a p p r o p r i a t e
of modular
context
for
basis
programming
for
are r e -
each use of a name.
REFERENCES F. J.
Corbato,
seven y e a r s .
C. T. C l i n g e n ,
and J.H.
Saltzer,
AFIPS Conference Proceedings,
MULTICS - -
the f i r s t
Vol. 40, SJCC, 1972,
pp 571-583. 2.
R. C. Daley and P. G. Neuman, A g e n e r a l - p u r p o s e secondary
file
AFIPS Conference Proceedings,
storage.
system f o r
Vol. 27, Part I,
FJCC, 1965, pp 213-229. 3.
A Bensoussan,
C. T. C l i n g e n ,
and R. C. D a l e y ,
The M u l t i c s
virtual
memory. Proceedings of the Second Symposium on Operating Systems
Principles. ACM, O c t o b e r 1969, pp 3 0 - 4 2 . 4~
R. C. Daley and J.
B. D e n n i s ,
in MULTICS. Comm. o f 5.
E. L. G l a s e r , computer f o r Vol.
6.
J.
t h e ACM, V o l .
C. T. C l i n g e n ,
memory,
11, No.
processes,
and s h a r i n g
5 (May 1 9 6 8 ) ,
pp 306-312.
F. C o u l e u r and G. A. O l i v e r ,
time s h a r i n g
27, FJCC,
Virtual
1965,
applications.
System design of a
AFIPS Conference Proceedings,
pp 197-202.
unpublished
C o n f e r e n c e on S o f t w a r e
memorandum p r e p a r e d f o r
Engineering
Techniques,
the NATO
Rome, 1969.
166
4. A BASE LINGUISTIC
In t h i s
lecture,
guistic
level
presentation
LEVEL FOR MODULAR
we p r e s e n t i n f o r m a l l y
(a common for
base
the semantic concepts o f a l i n that
language)
could
The o b j e c t i v e
is
will
have a s a t i s f a c t o r y
resolution.
It
signers
so f u t u r e
~ogramming.
computer systems w i l l
Our work toward the s p e c i f i c a t i o n methods c l o s e l y
related
level
better
for
this
material
computer system de-
s e r v e as f o u n d a t i o n s
of a common base language [ 1 ]
to the f o r m a l
methods developed at the
IBM
3] and which d e r i v e from the ideas of McCarthy
[4,
7].
4.1.
[6,
for
uses
Vienna L a b o r a t o r y [ 2 , 5] and Landin
such
in the p r e c e d i n g p r e s e n -
i s hoped t h a t
s e r v e as a guide or s t a n d a r d of c a p a b i l i t y
modular
of source p r o -
to d e s c r i b e a l i n g u i s t i c
the i s s u e s of modular programming r a i s e d
tations
s e r v e as a common r e -
program modules e x p r e s s e d in a v a r i e t y
gramming l a n g u a g e s . that
PROGRAMMIN.G
OBJECTS
For the f o r m a l required sisting
for
s e m a n t i c s of programming languages a g e n e r a l model
the data on which programs a c t .
of elementary
objects,
elementary objects
into
Elementary objects
are data
objects
i s not r e l e v a n t
sent d i s c u s s i o n ,
and compound
is
We r e g a r d data as conformed by combining
objects
data s t r u c t u r e s . items whose s t r u c t u r e
to the d e s c r i p t i o n
the c l a s s
of a l g o r i t h m s .
E of e l e m e n t a r y o b j e c t s
E = Zu
in terms of s i m p l e r For the p r e -
is
RUW
where Z
= the c l a s s
R
= a s e t of r e p r e s e n t a t i o n s
of i n t e g e r s
W = the s e t of a l l
strings
for
rea~ numbers
on some a l p h a b e t
Data s t r u c t u r e s
are o f t e n
mentary o b j e c t s
are a s s o c i a t e d w i t h
r e p r e s e n t e d by d i r e c t e d
a member of a s e t S of s e l e c t o r s . Vienna g r o u p ,
graphs in which e l e -
nodes, and each arc i s
In the c l a s s
the graphs are r e s t r i c t e d
of o b j e c t s
to be t r e e s ,
labelled
by
used by the
and e l e m e n t a r y
167
objects class
are a s s o c i a t e d o n l y w i t h
so an o b j e c t
"third o b j e c t sibility
leaf
may have d i s t i n c t
of s h a r i n g
is essential
presented here.
We p r e f e r
a less restricted
component o b j e c t s
as a common component.
and i n t e r p r e t e r
nodes.
The r e a d e r w i l l
to the f o r m u l a t i o n Our c l a s s
that
share some
see t h a t
this
pos-
of the base language
of o b j e c t s
is defined
as
follows: Let E be a c l a s s
of e l e m e n t a r y objects,
An o b j e c t
is a directed
which a l l
other
labelled
with
acyclic
S be a c l a s s o f s e l e c t o r s .
graph h a v i n g a s i n g l e
nodes may be reached over d i r e c t e d
one s e l e c t o r each l e a f
We use i n t e g e r s
and s t r i n g s
inside;
integers
in s i n g l e
quotes,
are r e p r e s e n t e d by s o l i d
in E may be
W Leaf nodes h a v i n g a s s o c i a t e d
are r e p r e s e n t e d by c i r c l e s
closed
is
as s e l e c t o r s :
10 g i v e s an example of an o b j e c t .
written
node from Each arc
node.
S = Zu
elementary objects
root
paths.
in S, and an e l e m e n t a r y o b j e c t
associated with
Figure
and l e t
with
the e l e m e n t of E
are r e p r e s e n t e d by n u m e r a l s , and r e a l s dots,
have decimal
strings
points.
with a horizontal
are en-
Other nodes
bar i f
there
i s more
than one emanating a r c . The node o f an o b j e c t root
node i s
the o r i g i n a l that
itself
reached by t r a v e r s i n g
the r o o t
object.
node of an o b j e c t
The component o b j e c t
can be reached by d i r e c t e d
4.2.
Figure
STRUCTURE
11 shows how source
the base language.
be r e a l i z e d .
root
a component of
of a l l
nodes and arcs
node.
c l a s s of a b s t r a c t
in terms of a
programs c o n s t i t u t e s
Concrete programs in source languages by t r a n s l a t o r s
into
programs cannot r e f l e c t
source ~anguage,
constructs
consists
paths from i t s
languages would be d e f i n e d
A single
are d e f i n e d
t u r e of a b s t r a c t ticular
called
OF A B A S E L A N G U A G E I N T E R P R E T E R
common base l a n g u a g e . the f i g u r e )
an arc emanating from i t s
the p e c u l i a r i t i e s
The t r a n s l a t o r s
The s t r u c -
of any p a r -
but must p r o v i d e a s e t of fundamental
i n terms of which the f e a t u r e s
of the base l a n g u a g e ,
(L1 and L2 in
the base language.
linguistic
o f these source languages may
t h e m s e l v e s should be s p e c i f i e d
p r o b a b l y by means o f a s p e c i a l i z e d
in
terms
source l a n g u a g e .
168
? ,f,
+
g
3
,i
i
0
i i
t
c i
,f, i
I
2
Figure i0. An example of an object.
concreteprogramsin L1. ~ s l ~
abstract programsinbGse janguage
concreteprogramsZ ...... states inL2/,~ translator ~
interpreter
Figure Ii. Language definition in terms of a con~aon base language.
169
The s e m a n t i c s of a b s t r a c t by an i n t e r p r e t e r
which
programs of the base language are s p e c i f i e d
is a nondeterministic
as in the work of the Vienna group. base language,, and s t a t e s of o b j e c t s
of
the i n t e r p r e t e r
of s t a t e s
shown in F i g u r e
12.
of
the i n t e r p r e t e r
Since we r e g a r d
guage as a complete s p e c i f i c a t i o n computer s y s t e m , a s t a t e programs,
data,
In F i g u r e
12 the u n i v e r s e
abstract
programs in the
are elements o f
for
the c l a s s
the i n t e r p r e t e r
for
information
i s an o b j e c t
is
in p r o g r e s s .
for
are procedure
structures.
represents
that
procedure s t r u c t u r e s .
objects.
idle
accommodated,
a procedure s t r u c t u r e
of
--
information that
is,
when
and p r o -
structures
Any o b j e c t
is a legitimate
da-
may have components t h a t
structure
of the base l a n g u a g e , So t h a t m u l t i p l e
the t o t a l i t y
represents all
i s an o b j e c t
p r e s e n t s a p r o c e d u r e e x p r e s s e d in the base language. which are i n s t r u c t i o n s
of a
p r e s e n t in the computer system.
example, a data s t r u c t u r e A procedure
the base l a n -
operation
The u n i v e r s e has d a t a
as c o n s t i t u e n t
structures
ta s t r u c t u r e ;
for
the f u n c t i o n a l
of the i n t e r p e t e r
and c o n t r o l
the base language i s
in the computer system when the system i s
no c o m p u t a t i o n cedure
Formally,
system,
d e f i n e d above.
The s t r u c t u r e
present
state-transition
It
that
data s t r u c t u r e s ,
activations
re-
has components or o t h e r
o f procedures may be
remains u n a l t e r e d
during
its
inter-
pretation. The l o c a l s t r u c t u r e for
each c u r r e n t
of an i n t e r p r e t e r
activation
state
:structure
has as components the l o c a l
tivations
initiated
represents think that
within
it.
initiates
local
structure
independent,
structures
a local
of a l l
Thus the h i e r a r c h y
the dynamic r e l a t i o n s h i p
of the r o o t
contains
of each base language p r o c e d u r e .
p r o c e d u r e ac-
of l o c a l
structures
of procedure a c t i v a t i o n s .
One may
as the nucleus of an o p e r a t i n g
concurrent
users as t h e y r e q u e s t a c t i v a t i o n
structure Each l o c a l
computations
system
on b e h a l f of system
of p r o c e d u r e s from the system f i l e s
(the universe). The l o c a l
structure
of a procedure a c t i v a t i o n
has a component o b j e c t
each v a r i a b l e
of the base language p r o c e d u r e .
ponent i s
identifier
jects
its
in the i n s t r u c t i o n s
may be e l e m e n t a r y or compound o b j e c t s
within
the u n i v e r s e or w i t h i n
local
The s e l e c t o r
o f the p r o c e d u r e .
These ob-
and may be common w i t h
structures
of o t h e r
objects
procedure a c t i v -
ations. The c o n t r o l
component of an i n t e r p r e t e r
state
for
of each com-
i s an unordered
s e t of
170
sites
cal
structure
struction site
ations
site
a t an i n s t r u c t i o n L for
designating
different
a procedure, asterisks
Each s t a t e
[8].
structures.
Also,
of some p r o c e d u r e ,
within
from the c o n t r o l
of
at a site
the c u r r e n t
transition
tion,
the chosen s i t e
of the base language.
4.3.
STATE
of a c t i v i t y
TRANSITIONS
e x e c u t e s one i n s t r u c t i o n
is
selected
resulting
replaced according
transitions
use a r e p r e s e n t a t i o n
instruction
procedure
of r e p r e s e n t a t i v e state
of an i n t e r p r e t e r .
for
form.
procedures
The i n s t r u c t i o n s with
instructions
13 through
components.
structure
i-component,
relevant
to the sequencing
that
This w i l l
For i l l u s employs con-
of a procedure are
0 being the s e l e c t o r
of
instruction.
shown in F i g u r e s relevant its
sequencing.
s e l e c t e d by s u c c e s s i v e i n t e g e r s ,
The e f f e c t
i s a non-
from a t r a n s i -
of a r u d i m e n t a r y base language
put the concepts e x p r e s s e d above i n t o more c o n c r e t e we w i l l
for
arbitrarily
Thus the i n t e r p r e t e r
In the s t a t e
instructions
would be implemented by s t a t e
objects
of
thus
OF THE I N T E R P R E T E R
Next we show how t y p i c a l
the i n i t i a l
one a c t i v a t i o n
concurrently;
of a c t i v i t y
state.
system.
rules
tration,
but
of a procedure may have arrows to
of the i n t e r p r e t e r
some procedure a c t i v a t i o n ,
ventional
represents a
Since s e v e r a l a c t i v -
t h e r e may be two or more
may be a c t i v e
instructions
that
"in-
structure.
transition
deterministic
concurrently,
instructions
on d i f f e r e n t
the same l o c a l
in F i g u r e 4
i s analogous to the
the same i n s t r u c t i o n
local
several
represented
combination
c o n t o u r model
of a procedure may e x i s t involving
is
of P. This
pointer"
in J o h n s t o n ' s
of a c t i v i t y
of a c t i v i t y
of p r o c e d u r e P and an arrow to the l o -
some a c t i v a t i o n
pointer/environment
of a c t i v i t y
sites
A typical
of activity.
by an a s t e r i s k
activation
The add i n s t r u c t i o n
19 in the form of b e f o r e / a f t e r In these f i g u r e s ,
containing
and L(P)
on the i n t e r p r e t e r
is
an i n s t r u c t i o n
the r o o t of
state
is
pictures
of
P marks the r o o t of the under c o n s i d e r a t i o n
the l o c a l
structure
for
as the
of P. is
typical
ions to e l e m e n t a r y o b j e c t s . add
of i n s t r u c t i o n s
The i n s t r u c t i o n
'u',
'v',
'w'
that
apply binary operat-
171
,T
llll
i 'universe' I
+l
i
I
' local structure'
control
I
" ''do " I structure
s i t e s of activity
t
~t
TP
'/' I'~, / / /
/
/ ~ ,
/
I.P , ,
T I instruction
tL
',
I ~
' .........
T
\
X ''
\\
\
~/ ~, , , -,I procedure structure P
~,. . . . . . local
~structure
\,J
L
Figure 12. Structure of objects representing states of the base language interpreter.
(o)
(b)
'1' ~dd
, u ,, l v ,, , w t,
,, 4t~)
' instruction
Figure 13, Interpretation of an instruction specifying a binary operation,
172
i s an o b j e c t 'v',
having as components
and ' w ' .
dress f i e l d s " structure the s i t e
used as s e l e c t o r s
operands and r e s u l t
L(P).
The s t a t e
Let us say t h a t structure
if
a procedure a c t i v a t i o n
the data s t r u c t u r e
some s e l e c t o r
to which d i r e c t 'p'
is
s.
is
'u'
"ad-
in the l o c a l 13. Note t h a t
i + l - c o m p o n e n t of P.
has d i r e c t
to a data
access
the s-component of
the l o c a l
struc-
The i n s t r u c t i o n
'p',
'n',
access to the
access e x i s t s .
'n'-component
to the
'add',
code and t h r e e
shown in F i g u r e
advances s e q u e n t i a l l y
i s used to gain d i r e c t the
for
transition
select
is
elementary objects
as an o p e r a t i o n
of a c t i v i t y
ture for
the f o u r
These are i n t e r p r e t e d
This
'q'
'n'-component instruction
of L(P) a l s o the
of a data s t r u c t u r e
makes the o b j e c t
'q'-component
that
of L ( P ) ,
as
shown by F i g u r e 14. Literal
v a l u e s are r e t r i e v e d
structions
from the p r o c e d u r e s t r u c t u r e
in-
such as 1.5,
const
which makes the e l e m a n t a r y o b j e c t and c o n s t
instructions
as i l l u s t r a t e d implies
by c o n s t
1.5 the
15.
Note t h a t
of an ' n ' - c o m p o n e n t
Select
' x ' - c o m p o n e n t of L ( P ) .
may be used to b u i l d
in F i g u r e
creation
'x i
arbitraty
data s t r u c t u r e s
e x e c u t i o n of s e l e c t
of the o b j e c t
selected
'p',
'n',
by ' p '
'x' if
none a l r e a d y e x i s t s . Figure
16 shows how the i n s t r u c t i o n link
establishes L(P))
The l i n k
access e x i s t s .
' q ' - c o m p o n e n t of L(P) instruction
one o b j e c t
'n' , 'q'
an arc between two o b j e c t s
to which d i r e c t
makes the
'p',
is
(the 'p'
The i n s t r u c t i o n
~p~ ,
~n I
'q'-components instruction
'n'-component
establishing
a common component of two d i s t i n c t
delete
and
E x e c u t i o n of t h i s
a l s o the
the means f o r
'p'-
sharing
objects.
of L(P). - - making
of
173
(a)
T~?
....
: /
(b) L(P)
..........T~,+,j~
?~c~i,
p
n
Figure 14. Interpretation of a select instruction.
(a)
T~~
?~c~? (b}
Select °l:/In'~'x' !
"
const |.5, I x I ~ '
~
Z
'~'
'-"
eonstL5,x n Q ~ Co) i
i-F!
. I
/.
,I,
I,
Figure 15. Structure building using select and const instructions.
174
(b)
(o)
i
i+~/
q
J
q
'"4"-~
'link,,, 'p'~'n','q"
i
p
I
Figure 16. Insertion of an arc by a l£nk instruction.
(a)
(b) L(P)
L(P) III
llUl ~mI I
i ,iilllill
P
V
i
iI q
°
i ill
n
I
t
1
b' %'
t I
i,,,,L,,,~ j
I
J
I
l
|
t
Figure 17. The effect of executing a delete instruction.
175
erases the arc of L ( P ) .
labelled
'n'
Any nod s and arcs
to be p a r t of
th~
emanating from the r o o t
of the
that
the e r a s u r e cease
are unrooted
interpreterstate,
as shown in F i g u r e
A l t h o u g h we have n t mentioned them in t h i s language w i l l tional
include
and i t e r a t i o k
after
appropriate
brief
instructions
statements,
and f o r
17.
summary, the base
for
testing
'p'-component
implementing condi-
the presence and type
of a component of an o b j e c t . Activation
of a new procedure
'f'
,
o f L(P)
is
the procedure
structure
and the
'a'-component
of L(P)
'f'-component
procedure
to be a c t i v a t e d ,
by the p r o c e d u r e tion..Execution illustrated structure
that
structure)
(e.g.,
contains
actual
of the a p p l y
in F i g u r e
'a'
as components a l l
parameter values)
instruction
18: A r o o t a new s i t e
i s advanced to
indicated
the
state
is created
for
its
func-
transition the l o c a l
i s made the
i s denoted by an a s t e r i s k
on the O-component of F and an arrow to L ( F ) ; activity
i s an o b j e c t data r e q u i r e d
the argument s t r u c t u r e of a c t i v i t y
F of the
to p e r f o r m
causes the
node L(F)
of the new a c t i v a t i o n ;
A-component of L ( F ) ;
by the i n s t r u c t i o n
apply
where the
(an a r g u m e n t
is accomplished
i+l-instruction
and the o r i g i n a l
site
of
of P and made dormant as
by the p a r a n t h e s e s .
A procedure a c t i v a t i o n
is
terminated
by the
instruction
return
which causes the s t a t e L(F)
is erased,
are not return
linked
cuting
to
disppears;
procedure
procedure
all
parts
displayed
F is
argument s t r u c t u r e .
in F i g u r e
o f the l o c a l
the argument s t r u c t u r e ;
instruction
activating
transition
deleting
19. The r o o t
structure
the s i t e
of a c t i v i t y
and the dormant s i t e
of a c t i v i t y
is activated.
Note t h a t
the e n t i r e
conveyed to the a c t i v a t i o n
node
of F t h a t
effect
at the i n the of exe-
of P by way of the
176
i
/
,apply f~a
I+'J
r
Instruction
'procedure~ =arg structure structure
p
app y ta
a
_L(P)
instruction i
~ "
~
, ~,ll,llT
instruction
Figure 18.
'op,, ,;v
i
i
Jt (.1
~
instruction
I i+~.
H
./
,~.." lappl t f,,v' finstruct ior~
Figure 19.
structure
Initiation of a procedure activation by an apply instruction.
'
'
argument
/
~F
j
.
"
~/
I-
A
--nL(F) T
re !u ~'n,'~ ,argu m~en t structure I II
.I,
f
I
'
I,
'o
~F I largument' + ~procedure structure structure i
Termination of a procedure activation by a return instruction.
I
177
4,4.
REPRESENTATION
OF M O D U L A R
PROGRAMS
Withlthe foregoing introduction to base language concepts we may study how well the base language could serve the needs df modular programming. F i r s t we consider the adequacy of the base language for representing and transforming data structures. The data types of many practical programming languages have natural representations as objects that are s t r i c t l y trees (have no shared subs t r u c t u r e s ) . These include vectors, arrays, d i r e c t o r i e s , symbol tables, and hierarchical data bases ( f i l e s ) .
Some data management systems employ
representations that provide for sharing of substructures. Also, most data structures occurring in Lisp programs have the form of binary trees with shared subtrees. These structures are d i r e c t l y modelled as objects having shared component objects. Some important languages, including PL/I, A L G O L
68,
and Lisp, permit the
programmer to build data structures containing directed cycles. Such structures do not have d i r e c t representations as objects of the base language. I t tial
is not yet clear to what extent use of cycles is an essen-
part of modelling real world semantic constructs in contrast to
use of cycles as an implementation
technique through which, for example,
objects may be represented and e f f i c i e n t l y manipulated as l i s t s . The p r i m i t i v e constructs of the base language provide a general f a c i l i t y for building and manipulating objects. Any object may be constructed by a base language procedure through repeated use of s e l e c t and oonst instructions. Through use of l i n k i n s t r u c t i o n s , objects may be made shared components of several objects, and argument structures may be assembled from any f i n i t e set of a r b i t r a r y objects. In contrast to l i n g u i s t i c levels (such as defined by PL/I) closely t i r e d to the concept of l i n e a r address space, passing an object to a base language procedure gives the procedure the a b i l i t y to transform the object in any way without the p o s s i b i l i t y of a f f e c t i n g objects not passed to the procedure as part of the argument structure. In the paragraphs below we show how the use of objects as the fundamental notion of data structure y i e l d s natural solutions to a number of issues of language implementation Recursion:
and modular programming.
Recursion occurs when a procedure makes application of i t -
178
self
in o r d e r to p e r f o r m
outlined
above,
there
procedure s t r u c t L ~ e vely.
However, as
initial
hown in F i g u r e 20,
recursive
In the base language i n t e r p r e t e r so i t
to make a
may be a p p l i e d
recursi-
the p r o c e d u r e P t h a t makes the
procedure F may i n c l u d e
the argument s t r u c t u r e
I m p l e m e n t a t i o n of f r e e
to access v a r i a b l e s
its
local
the procedure
for
its
structure
for
many programming
variables
call
of
and c r e a t e
language program
for
ted c o r r e c t l y .
into
details
In t h i s
are g i v e n
cedure v a l u e s to v a r i a b l e s .
structures
programs
and i n t e r p r e -
in [ I ] .
i m p l e m e n t a t i o n of p r o c e d u r e - v a l u e d v a r i a b l e s p r e s e n t e d by an o b j e c t
language,
requires
correct
use of the n o t i o n
In the base language a c l o s u r e may be r e -
having two components as shown in F i g u r e 22. The
i s the t e x t
occurrences
in the source
way, b l o c k - s t r u c t u r e d
In a b l o c k - s t r u c t u r e d
of the c l o s u r e o f a p r o c e d u r e .
contains
references
Some advanced languages p e r m i t a s s i g n m e n t o f p r o -
Procedure variables:
T-component
an
to which e x e c u t i o n of the p r o -
base language procedure
Further
Although
a procedure a p p l i c a t i o n
access because of n o n l o c a l
(see F i g u r e 2 1 ) .
and i s
in the base l a n g u a g e , we may i n -
h a v i n g as a component each o b j e c t
can be t r a n s l a t e d
references,
languages d e r i v e d from ALGOL 60.
r e f e r e n c e s are not p e r m i t t e d
cedure may r e q u i r e
in p r o c e d u r e s r e q u i r e s
by means of n o n l o c a l
c l u d e as p a r t of the argument s t r u c t u r e
that
cycles,
activations.
the a b i l i t y
object
introducing
way F may m ke F a component of
Block structure:
nonlocal
of a r e c u r s i v e
o f F as a ~ m p o n e n t o f
In t h i s
essential
function.
a component of i t s e l f
application
structure F.
its
i s no way, w i t h o u t
of the procedure and the E-component i s an o b j e c t
as components v a l u e s of the v a r i a b l e s
in the procedure t e x t .
A closure
that
have f r e e
s e r v e s as the v a l u e o f a
procedure v a r i a b l e . Context:
In the base language the c o r r e c t
names i s p r o v i d e d by o b j e c t s . tion
of a procedure
some s p e c i f i c activation tifier is
is
object.
Each i d e n t i f i e r
interpreted The o b j e c t
or some p a r t
i s the l o c a l
conflicts
are a v o i d e d ,
way a l l
during
for
itself,
execu-
the procedure if
the i d e n -
O t h e r w i s e the o b j e c t usual
sources of name
and m i s t a k e s in use of names lead to e r r o r
than unsuspected b i n d i n g s .
of
of a component of
structure
was chosen by the a u t h o r of the p r o c e d u r e . In t h i s
interpretation
encountered
as the s e l e c t o r
of the procedure s t r u c t u r e
p a r t of the argument s t r u c t u r e .
rather
context for
reports
179
~ L(P) I
'f'
!
'a'
'f'
I
text of F
Figure 20.
Implementation of a recursive procedure in the base language.
T
F
L(F) I 'x'
I' text,~, of F
I 'y'
I
ta!
!
[
text of G
x and y are local to F and occur as nonlocal references in G.
Figure 21.
argument structure
E
I
I
'x'
I
I
'y'
Principle used to translate blockstructued programs.
180
Run-time
Access to l i b r a r y
libraries:
of a p a r t i c u l a r language.
programming
language
Each p r o c e d u r e s t r u c t u r e
procedures is r e a d i l y
resulting
gram in source language A has as i t s presents trated
the directory
in F i g u r e 23.
dure s t r u c t u r e s
4.5.
in a d i f f e r e n t
sharing
USE
a separate
procedures
radically
of the l i n e a r
address into
presented
in
proce-
language A. Pro-
of r u n - t i m e
procedures.
as t h e u n d e r l y i n g
space.
Hence, i t
interested
in s e v e r a l
ways:
in p r o d u c i n g
Moreover,
s e r v e as a s t a n d a r d of p r a c t i c a l
understand
the t r u e
languages
those p r o p o s i n g
of a c h i e v e m e n t - computer limitations
of t h e i r
and where d e s i g n
changes can c o r r e c t many y e a r s a f t e r .
and e v a l -
favorable
to modu-
of the base language can
to be equaled or exceeded by the
systems.
plague users for
level
that
and t h e y may help
in d i r e c t i o n s
the l i n g u i s t i c
the
to computer
These ideas may be a p p l i e d
o f computer o r g a n i z a t i o n ,
o f programming
programming.
to
Nevertheless,
systems and languages
programming.
of
out t h a t
are r e q u i r e d
practice.
They may s e r v e as a g u i d e f o r
advanced c o n c e p t s
the e v o l u t i o n
general
[9]
notion
may t u r n
here s h o u l d be v a l u a b l e
s e r v e the needs of modular
designer
language A, as i l l u s -
of programs
directory
the promised advantages
system d e s i g n e r s
lar
re-
source language B become p r o c e d u r e
new c o n c e p t s o f computer a r c h i t e c t u r e
base language c o n c e p t s
uating
that
is a shared component o f a l l
The base language is founded on o b j e c t s
better
for
of a p r o -
an o b j e c t
OF THE M O D E L
memory i n s t e a d bring
from t r a n s l a t i o n
'lib'-component
produced by t r a n s l a t i o n
cedures e x p r e s s e d structures
of r u n - t i m e This d i r e c t o r y
of the i m p l e m e n t a t i o n
handled in the base
It
should
help designers
systems f o r
defects
that
modular
might
better
programming,
otherwise
181
,,, ,,
l T
I
| ,i,
il
closure of F E
I
text of F
I
I
x
y
66 %
for free variables
values
Figure 22.
T
,,
Base language representation closure of a procedure.
P (Language A)
I text of P 'lib '
!
tqt
IIQ (Language B) i
library for language A
J
Figure 23.
text of Q
I
i 'lib'
4
iibrary for language B
I
Providing separate libraries for two languages.
of F
for the
182
5.
REFERENCES
i.
J.
B. D e n n i s ,
On t h e d e s i g n
and i m p l e m e n t a t i o n
o f a common base
language. Proceedings of the Symposium on Computers and Automata. Vol. XXI , MRI Symposia Series.
I n s t i t u t e of Brooklyn, 2.
Polytechnik Press of the Polytechnic
Brooklyn,
N.Y., 1971.
P. Lauer, Formal D e f i n i t i o n o f ALGOL 60. Technical Report TR 25.088, IBM Laboratory, Vienna, December 1968.
3.
P. Lucas and K. Walk, On the formal description of PL/I. Annual Review in Automatic Programming,
Vol.6 , Part 3, Pergamon Press
1959, pp 105-182. 4.
J. McCarthy, Towards a mathematical science of computation. Information Processing
5.
62, North-Holland, Amsterdam 1963, pp 21-28.
J. McCarthy, A formal description of a subset of ALGOL. Formal Language Description Languages for Computer Programming.
North-Holland, Amsterdam 1966, pp 1-12. 6.
P. J. Landin, The mechanical evaluation of expressions. The Computer Journal,
7.
Vol. 6, No. 4 (January 1964), pp 308-320.
P. J. Landin, Correspondence between ALGOL SO and Church's lambdanotation (Parts I and I I ) .
Part I : Comm. o f the ACM, Vol. 8, No.
(February 1965), pp 89-101. Part I I :
Comm. o f the AOW, Vol.
8, No.3
(March 1965), pp 158-165.
8.
J. B. Johnston, The contour model of block structured processes. Proceedings guages.
9.
of a Symposium on Data Structures
SIGPLAN NOtices Vol.
in Programming Lan-
6, No. 2, ACM, February 1971, pp 55-82.
J. B. Dennis, Programming generality, parallelism and computer architecture. I n f o r m a t i o n Processing S8, North-Holland, Amsterdam 1969, pp 484-492.
CHAPTER 3 . B .
P O R T A B I L I T Y
P. C.
W. M. WAITE
Culham L a b o r a t o r y
University
Abingdon,
Dept.
1.
Berkshire
BRITAIN
El.
Colorado Enqineering
INTRODUCTION
is
ferred
one e n v i r o n m e n t
from
program
a measure
is
much l e s s
t h e n we say t h a t ease w i t h
cepts the
is
it
is
which
ges and s y s t e m of
of
of
BOULDER, COLORADO, USA
Portability
the
A D A P T A B I L T Y
POOLE
GREAT
the
and
that
the to
than
highly
adaptability whereas
ease w i t h another
that
required
to to
is
with
is
fit
distinction
can be t r a n s -
required
implement
Adaptability
concerned
portability
a program
th~ effort
can be a l t e r e d
The m a j o r is
which
: If
portable.
a program
constraints.
algorithm,
of
it
initially,
a measure
differing between
changes
concerned
in
with
t o move
user
the
the
of ima-
two c o n -
structure
changes
in
the
to
ease
environment.
An o b v i o u s the
reason
transition
highly
portable
to is
for
enhancing
the
a new c o m p u t e r . not
tightly
portability
of
An i n s t a l l a t i o n
bound t o
a particular
a program
is
whose p r o g r a m s computer
are
o r manu-
184
facturer. tion
Because o f t h i s ,
when b a r g a i n i n g
portable
the i n s t a l l a t i o n a new machine.
can p r o v i d e w o r k i n g
new hardware. tions
for
programs more q u i c k l y
and can exchange programs to a v o i d w a s t e f u l
We have o f t e n
heard the argument t h a t
because t h e y can be improved i f here i s
one o f
one has the freedom to it
allocation
: if
Even i f
out
installa-
duplication,
a decision
v e r s i o n can be made a v a i l a b l e
We b e l i e v e
a program i s
decide whether to a l l o c a t e
or doing a n o t h e r p r o j e c t .
the p o r t a b l e
is
programs should not be p o r t a b l e
t h e y are r e w r i t t e n .
resource
posi-
when b r i n g i n g
Academic and r e s e a r c h people can move to o t h e r
easily
question
has a more f l e x i b l e
M a n u f a c t u r e r s whose s o f t w a r e
resources
the portable,
to
improve
i s made to r e w r i t e ,
during
the p e r i o d
of rewri-
ting.
The main argument f o r
enhancing a d a p t a b i l i t y
broad range o f user r e q u i r e m e n t s w i t h ments are o f t e n portions
neither
a single
nested nor d i s j o i n t .
o f the program so t h a t
facilities
is
particular
the need to s a t i s f y program.
It
a
Such r e q u i r e -
i s n e c e s s a r y to d e l e t e
users are not burdened w i t h
which t h e y do not use and cannot a f f o r d ,
High a d a p t a b i l i t y
a l s o enhances p o r t a b i l i t y ,
i m p l e m e n t o r to d e l e t e
features
and system c o n s t r a i n t s . be r e s t r u c t u r e d difficult
in
enables the
n e c e s s a r y in o r d e r to meet memory
There are o t h e r ways in which a program could
response to such r e q u i r e m e n t s .
to c l a s s i f y
increasing
if
because i t
these techniques
portability.
In some cases i t
as i n c r e a s i n g
For example, we s h a l l
adaptability
is or
show how the t r a n s l a t i o n
rules
can be v a r i e d on the b a s i s of the f r e q u e n c y o f e x e c u t i o n o f va-
rious
parts
\
sier
o f the program.
to move a program,
program's
performance.
tability
?
1.1.
THE
BASIC
Such t e c h n i q u e s
but c e r t a i n l y Is t h i s
do not u s u a l l y make i t
make i t
e a s i e r to
increased portability
ea-
improve the
or i n c r e a s e d adap-
PRINCIPLES
Let us c o n s i d e r the normal
approach to c r e a t i n g
amine the problem and d e t e r m i n e an a p p r o p r i a t e
a program. F i r s t
we e x -
s e t of b a s i c o p e r a t i o n s
185
and data t y p e s . to m a n i p u l a t e control,
tying
says n o t h i n g rations
We then b u i l d
data.
the basic their
point
algorithms
for
operations previous
in a p a r t i c u l a r
are r e p r e s e n t e d ,
relative
operations
efficiency
t h e l e s s , the o r i g i n a l a l g o r i t h m w i l l work c o r r e c t l y . principle used to enhance a p r o g r a m ' s p o r t a b i l i t y .
To enhance the a d a p t a b i l i t y the a l g o r i t h m
o f a program,
in a s y s t e m a t i c
way.
of recoding,
a process which almost
ter
we s h a l l
sections
programs.
adaptability
at t h i s
o f the a l g o r i t h m
1.2.
WHAT
The t e c h n i q u e s achieve
show s e v e r a l
Unfortunately,
CAN
increases
s e t s o u t to t r a n s f e r
his
EXPECT
which we w i l l
dramatic
TO
noted in the
be p o s s i b l e . This
is
Never-
the b a s i c
easy to a l t e r
produces e r r o r s .
In l a -
and examples o f a d a p t a b l e
state
will
a basic
only
allow
principle
of
adaptation
ACHIEVE
discuss
in t h e s e l e c t u r e s
in s o f t w a r e
portability.
to a new c o m p u t e r ,
of the basic
Having done t h i s ,
representation.
basic
by the d e s i g n e r .
an a l g o r i t h m
choose a r e p r e s e n t a t i o n problem.
Our t e c h n i q u e s
in ways f o r s e e n
WE
techniques
its
to a v o i d t h e n e c e s s i t y
invariably
however, we cannot
time.
realization
we must make i t
The key is
upon the r e In such a case,
Because o f the c o n s i d e r a t i o n s might
indepen-
o f two d i f f e r e n t
and data t y p e s .
algorithm
is
and data t y p e s .
on any computer by r e a l i z i n g
a more e f f i c i e n t
It
nor how t h e ope-
would depend upon t h e p a r t i c u l a r
and data t y p e s .
f l o w of way.
the algorithm
the same problem may depend h e a v i l y
may be r e a l i z e d
paragraph,
uses the o p e r a t i o n s a particular
of the o p e r a t i o n s
the
respective
of a l g o r i t h m
An a l g o r i t h m
together
realization
solving
which
provides
In o t h e r w o r d s ,
o u t here t h a t
of their
the choice available.
operations
results.
dent of any p a r t i c u l a r
alization
simply
about how the data t y p e s
obtain
We should
an a l g o r i t h m ,
The a l g o r i t h m
operations
he must e x p r e s s
Our t e c h n i q u e s
eliminate
can be used to
When a programmer he must f i r s t
and data t y p e s
the a l g o r i t h m
for
the
in terms o f
the second step e n t i r e l y .
186
As an example of the s a v i n g s ,
consider
the
compiler/interpreter
[1,2].
macros, which
the b a s i c o p e r a t i o n s
realize
the a l g o r i t h m .
The program c o n s i s t s
each o f which is
required
is
a call
of
per y e a r
[3].
the o r d e r o f
5
years;
the a l g o r i t h m
roughly
lines
2.5
6ooo
lines
of
by
code,
A p p r o x i m a t e l y one week
o f assembly l a n g u a g e .
we can assume t h a t
capable of p r o d u c i n g
Hence the e f f o r t
reconstructing
131
and a n o t h e r f o u r weeks to debug them.
an o r d e r of magnitude c a l c u l a t i o n , age programmer i s
SNOBOL4
and data t y p e s r e q u i r e d
roughly
on one o f these macros.
to code the macros,
Each macro i n v o l v e s
implementation of a
This program i s expressed in terms of
25oo
involved
lines
an assembly l a n g u -
of debugged code
in i m p l e m e n t i n g if
the
by
SNOBOL4
in assembly code would be about
man-years would be r e q u i r e d
Making
12
man-
implementor
made heavy use of macros.
Another example, i l l u s t r a t i n g
the ease w i t h which the c h a r a c t e r i s t i c s
o f a program can be a l t e r e d , manipulator
[ 4J
on the
i s the i m p l e m e n t a t i o n of the
ICL
4/70.
Like
is e x p r e s s e d in terms of macro c a l l s . de are i n v o l v e d .
For the f i r s t
required level
Approximately
to complete t h i s
4ooo
to g e n e r a t e subsets o f
MITEM
lines
of co-
by a se-
No a d d i t i o n a l
effort
: The user s i m p l y s p e c i f i e s
the program.
a key which causes the t r a n s l a t o r
program
Roughly two man-weeks o f e f f o r t
implementation.
number and r e - t r a n s l a t e s
text
MITEM
MITEM
v e r s i o n each macro was d e f i n e d
quence of machine code i n s t r u c t i o n s . were r e q u i r e d
the
SNOBOL4,
to i g n o r e
Each l i n e it
if
it
is a
of code c a r r i e s is
not
relevant
to
the d e s i r e d l e v e l .
The f i r s t for
version
interactive
spent coding structure required factor It
of
for for
of
d i d not s a t i s f y
MITEM
programs on the
an i n t e r p r e t e r it.
This
the f i r s t
IO.A t h i r d
was a h y b r i d ,
with
4/70.A
the memory c o n s t r a i n t s
further
and a l t e r i n g
second v e r s i o n used o n l y version,
parts
after
on.
Total
memory r e q u i r e m e n t s
but the e x e c u t i o n
time
i n c r e a s e d by a
t h r e e more man-weeks.
o f the program t r a n s l a t e d
e x e c u t a b l e code and the r e m a i n d e r i n t e r p r e t e d . changed to pack code e f f i c i e n t l y
40 % o f the memory
but the e x e c u t i o n time
v e r s i o n was r u n n i n g critical
two man-weeks were
the macros to produce a data
The i n t e r p r e t e r
into was
at the expense of s l o w e r i n t e r p r e t a t i were s t i l l
i n c r e a s e d by o n l y
40 % of those f o r I0 % o v e r t h a t
version
of v e r s i o n
1, 1.
187
PORTABILITY
2.
The t r a d i t i o n a l
method of
language sch as approach, -
THROUGH
LEVEL
increasing
FORTRAN,
provided that
HIGH
certain
The b a s i c o p e r a t i o n s problem are a v a i l a b l e
CODIN@
program p o r t a b i l i t y
or
ALGOL
LANGUAGE
COBOL.
conditions
This
is
is
are s a t i s f i e d
and data t y p e s r e q u i r e d
this
standard
Care i s
-
by the
in the chosen l a n g u a g e .
These c o n d i t i o n s
definition
dialect,
are s a t i s f i e d
which s o l v e s c i e n t i f i c
which are accepted
but p r o h i b i t e d
by the s t a n d a r d .
by a l a r g e m a j o r i t y
problems,
and
is w i d e l y implemented.
taken to a v o i d c o n s t r u c t i o n s
in the l o c a l
valid
:
The chosen language has a s t a n d a r d d e f i n i t i o n ,
-
to use a
a perfectly
of the programs
and many which s o l v e the s t a n d a r d bu-
siness problems. Since a d a p t a b i l i t y than
its
is
a property
realization,
l y make a program h i g h l y in mechanisms to s e l e c t ers.
This e f f e c t
rate
text
editor.
high
level
languages
nerate.
of the coding of the a l g o r i t h m
use o f a high adaptable. portions
level
Few high
can be a c h i e v e d , however, t h r o u g h
2.1.
THE
is
their
1.2.
The f i r s t
FOR
of t h e t h r e e
to s a t i s f y .
inability
we i n d i c a t e d
A high
that
be r e a l i z e d a r e
different
above i s
for per-
code g e n e r a t i o n
realized
the most d i f f i c u l t
basic operations
in s e v e r a l
but the
ways.
on a computer. a particular
a string
of b a s i c o p e r a -
in the l a n g u a g e ,
It
may be t h a t ,
data t y p e ,
on the g i v e n computer.
provides neither
and data t y p e s
These b a s i c o p e r a t i o n s
by c o m b i n a t i o n s
available
language must be r e a l i z e d
[5]
the use o f a sepa-
improvements in o v e r a l l
in the chosen l a n g u a g e .
can u s u a l l y
ta t y p e can be e a s i l y
oth-
to v a r y the code which t h e y ge-
stated
'the language does not p r o v i d e
FORTRAN
ignoring
the program.
program may be i n a d e q u a t e
level
although
while
EXTENSIONS
and data t y p e s which
resulting
of
conditions
available
and data types
ANSI
parts
Many problems have s e v e r a l
which are not tions
different
NEED
languages have b u i l t -
A more i m p o r t a n t weakness of most t r a n s l a t o r s
In S e c t i o n for
level
o f the source t e x t
formance could be a c h i e v e d by u s i n g e n t i r e l y strategies
rather
language does not a u t o m a t i c a l -
that
For example,
data t y p e nor the b a s i c
da-
188
s t r i n g o p e r a t i o n s . The IBM S Y S T E M ~ 3 6 0 computers however, do p r o v i d e these facilities. C h a r a c t e r s t r i n g s may be r e a l i z e d as i n t e g e r a r r a y s ANSI FORTRAN,
but then t h e t r a n s l a t o r
more e f f i c i e n t
realization
e x t e n d the
possible
will
on
ANSI FORTRAN language to
the e f f i c i e n c y
of the resulting
in
not t a k e advantage o f the If
IBM SYSTEM~360.
include
program f o r
a string
we c o u l d
data t y p e ,
then
could
IBM SYSTEM~360
be
improved. There i s
another
advantage which can be gained by e x t e n d i n g
ge : improved program d o c u m e n t a t i o n . sequence of o p e r a t i o n s a string. string
If
languages
the s i g n i f i c a n c e provide
define
new o p e r a t i o n s
exist
[6,7].
extension
may not be c l e a r is,
efficiency.
immediately
not i n v o l v e
such a t r a n s f o r m a t i o n
the e x t e n s i o n s
text.
An e x t e n s i o n
they operate
in terms
procedures
in terms o f e x i s t i n g unless
language.
Extensions
however,
additional
effort
procedures
for
sections
a particular
to
we s h a l l
extension
ons and data t y p e s , thod p r e s e r v e s increased
2.2.
If
specify
EXTENSION
and data t y p e s facilities
in terms
[8]. has no i m p l i -
are not p a r t
of the t a r g e t
of
compu-
of the code g e n e r a t i o n
ways of a d a p t i n g in terms
of the program,
a translator of e x i s t i n g
computer. while
so t h a t operati-
The f o r m e r me-
the l a t t e r
permits
increased effort.
EMBEDDING
separate
to make an e x t e n s i o n
ding procedures written
Such mecha-
: The i m p l e m e n t o r must make an
may be made e i t h e r
at the c o s t o f
BY
may
machine.
discuss
the p o r t a b i l i t y
a language p e r m i t s
possible
defined
but do
computer must
in some languages
or in terms of t h e t a r g e t
efficiency
already
on the e x t e n d e d
o f the t r a n s l a t o r .
the m o d i f i c a t i o n
the new t a r g e t
the user to
A mechanism which p e r -
the extension
reduce t h e p o r t a b i l i t y
with
implementation
of the t a r g e t
operations
portability,
the s t a n d a r d
In l a t e r
to move
clear.
(The p r a c t i c a l
nisms have been proposed and are a v a i l a b l e
ter,
a certain
t h e improved d o c u m e n t a t i o n ,
explicitly.)
to be d e f i n e d
the code g e n e r a t i o n
for
that
intended
in terms o f those which
Conceptually,
to produce normal
cations
in f a c t ,
mechanisms which p e r m i t
and data t y p e s
source t e x t
involve
is
Such mechanisms p r o v i d e
not i n c r e a s e
mits
It
arrays
the same sequence is e x p r e s s e d as a move o p e r a t i o n
arguments,
Several
on i n t e g e r
the l a n g u a -
translation in
of p r o c e d u r e s ,
terms of the t a r g e t
in machine code.
then i t
is
computer by p r o v i -
This t e c h n i q u e
is
called
t89
and is
embedding,
guage is called
frequently
the one b e i n g e x t e n d e d , primitives.
language t r a n s l a t o r ,
calls
on the p r i m i t i v e s .
As we i n d i c a t e d
in S e c t i o n
2.1.,
improve
the e f f i c i e n c y
solve
certain
classes
of p o r t a b i l i t y
of problems.
for
These goals
When c r e a t i n g
make a d e f i n i t e
between p o r t a b i l i t y
section
d e v o t e d to a case s t u d y which
are t o t a l l y
decision
and e f f i c i e n c y .
are o f the
for
a language
of a l g o r i t h m s
an e x t e n s i o n
lan-
host
penalties
extending
documentation
he d e s i r e s is
The
modification
heavy time
the reason
and/or
considerations.
the d e s i g n e r must u s u a l l y
the need f o r
but may i n v o l v e
to
FORTRAN.
and t h e machine code p r o c e d u r e s
Embedding a v o i d s
host
is
used to e x t e n d
which
independent by embedding,
about t h e b a l a n c e
The r e m a i n d e r
illustrates
of t h i s
the principles
in-
volved. [9]
SLIP
is
capability. vided. ble
The b a s i c
2.1.
ference
an e x t e n s i o n
to
FORTRAN
One new data t y p e , operations
the
we s h a l l
cell
(Figure
note t h e i r
of the p r i m i t i v e s
(2 b i t s )
processing was p r o -
relevant
may be found
properties
LNKR
LNKL (Address)
Figure
A SLIP
2.1.
CELL
o f Tain
re-
as n e c e s s a r y f o r
our discussion.
ID
list 2.1.),
were embodied in t h e t e n p r i m i t i v e s
A complete d e s c r i p t i o n 3;
which p r o v i d e s
SLIP
(Address)
190
1,
Immediate
2.
Direct 2.1.
operation
:
MADOV(A)
operations Selectors
:
ID(CELL) LNKL(CELL) LNKR(CELL)
2.2.
3.
Constructors
Indirect
:
SETDIR(ID, LNKL, LNKR, CELL) STRDIR(DATUM, CELL)
operations
3.1.
Selectors
:
3.2.
Constructors
CONT(A) INHALT(A) :
SETIND(ID, LNKL, LNKR, A) STRIND(DATUM, A)
Table The
When a l a n g u a g e achieve
is
Primitives
SLIP
extended
any s t a t u s
as f a r
2.1.
by e m b e d d i n g ,
as t h e
language
the is
new d a t a
types
concerned.
In o u r e x a m p l e ,
supplying the p r i m i t i v e s of T a b l e 2.1. does not cause the compiler to recognize
FORTRAN
c e l l s as v a l i d data objects in t h e i r own
SLIP
r i g h t . The compiler s t i l l
do n o t
only knows about integers, r e a l s , etc. Every
variable known to the compiler must have one of these types. I f the contents of a
c e l l is to be placed into a named v a r i a b l e , we must
SLIP
be able to guarantee that the compiler has reserved s u f f i c i e n t space for that variable to hold the contents of a does n o t
ANSI FORTRAN integers tee
that
contents dresses real
or
addresses
a variable of
a
sufficiently SLIP
bit
of
large. and
the
On
48
type
will
For example, bits
bits
to
is
no way t o
have enough room t o
hold
the
15
ODC 3200
is
only
machine,
however,
on t h e
devoted
between addresses
Hence t h e r e
variable on t h i s
SYSTEM/ 360,
implemented 36
relationship
numbers.
FORTRAN i n t e g e r
occupies
was o r i g i n a l l y addresses
either
word,
SLIP
and t h e
variable
specify and r e a l
cell.
SLIP
has
24
addresses
IBM 7090,
bits
so t h a t
it are
and guaranthe
bit
long.
The
w o u l d be 24
bits.
a machine with
each v a r i a b l e , r e g a r d l e s s
ad-
of
15 type.
191
At t h i s ty
is
point
in
the
required.
tents
of a
If
cell
SLIP
L e t us examine t h e cell
is
each o f
the
here t h a t direct ce,
sacrificing
to
a
cell.
SLIP
has c o n t r o l ensuring ever,
gument. pies
that
it
neither
is
i n which
variable If
the
which
is
primitive
a bounds
be s t o r e d , Note t h a t
ne code r o u t i n e they
in
of primitives
tability
of
the
parate
full
which
t h e program contents
constructors
of
for
complex and l o g i c a l
If
assumption
this
Table
2.2.
ble
2.1.
tion
to
is
contains
MEMORY
discover
last
is the
The a
is
type
words,
primitives, this
be r e a l i z e d
problem
in
data type. its
How-
first
ar-
t h e one which
occu-
(such
CDC
the
as t h e
occupies
user's
is
two.
an i n t e g e r
allocated
space.
program c r a s h e s
one f o r
avoid
presumably
type of
is
in-
no d i f f i c u l t y
FORTRAN
the
potential
variable
primitives
each d a t a
and p r e s e r v e by e x a c t l y
to
portability.
the
The i m p o r t a n t
type
same m a c h i -
point
is
that
necessary. SLIP
cells
SLIP
cell
and r e a l
values will
while
2.2.
are
All
not
the modification
preserving
the
operations
invol-
indirect,
and t h e r e
arguments. should
MEMORY,
which
an e n v i r o n m e n t
inquiry.
It
are
se-
in
cells,
SLIP
be o b v i o u s . )
has no a n a l o g
permits
o f t h e memory a v a i l a b l e
por-
(We assume t h a t
be s t o r e d
one p r i m t i v e , limits
another
there
word o f t h e
shown i n T a b l e
false,
of
one memory r e f e r e n -
one word and a r e a l
might
integer
double,
argument
an a d d r e s s . ) contents
argument o f t h e p r i m i t i v e
handle a
re-
(We assume
FORTRAN
a computer
implementation.
is
the
of a
any
that
however,
the
if
cell.
illustrate
specifies
as though
a given
can be d i s t i n g u i s h e d
A set ving
to
primitives
cannot will
Hence t h e
avoid
of t h e
hold
STRJND
! Different
are r e q u i r e d
primitives
hold
CELL,
cell,
SLIP
enough t o
in
SLIP
contents
a c c e s s e s two s u c c e s s i v e
fault
these
the
the
first
stored
knows.
SLIP
accesses
compiler
it
STRIND
occupies
the
which
these
enough to
implementor
nor
an i n t e g e r
Since the
the
take
variable.
FORTRAN
FORTRAN
con-
efficiency.
Consider,
that
large
and
of
Hence each must a c t
Suppose f u r t h e r
with
the
large
STRDIR
t h e most space.
3200)
to
size
of
all
of
of portabili-
then the entire
a
They t h e r e f o r e
store
Since
in
the
whose t y p e s is
STRDIR
over the
since
2.1.
portability
These p r i m i t i v e s
importance
such a d e c i s i o n .
address
argument.
'The two c o n s t r u c t o r s hazard.
on t h e
be p o r t a b l e ,
The arguments
variable
in Table
as t h e i r
of
cells,
arguments
an i n t e g e r
cell,
fields.
SLIP
for
to
we must have a p r i m i t i v e
must be t h e
selectors
SLIP
object,
of
space o n l y
is
must n e v e r be s t o r e d
component
o f each s e l e c t o r
a decision
consequences
a structured
be t h e c o n t e n t s serve
design,
t h e program
for
the
in Ta-
initializa-
SLIP
cells,
192
and the s i z e o f a c e l l by the tion
user,
: If
and i t s
in address u n i t s .
The argument
exact interpretation
the memory i s
COMMON b l o c k
in a
d e c l a r e d by the u s e r ,
the memory should be r e q u e s t e d from the system, then ber o f
SLIP
cells
in the memory.
not occupied by h i s program,
If
then
which he i s prepared
to a c c e p t .
t h e r e are fewer than
NUM
cells
is provided
NUM
depends upon the i m p l e m e n t a -
the user w i l l
NUM
or i f
i s the num-
be g i v e n a l l
memory
i s the minimum number o f c e l l s
NUM
(MEMORY w i l l
terminate execution
if
available.)
I.
Environment i n q u i r y
:
M E M O R Y (NUM, I B O T , I T O P , I S I Z E )
2.
Selectors
:
ID(A) LNKL(A) LNZR(A) CONT(A) INHALT(A)
3.
Constructors
:
S E T I N D (I D , L N K L , L N K R , A ) STRINT CIDATUM, A ) ST R E A L (R D A T UM , A )
Table Primitives
Efficiency
which p r e s e r v e P o r t a b i l i t y
considerations
dictate
in machine code i f
possible.
suffers
done.
tion
if
this
is
o f the p r i m i t i v e s
portable
version,
which
2.2.
It
that
primitives
should be r e a l i z e d
We have a l r e a d y noted t h a t is
certainly
portability
p o s s i b l e to p r o v i d e a r e a l i z a -
in the host l a n g u a g e . can be used w h i l e
This w i l l
result
the more e f f i c i e n t
in more one i s
be-
ing c o n s t r u c t e d .
3.
PORTABILITY
In S e c t i o n
1.1.
te them. A b s t r a c t of t h i s
ABSTRACT
MACHINE
MODELLING
we d i s c u s s e d the s e p a r a t i o n of a problem s o l u t i o n
a set of basic o p e r a t i o n s tion
THROUGH
and data t y p e s ,
machine m o d e l l i n g
separation
is
and an a l g o r i t h m
into
to m a n i p u l a -
simply a mechanistic interpreta-
: The b a s i c o p e r a t i o n s
and data t y p e s are used
193
to d e f i n e at hand, puter.
a fictitious
computer which
and th~ a l g o r i t h m
We c a l l
it
real
computer, we r e a l i z e
ideally
suited
computer an
abstract
models the r e q u i r e m e n t s o f the problem.
The c o n c e p t u a l of a b s t r a c t
distinctions
Practically,
the t e c h n i q u e s use o f an
hand,
lem, one i s
guage d e s i g n e r . to make i t
lie
languages and use
in the problems to which
Use of a high l e v e l language.
language i m p l i e s
An a b s t r a c t a
level
high l e v e l
new
the a b s t r a c t
machine model s p e c i f i e d
upon the a b s t r a c t
by the l a n -
- Are t r a n s l a t o r s computers,
model
a g i v e n problem. e x p r e s s i n g a problem s o l u t i o n
machine which u n d e r l i e s
are o t h e r q u e s t i o n s which
langu-
language to s o l v e a prob-
E x t e n s i o n s to the language are changes in t h i s for
to us
machine model,
high
of a language f o r
of
and say
computer.
a particular
more s u i t a b l e
com-
and could even be argued to be
could be used to c o n s t r u c t
selecting
this
To run the program on a
machine on t h a t
the d i f f e r e n c e s
high l e v e l
When one s e l e c t s
to the problem
machine
between use of high l e v e l
are a p p l i e d .
existin#
on the o t h e r
Selection
the a b s t r a c t
machine models are t r i v i a l ,
nonexistent.
solely
is
then coded in some language f o r
the f i c t i t i o u s
that
age.
is
relate
available or i s
that
to the a v a i l a b l e for
a highly
not based
language.
translators
a sufficiently portable
is
There
:
broad s e t
translator
avail-
able ? Are the t r a n s l a t o r s extended a n d / o r tered) It
rather
their
recognized that
than the
Our p r i m a r y concern exists.
(i.e.
Available underlying
strategy
be a l -
these are p r o p e r t i e s
o f the
is
language or the u n d e r l y i n g those problems f o r
in the p r e c e e d i n g paragraph the p r o s p e c t i v e
which
is
3. I .
When high
both
highly
portable
regarding
their
user must become a d e s i g n e r .
machine model f o r
use in programming t h i s
his
problem,
machine,
trans-
machine,
i n a d e q u a t e because o f
machines or because of n e g a t i v e
In any e v e n t , age to
abstract
which no adequate language
languages may be c o n s i d e r e d abstract
the q u e s t i o n s
ate an a b s t r a c t
can the language be
code g e n e r a t i o n
?
should be c l e a r l y
lator,
adaptable
can the
answers to translators. He must c r e -
devise a suitable
langu-
and then p r o v i d e a t r a n s l a t o r
and a d a p t a b l e .
BACKGROUND level
languages f i r s t
became p o p u l a r ,
much t h o u g h t
was g i v e n
194
a) The m×n translatorproblem
~
~
~
L
,.L
s
l
a t o UNCOL // ~ ~ ~ i
/
I
I
I
z
J
b) A proposed solution Figure 3.1 UNCOL
r n
s written in UNCOL Translators written machine code
[
n
]
195
to what was known as
'the
mXn t r a n s l a t o r
we wish to run programs w r i t t e n of
n
ber,
machines, it
This
[I0]
written
translators and
n
m
a single
(Figure
translators
written
It in
3.1b).
m+n,
3.1a)
: If
languages on any one To reduce t h i s
intermediate
UNCOL.
in machine code ( F i g u r e
r e q u i r e d was t h e r e f o r e
m
are r e q u i r e d .
language was to be c a l l e d
n e c e s s a r y to produce lators
that
problem'
any one of
then mXn t r a n s l a t o r s
was proposed
vised.
in
num-
language be de-
would then o n l y be and
UNCOL,
The t o t a l
a substantial
n
trans-
number of savings if
m
are l a r g e .
One o f the main reasons t h a t was the d i f f i c u l t y since
problem.
specifying
scheme was never put i n t o UNCOL.
One needs o n l y to and
LISP
seems o b v i o u s to us t h a t be adequate to s u p p o r t too s i m p l i s t i c .
to a p p r e c i a t e a single
all
We s h a l l
machine s u i t a b l e
c o n s i d e r the o p e r a t o r s
SNOBOL
out,
to be r e d e s i g n e d f o r similar
to t h a t
Another e a r l y [11].
SLANG
rature,
of
every abstract
and hence the
however, t h a t
project
but a p p a r e n t l y
was
it
model
UNCOL
The f i r s t delling
step is
described
must be kept -
t e c h n i q u e s were a l s o s i m i l a r in p r o d u c i n g
in t h i s
The r e l a t i o n s h i p
- The r e a l t i o n s h i p
model
and data
problem.
to those we s h a l l
machine model.
way was
in the open l i t e -
a p i e c e o f s o f t w a r e by a b s t r a c t
in mind when d e s i g n i n g t h i s
quite
major p i t f a l l .
used a common core s e t of o p e r a t i o n s
to d e s i g n the a b s t r a c t
these
i s thus
types which were extended to meet the needs of a p a r t i c u l a r The r e a l i z a t i o n
is
operati-
are
Our approach
its
never f u l l y
p r o b a b l y never
there
a t t e m p t to p r o v i d e enhanced p o r t a b i l i t y This
It
There i s no need f o r
machine.
but a v o i d i n g
GNOOL,
every
involved.
machine w i l l
ons and data types common to most p r o b l e m s .
for
and data t y p e s f o r
the problems
abstract
languages,
point
practice
This should not be s u r p r i s i n g ,
must be based on an a b s t r a c t
UNCOL
ALGOL,
of
this
discuss.
machine mo-
Three c o n s i d e r a t i o n s
:
between the model and e x i s t i n g
computers.
between the model and the problem being
solved. - The t o o l s Overall while
efficiency
the t h i r d
Some care i s ly
available
the r e a l i z a t i o n .
depends p r i m a r i l y
needed in b a l a n c i n g the
easy to r e a l i z e .
upon the f i r s t
two c o n s i d e r a t i o n s ,
d e t e r m i n e s the c o m p l e x i t y of the model.
s i m p l e model r e s u l t s
however,
for
If
in a h i g h l y
first
two c o n s i d e r a t i o n s ,
portable
the problem r e q u i r e s
An extrem.
program s i n c e the model
relatively
is
complex o p e r a t i o n s ,
these must be coded i n terms of the s i m p l e model.
Often
it
196
turns
out t h a t
certain
machines
operations.
Since t h e a l g o r i t h m
operations,
it
culty
nes. all
operations,
its
the a l g o r i t h m portability
those operations
to t h i s
problem is
with
the r e a l i z a t i o n of the r e a l may
tail
in S e c t i o n
is
the a b s t r a c t
MODEL
one.
This machine is
There may be s e v e r a l must
machi-
machine which p r o v i d e s
very simple
levels
operations.
be c a r r i e d
into
The major
o u t in terms
this
technique
in more de-
EXISTING
a single
COMPUTERS
real
In o r d e r to m a i n t a i n
account t h e c h a r a c t e r i s t i c s
features
of interest
mechanisms
We s h a l l
computer when d e s i g n i n g
for
attempt
portability,
are t h e r e g i s t e r
addressing
to c l a s s i f y
the de-
o f a wide c l a s s
of com-
organization,
data a g g r e g a t e s , existing
and
computers
accor-
ding to t h e s e f e a t u r e s . Let us r e v i e w the major likely
to e n c o u n t e r .
haustive. which
register/processor
This
classification
For each c a t e g o r y ,
belong to t h a t
category
No programmable
we s h a l l
organizations
(IBM 1400 A single
arithmetic
an e x t e n s i o n , of the major
typical
computers
:
registers.
series,
which we are
should not be c o n s i d e r e d exnote s e v e r a l
All
instructions
ope~ands from memory and l e a v e t h e i r
ries,
in-
Only
o f any machine in the h i e r a r -
discuss
TO
to c o n s i d e r
machine model.
facilities.
of a b s t r a c t
by the problem.
however, o p e r a t i o n s
THE
memory o r g a n i z a t i o n , I/0
a hierarchy
7.
not s u f f i c i e n t
s i g n must t a k e
because o f the d i f f i -
hardware.
an a b s t r a c t
only
We s h a l l
coded in terms of the
suffers
l o w e s t machine
be so r e a l i z e d .
RELATING
puters.
providing
of this
hardware;
chy
3.2.
is
required
in terms o f a s i m p l e r the l o w e s t
is
on s i m p l e
to design
At the top o f the h i e r a r c h y
then r e a l i z e d
t h e s e complex
has been coded in terms o f the s i m p l e
if
o f the complex o p e r a t i o n s
volved,
It
Conversely,
of realizing
The s o l u t i o n
realize
must be changed to t a k e advantage of the more s o p h i s t i -
cated h a r d w a r e . sophisticated
have hardware to
take their
results
in memory.
IBM 1620) register.
This
register
which does not have the f u l l register.
(IBM 7040,
7090,
often
has
capabilities CDC 30O0
se-
many m i n i c o m p u t e r s )
- Multiple arithmetic registers. Arithmetic instructions may t a k e t h e i r operands from r e g i s t e r s or memory; some
197
registers
may be r e l a t e d ,
same c a p a b i l i t i e s , Register be in
file.
but a l l
have e s s e n t i a l l y
the
(IBM System~360)
Operands
registers.
for
arithmetic
There are m u l t i p l e
which have e s s e n t i a l l y
instructions
registers,
the same c a p a b i l i t i e s .
must
all
of
(CDC 6000,
7000) - Stack. fixed
Operands f o r positions
arithmetic
in the s t a c k .
instructions
(ICL KDF9,
are found
in
BURROUGHS 5000,
5500) The major
effect
termediate register ware. ter
results
is
machines
if
stored
if
t h e y are a u t o m a t i c a l l y
storage
come too l a r g e .
is on the programmer's
must be e x p l i c i l y
available;
Explicit
file
of the organization
is
only
required
only
used f o r
a single register
hard-
and r e g i s -
intermediate
computation
In-
arithmetic
p r e s e r v e d by s t a c k
in m u l t i p l e
the number o f s i m u l t a n e o u s
The r e g i s t e r
storage.
results
must be v a r i e d ,
be-
how-
ever. In view of t h e s e d i f f e r e n c e s , in which t e m p o r a r y programmer.
This
storage
applies
it
would be r e a s o n a b l e
need not be e x p l i c i t l y
not o n l y
referenced
to t e m p o r a r i e s
generate
in the c o u r s e of t r a n s l a t i n g
normally
provided
by the programmer
to d e s i g n
by t h e
which a c o m p i l e r
an e x p r e s s i o n , (e.g.
a model
the e x t r a
but a l s o location
would
to t h o s e used to
interchange words during a s o r t ) . There are three major kinds of memory o r g a n i z a t i o n s which we are l i k e l y to encounter : Linear address space. The memory consists of a series
-
of
locations,
CDC 6000,
consecutively
Piecewise-linear
-
sists
address
space.
spaces,
or the a d d r e s s i n g
an o r g a n i z a t i o n .
There are s e v e r a l
addressed memories o f v a r y i n g explicitly
(Some m i n i c o m p u t e r s , Any of t h e s e o r g a n i z a t i o n s addressable
unit.
independent,
may have e i t h e r
linear
BURROUGHS5500) independently-
speeds, w i t h
controlled
CDO 6000
con-
mechanism imposes such
(Many m i n i c o m p u t e r s ,
Memory H i e r a r c h y . between l e v e l s
lest
(IBM System~360,
The memory e i t h e r
o f a number of modules w i t h
address
-
numbered.
7000)
with
data t r a n s f e r
by the programmer. extended core)
b y t e s o r words as the smal-
198
Differences
in memory o r g a n i z a t i o n
In a p i e c e w i s e - l i n e a r
memory, f o r
the c o s t of an a r r a y r e f e r e n c e
appear as s i z e l i m i t a t i o n s .
if
is
a large
i n c r e a s e in
the s i z e of the a r r a y exceeds the s i z e
of one module.
This
into
- a module address and the address o f a l o c a t i o n
two p a r t s
the module. le,
If
because e v e r y i n d e x must be e x p l i c i t l y
the s i z e o f the a r r a y i s
a paged memory to p r o v i d e a l i n e a r
is transparent is
limited
incorrect
for
a model
to some f i x e d the m a j o r i t y
value.
of
address space i f
Whatever v a l u e i s A better
large arrays will
language d e s i g n should a v o i d any i m p l i c i t (such as e x i s t s
the paging
in
be
i s to s i m p l y
be e x p e n s i v e on some for
relationship
each case.
The
among s e p a r a t e l y
) which assumes a
FORTRAN COMMON
address space.
References to c o n s t a n t s by the t r a n s l a t o r . partially
and s i m p l e v a r i a b l e s
can be c o m p l e t e l y s p e c i f i e d
References to data a g g r e g a t e s ,
unspecified
until
common mechanisms f o r fication
(We con-
chosen w i l l
course
and then g e n e r a t e the best code p o s s i b l e
declared arrays linear
by the t r a n s l a t o r .
in which the maximum s i z e o f an a r -
computers.
make the programmer aware t h a t computers,
within
to the u s e r . )
u n r e a s o n a b l e to design
ray i s
divided
l e s s than the s i z e o f the modu-
then the module address can be s u p p l i e d
sider
It
is
usually
example, t h e r e
however, may be l e f t
the program i s e x e c u t e d .
providing
the i n f o r m a t i o n
There are t h r e e
to complete the s p e c i -
: Program m o d i f i c a t i o n .
The a c t u a l
address i s pre-computed
by the program and p l a c e d in an i n s t r u c t i o n then e x e c u t e d . Indirect
(IBM 1400
addressing.
IBM 1620)
series,
The a c t u a l
address
is
by the program and placed i n some l o c a t i o n . struction terprets
references its
that
contents
Index m o d i f i c a t i o n .
location
pre-computed The i n -
and the hardware i n -
as an address
The a c t u a l
which i s
(IBM 1620)
address i s
the hardware at the t i m e the r e f e r e n c e
computed by
i s made, P a r t
o f the data r e q u i r e d to compute the address i s by the r e f e r e n c i n g a register
instruction,
specified
(IBM System~360, There are many v a r i a n t s
of
by the r e f e r e n c i n g
CDC 3 0 0 ~
SO00,
index m o d i f i c a t i o n ,
the c o m p u t a t i o n o f the e f f e c t i v e
supplied
the r e m a i n d e r comes from 7O00
instruction.
series)
but the c e n t r a l
address by the hardware.
point
is
199
Components of data a g g r e g a t e s
are accessed f r e q u e n t l y ,
ses o f t e n
In f a c t ,
loop
occur
inside
in most programs
perform operations the i n n e r
loops. is
the o n l y purpose of an i n n e r
to sequence t h r o u g h
upon i t s
loops u s u a l l y
components.
controls
and these acces-
some data a g g r e g a t e and
Measurements show the t i m i n g
the e x e c u t i o n
Hence the way in which data a g g r e g a t e s are accessed w i l l cant e f f e c t If
mechanism f o r
the programer can use p a r t i c u l a r
performacne of the a l g o r i t h m . rent
for
the d i f f e r e n t
computers
mechanisms. This
from the use o f
a c c e s s i n g data a g g r e g a t e s ,
coding t e c h n i q u e s
Unfortunately, is
r i t h m does depend on the model, and l a r g e tain
have a s i g n i f i -
upon the execU%ion o f most programs.
a model assumes a p a r t i c u l a r
then
of
t i m e o f the program.
to improve the
the t e c h n i q u e s
are d i f f e -
a case in which the a l g o penalties
an i n a p p r o p r i a t e
can accrue on c e r -
model.
The best escape
from the dilemma seems to be t o model the most p r o b a b l e mechanism ( i n d e x modification)
and t r y
higher-level A procedure status model
constructs, call
saving. for
natives realizing
to a v o i d i n e f f i c i e n c i e s as i l l u s t r a t e d
involves
in S e c t i o n
two d i s t i n c t
The r e a l i z a t i o n
actions
the
latter
6.1.
There are f o u r
encoding o f
7.
p a r a m e t e r p a s s i n g and
of the f o r m e r i s
a c c e s s i n g data a g g r e g a t e s ; we s h a l l
in S e c t i o n
by d i r e c t
closely
discuss
linked
t o the
the p o s s i b l e
:
Relevant status
is
when a s u b r o u t i n e
(ICL KDFg,
placed on a s t a c k jump i n s t r u c t i o n
by the hardware
is executed.
BURROUGHS 5500)
Relevant status
is
when a s u b r o u t i n e
placed i n a r e g i s t e r jump i n s t r u c t i o n
by the hardware
is executed.
(IBM 7o4o, 7o9o, System/36o) -
R e l e v a n t s t a t u s i s p l a c e d in memory by the hardware when a s u b r o u t i n e jump i n s t r u c t i o n i s e x e c u t e d . The memory l o c a t i o n
bears some f i x e d
t a r g e t of the s u b r o u t i n e series, IBM 7 0 4 0 ) . - A separate instruction v a n t s t a t u s (GE 6 4 5 ) .
is
alter-
common hardware mechanisms f o r
jump
relationship
(CDC 6000
provided for
and
to the 7000
s a v i n g the r e l e -
200
The makeup of the At the l e a s t , The a c t u a l hardware, standard
it
'relevant is
realization but a l s o
status'
the r e t u r n
of a p r o c e d u r e c a l l
upon the o p e r a t i n g
procedure calling addressing,
cessarily level
simply
stating
In some cases the p r o c e d u r e such t h a t cursive
there
calls.
Unfortunately
u s i n g the t h i r d
procedure
retrieves
(Alternatively,
nism in s o f t w a r e . ) cedures which as o b t a i n i n g
cate
for two
computers of this
is e a s i e r
is
programs
rapidly.
scale
immaterial
tion
o f the time
string).
required
to
it,
gross
inefficiencies
model
and p r o v i d e s
is
are advo-
I/0
to data a g g r e g a -
milliseconds
to com-
bound or compute
operation
is
If
the time
a small
the i n i t i a t i o n
therefore
opera-
no need to s p l i t
then the o v e r a l l
a simple
(such
devices.
of i n p u t / o u t p u t
There is
an I / 0
pro-
the overhead r e -
are o v e r l a p p e d or n o t .
to complete
A satisfactory
task
and n o n r e c u r s i v e .
which r e q u i r e s
initiate
mecha:
we s t r o n g l y
of references
speed o f the program would not change even i f shed c o m p l e t e l y .
some s i m p l e calls,
the
on a s t a c k .
the f i r s t
These p r o c e d u r e s
modelling
modelling
a transfer
it
that
a wide range o f p e r i p h e r a l
w h e t h e r the program is
requires
For example
employ one or more s h o r t
to p e r f o r m
is much l o n g e r .
bound, o r w e t h e r the data t r a n s f e r s the model
in g e n e r a l .
and n o n - r e -
above r e q u i r e s
recursive
efficient
than e f f i c i e n t
to i n i t i a t e
by the hardware is
In view of t h i s ,
operations
would ne-
to be c a l l e d .
between r e c u r s i v e
not t r u e
loops
A
to
use a high
from t h e memory and p l a c e s
communicate w i t h
plete. which
cost
is
Because o f the f r e q u e n t
microseconds is
in
this
builds
diversity,
because the time It
mechanism p r o v i d e d
from an i n p u t
procedure call
In s p i t e tions
call
inner
a character
Existing
tes
a procedure
Most modular
recursion
We must t h e r e f o r e
c o u l d be s e t up to s i m u l a t e
are used in
n e v e r used r e c u r s i v e l y . quired
assumed.
mechanism d e s c r i b e d
calls
computer.
the system is
As in the case of data ag-
that
the status
all
if
upon the
coding of a procedure call
is no d i f f e r e n c e
recursion
system o f the t a r g e t
(see Dennis B . ) .
the d e t a i l e d
depend upon the model
model,
upon the computer.
depends not o n l y
sequence is mandatory
have a common base language gregate
depends e n t i r e l y
address.
frac-
execution time v a n i -
one which a v o i d s
u s e r image.
201
An a b s t r a c t ral
machine model
devices.
and d e f i n e s
codes.
its
is
b e h a v i o r when i t
The a b s t r a c t
uses to communicate w i t h mation
connected to a number of a b s t r a c t
periphe-
Each o f these d e v i c e s has a model which d e s c r i b e s
racteristics operation
is
machine has a s i n g l e
all
of
its
peripheral
passed in both d i r e c t i o n s ,
as a s i d e e f f e c t .
A I/0
request
The p e r i p h e r a l
If
and i n f o r m a t i o n
to be performed
the o p e r a t i o n
instruction
devices.
involves
unit
cha-
certain which
Control
transfer
i s d e f i n e d by s p e c i f y i n g
to be used ( l o g i c a l
The o p e r a t i o n
its
is presented with
it
infor-
may occur
the f o l l o w i n g
:
number).
(operation
code).
data t r a n s m i s s i o n ,
the
memory to be used. A peripheral
device returns
of the o p e r a t i o n ,
ing t h r e e are common to a l l The o p e r a t i o n -
device
(e.g.
: sequential
is
illegal
is
space a s e q u e n t i a l the p a r t i c u l a r
realization
Suppose t h a t
a user i s
the p o s i t i o n
of the n e x t standard
realization
a sequential (It
Note t h a t
record
this
in the a b s t r a c t
at any
d e v i c e can be r e s e t
classification
only
to back
depends o n l y
machine program,
device.
to be read or w r i t t e n
or m u l t i p l e - b u f f e r i n g
the d e v i c e
strategy,
the p o s s i b i l i t y
are two major d e v i c e
a l s o may be p o s s i b l e
accessing a sequential
system of the t a r g e t
o b t a i n e d by t h i s
r e q u e s t e d an
not on
o f the d e v i c e .
doubleof
disc full).
A random d e v i c e can be r e s e t
position.
device.)
There
on the use made o f the d e v i c e in the a b s t r a c t
overlap
results
device,
code does not i m p l y t h a t
position;
initial
on the p e r i p h e r a l
being i g n o r e d .
and random.
t i m e to any a r b i t r a r y to some s p e c i f i e d
ting
endfile,
The o p e r a t i o n was not completed because i t
of overlapped operations
in the
the
but the f o l l o w -
was completed n o r m a l l y .
The e x i s t e n c e of a c o m p l e t i o n
means t h a t
codes,
:
on the p e r i p h e r a l
a c t i o n which
classes
code which d e f i n e s different
The o p e r a t i o n was not completed because of an end condition
-
a completion
Each model may r e q u i r e
if
machine.
At each r e q u e s t is
known.
techniques
This
can be used
t h e y are not p r o v i d e d by the o p e r a The maximum p o s s i b l e o v e r l a p can be
and hence t h e r e
machine program f o r
is
J
never any need to use
sequential
devices.
202
The sequence of r e q u e s t s not w e l l
defined.
of the d e v i c e ,
for
but must be b u i l t
way of doing t h i s The new r e q u e s t s
emptied) pleted I/0
simply
as normal
into
'advise' time.
the a b s t r a c t
at any t i m e a f t e r
that
machine program.
requests
A specified
the
'advice',
can be d i v i d e d
normal
into
I/0
the c o r r e s p o n d i n g
'Advice'
requests.
b e f o r e the c o r r e s p o n d i n g
operations
from a random access d e v i c e i s
i s to double the number of p e r m i s s i b l e
be i s s u e d at some f u t u r e information
information
Thus the o v e r l a p cannot be handled in the r e a l i z a t i o n
operation will
give exactly
buffer
three
(or
must be com-
returns.
categories
- Read. O p e r a t i o n s which t r a n s f e r
the same
may be f i l l e d
but the t r a n s f e r
request
One
requests.
:
information
from a
d e v i c e to memory. - Write.
O p e r a t i o n s which t r a n s f e r
information
from
memory to a d e v i c e . - Control.
O p e r a t i o n s which do not t r a n s f e r
information
between a d e v i c e and memory. There may be many o p e r a t i o n s p o s s i b l e on a p a r t i c u l a r Occasionally
it
is difficult
ple,
c o n s i d e r the p l o t t e r
tion
to X , Y ' .
drawn.
to c l a s s i f y operation
Both o p e r a t i o n s line
operation
should be a c o n t r o l
examine the way most pen p l o t t e r s When the
o f which are
a given i n s t r u c t i o n .
i s drawn,
actually
use the p l o t t e r
this
command i s
all, If,
to
is X,Y
however, we
the d i s t i n c t i o n
hardware command preceeded by a
posi-
a line
position
operation.
work,
For exam-
from the c u r r e n t
- after
'move the pen from the c u r r e n t
drawing a l i n e '
as c l e a r .
not a l l
'draw a l i n e
This should be a w r i t e
The o p e r a t i o n
without
in each c a t e g o r y ,
device.
is
not
'move pen'
'pen down'
command. Such c o n f l i c t s an a b s t r a c t relevant.
can u s u a l l y
model.
The program i s
produce a l i n e
(clearly
chanism ( c l e a r l y
of
capabilities
I/0
limited
in a given class.
If
rors
operation)
the
line
is
of ir-
the model o f the p l o t t e r or r e p o s i t i o n
its
to
writing
me-
operation). operations
is
useful
because i t
of d e v i c e s and the s t r u c t u r e
Most d e v i c e s w i t h operating
goes about p r o d u c i n g
simply instructing a write
a control
The c l a s s i f i c a t i o n general
be r e s o l v e d by a d h e r i n g to the c o n c e p t i o n
How the p l o t t e r
capabilities
requests
system of the t a r g e t
for
such o p e r a t i o n s
computer,
and a complete l o s s of c o n t r o l
of t h e i r
cannot p e r f o r m
any
reflects
the
realizations. operations
are p r e s e n t e d to the
they often
by the a b s t r a c t
result
in f a t a l
machine model.
erBy
203
having the r e a l i z a t i o n s i m p l e checks,
3.3.
o f the a b s t r a c t
RELATING
THE
As we have n o t e d ,
MODEL
and data t y p e s .
tion,
in the sense t h a t
word.
this
type.
concatenation
computers which neral,
Particular
concatenation
example f o r
realize
a g r e a t deal
them d i r e c t l y .
basic operations
red to
THE
a certain
of code i s
s e t of b a s i c o p e r a -
computers may, however, p r o v i d e
hardwith
and l e x i c a l
selection
will
of abstrac-
n e c e s s a r y to r e a l i z e
The c h a r a c t e r
string
data t y p e ,
comparison,
A significant
provides
amount of code i s
an
requi-
comparison on word o r i e n t e d hardware.
(Remember t h a t ,
in ge-
have d i f f e r e n t
offsets
within
a machine
and s p l i c e d
together
during
the o p e r a -
Hence words must be s h i f t e d
SYSTEM/3SO the same o p e r a t i o n s
On
r e q u e s t make a few
PROBLEM
and l e x i c a l
have no f i e l d
the operand s t r i n g s
tion.)
I/0
can be a v o i d e d .
Some o f these may r e p r e s e n t high l e v e l
them on many computers. ware which r e a l i z e s its
TO
each problem r e q u i r e s
tions
excellent
machine's
most o f these c a t a s t r o p h e s
can be performed u s i n g o n l y
a few i n s t r u c t i o n s . We have a l r e a d y noted the i m p o r t a n c e of and data t y p e s
in the model.
what
mation about
It
level
operations
The d e c i s i o n
can then be d e f e r r e d
until
about how to we know what
are a v a i l a b l e .
would be p o s s i b l e
one i n s t r u c t i o n cause i t
and the
for
general.
current
compromise.
an a b s t r a c t
ask q u e s t i o n s a model.
about the l e v e l
These q u e s t i o n s
not i n t e r e s t i n g
o f the hardware a r t .
Unfortunately,
it
of a b s t r a c t i o n
are v i r t u a l l y
abstract
realized
machine and then
process begins w i t h
the
They r e p r e s e n t
the a l g o r i t h m on s e v e r a l
the s e l e c t i o n
which
i m p o s s i b l e to designer
an e n g i n e e r i n g
has been our e x p e r i e n c e t h a t
after
be-
the s o l u t i o n .
The answers depend upon the a l g o r i t h m ,
state
model o n l y becomes c l e a r
The design
machine model which had
Such a model i s
in any way reduce the l a b o r of r e a l i z i n g
therefore
appropriate
answer in
to c o n s t r u c t
: s o l v e the problem.
does not
One m i g h t is
results
high
the user to p r o v i d e more i n f o r -
he e x p e c t s to happen.
a c h i e v e the d e s i r e d tools
including
They p e r m i t
the p r o p e r
has been coded f o r
the
computers. of
some o p e r a t o r s
and data
t y p e s which seem a p p r o p r i a t e .
As the coding o f the a l g o r i t h m
progresses,
this
'obviously'
Sometimes
original
c h o i c e becomes
minor m o d i f i c a t i o n s necessary.
are i n d i c a t e d ,
Most d r a s t i c
already written.
revisions
The need f o r
less
but to o f t e n invalidate
rewriting
appropriate. a drastic
revision
large portions
and r e s p e c i f y i n g
is
o f the code
results
in
co-
204
ding
t i m e s which
ment e q u i v a l e n t
are significantly software
using
longer
than
an e x i s t i n g
those
required
l a n g u a g e which
to
imple-
permits
no
extensions. When t h e d e s i g n e r believes sea o f
in his
requires cable
to
virtually
People w i t h writing,
is
-
and i n t e g e r
of
The most common e x t e n s i o n s Reals and r e a l
in
a fixed
he
a vast
he can use t o
guide
the coding
point,
and d a t a t y p e s which
may w e l l
and d a t a
we need are appli-
types.
specify At t h e
sometime of
: arithmetic
and r e l a t i v e
character to
adrift
under which
and e x p e r i e n c e
following
equality
Input/output
- Strings;
which
To p r o v i d e
of operations
the
-
he i s
[12].
backgrounds
common s e t s
for
point
of operations
we p r o p o s e
-
no f i x e d
problems
different
- Tests
and d a t a t y p e s which
a t hand,
amount o f t i m e .
all
Integers
operators
problem
These a r e t h e c o n d i t i o n s
a common c o r e
what d i f f e r e n t
the
There
design.
the greatest
recognize
by s e l e c t i n g
for
possibilities.
himself to
begins
are suitable
this
magnitude
information list
would
be :
arithmetic
concatenation,
selection
and l e x i c a l
comparison Input/output
-
Neither
list
o u r minds
tions, rence parts.
be c o n s i d e r e d
gospel.
We r e s e r v e
to a common s e t o f o p e r a t i o n s
a common s e t
tures
to
the
right
to
change
a t any t i m e .
In a d d i t i o n fy
is
o f memory images
which
specify data
'organizational'
permit
the
that
We would p l a c e - Labels -
-
-
features
o f t h e model.
These are f e a -
the execution
of basic
variables
have c e r t a i n
and b r e a k t h e
the following
and t r a n s f e r s
algorithm
constructs
in this
of control
and r e c o r d s
Conditional
- Procedures
and r e p e t i t i v e and b l o c k s
types,
into
Declarations Arrays
can s p e c i -
programmer t o o r d e r
certain
aggregates,
and d a t a t y p e s , o n e
statements
opera-
form and r e f e -
intercommunicating category
:
205
4.
REALIZATION
An a b s t r a c t
OF
ABSTRACT
machine model
and data types was w r i t t e n
is
MACHINE
realized
in terms of the model,
4.1.
CHARACTERISTICS
TRANSLATOR
The most i m p o r t a n t If
for
translator
table,
resources
is
constructing
available.
If
then two problems must be overcome Lack o f
for
machine.
inadequate selling
effort
mechanically.
is
its
portable
portability.
programs, is
languages o t h e r
on the s t a t e being t r u e red to
o f the a r t ,
a translator
by the user ) incompatible
implement i t
results
it
is
local
dialects.
himself
differs
staff
from m a t e r i a l
generation
algorithm
to f i t
We s h a l l
discuss
Translation
may be d i v i d e d
every abstract
be based on t h i s
the t a r g e t
into
language,
machine.
core o f o r g a n i z a t i o n a l seem a p p l i c a b l e
it
possession, with
computer.
litt-
Hence a p o r t a b l e
must be easy to a l t e r machine.
This
in some d e t a i l
two subtasks
and code g e n e r a t i o n .
o n l y upon the source for
in h i s
used to a c h i e v e p o r t a b i l i t y
these t e c h n i q u e s
language c o n s t r u c t s
language must be p r e p a -
programs because the d e s i r e d
upon the t a r g e t
must a l s o be a d a p t a b l e
to the t e c h n i q u e s
insurmounta-
o f the i n s t a l l a t i o n .
translator lates
efficiency.)
something which we must r e c o g n i z e as
from most p o r t a b l e
depend s t r o n g l y
for
o f the
This may be a poor comment
FORTRAN.
at the moment. The u s e r of any o t h e r
l e or no a i d from the A translator
ANSI
but
por-
on
these problems are v i r t u a l l y
than
then
( T h i s may be due to an
i m p l e m e n t o r or to a m i s p l a c e d d e s i r e
ble for
not
:
( T h i s may be due to m i s u n d e r s t a n d i n g s on the p a r t
Our e x p e r i e n c e has been t h a t
which
a translator.
the t r a n s l a t o r
constructing
the d e s i r e d t a r g e t
The tendency to produce
basic operations
then t r a n s l a t e d
of the t r a n s l a t o r
for
must be w i d e l y
its
The a l g o r i t h m ,
the r a l i z a t i o n
characteristic
a language i s to be u s e f u l
computer.
is
required
its
by d e f i n i n g
in terms of the t a r g e t
Hence the major t o o l
MODELS
the code
characteristic
re-
in the t r a n s l a t o r . in S e c t i o n
: recognition
The r e c o g n i t i o n
4.2. of source
process depends
and hence could c o n c e i v a b l y be d i f f e r e n t
We have, however, a l r e a d y noted a common
features,
to most p r o b l e m s . common c o r e ,
basic operations A framework f o r
and a u n i f o r m
and data types which language d e s i g n can
recognition
algorithm
built.
206
Such a framework w i l l
be presented in Section
6.
Our experience has been that the interface between the recognition and code generation tasks must also be adaptable, even i f
a common framework
is used for designing abstract machine languages. The level at which constructs in the source language are recognized often depends upon how code for them is to be generated. Section
7.3. i l l u s t r a t e s this point
with examples f r o m ~ITEM. One important c h a r a c t e r i s t i c of a t r a n s l a t o r is i t s creasing the complexity of the t r a n s l a t o r i t
complexity. By in-
is possible to make the
source language more convenient f o r the user, to perform more complex optimization and to provide b e t t e r diagnostics. At the same time (with the current state of the art) one makes the t r a n s l a t o r more d i f f i c u l t to adapt and less accessible to small computers. We have taken a d e l i berate decision,
based upon our perception of today's needs and our
own l i m i t a t i o n s , to concentrate on simple t r a n s l a t o r s . As the methods for achieving a d a p t a b i l i t y in more complex translators become c l e a r , they can be w r i t t e n in terms of abstract machine languages processed by the simpler t r a n s l a t o r s . A conventional compiler is obviously unsuited to our purposes. There are some compilers, such as that for
BCPL [12], which are r e a l t i v e l y
portable and have code generators which can be adapted. The source language may or may mot be extensible. I t
is generally d i f f i c u l t
to chan-
ge the linkage between the recognizer and code generator. Usually the code generator is coded into the compiler, which is w r i t t e n in i t s own source language. A thorough
knowledge of the internal structure of the
compiler is necessary to adapt i t .
Such translators are only marginally
useful for our application. Syntax-directed compilers generators
[14]
[13]
and translators produced by compiler
can be modified to accept d i f f e r e n t source languages.
Unfortunately, most recognition algorithms depend upon context-free grammars. This means that a p a r t i c u l a r construct is always parsed in exactly the same way. For example, the arguments of a procedure call may always be recognized before the e n t i r e call
is recognized. When using a
hierarchy of abstract machines, we would probably represent high level operations by procedure c a l l s . ded some, but not a l l ,
Suppose that our target computer provi-
of these high level operations, i t would be con-
venient to be able to recognize the procedure calls which were d i r e c t l y translatable as single units, while processing the others in the normal way.
207
This d i f f i c u l t y
is
a v o i d e d by systems which a l l o w the user to embed
'semantic actions' procedure trary
calls,
actions.
in the s y n t a x s p e c i f i c a t i o n possibly
with
to
'success'
a v a l u e in some s y s t e m s . The v a l u e i s or
'failure',
s i n g the r e c o g n i z e r to b a c k t r a c k . it
is
interpreted
position
of the
be taken w i t h track
input
a more g e n e r a l
(see r e f e r e n c e if
the
I
for
recognizer
c o m p i l e r designs facilities
The c o u p l i n g
well-defined,
source
a failure
is
are s l a n t e d
are p r i m i t i v e ,
is
of-
cau-
allowed,
examples).
current Care must
p e r m i t t e d to back-
toward the r e c o g n i t o n or are w r i t t e n
into
between r e c o g n i z e r and code g e n e r a t o r
is
usu-
and can be changed o n l y by changing the s y n t a x of the
language or making e x t e n s i v e m o d i f i c a t i o n s blow is
output
return
[16].
Code g e n e r a t i o n s
the c o m p i l e r .
final
string
semantic actions
Most s y n t a x - d i r e c t e d
ally
If
with
as an element which must be r e c o g n i z e d at the
o v e r them
phase.
These are s i m p l y
p a r a m e t e r s , which can p e r f o r m any a r b i -
They may r e t u r n
ten r e s t r i c t e d
[15].
the a p p a r e n t l a c k o f p o r t a b i l i t y
to the c o m p i l e r .
The
of these s y s t e m s , d e s p i t e
c l a i m s to the c o n t r a r y . At the c u r r e n t
state
of the a r t ,
the most s u i t a b l e
purpose seem to be those which p e r f o r m both tion
interpretively.
The t r a n s l a t i o n
rules
gram to be t r a n s l a t e d ,
and can e a s i l y
quirements.
these p r o c e s s o r s
suited
to
lookup, preter
In e f f e c t ,
compiler writing.
code c o n v e r s i o n and can be c a l l e d
When the user d e f i n e s
writing
a compiler for
his
source
primitives it
is
for
constructing
also useful
[19.,20]
The main v a r i a t i o n s [19]
other
into
he i s ,
In some cases
a s e t of t r a n s l a t i o n If
use a f o r m a l
syntax,
possibly
with
rules,
can then be used as
such a f e a t u r e
recognizer.
is
[21-23].
Programs l i k e
Each t e c h n i q u e
has advantagesand d i s a d v a n t a g e s which we s h a l l
was d e t e r m i n e d a l m o s t e n t i r e l y available
[23]
as our b a s i c
by i t s
on a new computer w i t h
pattern
matching
Ma-
scheme.
not pursue
implementation tool
portability.
an e f f o r t
TMG
embedded s e m a n t i c a c t i o n s .
keywords or a general
STAGE2
[18].
as s y n t a x - d i r e c t e d
use e i t h e r
Our c h o i c e o f
provided,
rules
cro p r o c e s s o r s here.
inter-
in e f f e c t ,
purpose macro p r o c e s s o r s
seem to be in the
the
in the t r a n s l a t i o n
rules
These r u l e s
rules.
re-
(such as d i c t i o n a r y
are b u i l t
type are n o r m a l l y p r e s e n t e d e i t h e r or as g e n e r a l
the p r o -
to meet p a r t i c u l a r
to be a b l e to e x c i s e some o f the f r o z e n
Processors of t h i s compilers
[17].
our
and code g e n e r a -
language/machine p a i r .
'freeze'
the system
for
a language e x p r e s s l y
primitives
scanning)
his translation
the system a l l o w s the user to them i n t o
provide
upon by s i m p l e c o n s t r u c t s
rules.
compiling
recognition
are s u p p l i e d w i t h
be m o d i f i e d
The i m p o r t a n t
and l e x i c a l
translators
ranging
STAGE2
can be made
from one man-day
208
to two man-weeks.
It
ranging
IBM 1130
and
from the
The term
STAR.
people to Mcllroy
has been implemented on and
'macro p r o c e s s o r '
imply simple text
pointed out t h a t
ming at t r a n s l a t i o n
program.
an i n t e r p r e t e r
for
vided with
high
This
says,
a general
computers
CONTROL DATA
7600
may be m i s i n t e r p r e t e d
by some
In h i s
[24],
classic
paper
which could be performed a t in e f f e c t ,
that
purpose,
operations
which
Since t h a t
translation, are u s e f u l
run t i -
a macro p r o c e s s o r i s
programming l a n g u a g e .
a particular
level
different
a macro p r o c e s s o r should be capable o f p e r f o r -
time any a c t i o n
me by a normal
i s to be used f o r
replacement.
25
to the
DEC P D P - 1 1
it
language
should be p r o -
in c o n s t r u c t i n g
translators. interprets
STAGE2
manipulation.
a low l e v e l
language designed e x p r e s s l y f o r
Since the p o r t a b i l i t y
of
itself
STAGE2
the
almost e q u i v a l e n t
to a b s o l u t e machine code in s t r u c t u r e .
criticism
usually
the b a s i s of table for
level
quite
STAGE2
intended
casual
use.
language.
use : It
to i l l u s t r a t e
is
STAGE2
Such a g e n e r a l
a basic tool
translator using
section with
for
obtaining
could be w r i t t e n
for
sections
will
use
STAGE2
and a d a p t a b i l i t y .
of the macro language which are r e q u i r e d
a brief
por-
translator a suita-
STAGE2.
methods o f a c h i e v i n g p o r t a b i l i t y
d e r s t a n d those examples w i l l this
to p r o -
The d e s i g n can be defended o n l y on
i s not to be c o n s i d e r e d a g e n e r a l
machine and r e a l i z e d
characteristics
oriented
such
Assembly language programmers seem
Many of the examples in the f o l l o w i n g tailed
Some people ha-
(Our e x p e r i e n c e has been t h a t
acceptable.)
systems programs.
ble a b s t r a c t
basis.
might be c o n s i d e r e d
comes from those who are p r i m a r i l y
gramming in a high to f i n d
on t h i s
STAGE2
It
was the p r i m a r y
design c r i t e r i o n , ve c r i t i z e d
language has no f r i l l s .
string
be d i s c u s s e d at the t i m e .
We s h a l l
macros Any deto unclose
o v e r v i e w o f the p r o c e s s o r to p r o v i d e the ne-
c e s s a r y background. Each macro has a ce of l i t e r a l
template and a
a s t r i n g by a l e f t - t o - r i g h t the s t r i n g .
code body. The template is a sequen-
characters and parameter Each parameter
f l a g s . A template is matched to
scan which compares l i t e r a l
characters of
f l a g can match any s u b s t r i n g of the given
s t r i n g ( i n c l u d i n g a n u l l s t r i n g ) which is balanced w i t h respect to parentheses. The match m u s t account f o r a l l
of the characters i n the
s t r i n g . There may be several templates which match
a given s t r i n g .
This ambiguity is resolved in a standard way which does not depend upon the order i n which the macros were defined. When a template i s matched to a s t r i n g , the corresponding code body is
209
effectively
a procedure
parameters flags
split
supplied
reference
4.2.
In S e c t i o n
4 1.
z i n g an a b s t r a c t
of
we advocated a h i g h l y abstract machine.
machine f o r
and a r u n n i n g terms of
portable
which
it T
is is
N.
This d e f i n i t i o n
is
is
[26],
The b a s i c d i f f i c u l t y heard a g r e a t
deal
with
All
of
communication
paragraph. A
further Several
in terms of
difficulties
on
problem,
is
errors
iterations
No in
which
seem
code. is
to produce code f o r
of data i n t e r c h a n g e
formats,
N,
M.
i s one of communication. incompatible M
i s to
will
to N.
We have
: l a c k of
character
produce code mentioned in
be n e c e s s a r y b e f o r e a
is obtained.
On each i t e r a t i o n
the
must be surmounted.
No wonder a h a l f
boot-
a v o i d s the
iterative
implemented by hand on
man !
aspect o f the communica-
at the expense o f some a d d i t i o n a l
simple translator
and the
half bootstrapping, M
sometimes beyond the p a t i e n c e of m o r t a l [27]
T,
must be d e f i n e d
a g g r a v a t e d by the e r r o r s N
reali-
on computer
A
of these must be surmounted i f
Full bootstrapping tion
file
for
A. The new computer i s
known as
strategy
incompatible
The problem i s
the p r e v i o u s
i m p l e n e n t by
to the usual
about the d i f f i c u l t i e s
common p e r i p h e r a l s , etc.
this
of our
translator.
constructed
is running
which
insure
the f i r s t
used,
subject
to
But a l l
i s not p e r m i t t e d ,
coded by
even the most c a r e f u l l y
T
translator
already available
use the v e r s i o n of
is
in
recursion
which we wish to
One i m p l e m e n t a t i o n s t r a t e g y
strap
expressions).
as the b a s i c t o o l
Since i n f i n i t e
v e r s i o n of
creep i n t o
definition
which have
may be found
STAGE2
machine program was p o r t a b l e .
m a t t e r what i m p l e m e n t a t i o n s t r a t e g y
N.
strings
from strings
TRANSLATOR
Let us denote the t r a n s l a t o r
sets,
or
may be b u i l t
e v a l u a t e d as a r i t h m e t i c
we must have some o t h e r way of r e a l i z i n g
for
Strings
memory, or c o n s t r u c t e d
argument has been based upon a t r a n s l a t o r
to
memory l o c a t i o n ,
by the code body, p a r a m e t e r s t r i n g s ,
o f the f a c i l i t i e s
THE
a particular
abstract
to c o n s t r u c t
25
OBTAINING
that
in an i n t e r n a l
in some way ( e . g .
A complete d e s c r i p t i o n
Its
STAGE2.
may be matched a g a i n s t the s e t of t e m p l a stored
from the i n t e r n a l
been t r a n s f o r m e d
by
which matched the p a r a m e t e r
to a s e t o f break c h a r a c t e r s .
characters
extracted
string
to some d e v i c e ,
according
literal
are s u b s t r i n g s
The purpose o f the code body i s
A constructed
output
in the language i n t e r p r e t e d
by v a l u e )
of the t e m p l a t e .
strings. tes,
(called
N
hand c o d i n g .
A very
and then used to r e a l i z e
210
lize
A. The e f f o r t
involved
in hand coding
translator
[28]
statements.
The main d i s a d v a n t a g e l i e s
language of
A
the power of Certain
to an i n a c c e p t a b l e
: An i n p u t / o u t p u t
a univeral
sembly code f o r
the t a r g e t
output.
T's
terface
Since
T
There i s
a third
the p o i n t
p a r t o f the d e f i n i t i o n T
of
can e a s i l y
be
would n o r m a l l y produce as-
primary function
needed to
the c o m p l e x i t y
is to p r o v i d e the i n -
stream and the r e l o c a t a b l e
strategy,
which has c h a r a c t e r i s t i c s
The design of the a b s t r a c t
of specifying
an a b s o l u t e o b j e c t
object
code o f
A simulator
for
A
of both those men-
machine code.
A
is
now w r i t t e n program
is
carried
The r e s u l t
the memory of
numbers which form the a b s o l u t e o b j e c t
A
for T
is
(if
N.
to
a block
such h a r d -
The b l o c k o f
can be executed by
interpreter.
This s t r a t e g y
is
like
the f u l l
cation
problems o f the h a l f
coding
(the
interpreter).
need o n l y be done once; ber o f
bootstrap
bootstrap
in t h a t
The t r a n s l a t i o n it
it
a v o i d s the communi-
at the expense o f of
T
can then be used to
additional
hand
to a b s o l u t e o b j e c t realize
T
code
on any num-
computers.
The v e r s i o n o f for
either
and hence an a s s e m b l e r i s
numbers which could be loaded i n t o
ware e x i s t e d ) . this
may r e s t r i c t
necessary f o r
does most o f the work,
Its
FORTRAN
p l a c e d on the
computer.
mentioned above. of
simple
The s i m p l e t r a n s l a t o r
conventions.
between a c h a r a c t e r
the t a r g e t
computer i s
computer,
of the a s s e m b l e r i s m i n i m a l .
100
limitations
These l i m i t a t i o n s
package i s
requirement.
- one s u i t a b l e
degree.
on the t a r g e t
a r r a n g e d to use the same I / 0 process
in the
by the s i m p l e t r a n s l a t o r .
T
o f the methods
small
can be e x p r e s s e d by fewer than
basic software
A, and i s
is
N.
T
In t h a t
ing run on the munication
which
is
interpreted
respect this simulated
strategy
machine
A
is is
o n l y used to a half
to produce code f o r
problems are avoided because the s i m u l a t e d
the same p e r i p h e r a l s
and c h a r a c t e r
translate
bootstrap
s e t as
N,
A
-
T
T N.
is
be-
All
com-
has e x a c t l y
and i s at the same l o c a -
ton. Our e x p e r i e n c e has been t h a t simple translator be b u i l t
with
of
if
reference
A
meets the c o n s t r a i n t s
[28],
then an i n t e r p r e t e r
a p p r o x i m a t e l y the same amount o f e f f o r t .
pears t h a t
the most burdensome r e s t r i c t i o n s
translator
c o u l d be l i f t e d
ter.
Hence we conclude t h a t
without
the t h i r d
In f a c t ,
necessary f o r
increasing strategy
imposed by the for
could it
ap-
the s i m p l e
the cost o f the is
A
interpre-
the one to use.
211
A CASE
5.
In S e c t i o n portable
STUDY
OF SOME
EARLY
ABSTRACT
3, we e n n u n c i a t e d a number o f p r i n c i p l e s
and a d a p t a b l e s o f t w a r e .
ved at t h e s e p r i n c i p l e s
just
We must s t r e s s
t h e y are based on the r e s u l t s i s now our i n t e n t i o n pointing
that
thought;
out o v e r the l a s t
out where t h e y were s u c c e s s f u l , our c u r r e n t
machine m o d e l l i n g
few y e a r s .
way, we w i l l
models,
By c o n s i d e r i n g
will
consider:
(a)
FLUB,
the
we hope to s e t the
in a more c o n c r e t e
a t t e m p t to e v a l u a t e the p r i n c i p l e s
what m i g h t be a c h i e v e d ,
It
models i n some
and d e m o n s t r a t e how t h e y can be used to produce w o r k i n g this
rather,
where t h e y f a i l e d
thinking.
design and i m p l e m e n t a t i o n of some a c t u a l of a b s t r a c t
we have not a r r i -
o f a number of e x p e r i m e n t s on a b s t r a c t
to c o n s i d e r two o f these e a r l y
and how t h e y have i n f l u e n c e d pri~iples
forconstructing
by p r o c e s s e s of a b s t r a c t
machine models which we have c a r r i e d detail,
MACHINES
framework
software.
against
but what has been a c h i e v e d ,
In
not j u s t
In p a r t i c u l a r ,
we
a machine designed s p e c i f i c a l l y for the
task of constructing S T A G E 2; (b)
a machine used to
TEXED,
5.1.
program f o r
text
MACHINE
LANGUAGE
In S e c t i o n
AND
3.1,
implement
MITEM
a
manipulation.
DESIGN
we noted t h a t
in d e s i g n i n g a b s t r a c t
we had to bear in mind not o n l y the r e l a t i o n s h i p problem but a l s o i t s In our e a r l y
relationship
approaches
to the s t r u c t u r e
to a b s t r a c t
if
the model a d e q u a t e l y r e f l e c t e d
problem,
then encoding
the a l g o r i t h m
and data t y p e s of the a b s t r a c t the problem f o r been r e a l i z e d . computers
the a c t u a l
of r e a l
machine m o d e l l i n g ,
emphasize the f o r m e r a t the expense of the l a t t e r . that
machine models,
of the model to the machines.
we tended to
Our assumption was
the c h a r a c t e r i s t i c s
of the
in terms o f the b a s i c o p e r a t i o n s
machine i s e q u i v a l e n t
to programming
computer once the a b s t r a c t
machine has
O b v i o u s l y , we kept a wary eye on the s t r u c t u r e
of r e a l
but the r e q u i r e m e n t s of the problem tended to dominate the
design process. the c o r r e c t
Now in p r i n c i p l e ,
model j u s t
lem alone w i t h in p r a c t i c e ,
little
this
machine d i f f i c u l t case s t u d i e s ,
it
should be p o s s i b l e
by c o n s i d e r i n g or no r e g a r d f o r
can make an e f f i c i e n t
the c h a r a c t e r i s t i c s the way r e a l
of the prob-
machines o p e r a t e ;
i m p l e m e n t a t i o n o f the a b s t r a c t
or even i m p o s s i b l e to o b t a i n .
we w i l l
to c o n s t r u c t
In p r e s e n t i n g
these
a t t e m p t to pant out where our emphasis oF the model to
the
212
ADDRESS
FLG
VAL
PTR
¢
C
I@7
A
~@4
(Root of the tree)
T
&¢2
(End of CAT)
I
(Continuation of COT)
o
T
~5
(End of COT) (Beginning of DOT)
D
~@7
0
T
~9
(End of DOT)
1 i !
Figure Representation
5.1. of
a
Tree
213
problem has c r e a t e d
difficulties.
STAGE2
deals with
MITEM,
o n l y the l a s t
three
o f these s t r u c t u r e s our e a r l y least tion
suited
dictated
strings
word.
FLUB
CAT,
indicator
bits;
is
(PTR)
Given such a s t r u c t u r e asked was whether i t gers.
Clearly,
ted as a l i n k e d
fields
the
word,
FLUB
for
of words w i t h
of the s t r i n g
the d e s i g n of for
a string
ting
operations
"0"
With
the r e p r e s e n t a t i o n
cated a f u l l FLG
of a s t r i n g
The
free
as r e q u i r e d
a word whose and whose
FLG
field
again s t o r e s
When we came to design we decided t h a t
convenient for
con=
indicator
TEXED
this
ad-
after
structure
i m p l e m e n t i n g such e d i -
Figure
5.2.
by s t o r i n g
COAT
field
VAL
for
field
PTR
space and a d j u s t i n g
illustrates the c h a r a c t e r
various
and s t r i n g s
how to r e p r e s e n t
was o b v i o u s l y
operations to hold
of storing
too s m a l l
operations
links
in the
t h e r e o n l y remained They could be a l l o -
at l e a s t storing
integers
VAL
field
an i n t e g e r
since
it
would be needed f o r string
lengths.
of a u s e f u l
the
Addition VAL
However i t
size since
it
indica-
and sub-
field
since
would s t i l l
be
was not expected
v e r y long s t r i n g s .
to hold a c h a r a c t e r
seemed s e n s i -
in one o f the t h r e e
was o n l y used f o r
would be r e q u i r e d .
the programs would be m a n i p u l a t i n g
the use of a
fixed,
integers.
word but again on the grounds o f economy, i t
was to be used f o r
too small that
be r e p r e s e n -
a d d r e s s i n g the word
of t r e e s
and no a r i t h m e t i c
traction it
could r e a d i l y
and d e l e t i o n .
b l e to examine the p o s s i b i l i t y tors
and i n t e -
fields.
the problem o f d e c i d i n g
fields.
was
o f each word c o n t a i -
can be changed t o
i n the n e x t a v a i l a b l e
pointer
strings
that
field
header.
as i n s e r t i o n s CAT
the next q u e s t i o n
field
would a l s o be q u i t e
how the s t r i n g
and the
VAL
had been c o m p l e t e d ,
FLUB
contains
(FLO)
PTR
c h a r a c t e r o f the s u b s t r i n g
a substring
addressed s t o r e .
field
the
of a substring.
denoting
the
and the
dresses the f i r s t bits
containing
should a l s o lead to economy in
a l s o be s e t up by s p e c i f y i n g
length
a tree
specifying
tains
the
The r e p r e s e n -
address.
a string
the n e x t c h a r a c t e r . S u b s t r i n g s
could
algorithm
machine should at
s t o r e s one c h a r a c t e r
(VAL)
in
d e t e r m i n i n g the composi-
: the f l a g
was noted t h a t
list
in
used to hold a l i n k
for
STAGE2
data s t r u c t u r e .
illustrates
was a l s o s u i t a b l e
It
ning a c h a r a c t e r containing
the a b s t r a c t
economy in data s t r u c t u r e s
basic operations.
STAGE2
3
the v a l u e f i e l d
field
one in the
this
5.1.
and i n t e g e r s ;
i s the most complex
s e t up in a s e q u e n t i a l l y
DOT
into
strings
a key f a c t o r
Figure
and
COT
Each word i s d i v i d e d pointer
that
to m a n i p u l a t i n g
o f a t r e e was t h e r e f o r e
of the
: trees,
Since the t r e e
and a v e r y fundamental
design s t r a t e g y
be w e l l
tation
data t y p e s
two are used.
Similarly,
o n l y i m p l i e d the a b i l i t y
to
214
FLG
ADDRESS
VAL
PTR
C
I@4
A
192
T
193
o
191
9
1
193
196
Figure The
string
COAT
CONTENTS
FIELD FLG
5.2.
OPERATI ONS assignment
indicator bits
test for equality VAL
PTR
character
integer addition and subtraction
length of a string
test for equality
address
integer arithmetic
integer
test for equality test for relative magnitude
Figure Use
of
fields
in
5.3. the
FLUE
word
215
store quite
small
integers.
On the o t h e r
ready l a r g e enough to c o n t a i n on,
subtraction
through
and a t e s t
for equality
division
integers
and a t e s t
for
ready r e q u i r e d .
This
sent an i n t e g e r
by a f u l l
of arithmetic
in the relative
With
integers
had to
available
and i t them.
tioned
into
that
a real
the three
allowed
decided,
the design of the
summarized in F i g u r e
The o p e r a t i o n s
was not e x p e c t e d t h a t
t h e r e would be any p r o -
machines and r e a l
were a n o t h e r m a t t e r
computers was poor.
fields
making up the
FLUB
had to be c o n s i d e r e d
onto a r e a l
machine.
one or more words of the t a r g e t
Either target
the f i e l d s
p e n s i v e on space. set of
in l a r g e overheads f o r
word.
Methods f o r
and a mechanism f o r and the
registers
efficient
(36
for
heads would be small
fields
One e i t h e r
of the o t h e r
operations between tke
one t a r g e t
Since the number of
conserves but
FLUB
is exwith
a
take p l a c e , registers
and
computer word per f i e l d registers
was small
the amount o f space r e q u i r e d
transfer
The f i e l d s
would s t i l l
for
such an
have to be
to and from the memory, but the o v e r -
s i n c e memory would not be accessed by most o p e r a -
The m e m o r y - r e g i s t e r t r a n s f e r
perands.
could
the packing and u n p a c k i n g ;
information
i m p l e m e n t a t i o n was not p r o h i b i t i v e .
tions.
be
then be packed in the memory to conserve space
reasons g i v e n l a t e r ) ,
packed and unpacked f o r
approach
was r e s o l v e d by p r o v i d i n g
implemented w i t h
execution.
words could
FLUB
on which a l m o s t a l l
transferring could
could be mapped
machine or each f i e l d
The f i r s t
im-
in more d e t a i l .
access to be made to each f i e l d
The s i t u a t i o n
re#isters,
memory. The f i e l d s for
o f the
computer word.
the second e n a b l e s e f f i c i e n t small
It
computer would be found whose words were p a r t i -
packed i n t o
memory but r e s u l t s
5.3.
r e q u i r e d were a l m o s t u n i -
There were two o b v i o u s ways in which the data s t r u c t u r e
a full
a complete s e t
in any i m p l e m e n t a t i o n .
However, the data s t r u c t u r e s
p l e m e n t i n g such a s t r u c t u r e
be a l l o c a t e d
to r e p r e -
be g i v e n to the way such a d e s i g n might
and the match between a b s t r a c t was u n l i k e l y
any d e c i s i o n
had been on the r e q u i r e m e n t s of the problem.
machines.
realizing
Thus the d e c i s i o n was
Hence the s i z e of the p o i n t e r
of the b a s i c data s t r u c t u r e
However, some t h o u g h t versally
with
operations.
be mapped onto a c t u a l blem in
to sequence
magnitude had to be added to those a l -
machine had reached the s i t u a t i o n
The main emphasis so f a r
was a l of a d d i t i -
since only multiplication,
word which would have r e q u i r e d
and c o n d i t i o n a l
the f o r m a t
5.1.
field
PTR
i s to be c o n t r a s t e d
d e t e r m i n e s the range o f
abstract
field
PTR
would be r e q u i r e d
a t r e e of the t y p e shown on F i g u r e
taken to s t o r e
field
hand, the
an a d d r e s s , and the o p e r a t i o n s
operations
r e c e i v e s or t r a n s m i t s specifies
take two r e g i s t e r s
information
the memory l o c a t i o n .
while
Hence a l l
the
as oPTR
access to
216
memory i s stored
indirect.
in a
or a r e a l
This field
PTR
address.
required
some d e c i s i o n s
was to be i n t e r p r e t e d
The l a t t e r
about how the address
- as an a b s t r a c t
was chosen f o r
address
reasons of e f f i c i e n c y
but
a program had to be g i v e n access to the number of t a r g e t machine address units
per a b s t r a c t
example, on then
8
next
FLUS
word so t h a t if
System~360, word.
This
8
defining
the upper and l o w e r l i m i t s in t h r e e
Apart from i n p u t - o u t p u t , features,
for
does not
of the
this
field
require
fore
applicable
calls
and e x i t s
operation
a store
l a c k e d a number of e s s e n t i a l subroutines.
and to s p e c i f y
operation
The common
have been summarised in S e c t i o n
into
the r e t u r n
a register
a register
In
At t h i s
other differences nal
hardware
designing
point,
between
features
it
FLUB
required
is
explicitly
and may t h e r e f o r e
also appropriate
and
the
as an e x t e n s i o n o f
TEXED
implementation of
t h e r e would be s u f f i c i e n t
registers
than add e x t r a
we i n c o r p o r a t e d
ry s t o r e .
registers,
This could
procedure c a l l s . MITEM
We a l s o noted t h a t ,
running
the user and w a i t its
of f u r t h e r
commands could
flip-flop
INTERRUPT
cancel
and an e r r o r
him to respond;
o n l y course of a c t i o n
MODE an
for
a s t a c k to serve as a tempora-
under c e r t a i n its
courses o f a c t i o n .
interactively
is
circumstances,
e n v i r o n m e n t in o r d e r
For example, i f detected,
on the o t h e r
i s to t e r m i n a t e corrupt
Rather
the t r a n s m i s s i o n o f p a r a m e t e r s in
needed to be a b l e to i n t e r r o g a t e
se between a l t e r n a t i v e is
implement the program.
a l s o be used f o r
then
flip-flop
to choo-
the program
it
must i n f o r m
hand, in batch mode
p r o c e s s i n g s i n c e the e x e c u t i o n
the t e x t .
We t h e r e f o r e
added a
to a l l o w the program to make such a d e c i s i o n . a l l o w e d the o n - l i n e
a complex search p r o c e s s .
In
MITEM.
we had some doubt whether
FLUB,
to
on the
to c o n s i d e r the
These were m a i n l y a d d i t i o -
TEXED.
for
This
subroutine
TEXED,
take advantage o f w h a t e v e r hardware mechanisms are a v a i l a b l e machine.
address
on e x i t .
the program area and i s t h e r e -
to a wide c l a s s of computers. do not s p e c i f y
addresses
memory were p r o v i d e d
FLUS
was to t r a n s m i t
FLUB
of a r e g i s t e r
actual
fields.
PTR
the models s t i l l
The method chosen f o r PTR
together with
example, a method of h a n d l i n g
hardware mechanisms f o r
target
For
bytes,
problem o f address mapping has a l r e a d y been d i s -
The mapping f a c t o r
as p r e s e t q u a n t i t i e s
in the
addresses could be computed. word i s mapped onto
FLUB
must be added to any address to compute the address of the
cussed in Goos B.
3.2.
actual
1
This f e a t u r e
user to r e g a i n could
control
be d i f f i c u l t
i m p o s s i b l e to implement in some systems and hence we made i t one which could be adapted o u t .
BATCH-
Similarly and
or even
an o p t i o n a l
217
Before
going
useful
at this
on to c o n s i d e r point
we have d e s c r i b e d the basic nes. tively
the
easy t o
appetite
actually
operations
However, for
have p r o v e d
implement,
word s t r u c t u r e cepts,
there
There
each s t r i n g integers useful
is
types
treated
In
TEXED,
not
one o f
the
problem
memory o f a f i x e d necessary
is
lists
deletion,
it
structions
is
is
set
quite
use a
spend a g r e a t could
deal
improve
if
we d i s c a r e d
array during
its
of
characters.
the
process
time
field
FLG
the
position
copying
we would entirely string
between could
be
operations
is
not
storage
of
if
text
the
could
lose
then
from one b u f f e r
strings, if
very
little as an
be c a r r i e d to
we
Even on
a string
a
can
character
available.
in string than
editor
on some machines probably
in-
of characters.
i s much f a s t e r
and s t o r e d
strings and
hardware
of a character
the
is
really of
insertion
specified
facilities
and d e l e t i o n the
This
for
course
amount o f
instruction
Since
considerably
hardware
data
a small
process
TEST
searching
structure
Insertion of
the
AND
types
the
the manipulation
structure.
instructions,
list
the
for
locate
performance
such the
for
of
word s t r u c t u r e
FLUB
although
sequence o f b y t e s .
make use o f any s p e c i a l
machines w i t h o u t
to
on a l i s t
of
its
the
When
contain
would be r e t a i n e d .
needs o n l y
that
end.
of
of
space by h a v i n g
us f r o m making use o f s p e c i a l
TRANSLATE
up as a c o n t i g u o u s
the
blocks
and c o n d i t i o n a l
concepts
However,
on some machines
any programmed s e a r c h we c o u l d
arithmetic
by u s i n g
convenient
the
data
all
con-
The l e n g t h
We would
various
as w e l l .
STAGE2,
rarely
integers). the
MITEM
and t h e f a c t
SYSTEM~360,
we would
of
denote fields
a register,
In
contiguous
linkage. to
use t h e
data types
decision.
VAL
to
o f economy o f
save c o n s i d e r a b l e
transmit
immaterial.
available
For example on
and
rela-
adequately
decision
other
this
is
a voracious
quite
grounds
required
once i n
created
does p r e c l u d e
for
characters,
o f speed.
size
largely
as l i n k e d
is
set
the
m~chi-
it
be used e f f e c t i v e l y
way and occupy
and our economy o f
space b u t
for
therefore
Thus o n l y
be
none o f
on a c t u a l having
design
any e x p l i c i t
FLO
to
will
Although
cannot
the
on t h e
a predictable
operations
be r e q u i r e d ,
with
no need f o r
nodes,
realize
i n STAGE2
a tree
i n memory,
it
As e x p e c t e d ,
matter.
very often lies
and t h e memory, b u t uniformly.
string,
it
store
We m i g h t
system,
t h e program p e r f o r m s
reason
(tree
to
another
no v a l i d
in
I/0
practice.
does r e s u l t
known and no f l a g
distinct
registers
is
this
thus
information.
require
would
is
are s t o r e d
data
to
in
to j u s t i f y
"is r e a l l y
are h a n d l e d
storage.
three
required
the
difficult
while
The t r o u b l e
we a t t e m p t e d
strings
it
machines,
machines.
Although
performed
memory. Thus,
of
a moment and examine how t h e models
data structure
on medium and l a r g e on s m a l l
the design
to pause f o r
out
another.
218
Thus,
in p r a c t i c e ,
both a b s t r a c t
machines have r e v e a l e d design d e f i c i e n -
c i e s which we b e l i e v e were due to e m p h a s i z i n g the r e l a t i o n s h i p model to the problem and not p a y i n g s u f f i c i e n t ture
of real
lity,
machines.
in f u t u r e ,
As our goals are e f f i c i e n c y
we w i l l
cess on the f a c i l i t i e s Now l e t
us c o n s i d e r
these a b s t r a c t nally
on a c t u a l
From an e x t e r n a l
input
text
treated
The f i r s t
integers
Input
only with
Internally,
apart
the
characters
used f o r After
error
some e x p e r i e n c e in
the c h a r a c t e r b l e enough. fixed for
I/0
In i t s
I/0
file
stream,
edits
is
and t r a n s m i t t i n g
and t e x t
to be processed
be sent to two s t r e a m s ; the
further
processing;the
it
second was
we came to the c o n c l u s i o n
not wish to r e s t r i c t time,
and i t
that
to a
STAGE2
was being designed
TEXED
was e v i d e n t t h a t
process I/0
lines is
I/0
accepts
MITEM
them a c c o r d i n g
a much more
lines
of t e x t
to commands i s s u e d on a to the
reported channel
stream;
WRITE
on the
than was r e q u i r e d since
if it
considerations
the memory of the a b s t r a c t was r e q u i r e d
for
of efficiency
and the o p e r a t i o n machine.
reading
dictated
for
Thus
STAGE2.
from
must be placed
that,
from one at l e a s t ,
must be performed o u t s i d e
Hence some form of r e c o r d
and w r i t i n g
success or
a line
memory. However, in a s i m p l e c o p y i n g o p e r a t i o n
the b a s i c u n i t
from
CONTROL
stream.
PRINT
o p e r a t i o n s were s u f f i c i e n t
stream had to be scanned or m o d i f i e d
TEXED
to a n o t h e r ,
rations
could
About t h i s
the m o d i f i e d
o f the e d i t i n g
READ
the l i n e
receiving
system would be needed.
Character-by-character in the
provided character-by-cha-
FLUB
STAGE2,
MITEM
we a l r e a d y needed one more the
by an e n d - o f - l i n e
in a machine r e a d a b l e form so t h a t
we d i d
devices.
stream and o u t p u t s failure
are
symbol which was a s s i g n e d the v a l u e
s i m p l e s t mode o f o p e r a t i o n ,
READ
Hence the
and o u t p u t
messages.
using
In p a r t i c u l a r ,
s e t of
Inter-
s y s t e m , a l t h o u g h easy to i m p l e m e n t , was not f l e x i -
the i m p l e m e n t a t i o n o f
complex
a
I/0
MI-
text.
were r e p r e s e n t e d by n o n - n e g a t i v e
to the computer f o r
and d i a g n o s t i c
and
STAGE2
input
lines
o f the macro d e f i n i t i o n s
r e c e i v e d the g e n e r a t e d t e x t
could be r e - i n p u t
into
system f o r
of output
of characters.
field
VAL
was read from one stream and o u t p u t first
lists
divided
from the e n d - o f - l i n e
consisting
v i e w , both
was one in which both
system designed f o r
operations
a character. -1.
I/0
as streams of c h a r a c t e r s
symbol. racter
view of the
in the design p r o -
an i n p u t - o u t p u t
to produce l i n e s
however, t h e y m a n i p u l a t e l i n k e d
simplest
as p o r t a b i -
computers.
the problem of d e s i g n i n g of
of the
to the s t r u c -
as w e l l
have to p l a c e more w e i g h t available
machines.
process l i n e s
TEM
attention
lines.
I/0
ope-
For more complex
219
editing
operations,
it
was a l s o c l e a r
that
a number o f c o n t r o l
ons would be needed. For e x a m p l e , to move a b l o c k of t e x t tion
in a f i l e
to a n o t h e r ,
stream and copy i t
WRZTE
to a
then be copied from the sition
was l o c a t e d .
If
we could
the f i l e
p i e d to the
WRITE
the o r i g i n a l
file
ration
stream in
would r e q u i r e
implied still
that
its
Subsequent l i n e s
the
endfile is
and c h a r a c t e r
of t e x t
could be co-
We could then r e c o n n e c t processing.
and r e w i n d . a file
This ope-
Notice
operations computers I/0
to the a b s t r a c t
p r e s e n t e d no g r e a t transmit
on the o t h e r
records
I/0
from a stream and
If
devices,
I/0
which
each c h a r a c t e r We a v o i d e d t h i s
actually
TEM
process
lines
included
a line
I/0
operations. line
(see F i g u r e
buffer
5.4.).
cord o p e r a t i o n s , and the
Since to The
field
racter
I/0
I/0
is
switch
and
STAGE2
from one
transmit
devices via
I/0
MI-
d e v i c e to
channel
line
9
be-
buffers
can be r e c o v e r e d v i a r e between a channel
the d e v i c e number.
channels and
to s p e c i f y
buffer
device.
d e v i c e must s p e c i f y field
information
o f channels as r e q u i r e d
the e x t e r n a l
the use of up to VAL
the a p p r o p r i a t e
both
loaded or unloaded by c h a r a c t e r
operations
affecting
which
to a number of
to or from the memory. We t h e r e -
The c u r r e n t
a peripheral
Character
a d e v i c e number
to s e l e c t that
MITEM
the channel
up
number.
o f the same word was used to hold the c o m p l e t i o n code what happened to the o p e r a t i o n
operations
an end o f
gram t h a t
which
out.
without
permitted
reflects
of a line
and the e x t e r n a l
we made use o f the
FLG
routine
which m e r e l y move i n f o r m a t i o n
buffer,
STAGE2
must s p e c i f y
These enabled the s w i t c h i n g
r e q u e s t to
32,
which that
line
I/0
buffer
The r e c o r d
to be c a r r i e d
MITEM
Any
operation
and do not
fore
systems o f most
devices.
can be d i r e c t e d
overhead by n o t i n g
transmission
tween the
I/0
Record
be implemented by r o u t i n e s
characters
another during
for
problems s i n c e the
can be used by the b u f f e r i n g
buffer.
machines
machines would be r e q u i r e d .
hand would have to
also
o p e r a t i o n s were needed, we
to and from p e r i p h e r a l
pack and unpack b u f f e r s .
it
reconnected.
n e x t c o n s i d e r e d how such a system m i g h t be implemented on r e a l and what i n t e r f a c e s
could
the new po-
information
deleted
new p o s i t i o n .
when i t
r e s p e c t to the
streams u n t i l
WRITE
we must be a b l e to d i s c o n n e c t
both r e c o r d
with
stream to c o n t i n u e
at l e a s t w r i t e line
it
s t r e a m , the b l o c k
READ
r e c o v e r the c u r r e n t
Given t h a t
the
containing
READ
to the
delete stream.
DELETE
to
READ
were now connected to the
first
functi-
from one p o s i -
line
the l i n e
a l s o s e t the
FLG
(see S e c t i o n
field,
either
symbol has been read on i n p u t , buffer
is
full
during
output
or to
to
3.2.).
Cha-
indicate
i n f o r m the p r o -
o f an o v e r l e n g t h
line.
220
In d e s i g n i n g t h i s
I/0
both
MITEMj
it
and
STAGE2
for
[29]
any a c t u a l
age would p e r m i t the of a b s t r a c t
software.
organisation
Again we have got
into
needs o f a p a r t i c u l a r that
rigid
buffers
the e f f o r t
keeping w i t h
case
I/0
really
a particu-
has not t u r n e d particularly
as a g e n e r a l
out to with
Even f o r
has been g r e a t l y
an e f f i c i e n t
[30,31]
the p r i n c i p l e s
simplified,
information
on a c t u a l
by t h i s
STAGE2,
in S e c t i o n
3.2.
is
In t h i s
machine and the e n v i r o n m e n t
only function
to and from the c h a n n e l s .
machines and e f f i c i e n t
ex-
For these r e a s o n s ,
has been designed which
outlined
and i t s
for
program.
i m p l e m e n t a t i o n on
than we a n t i c i p a t e d .
system
re-
system.
too much emphasis on the
MITEM.
required
the boundary between the a b s t r a c t
the f l o w o f
this
by p l a c i n g
to o b t a i n
some systems has proved l a r g e r
lize
the c o s t of o b t a i n i n g
in t h i s
are not
required
a new v e r s i o n of the version,
environment
to be spread over a number
to q u a l i f y
difficulties
I/0
the system imposes some unnecessary i n e f f i c i e n c i e s ,
ample, the channel
more in
Since the
a structure,
of b u f f e r s ,
problem,
the r e q u i r e m e n t s of
we might be a b l e to use
the use o f a g e n e r a l i z e d pack-
In p r a c t i c e
be the case. The package has too
Further,
machines.
computer,
machines, t h e r e b y r e d u c i n g
gard to i t s
that
implementation effort
p i e c e of p o r t a b l e
we f i n d
to s a t i s f y
we a n t i c i p a t e d
a w i d e r range o f a b s t r a c t
must be recoded f o r
lar
package
now i s to c o n t r o l It
is
s i m p l e r to r e a -
v e r s i o n s are more r e a d i l y
obtai-
nable. In the f o r e g o i n g machines in re o f
real
sections,
relation machines.
by the t o o l s
used to
Now we must c o n s i d e r what l i m i t a t i o n s realize
was c r e a t e d f o r
FLUB,
to be used f o r available,
the models.
the purpose of
realising
the b o o t s t r a p manner to
characters.
sequence.
machines.
or f i x e d
operands were r e q u i r e d
: register
36
registers
bels
2
digits,
TO
control
to the
FLO
67
Since
to l a b e l field
of
67
if
register
A
FLUB
=
the
templates
FLUB
in a some-
o n l y be s i n g l e
and
0-9.
to
Two types of All
Hence
FLUB
program l a -
statement
B
FLG
B.
was not
STAGE2
realize
names and program l a b e l s .
FLG
the t o o l
STAGE2,
of characters.
named A-Z the
machine,
s t a t e m e n t s were r e s t r i c t e d
strings
e.g.
IF
recognized
FLUB
length
was p r o v i d e d w i t h consisted of
It
were imposed
abstract
but the parameters could
STAGE2
Hence the operands of
characters
transfers
implementing
other abstract
and i n i t i a t e
equal
The f i r s t
a much s i m p l e r macro p r o c e s s o r was used to
what s i m i l a r single
we have d i s c u s s e d the design of the a b s t r a c t
to the r e q u i r e m e n t s of the problem and the s t r u c t u -
field
of r e g i s t e r
The c o r r e s p o n d i n g
A
is
template is
221
<)< ]
F-
)
0
CHANNEL BUFFERS
-E LINE I BUFEER
Figure Input-Output
FILES
5.4. System
222
TO
where the apostrophe Once
''
IF
' =
represents
was r e a l i z e d
FLUB
FLG
'
a single
on an a c t u a l
character parameter.
machine,
and the s i m p l e macro p r o c e s s o r could be d i s c a r d e d . the f i r s t
abstract
MITEM
than we had f o r
operands
and p e r m i t t e d
programming
tions
on
FLUB
as i d e n t i f i e r s transfer
for
manifest
operations
ted the f a c t
that
also d i f f e r e d
the
identifiers
ter
register
parameters c o m p r i s i n g
list
of
apostrophe
is
an a r b i t r a r y
integers.
of f o r m a l
fields
Also, a procedure
parameters spe-
could a l s o
include
a list
and m a n i f e s t oR c h a r a c -
line
string
PORTING
The f i r s t
AND
in F i g u r e in
5.5.
of a parameter.
State-
The s i n g l e
FLUB.
When STAGE2 it
will
realizing
basic operations
code.
ADAPTING
been o u t l i n e d
in s e c t i o n
TEXED
the u n d e r l y i n g
and data t y p e s
data t y p e s of the t a r g e t
computer. 4.
described
The e s s e n t i a l s us c o n s i d e r
in the l a s t
the d e s i g n o f these m a c h i n e s , the f i r s t to be s t o r e d
soft-
machine by e x p r e s s i n g
in terms o f the b a s i c o p e r a t i o n s
Now l e t
how much space w i l l be r e q u i r e d information
abstract
can be mapped onto e x i s t i n g
Since the data s t r u c t u r e
as-
to each p a r a m e t e r , which can then be used to
stage in the process o f i m p l e m e n t i n g a p i e c e o f p o r t a b l e
ware i n v o l v e s
and
given
a g a i n s t one of these t e m p l a t e s ,
the process o f g e n e r a t i n g
5.2.
is
are those a v a i l a b l e
used to mark the p o s i t i o n
sign a c h a r a c t e r control
statements
TEXED
an a s t e r i s k
matches a source t e x t
the
into
l a b e l s was changed so
digit
and a l i s t
register
since they r e f l e c -
procedures w i t h
on a procedure
example,
constants.
A complete
FLUB
A call
2
to d e l i m i t
for
The r e g i s t e r / m e m o r y
FLUB
The f o r m a t f o r
of an i d e n t i f i e r
fields.
ments marked w i t h
its
constants.
could be used i n s t e a d of
heading c o n s i s t i n g of a c t u a l
the use of s t r i n g s ,
from those in
pseudo o p e r a t i o n s were i n t r o d u c e d cifying
We removed the r e s t r i c -
memory can be d i v i d e d
TEXED
STAGE2
a more c o n v e n i e n t language
STAGE2.
and c h a r a c t e r
number o f a r r a y s by d e c l a r a t i o n s . that
was t h e r e f o r e
TEXED
machine whose d e s i g n was based on the use o f
and we were a b l e to p r o v i d e o u r s e l v e s w i t h for
became a v a i l a b l e
STAOE2
of t h i s
process have
in more d e t a i l
how
machines. section
question
that
to r e p r e s e n t each f i e l d .
in each f i e l d
and
which,
for
i s fundamental
to
must be asked i s This FLUB,
depends on may be
223
I. (a)
Re__5/ist~z
Data
Transfer
OperatJ0ns
to
Memory
R__~egis t e £
stack
FLG
=
'
"
GET
'
=
'
PUSH
VAL
=
'
'
STO
'
=
'
PTR
=
'
POP ' FROM ' EMPTY ' STACK
'
REG VAL
= =
' PT~
'
"
PTR
=
VAL
'
"
SET SET
VAL
field
'
Integer
=
Arithmetic (b)
arithmetic
PTR
field
arithmetic
*
VAL
'
=
'
÷
'
*
PTR
'
=
'
+
'
'
VAL
'
=
'
-
'
"
PTR
'
=
' -
'
, *
PTR PTR
' '
= =
' * ' /
' '
3.
a)
Control
operations
Unconditional
*
STOP
*
TO
(b)
'
CALL
'(')
"
TO
"
RETURN RETURN
'
BY
'
4e (a)
Character
*
VAL
*
CHAR
' :
=
*
TO
IF
FLG
'
TO
IF
FLG
' NE
"
TO
"
TO
IF IF
VAL VAL
'-= ' NE
'
TO TO
IF IF
VAL VAL
' GT ' GE
' '
LOAD
MESSAGE
Input
Output
• READ
'
' '
Line
FORWARD SKIP READ CURRENT
'
=
*
TO
' IF
PTR
'
*
TO
'
IF
PTR
' NE
'
TO
'
IF
PTR
' GT
'
TO
' IF
PTR
' GE
'
*
NEXT
I/O
' '
WRITE WRITE
CURRENT NEXT '
'
*
WRITE REWIND
ENDFILE '
'
• "
MESSAGE
'
BACKSPACE
5. CONSTANT DECLARE DECLARE
Declarations PROC'(')
' = ' ARRAY'(') STACK (')
*
LOC
END
Figure TEXED
(Instructions available
Machine
preceded on
'
COMMENT
MESSAGE'(')
the
=
Operations (b)
I/O
CHAR VAL
Conditional
*
BY
'
=
ARRAY
2. (a)
7)( ! '(
' ON
COY~MENT
5.5. Instruction
by an * are FLUB
those
machine)
also
' ' TO
'
'
224
summarized as f o l l o w s
:
FLG
-
o,
VAL
-
non-negative
1,
2,
3
PTR
-
or s t r i n g address
integers
lengths,
the space r e q u i r e d
us f i r s t
e n q u i r e what i s
FLUB
word.
6
would be s u f f i c i e n t
sisting
Only
for
( T h i s would r e q u i r e greater
the
field,
PTR
word d i r e c t l y dress
we r e s t r i c t
the user o f
onto an
64 18
bit
use of b i t s space;
With
machine.
10
bits
important,
to
i s not
20
when one i s d e c i d i n g required
point if
useful
the v a r i o u s
internal
the o n l y f a c t o r machine.
problems can be s o l v e d .
which d e t e r m i n e s how we
This could be c l a s s e d as o p t i m i -
s i d e o f the
how to l a y out the
coin,
optimization
the r e g i s t e r s ,
fields.
were a v a i l a b l e
for
speed,
to t h i s
during
to save packing
However, some c o n s i d e r a t i o n
choosing a r e p r e s e n t a t i o n
which
can a f f e c t
on the t a r g e t representation
machine.
will
operate
For example, t h e r e would
which o p t i m i z e d the space
to e x t r a c t
subject
the d e c i s i o n
to the c o n s t r a i n t
are r e p r e s e n t e d by s u c c e s s i v e i n t e g e r s .
fied
with
a restricted incur
character
i s the
The i m p l e m e n t o r i s
0-9
not wish to
of
information
from
fields.
Another f a c t o r character
in
words
bits.
at e v e r y o p e r a t i o n .
no i n s t r u c t i o n s
1024
than the most t r i v i a l
We have a l r e a d y paid some a t t e n t i o n
the f i e l d s
FLUB
than the ad-
has shown us t h a t
other
words b e f o r e
4K
the o t h e r
for
to
(Program addresses could be
must be g i v e n to the way the memory access i n s t r u c t i o n be l i t t l e
strings
allocated
o f addresses r a t h e r
the d e s i g n o f the machines by p r o v i d i n g or unpacking
set con-
symbols.
to a v o i d c o n s t r u c t i n g
STAGE2
insufficient
map the word of the a b s t r a c t is also
field,
VAL
and a few c o n t r o l
Unfortunately,experience
We need about
for
the
o u r s e l v e s to a c h a r a c t e r
characters).
s e t s the minimum l i m i t
zation
best be
However, l e t
needed to r e p r e s e n t
for
FLG;
digits
an i n d e x to a t a b l e
memory are
Economical
for
will
machine.
we m i g h t t h e r e f o m e x p e c t to be a b l e to map the
itself).
FLUB
problems. This
are r e q u i r e d if
than
handled by s t o r i n g of
o f the t a r g e t
the minimum number o f b i t s
2
by the
some o f these q u a n t i t i e s
of upper case c h a r a c t e r s ,
of l e n g t h
I
expression constructed
d e t e r m i n e d by the c h a r a c t e r i s t i c s a
characters
in memory or p r o g r a m , v a l u e o f an i n t e -
ger a r i t h m e t i c user. Clearly
representing
set
that
Although
requiring
c o n v e r s i o n overheads in the
representaion free
only I/0
of a
to choose the
the c h a r a c t e r s he may be s a t i s -
6
bits,
he may
to t r a n s f o r m the
225
hardware
representation
machine with
characters
he may choose Having quire ted
to
fixed
the
field
was n o t
tion
simplifies of
the
entities.
need to fields
most
consider
struction third
which
field.
addition
the
means t h a t
adds one f i e l d
tion
t h e way t h e
to
However,
are
hardware
the
are
same r e g i s -
basic
sepa-
unit
of
instruc-
addition
of
of
PTR
an i n -
result
in
therefore
paying
organized
on
we no l o n g e r
is
can be w o r t h w h i l e
registers
the
the
fields
each
three
instances
and l e a v e
of
the
number o f
and t h e
simply
another
it
is the
we p o i n -
representa-
of
For e x a m p l e ,
fields
VAL
representation
choice.
register
en-
operate
in
as c o n s i s t i n g
This
Both
5.1.,
a uniform fields
next
word t o
instructions
than
of
0-255,
we must
computer
other
can be d e c r e a s e d .
oriented
range
In s e c t i o n
Most
of
the
structures
Further,
a register
rather
in
one t a r g e t
independently
A uniform
first
data
process.
instructions.
logical to
the
on a b y t e
field.
YAL
words).
operations.
the
Thus,
be i m p l e m e n t e d ,
consider
be i m p l e m e n t e d
as s e p a r a t e
the
allocating
The f i e l d in
for
one.
by i n t e g e r s
to
(108
a register
rate
to
bits
translation
We can t h e r e f o r e
information
of
prohibitive
ter.
tions
8
will
cost
internal
represented
assign
registers
that
one f i e l d
the
a representation
how t h e
out
to
the the
some a t t e n -
on t h e
target
machine, In s e c t i o n ons.
3.2.,we
register, stract
the
only
machine
multiple ters.
This
the
all
of
could
cost
of
practice.
the
16
and
mapped u n i f o r m l y use o n l y chine,
we w i l l
though
this
need in
is
core. hold 16
memory. sufficient,
FLUB
onto in
in
the
On a
bits This in
24
data for
bit
the
leaves
practice,
it
describe
arithmetic
for
register,
not
on a
in order very
assigned
fields
one m i g h t
the
in
was used t o
second word were
field is
regis-
We w i l l
The f i r s t
However,
bits
to
speed
have been a p p l i e d
machine,
PTR
6
hardware
The r e g i s t e r
structure,
ab-
section.
a single word.
the
execution
process. this
of
machines with
may be p o s s i b l e
increase
respectively.
the
it
for
directly
principles
two b y t e s
fields
file,
in
One w i t h
each
and t h e
YAL
into
general
Modular for
later
arithmetic
registers
However,
translation
more d e t a i l
bit
one word t o
any l o c a t i o n
registers
a considerable
how t h e s e
field
PTR FLG
in
in
hold
the
abstract
way.
organizati-
or a single
t o map t h e
a register
a more c o m p l i c a t e d
On a
is
or
required
the
action
registers
two words were to
of
a uniform
result
us now c o n s i d e r
register/processor registers
memory i n
one such a p p l i c a t i o n Let
various
no p r o g r a m m a b l e
course
into
arithmetic
map some o r at
categorized
For machines with
VAL
were
expect 32K
to
to
ma-
access
field.
convenient
Alsince
226
could
STAGE2
tained
not
we have i t
the
at
and
sight,
If
may n o t two w o r d s
in
the
we pack t h e
storing
three
an a d d r e s s ,
start
of
ros w h i c h rations the
which
contents
memory mapping address tents
of
a
an a c t u a l
size
on
obtain
bits
bits
24-bit again
here
is
adwe
a situa-
An i m p l e m e n -
are for
a full
using
the
Either actual
Notice the
This
of
the
an a b s t r a c t
of
the
opefrom
that
the
target
add
I
next
abstract
address
can
two z e -
address also
adthe
This in
number o f
need t o
for
that
t h e n ends
the it.
address
a sign.
word b o u n d a r y .
Although
we o n l y
available
by a r r a n g i n g
Each a d d r e s s
contains
the
the
for
each
of
packing
rations.The with
program
as
choice the
to
the
rather
is
available
be a s e r i o u s
to
store
one h a l f to
the
conword.
than
If
bits,
thereby
halving
to
incur
If
a d o u b l e word i s
i,
with 2
and u n p a c k i n g . CHARACTER
how t h e
the
and
optimize.
VAL
bytes,
or
LOAD
and
penalty, PTR
the if
also fields
t h e n we can a v o i d
Memory/registers
abstract
and w i l l
FLG,
4
In
we i n c r e a -
t h e n we w i l l
also
this
the pos-
program.
limitation.
more a d d r e s s word,
of
can be i n c r e a s e d . by
to
only
We may be p r e p a r e d
implementor
he w i s h e s
bits,
each a b s t r a c t
register
MOVE
of
to
field
PTR
respectively
be i m p l e m e n t e d
22
System/360
program
implement
to
proved
memory s p a c e .
overheads
the
4,
field
PTR
has n o t
of
represented
rests
to
before
is
field
use two w o r d s
speed o f
used t o
to
PTR
space
this
available the
23
be c h a n g e d .
word
the that
some a d d i t i o n a l
and may be d i s c a r d e d .
field
must
of
21
memory must now c a l c u l a t e PTR
the
address
have to
cost
bits which,
PTR
possible.
must be r e s e r v e d
of origin.
field
the
is
I0
one.
practice, se t h e
bit
corresponds
factor
PTR
By r e s t r i c t i n g sible
I
word
the
hold
3 fields.
System~360.
for
However,
the
the
we need
bits to
fields.
one w o r d ,
choice
units/abstract
Effectively,
into
an e f f e c t i v e
access
hold
conclude
a.nd speed at
to
con-
CDC 3 2 0 0 ,
code.
no i n f o r m a t i o n
of
22
therefore
space
any l i n e the
as an e x a m p l e
be s u f f i c i e n t
three
if
like
character,
leaves
to
FLUB
carry
saves
since
may be i n c r e a s e d each
to
the
fields
dress
be done by s u i t a b l e
for
b e t w e e n space
generated
input
words
we can t a k e
We m i g h t
for
card
24-bit
This
appear
machine.
can be made w h i c h
complexity
use two
fields.
where a t r a d e - o f f
tation
to
machines,
VAL
used on t h i s
must a l l o c a t e tion
best
normal
Hence on a m a c h i n e
hardware representation
FLG
first
dress
it
process
characters.
32-bit
we use t h e
for
64
found
M o v i n g now to If
be u s e d t o
more t h a n
and
machine will
depend on w h i c h
transfers STORE
the
can now
MULTIPLE
ope-
be r e p r e s e n t e d characteristics
of
227
For machines w i t h ly
sufficient
a word s i z e
to
hold
pend on what i n s t r u c t i o n s In s e c t i o n
5.1.,
have e n c o u n t e r e d
as a c o l l e c t i o n in ANSI
of
of the
efficient
quests,
is
in
as r o u t i n e code.
re and i m p r o v e s Now l e t
to
efficiency
implications of
little
respect
value
use or t o o
large
n i q u e s we w i l l
mization.
used by h i g h
to
the
ke i n t o
improve
also
readily
able
to
analysis
It
both
reduce
control
the
size
As we w i l l the static the
to the
a program to
re-
calls.
This
fetch
could
as
or sto-
be i m p r o v e d increase
efficiency
Clearly
this
is
the
too
since
that those
of
carried
takes
the
place,
of the
out only
For i n s t a n c e , cost
the techniques
machine on t h e
real
between space and speed. various
into
programmer i s
program a t t h e
demonstrate,
opti-
processes
program and do n o t t a k e Further
it
the tech-
improve the efficiency are u s u a l l y
a
s l o w to
heading
to
has
is
software
it
refers
machines
be to
and dynamic c h a r a c t e r i s t i c s
satisfy
I/0
operations
will
find
under
of the
the abstract balance
this
may be argued
classified
of
a r e needed
abstract
portable
and t h e n
t h e way t h e o p t i m i z a t i o n
t h e mapping o f
adapt
producing
dynamic c h a r a c t e r i s t i c s .
utilization.
account
on any
all
I/0
where t h e t e r m
Such i m p r o v e m e n t s
a static
control
code,
compilers
that
realizing
term conventionally
language
its
of
Our o b j e c t i v e
t h e memory.
are best
of
can be i m p l e m e n t e d
space and speed.
one i s
logic
has been t h a t
as a s i m p l e
ways i n which
both
exists
the
package c o n s i d e r a b l y .
problem
a program
in
this
code.
of
to if
fit
choose to
CPU
her
level
any o f
he c a n n o t
to
of
the
port
describe
object
on t h e b a s i s unable
to
However,
generated account
to
implemented
subroutine
operation
machiWe do
A version
record
buffer
and c o n s i d e r
very necessary objective is
are via
a character
of the generated
with
is
on t h e
efficiency
computers
abstract
system quickly
the difficulties
line
fields.
situation.
Our e x p e r i e n c e
operations,
but
again
these
The s y s t e m i s
the
de-
t h e p r o b l e m s we
arguments.
implementing
usual-
will
the
as a document d e s c r i b i n g
t h e two machines we are s t u d y i n g .
the
here.
In t h e new v e r s i o n ,
operates
the
us r e t u r n
on a c t u a l for
calls
This
of
improve the
integer
both
One o f
even c h a r a c t e r - o n l y
line
detail
of bits
v e r y s l o w and a s s e m b l y code v e r s i o n s
operations.
imposes s e v e r e o v e r h e a d s . remain
one word i s
pack and unpack
system for to
FORTRAN c o m p i l e r .
has a
FORTRAN v e r s i o n
for
I/0
with
serves
package and a means f o r
machine which the
the
subroutines
bits,
made some m e n t i o n
any g r e a t
which
FORTRAN
to
have been t a k e n
to go i n t o
32
The a l l o c a t i o n
are a v a i l a b l e
implementing that
than
fields.
we have a l r e a d y
nes and t h e s t e p s not propose
larger
the three
constraints
of a higwe use t a -
of
a program
one.
We can
Thus we are imposed by t h e system
228
or hardware of the target machine. E a r l i e r we defined the property of a program which allows i t
adaptability
to be altered to f i t
user images and system c o n s t r a i n t s , and suggested that i t
as
differing
is l a r g e l y con-
cerned with changes in the s t r u c t u r e of the algorithm. In the f o l l o w i n g examples, the transformation of the program w i l l the sequence of i n s t r u c t i o n s ; r a t h e r , i t
not involve changes in
w i l l only a f f e c t the way the
program is realized on the target machines. However, a key f a c t o r w i l l be the ease with which we can change the code generation of the translator.
It
is f o r t h i s reason t h a t we choose to discuss the techniques
under the heading of
adaptibility.
which i l l u s t r a t e how the f o l l o w i n g
The f i r s t
objectives were achieved :
(a)
minimization
of
utilization
(b)
minimization
of core occupancy
(c)
balancing
CPU
space versus
speed.
experiment was carried out f o r b o t h FLUB
ICL KDFg.
Q-stores
We w i l l describe three experiments
This is a
48-bit
machine which has
and
TEXED
on the
15 r e g i s t e r s called
and two stacks, one f o r a r i t h m e t i c operations and the other
f o r subroutine linkage. Each r e g i s t e r is divided i n t o three
16-bit
f i e l d s and there are a r i t h m e t i c and c o n d i t i o n a l i n s t r u c t i o n s which operate on these f i e l d s . The match between the abstract machines and the real one is very good but f o r t u i t o u s . As we have described, the design of
FLUB and
TEXED were based on the requirements
and not on ease of mapping onto the
of the problems
K D F 9 . The choice of a representa-
t i o n f o r the abstract word in terms of the real one was s t r a i g h t f o r w a r d . The machine has 32K c i e n t f o r the
PTR
words of memory and field.
16
b i t s are therefore s u f f i -
Since information may be transferred between
memory and the r e g i s t e r s via the stack, the other f i e l d s were also a l l o cated
16
b i t s each. The choice of a representation f o r the abstract
r e g i s t e r s was not so obvious. We wished to make use of as many of the real r e g i s t e r s as possible but the problem was to decide which of the 36
r e g i s t e r s of the abstract machine should be mapped d i r e c t l y onto
the hardware. We f i r s t
made a naive r e a l i z a t i o n in which each of the re-
g i s t e r f i e l d s was allocated one machine word. Once the program was running, we altered the macros so that extra code was generated to gather s t a t i s t i c s . The r e s u l t i n g program was then run on a comprehensive set of data to measure which r e g i s t e r s were used most f r e q u e n t l y . This data was combined with the r e s u l t s of a s t a t i c analysis of r e g i s t e r usage and used to a l l o c a t e Q-stores to the r e g i s t e r s of the abstract machine.
229
The macros were t h e n ded on w h i c h other 0
forms
and
I
etc).
realize
that
generators in
to
generate
were i n v o l v e d
of optimization twice
in
code sequences which
the operation.
were i n t r o d u c e d
Subsequent
grams were a b o u t to
rewritten
registers
(use o f
as t h e
unoptimized
because o f t h e ease w i t h
which
depen-
same t i m e ,
immediate operands
measuremants showed t h a t
as f a s t
At t h e
the optimized
ones.
It
is
pro-
important
one can change t h e
code
only a few man-days of e f f o r t were required to
STAGE2,
to produce the optimized versions. The degree o f "largely
optimization
due t o
However,
the
the question
also
arisen
ters,
there
on o t h e r On t h e
little
since
value
In t h e
machines. other
are a l r e a d y
subroutine
linkage
At t h e
"into a s m a l l
interactive
time,
a version
a s s e m b l y code w i t h
a batch
partition.
for
is
be
of
accumula-
was to
reduce
the size
similar
to
and we wanted to
continue
testing.
wherever possible.
fit
MITEM
We a l r e a d y 8
bytes
However,
it
of had
and o p t i would
on-
of core required by the program was to r e a l i z e
TEXED
as an i n t e r p r e t e r .
The form o f
the structure
t h e code to
accesses
108
number o f
constants.
total
to
register
constants. gram,
address Although
there
since
are less
are s t r i c t l y
by a s i n g l e
with
program,
in
the
and i n d e x e d
less
than
are
more t h a n
than
this
number i n
no d i r e c t is
jumps
procedure
access
Since
operand o f
256 or
all
byte register
labels
an a r r a y
is
the
therefore fields
and
in the entire
pro-
Further, la-
can a l s o
be s p e c i f i e d
of
local
addresses
than
256
are l e s s
can be s t o r e d other
so t h a t
o u t o f any p r o c e d u r e ,
any l a b e l
a procedure
shows t h a t
I of
any one p r o c e d u r e .
there
addresses
constants
Only array
into
means t h a t
each p r o c e d u r e .
instructions
a global
used t o
I00
256.
there
t h e ab-
TEXED
t h e use o f an u n s p e c i f i e d
uses a b o u t
in
by a l - b y t e
remaining
is
local.This
res
and p e r m i t s only
of
o f t h e program i t s e l f .
an o p e r a n d
b y t e which
associated
the
fields MITEM
contains
MITEM
reflects
characteristics
number o f o p e r a n d s
required
bels
be i n t e r p r e t e d
taken
MI-
that
down t h e amount
machine and t h e
The approach
of
to c u t
stract
in
to
regis-
should
approach
one word mapped onto
mized to use i m m e d i a t e operands run
this
required
has
128
registers
an a r c h i t e c t u r e
c o r e was l i m i t e d partition
registers
etc.
the objective
a machine w i t h
machines.
with
ATLAS
abstract
System~360,
System~360.
ly
hardware
for
IOL4/So
in
use o f
how t h e
was o f c o u r s e
and a b s t r a c t
For example on
as to
hand,
second e x p e r i m e n t ,
on an
the
experiment real
so many r e g i s t e r s
base r e g i s t e r ,
TEM
in this
between t h e
of optimizing
was no q u e s t i o n
implemented. tors,
achieved
similarity
call.
on a
procedu-
global
array
An e x a m i n a t i o n
operands
can a l s o
of
be r e -
230
p r e s e n t e d by a s i n g l e
byte.
With a u n i f o r m r e p r e s e n t a t i o n instructions
o f the r e g i s t e r
ber of operands in an i n s t r u c t i o n either
arrange f o r
use a f i x e d latter
operators
format.
course.
operation
In t h i s
TEXED
3
the
few man weeks o f e f f o r t , I/0) IK
which
is
40 %
b l e one.
p e n s i v e on first
obviously
interpretive
rest
in a p a r t i c u l a r
This
the macros
required
16K
bytes
only a
(excluding
itself.
We e s t i m a -
than the d i r e c t l y it
executa-
i n speed.
version
on how i m p o r t a n t
installation.
time may be b e t t e r
therefore
inner loop.
i s about an o r d e r o f magnitude more utilization
cPU
e x a m p l e , we were o p t i m i z i n g One i s
code.
of an
wastes some
c o m p l e t e , we then r e w r o t e
of a p i e c e o f p o r t a b l e
it
i s to have t h a t
An i m p l e m e n t a t i o n which
than no program at a l l .
two cases we have c o n s i d e r e d so f a r ce.
and speeds up i t s
program occupied
version
r e s p e c t to'
CPU
we could
we chose the
representation
to implement the i n t e r p r e t e r
to use a f u l l y
software will available
3,
b y t e s long and c o n s i s t s This
We have saved space and paid f o r
Any d e c i s i o n
and
number of operands or
o f the s i z e of the o p t i m i z e d assembly code v e r s i o n .
the i n t e r p r e t i v e
expensive with
4
0
28
Since the num-
again because o f the ease w i t h which we could
b y t e s were r e q u i r e d
te t h a t
we found t h a t
interpreter,
interpretive
The r e s u l t a n t
STAGE2.
is
operands.
interpreter
into
between
v e r s i o n of the
the i n t e r p r e t e r
With the d e s i g n of the to t r a n s l a t e
varies
to take a v a r i a b l e
Each i n s t r u c t i o n
code and up to
space but s i m p l i f i e s
adapt
fields,
were r e q u i r e d to implement the i n t e r p r e t e r .
program is ex-
However, the
r e p r e s e n t extreme p o s i t i o n s .
for
speed and in the second,
tempted to ask whether i t
is
possible
In the for
spa-
to produce a
v e r s i o n of the program in which these two r e s o u r c e s are balanced to suit
the needs of a p a r t i c u l a r
ted to answer in the t h i r d Once an i n t e r p r e t i v e relatively
set of t e s t
STOP
routine
refer
not to i n d i v i d u a l running
found t h a t small
data,
and the n e x t .
areas.
is
a
The n e c e s s a r y code to g a t h e r s t a t i s of the
the h i s t o g r a m .
and the
For c o n v e n i e n c e , t h i s
but to sequences of
When t h i s
interpreter
interpreter
These were the r o u t i n e s
in the l a s t of
can
instructions
o p e r a t i o n was c a r r i e d
described
the program was spending a major p o r t i o n
localized
then i t
a f r e q u e n c y h i s t o g r a m showing the
instructions
under the
available,
i s obeyed when the program i s p r o c e s -
the main loop
m o d i f i e d to o u t p u t
between one l a b e l MITEM
instruction
can be inserted i n t o
i s what we have a t t e m p -
experiment.
s i m p l e m a t t e r to o b t a i n
sing a standard
This
v e r s i o n o f the program i s
number o f t i m e s each tics
environment.
its
out f o r
example, we t i m e in
to s e t up the c u r r e n t
li-
231
ne as a l i s t was c l e a r
and t h e
that
if
interpretively pretive
one t o
these
while
Form,
the
increase
version
the
program.
stage
was t o
The f i r s t
in
a variable
number o f
tation
reduced
the
but
same t i m e ,
space
to
quired
by t h e
terpretive
the
Before of
code.
the
other
the
first
pletely
commences,
in
second in
is
program produced
required
space
needed f o r
now o n l y
I0
speed t o
suit
select
their of
powerful for
these ones
for
the
sion
about
final the
form
system
the
until
and t h e
re-
the
in-
macros w h i c h code as
in
but
to
define execu-
a switch
to
w h i c h matches mode o c c u r s
process
can r e a d i l y
the
is
com-
be a l t e r e d
that
routines
fully is,
code.
Optimized the
reduction
However,
of
balance
for the
code
using
u-
code.The
interpretive
the
assembly
expense
which
executable
in
the
version.
between
the
extra
program
more c o r e .
is
This Thus an
space
and
requirements. suggest
that
a piece
of
these
techniques
software
F o r e x a m p l e when c o n s t r u c t i n g with
of
directly
directly
as t h e
appropriate
tailoring
optimality. proceed
its
at
experiments
one c o u l d of
the
space
into
code has c o m p e n s a t e d
executable
own p a r t i c u l a r
tem,
remainder
than
frequently
size
normal
with
into
same s i z e
interpretive
be r e d u c e d
to
At
more
read
generation
MITEM
second e x p e r i m e n t ;
directly
slower
could
The r e s u l t s criteria
the
of
translated the
further. enable
the
encountered
program
took
interpre-
pairs.
version
still
the
only
executable
are
Reversion
form of
label
time
the
the
can e a s i l y
installation
is
in
for %
of
altered
interpretive
Thus t h e
with
amount o f
directly
pairs
is
for
the
be t r a n s l a t e d
mode i s
pairs.
and t h e
CPU
label
inter-
a hybrid
contained
the
the
operators
to
bytes,
We t h e n
when a l a b e l the
which
produce to
detected.
a hybrid
the
space
very
one o f
a new s e t
se up most o f
are
generation
place
label
We have p r o d u c e d resu!tant
30 %.
in
code s t i l l
1700
than
produce
overheads
although
to
code t o
to
so t h a t
instructions
It
rather
considerably
out
was i n c r e a s e d
was t h a t
which
The normal
set
interpretive
the
mode t a k e s
by r e a d i n g
figure
the
increased
program
program
pattern.
directly
p r o g r a m was l e f t
increased
operators
by
a particular
interpreter
translation
parameterized
version
the
of
interpretive
label
when t h e
up t h e
This
result
interpreter
this
areas
table
size
number o f
code was r e d u c e d
generated well.
rewrite
operands.
The n e t
the
We t h e r e f o r e
be saved by o p t i m i z i n g
used c o n s t a n t s .
for
be e x e c u t e d
of
speed
size.
the
the
a line
could
remainder
t h e n we c o u l d
a marginal of
search
routines
writing after
of it
to
be
meet d i f f e r i n g an o p e r a t i n g
a module and d e f e r
has been i n t e g r a t e d
appropriate
could
measurements
sys-
any d e c i with
made.
the
Infre-
232
quently
used modules
backing s t o r e to m a i n t a i n brid
form.
could o p e r a t e
occupancy;
throughput
critical
interpretively
sections
to m i n i m i z e core and
could be o p t i m i z e d f o r
and r e s p o n s e ; o t h e r modules could e x i s t
However, t h e r e
still
too l a r g e to s a t i s f y
some c o n s t r a i n t
here appears to be to c o n s t r u c t it
is
adaptable.
be p o s s i b l e
to
not something t h a t
space. place.
is
p l a c e so t h a t it
may then be
Unfortunately,
can be done as an a f t e r t h o u g h t .
yet
The o n l y s o l u t i o n
changing the a l g o r i t h m ,
demands f o r
the design of the program in the f i r s t to i l l u s t r a t e
as p o s s i b l e ,
on space.
the program in the f i r s t
By a u t o m a t i c a l l y reduce i t s
in a hy-
remains the problem of what to do w i t h
a program which has been compressed in s i z e as f a r still
speed
It
Again,
this
is
must be b u i l t we w i l l
an approach to the problem o f c o n s t r u c t i n g
use
into MITEM
adaptable soft-
ware. A text
editor
is
on most o n l i n e sert
a fairly
systems. They v a r y from s i m p l e e d i t o r s
and d e l e t e
lines
complex p a t t e r n .
a position
In w r i t i n g it
few f a c i l i t i e s ,
m i g h t be too
ing adopted on the grounds b l e to
for
a number of o p t i o n s neration,
to
its
it
users.
is
STAGE2
quences'within
enough.
line
oriented,
to s e l e c t
contains
6
6
ding
I/0,
on
(which m a n i p u l a t e s up to
8
G
is
the s i z e of
MITEM
are r e q u i r e d .
what core i s
MITEM
6
4
available
is
hea-
3
32
times t h a t
up to the
and what f a c i l i t i e s
MITEM
streams and c h a n n e l s )
streams and or
when more e s o t e r i c
The c h o i c e
appropriate
programs v a r y from
one might e x p e c t to make M I T E M 2
use and o n l y use
he wants and
and d e c l a r a t i -
i n c l u d e d or i g n o r e d a c c o r d i n g
The r e s u l t a n t
operating
Before ge-
r e q u e s t e d are i g n o r e d .
statements with
may then be s e l e c t i v e l y
(a s i m p l e c o n t e x t e d i t o r MITEM
wi-
v e r s i o n s and text.
Routines
it
i s a l s o p o s s i b l e to change code se-
by p r e f i x i n g
declarations.
include
those which
the one body o f
accordingly.
it
Further,
might be i m p o s s i -
Our approach was to
the v e r s i o n and o p t i o n s
a routine
Such l i n e s
to the i n i t i a l
for
text
too
in the program not be-
flip-flop,
MITEM
a
in some s y s t e m s ; too
was not p o w e r f u l
system.
for
the problem c o n t a i n s
the user d e c l a r e s which v e r s i o n and what o p t i o n s
ons not r e q u i r e d Since
by s e a r c h i n g
we had to plan to
If
interrupt
incorporated within
then adapts the i n p u t
STAGE2
of text
result
and a l l o w an i n s t a l l a t i o n
shed to make a v a i l a b l e
ties
that
example the
number to p r o -
editor,
l a r g e to i n c l u d e
implement on a p a r t i c u l a r
many f a c i l i t i e s
cally,
text
on the o t h e r hand, might
some f a c i l i t i e s ,
ders.
in a f i l e
a portable
a wide range o f user r e q u i r e m e n t s .
many f a c i l i t i e s ,
which m e r e l y i n -
on the b a s i s o f an a s s o c i a t e d l i n e
grams which can l o c a t e satisfy
common module which one w~uld e x p e c t to f i n d
3
of
text
it
MITEM
I.
for
manipulation
installation
up to
channels).
available
I
ExcluTypi-
everyday facili-
and depends on
decides should be made
233
available
to the u s e r s .
5,3.
REVIEW
AND
EVALUATION
We have d e s c r i b e d two a b s t r a c t and a d a p t a b l e s o f t w a r e .
models have been moved e a s i l y to meet a v a r i e t y plementations Typical
machine models used to produce p o r t a b l e
We have noted t h a t
from one machine to a n o t h e r and m o d i f i e d
of user requirements
have r e s u l t e d
the programs based on these
and system c o n s t r a i n t s .
in u s a b l e s o f t w a r e
of
reasonable efficiency.
i m p l e m e n t a t i o n t i m e s have been o f the o r d e r o f
regardless
o f the s o f t w a r e
of the t a r g e t
1-4
around.
least
would be r e q u i r e d
one o r d e r of magnitude more e f f o r t
its
problems. itself.
linked
with
algorithm
A major d i f f i c u l t y This
is
the a l g o r i t h m
grams on a number o f machines. and not t a k i n g
sufficient is
not p r e c l u d e
We b e l i e v e
relationship
account
that
by the c u r r e n t
for
have to accept some degree o f
then we w i l l it
is
ent than no s o f t w a r e
there
better
the
these d e f i c i e n c i e s
were
of r e a l
today's
machines.
machines,
our mo-
We hope t h a t
this
to have w o r k i n g
Although
l y make use o f the
inefficiency.
software,
albeit
o f machiHowever, ineffici-
at a l l .
Another major problem w i t h
the
approach
is
each machine i s
me to
implement a n o t h e r one. and
Although
that
it
leads to
a diversity
in the c o s t o f r e a l i z i n g
macros developed f o r
FLUB
nes, i t
after
are marked changes in the s t r u c t u r e
o f language and hence to an i n c r e a s e
TEXED
until
i m p l e m e n t i n g the p r o -
architecture.
nes,
machines.
ma-
intimately
us from moving the programs onto the n e x t g e n e r a t i o n if
stract
is
visible
o f the s t r u c t u r e
o f computers b u t , we s u g g e s t t h a t
i s not w i t h o u t
o f the model to the problem
to produce s o f t w a r e
d e l s must be i n f l u e n c e d
and r e l i a -
For each of our models, we have noted d e s i g n
which have become a p p a r e n t t h r o u g h
Since our o b j e c t i v e
at
to recode the
in the design o f the a b s t r a c t
and may not be c l e a r l y
caused by our emphasing the
will
lies
the approach
not an easy t a s k s i n c e the design
has been encoded.
deficiencies
We e s t i m a t e t h a t
a new machine and a c h i e v e comparable e f f i c i e n c y
However, as we have noted in our d i s c u s s i o n , chine
man weeks
computer but assuming reasonab-
l e access to the machine and a good t u r n programs f o r bility.
The im-
easy to
realize,
one a b s t r a c t
this
i s not
many ab-
we can r a r e -
machine when we co-
strictly
true
for
because of the close s i m i l a r i t y between the two machi-
becomes very apparent when one compares them with
AIMS, a ma_
chine developed to implement ~n i n t e r p r e t i v e BASIC system [32].Yet when
234
we e x a m i n e a number o f set
of
basic
lies
in
dies
this
flect
creating
and e n s u r e to
implement.
yet
In S e c t i o n
still
design
of
this
that
suggests of
problem.
special are
theme
there that
is
a common
t h e way ahead
m a c h i n e model Which
capable
programs
pursue
LEVEL
5.
bodies
this
the
features reason
LANGUAGES
we p r e s e n t e d
were d e s i g n e d
adequacy of
process
common s e t
discussed from
the
approach
being This
purpose
both
embo-
extended
to
re-
model
can s e r v e
abstract
machines
portable
and i n e x p e n s i v e
i n more d e t a i l
characteristics for
must,
in
the
subse-
ware.
We a r e
be s t r u c k
as t h a t
in
object
framework
types
this
the
in-
w h i c h em-
and o r g a n i z a t i o n a l
framework,
summarised
by Goos i n
to
in
his
we s h a l l
Section
discussion
3.1. of
upon c o n -
the
by Goos.
This by
belief
that
must the
of
portable
technique.
translator,
STAGE2,
be i g n o r e d .
such c o n s t r a i n t s
applications
be made a v a i l a b l e translator
should
and p o r t a b i l i t y .
intended
implementation
a more c o m p l e x
can be t r a n s l a t e d
with
code e f f i c i e n c y
bootstrap is
convenience
conjunction
upon t h e
STAGE2 could
step
disussed
age w h i c h
machines out
w h i c h was based p r i m a r i l y
programmer
committed
full
how
The n e x t
that
depends
firmly
be based on t h e we e x p l a i n e d
data
computers
B),
abstract
a uniform
To b u i l d
one t a k e n
be e v a l u a t e d
complexity,
to
real
the
early
We have p o i n t e d
operations,
(Goos
imply
however,
lance
need f o r
MACHINES
programmer.
mean t o
translator
of
basis.
3.2.
of
from
language
We do n o t
basic
Section
properties
ABSTRACT
two e x a m p l e s
and t h e
of
in
differs the
FOR
on an i n d i v i d u a l
venience
ter.
the
is
a particular
resulting
We s h a l l
LOW
which
It
which
we f i n d This
abstract
purpose
of
for
the
types.
sections.
6.
This
general
point
that
designs
and d a t a
characteristics
as a s t a r t i n g
quent
a
common s e t ,
the
different
operations
on t h e
soft-
software
must
target in
4.
compu-
a language
be e x p r e s s e d tool
The bathe
In S e c t i o n
say f o r
only
as
such
a langu-
guaranted
to
be
available.
6.1.
THE
Figure shall
6.1. base o u r
tended luation.
to
BASIC
HARDWARE
shows t h e low level
provide We s h a l l
register/processor language
a concrete explain
MODEL
organisation
framework.
peg upon w h i c h
The p i c t u r e to
on w h i c h we is
mainly
hang o u r c o n c e p t
b e l o w how we can r e l a t e
this
model
of
in-
of evaevalu-
235
MEMORY
,
PROCESSOR
!.... ~ , i l,,u laJ
I
' t
u
~. .....
~'~ e
'' ACC
~iii iiii iiiiii
~
/
An unspecified set of registers and toggles internal to the processor.
I
I
STACK
Figure
6.1.
The Basic Hardware
Model
236
ation
to the major r e g i s t e r / p r o c e s s o r
tion
3.2.
cessor is strued
In o r d e r to d i s c u s s capable of p e r f o r m i n g
The p r o c e s s o r accepts the o t h e r
subtraction), taken from
is dyadic,
(MBR).
It
(ACC) and
returns
any r e s u l t
(for
assumed to be in
operand f o r
to a s i n g l e
example, a and the
ACC
a monadic o p e r a t o r
is
is
after
the n e g a t i v e of i t s
In the t h i r d
the operand i s
MBR,
case the p l a c e d in
is executed.
A load i n s t r u c t i o n
r e p l a c e s the c o n t e n t s
would r e p l a c e the c o n t e n t s
operand.
of
two cases the operand i s
m o d i f y i n g t h e operand in some way.
instruction
of
always with
AaC
For example, a of
ACC
There i s one load i n s t r u c t i o n
with
which does
operand.
There i s one s t o r e
instruction.
or the top of the s t a c k . the s p e c i f i e d
The operand may
last
the o p e r a t o r
or the c o n t e n t s
Then the o p e r a t i o n
It
is
an operand.
is executed. to
the o p e r a t o r If
also provided.
an operand as above.
its
In the
popped up.
load o p e r a t i o n s
operand,
specifies
If
an operand.
of a memory l o c a t i o n ,
are t r a n s f e r r e d is
instruction.
does not s p e c i f y
and the o p e r a t i o n
ACC
load negative
contents
s h o u l d be c o n -
or, in fact,
i s not commutative
operand i s
o f the s t a c k .
and the s t a c k
not m o d i f y
register
The s i n g l e
the c o n t e n t s
MBR
of
specifies its
left
MBR.
then the i n s t r u c t i o n
p l a c e d in
A set of
This
one from the a c c u m u l a t o r
the o p e r a t i o n
corresponds
the top l o c a t i o n
ACC
If
the i n s t r u c t i o n
be a c o n s t a n t ,
contents
arithmetic.
the p r o -
ACC.
Each o p e r a t o r monadic,
assume t h a t
of other operations
two o p e r a n d s ,
then the
operand i n
summarized in Sec-
capable od integer arithmetic.
from the memory b u f f e r
to the a c c u m u l a t o r . right
integer
to mean that it is incapable
that it is necessarily
organisation
the model, we s h a l l
It
specifies
The c o n t e n t s
memory l o c a t i o n
or pushed onto the top o f the s t a c k .
an a c c e p t a b l y e f f i c i e n t
by a s t o r e
realization
possible
for
any o f the r e g i s t e r / p r o c e s s o r
Section
3.2.
M o r e o v e r , we a s s e r t t h a t
chine code to t a r g e t
a memory l o c a t i o n
of the a c c u m u l a t o r are s t o r e d
of the a c c u m u l a t o r are not a l t e r e d
We a s s e r t t h a t
either
The
operation.
of t h i s
organizations
model
is
d i s c u s s e d in
the c o n v e r s i o n of a b s t r a c t
computer i n s t r u c t i o n s
in
can be c a r r i e d
ma-
out by a sim-
ple processor. Stack-organized
computers
match i s
load and s t o r e o p e r a t i o n s
in the
fit
the model a l m o s t e x a c t l y .
on a s t a c k machine pushes down the s t a c k ,
: usually
The o n l y mis-
a load i n s t r u c t i o n
s a v i n g the c o n t e n t s .
In our
237
model,
the
versely, stack. tents
load o p e r a t i o n
the s t o r e is
ACC
These d i f f e r e n c e s structions
ly
the p r e v i o u s lost,
of
con-
ACC.
pops up the
whereas in our model the con-
preserved. p r o v i d e no s e r i o u s
g e n e r a t e the d u p l i c a t e it
Stack machines have i n -
before
and s t o r e
is
it.
Our load
by e r a s i n g the top element b e f o r e
by d u p l i c a t i n g
If
obstacles.
the top element and e r a s i n g
could then be r e a l i z e d
and our s t o r e
next instruction.
contents
of a s t a c k machine u s u a l l y
the v a l u e i s
for' d u p l i c a t i n g
operation ding,
operation
This means t h a t of
destroys
a load,
until
storing.
loa-
We would not a c t u a l -
we had looked ahead to the
then both the d u p l i c a t e
and erase
can be o m i t t e d . The match
is
register.
also quite
good f o r
a computer w i t h
Here the o n l y problem i s
the s t a c k ,
a single
in memory. There i s no need to s i m u l a t e the d e t a i l e d stack pointer
at run t i m e ,
o f most of the changes. tion
with
its
mits,
specifiing
say, a
a non-commutative o p e r a t o r
subtract
register
The r e m a i n i n g t h r e e le arithmetic
All
single
which
register
simulates
of a computer w i t h instruction. instructions, file
is
instruction
in the s t a c k ) . operation.
similar
[33,34]
variants
If
the s i n g -
register
is
to s i m u l a -
or s t o r a g e
loca-
ahead to the n e x t s t o r e all
operation for
multiple
(which may
intervening
as
ACC,
registers
s e l e c t e d to s i m u l a t e is
and
ignored,
or ACC.
and a n o t h e r
too many t e m p o r a r y r e g i s t e r s
f r e e d by a c t u a l l y to i n d i c a t e
of
For example, in the case
whose operand i s the s t a c k
then one i s
There i s e v i d e n c e
is
the
can take p l a c e .
Simply t r a n s l a t e
Optimization
ACC.
of
loaded w i t h
some memory l o c a t i o n
: one r e g i s t e r
s e l e c t e d to s i m u l a t e
are r e q u i r e d ,
time.
we would look
specifies
tedious
computer p e r -
more complex o p t i m i z a t i o n
but a l l o w the
to v a r y w i t h
no r e g i s t e r s ,
a bit
The b a s i c t e c h n i q u e
using the operand o f the s t o r e
then omit the s t o r e
register
computer, AaC
That i n s t r u c t i o n
be a s i m u l a t e d e n t r y
A store
in connec-
the c o n t e n t s
are e s s e n t i a l l y
of them r e q u i r e
facilities.
is
and then the r e g i s t e r
b e f o r e the o p e r a t i o n
organizations
register.
to make b e s t use o f t h e i r
a register
keep t r a c k
below,
Unless the t a r g e t
operation,
f r o m memory
must be s t o r e d ,
the operand from the s t a c k
te the
can e a s i l y
say more about t h i s
operand i s the top o f the s t a c k .
arithmetic
tion
movement o f the
procedures.
An i n s t r u c t i o n if
s i n c e the t r a n s l a t o r
We s h a l l
arithmetic
which must be s i m u l a t e d
that
storing
contents.
complex o p t i m i z a t i o n
faci-
238
lities
do not pay t h e i r
a p p l y to the t h r e e
way in
cases d i s c u s s e d above.
code could answer t h i s d i c a t e what s o r t
improved code.
question,
We need not s p e c i f y
results
may w e l l
Only measurements on a c t u a l
and we f e e l
of o p t i m i s a t i o n
Similar that
we must at l e a s t
in-
is possible. the b a s i c model.
Diffe-
r e n t memory o r g a n i z a t i o n s
can be accommodated by the a d d i t i o n a l
regi-
ters
to the p r o c e s s o r .
and t o g g l e s
ters,
a memory o r g a n i z a t i o n
internal
t o g g l e s which d e t e r m i n e the l e v e l
Remember t h a t abstract
this
machines;
specifies
model i s it
is
not a p a r t i c u l a r
o n l y the core d i s c u s s e d
abstract
in S e c t i o n
which to d e s i g n
machine, s i n c e
chanism, f o r
the reasons d i s c u s s e d single
instruction.
This m o d i f i e r
The c o n t e n t s
sumed to
be a signed i n t e g e r )
instruction
in S e c t i o n
altered
the c o n t e n t s
by adding to
it
Instructions
i s a t t a c h e d to the normal
a memory l o c a t i o n
of the s p e c i f i e d
memory l o c a t i o n
address.
After
o f the memory l o c a t i o n
the signed i n t e g e r
two c a s e s , the c o m p u t a t i o n
in the p r o p e r
location.
index m o d i f i c a t i o n ,
memory l o c a t i o n index is carried
specified
specified
If
the t a r g e t
still
3.2. call.
we argued t h a t This
has
in a s t r a i g h t f o r w a r d 3.2.
In each o f
out and the r e s u l t
computer a c t u a l l y is
this
em-
loaded from the
modification
in the i n d e x r e g i s t e r .
machine i n s t r u c t i o n s
The normal o p t i m i z a t i o n s
computer has m u l t i p l e
In S e c t i o n
as-
used as an i n d e x
Since any a l t e r a t i o n
by the same i n s t r u c t i o n ,
to use any t a r g e t
a procedure
the t a r g e t
by the m o d i f i e r .
out w h i l e the v a l u e i s
possible
is carried
then an i n d e x r e g i s t e r
or decrement i n d e x r e g i s t e r s .
is
the o p e r a t i o n
way u s i n g any o f the mechanisms summarized in S e c t i o n placed
(which
from the m o d i f i e r .
Such a data a g g r e g a t e r e f e r e n c e can be r e a l i z e d the f i r s t
and a signed
i s added to the normal operand of the
to p r o v i d e the e f f e c t i v e
been c o m p l e t e d ,
3.2.
form than i n s t r u c t i o n s
A modifier
specifies
integer.
if
in the
as the data a g g r e g a t e access me-
items have a d i f f e r e n t
which r e f e r e n c e data a g g r e g a t e s .
fore
it
3.3.
a r e f e r e n c e to the top e n t r y
We use i n d e x m o d i f i c a t i o n
which r e f e r e n c e
ploys
be base r e g i s -
addresses, etc.
or a r e f e r e n c e to memory. References to memory may i n v o l v e data
aggregates.
is
These could
o f memory
s i m p l y a framework w i t h i n
An operand may be a c o n s t a n t , stack,
for
which
of the can be
It
is
there-
increment
can be a p p l i e d
index registers.
a high
level
i s done by p e r m i t t i n g
ments as an operand in any i n s t r u c t i o n
model should be used f o r a procedure w i t h
except store.
argu-
The v a l u e of the
239
operand i s the v a l u e l e f t i s not a l o a d ,
ACC
is
pushed i n t o
the p r o c e d u r e
ACC
in
by the p r o c e d u r e .
then a p r o c e d u r e operand i m p l i e s
the s t a c k b e f o r e the p r o c e d u r e
returns,
leaving
Our model
has one
I/0
ACC,
a v a l u e in
nues as though the operand had s p e c i f i e d instruction
[31].
first
This
specification
two are the o p e r a t i o n
interpretation
o f the l a s t
For data t r a n s m i s s i o n area by g i v i n g We d i d
is
consists
transfers
an i n s t r u c t i o n
Any number o f
the f u l l unit
requires
model o n l y
-
Transfer
the r e s t .
of c o n t r o l
of the f i r s t
in S e c t i o n
operations
this
if
the c o n d i t i o n
code
the c o n d i t i o n
code i s
s e t by some o p e r a t i o n s ,
the c o n d i t i o n
FRAMEWORK
In o r d e r
LOW
ware. We c a l l
It
fleshed
is
out
out
Janus,
true.
undisturbed
by
it.
LANGUAGES
in S e c t i o n
6.1.
machine f o r
is only a skele-
a particular
by s p e c i f y i n g
to the problem.
fleshed
framework
Our b a s i c
false.
and l e f t
a framework f o r
to the s k e l e t o n
to produce a p a r t i c u l a r
because i t
looks
problem,
a s e t of o p e r a t o r s
There i s
languages which c o r r e s p o n d s
too must be this
of
location.
assumptions about which o p e r a -
LEVEL
to d e s i g n an a b s t r a c t
low l e v e l
(C)
code or how t h e y a f f e c t
FOR
and data t y p e s a p p r o p r i a t e designing
A transfer
of the program c o u n t e r
if
s k e l e t o n must be
word.
unconditionally.
The b a s i c hardware model p r e s e n t e d ton.
3.2.
the t a r g e t
could be d e f i n d e d .
Transfer
A
and l a s t
:
Our model makes no e x p l i c i t
6,2.
the
memory
Transfer
affect
of
The
t h e y d e f i n e the p a r t i c i p a t i n g
-
code i s
specification integers.
request.
The c o n d i t i o n tions
three
conti-
of the s t a c k .
number, w h i l e
whose operand s p e c i f i e s
transfer
When
the o p e r a t i o n
of f i v e
The v a l u e of the operand r e p l a c e s the c o n t e n t s (P).
called.
depends upon the p a r t i c u l a r
the base address and i n d i c e s
not d i s c u s s
control
requests,
is
of
The v a l u e of the operand i s
code and l o g i c a l three
the o p e r a t o r
the c o n t e n t s
the top e n t r y
the memory address of an area which c o n t a i n s the o p e r a t i o n .
If
that
hard-
language.
both to the program-
mer and to the machine. Janus
is
line-oriented.
It
is
possible
to t r a n s l a t e
languages based
240
upon
Janus
by c o n s i d e r i n g
lines
are d i v i d e d
into
one l i n e
two c l a s s e s
- Declarations. declaration translator,
line
Particular for
of
a
ENDtJ .
information
which
for
the
i s not a d e c l a -
An e x e c u t a b l e
into
a sequence o f ma-
For example,
(e.g.
is
required.
s e r v e because we do not
may p r o v i d e d i f f e r e n t
Janus
DCL
We would not
conventions
might be o m i t t e d ,
used i n s t e a d .
INTEGE~
and
In g e n e r a l ,
know which words to r e -
know what data t y p e s w i l l
ber t h a t our hardware model does not s p e c i f y
be r e q u i r e d .
any data t y p e s ,
Remem-
and hence
must n o t .
Janus
In our hardware model, an i n s t r u c t i o n state ring
or
instructions.
type
DCL
Input
about the program s t r u c -
Any l i n e
declarations.
keywords d e n o t i n g however, the
it
characters
i s an e x e c u t a b l e l i n e .
languages based on
recognizing
at a t i m e .
and the t y p e s o f v a r i a b l e s .
i s to be t r a n s l a t e d
chine
text
DCLLJ
provides
informing
Executable lines. line
four
are e i t h e r
the o p e r a t o r s
ration
input
:
The f i r s t line
A declaration ture,
of
o f the machine in some way. in some w e l l
the r e s u l t .
defined
order,
Since the o r d e r
is
i s an
aotion
which changes the
We c o n c e i v e o f these a c t i o n s which
is
significant
significant,
it
as occu-
in d e t e r m i n i n g
must be r e f l e c t e d
in
Janus.
Let us c o n s i d e r an i n s t r u c t i o n
which adds an i n t e g e r
the c o n t e n t s
o f the a c c u m u l a t o r .
an operand.
One way o f w r i t i n g ADD
B
Here the o p e r a t o r
is
mory l o c a t i o n written
with
called a plus
'ADD'
'B'.
from memory to
L i k e most i n s t r u c t i o n s , such an i n s t r u c t i o n
it
and the operand i s the c o n t e n t s The
instruction
as the o p e r a t o r
specifies
would be :
could a l s o have
o f the mebeen
:
+ B There i s This
really
no change o f meaning h e r e ,
construction
does not have q u i t e
but we must t r e a d
the same c o n n o t a t i o n s
carefully. as the con-
241
ventional
mathematical
e s , whereas here i t with
notation.
There
i s an i m p e r a t i v e
+
s e r v e s to combine two v a l u -
command to o p e r a t e upon one v a l u e
another.
Suppose now t h a t we wish to combine s e v e r a l p l a c e the sum of two i n t e g e r s of instructions
left.
A
B
or
STORE
the
+
C
instructions
on the r i g h t
are i d e n t i c a l
I am s i m p l y u s i n g d i f f e r e n t
a c t i o n when one t h i n k s
into
into
a
in meaning to those
codes f o r
in terms o f an e n t i r e
single
line
the o p e r a t i o n s .
a memory l o c a t i o n
would be more c o n v e n i e n t to g a t h e r the t h r e e ous paragraph
B
-~ C
P l a c i n g the sum o f two i n t e g e r s mic
Such a sequence
A
ADD
on the
a memory l o c a t i o n .
in o r d e r to
could be e x p r e s s e d by :
LOAD
Again,
into
instructions
is often
algorithm.
instructions
o f e x e c u t a b l e code
an a t o -
Thus i t
of the p r e v i -
:
A + B ÷ C
It
is extremely important
by t h i s taken
rewrite,
in a w e l l - d e f i n e d
operations
to
sequence.
on one l i n e
When the c o n t e n t s place
of
+
B
There i s no l i m i t current
operator
is
ACO
÷
C
+
D
had been changed and the a c t i o n s
We s i m p l y choose to w r i t e
all
are
three
stored
further,
into
memory,
storing
it
is
not a l t e r e d .
the m o d i f i e d
v a l u e in some
-~ E
to t h i s
expression
process. is
t a k e s the c u r r e n t
precedence o f o p e r a t o r s , These c o n v e n t i o n s
arise
previous paragraphs. sequence of
nothing
an a c t i o n ,
: A
The
that
because t h e y happen to be r e l a t e d .
Thus we may o p e r a t e upon i t other
recognize
Each o p e r a t o r p e r f o r m s
the J a n u s
and e v a l u a t i o n directly
A Janus
instructions
analog of
e x p r e s s i o n as i t s is
ACC.
left
strictly
Each d y a d i c
operand.
There i s no
from l e f t
from the development sketched
to
right.
in the
e x p r e s s i o n i s a means o f e x p r e s s i n g a
which p e r f o r m s
some atomic a c t i o n
in the p r o -
242
gram.
It
is
not an e x p r e s s i o n in the c l a s s i c a l
where the o p e r a t o r s It
r e p r e s e n t ways o f combining t h e i r
can be argued t h a t
which
look
like
sence o f m a t h e m a t i c s ,
if
one i s
normal
going to w r i t e
arithmetic
them them a c t
like
conventions.
There i s m e r i t
operands.
sequences of symbols
expressions,
then one should make
such e x p r e s s i o n s by a d h e r i n g
to normal
in these arguments i f
the e x p r e s s i o n s are
going to be used by m a t h e m a t i c i a n s to s o l v e m a t h e m a t i c a l e v e r , we are not in t h a t
position.
are f u n d a m e n t a l l y n o n - n u m e r i c , operations
at a b a s i c
tions
only with
deal
level.
In any case,
do not have w e l l - e s t a b l i s h e d counter
rules
i n d e s i g n i n g an a b s t r a c t
O p e r a t o r precedence p e r m i t s quires
temporary storage,
explcitly.
Remember t h a t
ding e x p l i c i t
references
v i d e an a l t e r n a t e A deferred pression
expression
in
is deferred,
it
cate t h a t
in s e c t i o n
expression
is
its
of
~D.
before
Figure is
The c o n t e n t s
If
Figure ACC
6.2c.
after
classic
We must t h e r e f o r e
operand.
6.2b.
to the s t a c k .
the c o n t e n t s
of
When an exis
ACC
to be d e f e r r e d ,
pushed To i n d i -
the user s i m p l y
the new c u r r e n t
its
left
expression.
The
precedes the
In F i g u r e
6.2a.,
and the r i g h t
A*B
is
In t h i s into
left
parenthesis,
operand and the new c u r r e n t
shows a s i t u a t i o n
simply reloaded
ACC
is
avoipro-
expression.
corresponds
an o p e r a t o r is
parenthesis.
of
we argued the need f o r
deferred
re-
storage
S e v e r a l examples of d e f e r r e d e x p r e s s i o n s
the v a l u e o f
the p a r e n t h e s i s
We
popped from the s t a c k when the matching r i g h t
6.2.
right is
the l e f t
expression within
+
us to mention t h i s
3.2.
and begins
then the d e f e r r e d e x p r e s s i o n operand of
operators.
o p e r a t o r s which we may en-
to t e m p o r a r y s t o r a g e .
is encountered.
are givem in F i g u r e pression
precedence conven-
normal
e x p r e s s i o n s whose e v a l u a t i o n
forcing
means t h a t
parenthesis
deferred expression is parenthesis
us to w r i t e
without
Janus
a l g o r i t h m s which
and l o g i c a l
all
How-
the sequence of
Any number of e x p r e s s i o n s may be d e f e r r e d .
the c u r r e n t
a left
for
problems.
machine.
mechanism, the
onto the s t a c k . writes
We are s p e c i f y i n g
and we must c o n t r o l
the common a r i t h m e t i c
precedence
for
where t h e r e
case,
example, the
operand i s is
exleft
the v a l u e
no o p e r a t o r
the v a l u e o f the d e f e r r e d
The v a l u e of the e x p r e s s i o n
ACC.
lost.
does not change when an e x p r e s s i o n
shows how the t r a n s l a t o r
deferred.
can be i n f o r m e d not to r e l o a d
the e x p r e s s i o n has been d e f e r r e d .
method of m u l t i p l y i n g
is
a number by
I0
(This
sequence i s
the
on a b i n a r y machine
243
without is
multiplying).
is
LEFT
the number of p l a c e s
a shift
is
ACC
operator
to be s h i f t e d
whose r i g h t
operand
left.
A * B + ( c * D)
(a)
Deferring
one operand w h i l e
evaluating
another
x([÷x)÷z
(b)
Exchanging two v a r i a b l e s
p
(c)
LEFT
Avoiding
Figure
I + (()
LEF~ 2)
an unwanted load
instruction.
6.2.
Deferred Expressions
Our hardware model p r o v i d e s
only a single
data a g g r e g a t e .
singly-dimensioned
Hence o n l y
The s y n t a x of an a r r a y operand e x a c t l y :
Janus.
reference
w Ca + 171
DS
index for
a reference
arrays
mirrors
to a
are a l l o w e d
the s t r u c t u r e
in
of the
244
w
is
the name of the a r r a y in t h i s
D3
and
indicates
that
example r J
the i n d e x i s
is
the i n d e x v a r i a b l e
to be decremented by
3
follow-
ing the r e f e r e n c e . P r o f e s s o r Goos has d i s c u s s e d py more than one s t o r a g e tion
with
location,
the t r a n s l a t i o n
example, t h a t
tem~360.
w
is
According
to our a b s t r a c t
struction
must be s t a t e d
ference
of v i e w ,
W thus
the
17th
the r e f e r e n c e element of
The t r a n s l a t o r information
that
performs
contained
J
place.
last
is
this
in the r e -
to s a y , he i s
addressed by
J.
reAfter
to address the
17 of
and W.
3
3rd
on the b a s i s of
Thus i t
will
-24.
produce J
is
number at the time the r e f e r e n c e takes
assumption which p r o v i d e s
i n c r e m e n t i n g were done w i t h
3
and an i n d e x i n c r e m e n t of
combining the i n d e x i n c r e m e n t w i t h
would have d i f f i c u l t y
the i n -
From the p r o -
addressed.
i n the d e c l a r a t i o n
the p r o p e r
locations.
and
should be m o d i f i e d
Sys-
on
numbers i n
That i s
the a d j u s t m e n t of
assumed to c o n t a i n
17
W beyond t h a t
currently
w + 136
It
storage
locations.
to be i n d i c e s .
a base address of
for
8
however, the numbers
takes p l a c e ,
W before
occupies
pro-
L e t us suppose,
machine model, the a c t u a l
element o f
This
FLUB words as r e a l i s e d
in terms of s t o r a g e
above are c o n s i d e r e d
ferencing
in connec-
5.1.).
o f an a r r a y r e f e r e n c e .
an a r r a y o f
Each element of
grammer's p o i n t
and we have mentioned i t
FLUB (see s e c t i o n
the i m p l e m e n t a t i o n of
blem a f f e c t s for
the problem of a r r a y elements which occu-
the major m o t i v a t i o n
the a r r a y r e f e r e n c e .
a separate assignment,
associating
it
with
If
the
then the t r a n s l a t o W
the proper a r r a y element
length. There i s
a n o t h e r way of a s s o c i a t i n g
element s i z e
word.
: it
could
This b r i n g s
an i n d e x w i t h
be d e c l a r e d as type
us to the whole q u e s t i o n
a particular
reference
array
to (say)
FLUB
o f data types and why t h e y
are n e c e s s a r y . The main reason f o r
assigning
the computer on which for
it
is
on the o t h e r hand,
If
data i s
not the a b s t r a c t There are r a d i c a l
and r e a l
rage economies are a l s o p o s s i b l e in s e c t i o n
is
realized.
e x a m p l e , between i n t e g e r
Burroughs 5500,
types
arithmetic
treats
machine, but differences,
System~360.
on
the two i d e n t i c a l l y .
on many computers,
as we p o i n t e d
The Stoout
5.2. typed,
then the problems o f t y p e c o n v e r s i o n
(or
coercion)
245
must be f a c e d . former, effect
Coercions
unless
it
l y upon t h a t
or e x p l i c i t .
can be done at c o m p i l e
of a coercion
explicitly,
may be i m p l i c i t
on t h e s t a t e
computer.
time.
The reason is
of the t a r g e t
By f o r c i n g
We r e j e c t
the
that
the
computer depends s t r o n g -
the user to r e q u e s t the c o e r c i o n
we are making him aware of i t s
(possibly)
e x p e n s i v e con-
sequences. A procedure with mal e x p l a n a t i o n the p r o c e d u r e bound v a r i a b l e body is
can a l s o calling
is
by
list
various
is
possible
a series
As above, h o w e v e r , by f o l l o w i n g
is
it
with
the values
is
replaced
: by
to - .
sions
which y i e l d
It
equal
is
s e t the
a l s o augmented
a series of assignments
of t h e arguments
assignments
which are o f t y p e
assumed t h a t a reference
are e v a l u a t e d
which
to the v a l u e s
These f o l l o w i n g
is
which
to the v a l u e s
call-by-value).
be made to arguments
rence
variables)
arguments.
instructive
augmented by p r e c e -
the p r o c e d u r e
the new v a r i a b l e s . only
is
of the arguments
of assignments
(Algol
of the c a l l .
all
of can
refe-
argument e x p r e s -
(such as s u b s c r i p t e d
before
the p r o c e d u r e c a l l .
(Copy/restore). -
Each bound v a r i a b l e
is
replaced
the v a l u e o f the c o r r e s p o n d i n g gument has no r e f e r e n c e ment were the number mented by p r e c e d i n g that
(for
3.5.), it
with
by a r e f e r e n c e argument.
example i f
an argument e x p r e s s i o n y i e l d s reference).
evaluated
before
is
the c a l l .
aug-
making which does
Similarly,
a reference,
to
an a r -
the a r g u -
an a s s i g n m e n t ,
not appear e l s e w h e r e in t h e program. is
If
then the c a l l
argument the v a l u e of a new v a r i a b l e
expression
a
the bound v a r i a b l e s
it
o f a bound v a r i a b l e
of these new v a r i a b l e s
:
which s p e c i f i e s
the c o r r e s p o n d i n g practice,
The p r o c e d u r e
with
The n o r -
process
which does not appear e l s e w h e r e in
the program. it
with
transformations
a new v a r i a b l e
values
of)
never done in
Each o c c u r r e n c e
reset
copying
a
A copy of the p r o c e d u r e
the procedure call,
(some t r a n s f o r m a t i o n
ding
involves
and a p r o c e d u r e body. for
such c o p y i n g
vestigate
be used as an operand.
r e p r e s e n t e d by a lambda e x p r e s s i o n ,
substituted
replaced though
parameters of p r o c e d u r e
if
then t h a t (Call-by-
to
Alin-
246
Each bound v a r i a b l e
-
ponding argument.
Note t h a t
no p a i r
procedures.
is
r e p l a c e d by the c o r r e s -
(ALGOL c a l l - b y - n a m e ) .
of t r a n s f o r m a t i o n s
ALGOL c a l l - b y - v a l u e
is
yield
identical
because assignments to the bound v a r i a b l e s program. other
pairs
Janus it
Figure
6.3.
gives counter
the
d e s i g n e r and the a d d r e s s i n g f a c i l i t i e s the
This
ware).
lectures
If
Janus
in memory)
region,
is
simply this
structures
and hence we r e l i e v e
can
(when the s t a c k
of c o n t r o l
is
required.
local
that
re-
some a c t i o n
The main j u s t i f i c a -
(conditionals,
By p r o v i d i n g
is
in h a r d over a
within
between such r e g i o n s ,
mechanism i s
problem.
computer.
discussion
(when the s t a c k
transfers
the programmer of the need to s p e c i f y citly,
on the t a r g e t
further
changes in s t o r a g e a l l o c a t i o n
For t r a n s f e r s
complex c o n t r o l
machine
a change in s t o r a g e a l l o c a -
or at run time
then a r b i t r a r y
gion can be p e r m i t t e d .
etc.)
nor does
and passed to the
change might be made at c o m p i l e time
o f the s t o r a g e a l l o c a t i o n for
that
and Dennis.
represents
t h e r e are no r u n - t i m e
particular
6.3.;
by Goos, G r i f f i t h s
e x p r e s s i o n in
being s i m u l a t e d
tion
available
p r e s e n t one example in s e c t i o n
be found in
tion.
the c a l l i n g
The c h o i c e depends upon the needs of the a b s t r a c t
A deferred
all
the c o n j e c t u r e
precise form of a procedure c a l l ,
how the arguments are to be i n t e r p r e t e d
procedure. We s h a l l
do not a f f e c t
examples f o r
for
from the r e s t
are e q u i v a l e n t .
does not s p e c i f y
specify
results
obviously different
case s t a t e m e n t s ,
such s t r u c t u r e s transfers
we r e l i e v e
of c o n t r o l
expli-
the c o m p i l e r o f the need to check t h e i r
destinations. Again,
our i n t e r p r e t a t i o n
b a s i c hardware model. expression, point.
The c o n d i t i o n
is
is
drawn d i r e c t l y
not c o n s i d e r e d
on
C
THEF)
is
false.
translated
into
an i n s t r u c t i o n
The v a l u e of the c u r r e n t
from our
to be a Boolean
but s i m p l y a sequence of o p e r a t i o n s which s e t s
THEN ( o r
transfers
of a c o n d i t i o n a l
C
a t some
which
expression is
unchan-
ged by the t r a n s f e r . When an e x p r e s s i o n ated.
An
Algol
is
deferred,
block
allows
one t e m p o r a r y s t o r a g e
rage as he needs, s i m p l y by d e c l a r a t i o n . when c o n t r o l
l e a v e s the b l o c k .
by a s i m p l e t r a n s f e r , to the
user.
Since i t
the c o s t of f r e e i n g
The user i s
location
is
cre-
the user to c r e a t e as much t e m p o r a r y s t o This is
s t o r a g e must be f r e e d
possible
storage
a l s o not aware of
to
l e a v e the b l o c k
is not made a p p a r e n t
the high p e n a l t i e s
for
ac-
247
100 (a)
FUNCTION F (X,Y) X=X+I Y=Y+i F =X+Y RE TURN END A=I B=F(A,A) PRINT 1007 B FORMAT (1H, I5) END
Different results from copy/restore and call-byreference.
begin procedure SWAP(i,j); integer i,j; begin integer t~ t:=i; i:=j; j:=t end; integer k; integer array A[1:2]; k:=l; A[1]:=2; A[2]:=3; SWAP(k,A[k]); print(k); print(A[1]); print(A[2]) end; (b)
Different results from call-by-reference and call-by-name.
Figure
6.3.
Examples of Argument Substitution
248
c e s s i n g t e m p o r a r y a r r a y s on some computers. costs
explicit
by r e q u i r i n g
The p r e c e d i n g
paragraph should not be c o n s t r u e d
machine design cannot and r e f e r e n c e
it.
include
is
definitely
an i m p o r t a n t
riables
is
left
unspecified.
abstract
A stack
the type of a l l o c a t i o n
A program w r i t t e n
and operands.
must e x t r a c t
and i d e n t i f y
realization.
which
pair
is
about each t o k e n .
string,
accept code in a f r e e 6.4.).
va-
for
which a l l o w s
the
the
the program. of a sequence
phase of the t r a n s l a t o r
constructing
operator-operand to be an a b s t r a c t phase f o r
by which the r e c o g n i z e r e x t r a c t s which p r o v i d e s
can be used f o r
to-
information
any language
rules.
rules
coded as macros f o r
format,
rules
in a r i g i d
use a d i c t i o n a r y
the user p r o v i d e s .
to break down any language based upon
These
STAGE2.
and produce macro c a l l s
The r e c o g n i t i o n
s e t up by o t h e r macros which
local
or space could be
then c o n s i d e r e d
The same r e c o g n i z e r
We have a s e t of r e c o g n i t i o n
Janus
also possible
consists
Janus
and a d i c t i o n a r y
which obeys the same e x t r a c t i o n
(Figure
is
say,
these l o c a l
passed to the code g e n e r a t i o n
There must be r u l e s
kens from the i n p u t
is
is
when he w r i t e s
these t o k e n s ,
Each o p e r a t o r - o p e r a n d
machine i n s t r u c t i o n ,
rules
It
The r e c o g n i t i o n
for,
may c o n t a i n
for
could be used,
in a language based on
of o p e r a t o r s
format
allocated
the p r o c e d u r e .
storage
of any program.
machine d e s i g n e r to p r o v i d e a d e c l a r a t i o n
user to s e l e c t
pairs.
property is
an a b s t r a c t
stack
be i n d i c a t e d
and these procedures
The way in which s t o r a g e
permanently associated with
to mean t h a t
F [35a.
programs
Algol
has the concept of a p r o c e d u r e , variables.
to make these
own dynamic s t o r a g e .
o p e r a t i o n s which a l l o c a t e
Such a design would c e r t a i n l y
a machine which was to run Modularity
We p r e f e r
the user to manage h i s
into
Janus
which
They are thus a series
able
of a b s t r a c t
machine i n s t r u c t i o n s . The code g e n e r a t o r
is
constructed
by w r i t i n g
p o s s i b l e macros of the forms shown in F i g u r e not as
burdensome as i t
to t r a n s l a t e
operators
might seem.
6.4.
for
for
all
Actually,
For most machines,
and operands s e p a r a t e l y .
s a r y to p r o v i d e a macro d e f i n i t i o n rator
definitions
Thus i t
each p o s s i b l e
of the
this
is
it
is
possible
is
not neces-
combination
of ope-
and operand f o r m a t .
Remember t h a t which w i l l
the macro d e f i n i t i o n s
be executed i n t e r p r e t i v e l y
are s i m p l y code g e n e r a t i o n by
STAGE2.
routines
They may be v e r y
249
(a)
.EC
op constant
.ES
op simple variable
.EA
op data aggregate reference
.EP
op procedure call
.ET
op
(operation involving the stack)
.EN
op
(monadic operator)
Executable instructions .DS
type symbol declaration
.DA
type data aggregate declaration
.DP
type procedure declaration
.DO
type operator declaration
.DM .T
(main program entry) name
(end of procedure
.T .C (b)
(end of text) comment
Declarations
Figure
6.4.
Output of the Recognition Phase
"name")
250
simple,
depending upon the c i r c u m s t a n c e s .
cros which ting
produces a b s o l u t e o b j e c t
a complete a s s e m b l e r .
the t a r g e t
It
is
(For example, i f
an i n t e r p r e t e r
as d i s c u s s e d
AN
6 . 3 .
In s e c t i o n
EXAMPLE
6.2.,
OF
it
normally,
a specific
style.
The obje
(Language f o r mers with same t i
LOW
LEVEL
LANGUAGE
d e s i g n i n g a low l e v e l in the
tive
is
System Development)
e, e n c o u r a g i n g
can r e a d i l y
or o t h e r w i s e
those
in o t h e r
depends on the e x i s t e n c e the f o r m e r i m p l i e s must be r e a d i l y
high
that
level
could
readily
proqram so t h a t
account
language,
which
in u n a c c e p t a b l e have j u s t i f i e d
implies
the d e s i g n of
any d e c i s i o n
inefficiencies their
available
this
to i n c o r p o r a t e i~
out
of o b j e c t
itself
LSD.
highly
portable;
of the t r a n s l a t o r
available is
The l a t t e r
at the moment
STAGE2.
This
Some f e a t u r e s to t r a n s l a t e
fact
of
or which
were o m i t t e d even though one
inclusion
does not mean t h a t
by c o n s i d e r a t i o n s
is
algorithm
of these r e q u i r e m e n t s
during
r e a d e r must
and p o r t a b i l i t y .
The o n l y t r a n s l a t e r
both
the
not downwards from the programmer.
of a t r a n s l a t o r
This
the f a c i l i t i e s
The p r o -
The design was c a r r i e d
the code g e n e r a t i o n
convenience. that
programs.
at the
by comparing the f a c i l i t i e s
languages which might prove d i f f i c u l t
result
influenced
of t h i s
code e f f i c i e n c y
adaptable.
seems to s a t i s f y
might
LSD
machine and has access to a
languages.
upwards from the machine,
into
called
languages w h i l s t ,
of e f f i c i e n t
not to form h i s judgemen~ j u s t
by l o o k i n g
Janus
was to p r o v i d e systems programlevel
which enable him to o r g a n i z e h i s
The main goals were o b j e c t
was taken
[36]
now con-
be o p t i m i z e d .
provides with
which
language which
of high
the p r o d u c t i o n
In a s s e s s i n g the m e r i t s be c a r e f u l it
this
made aware of the u n d e r l y i n g
number of f a c i l i t i e s it
We w i l l
example of such a language c o n s t r u c t e d in d e v e l o p i n g
is
via
4.2.).
we p r e s e n t e d a framework f o r
some o f the f a c i l i t i e s
grammer i s
thus s i m u l a -
but sometimes t h i s
language based on an u n d e r l y i n g model of the hardware. sider
a s e t of ma-
one wishes to p r o v i d e p o r t a b i l i t y
in s e c t i o n
A
one c o m p u t e r ,
e a s i e r to produce assembly code f o r
computer and then process
not p o s s i b l e .
We have w r i t t e n
code f o r
on the grounds o f programmer
factor
was i g n o r e d .
a particular
facility
code e f f i c i e n c y ,
Rather i t was a l s o
portability
and
STAGE2.
Another imporatnt factor which influenced the design of
LSO
was its
251
projected
use as a l a n g u a g e
nes used i n
data
Such p r o g r a m s ments ate
rather
would than
a convenient
grams w r i t t e n active
test
checked ferred
by s y s t e m s environmant
in in
the
the
this
to
for
checking
facilities
ges.
However,
as t h e
the
ready nostic
test
that
a tool
for
concluded compiler.
aid
[115],
of
SID
tant of
from
design
being
feature
remainded translator
We s h a l l guage also
to
how t h e y
tramslated
by
should
reader is
or
line
STAGE2.
oriented,
a declaration.
ne i n s t r u c t i o n s ; about
the
read
structure
of
overview
of
per
Diag-
abstract software
when v i e w e d
be e q u i p p e d
with
generates
syntax
as
should
of not
the
a syntax be c a p a b l e
to
ensuring
and a r e
a more
with
Thus an i m p o r -
as i n p u t
first
of
fit
into
the
complexity
the
For a c o m p l e t e reference
No
SID.
that the
this
con-
language
essential
if
the
latter the
the
main f e a t u r e s framework.
Janus
of
of
a language
description
of
the
This
that the
lan-
will
may be
language,
37.
where a l i n e
The f o r m e r the
we have a l store.
such a c o m p i l e r
characteristics
constraints
good
messaincrea-
portable
a language.
the
without
the
provide
such an e n v i r o n m e n t .
automatically of
trans-
STAGE2.
a brief
some measure
the LSD
by t h e s e
in
acceptable
language Some of
moving
limitations
produce
was t h a t
LSD
the
true.
for
bed s h o u l d to
description
used i s
indicating
test
LSD
the
diagnostic
use o f
pro-
Since
Further,
its
trans-
the
w h i c h was u n a c c e p t a b l y
particularly
the
for
in
has i t s
and t h e r e f o r e
now p r e s e n t
give
that
it
to
be
being
bed.
of
crepro-
an i n t e r -
then
raised
as one c h a r a c t e r
program
but
a program which
have been d e t e r m i n e d only
useful
via
implement
macros
set
a translator
We p r o p o s e d
criterion
was added
dition
be s t o r e d
a syntactical
one-tracked
of
to
o f macro d e f i n i t i o n s
expensive
produce
this could
test
decreases.
have t o
software,
conventional
the
experi-
needed
This
we c o u l d
a set
STAGE2
another,
producing
for a set
is
a very
to
We t h e r e f o r e
analyser
of
of
the
machine before
service.
STAGE2,
program
easily
is
STAGE2
one m a c h i n e
large
into
provide
and a c o m p r e h e n s i v e
speed
messages w o u l d
from
by
complexity
this
word and we c o u l d large.
to
A program
bed by c o n s t r u c t i n g
processing
noted
We p l a n n e d on t h e
machi-
and d e b u g g i n g
ICL
be p r o v i d e d
be t r a n s l a t e d
lator
ses,
the
testing
4/?0.
small
experiments.
running
We t h e r e f o r e
compiling
m a c h i n e and p u t
error
physicists
for of
programmers.
on t h e
should
control
for
environmant
small
software
online
by t h e
language.
blem o f w h a t c o m p i l e r was d e s i g n e d
implementing
and t h e
be w r i t t e n
bed r u n n i n g
out to
for
acquisition
are
is
translated
provide program
either
an e x e c u t a b l e into
information and i t s
data.
a sequence to
the
Central
statement of
machi-
translator to
the
design
252
of
is
LSD
from t h a t
the form of an e x p r e s s i o n which d i f f e r s
found in many e x i s t i n g
is constructed
in the
high
style
Janus
o f operands and o p e r a t o r s which 6.2.,
strictly
from l e f t
ter
each o p e r a t i o n
far
is available
is
is evaluated,
to r i g h t
without
rent
v a l u e in the a c c u m u l a t o r as i t s the next operand to the r i g h t ;
the operand to i t s
right.
and causes the c u r r e n t location
left
There i s
specified If
then the e f f e c t accumulator. evaluation is
the
references only
no concept o f an a s s i g n m e n t comis treated
by the n e x t operand.
just
like
The o n l y s i t u a t i o n
the s t a r t A table
is
lator rand.
retained
illustrating
right
or l e f t
sing that
in s e c t i o n
ready n o t e d , influence LSD
and
hold f o r
LSD
later.
Figure
of
identifiers
carried
an i n t e g e r
gi-
and conLet
As we have a l -
out can have c o n s i d e r a b l e
arrays.
the
Janus
frame-
These are d e c l a r e d and reserve a block of
An element of an a r r a y can be accesssed
a r r a y name where the s u b s c r i p t
p r e s s i o n which y i e l d s
6.5b.
are p e r m i t t e d .
t y p e s t a t e m e n t s which
locations.
hand ope-
e x c e p t to note in pas-
In keeping w i t h
p e r m i t s o n l y one d i m e n s i o n a l
dimensioned by the a p p r o p r i a t e
is
expressions.
and m a n i f e s t c o n s t a n t s is
LSD
the accumu-
by the r i g h t
them any f u r t h e r
the way in which t h i s
in
are used to o p t i m i s e a r r a y ac-
the s t r u c t u r e
on program e f f i c i e n c y .
consecutive storage
At the
().
shift
SLL
specified
ORDNL
in more d e t a i l
not c o n s i d e r string
and
SRL
the problem of a c c e s s i n g data a g g r e g a t e s .
by a s u b s c r i p t e d
This
a current
and operands a v a i l a b l e
The ope~tors
INDEX
character,
to r i g h t
6.2.
thC a c c u m u l a t o r c o n t a i n s
a number of places
conventions
us c o n s i d e r
work,
left
between s t a t e m e n t s and may be r e f e r e n c e d a t
be d e s c r i b e d
and we s h a l l
in the
of p a r e n t h e s e s .
concept d i s c u s s e d
ves a number o f s i m p l e examples of The usual
an
an assignment o p e r a t o r ,
the o c c u r r e n c e
the o p e r a t o r s
6.5a.
The o p e r a t o r s
cesses and w i l l
stants
contains
of the next s t a t e m e n t by means of the symbols
g i v e n in F i g u r e
in the
Assignment i s s i m p l y p a r t of
which causes the normal
is
end of any e x e c u t a b l e s t a t e m e n t , This
any o t h e r
s i m p l y to s e t the v a l u e of the e x p r e s s i o n
deferred expression
value.
takes the c u r -
v a l u e of the a c c u m u l a t o r to be s t o r e d
to be i n t e r r u p t e d
Af-
operand and combines t h i s
an e x p r e s s i o n does not i n c l u d e is
in s e c t i o n
precedence.
the e x p r e s s i o n and may be placed in any s t a t e m e n t which expression.
sequence
v a l u e of the e x p r e s s i o n so
a monadic o p e r a t o r
The assignment o p e r a t o r
LSD.
as d e s c r i b e d
A dyadic operator
accumulator.
respects
An e x p r e s s i o n
of a linear
any o p e r a t o r
completed the c u r r e n t
in the
languages.
and c o n s i s t s
with
mand in
level
in s e v e r a l
v a l u e between
ex-
is
any v a l i d
0
and the maximum a l -
LSD
253
OPERATORS
+
OPERANDS
plus
constant
minus
identifiers
times
array elements
/
divide
pointer
~t
exponentiate
function
t
&
and
I
inclusive
address or
variables calls references
current value of accumulator
assign SRL
shift right
SLL
shift
INDEX
convert
integer
ORDNL
convert
index to integer
logically
left logically to index
Figure Operators
PATTERN
6.5a
and Operands
in LSD
SRL 2 1 2 5 5 @ P A T T E R N
I~A ~B =@C+I~D ~E ()+A@B
Figure
6.5b.
LSD Expressions
254
lowed s u b s c r i p t
of the array.
happens when an a r r a y When an a r r a y
is
(a)
We w i l l
element
declared
is
LSD,
in
the appropriate locations
(b)
or
The p r o c e s s -array
A
loaded
into
is
of
variables
problems. ly
change i t
to e f f e c t
at
With
the
an a r r a y
these
facilities,
the
general
of these
same mame
later
that
in Figure
since this
facility
this
section,
access.
what e f f e c t instruction
the
the
for
by w r i t i n g is
of
useful certain
it
can e a s i -
the basic
hard-
has on t h e code r e q u i r e d
t h e model of
ar-
of poin-
can c r e a t e
terms
an
has been the
quite
concept
permits
and assumes t h a t
contents
following
the
it
in
this
Remember t h a t
by a d d i n g
6.6a.
a base a d d r e s s ,
L e t us c o n s i d e r ,
6.1.,
that
t h e base a d d r e s s
way to p r o v i d e
in
the
locations,
of the array.
program can r e f e r e n c e
case
:
storage
up and l o a d e d w i t h
Although
a natural
to a normal
computed
the
can a l w a y s be r e f e r e n c e d
run t i m e .
is
handle
in
of section
to be a t t a c h e d address
it
discussed
Since
ware model
Notice
a subscript.
and can be d e v e l o p e d ter
set
of
diagrammatically
elements.
r a y name w i t h o u t
is
variable
t h e base a d d r e s s
a variable,
take place
number o f c o n s e c u t i v e
of the first
illustrated 5
two a c t i o n s
local
as t h e a r r a y is,
what
are r e s e r v e d ;
a global address
now examine i n more d e t a i l
referenced.
the effective
the modifier
operations
a modifier operand.
would be r e q u i r e d
to
:
(a)
Evaluate
the subscript
(b)
Multiply
the result
expression;
by t h e memory mapping
factor
if
(c)
Add i n
t h e base a d d r e s s ;
(d)
Store
the
something
result
in
other
than
a modifier
I;
and make t h e
access. This if
it
sequence occurs
We c o u l d
could inside
improve
expressions
of
result
in
unacceptable
inefficiencies,
particularly
a loop.
this
situation
if
we r e s t r i c t e d
the form variable
+
constant
ourselves
to
subscript
255
k
]
50
10
50
A(0)
51
A(1)
52
A(2)
53
A(3)
54
A(4)
Figure 6.6(a) Structure of an Array
PTR 10
I
- - - > - - - .
10 11
12
24
13
36
__ _ > _ _
24
-101
25
I | ! ! I
58
78
59
79
....
( ....
3 6 ' It 37
1
Figure
58
49
I---->--- 49 I
6.4.
Output of the Recognition Phase
0
256
and a t t r a n s l a t e timisation
time,
had a v a i l a b l e
the base address o f the a r r a y .
o f the a r r a y r e f e r e n c e would then be p o s s i b l e
program does not change the base address a t run t i m e . i n s t r u c t i o n s of the b a s i c model, of the v a r i a b l e
the m o d i f i e r
and the normal
the c u r r e n t
is just
left
there
once a v a r i a b l e
for
ted in s i t u .
is
loaded i n t o
say the d u r a t i o n
mer to d e c l a r e
of t y p e
this
Efficient
then be a c h i e v e d by r e s t r i c t i n g
the v a r i a b l e
for
+
J
the i n t e g e r
J
possible
A
which c o n v e r t s
priate
memory mapping f a c t o r .
a
D
I
that
is
is
available
is
for
at translate
can be a d j u s t e d LSD
statement
for :
is
A
specified for
dif-
p r o v i d e d by the o p e r a t o r
restricted
Janus.
f o l l o w e d by a c o n s t a n t specified,
array
by d i v i d i n g
the appro-
to the form d e s c r i -
incrementing
or d e c r e -
the access has been p e r f o r m e d .
on w h e t h e r he wants the i n d e x r e g i s t e r no c o n s t a n t
is
a mechanism f o r
after
outlined
for
The a r r a y name i s
an i n d e x back to an i n t e g e r
provides
LSD
For e x a m p l e , the e x p r e s s i o n
in the memory mapping f a c t o r
of an a r r a y r e f e r e n c e
follows
or an
J.
The r e v e r s e o p e r a t i o n
menting the i n d e x r e g i s t e r
per element o f
J
differences
ORDNL
v i a the
v a l u e o f the e x -
by the memory mapping f a c t o r
array types.
notation
:>
to be de-
a v a l u e ad-
can be e f f e c t e d
computer address u n i t s
i n d e x back in
ferent
bed above, then
contains
the c u r r e n t
the n e x t operand.
INDEX
and stores the r e s u l t a n t
a subscript
the program-
to the form
This
which m u l t i p l i e s
INDEX
the a r r a y whose name i s
If
by a l l o w i n g
d e c l a r e d to be a r e g i s t e r
p r e s s i o n by the number o f t a r g e t
to a l l o w f o r
be
constant
the memory mapping f a c t o r .
operator
multiplies
gain
could
access to an a r r a y e l e m e n t can
subscripts
index register
special
situation
a l l o w s one or more v a r i a b l e s
IREG
c l a r e d as i n d e x r e g i s t e r s .
justed
it
index registers.
The d e c l a r a t i o n
providing
We could
the m o d i f i e r ,
of a loop and i n c r e m e n t e d or decremen-
provides for
LSD
value
the base address
of the a r r a y i n c r e m e n t e d or decremented by the c o n s t a n t . even more i f
the
In terms o f the
contains
operand address
providing
Op-
The
The programmer can append
to the a r r a y r e f e r e n c e depending decremented or i n c r e m e n t e d .
then
1
assumed.
time,
the s p e c i f i e d
the memory mapping f a c t o r .
Since the a r r a y name i n c r e m e n t or decrement Thus the f o l l o w i n g
If
257
A(J)I+A(J)I+A(J)I:>SUM would add t h r e e cess,
s u c c e s s i v e elements of an a r r a y .
the i n d e x r e g i s t e r
ping f a c t o r
required
be a more e f f i c i e n t
J
for
After
each a r r a y ac-
would be i n c r e m e n t e d by the memorv
A.
Note
, however, t h a t
map-
the f o l l o w i n g
way of a c h i e v i n g the same e f f e c t
would
:
A(J)+A(J+I)+A(J+2)I3:>SUM In t h i s
case, o n l y one s e t of i n s t r u c t i o n s
is
required
to i n c r e m e n t the
index register. Earlier
we noted t h a t
LSD
in
pointer
variable,
guage.
This
ring
the manner in which a r r a y r e f e r e n c e s were handled
could be developed q u i t e
a necessary feature
facility
of scalar
is
In our d i s c u s s i o n
in any systems programming l a n -
obtained quite
of array references,
s i m p l y by a l l o w i n g
we noted t h a t
address was c a l c u l a t e d
to the c o n t e n t s
If
we e x t e n d t h i s
it
i s easy to see t h a t
if
of the v a r i a b l e X
x(o)
following
word and so on.
will
is
reference
the use o f p o i n t e r s
a scalar
PTR
In the d i a g r a m ,
if will
12
Hence
PTR(2)
points
PTR(2) is
is
a scalar
to l o c a t i o n references
now i t s e l f
containing
then
a valid
x, X(1)
to by
the
facility
is
Compound data s t r u c t u r e s The process
be s p e c i f i e d .
Hence
PTR(3)(1)(O),
form,
a pointer
any number o f s u b s c r i p t s . an arrayname; any v a l i d
to
PTR(2)(O)
in
59,
which
13
location
-101.
location
variable 24,
subscripted,
cesses the v a l u e general
variable
An obvious e x t e n s i o n to t h i s to any depth.
the arrayname.
to be s u b s c r i p t e d ,
is
to
of
illustrated
in
6.6(b).
Location on.
in the g e n e r a l
associated with
the word p o i n t e d
any c o m p l e x i t y can then be c o n s t r u c t e d . Figure
the s u b s c r i p -
by adding the v a l u e o f the
mechanism to p e r m i t s c a l a r s
address, allow
to p r o v i d e the concept of a
variables.
case the e f f e c t i v e subscript
naturally
Similarly variable
12
36
references
PTR(3)(O)(1) the v a l u e consists
to l o c a t i o n to
58
0
in
10.
and so
and accesses the v a l u e
one more l e v e l
The i d e n t i f i e r
LSD
points
36,
24;
of i n d i r e c t i o n
location
24
and ac-
accesses the v a l u e 79 location
of an i d e n t i f i e r may be a s c a l a r
49.
In i t s
followed
by
variable
or
e x p r e s s i o n may be used as a s u b s c r i p t .
In o r d e r to c r e a t e the s t r u c t u r e s
accessed by p o i n t e r
variables
in the
258
manner j u s t ses.
d e s c r i b e d , we must be a b l e to s p e c i f y therefore
LSD
permits
ray e l e m e n t s , p o i n t e r le brackets
and m a n i p u l a t e a d d r e s -
access to the addresses of v a r i a b l e s ,
variables,
procedures,
functions
or l a b e l s .
ses.
Ang-
are used to denote the a d d r e s s . Thus
is
the address o f the v a r i a b l e
is
the address of element
FRED
of a r r a y
10
Some care must be e x e r c i s e d by the programmer when using t h i s s i n c e again
ar-
there
may be problems connected w i t h
A. facility
the mapping of a d d r e s -
Thus the e x p r e s s i o n
+I does not n e c e s s a r i l y To remedy t h i s
produce
situation
in
the address o f the a r r a y element
A(4).
we can use a m a n i f e s t c o n s t a n t .
LSD,
Thus the e x p r e s s i o n +MMF
will
produce the c o r r e c t
tor
for
the p a r t i c u l a r
before translation
MMF
Of the operands to c o n s i d e r .
listed
Instead,
dures and f u n c t i o n s the
if
machine i s
the v a l u e of the memory mapping f a c a s s i g n e d to the m a n i f e s t
o f the program i s in F i g u r e let
6.5(a).,
carried
constant
out.
we have o n l y f u n c t i o n
us examine the whole q u e s t i o n
are implemented in
LSD.
framework does not e x p l i c i t l y
Janus
re c a l l
result
As noted
define
of how p r o c e -
in s e c t i o n
itself.
The c h o i c e
is
left
6.2.,
the form of a procedu-
nor the manner in which arguments are to be t r a n s m i t t e d
procedure
calls
to the
to the d e s i g n e r of the a b s t r a c t
machine. An
LSD
program c o n s i s t s
o f a sequence of one or more procedures
o f which must be preceded by a d e c l a r a t i o n PROCEDURE
Either
ters.
of l o c a l
statements
of the form
(formal input parameters) (formal output parameters)
or both o f the f o r m a l
body c o n s i s t s table
identifier
each
parameters may be o m i t t e d .
type d e c l a r a t i o n s
some or a l l
(if
any)
of which may r e f e r e n c e
The procedure must c o n t a i n
at least
one
A procedure
followed
by execu-
the f o r m a l
parame-
RETURN s t a t e m e n t
to r e -
259
turn
control
to the p o i n t
from which
the procedure body i s
indicated
Every
returns
procedure
LSD
cedure c a l l s nal
between procedures
as a d o c u m e n t a t i o n a i d , PROOEDURE
However, the o n l y a c t i o n
identifier
in a d e c l a r a t i o n
to t h a t
to any f o r m a l
input
also a pro-
hand, an a c t u a l location
red,
is
it
into
restricted
Again e i t h e r
or
may be o m i t t e d but the f o r m a t of the c a l l The a c t u a l
Actual
input
output
p a r a m e t e r s are pas-
parameter corresponding
p a r a m e t e r may be any v a l i d
of a storage
expression.
LSD
p a r a m e t e r can o n l y s p e c i f y
which
the c o r r e s p o n d i n g
to an i d e n t i f i e r ,
output
On
the address
value is
sto-
an a r r a y element or a
variable.
The mechanism used in a stack.
(Recursion
countered,
LSD
is
the c u r r e n t
for
p a s s i n g parameters
is
copy/restore
When a procedure
a l l o w e d in L S D ) .
v a l u e of the a c c u m u l a t o r
call
via
is
en-
is s t a c k e d and the ac-
parameters e v a l u a t e d .
In t u r n
these are p l a c e d on the s t a c k
re i s e x e c u t e d .
On e n t r y
"the s t a c k base a d j u s t e d reserved for routine.
output
All
to the c u r r e n t
the l i n k
retrieved
is
sequence i s s t a c k and meters. becomes
both
are e f f e c t i v e l y
instructions
within
in the r o u t i n e
The o u t p u t
s t o r e d i n the l o c a t i o n s
'This mechanism i s
quite
is
link
is
the
local
p r e s e r v e d and
work space of the
reference
pointer.
this
returned
space do
Before e x i t ,
A return
to the c a l l i n g
v a l u e s are e x t r a c t e d
addressed by the a c t u a l
hand operand f o r expression
jump to the procedu-
parameters and l o c a t i o n s
which
v a l u e of the s t a c k
The v a l u e o f the procedure the r i g h t
the
the i n p u t
and the s t a c k base r e s e t .
then e x e c u t e d .
o f the c u r r e n t
and a r e t u r n
to the p r o c e d u r e ,
so t h a t
so r e l a t i v e
tion
to a c t i v a t e
procedure.
o f the d e c l a r a t i o n .
sed to a p r o c e d u r e by v a l u e .
tual
statement is
CALL
How-
as a s y -
FUNCTION
required
the name of the c a l l e d
both o f the p a r a m e t e r l i s t s
pointer
does not a p p l y .
an operand of the form
is
must c o r r e s p o n d
i.e.
and a
and hence p r o The c o n v e n t i o -
(actual input parameters) (actual output parameters)
where i d e n t i f i e r
the o t h e r
the a c c u m u l a t o r
really
end of
statement.
in an e x p r e s s i o n .
and f u n c t i o n s
provided.
to w r i t e
The t e x t u a l
END
a programmer may use
nonym f o r cedure i s
was c a l l e d .
a v a l u e in
may be used as operands
distinction
ever,
it
by means o f an
from the
output
para-
in the a c c u m u l a t o r then
the pending o p e r a t o r
and the e v a l u a -
resumed.
a straightforward
one and r e l a t i v e l y
easy to
260
implement. gard
to
lier
for
global
However,
array
local
the
address
unless
two m o d i f i e r s lity
and i t
accessing ce o f
in
is
remaining
the
should
etc. sists
of
A variable
is
arrays
at
statement. mation
the
ve i n hand, of
produce
sically
with
switch.
The in
GOTO
the
are
fer
control
operand; given
GOINDTO
to
the
but
it
type;
into
one word of
provides
latter
allows
control
at
same t i m e
the
Janus
for of
local
to
to
infor-
may b e h a -
allocates
one
on t h e
other
the
fact
that
that
a number
target
concerned
loop of
simple
control
control
form of
to
an i n d i r e c ~ specified indirectly return
A relational
baand
the
allows
the
some
machine.
those
be t r a n s f e r r e d
framework.
supply
space and c o n -
location
preserving
which
executable
example,
transfer
the
first
macros,
are
on t h e in
declarati-
statement
the
of
con-
declaration
DATA
the
The f o r m e r
whilst
LSD
conditionals,
Variations contained
language,
in
translators
interest
control-jumps,
the
merely
cognisance
be packed
CHARACTER
line
storage
the
con-
into
mainly
compiler
others,
GOTOVIA.
a
is
than
statements
languages convention.
and a r r a y
before
We n o t e d
a declaration
a global
Different
code and t a k e
the
in
can a l l o c a t e
space
and
sour-
briefly
INTEGER,
variable
declaration
the
follow
e.q.
by means o f
of
statement
that
new t y p e s
of
mechanisms.
an a d d r e s s
specify
be i n
statements.
particular
a constant
address
Conditionals
type,
declaration
same p r o c e d u r e .
tement of
to
irrespective
of
we
Hence t h e
consider
A declaration
LSD
executable transfer
a
form of to
could
use a d i f f e r e n t
introduce
the
less
could
The r e m a i n i n g
to
Thus,
optimized
require
characters
label
access
is
such a f a c i -
model.
, we w i l l
DCL,
denote
so t h a t
ways.
each q u a n t i t y
types
of
translator
correct
different
word t o data
the
provide
specifies
could
by a l i s t
the
an i n s t r u c t i o n
executable
characters
be i n i t i a l i z e d
end o f
The f u n c t i o n
to
struct
the
the
Janus
may be p r e s e t
may o n l y
placed
language
provided.
followed
restricted
hardware
in
ear-
question
as p a r a m e t e r s ,
the
a procedure
the
used to is
allows
re-
described
in
our basic
framework
are
passed
even f o r
with
program.
of
by t h e
array
a loop
although
Janus
techniques
the
arrays
particularly
Few m a c h i n e s
and a few o f
that
a keyword
in
LSD
a programmer
ons. but
a
declaration
TYPE
for
hardware
with
features
the
To a l l o w
the
in
keywords
LSD,
actual
variance
6.2 in
or
an e l e m e n t
aggregates
be i n t r o d u c e d
structed
of
use t h e unless
as an o f f s e t .
declarations
section
In
at
data
ineffeciency
Of t h e only
the
some p r o b l e m s
references
arrays
as w e l l
of
raise
We c a n n o t
such
For
must compute a subscript
does
references.
optimising
one.
it
a sta-
traDsby t h e !to a
address.
operator
com-
261
p,ares
its
right
hand operand w i t h
and s e t s
a condition
propriate
operator
for
is
THEN)
operation
is encountered.
translated
AND.
is
OR
into
optimization
This
Thus
v a l u e of the a c c u m u l a t o r is
into
i n s p e c t e d when the ap-
: (which
a jump-if-false
translated
v a l u e o f the accumumlator i s Note t h a t
the c u r r e n t
code a c c o r d i n g l y .
is
a s h o r t h a n d form
instruction
a jump-if-true.
as i s
The c u r r e n t
not changed by these t e s t s
or t r a n s f e r s .
which depends on the s t a t e m e n t f o l l o w i n g
o p e r a t o r which i n s p e c t s
the c o n d i t i o n
code i s
the
possible.
the
Thus the
LED
statement IF
would be t r a n s T a t e d ZF
is
The
an o p t i o n a l
A
LT
into
: GOTO
the loop
is
performs
a controlled
a jump-if-true
to
iteration
B.
Note t h a t
the
v a l u e to a f i n a l
loop are d e l i m i t e d
Loops may be nested to any d e p t h . control
The f i r s t
the
indicates
number of t i m e s ;
incrementing a counter
amount each t i m e from a s t a r t i n g
pected t r a n s f e r s
label
has two f o r m s .
LED
to be executed a s p e c i f i e d
ments performed w i t h i n ment.
B
keyword.
s t a t e m e n t in
LOOP
2
The
to one of s e v e r a l
by a s p e c i f i e d
value.
by an
The s t a t e state-
ENDLOOP
SWITCH
labels
that
the second
s t a t e m e n t as e x -
depending on the va-
lue o f the operand. To d a t e , 4/70
has been implemented v i a
LSD
and the Modular One.
estimate that plement
about one man month of e f f o r t
95 %
the
would be r e q u i r e d
to im-
o f the language on a new machine. This would i n c l u d e
the common and most f r e q u n t l y would r e q u i r e
on two machines,
STAGE2
On the b a s i s of these i m p l e m e n t a t i o n s , we
used f a c i l i t i e s .
a n o t h e r man month o f e f f o r t .
The r e m a i n i n g
all
5 %
These e s t i m a t e s assume t h a t
the programmer making the i m p l e m e n t a t i o n i s e x p e r i e n c e d in the use of STAGE2.
We have a t t e m p t e d to o b t a i n
of
by comparing the s i z e o f a program w r i t t e n
LED
some e s t i m a t e of the e f f i c i e n c y
s i z e o f the same program w r i t t e n
in
a typical
larger
LSD
version. things,
program i s
Currently the
LSD
5 %
we are u s i n g
compiler/test
machine and a data a c q u i s i t i o n scientific use.
users a l i k e
We e x p e c t t h a t
it
of S o f t w a r e E n g i n e e r i n g
LSD
system.
tools,
with
LED
the
We have found t h a t
than the e q u i v a l e n t
hand coded
to imDlement, amongst o t h e r
bed, an o p e r a t i n g
are f i n d i n g will
assembly code.
in
system f o r
a small
Both systems programmers and
the language c o n v e n i e n t and easy to
prove to be a u s e f u l
addition
to our k i t
262
7.
A
HIERARCHY
OF
ABSTRACT
MACHINES
Mention has a l r e a d y been made i n t h i s archy.
Dennis had r e f e r r e d
programs and s y s t e m s ; question
it
ordering,
relationship
problem o f p r o t e c t i o n
in d i s c u s s i n g
Goos has c o n s i d e r e d
of hierarchical
methodology and i t s
to
course o f the concept o f a h i e r -
illustrating
its
to l a n g u a g e ;
Tsichritzis
the r e p r e s e n t a t i o n
in some d e t a i l
c o u r s e , we are Concerned w i t h cing p o r t a b l e
use as a design
in h i s
discussion
has shown how p r o t e c t i o n
can be used in the i m p l e m e n t a t i o n o f a system.
In t h i s
We s h a l l
a c h i e v e the c o r r e c t
of a b s t r a c t
c r e a t e a model w h i c h , the p r o b l e m ,
teristic
although
providing
Complex o p e r a t i o n s
in the h a r d w a r e ,
unless the a l g o r i t h m
A c h i e v i n g the c o r r e c t
little
language f o r
implement.
required
des-
On the o t h e r
of r e a l
machines in the c h a r a c -
to s o l v e the p r o of the
incorporates
such
then no advantage can be taken of i s broken.
Thus e f f i c i e n c y
balance between p o r t a b i l i t y
suf-
and e f f i c i e n -
v e r y much a s o f t w a r e e n g i n e e r i n g t a s k and one in which we suggest concept of a h i e r a r c h y w i l l
7.1.
NEED
The need f o r results
FOR
the h i e r a r c h y
be o f g r e a t a s s i s t a n c e .
HIERARCHY
has been amply demonstrated by some o f the sections.
Both
FLUB
and
emphasis on the r e q u i r e m e n t s o f the p r o b l e m ,
degree of i n e f f i c i e n c y highly
THE
p r e s e n t e d in e a r l i e r
ned w i t h
that
to
the s t r u c t u r e
the program is moved to a machine which
these f e a t u r e s , cy i s
a convenient
have to be coded i n terms o f the s i m p l e o p e r a t i o n s If
operations d i r e c t l y
the
o f ab-
As we have a l r e a -
be easy to move but may not a d e q u a t e l y r e f l e c t
blem w i l l
fers.
3.1.
machines, then we can e a s i l y
can be v e r y d i f f i c u l t
o f the problem.
model.
two p r i n c i p l e s
in s e c t i o n
of r e a l
hand, a s i m p l e model, designed w i t h mind, w i l l
In p a r t i c u l a r ,
we emphasize the r e q u i r e m e n t s of the problem w i t h
or no r e g a r d to the s t r u c t u r e cribing
produ-
machines can be used to
balance between the f i r s t
machine m o d e l l i n g e n u n c i a t e d if
for
of the
now show again how the concept o f a h i e r -
c o n s i d e r how a h i e r a r c h y
dy n o t e d ,
hierarchies
section
the development of t e c h n i q u e s
archy can be of g r e a t v a l u e in a c h i e v i n g these g o a l s .
stract
on the
and a d a p t a b l e s o f t w a r e based on the concept o f a b s t r a c t
machine m o d e l l i n g . we w i l l
of
the whole
portable
when implemented on r e a l
because o f the s i m p l i c i t y
a person not f a m i l i a r
in any way w i t h
TEXED,
gave r i s e
machines.
of the model.
desigto some
Yet each was
We would e x p e c t
the t e c h n i q u e s
to a c h i e v e
263
an i m p l e m e n t a t i o n design
process
real
machines,
the
implementor.
to h a n d l e
with
the
28
dy noted
that
a month t o siderably I)iler
parameter
simple
i m p l e m e n t most o f
to
an i m p l e m e n t o r a running
"like
assistance.
could
suited
create
would
this
construct
an a b s t r a c t
p r o b l e m and t h a t this
abstract
t h e whole
a very
to
exists
compiler
portable
program.
machine,
there
with
a few more h i g h
level
algorithm.
to map any one o f t h e s e process
gui-
any s p e -
that
we must ask
in
a language
any machine i n skills.
a
The so-
machines.
L e t us
a machine w i t h
could
readily
be i m p l e m e n t e d
the compiler
in
this
at a price.
Although
which onto
we c o u l d
could
ALGOL.
well
not
postulate machine be used
machine and we would if
eventually
STORE,
real
we
would be r e q u i r e d
However,
levels,
COM-
conveniently
effort
a real
LOAD,
t h e n we w i l l
is
We would
a second a b s t r a c t
portability. like
for
computer,
H o w e v e r , we can a l s o
on a number o f
language,
a compiler
i n a s s e m b l y code.
a number o f
instructions
is
machine which
has one i n s t r u c t i o n ,
considerable
instructions
down t h r o u g h
reach
it
exists
instructions Again,
be a l o n g way f r o m a t t a i n i n g
nue t h i s
obtain
the originator
machine on a r e a l
below t h e f i r s t
still
therefore
of abstract
that
which
require
a program w r i t t e n
that
t o encode t h e
not
special
we a r e f i r m -
be a b l e t o
can be i m p l e m e n t e d o n
a hierarchy
we w i s h
there
To r e a l i z e
have t o w r i t e
have produced
it
of
implies
back t o
con-
o f com-
be a c h i e v e d .
that
to s o l v i n g
P I L E ALGOL.
refer
The q u e s t i o n
that
Suppose t h e p r o g r a m w h i c h We can p o s t u l a t e
He s h o u l d
o f t i m e by someone w i t h o u t
seems to be to
see how t h i s
out,
and t h i s instructions
: how can we d i s t r i b u t e
period
pointed
about
take
our o b j e c t i v e
should
he have to
take
user will
a set of
and g u a r a n t e e
reasonable lution
keeping with
sequence.
requi-
the fundamentals
software
nor should
sto-
We have a l r e a -
FLUB.
portable
the bootstrap
will
STAGE2 w i l l
of a full-bootstrap
of
This
An i n e x p e r i e n c e d
in
have
mechanisms f o r
etc.
implement
of
as seen by
he w i l l
LSD,
m e r e l y by f o l l o w i n g
program f o r LSD
not
least
the of
a much more c o m p l e x s e t
user of
As we have a l r e a d y
concept
knowledge,
and answer i s
control
some knowledge
is
of a piece
version
des him t h r o u g h of the
This
the
to
LSD.
and r e q u i r e
implementation.
committed
cialized
required
at
in
certainly
an e x p e r i e n c e d
in
of the structure
as p r o v i d e
loop
of macros,
macros
construction.
as w e l l
passing,
set
we e x p e c t
longer
cook-book
ly
a program w r i t t e n
of data types
complicated
As our emphasis
t a k e more a c c o u n t
has become more c o m p l e x ,
To r e a l i z e
rage management, than
to
our model
a variety
re a f a i r l y
a few weeks o f e f f o r t .
has s h i f t e d
ADD
machines.
we c o n t i we m i g h t etc. If
which
we code
have a c h i e v e d p o r t a b i l i t y
move t h e program e a s i l y
-
f r o m one machine
264
to a n o t h e r ,
it
The h i e r a r c h y
is
unlikely
to be v e r y e f f i c i e n t .
of a b s t r a c t
lowing structure. characteristics
machines we have p o s t u l a t e d
very portable.
The r e s u l t a n t
programs are e f f i c i e n t
of real
computers.
and produce programs which are h i g h l y
en-
but not
This suggests
The languages are
portable
t h a t we might be a b l e to f i n d
machine somewhere between the two extremes which w i l l nable balance between p o r t a b i l i t y , if
the
C o n v e r s e l y , the machines near the bottom are designed
to take account of the s t r u c t u r e efficient.
reflect
o f the problem and p r o v i d e c o n v e n i e n t languages f o r
coding the a l g o r i t h m .
low l e v e l
possesses the f o l -
Machines near the top of the h i e r a r c h y
we can connect the h i e r a r c h y
machine at one l e v e l
efficiency together
but not v e r y an a b s t r a c t
provide a reaso-
and c o n v e n i e n c e .
in such a way t h a t
Further,
an a b s t r a c t
can be mapped onto the machine at the n e x t
level
down in an a u t o m a t i c and machine i n d e p e n d e n t manner, then we may not have to r e l y lity
on the p r o p e r t i e s
o f the s o f t w a r e .
of one machine alone to c o n t r o l
A convenient problem-oriented
o b t a i n e d from a machine high
in the h i e r a r c h y ;
would then be used to ensure t h a t
the qua-
language could be
lower l e v e l
machines
the s o f t w a r e was both p o r t a b l e
and
efficient. To i l l u s t r a t e stract Ai+1
how t h i s
machines is
defined
I ~ i < n.
At,
can be a c h i e v e d ,
A2,
...
Such a h i e r a r c h y
the top of the h i e r a r c h y , blem i s e x p r e s s e d . link
Ai,
in terms o f
A,
computers.
is
An ,
An for
on a p a r t i c u l a r
If
It
Thus any program w r i t t e n
is
to be v e r y e f f i c i e n t
chine bY v i r t u e
of i t s
position AI
It
re t h a t
an e f f i c i e n t
is
The machine at
for
the f u n c t i o n
An
Ai_1
machines, t h ~ r e a l i z i n g is
portable.
via
AI
is
However, i t
since this
a fairly
of those of r e a l
in the h i e r a r c h y of one o f
the
in terms of
in the h i e r a r c h y ) is
An
provides
implements each machine in the
when r e a l i z e d
As we have a l r e a d y n o t e d ,
guage.
Ai(i#1 )
of real
i s an i n t e r s e c t i o n
Thus the p a r t p l a y e d by tability.
7.1.
machine
such t h a t
be implemented on a v a r i e t y
of
computer e f f e c t i v e l y
model whose s t r u c t u r e
i
of ab-
the language in which the p r o -
can e a s i l y
the d e f i n i t i o n
hierarchy. unlikely
such t h a t
v a l u e s of
shown in F i g u r e provides
does not depend on the c h a r a c t e r i s t i c s A~
structured all
at the bottom of the h i e r a r c h y
to the o u t s i d e w o r l d .
of a c t u a l
...
Ai
c o n s i d e r the h i e r a r c h y
ma-
simple
machines.
to g u a r a n t e e high p o r -
provides
the c o n v e n i e n t l a n -
the i n t e r m e d i a t e
machines to ensu-
i m p l e m e n t a t i o n can be o b t a i n e d .
As we move up the h i e r a r c h y , and more the c h a r a c t e r i s t i c
each machine w i l l
tend to r e f l e c t
more a
of the problem to be s o l v e d embodied in the
265
/ /
\
An-2 ! tI At TI A~
\
I
/ /
,\ \
A2 , ,,,,
/
A~
Abstract
~S//J
\
Machine
Real Machine
Figure
7.1.
Hierarchy of Abstract Machines
266
higher the
level
target
operations.
In g e n e r a l ,
machine w i l l
ment t h e o p e r a t i o n s
require
of
At.
program s i n c e
account
ristics
of the
machine,
rities
in
Hence i f
data structures the hierarchy
t h e more e f f i c i e n t that
running
is
version
is
available
In a p p l y i n g
this
through
a number o f
logical
e l e m e n t s which
hierarchy
of abstract
the d e s i g n abstract
machine
into
another of
f r o m an e a r l i e r
be used to produce
the generated
as so o f t e n
is
way.
one.
t h e case i n
in
we propose
with
so t h a t
it
tements
required
will
into
towards
if
a set
which
are a b l e a series
mapped onto
the
of
is
and d a t a
types
in
the entire
at
rules
do not d i v e r g e
the art. a program
translated
into
LSO
a language
determined
above
by t h e needs o f
features
of
LSD
common o r g a n i z a t i o n a l transformation
beneath
LSD
will
We have a l r e a d y designed
machine i n s t r u c t i o n s .
machine o r
can
levels
automatically
coded as macros
down any l a n g u a g e s
at a gi-
lower
program can be t r a n s l a t e d
to t h e programmer.
in
a t any one l e v e l
changes
However,
Languages
the one
the design
incorporate
the
in
we can e f f e c t i v e l y
many o f t h e
providing
any p r o b l e m .
of abstract real
to
top-
to a l e v e l
to t r a n s l a t e
of
of
t o as
to be s o l v e d
and i m p l e m e n t a t i o n
solution
recognition
to break
rules
state
into
a
implementa-
Each l e v e l
program may be coded i n
required.
t h e machine than
exists
into
role
so t h a t
LSD
there le
its
to s o l v e
be a v a i l a b l e
tically
code.
t h e y can be r e f l e c t e d current
out,
the appropriate
a sense e q u i v a l e n t
l a n g u a g e may i n c l u d e
fulfills
the problem
level,
Whi-
carried
to t h e
one r e a c h e s
lower
however,
efficiency.
p r o b l e m o f how to d i s t r i b u t e
the
operations
This
until
etc.
machine.
approach
a description
design
The a c t u a l
special
the problem.
that
Notice,
being
Changes i n d e s i g n
the
now to our o r i g i n a l
LSD
is
the corresponding
Similarly,
Returning LSD,
at the
code so t h a t
such a h i e r a r c h y .
in
generating
ven l e v e l
an a u t o m a t i c
is
machines
commonly r e f e r r e d
can be expanded i n t o
therefore in
obtain.
one expands
simila-
the more work we e x p e n d ,
S i n c e we can c o n s t r u c t
process
in
and a b s t r a c t
use on t h e t a r g e t
of design
machines
sequence.
automate the
characte-
to achieve this
strategy,
method,
levels
of the particular
we w i l l
between t h i s
and t h e d e s i g n
imple-
a more e f -
implementation
for
a marked s i m i l a r i t y
of software
down.
the
to
on
in
correctly,
up p o r t a b i l i t y
to o p t i m i z e
required
result
hardware features,
the r e a l
organized
is
should
special
an i m p l e m e n t a t i o n
we have not g i v e n
tion
e.g.
these operations
than
this
can be t a k e n
between
is
le the exercise
There
more e f f o r t
However,
ficient
target
to r e a l i z e
in
for the
automal o o k more noted
that
STAGE2 Janus
These can e i t h e r instructions
sta-
rules
stybe
o f an even
267
simpler abstract
machine at a l o w e r l e v e l .
ing a program w r i t t e n simple abstract
in
Thus the problem of o b t a i n -
to the problem of r e a l i z i n g
LSD
machines - a much l e s s burdensome t a s k .
We have noted what i n f l u e n c e
the use of a h i e r a r c h y
we must a l s o examine b r i e f l y
how i t
the b e h a v i o u r of the a l g o r i t h m grammer must plan in h i s sarily
original
way in which an o p e r a t i o n starts
sibilities.
is
at any of the l e v e l s ,
7.2.
A
From the d i s c u s s i o n sign of with that
the use of a h i e r a r c h y
various
FOR
THE
section,
the program i s
7.2.
to make
and one i s
structured
it
is
clear
be made a v a i l a b l e .
constructed
which a l l
other
machine i n d e p e n d e n t way. LEAC
is
to ensure t h a t
tees about e f f i c i e n c y . rectly
This
A~
abstract
the s o f t w a r e
tree
level
is
suggests in
of s e t -
called
SELEAC
describing then any
machine could
in s e c t i o n
UNCOL
3.1.
proposal.
readily
the s i m i -
The s i m i l a r i t y
the e x i s t e n c e of an a b s t r a c t
machines can be t r a n s l a t e d is portable;
the user i s
the f u n c t i o n it
more e f f i c i e n t
of
in a SE-
makes no guaran-
always f r e e
and map a s p e c i f i c
onto the hardware to o b t a i n
is
which e x i s t
Once the r u l e s
a higher
and the
Further,
to do
function
machines as shown i n
However, remember t h a t
any number of the lower l e v e l s
Its
machine have been d e v e l o p e d , for
even more marked when one p o s t u l a t e s
machine i n t o
has l i t t l e
portable.
of a b s t r a c t
We have a l r e a d y noted
between our t e c h n i q u e s
the de-
l e d to suggest the p o s s i b i l i t y
Computer).
in terms of the t a r g e t
that
common to a number of h i e r a r c h i e s .
The machine at the base of t h i s
p i e c e of s o f t w a r e larity
highly
AI
hierarchy
(Standard Elementary A b s t r a c t SELEAC
of
HIERARCHY
could be a number o f machines above
hierarchies
up a t r e e
Figure
is
the i m p l e m e n t o r a g r e a t deal
of the problem being s o l v e d .
there
many pos-
form of the program.
BASE
may be p o s s i b l e
the
could be a p p l i e d
a t the base of the h i e r a r c h y
it
as a l t e r i n g offers
5.2.
AI
In f a c t , ting
As f a r
the c h a r a c t e r i s t i c s
m e r e l y to ensure t h a t
those of the
made b e f o r e ~ r a n s l a t i o n
accordingly.
in the p r e v i o u s
the machine
For example, the
i s expanded i n t o
thereby allowing
STANDARD
text.
d i s c u s s e d in s e c t i o n
the f i n a l
Changes in which the p r o -
However, t h e y do not neces-
by d e c l a r a t i o n s
realized,
freedom in choosing
design,
modified
The t e c h n i q u e s
adaptability
i n the o r i g i n a l
could be c o n t r o l l e d
has on p o r t a b i l i t y ;
something f o r
at one l e v e l
and the a l g o r i t h m
way an a l g o r i t h m
affects
are s t i l l
have to be i n c o r p o r a t e d
next l e v e l
one of these
abstract
to b y - p a s s machine d i -
implementation.
Thus
268
/
~,
A
\,
/
J , ,
t
1
\
f \
t
/
\
/,,,
\ I
/
~
t
\
I /
A
SELEAC \
f l [I
Abstract Machine
V/////A
Real Machioe
Figure Tree-structured
7.2. Hierarchy
of Abstract Machines
\
269
the i n f l e x i b i l i t y In c r e a t i n g
o f the
the design o f the
the f o l l o w i n g
proposals
UNCOL
have been a v o i d e d .
machine, we must t a k e account o f
SELEAC
factors.:
(a)
must be s i m p l e enough so t h a t
SELEAC
be q u i c k l y variety it
and e a s i l y
of c u r r e n t
it
can
implemented on a wide
machines.
On the o t h e r
must not be so s i m p l e t h a t
hand
the r e s u l t i n g
imple-
m e n t a t i o n s are u n u s a b l e . (b)
must be e x t e n d a b l e .
SELEAC
sible
It
to add new f a c i l i t i e s
t h e y cannot e a s i l y existing
facilities.
new high
level
must be pos-
to the machine i f
be expressed in terms of Thus the i n c l u s i o n
abstract
machine in the h i e r a r -
chy can produce an e x t e n s i o n the base machine i f
of a
its
in the d e s i g n of
particular
needs can-
not be s a t i s f i e d . Our s t a r t i n g
point
model d i s c u s s e d alized
with
for
of
6.1.
the model
left
framework w i t h i n
described
certain
i n g an a b s t r a c t
things
this
could be r e -
Remember, however,
since
machines.
it
is merely a
For e x a m p l e , i t
to be o r g a n i z e d .
be-
SELEAC,
f r a m e w o r k , musk s p e c i f y
such de-
explicitly.
SELEAC
is
a single
address machine w i t h
and an i n d e x r e g i s t e r . subtraction, mulator;
The i n t e g e r
multiplication
only addition
the i n d e x r e g i s t e r .
SELEAC
memory
arithmetic
and s u b t r a c t i o n
is
instructions
since
divided
into
operations
operations
it
are a v a i l a b l e
for
of the b a s i c model has
two main a r e a s .
Nothing
is is
must be l a r g e
mentary data t y p e s used in the h i g h e r
of addition,
can be s i m u l a t e d in memory.
or i n d i r e c t l y
and d a t a .
of a word e x c e p t to note t h a t
it
an a c c u m u l a t o r
=
can be performed on the accu-
The s t a c k which formed p a r t
may be addressed both d i r e c t l y holds both
two r e g i s t e r s
and d i v i s i o n
not been i n c l u d e d s p e c i f i c a l l y 'The
how i t
3.2.
unspecified
which to d e s i g n a b s t r a c t machine w i t h i n
the b a s i c hardware
on any of the r e g i s t e r /
in s e c t i o n
s a i d n o t h i n g about the way the memory i s tails
is
SELEAC
We have i n d i c a t e d
an a c c e p t a b l e degree of e f f i c i e n c y
processor organizations that
the design
in s e c t i o n
level
The f i r s t
a static
specified
memory which about the s i z e
enough to hold
machines.
which
the e l e -
In g e n e r a l ,
we
270
would e x p e c t i t
to be 16 b i t s
or l a r g e r . T h e second area i s
data memory which can o n l y be addressed i n d i r e c t l y some i m p l e m e n t a t i o n s , be mapped onto backing s t o r e . either
memory i s
constructed
of the i n d e x r e g i s t e r
(if
a dynamic
and which c o u l d , The address f o r
from the operand address and the c o n t e n t s
specified).
Thus the i n d e x r e g i s t e r
fils
the f u n c t i o n
of the m o d i f i e r
mndel and i s
used in a d d r e s s i n g elements of a data a g g r e g a t e .
the o p e r a t i o n is
code.
permitted for
static
discussed
in c o n n e c t i o n
ful-
sic
The s e l e c t i o n between s t a t i c
in
with
the ba-
and dynamic memory i s made on the b a s i s of
0nly fetch
and s t o r e
the dymamic memory.
to and from the a c c u m u l a t o r
Instructions
memory i n c l u d e m e m o r y / r e g i s t e r t r a n s f e r s
which access the
and the a r i t h m e t i c
ope-
rations. Another f a c t o r lity
which
o f using i t
tion
4.2.,
to s t a r t
the b o o t s t r a p
machine.
All
s i m p l e machine.
Subsequently
a more e f f i c i e n t
sequence.
As we noted
of the t a r g e t
t h a t would then be r e q u i r e d
level
abstract If
a g i v e n computer. could
was taken to expend f u r t h e r is
SELEAC
effort
this
r e a s o n , we have added i n s t r u c t i o n s
ters
to be packed i n t o
this
into
interpreter
the
once
STAGE2,
SE-
b e f o r e any d e c i s i o n version.
If
then some a t t e n t i o n
and m a n i p u l a t i n g c h a r a c t e r s . to
SELEAC
which
in a word i s
For
allow charac-
or unpacked from the a c c u m u l a t o r .
c h a r a c t e r s which can be s t o r e d
version
expressed
STAGE2
to produce an o p t i m i z e d
must be paid to the problem of s t o r i n g
this
s o f t w a r e e x p r e s s e d in
minimum e f f o r t
to be capable of s u p p o r t i n g
to obfor
interpretive
machine d i r e c t l y
Any i t e m o f p o r t a b l e
then be examined w i t h
code
the s i m p l e machine used was
then we would o n l y need to c o n s t r u c t
SELEAC,
LEAC
we could use t h i s
computer.
to Sec-
expressed in the o b j e c t
v e r s i o n by t r a n s l a t i n g
in the language of a h i g h e r instructions
fest
was the p o s s i b i -
SELEAC
a r u n n i n g v e r s i o n of the program would be an i n t e r p r e t e r
to o b t a i n
for
the design of
could be d i s t r i b u t e d
STAGE2
of a s i m p l e a b s t r a c t tain
influenced
The number of
only specified
v i a a mani-
constant.
Both c o n d i t i o n a l
and u n c o n d i t i o n a l
In keeping w i t h
s e t by an i n s t r u c t i o n the a c c u m u l a t o r . control
operations
The comparison accumulator. are i n c l u d e d specified
transfers
the b a s i c m o d e l , t h e r e This
is
which compares i t s is
then
of c o n t r o l
a test
operand w i t h
does not a l t e r
O p e r a t i o n s to e f f e c t in the i n s t r u c t i o n
are p r o v i d e d . which may be
the c o n t e n t s
i n s p e c t e d by the c o n d i t i o n a l
to d e t e r m i n e whether a branch
instruction
register
is
e n t r y to or e x i t
set.
and depends on the p a r t i c u l a r
required
the c u r r e n t
of
transfer or n o t .
contents
of the
from s u b r o u t i n e s
The e x a c t mechanism i s implementation.
left
un-
For example,
of
271
in an i n t e r p r e t e r red in the f i r s t is
LEAC
constructed location
for
handled by one i n s t r u c t i o n
standard
I/0
package.
Its
the r e t u r n
SELEAC,
of the c a l l e d
routine.
which
interfaces
operand s p e c i f i e s
address
Input/output
is
sto-
in
SE-
the machine to the
the address of a l i s t
of p a r a m e t e r s r e q u i r e d by the o p e r a t i o n . The assembly code p r o v i d e d f o r
the machine c o n s i s t s
and e x e c u t a b l e s t a t e m e n t s ,
Each i n s t r u c t i o n
operation
operand.
code and a s i n g l e
of
declarative
i s made up of a mnemonic
The g e n e r a l
form of the operand
is
n(m)t
where
n
is
stand f o r
an i d e n t i f i e r
an a r r a y ,
valued v a r i a b l e arithmetic constants for
gister
a label just
or a p r o c e d u r e .
or' n e g a t i v e . and c h a r a c t e r
an o r d i n a r y
is
as f o l l o w s
constants,
address
and
t X
is if
m
address.
an
manifest
the operand t y p e which
the f i n a l
is
which may
integers,
the c o n t e n t s
may
a single
in an i n t e g e r
is
of the i n d e x r e This
is
d e t e r m i n e the address a s s o c i a t e d w i t h is
(ii)
n.
If
n
is
evaluated
address
and add the r e s u l t i n g
integer
taken to be
evaluate
m
(iii)
if
indexing
tents 24
is
null
the
then t h i s
O.
to the address of
n.
specified,
add in the con-
of the i n d e x r e g i s t e r .
operations
specified
for
o f which may be extended to i n d i c a t e
terpreted
It
:
name
16
results
The e x p r e s s i o n may c o n t a i n
to be used in f o r m i n g
(i)
There are
As noted l a t e r ,
regarded as an a r r a y of one e l e m e n t ,
e x p r e s s i o n whose e v a l u a t i o n
be p o s i t i v e null
is
unique to the whole of the program.
the b a s i c that
SELEAC
machine,
the operand i s
to be i n -
as i m m e d i a t e , e . g . LDA
LDAI
CELL(N)X
CELL(N)X-
-
contents
of
of the a r r a y
load
CELL
load
of
address
the a r r a y
CELL
(N+x)th
element
into
the a c c u m u l a t o r
(N+x)th
element of
into
the a c c u m u l a t o r
272
All
identifiers
cludes
used in a
labels,
program must be d e c l a r e d .
SELEAC
procedures
and a r r a y s .
Only one d i m e n s i o n a l
This
p r o v i d e d and an a r r a y d e c l a r a t i o n
specifies
allocated.
as an a r r a y of one e l e m e n t .
A variable
is
treated
elements may be i n i t i a l i z e d
s i o n which obeys t h e same r u l e s assumed to be a l i n e a r the c o n t i g u i t y
is
as
in p r e s e n t i n g
m e r e l y the core of t h i s if
it
is
on c u r r e n t
this
brief
description
of an a b s t r a c t
Each a r r a y
is
specified
is
about
of the s t r u c t u r e
of our c u r r e n t
machine which could
It
will
to be capable of s u p p o r t i n g shifts,
floating
an ongoing process and the f i n a l
s e r v e as a
form w i l l
need to be extended c o n s i d e operations widely point
etc.
available
The design
depend on a s e r i e s
chines
language i n t o which h i g h e r
as a t a r g e t
is
of e x p e r i -
it
suitability
of
thinking
ments to d e t e r m i n e the ease w i t h which and i t s
expres-
O b v i o u s l y , what we have d e s c r i b e d i s
machine.
machines, e . g .
Array
arit~etic
d e s c r i b e d above.
to g i v e the r e a d e r some i n d i c a t i o n
about the c h a r a c t e r i s t i c s
to be
arrays.
s t a n d a r d base of the h i e r a r c h y . rably
m
sequence of words but n o t h i n g
of d i f f e r e n t
Our i n t e n t i o n SELEAC
how much space i s
to the v a l u e o f an i n t e g e r
in-
a r r a y s are
can be mapped onto a c t u a l
ma-
level
language may be t r a n s l a t e d .
7.3.
A
Finally,
let
struction
that
STUDY
us c o n s i d e r
the approach
hierarchy
that
In s e c t i o n for
[32,
37]
is a feasible editor
includes
5.1.,
we noted t h a t and i n t e g e r s .
as we d i d
a string
ment the A string specified
the con-
of a b s t r a c t
MITEM
is
and these e x p e r i m e n t s have i n d i c a t e d
one. MITEM
The program we w i l l which i s
examine i s a
being c o n s t r u c t e d
in a
LSD.
were s t r i n g s a string
rizes
a case s t u d y which i n v o l v e s software via a hierarchy
We have a l r e a d y t e s t e d out some of the ideas on our e a r l y machine models
new v e r s i o n of the t e x t
TEM
briefly
of a p i e c e of p o r t a b l e
machines. abstract
CASE
in
the b a s i c data types r e q u i r e d
for
MI-
I n s t e a d of choosing a r e p r e s e n t a t i o n
TEXED,
and what o p e r a t i o n s
let
us f i r s t
on s t r i n g s
e n q u i r e what c h a r a c t e we m i g h t need to i m p l e -
algorithm.
an o r d e r e d sequence of c h a r a c t e r s which may be c o m p l e t e l y
in a machine by the address of the f i r s t
character
(BASE)
273
and the number of c h a r a c t e r s cess of e d i t i n g , We t h e r e f o r e cursor
need to m a i n t a i n
and t h i s If
(HEAD).
then in o r d e r to check quantitiy
the a c t u a l
these q u a n t i t i e s
for
r a y of f o u r
In
be t r a n s l a t e d
named by the f i r s t
providing
If
into
the
we w i l l
refer
Thus the de-
of a s t r i n g
register
and
the c o r r e s p o n d i n g
by
STAGE2
'
the c h a r a c t e r
control
If
Clearly
LT
It
is
named
transferred
the search i s the s t r i n g
the o p e r a t i o n
successful is
set
can r e a d i l y
statement
~
procedure
LSD
a
:
associated with
, ')
the search f a i l s .
code sequence c o n t a i n i n g
recognition
' CHARACTER
conditional
('
by the i n s t r u c t i -
The f o r m a t of the o p e r a -
its
parameter.
register
SEARCH
and p a r t l y
parameter f o r
character. LSD
ar-
CHARACTER.
the search f a i l s ,
g i v e n as the t h i r d
IF
cumulator if
' FOR
STRING
in the s t r i n g
string,
Thus the o p e r a t i o n
LSD.
to the r e q u i r e d
the data t y p e
REGISTER.
on some machines.
into
HEAD
and t o g e t h e r w i t h
to m a n i p u l a t e these data types are d e t e r -
translation
to p o i n t
line
in the c r e a t i o n
a view to s i m p l i f y i n g
in the second p a r a m e t e r . then
STRING
chosen w i t h
to the l a b e l
are s u f f i c i -
space to hold the maximum s i z e a l l o w e d
required
SEARCH
a maxi-
need to have
quantities
constitute
by the needs of the a l g o r i t h m
scans the s t r i n g
we w i l l
is
not r e q u i r e d ,
these would be r e p r e s e n t e d by an i n t e g e r
LSD,
ons we might e x p e c t to f i n d and i t s
there is
Since we may wish to r e f e r e n c e
elements and an a r r a y of t y p e
mined p a r t l y is
itself,
of the
character
that
at any i n s t a n t
machine model.
results
The b a s i c o p e r a t i o n s
tion
overflow,
v i a the data t y p e
position
allocation
These f o u r
of a s t r i n g
of s u f f i c i e n t
the s t r i n g ,
permits)
pro-
being scanned.
from the f i r s t
i n d e p e n d e n t l y of the c h a r a c t e r
of a s t r i n g
the r e s e r v a t i o n
possible
(LIMIT).
of c h a r a c t e r s
the a b s t r a c t
to them c o l l e c t i v e l y claration
for
the s t a t e
string
in
STRING
to the c u r r e n t
and dynamic s t o r a g e
available
ent to d e f i n e
a pointer
(as the a l g o r i t h m
any s t r i n g
During t h e
(LENGTH).
along the s t r i n g
can be held as an o f f s e t
we assume
mum s i z e f o r this
in the s t r i n g
moves a c u r s o r
MITEM
:
GOTO
'
returns
a - 1
could a l s o be t r a n s l a t e d
TRANSLATE
AND
TEST
in the acinto
instruction
an onif
274
such is a v a i l a b l e
on the t a r g e t
machine.
The language in which the program i s special
operations
distinguish
is
above and
a m i x t u r e o f the In o r d e r to
LSD.
between the procedures which implement the a l g o r i t h m
those which p r o v i d e the the l a t t e r described
being w r i t t e n
in the form d e s c r i b e d equivalents
LSD
are d e c l a r e d as
of the s p e c i a l
Thus, the o p e r a t i o n
OPERATIONS.
and
operations, SEARCH
above would be d e c l a r e d as
: SEARCH
OPERATION
which could r e a d i l y declaration.
' FOR
STRING
be t r a n s l a t e d
By d i s t i n g u i s h i n g
into
'
CHARACTER
the e q u i v a l e n t
operations
in t h i s
procedure
LSD
way, we can e a s i l y
a r r a n g e to r e p l a c e the LSD by a more e f f i c i e n t r o u t i n e in assembly code or d e l e t e i t e n t i r e l y i f the c a l l i t s e l f i s r e p l a c e d by an i n l i n e code sequence.
The macros which p e r f o r m the t r a n s l a t i o n
language to
and are e f f e c t i v e l y
SELEAC
free
may be combined
LSD
to come in at any l e v e l
The development of t h i s yet,
we have no
or o t h e r w i s e , processor
L.
a pre-pass.
The i m p l e m e n t o r i s
[38]
new v e r s i o n o f
supports without
The language i s q u i t e to r e a l i z e
existing this
version of
figure,
from
L
machines. L
to
high about
3
and whose s t r u c t u r e ML/I
The form of
-LOWL.
(e.g.
360
ML/1
man-weeks.
g e n e r a t e d from
the o p t i m i z e d v e r s i o n . considerably
LOWL
of this
loss
an a b s t r a c t
bootstrap, In an e f f o r t [39]
LOWL
i s more o r i e n t e d is
in e f -
machine c a l -
towards
to c o n v e r t the source t e x t
LOWL
process
and e x p e r i e n c e in a number of man months o f e f f o r t
such t h a t
macro a s s e m b l e r ,
implement the program on a new machine. 2
for
level
c r e a t e d a machine c a l l e d
He then used
macro p r o c e s s o r s i s now about
underway and, as
in an u n a c c e p t a b l e
on a n o t h e r computer.
ML/1
therefore
the use o f a h i e r a r c h y
on a new machine v i a a h a l f
Brown f i r s t
s i m p l e r than real
L
to
out by Brown on the macro
the view t h a t
resulting
i m p l e m e n t a t i o n s has shown t h a t quired
is s t i l l
MITEM
i s a program c o n s t r u c t e d
ML/I
LSD
to c a r r y out o p t i m i z a t i o n .
measurements to assess the e f f i c i e n c y
improves p o r t a b i l i t y led
those which c o n v e r t
However, some work c a r r i e d
ML/1
ficiency.
from the s p e c i a l
with
to reduce which that of
of ML/1
can be used to
required
to do t h i s
However, measurements on the e f f i c i e n c y indicate
that
it
Thus the p o r t a b i l i t y
increased at virtually
no c o s t
is
only
is
any of the common
STAGE2)
The e f f o r t
are r e using an
5 %
s l o w e r than
of the program has been in e f f i c i e n c y .
of
275
8.
REFERENCES
Griswold, R.E., Poage, J . F . , Polonsky, I . P . The SNOBOL4 ProP r e n t i c e - H a l l , Englewood C l i f f s , N.J.,1969 gramming language. Griswold,
R.E.The Macro Implementation
& Co., San F r a n c i s c o , Harr, J.A.
of SNOBOL4.
W.H. Freeman
1972.
The design and production of real-time software for
electronic switching systems.
Naur, P., Randell,
B.
(Eds.),
Poole, P,C., Waite, W.M.
Quoted in Software Engineering, NATO Science Comm., Jan. 1969,27.
A Machine Independent Program for the
Tech. Rept. U n i v e r s i t y of Colorado, 1969.
Manipulation
69-4,
of Text.
American National
Standards I n s t i t u t e .
Computing Center,
FORTRAN,
X3.9-1966.
G a l l e r , B . A . , P e r l i s , A.J. A Proposal for Definition CACM, 1~ ( A p r i l , 1967) 204-219. van Wijngaarden, A. ( E d . ) , C,H.A.
Mailloux,
B.J.,
in
Peck, J . E . L . ,
Report on the Algorithmic Language
ALGOL.
Koster,
ALGOL 68. Numerische
Mathematik, 14 (1969) 79-218. Newey, M.C. An Efficient system for User Extendible Languages. Proc. AFIPS. FJCC, 3__~3 (1968) 1339-1347. CACM, 6
Weizenbaum, J. Symmetric List Processor. 1969) 524-544. 10
SHARE Ad-Hoc Committee on Universal
Languages.
Programming Communication with Changing Machines Solution. 11 12
Sibley,
CACM,
R.A.
Richards,
M.
Programming. 13
1
The
(1968)
SLANG
(September
The Problem of : A Proposed
12-15.
System.
CACM, 4
(Jan.,
1961) 75-84.
BCPL : A Tool for Compiler Writing and System
Proc.
AFIPS.
SJCC,
34
(1969)
I r o n s , E.T. A Syntax Directed Compiler for 4 (1961) 51-55.
557-566.
ALGOL 60.
CACM,
276 14.
McKeeman, W.M., H o r n i n g , J . J . , Wortman, D.B. A C o m p i l e r Generator. P r e n t i c e - H a l l , Englewood C l i f f s , N . J . , 1970.
15
Foster, J.M. A Syntax Improving Program. (May, 1968) 31-34.
16
Waite, W.M. Implementing Software for Non-Numeric Applications. Prentice-Hall, Englewood C l i f f s , N.J., 1973.
17
Irons, E . T . Experience with an Extensible Language. 13 (January, 1970) 31-40.
18
Ph.D. Thesis, Yezerski, A. Extendible Contractible Translators. University of New South Wales, Sydney, Australia, 1972.
19
McClure, R.M. TMG - A Syntax Directed Compiler. 20 th N a t i o n a l Conference, 1965, 262-274.
20
Brooker, R.A., Morris, D.
Computer J . , I._11
CACM,
Proc. ACM
Some Proposals for the Realization
of a Certain Assembly Program.
Computer J.,
~
(1961) 220-224.
21
Waite, W.M. A Language Independent Macro Processor. (July, 1967) 433-440.
CACM, 10
22
Brown, P. J. The 1967) 618-623.
(October,
23
Waite, W.M. The Mobile Programming system : STAGE2. (July, 1970) 415-421.
24
McIlroy, M.D. Macro Instruction Extensions of Compiler Languages. CACM, 3 ( A p r i l , 1960) 214-220.
25
Tech. Rept. 69-3-B. Waite, W.M. The STAGE2 Macro Processor. Computing Center, Universiy of Colorado, 1969.
26
Halstead, M.H. Machine Independent Computer Programming. tan Books, Washington, D.C., 1962.
27
Waite, W.M. Building a Mobile Programming System. 13 (February, 1970) 28-31.
ML/I
Macro Processor.
CACM, 10
CACM,
13
Spar-
Computer J . ,
277
28
Orgass, R . J . , Waite, W.M. A Base f o r a M o b i l e Programming CACM. 12 (September, 1969) 507-510.
System. 29
Poole, P. C., Waite, W.M. I n p u t ~ O u t p u t f o r a M o b i l e Programming Software Engineering, Vol. i, Tou, J.T. (Ed.) Academic Press (1970). System.
30
Waite, W.M.
A New Input~Output Package for the Mobile Program-
Department of I n f o r m a t i o n Science, Monash UniverClayton, V i c t o r i a , A u s t r a l i a (1970).
ming System.
sity, 31
Waite, W. M. Input~Output Conventions for Abstract Machines. Proc. Culham Symposium on Software Engineering ( A p r i l 1971).
32
Newey, M.C., Poole, P.C., Waite, W. M.
Abstract Machine Model-
ling to Produce Portable Software - a Review and Evaluation.
Software, 2 (1972) io7-136. 33
Knuth, D.E. An Empirical Study of ware, i (1971) 105-133.
FORTRAN
Programs.
Soft-
34
W i r t h , N. The design of a Pascal Compiler. (1971) 309-333.
Software,
!
35
See reference
36
Randell, B., R u s s e l l , Press (1964).
37
CLM-PDN 9/71, Calderbank, V . J . , Calderbank, M. LSD Manual. Culham Laboratory UKAEA, Abingdon, Berkshire (1971).
38
Poole, P.C. Hierarchical Abstract Machines. Proc. Culham Symposium on Software Engineering ( A p r i l 1971).
39
See reference
40
Brown, P.J. Levels of Language for Portable Software. (to be p u b l i s h e d ) .
31. L. J.
ALGOL 60
Implementation.
Academic
22. CACM,
CHAFFER 3.C.
DE~]GGING
~9
TESTING
P. C. Poole Culham
Laboratory, Great
~.
Abingdon,
Berkshire
Britain
INTRODUCTION When a programmer
looks at the output
and finds that the computer UNRECOVERABLE then,
although
he may not realise
debugging often,
Testing
to the debugging
the starting
point
above message
ing is to ensure
that powerful
are readily
function
tools
available
the current
state of the art,
to make mistakes.
the design~
misunderstandings
as bugs,
some external
almost
agency.
that is is only through
there
engineer-
for testing
and widely used.
these
at least within make,
and will
are due to faults
of the design
just plain coding errors, there is a tendency mistakes
However,
and techniques
that programmers
Whether
Too
to the
of software
N o w it seems to be the nature of things,
continue
it.
appends
its lucid comment.
and an important
of a "bug";
is the 50 or so
that the computer
to help clarify
of software
what caused
for this process
dumps
are alternatives
and debugging
phase
he is about
the presence
must now be used to identify
latest run
CAUSE
it at the time,
has demonstrated
pages of hexadecimal
from his
sent him a message
ERROR DUE TO U N K N O W N
to move from the testing production.
has kindly
as if to attribute
It has been suggested such a transference
specifications
in or
to refer to such their existence (cynically,
to
I hope)
of r e s p o n s i b i l i t y
279
that p r o g r a m m e r s
are able to preserve their own sanity
ing the rate at which t h e y seem capable of g e n e r a t i n g
- considersuch errors.
N o w there is no doubt at all in m y mind that the best w a y of reducing
the n u m b e r of bugs in a program is to prevent
occurring
in the first place.
M a n y of the topics
this course have been aimed at just this objective high
level [languages,
modularity
etc.
structured
However,
programming,
although
the p r o b a b i l i t y of bugs occurring, it will be zero. demonstrating correctly,
The p r o g r a m m e r
portability,
The q u e s t i o n of the c o r r e c t n e s s
evoking
a great deal of i n t e r e s t advances
will operate
It m a y be possible
this by means of a formal
proof.
n u m b e r of significant
that
still face the task of
at least to his own satisfaction. to effect
can reduce
they cannot g u a r a n t e e
must
of programs
in the computer
have been made.
is c u r r e n t l y
field and a Noteworthy
the work of the Vienna group who were able to d e m o n s t r a t e error in an IBM PL/I c o m p i l e r by formal methods such techniques
As good software
of such d e v e l o p m e n t s but,
in the meantime,
conventional
means
the task of proving
still remains.
plan for a
production
theme to these
lectures,
of planning well ahead of time
for
this point in m o r e
2.
and debugging
must not be viewed in i s o l a t i o n
the other phases of software
production.
l e m e n t a t i o n have been completed, and attempts
the p r o g r a m b y more
We must therefore
I shall return to consider
detail in Section
app-
we must remain aware
phase in the software
then it is the i m p o r t a n c e
Testing
an
However
we are c u r r e n t l y r e q u i r e d
engineers,
If there is an u n d e r l y i n g
these phases.
is
and make use of them when and where we can~
t e s t i n g and a d e b u g g i n g sequence.
[I].
are still a long way from being g e n e r a l l y
licable to the wide range of software to produce.
in
- the use of
such techniques
that the program he has c o n s t r u c t e d
some time in the future
them ever
discussed
to d e m o n s t r a t e
the mro~r~n with a suitable
Once design
the programmer
the presence
from
and imp-
commences
testing
of bugs b y e x e r c i s i n g
set of test data.
It is i m p o r t a n t
280
to remember
that testing
only their presence.
can never
programmer
hopefully
techniques
in order to determine
makes use of various
m a y lead him back either phase
to correct
re-entered.
has been made, through
is satisfied
(which u n f o r t u n a t e l y obvious
errors).
depends
on whether
it is part of a larger with other modules
Experience
without
has
by persons such people
supplier
the r e s p o n s i b i l i t y
performs
provision
available
These
phases.
checks The
that the test data is of the program. i.e.
on the other hand,
and again involves
in his lectures,
since each involves for the testing
phases
is
testing
specifications
local conditions.
subsequent
the
and
HELMS will have i have m e n t i o n e d testing
in one
phase must take account and make
special
for them.
In presenting at the outset approach
and certification
Certification,
Planning of these
case
from the developer tests.
to the same organisation,
these phases
form or another. of the existence
to be made
than the developer
for completeness
are no
and the testing/debugging
out further
except
under
until
of the program
to move directly
of the customer
satisfactorily
continues
is
system(in which
to see that it meets its design
more to say about them here
other belong
of the software.
the program
this
shown that it is not good sense
carrying
to testing
sequence
that there
it is a product
in the v a l i d a t i o n
former is similar prepared
tools and In turn,
is working c o r r e c t l y
The next phase in t h e ~ f e
to allow a piece of software
Usually
the phases
too often means merely
it will be integrated
are performed
the testing
the program
phase re-entered) or whether
to the customer
debugging
why it occurred.
the
to the design or the i m p l e m e n t a t i o n
This cycling
to a customer.
of bugs,
the error in the light of what he has discovered.
Once the correction
the programmer
show the absence
Once an error has been detected~
these
to the processes
the techniques
lectures,
I must make it very clear
that I am not proposing of testing
I will discuss
some n e w and r e v o l u t i o n a r y and debugging.
M a n y of
should already be familiar
to you,
281
at least I hope they are. the p r o g r a m m e r s mistakes
continue
when producing
are not well d o c u m e n t e d gramming.
software.
In some cases,
and are part of the
The n e w p r o g r a m m e r
is easy;
engineering
software
can be difficult.
p r e p a r e d to pay more a t t e n t i o n debugging
quickly d i s c o v e r s
to the problems
of pro-
school of that writing
it takes him some time to realise
reliable
that
the t e c h n i q u e s
folklore
They are often o n l y learnt in the hard
experience. programs
Yet one cannot help n o t i c i n g
over and over again to make the same
that
Only then is he of testing
and
and c o n s i d e r h o w t h e y could be solved m o r e effectively.
For these reasons,
two main points
I want to stress in these
lectures are : (a) the i m p o r t a n c e debugging
of planning
(b) the use of t e s t i n g In c o n c e n t r a t i n g
for the testing
and d e b u g g i n g
on these points,
2.
structure,
modularity
aids.
I am of course
sensible d e c i s i o n s have a l r e a d y been made language,
F o r w a r d planning is of p a r a m o u n t
and c o s t l y ones.
guarantee
The p l a n n i n g
the design process;
it cannot
that
about such factors
as
PHASES
importance
are to be c a r r i e d out e f f e c t i v e l y
of it will almost c e r t a i n l y
assuming
etc.
P L A N N I N G FOR THE T E S T I N G AND D E B U G G I N G
debugging
and
phases;
if testing
and efficiently;
that the phases
should form an integral
and lack
are long part of
sensibly be c a r r i e d out after the
software has been written.
Let us examine what factors we should
consider
they might have on testing
and what i n f l u e n c e
debugging.
and
282
2.~
DOCUMENTATION The need f o r a n d i m p o r t a n c e
of high standards
in d o c u m e n t a -
tion have already been stressed in this course by GOOS. Documentation
has a fundamental
that the testing and d e b u g g i n g carried out. is not
role to play in e n s u r i n g phases can be s u c c e s s f u l l y
Lack of good d o c u m e n t a t i o n
performed
as t h o r o u g h l y
u s u a l l y means that testing
as it should be and d e b u g g i n g
is that much more complicated. It has been s u g g e s t e d that computer written lished
and d o c u m e n t e d [2].
in such a way that they could be pub-
and readable
adopted in order to remove d i f f i c u l t i e s when trying to u n d e r s t a n d be paid to t e c h n i q u e s
programs.
for i m p r o v i n g
Particular
it uses.
attention
the r e a d a b i l i t y
be
should
of the program
as input to a machine.
Thus every v a r i a b l e used should be annotated~
what algorithm
shou~
tech-
that people e n c o u n t e r
it should not just be c o n s i d e r e d
accompanied by a description
program and
that m a n y of the specific
niques u s e d to make a b o o k a t t r a c t i v e
paginated
should be
An a n a l o g y is drawn between a computer
a text book and it is p r o p o s e d
by people;
programs
e v e r y subroutine
of what it is i n t e n d e d
to do and
A n y listing of the program
should be
so that a table of contents can be set u p to show
where every subroutine an index which referenced.
or m o d u l e
occurs.
shows where e v e r y routine
In creating
There should also be and v a r i a b l e
such documentation,
is
the p r o g r a m m e r
must
take as his goal the need to ensure that someone other than h i m s e l f can u n d e r s t a n d The effort r e q u i r e d pay handsome
the p r o g r a m
to d o c u m e n t
dividends
a program to this standard will
in the t e s t i n g
Other u s e f u l pieces in the c o m m e n t a r y
and h o w it should behave.
and d e b u g g i n g
of i n f o r m a t i o n
are any c o n d i t i o n s
which
which
could
phases. be r e c o r d e d
should be true
283
at particular input
points
parameter
then this
to a procedure
invaluable
implicit
comment
is modified,
restriction
the condition
it stops working
testing.
should
a particular
need to be checked
aware
ing is being carried test procedures
particularly
and enhanced
out.
to ignore
to be working
is one in which
the program
when test-
attention.
of the program
is one which will be m a i n t a i n e d is a tendency
satisfactorily.
on the part
development
The correct
approach and data
the life of the program.
it should be e x e r c i s e d any interference
code.
and
once the pTogram
set of test procedures
throughout
and the original
he is much
of documenting
to be discarded
to reveal
If he
checks
considerable
this aspect of program
is modified,
test data in an attempt modifications
There
a well-documented
is built up and retained
question
that
routine
in the program,
as the documentation
if the program
cases
is completed.
the appropriate
to
out during
a particular
program
The whole
to treat test data as something is thought
of any
that m a y arise
and the special
as comments
to make
by other people.
of programmers
If
then the
the change
of the problems
is one that requires
It is just as important
Whenever
in the code,
need to be carried
algorithm
such information
likely to remember
itself,
checked
while he is writing
than he will be when the whole includes
some
also be u s e d by the programmer
of what actions
He is much more
from using
because
and
when a
should apply.
Documentation remind h i m s e l f
for debugging
has been broken by the modifications.
is not specifically
that
range,
associated
Too often,
will at least warn the person making
restriction
more
is useful
and enhancement.
if an integer
a given
in the comments
Such information
for maintenance
program
For example,
should lie within
fact should be recorded
with the procedure.
working
in t h e program.
o n the
between
A n y change made
to a
the
284
program
should be reflected
procedures
and data.
that interference
DEBUGGING The obvious
the programmer
purposes
indicates
that the
correctly.
step from the provision
idea of inserting
programs
of comments
which
should do is the inclusion
to inspect
of code to
what it is actually
code into a program
of debugging
been writing
it merely
to the test
is no guarantee
CODE
describe what the program enable
changes
that such testing
has not occurred;
test data can be processed 2.2
in corresponding
Remember
has been around
specifically
doing.
The
for the
as long as people have
[3].
"It is good to plan to include the kind described
extra printing
here in all new programs
they are first drawn up rather
of
when
than to wait until
the program has been tried and found to fail" (Wilkes Unfortunately, of subsequent
in spite of such advice experience
far too many programs adequate
provision
often the n e c e s s a r y attempt
to substantial Let us consider
there are still
today which do not include and debugging
as they arise.
of output
Undoubtedly,
it~
phases.
code is only added as an afterthought It u s u a l l y
but little u n d e r s t a n d i n g
the most effective
built into the program
useful
written
1951).
and the great wealth
supports
for the testing
to meet the needs
large quantities lem.
which
et al~
Too in an
results
in
of the prob-
debugging
aids are those
from the very beginning.
They can lead
increases
in the productivity
in general
of programmers.
what type of information
and how it might be obtained.
Whether
could be
or not a programmer
285
has e x p l i c i t l y
to include
what d e b u g g i n g
aids are a v a i l a b l e
the code h i m s e l f will depend on in the system.
cuss these in more detail in Section
3.
The simplest action a p r o g r a m m e r number
of print
execution results~
statements
of the program. current values
that a p a r t i c u l a r The statements
which
can take is to include
allow him to m o n i t o r
or just m a r k e r s
produce output
m u c h more
selective
However,
since the p r o g r a m m e r and what v a r i a b l e s
they can be made
knows which
statements
examined.
function that can be p e r f o r m e d by d e b u g g i n g
is to improve
the u s e f u l n e s s
that can be o b t a i n e d
of post m o r t e m dumps.
code
The evidence
from a dump is often not s u f f i c i e n t
an error u n l e s s
more i n f o r m a t i o n
to
steps have b e e n taken to ensure that
has been placed in the m e m o r y
r e q u i r e d to solve the problem. subroutine
entered.
similar to that of the trace
aid d e s c r i b e d in Section 3.1.
locate
to indicate
label has b e e n passed or a subroutine
should be m o n i t o r e d
a
the
These might output i n t e r m e d i a t e
of v a r i a b l e s
debugging
Another
We will dis-
one might arrange
For example,
than is actually
on e n t r y to a
to store the following:
(a) name of the c a l l i n g routine; (b) location in the routine
from which the call was made;
(c) value of any input parameters; (d) value
of any global v a r i a b l e s ~Itered by the routine;
(e) contents
of r e g i s t e r s ~ o r
(f) a count of the n u m b e r
a s s e m b l y code programs);
of times the routine has been
entered. If this i n f o r m a t i o n be e x t r a c t e d
is p l a c e d in a cyclic buffer,
from the dump for the last n entries
then it can to the
routine where n is d e t e r m i n e d b y the size of the buffer. Application
of this t e c h n i q u e
n e e d not be r e s t r i c t e d
to routine
286
entry points
but m a y be used at any point in the program
the programmer before print
wishes
to preserve
it is changed. statements
be reduced.
The main advantage
provides
the program
it has over simple
is inside
the programmer
is terminated
output
a loop,
than m values
is the number of times control
has passed
m could be very much
n.
Its main disadvantage
learn how to interpret
Earlier describing various
plemented holds
conditions
by code which
true,
e.g.
indeed lie within program be
a specified
on detecting
diagnostic
routine
so that testing
circumstances,
we have
service. which
cause
long sequence
true
be reproduced
does
is false m a y simply include
the failure
debugging phase.
entering
a
(after printing
code once However,
the in some
to leave instances
for real time systems
exactly.
Under
which eventually
the exact
it may initiate produces
in
such circums-
task to determine
When a fault occurs,
of events
apply at
could be sup-
even when they have been put into
it can be a v e r y difficult
of a failure.
should
The action taken by the
the testing
This is p a r t i c u l a r l y
input cannot
tances,
3.
comments
to a routine
found it advantageous
of such code in programs
We
in Section
can continue. to remove
through
than
that the condition
possibilities
or even ignoring
one expects
has passed
range.
that a condition
a message)
Normally
which
checks
larger
by the machine.
Such comments
actually
where m
m a y have to
of including
that an input parameter
to initiate a dump; other
program
produced
and restrictions
in the program.
then
the inspection
this point in more detail
we noted the possibility
points
through
is that the programmer
the dumps
will return to consider
can
with the last n values
rather
point.
In many circumstances~
where
information
is that the amount of diagnostic
If the point of interest
the technique before
some current
a
the final
287
collapse.
The r e c o n s t r u c t i o n
difficult nature.
In such
"guard code"
situations,
have not been v i o l a t e d Effectively
checks,
dividends.
or to preserve useful
to ensure
of the software that errors
significantly
information.
of r e d u n d a n c y
to improve
in the sense that an attempt is
do not go undetected.
e x t e n s i v e use of the guard code technique of the C O T A N m u l t i - a c c e s s
This code can be
to test that r e s t r i c t i o n s
one is using the principle
the r e l i a b i l i t y
very
can be v e r y
the use of what we have termed
[4] can pay h a n d s o m e
u s e d to apply c o n s i s t e n c y
made
of this sequence
since m u c h of the e v i d e n c e m a y be of a t r a n s i t o r y
We made
during
the d e v e l o p m e n t
system on the ICL KDF9.
to the rate at which errors
It c o n t r i b u t e d
could be located
and c o r r e c t e d when the system was put into service. 2.3
GENERATION
OF D E B U G G I N G
The use of d e b u g g i n g
code can sometimes
w h e n the time comes to remove finds that a program which suddenly This
starts to
situation
manually.
is more
actual m a c h i n e advantage
lead to d i f f i c u l t i e s
it from the source text.
appears
to be working
fail w h e n the d e b u g g i n g
One often
satisfactorily
code is taken out.
likely to occur when the code is deleted
An a l t e r n a t i v e
code a p e r m a n e n t
CODE
feature
approach is to make the d e b u g g i n g of the source text and g e n e r a t e
instructions
as and when required.
is that the d e b u g g i n g
the
An added
code can be r e a c t i v a t e d
at any
time even after the program has been put into service.
Unfor-
tunately,
facilities
high
w h i c h enable
level
language
a programmer
from a p a r t i c u l a r
Section
3.
rarely provide
to control whether code is g e n e r a t e d
statement
he m a y be able to request with debugging
compilers
or not.
In some implementations,
that extra code be g e n e r a t e d
to help
and we will c o n s i d e r this in more detail in
However,
language
designers
do not,
as yet~
seem
288
to have r e c o g n i s e d
the value of including
constructions
language which m a y either be t r e a t e d as a comment produce
code depending
For this reason~
on the setting of a compiler parameter.
a macro p r o c e s s o r
to control the g e n e r a t i o n The technique
can be a v e r y useful
of debugging
is applicable
both to assembly code and
p a r t i c u l a r l y with the a v a i l a b i l i t y
language-independent
macro processors
is free to i n t r o d u c e
language p r o v i d i n g he defines m a c r o processor operations.
to translate
as STAGE2
extra
This t r a n s l a t i o n
of legal
is u s u a l l y carried out in a pre-
the rules or p a r a m e t e r i s i n g
can e a s i l y arrange
class of statements
[6].
into the
a set of rules which allows the these into a sequence
one of a number of t r a n s l a t i o n programmer
of such
[5] or M L / ~
statements
pass before the source text is input to the compiler. redefining
tool
code.
h i g h level languages
The programmer
in a
or u s e d to
the g e n e r a t i o n
rules can be selected~ that a p a r t i c u l a r
is ignored.
These extra
By so that the
statement
statements
or
therefore
form part of the source text but do not result in the p r o d u c t i o n o f any code unless it has been
s p e c i f i c a l l y requested.
There is no need to restrict statements
this approach
c r e a t e d by the programmer.
is powerful
enough to r e c o g n i s e
the language, of debugging
to special
If the macro p r o c e s s o r
constructs which are legal in
then these could be used to direct the g e n e r a t i o n code.
For example s one could use STAGE2 to detect
a function d e c l a r a t i o n in F O R T R A N and generate code to m o n i t o r entries
to the routine;
assignment
to a p a r t i c u l a r
could be u s e d to produce code to preserve
variable
the current value.
H o w far one can use this t e c h n i q u e will depend on the facilities in
the m a c r o processor
and the language
in which the p r o g r a m is
written. Earlier~
it was pointed out that d e b u g g i n g
code could be
289
u s e d to supplement
comments
describing
should apply at a p a r t i c u l a r a m a c r o processor, code
it m a y be possible
from the comment.
one could i n t r o d u c e
ASSERT
where
to actually
generate
OF
' IS
a given
' TO
'
the lower and upper
s t a t e m e n t contains
and the
limits of the
as m u c h i n f o r m a t i o n
as the comment but could be turned into the a p p r o p r i a t e via a macro processor. added advantage
Apart
code
from the e c o n o m y in writing,
is that the w r i t e r is e n c o u r a g e d
comments up to date as changes
an
to keep the
are made to the program.
dual role played by the comments
can help to ensure
The
that they
do not get out of step w i t h the code as is f r e q u e n t l y 2.4
the
of the form
is the name of the v a r i a b l e
second and third parameters, This
should lie w i t h i n
a statement
RANGE
the first p a r a m e t e r
range respectively.
which
By u s i n g
Thus i n s t e a d of m e r e l y n o t i n g in a
c o m m e n t that the v a l u e of a v a r i a b l e range,
some c o n d i t i o n
point in a program.
the case.
MODULARITY One of the advantages
the c o n s t r u c t i o n debugging
claimed
of software
are g r e a t l y
is that the problems
simplified.
up from a n u m b e r of t h o r o u g h l y of reliability
for a m o d u l a r
Larger
[7],
tested modules
p of being correct,
if the i n d i v i d u a l
to
of testing
and a high degree However
as D i j k s t r a
modules have a p r o b a b i l i t y
then the p r o b a b i l i t y
that the whole program
is correct cannot exceed pN where N is the number of modules. For systems where N is large, the overall zero.
probability
Hence,
and
systems can be built
should be r e a d i l y obtainable.
has pointed out
approach
p must be almost equal to I if
is to be s i g n i f i c a n t l y
different
from
although we m a k e every effort to e n s u r e that
290
i n d i v i d u a l m o d u l e s rare as correct as possible,
we m u s t not
overlook
aggregate.
the need to test and debug the module
In any large system, in its own right. information
a m o d u l e rarely exists as an e n t i t y
It i n t e r a c t s
to and fro across
which will u l t i m a t e l y constructed
with other modules
an interface.
coexist
programmers,
ficulty that arises is h o w to test a module
to construct module's
Since m o d u l e s
in a large system are often
in parallel by d i f f e r e n t
the other m o d u l e s
passing
on w h i c h it depends.
a test bed which
in i s o l a t i o n
from
A common solution is
simulates
external environment.
the dif-
the b e h a v i o u r
of the
The module can then be placed
in the test bed and suitably exercised. A major drawback with this approach
is the d i f f i c u l t y
of ensuring
that the test bed c o r r e c t l y
environment
in which the m o d u l e will e v e n t u a l l y
Usually
simulates
the actual
operate.
the test bed is c r e a t e d by the person w r i t i n g the module,
and any m i s u n d e r s t a n d i n g their i n t e r f a c e s of modules
he has about the other modules
will be i n c o r p o r a t e d
into it.
tested in this way are combined,
often chaos even though each module to be w o r k i n g
satisfactorily.
not n e c e s s a r i l y
W h e n a number
the result is
is claimed by its o r i g i n a t o r
Unfortunately,
tested is the i n t e r f a c e b e t w e e n
and
what has been
the m o d u l e and its hest bed,
its true i n t e r f a c e
to the actual
system.
There are a n u m b e r of ways one might improve this
situation.
The task of c r e a t i n g the test bed m a y be assigned to someone o t h e r than the p r o g r a m m e r be possible,
constructing
in some cases,
the module.
to construct
It m a y also
a common test environ-
ment
for a n u m b e r of m o d u l e s
this
should be placed in the hands of the more e x p e r i e n c e d
programmers.
Test h a r n e s s e s
times be v e r y useful
and the r e s p o n s i b i l i t y
for doing
and test data generators
in r e d u c i n g the p r o b a b i l i t y
can some-
that the test
291
bed i t s e l f contains individual modules code technique of m u t u a l
errors.
At a minimum~
the w r i t e r s
should make extensive use of the guard
discussed
suspicion".
earlier
This
and be guided b y " t h e p r i n c i p l e
states that a m o d u l e m u s t not m a k e
use of any data p a s s e d to it across an interface checking
on its v a l i d i t y
the interface.
of
without
as defined by the s p e c i f i c a t i o n
The checks
applied
should be even more
first of
rigorous
than one m i g h t e x p e c t to use r say at a subroutine i n t e r f a c e w i t h i n the module.
Again,
the question
arises as to what action
should be taken if an item of data is found to be invalid, this case~
a possible
communication
in
would be to organise i n t e r - m o d u l e
in such a w a y that if one module calls another~
it must be p r e p a r e d module
approach
to have the call rejected.
Thus the called
on finding that the input is invalid can r e t u r n control
to the caller i n d i c a t i n g this module
w h i c h item is at fault.
can then take w h a t e v e r
to isolate the cause of the error.
The w r i t e r of
steps he wishes in an attempt The extra code r e q u i r e d
to
make
these v a l i d i t y checks will be more than justified by the
time
saved in locating the errors.
sively removed via the t e c h n i q u e s b i l i t y of the module Another
aggregate
attempt to test the whole
increment
system as a unit.
a n u m b e r of m o d u l e s
as "incremental
is
it would not be a
a reasonable
A better
approach
and test these t h o r o u g h l y
aggregates.
testing".
will vary from one situation
c h o s e n to obtain complexity
Obviously,
all the modules were r e a d y and then
going on to form larger
this process
above as the relia-
problem r a i s e d by the concept of m o d u l a r i t y
sound policy to wait until
before
discussed
increases.
the choice of the test strategy.
is to combine
It can of course be progres-
We will refer
to
The size of a test to another but should be
balance b e t w e e n increase
in
and the cost of c r e a t i n g the test procedures.
292
The concept of i n c r e m e n t a l to the c o n s t r u c t i o n
sist of a n u m b e r of r o u t i n e s until
the whole module
example,
is formed before
in a m o d u l e c o n s i s t i n g
Dummy
not yet available
routines
These u s u a l l y
starting to test.
it m a y be possible
to commence
for any routines
and set up to return values which
Complex
s u b j e c t e d to a p r e l i m i n a r y added to the r e m a i n d e r
Thus the overall subroutines
satisfy
flow o f
could
first be
testing in i s o l a t i o n before b e i n g
of the program.
In this way,
the section
o f the p r o g r a m a l r e a d y c h e c k e d out can be u s e d to provide environment
For
and a few of the subroutines
c o u l d be s u b s t i t u t e d
the needs of the main sequence. could be tested.
con-
of a m a i n r o u t i n e which o r g a n i s e s
as soon as the main routine
are ready.
control
modules.
and again one should not w a i t
calls on a set of subroutines, testing
testing can also be a p p l i e d
of i n d i v i d u a l
a test
for the new routine.
There are no hard and fast rules about the best w a y to use incremental being
testing.
solved~
programmer.
M u c h will depend on the type of problem
the structure However~
of the p r o g r a m and the skill of the
experience
has shown that there are decided
benefits
from using
such an a p p r o a c h - r e d u c e d costs and more
reliable
software.
It must therefore
s i d e r a t i o n when the t e s t i n g 2.5
be given adequate
con-
phase is being planned.
PARAMETERISATION In Section
case testing.
parameterised
the technique
of extreme
Some a t t e n t i o n m u s t be paid during the c o n s t r u c t i o n
of the software In particular,
3.3, we will discuss
to ways of m a k i n g
such an approach
the p r o g r a m m e r m u s t ensure so that the a p p r o p r i a t e
feasible.
that the code is well
test situation
can e a s i l y
be created. To i l l u s t r a t e
this process,
suppose we are c o n s t r u c t i n g
293
a dynamic b u f f e r control a number of buffers request.
If a process
then the control
routine.
asks
routine
free.
retrieved Now,
to backing
Subsequently
once the process
on
for a buffer and none is available, from
store so that the buffer
this i n f o r m a t i o n will have to be to which it belongs
it is clear that the m e c h a n i s m s
mation
to p r o c e s s e s
will have to move the i n f o r m a t i o n
one of the incore buffers becomes
An area of core will contain
and these m a y be allocated
to and from the backing
is reactivated.
for t r a n s f e r r i n g
infor-
store will only come into opera-
tion once the level of activity has risen to the point where the demand for buffers check these mechanisms,
is greater
than the supply.
in which demand is high or we could restrict latter is by far the easier incore buffers
rises.
in the r e l i a b i l i t y When the package
required m a y depend on such factors and desired response
set accordingly. system,
available
time.
to control
is i n c l u d e d
the number of
as the amount of The p a r a m e t e r
can be
error is b e i n g c a u s e d
to increase the frequency with which
and the p r o b a b i l i t y
the a v a i l a b i l i t y
of locating it.
of this resource
The ability
simply by changing
one p a r a m e t e r has provided us with a powerful m e t h o d convenient
of the
then we could reduce the number of b u f f e r s
in an attempt
the error occurs
If
If at some later stage in the life of the
we suspect that an i n t e r m i t t e n t
b y the package,
of
This number could be g r a d u a l l y
in say a real time system and put into service,
core available
the number
which can readily be altered.
as the 'level of c o n f i d e n c e
various parts of the package
buffers
The
then the first test might be one in which
there is only a single buffer. increased
the supply.
approach providing
is a p a r a m e t e r
this is the case~
In order to
either we could create a test s i t u a t i o n
for creating
test situations.
I n t e l l i g e n t use of p a r a m e t e r i s a t i o n
is not as widespread,
294
in my opinion, finds
as it should be amongst
far too m a n y instances
quantities integral
which
features
tions
and executable
useful
statements.
tool in this context
the appropriate
constant
is a v e r y useful both declara-
A m a c r o processor
can be a
does not provide
A better u n d e r s t a n d i n g
could help programmers
software,
should include
to parameterise
if the language
facilities.
of parameterisation of current
a v e r y difficult
languages
to make use of these
the manifest
a programmer
where
have been built in as an
All
programmers
For example,
for allowing
them.
One still
day systems
It then becomes
task to change
which encourage
techniques. facility
should be parameters
part of the code.
and expensive
programmers.
in current
particularly
of the value
to improve
the quality
in the areas of reliability
and adaptability. T E S T I N G .A N D D E B U G G I N G T E C H N I Q U E S
3.
Once programmer available Software
and debugging
to aid these processes. Engineering
techniques producing
the testing
phases have been entered,
must be aware of the tools and techniques An important
is the development
software
and contribute
aspect
of such tools
since their use can greatly reduce
a
that are of and
the cost of
significantly
to an increase
in reliability. Underlying
any discussion
on the use of testing
ging aids is the problem of man-machine one hand,
the programmer
and succintly wishes
him in locating
be able to respond
of the behaviour
and what information
an errorl
communication.
On the
must be able to state c o n v e n i e n t l y
what aspects
to examine
and debug-
of the program
he requires
on the other hand,
with clear
and informative
he
to assist
the machine diagnostic
must messages.
295
A s i n g u l a r l y bad
feature of m a n y current
systems
is the lack of
a t t e n t i o n that has been paid to these aspects of the m a n machine
interface.
To use
some debugging
often has to prepare his requests difficult
to c o m p r e h e n d
the machine
in a coded
and construct.
are often u n i n t e l l i g i b l e
or even n o n - e x i s t e n t are p r o d u c e d
[8].
aid~
the p r o g r a m m e r
form which
Diagnostic
In m a n y cases,
system p r o g r a m m e r
the m e s s a g e s
which
output
for his own purposes.
are often v e r y cryptic,
b e i n g composed of numbers
which~
of some value to the o r i g i n a t o r
although
program, have
perhaps
are m e a n i n g l e s s
to pay c o n s i d e r a b l e
b e t t e r dialogue the processes 3.1
to the user. attention
CLASSICAL DEBUGGING We will c o n s i d e r
They
and symbols of the
It is clear that we will
in the future to o b t a i n i n g
between m a n and machine
of testing
from
to the average p r o g r a m m e r
appear to be the relics of d i a g n o s t i c
i n c l u d e d by the
is
messages
a
if we wish to improve
and debugging. TECHNIQUES
first
some of the classical
techniques
and examine ways in which they might be improved. (a)
Post-Mortem A common
rammer bearing
Dumps
sight around any computer
room is that of a prog-
away a large pile of c o m p u t e r output.
as not~
it is a system dump.
provide
such a facility,
activated
either by a call
p r o g r a m or in the event of some c a t a s t r o p h i c can be a v e r y u s e f u l
debugging
systems from the
failure.
Dumps
aid but their value is often
r e d u c e d by the w a y in which they are implemented. tends to be v o l u m i n o u s
As often
Nowadays most computer
The output
and use of octal or h e x i d e c i m a l
ters in the listing can m a k e u n d e r s t a n d i n g
charac-
the dump a d i f f i c u l t
296
task,
particularly
often have
for high
little k n o w l e d g e
lying machine.
To increase
to be able to control
level
language
the u s e f u l n e s s
the p r o g r a m
p r o d u c e d and
to the programmer.
should be able to n o t i f y the sys-
tem of which areas are to be dumped and in what format information
should be produced.
could request that a certain
who
of the u n d e r -
of this aid we need
the amount of i n f o r m a t i o n
m a k e the dump listing m o r e m e a n i n g f u l At a minimum,
programmers
about the structure
For example,
the
a F O R T R A N program
array be dumped in the event of
a failure and that the format r e q u i r e d is integer or floating point.
A further i m p r o v e m e n t
to select a p a r t i c u l a r later cancel
area as a c a n d i d a t e
this request.
could n o t i f y the be
w o u l d be to permit the p r o g r a m and it
system of the areas to be dumped and these can
a c c u m u l a t e d in a dump request
list.
Subsequently,
m a y reach a point at which the programmer made
for dumping
Thus as the program executes,
control
knows that the requests
so far would be of little v a l u e if a failure o c c u r r e d in
the next
stage.
re-initialise
Thus he c o u l d
the list.
cancel
all current requests
and
New items could then be added and
when the program u l t i m a t e l y
fails only those areas c u r r e n t l y
of interest will be dumped.
If the final i n s t r u c t i o n clears
the dump request
the program
list before
then no dumps will be p r o d u c e d unless Thus the i n s t r u c t i o n s
controlling
list could a d v a n t a g e o u s l y so that diagnostic
causes
is produced if some u n f o r e s e e n
presented
terms of his source program written.
This implies
part of the program
the program to fail.
A major increase in the u s e f u l n e s s if all the i n f o r m a t i o n
successfully,
the state of the dump request
be made a p e r m a n e n t
information
set of c i r c u m s t a n c e s
terminates
the p r o g r a m ends prematurely.
of a dump can be made
to the u s e r is e x p r e s s e d
in
and the language in which it is
that the dump routine must have
access
297
to the symbol table.
This presents no great problem if the
program is being e x e c u t e d
in a test mode in which case the
symbol table could be made
available
in core.
Some d i f f i c u l t i e s
arise if we wish to r e t a i n the facility after the p r o g r a m has passed through situation,
the testing phase into production.
table and it m a y have to be r e t r i e v e d An example the source
analysis
obeyed
and~
is one d e v e l o p e d
includes
for ALGOL W [9].
was abnormal,
A listing of the o r i g i n a l
store.
a dump in terms of
frequency information
if t e r m i n a t i o n
the a p p r o x i m a t e
from backing
of a system w h i c h outputs
language
mortem
storage.
one w o u l d not wish to waste core to h o l d the symbol
The post-
for statements
a dump of the active
source text is p r o d u c e d with
location of the error clearly indicated.
the dump of active
storage,
variables
procedure or block are displayed.
local %o each active
In the case of arrays,
d i s p l a y is limited to 8 or fewer elements and last so that array bounds has not been i n i t i a l i s e d
including
are available.
is m a r k e d
In
the
the first
Any variable
as such in the output.
which It
is claimed that the system does not impose v e r y large overheads. The b a s i c c o m p i l e r code
generates
for the System/360
include
a tracing
facility)
1.2 and 2 depending
r e a s o n a b l y compact
and the debugging increase
on what options
and e f f i c i e n t
routines
(which also
the size by a factor b e t w e e n the user
selects.
(b) Snapshots This is similar output
occurs
tinues.
is p r e s e r v e d snapshot
except
that
as soon as the request is made and e x e c u t i o n
The u s e r
information
to the dump in m a n y respects
specifies
he wants output.
one or more
snapshot
The i n s t r u c t i o n
points
at each such point
and r e p l a c e d by a transfer of control
routine.
con-
and the
to the
W h e n the program is being e x e c u t e d
and control
298
reaches occurs
one of these points~ and the required
instruction resumed.
is then obeyed
Obviously
the snapshots the program
and execution
are located
the disadvantage a low level
explicit
is
about where in cases where
is that it obviates
statements
in the program~
a facility
provided
addresses.
output
Further,
the technique
if the snapshot
point is
in the middle of a loop.
this may be just the point at which To handle
to specify point.
the point before
three
this situation,
every mth time until
In some
the infor-
it should be
parameters - i, m and n - when declaring
1 is the number output
at
points may have to be specified
for example
is required.
or equal
technique
is that it is usually
in voluminous
circumstances,
a snapshot
print
and the snapshot
chosen,
possible
of the program
- particularly
of the snapshot
in terms of actual machine
mation
routine
The original
can modify itself.
the need to include
unwisely
is output.
some care must be exercised
points
The advantage
can result
a jump to the snapshot
information
commencesl
of times control thereafter
the total number
is to pass
output
of passes
occurs
is greater
than
to n.
(c) Trace When a program ment,
within
be output.
the area being What
program
accumulator
applied.
counter,
of all variables
in trace mode,
causes
is depends
At the machine
instruction
and index registers.
output consists
evaluation~
traced,
the information
the trace is being include
is being executed
of the source and function
in the appropriate
each
some information on the level code
level,
being obeyed,
text line together
context,
required
to
at which it might
contents
In the ALGOL W debugging
procedures
state-
of
system,
with the values for expression
the display
also contains
299
any n e w l y assigned values, procedure
the outcome
calls and the c o r r e s p o n d e n c e
of c o n d i t i o n a l between
tests,
formal and
actual parameters. The problems
which arise
from using a trace are e x c e s s i v e
output and g r e a t l y r e d u c e d e x e c u t i o n controlled an upper
speed.
to some extent by arranging
limit on the number
and that the default option difficulty
arises
interpretively.
of times any statement
is traced The
from the fact that tracing is u s u a l l y Hence it m u s t be possible
to arrange
latter done
that trace
areas of the program.
tion of the p r o g r a m in other regions
mation
that the system imposes
for this limit is low.
mode will only apply to s p e c i f i e d
speed.
The former can be
can then p r o c e e d
Execu-
at normal
This can also help to reduce the amount of trace inforproduced.
(d) T r a c e b a c k The purpose of this d e b u g g i n g
aid is to show h o w control
r e a c h e d a point in the program where an error occurred. m a y be used in c o n j u n c t i o n in a similar m a n n e r
to a snapshot.
when a FORTRAN p r o g r a m procedure brought
calls and the
control
traceback
with a p o s t - m o r t e m
It
dump or operate
Thus a t r a c e b a c k
produced
fails m i g h t be a r e c o r d of all the associated
to the failure
feature in STAGE2
actual p a r a m e t e r s
point.
that
In a similar manner,
outputs the current
the
line which
c a u s e d the failure, followed by the call on the current macro~ the call on the m a c r o that g e n e r a t e d to the original processing diagnostics
input line.
is then resumed.
that call and so on, back
Unless the error is a fatal one, Thus the user is free to b u i l d
into a m a c r o w i t h o u t
causing
premature
termination.
300
(e) ppbu~ Mode Aids such as p o s t - m o r t e m d u m p s available
for high
compiler.
facilities.
At the same time,
are not exceeded,
depend
goto
statement
on the structure
Obviously, checks.
through
users
to create,
tested
What checks
and facilities
useful
of the language. to pay for such run time
phase
by
recompiling
the program
off.
discussed
the importance
and debugging
maintain
phases.
and understand
tools to have available
and debugged.
are m a n y and varied
The
facilities
[~0,11~2];
of good documenPrograms
which
the documentation when software
assist are
is being
offered by such programs
often they depend on particular
features of a programming language. expect
can be made
Aids
in the testing
therefore
in
that the address used in a FORTRAN is valid.
the testing
We have already tation
features
the extra code can be removed once the program
with the debug option (f) Documentation
by the various
that array bounds
there will be some overheads
However,
has passed
required
e.g.
in the
then the compiler
code to check other
may be produced,
are often made
via a debug option
the extra code and storage
of the language
assigned
languages
If debug mode is requested,
generates
ALGOL
level
and tracebacks
Examples of the actions
we might
to be able to carry out are:
(i) reformat
the source
text to produce
in which any u n d e r l y i n g in A L G O L 60, indentation block
structure
(it) paginate
structure
a nearer
listing
is displayed,
e.g.
could be used to make the
c l e a r l y visible;
the listing
and prepare
a table of contents
301
for all procedures, (iii) c o n s t r u c t transfers
labels
an i n d e x to all calls to procedures, to labels
(iv) s y s t e m a t i c a l l y
and references
change
identifiers
program or in s p e c i f i e d (v) d r a w flowcharts
3.2
throughout
procedures
structure
a
procedures; comments.
to v a r y the level at w h i c h the
flowchart is c o n s t r u c t e d
global
to variables;
from the code or a s s o c i a t e d
It should be possible
individual
and declarations~
- from a detailed
one for
to one which i l l u s t r a t e s
the
of the whole program.
ONLINE D E B U G G I N G To a u s e r of first g e n e r a t i o n machines,
t e s t i n g and d e b u g g i n g
seemed much
the tasks of
less difficult
they are today in m o d e r n b a t c h p r o c e s s i n g
than
systems.
interact v e r y c l o s e l y with his program under test, its b e h a v i o u r points,
from the console
of the machine.
he c o u l d make the program
he could then examine
the contents
shot key, he could cause i n s t r u c t i o n s the effect of i n d i v i d u a l
examination
of a p a r t i c u l a r
the program
to restart
turnaround
was encountered.
than it is today,
debugging
M a n y of these
one by one
once detailed
he could cause at full speed
D e b u g g i n g was a particularly
if the
The only users who seem to
facilities
since again close i n t e r a c t i o n sible.
to be e x e c u t e d operations;
area was complete,
of the system is poor.
have c o n v e n i e n t
by means of the single
at any point and operate
m u c h more rapid process
instruction~
of any r e g i s t e r or m e m o r y
and m o n i t o r
the n e x t b r e a k p o i n t
controlling
By setting break-
stop on any desired
location and alter its value if required;
until
He could
are those with
small machines,
between man and machine
small m a c h i n e s
is pos-
are equipped with powerful
302
debugging
systems w h i c h make the process even e a s i e r than it
was before,
e.g.
DDT on the PDP series.
advent of i n t e r a c t i v e improving
time sharing
the d e b u g g i n g
a feasible
on such machines,
systems,
facilities
proposition.
Because
However,
with the
the p o s s i b i l i t y of
on large machines
of the resources
it should be possible
becomes
available
to c o n s t r u c t
an
interactive
test bed into which the program can be placed and
exercised.
The c o n v e r s a t i o n a l
facilities
system should enable a p r o g r a m m e r
provided by such a
to interact not o n l y w i t h
his own program but with the test bed itself. gain access
to c o n v e n i e n t m e c h a n i s m s
fying the b e h a v i o u r
the facilities
as they are on small machines,
the m a j o r i t y
[13].
made
or otherwise
so easy,
of searching
However,
sensible
techniques
Since access is substitute
and make only superficial
attempts
for the perhaps more
reasons w h y the faults occurred.
trained to make
who
feel it will tend to make p r o g r a m m e r s
it is felt that p r o g r a m m e r s might
for thinking
not unreasonable.
system could
debugging
sloppy approach £o debugging.
to correct errors i n s t e a d fundamental
The
debate in recent years on
of c o n v e r s a t i o n a l
There are those who
interaction
written.
level language p r o g r a m m e r s
of computer users today.
There has been c o n s i d e r a b l e
adopt a rather
for
c o u l d be directly related to the language
in which the program was o r i g i n a l l y
the virtues
and modi-
Since resources
provided could be such that all requests
or changes
then cater for the needs of high constitute
for m o n i t o r i n g
of the program under test.
are not likely to be as limited
information
He can then
Such fears are
I feel that a p r o g r a m m e r
use of c o n v e r s a t i o n a l
can be
debugging
facilities, p r o v i d e d he can choose c o n v e n i e n t l y b e t w e e n u s i n g the machine
or desk c h e c k i n g
o n l y book a console
the program.
Thus~
for a limited period of time,
if he can then he m a y
303
be tempted to make h u r r i e d on the other hand, can choose
and s u p e r f i c i a l
the console
the m e t h o d most
A reasonable
corrections.
is in his own office,
appropriate
If,
he
to the p a r t i c u l a r
approach w o u l d be for the p r o g r a m m e r
error.
to carry out
a short p e r i o d of desk c h e c k i n g and then return to the console when he feels more i n f o r m a t i o n By achieving
is needed
a correct balance between
to solve the problem.
the two techniques,
he
will be able to locate errors more quickly and efficiently. In contrast
to this,
w h e n using
a batch
system,
must plan his next run c a r e f u l l y in an attempt i n f o r m a t i o n he needs. getting
sufficient
the programmer to o b t a i n the
To guard against the p o s s i b i l i t y
evidence,
of not
he may err on the side of request-
ing too much and be faced with the task of sorting through large q u a n t i t y of output. he has i m m e d i a t e of i n f o r m a t i o n However,
access,
In the c o n v e r s a t i o n a l
system,
a
since
he n e e d only ask for the next piece
he thinks might help to locate the error.
there is a word of warning:
e a s y for a p r o g r a m m e r
to alter
we must not make it too
a program which is perhaps
part
of a larger system and then i m m e d i a t e l y make this n e w v e r s i o n available
to other users.
correction
as q u i c k l y
By all means,
let him make the
as possible but then ensure that the
appropriate
validation
code before
incorporating
procedures
are applied to the m o d i f i e d
it with the remainder
It it all too e a s y to "correct"
a module w i t h o u t
of the system. fully a p p r e c i a -
ting what affect the change c o u l d have on other m o d u l e s
in the
system. Let us n o w consider what type of facilities to see in online discussion,
debugging
we will r e s t r i c t
high level languages code.
systems.
The techniques
rather
one might wish
For the purpose of the
ourselves
to systems which
support
than ones which handle only assembly
used in
both
types of systems
are v e r y
304
similar.
However,
we wish to concern ourselves w i t h systems
w h i c h could be of value to a wide class of programmers. Further,
we will consider
purpose high concentrate
level
general-purpose
languages.
on s p e c i a l i s e d
e q u i p p e d with powerful
rather than special-
M a n y online
systems which
areas have been c o n s t r u c t e d
debugging
our emphasis is on languages
aids
[~4,i5].
Again,
which are suitable
and however,
for a wide
Class of problems. In c o n s t r u c t i n g level (a)
language, Commands
an online debugging system
to set and reset b r e a k p o i n t s
the program.
W h e n the p r o g r a m
state is "frozen" The b r e a k p o i n t
reaches
This implies
relatively
An e x t e n s i o n a condition.
in the language.
convenience,
routine
request.
inserted.
is one that associates
However,
it should be possible
occurs only after the b r e a k p o i n t (b)
code since i n s t r u c t i o n s
this could be any c o n d i t i o n
a number n with a b r e a k p o i n t
This is
if the p r o g r a m is
and jumps to the m o n i t o r
to the simple b r e a k p o i n t
be legally e x p r e s s e d of p r o g r a m m e r
of the source text.
if the p r o g r a m u n d e r test is
d i r e c t l y in machine
In general,
entry
system must have access
there m a y be problems
have to be e x t r a c t e d
its current
to the online user.
to a label or p r o c e d u r e
that the debugging
simple to i m p l e m e n t
b e i n g executed
in
by specifying either a statement
to the symbol table and the structure
being interpreted;
on any statement
a breakpoint,
and control is returned
is r e q u e s t e d
number or a p o s i t i o n relative point.
for a high
one m i g h t expect to provide the following:
that can
on the grounds simply to supply
Exit to the u s e r then
has been passed n times.
The ability of the system to return control to the u s e r in
the event of an error in the program,
e.g.
attempting
the square root of a n e g a t i v e number,
exceeding
to take
the array
305
bounds
in an array reference.
code in a debug mode
similar
so that the e x e c u t i o n as possible.
The compiler
W h e n an error occurs,
information
n e e d e d to p i n p o i n t
information
should be c o u c h e d
e.g.
position
relative
More
esoteric
error d e t e c t i o n
attempting
array elements
as "protected"
in such a test
it
or a number
More
executing
Unfortunately,
and altering
is the culprit.
any item of data in
a simple
that the system must be able
complex o p e r a t i o n s m a y ,
a loop statement
for
to print the elements
it is u n l i k e l y that the original to couch all the e n q u i r i e s
u s e r m a y wish to make,
e.g.
if the language
he m a y wish to inspect
previous
some e x t e n s i o n s
is also true debugging
values
permits
of a v a r i a b l e
e.g.
that a
recursion, on the stack.
to the language will be necessary.
for other data which could be a c c u m u l a t e d
system,
to the
assignment
language will be sufficient
Hence
of
This is a par-
frozen and control r e t u r n e d
statement-whichimplies
involve
e.g.
is b e i n g altered
see which statement
for e x a m i n i n g
This
language
procedure.
Further,
them.
This m a y involve m e r e l y e x e c u t i n g
of an array.
line.
so that any error is r e p o r t e d
is made to access
to recall the c o m p i l e r . example,
is also possible
facility if a variable
and one c a n n o t
Facilities
or print
source
to be able to mark a v a r i a b l e
a p r o g r a m once it has been console.
faulty
in terms of the original
i f the p r o g r a m is being interpreted)
ticularly valuable
(c)
as c l o s e l y
by any r e l e v a n t
to a label in a p a r t i c u l a r
as soon as any attempt
illegally,
the
3.1
a clear d i a g n o s t i c m e s s a g e
accompanied
to use an u n i n i t i a l i s e d variable.
could be useful
in Section
of the program is m o n i t o r e d
should be output to the c o n s o l e
bed(particularly
should generate
to that d i s c u s s e d
This
by the
the last n sets of actual p a r a m e t e r s
supplied to a procedure.
306
(d)
The ability
point
to restart
at which execution
possible control having
to cause returning
statements
could be produced
the program
if requested However,
between
Since rapid interaction
should be encouraged to be returned
the program.
for deleting,
immediately
system,
one would
debugging
the program
are
provided
the whole
system
and the program
efficiency
like such changes
[17],
Whether
system
is
for modifying
to become
this is possible
the code.
The o n l y way version
On the other hand,
the FORTRAN
statements
in the
are interpreted
being debugged m a y be freely modified.
However,
will reduce
providing
the same symbolic
to the difficulty
execution
speed
compiler
text is available,
it is put into production.
The
is of course
an optimising
program before
time
form and no
is to edit the symbolic
program.
lines in
[16] for the Berkeley
test bed for debugging
answer
the programmer
is set up in the test bed.
use the online
plete
to handle
is held in a compiled
since i n t e r p r e t a t i o n
of accepting
since it
and inserting
price that must be paid for this convenience
considerably.
of
about what information
they are made.
a user can make any changes
QUICKTRAN
are executed
is possible,
replacing
on how the program
Thus in the FORTRAN
and recompile
without
information
to obey a section
for the console
to be selective
Ideally
or not depends
facilities
Trace
with
to him.
Facilities
sharing
statements
in trace mode is probably not required too much output
at the
also be
one-by-one
when statements
the ability
could produce
effective
It should
on every one.
efficiently.
(e)
not n e c e s s a r i l y
to be executed
to the console
to set a breakpoint
in this manner.
the program
was interrupted.
capable
one could
and then recompile
the
This is not a com-
since it may not be possible
to
307
c a r r y out all the r e q u i r e d slowly.
A better
in the test bed consists directly executable adaptability which
of a mixture of i n t e r p r e t i v e
code.
(WAITE,
such a hybrid
the selection
testing if the program operates
too
solution m i g h t be one in which the program
In the lectures
on p o r t a b i l i t y
POOLE A), we d e s c r i b e d
an e x p e r i m e n t
program was p r o d u c e d , and p o i n t e d
of the type of code
program was parameterised.
and and in
out that
for a p a r t i c u l a r
area of the
Thus we could envisage
a situation
in which the first time a p r o g r a m is placed in the test bed, is set up e n t i r e l y
for interpretation.
h a v e been debugged~
from an inner
degrading
the overall
In this
speed of the program.
Finally,
exist in the c o m p i l e d test bed before Another facilities updating code.
form w o u l d be available
that arises when the user is p r o v i d e d with
fo9 m o d i f y i n g
the program in the test bed is that of
the source text so that it c o r r e s p o n d s If
the
completed.
during
to the
actual
test bed does not need to access the source then any changes
However,
the user then has to work from the original
(and m o r e expensive)
an u p - t o - d a t e
A more con-
solution is to make any changes
as soon as t h e y are supplied.
a test run.
could be
text once the test r u n h a s been
listing which existed when the run commenced.
the source
would
in production.
saved and edited into the original
produce
Only those parts
the whole of the program
text once it has been compiled,
venient
they c o u l d be
form and could be checked out in the
being placed
problem
form,
loop in~ say, the m a i n routine w i t h o u t
of the program held in the i n t e r p r e t i v e for modification.
subroutines
they could be c o m p i l e d into d i r e c t l y execut-
able code in a later test run. called
Once the basic
it
A user can then
listing of his source text at any time
to
308
(f)
Commands
to request
statistical
behaviour of the program, m e n t has been obeyed,
e.g.
a list of all statements
which have not been executed, not been i n i t i a l i s e d provide insight as i n d i c a t i n g (g)
into the p e r f o r m a n c e
possible
about the
(or routines)
a list of variables
or referenced.
which have
This i n f o r m a t i o n of the program
can often
as well
sources Of error.
C o n v e n i e n t methods
facilities.
information
the number of times any state-
for initiating
Particular
any of the foregoing
a t t e n t i o n must be paid in the construc-
tion of any test bed to ensure that the user has a v a i l a b l e powerful macro
and concise
language
for issuing his requests.
facility which enables him to combine
sequences
of commands
is a v e r y n e c e s s a r y
a
A
frequently used
feature of such a
system. M o s t of the facilities ted in one way or another systems.
The response
access
system d e v e l o p e d
prepare
in a number of online d e b u g g i n g
are quite primitive. at Culham
for use online
jobs for production.
were made via an i n t e r a c t i v e users
above have b e e n i m p l e m e n -
from users has been one of e n t h u s i a s m
e v e n when the facilities
A L G O L compiler
described
In a m u l t i -
[4], we provided
and an optimising Modifications text editor.
an i n t e r p r e t i v e
compiler
to
in the source text With these
facilities,
found they could d e v e l o p programs much more r a p i d l y than
was possible
in the batch
use of time sharing
system alone.
systems
increases,
Undoubtedly,
as the
online debugging
have a larger and larger role to play in the p r o d u c t i o n
will of
software. Before
leaving this topic,
I will describe
briefly a
system in w h i c h a somewhat d i f f e r e n t
approach to online
ging is taken.
to the above p r o p o s a l s
One of the drawbacks
debug-
309
is that the test bed is effectively language. number
In modern
computer
of such languages
test bed becomes
systems,
the users
one.
program which
collects
and stores
debugging
routines
extract
programs
which
implementation it6elf.
to the s y s t e m
The monitor
process
Some efficiency
Output is displays
through
is i n v a r i a n t the time varies
wi%h
The user
with execution
time,
e.g.
time,
static
can run the motion
backwards
at variable
speeds
This type
of application
potential
displays
scanning
any loss.
and motion
displays
e.g. values
picture
information of variables
of a particular
at
variable.
aids both forwards
is just one example
that
data which
and stop at any desired
have in an online
is
since
The latter displays
picture
and for-
of I/O, but it is
far outweighs
last n values
language
information
sacrificed
The former
an error occurred.
of both the
file search
a large amount
a CRT and both
c a n be produced.
of the
aids can easily be added
has been
flexibility
the history
and the source
the n e c e s s a r y
tape will involve
the
from the tape and present
the appropriate
the h i s t o r y
the program's
the remainder
and monitoring
providing
t h a t the added
about
the
routine
Subsequently,
which prepares
language
available.
claimed
the test bed or
the tape are independent
by writing
routines,
tape.
However,
of the source
A n y debugging
matting
information
information
dependent.
aids
taken is one in which
it on a history
it to the online user. tape is language
debugging
is run with an EXDAMS monitor
all the n e c e s s a r y
behaviour
a common
a single environment
the compilers,
The approach
to be debugged
of having
a
[18] is an attempt
add new online
to the system without m o d i f y i n g their own programs.
EXDAMS
since it provides
can e a s i l y
level
there are u s u a l l y
and the possibility
an attractive
to set up such a facility in which
tied to one high
and
point.
of the great
debugging
system.
Since
310
output can be displayed teletype,
at a much
a user can be provided
ways of scanning
the information
it m a y be possible debugging
process.
3.3
could be m o n i t o r e d
we have been mainly
clear-cut
for this phase
warn him that insufficient
during
a common enough
i.e.
but the quality
approach.
all cases.
to prove
testing
of the program,
software.
of the testing
on m a n y cases
to oneself
all possible
exercise.
cases,
that is an
Yet it seems to be
M a n y programmers
the program
all the bugs will be revealed. has worked
the design
is
the adequacy
factor.
and time consuming
that if they exercise
near as
or not a program
testing means unreliable
It should not be difficult testing,
Whether
what
testing
and his experience, which should
it is not the quantity
inefficient
with debug-
is nowhere
on m a n y factors, including
the skill of the programmer
exhaustive
With
test beds,
disappear.
concerned
the situation
as it is for debugging.
well tested will depend
that is the important
online
we might use to improve
Unfortunately
of the planning
visually.
Now let us turn to consider
and techniques
procedures.
However,
into powerful
AND TECHNIQUES
ging aids and techniques. strategies
is dis-
to hope that m a n y of the debugging
STRATEGIES
To this point,
to aid the
is being executed,
that beset us today may eventually
TESTING
Further,
of the program
the program
connected
it is not unreasonable problems
presentations
If the flowchart
then the flow of control devices
about his program.
to use graphical
played on the screen while
such output
faster rate than on a with more rapid and convenient
apparently
on enough test cases,
However,
think then
the fact that a program
does not prove that it will work in
One must give considerable
attention
to the way in
311
which testing
is carried
the right test cases, of confidence no general
rules;
if the program being
the logarithm
of cases for years
It is interesting,
computed
because
in results
believing
the physics
extreme
possible
are likely to make,
system
of experience
[~9].
Extreme
test in which
to support
20 users
that this situation and it might
sense,
faith
supplied
by
as
An important a consistency in terms
of
to make
to allow for the
is also important
For example
20 consoles,
it is
what sort of errors
forgetting
for
in a m u l t i - a c c e s s
one should
simultaneously.
test.
them
In other cases,
case testing
login
are those in
Often these may be difficult
will occur in real
seem an unfair
what
say,
dividends
and knowing
e.g.
and certification.
designed
which
problem.
apply.
a matter
string
routine
of parameterisation
people
subroutines
of applying
which can pay handsome
merely
null
seem fairly obvious,
they produce.
has already been discussed.
validation
zero, @,e~ and the small-
to conjecture
to see if it makes
conditions
then at least
tend to treat computers
every answer
to set up and the value
constructed
found to be in error.
in some basic
of the original
Test cases
and the algor-
which have been i n c o r r e c t l y
is the p o s s i b i l i t y
check to the output
which
suddenly
are
that can be represented
such choices
are
of errors
of testing
number, number
where mathematical
Far too m a n y users
infallible, aspect
positive
if not alarming,
may have been placed
the system.
to base e
a negative
Even though
have been u s e d
There
upon the problem
test it with
one still hears
of the program.
For example,
est and largest non-zero in the machine.
if one chooses
one has a fair degree
much depends
is one to calculate should
In particular,
by induction,
in the correctness
ithm being used.
one
out.
then,
arrange
a
The probability
life is v e r y small
However,
the load that it
312
puts on the system might be sufficient queues are g o i n g to overflow, errors exist, produce
to ensure
they will}
then they will be revealed.
the errors in a c o n t r o l l e d
It is far b e t t e r to
test situation
them occur at random when the system is in use, become
far more d i f f i c u l t
Testing numerous preparing manner.
aids.
U s u a l l y they involve
test packages
m o d u l e or m o d u l e
test run.
to locate.
test data in a systematic, Module
aggregate
than to have
for then they
aids do not seem to be as w i d e s p r e a d
as d e b u g g i n g
the system,
that if
if time d e p e n d e n t
convenient
systems
to take a
from the r e m a i n d e r
submit test data and m o n i t o r the e x e c u t i o n M o s t test packages
of
of the
allow the module to be executed
a n u m b e r of times during one run. i n c l u d e d in such packages
for
and r e p r o d u c i b l e
allow a p r o g r a m m e r
in i s o l a t i o n
or as
A m o n g the c a p a b i l i t i e s
is that of file simulation
often
in order
to test modules ,which read input files and write or u p d a t e files for output.
Testing
can then be u n d e r t a k e n
need to set up or dump actual both programmer
and computer
physical time.
files,
without
thereby reducing
Test data g e n e r a t o r s \ h a v e
a useful role to play in the d e v e l o p m e n t of real time For example,
in testing
small computer
users and create The
system,
one m i g h t use a
for input to the system.
of this stream is c o n t r o l l a b l e
even down to the time delays between
Hence t i m e - o r l o a d - d e p e n d e n t duced during the d e b u g g i n g
and repro-
specific messages.
errors once revealed can be repro-
process.
We can contrast
a situation in which one is a t t e m p t i n g a number of people at consoles.
to cause an error to repeat itself. task.
this with
to test the system with
Since we have no control
the rate at which input is generated,
then a much more d i f f i c u l t
systems.
the b e h a v i o u r of a number of online
a stream of messages
form and content
ducible,
a multi-access
to simulate
the
over
it m a y not be possible Locating the error is
313
The t e s t i n g of portable problems
particularly
an i m p l e m e n t a t i o n it is u n l i k e l y
where
software
raises
a full b o o t s t r a p
on a n e w target machine.
that the i m p l e m e n t o r
with the originator,
will
some special is u s e d to effect In such a case,
be
in
close contact
and t h e r e f o r e not only must the i m p l e m e n t o r be
able to v a l i d a t e the n e w implementation, but also he m u s t be given some a s s i s t a n c e in d e b u g g i n g his i m p l e m e n t a t i o n Since one of the aims in p r o d u c i n g the i m p l e m e n t a t i o n also u n l i k e l y algorithm.
procedure
if the v a l i d a t i o n
portable
software
a fairly m e c h a n i c a l
is to make
one,
it is
that he will have any detailed k n o w l e d g e
The solution we have adopted is to provide
ing test p r o g r a m s "
for the abstract m a c h i n e
The m a c r o s which define to the hardware
of a real computer. faulty c o m p o n e n t s
for
implementor
of the "engineer-
[20].
an abstract machine
m u s t test
fails.
are e q u i v a l e n t
Just as the m a n u f a c t u r e r
and wiring errors,
so the
of an abstract m a c h i n e m u s t test for c o d i n g errors
in the macros.
To aid this task,
a series of test programs.
the designer m u s t provide
These must be designed v e r y care-
fully to v e r i f y every m a c r o i n d i c a t i n g which an error occurs
the specific m a c r o in
and giving n e c e s s a r y
details if possible,
e.g. TO
'' IF VAL
' = ' FAILS
The test programs ventional
hardware
OPPOSITE
to report any failures.
describing
I/O m e c h a n i s m s
The first test therefore
simply reads
(The I / O package i t s e l f is p r o v i d e d with
Thereafter,
each
subsequent
a failure and then checks
has occurred.
to con-
but rely on the normal
a set of test data to assist the i m p ! e m e n t o r his machine.)
SIGN
are similar in c o n s t r u c t i o n
tests,
and prints one line.
ON UNEQUAL,
If so, the
to realise
to see whether
line is printed;
it on
test reads a line
otherwise
the failure the next
314
test is begun. operations,
The d e s i g n e r m u s t
select a m i n i m a l
other operations.
Selection
of the test sequence
strategy depends u p o n the o r g a n i s a t i o n Unfortunately (a)
set of
test these and then use them in the testing
the t e c h n i q u e
The test programs
of
and overall
of the abstract machine.
has a number of drawbacks:
are d i f f i c u l t
to write and e x p e n s i v e
to produce - we are a l r e a d y u s i n g a second set of tests for FLUB and c u r r e n t l y
t h e y contain
30% more
lines of code
than STAGE2 itself. (b)
It is by no means
f o o l p r o o f - we have r e c o r d e d i n s t a n c e s
where the test programs correct but STAGE2
reported
subsequently
that the macros
were
failed to operate
successfully. The d i f f i c u l t y be exhaustive. combinations
lies in the fact that the test p r o g r a m s cannot While they can be made to check,
of signs in an arithmetic
test all possible more
register
severe once the i m p l e m e n t o r
to his specific machine.
operation,
combinations.
say, various they cannot
The problem is even
begins to adapt the p r o g r a m
Thus if the operation
PTR A = B + C is checked by the test program,
we can be fairly safe in
assuming that PTR X = Y + Z will also function c o r r e c t l y i f registers X, Y, Z are m a p p e d in exactly the same way as A, B, C. are mapped,
However,
say, onto actual registers
if X, Y and Z
of the real machine,
while
315
A, B and C are m a p p e d
into core,
then the test p r o g r a m will
no longer check the latter operation. the problem might be to generate the actual program itself. a set of m a c h i n e
A possible
a set of test programs
It might be possible
actually used. PTR
does not occur in the program, it maps c o r r e c t l y
the
would check only
Thus if the o p e r a t i o n A
=
B
+
C
then it is i r r e l e v a n t
whether
or not.
In a d d i t i o n to p r o v i d i n g
test programs,
the designer
of
p r o g r a m m u s t also supply some test data to v a l i d a t e
a new implementation
at least to the point when the i m p l e m e n t o r
can use the program with Both
from
and create a test program which,
when combined with a set of basic tests, those o p e r a t i o n s
to
to c o n s t r u c t
i n d e p e n d e n t macros which would analyse
program being i m p l e m e n t e d
a portable
solution
STAGE2
a reasonable
degree of confidence.
and M I T E M are supplied with
tion purposes.
Again c o n s i d e r a b l e
such data for valida-
effort is required
to produce
such test data since it m u s t be c a r e f u l l y planned to e x e r c i s e all parts of the program. simply to v a l i d a t e are no complete
test programs
which already exist to provide processes
For STAGE2,
the implementationl
debugging
for FLUB),
the test data,
a particular
(other than those
as well.
Thus as the editor
files c o n t a i n i n g
are c o n t i n u a l l y validated. test sequence,
character
string,
locate that string and then organise that the test sequence
since there
we attempted to use the test data
it c o n s t r u c t s
at a certain point in the contain
for MITEM,
for TEXED
information
of text which themselves
the test data serves
located;
we expect a line to
we issue a command
to
the subsequent command
is o n l y continued
ter string is s u c c e s s f u l l y
lines
If
so
if the desired charac-
if it is not,
then the
316
program stops at this point.
By comparing the output produced
so far by this version with the correct output~ the implementor may gain some idea of what is causing the error.
317
4.
REFERENCES
1.
Henahpl, W.
A proof of correctness of the reference
mechanism to automatic variables in the F-Compiler. LN 25.3.048 2.
IBM Vienna Laboratory.
Roberts, K.V.
The publication of scientific FORTRAN
programs, Computer Physics Communications~ ~ (1969) 1-9. 3.
Wilkes, M.V., Wheeler,
D.J., Gill, S.
The Preparation
of programs for an electronic digital computer. Addison-Wesley d951). 4.
Poole, P.C.
Developing a multi-access system online.
Software, ~ (1971) 39-51. 5.
Waite, W.M.
The mobile programming system: STAGE2,
CACM, 13 (1970) 415. 6.
Brown, P.J.
The ML/1 macro processor, CACM, 10 (1967)
618-623. 7.
Dijkstra, E.W. Techniques,
8.
Structured programming, Software Engineering
Report on NATO Conference in Rome
Barron, D.W.
(1970) 84-88.
Programming in Wonderland, ~he Computer
Bulletin, 1 5 (1971) 153-153. 9.
Satterthwaite,
E.
Debugging tools for high level languages,
University of Newcastle upon Tyne Computing Laboratory Technical Report Series no. 29 (1971). 10.
Conrow, K., Smith, R.G. reformatter,
CACM,
NEATER2: A PL/1 source statement
13 (1970) 669-675.
318
11.
Mills,
H.D.
Syntax directed documentation of PL/360,
CAcM, !! (1970) 2~6-223. 12.
Scowen,
R.S., Allen,
D., Hillman,
SOAP - a program which documents programs, 13.
Comp, J.,
14.
Engelman,
C.
15.
Sutherland,
16.
I.E.
30.50.50, 1,7.
Proc.
Dunn, T.M., Morrissey,
Balzer,
J.H.
system.
R.M.
Proc.
Sparton Press,
S.J.C.C.
Pyle, I.C., McLatchie,
Newey, M.C.,
Document No.
Berkeley
Remote Computing S.J.C.C.
(1966). - an
(1964).
EXDAMS - extendable debugging and monitoring (1969)
Poole,
567-580.
R.F., Grandage,
bug with delayed effects, 20.
S.J.C.C.,
(1963).
FORTRAN II Reference Manual,
system, Proc. 19.
system,
'68, B 91-95.
a man machine graphical
University of California,
experimental 18.
Sketchpad:
Maryland
Carr, C.S.
Report on NATO
(1970) 23-24.
Mathlab 68, IFIP Congress
communication Baltimore,
Shimell, M.
14 (1971) 133-136.
Software Engineering Techniques, Conference in Rome
A.L.,
and edits ALGOL 60
B.
A second order
Software , ~ (1971) 231-235.
P.C., Waite, W.M.
Abstract machine
modelling to produce portable software - a review and evaluation,
Software
(to be published).
CHAPTER 3.D. RELIABILIT~
D. Tsichritzis University of Toronto Department of Computer Science Canada
i.
DESIGN AND CONSTRUCTION
i.i
INTRODUCTION
OF RELIABLE SOFTWARE
As computer users become more experienced, and disillusioned
the efficiency requirements
replaced by reliability requirements.
mature
are usually
Many reasons can be
given. a)
As equipment becomes cheaper and faster the pressure diminishing
is
to drive it "hard".
b)
Unreliable
software is worthless no matter how efficient.
c)
For some applications
the cost of failure
than the cost of the computer system,
is much higher
e.g. process control
applications. d)
An inefficient success.
e)
system can be "tuned" with considerable
An unreliable
system is much harder to rescue.
Reliability might affect data which are very expensive to duplicate.
f)
The result of inefficiency long.
Unreliable
is obvious,
one has to wait
software can have hidden errors, which
can violate the system and users data without much warning. The results of the error might be discovered much later. For instance if a defective a software bug,
airplane were designed due to
it will take many "crashes"
to trace the
bug. We wholeheartedly
agree with the remarks of E. Dijkstra
"Testing can show the presence of errors and not their absence"
320
and B. Randell "Reliability
is not an add-on feature".
software is designed and implemented with care. not happen, because
Reliable
It just does
the programmers were good, or careful.
There are some fortunate persons who can write small error-free programs,
the generalization
is not valid. The idea behind software
engineering is to develop tools, which enable a person of average competence and intelligence to produce good work. (Real "hero" programmers are.) Reliability software.
are few, although many think they
is often viewed only qualitatively
A system is considered reliable,
rare occasions
such as in Electronic Switching System
reliability requirements
quantified.
priate to talk about reliability
[I] are
It will be even more appro-
for a price. There is an inherent
extra cost both at the development of reliable software.
in
or not. Only on
stage and the running stage
The field will be really mature, when
a level of reliability can be contracted for a piece of software according to specified cost. interpretive
By incorporating many run-time
checks one can also think of different versions
of functionally the same software system according to desired reliability levels. We will outline several aspects of software design and production affecting reliability. 1.2
INFLUENCE OF THE LANGUAGE Programmers
programming
do not make the same errors in different
languages.
frequent errors.
Certain constructions
introduce more
Ichbiah calls them characteristic errors
The reasons for characteristic of the construction,
[2].
errors might be complexity
poor definition,
unnatural behaviour
etc.
We give two simple examples. Example 1 [3] MISTAKE = MISTEAK + 1 The intention of the programmer was to increment variable MISTEAK.
the
In FORTRAN two separate locations will be
321
assigned
to the variables MISTEAK and MISTAKE
will be undetected. declared, Example
and the error
In ALGOL, where variables have to be
the error will be pointed out by the compiler.
2 [2] Consider
the CASE construction
actions A, B, C are taken according
in ALGOL W, where
to the value of
the
I = 1,2,3.
CASE I OF BEGIN A; B; C; END Frequent errors arise from the CASE construction, due to ommision of cases,
or rearrangement
of the cases.
Consider now the modified version of the CASE construction SPACE COLOR:
(RED,ORANGE,GREEN)
TYPE C~LOR LIGHT; CASE LIGHT/COLOR OF GREEN: A RED:
B
ORANGE:
C
In the modified version the numbers
associated with
the cases do not have to be known to the programmer.
The order
in which the cases are written can be changed easily. Unfortunately behaviour
of programs,
there is very little known about the especially with respect to bugs.
It
would be nice to organize a collection agency for frequent teristic errors.
This type of information would be invaluable
to language designers
to ensure proper properties
So far most of the emphasis influence
of languages.
in language design is concentrated
on efficiency and programming is a positive
chara~
style.
Clarity of programs
on reliability,
Even nice, well structured programs
but it is not enough.
can have bugs
[4].
322
1.3
SEMANTIC CHECKING The compiler is in a position to do much checking,
especially if the language has" the proper characteristics. At least the compiler should catch most of the clerical errors e.g. keypunching. We will give some simple examples. Example 1 [3] Suppose a FORTRAN program contains
the following
statements: Z = <simple arithmetic expression> (Some statements which make no reference not have branching
instructions
to
Z , do
and are not targets
of GO TO's) Z =
arithmetic expression>
The above program is either wrong or ridiculous, since the value of previous value.
Z
is changed without using at all the
A compiler should do some checking and
point out such a discrepancy. Example
2 [3] DO i00 I = 1,5 DO I00 I = I,i0
I00
CONTINUE The above code is syntactically correct, but it does
not make any sense.
The compiler should point out the
discrepancy,
it is probable
because
that an error is present.
The compiler cannot absolutely since some information is only available the language has many constraints to detect errors.
ensure reliability at run time.
there is more information
This goal is sometimes not compatible
with the philosophy of flexible, powerful and natural constructions.
When
language
323
1.4
PROGRAMMING STYLE The style of programming can greatly influence
reliability. PI)
We mention two cases.
Semantic naming.
Names in the program should have a
direct relation with their intended function.
In certain
cases elegance should be sacrificed by using long names. P2)
Verification length.
Humans can look and grasp a
piece of program at a time.
As a result paragraphing rules,
control statements and clustering of actions are very important.
This is one of the arguments against the use
of GO TO statements. The art of computer programming is receiving much attention lately starting with the pioneering work of Dijkstra (THE), Mills
(IBM) etc.
The goal is to establish a set of
conceptual and operational principles for good programming practices.
Before a program is written a structure of abstrac-
tions should be designed which leads in a natural and well organized way to the final program.
This method of structured
programming is outlined with an example by GRIFFITHS.
Struc-
tured programming slows down programming speed but greatly reduces the testing and validation phases.
The result is more
productivity for programmers and better programs.
Structured
programming enables the programmer to obtain a global view of the problem and deep understanding of the relations between the different modules. understand and more ness.
As a result the program is easy to
amenable to proof of its logical correct-
The structure which is superimposed on the program
during the design phase provides good documentation of the
324
program's behaviour.
The context of the different variables
and the effect of changes is easier to visualize.
As a re-
sult the maintenance effort is greatly reduced. In general structured programming is an elegant and organized way for writing programs. proved the method to be practical.
Recent results have also It is a first step towards
a programming methodology.
Some ad hoc techniques can also be used.
They are
based on redundancy and the incorporation of chocking code. The objective is to detect and isolate errors at runtime. TI)
Cross-chocking.
The method used by accountants for adding
columns and rows and expecting the partial sums to add up to the same number. T2)
Range checking.
Certain variables in the program can
take values within some range due to their semantic moaning. T3)
Consistency chocking.
If a program is run frequently,
certain outputs have to be consistent e.g. a telephone bill cannot be $i0 for months and then jump to $2,000 without cause for alarm. T4)
Unique names.
Items in the system are provided with unique
bit patterns, which serve as identification cards.
Local
mneumonic oriented names are used for reference, but the additional unique names serve to avoid ambiguities. TS)
Pointer keys.
pointer.
A bit pattern is associated with every
The same bit pattern is associated with the item at
which the pointer points. point at "apples"~
This way when the pointer should
we can detect an error if by following
the pointer we picked "oranges".
325
T6)
Stage processing.
Perform the computation in stages.
After every major stage, check the integrity of the programs and data involved. T7)
Positive checking.
If an action has
n
outcomes
incorporate an extra one for unpredictable behaviour.
At
least there is a definite path for the errors, instead of taking any arbitrary one. The above mentioned techniques were communicated to us by other persons, notably B. Randell.
We welcome any
additional techniques that programmers are using or have used to increase reliability.
Our intention is to compile
a list of programming techniques which can influence reliability. 1.5
INFLUENCE OF PROTECTION A protected system is a controlled environment in
which some well-defined boundaries and ground rules exist to restrict communication.
Protection increases reliability
and fault tolerance in a software system in two important ways. a)
Diagnostic
An error in the system usually results in
an attempt to violate the protection mechanism. b)
Error isolation firewalls.
The protection mechanism establishes
If an error occurs in one part of the system
it is isolated in such a way that only the particular part is affected.
As a result the system fails gracefully.
We will elaborate later on protection mechanisms and their implementation. 1.6
PROGRAM CORRECTNESS A program representing an algorithm can conceivably
be proved to be mathematically correct.
Lately there has been
much work in this topic following essentially two approaches. First to give a conventional manual mathematical proof that
326
the program is doing what it is supposed to. the properties
Second to state
of the program in a formal manner using a
mathematical model and express the correctness
of the program
with a formula of the predicate calculus which can then be verified by a mechanical 1.6.1
theorem prover.
INFORMAL PROOF [3] This approach dates back to Von Neumann but it has
acquired significant
importance through the work of Floyd
London
[5].
[7], and Naur
Assume there are points
In principle, pl,...,p n
[6],
it is quite simple.
in the program, where
the programmer provides assertions of invariant conditions among the variables of the program.
There is at least one
assertion about the input variables
of the program and one
assertion about the output variables
in connection with the
input expressing the intent of the program. procedure
The verification
takes the following steps. Assume that
Pi
is an assertion and
pj
is the
following assertion in a control path of the program. that the code between true, pj is true.
Pi
and
pj
Prove
is such that, if
Pi
is
If this verification process is performed for all adjacent pairs of assertions,
for all paths of control,
the program is correct assuming it halts.
then
As a result the
verification process shows only partial correctness,
up to
halting, and it should be supplemented by another halting proof.
The interested reader can find a simple example of
this technique in Elspas London [8].
[3] or a more elaborate example in
The following comments are pertinent: I)
The creative part of the proof is to develop the assertions
which can be very hard for fair size programs. 2)
The proofs of programs tend to be very long and tedious.
The problem of verification
of the proofs ~s apparent.
327
3)
Some of the constructions in high level languages are
very powerful and the verification conditions must be developed to identify the relationships between assertions 4)
[6,9].
A proof of this type does not guard against clerical
errors e.g. keypunching, or safeguard a particular program from hardware or system errors. On the other hand, to view a whole software system as one program for verification purposes is unrealistic considering the present state of the art. 5) The effort of proving a program correct is of such magnitude that it precludes the application of the method in any large scale.
If all the mathematicians of the world
would join forces, they could not probably prove the correctness of a large size Operating System (assuming they can find a candidate). The method of assertions is a very general and flexible method which can be used heuristically without any claims for proofs.
For instance, in some languages the
programmer can specify assertions which produce checking code when the compiler is in the debug mode and are inserted as comments when the compiler is in the regular mode
[I0].
An interactive program verification system could offer many advantages
[II]. In such an environment a programmer makes
assertions which can be verified or proved erroneous.
As
a result he gains a deep understanding of the program properties. To conclude, proving correctness of programs manually is interesting, and it can be useful for small sensitive algorithms of a software system.
Practical limitations though
will always be present. 1.6.2
FORMAL PROOF [3] Much work has been concentrated in expressing the
correctness and general properties of programs within a formal model, specifically the predicate calculus.
Floyd [6] showed
328
that the partial correctness proving corresponding calculus.
Manna
of programs can be reduced to
theorems in the first order predicate
[12] demonstrated
including halting,
that total correctness,
can be reduced to proving theorems in the
first order predicate calculus.
The hope is that after
expressing the correctness problem in the formal framework automatic theorem proving techniques the correctness
of the program.
automatic program verification
can be used to establish
As a result the future of is closely related to the
advance of theorem proving techniques.
The current state of
the art in theorem proving is far from being adequate to handle the long, well formed formulas associated with the correctness
of programs
[13].
The currently used resolution
technique is a semi-decision procedure, but it is inefficient. Heuristic
techniques might provide the answer
[22].
In short formal proof techniques give some hope for the future, but unfortunately
they cannot be considered
available tools for current software engineering. 1.7
DESIGN FOR RELIABILITY During the design phase decisions can be made and
the program structured in such a way which greatly enhance reliability.
Good design increases reliability
in a number
of ways. I)
The program structure can allow easier and more complete
testing. 2)
The logical correctness of some mechanisms
can be proved
informally. 3)
Problems
of timing can be eliminated by ensuring proper
synchronization
and cooperation of the processes.
A very good example of design for reliability level approach
[IS].
is the
Namely, the system is designed as a
hierarchy of abstract machines.
Each level of the hierarchy
is tested exhaustively before the next outer level is implemented. This type of approach greatly reduces the program states for
329
testing purposes.
The general principle of levels of
abstraction and their usefulness in the design of systems is discussed in GOOS C.
A very nice concise explanation of
their influence for testing and reliability can be found in Dijkstra's paper in the Rome NATO report [16]. Certain mechanisms or tools used for the design can be proved informally to be correct.
A very good example is
the use of synchronization primitives to provide mutual exclusion.
This type of relation is frequent enough in a
software system that an informal argument of its correctness should be given.
A simple proof is included in Habermann
[17]
for the case of P's and V's on semaphores. Finally there are certain timing considerations which can introduce problems.
Typical examples are the
determinacy and deadlock problems discussed in the concurrency section DENNIS C.
The problem of deadlock and its detection
and prevention has received special attention in the literature [18] In general during the design phase, attention should be paid for the logical correctness of the design, together with its influence for the reliability and ease of maintenance of the final product.
In order to ensure logical correctness,
the design could be described formally using some model of computation,
for example, as discussed in the concurrency
section DENNIS C.
The description can then be analyzed using
known theorems about the properties of the model. 1.8
RELIABILITY DURING THE LIFE CYCLE OF THE SOFTWARE We discussed so far the design and implementation
of reliable software.
In actual practice, if we follow any
one or all the techniques outlined, we will still not have an error free software system.
To ensure ultimately reliabil-
ity, we have to admit our weakness as a fact and anticipate errors.
Each module of the system and the users themselves
should operate defensively. appropriate.
The following steps are
330
I)
The integrity of information should be preserved using
frequent incremental 2)
and/or complete dumps.
Key information of the system should not only be
safeguarded,
but duplicated e.g. directories.
3) Hardware will malfunction. The software should be designed to cope with and not enhance the disaster. 4)
Good restart facilities can minimize the effect of failures. Another practical aspect is the maintenance
produced software.
Functional
of the
changes and known bugs force
new releases of the system.
A very interesting
and Lehman
the dynamics of the maintenance
[19] investigates
function using a model. conditions
It is shown that with certain
the maintenance
with time.
study by Belady
effort can increase exponentially
When this happens the software system is a
candidate for retirement.
More about the life cycle of
software is discussed in HELMS. 1.9
SUF~RY
AND CONCLUSIONS
We will summarize our discussion by giving a set of informal rules for increasing reliability I)
in software.
Design a well structured system which will be easier to
test. 2) Describe correctness.
the design formally and argue about logical
3)
Have a powerful protection mechanism.
4)
Prove the correctness
of certain key algorithms
5) Choose an appropriate high level programming for implementation.
6)
Have the compiler do much checking.
7)
Enforce a clean and structured programming
language
style, usually
through the language.
8)
(if you can).
Incorporate many run-time checking techniques.
331
9)
Make available many testing and debugging techniques
(POOLE B). 10)
Pay e x t r e m e a t t e n t i o n t o t h e t e s t i n g , v a l i d a t i o n s t a g e s (POOLE B).
debugging and
ii)
Accept the inevitability of errors and prepare for it.
12)
Set up a good maintenance facility.
332
2.
PROTECTION
2.~
INTRODUCTION Protection is a general term describing the mechanisms
which protect items of a system from their environment.
The
mechanisms are not the same in different systems and they can also be different for parts of the same system.
For instance
in some systems, there is a different protection mechanism for the file system, another one for memory protection etc. The goal of protection is stopping a malicious or erroneous user from harming other users.
We could make the
assumption that users are friendly and infallible, but it would be unrealistic.
A user can do harm in three different ways.
a)
By destroying his own virtual machine
b)
By destroying another user's virtual machine
c)
By degrading the service that all other users are
e.g. by modifying data in an unauthorized way getting with the ultimate degradation being "crashing" the system. A good protection mechanism should not allow (b), and (c) and it should provide some tools to the user to safeguard his own virtual machine against himself. some extra benefits,
Protection provides
for instance support for proprietary pro-
grams against unauthorized use.
As a side benefit the protection
mechanism can be expanded to provide some accurate accounting for measurement and billing purposes. In a system all information resides on some storage device.
Hence the system is protected if some mechanisms exist
to divide the logical address space into parts henceforth called regions and allow information to flow between the regions in a controlled way. Each region will have to be protected from attempts by the rest of its environment to: El)
obtain unauthorized information
E2)
provide unwanted information
333
As a typical example consider a region defined by a particular
file.
We might want to protect the file from unauthor-
ized read operations (E2).
(El) or from somebody trying to overwrite
A more subtle case of (E2) exists if somebody provides
information to the region which is superficially welcome but inherently dangerous.
To uncover such a potentially dangeroussituation
requires much checking.A typical case is that of a user passing a seemingly innocent command to the system which causes it eventually to "crach".A protected system should be able to refuse such unwanted information.
Two basic problems require
solution. PI)
How we build walls around the regions with specific gates
P2)
How we police the communication.
for co~aunication purposes. 2.2
DOM~TNS AND OBJECTS We will develop a set of concepts which can be used
for the understanding
and design of different protection
mechanisms. The notion of regions
is too general to be successful
in describing the protection status of a system.
Two roles are
easily distinguished.
there are
In a protected environment
both passive elements aggressors). elements,
(the victims)
domains
[20,2~.
(the
in the course of time.
Note that these roles might change
For instance two processes may very
well guard against each other, alternatively.
and active elements
We will call passive elements objects and active
in which case both are objects
Files are typical objects.
We will elaborate the notion of domain.
Many words
have attempted to capture it e.g. protection context, state, or sphere,
capability
list etc.
[21,22].
The simple active element in a system is a procedure activated by an activation record. activated procedures
To indicate that some
can operate conceptually in parallel we
give them separate status and call them processes cooperate and synchronize primitive operations
like
[23].
Processes
each other through the use of basic P
and
V
on semaphores,
or by
334
sending
and receiving
concurrency cesses
messages.
purposes
are the processes.
are candidates
general
procedures
(e.g. memory),
in an Operating
guidelines.
resource
We will
allocation
of activity
In other words
a task.
are allocated
System
tions,
however,
header
The concept
it is convenient In order
kinds
of headers.
processes with
between
resource
tasks
process
we need a
descriptor). we need different
are allocated This
in conjunction
in terms
of
is consistent with general
allocation.
activated procedure and/or
the notion
task.
purposes
always
it is expedient
that every process lose is some activated
to identify
tion status we have it is not allowed
inside
To summarize,
concurrency resource
c)
protection
[21 ]. domain.
with processes
if we want
such
What we
a particular
to have different process,
protec-
although
concurrently.
if we choose
b)
the unit represen-
to a domain.
it a different
the only active unit they serve a)
domains
a process
to make
to a prointroduces
out of a particular
uniquely
Namely,
to operate
can be any
to a set of objects
operates
flexibility.
Lampson
is essentially
corresponds
procedure
element
corresponding
this possible,
of a domain which
A process
the active
not necessarily To make
ting a set of access privileges
Again
all resources
and processes
CPU time allocation
of
In most implementa-
the facilities
If all resources
For protection cess
to some
of a task does not
we only need one kind of header.
treating
to the
according
to allocate
to manage
(task control block--internal
If we distinguish
pro-
call the unit for purposes
have to be the same as that of a process.
to processes.
for
for CPU time allocation.
Other resources activated
The units
allocation
to consider
three distinct
processes functions.
as
335
2.3
PROTECTION WALLS AND MONITORS Since regions correspond to parts of logical address
space the walls surrounding them must be present in the addressing mechanism.
Addresses within the region are immediately
executed, while addresses pointing outside the regions go through a specified mechanism which can be considered as a gate.
It is very hard to discuss the mechanism for building
walls around regions without making reference to specific addressing schemes. Consider as an example a segmented address space. ments correspond to regions and are protected.
Seg-
Every time a
segment is retrieved from the segment table its protection status must be checked against the privileges of the process generating the command.
In the same way, when a process wants
to access a file, its request goes through the file system which checks the privileges of the process against the protection status of the file. In general, every region which must be protected has a monitor that serves as a gate.
Every time an access is
generated from outside the region (somebody trying to cross the gate) it must go through the monitor in question.
The
monitor can be in hardware, or software, or a combination of both.
For instance the physical keys in /360 machines together
with the key in the PSW of the executing process form a hardware monito~ protecting regions of
2K
bytes in memory.
A
purely software monitor is the file system which sits between the files residing in secondary storage and main memory.
The
file system can be considered as the monitor safeguarding the information on secondary storage. The monitor can be near the domain generating the access, or near the object it guards, or inbetween.
The main
requirement is that it sit somewhere in the linking path from the domain to the object and interupt the access attempts. Note that many linking paths can share the same monitor as it is the case of the file system.
336
Another way of providing logical blocking walls is to keep the linking path secret.
That is, the object is not pro-
tected but its location is not known except by the right processes.
Any process can use an internal name to address the
object but the linking path to the object will be only revealed to privileged processes. adequate protection.
Such a mechanism does not usually provide
The malicious user can search all memory
trying to locate the object.
This is not always easy.
To
begin with, in a virtual memory environment the very size of the address space makes it very difficult. Another technique which can be used is encoding
[24].
Namely the information is there, but it is encoded and in order to decode it one needs a special code key provided only to privileged processes.
A middle of the road approach might
by to link parts.of the object in a linked list with encoded pointers.
This implies that if the object is split in
parts the malicious process has to work
n
n
times as hard to
locate the different parts. 2.4
IDENTITY CARDS AND CAPABILITI~g Before we can talk about any checking going on at
the gates we must identify the different domains, objects and privileges using unique names. are not necessarily unique.
The names used internally
We will use unique numbers called
magnums
(magic numbers) to serve as identity cards in the
system.
A magnum identifies an item and will ensure a unique
name with global validity for any item in the system.
A
bit pattern of 64 bits can provide by simple addition a different magnum every
~
sec for 107 years.
If magnums are not pro-
vided by hardware we will assume that a high-privileged process
issues them according to the needs of the system.
In case
we need numbers which are not absolutely unique, but relative to a particular application we will name them using the name of the application e.g. process magnums for magnums which
337
identify uniquely processes. Privileges of domains over objects will be identified with capabilities
[21,22].In order to be able to pass capa-
bilities around we have to identify them using magnums. Capabilities will also have an indication if they are local, or can be copied, or passed around. Capabilities are checked by the monitor which sits on the linking path between regions.
Note that capabilities
provide protection without any specific reference to the internal nature of the regions or the contents of the messages being passed.
It is assumed that a capability is enough re-
commendation for the privileges of the domain.The monitor could ask for the identification card (magnum) of the process trying to access the object either for some checking purpose or for reference if something anomalous happens. The way we have talked so far about capabilities they are strictly boolean and represent authorization for an access privilege.
They are like "passes" which are used to
traverse a linking path between regions.
Another way to view
capabilities is as "tickets" which are consumed when they are used.
This introduces some elementary form of memory in
the capabilities. Recalling capabilities is an issue.
Some feel that
once a process has certain capabilities nobody should be able to take them away.
We subscribe to the other view that the
agent issuing the capabilities should certainly have the right of recall.
The resource manager of a particular re-
source should, for instance, be able to invalidate all or part of the tickets used to request that resource.
This can
be done easily by changing the capability magnum, in which case the old ones automatically become invalid.
The process
can demand the right to be protected either by making a capability inviolate ~ la Lampson or
registering a complaint
for the action and demanding the new capability.
338
An interesting the number
the privilege an ability mation
of accessing
through
capabilities
the pass but
introduce
With
up to
n .
n
It will be useful itself.
Every
the capability zero.
This
n
times.
a data base
ways we can solve
("passes")
we issue
purposes.
the problem.
we can still
give
in the monitor which n
and the passes
accesses
Such
from infor-
or for accounting
some memory
system,
that only
to give a process
only
After we reach
magnum
the ticket
sures c)
(Lampson)
are many distinct
capability's b)
an object
correlation
if we want to limit
we want
to safeguard
With boolean
counts
arises
Suppose
is important
There a)
problem
of accesses.
we change
become n
the
invalid.
tickets.
This en-
are possible.
to introduce
some memory
in the capability
time it is used the count decreases becomes
invalid when
type of capability
and
the count becomes
is like a "ticket book"
or "cash". d)
An indirect specific
place where
the pointer This 2.5
capability
can be used as a pointer
a count
is kept.
and the capability
type of operation
to decrease
uses
the count. account".
POLICING
tions between
sections.
its own monitor
This has
disadvantages.
two main
It is very inefficient the same for instance,
the
communica-
local police,
checking
that is,
for capabilities.
guiderules
are generally
inefficient
For
if each file
to check capabilities.
can be very restrictive.
control
For instance,
to police
large class of objects.
it would be highly
Local police higher
since
some
had its own mechanism
items
We could have
every object with
2)
The monitor
is like a "checking
We have now the necessary
I)
to a
authority if a file
which locks
must be some way of remedying
There must be a
can overrule itself
its decision.
out completely
the situation.
there
339
There can be a central police policing tions and checking all capabilities. but it becomes
cumbbrsome
all communica-
This is conseptually
acceptable
for every access to go through the
same central authority. The best solution is to have policing
in groups with
~aonitors which serve as gates between a large class of domains and objects.
A typical example is the file system
which has its own ground rules about the use of capabilities to police
all accesses
from processes
So far we made no reference such communications.
It follows
to files.
to the data involved in
that a simple capability-
oriented protection mechanism does not check for corrupt data which can cause harm to the receiver.
It is obvious
that the policing
should include: I)
Some identity checking
2)
Some data checking
The rules for data checking can be frozen,
or better,
they can be specified to the monitor and updated regularly. What we propose is to incorporate
a data-checking process
in
the monitor which can change dynamically. The treatment of violators What do we do with intruders
is an interesting question.
in most systems?
We simply re-
fuse unauthorized access. But if identifications are provided one could conceivalbykeep track of violations. If nothing else, somebody has to pay for the policing so it may as well be the attempting violators
(yes, we are suggesting
One of the most difficult problems between two mutually suspicious processes. to restrict tile accesses supervise
fines).
is the communication Each one will try
generated from the other and closely
the data which goes back and forth.
pose in such a case is that the two processes understanding
on their communication
What we procome to an
requirements
and establish
a third independent process, which is the monitor and policeman.
Note that this is a typical case of a contract.
once the processes
agree about the supervision
That is,
and set up
340
the policeman they can't change it unless they appeal to higher authority.
This appeal to higher authority is always
present in the system, with the final "supreme" authority the operator himself.
After all, he can stop the machine.
We talked about capabilities,
magnums, monitors,
policemen,
etc.
All these protection mechanisms have to be
protected,
especially if they are in software.
tection can be ensured by having ing them, whose protection ware
Their pro-
a protection mechanism safeguard-
status is frozen either in soft-
(a key has to be present which can only come from the
operator)
or in hardware.
tion mechanisms
For instance,
all the file protec-
checking file manipulation
highly protected.
Only one process,
capabilities
the file manager,
talk to them and must do so in a restrictive manner.
This
capability of file manager is locked in his capability and even he can't change it.
Finally the capability
are can list
list can be
protected through a different hardware protection key. A final problem is unauthorized use of information For instance,
consider the communications
[26].
between a proprie-
tory tax package program and its input process.
The pro-
prietory package will certainly not want to reveal its secrets, but on the other hand we don't want the package to retain information about Mr. Ive's taxes.
This type of problem can-
not be solved by considering the processes
as black boxes and
looking only at their interaction and messages, we have to look inside them. certification 2.6
It then becomes
a problem of program
and security.
DESCRIBING THE PROTECTION STATUS OF A SYSTEM It is advantageous to use the concepts outlined in the
previous paragraphs system.
to describe the protection status of the
At any point of time the protection status can be
represented by the set of privileges
that each domain has.
We
can view this information as a matrix with rows the domains and columns the objects of the system
[21].
Each entry of the
matrix will define the privileges of the domain corresponding
341
to the row ever the object the matrix. with
corresponding
These privileges
access
attributes
very sparse
since
like read,
The matrix not give a m e c h a n i s m
rules.
representation for changing
it as a solution
the entries
and the p r o b l e m If we want
changed
as objects.
more entries scheme We have
an access matrix
and domains the matrix
d
The special
u
attributes
implies thecase
d
that
for d
has
another
plus
status
d
as
rows
The entries
certain
special
d'
"Owner"
of
attributes.
d'
or
can
attribute
or an object
over
[21,22].
u
u as
rules
for changing
d
if it has
"owner"
A domain
can remove
d
access
access to
attributes
from
or without
d
can copy an access can add access
A(d',x)
d' attribute
to
the copy flag set for this particular
A domain
the
of the system.
A domain
it has 3)
with
the following
and a copy flag which
control
We give now the following
2)
access.
may be.
protection I)
Consider
attribute.
domain
complete
to be
the domains
of the domain privileges
with domains
are "owner"
are
should be augmented
as columns.
attributes
be set or not for every access of a domain
to consider
to domains.
and objects
are access
scheme
sparse matrix.
from unauthorized
A(i,j)
facility
to well defined
the domain privileges
the changing
We
is a central
the large
that the matrix columns corresponding
so far does
The central
according
manner we have
for controlling
is
on
of domains.
of such a centralized of storing
They need protection
It is implied
privileges
only if there
the matrix.
to allow
in a flexible
The matrix
as described
privileges
of the matrix
The disadvantages
rigidity
etc.
expressed
objects.
facility which manipulates can change
modify
a domain has on the average
very few of the system's
could consider
to the column of
will be capabilities
attributes
the copy flag set if it has
to
A(d',x) access
A(d',x)
"owner"
if
attribute. with
access
to
x
342
The above calling
rules
capabilities.
do not give the opportunity
In order
for a recall we introduce 4)
A domain if
d
d
has
have "protected" Note
change
of creating domain
matrix
does not
for the relation by which about
a process
object
or
The idea of
as an abstraction
and visualize
can
the mechanism
a superfluous
description.
of
different
in which
protection
of systems.
ways
a description access
of the matrix
attributes]
representation reasons
using
triplets
T .
the value of
is usually
can be represented
in many
The simplest way is to provide
in a table
This
A(d,x)
impractical
[doli~ain, object,
global
table
is required.
is This
for the following
[21 ].
Memory protection to
status
in a computer.
searched whenever
T
is not usually
provided
with respect
by the hardware.
The table will be quite large. It will be unrealistic to store
3)
status
is very useful
The protection
2)
A(d',x)
d'
IMPLEMENTATION
different
i)
from
x .
or deleting
context we can understand
2.7
attributes x , provided
and the mechanism
from the protection
mechanisms
to
for re-
the ability
rule.
Also no mention was made
a new domain,
a protection
to
does not provide
to domains
domains.
access
access
access
that the scheme
processes
the following
can remove "owner"
to give a domain
it all in core memory.
A way must be found to
keep only relevant
portions
Objects
may be grouped
make
T
or domains
very wasteful
in core memory.
in terms
in certain ways which
of storage
e.g.
a public
file. 4)
It is usually domain has a domain
necessary
certain
is owning
to obtain
access
to e.g.
and paying.
all objects
that a
all objects
for which
343
Another
implementation
to which a domain has access domain.
This list is usually
entries
[×,A(d,x)]
privileges
is to group all the objects
in a list and attach it to the called a capability
are capabilities
of domain
not provide read-only
d
on object
list.
representing
x .
The
the access
Most hardware does
arrays which can be used by the super-
visor to store the capability
lists.
though by keeping all capability
It can be simulated
lists in the same storage
area highly protected by a separate storage key and being accessed only by a highly privileged monitor. to the objects
Linking paths
can be stored in the same area together with
the capabilities. To facilitate be structural
implementation
ables a more efficient
implementation
A third Implementation to a particular
to the object. [d,A(d,x)]
could also
in levels or using a tree hiererchy.
at the expense of some flexibility having access
domains
This en-
of the capability
lists
[27 ]
is to group all the domains
object in a list and attach it
The entries of the list are of the form
and they give the access attributes
domain which has access privileges
to the object
for every x .
There
is a procedure provided by the owner of the object which checks the domains implementation
to access
the object.
This
is similar to the access control list such as
used in Multics. to the object. linking paths
attempting
The domains still need a way of connecting We could take the approach of providing the
to everybody upon request by the system, with-
out any checking,
or leave it to the domains
tion about the linking paths
to keep informa-
to the objects on Which they have
access.
A fourth implementation (magnums)
as locks and keys.
list of objects
is using unique numbers
In every domain there is a
together with a unique number for every ob-
ject which serves
as an identification of the access method.
In the object there is a list of unique numbers associated meaning in terms of access privileges
together with for each one.
344
A domain accessing the object transfers to the appropriate monitor safeguarding the object the unique number which corresponds to an access privilege.
The monitor interprets the unique number
according to the list in the object and grants or refuses the access depending on the access method requested. This idea of giving capabilities special status by providing them with some unique identification is very important.
The capabilities can be passed around withoutany reference
to their meaning.
When the owner of an object wants to re-
call or change certain capabilities, he can make the old ones obsolete by changing the list of access methods associated with the object.
We will outline the use of capabilities by giving
an example of a file system. 2.8
A CAPABILITY BASED FILE SYSTEM
2.8.1
INTRODUCTION We will describe the design of a basic file system
which uses the capability concepts.
The design is associated with
the SUE project in the University of Toronto
[28 ].
The
file system is intended for implementation in IBM /360 hardware and it has to take into account some of the hardware characteristics. We will not use a separate notion of domain; processes will correspond to domains.
Each process has a capa-
bility list of all its privileges
as part of the internal
process description,
together with other information relevant
to the operation of the process e.g. its father and sons, mailboxes it owns for communication etc.
We assume an
environment where processes form a tree structure and communicate using ports and mailboxes very similar to RC 4000 [29]. Processes are ephemeral; deleted.
they are created and
On the other hand we need a permanent entity which
contains information about a particular user account, its file information etc. [28].
We will call such an entity a sponsor
Sponsors have descriptions residing permanently on
345
secondary storage.
When a user logs in a system process ob-
tains information about his sponsor and initiatesthe first process corresponding to the user.
When a user logs-out all
of his generated processes are destroyed and information abbut their capabilities generated, sponsor.
accounting etc. are transcribed to the user's
Capabilities sit inactive as data in a user sponsor,
when he is not active.
When he becomes active the capabilities
should be reactivated.
For instance when a user creates a new
capability e.g. by creating a new file, this mechanism ensures that the capability is retained after he logs-out. The file system will utilize capabilities to monitor the access of files.
Capabilities will be used mainly
for: a)
Protection for authorized access of the files
b)
Accounting of the use of the file manipulating facilities
I)
Boolean capabilities mainly used for authorization
Capabilities are of two basic types: purposes 2)
Numeric capabilities used mainly for accounting purposes. (We will sometimes refer to them as financial capabilities since their presence signifies ultimately ability to pay for an operation.) In addition capabilities can be distinguished to
many
2.8.2
(up to 28 ) different types according to their use.
CAPABILITY FORMAT A capability is a double word
(64 bits) with the
following fields. 8 bits TYPE
24 bits I Attribute or Numeric field CAPAB. MAGNUM 32 bits
346
TYPE is a codeword capability.
signifying
For example
that the capability The TYPE will
is an ownership
also indicate
the use
of the
TYPE = 0...I0 capability
might
imply
for a file.
if it is a boolean
or numeric
capability. In case of a boolean for control 8 bits PASS,
and attributes
the passing
SEND and to indicate
be changed.
The next
16 bits
This enables
the same
capability
according
In case of a numeric value
capability
divided,
but not copied. and can devote
for capabilities. at different naming magnum which
that processes
since
for control value.
is a number which
is unique
capabilities
the same magnum.
(created
A process
naming magnum, We also assume
32 bit identification
to generate
one capability
for at least ten years. The
specifications
system
creates
PACKTNG
according
to the
CAPABILITIES will proliferate
we use indirection
it is useful
capabilities
in the create primitive.
Capabilities unless
by anybody
the 8 bits
will have
are enough
are
A numeric
they are used differently.
and sponsors
on
It can always be sent
two different
cannot have
The 32 bits
per second
or decreased
can be the same as a capability
is acceptable
magnums.
2.8.3
Namely
times)
on bits
the 24 bits
for the numeric
magnum
to
on, but WRITE
of the capability.
Hence we don't need
The capability
attributes.
capability.
for it.
all the 24 bits
bit
capability
used to hold the numeric
capability
e.g.
field can
to the turned
attribute
off in case of a file manipulation
can always be
of capability
(same type and Magnum)
READ
an indirect
The first
are used for access
the at%ribute
holding
are used
if the attribute
be used for many purposes field e.g.,
24 bits
in the following way.
are used to control COPY,
capability
in the system
and/or some encoding.
to think of all these
Conceptually
capabilities
passed
347
around,
each process
or tickets.
having
We feel
that a simple
but even after reducing double
word we still
The attribute encoding
and used as passes
scheme
like this
is workable
the size of the capabilities
can't
Two encodings i)
its own copy,
afford
to a
the proliferation.
can be used at two different
field of a boolean
of up to 16 different
capability
access
levels.
enables
the
attributes
in the
a pointer
(which
same capability. 2)
Instead of having
a capability
we have
must be a capability
in itself)
capability.
for instance
bilities
One use
for the ability
capabilities. direct
further
to enable
to the right
is use of boolean
to decrease
We extend
capability
pointing
indirectly
the notion
encoding
capa-
numeric
of an in-
of combinations
of
capabilities. For that purpose TYPES:
"Indirect
type of capability sponsor takes
is boolean
or process.
the place
divided
in two ffelds
the relative
number
The second[ 8 bits within
8 bits
to a particular
of the process
or sponsor
magnum.*
of the attribute
field are used
purposes.
The other
16 bits
of 8 bits.
The first
8 bits
of an 8-size block
specify which
This
are indicate
of the capability
capabilities
are pointed
the 8-size block.
TYPE
CONTROL
MAGNUM pointing
*
to sponsor".
and it points
The magnum
of passing
at least two different
"Indirect
of the capability
The first for control
we have
to process",
A
to process
or sponso~
Note that this implies that an indirect capability have the same m a g n u m as another direct capability, it is p e r m i t t e d since the types are different.
may but
list. at
348
For instance if
A -- 13, Mask = 01010000,
plied that the capability points
capability of the 13 th block in the capability process
it is im -
to the second and fourth list of the
or sponsor specified by the magnum. This mechanism enables us to have indirection,
to cut
down a capability list by a factor of eight and to pick any consecutive block of capabilities we need. can accomodate 2.8.4
any capability
list up to
The mechanism 8 2 x 8 entries.
KERNEL SYSTEM FACILITIES The following is a list of facilities needed to mani-
pulate capabilities. I)
CREATE.
Create a capability
Capabilities
according to specifications.
of certain types can only be created
by specific processes,
e.g. file capabilities.
The
created capability is boolean or numeric. 2)
DESTROY.
We must have the ability to
from a capability
list.
delete a capability
This is especially useful
when we want to save some of the capabilities sponsors,
or when a capability becomes
e.g. owner's
in the
obsolete
capability in the deletion of a file.
The capability can be boolean or numeric. the assumption
We make
that a numeric capability does not
get destroyed automatically when it becomes
zero,
but explicitly with a DESTROY command. 3)
DECREASE.
Applicable
to both boolean and numeric
capabilities but with very different effects. n u m e r i c capability also be indirect).
gets decreased
The
(this command can
The boolean capability
loses
some of the "on" bits on the attribute mask. 4)
INSPECT.
Copy a capability in core to inspect it.
capability gets copied, except of the capability 5)
MOUNT.
for its magnum.
in the capability
Some privileged processes
The The index
list is an argument.
should have the ability
to mount a capability from their core to a capability
list.
349
This is true for the process which picks up the authorization capabilities from the sponsor (sitting there as data) and initiates the first user process The highest process with respect to father-son relation, which has the same sponsor will be called a patriarch of the lower processes. 6)
COPY.
Applicable only to
boolean capabilities.
The
corresponding numeric capability facility is SPLIT capability. 7)
SPLIT.
A check is made for the ability to copy.
Applicable to numeric capabilities.
The numeric
value is divided into parts.New capabilities with the same magnum are generated and obtain as values the parts of the numeric capability.
8)
SEND.
9) i0)
WAIT.
Wait for the reception of a capability.
PACK.
Create a new capability which points indirectly
Ii)
EXPAND.
Facility to send capability from one capability
list to another.
to a block of 8 capabilities in a capability list. Substitute an indirect capability with the
capabilities it points to.
One bit in the passing
control indicates the ability to expand.
In some
instances of numeric capabilities we might like the indirection to remain. 2.8.5
PASSING CAPABILITIES One of the main problems we have to face is the passing
of capabilities from the sponsor structure, where they sit inactive as data, to the logical resource, e.g. the file system, through the family of processes. into three distinct parts.
We divide the problem
We need one mechanism to transfer
capabilities from the sponsor to the patriarch a different sponsor
than his father).
bilities between processes.
(process with
We need to pass capa-
Finally we need to pass the
pertinent capability to the module providing the logical resource e.g. file system.
350
Capabilities are passed between processes as special purpose messages 2 and 3.
[28].
This provides a solution for Parts
We only have to be careful to pack capabilities and
use indirection, so they don't proliferate in the capability lists of the processes. For Part 1 we need a special process called the Sponsor manager providing the interface between sponsors and processes. When the patriarch is initiated, he gets in his core a list of capabilities
he can use, together with some explanation of
what they mean e.g. capability #7 on the sponsor list has to do with ownership of file FILENAME.
The patriarch can then
choose the capabilities he will need and he requests the sponsor manager to mount them for him (recreate them as capabilities). The sponsor manager is one of the few processes having the ability to pick an image of a capability from its core and reactivate it as a capability. The second problem we have is how to pass a capability from the process structure to the sponsor.
When the file
system creates a capability for the manipulation of a file it can do three different things. a)
Pass the capability only to the process
Pn issuing
the command. b)
Pass the capability to both
Pn
and its sponsor
c)
Pass the capability to both
Pn
and its patriarch.
As part of the command the process which option it needs.
Pn
can specify
Note that if we take (b) or (c) as
standard we don't need to de anything for these capabilities when
P
is deleted. n Suppose we take option (a).
Then we have to have a
mechanism to prevent the capabilities from being lost when P terminates. P might require different action for the n n capabilities iI created whenit terminates, Namely : a') Some capabilities to be transferred directly to the sponsor. b')
Some capabilities to be transferred to its father.
c')
Some capabilities to be transferred to its
patriarch.
35t
When a patriarch
terminates
all of its capabilities
are transferred
to its sponsor. We favour the solution (a'),(b'),(c')
(a) combined with options
specified by the type of capability created by
the logical resource in this case the file system. an indication about the option when
Pn
terminates
(a'),(b'),(c').
Pn
passes
Note that
there are some bookkeeping operations with
respect to capabilities. 2.8.6
OUTLINE OF THE FILE SYSTEM The basic file system has as goals:
I)
A complete set of instructions which can be used to
2)
A structural design which will allow the introduction of
implement more user-oriented new processes
to accommodate
commands. extra facilities
different sponsor directories 3)
Adequate facilities
e.g.
and space allocation schemes.
for the development of an Operating
System nucleus and the implementation
of an interactive
service. A rigid assumption is made: All files consist of an integer number of fixed size blocks.
2.8.7
FACILITIES OF THE FILE SYSTEM
The following commands can be executed directly by the file system. i)
REQUEST
(VOLUME NAME, SPECS)
The process process)
(henceforth called the
asks for the allocation of an extra volume named
VOLUME NAME. volume;
issuing the command
SPECS include:
a)
b) Sponsor of the process
Capability for requesting issuing the command;
c) Pointer to the internal process descriptor
(i.p.d.) of
the process. 2)
RELEASE
(VOLUME NAME, SPECS)
The process wants uses.
SPECS include:
to ~elease a volume it owns for other
a) Capability of ownerhsip of this volume;
352
b)
Sponsor; c)
Pointer to i.p.d, of the process.
3)
ATTACH (VOLUME NAME, SPECS) The prgcess wants to mount a volume it can use on a
logical drive.
SPECS include:
volume VOLUME NAME; b) (financial); c) (optional); d) 4)
a) Capability for using the
Capability of issuing ATTACH command
Capability of ownership of a logical drive Sponsor; e)
Pointer to the i.p.d, of the process.
DETACH (VOLUME NAME, SPECS) The process wants to dismount the volume VOLUME NAME from
the logical drive it was mounted and get back the logical drive if it owned it.
SPECS include:
the particular volume; b)
a)
Capability of detaching
Sponsor; c)
Pointer to i.p.d, of
process. 5) CREATE (FILE NAME, SIZE, VOLUME NAME-optional,SPECS) The process wants to create a file FILE NAME.
If a
VOLUME NAME is specified it wants the file to be in the particular volume.
SPECS include:
files (financial); b)
a)
Capability for creating
Capability for using the specified
volume (optional); c) Sponsor; d) Pointer to the i.p.d, of procesS. 6) DELETE (FILE NAME, SPECS) The process wants to delete the file FILE NAME. SPECS include: a) Capability of ownership for the file; b) Sponsor; c) Pointer to the i.p.d, of the process. 7)
OPEN (FILE NAME, SPECS) The process wants to open (link to) a file FILE NAME.
SPECS include: b)
a)
Capability for opening files (financial);
Capability of opening of the file; c)
Sponsor; d)
Pointer
to the i.p.d, of the process. 8)
CLOSE (FILE NAME, SPECS) The process wants to close and disconnect from the file.
SPECS include:
a)
Capability of closing file; b)
c)
Pointer to the i.p.d of the process.
9)
READ (FILE NAME, NFILE, MPROCESS, COUNT, SPECS)
Sponsor;
The process wants to read from FILE NAME a number of
353
blocks specified by COUNT starting from the NFILE block of the file consecutively
and to deposit them at memory consecu-
tively starting from MPROCESS block. bility of issuing read-commands for reading the file; c)
SPECS include:
(financial);
b)
a)
Capa-
Capability
Capability for writing on the MPROCESS
to MPROCESS + COUNT - 1 on memory;
d)
Sponsor;
e)
Pointer to
the i.p.d, of process. i0)
WRITE
(FILE NAME, NFILE, MPROCESS,
The process wants
COUNT SPECS)
to write a number of blocks specified
by COUNT from the FILE NAME starting from the NFILE file block to the main memory starting from the MPROCESS block of the process.
SPECS include:
commands
(financial); b)
c)
a)
Capability
Capability
of issuing write
for writing the file;
Capability for reading from the MPROCESS i; d)
Ii)
Sponsor;
SWITCH
e)
(FILE NAME, SPECS)
The process wants a particular
to MPROCESS + COUNT
Pointer to the i.p.d, of the process.
file.
to change the capability of use for
This will not affect the processes which
have the file open but from now on nobody will have access to the file unless he produces
the new capability.
way the process can change the financial include:
a)
The same
capability.
SPECS
Capability of ownership of the file; AND/OR
b)
Capability for creating files
d)
Pointer to the i.p.d, of process.
(financial);
c)
Sponsor;
Each command will have many associated actions. Consider as an example the CREATE command which has the following results. i)
The file FILE NAME has been created.
2)
Your financial capability
Here is the
capability of ownership of the file. is no good.
3)
The name FILE NAME is in conflict.
4)
The capability you specify for the volume is no good.
5)
VOLUME NAME cannot be found.
6)
VOLUME is not mounted.
7)
Volume or file system is loaded. created.
Please change it.
Please arrange to be mounted. Your file cannot be
354
8)
Really sorry, but something went wrong.
2.8.8.
ORGANIZATION OF THE FILE SYSTEM Files have entries in a sponsor directory.
A tree structured sponsor directory is implemented corresponding to the sponsor structure.
The entries in the sponsor
directory simply point to the appropriate entries in the Master directory.
We will allow both sharing and the ability to use
local names for files. For every file there is a unique entry in the Master directory.
The entry will be of the following form:
Magnum identifying uniquely the file
Linked list of opened files
Capability of ownership Sponsor of the creating process
entries
Date of creation
Pointer to and magnum of the file X Manager
Linked List of Active Users Capability of use Linking Information Volume Name and Pointer to Volume table Surface, track, record of the beginning Size of the file Allocator and address calculator pointers
For every successful execution of an open command in the file system a process is created to manage the read and write operations for the particular file. processes are kept in the Master
Pointers to the
directory to ensure that
the processes do not change the file concurrently.
355
The file system manages two sets of volumes.
Regular
volumes belonging to the file system where it allocates space for regular files.
Special volumes belonging to other owners
in the system which are managed by the file system.
A volume
table is kept containing the current status of all volumes. The entries contain: a)
Volume unique identifier.
b)
Logical drive and capability if the volume is mounted. Flagged if the volume is not mounted.
c)
Count of currently open files.
d)
Pointer and magnum of volume's allocator.
e)
Capability of ownership.
f)
Date of creation. Note that the entries are of fixed size. Attached is a diagram of all the processes related or
belonging to the file system and their communication lines. We hope that their names provide a hint for their function. We do not claim any particular functional or performance properties for the file system outlined in comparison with existing file systems.
It was described mainly as an example
for the use of the protection concepts discussed in this section.
356
FILE SYSTEM STRUCTURE
DEVICE ~LLOCAT OR 'VOLUME MANAGER ULTIMUM ANCESTOR
ORY
J
FILE
-~
MANAGER
FILE CCOUNT ANT
TABLE MONITOR
VOLUME X ILLOCATO] FILE B ANAGER II
FILE A [ANAGER I
Y
CALCULATOR ,
357
3.
SECURITY
3.1
INTRODUCTION
Lately there has been much discussion on the control of access of privileged information stored in large scale data banks [24,30,31].
The issues can be separated into three
major categories [24]. Information privacy
"involves issues of law, ethics
and judgement" [24] controlling the access of information by individuals. Information confidentiality data. Information security
involves rules of access to
involves the means and mechanisms
to ensure that privacy decisions are enforceable. We share as individuals the concern of citizens for information privacy, but our main concern as comouter professionals involves questions of information security.
We must warn the
public about the difficulties involved with control of access privileges and develop the techniques for implementing confidentiality decisions in systems and data banks in a cost effective manner. We propose the following distinction between protection and security. Protection deals with the control of information access within the operating system without consideration of the nature of information.
That is, protection mechanisms use
labels, locks, keys etc. to ensure that an information block can only be accessed by privileged active units in the system. Information security differs from protection mainly in two ways. I)
The whole information system is considered rather than
the system operating on the computer.
The actiwe elements are
people. For example user identification is assumed correct for protection purposes but it is of prime importance in security enforcement.
358
2)
The nature and content of data play an important role for
the access privileges.
The security mechanism can base its
decisions on the contents of a file, or perform data checking during transmission,
or insist on certifying
the properties
of a program. It is implied that a secure system must be protected but that is not enough. protection mechanisms. approaches
to achieve
We discussed
The following [24].
section
security in addition to protection. example gives an indication of
potential security requirements of information
in the previous
In this section we will concentrate on
imposed by selective privacy
Consider an employee personnel/payroll
file for a large industrial concern.
The file might include
data structures named NAME, SALARY IIISTORY, CURRENT SALARY, PERFOrmaNCE EVALUATION, SOCIAL SECURITY NU~IBER. the privacy decisions
DEPARTMENT,
MEDICAL HISTORY,
and
The following could be conceivably
concerning
the access of the file by
individuals. RI.
He has complete access of file.
R2.
He has no access of file.
R3.
He may see any portion of the file, but change none of its contents.
R4.
He may see exactly one record and may change only some fields of the record.
R5.
He may see only the NAME and MEDICAL HISTORY and alter only the MEDICAL HISTORY.
R6.
He may alter only "financial" portions of each record but only at specific times of day from specific terminals.
R7.
He may see and modify only "financial"
records with
R8.
He may see "financial"
information but only in an aggregate
way and not individual
records.
He may see PERFORMANCE
EVALUATION but only for certain
CURRENT SALARY below $15,000.
R9.
DEPARTMENT. The protection mechanisms which were discussed in the previous
section can very easily handle RI, R2, R3~ R4.
359
As a matter of fact they can conceivably handle all requirements but in a rather inelegant manner.
In addition we might ask
the following embarrassing questions: QI)
How can we ensure the identi{ication
Q2)
What measures
of the individual?
Q3)
What control
Q4)
What happens when the system is not working properly due
are taken against wiretapping? is exercised over the discs or tapes when
they are off line? to malfunction
of hardware or software?
A good security environment
should attempt to
provide solutions
to the above mentioned problems.
almost impossible
to design an "unbreakable"
maybe needless
in a commercial
environment.
It is
system and It is important
to make the security mechanism very difficult and very expensive 3.2
to bypass.
INFOR~TION
SYSTEM APPROACH
We wi!l take an information system approach to the problem of security.
It should be kept in mind that a breach
in security will happen in the weakest chain, which is not necessarily complex or technically 3.2.1
link of the protection
associated with the most
advanced operation.
INTEGRITY OF PERSONNEL One of the most direct methods
to break a security
system is through the trust of a privileged user.
This
problem is well known and methods are similar to manual data systems.
Personnel are investigated,
of security are established, by labelling of information. authorization
in pairs.
sensitive operation. malpractice. be separated.
high penalties
accidental
is minimized
A common method in banks
is the
It takes two persons to perform a
Both persons have to agree for a
In the same way functions For instance,
systems programmer
disclosure
for breach
of individuals
should
an operator who is also a
can violate security easier.
The presence
of a security guard is not beneficial unless he understands the operation sufficiently.
360
As a general rule both privileged access and information about the operation of the system should be disclosed to as few persons as possible. 3.2.2
The "need to know" serves as a criterion.
AUTHENTICATION OF USERS IDENTITY Users are traditionally identified with passwords.
This procedure is not always adequate
[30].
Three more
elaborate identification means are outlined. AI)
The user is provided with a unique number every time he
logs out.
He uses this as a password to log-in.
If another
person impersonates him in the meantime by producing the password the user will at least be able to detect that his environment was violated. A2)
The user identifies himself at log-in.
provides a pseudo-random number simple transformation
T(x)
x .
The system
The user performs a
and sends the result back.
The system has the transformation stored in a highly secure area and is able to check the identity of the user [30]. that
x
and
T(x)
identification of
Note
provide very little information for T , assuming the transmission line was
tapped. A3)
Some point out that one-time passwords are not adequate
against infiltrators who attach a terminal to a legitimate user's line [30].
They propose identifying the messages with
unique numbers implemented by hardware in the terminal and possibly in the central processor. The problem of user identification is also related to the security of the physical location of the terminals and transmission lines. 3.2.3
PROTECTION OF DATA OFF LINE AND IN TRANSMISSION
Data in hard copy, removable discs, s h o u l d be a d e q u a t e l y is
protected
common among m a n u a l s y s t e m s ,
off
line.
This
or tapes type of security
and i s h a n d l e d by s e c u r i t y
361
guards,
vaults, etc.
Care should be taken for data which can
be considered useless, but can provide e.g. old tapes, core images.
It is
and tapes "dirty" after their use. by overwriting processing
a breach of security
customary
to leave core
They should be cleared
zeros to erase the information.
restriction
important
Another
for security is the mounting
of removable volumes on drives which must be authenticated before access. back-up
Special care should be taken for adequate
facilities
proliferation
for integrity purposes without undue
of copies.
Some of the data are so valuable
that it is not a question of security but of survival
for the
organization. During transmission through wiretapping,
there is the potential
especially
danger
if common carriers are used.
Sensitive data should be encrypted.
We will discuss in the
next section encrypting methods. 3..2.4
THREAT MONITORING
[30]
Any password protection scheme can be violated assuming that the infiltrator has enough resources and patience. Thus, monitoring and Turn
of the system must take place.
[32] give a description
monitoring
Petersen
of threat monitoring.
"Threat
concerns detection of attempted or actual penetrations
of the system or files, either to provide real time response (e.g.
invoking job cancellation,
or starting
tracing procedures)
or to permit post facto analysis". Threat monitoring enforce penalties on file activities
enables
the system to respond to and
for attempted violations. can serve for performance
Periodic reports evaluation and
tuning of the system in addition to an indication of misuse or tampering. process
Without any threat monitoring
can, in addition,
different unauthorized
slow down the system by attempting
accesses.
Audit logs of the operations of security violations
a malicious
off-line.
enable detection
Audit logs are hard to
362
interpret, but they provide an audit trail and evidence breaches
of security.
of
As such, they serve as a deterrent
to
malpractices. 3.3
DATA DEPENDENCE AND DATA TRANSFOP~MATIONS While discussing protection data were protected
without any reference
to their content.
A good security system
has sometimes
to revert to both data transformation
investigation
of data or program properties
monitoring 3.3.1
to ensure
careful
of access.
DATA TRANSFORMATIONS Reversible encodings
to conceal the information. transmission wiretapping, etc.
and/or
of sensitive data can be used
They can protect against
unauthorized
access to data files
They are especially useful if a highly secure system
needs to be attached as a subsystem to a system with marginal security e.g. ASAP on O.S./360 strings,
transposition
characters
[24].
of characters,
Substitution
of character
and addition of key
are three common types of transformations
which
can be combined to increase the work factor associated with breaking the code.
The work factor depends
on the following criteria a)
Length of the key.
transmission
Keys require protection,
and often memorization.
key is desirable;
(among others)
[30,33]. storage,
It seems that a short
on the other hand, better protection
can
be obtained by long keys. b)
Transformation
be available
space.
Sufficient
transformations
should
to discourage "trial and error" approaches.
Transformations
are user dependent and possibly time dependent.
c) Complexity. The work factor is related to the complexity of the hardware, software and processing time involved in the security system.
363
d)
Error :sensitivity.
Enough redundancy
to make decoding possible
should be provided
in the presence of errors.
Encryption by data transformation
is discussed in
[32,34]. Good theoretical studies are also available. Certain machine instructions can be usedtomake the transformation efficient. 3.3.2
DATA DEPENDENT ACCESS It is sometimes
desirable
to control the access of
data not only on the file level, according to the user or process privileges, data accessed.
but at the record level according
This approach avoids duplication especially
of data
proliferation
of capabilities,
environment.
Hoffman points out the inadequacies
in a shared information
capability scheme when he discusses a particular In the example of a personnel/payroll the beginning
to the
of a case
[30,35].
file as discussed at
[24] and assuming we do not duplicate data,
we cannot restrict a file user from seeing SALARY items above $15,000
in access R7 without
taking into account the
value of the data in the field SALARY.
Data dependent access
decisions can be made interpretively by invoking in the monitor
associated with a particular access as
represented by a password. procedure
It should be pointed out that a
related to a capability provides
form of protection, policy decisions
for instance,
etc.
a very general
it can provide memory,
The interpretive nature of the
operation implies a certain amount of overhead, should be used only when needed
[24].
and it
For an example of
a security mechanism based on procedures, 3.3.3
a procedure
consult Hoffman
[36].
PROGRAM CERTIFICATION The security mechanism
could involve
the verification
of certain program properties which falls under the general area of program certification. i)
We mention a few instances:
Consider the case where we would like to prohibit a program
from retaining statistical
information about the data on which
364
it is operating.
There is no other way that a security
mechanism can enforce such a condition, unless the program is shown to demonstrate the proper behaviour by analyzing its structure. 2)
In order to decrease overhead, sometimes we would like
the compiler to test the accesses at compile time as compared with the protection status of the system.
In such a case
we have to trust the compiler that i t w i l l
not willingly
or erroneously provide the wrong access. 3)
In most cases the security mechanism is implemented with
software.
It is important that the procedures providing the
security be both protected and certified. 3.4
SUMMARY OF CURRENT PRACTICES Some security measures on existing systems are
outlined
[31].
measures
to Threats to Information Privacy" is very interesting
For a general overview a table of "Counter-
as appears in [38,32]. For example Hoffman
Cost can be an 6verriding criterion.
[30] gives an example of a dial-up user
identification which was marketed without much success due to cost.
365
System
ATS
Identification
a c c o u n t number; password
0 N
Authentication
none
S I G N O
User
none
I G
-- text
editor
for
J360
changeable
N
options
d u r a t i o n of use; (needed billing anyway)
Accounting
for
..... F
Determination file names
F I L
of
User
identification
User
options
files
in a c c o u n t on r e q u e s t
listed
five
c h a r a c t e r p a s s w o r d s for a n o t h e r a c c o u n t ' s files
none
E
U S A G E
P R 0 T E
C T I 0 N
V T I I 0 0 L N A- S
Accounting
date
T y p e s of capabilities
read-only, delete-only (separate passwords)
D e t e r m i n a t i o n of capabilities
available
Security
file
scope
Cryptography
none
Separation data
kept
from
last
Protection concurrent
from systems
no user very
through
in s e p a r a t e ,
Integrity considerations
stored
directory
,J
programs
poor .... d i r e c t o r y be e r a s e d
Back-up
separate
Residual information protection
p a s s w o r d of file unmatchable
Standard
messages
response
passwords
working
may
storage made after
delete
error
in
onlly
Non-standard responses
disconnected~ signon
Comment s
i n f o r m a t i o n 9 b t a i n e d from U of T o r o n t o C o m p u t e r C e n t r e
for
366
System S I G N
,,
CPS
,
-- t i m e
sharing
account number; ,password
Identification
Authentication
none
User
none
for
/360
fixed ........
0
N
G N 0 F F
options
d u r a t i o n of u s e ; ( n e e d e d billing anyway)
Accounting
Determination file names
F I L E
of
able
User
identification
six
User
options
none
to in
find all system
character
Accounting
date
T y p e s of capabilities •
read-only
Determination of capabilities
available
Security
file
file
for
names
password
last stored; accessed
date
last
u S A G E
scope
Cryptography Separation data P R 0 T E C T I 0 N
V T I I 0 O L N A- S
through
password
none from
p a s s w o r d k e p t in directory
file
Integrity considerations
p o o r -- o p e n to a l l OS t r a p doors; uses separate storage k e y s and d e s i g n a t e d disks
Protection from concurrent systems
none
Back-up
user
Residual inf0rmation protection
zeros
Standard
messages
response
explicitly
core
saves
files
used
only
Non-standard responses
t w o s i g n o n e r r o r s r e s u l t in disconnection; some errors abend CPS
Comment s
information obtained"fro'm U of Toronto Computer Centre
367
S~stem
APL-Plus
S I G
Identification
account number; password
N
Authentication
none
?
User
none
I G N 0 F F
Accounting
O
time
sharing
--/360
changeable
N
Options
d u r a t i o n of use; (needed b i l l i n g anyway)
Determination f i l f i l e names
F I L
of
identification
p a s s w o r d (integer) for a file: u n l o e k a b l e seals for functions
User
options
able to use to a l l o w
E
integer password own c h e c k i n g fcn.
T y p e s of capabilities
date and time last stored; who s t o r e d it; a m o u n t of ~tora~e I~4 read-only, append, read-write etc.
D e t e r m i n a t i o n of capabilities
m a t r i x of u s e r n u m b e r s vs. c a p a b i l i t i e s s t o r e d with file
Security
file
Accounting
G
a c c e s s is on request
User
E U S A
files to w h i c h some a l l o w e d are l i s t e d
for
scope
Cryptography
mnemonic
Separation data
a c c e s s m a t r i x s e p a r a t e d from file; e ~ r y ~ carealso separate
from
passwords
encrypted
P R
0 T E
C T I 0 N
V I T 0 I L o N A- S
Integrity considerations P r o t e c t i o n from c o n c u r r e n t systems
good -- no r e m o t e Job entry; t h e r e exist b a t c h p r o g r a m s to e x a m i n e and alter files good -- uses DOS p r o t e c t features
Back-up
e v e r y file directly
Residual information p r o t e c t i o n
n o t h i n g on d i s k re w r i t i n g ;
Standard
error in f u n c t i o n gives msg. and h a l t s f i l e e r r o r l o g g e d
response
Non-standard
Comments
update written onto disk file readable
befo~
a p a r t i c u l a r error in a r r a v s u b s c r ~ p t i n g d i s c o n n e c t s user and m a k e s w o r k s p a c e open to system i n f o r m a t i o n o b t a i n e d from I.P. S h a r p and Assoc. Ltd.
368
S I G N 0 N
I G N 0 F F
System
PDP/]O
Identification
project number; programmer n u m b e r ; fixed p a s s w o r d
Aut hent icat ion
terminal but
User
none
options
Determination file names
U S A G E
sharin $
,
n u m b e r can be r e a d it's not used
d u r a t i o n of use; (needed for b i l l i n g a n y w a y ) ; core usage
Accounting
F I L E
time
of
able
to f i n d all in s y s t e m
User
identification
password
User
options
none date
T y p e s of capabilities
read-only, execute, append, w r i t e , etc.; by p r o g r a m m e r number, nroAect number,
D e t e r m i n a t i o n of c a p a b i l i t ies
list
of user p a s s w o r d s w i t h file
Security
file
(field
Cryptography
none
Separation data
none
from
Integrity considerations
used; date created
names
Accounting
scope
last time
file
added
and
stored
at U o f W O n t )
good -- n o b o d y able into EXEX state
to get
P r o t e c t i o n from concurrent systems
V T I 0 I 0 L N A- S
.......
Back-up
user
Residual information protection
core z e r o e d ; d e l e t e d e r a s e d on disk
Standard
messages
response
Non-standard responses
I
explicitly
saves
files
file
only
n o n e
,
Comment s
information obtained from U n i v e r s i t y of W e s t e r n O n t a r i o
369
System
MTS
',,
S I G N 0 N
....
U S A G E
P R 0 T E C T I 0 N ~7 V I O L
,I
none
User
none
options
for
/360
,
Authentication
changeable
d u r a t i o n of u s e (.needed for b i l l i n g a n y w a y ) ; d a t e of last s i g n o n ; r e s o u r c e usage
Determination file names
F I L E
sharing
account number; password
Accounting
0 F F
time
",',
Identification
G N
,
of
f i l e s in a c c o u n t l i s t e d on request; others provided by u s e r
User
identification
none
User
options
a program user
f i l e can c h e c k identification
the
Accounting
date last used; s i z e of file;
T y p e s of capabilities
r e a d - o n l y for all s h a r e d f i l e s ( i n c l u d i n g own a c c e s s
D e t e r m i n a t i o n of capabilities
not
Security
file
scope
Cryptography
none
Separation data
kept
l o c a t i o n and date created
applicable
from in d i r e c t o r y
Integrity considerations
p o o r -- p r i v i l e g e d u s e r s can read all files and find all p~words; not all usages ta~6ed
Protection from concurrent s[stems '
n o n e -- o t h e r PS c a n a c c e s s
Ba c k-u p
file
editor
Jobs u n d e r all f i l e s
uses
UMMI
explicit
s a v e s
T I 0 N
A- S
Residual information protection
n o t h i n g on d i s k r e a d a b l e before a write
Standard
messages
response
only
Non-standard responses
three signon errors result in d i s c o n n e c t i o n ; some r e t u r n codes abend a terminal
Comment s
n e w v e r s i o n is c u r r e n t l y being prepared
370
S I G N 0 N
System
CP o p e r a t i n g
Identification
account number; password
Authentication
none
User
changeable
options
system
for
/360
fixed
signon
procedure
1[ G N
d u r a t i o n of u s e ( n e e d e d for billing anyway); user identification
Accounting
0 F F ;"'"
"'Y'" . . . . . . . .
, '", .
I
Determination file n a m e s
F I L E U S A G E
files
of
User
identification
none
User
options
none
.
.
.
.
.
.
.
in a c c o u n t request
.
,,
listed
on
all c a p a b i l i t i e s on own " m a c h i n e " ; r e a d - o n l y on o t h e r
D e t e r m i n a t i o n of capabilities
not
Security
file
scope
.
.
last
.
T y p e s of capabilities
Separation data
stored
applicable
none .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
from
Integrity considerations
g o o d -- no u s e r can a c c e s s CP c o r e (only " v i r t u a l " )
Protection from concurrent systems
not
Back-up
user
Residual information protection
core z e r o e d at s i g n o n ; virtual "mini-disk" have volume ]abels e r a s e d at s i g n - o f f
,~ V T I 0 1 L 0 N A- s
.
date
u
0 N
.
Accounting
Cryptography
P R 0 T E C T I
.
Standard
response
applicable explicitly
saves
files
m e s s a g e s only; some e r r o r s will deadlock a "machine"
Non-standard responses
three signon errors result in d i s c o n n e c t i o n ; s o m e s e v e r e l i n e e r r o r s c a u s e CP a b e n d
Comment s
information obtained from B r o w n U. C o m p u t e r C e n t e r
371
~.
REFERENCES
1)
NATO Report on SOFTWARE ENGINEERING, Garmish, Oct. 1968. Available through NATO, Dr. H. Arnth-Jensen, Scientific Affairs Division, OTAN/NATO, iii0 Bruxelles, Belgium.
z)
Ichbiah, J. Compagnie Internationale pour I' Informatique private communications.
3)
Elspas, B. et al "Software Reliability", Vol. I, No. I, Jan. 1971.
4)
Henderson, P. and Snowden R. "An experiment in structured programming", Technical Report 18, Computer Laboratory, University of Newcastle upon Tyne, 1971.
s)
Naur, P. "Proof of Algorithms by General Snapshots", BIT, Vol. 5, No. 4, pp. 310-316, 1966.
6)
Floyd, R.W. "Assigning Meanings to Programs", Proceedings of a Symposium in ' Applied Mathematics, Mathematical Aspects of Computer Science, Vol. 19, pp. 19-32, J~T. Schwartz (Ed.), American Mathematical Society, Providence, Rhode Island, 1967.
7)
London, R.L. "Computer Programs can be Proved Correct", Proceedings of the Fourth Systems Symposium - Formal tems and Non-Numer_ical P r o ) l e m S o l v i n g bY8Computers , ~-t Case-Western Reserve University, 196 .
8)
London, R.L. " Certxflcation " " of the Algorithm Treesort", Communications of the ACM, Vol. 13, No. 6, pp. 371-373, 11970.
9)
Hoare, C.A.R. "An Axiomatic Basis for Computer Programming", Communications of the ACM, Vol. 12, No. I0, 1 9 6 9 .
lo)
Clark, B. and Homing, J. "The system language for project SUE", SIGPLAN Symposium for Languages for Systems Implementation, 1971.
11)
Snowden, R. "PEARL: An interactive system for the preparation and validation of structured programs", Technical Report 28, Computer Laboratory, University of Newcastle upon Tyne, 1971.
12)
Manna, Z. and Pnueli, A. "Formalization of Properties " " , ACM 1 of Recursively Defined Functzons Sy m p 0 s'um of Theory of Computing, pp. 201-210% 1969.
13)
Luckman, D. "The Resolution Principle in Theorem Proving", Machine Intelligence I, Collins and Michie (Eds.), American Elsevier, Inc., New York 1967.
IEEE COMPUTER,
372
14)
King, J. C. and Floyd, R. W. "Interpretation Oriented Theorem Prover Over Integers", Second Annual ACM Symposium on Theory O f Computing , pp. 169'-179, 1970.
15)
Dijkstra, E. W. "The Structure of THE Multiprogramming System. CACM II, 5, May 1968, 341-546.
16)
NATO Report on SOFTWARE ENGINEERING, Rome, Oct. 1969. Available through NATO, Dr. H. Arnth-Jensen, Scientific Affairs Division, OTAN/NATO, Iii0 Bruxelles, Belgium.
17)
Habermann, N. "~oy n cnronization " of Communicating Processes" Third Symposium on Operating Systems Principles , Palo .............. Alto, 1971.
18)
Coffman, E. et al "System deadlocks", ACM Computing Surveys, Vol. 3, No. 2, June 1971.
19)
Belady, L. and Lehmann, M. "Programming System Dynamics" IBM Report RC 3546, September 1971.
2O)
Lampson, B. W. "Dynamic Protection Structures", Proc. AFIPS Conf. 35, 1969 FJCC.
21)
Lampson, B. "Protection", Proceedings of the Fifth Annual Princeton Confer-ence on-lnformati0n Sciences and S_~tems, March !971/ p." 437-443. .......
22)
Graham, S. and Denning, P. "Protection: Principles and Practice", Tech. Report No. I01, Princeton University Electrical Engineering Dept. to appear in SJCC 1972.
23)
Dijkstra, E. W. "Cooperating Sequential Processes" in Programming Languages (F. Genuys ed.) Academic Press 1968 p. 43-112.
24)
Conway, R. W. et al "Security and Privacy in Information Systems A Functional Model" Tech. Report No. 133, Dept. of Operations Research, Cornell University, June 1971, to appear in CACM.
2s)
"COSINE Report on Operating Systems", available through Commission on Education, National Academy of Engineering, 2101 Constitution Avenue, Washington, D.C. 20418.
26)
Schroeder, M. and Saltzer, J. "A Hardware Architecture " " , Proce e d" in s of for Implementing Protection Rings the Third Symposium on Operating Systems Principles, October 18-20, p. 42-54. . . . . . . . .
27)
Graham, R . M . "Protection in an Informations Processing Utility", CACM Ii, S, May 1968.
373
28)
Project SUE internal documentation workbook. Computer Systems Research Group, University of Toronto.
2g)
Brinch, Hansen P. "The Nucleus of a Multiprogramming System", CACM 13, April 1970.
3o)
Hoffman, L. "Computers and Privacy", ACM Computing Surveys, Vol. I, No. 2, June 1969.
31)
Gotlieb, C. C. and Hume, P. "Systems Capacity for Data Security". Draft report for study 6 of Privacy and Computers Task Force.
32:)
Petersen, H. E. and Turn, R. "Sy>tem implications of information privacy". Proc.~AFIPS 1967 Spring Joint Comput. Conf., Vol. 30, Thompson Book Co., washington, D.C., pp. 291-300.
33)
Shannon, C. E. "Communication theory of secrecy systems", Bell Syste. Tech. J. 28, 4, Oct. 1949, 656-715.
34)
Baran, P. On distributed communications: IX. Security, secrecy and tamper-free considerations. Doc. RM-3765-PR Rand Corp., Santa Monica, Calif., August 1964.
3s)
Hsiao, D. K. "A File System for a Problem Solving Facility". Ph.D. Diss. in Electrical Engineering, U. of Pennsylvania, Philadelphia, Pa., 1968.
36)
Hoffman, L. "The formulary model for access control and privacy in computer systems", SLAC Rept. No. 117, Stanford University, May 1970.
37)
Beardsley C. "Is your computer insecure". January 1972,
IEEE Spectrum,
CHAPTER 4.A. PROJECT
MANAGEMENT '",
.........
D. T s i c h r i t z i s University
of T o r o n t o
Department of Computer S c i e n c e ,
1.
INTRODUCTION
To say t h a t
software
and time c o n s t r a i n t s quirements that
is
probably
considerable
t h e y are a l m o s t the p r o j e c t
is u s u a l l y while
The reasons f o r
education,
there
managerial
skills.
As a r e s u l t
people with
the originally We w i l l
production. factor
irrelevant
One
and l e t
activities.
engineering. projects
can be
projects
managerial
of
software
are l i a b l e Planning
the s o f t w a r e
is
the final
the project.
product
will
engineers.
not a d e t e r m i n i s t i c
to be s h i f t e d . tools
some f a c t s
projects.
of
life
by
or by good managers
Personnel
are l a c k i n g .
one has to use v e r y d i f f e r e n t
account the natureof
and
are managed e i t h e r
talent,
and communicate w i t h
and p a i n f u l .
that
adequate t e c h n i c a l
It
In g e n e r a l
techniques,
must be a c c e p t e d
be v e r y d i f f e r e n t
than
proposed.
outline
of s o f t w a r e
software
The p r o d u c t i o n
in o r d e r to manage s o f t w a r e into
safety
in managing s o f t w a r e
specifications
changes are f r e q u e n t
from the b e g i n n i n g
in s o f t w a r e
present that
of t h o u g h t .
limited
understand
Product
re-
Due to the gap between manager and computer s c i e n c e
b) Random activity.
which t a k e
life
are v e r y few persons w i t h
who do not r e a l l y
are so o f t e n
up the time w i t h
to two s c h o o l s
a) Poor management.
activity.
fill
cost
Past e x p e r i e n c e has shown
by a l a r g e
be c o n s i d e r e d
the d i f f i c u l t i e s
according
of
the specified
the p r o p e r p e r f o r m a n c e
and c o s t o v e r r u n s
the e s t i m a t e s
participants
good t e c h n i c a l
demonstrating
a c c e p t e d as a f a c t
approach can h a r d l y
divided
not produced w i t h i n
an u n d e r s t a t e m e n t .
delays
c o u l d always m u l t i p l y This
Canada
which g i v e n i g h t m a r e s
to managers
375
1)
Programmers
tend to be " d e v i o u s " .
and t h e y d e m o n s t r a t e disastrous 2)
of the p r o j e c t project
make h i m s e l f 3)
is not u s u a l l y
is v e r y s e n s i t i v e
precious
in d e s i g n i n g
5)
resource
it
perienced
projects
Talent
is
limited
successfully
a bridge,
When a s o f t w a r e he u s u a l l y
system,
is
most p r o j e c t concurrently
description
he is e x p e c t e d
engineer
is
success-
position,
or to an
members are new to the training
of t h e s o f t w a r e
and e d u c a t i o n .
product
easy to h i d e and postpone problems phase.
time.
designers
moves to a more com-
or to a m a n a g e r i a l
and t h e y o b t a i n
tools
in any o c c u p a t i o n .
system,
As a r e s u l t
of c e r -
member can
is d e s i g n e r s
and use of s o f t w a r e
bridge.
and d i f f e r e n t
phases,
availability
another
the implementation
6)
in many
of d o c u m e n t a t i o n .
in s o f t w a r e
availability
Due to the i n f o r m a l
early
artists
the p r o g r e s s
a shrewd p r o j e c t
a software
academic e n v i r o n m e n t . activity
due to
engineer designs
to go on and d e s i g n
As a r e s u l t
to the c o n t i n u o u s lack
When a b r i d g e
project
up to d a t e .
indispensable
The r e a l l y
plicated
themselves
and raw i n t e l l i g e n c e
of f a c t
have to be v e r y t a l e n t e d .
ful
style
members. As a m a t t e r
In view of the l i m i t e d
4)
They c o n s i d e r
artistic
ways.
Documentation
tain
their
Sometimes major d e c i s i o n s
right
are l e f t
at the down to to i n e x -
hands.
Performance r e q u i r e m e n t s
force
rewriting
and r e d e s i g n
of some p r o -
d u c t modules. 7)
The m a r k e t i n g
p e o p l e can come r i g h t
in the m i d d l e
and s h i f t
the
groundrules. In g e n e r a l
t h e goal
of p r o j e c t
To produce the d e s i r e d fications In o r d e r
and a v a i l a b l e to a c h i e v e
among the p r o j e c t
product
management should
be:
within
design goals,
the s t a t e d
speci-
resources.
the goal
members.
the manager has to maximize
Realistically,
the r e s o u r c e s
to produce the system and not to make anybody "happy"
"happiness" are p r o v i d e d
Unfortunately,
the p r o d u c t i o n of s o f t w a r e a c c o r d i n g to s p e c i f i c a t i o n s and w i t h i n s t a t e d constraints is a q u e s t i o n which can o n l y be answered r i g h t at the end.
376
Until ject
that
time of
progressing
"reckoning"
at f u l l
2.
P R O J E C T COMMUNICATION,~
I~
Team communication,
of n a t u r a l
we should
concentrate
on keeping the p r o -
speed.
O R G A N I Z A T I O N AND C O N T R O L
The i n i t i a l
language f o r
problem
communication
is
to f i n d
purposes.
a common d i a l e c t
Unfortunately
the terms
in the area of s o f t w a r e
do not always convey the same meaning to e v e r y -
body.
to adopt an e a s i l y
Our s u g g e s t i o n
existing
similar
is
system as a p r e l i m i n a r y
as an example the RC 4000 manual Project
\
during
(!)
manual
System's
the f i r s t
the l i f e t i m e
table),
two or t h r e e m e e t i n g s
informal
should
conversations
much communication [13].
larity
interfaces.
project
the p r o j e c t
don't
team is
it
one should
large,
that
of
of the p r o j e c t . around a
informal.
Frequent
Parnas s u g g e s t s t h a t negatively
information
affect
too
modu-
to bypass s t a n d a r d
take.
communication
suggestions
be s m a l l e r ?
one should r e a l i z e
be encouraged.
use i n f o r m a l
is a r i s k
have any c l e v e r
Couldn't
should
[2].
iteration
the members is d e s i r a b l e .
and one v e r y
between the members might
Persons c o u l d This
per week of a l l
be more t e c h n i c a l
of an
We propose
the team is small (5-10 persons so t h e y can g e t t o g e t h e r
One meeting
If
an O p e r a t i n g
members read the manual which p r o v i d e s
the language spoken and w r i t t e n If
for
readable
term d i c t i o n a r y .
is almost
e x c e p t to ask:
There are the usual much of the t a l e n t
hopeless.
Why is
it
We
so l a r g e ?
mechanisms of memos, but in the p r o j e c t
will
be ex-
pended on c o m m u n i c a t i o n s . Proper d o c u m e n t a t i o n the U n i v e r s i t y carries week).
important.
a copy which
is
nutes of m e e t i n g s , Material
the b a s i s
reached,
everything
which
The workbook g i v e s
but a c o m p l e t e
history
an anatomy of the p r o j e c t
after
its
now underway in
Every p r o j e c t
(3 to 5 new w r i t e - u p s and t e c h n i c a l
is judged
important
responsible
Old v e r s i o n s
it
member per
p a p e r s , mi-
to the p r o j e c t
for
the u p d a t i n g .
go in a s p e c i a l
not o n l y the c u r r e n t
as w e l l .
of adequate d o c u m e n t a t i o n ;
[15].
working
One team member is
is condensed p e r i o d i c a l l y .
in the back.
project,
In the SUE p r o j e c t
updated r e g u l a r l y
Every major d e c i s i o n
goes in the workbook. tion
is
of T o r o n t o we use a workbook
status
secof the
We e x p e c t the workbook to form will
also
completion.
e n a b l e us to p e r f o r m
377
II/
Team o r g a n i z a t i o n .
According
semble the o r g a n i z a t i o n s into the
its
different
interfaces
groups.
If
functions
will
a)
immediate
There is
design
freed
small
there
is
head.
one person
This
(preferably
system g l o b a l l y .
in
between the and p r o j e c t This
has
who has the complete
be the p r o j e c t
tasks,
the ideas
manager,
since
no
the design details
of the system should
be
so he can d e v o t e most of his of
all
other
in more than one d e s i g n
They do n o t t ~ y to o p t i m i z e
designers.
and t h e y begin locally
with
to view the
the o b v i o u s
effects.
There is small
two)
and keep a l l
documentation"
time-consuming
Members take p a r t
c)
to s e p a r a t e g r o u p s ,
adequate communication
person c a n ' t
The " w a l k i n g
from o t h e r
disastrous
systems r e the system
in the d e s i g n of more than one f u n c t i o n .
time as a sounding board f o r b)
60]
we s p l i t
and g i v e s e p a r a t e p i e c e s
manager can both manage t h e p r o j e c t under c o n t r o l .
If
advantages.
at l e a s t
in his
"Conway's Law"
be as good or bad as the i n t e r f a c e s
the team is
members g e t i n v o l v e d three
to
which produce them.
less
need f o r
enough numbers.
strings,
but he should
matters.
Project
formal
lines
The p r o j e c t
not
of management.
manager should
impose his w i l l ,
members should f e e l
free
Democracy works
keep the f i n a l
especially
on t e c h n i c a l
to make d e c i s i o n s ,
which
is
very desirable. If
the team is
large
must be d e v e l o p e d . In some cases i t
is
(more than I0 p e r s o n s ) ,
then a management s t r u c t u r e
As a r e s u l t
many of the man-hours
inevitable,
but b e f o r e one h i r e s
should ask the o b v i o u s
questions.
Do we need t h a t
go i n t o
management.
many persons one
many? How many d e s i g n e r s
and how many managers are needed? a good programmer a n d / o r
d e s i g n e r does
not make a good manager.
III/
Oontrol
Checkpoints
by c h e c k p o i n t burden f o r feedback. a)
the project; The f o l l o w i n g
Checkpoint
"optimistic" port
reports.
should
added b e f o r e
reports
checkpoint
be c a r e f u l l y reports
t h e y should
followed
should
provide
and accompanied
not be r e g a r d e d as a
some r e a l
benefits
and
the need a r i s e s
for
comments are p e r t i n e n t . should report
be d i s t r i b u t e d the r e p o r t
should
Checkpoint
be r e a l i s t i c . for
political
internally.
goes o u t .
If
reasons,
The o p t i m i s t i c
the
"bare"
an
re-
tone can then be
378
b)
Checkpoint
are r e a l
reports
problems
reasons f o r
should not be a p o l o g e t i c ,
then reasons and s o l u t i o n s
the d e l a y s
t h e r e are no s o l u t i o n s
exist,
then the p r o j e c t
to t h e p r o b l e m s ,
or gloomy.
If
there
should be o f f e r e d . is
not w e l l
then an a t t e m p t
If
no
managed.
should
If
be made
to change the g r o u n d r u l e s . c)
Projects
are v e r y seldom d i s c o n t i n u e d
The i m p l i c a t i o n problems. point
If
reports,
report
is
checkpoint
the case,
they exist
and b e n e f i t s
a modified
proposal,
d)
Checkpoint
of
time should
project
is
try
for
political
reports
it
should
as check-
Every c h e c k p o i n t
objectives,
If
speci-
t h e need a r i s e s
for
be w r i t t e n .
can g e t out of hand.
be made f o r
reports.
to c o v e r up the r e a l
reasons.
of the proposed p r o j e c t . then
of checkpoin-t
not qualify
back as the o r i g i n a l
the r e p o r t .
After
A specific all,
to d e s i g n a system and not to w r i t e
PROJECT
An o u t l i n e
reports
then they should
mainly
should always go as f a r
fications
3.
may be t h a t
this
as a r e s u l t
short
allotment
the purpose of
checkpoint
the
reports.
PHASES
will
be g i v e n
of the d i f f e r e n t
stages,
constituting
a soft-
ware p r o j e c t .
3.1.
Proposal
We b e l i e v e
strongly
of any s e r i o u s sources
of
that
effort.
the p r o j e c t .
internal
documentation
jectives
to t h e o u t s i d e
The p r o p o s a l 1.
Project
generality, offs,
efficiency,
jectives
the p r o d u c t
is
should
irrespective
The p r o p o s a l
be the f i r s t
step
of the f i n a n c i n g
s e r v e s as much (maybe more)
purposes as a p r e s e n t a t i o n
for
of the p r o j e c t ' s
ob-
world.
Different desired
weights should
functional
goals
be o u t l i n e d
facilities,
should
reliability
and p r i o r i t i e s
and a c c e p t a n c e c r i t e r i a .
instance
proposal
true
include:
objectives.
the r e l a t i v e
testing for
should
a detailed
This
be d i s c u s s e d ,
level,
etc.
should be g i v e n . with
respect
should
organization,
trade-
A method
to the s t a t e d
System s p e c i f i c a t i o n s internal
among them
In case of
etc.
of
ob-
be g i v e n ,
379
2.
Time and c o s t
Resources.
requirements
should
state-of-the
art
side persons.
within
3.
Benefits.
decision
to
initiate
to t h i n k
that
Benefits
for
be p r o p o s e d ,
a realistic
Personnel
view of
and the a v a i l a b i l i t y frequent
the
of o u t -
checkpoints.
be o u t l i n e d .
seem s u p e r f l u o u s
the p r o j e c t
reasons.
should
taking
s h o u l d be p r e p a r e d w i t h
should
This m i g h t
and p o l i t i c a l
stated
the o r g a n i z a t i o n
A schedule
A team o r g a n i z a t i o n
requirements
be c l e a r l y
at f i r s t ,
i s reached f o r
For p s y c h o l o g i c a l
t h e y are p a r t i c i p a t i n g
especially
other,
mostly
reasons t h o u g h ,
if
the
marketing
people like
in something
important
and e x c i t i n g .
the g e n e r a l
community
should be
the o r g a n i z a t i o n
and f o r
should
circulated
emphasized, The p r o p o s a l and w i t h i n
be w i d e l y
the o r g a n i z a t i o n .
Interested
asked to comment on the c o n t e n t s emphasized to e v e r y b o d y t h a t t h e case o f o u t s i d e obtaining
should
bers.
They should
ginning. loyalty
financing
the r e s o u r c e s .
posal
be w r i t t e n
both among the p r o j e c t
until
or i m p o r t a n t a specific
no response
Many i t e r a t i o n s w i t h the
take personal
We t a k e the stand
that
of the p a r t i c i p a n t s .
persons s h o u l d
date.
is e q u i v a l e n t
the p r o p o s a l
serves
might
participation
members
It
to a p p r o v a l .
the r e a l
The p r o -
key p r o j e c t
interest
in
money is
not enough to ensure the
They s h o u l d
the p r o j e c t
take pride
In
purpose of
be n e c e s s a r y . of a l l
be
should be
mem-
from the be-
in what t h e y are
doing.
3.2.
Survey phase
iDuring t h i s tion. eral
We w i l l
technique
is chosen.
illustrate
this
phase f o r
of an O p e r a t i o n
tools
for
informaA gen-
the p r o d u c t i o n
are
the p a r t i c u l a r
case of the d e s i g n
and
~5]. we i d e n t i f i e d
several
design
ap-
[16~.
Level-approach.
Dijkstra
appropriate
be made whenever f e a s i b l e .
Finally
System
by a s u r v e y of the l i t e r a t u r e
proaches
surveyed f o r
or generated.
construction
I~
is
Use o f p a s t e x p e r i e n c e s h o u l d design
imported
First
phase the l i t e r a t u r e
e t al
L4].
The b e s t example is Startinq
the "THE" system produced by
from the bare machine the system is
built
380
in l e v e l s . (closer
Each l e v e l
to the machine)
is the system i t s e l f . elegance,
II/
uses o n l y the f a c i l i t i e s levels
as l o g i c a l
provided
resources.
by the lower
The f i n a l
level
The advantages of such an approach are mainly
nice s t r u c t u r e
and r e l i a b i l i t y .
Top-down ~4]. S t a r t i n g w i t h the o u t e r l a y e r (Job Control Language,
or o t h e r
~rmal
description
of the s y s t e m ) ,
one o b t a i n s
the system by
r
gradual
refinement
system is s i m u l a t e d .
towards the a c t u a l As the s i m u l a t i o n
i n t o modules and s u b s t i t u t i n g the hardware and i t
III/
machine.
real
In the b e g i n n i n g
becomes more r e f i n e d
code,
the s i m u l a t i o n
the
by breaking
finally
meets
becomes the system.
Nucleus-extension ~2,3]. Start by i s o l a t i n g the basic f a c i l i t i e s
of the system. Good candidates are process generation, message communication, CPU a l l o c a t i o n , c a p a b i l i t y manipulation etc. Generate a small nucleus of a system which provides these f a c i l i t i e s .
The nucleus
may or may not be structured. From the nucleus, d i f f e r e n t Operating Systems can be obtained by extension, preferably using a level approach.
IV/
Modules-Interfaces. The system is broken i n t o modules u s u a l l y based
on the d i f f e r e n t f u n c t i o n s to be performed e . g . f i l e system, I / 0 system e t c . The modules are designed i n d e p e n d e n t l y and then i n t e r f a c e d t o g e t h e r . Problems may a r i s e
from i n c o m p l e t e
specification
of the i n t e r f a c e s .
We chose as a design approach a combination of the d i f f e r e n t techniques and methodologies. We adopted the nucleus approach ( I l l ) (1). Starting from a kernel (small
nucleus)
using levels
the nucleus is obtained
using the level approach. The system is obtained from the nucleus again using a level approach for every system or subsystem attached. The outer facilities
of the system are always kept in mind and simulation is used
whenever appropriate ( I I ) . Second we surveyed system programming languages. We decided to design and implement our own by borrowing many ideas from other d i f f e r e n t languages, notably PASCAL ~ 5 ] . Third we generated the f a c i l i t y
by which we can run and t e s t our pro-
grams under an e x i s t i n g system, in the p a r t i c u l a r case 0S/360. In view of the fact that designer's time is the c r i t i c a l the development of a software system i t
resource in
is surprising to see how few
381
design tools are available. For description purposes flowcharts are not adequate, since they do not provide the a b i l i t y to describe concurrent operations. Many description models are available such as Petri nets, Program schemata etc.
[8,9]. Unfortunately they are more appropriate
for theoretical results than description and analysis of systems. Models tend to be rather primitive to f a c i l i t a t e theory and they grow to tremendous proportions when used in practice. Simulation
is very helpful, but time-consuming. I f the system is to be
iimplemented in a high level language, extensive simulation may be as d i f f i c u l t to implement as the system i t s e l f .
Queuing theoretical and
p r o b a b i l i s t i c analysis tools can be very useful
for i n s i g h t , but when
we move to multiple resource and multiple queue models, they usually become intractable. There are e f f o r t s to produce an environment for Computer Aided Design of software systems [7,11]. This type of approach has tremendous implications in the development of systems. One such environment, AED, origi n a l l y developed in MIT is now a commercial venture.
3. 3.
Design and implementation phase
We w i l l
not attempt to describe this phase in any d e t a i l , since most
of our colleagues already talked about i t . 1)
Conceptual design
2)
Structural design
3)
Detailed design
4)
Programming
5)
Testing and debugging
6)
Validation
I t involves among others:
In some projects where a "turn-key" product is desirable, there is the f i n a l stage of actually implementing the system in the users environment. The project is only complete and successful after the product is cert i f i e d by the user. We w i l l
mention two mistaken approaches which could prove disastrous
in the design and implementation phase.
382
a)
Deadline
or at l e a s t
The system has to be ready by a c e r t a i n
approach,
it
has to " l o o k "
ready from a t e r m i n a l .
sign the programmers are rushed on in the implementation. "almost" works and i t
"almost" meets the deadline.
i s a continuous b a t t l e
date,
Without proper deThe system
From then on, i t
against unfavourable odds to make the system
work at any time. We do not imply t h a t deadlines are not i m p o r t a n t ; the designers should have enough f o r e s i g h t and courage to change them or change the p r o j e c t b)
(or j o b ) . Untrained,
Million monkey approach.
or at best not p r o p e r l y t r a i n e d
persons are hired in numbers. They are given desks, p e n c i l s , machine and the specs and they are expected h o p e f u l l y system.
It
particular
u s u a l l y does not work. skills
to produce the
System designers are needed with
and sheer numbers cannot be a s u b s t i t u t e .
s u l t whatever t a l e n t
paper, a
is in the p r o j e c t
As a re-
is probably wasted on manage-
ment of the vast numbers of s t a f f .
4.
MANAGING
A "large"
"LARGE" PROJECTS
software p r o j e c t
is u s u a l l y defined as one with more than
25 members, or with at l e a s t two l e v e l s of management.
The problems
of managing " l a r g e " p r o j e c t s are o u t l i n e d i n the two NATO r e p o r t s [ 1 1 , 1 2 ] . We feel t h a t the s t a t e of the a r t f o r the management of such projects
has not advanced s i g n i f i c a n t l y ,
Since the time of the discus-
sions in Garmisch and Rome. There is r a t h e r a tendency t o e l i m i n a t e the n e c e s s i t y of " l a r g e " numbers of people, tools
software p r o j e c t s .
Instead of h i r i n g
enormous
a few persons are selected and they are given good
to produce the same product.
For i n s t a n c e ,
three experienced persons are used to produce t h e the CDC Star computer.
One is r e s p o n s i b l e f o r
it
seems t h a t only
initial
software f o r
the microprogramming,
the other f o r the compilers and the t h i r d f o r the Operating System. This is q u i t e a departure from the 5,000 people used f o r 0 . s / 3 6 0 . It
seems t h a t the p r o d u c t i v i t y
and j o i n t
creases starts between in the
as more people get i n v o l v e d , decreasing r a p i d l y . The p o i n t ten and t w e n t y - f i v e members p r o j e c t . The i n h e r e n t reason
gerial
structure
is needed.
intellectual
output
in-
i t reaches a plateau and then i t of d i m i n i s h i n g r e t u r n s v a r i e s according to the managerial t a l e n t is t h a t above 5-7 members a mana-
383
We w i l l
outline
the h i s t o r i c a l
cases as p r e s e n t e d
in the Rome NATO r e -
port. a)
In 1950 a small
duced an a i r c r a f t
group o f p e o p l e w i t h
surveillance
system.
SAGE system went underway w i t h to school
teachers according
d e l a y and c o n s i d e r a b l e b)
In 1961 a small
successful
time
be d e s c r i b e d follow-up its
and a l t h o u g h
CTSS, on equipment which can at b e s t
The MULTICS p r o j e c t is
to manage a " l a r g e "
are " s k i l l e d ,
flexible, rare"
are r e a d i l y
available
a successful
much p a r a l l e l i s m
software
tolerant,
~2].
running
project,
was launched as a system now, i t
overran
informed,
managers are needed which
extremely
Such p e o p l e may e x i s t , for
One can always argue t h a t
can't
was one y e a r
by 100 %.
fortunately
act t h i s
from u n d e r t a k e r s
The r e s u l t
group of people in MIT produced a v e r y
system, it
tests.
success the
overrun.
well-knit
sharing
equipment p r o -
from t h e i r
1,000 people r e c r u i t e d
to a p t i t u d e
"extraordinary"
estimates
In s h o r t
cost
very limited
Judging
managing s o f t w a r e "large"
projects
t h e y can s h o r t e n
tactful
but to t h i n k
projects
is
are n e c e s s a r y because w i t h
the p r o d u c t i o n
have a baby in one month by i m p r e g n a t i n g
REFERENCES
1.
Belady,
L.
they
an i l l u s i o n .
time.
We w i l l
argument by a quote due to an unknown s o f t w a r e
5.
and unthat
soldier
counter"You
n i n e women".
and Lehmann, M, Programming System Dynamics
IBM Report RC 3546, September 1971.
2.
Brinch
Hansen,
P. (ed.) RC 4000 Software Multiprogramming
A/S R e g n e c e n t r a l e n , 3.
Brinch
Hansen, P.
OACM 13 ( A p r i l
4.
Dijkstra, CACM 11,
5.
Dijkstra,
April
1969, F a l k o n e r ,
Copenhagen F. Denmark.
"The Nucleus o f a M u l t i p r o g r a m m i n g
1970),
System",
238.
E. W. "The S t r u c t u r e 5 (May 1 9 6 8 ) ,
System,
o f THE M u l t i p r o g r a m m i n g
System",
341-346.
E. W. " S t r u c t u r e d
programming"
in t h e second NATO R e p o r t
384
II
,
p.
84-88.
.
D i j k s t r a , E.W. Notes on Structured Programming T e c h n o l o g i c a l D. Eindhoven, The N e t h e r l a n d s , August 1969.
.
Glaser et a l . Project LOGOS. D i s t r i b u t e d m a t e r i a l . P r o j e c t LOGOS c o n f e r e n c e . Case Western Reserve U n i v e r s i t y . October 1971.
.
Holt,
A. and Commoner, F. " S y s t e m i c s " in Proceedings
of Project
MAC Conference on Concorrent System and Parallel Computations.
Woods Hole, Mass., June 1970. .
Karp, R. and M i l l e r ,
R. " P a r a l l e l
Computer and System Sciences
I0.
Program Schemata", Journal of 3.2, May 1969.
Mealy, G. "The System Design Cycle" Second Symposium on Operating Systems Principles,
October 1969, P r i n c e t o n U n i v e r s i t y .
II.
NATO Report on SOFTWARE ENGINEERING, Garmisch, Oct. 1968. A v a i l a b l e through NATO, Dr. H. A r n t h - J e n s e n , S c i e n t i f i c A f f a i r s D i v i s i o n , OTAN/NATO, 1110 B r u x e l l e s , Belgium
12.
NATO Report on SOFTWARE ENGINEERING TECHNIQUES, Rome, Oct. 1969. A v a i l a b l e through NATO, Dr. H. A r n t h - J e n s e n , S c i e n t i f i c A f f a i r s D i v i s i o n , OTAN/NATO, I i i 0 B r u x e l l e s , Belgium.
13.
Parnas, D. " I n f o r m a t i o n D i s t r i b u t i o n aspects o f Design M e t h o d o l o g y " , IFIP Congress 1971, COMPUTER SOFTWARE, pp. 26-30.
14.
Z u r c h e r , F. and R a n d e l l , B. " I t e r a t i v e M u l t i - L e v e l M o d e l l i n g . A Methodology f o r Computer System". I F I P Congress SS. E d i n b u r g h , S c o t l a n d , (Aug. 1968).
15.
SUE project development workbook.
University 16.
Computer Systems Research Group,
of Toronto.
a v a i l a b l e through the Commission on E d u c a t i o n , N a t i o n a l Academy of E n g i n e e r i n g , 2101 C o n s t i t u t i o n A v e . , Washington, D.C.
COSINE Report on Operating Systems,
CHAPTER 4.B. DOCUMENTATION Gerhard Goos University
O.
of K a r l s r u h e ,
Germany
INTRODUCTION
Documentation The c r e a t i o n
is
the
information
o f a program i s
phases of d e v e l o p m e n t . describing
the s t a t e
documentations
about a program a v a i l a b l e
in w r i t i n g .
accompanied by documenting the d i f f e r e n t
There e x i s t
therefore
different
of the program at d i f f e r e n t
are used by d i f f e r e n t
documentations
stages.
These d i f f e r e n t
sets of people and f o r
different
purposes. 9 o c u m e n t a t i o n must be a v a i l a b l e .
Not any sheet o f paper on the desk of
some programer can be c o n s i d e r e d
part
be some s t a n d a r d s documentation. information
stating
These s t a n d a r d s
to r e t r i e v e
allow for
checking
guarantee t h a t
all
relevant
such r u l e s
Lastly
it
will
be v e r y i.e.
they
aspects are c o v e r e d . There i s
no i n t e r e s t
from those which are implemented t o d a y . if
of which
documentation standards
in
are t h o u g h t y e s t e r d a y to s o l v e a problem i f
documentation
There must
considered part
a l s o the form in which the
without
the i n f o r m a t i o n .
is
the completeness of the d o c u m e n t a t i o n ,
Documentation must be v a l i d . gorithms
prescribe
has to be p r e s e n t e d ;
difficult
of the d o c u m e n t a t i o n .
which i n f o r m a t i o n
G u a r a n t e e i n g the v a l i d i t y
many people are i n v o l v e d
b e s t way to a c h i e v e v a l i d i t y
is
knowing which
is
al-
t h e y are d i f f e r e n t of the
no easy u n d e r t a k i n g .
The
the use of an automated d o c u m e n t a t i o n
system. There are many t h i c k mentation
to g i v e l e n g t h y try
handbooks d e f i n i n g
to e v e r y d e t a i l . listings
It
into
different
not t h o u g h t
about which d e t a i l
to o v e r v i e w the d i f f e r e n t
troduction
is
some c r u c i a l
needs f o r problems.
standards
the t o p i c
for
of t h i s
should be d e s c r i b e d
documentation
doculecture how. We
and to g i v e an i n -
386
1,
THE
NEEDS
FOR
Documentation (I)
DOCUMENTATION
has to answer d i f f e r e n t
questions:
How to use a program?
(2) What is the s t a t e (3)
of the p r o j e c t ?
What are the o v e r a l l
specifications
(4) Which models are used to s u b d i v i d e different
(6) Flow of c o n t r o l (7) D e t a i l e d What is
and flow
description
the modules?
of data through
the program.
of data.
the meaning of the error-messages?
These q u e s t i o n s different
the program and to i n t e r f a c e
modules?
(5) Which basic models are used f o r
(8)
of the p r o j e c t ?
must be answered by d i f f e r e n t
documentations
at
times.
We d i s t i n g u i s h (a)
the u s e r ' s
(b)
the conceptual
(c)
the design documentation
(d)
the p r o d u c t
The u s e r ' s last.
guide
(ad 1,8)
description
documentation
guide is at l e a s t
I t must be independent
The conceptual d e s c r i p t i o n serves as an i n t r o d u c t i o n , It
contains
references
(ad 2, 3, 4, 5) (ad 2, 6, 7, 8) conceptually
initiated
documentation
first
but f i n i s h e d
from the o t h e r d e s c r i p t i o n s .
is developed as the p r o j e c t proceeds. I t overview and s p e c i f i c a t i o n of the p r o j e c t .
to the design and product
The design documentation d u r i n g the design phase. The p r o d u c t
(ad 2, 3, 4, 5)
documentation.
d e s c r i b e s the c u r r e n t s t a t e of the p r o j e c t I t d e f i n e s the i n p u t f o r the c o n s t r u c t i o n phase. describes
the c u r r e n t
state
of the p r o j e c t
d u r i n g the c o n s t r u c t i o n and maintenance phase. The program i t s e l f is p a r t of the product documentation. I t is the basis f o r w r i t i n g the user's
guide.
387
All
these d o c u m e n t a t i o n s
Any q u e s t i o n
are s u b j e c t
about d e t a i l s
to the f o l l o w i n g
must be answered using a v e r y small
of t i m e .
For i n s t a n c e t h e r e
could
allowing
for
from the r o o t
terminal
node of the t r e e .
tracing
easily
- The answers must be c o m p l e t e . the r e a d e r has not asked f o r or t h a t
be a t r e e - l i k e
ordering
to e v e r y d e t a i l
They must r e f e r e n c e because he d i d not
amount
imposed found at a
related
details
know t h a t
which
they exist
t h e y are r e l a t e d .
'~any p r o j e c t s the design
have d i e d because these c o n d i t i o n s
and p r o d u c t
nobody can f u l f i l
documentation.
this
if
it
how t h e i r
work i s
because t h e y u s u a l l y
start
If
were not f u l f i l l e d
Programming i s
takes him too
about the b a s i s of h i s work. standing
conditions:
a creative
long a t i m e to
task;
inform himself
programmers cannot get a c l e a r
related
to the work of o t h e r s
by
under-
they must f a i l
from wrong assumptions about t h e i r
environ-
ment.
1.1.
THE
USER'S
The u s e r ' s
GUIDE
guide
- introductory
is
subdivided
into
two or i f
necessary three
parts:
manual
the r e f e r e n c e manual the o p e r a t o r ' s The i n t r o d u c t o r y It
that
because i t version
achieved,
and s e l l i n g Secondly the program.
It
introduction
guide which
the f i n a l
introductory
there
is
any a c t i v i t y
of an o p e r a t o r
and o v e r v i e w about what problems
should be d r a f t e d
specifies of the
implied).
goals:
by the program and what the l i m i t a t i o n s
of the u s e r ' s gram s t a r t s
(if
manual s e r v e s t h r e e
g i v e s an i n f o r m a l
be a t t a c k e d
has
guide
not those one was t r y i n g
It
is
that
part
b e f o r e the design of the p r o -
the o b j e c t i v e s
introduction
are.
can
of the program.
describes
But note
the o b j e c t i v e s
one
to a c h i e v e when one s t a r t e d !
The
manual m o s t l y forms the d e s c r i p t i o n a l
basis
for
advertising
the program. introductory is
manual from a l l
manual d e s c r i b e s
very useful other
parts
to s e p a r a t e t h i s and to make i t
the
"standard
part
of the
as s h o r t
use" of the introductory
as p o s s i b l e
because
388
every
u s e r is expected to know these informations by heart or has them
in his pocket.
I t eases the job of the user i f
he has not to select
these basic informations from d i f f e r e n t parts of the introductory manual. A "cookbook" must be given describing the commonly used job control cards with only a few number of options; layout and order of input and output data are described mostly by examples. E.g., f o r an ALGOL 60-compiler the following information is given: - Command for s t a r t i n g the compiler with l i s t i n g of source t e x t and standard core size. - Command for s t a r t i n g the translated program using standard core size. - Advice how to punch special characters ( " ( / " instead of " [ " e t c . ) - Rules f o r punching input data (separation of numbers by two blanks etc.) -
How to use the a v a i l a b l e t e s t f a c i l i t i e s .
- Explanation of commonly occurring error messages. Nothing is said about language r e s t r i c t i o n s or extensions. T h i r d l y the introductory manual describes a l l possible applications of the program, i n f o r m a l l y and in common terms. The main difference between t h i s description and the corresponding description in the reference manual consists in that t h i s description should be readable by every user of the program. Previous programming experience or formal t r a i n i n g of the user can be requested only i f quire i t .
the objectives of the program re-
In such cases, however, the informal description is superfluous
and must not be supplied. E.g., an informal description of the interface between operating systems and assembly programs is not needed; the i n formal description of a compiler contains an informal description of the language as implemented. "Informal" means that sometimes a compromise must be made between r e a d a b i l i t y and the r i g i d i t y required for describing a l l exceptional cases. The reference manual assumes that the reader has some f a m i l i a r i t y with related publications and with the current state of the a r t how to solve a given problem. The informal description does not assume t h i s .
I t therefore should provide the user with some back-
ground information motivating him for the best usage of the program and t e l l i n g him which cases can be handled p a r t i c u l a r l y e f f i c i e n t l y . The user's reference manual supplies complete information how to use the
389
program.
The formal
and completeness description
level
o f commands, f o r m a t
messages the f o l l o w i n g -- How to
of t h e d e s c r i p t i o n
of the d e s c r i p t i o n
install
system r e q u i r e d ,
- Possible ness,
cards, disc
data,
files,
1.2.
THE
The f i r s t
punched c a r d s ,
version
the d e s i g n describes
starts
it
specifies
possible During It
it
finished.
from.
data. the p e r f o r -
the availability,
robust-
filed),
devices
or f o r
the i n p u t
the o u t p u t
particular
intervention
stream
and i t s for
is
reacting
stream
reat
(job
the
control
(tape files,
preparation)
handling
and the
abnormal
description
specifies
that
them in t e c h n i c a l
(minimal
input
data,
configuration,
conditions.
exceptional
a l o w e r bound f o r
concepts
the decomposition
and o b j e c t i v e s
paper from which guide
point terms.
of v i e w ; Especially
estimates
cases y e t
of t h e program is
for
amount
to be handled
subjected.
Whenever
the requested performance.
of the program the c o n c e p t u a l
the conceptual
is
to the u s e r ' s
of the program from the u s e r ' s
of t h e b a s i c
In p r a c t i c e
used,
problems.
increasing
The o v e r v i e w b e l o n g i n g
of usual
changes a f t e r w a r d s ,
exclusively
ameloriating
related
mounting
of the conceptual
must s p e c i f y
describes
for
printer-output in
description
the d e s i g n
Ideally
for
to which the c r e a t i o n
a description concepts
for
the constraints
and p r o p e r t i e s efficiently)
operating
DESCRIPTION
the o b j e c t i v e s
the conceptual
and e r r o r
etc.
disc
at t h e c o n s o l e ,
CONCEPTUAL
resources
of i n p u t
guide must d e s c r i b e
tape f i l e s ,
output
required,
needed whenever o p e r a t o r ' s
The o p e r a t o r ' s
operation
solving
changes or e x t e n s i o n s
by t h e program e i t h e r
console.
and o t h e r
be u s e f u l
one or f o r
g u i d e is
configuration
as a f u n c t i o n
range of a p p l i c a b i l i t y
The o p e r a t o r ' s quired
(minimal
which might
mance of the g i v e n
of i n p u t ,
to t h e
tape e t c . )
- Time and space e s t i m a t e s of programs
clarity
In a d d i t i o n
is needed:
permanent f i l e s
how to read the d e l i v e r e d
List
and c o n t e n t
information
the program
must be such t h a t
can be a c h i e v e d .
description
and models u n d e r l y i n g
of the program i n t o
extends
to
the design.
modules and the b a s i c
of each module. description this
e.g.
s h o u l d be s t a b l e
happens v e r y seldom. because d u r i n g
when t h e d e s i g n
Too o f t e n
production
there
is
are
or m a i n t e n a n c e
the
390
objectives
could not be met or because the d e s i g n was changed l a t e r
because the o b j e c t i v e s
1.3.
DESIGN
AND
PRODUCT
DOCUMENTATION
The d e s i g n and the p r o d u c t They w i l l
or
have changed due to market c o n d i t i o n s .
documentation
never be produced
if
there
have many p r o p e r t i e s
is
no a u t h o r i t y
in common:
who i n s i s t s
on
that. They are t h e w o r k i n g documents by which the d e s i g n e r s and the p r o grammers r e s p e c t i v e l y communication -
report
on t h e i r
work.
in the d e s i g n and p r o d u c t i o n
They p r o v i d e the means o f group.
Both should be s u b d i v i d e d by the same scheme so t h a t referencing tion
explicit
i s m i n i m i z e d s i n c e e v e r y r e a d e r of the p r o d u c t
knows where to search
The s u b d i v i s i o n
cross-
documenta-
in the d e s i g n d o c u m e n t a t i o n and v i c e v e r s a .
corresponds
to the s u b d i v i s i o n
o f the program i n t o
modules. - The i n f o r m a t i o n get the b a s i c tion
for
each module i s
information
to the o u t s i d e .
structure, It
is
is
created
how i t
Then t h e r e f o l l o w
very early
that
First
the i n t e r f a c e
the d e t a i l s
the c l a s s i f i c a t i o n
in the d e s i g n .
e x i s t e n c e of the c l a s s i f i c a t i o n body's
ordered.
including
we descrip-
about the i n t e r n a l
data and a l g o r i t h m s .
very important
decision
hierarchically
works
is really
It
allows
scheme of the d o c u m e n t a t i o n
may be r e v i s e d to c o n t r o l
later.
that
The e a r l y
every design
documented and does not remain in the a i r
or on some-
desk.
The design d o c u m e n t a t i o n underlying description
each module.
a detailed
includes
All
functions
Diagrams s p e c i f y i n g
program as a whole and through documentation
should r e c o r d
back r e v i s i n g
an e a r l i e r
of earlier
The p r o d u c t
documentation
of the models o f the f o r m a l
as a d e s c r i p t i o n
o f the b a s i c
p r o v i d e d and r e q u e s t e d by the module are the f l o w of data and c o n t r o l each module are g i v e n .
the h i s t o r y
decision.
the r e s u l t s
description
the s p e c i f i c a t i o n
how the module works as w e l l
data s t r u c t u r e s . specified.
contains This
discussions consists
of the d e s i g n .
To t h i s for
Lastly
end i t
is
through
One o f t e n
useful
comes
to p r e s e r v e
reuse.
of a d e t a i l e d
description
the
the design
of a l l
391
module-interfaces,
data,
is crossreferenced
to the source program which
product
documentation.
data f o r m a t s
Special
local
understanding
belongs
to the
comments which
in
of the meaning of the
going back to the d e s c r i p t i o n .
care must be taken to d e s c r i b e a l l
extensively.
This d e s c r i p t i o n
itself
The source program c o n t a i n s
most cases should a l l o w f o r program w i t h o u t
and a l g o r i t h m s .
It
is this
program have to s t a r t messages are u s e l e s s
information
for if
searching
it
is
kinds
of e r r o r - h a n d l i n g
from which the m a i n t a i n e r s errors
and c o r r e c t i n g
i m p o s s i b l e to t r a c e
very
o f the
them.
Error
them back to t h e i r
source. There i s
no program which
the o b j e c t i v e s the p r o d u c t additional
is
stable
describes
i s done.
a bad d o c u m e n t a t i o n .
SPECIAL
2.1.
Very o f t e n
information
OF D A T A
9ata belong to d i f f e r e n t
algorithms
files
of
an a l g o r i t h m
ALGORITHMS
classes:
files)
Tables accessed by many program-modules
Local
tables
Auxiliary
o f program-modules
of a program module
"State variables" variables
distinguishes is
that
where between
and the p a r t i c u l a r
way
a consequence of
or a data s t r u c t u r e
in the d o c u m e n t a t i o n .
AND
(scratch
Parameters of c a l l s
i
it
non-adaptability
Data on permanent f i l e s Auxiliary
I
and t h a t
PROBLEMS
DESCRIPTION
is crucial
the~program p o i n t s
The people i n v o l v e d must make e x p e r i m e n t s f o r
the e s s e n t i a l s
of r e a d i n g t h i s
clearly
can be i n s e r t e d
the goals to be a c h i e v e d by c e r t a i n
2.
the base system or
!
activities
constructing
Either
change sometimes. To adapt the program i t
documentation
by which t h i s
forever.
re-
instead
392
To d e s c r i b e data on f i l e s
the r e c o r d - s t r u c t u r e
c e s s i n g them must be s p e c i f i e d . scribed (e.g.
to which
updating,
nization
it
i s de-
purpose the data are or can be used by o t h e r gathering
statistics).
in case of m u l t i p l e of f i l e s ,
are p a r t
interface-description
cated to which e x t e n t
If
access i s
The d e s c r i p t i o n of the
and the methods of ac-
In case of permanent f i l e s
tables
this
programs
may happen the s y n c h r o -
described.
accessed by many modules and parameters between modules.
the c o n s i s t e n c y and v a l i d i t y
It
must be i n d i -
of these data i s
checked by the modules using them. The d i s t i n c t i o n
between s t a t e
on the problem.
A boolean v a r i a b l e
an a u x i l i a r y
variable
In d e s c r i b i n g
parts
is
the v a r i a b l e
in a r e s t r i c t e d
t h e y are used f o r
In a l l
it
and a u x i l i a r y
can be a s t a t e
used o n l y f o r
the meaning of the v a r i a b l e
say which c l a s s ing o n l y
if
variables
cases i t
is
part
it
belongs t o .
variables
variable
depends
or i t c a n
be
the purpose of a b b r e v i a t i o n . is,
however,
Auxiliary
important
variables
to
have mean-
of a module and sometimes in d i f f e r e n t
different
purpose.
n e c e s s a r y to s t a t e
This must be i n d i c a t e d
not o n l y the p o s s i b l e
v a l u e s but a l s o the meaning of the v a l u e s
clearly.
range of
in terms of the problem to be
solved. The d e s c r i p t i o n ever,
is
cludes
of the a l g o r i t h m s may be g i v e n by f l o w
a rich
s e t of c o n t r o l
expressions etc.). program-modules original plain
2.2.
charts.
This,
not recommended when using a programming language which In t h i s
structures
case i t
(nesting,
loops,
i s more a p p r o p r i a t e
by " p s e u d o - p r o g r a m s "
program but r e p l a c e d e t a i l e d
conditional
to d e s c r i b e the
which p r e s e r v e the s t r u c t u r e constructions
how-
in-
of the
by e x p l a n a t i o n s
in
English.
CROSSREFERENCING
BETWEEN DOCUMENTATION
In many cases procedures the source program. finding
and l o c a l
PROGRAM
data can be d e s c r i b e d by comments
Otherwise a p p r o p r i a t e
the d e s c r i p t i o n
AND
means must be e s t a b l i s h e d
g i v e n a p i e c e of source t e x t
in for
and v i c e v e r s a .
Four problems are i n v o l v e d : The s e a r c h i n g problem. ing the c o r r e s p o n d i n g
There must be l i s t i n g s section
program and v i c e v e r s a . name, data name) or
which a l l o w f o r
o f the d e s c r i p t i o n
The program p a r t
to each p a r t
can be s p e c i f i e d
(module name, procedure name)
specifyof the
by (module
in the d o c u m e n t a t i o n .
393
The p r o g r a m taining table
of
the
p r o g r a m must be e s t a b l i s h e d
in
the
identifier
the
source
problem.
is
To each
and t h e
given
Every
program local
by t h e
name o f
is
description.
used many t i m e s the
component
component
which in
time
reasonable
a certain
reference
problem.
a certain
other
variable
the
data
in
are
of
naming
as g i v e n of
data
2.3.
data
THE
The d e s i g n
and p r o d u c t to
which
is
out
is
in
the
dynamically
least
language
every
the
description
finding
This
Every is
is
the
difficult
when
purposes.
A solu-
other
global its
context
in
identifier
and t h e n
is
it
starts
declaration:Compo-
means a u n i t
documentation
small
or
telling and w h i c h
t h e most
the
"
which
serious
that
to
in
enough
the
pro-
which
functions
outgoing
calls
versa.
case
such as
allows
are
the
for
the
in
use o f
naming
these
trouble
packed source
data
when
into
explicitly
"access
cross-
identifier
causes
algorithm
In t h i s However,
the
This
or when t h e y access
problem
every
documentation
comments
one instead
program
must
to
better
solution
is
the
describing
the
access
up t o
date,
to
names.
DOCUMENTATION
documentation
member o f of
modules
and v i c e
contains
at
accessible mation
value
by q u a l i f i e d
MAINTAINING
This
description>
a programming
packed
it
contains
be a g u a r a n t e e
by i d e n t i f i e r .
contain
by t h e
where
conventions:
name. or
this
by o t h e r
program
program
the
everywhere
or
the
allocated
word and t h e
points
modules.
T h e r e must
can be f o u n d
qoreover,
the
different
T h e r e must be l i s t i n g s
The unnamed d a t a p r o b l e m s . referencing.
the
information.
can be c a l l e d
which
for
following
has a u n i q u e
in
a module
etc.
in
A "component"
of
cona
program
for
to
- The i n t e r f a c e
all
occurring
nent&globname. gram f o r
finding
must e x i s t
a certain the
be s e a r c h e d
each p r o c e d u r e
by a comment Additionally specifying
identifier
corresponding
component to
documentation.
an a l g o r i t h m
by e s t a b l i s h i n g
either
for
means f o r
program
same i d e n t i f i e r
tion
text
the
occurs.
source
declaration the
in
be a u t o m a t i c
of
section
contents
-- The naming or
documentation
index
should
a given
the
the
numbers
there
specify to
of
line
should
a pointer
date.
the
must be c o m p l e t e ,
team and i t
should
not
contain
infor-
use
394
The f i r s t
goal
which a l l o w s
is
for
and to g u a r a n t e e mentation
a good t a b l e
the c o m p l e t e n e s s .
accessibility
mechanic aids
are recommended. A t e x t - e d i t i n g
ronment s o l v e s documentation
a lot
of t r o u b l e :
most e a s i l y
saves c l e r i c a l that
a c h i e v e d by h a v i n g controlling
It
in advance
for
establishing
the docu-
system in a t i m e - s h a r i n g
allows
(so he w i l l
work in w r i t i n g
of contents
To update a d o c u m e n t a t i o n
each programmer
be more i n c l i n e d
and d i s t r i b u t i n g
not o n l y t h e one copy of the d o c u m e n t a t i o n
envi-
to update the
to do the j o b ) ,
material,
it
it
guarantees
in t h e a r c h i v e
is
up to
date. To d e l e t e c i d e which better trivial
outdated
information
information
to hold rule
is
information
should
appropriate
no l o n g e r
rules
must be set up to de-
needed. At l e a s t
too long than to d e s t r o y
be r e c a l l e d
ment or make any change w i t h o u t
in
this
context:
adding t h e d a t e .
during it
design
too e a r l y .
Never e n t e r
it
is
A very
any docu-
CHAPTER 4.
C.
pERFORMANCE PREDICTION
R. M. GRAHAM DEPARTMENT OF COMPUTERSCIENCE UNIVERSITY OF CALIFORNIA BERKELEY, CALIFORNIA, USA
PERFORMANCEPREDICTION In the following sections we w i l l consider the problem of predicting the performance of a proposed, but not yet implemented, computer system. Because of the need to l i m i t the scope of our subject we w i l l focus our discussion around operating systems, even though we consider the hardware to be an important part of any computer system.
Most of our discussion is
also valid for any complex software, especially other types of computer systems, such as, a management information system or an a i r l i n e reservation system. We have elected to focus on operating systems in our discussion and examples since operating systems are probably more well known to software engineers than any other type of software. Our discussion begins by considering the meaning of performance, the problems encountered when trying to measure i t , and the limitations which bound the attainment of specific performance goals.
We then consider the
modelling of systems, which is an essential part of any performance analysis, and how these models are used in predicting performance.
Since simulation
is the most powerful of all the tools for predicting performance we explore this topic in more depth.
F i n a l l y , we conclude our discussion by exploring
the integration of performance prediction with the design and implementation of software systems. P r e s e n t A d d r e s s : Dept of C o m p u t e r New York, New York, USA
Sciences,
The
City C o l l e g e ,
396
]~..... PERFORMANCE: DEFINITION, MEASUREMENT, AND LIMITATIONS A well formulated set of design goals for a proposed system w i l l include some e x p l i c i t performance goals which the f i n a l implemented system should s a t i s f y .
Even i f the design goals do not contain any e x p l i c i t
performance goals, some minimal performance goals are c e r t a i n l y i m p l i c i t , for example, an implementation which takes an i n f i n i t e amount of time to perform i t s functions is generally unacceptable. performance prediction i s :
The basic problem in
given the design goals and given the s p e c i f i -
cation of a p a r t i c u l a r design determine whether the performance of the proposed design s a t i s f i e s the design goals.
Before we can discuss how
this can be done we must have a clear understanding of what performance means, how i t can be measured, and what l i m i t a t i o n s e x i s t which may prevent the attainment of a r b i t r a r i l y selected performance goals. I.I.
WHAT IS PERFORMANCE? Performance, in the present context, is the effectiveness with which
the resources of the host computer system are u t i l i z e d toward meeting the objectives of the software system. That i s , how well does i t do the job i t was designed to do? Thus, performance has to do with the minimization or maximization of certain parameters in a system. For example, "minimum response time" and "maximum throughput" are phrases often used in discussing the performance of a system. Two levels of performance are important in any discussion of the subject: minimum acceptable performance and optimal performance. Minimum acceptable performance is specified by a set of constraints on performance which an implementation must s a t i s f y to be an acceptable implementation. These constraints may be e x p l i c i t l y stated as part of the design goals. For example, i t may be specified that for a certain class of jobs the throughput must be more than a specified number of jobs per hour or that the response time for a certain class of requests is less than a specified number of seconds. Another type of constraint which is often e x p l i c i t l y stated is the maximum amount of primary memory which the operating system may use at various times. I f these constraints are not e x p l i c i t l y stated then they must be derived from the design goals. For example, i f no constraint on response time is specified for an interact i v e system, then some reasonable constraint is assumed. Five seconds is probably reasonable for a t r i v i a l
request while f i v e minutes is quite un-
reasonable. In general the system designer attempts to produce a system which
397
achieves optimal performance. There are many system designs which s a t i s f y the minimum acceptable performance constraints. Some subset of these designs, which is usually small, contains a l l those designs which are "best" in some sense. I t is d i f ~ c u l t to define "best" precisely and t h i s problem is discussed in the next two sections. A "best" design is one in which the values of certain parameters are maximized or minimized. We say that the implementation of such a design is a system which achieves optimal performance. The design goals may explicitly
call for optimal performance in addition to specifying the minimum
acceptable performance. For example, the design goals for an i n t e r a c t i v e system may e x p l i c i t l y call for minimal response time while the design goals for a non-interactive system might specify maximum throughput. The performance of any p a r t i c u l a r system must always be related to the purpose and goals of that system. No universal set of system parameters can be used to define performance, since the significance of parameters depend on the p a r t i c u l a r objectives of the system. In fact, i t is possible that the performance of two quite s i m i l a r systems, each designed f o r a d i f f e r e n t purpose, w i l l be defined in terms of two nearly d i s j o i n t sets of system parameters. In a sense, the performance of a p a r t i c u l a r system is a characterization of how well that system does the job that i t was designed to do. Thus, any discussion of the performance of a p a r t i c u l a r system must always be r e l a t i v e to the purpose for which the system was designed.
1.2.
MEASUREMENTOF PERFORMANCE In the preceding section we used words l i k e "optimal" and "minimization"
in discussing the meaning of performance. performance, or at least some aspect of i t ,
Use of these words implies that is q u a n t i t a t i v e l y measurable.
There are two s i g n i f i c a n t problems in connection with the measurement of performance:
selection of an adequate metric and the v a r i a b i l i t y of per-
formance as a function of the input to the system. 1.2.1.
PERFORMANCEAS A FUNCTION OF INPUT
I t is clear that performance is a function of the inputs to the system, that i s , i t is variable and depends on the characteristics of the jobs or requests which are submitted to the system.
For example, in an i n t e r -
active system i f every request takes one hour of computing time and the system contains only one computational unit (processor) then the average response time cannot possibly be f i v e seconds.
A computer system has at
i t s disposal a f i n i t e amount of a number of d i f f e r e n t resources: such as, processors, primary memory, secondary memory (disks, drums. . . .
), input-
398
output devices (tape units, printers, card readers, communication lines . . . . ), and input-output channels (or processors).
The system uses these resources
to process the jobs or requests submitted to the system. The characteristics of a job (or request) which are significant in performance measurement can be specified in terms of the sequence in which a job demands the use of various resources and the amount of these demands. I t is usual to consider types of jobs rather than individual jobs.
A job
type is a class of jobs a l l of which have similar sequences of resource demands. There are clearly an i n f i n i t e number of different job types. Fortunately many of the jobs encountered in real systems seem to belong to one of a r e l a t i v e l y small number of different types.
For example, one
common job type consists of a sequence of more or less evenly spaced requests for input-output with a r e l a t i v e l y small amount of computation between each request.
The term input-output limited is Often used to characterize
this type of job.
Manydata processing jobs are of this type, especially
f i l e maintenance. Another common job type is a short sequence of input requests followed by a long computation and terminated by a sequence of output requests. Many s c i e n t i f i c computations are jobs of this type. Even though the number of different job types commonly encountered in real systems is r e l a t i v e l y small there are a number of open problems in this area. For example, i t is not clear how much difference is acceptable between the resource demand sequence of two jobs of the same type. Also, the common job types have not a l l been identified and precisely characterized. 1.2.2.
METRICS
Since the performance of a system must always be related to the purpose for which the system was designed, there is no single metric f o r the measurement of system performance which is c l e a r l y superior to a l l others. For example, in an a i r l i n e reservation system the performance might be measured by the sum of the average response time and i t s standard deviation. Thus, for the performance of the system to be optimal both the average response time and i t s deviation must be small (but not necessarily minimal). On the other hand, in a multiprogramming batch system the performance might be adequately measured by the inverse of the fraction of time that the central processor is not in use. Thus, optimal performance is obtained when the percent of processor i d l e time is almost zero. Performance requirements which are contained in the design goals are usually specified in terms of user oriented system parameters, such as, throughput or response time. These parameters are usually not basic
399
variables of the system and are thus often d i f f i c u l t
to work with or measure.
I f we view the system as mainly concerned with resource management then i t s basic variables are those things having to do with resource management. A number of d i f f e r e n t variables are relevant for each system resource, for example, the percent of the resource currently in use, the number of requests for the resource which are currently queued, and various s t a t i s t i c a l properties of the preceding two variables, such as, the average and maximum queue length, the average and maximum time a request stays in a queue, and the average and maximum amount of the resource used. Use of user oriented parameters in the design goals is natural and reasonable since they describe the performance goals in terms which are meaningful to the user.
When considering the purpose of a system and i t s
performance requirements, parameters l i k e throughput and response time are meaningful, while the basic system variables like average queue length are p r a c t i c a l l y useless, However, the opposite is true when the system designer is attempting to produce an optimal design. The system designer is concerned with the internal workings of the system. Thus, he is dealing with things such as queues and resource allocation algorithms. Knowledge of the average response time is of l i t t l e
use to the designer since his
problem is to design allocation and scheduling algorithms which w i l l achieve optimal performance.
In analyzing the performance of the system
he is interested in the way that a job's resource demand sequence, i n t e r acting with his allocation algorithms, affects such variables as average and maximum queue length and the average percent of unused resources. Of course the system designer is attempting to ultimately minimize or maximize the parameters specified in the design goals, however, knowledge of t h e i r values often gives him no other information than whether or not he has succeeded, What he needs is some information about where the problem l i e s , that is, where the system design needs to be changed in order to improve i t s performance. When he makes a change in the design the values of the basic system variables are d i r e c t l y affected, while parameters l i k e response time are only i n d i r e c t l y affected.
The values of these
higher level parameters are functions of the values of the basic system variables.
These functions are usually quite complex.
We have now uncovered
the fundamental problem in performance analysis, that of finding and expressing the relationship between parameters such as, throughput and response time, and the basic system variables. This process is called modeling and the expression of this relationship for a specific system is called a model of that system.
The problem of modeling systems is discussed
400
in a l a t e r section. The problem is f u r t h e r complicated by the fact that the basic system variables are not necessarily mutually independent.
I f these variables
are not mutually independent, then changes in the system design which a f f e c t one variable may also cause unwanted affects in other basic variables. This makes i t more d i f f i c u l t
to figure out what changes in the system w i l l
cause the desired change of behavior.
Another complication arises i f the
basic variables are not a l l relevant, that i s , some of the variables may be such that changes in t h e i r values have l i t t l e
or no a f f e c t on the per-
formance of the system. 1.2.3.
STEADYSTATE, TRANSIENT, AND OVERLOAD BEHAVIOR
When measurement of performance is considered i t is important to recognize the difference between the steady state behavior of a system and i t s behavior during transients or under overload conditions.
The system
w i l l normally spend most of i t s time in a steady state condition.
We are
most interested in the performance of the system when i t is operating in this normal, steady state condition.
When measuring a system's performance
we must make certain that we are measuring the steady state performance. Generally, the performance of the system in the steady state is some reasonable function of the input, thus, the change in i t s performance due to a change in the input is e a s i l y predictable. true for transient or overload behavior.
This is frequently not
For example, i f the input changes
from jobs of one type to jobs of a quite d i f f e r e n t type there may be a short period of quite e r r a t i c and unpredictable behavior u n t i l the system settles down to a steady state behavior which is a function of the new job type. An overload condition w i l l usually r e s u l t in s i m i l a r unpredictable behavior, as for example, when an i n t e r a c t i v e system which can give good service to only
N terminals is operated with
2N terminals.
Even though the steady state performance of a system is the system designer's major i n t e r e s t , i t is important that transient and overload behavior be studied. There are two reasons for t h i s . The ideal design is one in which the e r r a t i c affects of transient and overload behavior are minimal. Study of performance under these conditions should help the designer in achieving this ideal design. Since the e r r a t i c behavior under transients and especially overload conditions can never be completely eliminated i t is important to know under what conditions such behavio~ occurs and what affects i t has on the system performance. This knowledge is needed so that during normal operation of the system these conditions
401
can be avoided, or when they occur, t h e i r occurrence recognized and correct i v e action i n i t i a t e d . 1.3.
LIMITATIONS OF PERFORMANCE I t i s important t h a t we recognize t h a t there are l i m i t a t i o n s on p e r f o r -
mance, t h a t i s f o r any proposed system there e x i s t performance c r i t e r i a t h a t no possible design can s a t i s f y the c r i t e r i a .
such
These l i m i t a t i o n s can be
divided i n t o two classes, inherent and economic. Inherent l i m i t a t i o n s are l i mits which are t h e o r e t i c a l l y impossible to overcome. Economic l i m i t a t i o n s are l i m i t s which are possible to overcome in theory but not in p r a c t i c e , t h a t i s , the cost of overcoming the l i m i t
is unreasonable. The f o l l o w i n g discussion is
intended to be suggestive of the types of l i m i t a t i o n s which e x i s t and is not an exhaustive l i s t . 1.3.1.
INHERENT LIMITATIONS
Since a l l operational computing systems use physical devices, the laws of physics are an obvious inherent l i m i t a t i o n .
These laws l i m i t
the speed of
signal ( i n f o r m a t i o n ) transmission. This u l t i m a t e l y l i m i t s the speed at which a given task can be accomplished. The l a r g e r a computing system is, the more serious t h i s problem becomes. For example, there i s a lower bound on the
time
i t takes to add together two numbers which is c e r t a i n l y not smaller than the time i t takes to t r a n s m i t the
two numbers to the computer's adder u n i t and
t h e i r sum back to memory (the working r e g i s t e r s of a computer are a form of memory). Thus, no matter what technique is used f o r a d d i t i o n , the minimum time is l i m i t e d by the laws of physics. A s i m i l a r l i m i t a t i o n e x i s t s in i n t e r a c t i v e systems where the minimum response time i s l i m i t e d by the transmission time of the request from the terminal to the computer and the r e p l y from the computer to the t e r m i n a l . Another i n h e r e n t l i m i t a t i o n
r e s u l t s from c o n f l i c t i n g o b j e c t i v e s . When-
ever the performance goals include c o n s t r a i n t s on more than one v a r i a b l e i t may be impossible to design a system which s a t i s f i e s a l l of the c o n s t r a i n t s i f the v a r i a b l e s are not m~tually independent. The oldest and most well known example of t h i s c o n f l i c t
is the time-space trade o f f .
For example, i t
is
well known that the f a s t e s t method of computing a value f o r a common mathematical
f u n c t i o n , such as the cosine, is by use of a table of values where the
argument of the f u n c t i o n is used as an index. However, t h i s speed is obtained at the expense of a large amount of memory space. On the other hand a method f o r computing the cosine which uses a minimum of memory space i s con-
402
siderably slower. This difference is even more pronounced f o r more complicated functions. Thus, we can minimize memory space or computation time, but not both simultaneously. In p r i n c i p l e the minimization of a set of variables can be accomplished by l i n e a r programming methods. However, in practice the designer is usually unable to formulate the appropriate mathematical r e l a t i o n s . Another l i m i t a t i o n which we consider inherent is complexity. I t is our experience that i t is possible to conceive of systems which are too complex to build. I t can be argued that t h i s is not the case, that given enough time, money, and people anything can be b u i l t . We do not accept this as a valid argument. I t is even more the case that a system may be so complex that, while i t can be b u i l t , i t never works the way i t was intended. There is always something wrong with i t and i t s r e l i a b i l i t y
is so poor that i t is useless.
1.3.2. ECONOMICLIMITATIONS The other type of l i m i t a t i o n is economic.
This type of l i m i t a t i o n is
not inherent and can be overcome given enough money.
I t i s , however, a
genuine l i m i t a t i o n since no project ever has an unlimited source of money. The optimal system design which can be achieved is l i m i t e d , often rather severely, by a r e l a t i v e l y fixed budget.
Economic l i m i t a t i o n s on performance
are often overlooked in formulating the design goals and u n r e a l i s t i c performance objectives are often attempted. One obvious economic l i m i t a t i o n is the cost of high performance hardware. The current state of the a r t in hardware technology is such that hardware can be b u i l t which has considerably higher performance than the hardware which is currently marketed.
In addition, there is a wide range in perfor-
mance among the currently available hardware and i t is usually the case that the higher performance hardware is the more expensive hardware.
Thus,
up to a point, higher performance can be achieved by spending more money. This is not usually an acceptable method of achieving optimal performance due to the budget l i m i t a t i o n s of most projects.
Hence, a l i m i t e d budget
imposes an upper bound on the performance of the system. Another way that a limited budget can impose an upper bound on the performance of the system is to l i m i t the search for more optimal algorithms. I t is well known that additional work by a designer usually leads to improvements in the e f f i c i e n c y of an algorithm or even a completely new algorithm with much superior performance. complex operating system.
This is especially true with a
Thus, better performance can be obtained by
having the designer spend more time on the design and implementation.
How-
403
ever, the designer is paid and thus, the size of the projectls budget l i m i t s the total amount of time that the designer can devote to algorithm improvement.
Only rarely is the budget of s u f f i c i e n t size that all parts
of the system can receive equal attention. are most c r i t i c a l
Usually only those parts which
in the system's performance can be considered for
improvement. For this reason i t is v i t a l l y important that the system designer understand the behavior of the system which he is designing so that he can locate those parts which have the most c r i t i c a l overall performance of the system. first
affect on the
These parts of the system should be the
to receive his attention in an attempt to discover more e f f i c i e n t
algorithms. 1.4
SUMMARY In the preceding sections we gave an informal d e f i n i t i o n of performance
and b r i e f l y discussed the problem of measuring performance.
We also pointed
out that there are limitations on the optimality of performance which we can reasonably expect to achieve.
Interest in performance analysis, while
increasing rapidly in the past few years, is not new since several papers on the subject were published in the open l i t e r a t u r e more than ten years ago. The number of published papers relevant to performance is large. Crooke, Minker, and Yeh [ I ] l i s t over 400 papers. In spite of this mass of l i t e r a t u r e , satisfactory solutions do not e x i s t for most of the problems in performance analysis, especially performance prediction.
There seems
to be no documented case where performance prediction was used e f f e c t i v e l y and extensively in the design and implementation of a large, complex operating system.
The reasons for this are discussed in section 5.
The aspect of performance analysis which has received the most attention is the evaluation of the performance of existing systems. Lucas [2] gives a good summary of the major techniques currently in use for performance analysis. Much work has been done on improving the performance of existing systems. The improvement which is possible in most systems is so substantial that several commercial organizations e x i s t whose sole business is performance evaluation and improvement [3]. 2.
SYSTEMMODELING
We now focus our attention on the central problem of performance analysis, system modeling. A model of a system expresses, in some form, the relations which exist between the basic variables of that system. The model of a system may be very simple or i t may be quite complex, depending
404
on the complexity of the system and how the model w i l l be used.
I f the
system is very simple the model may be so simple that i t exists only in the mind of the designer.
However, most systems are of s u f f i c i e n t complexity
that any useful model w i l l have to be expressed in some precise and formal way.
I t is clear that a complete, detailed description of the system, such
as the actual code, is a model of the system.
However, such a model is
generally not useful since i t contains a large amount of unnecessary i n f o r mation and does not c l e a r l y e x h i b i t the relations between the basic variables. A model is an abstraction containing only the s i g n i f i c a n t variables and relations. models.
Hence, i t is usually much simpler than the system which i t
How much simpler w i l l depend heavily on the expected use of the
model and the precision desired in the results of i t s use. Any form of performance analysis, even the measurement of performance, is impossible without some kind of model.
Conceptually a model is a function,
or set of functions, in which the system parameters used to characterize the system's performance are expressed as a function of the system's basic variables.
When predicting performance, a sequence of jobs or requests is
expressed as a sequence of values for the basic variables.
Using this
sequence of values and the model we obtain a sequence of values f o r the performance parameters.
This sequence of parameter values then describes
the performance of the system for the given sequence of jobs or requests. I f we wish to measure the performance of an existing system we need a model to help us i n t e r p r e t the quantities which we can actually record. The performance of a system is not a constant number, rather i t is a function, or several functions, whose values depend on the input.
In order to charac-
t e r i z e the performance of a given system we have to express these functions and this
expression is a model of the system.
values of only some of the system variables.
We can d i r e c t l y record the The values of these variables
for d i f f e r e n t input sequences are then used to derive the values for the c o e f f i c i e n t s of the functions in the model. 2.1.
TYPESOF MODELS There are many d i f f e r e n t kinds of models for systems, but b a s i c a l l y
only two d i f f e r e n t types~analytical and l o g i c a l .
An analytical model is
a set of mathematical equations which express the relations which e x i s t between the basic system variables and the performance parameters.
These
equations are then solved for the dependent variables, that i s , for the performance parameters.
After solving these equations the system's perfor-
mance is completely known since graphs of the performance parameters can be plotted from the resulting mathematical expressions.
405
In general, an analytical model does not r e f l e c t the structure of the system which i t models, but only expresses the relations between i t s variables. On the other hand a logical model mirrors closely the structure of the system being modeled.
A logical model usually cannot be solved in the sense that
an analytical model can be, that i s , closed form expressions for the performance parameters are not derivable from the model.
However, a logical model
often includes mathematical equations which express some of the relations between variables. In the following sections we w i l l discuss analytical models and two d i f f e r e n t kinds of logical models, directed graph models and simulation models. In each case we w i l l informally define the kind of model and give an example. In sections 3 and 4 we w i l l see how these p a r t i c u l a r example models can be used f o r predicting performance. analyzed in this section. 2.1.1o
For that reason, the examples w i l l not be
The purpose here is to show how they model a system.
ANALYTICAL MODELS
An analytical model is a set of mathematical equations which express the r e l a t i o n s which e x i s t between the basic system variables and performance parameters.
As such, the number of d i f f e r e n t analytical models is large
and the kinds of mathematical equations appearing in these models covers a wide range.
However, p r a c t i c a l l y a l l analytical models share one common
property, they are stochastic, that i s , analytical models o r d i n a r i l y contain stochastic variables.
The exact sequence of values taken by a stochastic
variable is not known, however, i t s range of values and the p r o b a b i l i t y with which i t w i l l take these values is known or assumed to be known. Stochastit variables are usually defined by functions which give the p r o b a b i l i t y of the variable taking the values in i t s range. The stochatic nature of analytical models r e a l l y r e f l e c t s a basic property of system performance. We seldom, i f ever, know the exact sequence of jobs or requests which w i l l be presented to a system. Further, we do not know the exact characteristics of the jobs or requests. Ordinarily the most we know, or can estimate, is the p r o b a b i l i t y d i s t r i b u t i o n of such quantit i e s as the job a r r i v a l time and i t s resource demands. Since t h i s stochasticness is basic to operating systems we should expect that i t w i l l also show up in the logical models. Much of the work on analytical models of systems has focused on the scheduling policy used in the system. Assuming that the performance of a system i s not noticeably affected by anything other than the scheduling polioy, quite simple analytical models can be constructed. The simplest such model is based
406
on a f i r s t - c o m e - f i r s t - s e r v e scheduling p o l i c y . We assume that our system cons i s t s of a single processor and a single queue f o r that processor, as in Figure 2.1.
Time i s divided into units called quanta, each of which i s exactly
new job enters system
~ / 7 - ~ / 7 ~ > queue
" > processor
completed job leaves system
Figure 2.1 F i r s t - c o m e - f i r s t - s e r v e model
Q seconds long. At the end of each quantum a new job may enter the system, i f so i t i s put at the end of the queue. The processor i s always allocated to the job at the head of the queue. Once the processor has been allocated to a job, that job executes u n t i l
i t s execution is complete. The completed job
then leaves the system and the processor is allocated to the job now at the head of the queue. I f the queue i s empty the processor remains i d l e u n t i l a new job i s placed on the queue. Thus, each job which enters the system is queued u n t i l i t gets i t s turn at the processor. Once a job gets the processor i t executes to completion. This scheduling policy is frequently used in the simpler, nonmultiprogramming, batch systems. To construct an a n a l y t i c a l model we have to specify the time when each job enters the system and the job's execution time. The usual method of spec i f y i n g t h i s information is by p r o b a b i l i t y d i s t r i b u t i o n s f o r both job a r r i val and execution times rather than giving actual sequences of job a r r i v a l s and execution times. For example, we may assume that at the end of each quantum a new job a r r i v e s with p r o b a b i l i t y
~Q. This gives a job a r r i v a l d i s t r i -
bution which i s a special case of the discrete Bernoulli or binomial d i s t r i bution. We might also assume that a job's execution time i s an exact mult i p l e of
Q, nQ, and i s chosen independently from a geometric d i s t r i b u t i o n , sn = ( l - o ) o n - I
where
sn
,
n = 1,2,3,...
,
0 ~ ~ < 1
is the p r o b a b i l i t y that a job's execution time is e x a c t l y
quanta, i . e . ,
n
nQ seconds. In section 3.2 we w i l l explore the performance
407
of the f i r s t - c o m e - f i r s t - s e r v e model with these probability distributions. A s l i g h t l y more complicated model is based on the round-robin scheduling policy sometimes used in time-sharing systems. In this model a new job entering the system is put at the end of the queue and the processor is always allocated to the job at the head of the queue. However, when the processor is allocated to a job, the job executes for exactly one quantum, Q seconds. At the end of the quantum i f the job has completed i t s execution i t leaves the system, otherwise i t is returned to the end of the queue (see figure 2.2). The processor is then allocated to the job now at the head of the queue. Since a job's execution time is exactly nQ seconds, i t w i l l be put on the queue exactly n times before i t has completed
~" ~,.. ) ~ / / ~ / / ~ / ~ new job enters system
....... queue
p a r t i a l l y completed job / f_., ~ ~ returns to queue > completed job leaves system processor
Figure 2.2 Round-robin model
its execution. The same distributions for arrival and execution times may be assumed for this model as were assumed for the first-come-first-serve-model. Kleinrock [4] has studied models based on these as well as other scheduling p o l i c i e s , p a r t i c u l a r l y policies involving p r i o r i t i e s .
Estrin and
Kleinrock [5] have surveyed the results of analyzing a number of d i f f e r e n t models. Analytical models have been used to model many d i f f e r e n t aspects of a system's operation, such as; central processor scheduling, disk scheduling, memory p a r t i t i o n i n g ,
paging, and f i l e organization. Since resource management
usually requires the use of queues, many analytical models require the use of queueing theory in t h e i r analysis. Several interesting studies of analytical models appear in [6] and [7]. especially in [ I ] . 2.1.2.
Good bibliographies appear in [8] and
DIRECTEDGRAPHMODELS
One of the simplest models of a program is a directed graph, which is
408
b a s i c a l l y a flowchart of the program in which some of the detail has been suppressed and some additional information has been added. is a set of nodes and directed arcs.
A directed graph
Each arc in the graph originates at a
node and terminates at a node, possibly the same node. More than one arc may originate or terminate at a single node. For example, figure 2.3 shows a directed graph consisting of f i v e nodes ( c i r c l e s ) and seven arcs (lines with arrowheads).
Figure 2.3 Directed graph
In modeling a program with a directed graph the arcs represent the paths of possible control flow.
Branch points are represented by nodes with
more than one arc originating at the node. Computation or other processing may be associated with either the nodes or arcs depending on the p a r t i c u l a r model.
Additional information may be associated with the nodes and arcs,
for example, the p r o b a b i l i t y that control exits from a branch point along a given arc is often associated with that arc. As an example consider the following program fragment, IF X<5 THEN W=X+2 ELSE W=6-X; DO I=I TO N; A(1)=B(1)*W; END;
409
Figure 2.4 shows i t s flowchart. To construct i t s graph model l e t us make the following assumptions. of the time.
N is always equal to
9.
X is less than
5
half
The instructions for addition, subtraction, comparison, load,
store, and conditional transfer each take one time unit. M u l t i p l i c a t i o n takes three Lime units. The index I is kept in a register and thus the subscripting does not take any additional time.
Using these assumptions
we can make the following assignment of execution times to the boxes in the flowchart,
false
f W = 6-X
t
II=l I
true
1 Figure 2.4 Flowchart of program fragment
<>
410
f l o w c h a r t box
execution time
X<5
The box
X < 5
3
W = X+2
3
W = 6-X
3
I=I
1
I>N
2
A(1) : B(Z)*W
5
I = I+I
l
takes three time u n i t s , one to load
to make the comparison w i t h The boxes
W = X+2
and
5,
X
i n t o a r e g i s t e r , one
and one to execute the c o n d i t i o n a l t r a n s f e r .
W = 6-X
take three time u n i t s , one to load one
operand, one to do the a d d i t i o n or s u b t r a c t i o n , and one to store the r e s u l t . The box
I = 1
load
i n t o the r e g i s t e r used f o r
1
takes only one time u n i t since a l l t h a t is necessary is to
two time u n i t s because
I
I.
takes f i v e time u n i t s , one to load m u l t i p l y by
W,
The comparison
i s already in a r e g i s t e r . B(1)
one time u n i t f o r adding
1
since
I
A(1) = B(1)*W
( s u b s c r i p t i n g is f r e e ) , three to
and one to store the r e s u l t .
r e s u l t is to be l e f t
I > N only takes
The box
The box
I = I+I
takes only
is already in a r e g i s t e r and the
in the r e g i s t e r .
Using these execution times we can c o n s t r u c t the d i r e c t e d graph model shown in f i g u r e 2.5.
© 1.0 3
1 1.O 1
l.O 1 Figure 2.5 Directed graph model
411
In t h i s mode] a l l execution time is associated with the arcs of the graph. The nodes are j u n c t i o n , branch, or separation points. associated with i t :
Each arc has two numbers
the p r o b a b i l i t y that control w i l l e x i t from that arc's
o r i g i n node along the arc, which is w r i t t e n with a decimal p o i n t , and the execution time f o r the branch, which i s w r i t t e n without a decimal point. Notice that the execution time f o r a decision box in the flowchart has been associated with the arc which terminates on the corresponding node in the graph. with arc
Thus, the execution time f o r the flowchart box (1,2)
X < 5
is associated
while the branching in t h i s flowchart box is represented by
node 2 which is the o r i g i n f o r the two arcs
(2,3)
which correspond to the
two flowchart boxes W = X+2 and W = 6-X. The model in the preceding paragraph i s adequate f o r very simple programs, but needs to be extended in order to model some of the more common program constructions.
The f i r s t
N i s not constant.
s i t u a t i o n in which the model i s inadequate i s when
I f the v a r i a t i o n in
N is small compared to i t s s i z e ,
the model w i l l probably be v a l i d i f the mean value of the branch p r o b a b i l i t i e s .
However, i f the variance of
N is used to c a l c u l a t e N from i t s mean
value is high some modification of the model is required in order to obtain a v a l i d mode].
One way of achieving t h i s is to leave the v a r i a b l e
N in
the model, f o r example
P 5
where
N P - N+I
A s i m i l a r problem arises in connection with branch points in general. Another strategy f o r attacking the same problem is to associate a random v a r i a b l e with each arc and define i t s value as some p a r t i c u l a r p r o b a b i l i t y distribution. Another problem occurs when a computation box in the flowchart is a subroutine c a l l .
Usually a subroutine does not have a f i x e d execution time,
r a t h e r , the time i s a function of i t s input arguments. are suggested.
Again two s t r a t e g i e s
The actual function which determines the execution time can
be associated with the appropriate arc. can be defined by a random v a r i a b l e .
A l t e r n a t e l y , the execution time
Beizer [9] proposes a model in which
412
the execution time is given by a mean value and i t s variance.
In his model
a subroutine or function call would be modeled by an arc such as,
(~,~,)
where
~
is the mean execution time and
~
is i t s variance.
Both of the extensions to the simple graph model which are suggested in the preceding paragraphs make analysis of the model more d i f f i c u l t .
How-
ever, these d i f f i c u l t i e s cannot be avoided i f we wish our model to be valid enough that analysis w i l l provide r e l i a b l e information about the performance of the system.
We w i l l discuss these d i f f i c u l t i e s in a l a t e r section when
we consider how our model can be used for performance prediction. A directed graph is conveniently represented by a Boolean matrix. The properties of directed graphs and t h e i r manipulation in Boolean matrix form have been studied [ I 0 ] .
Directed graph models of programs are useful
for many other purposes in addition to performance prediction. many variations of the basic model e x i s t .
As a r e s u l t ,
For example, Lowe [ I I ]
defines
a model which contains additional nodes, of a d i f f e r e n t type, corresponding to d i s j o i n t data sets and additional arcs which represent data references. Graph models of programs have long been used by compilers for optimization of object code [12,13].
More recently graph models have been used for auto-
matic program segmentation [ I I ] and performance measurement [14].
A simple
graph model can e a s i l y be constructed d i r e c t l y from the source language program [14].
The construction of a complete, detailed model is straightforward
when i t is part of a compiler for the source language [13]. 2.1.3.
SIMULATIONMODELS
The most important kind of model is a simulation model. general and f l e x i b l e of a l l the d i f f e r e n t kinds of models. kind of information can be included in such a model.
I t is the most P r a c t i c a l l y any
Further, such a model
can be constructed at any l e v e l , that i s , as much detail as desired can be included in the model. Furthermore, concurrency [see Dennis C] is e a s i l y modeled with simulation models, whereas i t is d i f f i c u l t or impossible using analytical models and many graph models, although some graph models are spec i a l l y designed f o r modeling concurrency [15]. There are large number of d i f f e r e n t kinds of simulation models, j u s t as there are a large number of d i f f e r e n t simulators. Since a simulator is
413
required to i n t e r p r e t a s i m u l a t i o n model, the form of model to be used is determined by the s i m u l a t o r .
For example, one simulator uses a model which
is s i m i l a r to the d i r e c t e d graph model used as an example in the preceding section [ 1 6 ] .
There are a number of simulators which require the model to
be described i n a special model d e s c r i p t i o n language.
Some of these simu-
l a t o r s are described in l a t e r sections where s i m u l a t i o n and s i m u l a t i o n models are discussed in considerable d e t a i l . Logical models in general r e f l e c t f a i r l y system,
d i r e c t l y the s t r u c t u r e of the
There are several d i f f e r e n t ways to express t h i s s t r u c t u r e .
The
d i r e c t e d graph model which was discussed e a r l i e r expresses s t r u c t u r e by d i r e c t l y representing the branch points in the program.
Another way of
representing the s t r u c t u r e is by modeling the f l o w of the e n t i t i e s with which the system deals, such as: jobs and i n p u t - o u t p u t requests. of t h i s type the s t r u c t u r e of the system is less e x p l i c i t
In a model
than i t was in the
d i r e c t e d graph model. This e n t i t y flow type of model i s most f r e q u e n t l y used in s i m u l a t i o n .
In the remainder of t h i s section we w i l l
describe a model of
t h i s type f o r a r a t h e r simple system. The model and i t s use f o r performance prediction will
be discussed in d e t a i l
in section 4 which deals with simula-
tion. The system which we w i l l
model is a n o n - i n t e r a c t i v e , multiprogramming
system and is due to MacDougall [ 1 7 ] .
The hardware in the system consists
of a central processor, central memory, and a movable head disk. example we w i l l the card reader.
For t h i s
not consider the e f f e c t s of any peripheral devices such as Jobs are entered i n t o the system whenever they are submitted
to the computation center.
As soon as s u f f i c i e n t
central memory space is
a v a i l a b l e the job i s loaded f o r execution (we ignore the loading time in t h i s example).
A l l of the loaded jobs compete w i t h each other f o r use of
the central processor,
Whenever a job makes a disk i n p u t or output request
i t gives up the central processor. up the central memory space which i t
When a job Finishes execution i t gives has been a l l o c a t e d .
be more than one job in the system at a time, i t
Since there may
is possible that a job
requests the use of a resource which is not c u r r e n t l y a v a i l a b l e . queue must be maintained f o r each resource.
Thus, a
The resources the system has are
central memory space, the central processor, and the disk.
Whenever a job
makes a request f o r one of these resources and the resource is already in use or, f o r central memory, there is not enough resource remaining to s a t i s f y the request, the job is put on the appropriate queue. one queue at a time and does not execute when i t
A job may be on only
is on a queue.
414
B r i e f l y , the system functions as follows.
When a job f i r s t enters the
system a request is made for central memory space into which to load the job. I f s u f f i c i e n t central memory space is not available the job is put on the central memory queue. Otherwise the job is loaded and a request f o r the central processor is made.
I f the central processor is not free the job is
put on the central processor queue. Otherwise, the job begins execution. Whenever a job in execution makes a disk request several things happen. the disk is free the requested disk input or output is started. the job is put on the disk queue. the central processor.
If
Otherwise,
In either case the requesting job gives up
I f the central processor queue is not empty the
central processor is allocated to the job at the headof the queue. This job then resumes (or begins) execution. I f the central processor queue is empty, the central processor is l e f t idle u n t i l a request is made for i t s use. When a disk input or output request has been completed the job which made the request is ready to resume execution. t r a l processor.
A request is made for the cen-
I f this request can be s a t i s f i e d , the job resumes execution.
Otherwise, i t is put on the central processor queue.
I f , upon completion
of a disk input or output request, the disk queue is not empty the input or output requested by the job at the head of the queue is started.
When
a job completes execution, the processor is allocated to the job at the head of the processor queue i f the queue is not empty. allocated to the terminating job is given up.
The central memory space
I f the central memory queue
is not empty then central memory space is allocated to the job at the head of the queue i f there is now s u f f i c i e n t space to s a t i s f y i t s request. Our model for this system consists of a characterization of the flow of a job through the system. the model.
The job is the single e n t i t y which appears in
The flow of a job through the system is expressed by the flow
diagram in figure 2.6.
Each job which enters the system follows a path
through this diagram until i t s execution is completed, at which time i t leaves the system.
Although the diagram is not exactly a flowchart of the
system i t is very close to i t . Thus, the model closely reflects the structure of the system. To use the model we must specify the relevant properties of the jobs which enter the system. We do this by specifying d i s t r i b u t i o n functions j u s t as we did f o r our example analytical models. There are f i v e relevant job char a c t e r i s t i c s : job i n t e r a r r i v a l time, central memory requirement, central processor time requirement, I-0 interrequest time, and I-0 record length. The job i n t e r a r r i v a l time is the interval between a r r i v a l of successive jobs.
415
@ ~e
~ insufficient quest central memory) memory request satisfied
I
I--" ....
I central memory queue
• |,
Fload job I
~
" " I processor equest central processorf. ' ~ y Dusy processor I~, ree
~
central processor queue
~.:xecute " "t execution JOD-I completed disk I
~I input Loutporut
release I central processor I
I
~equest dis ,disk busy disk
i
free < process disk linput or output l I
Lrelease disk!
J
disk queue I
release central processor rel ease central memory
Figure 2.6 Job flow in the system
416
The I-0
interrequest time is a d i s t r i b u t i o n which specifies the length of
time a job executes, whenever i t gets the processor, until i t makes a disk input or output request. The central processor time requirement and the I-0 interrequest d i s t r i b u t i o n determine the number of I-0 requests which the job w i l l make. The I-0 record length is a d i s t r i b u t i o n which specifies the amount of time that the disk w i l l be busy servicing an input or output request. The model of the system is completely specified by the flow diagram. In order to use i t in simulation i t must be expressed in the manner required by the p a r t i c u l a r simulator being used. The distributions for the f i v e relevant job characteristics specify a p a r t i c u l a r class of input jobs. These must also be expressed as required by the simulator. We w i l l examine one of the model specification languages which is used by a p a r t i c u l a r simulation system in section 4. In that section we w i l l study simulation in more detail, 2.2.
including the use of the preceding example for performance prediction. PROBLEMSIN MODELING A number of problems always arise whenever one attempts to model a system.
The most s i g n i f i c a n t problem is that of the v a l i d i t y of the model.
A model
of a system is an abstraction of the system in which many d e t a i l s of the system's structure have been omitted or, in the case of an analytical model~ a set of equations which express a l l of the s i g n i f i c a n t relations between the variables of the system. the system.
The model is b a s i c a l l y a simplified version of
In the process of deriving the model from the system some s i g n i -
f i c a n t relations may have been omitted from the model.
I f this happens the
model is not v a l i d , that i s , the behavior of the model for a given input w i l l not match the behavior of the real system within reasonable l i m i t s . should be clear that an i n v a l i d model is r e l a t i v e l y useless. v a l i d i t y is probably the most d i f f i c u l t
It
The problem of
and c e r t a i n l y the most serious
problem in modeling, especially for performance prediction.
When measuring
performance the v a l i d i t y of the model can be tested by comparing the behavior of the model with the behavior of the real system.
I f they disagree beyond
acceptable l i m i t s , the model is modified u n t i l i t s behavior agrees with the real system,
In the case of performance prediction this is not possible.
Since the designer is trying to predict the performance of a system design before he implements that design, there is no way to compare the model's behavior with the behavior of the unimplemented "real" system. to the problem of v a l i d i t y in section 5.
We w i l l return
417
One way of solving the problem of v a l i d i t y is to include more detail in the model.
However, this leads to another problem, the inclusion of a large
number of i r r e l e v a n t variables and r e l a t i o n s .
This problem is not as serious
as an i n v a l i d model, nonetheless, i t may have serious consequences.
A model
which includes many i r r e l e v a n t variables and relations often becomes unmanageable.
Analysis of such a model becomes d i f f i c u l t
consuming and i n e f f i c i e n t .
I t is d i f f i c u l t
and simulation is time
for the designer to understand
the behavior of the system because the s i g n i f i c a n t relations get lost among the i r r e l e v a n t ones.
I t is possible to have more than one model for the same
system, each d i f f e r e n t model being used for a d i f f e r e n t purpose. of detail in these models would be d i f f e r e n t .
The level
The ideal model is one which
has j u s t enough detail for i t s purpose, and no i r r e l e v a n t variables and relationships. same model.
The level of detail may also vary from part to part in the
For example, the model used to get a rough indication of the
gross behavior of the system may be quite simple and include only a few variables and r e l a t i o n s .
On the other hand a model used to analyze the per-
formance of a p a r t i c u l a r disk unit f o r a p a r t i c u l a r f i l e storage allocation algorithm would have to be f a i r l y detailed.
Such a model would probably
contain a moderate number of variables and relations in order to r e f l e c t such things as, the sequence of positions of the disk's read-write head, the sizes of the f i l e s on the disk, and the d i s t r i b u t i o n of the records on the disk. There are several problems which are unique, or especially severe, with analytical models.
The most obvious problem is that the equations which
express the relations between the system variables may be extremely d i f f i c u l t or impossible to solve, that i s , the analyst is unable to derive any closed form expressions f o r the performance parameters.
In t h i s case the
advantage of the analytical model over logical models is l o s t .
Also, for
a complex system, the relations between the system variables may not even be expressible as mathematical equations.
Another d i f f i c u l t y with analytical
models is that usually the level of detail in the model cannot be changed without constructing an e n t i r e l y new model. logical models.
This is generally not true for
Since a logical model is a f a i r l y d i r e c t r e f l e c t i o n of the
system's structure i t is usually possible to change the level of detail of the model or any part of i t by techniques analogous to the system design techniques which are based on hierarchical structure and levels of abstract machines [see Dennis A, Goos A, Waite, and Poole A].
Logical models also
have the advantage that i t is r e l a t i v e l y straightforward to build a model of
418
a system by combining models of i t s subsystems or component parts. process of combination is usually d i f f i c u l t
This
or impossible with analytical
models. In f a c t , i t seems to be p r a c t i c a l l y impossible to model a complex system in any reasonable detail with an analytical model. Analytical models are most useful in modeling some part of the system. The information obtained from the study of such a model can then be used in a logical model of the whole system. I t is usually possible to capture a great deal more detail with a logical model than with an analytical model. This is especially useful in the e a r l i e r stages of performance prediction when i t is s t i l l significant.
unknown what variables and relations in the system are r e a l l y
There is no sharp dividing line between analytical and logical
models. For example, an analytical model can be used for simulation rather than deriving a closed form solution. Likewise some logical models yield a closed form solution, at least for certain aspects of performance.
In
an analogous fashion, no single modeling technique is always the most useful. Although simulation modeling is the most v e r s a t i l e , the other kinds of modeling are usually always useful in a complete analysis of a system's performance, giving information which is d i f f i c u l t
or impossible to obtain
from simulation. 3.
USE OF MODELS IN PERFORMANCEPREDICTION
In this section and the next we w i l l explore the use of models in performance prediction using the three models described in the preceding section as examples. Each d i f f e r e n t type of model w i l l require a d i f f e r e n t technique for its use and w i l l y i e l d d i f f e r e n t kinds of information. As we have previously mentioned each d i f f e r e n t technique has i t s place in a complete analysis of performance.
Before considering the d i f f e r e n t tech-
niques and examples, we should be aware of some problems which we w i l l encounter when using any kind of model to predict performance.
3.1.
PROBLEMSIN USING MODELS The major problems in using models to predict performance are v a l i d i t y
of the model, characterization of job or request properties, and interpretation of the results. The problem of the v a l i d i t y of a model was discussed in section 2.2. The reader should not underestimate the significance and d i f f i c u l t y of this problem. The significance of the problem l i e s in the fact that predictions based on an invalid model are v i r t u a l l y useless and do not give the designer any r e l i a b l e information on the performance of the
419
system he has designed. Constructing a valid model is d i f f i c u l t , especially for a large, complex operating system. In order to make the model tractable, considerable abstraction w i l l have to take place during construction of the model.
Since the designer does not usually have a very good understanding
of the behavior of a new, complex system in terms of its variables and the relations between them, i t is easy for s i g n i f i c a n t relations to get omitted from the model.
Since the proposed system design has not yet been imple-
mented the model cannot be validated by comparison with actual operation of the system. Characterization of the properties of the jobs or reques~ which w i l l be submitted to the system is also a s i g n i f i c a n t and d i f f i c u l t
problem.
As we have noted e a r l i e r , the performance of any system is a function of certain properties of the input to the system, namely t h e i r resource demands. When using a model to predict performance, the model is applied to the sequence of resource demands which represent the system's input. The result is a measure of the predicted performance of the system for the given input. Assuming that the model is v a l i d , the result of applying i t to input other than that which w i l l be given to the system in actual use may be i n t e r e s t ing but is not apt to be relevant to the desired performance of the system. What the designer wants to know is how the proposed system w i l l perform for the kind of input i t w i l l receive when i t is actually used. The system's behavior with other input may be i n t e r e s t i n g , since i t might give the designer some insight into the s e n s i t i v i t y of the system to unexpected input, however, i t is not the primary reason for performance prediction. I t may be quite d i f f i c u l t
to find a valid and usable characterization
of the system's input. The s i g n i f i c a n t properties of the input are usually the sequence of jobs (or requests) in the input and the sequence of resource demands made by each of these jobs. In the f i r s t
place, the designer may
have only a vague knowledge of the types of jobs which w i l l be submitted to the system. He may know what kinds of applications the system w i l l be used for, e . g . , payroll or heat transfer computations. However, this knowledge needs to be translated into typical sequences of resource demands before i t can be used with the model to predict performance. In fact, the input must be modeled, that i s , the s i g n i f i c a n t resource demands must be abstracted from the anticipated real jobs. In this modeling of the input we have to cope with most of the problems which have been discussed in connection with modeling of the system. In fact, for some simulators, models of the jobs input to the system are expressed in exactly the same way as the model of the system i t s e l f
[16,18].
420
In any system where the user is able to w r i t e his own programs the problem of modeling the input is especially severe, p r i n c i p a l l y because the system designer does not know what programs the user w i l l w r i t e .
Even
knowledge of the class of problems the user w i l l be solving is often of little
help since there are many d i f f e r e n t ways of w r i t i n g a program to
solve a p a r t i c u l a r problem.
Even i f the designer knows exactly a l l of the
programs which w i l l be input to the system, the number of d i f f e r e n t programs is so large that i t is usually impossible to explore the system's behavior for a l l possible combinations of programs in the input.
For this reason
the input is usually characterized as a small number of d i f f e r e n t mixes of several typical jobs.
A typical job is a sequence of resource demands which
is s i m i l a r to the resource demand sequences of some class of real jobs. A typical job is an abstraction from a class of real jobs.
I t can
sometimes be deduced from the sequence of computation and data manipulation required to carry out the function which the job performs.
For example,
a master f i l e update job w i l l have to sequence through the records in two f i l e s , the master f i l e and the f i l e containing the update information. computation performed between input or output operations is minimal. most jobs are not so simple and may be impossible to analyze.
The However,
The usual
attack in this case is to record the operational characteristics of a large set of jobs from a given class when they are executing in some other system. From this data i t is usually possible to derive a v a l i d model (typical job) of this job class. Just as models of systems range from simple to complex, so do models of job classes.
The simplest model of a job class consists of a set of
d i s t r i b u t i o n s , one pair f o r each resource.
One d i s t r i b u t i o n in the pair
gives the pattern (frequency) of requests for the resource while the other d i s t r i b u t i o n in the pair gives the amount of resource demanded by each request.
In addition, i t is assumed that these d i s t r i b u t i o n s are a l l inde-
pendent.
More complex models of job classes may allow some resource demands
to be expressed as functions of p r i o r demands f o r the same or other resources, for example, the amount of memory requested and the frequency of requests for memory may be a function of the amount of memory already requested. Even though time can be considered as a resource, the dependence of resource requests on time is so important that we w i l l consider i t as a separate aspect.
Most d i s t r i b u t i o n s are a function of time.
However, there is another
way in which the resource demand sequence may depend on time.
The d i s t r i b u t i o n
which models the frequency of requests for a resource or the magnitude of request for that resource may be d i f f e r e n t from time to time.
For example,
421
a p a r t i c u l a r typical job may be modeled by a sequence of frequent requests for a short amount of execution followed by a sequence of less frequent requests for a longer amount of execution.
A single d i s t r i b u t i o n (at least
one of the common, simple d i s t r i b u t i o n s ) may not v a l i d l y model the total sequence of requests for execution, whereas, two d i f f e r e n t distributions might be quite adequate as a model. The t h i r d major problem in using a model for performance prediction has to do with i n t e r p r e t a t i o n of the results.
I f the results of performance
prediction indicate that the performance is not acceptable, the designer must modify his design until the design exhibits acceptable performance. Even i f the prediction results show acceptable performance, the designer may still
need to modify the design in order to improve i t s performance since
he may be trying to achieve an optimal design.
In order to improve his
design the designer needs to know what part of his design to modify to achieve performance improvement.
This requires some i n t e r p r e t a t i o n of the result of
applying the model of the system to a typical job mix.
I t is not s u f f i c i e n t
to simply observe the values of the performance since this information only t e l l s the designer how good or bad the performance is compared to the minimum acceptable performance.
The inner workings of the model as i t reacts to
the input has to be observed.
I t is only by examining the values of the system
variables which are internal to the model and considering the relations which e x i s t between these variables that the designer can locate the b o t t l e necks in his design and thus learn where the design can be improved.
For
example, observing the average length of the resource queues and the average time spent by a job in these queues w i l l reveal any mismanagement of resources. I t was mentioned e a r l i e r that the use of d i f f e r e n t kinds of models may require d i f f e r e n t techniques depending on the p a r t i c u l a r model.
There
are basically two classes of techniques for the use of models, closed form solution and experimental. analytical models.
Closed form solution is most commonly used for
The set of equations which constitute an analytical model
are solved f o r the performance parameters.
This solution, which is i t s e l f
a set of equations, can then be plotted or further analyzed.
Since the
equations which constitute a solution are almost always functions of several variables, the graph of these equations is a family, or families, of curves. These curves usually display quite v i v i d l y the complete behavior of the system. Since by d e f i n i t i o n a logical model does not y i e l d a closed form solution, some other technique is required, even though parts of the model may be solved for closed form expressions.
The basic way of using such a model
422
is to conduct a set of experiments, that i s , the model is applied to a set of d i f f e r i n g inputs. Each application of the model constitutes an experiment. The r e s u l t s of each experiment are recorded and the set of results from a l l of the experiments is l a t e r analyzed. Usually t h i s analysis includes p l o t t i n g the values of some or a l l of the observed variables (the performance parameters and system v a r i a b l e s ) , j u s t as the r e s u l t s of experiments in the physical sciences are p l o t t e d to depict the r e l a t i o n s between v a r i a b l e s . I f enough experiments are conducted, the designer may be able to discover simple mathemat i c a l equations which are good approximations to the true r e l a t i o n s between the system variables and performance parameters. Simulation always involves conducting a set of experiments. Thus, i t i s the most v e r s a t i l e of a l l the types of models and i s useful at any level of d e t a i l and complexity. A c t u a l l y , almost any model, including a n a l y t i c a l models, can be used f o r simulation. However, while some l o g i c a l models can be analyzed to some degree, most l o g i c a l models are s u i t a b l e only f o r use in some form of simulation, that i s , to use them f o r performance prediction a set of experiments must be conducted. The use of simulation models w i l l be discussed and i l l u s t r a t e d in section 4. In the remainder of section 3 we w i l l discuss the use of an a n a l y t i c a l model and a logical model upon which some analysis can be performed. 3.2.
PREDICTIONUSING AN ANALYTICAL MODEL As an example of prediction using an a n a l y t i c a l model we w i l l explore
the a n a l y t i c a l models described in section 2.1.1.
Recall that the first-come-
f i r s t - s e r v e model i s a simple, single queue model without feedback, where the queue d i s c i p l i n e used is f i r s t - c o m e - f i r s t - s e r v e , while the round-robin model is the same except f o r the addition of feedback and l i m i t a t i o n of execution time f o r a job on the processor to a single quantum. S t r i c t l y speaking, the d i s t r i b u t i o n s which characterize the job a r r i v a l and execution times are not part of the model, but part of the input description. However, most studies of a n a l y t i c a l models seem to include these d i s t r i b u t i o n s as part of the model. In our example we assume that jobs a r r i v e according to a ( d i s c r e t e ) Bernoulli d i s t r i b u t i o n with p r o b a b i l i t y
~Q, where
Q i s the length of a quantum (in
seconds). We also assume that a j o b ' s execution time i s chosen independently from a geometric d i s t r i b u t i o n ,
s n = ( I - ~ ) ~ n-I ,
n = 1,2,3 . . . . .
0 _< ~ < 1
,
423
where
s n is the probability that a job's execution time is exactly
n quanta
(nQ seconds). Klein~ck [4] derives the following results for these two models.
In
both models the expected number of jobs in the system at any given time is, E= Since
,
where p = I-~
~ is the average number of jobs arriving per second,
I/(I-o)
is the
average number of quanta of execution required per job, and Q is the number of seconds in a quantum, then
p
is j u s t the average number of seconds of
execution time demanded per second by all of the jobs in the system. p < I, E÷~
otherwise the system overloads and never gets caught up. as
In fact,
p÷l.
For the first-come-first-serve
(F) model the response time is given by,
RF(n) = ~ + RF(n)
Clearly
nQ
(3.1)
is the total time that a job, which requires
execution, spends in the system. spends QE/(I-~)
n quanta of time for
Its execution time is
nQ seconds and i t
seconds in the queue. For the round-robin (R) model the
response time is given by, ~2 RR(n) = nq _ ~_--~p1 + (1-°(°+'xq))(1-(°+xq)n'l)]
l-p
(l_~)2(l_p)
Kleinrock has found that a good approximation to RR(n) is, RR(n) Z nQE + nQ Thus, in the round-robin model a job which requires
(3.2) n quanta of execution
spends nQE seconds in the queue. Let us look more closely at the response time. RF(n) and
RR(n) are linear in
n,
since all of
Notice f i r s t that both Q, o,
and ~ are constant.
Rewriting equations 3.1 and 3.2, we have, ~RF(n ) = n + I-~ E ~RR(n) = (E+l)n We drop the constant factor I/Q which occurs in both relations and plot the response time for the two models as a function of n in figure 3.1. In
424
the graph the crossover point, equating them and solving for
na,
for the two functions is obtained by
n,
RF(n)
RF RR
RF(n)~F-J
i
I
na
n
Figure 3.1 Response time as a f u n c t i o n of e x e c u t i o n time
n a + i_--~ = (E+l)n a 1 na = T ~
The crossover point is the place where the first-come-first-serve scheduling policy begins to give a shorter response time than the round-robin.
In
other words, i f the execution time of a job is less than na quanta then its response time is shorter i f a round-robin scheduling policy is used. Another way of looking at this is to say that a round-robin scheduler gives better service to short jobs, which is desirable in most time-sharing systems. Consider the case where ~ = 0.1, then sI = (l-a) : 0.9, that is, the probability that the execution time of a job is one quantum long is 0.9.
425
The crossover point i s ,
na = I . I . Thus, those jobs whose execution time is one quantum (about 90% of the jobs) get b e t t e r service when a round-robin
scheduling p o l i c y is used. • We can also examine the behavior of these two models as the system approaches overload conditions, i . e . ,
as
p ÷ I.
We w i l l
look at the amount
of time a job spends in the queue, which is i t s delay time. is the response time minus the execution time.
The delay time
In [4] Kleinrock p l o t s ,
kDF(n ) = k(RF(n ) -nQ) kDR(n) = k(RR(n)-nQ) where
k = (l-~)/(oQ),
true formula f o r
,
rather than the true delay time.
He also uses the
RR(n) r a t h e r than the approximation since the approxima-
t i o n is quite bad as
~ ÷ O.
i s a function only of
p.
Under the normalization f a c t o r
However,
kDR(n)
k,
kDF(n )
remains a function of
n
and
as well as p. In f i g u r e 3.2 kDR(n) is plotted f o r two values of ~. In each case we get a f a m i l y of curves, one f o r each value of n, and several members of the family are shown. The curve f o r kDF(n), which is the same f o r a l l values of n and a, is plotted in each of the two graphs with small c i r c l e s .
There are three s i g n i f i c a n t aspects of the system's
performance which can be seen from the graphs. as the system approaches overload conditions.
The service deteriorates That i s , the more e f f i c i e n t l y
the processor is used, the longer the delay time.
I t is also clear that
the rate at which service deteriorates gets larger as the system approaches overload.
F i n a l l y , i f a round-robin scheduling p o l i c y is used, the service
deteriorates at a f a s t e r rate f o r jobs with longer execution times. d e t e r i o r a t i o n is p a r t i c u l a r l y severe f o r small values of
~,
This
i . e . , when
the input to the system contains a large percentage of short jobs. The preceding analysis has derived p r a c t i c a l l y a l l there i s to know about the two models. We have seen how the response time varies with the execution time of the job. The actual response time depends on the values of
k, o,
and
Q,
however, f o r given values of these parameters i t varies
l i n e a r l y with respect to job execution time.
We also saw how the service
deteriorates as the system approaches overload conditions.
Both of the
models studied are extremely simple, y e t they include several variables and the mathematics required to solve them is not t r i v i a l .
When predicting the
performance of a system of any complexity use of e i t h e r of these models w i l l not give a complte and accurate picture of the system's performance. is not to say that these models are useless.
This
I f the system follows a f i r s t -
426
come-first-serve or round-robin scheduling policy, then using the appropriate one of these two models will give some broad indication of the system's performance, an upper bound to the best possible performance. These models are inadequate for precise performance prediction because they are too simple. Many significant system variables and relations have been omitted from these models. For example, any system, except the most t r i v i a l , will have more than the single queue which is included in the above
I
n = 20
16
161
kDF
~ i
/n = 50
/
/n=5
kDF
"
,n= 2 i~ n;1
kDR
kDR 8
0
0.2
p
0.6
1.0
0.2
p
0.6
10
Figure 3.2 Delay time as a function of system load
models. We cannot expect that any model which omits all of these other queues will yield completely valid, detailed performance information. The movement of a job in and out of at least some of these queues (e.g., queues for input or output requests) will certainly have a noticeable effect on the job's response time. Multiple queue models have been formulated, but they are extremely d i f f i c u l t to solve.
427
3.3.
PREDICTIONUSING A DIRECTED GRAPHMODEL In this section we w i l l analyze the directed graph model described in
section 2.1.2 (figure 2.5). Our strategy w i l l be to successively apply elementary transformations to the graph in order to reduce i t as much as possible. Each elementary transformation w i l l reduce the complexity and/or the size of the graph. The reduced graph which results w i l l be equivalent to the original graph.
Since we are interested only in performance, this
equivalence w i l l be equivalence of execution, but not usually equivalence of structure. Beizer [9] defines three elementary transformations: and loop.
series, p a r a l l e l ,
The series transformation is applicable to a pair of arcs in
series, i . e . , the terminal node of one arc is the origin node of the other arc. The pair of arcs and the node between them can be replaced by a single arc provided no other arcs terminate or originate at the i n t e r i o r node. Figure 3.3 i l l u s t r a t e s this replacement.
Pik
~ik > ~
Recall that the two numbers attached
Pkj
~kj
> ~
can be replaced by
~ij Figure 3.3
Simpleseries transformation
to an arc ( i , k ) are the probability, Pik' that control leaves the origin node, i , along the arc and the execution time, ~ik' associated with that arc. In the series reduction i l l u s t r a t e d above, arcs ( i , k ) and ( k , j ) and
428
node
k
are replaced by a new arc
(i,j).
The p r o b a b i l i t y and execution
time f o r t h i s new arc are, P i j : PikPkj ~ i j = Uik + ~kj This transformation can be generalized to apply to any node which i s not i n t e r i o r to a loop of length one, i . e . , is both i t s o r i g i n and terminal node. trated in f i g u r e 3.4.
there is no arc f o r which that node The general transformation is i l l u s -
Each d i f f e r e n t combination of two arcs in series i s
replaced by a new arc and the i n t e r i o r node i s eliminated.
The p r o b a b i l i t y
and execution time f o r each of the new branches are computed in the way as f o r the simple series transformation, that i s ,
~
k~/~
x)~kr
can be replaced by
-
Figure 3.4 General series transformation
429
Pnr
=
PnkPkr
#nr = Pnk + Pkr and s i m i l a r l y f o r each of the other new arcs. The p a r a l l e l transformation is applicable to a p a i r of arcs in p a r a l l e l , that is a p a i r of arcs both of which have the same o r i g i n node and the same terminal node.
Figure 3.5 i l l u s t r a t e s
t h i s transformation.
The p a i r of
Pik
can be replaced by
Pik Pik
" ~
Figure 3.5 P a r a l l e l transformation p a r a l l e l arcs is replaced by a single new arc. time f o r t h i s new arc are, I
Pik = Pik
"t"
The p r o b a b i l i t y and execution
II
Pik
P k ;k + P;k Vk ~ik
i
ii
Pik + Pik
I f there are more than two p a r a l l e l arcs between two nodes they can be reduced to a single arc by applying the p a r a l l e l transformation repeatedly to one p a i r of arcs at a time. The loop transformation removes an arc which is a loop of length one, that i s , an arc which has the same node f o r both i t s o r i g i n and terminal nodes.
This transformation is i l l u s t r a t e d in f i g u r e 3.6.
The arc which
is a loop is eliminated and a new p r o b a b i l i t y and execution time are assigned to each of the remaining arcs.
These new values are,
430
Pii Vii
Pik laik t
can be replaced by
Pik ~ik
>.~
Figure 3.6 Loop transformation
!
Pik Pi k = 1 - Pi----~-" Pii~ii Uik ~ik 1 - Pii which must be calculated for each remaining arc which has node
i
as i t s
origin node. I f a directed graph has a single entrance node and a single e x i t node, repeated applications of these elementary transformations w i l l reduce the graph to a single arc and two nodes.
To i l l u s t r a t e this procedure we w i l l
use the graph model from section 2.1.2, which is shown again in figure 3.7(a). Figure 3.
shows the reduction of this graph model by repeated application
of the elementary transformations. The parallel transformation applied to the two arcs (2,3) transforms the graph from (a) to (b). Two applications of the series transformation, f i r s t with i n t e r i o r node 2 and then to arcs
to the arcs
(1,3)
node 3, transform the graph from (b) to (c). on arcs
(5,6)
(c) to (d).
and
(6,4)
and
(1,2)
(3,4)
and
(2,3)
with i n t e r i o r
Another series transformation,
with i n t e r i o r node 6, transforms the graph from
The transformation from (d) to (e) is accomplished by a general
series transformation. are three arcs involved,
In this case node 5 is the i n t e r i o r node and there (4,5), (5,7),
nating node 5 is two new arcs,
(4,7)
and and
(5,4). (4,4)
The r e s u l t of elimi-
which is a loop.
Appli-
cation of the loop transformation eliminates this loop and transforms the
0
--h 0
3 0
"0
"S
0 .-h
~° 0
I'D
v
-h v
v
v
m.
o. °
0
0
0
0
v
' ~0
C
~ 0
•
oi.
i,,,,,,,~
'-.." 0
0
0
0
000
0
432
graph from (e) to (f).
Referring to figure 3.6 we see that,
Pii = P44 = 0.9
' = P47 ' =O.l Pik
vii then, 1
Pik P47 = Pik = ~ , Pii~ii u47 = ~ i k = ~ i k + - ~ i i
0.I = I - 0 9,,= 1
_ ~ - 2 +
7.2 = 2 + 0.-71- =
74
Finally, application of a series transformation to arcs (1,4)
and (4,7)
with interior node 4 reduces the graph to (g) which is a single arc and two nodes, the entrance and exit nodes. The final reduced graph indicates that the execution time of the program is 81 time units. The elementary transformations which we have been using are also applicable to graph models with multiple entrance and exit nodes. The only restriction is that no entrance or exit node may be eliminated. A graph model with multiple entrance and exit nodes cannot be reduced to a single arc.
For example, figure 3.8 shows the reduction of a graph model with
two entrance nodes and two exit nodes. Each of the transformations used in this example is the series transformation except that from (d) to (e) which is the parallel transformation; The reduced graph has three arcs which represent all of the possible paths from entrance nodes to exit nodes. Each of the arcs indicates the execution time for that path and the probab i l i t y that ~he path will be followed given that control enters at the corresponding entrance.
I f we know the probability of entering at each
entrance we can tabulate all of the paths and assign to each path the probab i l i t y that i t w~ll be followed through the program. For example, assume the probability of entering at entrance node l is
eI = 0.9
and the proba-
b i l i t y of entering at entrance node 2 is e2 = O.l. The three paths in figure 3.8 are tabulated in figure 3.9. The probability for each path is the product of the probability assigned to the arc representing the path and the entrance probability assigned to that arc's origin node, that is, the probability of the path represented by arc ( i , j ) is p i j e i . We can also compute an average execution time for the entire program by taking a weighted sum of the execution times for all of the paths where the weights used are the path probabilities.
In our example this sum is,
(po
Co
g~ -~
~
L~
"0
~D X
~D X
-q
(1)
"o
t-
(-I-
(/)
io ~J
co co
Oo
o
~ ~
! 0
3
"(3
r~
.,J°
t~
0 -h
0
r,.
~v
v
i~o
~
co
•~
(,,o I%)
OJ r~
O0
c~
434
0.9(11) + 0,068(12.88) + 0.032(8) = 11.04
These figures are principally useful for getting a general idea of the magnitude of the average execution times for the paths and the program as a whole. When control enters the program i t actually follows some particular path. The actual execution times for the paths in our example range from 8 to 15. We mentioned in section 2.1.2 that in order to model some of the common program constructions, we needed to extend the graph model to include arcs whose execution time was not constant.
Following Beizer [ 9 ] , we propose
representing the execution time by two numbers, the mean execution time and i t s variance
(~,X).
This extension i s useful even in the simpler case
i l l u s t r a t e d by our l a s t example.
Even i f the variance is zero f o r a l l of
the arcs in the o r i g i n a l graph, the r e s u l t of a p a r a l l e l transformation w i l l not have zero variance i f the execution times of the two arcs are not equal. The elementary transformations are e a s i l y extended to include the variance. The new variance f o r the series transformation is given by,
~ij = ~ik + ~kj
'
for the parallel transformation by, Pik ' ~'ik, + Pi.k.k~i ,. ,. 2ik = Pik +Pik
+ Pi~Pik , + Pi'~Pi'k ,, _ 2 ~ik Pik +Pik
and for the loop transformation by, ~iiPii ' + ~1+- P i i ~ik : ~ik
2 ~iiPii ~ ,l_Pi i )2
By associating a variance with each arc, the reduced graph will indicate the variation in execution time for the various paths as well as their mean execution time. gram~ behavior.
This helps give a more accurate picture of the pro-
I f we include the variance in our last example, the variances are all zero up until application of the parallel transformation to the partially reduced graph in figure 3.8(d). The variance for the new arc (2,6) is,
435
, , ,, ,, ,2 , + ,,2 ,, P26~26 + P26~26 ~26P26 U26P26 2 ~26 = ' " + ' " P26 + P26 P26 + P26 ~26
= 0.2(0 ) +0.48(0) + ]52(0.2) + 122(0.48) 12.882 0.2 + 0.48 0.2+ 0.48 = 1.93
We can also apply the variance computations to the program paths and reduce the graph to a single arc i f we assume a dummy entrance node which is the o r i g i n node of some new arcs, one to each entrance node in the o r i g i n a l graph, and a dummy e x i t node which is the terminal node of some new arcs, one from each e x i t node in the o r i g i n a l graph.
Figure 3 ~ ( a ) shows the
f i n a l graph of f i g u r e 3.8(e) modified in t h i s way,
In t h i s graph the execu-
tion times are w r i t t e n as a p a i r of numbers (~,~).
Two series transforma-
tions are applied to (a) and one to (b) to get (c),
Then the p a r a l l e l
transformation is applied to obtain (d). decreases.
Notice that the variance a c t u a l l y
This is because the arc which had the higher variance also had
a very low p r o b a b i l i t y and the means f o r the two branches are quite close together.
One more series transformation followed by a p a r a l l e l t r a n s f o r -
mation reduce the graph to a single arc which has a mean execution time of l l . 0 4 with a variance o f 0.39. The modified graph model which we have j u s t been discussing, which includes variances, i s s t i l l
not adequate f o r modeling some aspects of
program behavior, e s p e c i a l l y loops and branches which depend on the arguments of the program.
I f t h i s dependency can be expressed as a simple r e l a -
tion we may be able to f i n d a mean value and variance f o r the execution time corresponding to the data dependent portion of the program.
However,
we may not be able to do t h i s because the execution time does not f o l l o w a normal d i s t r i b u t i o n closely enough for the mean and variance to be a v a l i d representation,
Also we may not be able to derive a numerical pro-
b a b i l i t y f o r a l l of the arcs. attack t h i s problem.
There are two basic d i r e c t i o n s in which to
We can t r y to extend the basic model to allow more
v a r i e t y in the method of expressing the p r o b a b i l i t y and execution time attached to an arc, e i t h e r by allowing other d i s t r i b u t i o n s or symbolic expressions.
In e i t h e r case the analysis becomes more d i f f i c u l t
soon experience great d i f f i c u l t y with the a n a l y t i c a l model. simulation.
and we
in analyzing the model, j u s t as we did
The other d i r e c t i o n i s to go to some fonn of
In t h i s case, we can extend the model to include other
436
° : 9 ~ I °(o,o)
~ "
-'~
(o,o1 ~
~
)~ ~0.032 ~(I .88~(8'°)
/0.32
(o,o)
(a)
•le
0.I (0,0)
(c)
(b)
0.032
~
~
Q o j .,o,
0.032
(8,0)
.0 III .04,0.39)
(o,ol"~J (d)
(e)
G (f)
Figure 3.~0 Reduction of multiple entrance and exit model with variances
distributions and symbolic expressions for expressing the branching probabilities and execution times. One extension of this model [16] is used with a combination of techniques. After doing as much analysis as possible, the partially reduced model is used for simulation. This extended model and the techniques used on it are described in more detail in section 5.
437
4.
SIMULATION Gordon [19] defines system simulation "as the technique of solving
problems by following the changes over time of a dynamic model of a system." Basically, in simulation one does not attempt to solve the model analytically.
Further, no specific attempt is made to isolate the relations between
any p a r t i c u l a r variables, one j u s t observes the way the variables of the model change with time. tions.
Relationships must be derived from these observa-
Therefore, simulation is basically an experimental technique.
In
this section we w i l l consider the methods and problems of simulation and explore the model described in section
4.1.
2.1.3.
MAJORMETHODS
There are two major types of simulation: continuous and discrete. The model of a continuous system, where our interest is in smooth changes in time, is usually a set of d i f f e r e n t i a l equations. on such a model.
Continuous simulation is based
Analog computers are best suited for this type of simula-
tion and are used extensively for this purpose.
Digital computers can be
used also, provided a small enough time interval is used to integrate the equations.
I f we are not interested in smooth time changes but in certain
events, our model is essentially a set of logical conditions which are necessary for the event to occur. In this case simulation follows the changes in the system which result from a succession of events. simulation.
This is discrete
Computer operating systems are basically discrete systems so
our discussion w i l l be limited to discrete simulation. To f u r t h e r c l a r i f y the d e f i n i t i o n of discrete simulation refer back to the simulation model described in section 2.1.3. There we described a model which represented the system by describing the flow of a job through the system. With respect to time only certain events were i n t e r e s t i n g , f o r example, putting a job on one of the queues, allocating the processor to a job, the entry of a job into the system, and so forth.
What happens
between these events (e.g., several seconds of uninterrupted execution) is uninteresting and, aside from the length of the time interval between two successive events, has no relevance to the performance of the system.
Thus,
our i n t e r e s t is focused on a succession of points in time which are separated by f i n i t e time intervals (which we allow to be of length zero).
438
There are three major computer based methods used for simulation:
an
analogue computer, a simulation system, and a d o - i t - y o u r s e l f specific program. As we mentioned e a r l i e r the principal use of analogue computers is for continuous simulation.
I t is a r e l a t i v e l y useless method f o r the simulation of
computer operating systems, or any other discrete systems for that matter. Hence, this method w i l l not be discussed further.
A simulation system
usually consists of a special modeling language, a t r a n s l a t o r or i n t e r p r e t e r for that language, and a collection of support routines. his model in the modeling language.
The user describes
This description is then e i t h e r i n t e r -
preted d i r e c t l y to perform the simulation or translated into a program which performs the simulation when i t is executed.
In e i t h e r case, the user is
provided with a convenient way of specifying and changing the parameters in his model so that he can make a number of d i f f e r e n t simulation "experiments."
The simulation system also provides him with data c o l l e c t i o n ,
analysis, and display f a c i l i t i e s so that he can e a s i l y observe the changes in the variables of his model and derive the relations between them. Using the d o - i t - y o u r s e l f specific program method the user writes a program to s p e c i f i c a l l y simulate exactly his model.
As a result he may have to program
most of the functions supplied by a simulation system.
However, i f his
model is quite simple, the resulting program may perform the simulation much faster than a simulation sytem would. The technique f o r discrete simulation is e s s e n t i a l l y the same whichever of the l a t t e r two methods are used.
A model of a system is concerned
with one or more d i f f e r e n t classes of e n t i t i e s . class of e n t i t y
Each class of e n t i t y
In our example, job is one
has a number of attributes asso-
ciated with i t which represent various properties of e n t i t i e s in the class. For example, the a t t r i b u t e s of a job are i t s execution time, i t s central memory requirement, and i t s I-0 requests.
An individual e n t i t y from a
certain class has a set of values associated with i t , one value f o r each a t t r i b u t e associated with the class.
The model consists of the d e f i n i t i o n
of the classes of e n t i t i e s and t h e i r a t t r i b u t e s , a set of a c t i v i t i e s , and a set of events.
An a c t i v i t y is a process which acts on one or more e n t i t i e s
and changes the state of the system.
For example, an a c t i v i t y may be an in-
put or output operation or execution of a program by the central processor. The state of the system is a record of a l l the individual e n t i t i e s , with the values of t h e i r a t t r i b u t e s , which currently e x i s t in the system and the a c t i v i t i e s currently in progress along with an indication of which e n t i t i e s they are processing.
439
An event is a point in time at which a change in the system state occurs.
An event has no duration.
takes place.
When an event occurs some a c t i v i t y
Activities also cause events to occur.
I t is the execution
of a c t i v i t i e s which actually cause the changes in the system state. Since simulation consists of following the changes in a model of a system, i t is basically a program which follows a sequence of events. Except for the magnitude of i t s duration, the time between events is not significant and is ignored, While following a sequence of events the simulator keeps the system state updated. Fundamental to simulation is the concept of time.
The simulator must
be aware of the passage of simulation time, which is the basis for time relationships in the model. Simulation time usually has no connection with the real time which i t takes the simulator program to run. The usual method of recording the passage of simulation time is to maintain a simulation clock.
The simulation clock can be updated by small, uniform
intervals of time.
This method is normally used for continuous simulation.
On the other hand the method normally used in discrete simulation is to advance the simulation clock to the time at which the next event is due to occur. Thus, the clock is updated by varying length time intervals whose length corresponds to the simulation time between consecutive events. a sense, the simulator is unaware of the time between events.
In
Indeed i t
need not be aware of this time since nothing happens between events. One of the major functions of an a c t i v i t y is to determine that some event w i l l occur in the future and compute the time at which i t w i l l occur. A major function of the simulator is to accept this information and record an identification of the event and the t{me at which i t w i l l occur. This action is called scheduling an event.
The most common way of recording
the information about a future event is in an event l i s t which is ordered by time of occurrence of the event.
The f i r s t event to occur in the future
is the f i r s t event on the l i s t . The second event to occur in the future is second on the l i s t and so forth.
4.2.
SPECIFICATION OF JOB PROPERTIES
Many of the interesting properties (attributes) of a job are stochastic variables.
The most common way of specifying the values of such a variable,
440
x,
is by a probability d i s t r i b u t i o n .
discrete and continuous.
There are two types of distributions,
A discrete d i s t r i b u t i o n is a f i n i t e set of values
Xl,X 2 . . . . ,x n each with an associated probability, pl,P2 . . . . . Pn" where Pi is the probability that the value of the stochastic variable x w i l l be equal to
xi .
The condition, n i=l
Pi = l
is imposed on the p r o b a b i l i t i e s , that i s , the stochastic v a r i a b l e must have a value equal to one of the of the v a r i a b l e
x
xi .
For a continuous d i s t r i b u t i o n , the value
is defined using a p r o b a b i l i t y density function
The p r o b a b i l i t y that the value of x x I ~ x 2, is given by the i n t e g r a l
I
f a l l s in the range
xI
to
f(x) ~0.
x 2,
where
x2f(x)dx x1
We can see from t h i s that the p r o b a b i l i t y of i s zero. We also require,
x
having one s p e c i f i c value
~J(x)dx : 1 A related function, the cumulative distribution function, F(x) = I x f ( x ) d x is more often used in simulation. value is positive ranging from b a b i l i t y that the value of
0 to
is monotonic increasing and i t s I.
The value of
F(xo)
is the pro-
x . We can also o derive a cumulative distribution function for a discrete d i s t r i b u t i o n . We order the values
xi
ing subscripts on the
x
F(x)
is less than or equal to
and change t h e i r subscripts (and also the correspondpi )
so that,
xI < x2 < " "
< xn.
Then,
k F(x k) = Z Pi i=l is the probability that the value of
x
is less than or equal to
x k.
Actually what we r e a l l y need is the inverse of the cumulative d i s t r i bution function.
When simulating our system we need to generate a set of
values for the attributes of each new job which enters the system. For
441
each stochastic v a r i a b l e
x
in the a t t r i b u t e s we need to generate a sequence
of random numbers which are drawn from the d i s t r i b u t i o n corresponding to I f t h i s d i s t r i b u t i o n is not uniform ( a l l values equally l i k e l y ) difficult
to generate the sequence d i r e c t l y .
x.
i t may be
However, i t i s r e l a t i v e l y easy
to generate sequences of uniformly d i s t r i b u t e d random numbers and most system l i b r a r i e s have at least one subroutine which does t h i s .
It is fairly
easy
to convert a sequence of random numbers which are uniformly d i s t r i b u t e d over the range from
0
to
1
to a sequence of random numbers which s a t i s f y
some other d i s t r i b u t i o n by using the inverse of the cumulative d i s t r i b u t i o n function f o r that d i s t r i b u t i o n . random number, Yr = F(Xr)
Yr'
Recalling that
uniformly d i s t r i b u t e d over
and solve f o r
×r"
i.e.,
0 ~ F(x) ~ I , 0 ~yr
xr = F - l ( y r )
~ I.
generate a Then l e t
as shown in f i g u r e 4.1.
I.O
............ Yr
F(x)
0.5
y
_i~
f
I I I I I
I i i
x
r
x ->
Figure 4.1 Graph of a cumulative d i s t r i b u t i o n function
Of course t h i s procedure requires that one be able to evaluate
-I F
This
procedure also works f o r discrete d i s t r i b u t i o n s , but in t h i s case i t is b a s i c a l l y a table look up.
Again we generate a random number Yr
which
is uniformly d i s t r i b u t e d , but we must r e s t r i c t the range so t h a t Then we have to f i n d a value the convention that The sequence of
k
F(x o) = O.
such t h a t ,
0 < Yr ~ I .
F(Xk-l) < Yr ~ F(Xk)'
The desired random number is then
with x k.
x ' s generated by e i t h e r of these procedures i s random and
has the desired (non-uniform d i s t r i b u t i o n ) . Another' important c h a r a c t e r i s t i c of the jobs which are input to the system is t h e i r a r r i v a l pattern, which describes the s t a t i s t i c a l of the job a r r i v a l s at the system.
properties
The usual way of describing an a r r i v a l
pattern is in terms of the i n t e r - a r r i v a l
time, which is the i n t e r v a l between
442
successive a r r i v a l s .
I f the a r r i v a l pattern has no v a r i a b i l i t y ,
a r r i v a l time is a constant.
a r r i v a l time w i l l be defined by a p r o b a b i l i t y d i s t r i b u t i o n . practice to define the a r r i v a l d i s t r i b u t i o n an i n t e r - a r r i v a l function t,
F(t)
we have,
the i n t e r -
I f the a r r i v a l s vary s t o c h a s t i c a l l y , the i n t e r -
time is greater than
t.
Ao(t )
I t is common
as the p r o b a b i l i t y that
Since the cumulative d i s t r i b u t i o n
is the p r o b a b i l i t y that an i n t e r - a r r i v a l
time is less than
Ao(t) = I - F ( t ) .
A common a r r i v a l pattern is one in which the a r r i v a l s are completely random.
This means a job can a r r i v e at any time, subject only to the
r e s t r i c t i o n that1~Emean a r r i v a l rate
X be some given value.
a r r i v a l pattern the d i s t r i b u t i o n of i n t e r - a r r i v a l
The p r o b a b i l i t y density function of the i n t e r - a r r i v a l f ( x ) = ~e-~t ,
With t h i s
times i s exponential. time i s ,
t > 0
and the a r r i v a l d i s t r i b u t i o n i s , Ao(t ) = e-At The number
X is the mean number of a r r i v a l s per time u n i t .
number of a r r i v a l s in an i n t e r v a l of time
t
The actual
is a stochastic v a r i a b l e .
With an exponential d i s t r i b u t i o n of i n t e r - a r r i v a l
times, the p r o b a b i l i t y
of
t
n a r r i v a l s occurring in an i n t e r v a l of time
P(n) = (~t)ne-~tn!
is,
(n = 0 , 1 , 2 , . . . )
This d i s t r i b u t i o n is discrete and is Called the Poisson d i s t r i b u t i o n .
For
this reason a random a r r i v a l pattern is usually called a Poisson a r r i v a l pattern.
The cumulative d i s t r i b u t i o n function of the exponential d i s t r i b u -
tion function i s , F(x) : l - Ao(t ) = I - e -At and i t s inverse i s , At : - l o g e ( l - F(x)) The Poisson a r r i v a l pattern is one of the most commonly occurring a r r i v a l patterns.
443
We use the c o e f f i c i e n t of v a r i a t i o n deviation and
Ta
o/T a
(where
~
is the standard
is the mean value) to measure the degree to which data
is dispersed about the mean. Since the standard deviation f o r an exponential
d i s t r i b u t i o n of mean value
c i e n t of v a r i a t i o n is which w i l l
I.
Ta
(Ta = I / ~ )
is also
Ta,
the c o e f f i -
I f the c o e f f i c i e n t of v a r i a t i o n f o r the job mixes
a c t u a l l y be submitted to the system is s i g n i f i c a n t l y less than
or greater than
I,
then an Erlang or hyper-exponential d i s t r i b u t i o n [ 1 9 ] ,
r e s p e c t i v e l y , should be used. While i t may be possible to create a sequence of job a r r i v a l s before a simulation run is s t a r t e d , the usual procedure is to delay creation of the jobs u n t i l they are needed.
The a r r i v a l of a job is an event.
When
the simulation clock reaches the time f o r t h i s event to occur a new job ( e n t i t y ) is created.
Using the inverse of the cumulative d i s t r i b u t i o n
for the i n t e r - a r r i v a l
times and a newly generated random number, the i n t e r -
a r r i v a l time f o r the next job to a r r i v e is computed.
The a r r i v a l of the
next job is then scheduled to occur at a time equal to the current clock time plus the i n t e r - a r r i v a l
time f o r the next job.
In addition to sche-
duling the a r r i v a l of the next job, the values of the a t t r i b u t e s of the newly created job are computed and set.
Thus, the job a r r i v a l event creates
a new e n t i t y , sets the values of i t s a t t r i b u t e s , and schedules a future event. 4.3.
DATA COLLECTION
The p a r t i c u l a r data collected and the analysis performed on t h i s data depend upon the model and the purpose of t h e s i m u l a t i o n .
However, there
are some data which are so common that most simulations w i l l data.
The same is true of certain basic a n a l y s i s .
collect this
A count of the number
of times some event occurred, such as a request f o r disk I - 0 , or the number o f e n t i t i e s in a p a r t i c u l a r class which were created, such as the number o f jobs which enter the system, i s one of the most common datum which is collected. Summary s t a t i s t i c s , such as extreme values, mean values, and standard deviations are also usually computed. Suppose we are interested in central memory usage.
The maximum and minimum amount of central memory
occupied is e a s i l y obtained by comparing each new value f o r memory use,
xi,
against the current values of the maximum and minimum. To obtain the mean M and standard deviation S the simulator must accumulate both the sum of the d i f f e r e n t memory use values and the sum of the squares of these values, since
M and
S are defined by,
444
M=l n i= 1 --
S2
X°
1
1 n 2 _ M2 = ~ i~ixi
The sums are accumulated during the simulation run and the remainder of the computation is done at the end of the simulation.
Another common datum
collected is the f r a c t i o n of time that some e n t i t y such as the central processor is in use. Since queues usually play an important part in any system, data on the queue a c t i v i t y is usually collected.
Some of the more important data are
the v a r i a t i o n in queue length, which may be expressed by the mean, standard d e v i a t i o n , maximum, and minimum, and s i m i l a r s t a t i s t i c s f o r the waiting time, which is the time a job spends in the queue.
Often the time between
certain events or the time i t takes an e n t i t y to move from one part of the system to another is useful.
Sometimes an event trace is desired.
This
is a record of every event and the state of the system a f t e r the occurrence of the event.
Since this is usually a very large amount of data, a complete
eVent trace is normally not desired, except in case of trouble in the simulator.
However, a p a r t i a l event trace may be quite useful.
In a p a r t i a l
event trace only part of the system state is included in the output, or only selected events are traced. Most simulation systems provide f a c i l i t i e s data mentioned above. routines.
f o r c o l l e c t i n g a l l of the
In a d d i t i o n , they contain the most common analysis
Since the user may often wish to analyze the data in other ways,
some systems allow the user to w r i t e analysis programs which can be incorpora£ed i n t o the simulation.
Display of the simulation r e s u l t s i s , in
some ways, as important as the simulation i t s e l f . systems have f a c i l i t i e s tables.
A few systems have f a c i l i t i e s
lation results.
Thus, most simulation
f o r p r i n t i n g the r e s u l t s in reasonably readable f o r p l o t t i n g graphs from the simu-
A graph is often the ideal way of displaying simulation
r e s u l t s , since the user i s looking f o r r e l a t i o n s which e x i s t between the variables of the system. 4.4.
SIMULATION LANGUAGES In speaking of simulation languages we mean a language f o r describing
a model and the other information necessary to simulate the system which
445
is represented by the model.
As such we would expect any simulation Jan-
guage to include features especially for describing e n t i t y classes and their attributes, activities,
and events.
This rules out languages l i k e
FORTRAN and PL/I which we do not consider to be simulation languages.
We
also expect a simulation language to include queues (or something equival e n t ) , f a c i l i t i e s for specifying a number of d i f f e r e n t probability d i s t r i butions, and f a c i l i t i e s
for data collection and analysis.
There are two classes of simulation languages, general purpose and special purpose. A general purpose simulation language is designed to be used to simulate a wide range of dynamic systems, such as, computer systems, telephone systems, economic systems, factory assembly l i n e s , supermarkets, and ocean ports. For this reason the underlying simulator for a general purpose language can have no b u i l t - i n knowledge about the system being simulated. On the other hand, a special purpose simulation language is designed to simulate a specific kind of system, such as a computer operating system. Thus, i t s underlying simulator can have b u i l t - i n knowledge about the kind of system which w i l l be simulated, such as, knowledge of the operational characteristics of sequential and random access devices (e.g., tape and disk). Four of the most popular general purpose simulation languages are GPSS, SIMSCRIPT, SlMULA, and CSL. Each of these languages presents a d i f f e r e n t view of system dynamics. Kiviat [20] has written a detailed analysis of simulation languages and compares the characteristics of these four languages.
In addition, he gives examples of the use of each language.
We w i l l not attempt to duplicate that analysis here.
What we w i l l do is
to b r i e f l y sketch the highlights of GPSS and SIMSCRIPT to give the reader a feeling for the character of general purpose simulation languages. GPSS is a block diagram language. The model of the system to be simulated is described as a block diagram. Blocks represent a c t i v i t i e s and the lines joining the blocks indicate the sequence in which the a c t i v i t i e s can be executed. as jobs.
Moving through the system to be simulated are e n t i t i e s , such
In GPSS these e n t i t i e s are called transactions.
An event is
defined as the movement of a transaction from one block to another.
Input
to the GPSS simulator is a description of each of the blocks in the model plus some control cards which may define functions (probability d i s t r i b u tions, etc.) and tables as well as control the execution of the simulator. In the model transactions are created by GENERATEblocks. Part of the
446
description of this block is the definition of the inter-arrival time of the transactions generated by the block.
The inter-arrival time can be
specified as a constant, a normal distribution, or some user defined function. Normally i t does not take any (simulation) time to pass through a block, except for the ADVANCE block.
This block is a delay and i t s descrip-
tion specifies the duration of the delay.
When a transaction enters an
ADVANCE block an event, which is the movement of the transaction to the next block, is scheduled to occur at a time in the future equal to the current time plus the delay specified by the ADVANCE block.
The simulation
consists of moving a transaction through one block after another until i t reaches a TERMINATE block, which removes the transaction from the simulation, or until i t is delayed by an ADVANCE block or encounters a block which cannot be entered at the current time.
The simulator then considers
the next scheduled event, moving the associated transaction through as many blocks as possible. There are some blocks which cannot always be immediately entered, such as the SEIZE and ENTER blocks.
These blocks are used to control the use of
permanent entities which GPSS calls f a c i l i t i e s and storages.
A facility
is an entity that can be allocated to only one transaction at a time, such as the central processor. A storage is a partitionable entity, such as central memory. Portions of a storage may be allocated to several different transactions simultaneously, a different portion to each transaction. portions need not be the same size.
The
The SEIZE block applies to f a c i l i t i e s
and the ENTERblock applies to storages.
The RELEASE block releases a
f a c i l i t y which has been allocated by a SEIZE block and a LEAVE block gives up some or all of the storage allocated by the ENTER block.
A transaction
is prevented from entering a SEIZE block i f the requested f a c i l i t y is in use. Similarly a transaction is prevented from entering an ENTER block i f the amount of storage available is less than the amount requested. When a transaction is prevented from entering a block i t is automatically queued, however, the simulator keeps no s t a t i s t i c s on the a c t i v i t y in these queues.
I f the user wishes to collect such s t a t i s t i c s he must
e x p l i c i t l y queue and dequeue the transactions. and DEPART blocks.
This is done by the QUEUE
The QUEUEblock identifies a queue and increments the
length of that queue. The DEPART block identifies a queue whose length is decremented. These blocks do not affect queue a c t i v i t y , they simply allow s t a t i s t i c s gathering.
GPSS also has some blocks which allow the user to
447
specify other than the standard queue d i s c i p l i n e .
Two other blocks, MARK
and TABULATE, allow the user to record the time i t takes for a transaction to move between two points in the model. initial
point.
The MARK block indicates the
The TABULATE block records the amount of (simulation) time
which has passed since the MARK block.
This time is recorded in a table
specified L~ the TABULATE block. GPSS also contains blocks for branching, assigning values to variables, and maintaining l i s t s .
However, since our purpose
is only to give the
flavor of GPSS, not completely describe the language, we w i l l not discuss any of these additional features.
Figure 4.2 shows a sample GPSS block
diagram. In this example the name of the block is w r i t t e n to th left of the block.
GENERATE
QUEUE
SEIZE
DEPART
( RELEASE
p
TBULATE
i
TERMINATE Figure 4.2 Example GPSS block diagram
448
The GENERATE block generates transactions at the rate of one every 5 time units. time.
The
0
indicates that there is no variation in the i n t e r - a r r i v a l
The sequence for a transaction is to seize f a c i l i t y number I , process
f o r a period of time, release the f a c i l i t y ,
and leave the system.
The
ADVANCE block specifies that the processing time has a mean value of varies uniformly from
4-3
to
4+3.
4
and
In order to gather s t a t i s t i c s on
the a c t i v i t y in the queue for the f a c i l i t y we have bracketed the SEIZE block with a QUEUE and a DEPART block.
The inclusion of the MARK and TABULATE
blocks causes the actual processing time for each transaction to be recorded in table number I . When using an actual GPSS simulator each block w i l l have to be described on cards f o r input to the simulator.
For example the f i r s t three blocks
would be w r i t t e n , GENERATE 5,0 QUEUE
1
SEIZE
1
In addition, table number 1 must be defined and various other control i n f o r mation specified.
The length of a simulation run is defined by specifying
the number of transactions to be processed.
The TERMINATE block counts by 1
u n t i l i t s count reaches the number of transactions specified, at which time the simulation run ends. SIMSCRIPT is a language which is s i m i l a r in appearance to FORTRAN. I t deals with e n t i t i e s and t h e i r a t t r i b u t e s .
A c t i v i t i e s are described by
event routines which are closed subroutines.
When an event occurs i t s
corresponding event routine is executed.
A l l events must be e x p l i c i t l y
scheduled by executing the appropriate statements in some event routine. For this reason SIMSCRIPT is classed as an event based language.
There is
no automatic queuing in SIMSCRIPT. Queues are managed by the event routines using data structures of e n t i t i e s called sets. SlMSCRIPT has statements f o r creating and destroying e n t i t i e s . special class of e n t i t y is the event notice.
One
This e n t i t y is used f o r sche-
duling events. Whenever an event is to be scheduled an event notice is created. Then the CAUSE command is executed to schedule the corresponding event for some specified time.
There are statements for maintaining sets,
assignment of values to variables, branching, and collecting s t a t i s t i c s . In addition there are minimal f a c i l i t i e s for generating random values from
449
various d i s t r i b u t i o n s . I t is a c h a r a c t e r i s t i c of SIMSCRIPT that the user has to program more of the action in the simulation than he does i f he uses GPSS. This is the price that is paid f o r the advantage t h a t SIMSCRIPT i s a more f l e x i b l e language than GPSS. I f we t r a n s l a t e our previous GPSS example into SIMSCRIPT we w i l l need to w r i t e four event routines:
one to get s t a r t e d , one to generate the e n t i t i e s
(corresponding to the generate block), one to s t a r t processing (corresponding to the QUEUE, SEIZE, DEPART, and ADVANCE blocks), and one to f i n i s h processing (corresponding to the RELEASE and TERMINATE blocks).
We w i l l
omit the MARK and TABULATE from our t r a n s l a t i o n . To get the simulation started we need the following special event routine, EXOG EVENT START CREATE ARRV CAUSE ARRV AT TIME STORE 0 IN BUSY RETURN END This event routine creates an event notice f o r the event i t to occur at TIME. time.
ARRV and schedules
TIME is a system variable whose value is the current
BUSY is a global v a r i a b l e i n d i c a t i n g the central processor is free
i f i t s value is
O. The eyent routine ARRV generates an e n t i t y corresponding
to a job which a r r i v e s at the system. ENDOG EVENT ARRV DESTROY ARRV CREATE JOB CREATE PROS STORE JOB IN J(PROS) CAUSE PROS AT TIME CREATE ARRV CAUSE ARRV AT TIME+5 RETURN END This event routine creates a job, creates an event notice f o r the event PROS, and schedules i t to occur immediately. begin processing of the job.
The event routine PROS w i l l
The STORE statement stores the i d e n t i f i c a t i o n
of the job to be processed in the event notice.
The event routine ARRV must
450
also destroy the event notice which activated i t and create a new event notice f o r i t s e l f and schedule this event to occur at 5 time units in the future. The event routine PROS controls allocation of the central processor and maintains a queue of jobs waiting f o r the processor. ENDOG EVENT PROS STORE J(PROS) IN JID DESTROY PROS IF PQ IS EMPTY, GO TO 3 FILE JID IN PQ RETURN 3
IF BUSY EQ O, GO TO 2 FILE JID IN PQ RETURN
2
STORE 1 IN BUSY CREATE TERM STORE JID IN J(TERM) CAUSE TERM AT TIME+RANDI(I,7) RETURN END
The i d e n t i f i c a t i o n of the job must be extracted from the event notice which activated this event routine before i t is destroyed. the queue f o r the processor.
PQ is a set which is
I f i t is not empty the new job is added to
the queue by the FILE statement and this event routine is then finished. I f the queue is empty a test is made to see i f the processor is busy. i t is the job is put on the queue. allocated to the job.
In this
job's execution is created.
If
I f the processor is not busy i t is
case an event notice f o r termination of the
This event, TERM, is then scheduled for the
time at which the job w i l l complete execution.
The execution time of the
job is a random number uniformly d i s t r i b u t e d in the range 1 to 7, as computed by the function call RANDI(I,7). The termination event routine is activated when a job completes execution and releases the central processor.
451
ENDOG EVENT TERM DESTROY J(TERM) DESTROY TERM IF PQ IS EMPTY GO TO 2 REMOVE FIRST JID FROM PQ CREATE TERM STORE JID IN J(TERM) CAUSE TERMAT TIME+RANDI(I,7) 2
RETURN STORE 0 IN BUSY RETURN END
Both the terminating job and the event notice which activated this event routine are destroyed. I f the queue is not empty, the f i r s t job on the queue is removed from the queue and the processor allocated to i t . In a complete SIMSCRIPT program the various variables, e n t i t i e s , and sets would be defined by declarations.
Additional statements would be included
for collecting data and generating reports. run would also be needed.
Cards to control the simulation
Some versions of SIMSCRIPT permit the inclusion
of subroutines written in FORTRANwhich may be called from the event routines.
This feature makes i t possible for the user to do things during
simulation which would otherwise be d i f f i c u l t
or impossible.
There are two special purpose languages which we w i l l discuss b r i e f l y : CSS [18] and DES [16].
These are both languages which have been designed
for use in simulating computer operating systems. tion is d i f f e r e n t .
However, t h e i r orienta-
CSS is oriented toward the simulation of existing systems,
while DES is oriented toward systems which have not yet been implemented. DES was actually designed to be used for implementing operating systems as well as simulating them. The other major difference between the two languages is that CSS is l i k e assembly language while DES is like PL/I. The simulators for both of these languages have b u i l t - i n knowledge of computer hardware systems and the language contains statements and declarations which relate to hardware f a c i l i t i e s .
The user specifies a p a r t i c u l a r
hardware configuration by declaring the values of various hardware parameters, such as, central memory size and cycle time, data transfer rates for I-0 devices, late~cy f o r rotational devices, select time for tape drives, head movement time for disk drives, and the number of devices and processors.
452
They have statements for sepcifying processing time which are similar to the ADVANCEblock of GPSS. There are also statements for synchronizing asynchronous operations which are necessary to model I-0 channel operation, interrupts,and concurrent processing (multi-tasking).A minimal computational a b i l i t y is available in CSS, but DES, which is actually an extension of PL/I, has the f u l l capability of PL/I for computation and decision making. The following example taken from [18] i l l u s t r a t e s the CSS language. APPL PROCESS 3000 WRITE ( f i l e A)
similar to ADVANCE i n i t i a t e I-0
READ ( f i l e B) PROCESS 5 0 0 0
overlapped with I-0
SCHEDL WAIT PROCESS 7500 WRITE WAIT
wait f o r I-0 completion
( f i l e C) SCHEDL
end of program, go to scheduler BRANCH SCHEDL In addition to these statements there would be declarations defining the hardware configuration and other required information. The DES language w i l l be discussed in section 5 so we w i l l not include an example here. 4.5.
AN EXAMPLE SIMULATION MODEL In this section we w i l l model the small system defined in section 2.1.3
using GPSS and ~scuss i t s use in predicting the performance of the modeled system. The reader should refer to the diagram in figure 2.6 which shows the flow of jobs through the system. We must translate this diagram into the GPSS language. This is a f a i r l y straightforward task since a job w i l l be a GPSS transaction and a GPSS program describes the flow of transactions through the modeled system. The body of the GPSS program for our example i s , GENERATE
I,FNI,,,,2
job enters system
ASSIGN ASSIGN
I,I,FN2 2,I,FN3
set memory length set I-0 record count
QUEUE ENTER
l I,PI
memory queue allocate memory
DEPART
1
453
EXEC QUEUE SEIZE
2 1
processor queue allocate processor
DEPART
2
ADVANCE
I,FN4
execute
RELEASE TEST G
1 P2,0,DONE
release processor job completed?
QUEUE
3
SEIZE
2
disk queue allocate disk
DEPART
3
ADVANCE
I,FN5
read or write disk
RELEASE ASSIGN
2 2-,I
release disk decrement I-0 record count
TRANSFER DONE LEAVE
,EXEC I,PI
TERMINATE
release memory job exits from system
A number of new GPSS features have been introduced into this example and need a few words of explanation. When we i n i t i a l l y defined our model we gave a job f i v e a t t r i b u t e s : i n t e r - a r r i v a l time, central memory requirement, I-0 inter-request time, execution time, and I-0 record length. I t turns out to be easier to work with the number of I-0 requests instead of execution time, l e t t i n g the execution time be the sum of the I-0 inter-request times. The transaction which represents jobs needs only two attributes since the job i n t e r - a r r i v a l time is specified in the GENERATEblock while the I-0 inter-request time and I-0 record length are specified in ADVANCEblocks. In addition to specifying the i n t e r - a r r i v a l time the GENERATEblock specifies the number of attributes f o r the generated transaction. The attributes are referenced by number. The two ASSIGN blocks following the GENERATEset the values of the job's two a t t r i b u t e s . References to the current transaction's attributes in blocks other than ASSIGN use the notation
Pi
for the
i th
a t t r i b u t e , as in the ENTER block which allocates an amount of storage equal to the value of the f i r s t a t t r i b u t e . Queues, f a c i l i t i e s , and storages are all referenced by number. model has three queues:
Our
a central memory queue ( I ) , central processor
queue (2), and a disk queue (3); two f a c i l i t i e s :
central processor ( I )
and disk (2); and one storage: central memory ( I ) . After completing a disk input or output, the I-0 record count, the second a t t r i b u t e of the job, is decremented by 1 and the job is routed to the processor queue.
The TEST block
454
determines i f the job has completed by testing the I-0 record count to see i f i t is greater than zero, i f not the job is routed to location DONE which releases memory and terminates the job. The job i n t e r - a r r i v a l time, memory length, I-0 record count, I-0 i n t e r request time, and I-0 record length are each defined by a d i f f e r e n t function, F1 . . . . ,F5.
These functions must be defined by function d e f i n i t i o n cards.
Functions in GPSS are defined in tabular form and are considered as inverses of cumulative p r o b a b i l i t y d i s t r i b u t i o n s .
Each time a function is referenced,
a uniformly distributed random number is generated and used as an argument. When a function value is needed in a block i t is referenced by the value actually used is the product of I,FI
k
k,Fn
and
and the function value.
Hence,
is simply the value of function number I . In using this model i t is very easy to vary the input job characteristics
by simply changing the d e f i n i t i o n s of the functions.
Thus, we can make a
number of simulation runs (experiments) and see how the system performs for d i f f e r e n t typical jobs.
We can also easily see how d i f f e r e n t hardware confi-
gurations e f f e c t performance.
Each GPSS storage must be defined by a d e f i -
nition card which specifies i t s capacity. central memory size.
Thus, we can e a s i l y change the
We can also observe the e f f e c t of multiprocessing by
changing the central processor from a f a c i l i t y to a storage whose capacity is the number of processors. The corresponding SEIZE and RELEASE blocks would also have to be changed to ENTER and LEAVE blocks. The GPSS program can also be modified so that the simulations can be done with mixes of d i f f e r e n t job types.
Jobs are given an additional a t t r i -
bute which is t h e i r job type. Then when other a t t r i b u t e s are generated or the job passes through ADVANCE blocks this new a t t r i b u t e is used to select the appropriate function, for example, ADVANCE
I , FN*3
computes the delay time by using the function specified by the t h i r d a t t r i bute. We have been assuming that the various job a t t r i b u r e s , such as I-0 inter-request time, are defined by the same d i s t r i b u t i o n throughout the entire time the job is in the system. gets more complicated.
I f this is not true our GPSS program
The a t t r i b u t e s of a job must be expanded to include
the specification of each of the d i f f e r e n t d i s t r i b u t i o n functions involved, the sequence in which they are used, and the time interval or other conditions which cause the s h i f t from one d i s t r i b u t i o n to the next.
As the job progresses
455
through the system i t s progress w i l l have to be monitored to detect when to change d i s t r i b u t i o n s .
This modification to our GPSS program is quite compli-
cated. The observant reader w i l l have noticed that our simple model does not take into account any system overhead. This of course must also be included before our simulation results can possibly be a valid prediction of the system's performance. In some ways this is very easy, in other ways i t is very d i f f i c u l t . The overhead resulting from the loader can easily be modeled by including an ADVANCE block at the point where memory is allocated. The loading time w i l l be a function of the program size. Therefore, the memory length for the job should be specified as two numbers, program length and total length. I f we wish to record s t a t i s t i c s on loader overhead the new ADVANCE block w i l l be bracketed by MARK and TABULATE blocks.
This simple
modification assumes that the loader overhead is some simple, known function of program size.
This is usually not the case.
The function may not be
simple and i t is usually not known f o r an unimplemented system.
For this
reason the loader i t s e l f may need to be modeled and included in the simulation. This model w i l l have to model the l i b r a r y search which loaders usually perform. This involves disk input, assumptions about the organization of the system l i b r a r y , and so forth. In addition, additional job attributes w i l l be required to specify the job's use of l i b r a r y procedures. As all of thi~ is incorporated into the model of the system i t rapidly grows quite complicated. I t should be clear that simulation is an extremely f l e x i b l e and powerful tool.
However, simulation models f o r a complex system are l i k e l y to be
complex themselves.
Thus, they are d i f f i c u l t
to construct.
However, none
of the other prediction techniques seem to be capable of providing the kind of detailed performance information which the designer needs.
Clearly,
since simulation seems to be necessary to do the job, better techniques for building simulation models are required.
The special purpose simulation
languages discussed e a r l i e r are attempts to provide the required improvements in model building. •5~
INTEGRATEDPERFORMANCERRE~ICTION. DESIGN. AND IMPLEMENTATION I t is not unusual for a complex system to be designed and implemented
only to find that i t ' s table performance.
performance does not even meet the minimum accep-
This is largely due to the lack of any attempt by
456
the designers and implementers to evaluate (predict) the performance of the proposed design. The solution to this problem seems to be to make performance evaluation an integral and continuing part of both the design and implementation of the system. 5.1.
THE PROBLEMS WITH NON-INTEGRATEDPREDICTION
There are many problems involved in evaluating the performance of a system design. However, the two c r i t i c a l problems seem to be the v a l i d i t y of the evaluation and the provision of timely performance information. We have seen that all performance evaluation requires a model. This model must f a i t h f u l l y represent the system actually being implemented. I f i t does not, the evaluation is apt to be misleading.
In fact, i f the designer
modifies his design in response to these results i t may well lead to performance degradation rather than improvement. Even i f the evaluation is valid, i t is of l i t t l e use i f i t is not available until after the system has been implemented. In f a c t , the sooner the evaluation is available the more l i k e l y i t is that costly redesign and reimplementation w i l l be avoided. A number of factors contribute to the lack of timeliness. simulation, current evaluation techniques make l i t t l e Most analysis is done by hand. and the results are too late.
Except for
or no use of a computer.
Thus, any deep analysis takes a long time Since evaluation is not automatic, i t almost
always has only second p r i o r i t y and is continually postponed because of the pressure resulting from over optimistic schedules and deadlines.
No easily
accessible, central repository exists which contains all of the knowledge about the proposed system, both the software components and the hardware. Obtaining the information needed for evaluation may be d i f f i c u l t ,
or even
impossible, resulting in a considerable delay in producing the desired results.
Even though simulation usually uses the computer, a model of the
proposed system has to be coded and debugged in some language which is d i f f e r e n t from that being used to specify the design and implementation. The process of interpreting the written documentation, designing the model, coding i t , and debugging i t is a major project of long duration. By the time this project has been completed the proposed system design w i l l either have changed s i g n i f i c a n t l y or already have been implemented. V a l i d i t y is an even more serious problem. Since use of existing evaluation techniques requires considerable time and e f f o r t i t is usually not practical for the designer to do the evaluation.
Thus, the design
457
specifications must be interpreted by someone other than the designer.
Any
interpretation by someone other than the designer is open to question, principally because of a lack of precision and uniformity in the specification. Another factor which makes the v a l i d i t y of an evaluation questionable, especially simulation, is that in abstracting to a model i t is very d i f f i cult, and frequently even impossible, to identify the significant variables. I f any of these are omitted from the model the results w i l l be invalid. Since all existing evaluation techniques require a model which is separate from both the design specification and the implementation, changes in either may not get reflected in the model. Minor software or hardware changes may have an effect which, when propagated throughout the system design, s i g n i f i cantly affect performance.
I t is d i f f i c u l t
to prevent the model being used
in evaluation from d r i f t i n g away from the system actually being implemented when this model's description is separate from the implementation description. 5.2.
SINGLE LANGUAGEAPPROACH A system which integrates design specification, implementation, and
evaluation has been proposed [16] and a p i l o t version has been implemented [21]. This system is called DES (Design and Evaluation System). The two most significant features of DES are a single high level language, which is used for both design specification and implementation, and a single data base containing all known information about the proposed system, both software and hardware. In a sense DES is a combined management information system, simulator, and compiler. The DES language is an extension of PL/I, the extensions making i t into a special purpose simulation language. The key idea in DES is to use a single language to describe the proposed system at all stages of i t s design and implementation. This evolving source language description of the proposed system is used as direct input to the analysis and simulation routines.
The i n i t i a l
sketch of the proposed system's
structure and data bases, which is the gross design specification, evolves into a f i n a l , detailed implementation specification which can be compiled into executable object code.
As soon as any part of the object system is
specified some evaluation information is available. more detailed this information becomes more precise.
As the design becomes Thus, a f a i r l y detailed
and precise picture of the proposed system's performance is developed before i t is completely implemented.
458
The central data base for the proposed system contains a descr(ption of both the hardware and the software.
The hardware description includes
the memory size, instruction and cycle times, standard configurations, and device descriptions.
A device description specifies the properties of the
device which influence its behavior, such as, seek time, latency, transfer time, and number of access paths. procedure and data components.
The software description includes both
A procedure component description i d e n t i f i e s
i t s entry points and a description of the corresponding arguments (data type, structure, e t c . ) , names of external data components and procedures which i t references, i t s resource requirements, and so forth. A data component description includes information on i t s structure, the data type of i t s elements, the way or ways i t w i l l be accessed, i t s average and maximum size, and so f o r t h . As soon as any part of the proposed system design is known i t is expressed in the DES language and entered into the central data base.
Ini-
t i a l l y this information may be no more than component names and types (procedure, data, or hardware). As the design progresses the designer gradually f i l l s in additional information until the central data base contains a complete description of all components in the proposed system. The evaluation routines in DES give performance information consistent with the degree of detail and completeness of the component specifications.
Whenever a change
is made in the specification of a component, DES automatically propagates this information throughout all components which are affected by the change and the persons responsible for these components are notified that there has been a change. The DES language is an extension of the implementation language, in this case PL/I, with additional statements which allow the designer to express the design at whatever level of detail he desires. This allows the total system design to be captured in a processable format beginning with the i n i t i a l
design phase.
The intent of these extensions is to make i t
possible for the designer to sketch his i n i t a l with l i t t l e
design in the extensions
or no use of the standard PL/I statements or declarations.
As the design progresses the designer f i l l s
in missing parameters in the
statements of the extended language, inserts additional PL/I statements, and completes the data descriptions in the object system data base.
Each
i t e r a t i o n of a component's design is automatically combined with all others to ensure that the total system is consistent at all times.
Variations in
459
the level of detail between components and within a single component can be noted f o r project control, but do not prevent evaluation of parts or the whole at any time. Three types of language elements are defined.
The f i r s t is a data
structure description which allows declaration of generalized data structures such as queues and tables.
For example, the statement,
$dcls 1 d_free(queue,fifo); declares a local data structure, d_free, which is a queue with f i f o access characteristics.
The description of the data items within an individual
queue entry can be added when i t s detailed description is known. The statement, Sdclg f r l i s t ( t a b l e , k e y ) ; directs DES to include in the source text a data declaration which is stored in the central data base.
I t further indicates that the declaration is
that of a table to be accessed by a key. The second type of language element is used to specify conceptual operations, such as create, find and i n s e r t , on the generalized data structures.
The statement, $find f r l i s t ;
indicates a search of the structure f r l i s t to locate an element.
The
statement, $insert d_free; specifies the insertion of an element into the structure d_free.
The t h i r d
type of language element is used to indicate the use of system resources such as input or output devices, memory, and central processor u t i l i z a t i o n . The statement, $read(disk); indicates a read operation on a disk device.
The statement,
$process(lO00); indicates the use of the central processor f o r I000 time units. The following example shows how these language elements can be used to describe a basic system function:
460
get_element:
proc; Sdcls I d__free (queue,fifo); Sdclg f r l i s t
(table,key);
$find d free; $process (I00); $insert f r l i s t ; end; In this example an element on the queue d_free is located, an estimated amount of processing is performed, and an element is stored in the table fr list. INTERACTION WITH THE DESIGNER-IMPLEMENTER
5.3.
There are three major phases in the evaluation analysis performed by DES. The f i r s t phase analyzes each procedure component i n d i v i d u a l l y . Certain s t a t i c information is output from this phase, such as, the estimated size of the procedure, a l i s t of external references, and a l i s t of i n t e r face violations. However, the principal output is a directed graph model of the procedure. 2.1.2.
This model is similar to the one described in section
This model has been reduced as much as possible using the techniques
discussed in section 3.3. In constructing this model execution times and other timing information are calculated from the hardware description which is contained in the central data base.
These computations take into account
the structure of data which is accessed as well as the operations performed on the data. The second and t h i r d phases of evaluation demand interaction with the designer (who should also be the implementer).
The second phase consists
of exercising a component model i n t e r a c t i v e l y with the designer to ascertain which of the variables remaining in the model are s i g n i f i c a n t . cising may require some simulation of the component.
This exer-
In the course of this
analysis the designer supplies additional information, such as, the d i s t r i bution of the values of the variables in the model and the p r o b a b i l i t i e s of various branches.
The result of this analysis is a more simplified model.
The t h i r d phase of the evaluation is simulation of the entire system. The model of the system is the collection of component models produced by the f i r s t
two phases of the evaluation.
DES provides an easy way of speci-
fying input job mixes f o r the simulation runs.
Each typical job is programmed
in the DES language using actual calls to the proposed system.
These
461
programs are ~ensubjected to the same analysis that is applied to the system components.
The r e s u l t is a set of models, one for each typical job.
These
models can be combined with the models of the system components f o r simulation runs.
This results in a very f l e x i b l e way of simulating the system's
performance f o r d i f f e r i n g 5.4.
job mixes.
AIDS TO PROJECT MANAGEMENT Although not d i r e c t l y part of performance prediction the DES approach
provides a number of useful aids to project management. The existence of the central data base and the a b i l i t y to express the early design in machine processable form Certainly aids documentation.
By controlling access to
the central data base, unauthorized changes in the global data bases or interfaces of the proposed system can be prevented.
Since the DES analysis
routines and the compiler, which w i l l u l t i m a t e l y produce object code for the implemented system, both refer to the central data base f o r component descriptions, constraints on the use of certain language features, hardware devices: and software components can be continuously enforced. U t i l i z i n g the information in the central data base, periodic reports on the status of the project can be produced.
Information in such a report
includes, -- a l i s t of a l l procedures called and global data referenced by each procedure in the system -- estimates of the memory and other resource requirements -- indicators of progress, such as, the frequency of component updates, the date of the l a s t update, and the r a t i o of execution time specified by process statements to execution time resulting from other statements -- a l i s t of a l l recent changes to interfaces and the components affected -- a l i s t of a l l inconsistencies and other constraint violations By i t s e l f ,
this information is inconclusive as to the state of system
development.
However, when the project manager combines this information
with his own knowledge of the development e f f o r t within his department i t can give him a much more accurate and complete picture of his project than has usually been the case in the past.
462
6.
REFERENCES
I.
Crooke, S.; Minker, J.; Yeh, J.: Key Word in Context Index and Bibliography on Computer Systems Evaluation Techniques. Technical Report TR-146, Computer Science Center, University of Maryland, College Park, Maryland ~anuary 19711
2.
Lucas, H.C. Jr.: Performance Evalution and Monitoring. Computing Surveys 3, 79-91 (September 1971).
3.
Hart, L.E.: The User's Guide to Evaluation Products. Datamation, 32-35 (December 15, 1970).
4.
Kleinrock, L.: Time-Shared Systems: A Theoretical Treatment. J. ACM 14, 242-261 (April 1967).
5.
Estrin, G.; Kleinrock, L.: Measures, Models and Measurements for Time-Shared Computer U t i l i t i e s . Proc. ACM National Meetin~ 1967, 85-96.
6.
Proceedings of the Third Symposium on Operating System Principles (held at Stanford University). ACM, New York (October 1971).
7.
Proceedings of the SIGOPSWorkshop on System Performance Evaluation (held at Harvard University). ACM, New York (April 1971).
8.
McKinney,J.M.: A Survey of Analytical Time-Sharing Models. Computin~ Surveys 2, 105-116 (June 1969).
9.
Beizer, B.: Analytical Techniques for the Statistical Evaluation of Program Running Time. Proc. FJCC 1970, 519-524.
I0.
Ramamoorthy, C.V.: Analysis of Graphs by Connectivity Considerations. J. ACM 13, 211-222 (April 1966).
II.
Lowe, T.C.: Analysis of Boolean Program Models for Time-Shared, Paged Environments. C. ACM 12, 199-205 (April 1969).
12.
Allen, F.E.: Control Flow Analysis. Proc. SlGPLAN Symp. Compiler Optimi- . zation (held at the University of l l l i n o i s ) , ~C~, New York, 1-19 (July 1970).
13.
Allen, F.E.: Program Optimization. Annual Review in Automatic Programming, Vol, 5, Pergamon, New York, 239-307 (1969).
14.
Russel, E.C.; Estrin, G.: MeasurementBased Automatic Analysis of FORTRAN Programs. Proc. SJCC 1969, 723-732.
15.
Patil, S.S.: Coordination of Asynchronous Events. Project MAC Technical Report TR-72, MIT, Cambridge, Massachusetts (June 1970).
16.
Graham, R.M.; Clancy, G.J. Jr.; Devaney, D.B.: A Software Design and Evaluation System. Proc. SIGOPSWorkshop on System Performance Evaluation (held at Harvard University)]ACM, New York, 200-213 (April 1971).
17.
MacDougall, M.H.: Computer System Simulation: An Introduction. Computing Surveys 2, 191-209 (September 1970).
463
18.
Seaman, P.H.; Soucy, R.C.: Journal 8, 264-279.
Simulating Operating Systems. IBM Systems
19.
Gordon, G.: System Simulation. Prentice-Hall (1969).
20.
Kivat, P.J.: Simulation Languages. Appendix C of; Naylor, T.H.: Computer Simulation Experiments with Models of Economic Systems. John Wiley (1971).
21.
Carlson, B.: Forthcoming MS Thesis, Department of Electrical Engineering, MIT.
CHAPTER 4.D. PERFORMANCE
MEASUREMENT C.C.Gotlieb
Department
of Computer S c i e n c e ,
University
I.
of T o r o n t o ,
Canada
INTRODUCTION
Performance measurements (I)
installing
(2)
changing
(3)
comparing miesof
a new computing
system
the c o n f i g u r a t i o n systems
scale
and c o s t / b e n e f i t
techniques
(1)
a figure
establish
2) run a s e t of
or " t u n i n g "
to d e t e r m i n e
The a v a i l a b l e
it
to
technological
improve t h r o u g h p u t improvements,
econo-
ratios
are t o :
of m e r i t
"kernel",
3) make o b s e r v a t i o n s (i)
are needed when:
based on component r a t i n g s
"benchwork"
or s y n t h e t i c
problems
and measurements by u s i n g
hardware i n s t r u m e n t a t i o n
(ii)
software
monitors
4) model the system e i t h e r (i)
analytically,
(ii)
or
by s i m u l a t i o n .
M o d e l i n g and s i m u l a t i o n design
and p l a n n i n g
portant
parameters
often
2.
stages.
FIGURES
the o n l y
They are a l s o
(see Graham).
used in e v a l u a t i n g
We c o n c e n t r a t e
are o f t e n
existing
The f i r s t
tools useful three
available in
during
identifying
techniques
systems and a l t e r n a t i v e
the
the im-
are more
configurations.
on t h e s e .
OF MERIT
The c o s t s h o u l d
be an o v e r a l l
measure of p e r f o r m a n c e .
In c o m p u t i n g ,
the
465
economic p r i n c i p l e production
units
known as "economy o f s c a l e " ,
e x p r e s s i o n as G r o s o h ' s
its
C = K I/E" where C
Law. A c c o r d i n g
is
the e f f e c t i v e n e s s
speed, t h r o u g h p u t K we assume t h a t
CPU = c e n t r a l follows. is
is
related
is,
it
E~
and
S.CPU, G r o s c h ' s
Law
seems to be an o b s e r v a b l y c o n f i r m e d r e l a t i o n
to process j o b s , it
this
is
that
it
and in any case, when a l t e r n a t i v e
usual
to compare systems of equal
factor. by a s s o c i a t i n g
each f e a t u r e ,
a w e i g h t to each a t t r i b u t e .
with
The f i g u r e
of merit
example in Table l l
that
capability
to d e f i n e machine f e a t u r e s ,
attributes
Table
etc.
G e n e r a l l y we want some measure of e f f e c t i v e n e s s
in o r d e r to e l i m i n a t e is
large finds
measured in
p r o c e s s o r speed, and f u r t h e r
to a b i l i t y
One approach
that ones)
to t h i s :
S ~ C and CPU ~ C where S = s t o r a g e
systems are being c o n s i d e r e d cost
small
a constant
Simple as i t
(Solomon 1966).
than
i s the c o s t
E
If
(which s t a t e s
and processes are more e f f i c i e n t
is
and a t t a c h i n g
calculated
as a w e i g h t e d sum of the f e a t u r e s .
i s g i v e n by Sharpe (Sec.
Features for
a number of
E v a l u a t i n g a Computer System
Feature
No. of a t t r i b u t e s
Weight of a t t r i b u t e s
Hardware
38
0.27
Supervisor
18
0.27
8
.08
Language p r o c e s s o r s
31
.16
Programming s u p p o r t
4
.02
8
.12
16
.08
Data management
Conversion d i f f i c u l t y Vendor r e l i a b i l i t y
The
9.4).
support
1.00 The o b j e c t i o n
to t h i s
is
by a group of e x p e r t s )
is
that
the c h o i c e
inevitably
arbitrary
has l i m i t e d
credibility.
the v a r i o u s
t y p e s of machine i n s t r u c t i o n s
instruction
time.
lems, and to a l l o w f o r
Table 2 shows examples of sets
arrived
approach
is
and compute an o v e r a l l
between s c i e n t i f i c
sets of f a c t o r s of w e i g h t s .
at
to w e i g h t
are d e t e r m i n e d by a n a l y z i n g t y p i c a l
the d i f f e r e n c e
different
(usually
and the method t h e r e f o r e
A somewhat more o b j e c t i v e
The w e i g h t s
cessing applications
of weights
weighted prob-
and data p r o -
are produced
for
each.
(See a l s o Solomon 1966).
466
Drummond [1966)
suggests
the maximum s t o r a g e bus r a t e
(MSBR) as a m e r i t
figure. MSBR = data l e n g t h
x degree of
Table 2 Instruction
Instructional
Type
cycle
Mix
Commercial Weight .25
and compare
add
.095
0
Multiply
.056
.01
Divide
.020
0
Load/store
.285
Indexing
.225
Conditional
branch
.132 .187
.74
1.000
1.00
Miscellaneous
Arbuckle
1966
K . E . K n i g h t A Study o f T e c h n o l o g i c a l PH.D. T h e s i s , Merit tion
figures
portant
factors stores
ures of m e r i t
for
Inst,of
It
is
I/0
possible
1963
w e i g h t e d means of the
rates,
into
channel
account
instrucsuch im-
speeds, o v e r l a p p i n g ,
to d e v i s e much more c o m p l i c a t e d f i g -
in which these f a c t o r s (see K n i g h t
Technology
t h e y do not take
as word l e n g t h ,
etc.
have done t h i s ple Knight
Carnegie
Innovation
d e t e r m i n e d by c a l c u l a t i n g
times are too s i m p l e ,
buffer
time.
Weights
Scientific
F i x e d add ( s u b t r a c t ) Floating
interleave/storage
are i n c l u d e d
and K n i g h t and o t h e r s
1968 and Sharpe Ch. 9, S e c t i o n
D).
For exam-
defines:
Computing power = Memory f a c t o r
=
memory f a c t o r
[(L-7)N
x operations
per second
(WF~ P where
K K
=
a constant
L
=
word l e n g t h
WF =
I for
P
0.5 f o r
=
Operations
(in
a fixed
bits),
N = no.
word l e n g t h
scientific
per second =
of words
memory, 2 f o r
computation,
in high
a variable
0.333 f o r
tc
=
1012
time in ~s f o r
one m i l l i o n
l e n g t h memory
commercial
tc+tl/0 where
speed memory
operations
computation
467
= non o v e r l a p p e d t i m e
tl/O
I/0
(in
~s)
for
( d e t e r m i n e d from channel w i d t h , start, It
is
clear
that
s y s t e m s , and f o r
in t i m e - s h a r e d ,
this
reason such f o r m u l a
computing systems as a w h o l e ,
over time
(See K n i g h t Table 3
1968,
do not
include
are not a p p l i c a b l e
now. They
of t e c h n o l o g i c a l
and f o r
innovations
subsystem components,
Core(KB)
Problems
370
360
360
360
ERT *
CPU **
ERT
CPU
I/0
i
I
1
I00
212
40
278
100
0
2
I
2
96
36
5
42
I0
350
3
2
I
130
34
30
115
113
134
4
3
1
200
21
2
17
4
0
5
3
2
96
24
3
27
6
0
6
3
3
200
76
21
69
50
109
7
4
I
100
6
4
18
16
0
8
4
2
96
12
2
14
3
0
9
4
3
76
59
58
294
293
120
i0
5
I
140
21
18
69
66
195
* ERT i s **
the Expected Run Time, computed by adding a f i x e d
each I / 0
interrupt
in u n i t s
of
From C.A.
Ford,
KERNELS,
,~ kern el i s
tely
issued during
the j o b
cost
time for
step.
.01 m i n u t e s
Somputer C e n t r e ,
3.
factors
parallel
Sharpe Ch. 9, Harman 1971 and Solomon 1966).
O b s e r v a t i o n s on Kernel
Job ~ Step #
rate,
m u l t i p r o g r a m m e d or h i g h l y
have been used however to s t u d y the e f f e c t s both f o r
transfer
stop or r e w i n d t i m e s e t c . )
even these more complex r a t i n g s
which are i m p o r t a n t
one m i l l i o n
operations.
A report
on CUC/UTCC P r i c i n g
B E N C H M A R K S AND S Y N T H E T I C
a representative
coded and t i m e d .
programs may be s h o r t facturer-provided
Data U n i v e r s i t y
of T o r o n t o
J a n u a r y 1972.
PROGRAMS
program which
(Arbuckle
has been p a r t i a l l y
1966, C a l e n g a e r t
or e x t e n s i v e and the t i m i n g
data or machine c h a r a c t e r i s t i c s .
1967, is
or comple-
Lucas 1971).
often
The
based on manu-
468
A w i d e l y quoted ports
set of
kernels
i s d e s c r i b e d by Auerbach
(See System Performance C h a r t s ,
in the EDP Re-
and a l s o H i l l e g a s s ,
1966).
The
problems used a r e : Updating
sequential
UDdating f i l e s
files
s t o r e d on a random access d i s k s t o r a g e
Sorting Matrix
inversion
Polynomial
evaluation
To a c h i e v e u s e f u l ly
specified
comparisons
(size
machines are s t a n d a r d i z e d etc.).
On the o t h e r
are l e f t
flexible
teristics charts, vity It
hand f i l e
so t h a t
i.e.
(or runs)
is
factor,
etc.)
number o f c h a n n e l s ,
arrangements and d e t a i l e d
type divisors,
coding methods
advantage can be taken of the s p e c i a l The r e s u l t s
are d i s p l a y e d
and the
charac-
in a s e r i e s
of
of I 0 , 0 0 0 Records" vs A c t i -
and vs "Average System R e n t a l / M o n t h " .
i s n e c e s s a r y to accept
culations
activity
"Time to Process a M a s t e r F i l e
Factor"
- there
(core size,
of each machine. e.g.
the p a r a m e t e r s of the problem are c a r e f u l -
and number of r e c o r d s ,
with
the r e s u l t
of comparisons based on k e r n e l
no agreement about the r e l a t i v e
how f r e q u e n t l y
cal-
caution.
they a r i s e
i m p o r t a n c e of k e r n e l s
or what w e i g h t s
-
should be a t t a c h e d
to
them - The r e s u l t s
are dependent on the q u a l i t y
of the programming as w e l l
as on the system - important factors and s o f t w a r e to p r e d i c t In s p i t e
such as I / 0
and r e q u i r e
actual
of these r e s e r v a t i o n s ,
when comparing c o n f i g u r a t i o n s excerpts
considerations,
overhead are u s u a l l y
operation within kernels
computer jobs
(54 job
system on the two machines. compare the c o s t of r u n n i n g
steps)
context.
can be v e r y u s e f u l ,
run w i t h
a 370/165,
especialy Table 3 shows based on
the same o p e r a t i n g
The s t u d y from which these r e s u l t s a job
are
of the computer speeds and to
on the 165 w i t h
the 65, using an agreed-upon p r i c i n g the r a t i o
a larger
which are not too d i f f e r e n t .
taken was made to d e t e r m i n e the r a t i o
formula
that
of running
in each case.
it
on
For the jobs
360 CPU t i m e / 3 7 0 CPU t i m e was 3.67 and the 360 c o s t / 3 7 0
c o s t was 9 8 0 . 6 5 / 6 3 9 . 2 0 ded.
overlapping operations
s i n c e these are d i f f i c u l t
from a comparison o f an IBM 360/65 w i t h
36 d i s t i n c t
run,
omitted
or 1.5 as compared w i t h
the 1.4 which was i n t e n -
469
A benchmark is
an e x i s t i n g
program t h a t
is
coded in a s p e c i f i c
and e x e c u t e d on the machine b e i n g e v a l u a t e d mark the complete
software
system is
ate factors
than j o b
time,
around,
other
diagnostics
competitive bidding ers
in
etc.
introducing
e.g.
red w i t h
compile
new computers with
tests.
It
so t h a t
the o l d .
is w i d e l y
is a l s o
their
require
open
used by m a n u f a c t u r -
customers
For example,
turn-
used in
can compare
on the bases of bencht h e 370/165 as compa-
the 360/65.
in e v a l u a t i n g
two s e r i o u s systems.
portance
of d i f f e r e n t
ter
be m a i n l y
will
It
one can be sure t h a t
for
information
reflect
where at
least
during
is
software
local
packages
similar
Synthetic
include
programs
(see Lucas,
Table
and a l s o
in
with
conjunction
be s p e c i a l l y
bottlenecks
0S/360 f o r
Schneidewind
(1967),
t h e benchmark
have been e l i m i n a t e d .
as p o s s i b l e ,
of a system by
or by s u b j e c t i n g
it
have long been used by hardware and t h e y are now commonly used
example i n c l u d e s
a s e t of j o b s
and most commercial
which
software
tests.
like
for
any phase of system o p e r a t i o n
software Their
v e r y much l i k e
monitors.
Their
disadvantage
is
the system on hand, o f t e n
of t h e methods d e s c r i b e d
in computer s e l e c t i o n .
While
the o p e r a t i o n
system g e n e r a t i o n ,
monitors.
written
possible.
run en-
the o b v i o u s
They are in f a c t
mark programs,
hardware or s o f t w a r e
how the systems would compare in a w e l l
can be used to t e s t II).
minor
Thus speed may
t h e y do
Such programs
as w e l l .
etc.).
system (lack
run,
d e s i g n and m a i n t e n a n c e ,
may be run a f t e r
or d i s k
not o b v i o u s
about t h e systems a c t u a l l y
as many component f u n c t i o n s
to extremum c o n d i t i o n s . engineers
the compu-
under which benchmarks are run
Synthetic programs are used to v a l i d a t e exercising
which is
by some r e l a t i v e l y
and the c o n d i t i o n s
im-
Even more i m p o r t a n t
program or a p a r t i c u l a r
a channel
one to know w h e t h e r t h i s provide
not n e c e s s a r i l y vironment
unless
to assess the r e l a t i v e
to one a p p l i c a t i o n .
of a particular
contention
dramatically
comparisons
about the use of benchmarks
very difficult
problems,
change in the system, do not a l l o w
reservations
because of some b o t t l e n e c k
of core s t o r a g e , be improved
is
dedicated
h o w e v e r , the p e r f o r m a n c e may be l i m i t e d
All
to e v a l u -
and e x e c u t e s p e e d s ,
IBM q u o t e a speed advantage o f 2-5 f o r
There are s t i l l
to t e s t
With a bench-
possible
where government r e g u l a t i o n s
performance
the new c o n f i g u r a t i o n s mark r u n s ,
e.g.
1971). is
This method of e v a l u a t i o n
situations,
and o b j e c t i v e
(Lucas
used, and i t
language
here,
along w i t h
value
is
t h e y have to
in assembly
modelling,
use of e v a l u a t i o n
and bench-
greatest
that
In a s u r v e y of 69 i n s t a l l a t i o n s the r e l a t i v e
kernel
language.
come i n t o reported
methods in
use
by computer
470
selection
was g i v e n
as f o l l o w s :
1.
Use of benchmark problems
2.
Published
3.
Use o f k e r n e l
4.
Computer s i m u l a t i o n
5.
Mathematical
Kernels,
hardware and s o f t w a r e
COLLECTION
mated j o b
times,
systems or adequate f o r For t h i s
it
is
lines
printed,
capa-
determining
how
n e c e s s a r y to t a k e a more
approach and go to d e t a i l e d
in
listing
quantities
Statistics
e l a p s e d times
the r u n - t i m e
measurement and ob-
which might
options
turn-around
for
job
selected, time,
be m o n i t o r e d
can be g a t h e r e d at t h r e e
steps,
called
compilation,
core used,
priorities
levels:
cards
in,
esti-
execution
read and punched,
selected,
cost,
diagnostics
in
the
system
I/0
activity,
overlapped the
in t h e m s e l v e s
- here we can measure the programs
etc., called
are not
AND ANALYSIS
system.
l~el
job
programs
system components.
no d i f f i c u l t y
in a computing user
7 %
can be i n c r e a s e d . engineering
4.
the
16 %
analysing
on i n d i v i d u a l
There is
52 %
and s y n t h e t i c
servation
DATA
64 %
modelling
b l e of q u a n t i t a t i v e l y effectiveness
reports
problems
benchmarks
analytical,
61%
level
here we measure r e s o u r c e
j o b and system q u e u e l e n g t h s
level
resource
user enquiries,
-
quantities.
operator
and c o m p l a i n t s ,
traffic
channel times,
and
various
actions
and f l o w s ,
and i n t e r v e n t i o n s ,
others
are c a l c u l a t e d
They are suggested from a n a l y t i c a l from o b s e r v a t i o n
movements and c o n s o l e
lights,
to be i m p o r t a n t .
service
c o s t and income s t a t i s t i c s .
are observed d i r e c t l y ;
models of the system, are l i k e l y
and s e r v i c e
here we measure j o b
allocation,
requests
Many of t h e s e q u a n t i t i e s or d e r i v e d
allocation,
activities
installation
utilisation,
-
of i n p u t
and from r e f l e c t i o n The d i f f i c u l t y
stations,
and s i m u l a t i o n disk-arm
on what p a r a m e t e r s
comes in c h o o s i n g from t h i s
l a r g e l i s t of p o s s i b i l i t i e s , in d e c i d i n g which t o o l s to use, how f r e q u e n t l y to c o l l e c t data ( c o n t i n u a l l y , at i n t e r v a l s , upon r e q u e s t , under extreme c o n d i t i o n s )
how to d i s p l a y
in knowing what k i n d of a n a l y s i s
and s t o r e
to do.
the data,
and most of a l l ,
471
The two g e n e r a l their
c l a s s e s of m o n i t o r s ,
own a d v a n t a g e s .
do r e q u i r e
the s e r v i c e s
tors
or s e l e c t i v e l y
as d e s i r e d .
operation, cated
They i n t e r f e r e
and may r e q u i r e
to them.
Probes
-
data i s
components
the c o n t r o l
-
of the o t h e r
quantities
selection
with
resources
the
be a l l o and
simultaneously.
are common to both
types.
which are i n s e r t e d
unit
either
-
this
The
These i n -
at points
where
retained
HARDWARE
MONITORS
The e a r l i e s t
devices
is
were o u t g r o w t h s
ready f o r
s t a n d a r d 60 cps c l o c k because i t s
available
resolution
in t e n s - o f - m i c r o s e c o n d s ,
in a l l
in the
of a d a t a
may take in
buffer
output.
systems - o s c i l l i s c o p e s ,
At the extreme ends of s i m p l i c i t y
the p r o g r a m - a c c e s s i b l e hardware c l o c k
directly
the o u t p u t
of the equipment used by e n g i n e e r s
and development of computing
and c o u n t e r s .
the o u t p u t
software monitors it
or d i r e c -
by the system on the o c c u r -
processes the c o n t e n t s until
the a c t i v i t i e s
programmed p r o c e d u r e s ,
and r e c o r d s for
as needed
and s y n c h r o n i z e s
through
or a u t o m a t i c a l l y
displays
the form of a program which
in the design
or c o n v e r s i o n
the system which d i r e c t s components,
which the data i s
toring
or o t h e r
can be c a l c u l a t e d
integration
case of the hardware m o n i t o r ;
ters
at l e a s t ,
to
can be d i s p l a y e d more i m a g i n a t i v e l y
of a m o n i t o r
t i o n s a p p l i e d by the o p e r a t o r rence o f c e r t a i n e v e n t s
54
tape u n i t s
(such
accessible
- a d e v i c e or program which r e c e i v e s data from a s e t of
applying
an o u t p u t
to some e x t e n t ,
d e v i c e s or p r o g r a m - i n t e r r u p t s to be g a t h e r e d
an a n a l y z e r probes,
that
The o b s e r v a t i o n s
dependent and r e l a t e d essential clude:
S o f t w a r e moni-
and can be used to observe system f u n c t i o n s
as q u e u e l e n g t h s and program usage) which are not at a l l hardware m o n i t o r s .
but t h e y
They impose no system overhead and can
be used c o n t i n u o u s l y
are more v e r s a t i l e
are easy to a t t a c h ,
of a maintenance e n g i n e e r and are more l i m i t e d
in the ways t h e y can be used, therefore
hardware and s o f t w a r e each have
Hardware m o n i t o r s
and a f u l l
and c o m p l e x i t y are
s c a l e computer.
The
systems i s not adequate f o r
i s not high enough.
or even s m a l l e r u n i t s
me-
A clock
moni-
which counts
of t i m e is needed.
472
5.7.
ONE
COMPUTER
MONITORING
ANOTHER
There are many examples of one computer b e i n g used to m o n i t o r Table
IV l i s t s
some cases r e p o r t e d
Table
Primary
Monitoring
Machine
Machine
in the l i t e r a t u r e .
IV - One Computer M o n i t o r i n g
Environment
Another
Reference
IBM 7090
IBM 7044
Conte 1964
UNIVAC 1108
UNIVAC 1108
MacGowan 1970
CDC 6600
Peripheral
Lawrence Rad-
processor
iation
Variable
SNUPER
UCLA
GE 648
PDP.8
MULTICS
Clearly ter
this
technique
to be used f o r
data.
Although
channel
such dual
minimal
Estrin
systems
et a l .
Saltzer
and
Gintell
1970
reducing
1967
allow
If
(e.g.
the p r i m a r y
the m o n i t o r
enough most of the data can be e v a l u a t e d
compu-
and a n a l y s i n g
must be d e s i g n e d
interference.
1968
power o f t h e m o n i t o r i n g
recording,
interface
Stevens
Lab.
the f u l l
collecting,
a special
connection)
operated with
permits
another.
machine to be
computer
as soon as i t
o f the
a channel-to-
is
is
fast
collected.
If
i t is not i t is necessa~ ~ halt m e m o n i t o r e d system u n t i l t h e p r o c e s s i n g catches up, ( a t some c o s t in e l a p s e d t e s t i n g t i m e ) or e l s e to p r o v i d e buffers
and i n t e r m e d i a t e
gathered
from the t e s t
providing monitor
two-way o p e r a t i o n s . is,
research,
5.2.
of course,
gister high
With a computer as m o n i t o r ,
The d i s a d v a n t a g e
the extra
cost,
as opposed to o p e r a t i o n a l
MONITOR
device
gate which a l l o w s which
data
which
computer,
o f h a v i n g a computer as
is
prohibitive
e x c e p t under
conditions.
LOGIC
The b a s i c m o n i t o r "and"
storage.
system can be f e d @ack t o the p r i m a r y
is
is
an event
a clock
being m o n i t o r e d
impedance probe b u f f e r s
system b e i n g m o n i t o r e d .
counter.
pulse through records
(isolates)
This
is
essentially
the e v e n t sought the m o n i t o r
With a more e l a b o r a t e
an
to a c o u n t e r when a r e -
control
(Fig.l).
circuit unit
The
from t h e it
is
possible
473
MONT I ORED REGISTOR
COUNTER~
CLOCK PULSE F IGURE I EVENT COUNTER 1
[ L,
SAMPLING DEVICE
I
RECORDER I
',
SELECTOR 1BUGY
SELECTOR 2 BUSY
ANYCHANNELBUSY
MULTIPLEXORBUSY > ~ ~
CHANNELBUSY ) CPUIDLE
DIGITAL DISPLAY
/i
FIGURE 2 OVERLAPPINGEVENTS ANYCHANNEL BUSYANDCPUIDLE
>
* ' ~ I___ UPPERBOUND ADDRESSU COMPARATOR ~S~RAGE ~ { i ADDRESS i ~ - [ REGISER J PROBE FIGURE ) REGIONAL EXECUTION
i>-->
LOWERBOUND ADDRESSL COMPARATOR CPUEXECUTING CODERESIDING IN REGIONL÷I to U-I
--~-
> r INTEGRATING (COUNTING) CIRCUIT
FIGURE 4 MONITORWITH PEN-AND-INK RECORDEROUTPUT
474
to r e c o g n i z e when c e r t a i n
instructions
lapping
events,
etc.
(Fig.2).
storage
protect
bits
it
certain
regions
of the s t o r e
is
By a t t a c h i n g
possible
reserved
for
if
measure the time part
of s t o r e .
for
the c o u n t e r circuit
EXAMPLES
OF
special
strip
CURRENTLY
computers,
have been marketed present-day
out of any
HARDWARE
1967),
The u s e f u l n e s s
design.
the M u l t i c s
(SUM)
instrumentation hardware m o n i t o r s availability,
briefly.
Manufactured
by Computer Syne-
to market hardware m o n i t o r s . The c o u n t i n g
Model rate
Any one of them can be d i s p l a y e d ,
r e c o r d e d on m a g n e t i c
probes and i n p u t
cables)
tape.
The whole system ( e x c e p t f o r
is mounted in a s i n g l e
Boole and Babbage Hardware M o n i t o r of:
Examples of such
the current
are d e s c r i b e d
independent counters.
1MHZ.
of t h e s e
have been b u i l t
systems where the c o n f i g u r a t i o n
To i n d i c a t e
Monitor
consisting
who used
1965) and the m o n i t o r
Recently self-contained
and t h e y are a l l
packaged d e v i c e s ,
points
MONITORS
complexities
in the i n i t i a l
commercially.
1 KHZ to
(Apple
or v a r i o u s
hardware m o n i t o r s
from
from s e v e r a l
by the m a n u f a c t u r e r s
IBM produced.
(Schulman
16 s i x - d e c i m a l
can be r e p l a c e d
on a meter or r e -
Examples are the Basic Counter
company was the f i r s t
SM-416 p r o v i d e s
(2)
to
and t h e r e s u l t s
the counter displayed
time-sharing
1970).
The System U t i l i z a t i o n This
all
devices
determined
and G i n t e l l ,
can be v a r i e d
possible
or loaned them to customers where t h e r e
analysis.
particularly
are TS/SPAR -
Inc.
is
instructors
The o u t p u t
were c o n s t r u c t e d
monitoring
(Saltzer
tics
AVAILABLE
by Bonner ( 1 9 6 9 ) ,
monitors
(I)
is
of s t o r e
on the same c h a r t .
configurations
was n o t c o m p l e t e l y
four
Instead
(BCU), the Machine Usage Recorder
was such t h a t into
it
sampled p e r i o d i c a l l y
recorder.
simultaneously
the m o n i t o r s
described
it
from the p a r t
has c o m p a r a t o r s ,
and the r e s u l t s
was some problem r e q u i r i n g Unit
is
subsequent a n a l y s i s .
can be p r e s e n t e d
them in
unit
spent by the CPU in e x e c u t i n g
corded on a p e n - a n d - i n k
Initially
In p a r t i c u l a r
executing
(Fig.3).
by an i n t e g r a t i n g
5.3.
is
in
system and thus measure system o v e r h e a d .
the m o n i t o r
As shown in F i g . l recorded
a decoder n e t w o r k to the
are b e i n g e x e c u t e d .
to r e c o r d when the computer the o p e r a t i n g
record over-
to r e c o g n i z e when i n s t r u c t i o n s
possible
Alternatively,
are e n c o u n t e r e d ,
Units
chassis.
- These are s e p a r a t e l y
475
Event M o n i t o r
- six
counters
- 104 t
106 c o u n t s / s e c
- removable
logic
plugboard Measurement Probe Measurement P r i n t e r M a g n e t i c Tape U n i t
- records - for
Trend R e c o r d e r - p l o t s
data d i g i t a l l y
System A c t i v i t y
Meter.
This
165 (IBM 370/65 F u n c t i o n a l A switch
allows
(I)
I/0
- I/0
(3)
I/0
and Compute (4)
(7)
Compute Problem
A counter (4)
or s t r i p
University
This
is
Characteristics
(e.g. Off
recorder
of T o r o n t o
It
recorder, the c o s t practical
a signal
a general to
the m o n i t o r
output ANALYSIS
OF O U T P U T
of how the r e s u l t s Analysis
spent
compared w i t h ber of
built
Compute T o t a l
each a d d r e s s ) . plugboard,
An i m p o r t a n t
1971)
Fig.
is
that
to the c o m p u t e r ,
operations.
of
An address
a 6-channel
feature
than $ 5 each)
attached
normal
(Milandre
at the U n i v e r s i t y
for
a logical
etc.
that
it
is
and use
5 shows a
recorder.
OF H A R D W A R E M O N I T O R S
o f hardware m o n i t o r s in
improving system
of core s t o r i n g
with
some i l l u s t r a t i o n s
system p e r f o r m a n c e . (Bonner 1969).
The CPU time
the message p r o c e s s i n g
t h e t i m e used e l s e w h e r e . it
(6)
(HARDMON I I )
enough ( l e s s
of a t e l e c o m m u n i c a t i o n
inquiries
out d e g r a d i n g (b)
comparator,
were u s e f u l
in the p o r t i o n
I/0
I)
(20 are r e q u i r e d
interrupting
of the s t r i p
(2)
can be a t t a c h e d .
l e a v e them p e r m a n e n t l y without
IBM 370/
to be s e l e c t e d :
Compute in S u p e r v i s o r
15 e q u a l s
small
We c o n c l u d e t h e d i s c u s s i o n
(a)
(5)
purpose c o u n t e r ,
of t h e probe is
component of the
tape
p 24).
Hardware M o n i t o r
has 108 probes
data or m a g n e t i c
between c h a n n e l s )
(PSW b i t
- compare c i r c u i t ,
5.4.
a standard
a s u b s e q u e n t development to a u n i t
Waterloo°
typical
is
analyzing
any one of seven f u n c t i o n s
overlap
event monitors
output
Data Summary Program - A program f o r (3)
from f o u r
storing
was found p o s s i b l e
By p l o t t i n g
this
system was
against
to reduce t h e p o l l i n g
the num-
rate with-
performance.
Distribution
of access to d i r e c t - s t o r a g e
access to the modules
in a 5-module d i s k
one module had e x c e s s i v e
requests
to a n o t h e r module improved
(Bonner 1969).
storage
and seek t i m e .
performance.
device
A s t u d y of
revealed
Transfering
that
a catalogue
476
--:-vv
~
....
~...... ~...........i--~ ........ :.... i........' ......... :- 5 ¸ i
i
:
............
i
i!
~' --
~
i
!
!
~ ~ '
~
~
iIi
i!
! ¸~•
i
~
. . . . . . .
:
:!
. . . . . . .
~- ¸~....
:
•~:-i-I ¸¸-~ ....... ~i--! :- i I ~ i ....
¸
'
.
.
.
.
.
.
• ,
:~
.
:--I-!~ 71 ......i
•
....
~iii~:i: ~ :i:i-!~ :~:::~:: ::::~_:::: i
i
:
.
.
'"
.
.
A ~ :~.f~ ' ' _
..............
--::~-~,-i
., ....................................
"~-~i ............ ~...:_.
I--~
~ ...... ~
i ~
-?,--~ .................~ ............~ ........... ..................
!
i
!
! ....
i
!
',
~fi,~.~.~/!'~'~ ~___L_.~.__~_
i
~
1
i
I
....
<
~
,----.~--:-.
~
i
1
:
-
~. . . . . . . . . . . . . . . . .
~---
~ ~ "
:
"
:
~. . . . . . . .
i
, I ~ .....
: .............
~
:
:
~x~K-!
t
'
~.~ K
~,,'o"i
]
.
9" ? ~ t / ! i ~ ~;r~,~i~',",~,rt "~: o~ ¢ ~ ' ~ i ~ _ _ ..~...__L .......--~-~----~ F ~ T - - o p : ~ ) ~ - O C ~ ' ~ ' r , ~ , ~
i
~,
:
v T~-~
~--
To
:~
477
(c)
Balancing
Bottlenecks continuous to a v o i d monitor (d) oral
Channel
Loading
(Kohn 1971).
due to e x c e s s i v e a c t i v i t y surveillance
this.
This
is
permitted probably
on one channel
by a hardware m o n i t o r
the most f r e q u e n t
Direct-Storage
Contention
storage
devices
to the CPU, t h r o u g h
(U.
use of the hardware
of T. Computer C e n t r e - T. S e l l g r e n ,
its
were c o n f i g u r e d
own channel
on A p r e - e m p t e d
Lo a drum w i t h (e) A n a l y s i s
a much f a s t e r
of o p e r a t o r
channel
actions
so t h a t
and a l s o
B's c h a n n e l .
the c l u e which e n a b l e d t h e key r o u t i n e
A had a dual
through
path
the channel
The hardware m o n i t o r
to be t r a n s f e r r e d
of
B.
provided
from a d i s k
capacity.
(U. o f T. Computer C e n t r e - T. S e l l g r e n ,
communication)
Examination tapes,
of the t r a c e s
failing
monitor
of
t h e continuous
which happen in
There is output
a short
time
e v e n t streams make i t
possib-
which cause t r o u b l e .
(seconds)
a need to d e v e l o p t e c h n i q u e s procedures
The
output
allows
to be r e c o g n i z e d .
events
Important
ob-
because of a v e r a g i n g .
to be used c o n t i n o u s l y
analysis
packs.
are v a l u a b l e :
(as opposed to the sampled)
can be l o s t
disk
in mounting
to be s p e c i f i e d .
of s e v e r a l
actions
practices
and a s s i g n i n g
hardware m o n i t o r
recording
concurrent
poor o p e r a t i n g
program loops
procedures
(metered)
The simultaneous
l e to r e c o g n i z e
servations
revealed
to r e c o g n i z e
enabled b e t t e r
Two f e a t u r e s
(2)
easy
communication).
The a c t i v i t y
(I)
makes i t
on the 370/165.
Two d i r e c t
oral
are common. The
which w i l l
by o p e r a t o r s ,
to the t r a c e s .
allow
the m o n i t o r e d
and to d e v e l o p s t a n d a r d
478
6,
SOFTWARE
MONITORS
Hardware m o n i t o r s is
constrained
operation
the d i f f i c u l t y
it
trol
- at a p p r o p r i a t e
(a)
rate
Standard tion
is
time,
(I)
for
that
e.g.
is
is
a softto the
a transfer
it
time
of a d i a g n o s t i c
(corresponding
there
data and s t o r e s
for
of con-
later
analysis.
must be low enough so
acceptable.
We can d i s t i n g u i s h
JOB-ACCOUNTING
information purpose is
true time,
up p r i c e
and in a d v i s i n g
if
billing
or to o b t a i n
informa-
problem. with
system d e s i g n
and d e v e l o p m e n t .
DATA
which
is
given
an e x t r e m e l y is
connect-time,
structures, users
o f the normal
and management.
conjunction
billing
core-residence
around,
in
FROM
particularl%:
in s e t t i n g
give
users
job-accounting
collected
ful
the program
inserted)
out of some s p e c i a l
MONITORIN@
This
essentially
packages which are run p e r i o d i c a l l y
arising
The normal is
collects
in
can o n l y be o b t a i n e d w i t h is
programs which g a t h e r data as p a r t
for
Programs w r i t t e n
6.!.
and e l i m i n a t e
types of m o n i t o r s :
System a c c o u n t i n g
(c)
inadequate
spots,
much more d e t a i l ,
and amount o f data c o l l e c t e d
job-accounting (b)
approach points
which
device contention, the t r o u b l e
used program modules or the w a i t i n g
the overhead due to the m o n i t o r
three
the system where the f l o w of work
n e c e s s a r y to o b t a i n
information
The g e n e r a l
to a r o u t i n e
that
often
where a hardware probe is
The sampling
in
To p i n p o i n t
of h e a v i l y
Some o f t h i s
ware m o n i t o r . routine
etc.
is
location
in queues,
to p l a c e s
because of b o t t l e n e c k s ,
attention,
the e x a c t
point
point
to users and which
rich
source of d a t a .
based on r e s o u r c e etc.).
usage (CPU
The i n f o r m a t i o n
in s c h e d u l i n g ,
in
how to reduce the c o s t s
is use-
predicting of t h e i r
turnwork.
some examples. Cumulative
distributions
- job execution -
-
job-step
times
core usage
times
of: These are u s e f u l for
priority
in s e t t i n g
limits
in multiprogrammed for
selecting
times
and c l a s s e s job
streams
and
benchmark problems
We
479
12) d i s t r i b u t i o n
of
turnaround
time
These w i l l
time
users
t a k e to
call
for
their
work
require
time-stamps
on
the j o b
card - t h e y are u s e f u l
in
setting
prices
for
priority
work and
in u s e r - r e l a t i o n s (3)
machine l o a d i n g daily,
statistics
w e e k l y and
monthly
Necessary f o r
averages and
tion
peaks sharing 4)
I/0
planning,
configura-
budgeting,
determin-
ing t h e dependence of t u r n - a r o u n d
connect-time
-
scheduling,
in
on load e t c .
time-
etc.
statistics
cards read and punched
Useful
lines
problems,
printed
5) A n a l y s i s
in d e s i g n i n g
benchmark
budgeting
for
supplies
etc.
of
program a d v i c e diagnostic
These h e l p to b r i n g
sought
messages
- user r e f u n d
requests
to
ficiences
in d i s t r i b u t e d
operating
procedures
light
de-
material,
and user u n d e r -
standing There should and a l s o charts job
be s t a n d a r d
regular
or in n e w s l e t t e r
submissions
programs
procedures
for
to p r e p a r e most of t h i s
displaying
distribution.
and h e l p s m a i n t a i n
It
it
to users
information,
- either
as
h e l p s them in p r e p a r i n g
good r e l a t i o n s
with
the
the computing
centre. There are s e v e r a l
commercially
this
Biggs-Matthews
information.
available
and in Canada, Systems Dimensions (both
for
Limited
a very detailed
profile
(SDL) market ACCOUNTPAK
of the user j o b
scheme used by SDL is based on charges
component of the system - CPU t i m e , usage,
I/0
volumes e t c .
There are about t h i r t y
points
in the program s o f t w a r e .
ties
above,
records
channel
usage - time
every
allocated, activity
block
because the identifiable
residence,
program
In a d d i t i o n
are produced f o r :
program module usage tape and d i s k mounting
for
core and d i s k
appropriate listed
obtaining
programs,
IBM s y s t e m s ) .
ACCOUNTPAK t a k e s pricing
program packages f o r
have a set of t a b u l a t i o n
and b y t e t r a f f i c
channel
"hooks"
at
to the q u a n t i ~
480
The data are d i s p l a y e d
in
data r e c o r d e d
approaches
ware m o n i t o r s
described
head ( ~ 3 % )
is
tabular that
available
next,
such t h a t
it
form and as h i s t o g r a m s .
In d e t a i l
in the s p e c i a l - p u r p o s e
but the program e f f i c i e n c y
is
practical
the
soft-
and system o v e r -
to use the program as r e g u l a r
practice.
6.2.
PACKAGED
SOFTWARE
Most of t h e q u a n t i t i e s also
observable
be observed w i t h
To i l l u s t r a t e
MONITORS
software
by means of hardware m o n i t o r s
monitors,
the p o s s i b i l i t i e s
two
but at g r e a t e r
'packaged'
monitors
can
cost
in t i m e .
will
be d e s c r i -
bed. (I)
Boole and Babbage Systems Measurement S o f t w a r e
This ral
is
the f i r s t
distinct
programs,
available
, Problem Program E f f i c i e n c y same p a r t i t i o n record
for
(PPE).
IBM Ard S p e c t r a This
as the problem program,
program,
There are sevecomputers.
operating
core r e g i o n s .
(SVC) has been i s s u e d w i t h i n Configuration
Utilization
ware usage ( c h a n n e l s , Both programs
It
Efficiency
contain
an ana~yser which a n a l y z e s
instructions
(CUE).,
disk
and data on I / 0
collects
head movement,
The r e s u l t s
call waits.
data on h a r d supervisor
an e x t r a e t o r which c o l l e c t s it.
to
a l s o r e c o r d s when a s u p e r v i s o r
the sample bounds,
CPU e t c . )
in the
samples e v e r y 1/60 sec.
the p e r c e n t u a g e o f time the CPU spends e x e c u t i n g
out o f s p e c i f i e d
etc.
(SMS).
company to market s o f t w a r e m o n i t o r s .
calls
the data and
are d i s p l a y e d
in t a b l e s ,
and h i s t o g r a m s . Data Set O p t i m i z e r
(DSO) r e c o r d s
organization
of the data s e t s
Tables V ( a ) ,
(b)
three (2)
and (c)
disk
head movements and s u g g e s t s
re-
to reduce average head movement t i m e .
show r e p r e s e n t a t i v e
outputs
for
each of the
programs.
SUPERMON - An MVT S o f t w a r e M o n i t o r ,
0S/360 MVT, w r i t t e n addition
to o b s e r v e v a r i o u s w a t e r mark" programs,
at SLAC, S t a n f o r d
to the types
operating
University
of measurements a l r e a d y
aspects
of core s t o r a g e
( t h e maximum u s e d ) ,
and the f r a g m e n t a t i o n
Table Vl shows a sample o u t p u t
as a system t a s k (SUPERMON, 1970).
mentioned
use,
including
the amount a v a i l a b l e
for
it
is the
under In
possible "high
additional
of unused s t o r a g e . from SUPERMON, t h e D i r e c t
Access Device
481
Utilization Monitors
report,
and the summary r e p o r t
such as SUPERMON have been d e v e l o p e d f o r
many i n s t a l l a t i o n s Katonak
1971 f o r
valuable
(See Stevens 1968 f o r other
0S/360 m o n i t o r s
load the p r o c e s s o r These programs
utilization
should
them which
can j u s t
efficiently.
It
is
is
always w a i t i n g
almost
certain
that
for
into
Their
combinations
the p r o c e s s o r
As a g e n e r a l
one of the c o m b i n a t i o n s
(Cantrell
should
are not y e t
tune t h e system by r e l o c a t i n g or d e r i v i n g
frequently
be done in
that
ard t o o l s
of one or two of
set operational
important
software
(and h a r d w a r e ) engineering.
We c o n s i d e r
finally,
with
before
part
MONITOR AND
balancing
Although
the a n a l y s i s
just
procedures. statistics
gathering
monitors
memory s y s t e m s ,
analysis
sharing
use to
outputs,
and
of measurements
is a l r e a d y
enough expe-
be c o n s i d e r e d
instruction
and t r a c e
on t h e i r
d e s c r i b e d were f i r s t into
The g r e a t e s t and c a r r y i n g
standuse
efforts
programs
have gone i n t o
out a n a l y s i s
in
con-
Of course or s t a n d -
programs
on t i m e - s h a r i n g
experienced with
As i l l u s t r a t e d
monitor.
written
studies.
used as system a n a l y s e s
job-accounting
in view of the d i f f i c u l t i e s
system has i t s
programs,
and system d e s i g n
and d e v i c e management in t h e s e systems. time
channel
much remains
of m o n i t o r
should
to
of the computer c u r r i c u l u m .
t h e y were i n c o r p o r a t e d
ard o p e r a t i n g
there
Further,
investigations
for
major
times.
TRACE PROGRAMS
special
research
most of the m o n i t o r s
tual
jobs
strategy,
at most i n s t a l l a t i o n s
modules,
parameters,
in s o f t w a r e
SPECIAL
tools
occurring
be in the machine at a l l
used load c o m b i n a t i o n s .
become a r e g u l a r
junction
resource
multiprogramming
used r e g u l a r l y
the way of s y s t e m a t i z i n g
automatically
6.3,
there
should be d e t e r m i n e d .
we are a long way from b e i n g a b l e to have the r e s u l t s
should
One
to h e l p
and E l l i s o n ) .
Software monitors
rience
is
50 to 80 % of the
study.
one or more of the f r e q u e n t l y
service.
UNIVAC).
systems
account f o r
and at
Kohn 1971, and
In most i n s t a l l a t i o n s
deserve careful
be o b s e r v e d and a l l be f i t t e d
many computers
and MacGowan f o r
are ten or so programs which t y p i c a l l y computer use.
at the end of a run.
the CDC 6600,
way t h e y can be used in multiprogrammed
operators
exactly
issued
virmemory
in Table V I I
each
482
Table V (a)
Sample Outputs from Boole and Babbage Software Monitors
Problem Program Efficiency Report DISTRIBUTION
OF DSOW
WAIT
DATA SET NAME
PERCENT
0.0 0.0 0.0 22.73 2.37
TOTAL
25.10
MODULE
MAP
MODULE NAME
FIRST BYTE ADDRESS
COBLTEST IGG019CC IGGOIgAQ IGG019AA IGG019CF
(b)
LAST BYTE ADDRESS
001820 02BDA8 02BCI0 02BB90 02BAq8
PERCENT OF RUN TIME
002B38 02BE68 02BC88 02BRF8 02BB~8
MODULES WITH OVERLAYS
CHANNEL CHANNEL
X
61.55 2.83 3~.8q 0.78 0,00
SAMPLED 1 AND 1 AND
CONTROL CONTROL
UNIT UNIT
DEVICE
AMOUNT TIME
CHANNEL CHANNEL
CHANNEL 0 BUSY MULTIPLEXOR CHANNEL CHANNEL 1 BUSY CHANNEL 2 BUSY 03 13
IN
2 3
USE
SEC SEC
79120 5909,76 2298.24 802,08
SEC SEC SEC SEC
PERCENTAGE OF" T O T A L TIME BUSY
2.52 1.05
1:1
82.08 31.92 11,14
o[o
RATIO WAITING SAMPLE
1285.20
17.85
1231[92
17111
OF TASKS TO TOTAL INTERRUPTS (WHEN
2540 1403 2311 2311 2311
3751.20 6.48 3243,60 21,60 3610.08 1190,88
SEC SEC 5EC SEC SEC SEC
52.10 0.09 45.05 0.03 50,14 16.54
23"14 2314 2314
4710:24 2534.96 0.0 404.69
SEC SEC
65[42 35,18 0.0 5,62
DATA
PERCENTAGE OF TOTAL TIME
o~o
BUSY BUSY
AMOUNT OF TIME BUSY
2540
OF
181.44 75.60
BUSY BUSY
BUSY
DEVICE TYPE
(c)
MODULES FOR WHICH REPORTS ARE PROVIDED
Configuration Utilization Efficiency Report
EQUIPMENT
NO
OF ACTIVITY
JOELIB SYSOUT SYSIN UNBLKED BLKED
SEC
CPU
[
'~ RATIO OF TASKS WAtTING TO TOTAL SAMPLE INTERRUPTS WHEN DEVICE NOT BUSY
IN
WAIT
0.620 0.011 0.112 0,0 0.284 0.079
STATE) 0,100 0.004 0.070 0.0 0.020 0.001
o~87o
o~oIo
O.O
O.O
0.004
0.001
0,382
0.005
Data Set Optimizer Report SET
HEAD
DATA
SET
P B F I L E (01) P B F I L E (02) P P F I L E (01) P P F I L E (02) P S F I L E (01) P B F I L E (01)
MOVEMENT
PAIRS
ON VOLUME
BOOL7Z
NUMBER OF TRAVERSALS BETWEEN DATA SETS
HEAD
MOVEMENT PERCENTAGE TIME HEAD MOVEMENT
OF TIME
AVERAGE HEAD MOVEMENT TIME
108127
8758287
MS
49.00
81.01 MS
86920
5997480
MS
34.50
89.00
MS
19529
637817
MS
8.18
32.66
MS
238680
17370290
MS
I00.00
72.81 MS
483
Ta__ble VI
Sample MVT
(a) Address CO
Direct
S e r i a l No. TICDOI
Output
0S/360
Access
from
SUPERMON
Monitor
Device
Utilization
Use Count 1 1
Allocated 100.00%
Not R e a d y .00%
Cu B u s y .00%
i0
Seek .00%
Data Trans 11.97%
100.00%
.00%
.00%
O0%
4.23%
40.85~
25.35%
.O0%
.O0%
.OO%
140
TIC950
241
TIC108
0 -
0
:242
TIC035
0 -
0
.00%
00%
.OO%
.00%
.00%
143
TMD001
2 -
2
100.00%
00%
13.38%
4.23%
30.99%
:144
TIC019
- 12
100.00%
00%
1.41%
.00%
.00%
145
TIC103
0 -
0
.00%
00%
.00%
.00%
.00%
12
- 24
146
SPOOL1
1 -
1
100.00%
00%
1.41%
9.86%
4.23%
247
TIC070
4 -
4
100.00%
00%
4.93%
28.87%
13.38%
230
TIC954
1 -
1
100.00%
00%
.00%
.00%
.00%
1531
TIC106
0 -
1
61.27%
OO%
1.41%
.70~
2.11%
232
TIC008
0 -
0
.00%
00%
.00%
.00%
.00%
1 -
1
100.00%
00%
.00%
.00%
.00%
13 - 16
100.00%
00%
.00%
1.41%
8.45%
00%
.00%
.00%
.OO%
233
TIC069
234
TIC022
235
TIC014
1 -
2
100.00%
236
SPOOL2
1 -
1
100.00%
00%
.00%
2.82%
5.63%
237
TIC071
2 -
3
100.00%
00%
.00%
4.93%
11.97%
484 Sampl e Output fr0m,SU.PERMON (c,0n't)
Table VI
MVT OS/360 Monitor
(b)
Monitoring Completed
Machine Activity at a Glance !
DATE: ENDED: TIME MONITORED:
72.007 13.33.26 2.00 MINUTES
PARAMETERS CYCLE RANGE CORE
4
MODULES
3
QUEUES
2
I/O DEVICES
4
CHANNELS CYCLE TIME
1 0.20 SECONDS
CYCLES COMPLETED
569 OUT OF
600
ACTIVITY ANY SELECTOR CHANNEL BUSY I/O ACTIVITY
84.18~ 79,016
INDEX
13,779
I/O INTERRUPTS
6,890 PER MINUTE
37
DEVICES USED RQE USE SINCE LAST IPL
61
TOTAL SUPERVISOR CALLS
38,750
19,375 PER MINUTE
EXCP
12,453
6,227 PER MINUTE
OPEN
14
7 PER MINUTE
POSSIBLE BOTTLENECKS ENQ WAITS
I00.00%
070K REGION AVAILABLE
i00.00%
AVERAGE CORE WASTED
IITK
TAPE CU WAITING
59.15%
DISK CU WAITING
26.76%
TAPE NOT READY
.00 MINUTES
DISK NOT READY
.00 MINUTES
485
Table V I I
Software Monitors
for
Time-Sharing
Monitor
System
Systems
Reference Scherr,
CTSS
1967
Pinkerton,
~TS
1969
TSS/360
SIPE
360/67 CP-67
DUSETIMR
Bard,
(a s e t of programs)
Saltzer,
1970
MAPPER
Cantrell
and E l l i s o n
MULTICS GE Dartmouth
Deniston,
1969
Schulman,
1967
1971
System
GECOS
1968
SDC T i m e - s h a r i n g Totscheck
system The b a s i c display do a l l
components
the other
used f o r
things
diagnostic
Paging q u a n t i t i e s instructions I/0
is
interest
issued
the r e s o u r c e
spent
utilization,
in program segments,
we have a l r e a d y m e n t i o n e d .
tracing of
record
the time
and
A program of t h e t y p e
essential. include:
by users
and by the system to v i r t u a l
memory
devices.
counts
on pages read in
records
on pages t h a t
overwritten -
of the m o n i t o r s
memory maps, d e t e r m i n e
average r u n n i n g
performance The r e s u l t
in a c t i v e
queues and t h a t
are
pages.
time between page f a u l t s , idle
until
its
of a s s o c i a t i v e
obtained
system b e i n g
belong to users
by incoming
time a page is
and swapped out
space is
of the
memory h a r d w a r e .
from m o n i t o r s
investigated,
and average d u r a t i o n
revised.
but
it
are, is
on the w h o l e ,
possible
specific
to the
to make some g e n e r a l
ob-
servations. The most u s e f u l diagnostic while
trace
executing
programs
part
of a monitor
a defined
are i d e n t i f i e d
significant and E l l i s o n ) .
is
some v e r s i o n
program which i n d i c a t e s
program segment.
this
improvements,
in
itself
both f o r
o f the s t a n d a r d
how t h e CPU time
is
Once the h e a v i l y
almost
invariably
spent used
produces
u s e r and system programs
(Cantrell
486
• Monitors
can be designed so t h a t
to t h e i r
presence
expensive).
This
t h e y impose a 1 to 5 % overhead due
(Trace M o n i t o r s , is
small
running
interpretively
will
be more
enough to a l l o w them to be used o v e r v e r y
long p e r i o d s . In a t t e m p t i n g it
to e v a l u a t e the worth
of a hardware or s o f t w a r e
change,
i s n e c e s s a r y to observe the system under heavy load c o n d i t i o n s
(Bard).
This means t h a t
in a t i m e - s h a r i n g
system, f o r
example, the
f r e q u e n c y of s a m p l i n g should be i n c r e a s e d when many users are on. Alternatively,
it
may be u s e f u l
to c r e a t e a s y n t h e t i c
t e s the presence o f user t e r m i n a l s it
will
(Saltzer
be n e c e s s a r y to have a p r o f i l e
found from m o n i t o r
j o b which s i m u l a -
and G i n t e l l ) .
of the l o a d ,
to become even more so, what has been c a l l e d
is
in c o n n e c t i o n
1971, and Katonak 1971)•
I/0
are f i t t e d
request,
A very detailed
ages and v a r i a n c e s the o b s e r v a t i o n s . gies is
algorithm first
results, 6.4.
of a j o b In t h i s ,
considered
segment, e t c .
Poisson,
Uniform)
to each f e a t u r e
as w e l l
each s t r a t e g y .
constructed
The s t r a t e g y
(round-robin,
with
MONITOR
to
strate-
or CPU u t i l i s a t i o n )
m i g h t be a s c h e d u l i n g requested time
stream is
used to
a combination
The observed
"calibrate"
of using m o n i t o r
and s i m u l a t i o n .
STATISTICS
FROM
THE
OBSERVATIONS
t e c h n i q u e s which are used to e s t i m a t e system parameters
from the m o n i t o r
observations
ation
in o r d e r
of them i s
are u s u a l l y
s e t of p o s s i b l e wave forms random process•
or a channel
(Xl(t) .... Xs(t)...
Often what i s wanted i s v a r i o u s s,
taken at v a r i o u s
times.
If
very simple,
(Denning and E i s e n s t e i n
such as a queue l e n g t h ,
t for
-
The
as the a v e r -
different
FIFO, s h o r t e s t
is
important
as as to correspond
(as measured by t h r o u g h p u t
jobs,
stream, statistical
features
In essence the t e c h n i q u e
The s t a t i s t i c a l
at f i x e d
profile
observed.
placement o f modules on drums vs d i s k s e t c .
ESTIMATIN@
quantity
(Sherman, Basket & Brown
each j o b
Then a model i s
a s e t of k e r n e l
to produce
job
performance of the system on the job the model•
and promises
simulation
time f o r
(Gaussion, are f i l l e d
in t i m e - s h a r i n g
etc.),
is
to the a l l
and the performance
simulated for
level
CPU s e r v i c e
t y p e of d i s t r i b u t i o n
with
Trace Driven M o d e l l i n g
down to a l m o s t a m i c r o s c o p i c e.g.
can be
statistics.
A way to use m o n i t o r s which has a l r e a d y proved u s e f u l ,
distributers
To do t h i s
and t h i s
1971)•
In g e n e r a l
a
d e l a y i s r e p r e s e n t e d by a ) called
an ~ s e m b l e
an ensemble or
measurement
,
taken
a temporal measurement, ergodic t e m p o r a l averages are
but what i s observed i s the system i s
but some c o n s i d e r -
487
equal
to
ensemble
periodicities
in
The s i m p l e s t
averages. the
In e f f e c t
system's
statistic
this
means t h a t
there
must
(x),
(Xl...Xk)
be no
behaviour.
(representative)
of
given
is
the
average A I xk = ~
It
is
unbiased,
An u n b i a s e d
k ~ i=l
i.e.
xi
has e x p e c t e d
estimate
for
the
value
variance
equal
to
^2
is
x,
1
%
k-I Xk
is
calculated
iteratively
A
xo = 0
It
is
always
A
xk
better
to
true
mean.
k
2
~
(x i -
xk)
^2
given
by
i=l
by ^
=
the
+ ~1( x k
Xk_ 1
^ Xk_ I)
-
use a stochastic approximation,
A
xo
=
0
•~,
a 1
=
1
A
A
x k = Xk_ I + a k ( x k - X k _ l ) The s i m p l e s t 0_~
is
the
estimator w h e r e a k = o i ,
exponential
o< 4 1
Another
useful
estimator
is
given
by
S0(T ) = 0
A A Sk(T ) = Sk_ I ( T ) where
T
ments
are
determines
rors they
estimators
eventually the
provide
te
uses,
In
conclusion
e.g.
ware m o n i t o r s . papers
of
have t h e
fades
away,
complete
a "window"
in
out
we may n o t e
through
which
on t h e
subject
require
the
less
storage
estimate
which
some r e s o u r c e interest
especially
evidenced in
that
and c a l c u l a t i n g
a strong
including is
they
timely
carrying
This
advantage
sequence
a current,
mance e v a l u a t i o n , of
size
Xk-T) the measure-
observed.
Stochastic recording
the
1 + - (×k T
the
the
by t h e last
effect
the is
than
initial is
estimate
available
er-
needed f o r later,
for
and
immedia-
allocation. in
all
methods
use o f
hardware
appearance
three
of
years,
of
of
perfor-
and s o f t -
a large
number
and by s p e c i a l
con-
488
ferences devoted to the s u b j e c t Evaluation, April
April
(see ACM Workshop on System Performance
1971, and Computer M o n i t o r i n g
1972 at Brigham Young U n i v e r s i t y ) .
there are s t i l l
Workshop schedule f o r
There is general
some important open q u e s t i o n s ,
especially
agreement t h a t on methods of
analysis.
7.
REFERENCES
ACCOUNTPAK
A Proprietary
Software Package of Systems Dimensions L t d . ,
Ottawa, Canada Apple,
C.T. The Program Monitor - A Device f o r Measurement Proc. ACM 20th National pp 66
Arbuckle,
Program Performance Conference Aug.1965,
75
R.A. Computer Analysis and Thruput Evaluation Automation,
Bard, Y.
Vol.
15, N o . l ,
Performance c r i t e r i a
Brundage, Robert
January 1966, pp 12-15
and measurement f o r a t i m e - s h a r i n g
system. IBM Systems J. Vol. Basson, Alan;
I0 No. 3, 1971, pp 193-231
Performance Measurements on a V i r t u a l
Memory Computer System in a Batch-Processing Workshop, A p r i l Bemer, R.; Ward, A . L . ;
Computers and
Environment -
1971
Ellison
Software
Instrumentation
Systems f o r
Optimum Performance Pwc. IFIP Congress 68, North Holland, pp 520-524 Boehm, B.W. Computer Systems Analysis Methodology - Studies in Measuri n g , Evaluating and Simulating Computer Systems,R-520 NASA, Rand Corp., Bonner, A.J.
Santa Monica, Sept.
1970
Using System Monitor Output to Improve Performance, IBM Syst. Journal Vol 8 (1969) No. 4, pp 290-298
Bordsen, Donald T.
UNIVAC 1108 Hardware I n s t r u m e n t a t i o n
Workshop A p r i l
1971
System -
489
BUC Component D e s c r i p t i o n Calengaert,
and U s e r ' s Guide. Form no. 7X22-6953 IBM Corp.
P. System Performance E v a l u a t i o n : Survey and A p p r a i s a l ACM V o l , I 0 , No. I , January 1967, pp 12-18
Comm.
Campbell, D . J . ; H e f f r e r , W.J. Measurement and A n a l y s i s of Large Opera t i n g Systems During Development AF!PS Proc.33, (FJCC 1968,Vo12),pp903-914 Cantrell, H.N.; E l l i s o n , A.L. Multiprogramming System Performance and Anylysis, AFIPS Proc.32 (SJCC, 1968), pp 213-21 Choosing a Computer 1971-72, Data Systems, Dec. 1971 Crooke, S.; Minker J. Key Word in Context: Index and Bibliography, Computer System Evaluation Techniques, Technical Report 69-I00, Dec.1969, University of Maryland, Computer Science Dept. Deniston, W.R. "SIPE: A TSS/360 Software Measurement Technique" Proc. ACM 24th National Conf. 1969, pp 229-245 Denning, Peter J.; Eisenstein, Bruce A. S t a t i s t i c a l Methods in Performance Evaluation - Workshop, April 1971, pp 284-307 Esthin, G.; Hopkins,D.; Coggar, B.; Crocker, S.D. Snuper Computer: A Computer in Instrumentation Automation, AFIPS Proc. 30 (SJCC, 1967), pp 645-656 Freibergs, I.F. The Dynamic Behaviour of Programs. AFIPS Proc. 33, (FJCC 1968, Vol.2,)pp I163-I167 Gotlieb, C.C. and Mac Ewen G.H. System Evaluation Tools in Software Engineering. NATO S c i e n t i f i c Affairs Division, 1969, pp 93-98 Hart, L . E .
User's Guide to Evaluation Products. Datamation 16 (Dec.1970) 17, p 32
Harman, A.J.
The International Computer Industry. Harvard University Press, 1971
Hillegass, J . R . Standardized Benchmark Problem Measure Computer Performance Computers and Automation Vol.15, no. l , Jan.1966, pp 16-21
490
IBM System/370 Model 165 Functional
Characteristic,
GA22-6935-0
May 1971, p 24 Joslen,
E.O. and Aiken, J . J .
The V a l i d i t y
on Benchmark Results.
of Basing Computer Selections
Computers and Automation V o l . 1 5 ,
No.6 , June 1966, pp 22-23 Katonak, P.R.
Use of Performance Analysis S t a t i s t i c s System Simulation Simulations.
- Fifth
Association
in Computer
Conference on A p p l i c a t i o n s
of
f o r Computing Machinery,
December 1971, pp 317-325 Kohn, Carl
Knight,
E.
K.
Techniques and Results of Systems M o n i t o r i n g . of Waterloo, 1971, Computer Centre
University
Evaluating Computer Performance 1962-1967. Datamation, January 1968, pp 31-35
Lucas, H.C.
Performance Evaluation V o l . 3 , No3, Sept.1971,
MacGowan, J.M. UNIVAC 1108.
and M o n i t o r i n g Computing Surveys, pp 79-9~
Instrumentation
Technique. NATO S c i e n t i f i c
in Software En#ineering Affairs
Div.
1970, pp 106-
II0 Metzger, J.
Monitoring
Computing Systems. M.Sc. Thesis.
Computer Science, Milandre,
G.
Hardware I I
University
of Toronto,
Dept.
of
December 1970
- U n i v e r s i t y of Toronto, Hardware Monitor
P r o j e c t . I n t e r n a l Report V, November 1971. U n i v e r s i t y of Toronto Computer Centre Minker,
S.; Crook and J.Yeh
A n a l y s i s of Data Processing Systems. Techni-
cal Report 69-99. U n i v e r s i t y of Maryland, Computer Science Centre, Dec. 1969 Pinkerton,
T.
Performance M o n i t o r i n g
in a Time-Sharing System.
CACM 12, Nov. 1969, V o l . 1 2 , Saltzer,
J.H.;
Gintell,
J.W.
No.ll,
pp 608-610
The I n s t r u m e n t a t i o n
of M u l t i c s .
CACM 13,
No.8, Aug.1970, pp 495-500 Scherr, A.L.
An Analysis of Time-Shared Computer Systems. M . l . T . P r e s s , Cambridge, 1967
491
Schneidewind,
N.F.
The P r a c t i c e
February Schulman, F.D.
of Computer S e l e c t i o n .
Hardware Measurement Device f o r Sharing
Datamation,
1967, pp 22-25
Evaluation.
Proc.
IBM System 1360 Time
ACM 224.
National
Conf.
1967,
pp 103-109 Share-Session
Report on "Hardware vs S o f t w a r e " (1970)
Sharpe,
W.F.
The Economics of Computers. Ch.9.
Sherman, S.;
Browne, J.C.
Forest
Baskett
Solomon, M.B. J r s .
III.
Trace Driven Modeling in a M u l t i - P r o g r a m m i n g
No 6, June 1966, pp 435-440
(1968),
No 2, pp 85-102
System E v a l u a t i o n
on the C o n t r o l
Cong. 68, Aug.1968, SUPERMON Systems T e c h n i c a l Hall, System Performance
System U t i l i z a t i o n
C.D.
sec.
Monitor:
of G e o r g i a ,
Jan.1971,
II
EDP R e p o r t s ,
Auer-
Form n o . A / B - 4 1 6 .
Computer
Sept.1969 Datamation
pp 40-49 Evaluation,
ge, Mass., ACM, A p r i l W.
Georgia
00.101-115
A Key to Cost E f f i c i e n c y .
Workshop on System Performance
Wulf,
Athens,
- in Standard
U s e r ' s Manual.
Inc.,
Monitoring:
Proc. IFIP
pp 542-547
Comparison Charts
Synetics
Data 6600.
Memo No. 30, January 1970. COSMIC, Barrow
University
bach Corp.
Warner,
1971, pp 173-199
H e r t e l , H.F. Statistics G a t h e r i n g and S i m u l a t i o n f o r the A p p o l l o Real Time Operating System. IBM S y s t . J . Vol.7
D.G.
of Computer Systems
Economies of Scale and the IBM System/360 Comm.
ACM V o l . 9 ,
Stevens,
Press 1969
of CPU Scheduling
System - Workshop, A p r i l
W.I.;
Columbia U n i v e r s i t y
The Cost and E f f e c t i v e n e s s
and A n a l y s i s
Stanley,
Share XXXIV Proc. V o l . l
pp 380-405
5-7,
Harvard U n i v e r s i t y ,
Cambrid-
1971
Performance M o n i t o r s f o r M u l t i p r o g r a m m i n g Systems. Proc.2nd ACM Symp. on Op. S y s t . P r i n c i p l e s . P r i n c e t o n , N . J . (0ct.1969),
pp 175-181
CHAPTER 4.E. MECHANISMS
PRICING
C.C.Gotlieb Department o f Computer Science University
Pricing
s e r v e s an i m p o r t a n t
rationalizing be not
as s a t i s f a c t o r y .
by p o l i c y ined,
I.
planning.
Price
Canada
in a l l o c a t i n g
long run i t s levels
Different
s e r v i c e resources
alternatives
turn
are d e t e r m i n e d by c o s t s , methods o f
some of the r e s u l t i n g
setting
implications
and
out t o but a l s o
levels
are exam-
and r e q u i r e m e n t s .
THE RATIONALE, OF PRICING
In a market s i t u a t i o n making a p r o f i t . facility equally for
strong
services
reasons f o r
the s e r v i c e s .
Prices
They do t h i s
the
nal
a policy
rationalize
an e f f i c i e n t control
planning
long
costs
(sometimes d e s i r a b l e ) ,
to p r o v i d e the p r o p e r
user or the a d m i n i s t r a t i o n
for
are in essence a s u r r o g a t e
competitive
are a means o f a l l o t use o f r e s o u r c e s
other
- e.g.
testing,
sensible for
and are u n l i k e l y
levying
as p r i c i n g
an average
services,
incentives without less
but
( K a n t e r and Moore,
use d u r i n g
to c o s t
(when
of new f a c i -
at m a r g i -
peak p e r i o d s , is
and
guaranteed.
either
use 6f the f a c i l i t i e s .
prices
over
service centers.
priority
a g a i n s t over investment since a r e t u r n
Overhead s i m p l y f a i l
are charged
to p r o v i d e s e r v i c e
encourage
in a
company, t h e r e are
demand, smooth loads
or i n s t i t u t i n g
p r e c l u d e the a b i l i t y
computer
as p r e v a i l s
and a c q u i s i t i o n
r e c o v e r i n g costs
run these do not work as w e l l
c o s t s and
where p r i c e s
budget a l l o c a t i o n )
overhead c h a r g e s ,
Average c o s t s
do not p r o t e c t
ties
departments,
bureau or a l a r g e
adopting
(with
recovering
where a c e n t r a l i s e d
and p r o v i d e a b a s i s of comparison w i t h
applying
1968).
internal
and o b t a i n i n g
There are o t h e r methods o f cost,
to
situation
because t h e y h e l p
used a p p r o p r i a t e l y ) , lities,
are a d e v i ~ e f o r
a government computing
i n g scarce r e s o u r c e s time.
prices
But even i n the
provides
university,
in
role
In the
considerations.
along w i t h
of T o r o n t o ,
to the Priori-
the advantage o f to a d m i n i s t e r .
493
2.
DETERMINING
The f a c t o r s
FACTORS
which
Costs - t h e s e in
the
determine should
price
levels
be r e a l i s t i c .
are:
They are d i s c u s s e d
i n more d e t a i l
next section.
Policy
decisions
- the
first
decision
is
to apply
prices
and t r a n s f e r
payments between d i v i s i o n s . Other is
important
questions
each i d e n t i f i a b l e
certain will
services
prices to
what w i l l
be t h e
Will
service
to
be p r i c e d
(and u s e r s )
to
be s u b s i d i z e d ?
be s e t
permitted
are:
by o v e r a l l
pay m a r g i n a l
computer
services
costs
relation
or will
to
cost
certain
or are
users
be
costs?
"convertability"
t h e y be good o n l y
for
average
in
for
of
the funds
alternative
elsewhere?
for
which
in-house other
users
computer
types
are given?
services?
of products
(Smidt,
1969) the
level
lization bility
of
use which
implies
greatest
and room f o r
Complexity
growth
of equipment
significantly
considered
necessary
efficiency
in
in
and s e r v i c e s
facilities,
( t a p e and d i s k
or d e s i r a b l e .
one s e n s e ,
but
less
High u t i flexi-
another. - the
as we go f r o m a s i n g l e
multiprocessor creased
is
complexity
processor
and as t h e v a r i e t y storage,
special
to of
increases
time-sharing services
outputs,
plots,
is
and in-
keypunching
etc.)
3.
COSTS
In a d d i t i o n
to their
fectiveness
determination.
distributing general
users,
situation,
The p r o b l e m
cost accounting
components
"These a r e :
costs
are important
We need a method o f g o i n g
the different
above a r e needed.
problem of
The a o s t
to
pricing,
purpose multiprogramming
mentioned neral
these
use i n
but this
is
a particular
to
identify.
cost
ef-
from expenses is
difficult
and t h e p o l i c y
i n any p r o d u c t i o n
are not difficult
for
and in
a
decisions
case o f t h e
processes.
ge-
494
Salaries
management, o p e r a t i o n a l ,
-
fringe
benefits
tributions Equipment
development, health
office
equipment
Supplies
cards,
paper,
tapes,
Software
purchased,
documentation
leased, developed in-house
space, p r e p a r a t i o n
costs,
utilities
Overhead
use of p u r c h a s i n g and maintenance s e r v i c e s ,
Miscellaneous
travel,
A major d e c i s i o n chase costs ciation
is
advertising,
that
asset is and i t
so o f t e n
the method o f a m o r t i z i n g
In b u s i n e s s i t
equipment but
common i n computer f i n a n c i n g
always i n c l u d i n g
of Task Force on Computer C h a r g i n g ) . on purchase vs r e n t i n g
a more r a t i o n a l
usual
basis.
the p u r -
to show d e p r e -
does not seem to have been - perhaps because the major or s p e c i a l
to d e t e r m i n e what v a l u e should
There are arguments f o r decision
this
and c o s t i n g
is
a c q u i r e d w i t h the a i d o f g r a n t s
is difficult
library
user manuals, e t c .
concerning
of the equipment.
allowances for
plan con-
payments, m a i n t e n a n c e , communication
costs,
-
insurance,
etc.)
purchase or r e n t a l
Site
applications,
(pension,
an a m o r t i z a t i o n
Among o t h e r t h i n g s
vs t h i r d - p a r t y
financing,
be imputed t o cost this
it.
(Report places
the
l e a s i n g o f equipment on
To a n a l y z e the purchase c o s t we must know the
useful l i f e of the e q u i p m e n t . A l o w e r bound can be e s t i m a t e d from R/C the r a t i o
o f the m o n t h l y r e n t a l
by the m a n u f a c t u r e r . rate of return cost;
If
L
c o s t to the purchase cost as d e t e r m i n e d
i s the u s e f u l
on c a p i t a l ,
and
M
life
in months,
r
the annual
the maintenance p a r t o f the r e n t a l
approximately R.
For example i f
M = C L
C R ~ = 48, M = ~
r . ~ 12 2 and r = 10% then L ~
In commercial
s e r v i c e bureaus the computer i s
time which i s
very short
compared to t h a t
equipment i n v e s t m e n t e . g . common e l s e w h e r e .
clearly
of t e c h n o l o g i c a l
c o s t of s e r v i c e s v e r y h i g h .
long f o r change,
amortized in a
found
3 or 4 y e a r s compared w i t h
Ten y e a r s i s
in view of the r a p i d i t y
usually
usually
66 months.
the
the l i f e but
in o t h e r 10-20 y e a r s o f a computer
3 y e a r s makes the
495
4.
THE
FACTORY
The r e c e n t
MODEL
trend,
both
in commercial
to view the computer f a c i l i t y of p r o d u c t s , prices
for
i.e.
various
these
as a " f a c t o r y "
installations
which
types of s e r v i c e s ,
delivers
services
and d e t e r m i n e c o s t and
are i d e n t i f i e d ,
are used to a s s i g n c o s t
For example at the U. of T.
components
in
is
a number
(Nelsen 1968, U. of T. Computing Centre Reports
A number o f d i s t i n c t niques
and U n i v L r s i t y
1971).
and cost a c c o u n t i n g
tech-
to each of these s e r v i c e s .
1971 the f o l l o w i n g
services
and cost
components were d e f i n e d . SERVICE
COST
Time s h a r i n g
service
Batch s e r v i c e
(CPS, APL, ATS)
258,723 1.309,610
(OSon 360/65)
High-speed batch
299,120
service
7094-service
300,000
Remote Job E n t r y S e r v i c e Miscellaneous
services
239,176
(Plotters,
unit
record)
136,071 2.542,700
There i s
inevitably
some a r b i t r a r i n e s s
programming a d v i s i n g , distributed
into
which m i g h t have been c a l l e d
the o t h e r s .
budget.
it
the budget
difficult arrived
line
to a s s i g n are the c a p i t a l
types
of
batch
(HSJS) and t i m e - s h a r i n g .
It
In some cases,
was a p o l i c y
no more f o r costs f o r
decision
their
this
on the
that
computing
a service,
e.g.
salaries
items to the
(amortization)
a t by measurements on the core and c . p . u , usage i d e n t i f i e d
a service.
Thus
was r e -
These costs were d e t e r m i n e d by a c a r e f u l
a n a l y s e s of the annual i s easy to a l l o c a t e
in i d e n t i f y i n g
360/65,
costs,
namely OS b a t c h ,
than time
local
users,
into
The most
and t h i s
usage f o r
users of the remote job
s e r v i c e were r e d i s t r i b u t e d
and s u p p l i e s ,
services.
was
the t h r e e
high speed
s e r v i c e would pay
and t h e r e f o r e
the
OS, to be added to the
i n p u t / o u t , put c h a r g e s .
5.
PRICING
A SERVICE
Having d e t e r m i n e d the t o t a l
price
for
each of the s e r v i c e s
there
is
still
496
a g r e a t deal o f c h o i c e i n a r r i v i n g are a p p r o p r i a t e Single Price
for
Scheme: T h i s
was adopted f o r in
I/0
is
services.
time,
and i t
cards
is
below t h i s
Definition: for
there
more e f f i c i e n t
- Multiple
Input
and p r i c e s
but i t Queue:
is
Queues w i t h
are charged a c c o r d i n g
and p r i c e s
time.
at n i g h t s
are s e t
or on week-ends.
the whole computer
(See Table
i n multiprogrammed systems.
different to
Much of the
to a c o s t o f ~ 2 0 # / j o b .
times - e . g .
less useful
It
upper bounds on
printed.
is defined
This was common when users were a l l o c a t e d 1 from S h a r p e ) ,
strategies
n o t to r e c o r d c . p . u ,
leading
A prime s h i f t
less desirable
Different
are s t r i c t
read and l i n e s
The load runs about 5000 j o b s / d a y , - Prime S h i f t
price, We have:
the average c o s t mentioned e a r l i e r .
HSJS. For such j o b s
core usage, c . p . u , cost is
at a unit
the d i f f e r e n t
priorities
the p r i o r i t y ,
are s e t up,
e.g.
Rush ( a t
double
rate) Table
Shift
1
Period
Typical
Price
Shift
Rates
as a p e r c e n t a g e
of p r i m e - s h i f t
price
Approximate percentage of time s o l d at this
Prime
Working days
100
42
85-90
28
60-90
25
price
8:00a.m.-6:OOp.m. Second
Mon.-Fri. 6:00p.m.-midnight
Third
After
Weekend Sat.
midnight and Sunday
negotiable
5
From Sharpe - Economics of Computers po504 ASAP (As Soon as P o s s i b l e - a t I01 It
(If
Otherwise
Idle-at
would be p r e f e r a b l e
this
the
standard r a t e )
60% s t a n d a r d )
to charge a c c o r d i n g
to t u r n a r o u n d
time,
but
cannot be g u a r a n t e e d .
Resource Usage: This
is
the mechanism now used w i d e l y f o r
ing and m u l t i p r o g r a m m i n g i n s t a l l a t i o n s . ges are made i n c l u d e :
CPU t i m e ,
The r e s o u r c e s
memory usage, t e r m i n a l
for
time-sharwhich c h a r -
connect t i m e ,
497
cards read or punched, lines printed, number of tapes or disks mounted etc.
Market Scheme: many p o s s i b l e
Users b i d f o r variations;
a "share"
for
and a share not used l a p s e s . used l e a s t
of their
by the U n i v e r s i t y Special
according
share to d a t e .
to r e s o u r c e s times.
to c o n s i d e r a b l y
keep the CPU F u l l y to c a l c u l a t e
for
adding a c o n s t a n t t i m e
The r e s u l t ges in
is
for
of T o r o n t o system i s
It
If
for
relatively
prices
are charged
depend on the
the same job
run at
to users adds up is
difficult
run
time
(ERT) which
each I / 0
is
interrupt
to was
found by
i s s u e d by
to the measured CPU usage.
c o m p l i c a t e d methods o f computing job c h a r -
computing run charges the r a t e
used in o t h e r
used
A scheme developed by Douglas A i r c r a f t
an e x p e c t e d
( a b o u t 25ms.) f o r
(along with
scheme i s
the c o s t w i l l
prices
daily,
on the IBM 360/65.
I00 % of CPU t i m e because i t
and adding t h i s
that
of t h i s service
in g e n e r a l
different
a m u l t i p r o g r a m m e d system are o f t e n
formula
6.
find
occupied.
There are
to those who have
the CPU t i m e c h a r g e a b l e d i r e c t l y
than
each job
the user program,
A variant
in allocating
committed,
Also, less
may be g i v e n
in m u l t i p r o g r a m m e d systems.
program mix and users w i l l different
Priority
o f Waterloo
problems a r i s e
of the computer.
example the share may be a l l o t e d
used.
presently
structure
universities
The Appendix g i v e s
in effect
for
other
a t the U n i v e r s i t y services).
and commercial
the
A similar
installations.
SOFTWARE REQUIREMENTS
is
apparent that
a pricing
the form of s o f t w a r e . a) A j o b
authorization
checks e v e r y job
mechanism r e q u i r e s
c o n s i d e r a b l e backup i n
Among the programs needed a r e : routine
for
- this
sufficient
maintains
funds
before
credit it
is
b a l a n c e s and run.
Preferably
applied on-line. b) Job a c c o u n t i n g accounts c)
Billing
routine
and d i s p l a y s routine
this
computes job
charges,
posts
them to the
them on the user o u t p u t
- prepares
statements
and summary s t a t i s t i c s
about
earnings d) Job a n a l y s i s jobs
routine
in each s e r v i c e
- this
collects
statistics
so as to a l l o w the e f f e c t s
pricing mechanism to be predicted.
about the number of of changing the
498
These programs hensive years,
job
Commercial
statistics
A.
RATE
are
are very
It
is
doubtful
can be w r i t t e n available
detailed
in
if
a set
less
than
(SDL ACCOUNT PAK). and u s e f u l
for
of ten
compremen-
The j o b
performance
and s y s t e m e v a l u a t i o n .
EXAMPLES
7.1.
programs
versions
collected
measurement
7.
each have c o m p o n e n t s .
accounting
FOR
PRICING
SCHEDULE
SYSTEM~370
FOR
SERVICE
-
MECHANISMS
THE
UNIVERSITY
The G e n e r a l
OF TORONTO,
Purpose
1 JAN
1972
Job S t r e a m
JOB CHARGE = SF((~CPU~CPUTIME)+(SCORE~COREUSAGE)+UR+PDC) = Service
where:SF
= ~ 8.50
~CPU
Factor
per
of
2.00
for
RUSH
1.00
for
ASAP
0.60
for
I01
CPU m i n u t e
CPUTIME
= measured
~CORE
= ~ 1.05
COREUSAGE
= (RA/IOO)~(I+RA/5OO)~ERT
RA
= Region A l l o c a t e d
ERT
= Equivalent
core
CPU t i m e
UR
WAITTIME
= (.0245 = Unit
minutes
kilobyte
minutes
of
usage
sec.
Record
= ~ 0.80
(KB)
Run Time
= CPUTIME + I / 0 I/0
in
per hundred
(in
minutes)
WAITTIME
per
I/0
event)/
Service
Charge
60 min
per
thousand
cards
read,
per
thousand
lines
printed,
per thousand
cards
punched
plus % 0.80 plus % 2.00 PDC
= Peripheral
Device
= ~ 4.00
per job
special
printer
Charge
requiring set-up
disk,
tape,
or
499
B. 7094 II/1401 S E R V I C E Computation Unit
= ~ 96,00 per system hour
(7094)
Record S e r v i c e
= ~ 0.80 per thousand
(1401)
cards r e a d ,
plus 0.80 per thousand
lines
printed,
2.00 per thousand cards
punched
plus
C. I N T E R A C T I V E
TERMINAL
1. C o n v e r s a t i o n a l
SYSTEMS SERVICE
Programming
2.00 per CPU m i n u t e ,
System (CPS)
plus
% 1.20 per core page per h o u r ,
plus
3.00 per c o n n e c t hour 2. A d m i n i s t r a t i v e
Terminal
System (ATS)
3.60 per c o n n e c t hour 3. APL S e r v i c e % 3.00 per CPU m i n u t e , 3.00 per connect 4.
IBM 2741 T y p e w r i t e r %
hour Terminal
95.00 per month f o r
% 105.00 per month f o r (This a real 5.
rental dollar
is
plus
Rental
a leased line, a dial-up
not payable
in
or
line allocated
or s u b s i d y
funds;
it
is
charge).
Disk S t o r a g e Space 0.30 per t r a c k workspaces,
per month f o r
and CPS l o a d / s a v e
ATS permanent and f i l e
D. M I S C E L L A N E O U S
1. SYSTEM/360 O n - l i n e 0.30 per t r a c k
Disk Storage
per month
space
storage
records,
APL
500
NOTE: 7294 bytes
= one t r a c k
20 t r a c k s
= one c y l i n d e r
200 c y l i n d e r s
= one 2316 d i s k
(The minimum is one month and the charge is payable in advance).
PRICING MECHANISMS
7.2.
DISK PACK RENTAL
(OFF-LINE)
25.00 per month (The minimum i s
one whole d i s k
pack f o r
one month and the charge
i s p a y a b l e in a d v a n c e ) . 7.3.
7.4.
DISK PACK
STORAGE
25.00
initial
I0.00
annual
charge,
plus
renewal
D I S K TO T A P E B A C K U P
20.00 per cycle 7.5.
TAPE RENTAL
1.00 per tape per month (The minimum i s 7.6.
one month and the charge i s
p a y a b l e in a d v a n c e ) .
TAPE STORAGE
Z 5.00 i n i t i a l
charge,
plus
1.00 per tape per month 7.7.
TAPE
CLEANING
AND TESTING
c l e a n i n g = Z 1.50 per tape ( d o u b l e testing 7.8.
pass)
= ~ 2.00 per tape
NEGOTIATED
CONTRACT
SERVICES
Job t u r n a r o u n d h a n d l i n g = ~ I 0 . 0 0 Programming A s s i s t a n c e
per man hour
= ~ 12.00 per man hour
501
Analytical (These they
7.9.
20,00
7,10.
services
are
CALCOMP
Assistance
real
are dollar
= ~ 15,00 not
payable
in
p e r man h o u r allocated
charges).
PLOTTING
per
plotter-hour
CARD P R O C E S S I N G
Reproduction
= $ 2.00
per thousand
cards
Interpretation
= % 2.50
per
thousand
cards
= ~ 3.50
per
thousand
cards
Labels
= % 5,00
per
thousand
cards
Listing
= ~ 1.00
per
thousand
cards
Keypunching
= % 5.00
per
hour
= ~ 5.00
per hour
Reproduction
and
Interpretation
Keypunch V e r i fying
or
subsidy
funds;
502
~.
REFERENCES
ACCOUNTPAK
A Proprietary ted,
Diamond,
D.S.
pp.
Limi-
L.
Considerations
for
Computer U t i l i t y
pricing
Proc. ACM Nat. C o n f e r e n c e Brodon System Press
1968,
189-200.
S. and Samet P.A.
Charging f o r
Computer B u l l e t i n , Hootman, J . T .
Package of Systems Dimensions
Canada.
and Selwyn, policies.
Gill,
Software
Ottawa,
The p r i c i n g
computer time
13, No.I
dilemma.
in
(Jan.1969)
Datamation
in
universities.
pp.
14-16.
15, 8 (Aug.1969)
pp. 61-
66. Leppik,
J.J.
"Proposal ter
Marchand,
of Terms of R e f e r e n c e of the I n s t i t u t e
Science".
M. P r i o r i t y
University
pricing
with
application
FJCC 1968, AFIPS, P a r t Nielson,
N.R.
Flexible puter
pricing:
resources.
of T o r o n t o ,
I,
pp.
Sharpe,.W.F.
Ontario
The Economics
November 1969.
to t i m e s h a r e d
An approach
to the a l l o c a t i o n
FJCC 1968, AFIPS P a r t
Council
I,
pp.
of com-
521-531.
Computer C o o r d i n a t i o n
of U n i v e r s i t i e s ,
of Computers,
computers.
511-519.
Report o f the Task Force on Computer C h a r g i n g . Group,
of Compu-
June I ,
Columbia U n i v e r s i t y
1970. Press,
1969 Ch.9 and I I . Singer,
N.M.;
Kanter,
H. and Moore, A.
computer t i m e . Smi d t ,
S.
Part University
I,
of Toronto
and the a l l o c a t i o n
FJCC 1968, AFIPS, P a r t
The use of hard and s o f t demand f o r
Prices
centralized
I,
money budgets
pp.
493-398.
and p r i c e s
computer f a c i l i t y .
of
to l i m i t
FJCC 1968, AFIPS,
pp. 499-509. Computing C e n t r e -
Internal
Reports
Pricing
Committee - June 1970 A Paper on P r i c i n g A Cost A c c o u n t i n g
- C.A. Model
Ford, C.A.
May 1971 Ford,
February
1971
Sub-
CHAPTER 4.F EVALUATION IN THE COMPUTING CENTER ENVIRONMENT H. J.
Helms
Technical U n i v e r s i t y of Denmark Northern Europe U n i v e r s i t y Computing Center 1.
INTRODUCTION
In the following we w i l l consider some of the aspects of the u t i l i zation made from software. We are moving from the problems concerning the design and construction of programs and systems of programs into the environment of the users. We are no longer dealing with software engineering in i t s e l f ,
but rather with the applications of the pro-
ducts of the software engineers. We shall move around in the computing center environments, and while we shall t r y to describe them i t must be admitted, i t difficult
is by now
to give a precise d e f i n i t i o n . In former times t h i s was
rather easy. The computing center simply was the physical location of a computer, and the environment the s t a f f servicing the computer, as well as the users most of whom were programmers themselves and, on many occasions,
also operators.
The s i t u a t i o n is no longer that simple. With~e p r o l i f e r a t i o n of t e r minals attached to d i s t a n t computers and even development of computer networks,it is more d i f f i c u l t
sharply to provide a d e f i n i t i o n of a
computing center environment. We may s t i l l
f i n d i t around the physical
location of a computer, but i t may as well be found around the physical location of a terminal connected to a remote computer. There are indeed examples of important computing environments using terminals and never giving considerable thought to the f a c t that the computer i t s e l f is located f a r away. For the purpose of our discussion l e t us define the computing center environment as the community of people using the services of a given computing system.
504
A user is a member of this community and we may. mention as examples An a i r l i n e t i c k e t agent using a seat reservation system. A t y p i s t using a t e x t editing system. A bank t e l l e r using an on-line accounting system. A manager using a management information system. A consulting engineer using standard engineering programs from a terminal in his o f f i c e . A chemist developing programs to solve his own research problems. A student solving exercises for his informatics course. A programmer developing programs f o r a customer. While the above mentioned examples of user categories by far are exhaustive, i t
does lead to a recognition of various classes of users.
Roughly we may describe them as non-specialists in computer usage and s p e c i a l i s t s in computer usage. We may also describe the users as f a l l i n g into the categories non-programmers and programmers, but here reservation on the s k i l l s and a b i l i t i e s may be made for the persons falling
in the category programmers.
The users we shall consider in the f o l l o w i n g , mainly f a l l
in the l a t t e r
of the two categories. We find them in computing center environments in amongst others computer firms, computing centers serving administr a t i o n , business, hospitals, industry, l i b r a r i e s , research i n s t i t u t ions and u n i v e r s i t i e s . The largest v a r i e t y of these categories of users are found in univers i t y computing center environments also often characterized by a large v a r i e t y of applications, a large v a r i e t y of problems to be solved, a vide scope of need f o r computing f a c i l i t i e s
as well as a broad spectrum
of varying degrees of experience and s k i l l s
in computer usage.
With the above broad d e f i n i t i o n of a user i t
is of course rather d i f f i -
c u l t to provide s t a t i s t i c s of the number of users. There does e x i s t many s t a t i s t i c s countrywide and worldwide of the number of computers, and as an example in the Federal Republic of Germany the company Diebold, Deutschland has published that in early 1971 there were the following approximate number of computers 60 large computers 8.300 medium sized computers 13.500 small computers
505
of a t o t a l value of 11.6 x 109 DM. A large computer is defined as a machine whose purchase value exceed 8 m i l l . DM. I t depends of course e n t i r e l y from the a p p l i c a t i o n , how many users a given machine or a given computing center have. At l e a s t on an European scale a computing center in a large research i n s t i t u t e may have some 1000 users and a large u n i v e r s i t y computing center w i l l have 2000 or more. At NEUCC, Technical U n i v e r s i t y of Denmark, where we provide a univers i t y computing service on a regional basis i . e . also to u n i v e r s i t i e s and research i n s t i t u t e s outside our own u n i v e r s i t y , we have around 1000 v a l i d account numbers and a user population of 2000-3000. The computer system, a c t u a l l y an IBM 360/75 is l a r g e l y terminaloriented and besides a high-speed terminal there are at present 14 medium-speed terminals attached to the mainmachine, as well as the users have around 80 t y p e w r i t e r terminals, which connect with us on a d i a l - u p basis. During a t y p i c a l month we find that some 40-45.000 jobs are passed on the machine. 20.000 of these are t y p i c a l student jobs. Around h a l f of the jobs come from the terminals some of which are located f a r away, up to 200 km. North American u n i v e r s i t y computing centers may serve a community of 30.-40.000 students and a f a c u l t y of some 3.000 members. Quite t y p i c a l are some 20% of the students in professional or graduate schools. Our computing center environments are thus operating on a very large scale and draw t h e i r users from large populations. 2. T H E
USER AND
HIS NEEDS
I t is often claimed that the user has great d i f f i c u l t i e s in specifying his needs and do not know, what he r e a l l y wan~ in order to solve his problems. This is perhaps not s u r p r i s i n g , but i t
is most dangerous
f o r the user as well as f o r us, i f we do not t r y to perform a f u r t h e r analysis both of the user and his problems and thereby t r y to provide
506
a s p e c i f i c a t i o n in greater d e t a i l s of his needs and requirements. I t
is
surprising to find how seldom this is done in an i n t e l l i g e n t and workable fashion, and how often decisions in r e a l i t y are made in a nearly random way or as a r e s u l t of a coincidence of circumstances. There is often a large amount of goodwill involved in reaching the r i g h t decisions also l e t t i n g the users exercise influence through an appropriate committee structure. Without underestimating the value of t h i s , i t must be admitted, that the reasons f o r t h e i r existence sometimes are psychological. Anybody who l i v e in the environment w i l l by the way know only too well that a complicated
structure for mutual in-
formation, decision making on several levels etc, in a computing center environment as in many other organizatorial environments by far is the only l i n e of communications. Perhaps j u s t as important also when i t comes to influence on decisions are the many informal contacts.They may be sound, stimulating and i n s p i r i n g , but by t h e i r very nature may lead to decisions based on coincidences. A strong element of influence d i r e c t l y or i n d i r e c t l y is also exercised by software firms and computer firms. The s t a r t of any systematic measurement technique must be a very good set of accounting routines. They should provide records of the f a c i l i ties used such as total output f a c i l i t i e s .
It
time, CPU time, core store used, use of input/
is surprising that routines of this type to do
accounting are r e l a t i v e l y rare when the computer system is delivered from the manufacturer. The machine may even lack an internal clock. It
is f o r t h i s reason there is a large number of papers in the l i t e -
rature describing what was done at a p a r t i c u l a r i n s t a l l a t i o n to provide a reasonable accounting scheme f o r t h e i r u t i l i z a t i o n . Accounting routines are used for keeping record of the u t i l i z a t i o n of the computer, charging the users and provide a basis for prognosis on f u r t h e r computer use and thereby aid the budgeting plans and establish the procurement p o l i c i e s . The data collected may also be used to the establishment of a user p r o f i l e , and here we find surprising s i m i l a r i t i e s between u n i v e r s i t y computing centers.
507
From the individual figures in the accounting schemes we can get the d i s t r i b u t i o n of jobs by time and by number in p a r t i c u l a r time i n t e r vals. The general shape of such d i s t r i b u t i o n s are very a l i k e . P.A. Samet [1]
, University College, London, Computer Centre, which
is equipped with an IBM 360/65, reports that about 90% of the jobs run f o r less than 5 minutes, but took only 50% of the time. Almost 50% of the jobs run for less than one minute. What is a job? In t h i s d i s t r i bution batches of small jobs run under the WATFOR compiler are counted as one job, and each of these batches usually contain between 5 and 10 jobs. Each such batch t y p i c a l l y takes 1 minute. P.A. Samet [ I ]
also reports that the London University CDC 6600 machi-
ne from i t s f i r s t
months of operation in handling more than 33.000
jobs, i t was found that 83% took less than 30 seconds and 88% took less than 1 minute. These figures r e l a t e to the same u n i v e r s i t y . In 1968 at NEUCC we reported [ 2 ] from our IBM 7094 operations that 92% of the jobs run for less than 6 minutes. They took 45% of the machine time. The s i m i l a r i t y is s t r i k i n g . At present on the IBM 360/75 at NEUCC we find (not taking WATFIV and Algol W jobs into account)
that 63 % of the jobs take less than 1 min
CPU time and use 12% of the t o t a l CPU time. It
is d i s t r i b u t i o n s l i k e these which explains the i n t e r e s t of univer-
s i t y computing centers in fast compilers l i k e WATFOR and j u s t i f y t h e i r concern for small overheads. The d i s t r i b u t i o n s of the number of jobs and the time used of course r e f l e c t s the use of the computer f a c i l i t i e s
both f o r research and
educational purposes. At NEUCC we found in 1968 from the IBM 7094 operations that the d i s t r i b u t i o n of the machinetime was Education
19%
Research
80%
Other use
1%.
508
At present on the 360/75 i n s t a l l a t i o n i t Account units
Normal jobs
Education
14%
31%
Research
85%
66%
1%
3%
Other use
is
Many accounting routines also allow us to obtain information about the u t i l i z a t i o n of the software modules available. I t
is based on these
we at NEUCC estimate 50% of the machinetime is used on Fortran jobs, 20% on Algol jobs and 30% on other languages. It
i s , however, necessary to provide even more detailed studies of
the user p r o f i l e s and the usage c h a r a c t e r i s t i c s . We may estimate that t h e i r w i l l
be no major changes in the type of
computing done in many environments over the next few years. The number of users may, however, increase and i t
is thus important to
know the major c h a r a c t e r i s t i c s of the increasing population in order to a n t i c i p a t e the bottle-necks and to plan for the necessary expansion. This is true f o r the computing center but is also true for the users. An i n s t r u c t o r must be able to estimate the cost of his programming class. A leader of a research project should also be provided with applicable averag~to estimate c o r r e c t l y his needs for computer resources in the development of production programs. His programmers go through cycles of planning, debugging, program modifications and reprogramming.
It
is important to know what t h i s costs.
Earl Hunt et a l . / 3 ]
has reported on an analysis of computer use in
the u n i v e r s i t y computing center at Washington U n i v e r s i t y , Seattle equipped with a CDC 6400 machine. A more detailed study of programming practices has been conducted by D. Knuth [ 4 ] as an empirical study of Fortran programs w r i t t e n and run by users at Stanford U n i v e r s i t y , Computation Center and at the computer center of Lockheed Missiles and Space Corporation in Sunnyvale, California.
509
A static
statistics
structions
provide a picture
is
prising
things.
simple
at a u n i v e r s i t y More d e t a i l e d
that
compilers
computing studies
at appropriate
center
certain
con-
places
counts
grammers t h a t
much t h a t
could also [4] portions
to t h i s
by dynamic
with
is a c t u a l l y
are h i g h l y
statistics.
In the
are inserted
in o r d e r to d e t e r m i n e
the number
performed.
revealing
t h e y ought
and indeed t e l l
to be p r o v i d e d
be used to govern s e l e c t i v e of a program
sur-
users
conclusion.
or program p r o f i l e s ~ c o u n t e r s
in the program
The f r e q u e n c y
t i m e doing
to c o n s u l t
can s u b s c r i b e
were performed
each s t a t e m e n t
untested
spend most of t h e i r
Anybody who has t r i e d
method of f r e q u e n c y c o u n t s
This
how f r e q u e n t
are used in p r a c t i c e .
The c o n c l u s i o n
of times
of
i.e.
it
the p r o -
as a s t a n d a r d
tracing
is a u s e f u l
tool
and to for
tool. locate
debugging
purposes° The c o l l e c t i o n
o f debugging c o u n t s
program [ 4 ]
. The programs
~t was a l s o
found
for
more than
but
if
this
that of
less its
is common i t
improvements places.
half
to t h e i r
often
is c a l l e d
have a p r o f i l e
with
of t h e
a few sharp peaks.
than 4% of a program g e n e r a l l y
running
time.
means t h a t
own programs
Moreover o p t i m i z i n g
the p r o f i l e
programmers
can make s u b s t a n t i a l
by being c a r e f u l
compilers
accounts
There are few such s t u d i e s , o n l y at a few
can be made to run f a s t e r
t h e y do not need to s t u d y t h e whole program w i t h
as
the same degree o f
concentration. More d e t a i l e d
studies
mers p r o v i d e useful
The f r e q u e n c y programmers
programs w r i t t e n
even more i n f o r m a t i o n
both f o r
relatively
of
on the use of c o m p i l e r s
the programmer and the c o m p i l e r
counts
g i v e an i m p o r t a n t
how to make t h e i r little
by a p o p u l a t i o n
effort.
lead to an e l e v e n - f o l d
routines
A study [5]
increase
dimension
program-
and hence are
builder. to
more u s e f u l
programs and show and e f f i c i e n t
has shown t h a t
in a p a r t i c u l a r
of
this
compiler's
with
method may speed.
510
I t might be a challenge to develop i n t e r a c t i v e systems which immediately t e l l
the programmer the most costly parts of his programs. This
should strongly motivate him to make the necessary changes. The studies described are only too rare and i t may be expected that many w i l l
be encouraged to continue and to report t h e i r r e s u l t s ,
This should provide a solid base for feed-back to the software engineers about the users behaviour both on a global basis when we study the operatings of a computing center and on a more local basis when we study the behaviour of the programmers. These methods can lead to a better economy in computer usage and undoubtedly make the users more motivated to proper economy than the various administrative schemes derived in the computing center environments. Only to a limited extent do they t e l l
us about new f a c i l i t i e s
needed
and they only provide a limited basis f o r a marked analysis. 3.
SOFTWARE
AND
THE COMPUTING
CENTER
We may find computing centers with expensive f a c i l i t i e s
who are unable
c l e a r l y and sharply to define t h e i r objectives and purposes.
In par-
t i c u l a r this is too often the case with u n i v e r s i t y computing centers. One of the reasons is that some u n i v e r s i t y computing centers not yet c l e a r l y have recognized where they want to place themselves on the scale ranging from research laboratories to purely service f a c i l i t i e s . Many make gradual moves back and f o r t h while others have gone through major organizatorial changes. In many cases the objectives of such r e d e f i n i t i o n s have been to d i s t i n g u i s h c l e a r l y the service functions from the academic functions. Several cases could be discussed i n c l u ding an assessment of the advantages and disadvantages ~ the various schemes. It
is also important to recognize the d i s t i n c t i o n between a commercial
service bureau and a u n i v e r s i t y computing center.
51I
The o r g a n i z a t o r i a l structure of the two types of centers may be rather i d e n t i c a l , but while a commercial service bureau often provide a spec i a l i z e d service - a time-sharing service as a t y p i c a l example
the
u n i v e r s i t y computing center mostly have the task to make a multitude of services and f a c i l i t i e s a v a i l a b l e . Moreover, most service bureaux only t r y to provide services which are found to be economical p r o f i t able o v e r longer or shorter periods of time. The u n i v e r s i t y computing centers are often required to provide services independent of t h e i r p r o f i t a b i l i t y . Indeed many such centers by t h e i r very nature are Forced to provide non-profitable services. In t h i s respect they may be compared with other public services l i k e postal services or transportation
Another vices
services,
important
are o p e r a t e d
of t h e u n i v e r s i t y monopoly.
This
difference
is
in a h i g h l y
that
computing c e n t e r s
increases
a danger of u n s a t i s f a c t i o n
most commercial
competitive enjoys
the responsibility
market,
computing
while
ser-
a majority
a monopoly or an a l m o s t and in
itself
it
contains
amongst the u s e r s .
All these aspects also influence the software s i t u a t i o n in u n i v e r s i t y computing center environments. The multitude of services and f a c i l i t i e s a v a i l a b l e is of course only possible with a s i m i l a r large amount of software a v a i l a b l e including a vast number of a p p l i c a t i o n programs. The cost components of the computing center are described by Gotlieb [6]
It
is of p a r t i c u l a r i n t e r e s t that at most u n i v e r s i t y computing
centers the software budget as i t still
is shown d i r e c t l y on the accounts
is rather" marginal. This w i l l of course change as the policy of
computer companies of separate pricing f o r hardware and software is developing. At NEUCC we c u r r e n t l y spend as l i t t l e
as 2% of the t o t a l
cost of operating the center on d i r e c t l y renting or purchasing s o f t ware, and w i t h i n a few years we estimate t h i s f i g u r e to grow to more than 5%. However, i f
we look into our s t a f f expenses we may estimate that 60%
of these are f o r s t a f f members involved in developing, evaluating and maintaining software. The major sources f o r software from outside the computer center environment are
512
manufacturers software houses program l i b r a r i e s private communications. The manufacturer normally also d e l i v e r the basic software l i k e operating systems, compilers, assemblers, t r a n s l a t o r s , etc. and, moreover, utilities
and a v a r i e t y of applications software. The a v a i l a b i l i t y of
software is often both an important argument in the o f f e r for sale of a computer, and one of the elements in the choice made by the customer. I t
i s , however, also found that computing centers only use a
limited amount of the software offered and indeed even develop t h e i r own operating systems. For more specialized purposes we find important software developments performed in a collaboration between the manufacturer and the customer. The policy of separate pricing on software is s t i l l
new for many manufacturers, but one of i t s
e f f e c t s may be a s h i f t from the manufacturer to other sources f o r software. The software houses are characterized by providing e i t h e r software for a customer on a special contract or developing software packages f o r sale or for lease. Software may also be developed for a manufacturer to enhance the software selection available to his p a r t i c u l a r machines. The whole range of applications software and basis software is a v a i l able on the market, but most of the offers are f o r systems or rather big programs of more general u s a b i l i t y such as Fortran compilers, l i n e a r programming systems, flowchart programs etc. Of p a r t i c u l a r i n t e r e s t are programs f o r accounting of the usage of a computer system, system measurement software and simulation programs used in determining the optimal configuration f o r well-defined applications. The services of many software houses often go beyond making the products available to the c l i e n t s and are often combined with consulting services. Close to the software house concept is the u n i v e r s i t y computing center or computer science department which develop software f o r research
513
purposes or own purposes and subsequently make the products a v a i l a b l e to other interested i n s t a l l a t i o n s .
Large exchanges of s o f t w a r e an i n f o r m a l
has been made in t h a t
b a s i s a t no c o s t or a nominal
expenses o f r e p l i c a t i o n , Beside e n s u r i n g
materials
the distribution
way and m o s t l y
covering merely
on
the
and s h i p p i n g . of such u n i v e r s i t y
ware t h r o u g h a program l i b r a r y distribution,
cost
there
m a i n t e n a n c e and o t h e r
is at
developed soft-
present a trend
services
that
the
are ensured by a s o f t w a r e
house. There a r e a l s o
several
form and o r g a n i z e idea i s
that
and r e s e a r c h
examples t h a t
software
a gap e x i s t s institutes
between r e s e a r c h
and the s t a t e
These companies are o f t e n software
like
centered
an o p e r a t i n g
A more c o n v e n t i o n a l
way of
environments amounts o f
Program l i b r a r i e s serious
in
large
industry. p i e c e of
the contact
at universities
software
is
between i n d u s t r y
by i n d i v i d u a l
are d e v e l o p e d
in t h i s
are a w e l l - k n o w n and much used s o u r c e f o r suffer
con-
way. software,
under
deficiencies. often
keep l i b r a r i e s
packages are a v a i l a b l e
library
is m o s t l y
classified
the company g u a r a n t e e s attached
for
for
where r o u t i n e s ,
the customers.
according
to the d e g r e e of
t h e programs.
to t h e programs f u r n i s h e d
A low c l a s s
by the customers
and in many cases are t h e c o n t e n t s
of t h i s
no v a l u e a t a l l .
quality
or o f v i r t u a l l y
There are many of t h e s e g e n e r a l programs or o t h e r
on s o f t w a r e
items r a t h e r
section
purpose l i b r a r i e s
b e s t when t h e y are o r g a n i z e d
o f computer
programs
as systems f o r
information
than d i s t r i b u t i n g
and
The items of t h e
of varying
mally
The
in u n i v e r s i t i e s arts
around a p a r t i c u l a r
is broad and many of the l i b r a r i e s
The m a n u f a c t u r e r s larger
innovations
of the software
stimulating
tracts
but t h e c o n c e p t
groups t h e m s e l v e s
system or a c o m p i l e r .
and t h e r e s e a r c h and l a r g e
university
houses which are u n i v e r s i t y - b a s e d .
pertinent
of
service
which
service
is
to t h e l i b r a r y of the library
and t h e y are n o r handling to
the programs
abstracts
information themselves.
514
Special
purpose l i b r a r i e s concentrating on programs f o r use in a
s p e c i f i c s c i e n t i f i c d i s c i p l i n e or a p a r t i c u l a r l i n e of applications are normally at a limited size. I t
is f o r this reason they often are
able to o f f e r a rather homogeneous q u a l i t y and thus provide a highly useful service. In p a r t i c u l a r are such l i b r a r i e s often a f i n e adjunct to the special l i b r a r i e s kept in the u n i v e r s i t y computing centers. Close to the l i b r a r y concept are the publications of algorithms in journals. They should be compared to normal publications and are often subject to the same degree of referee examination which largely guarantee t h e i r q u a l i t y . In [ 7 ] M. D. McIlroy suggest a factory for mass produced software components. Here he clains that the CACM algorithms, in a limited f i e l d , perhaps come closer to being a generally available product than do commercial products. However, such c o l l e c t i o n s of algorithms also suffer c e r t a i n d e f i c i e n c i e s . They are an ingathering of personal contributions and are often quite varying in s t y l e . Moreover, they fit
into no plan~ for the e d i t o r can only publish what the authors
volunteer. I t
is f u r t h e r c r i t i c i s e d that algorithm sections of j o u r -
nals of learned societies can not deal in large number of variants of the same algorithm. V a r i a b i l i t y which makes the algorithms more useful for a large number of users can only be provided by expensive run time parameters. The review indicates that there are many types of formal
sources of
software. In the u n i v e r s i t y computing center environment we find that besides these sources both the center and i t s users to a large extent also draw on more informal
sources and many pieces of s o f t -
ware are obtained through private communications. For the computing center i t
is important to keep an exact record of
the software independent of i t s o r i g i n . This is done through the s o f t ware inventory which ought not only to l i s t
the software but also
contain a summary of the documentation a v a i l a b l e , status of maintenance, implementation
c h a r a c t e r i s t i c s and degree of r e s p o n s i b i l i t y
taken f o r the p a r t i c u l a r piece of software. Many computing centers have found i t
feasible to combine the software
inventory with the function of exercising central control over q u a l i t y
515
of a l l
software available in the center and provided to the users.
This function provide the needle-eye between software under development or consideration and software f o r operational purposes and offered by the computing center to the users on a regular basis. With software stemming from many sources i t
is quite d i f f i c u l t
maintain an adequate standard of documentation.
It
to
i s , however, a
necessity that there f o r every piece of software in the inventory is i
documentation s a t i s f y i n g a set of requirements L8] , [ 9 2 . There are four d i f f e r e n t categories of persons who need information about a piece of software. -Users of the software. Based on the documentation they need to assess the s u i t a b i l i t y of the software f o r t h e i r problems and they need also to see how the software may be used. -Programmers. Based on the documentation they perform eventual corrections and f u r t h e r developments of the software. -Systems s t a f f at the computing center. Based on the documentation they perform the implementation
on a p a r t i c u l a r computer.
-Operations s t a f f at the computing center. Based on the documentation they assure the runs of the software on the computer. Besides this documentation the computing center also need a cent r a l i z e d service called the software advisor. This should not be confused with the ordinary programming consulting service whose tasks nainly are to help users in debugging programs under development. The software advisor w i l l - a s s i s t users in defining t h e i r problems -advise an available software e i t h e r within the environment or obtainable from elsewhere -provide guidance on eventual new development of software necessary for solving the problem -accumulate experiences.
516
The services of the software advisor are supported by suitable knowhow on the software available in the software inventory. In all
considerations costs should play a proper role. Here we may
distinguish between the open costs and the hidden costs. Open costs f o r software in the computing center are for -developing software -purchasing or renting software - i n s t a l l i n g software -documentation. Those cost items w i l l
normally be recognized for each individual
piece of software. The more hidden costs are for -storing software - r e p l i c a t i n g software -servicing software -maintaining know-how. In p a r t i c u l a r the l a t t e r item is very important and the ambitious computing center with a long inventory may find i t s e l f where i t
has far too many items in i t s
in a s i t u a t i o n
inventory in comparison with
i t s s t a f f resources for servicing the software and to provide knowhow and assistance on the software. There are also the cost of using the software available. Are the s o f t ware pieces reasonable e f f i c i e n t and are users aware of the operational costs? I t
is also the auty of the software advisor to provide
guidance to the users about these matters. The awareness of costs may provide a better basis f o r a decision to use the available standard program, to adapt an available standard program or to develop a new program to solve the specified problem. An encouragement for the recommended solution may be provided through the pricing scheme of the computing center f o r i t s software services.
517
4.
INSTALLATION
AND MAINTENANCE
OF A P I E C E
OF S O F T W A R E
In the following we shall f o l l o w a piece of software from the need has been established through the i n s t a l l a t i o n phase and into the phase where i t
is made available for the users on a regular service
basis. The piece of software under consideration may form part of the basic software l i k e an operating system or a compiler or i t may form part of the applications software l i k e a package for l i n e a r programming or statistics. From whom does the i n i t i a l
motivation occur to increase the inventory
of software at the computing center? This is perhaps not possible to answer in general, but we may l i s t -users -software advisor -systems programmers. They are a l l
concerned with problems to be solved and may recognize
that existing f a c i l i t i e s
including existing software do not s a t i s f y
a new problem range. At this stage the new piece of software should be documented in the form of a proposal. This should explain why the new software is desirable, provide proper specifications and also o u t l i n e the l i k e l y costs concerned with the software including the hidden costs. Each appropriate section of the computing center must review the proposal and comment i t
based on i t s area of r e s p o n s i b i l i t y .
At t h i s stage the proposal may give occasion for feed-back to the software producer. I t may be found that changes should be made or •indeed that another version of the software is l i k e l y to provide better service than the o r i g i n a l l y proposed. At the phase of decision there should be a document describing in some d e t a i l s the product's operations and also i t s are the s p e c i f i c a t i o n s . Its level of d e t a i l
performance. Those
should be deep in order
i t r e a l l y provide a clear set of expectations to the software.
518
I t is assumed that the software producer provide a proper testing of his product before hepass i t on to his c l i e n t s and that he s a t i s f i e s himself i t
is f i t
for release. This testing may be done e n t i r e l y with-
out collaboration with the c l i e n t or i t may be combined with a f i e l d t e s t . The l a t t e r procedure is to be encouraged, but only i f c l e a r l y underlined that the r e s p o n s i b i l i t y s t i l |
it
is
is f u l l y with the pro-
ducer. Once the product is provided to the c l i e n t he often accept i t on i t s face value or at most run a demonstration
to prove that the main
features are working as expected. At a l a t e r stage he may discover the inconveniences, the e r r o r s , the omissions and in general that his expectations have not been f u l f i l l e d . The consequences of this are only too well-known and lead to wasted time and e f f o r t s as well as they create a lack of confidence in any changes or improvements to e x i s t i n g software. To prevent this the computing center must provide i t s own acceptance t e s t to be applied rigorously on any piece of software before i t
is
put into operations and in turn made available to the computing center environment. The aim is to ensure that we get the software we expected which means that i t f u l f i l
the specifications drawn up at the stage of deciding
the acquisitions. Hopefully this acceptance t e s t w i l l
also provide an incentive f o r the
producer to improve his own testing procedures and q u a l i t y control before he releases software. The t e s t procedure should include (i)
Documentation
(ii)
Availab~ility
(iii)
V e r i f i c a t i o n af f a c i l i t i e s
(iv)
Performance assessment.
For each of the items there must be stated c r i t e r i a of acceptance and only when these are f u l f i l l e d
the software is approved.
519
The procedure is not t r i v i a l In [1~
and i t may request considerable e f f o r t s .
Llewelyn and Wickens describe an acceptance scheme f o r s o f t -
ware and find the cost f o r a t y p i c a l c u r r e n t l y a v a i l a b l e operating system to be 75 man-months, together with the use of 47 machine-hours. They find the t o t a l cost of the exercise to be approximately £ 25.000 spread over a period of a year. The National Computing Centre, Manchester has suggested a procedure f o r a formal v e r i f i c a t i o n and c e r t i f i c a t i o n of a program with the following stages. I.
The i d e n t i f i c a t i o n of the type and purpose of a program, the configuration on which i t
is known to run, mode of use and
language. 2.
The i d e n t i f i c a t i o n of the level of documentation, technical support and l e v e l of use. The carrying out of t e s t s , e i t h e r by an independent a u t h o r i t y of j o i n t l y with a user group to check that the program operates in accordance with the i n s t r u c t i o n s given in the user manual and that the program a c t u a l l y does what the manual claims i t w i l l do.
A v e r i f i c a t i o n service of t h i s kind is c e r t a i n l y a great improvement, but i t would never completely make the acceptance t e s t by the computing center superfluous. Once the software is tested and accepted i t w i l l be i n s t a l l e d on the machine during which process there w i l l also be made a decision of the i n s t a l l a t i o n dependent parameters. For those i t may be important to have a p r i o r estimate of the l i k e l y usage of the software as well as the setting of the parameters may influence on the performance during the operations. The software a v a i l a b l e f o r the users in the computing center environment should be properly introduced
to ensure on the one hand that they
take advantage of the new f a c i l i t y
and on the other hand to ascertain
that i t s usage is l i m i t e d to those purposes f o r which i t was intended. This i s the task of the software advisor who w i l l provide mechanisms f o r the i n i t i a t i o n and the formation of the users on the new piece of software. This may take place in the form of courses and seminars and
520
may also involve development of new documentation to supplement the users manual. Furthermore, methods are provided for ensuring the d i s t r i b u t i o n of the software. I t may be placed permanently on a primary or secondary storage on the machine with d i r e c t access for the users or i t may be placed remotely on cards, tapes or discs. should be good f a c i l i t i e s
In the l a t t e r case there
to secure r e p l i c a t i o n and rapid d i s t r i b u t -
ion. During the l i f e - c y c l e of the software i t
is under constant evaluation
with respect to -performance -quality -usability. These experiences should be collected in a continuous way with an easy procedure for deciding on - e r r o r correction -changes of implementation parameters -changes of f a c i l i t i e s . The procedure should also include a procedure to determine when a piece of software is to be removed from the inventory of the computing center. Clearly the procedure includes a mechanism for feed-back to the original
software producer e i t h e r to encourage him to perform changes
in his product or to provide i n s p i r a t i o n for new products. 5.
CONCLUSION
There has in recent years been much concern over software, i t s bad q u a l i t y , delays in d e l i v e r y , cost which exceed the estimates etc. We may not be able to improve the s i t u a t i o n in a d r a s t i c way on a short term basis, although the seeking for basic p r i n c i p l e s in the concept of software engineering does give occasion to more optimism.
521
The users of s o f t w a r e , however, must be aware t h a t they a l s o have a large responsibility f o r the improvement, and i f a l a r g e r awareness of t h i s aspect has been o b t a i n e d through the p r e s e n t paper one of the goals has been o b t a i n e d .
[ 1]
P.A. Samet:
[2]
H. J. Helms et a l . :
[3]
E. Hunt, G. Diehr, D. Garnatz:
Who are the users? -An a n a l y s i s of computer use i n a u n i v e r s i t y computer c e n t e r , AFIPS Conference Proceedings Vol. 38, 1971. Spring J o i n t Computer Conference, 1971.
4]
D. Knuth:
5]
S.C. Darden and S.B. Heller:
An e m p i r i c a l study o f FORTRAN programs, Software V o l . 1 , No 2, 1971. S t r e a m l i n e your s o f t w a r e development, Computer D e c i s i o n s No.2, 1970. P r i c i n g mechanisms, Advanced Course on Software E n g i n e e r i n g , 1972. Mass produced s o f t w a r e components, i n P. Naur and B. Randell ( e d s . ) : Software E n g i n e e r i n g , Report on a c o n f e r e n c e , October 1968. Guidance in C o n s t r u c t i o n of Datamatic Systems ( i n D a n i s h ) , S t u d e n t l i t t e r a t u r , Lund, 1972 Documentation, Advanced Course on S o f t ware E n g i n e e r i n g , 1972. The t e s t i n g of computer s o f t w a r e , in P. Naur and B. Randell ( e d s . ) : S o f t ware E n g i n e e r i n g , Report on a c o n f e r e n ce, October 1968.
[
[ 6]
C.C. Gotlieb:
[ 7]
M.D..cllroy:
[
H.J.
8]
Helms ( e d . )
[ 9]
G. Goos:
[10]
A. I. Llewelyn and R. F. Wickens:
Measuring the e f f i c i e n c Y of s o f t w a r e , Proceedings SEAS XIV, Grenoble, France 1969. Experiences from o p e r a t i n g NEUCC ( i n D a n i s h ) , F o r s k n i n g , december 1968.
Appendix
SOFTWARE
ENGINEERING
Friedrich L. Bauer Technical University, Nunich SermaD~
"Our problems arise from demands, appetites and our exuberant optimism. They are magnified by the unevenly trained personnel with which we work". Alan Perlis
This lecture was presented by F. L. Bauer on August 28, 1971 during the IFIP-Congress !971 at Ljubljana, Yugoslavia, and was published in 1972 by the North-Holland Publishing Company, Amsterdam-London, in the "Proceedings of the IFIP Congress 71" edited by C. V. Freiman (pp. 530-538).
523
Software Engineering seems to be well understood today, if not the subject, at least the term. As a working definition: software engineering is that part of computer science, which is too difficult for the computer scientist. I.
WHAT IS IT?
1.1.
The common complaint
When the word software enginnering was introduced a few years ago, it was done in a provocative way. The use of the word was intended to signal a certain deficiency in the computer world, and "software engineering" by analogy pointed out a certain remedy. What have been the complaints? Typically, they were -
Existing software is produced by amateurs (regardless, whether it is done at universities, software houses or manufacturers)
-
Existing software development is done by tinkering (at the universities) or by the human wave ('million monkey') approach at the manufacturer's
-
-
Existing software is unreliable and needs permanent 'maintenance', the word maintenance being misused to denote fallacies which are expected from the very beginning by the producer Existing software is messy, lacks transparency, prevents improvement or building on (or at least requires too high a price to be paid for this).
Last, but not least, the common complaint is
524
-
Existing software comes too late and at higher costs than expected, and does not fulfill the promis~ made for it.
Certainly, more points could be added to this list. 1.2. The aim Clearly, nobody likes
software having the characteristics
mentioned above. But a negative definition of software engineering would not be the right answer. Positively, the aim m a y b e
stated:
To obtain economicall 2 software that is reliable and works efficiently on real mach!nes. Software engineering would then mean the establishment and use of sound engineering principles in order to reach that aim. Before considering the question what these principles are or might be we have to look at the existing situation again and to ask ourselves: What differences between the computer field and other fields of science and technolgy exist which give rise to the difficulties outlined above.
1.3. The oaradox of non-hardware engineering An answer lies in the paradox that is inherent in the combination of the word engineering and software. Engineers usually deal with material subjects, with hardware in the widest sense, from chariots to steam engines and airplanes, from jungle footbridges to the Verrazano Narrows Bridge, or, to use the word ~ingenieur~) in the meaning of the 17th century French builders of fortresses, from ramparts to Naginot lines. One may object to this that electricity is not a material, and indeed, electrical engineers see to be somewhat more abstract, somewhat more noble than others, but in common with other engineers they deal with physical objects. And here, the difference comes up: software is not a physical object, it is non-material.
525
It needs physical objects as carriers only, and it is altogether unspecific
about the carrier.
Since the material is cheap - paper as a carrier is sufficient - and the tools are at hand - usually
-
one's own head - to produce
some software is a common puberty rite for beginners
in the
computer field. As CHEATHAM says in his lecture at this Congress,
things can
be sensed in normal engineering, thus they can be judged easily whether they are reasonable. The abstract nature of software disallows this. Indeed, tissue,
software is an abstract web, comparable
to mathematical
but it is a process and in so far very different
most of usual mathematics, The difficulties
from
too.
with software can already be observed in
the problem it poses with respect to the German patent law. Is software patentable? According to the German patent law, software consists only of 'instructions to the human mind' and is therefore not patentable, despite the fact that it usually needs 'ingenuity' important to the national
and that its protection may be economy.
So something i_~sdifferent about software, has the effect of prohibiting
something,
software engineering
simply a copy of other engineering
which
from being
fields. Ny impression is
that this difference has not been given proper recognition and attention in the past, based on after-effects
and that many of the complaints
of this neglect.
Of course,
are
the mere
fact that in the early days progress was strongly associated with the hardware software
engineer explains this somewhat,
as an industrial product,
prices in an open market,
and the idea of
to be purchased at regular
is even now not fully accepted.
Something
that is given away free might very well not attain more value than a gold plated car medal one obtains with gasoline.
Note-
over, a hasty buildup in the computer industry has not provided the best climate for satisfactory ED DAVID
([G], p. 73) said:
development
"In computing,
of good software.
the research,
development,
and production phases are often telescoped into
one process.
In the competitive
rush to make available the
526
latest techniques,
such as on -line consoles served by
time-shared computers~ we strive to take great forward leaps across gulfs of unknown width and depth. In the cold light of day~ we know that a step-by-step approach separating research and development from production is less risky and more likely to be successful.
Experience
that for software tasks similar to previous
indeed indicates ones~ estimates
are accurate to within 10-30 % in many cases. This situation is familiar in all fields lacking a firm theoretical base. Thus, there are good reasons why software tasks that include novel concepts involve not only uncalculated but also uncalculable
risks".
But the situation is improving and has even improved already to some extent. The economical importance of software is now fully recognized. with large machines
Estimates
that the software used
often costs just as much as pure hardware
costs are now viewed by manufacturers. course,
This has had~ of
the effect that in the software field an extra
inflationary world-wide
tendency was introduced;
but even if no
recession cools the overheated market,
the recession in the USA - insofar as it applies to computers
- will already act as a regulator.
1.4. The role of education But it seems that the core of the difficulties deeper~
lies
and the situation outlined above has only brought
it to the open - fortunately~
I may say. My observation
is that the problem that is meant by the provocative of the phrase educational agreement
'software engineering'~
one. Surprisingly
is in fact an there seems to be
about this point from two extreme sides of the
software gang: from the called,
enough~
use
and from the
'theorists'
as they are sometimes
'practicioners'.
527
Perhaps it is less surprising that the practicioners are uneasy. Computer Science, as exercised in the United States, is not only sometimes somewhat highbrow, it also has a tendency to neglect the practicioner's immediate needs. Rightfully so, if one thinks that the only orientation academic education has is towards a P h . D . , but this ideal picture does not hold. Attempts in Europe, to define ~informatique~ in France, "Informatik" in Germany in a way so as to strengthen the practical side of programming have still a way to go in order to prove their effectiveness. What the practicioners want, is the introduction of sound engineering techniques in Computer Science teaching. Said D'AGAPEYEFF ([G], p. 24): ~'We need a more substantial basis to be taught and monitored in practice on the structure of programs and the flow of their execution, on the shaping of modules and an environment of their testing, and on the simulation of run-time conditions". In any case, the 'theorists' are even more upset (DIJKSTRA: "the massive dissemination of error-loaded software is frightening" ([G~, p. 16) and they propose real changes in programming habits. LUCAS, from the Vienna IBM Lab, reporting about a mechanical correctness proof, which by failing indicated an error, said (~R], p. 21): "The error was not found by the compiler writers. I am quite convinced that making this proof was cheaper than the discussion I have heard among highly-paid people on whether or not this allocation mechanism would cover the general case". And DIJKSTRA says "Testing shows the presence, not the absence of bugs" ({R], p. 21). How the concept of structured programming which he advocates combines with
528
engineering needs,
will be seen later.
In its tendency
to go from the general to the particular, of a system step-by-step, down teaching. programming, production
it coincides with modern top-
In particular
a sense for the conscious
to detail the description
it helps the student to develop
discipline that is needed in
and early in the education it supports the
of clean, gimmick-free,
defensive programming.
In the course of such an education,
it may be hoped that
a code of good practice for professional
programmers
will develop.
2. ~OF~WARE 9 E S I ~ AND ~OnUCTION IS ~
!~USTRI~
ENGINEERING FIELD 'On the Division of Mental Labour' Charles Babbage~
Chapter heading in his book
'On the Economy of Machinery and Manufacturers' 2.1. Large orojects For the time being, conditions,
we have to work under the existing
and the work has to be done with programmers
who are not likely to be re-educated.
It is therefore
the more important to use organisational tools that are appropriate large projects
all
and managerial
to the task, in particular to
- i.e. projects which essentially cannot
be carried through by one man within the specified time. It also goes without saying that a code of good practice, as stipulated above, will be of utmost importance if the work has to be divided by groups. Communication within the group is the main problem; and whether the resulting work increases with the square root,
or with the dual
logarithm of the number of co-workers, after some critical commonality.
or even decreases
size, depends on the degree of
529
2.2. Division into manageable Darts If software is to be designed and produced in an industrial process, the problem of division of labour is the main obstacle. Frequently, there are no natural boundaries to suggest a division into manageable parts. More important, in contrast to a normal industrial process which gains its efficiency from the economization of frequent repetition, the situation in software is different from day to day, from case to case. Moreover, as software is usually highly interwoven, breaking it into manageable parts frequently leads to a host of interface specifications. The solution can therefore not be sought in a mosaic-like sub-division (fig. I).
F~. 1.
Instead, a hierarchical structure is needed, in the simplest case a tree-structure (fig. 2) where no (or only few) connections exist between pieces at the same depth. The gain is to be found in stepwise detailization, which establishes the vertical interfaces in a natural way and keeps them to a minimum. The main difficulty rests, however, in finding the appropriate layers.
Fig. 2.
530
As an example of such a structure,
I would like to
take the organization of the project BS Nunich, an operating system for a Telefunken 2-processor configuration,
being built by a working group at the
Technical University, Munich (fig. 3). The example for the hierarchical structure supporting one arbitrary user process has been taken from routine material and has not been made up for our purpose; in particular it would be difficult to answer the anticipated question 'what do the lines mean?' - nevertheless,
it illustrates
the point.
Fig. 3.
2.3. Division into distinct stages of develooment Also in contrast to the usual situation in engineering, the division into distinct stages of development is a problem. The need for thorough feedback from construction to design, from use to construction is usually given as a reason. But
this is not new at all, it is in fact characteristic
in industrial manufacturing.
It may, however, be that
there more feedback is needed from production because of the poor status of the design, and more from maintenance because of poor construction.
Again~ the haste in the
build up might be held responsible?
including the fact that
in the computer field PETER's principle is not valid.
531
Nobody seems to reach the level of incompetence, because probably erverybody is incompetent (D'AGAPEYEFF: "those who are incompetent find each other's company congenial"). Therefore, nobody will ever do something again as soon as he somehow understands it. The hope, that time will cure these ills, is insufficient. The inner complexity of large software projects needs a careful treatment of organizational hazards. Fortunately, the computer itself can help. 2.4. Comouterized surveillance The whole design, production and maintenance process has to be subjected to computerized surveillance. The points to be looked at are in particular: - Automatic updating and quality control of documentation - Selective dissemination of information to all project staff -
Surveillance of deadline plans
- Collection of data for simulation studies - Collection of data for quality control -
Automatic production of manuals and maintenance material.
It is clear that a house well equipped with programs and an underlying philosophy for doing these things, can be regarded as a modern software plant. The tools are to a large extent at hand, although they are sometimes used to "nibble at the periphery", as someone from a leading manufacturer has stated. Many excellent remarks about the theme will be found in the Reports on the Software Engineering Conferences in Garmisch (October 1968, [G]) and Rome (October 1969, [R~. Nore modest, but probably earlier successful efforts are those described by LANDY and NEEDHAN [15].
532
2.5. Management Needless to say that successful operation in an industrial engineering field requires the full repertoire of management artifices that is at hand. Yet, many project managers in software design and production have never heard of such things and even if they are aware of this deficiency, they have neither time nor opportunity to acquire the necessaryknowledge° As soon as the software market enters into a competitive situation, this will change. Education should be particularly concerned about providing the elementary knowledge and the willingness to apply it. About management problems, the Garmisch [G] and Rome [R] reports contain many interesting details - it would go too far to mention here all the name s ° 3~ THE ROLE OF STRUCTURED PROGRAMMING 3.1. A hierarchy of conceotual la2ers The essential point, however, is to organiue the software project in conceptual layers. This technique is known under different names. It is essentially what DIJKSTRA (1969) does in his "Notes on Structured Programming" ([3], see also JR], pp. 8~-88). Stepwise abstraction is advocated; the writing of a program should start with the most abstract form. Doing the labour mentally, one does not have to introduce formalized language at different levels. But doing so, one arrives at the use of a sequence of languages, from the highest being the user's language, problem-oriented in the main, to the lowest, usually the machine language. In this form, the technique has been used somewhat widely since first described (to my knowledge) in the 1958 UNCOL Reoort ~I~, where three levels of languages were advocated, the one intermediate level being the 'Universal
533
([11],
Computer Oriented Language'
Appendix A). The
essence of such a hierarchical structuring~ however, was given in q968 by ZURCHER and RANDELL [14]. They spe~,
like DIJKSTRA, of design "from the outside
inwards", using different "levels of abstraction" and achieving "successively greater detail". The technique is also advocated by J. I. SCHWARTZ in a most interesting contribution at the Rome Conference
[Io]. The direction is here 'top-down', and interestingly it is the same as in modern top-down teaching of progrnmming. There is, however, also the choice of adopting a bottom-up approach to the design, illustrated by POOLE and WAITE [7], who start from machine level, which is defined by a real machine~ then introduce a sequence of abstract machines, each one being defined in terms of one or some of its predecessors. For the final structure neither the direction matters nor is there any fundamental difference between abstract machines and intermediate languages. In the simplest case, we will have a linear ordering (fig. 4) of levels or layers. More generally, the ordering will be a partial ordering only. The levels as such disappear, we may speak of layers only and incomparable layers may exist (fig. 5).
1
Fig. 4.
Fig. 5.
534
Since one man and/or one machine is not necessarily implied by the picture, we have the most general situation of fig. 6. Such a structural
scheme means
that everything in the meaning of a certain layer is based directly on the layers immediately below.
F~.6.
3.2. Communication between layers At any interface between layers, we may consider whatever means of intercommunications we find as a language, by which the concepts of the higher layer are expressed in terms of concepts of the lower layer. There is no logical reason,
however,
not be used at different interfaces. idea meant that U N C O L w o u l d We know today,
why the same language should
In fact, the UNCOL
be used in ever~ communication,
that under most practical
than one intermediate
circumstances more
language is worthwhile.
tensible languages
(CHEATHAM)
language different
styles,
However,
ex-
allow to develop within on_.~e
appropriate
for the respective
layer. The use of the same language at two levels also allows one to make use of recursive descriptions. In these descriptions, we find - seemingly in contradiction to the partial ordering - closed loops of descriptional
reference.
shows such a situation - the arrow between meaning:
"In the description
of the coucepts of ~
~%
and
of the concepts of ~ ,
is made". Nevertheless,
Fig. 7 (A) use
we should
535
hope that the recursive description does not lead to a circle definition, that is, that we have a partially ordered conceptional structure like the one in fig. 7 (B).
Fig. 7.
Concepts and their descriptions are different things. This is important in the following respect: The language used at a higher conceptional interface does not have to be a 'higher level' language. Neither the degree of redundance to be used uor the syntactical complexity, are necessarily correlated with the conceptual layers. But usually the more detailed, lower layer will use a less compact notation. It is also not necessary that the languages be formalized - in particular those used at higher layers will frequently not be completely formalized. Thus, we are not so much concerned with the language as such to be used, as with the style of use. Religious aspects in the use of some current programming languages are irrelevant.
536
704 ML LarcML÷Unco~ IUncol LarcML~-Unco~, t ,,
Fig. 8.
F~. 9.
537
An important matter, however, is the kind of communication between layers. In simple cases, it can be strictly operative or strictly descriptive ("communication of control" and "communication of information" in the sense of ZURCHER and RANDELL). It usually is a mixture, and sometimes does not show the pragmatic distinction between control and information at all. It may, in special cases, require a finite number of parameters of predetermined importance, quite similar to subroutine parameter sequences. Then one speaks of 'parameterized generality'. 3.3. Software engineering asDects Apart from the obvious conceptual discipline and economization structured programming brings forth, it has special technical merits. A system of layered structure e a s i l y lends itself, as is well known, to bootstrapping techniques. This has been demonstrated already in the UNCOL report [11]*. For the simple portability problem - the transition from 70$1TL to LARCNL, having a description of a translator from UNCOL to LARCNL, written in UNCOL, a n d u s i n g a 705 in a first run a description of this translator, written in 705 NL, is obtained by using the existing UNCOL to 705 NL translator, and in a second run with the help of this translator, the wanted UNCOL to LARCNL translator, written in LARCNL, is obtained (fig. 8). Noreover, if a translator description of SONEL into NL~ written in SONEL, concentrates all efforts on making the translator very efficient both in the compiling process and the run-time characteristics of the code produced, then bootstrapping with a crude translator of SONEL into NL, written in EL, obtains in one run (which may take long time) a translator of SONEL into a good NL. written in NL, which may - the ~ C O L project, although being 'spectacularly unsuccessful' and 'an exercise in group wishful thinking', as two leading scientists have stated ~ was nevertheless the first software engineering attempt.
538
now be applied again to the original description, resulting in an efficient translator from SONEL into good NL irrespective of the crudeness of the bootstrap translator. 'Good' NL, obviously a subset of i'lL, is abbreviated GEL in fig. 9, which shows that this frequently used bootstrapping process is technically identical with the one of fig. 8. Thus, using layered description, simulation can be greatly simplified, as ZURCHER and RANDELL [14] have pointed out in particular. They stress the evolutionary aspect of the software design labour. To begin with, inefficient realizations of lower layers m a y b e used highly interpretative schemes for example - which may be easily built, checked and changed. These will be replaced towards the end from above to below by final, efficient schemes. During the design labour, or in construction, intermediate layers can be expressed fully by lower ones. This is the situation resembling the use of open subroutines, and will to some extent have advantages. Very often, however, it is worthwhile to keep the layered structure. DIJKSTRA has shown this in his design and construction (1967) of the T. H. E. multiprogr~mming system [2]. This offers great flexibility for later changes. More details, in particular about the formation of the layers by introduction oZ abstract machines, are given in a working paper in [G], pp. 181-185. One more remark m a y b e in order: Structured programming may even go down to include the microprogr~mming level. 3.4. Flexibility: portability and adaptability The flexibility structured programming offers with respect to the changes that occur during the work are particularly evident in the two ends that have been at so far regarded
539
as fixed: the machine end and the user's end. The latter means that a changing situation with the user enforces changes, adaptations to new foreseen or unforeseen situations. The situation has been called adaptabilit~ ~RP]. The former means changing machine characteristics, foreseen or? as usually the case with a new machine, unforeseen ones. This situation has been called portabilit ~ [RPS. The case of foreseen chsmges offers in fact nothing new, since then the problem can be considered as being taken care of from the beginning. (The word availability that has been used sometimes in this connection is misleading.) Portable software and adaptable software mean, however, that something has to be changed, depending on the unforeseen change. The hope is to keep this to a minimum? and as in the previous case, to achieve this by suitable structure so that perhaps only the immediate neighbouring units will have to be changed, or at least very few of them. In general, the effect of changes should rather be damped at more remote layers. 3.5. Some existing examples There exist a number of examples for software which is sufficiently portable or adaptable so that its portability ratio or adaptibility ratio, resp., is less than 5 %, the ratio in question being the effort necessary to make changes, in relation to the total effort. An early example is the ALCOR ILLINOIS compiler for ALGOL 60, which was built for an IBM 7090 and was transferred by DAVID GRIES to an IBM 7044 in two weeks [5S- Its portability was * achieved mainly through parameterization. More recently, MARTIN RICHARDS with his BCPL compiler has given several *
The problem was thoroughly discussed by S. Warshall at the Rome Conference [16].
540
examples of successful portability, to a KDF 9 ([R], p. 29) and recently to a Telefunken TR 440. Very impressing are the experiments POOLE and WAITE made, using a 'mobile programming system' with the macro processor STAGE 2 as tool ([~,~,~3~). STAGE 2 itself is highly portable and has been implemented on 20 different computers, requiring about one man-week of effort to obtain a running version in each case [8]. They have ported, among others, several compilers to a number of machines of quite different characteristics. D. T. ROSS with his system AED [9S claims portability, through a complex bootstrapping approach, too ([R], p. 29), and favours macro-expansion ([G], p. 150). There are many more interesting approaches scattered in the literature. On the side of adaptability, examples have been given, too. Parametrizing 'generic software' has been used, e.g., for varying precision of calculation and arguments range in numerical approximation. Nc ILROY proposed to use 'software components' which allows software to be built mosaic-like from a multitude of mutally harmonized small pieces, to be ordered from a catalogue [6]. Such an ambitious goal is not likely to be attacked successfully in near future, but theoretically it falls fully within the 'structured progrsmming' idea. Keeping in mind that our definition of user and machine is relative, we obtain a number of further examples through macro generators which allow the specification of new macros, and more generally through extensible languages. In these examples, although the extra work is practically negligible, the possible changes are~ however~ also narrowly restricted.
541
3.6. The trade-offs Known ~u¢cesses
in making software portable
adaptable have often accepted considerable
and/or
inefficiency
as the price to be paid for this. This has been the practical result, but it is not a logical necessity. with this present situation, and adaptable inefficiencies
the advantages
Even
of portable
software have overcome the accompanying in many cases. The values implied by this
trade-off point to the urgent need for further research. In this connection,
it is important to develop system
evaluation tools. A detailed survey has been given by GOTLIEB and ~AC EWEN [#]~ and most recently some very interesting results
came from ASLANIAN and BENNET [I].
4. CONCLUDING REMARKS Software engineering has probably a long way to go before it can repay the costs that have to go into it. The discussion of structured programming
as a software
engineering approach has left a number of questions open: how to find the right layers,
for example. All
experts agree that this is the most important thing, and it seems to require so much intuition that it cannot be taught simply. But although no one would suggest that software engineering now can be left to a robot: it is important that - to use a phrase of LEIBNIZ "excellent men should not loose hours like slaves in the labour which could be safely relegated to any one else if machines were used". Progrmss
in software engineering can be expected only
if the available techniques
are more widely used and
applied to a variety of situations.
Comparison can then
542
show the advantages
and disadvantages.
between commercial manufacturers and therefore proposed.
a cooperative
Such a comparison
is hardly imaginable,
effort of governments has been
The result of an international,
activity in the development
of software
non-commercial
engineering techniques
could at the same time be some help for the user who finds it more and more difficult to obtain the software he needs in view of the growing complexity of the computer system. Such an enterprise manufacturers
should, however, be in contact with
and software houses in order to avoid a
drift into the purely academic direction,
and should in
particular publish its final products for free use. In view of the long time the preparations
took so far, however,
it
is doubtful whether such a thing would come at all in time. In the four years since autumn 1967, when the phrase 'software engineering' was introduced to a wider public, many people - scientists, educators, managers, businessmen became aware of the problem. reorient themselves,
Software houses commence to
tutorial meetings
are held~ like one
by Infotech in London this year, and the scientific divisions ment;
of governmental
affairs
agencies support further develop-
for example an International
Software Engineering,
-
Advanced Seminar on
under EEC auspices financed by the German
Federal Ninistry for Education and Science,
is under preparation
and will be held in Nunich in ~ebruary/Narch next year, hopefully providin~ the computing community with wellorganized teaching material
in some form. ~,ast not least,
the fact that IFIP has taken up this subject in its congress program is a most encouraging
sign.
Some of the effects software engineerin~ may have may not be liked universally. From a list D!JKSTRA compiled, I take:
It may be necessary to change our tools - which
is expensive~ balance~
to chan~e our hardware - which is upsetting
to chan~e the organizational
set-ups in whic h our
work has to be done - which is alarming for some supervisors. It may mean that we have to chan~e our thinkin~ habits which a majority of the computer community may dislike.
543
Unemployment of unskilled programmers may very well be a result of software engineering. The gold-rush will not last forever. The computer, one of the greatest inventions of engineers, has to go the complete way of engineering to its end. ACKNOWLEDGENENTS I have heard many views and learnt about the details at the Working Conferences sponsored by the NATO Science Committee, held in 1968 at Garmisch and in 1969 at Rome. For a systematic approach, I owe thanks for fruitful discussions to Dr. E. DAVID, formerly at Bell Teleph. Lab., and Dr. W. NORTON, Culham Laboratory, UKAEA, and to many of my academic colleagues. Ny particular thanks go to Prof. C. C. GOTLIEB for editorial help. REFERENCES [GJ
(Garmisch Report) P. NAUR and B. RANDELL (ed.) Software Engineering. Report on a Conference, Garmisch, October 1968. (Rome Report) J. N. BUXTON and B. RANDELL (ed.) Software Engineering Techniques. Report on a Conference, October 1969.
[RP]
Recommendation of the Planning Board for an International Computer Science Institution. Working Document, Rome Conference on Software Engineering Techniques, October 1969.
[I]
R. ASLANIAN and N. BENNET. Computer Oriented Operating System Design Using Evolutive Nodelling and Evaluation. CII Working Document (Nay 1971) submitted to the Palo Alto October 1971 Symposium on Operating Systems Principles.
544
[22
E. W. DIJKSTRA: The Structure of the T. H. E. MultiProgramming System. ACN Symposium on Operating Systems Principles, 1967. See: Comm. ACN 11 (1968),
341-346.
[31
E. W. DIJKSTRA: Notes on Structured Programming. Report Nr. 241, Technische Hogeschool Eindhoven (1969). C. C. GOTLIEB and G. H. Mac EWEN: System Evaluation Tools. In: [R], pp. 93-99. D. GRIES, M. PAUL and H. R. WIEHLE: Some Techniques Used in the ALCOR ILLINOIS 7090, Comm. ACM 8 (1965), ~96-500.
[6]
N. D. Mc ILROY: Mass-produced Software Components.
In: FG], 138-155.
[7]
P. C. POOLE and W. N. WAITE: Machine Independent Software. Proc. ACM Second Symposium on Operating Systems Principles, Princeton, N. Y., October 1969.
[8]
P. C. POOLE and W. N. WAITE: The Design of Portable Abstract Nachines. Culham Lab. Report CLN-P 258 (1971).
[9]
D. T. ROSS: News About AED. Periodical Publication by Softtech, Waltham, Massachusetts.
[lOI
J. I. SCHWARTZ: Analysing Large-Scale System Development. In: [R], 122-137.
[11]
J. STRONG, J. WEGSTEIN, A. TRITTER, J. OLSZTYN, O. MOCK, T. STEEL: The Problem of Programming Communication with Changing Machines. Comm. ACM 1, No. 8, 12-18, No. 9, 9-15 (1958).
545
[12]
W. M. WAITE: Buildin~ a Mobile Progrsmming System Comp. J. 15, 28 (1970).
[13]
W. M. WAITE: The Mobile Progrsmming System: STAGE 2 Comm. ACN 15, 415 (1970)
[14]
F. W. ZURCHER and B. RANDELL: Iterative MultiLevel Nodelling. (Submitted Paper) IFIP Congress 1968.
bs]
B. LANDY and R. N. NEEDHAN: Software Engineering Techniques used in the Development of the Cambridge Multi-Access System, Software Practice and Experience 1, 167-173 (1971). S. WARSHALL: Software portability and representational form. Paper, submitted to the Rome Conference.