ON THE DEVELOPMENT OF SYSTEMS OF MEN AND MACHINES H. D. Mills,
IBM Corporation
and The Johns Hopkins University
With Appendix by R. C. Linger,
IBM Corporation
ABSTRACT
We formulate the development programming
of systems of men and machines
problem for multiprocessing
are men and some are machines. courses,
etc., are determined
as machine specifications, of the total operation. miniature
in which some processors
In this way, users guides,
training
from processing requirements,
just
which are consistent with the objectives
An Appendix illustrates
for a supermarket
as a
this idea in
checkout operation.
Systems of Men and Machines
Our topic is the architecture, large s y s t e ~
implementation,
of men and machines
enterprise -- managing a business, tion system,
women)-tenders;
operating an airline reserva-
running a government agency,
and back, etc.
and operation of
in some definite and coherent
getting men to the moon
In such systems there are many kinds of men (and
managers,
clerks, specialists
of various kinds, and machine
there are also many kinds of machines -- computers,
minals, sensors,
Ordinarily
actuators,
communications
computer programming
of the system, and programmers
equipment,
that modern principles necessity
etc.
is regarded as part of the machine side as part of the machine tenders.
We bring a different view -- that the architecture tions of the enterprise
ter-
is programming, of programming
as well.
of the operaOur thesis is
-- forced on us out of
in dealing with machines of much logical capability but no common
sense -- can play as vital a role in bringing systematic
discipline and
standards into all phases of large systems development.
For this point of view we escalate the concept of programming providing comprehensive The operations
instructions
to that of
for either man or machine activities.
of an enterprise becomes a multiprocesssing
operation of men
and machines;
then the architecture
tions and types of men and machines direct each type of man and machine. one part of a cooperating
of the operations in the system,
defines the configura-
and the programs which
For example, a users guide becomes
system of programs operating in separate processors
(the user and a machine).
Of course~
the characteristics
of man and machine are quite different
in
such a system, just as machines are different among themselves.
And yet
their architectural
For example,
properties
can be treated in a uniform way.
in deciding on a particular machine requirement,
various considerations
of
physical or logical capability will arise, as well as whether the requirements can be met with off-the-shelf apply to a man requirement,
equipment;
these same considerations
in common terms -- e.g., how many letters can
a postal clerk sort in an hour, and can this be done with minimal
training,
etc.
It is well understood things.
that men and machines do well at quite different
Machines are good at doing what they are told to do, very rapidly
and accurately. instructions
Men are good at using common sense -- even disobeying
when they are obviously misconceived;
-- discovering
at pattern recognition
information by no special or dependable process;
invention -- creating a new idea for the enterprise to act on. but very important
form of pattern recognition
speech to and from machine readable text. activities
and at One simple,
is the translation of human
This is routine for clerical
that interface with persons outside the enterprise -- e.g., in
an airline reservation system.
It may be asked how the concep~ of programming so differently and unpredictably,
applies to men, which operate
compared to machines.
noting that programs are used in a local way by machines moment,
under these conditions,
do this next.
It applies by -- i,e., at this
A good deal of programming
on the man side is already subsumed under general instructions
and common
sense -- if the telephone rings, answer it; if you want to execute a program, keypunch a job deck and submit it to the operator. of explicitly programming
human activities
We have no intention
now done by general instructions
and common sense; the typical level of human programming which is found in users guides, operating with machine operations.
instructions,
envisioned
is that
ere., associated
Principles of Pro~rammin~ for Men and Machines
The recent, twenty-five year, history of computer programming has seen an explosive and traumatic growth of human activity and frustration in attempting to realize the promise and potential of electronic and electromechanlcal devices for processing d~ta in science, industry and government.
Out of
this history has come the stored program computer; programming languages, compilers, and libraries; and new technical foundations for programming computers, e.g., as propounded by McCarthy [ 4], Dijkstra [ i], Hoare [ 3], and Wirth [I0].
In this period the computer has been the center of attention.
In the beginning, the numerical computing power was so great, compared to manual methods, and the availability so limited, that men readily adapted to this new tool -- from decimal to binary, from equations to algorithms. But in a short time, the remarkable possibilities for more general data processing (nonnumerical) were realized, and a new industry was horn in just a few years.
In the later part of this period (up to now) the large
data processing systems appeared -- management information systems, airline reservation systems, space tracking systems, etc.
Even then, although
human factors were considered, these systems were conceived primarily as data processing systems, which responded to users.
But in our proposed
perspective, the users are as much a part of the multiprocesslng enterprise as the machines and their programs.
Thus, whereas the computer has forced us to find more effective programming principles than we would otherwise have, it has also warped our sense of perspective. focus.
By this time, simple human factors questions are in better
There is enough data processing power available to invest part of
it in creating more human-like interfaces than binary and machine code. But in extending programming to instructing men, with their entirely different characteristics, additional ideas and principles are needed, such as
1.
Languages.
The programming languages used in the architecture of
systems of men and machines need be near natural languages.
The difficulties
of processing natural languages are well known and that is not proposed. Rather, what is proposed is a "naturalization" of processable programming languages which is close enough for use as a dialect of natural language by nonprogrammers of the enterprise. both sides of this.
Sammet [ 8] and Zemanek [ii] discuss
2.
Procedures.
The concept of procedure should be extended to include
indefinite procedures where it is not possible or desirable to define them. In simple terms, "find values for these variables so that these equations hold" (Wilkes discusses this in [ 9]), or more complex, "make sure no one's feelings are hurt".
3.
Interactions.
The principal subject of the architecture is multi-
processing -- the conduct of the operation of the enterprise through programs distributed to men and machines.
The creation of such programs in an orderly,
systematic way will require a new and fundamental development beyond programming principles for synchronous operations.
The idea of the Petri net [ 7]
may be an embryonic step in such a development.
Dennis illustrates
Petri nets as multiprocessing control mechanisms, in [ 2].
Architecture P r i n c i l ~
A system depends on its components.
The architecture of a system specifies
the types of men and machines required~ as well as how they are to interact as a system operation, i.e.~ the selection and arrangement of the system components, as well as detailed instructions for their behavior.
In the
case of machines, the usual considerations apply -~ define feasible requirements, either already embodied in existing maehines~ or possible with special development where justified; tradeoffs and comparisons with alternative approaches, even with manual approaches where feasible~ etc. men, the system architect is frequently shortsighted.
In the case of
There is a reason; it
is much more difficult to predict a human performance than a machine performance.
Sometimes the human fails to live up to a requirement.
But often
the human will exceed a requirement in a totally unexpected way, by acquiring a skill not imagined possible beforehand.
This is happening with computer
programmers right today, who are beginning to program with a precision not believed possible five years ago.
It happened with typing when touch typing
was introduced early in this century.
In retrospect, it is easy to identify a pitfall in overestimating machine possibilities, leading to a common scenario in many large systems in recent history, which put a "man in the loop" at the last moment, with marginal operational results.
In such a case, the operation was originally planned
as completely automatic, depending on some key algorithm (often involving some form of pattern recognition); as a result of the planning, the data processing functions surrounding the algorithm were developed in parallel
according to a general system design; then at the last moment, the performance of the key algorithm proves inadequate, and the man is brought into the loop, with two costs: 1.
The human factors are bad (e.g., interacting with programs which
require long, fixed argument lists); these can be fixed up.
2. the loop.
There is a lost opportunity in not having a better trained man in The effort on algorithm development is frequently different than
that required for insight development for a human executing an indefinite procedure, and the time is lost, anyway.
These operational experiences lead to the following principles of systems architecture:
i.
Components.
Regard men and machines as equal status components
for system operations with equal requirements for development, state-of-art projections, and improvement, according to their own characteristics.
2.
Evolution.
Plan on the unexpected, by well-structured interfaces,
that permit the replacement of components by improved versions which perform identical functions more effectively.
Parnas [ 6] deals with this inter-
changability by axiomatizing such interfaces.
3.
Intesrity.
Value system integrity above all else, by requiring
that the multiprocessing operation of men and machines be described and scrutinized according to the best principles of programming -- particularly with respect to methods of specification and validation of programs.
Note
especially the technique defined by Wirth [I0] and Mills [ 5].
In an Appendix dealing with a minature problem, R. C. Linger illustrates the idea of programming an operation through a set of abstract processes which only later are specialized to either men or machines, depending on their requirements.
It is an easy transition from man as a user to man as
a processor and yet it seems a critical one in providing system coherence and integrity with respect to a given operation.
Literatur
[i]
O. ~J. Dahl, E. W. Dijkstra and C. A. R. Hoare, Structured Programming, Academic Press, London. 1972.
[2]
J. B. Dennis~ ~Concurrency in Software Systems", Lecture Notes in Economics and Mathematical Systems, 81 Advanced Course on Software Engineering (Ed. F. L. Bauer), Springer-Verlag, Berlin, Heidelberg~ New York.
[3]
C. A. R. Hoare~ ~An axiomatic basis for computer programmlng"" CACM 12.
[4]
1973.
i969.
pp. 576-580, 583.
J~ McCarthy, "A basis for a mathematical theory of computation", Co___mputer pro~rammin$ and Forma! Systems, (Eds. P. Braffort and D. Hirschberg)~ North-Holland Publishing Company.
[5]
(To appear).
D. L0 Parnas, "A technique for software module specification with examples", CACM 15, 5.
[7]
pp. 33-70.
H. D. Mills, "The new math of computer programming", CACM 17, 1975.
[6]
1963.
May 1972.
pp. 330-336
C. Ao Petri, Communications with Automata.
Supplement i to
Technical Report KADC-TR-65-377, Vol. i, Griffiss Air Force Base, New York.
1966.
(Originally published in German:
Kommunikation mit Automaten,
[8]
J. E. Sammet, "The use of English as a programming language", CACM 9, 3.
[9]
March 1966.
pp. 228-230.
M. Wilkes, "Constraint-type statements in programming languages", CACM 7.
[i0]
University of Bonn, 1962).
1964~
pp. 587-588.
N. Wirth, Systematic Programming: Hall, Englewood Cliffs, New Jersey.
[Ii]
An Introduction, Prentice1973.
H. Zemanek, "Semiotics and programming languages '~, CACM 9, 3.
March 1966.
pp. 139-143.
Appendix
SUPERMARKET
CHECKOUT AS A MAN-MACHINE MULTIPROCESSING
ACTIVITY
R. C. Linger, IBM Corporation
Consider the problem of specifying a design for a super market checkout operation.
Several possibilities
and cashbox,
come to mind; a man with an adding machine
a man with a cash register,
encoded prices,
etc.
Whatever
design with a procedural
a man with an OCR device which reads
the final configuration,
we can begin our
description of the checkout process itself.
view, the dynamics of the process are of central interest.
In this
That is, we do
not begin with a static description of components which must somehow fit together in a presumed system~ but rather begin with a process which can be shown to work, and derive required
(and possibly alternate)
urations of men and machines from it.
component config-
Our tentative first refinement
is:
start checkout open checkout station at 9 a.m. do while before 5 p.m. checkout next customer,
if any
od close checkout station stop
Here, the actions of men and machine are abstracted;
we cannot determine
which does what, but can agree that the description seems reasonable so far. Our second refinement might be: start checkout establish man on duty at 9 a.m. power up machines at 9 a.m. do while before 5 p.m. accept next customer,
if any
total cost of all items inform customer of total accept payment present change, if any, to customer bag all items od balance cash total with sales total power down machines stop
Here the functions of man and machine begin to emerge; presumably
they will
cooperate to "total cost of all items" through manual keystroke entry or OCR input, and the man will "bag all items," without mechanical identify
"before 5 p.mo" as a man predicate,
for clockwatching,
We
which hopefully yields to common sense to complete checkout
of customers waiting in line at quitting timel processing"
assistance.
in view of the human propensity
arises in this procedure,
coffee break is desired~
The possibility
of "exception
as when the machines break down, or a
These situations
are analogous
to the interrupt
handling facilites of modern computers.
An aspect of the activity distribution between man and machine can be explored through elaboration
of the process to "total cost of all items:"
start total cost of all items initialize subtotals
to zero
do while items remain if item type is produce then add price to produce subtotal else if item type is meat then add price to m e a t
subtotal
else add price to grocery subtotal fi fi od add subtotals
to find total cost
stop
The cooperating process here~ expressions
give and take actions of man and machine appear as an abstract
"items remain" is likely a man predicate,
appear to be machine predicates
set by man.
lations are best handled as machine functions.
while the "item type" The subtotal accumu-
We can establish concrete procedures for man and machine by defining an interface between them.
In illustration, consider an electromechanlcal
cash register with control keys as follows:
0, 1,...,9
price digits decimal point
I
initialize (for new customer)
P
produce
M
meat
G
grocery
T
total
For this interface, the man procedure becomes
start total cost of all items -- man part push I do while items remain get next item push digit and decimal keys for price i_~fitem type is produce then push P else i f item type is meat then push M else push G fi fi od push T stop
and the corresponding machine procedure is:
10
start total cost of all items -- machine part (I key depressed) set produce subtotal to zero set meat subtotal to zero set grocery subtotal to zero do until T key wait for TIPIMIG key i f ~ T key then read and clear price register I f P key then add price to produce subtotal else i f M key then add price to meat subtotal else add price to grocery subtotal fi fi fi od add produce, meat, grocery subtotals display result
A different set of procedures derive from an alternate interface, as with a cash register using OCR input, and equipped with only I and T keys.
In
this case the man procedure simplifies to
start total cost of all items -- man part push I do while items remain pass item label over OCR window od push T
and the matching machine procedure must extract both item type and price from encoded labels.
We observe that the man procedures above can evolve naturally into users guides and training courses, possibly containing instructions no machine would understand, as, "Don't be distracted while pushing price keys."
The
machine procedures give a explicit basis for electromechanical component design, or executable procedures in the case of programmable devices.
A New Look at the Program Development Process P. Hiemann,
IBM Boeblingen
This paper is a contribution to the discussion of defining a program development process frame with distinct validation points for analysing and measuring process results providing methods,
techniques and tools to support
the program development process.
O.
INTRODUCTION
In the 25 years history of professional system programming,
system
programmers have developed systems of increasing complexity in function and size. Their experience shows that it is increasingly more difficult to meet specified system integrity/quality objectives without complying to disciplined programming methods based upon a firm definition of the program development process. System programmers have identified and developed a series of programming support techniques which are mandatory for producing and validating a high quality of specifications and code. In the first section of this paper, a process frame with its validation points and some basic process measurements are described. The second section describes the methods supporting the program development process. The final section of the paper provides some proof regarding the quality of succeeding OS releases that have been improved by applying new process support techniques.
12
I.
THE PROGP~d~9,1ING PROCESS AND SOME MEASURES
Most people
think of programming
four activities Designing:
(Figure
Where
Testing_! Where
While
during
is written°
that
testing
the pieces
an application, testing
composed of
do and how it w~ll work.
it is verified
uncovered
I n t e g r a t i n g : Where create
will
Where the program
problems
as being
it is determined what is needed and defined
what the p r o g r a m m i n g Coding:
cycles
I):
the program will work and are fixed.
or modules
a subsystem,
and integration
are put together
or an entire
are closely
to
system.
related
in that inte-
gration has to take place before
larger units of code can be
tested,
if one has to go back to design
it is resource
during coding The results programs
or even testing.
of coding,
of growing
testing
to the smallest
It may be a module,
- A function a command, - A component
is composed
a macro,
is composed of functions.
- The components
self-contained
piece
of
A function may be
routine. A component may be an
or the supervisor.
are integrated
I, OS Release
2)
or a subroutine.
or a recovery
a compiler,
are operational
(Figure
of program units.
a parameter,
access method,
Release
and integration
size and complexity
- Program Unit relates code.
consuming
to form a system~
21, and so on.
as VS/2
13
About
two or three years
and statistical
ago,
information
customer was experiencing
IBM started to analyse
relating
to quality of systems.
an APAR rate that was too high.
shows the average number of all APARs in other words, customers. APARs
financial
all submitted APARs,
The
Figure
3
for OS submitted per customer, divided by the number of
The area labeled Valid is the average number of valid
submitted per customer.
A valid APAR is one that requires
a fix either in code or in documentation. valid APARs
and the total,
user errors.
largely represents
The average number of APARs
not changed significantly
The difference duplicates
between and
submitted per user has
over the past four releases,
nor has
the average number of valid APARs. Corresponding
to the high error rates the analysis
76 % of the cost for Release
only 24 % was spent on development cludes
cost related to designing
and printing publications, integration. laboratory
Maintenance
(Figure 4). Development
and coding programs,
and to performing includes
in-
to writing
all testing
and
FE cost to find problems
also indicated that the cost of finding
increased as the programming
process progressed.
and
thirty times the cost of fixing it in unit test, that the code goes through
The cost of finding As we progress
a problem during
coding
at each testing
stage. More
with the code as it moves
(Figure
5).
is virtually nothing.
field usage,
and more people
through the cycle,
(an APAR) was
the first set
in development
±rom unit test through
a problem
The cost of finding
and fixing a problem after the system was released
creases
and
costs to fix the problems.
The analysis
of tests
showed that
]8 was spent on m a i n t e n a n c e
the cost inare involved
and more people
and their work are affected when a problem occurs. The net result of the analysis was to establish that will help in: reducing
errors
finding problems
as early as possible.
an improved process
14
The base of the new process was a frame of seven distinct phases with distinct Validation
validation points at the end of each phase
regards
to
Quality of the results produced Progress
in terms of goals
Usefulness
determined
during one phase
and schedules
and Profitability
to be developed The process
of the system/component/function
further.
should be continued validation
criteria
to its next phase only~
about the functions
data processing
capability.
if pre-
are met.
Phase 0 is a planning phase to develop statements
(Figure 6).
requirements.
These are
needed to provide new or improved
People study,
analyze,
and perhaps
survey users to determine what they require. Phase I determines package° formance~ Phase
the requirements
It addresses
configuration,
storage
programming
requirements,
per-
etc.
II is the external
commands,
for a particular
parameters,
design phase.
output
major system components
the module
and interfaces
and the interfaces
design phase~
structure
are developed.
that covers all functional Phase 0 to Ill are strictly the implementation
(for example with
are developed.
Phase !II is the internal the program,
formats)
User interfaces
The internal
logic of
as well as internal
During this phase,
variations sequential
and resting'phases,
data areas
a test plan
is also developed. steps; phase however,
IV and V
interact with
each other~ Nevertheless,
Phase
IV is the main coding phase
and function
testing.
including
unit
15
Each phase has its own unique output or results: Documentation
in the first 4 phases
Code in Phase IV A system in Phase V Fixes in the maintenance Besides
the planning
directly Market
Phase VI
documentation,
the following
documents
relate
to the program being developed:
requirements
define a need,
considerations.Objectives programming
package.
of the program,
independent
of any programming
state the requirements
External
specifications
its user interfaces,
other parts of the system.
and its interfaces
Internal specifications
logic and the method of operation of the program interfaces.
Code is also considered
field engineer
receives
of a particular
define the purpose the
and its internal
as documentation
source code documented
with
describe
since the
in the form of
microfiche.
2.
THE METHODS
According
SUPPORTING THE PROGRAMMING
to the phase structure
PROCESS
of programming
projects,
there
are -
methods
supporting
-
methods
and criteria
results
of a phase
In describing -
the programm
the methods,
Principles Tools and Techniques Management
Aspects
development
process
to check the completeness
this paper distinguishes
of the
between
16
2.1
Principles
2.1.1
Top-Down Design
Top-down design is used today for most new development projects° In top-down design, we take a functional viewpoint.
We define
the basic functions of the program and design them with a high level of detail.
Then each of those basic functions
into more detailed subfunctions. of detail,
is broken
We move down through levels
creating a tree structure like the one shown in Figure
We continue this process until all subfunctions a consistent
are defined to
level of detail. When we are finished with the top-
down design process,
we will know about all of our interfaces,
all of our logic decisions, box in the tree structure a code segment,
and how the data is structured.
represents
Each
a function and, in most cases,
an inline block of code, or a subroutine
that
does not exceed a listing page. Top-down design avoids simultaneous, finition.
inconsistent
interface de-
Top-down design has reduced the complexity of the design
and of the programs
that result from the design.
for orderly logic development.
It also provides
When design is done from the bottom
up, decisions may be assumed to be in the upper-level
logic,
when in fact they should happen in the bottom levels.
2.1o2
Structured Programs
(Figure 8)
To reflect the program structure
that has been developed during
design, we emphasize on development of structured code. Structured code is composed of one-page
segments.
Control always enters
a segment at its top and leaves at the bottom. All segments referenced by name. GO TOs are not permitted. returns
7.
to its caller or moves
are
Each segment either
inline to the next segment.
17 Some advantages
-
of structured
code are:
It is easy to read. A listing of a program can be read and understood if the program
- Structured
can learn structured
and practice;
Elements
Structured
coding methods.
and to extend.
coding in about a week of
and once he has learned it, he generally
will not go back to traditional 2.1.3
of Structured
methods. Code
(Figure 9)
coding uses basic program elements
to code any program.
By using simple building
blocks
THEN-ELSE,
we can simplify programming,
and DO-WHILE,
code
This is often impossible
is written using traditional
code is easier to maintain
A programmer coursework
very quickly.
in structured
for sequential
statements,
IF-
reducing
complexity. 2.1.4
Continuous
As mentioned
earlier,
of code pieces
(Figure
code development
of growing
We use a process integrated
Integration
10)
implies
size and complexity
called continuous
the integration into the system.
Integration.
Functions
into the system as they become available
to a disciplined
plan.
be added in a logical
"Disciplined" sequence,
voted to the right functions
means that functions must
and that resources must be de-
at the right time. We start inte-
gration early,
for large efforts
fore shipment.
A driver is a subsystem used for subsequent
Drivers
are
according
are built by continually
as much as eighteen months beadding
functions.
system followed top-down design and implementation, can take the integration plan directly
testing.
If the entire then one
from the top-down plan.
18
Here
are some of the advantages
of continuous
integration:
Code is entered
into the system when it is complete.
sets of modules
are added at one time,
to put all the modules gration process
Small
instead of trying
together at once~
i.e.
the inte-
is less complex.
A real system is always
running,
and it is easier to keep
it running. Detailed planning to think through
becomes
their dependencies
faces will become more to become more
2.1~5
Tests
We d i s t i n g u i s h gration
necessary.
apparent.
realistic
are forced
so that their inter-
It also forces people
in their scheduling.
& Test Criteria between
Programmers
(Figure
11)
four types of tests
according
to the inte-
levels:
. Unit Test o Function Test Component
Test
System Test Unit testing It tests
is done by the developer
the smallest
a module~ completion
a macro,
self-contained
or a subroutine.
after his code is completed. piece
of code,
The m i n i m u m
is that all the branches
for example
criterion
for
in the code are executed
both ways. Function test occurs test, we put units a command,
after unit
test is completed.
together to form discrete
a recovery procedure,
~or c o m p l e t i o n
is 1OO % execution
hich must be successful.
In function
functions,
or a parameter.
for example
The criterion
of all test cases,
90 % of
19
Component
test takes place after function test is completed.
Functions
are put together to form a component.
be an access method, for completion
the supervisor,
A component may
or a compiler.
The criterion
is 100 % of all test cases were executed success-
fully. The final test is system test. The components to form the system, Criteria
for completion
and all problems groups,
I.
are all test cases executed successfully
fixed. Whatever
it is important
throughout
are put together
for example OS 20.1 or VS/2 Release
criteria
that predetermined
the whole process
(including
is used by other development criteria are used
specification
validation)
and should never be compromised.
2.2
Tools and Techniques
This section describes
techniques
to perform their tasks according 2.2.1
HIPO
and tools that support programmers to established principles.
(Figure 12)
We are supporting
the design task by a technique
called HIPO
that in turn is supported by a tool that provides
of updating
and print facilities. HIPO stands for Hierarchy, means proceeding following
plus Input, Process,
from very generalized
top-down design practices.
sively lower-level
to very detailed
The diagrams
drawing,
refer to succes-
diagrams.
Each diagram shows the inputs to a function, within the function,
and the outputs.
volved in the process, replacement
Output. Hierarchy
therefore,
for flowcharts.
the processing
steps
The data and function in-
becomes visible.
HIPOs is a
20
2.2.2
Cause / Effect Graph
Due to the absence of formal design languages, gram functions racteristic
in prose~
of burying
less visible.
Unfortunately
relationships,
A technique
we describe pro-
prose text has the chai.e. to make relationships
of analyzing
external
specifications
is to draw a cause and effect graph. A cause and effect graph is a Boolean representation mented in specifications, representing expression. omissions
a valid such as a measure
review of a specification.
The cause/effect
test of all variations graphs
design of such test case buckets. and a tool produces all causes
Figure
the cause/effect
The test bucket
requires
Integration
at different
to each other°
codes the graph
14 shows what the tool would prograph shown in Figure 13: for having
all causes in-
observed.
integrating
levels and possibly
needs
The programmer
& Build Support
of continuously
Figure
of program
can be used to derive the
6 test cases
voked and all related effects
The prQcess
test case
the list of test cases needed for testing
and effects.
duce assuming
projects
They provide
a large effort is needed to develop
for a comprehensive
functions~
manner°
13 shows an example of a graph of how to construct
inconsistencies.
for a thorough
docu-
Test Case Design
Traditionally~
2.2.4
relationships
These graphs help to find errors early,
and a structure
buckets
Figure
some specifications
and logical
2.2.3
of the logical
code and building
drivers
for a series of programming
a complex tool to perform all steps in a controlled 15 shows the support
functions
and how they relate
21
The programmer stores new code in a development library. He tests his code under a specific test system called driver. Upon test completion
(successful) his code is integrated and stored in
a master library. At that point, this code becomes available to other development groups. In particular,
test system or drivers are built from the master
library and are then used by all development groups. Finally, the system to be released is build from the master library. Errors found during testing are fixed in the development library and, after successful testing the modified code is integrated into the master library. The whole integration and build process is fully controlled in that all actions are recorded and problems as for example unresolved dependencies are reported. Preferrably,
the described library system should be capable of
supporting specification development as well. 2.2.5
VM/370 (Figure 16)
Testing needs a high amount of computing resources; testing system programs in particular requires different hardware configurations. VM/370 provides for the capability to have a variety of test systems running on the same installation. This makes the testing effort more flexible and economical. 2.3
Management Aspects
The previous sections dealt with the principals of the programming development process and the tools supporting it. I want to add some comments about managing this process. All techniques and tools, phase plans, and controls will not work unless programmers are involved in the implementation of the process. The programmers must do the detailed planning upon which the manager can base his overall plans.
22
People are evaluated on the basis of the results whereby tity,
results
are stated in terms of quality,
and cost.
cording
It should become
and analyzing
Programming
Programmers Generally
professional
quan-
that reare part
responsibility.
Teams
of different
skills are best organized
in teams.
one to three teams report to a first-level
Each first-level
manager has a librarian
The roles of the team members Figure
timeliness,
common understanding,
the above process measurements
of a system programmers 2.3.1
they achieve
manager.
to support his teams.
and the librarien are shown in
17.
The team leader is responsible cification
and has technical
specifications,
for preparing
responsibility
and all code. He makes
for the project.
For most design,
his job is to review and analyse programmers.
him.
(Evaluation
and employee
addition,
the team leader designs,
The co-team
writes
and code rather
for specifications
logic specs,
In
and codes
that his team is producing.
of the product.
responsibility
logic
by other
are done by the manager.)
leader helps the team leader
also codes key elements leadership
appraisals
of the product
the programmer,
of end results
spe-
decisions
logic specifications,
and code,
the key elements
for all design,
the technical
the results produced
In this role, he supports
than evaluates
the functional
if necessary.
in all his duties
and
He will assume full team He is a technical
peer
of the team leader and his back up. The programmers and code,
in the team are responsible
their own detailed planning,
for lower-level
and testing.
design
23 The librarian is an important part of the team. His job is to create, maintain,
and own the library for the project. The project
library includes both documentation and code. The librarian schedules and receives the runs from the computing center, and provides clerical services to the rest fo the group. The librarian is a full-fledged team member. There are many advantages in programming teams. Senior people are directly and actively involved in the project.
In the past
these people would design the function and sometimes leave the project. Keeping senior people involved provides continuity. Since they generally produce work of higher quality, they can educate the junior members of the team. The team approach provides for more detailed and realistic plans. 2.3.2
Formal Reviews
During the last two years, there have been several projects which applied very formal review techniques to specifications,
code,
test cases, and publications. One review technique has been called Walkthru,
another Inspection which word stems from a common in-
spection practice in the world of hardware engineering.
Both
techniques are common in that a very detailed structured review, attended by the most knowledgeable and affected persons regarding system technology and dvelopment process,
is performed. However,
there is a difference between a Walkthru and an Inspection, in that only the latter emphasizes the recording and analysis of not only defects discovered but also development process measurments like number of defects by type. The formal review is used by a developer as a resource to help him produce error-free products. The developer's attitude must be that other people are there to help, not to kill him with personal comments. The other participants
(a moderator,
the designer,
24
other developer(s),
a tester) must also realize that they are
there to help. Managers feelings.
Without
will not work, evaluating
this psychological
Formal
number of problems
Formal
reviews
to reinforce
atmosphere,
of existing
just that they are found
or projected
are performed
study the external
quality exposures.
during phases
questions.
specification
They probe
II, III, and IV. Fi-
by all present.
the material
The issues
walkthru
concentrates
material
like HIPOs
a code walkthru.
The participants
and related material
it for errors
In the meeting
about the
should use the number of problems
18 shows the way a Phase II walkthru works.
effect graphs.
tool for
is not concerned
found at this stage;
these
formal reviews
are not a management
The manager
The manager
as an indicator
gure
reviews
programmers.
and corrected.
have consciously
a list of
is thoroughly
analyzed
are only recorded;
on internal
like cause/
and develop
the phase
specifications
and Test Cases. The phase
III
and related
IV walkthru
is
The length and the number of participants
may
vary. A review of an entire spec could involve eight people, a couple of days notice,
and an offsite meeting.
few changes have been made to an existing may involve only three people: and the independent We have analyzed The results a problem
If relatively
spec, the walkthru
the developer,
his team leader,
reviewer.
the effect
of walkthrus
on costs
(Figure
19).
showed that it was 14 to 15 times cheaper to find
during a walkthru
test is the point
than it was in unit test. And unit
in the testing
cycle at which it is cheap to
find and fix an error.
3.
CLOSING
We believe
to have proof that the analysis
the programming quality,
development
and improvement
process has resulted
i.e. fewer APARs relative
of
in improved
to the released
code
(Figure 20)~
25
We are observing
even better results with newly developed
code.
Figure 2] shows the APAR rate for the TSO scheduler code the rate of which is about one third the rate of code that consists in a conglomerate The same methods
of old, changed and added code. that increase quality have also a favorable
effect on costs accounted
over the whole process
from inception
until shipment of a system/component. We hope that more and more system programming contribute veloping
new ideas on how to improve
systems.
will
of de-
This will be needed if new systems of growing
size and increasing tude in managing
professionals
the total process
functional
capability
the development
system programmers,
and hardware
require
another magni-
process which involves engineers.
users,
26 REFERENCES_
(I)
W.B° Cammack and H.J. Rodgers, Jr. Improving the Programming Process, IBM Technical Report 00.2483 (Oct. 73)
(2)
W.R° Elmendorf Cause-Effect
Graphs in Functional Testing
IBM Technical Report 00.2487
(3)
(Nov.
73)
F.T. Baker System Quality through Structured Programming 1972 Fall Joint Computer Conference
(4)
F.T° Baker Chief Programmer Team Management
of Production Programming
IBM Systems Journal, Vol. If, Nov. I, 1972 (5)
H.D. Hills Top-Down Programming
in Large Systems Courant Computer Sciences
Symposium I New York Univ., June 1971
(6)
(7)
P.M. Metzger Managing a Programming Prentice-Hall 1973
Project
IBM HIPO Audio Education Package SR20-9413
27
Fig.
I
System build up
Fig.
2
28
Experience Average APARs Per User OS
4
Total APARs/User
Valid 18
19
20
21
Release Fig.
3
Release 18 G FCS ÷ 2yrs.
Fig.
4 Ill D ' V
29
of
0 L-.-~
C~ling
Fig.
Unit Test Time
APAR's
5
Fig. 6 see page 3 o - ~
Top- Down Design ....
Fig.
7
30
8t
Structured Programs
Fig.
Structured Program Elements Sequence rNm
Lm
m.m
n
w,m m ~
n
i
mmm ~
I
i
~
n
~
~
mmm ~
m
m
~
~
~
mmm w,m
m ~ e m m m
~
i
um.
m
w.n
~
~
~
~ *
uw
~
,..~
u--.i
u
If T h e n Else
r
~
m
,
~
I i
~
I
.............. L
~
mmm ~m
wmmP I n t o
~m
~lmm uwm
~
~
J
L. . . . . . . . . . . . . . . . . . . . . . Do While
Fig. 9
~
~
~
~
~
1
I I
I
_1
32
I n t e g r a t i o n - As it Is FUNCTION
A--B---" C---D-'E---F--G---Driver 1 Driver 2 Driver 3 Fig.
Fig°
Io
11
33
HIPO Hierarchy
Input
CVT
Process
Fig. 12
Draw the Graph
Nodes 1. "OPI" is fixed binary 2. "OP2" is fixed decimal 3. "OPERATOR" is + 4. "OPERATOR" is 5. "OPI' is invllid 6. "OP2" is invalid 7. e x p m l l i o n is valid 8. OPERATOR is valid 9. OPERATOR is invalid
Fig. 13
Outpu,
34
The Test Case Library Design lnvocable causes 1 2 3 4
T E S T S
I 2 3 4 5
| | ! 1 | 61
,I
,
,
Observable effects I
SSSS ~lSI I l l S l l S S i S i S S l i IS I lrllllll I 1234
i l 12 g3 t4 15 16
5 I6 7 9 T E S T S
1 2 3 4 5 6
I ! i I I i
P PAP AAPA AAPA AAA P APAA PAAA
1 2 3 4 5 6
5679 | = Involved S = Suppressed Fig.
P = Present A = Absent
14
Integration and Build New Code ~rror ;orrection
Integration Source Code
I Fig.
15
T
i
System Build
I I
Test
eQ ee e
J.
J.l~
i
i
i ~iii i~i¸
36
Phase 1I Walkthru * Probe spec for mistak.es and omissions
Probb
uj
• Record problems Problem I
* Resolve problems
Fig. 18
Find Programming Errors Early
Cost errors
Fig. 19
Walkthru
Unit Test
37
Results Average APARs Against Base Declining
OS
Valid APARs
Per I n s t ~
18 Fig.
19
20
21
Release
2o
Results New Code is Higher Quality TSO Scheduler Valid APARs
(.o..dzed)
Base Plus Fig.
2~
C~,,ged
New
Organizing
for Structured Programming
F. T. Baker;
IBM Federal Systems Division,
Maryland,
Gaithersburg,
USA
ABSTRACT A new type of programming methodology,
built around structured
programming ideas, has been gaining widespread acceptance production programming.
for
This paper discusses how this method-
ology has been introduced into a large production programming organization.
Finally it analyzes the advantages and disad-
vantages of each component of the methodology and recommends ways it can be introduced in a conventional programming environment.
INTRODUCTION At this point in timer the ideas of structured programming have gained widespread acceptance,
not only in academic
circles, but also in organizations doing production programming.
An issue [1] of Datamation,
one of the leading business
data processing oriented magazines in the U.S., several articles on the topic.
featured
The meetings of SHARE and
GUIDE, two prominent computer user groups, have had an increasing number of sessions on subjects related to structured programming.
The IBM Systems Science Institutes are offering
courses and holding seminars,
and several books on the topic
are in print° What is perhaps not so widely appreciated, the organizations,
however,
is that
procedures and tools associated with the
implementation of structured programming are critical to its success.
This is particularly true in production programming
environments, grams)
where program systems
are developed,
reliable, maintainable
(rather than single pro-
people come and go, and the attainment of software on time and within cost esti-
mates is a prime management objective.
In this environment,
89
module level coding and debugging activities typically account for about 20~ of the effort spent on software development [2] . Thus, narrow applications of structured programming ideas limited only to these activities have correspondingly It is therefore desirable to adopt a broad,
limited effects.
integrated approach
incorporating the ideas into every aspect of the project from concept development to program maintenance to achieve as many quality improvements and cost savings as possible.
BACKGROUND
The IBM Federal Systems Division
(FSD) is an organization in-
volved in production programming on a large scale.
Although
much of its software work is performed for federal,
state and
local governmental agencies,
the division also contracts with
private business enterprises for complex systems development work.
Work scope ranges from less than a man-year of effort
on small projects to thousands of man-years spent on the development and maintenance of large, evolutionary,
long-term
systems such as the Apollo/Skylab ground support software. Varying customer requirements cause the use of a wide variety of hardware,
programmihg languages,
software tools, documenta-
tion procedures, management techniques,
etc.
Problems range
from software maintenance through pure applications programming using commercially available operating systems and program products to the concurrent development of central processors, ~eripherals,
firmware,
support software and applications soft-
~are for avionics requirements.
Thus, within this single or-
ganization can be found a wide range of software development efforts.
FSD has always been concerned with the development of improved software tools, techniques and management methods.
Most re-
cently, FSD has been active in the development of structured programming techniques
[3]
This has led to organizations,
procedures and tools for applying them to production programming projects,
particularly with a new organization called a
Chief Programmer Team. [4]
The Team, a functional organization
40
based on standard support tools and disciplined application of structured programming principles,
had its first trial on
a major software development effort in 1969-71. [5],[6]
In the
three years since the completion of that experimental project, FSD has been incorporating structured programming techniques into most of its software development projects. scope and diversity of these projects,
Because of the
it was impossible to
adopt any single set of tools and procedures or any rigid type of organization to all or even to a majority of them. cause of the ongoing nature of many of these systems,
And beit was
necessary to introduce these techniques gradually over a period of many years.
The approach which was adopted,
the problems
which have been encountered and the results which were achieved, are the subject of this paper.
It is believed that any software
development organization can improve the quality and reduce the costs of its software projects in a similar way.
PLAN
To introduce the ideas into FSD work practices and to evaluate their use, a plan with four major components was implemented. First, a set of guidelines was established to define the terminology associated with the ideas with sufficient precision to permit the introduction and measurement of individual components of the overall methodology. and directives Second,
These guidelines were published,
regarding their implementation were issued.
support tools and methodologies were developed,
par-
ticularly for projects using commercial hardware and operating systems.
For those projects where these were not employed,
standards based on the developed tools enabled them to provide their own support.
Third, documentation of the techniques and
tools, and education in their user were both carried out.
These
were done on a broad scale covering management techniques,
pro-
41
gramming methodologies and clerical procedures.
Fourth,
a
measurement program was established to provide data for technology evaluation and improvement.
This program included both
broad measurements which were introduced immediately,
and de-
tailed measurements which required substantial development work and were introduced later.
The next four sections cover
the components of this plan and their implementation in detail.
GUIDELINES
A number of important considerations
influenced the establish-
ment of a set of guidelines for the application of structured programming technology within FSD.
First and most important,
they had to permit adaptation to the wide variety of project environments described above.
This required that they be use-
ful in program maintenance situations where unstructured program systems were already in being,
as well as in those where com-
pletely new systems were to be developed.
Second, they had to
allow for the range of processors and operating systems in use. This necessitated the description of functions to be provided instead of specific tools to be used.
Third, they had to allow
for differences in organizations and methodology fications,
documentation,
(e.g., speci-
configuration management)
required
or in use on various projects.
The guidelines resulting from these considerations are a hierarchical set of four components, in Figure I.
graphically illustrated
Use of the component at any level presupposes use
of those below it.
Thus, by beginning at a level which a
project's environment and status permit,
and then progressing
upward, projects can evolve gradually toward full use of the technology.
42
1.
Development
The introductory
Support Libraries level is the Development
Support Library,
which is a tool designed with two key principles a°
Keep current project status organized all times.
in mind:
and visible at
In this way, any programmer,
manager or
user can find out the status or study an approach directly without depending b.
Make it possible
on anyone else.
for a trained secretary to do as
much library maintenance clerical
from intellectual
A DSL is normally Librarian~
as possible, activity.
the primary responsibility
Programmers
thus separating
of a Programming
interface with the computer primarily
through the library and the Programming
Librarian.
This allows
better control of computer activity and ensures that the library is always complete with up-to-date standings
and current.
versions
of programs
and inconsistencies
and modification
Programmers
are always working
and data,
so that misunder-
are greatly reduced.
A version
level are associated with all material
in the
library to permit change control and assist in configuration nanagement.
In general,
in increasing
the library system is the prime factor
the visibility
reducing risk and increasing The guidelines
reliability.
provide that a Development
being used if the following
ao
of a developing project and thus
conditions
A library system providing the PPL
(see TOOLS below)
Support Library is
prevail:
the functional is being used.
equivalent of
43
b.
The library system is being used throughout the development process, ject code,
c.
not just to store debugged source or ob-
for example.
Visibility of the current status of the entire project, as well as past history of source code activities and run executions,
d.
is provided by the external library.
Filing procedures are faithfully adhered to for all runs, whether or not setup was performed by a librarian.
e.
The visibility of the code is such that the code itself serves as the prime reference for questions of data formats, program operation,
etc.
f.
Use of a trained librarian is recommended.
2.
Structured Pro~rammin~
In order to provide for use of structured programming techniques on maintenance as well as development projects, necessary to depart from t h e c l a s s i c a l and adopt a narrower definition.
In FSD, then, we distinguish
between those practices used in system development Development)
it was
use of the terminology (Top-Down
and those used in coding individual program modules
(Structured Programming).
Our use of the term "structured pro-
gramming" in the guidelines thus refers primarily to coding standards governing control flow, and module organization and construction.
They require three basic control flow figures and
permit two optional ones, as shown in Figure 2. a Guide [7]
They refer to
(see DOCUMENTATION below) which contains general
information and standards for structured programming,
as well as
44
d e t a i l e d standards
for use of various p r o g r a m m i n g
languages,
They also require that code be r e v i e w e d by someone other than the developer~
The d e t a i l e d g u i d e l i n e s for s t r u c t u r e d pro-
g r a m m i n g are as follows:
a.
The c o n v e n t i o n s e s t a b l i s h e d in the S t r u c t u r e d P r o g r a m m i n g Guide are being followed. documented.
E x c e p t i o n s to c o n v e n t i o n s are
If a language is being used for w h i c h con-
v e n t i o n s have not been p u b l i s h e d in the Guide,
then use
of a locally g e n e r a t e d set of conventions c o n s i s t e n t w i t h the rules of s t r u c t u r e d p r o g r a m m i n g is acceptable.
b.
The code is being r e v i e w e d for functional i n t e g r i t y and for a d h e r e n c e to the s t r u c t u r e d p r o g r a m m i n g conventions.
c.
A Development
Support L i b r a r y is being used.
3.
Top-Down Development
T o p - d o w n d e v e l o p m e n t refers to the process of c o n c u r r e n t design and d e v e l o p m e n t of p r o g r a m systems c o n t a i n i n g more than a single c o m p i l a b l e unit. which minimizes
It r e q u i r e s d e v e l o p m e n t to proceed in a way interface p r o b l e m s n o r m a l l y e n c o u n t e r e d during
the i n t e g r a t i o n p r o c e s s typical of "bottom-up development" by i n t e g r a t i n g and testing m o d u l e s as soon as they are developed. Other a d v a n t a g e s are that:
a.
It permits a p r o j e c t to man up more g r a d u a l l y and should reduce the total m a n p o w e r required.
bo
C o m p u t e r time r e q u i r e m e n t s tend to be spread more evenly over the d e v e l o p m e n t period.
45
c.
The user gets to work with major portions of the system much earlier and can identify gross errors before acceptance testing.
d.
Most of the system has been in use long enough by the time it is delivered that both the user and the developer have confidence in its reliability.
e.
The really critical interfaces between control and function code are the first ones to be coded and tested and are in operation the longest.
The term "top-down" may be somewhat misleading if taken too literally.
What top-down development really implies in every-
day production programming is that one builds the system in a way which ideally eliminates
(or more practically,
minimizes)
writing any code whose testing is dependent on other code not yet written,
or on data which is not yet available.
This re-
quires careful planning of the development sequence for a large system consisting of many programs and data sets, since some programs will have to be partially completed before other programs can be begun.
In practice,
it also recognizes that exi-
gencies of customer requirements or schedule may force deviations from what would otherwise be an ideal development sequence. The guidelines for top-down development are as follows:
a.
Code currently being developed depends only on code already operational,
except in those portions where devi-
ations from this procedure are justified by special circumstances.
46
b.
The project schedule reflects a continuing integrationr as part of the development process,
leading directly to
system test, as opposed to a development, integration,
followed by
followed by system test, cycle.
c.
Structured Programming
is being used.
d.
A Development Support Library system is being used. (While ongoing projects may not be able to meet this criterion,
an implementation of structured coding practice
is acceptable
e.
in these cases.)
The managers of the effort have attended a structured programming orientation course
4.
(see EDUCATION below).
Chief Progrgmmer Teams
A Chief Programmer Team
(CPT) is a functional programming or-
ganization built around a nucleus of three experienced professionals doing well-defined parts of the programming development process using the techniques and tools described above. It is an organization uniquely oriented around them and is a logical outgrowth of their introduction and use. detail in
[4], [5], and
Described in
[6], it has been used extensively in
FSD on projects ranging up to approximately i00,000 lines of source code and is being experimented with on larger projects. The guidelines
as
for CPT's are as follows:
A single person;
the chief programmer,
technical responsibility
has complete
for the effort.
He will
ordinarily be the manager of the other people.
47
b.
There is a backup programmer prepared to assume the role of chief programmer.
c.
Top-D~wn Development, velopment
d.
Structured Programming
and a De-
Support Library are all being used.
Top level code segments and the critical control paths of lower level segments
are being coded by the chief and
backup programmers.
e.
The chief and backup programmers
are reviewing the code
produced by other members of the team. f.
Other programmers
are added to the team only to code
specific well defined functions within a framework established by the chief and backup programmers.
TOOLS
Tools are necessary
in order to permit effective
of, and achieve m a x i m u m benefits programming.
Development
implementation
from the ideas of structured
Support Libraries,
introduced above,
are a recognized and required component of the methodology ployed in FSD.
em-
Standards are necessary to ensure a consistent
approach and to help realize benefits of improved project communications effective
and manageability.
Procedures
are required for
use of the tools and to permit functional breakup and
improved overall efficiency ly, other techniques ment can be helpful
in the programming
of design,
programming,
in a structured programming
well as in a conventional
one.
process.
Final-
testing and manageenvironment
as
48
1.
Development Su~ort
Libraries
The n e e d for and value of D e v e l o p m e n t Support L i b r a r y support,
(DSL)
both as a n e c e s s i t y for s t r u c t u r e d p r o g r a m m i n g and
as a vehicle
for p r o j e c t c o m m u n i c a t i o n and control, has been
t h o r o u g h l y covered in
[4],
[5],
[6],
[7], and
[8].
Early work
on DSL's c e n t e r e d on the p r o v i s i o n of libraries for p r o j e c t s using IBM's S y s t e m / 3 6 0 O p e r a t i n g System and Disk O p e r a t i n g System.
The OS/360 P r o g r a m m i n g P r o d u c t i o n L i b r a r y
(PPL) is
typical of those we are using in b a t c h p r o g r a m m i n g d e v e l o p m e n t situations. external
It consists of internal
(human-readable)
procedures.
libraries,
(computer-readable)
and
and office and m a c h i n e
Similar concepts and approaches apply to our other
library systems,
i n c l u d i n g those w o r k i n g in an online environ-
ment.
The PPL keeps all m a c h i n e a b l e data on a p r o j e c t - source code, object code, test data,
linkage e d i t o r language,
job control language~
and so on - in a series of data sets w h i c h c o m p r i s e
the internal l i b r a r ~
(see Figure 3).
Since all data is kept
i n t e r n a l l y and is fully b a c k e d up, there is no need for programmers to g e n e r a t e or m a i n t a i n their own p e r s o n a l copies.
Cor-
r e s p o n d i n g to each type of data in the internal library there is a set of c u r r e n t status binders w h i c h c o m p r i s e the external librar[
(see F i g u r e
~).
These are filed c e n t r a l l y and used by
all as a s t a n d a r d m e a n s of communication.
There is also a set
of a r c h i v e s of s u p e r s e d e d status pages w h i c h are r e t a i n e d to assist in d i s a s t e r recovery, run results.
Together,
and a set of run books c o n t a i n i n g
these record the a c t i v i t i e s - current
and h i s t o r i c a l - of an entire p r o j e c t and keep it c o m p l e t e l y organized°
49
The m a c h i n e p r o c e d u r e s , as the name implies,
are c a t a l o g e d
p r o c e d u r e s w h i c h p e r f o r m internal library maintenance, up, e x p a n s i o n and so on.
back-
Most of them are used by Program-
ming L i b r a r i a n s by means of simple control cards they have been trained to prepare.
A c o m p l e t e list is given in Table i.
The 0ffice p r o c e d u r e s are a set of "clerical algorithms" used by the P r o g r a m m i n g L i b r a r i a n to invoke the m a c h i n e procedures, to prepare the input and file the output.
Once new code has
been created and placed in the library initially,
a programmer
makes c o r r e c t i o n s to it by m a r k i n g up pages in the external library and giving them to the P r o g r a m m i n g L i b r a r i a n to make up control and data cards to cause the c o r r e s p o n d i n g changes or additions to be made to the internal library.
As a result,
clerical effort and w a s t e d time on the part of the p r o g r a m m e r s are s i g n i f i c a n t l y reduced.
Figure 5 shows the w o r k flow and the
central role of the P r o g r a m m i n g L i b r a r i a n in the process.
Be-
cause p r o g r a m m e r s are served by the L i b r a r i a n and the PPL, they are freed from i n t e r r u p t i o n s and can work on more routines in parallel than they p r e v i o u s l y did. ~ The PPL m a c h i n e and office p r o c e d u r e s are d o c u m e n t e d for programmers librarians in
in
[7] and for
[9].
S u b s e q u e n t work on DSL's in FSD has e x t e n d e d the support to some of the n o n - S y s t e m / 3 6 0
equipment in use and also introduced
interactive DSL's for use both by librarians and programmers. Furthermore,
a study of general requirements
for DSL's has
been p e r f o r m e d under c o n t r a c t to the U. S. Air Force and has been p u b l i s h e d in
[i0].
DSL's are now available for and in
use on m o s t p r o g r a m m i n g projects in FSD.
50
2.
Standards
To support s t r u c t u r e d p r o g r a m m i n g in the v a r i o u s languages used, p r o g r a m m i n g standards were required.
These covered
both the i m p l e m e n t a t i o n of the control flow figures in each language as w e l l as the c o n v e n t i o n s
for f o r m a t t i n g and in-
d e n t i n g p r o g r a m s in that language.
There are four a p p r o a c h e s w h i c h can be taken to provide the basic and o p t i o n a l control flow figures in a p r o g r a m m i n g language,
a~
and each was used in certain situations
The figures may be d i r e c t l y a v a i l a b l e as s t a t e m e n t s in the language.
In the case of PL/I, all of the basic
figures were of this variety. (with slight restrictions) PERFORM)
b.
in FSD.
In COBOL,
the IFTHENELSE
and the D O U N T I L
(as a
were present'.
The figures may be e a s i l y s i m u l a t e d using a few s t a n d a r d statements.
The CASE s t a t e m e n t may be readily s i m u l a t e d
in PL/I using an indexed GOTO and a LABEL array, w i t h e a c h case i m p l e m e n t e d via a D O - g r o u p e n d i n g in a GOTO to a c o m m o n null s t a t e m e n t f o l l o w i n g all cases.
c.
A s t a n d a r d p r e - p r o c e s s o r may be used to augment the basic language s t a t e m e n t s to p r o v i d e n e c e s s a r y features.
The
m a c r o a s s e m b l e r has been used in FSD to add s t r u c t u r i n g features to System/360,
System/370
and System/7 A s s e m b l e r
Languages.
d.
A special p r e - p r o c e s s o r may be w r i t t e n to compile augmented language statements
into standard ones, w h i c h may
then be p r o c e s s e d by the normal computer.
This was done
51
for the FORTRAN language, which directly contains almost none of the needed features.
The result of using these four approaches was a complete set of figures for PL/I, COBOL, FORTRAN and Assembler.
Using these
as a base, similar work was also done for several specialpurpose languages used in FSD. To assist in making programs readable and in standardizing communications and librarian procedures,
it was desirable that
programs in a given language should be organized, indented in the same way.
Link Editor Languages as well as of the procedural mentioned above.)
formatted and
(This was true of the Job Control and languages
Coding conventions were developed for each
covering the permitted control structures, naming, use of comments,
segment formatting,
labels, and indentation and formatting
for all control flow and special
(e.g., OPEN, CLOSE, DECLARE)
statements. 3.
Procedures
An essential aspect of the use of DSL's is the standardization of the procedures associated with them.
The machine procedures
used in setting up, maintaining and terminating the libraries were mentioned above in that connection.
However,
the office
procedures used by librarians in preparing runs, executing them and filing the results are also quite extensive.
These were
developed and documented [9] in a form readily usable by nonprogramming oriented librarians. ~.
Other
While the above constitute the bulk of the work originated by FSD, certain other techniques and procedures have been assimi-
52
lated into the methodology in varying degrees. management techniques,
These include
HIPO diagrams and structured walkthroughs.
FSD has been a leader in the development of management techniques for programming projects.
A book [II] resulting from
a management course and guide used in FSD has become a classic in the field.
As top-down development and structured program-
ming came into use, it became apparent that traditional management practices would have to be substantially revised IMPLEMENTATION EXPERIENCE below).
(see
An initial examination was
done, and a report [12] was issued which has been very valuable in guiding managers
into using the new methodology.
This
material is now being added to a revised edition of the FSC Programming Project Management Guide,
from which the book [II]
mentioned above was drawn. A documentation technique called HIPO Process-Output)
diagrams
[8,13]
(Hierarchy plus Input-
developed elsewhere in IBM has
proved valuable in supporting top-down development.
HIPO con-
sists of a set of operational diagrams which graphically describe the functions of a program system from the general to the detail level.
Not to be confused with flowcharts, which de-
scribe procedural
flow, HIPO diagrams provide a convenient means
of documenting the functions identified in the design phase of a top-down development effort.
They also serve as a useful intro-
duction to the code contained in a DSL and as a valuable maintenance tool following delivery of the program system. Structured walk-throughs [8] were developed on the second CPT project as a formal means for design and code reviews during the development process.
Using HIPO diagrams and eventually the
53
code itself, reviewers. grammer
the d e v e l o p e r
"walks through" his efforts for the
These latter may consist of the Chief or Backup Pro-
(or lead p r o g r a m m e r if a CPT is not being employed),
other p r o g r a m m e r s and a r e p r e s e n t a t i v e formally test the programs. detection,
not correction,
from the group w h i c h will
Emphasis is on error avoidance and and the attitude is open and non-
d e f e n s i v e on the part of all participants be tomorrow's reviewee).
(today's reviewer will
The reviewers prepare for the walk-
through by studying the diagrams or code before the meeting, and followup is the r e s p o n s i b i l i t y of the reviewee, who m u s t notify the reviewers of corrective actions taken.
D O C U M E N T A T I O N AND E D U C A T I O N
Once the fundamental tools and guidelines were established, was n e c e s s a r y to begin d i s s e m i n a t i n g them throughout FSD.
it Much
e x p e r i m e n t a l work had already been done in d e v e l o p i n g the tools and guidelines themselves,
so that a cadre of people familiar
w i t h them was already in being.
Most of the d o c u m e n t a t i o n has been referred to above.
The pri-
mary reference for p r o g r a m m e r s was the FSC S t r u c t u r e d Pro@ramm i n @ Guide [7]
In addition to the standards for each language
and for use of the PPL, it contained general i n f o r m a t i o n on the use of top-down d e v e l o p m e n t and structured programming, as the p r o c e d u r e s
as well
for m a k i n g exceptions to them when necessary.
It also contained provisions
for sections to be added locally
when s p e c i a l - p u r p o s e languages or libraries were in use. tributed t h r o u g h o u t FSD,
Dis-
the Guide has been updated and is still
the standard reference for programmers.
The FSC P r o ~ r a m m i n ~ Li-
b r a r i a n ' s Guide [9] serves a similar purpose for librarians and also has provisions for local sections where necessary.
While
54
the use of the macros
for System/360 A s s e m b l e r L a n g u a g e was in-
cluded in the Pr_ogrammin~ Guide, was a v a i l a b l e on them if desired. m e n t a t i o n in the form of
[ii] and
a d d i t i o n a l d o c u m e n t a t i o n [14] Finally,
m a n a g e m e n t docu-
[12] was also available.
It was r e c o g n i z e d that p r o v i d i n g d o c u m e n t a t i o n alone was not s u f f i c i e n t to p e r m i t m o s t p e r s o n n e l to b e g i n a p p l y i n g the techniqueso
S t r u c t u r e d p r o g r a m m i n g r e q u i r e s s u b s t a n t i a l changes
in the p a t t e r n s and p r o c e d u r e s of programming,
and a s i g n i f i c a n t
m e n t a l e f f o r t and amount of p r a c t i c e is n e e d e d to o v e r c o m e old habits and instill new ones. m a j o r language)
A series of courses
in s t r u c t u r e d p r o g r a m m i n g and DSL techniques. five hours, portantly,
L a s t i n g twenty-
these courses p r o v i d e d i n s t r u c t i o n and, m o r e imp r a c t i c e p r o b l e m s w h i c h forced the p r o g r a m m e r s to
begin the t r a n s i t i o n process. retrained,
(one for each
was set up to train e x p e r i e n c e d FSD p r o g r a m m e r s
Once all p r o g r a m m e r s had been
these courses w e r e discontinued,
and s t r u c t u r e d pro-
g r a m m i n g is now i n c l u d e d as p a r t of the basic p r o g r a m m e r training courses given to newly hired personnel.
The same s i t u a t i o n held true for m a n a g e r s as well as programmers.
Because FSD w i s h e d to apply the m e t h o d o l o g y as rapidly as
possiblea
it was d e s i r a b l e to a c q u a i n t m a n a g e r s w i t h it and its
p o t e n t i a l immediately.
Thus, one of the first actions taken
was to give a h a l f - d a y o r i e n t a t i o n course to all FSD managers. This p e r m i t t e d them to e v a l u a t e the depth to w h i c h they could begin to use it on current projects, its use on p r o p o s e d projects.
and to begin to plan for
This was then followed up by a
t w e l v e - h o u r course for e x p e r i e n c e d p r o g r a m m i n g managers,
ac-
q u a i n t i n g t h e m w i t h m a n a g e m e n t and control t e c h n i q u e s p e c u l i a r to t o p - d o w n d e v e l o p m e n t and s t r u c t u r e d programming.
(It was ex-
55
pected that most of these managers would also attend one of the structured programming courses described above to acquire the fundamentals.)
Again, now that most programming managers
have received this form of update, the material has now been included in the normal programming management co~rse given to all new programming managers.
MEASUREMENT
One of the problems of the production programming world is that it has not developed good measures of its activities.
Various
past efforts, most notably the System Development Corporation studies [15] have attempted to develop measurement and prediction techniques for production programming projects.
The
general results have been that a number of variables must be accurately estimated to yield even rough cost and schedule predictions,
and that the biggest factors are the experience and
abilities of the programmers involved.
Nevertheless,
it was
felt in FSD that some measures of activity were needed, not so much for prediction as for evaluation of the degree to which the methodology was being applied and the problems which were experienced in its use.
To these ends, two types of measure-
ments were put into effect.
The first type of measurement,
implemented immediately, was a
monthly report required from each programming project. programming manager was required to state:
I.
The total number of programmers on the project.
2.
The number currently programming.
3.
The number using structured programming.
Each
56
4.
The number of p r o g r a m m i n g groups on the project.
5.
The number of CPT~s.
6.
W h e t h e r a DSL was in use.
7.
W h e t h e r top-down d e v e l o p m e n t was in use.
These figures w e r e s - ~ m a r i z e d m o n t h l y for various
levels of
FSD m a n a g e m e n t and were a v a l u a b l e tool in e n s u r i n g that the m e t h o d o l o g y was indeed being introduced.
The second type of m e a s u r e m e n t was a m u c h more c o m p r e h e n s i v e one.
It r e q u i r e d a great deal of r e s e a r c h in its preparation,
and e v e n t u a l l y took the form of a q u e s t i o n n a i r e from w h i c h data was e x t r a c t e d to b u i l d a m e a s u r e m e n t data base. naire contains
The q u e s t i o n -
105 q u e s t i o n s o r g a n i z e d into the f o l l o w i n g eight
sections:
1.
I d e n t i f i c a t i o n of the project.
2.
D e s c r i p t i o n of the c o n t r a c t u a l environment.
3.
D e s c r i p t i o n of the p e r s o n n e l environment.
4.
D e s c r i p t i o n of the p e r s o n n e l themselves.
5.
D e s c r i p t i o n of the technical environment.
6.
D e f i n i t i o n of the size, type and q u a l i t y of the p r o g r a m s produced.
7.
I t e m i z a t i o n of the financial~
c o m p u t e r and m a n p o w e r re-
sources used in their development.
57
8.
Definition of the schedule.
The questionnaire
is administered at four points during the
lifetime of every project. ginning,
The first point is at the be-
in which all questions are answered with estimates.
The next administration is at the end of the design phase, when the initial estimates are updated as necessary.
It is
again filled out halfway through development, when actual figures begin to be known.
And it is completed for the last
time after the system has been tested and delivered, results are in.
and all
The four points provide for meaningful com-
parisons of estimates to actuals,
and allow subsequent projects
to draw useful guidance for their own planning.
The data base
permits reports to be prepared automatically and statistical comparisons to be made.
IMPLEMENTATION EXPERIENCE
Each of the four components of the methodology which FSD has introduced has resulted in substantial benefits.
However,
experience has also revealed that their application is neither trivial nor trouble-free.
This section presents a qualitative
analysis of the experience to date, describing both the advantages and the problems.
i.
Development Support Libraries
Most projects of any size have historically gravitated toward use of a program library system of some type.
This was cer-
58
tainly true in FSD~ w h i c h had some h i g h l y d e v e l o p e d systems already in place w h e n the m e t h o d o l o g y was introduced. p r i m a r i l y used as m e c h a n i s m s to control the code,
These w e r e
so that dif-
fering v e r s i o n s of c o m p l e x systems could be segregated.
In
some cases they p r o v i d e d p r o g r a m d e v e l o p m e n t services such as compilation,
t e s t i n g and so forth.
However, none were being
used p r i m a r i l y to achieve the goals of i m p r o v e d c o m m u n i c a t i o n s or w o r k f u n c t i o n a l i z a t i o n w h i c h are the p r i m a r y b e n e f i t s of a DSL.
In fact, the general a t t i t u d e toward the services they
p r o v i d e d was that they were there to be used w h e n and if the p r o g r a m m e r s wished.
M o s t code in them was p r e s u m e d private,
with
the usual e x c e p t i o n s of m a c r o and s u b r o u t i n e libraries.
One of the m o s t d i f f i c u l t p r o b l e m s
in the i n t r o d u c t i o n of the
DSL a p p r o a c h was to c o n v i n c e o n g o i n g p r o j e c t s that their p r e s e n t library systems f u l f i l l e d n e i t h e r the r e q u i r e m e n t s nor the intents of a DSL.
A DSL is as m u c h a m a n a g e m e n t tool as a pro-
grammer convenience.
A P r o g r a m m i n g L i b r a r i a n ' s p r i m a r y respon-
s i b i l i t y is to management,
in the sense of s u p p o r t i n g control
of the p r o j e c t ' s assets of code and data -- a n a l o g o u s to a controller's r e s p o n s i b i l i t y to m a n a g e m e n t of s u p p o r t i n g control of financial assets.
The p r o j e c t as a w h o l e should be e n t i r e l y
d e p e n d e n t on the DSL for its operation, other criterion,
and this, more than any
is the d e t e r m i n i n g factor in w h e t h e r a library
s y s t e m m e e t s the g u i d e l i n e s as a DSL.
When all functions are provided,
and a p r o j e c t implements a DSL,
then a h i g h degree of v i s i b i l i t y is available.
P r o g r a m m e r s use
the source code as a b a s i c means of c o m m u n i c a t i o n and rely on it to answer q u e s t i o n s on i n t e r f a c e s or suggest approaches to their problemso
M a n a g e r s use the code itself
(or the summary
59
features of more sophisticated progress
of the work.
basis.
even at an early
basis.
in itself is valuable,
even on a laissez-faire
But when it is coupled with w e l l - m a n a g e d above,
that the specifications
the planned test coverage, and last but not least,
ensure
reviews the code,
have been addressed,
assisting
in standards
constructively
While the review procedure concomitant
The walk-
or equivalent procedures,
that someone in addition to the developer verifying
code-reading
it also provides quality improvements.
throughs described
of the
of beginning to use the de-
system on an experimental
procedures,
the
from the ready availability
test data and the feasibility
The visibility
to determine
Users also benefit,
stage of implementation, veloping
DSL's)
criticizing
checking
compliance the content.
is obviously greatly facilitated by
use of structured programming,
it is possible with-
out it and was included with the DSL guidelines
to encourage
its adoption.
The archives which are an integral part of a DSL provide an ability to refer to earlier versions of a routine - sometimes useful in tracing intent when a program is passed from hand to hand.
More importantly,
cover from a disaster stroyed.
they give a project the ability to re-
in which part of its resources
(It is perhaps obvious but worth m e n t i o n i n g
will not be complete
from the working versions.)
sees
separate
There was an initial tendency
and to over-formalize
It appears unnecessary object code,
that this
insurance unless project management
to it that the backup data sets are stored physically FSD to over-collect
are de-
in
the archiving process.
to retain more than a few generations
run results and so forth.
of
The source code and test
60
data g e n e r a l l y w a r r a n t longer retentionl
but even here it
rapidly b e c o m e s i m p r a c t i c a l to save all versions. s u f f i c i e n t archives
In general,
should be r e t a i n e d to provide c o m p l e t e
r e c o v e r y c a p a b i l i t y w h e n used in c o n j u n c t i o n w i t h the b a c k u p data sets, plus enough a d d i t i o n a l to provide back references.
The s e p a r a t i o n of f u n c t i o n i n t r o d u c e d by the DSL office procedures has two main benefits.
The obvious one is of lowered
cost t h r o u g h the use of c l e r i c a l p e r s o n n e l instead of p r o g r a m mers for p r o g r a m maintenance,
run setup and filing activities.
A s i g n i f i c a n t a d d i t i o n a l b e n e f i t comes about t h r o u g h the resulting m o r e c o n c e n t r a t e d use of programmers. terruptions~
By r e d u c i n g in-
librarians afford the p r o g r a m m e r s a w o r k environ-
m e n t in w h i c h errors are less likely to occur.
Furthermore,
they p e r m i t p r o g r a m m e r s to work on more routines in p a r a l l e l than t y p i c a l l y is the case.
The last major b e n e f i t d e r i v e d from a DSL rests in its support of a p r o g r a m m i n g m e a s u r e m e n t activity.
By a u t o m a t i c a l l y col-
lecting s t a t i s t i c s of the types d e s c r i b e d above,
they can en-
h a n c e our ability to m a n a g e and improve the p r o g r a m m i n g process. The e a r l y DSL's in FSD did not include m e a s u r e m e n t features, and the next g e n e r a t i o n is only b e g i n n i n g to come into use, so a full a s s e s s m e n t of this support is not yet possible.
It was d i f f i c u l t to c o n v i n c e FSD p r o j e c t s in some cases that a w e l l - q u a l i f i e d P r o g r a m m i n g L i b r a r i a n could b e n e f i t a p r o j e c t as m u c h as a n o t h e r programmer.
In fact, there was an initial
t e n d e n c y to use junior p r o g r a m m e r s or p r o g r a m m e r t e c h n i c i a n s to provide librarian support.
This had two d i s a d v a n t a g e s
is not recommended.
the use of p r o g r a m m i n g - q u a l i f i e d
First,
and hence
61
personnel is not necessary because of the well-defined procedures inherent in the DSL's. dividuals
Use of overqualified in-
in some cases led to boredom and sloppy work with
a resulting loss of quality.
Second,
such personnel cannot
perform other necessary functions when needed.
One of the
advantages of using secretaries as librarians is that they can use both skills effectively over the lifetime of a typical project.
During design and documentation phases,
they can
provide typing and transcription services; while during coding and testing phases,
they can perform the needed librarian
work.
Two problems remain in defining completely the role of librarians.
First,
the increasing use of interactive systems
for program development is forcing an evolution of librarian skills toward terminal operation and test support rather than coding of changes and extensive filing.
The most effective
division of labor between programmer and librarian in such an environment remains to be determined.
It also appears
possible to use librarians to assist in documentation, as in preparation of HIPO diagrams.
such
Second, FSD has a number
of small projects in locations remote from the major office complexes and support facilities - frequently on customer premises.
Here it is not always possible to use a librarian
cost-effectively.
In this situation,
the programmer/librarian
better definition of
relationship in the interactive system
development environment may permit development and librarian support to some extent from the central facility instead of requiring all personnel to be on-site.
62
2.
Structured
Pro~rammin@
Recall that the FSD use of the term "structured programming" is a narrow one, adopted to permit ongoing projects to use some of the methodology.
In this usage,
might be called "structured techniques
coding"
used in developing
a single compilable
Combined with usage of a DSL, of code, enforces modularity and maintainability, manageability properties on here.
programming
and thus encourages testing,
and accountability.
An additional,
(module).
changeability
and permits improved
These are all well-known
programming unplanned
and need not be elaborated
for, benefit of structured
is that it tends to encourage
~11ocality of reference",
unit
it provides enhanced readability
simplifies
of structured
it is more like what
and is limited to those
the property of
which improves performance
in a virtual
systems environment. Reflecting
on the advantages
ming and the use of DSL's, techniques
fundamentally
gramming discipline. individualistic, cipline
attributed
program-
are directed toward encouraging
Historically,
undisciplined
programming
activity.
Thus,
introducing
in-
plus those due to better
and control.
The introduction achieved
itself,
dis-
recog-
yields double rewards -- the advantages
herent in the methodology standardization
pro-
has been a very
in the form of practices which most programmers
nize as beneficial,
in FSD.
of structured programming
was not easily
The broad variety of projects,
support has already been mentioned, DSL's,
to structured
one is struck by the fact that the
languages
and the development
and
of
the Guides [7'9] and the education program were necessary
before widespread
application
of the methodology
could take
63
place.
Furthermore,
the ongoing nature of many of the systems
meant that structured programming could take place only as modules were rewritten or replaced. This gradual introduction created a problem of educational timing.
Practically,
it was most expedient to have programmers
attend the education courses between assignments.
The nature
of the courses was such that they introduced the techniques and provided some initial practice.
Yet they required sub-
stantial work experience using the techniques to be fully effective.
Structured programming requires the development of a
whole new set of personal patterns in programming.
Until old
habits are unlearned and replaced by new ones, it is difficult for programmers to fully appreciate the advantages of structured programming.
For best results, this work experience and
the overcoming of the natural reluctance to change habits should follow the training immediately.
This was not always feasible
and resulted in some loss of educational effectiveness. A second problem arose because of the real-time nature of a significant fraction of FSD's programming business.
Here the
difficulty was one of demonstrating that structured programming was not detrimental to either execution speed or core utilization.
While it is difficult to verify the advantages quantita-
tively, a working consensus has arisen.
Simply stated, it is
that the added time and thought required to structure a program pay off in better core utilization and improved efficiency which generally are comparable to the effects achieved in unstructured programs by closer attention to detail. note that even in "critical" programs,
It is also useful to a relatively small frac-
tion of the code is really time- or core-sensitive,
and this
64
fraction may not in fact be predictable a priori.
Hence it is
probably a better strategy to use structured programming throughout to begin with.
Then, if performance bottlenecks do appear
and cannot be resolved otherwise,
at most small units of
code must be hand-tailored to remedy the problems. way the visibility,
In this
manageability and maintainability
ad-
vantages of structured programming are largely retained. Perhaps the most difficult problem to overcome in applying structured programming is the purist syndrome,
in which the
goal is to write perfectly structured code in every situation. it must be emphasized that structured programming is not an end in itself, but is a means to achieving better, more reliable, more maintainable programs.
In some cases
a loop when a search is complete,
(e.g., exiting from
handling interrupt conditions),
religious application of the figures allowed by the Guide may produce code which is less readable than that which might contain a GO TO
(e.g., to the end of the loop block, or to return
from the interrupt handler to a point other than the point of interrupt).
Clearly the exceptions must be limited if discipline
is to be maintained,
but they must be permitted when desirable.
Our approach in FSD has been to require management approval and documentation
for each such deviation.
This ensures that only
cases which are clearly justified will be nominated as exceptionsr
3.
since otherwise the requirements are prohibitive.
Top-Down_Developme~
As defined above~
top-down development is the sequencing of pro-
gram system development to eliminate or avoid interface problems.
65
This permits development and integration to be carried out in parallel and provides additional advantages such as early availability discussed under GUIDELINES.
Top-down development is the most difficult of the four components to introduce, probably because it requires the heaviest involvement and changes of approach on the part of programming managers.
Top-down development has profound
effects on traditional programming management methodology. While the guidelines
sound simple, they require a great deal
of careful planning and supervision to carry out thoroughly in practice,
even on a small project.
top-down development,
The implementation of
unlike structured programming and DSL's,
thus is fundamentally a management and not a programming problem.
Let us distinguish at this point between what might be called "top-down programming"
and true top-down development.
While
they were originally used interchangeably and the guidelines do not distinguish between them, the two terms are valuable in delineating levels of scope and complexity as use of the methodology increases.
Top-down programming is primarily a single-program-oriented concept.
It applies to the development of a "program", typical-
ly consisting of one or a few load modules and a number of independently compilable units, which is developed by one or a few programmers.
At this level of complexity the problems are pri-
marily ones of program design, and the approaches used are those of classical structured programming definition)
(here not the narrower FSD
such as "levels of abstraction "[16] and the use of
Mills' Expansion Theorem [3] .
Within this scope of development
66
external problems and constraints are not as critical, while management involvement is needed, pervasive as in top-down development.
and
it need not be so Many of FSD's suc-
cessful projects have been of this nature,
and the experience
gained on them has been most valuable.
Top-down development,
on the other hand,
program oriented idea.
is a multiple-
It applies to the development of a
"program system ~', typically consisting of many load modules and perhaps a hundred or more independently compilable units, which is developed by one or more programming departments with five or more people in each.
Now the problems expand to those
of system architecture~
and external problems and constraints
become the major ones.
The programs in the system are usually
interdependent and have a large number of interfaces, perhaps directly but also frequently through shared data sets or communications
lines.
They may operate in more than one processor
concurrently - for example, System/370
in a System/7
"front end" and a
"host".
The complexity of such a system makes management involvement in its planning and development essential even when external constraints are minimal.
It involves all aspects of the project
from its inception to its termination.
For example,
a proposal
for a project to be implemented top-down should differ from one for a conventional
implementation
and usage of computer time.
in the proposed manning levels
Functions must be carefully analyzed
during the system design phase to ensure that the requirements of minimum code and data dependency are met, and a detailed implementation sequence must be planned in accordance with the overall proposed plan and schedule.
The design of the system
67
very probably should differ significantly from what it would have been if a bottom-up approach were to be used. implementation,
sure that this sequence is being followed, are being met.
During
progress must be monitored via the DSL to enand that schedules
The early availability of parts of the system
must be coordinated with the user if he intends to use these parts for experimentation or production. ferent type of test plan must be prepared, testing over the entire period. components,
An entirely diffor incremental
Rather than tracking individual
the manager is more concerned with the progress of
the system as a whole, which is a more complicated matter to assess.
This is normally not determinable until the integration
phase in a bottom-up development,
when it suddenly become a
critical item; in top-down work it is a continuing requirement, but one which enables the manager to identify problems earlier and to correct them while there is still time to do so. In a typical system development environment such as those in FSD, however, exception. be met.
external constraints are the rule rather than the
A user will have schedule requirements which must
A particular data set must be designed to interface
with an existing system.
Special hardware may arrive late in
a development cycle and may vary from that desired.
These are
typical of situations not directly under the developers'
control
which have profound effects on the sequence in which the system is produced.
Now the manager's job becomes still more complex
in planning and controlling development.
Each of these external
constraints may force a deviation from what would otherwise be a classical,
no-dependency development sequence.
Provision may
have to be made for testing, documentation and delivery of
88
products
at intermediate
will typically will p r o b a b l y
points
in the overall
change the schedule
This and
increase the complexity of the m a n a g e m e n t
This is especially hundred thousand
true on a very large project
lines of source code or more),
realistic
schedule may well require
developed
in parallel
fashion
cycle.
from the ideal one,
job.
(several since any
that major subsystems be
and integrated
in a nearly conventional
(hopefully at an earlier point in time than the end of
the project). of 400,000
This was carried out successfully
lines of source code,
on a project
the largest known to the
author to date. When carried to its fullest extent~
top-down development
large system p r o b a b l y has greater effects on quality indirectly,
on productivity)
methodology.
than any other component
Even when competent m a n a g e m e n t
its implementation,
of the
is fully devoted
to
there are two other problems which potential-
ly can arise and must be planned for. overlapping
of a
(and thus,
nature of design,
These both relate to the
development
and integration
in a
top-down environment. The first of these concerns the system to be delivered ly, a user receives
the nature of materials
documenting
to and reviewed by the user.
Typical-
a p r o g r a m design document at the end of the
design phase and must express his concurrence before d e v e l o p m e n t proceeds.
This is impractical
in top-down development
because
d e v e l o p m e n t must proceed in some areas before design
is complete
in others.
a detailed
functional
To give a user a comparable specification
all external
is desirable
aspects of a system,
gorithms of concern to a user,
opportunity,
instead.
This describes
as well as any processing
but does not address
al-
its internal
69
design.
This type of specification is probably more readily
assimilated by typical users,
is more meaningful than a de-
sign document and should pose no problems in most situations. Where standardized procurement regulations
(such as U.S.
Government Armed Services Procurement Regulations) effect,
are in
then efforts must be made to seek exceptions.
top-down development becomes more prevalent,
(As
then it is hoped
that changes to such procedures will directly permit submission of this type of specification.)
The second problem is one of the most severe to be encountered in any of the components and is one of the most difficult to deal with.
It has to do with the depth to which a design should
be carried before implementation is begun.
If a complete,
de-
tailed design of an entire system is done, and implementation of key code in all areas is carried out by the programmers who begin the project,
then the work remaining for programmers added
later is relatively trivial.
In some environments this may be
perfectly appropriate and perhaps even desirable; may lead to dissatisfaction latecomers.
in others it
and poor morale on the part of the
It can be avoided by recognizing that design to
the same depth in all areas of most systems is totally unnecessary.
The initial system design work
"architecture"
(the overworked term
still seems to be appropriate here)
should con-
centrate on specifying all modules to be developed and all inter-module interfaces.
Those modules which pose significant
schedule, development or performance problems should be identified, and detailed design work and key code writing done only on these.
This leaves scope for creativity and originality on
the part of the newer programmers,
subject obviously to review
and concurrence through normal project design control procedures.
On some projects,
the design of entire support sub-
70
systems w i t h interfaces to a m a i n s u b s y s t e m only through standard,
s t r a i g h t f o r w a r d data sets has been left to late in the
project.
Note that while this may solve the p r o b l e m s of chal-
lenge and morale,
it also poses a risk that the d i f f i c u l t y has
been u n d e r e s t i m a t e d .
Thus, here again m a n a g e m e n t is c o n f r o n t e d
w i t h a d i f f i c u l t d e c i s i o n w h e r e an incorrect a s s e s s m e n t may be n e a r l y i m p o s s i b l e to r e c o v e r from.
4o
Chief P r o g r a m m e r Teams
The i n t r o d u c t i o n of CPT~s should be a natural o u t g r o w t h of t o p - d o w n development.
The use of a smaller group b a s e d on a
nucleus of e x p e r i e n c e d people tends to reduce the c o m m u n i c a t i o n s and control p r o b l e m s e n c o u n t e r e d on a typical project.
Use of
the other three c o m p o n e n t s of the m e t h o d o l o g y e n h a n c e s these a d v a n t a g e s t h r o u g h s t a n d a r d i z a t i o n and visibility.
In order for a CPT to function effectively, m u s t be given the time,
r e s p o n s i b i l i t y and a u t h o r i t y to p e r f o r m
the t e c h n i c a l d i r e c t i o n of the project. this poses no problems;
the Chief P r o g r a m m e r
In some e n v i r o n m e n t s
in FSD it is sometimes d i f f i c u l t to
achieve b e c a u s e of other demands w h i c h may be levied upon the chief.
In a c o n t r a c t p r o g r a m m i n g e n v i r o n m e n t he may be called
upon to p e r f o r m three d i s t i n c t types of activities:
technical
m a n a g e m e n t - the s u p e r v i s i o n of the d e v e l o p m e n t p r o c e s s itself, p e r s o n n e l m a n a g e m e n t - the s u p e r v i s i o n of the p e o p l e r e p o r t i n g to him,
and c o n t r a c t m a n a g e m e n t
t i o n s h i p s w i t h the customer.
- the s u p e r v i s i o n of the rela-
This latter in p a r t i c u l a r can be a
very t i m e - c o n s u m i n g f u n c t i o n and also is the s i m p l e s t to secure a s s i s t a n c e on.
Hence m a n y FSD CPT's have a p r o g r a m m a n a g e r who
has the p r i m a r y c u s t o m e r i n t e r f a c e r e s p o n s i b i l i t y in all non-
71
technical matters.
The Chief remains r e s p o n s i b l e for technical
c u s t o m e r i n t e r f a c e as well as the other two types of m a n a g e ment;
in m o s t cases this makes the situation manageable,
and
if not then a d d i t i o n a l support can be p r o v i d e d where needed.
The Backup P r o g r a m m e r role is one that seems to cause people a great deal of d i f f i c u l t y in accepting, p r o b a b l y because there are o v e r t o n e s of "second-best"
in the name.
Perhaps the name
could be improved, but the functions the Backup performs are e s s e n t i a l and cannot be d i s p e n s e d with.
One of the p r i m a r y
rules of m a n a g e m e n t is that every m a n a g e r should identify and train his successor.
This is no less true on a CPT and is a
m a j o r reason for the e x i s t e n c e of the Backup position.
It is
also h i g h l y d e s i r a b l e for the Chief to have a peer w i t h w h o m he can freely and openly interact, e s p e c i a l l y in the critical stages of system design. and balance on the Chief.
The Backup is thus an e s s e n t i a l check Because of this,
it is important that
the C h i e f have the right of refusal on a p r o p o s e d Backup;
if
he feels that an open r e l a t i o n s h i p of mutual trust and respect cannot be achieved,
then it is useless to proceed.
The require-
m e n t that the Backup be a peer of the Chief also should not be waived,
since it is always p o s s i b l e that a Backup w i l l be
called on to take over the project and must be fully q u a l i f i e d to do so.
One of the limits on a CPT is the scope of a p r o j e c t it can r e a s o n a b l y undertake.
It is d i f f i c u l t for a single CPT to get
much larger than eight people and still permit the Chief and Backup to e x e r c i s e the e s s e n t i a l amount of control and supervision.
Thus, even at the h i g h e r - t h a n - n o r m a l p r o d u c t i v i t y rates
achievable by CPT's it is d i f f i c u l t for a single Team to produce
72
much more than perhaps 20,000 lines of code in its first year and 30-~0q000 lines thereafter.
Larger projects must there-
fore look to multiple CPT'sr which can be implemented in two ways.
First, as mentioned above under Top-Down Development,
interfaces may be established and independent subsystems may be developed concurrently by several CPT's and then integrated. Second,
a single CPT may be established to do architecture
and nucleus development for the entire system.
It then can
spin off subordinate CPT's to complete the development of these subsystems.
The latter approach is inherently more appealing,
since it carries the precepts of top-down development through intact.
It is also more difficult to implement;
the experiment
under way by the author ran into problems because equipment being developed concurrently ran into definition problems and prevented true top-down development. It is difficult to identify problems unique to CPT~s which differ from those of top-down development discussed above.
Perhaps the
most significant one is the claim frequently heard that,
"We've
had Chief Programmer Teams in place for years - there's nothing new there for us. ~' While it is certainly true that many of the elements of CPT~s are not new, the identification of the CPT as a particular ciplined,
form of functional organization using a dis-
precise methodology
particular,
suffices to make it unique.
In
the emphasis on visibility and control through
management code-reading,
formal structured programming tech-
niques and DSL's differentiate true CPT's from other forms of programming teams [17]
And it is this same set of features
which make the CPT approach so valuable in a production programming environment where close control is essential if cost and schedule targets are to be met.
73
MEASUREMENT RESULTS
It is not possible, because it would reveal valuable business information,
to present significant amounts of quantitative
information in this paper.
At this time, the results of the
measurement program do show substantial improvements in program~ing productivity where the new technology has been used. A graph has been prepared where each point represents an FSD project.
The horizontal axis records the percentage of struc-
tured code in the delivered product, records the productivity. the project,
and the vertical axis
(The latter includes all effort on
including analysis,
design,
testing, management,
support and documentation as well as coding and debugging.
It
also is based only on delivered code, so that effort used to produce drivers,
code written but replaced, etc., tends to
reduce the measured productivity.)
A weighted least squares
fit to the points on the graph shows a better than 1.5 to 1 improvement in the coding rate from projects which use no structured programming to those employing it fully.
It is also possible, leased elsewhere,
because the data has already been re-
to make one quantitative comparison between
productivity rates experienced using various components of the technology on some of the programming support work which FSD has performed for the National Aeronautics and Space Administration's Apollo and Skylab projects.
This comparison
is especially significant because the only major change in approach was the degree to which the new methodology was used; the people, experience
level, management and support were all
substantially the same in each area.
Figure 6 shows the productivity rates and the components of the technology used.
In the Apollo project,
a rate of 1161
74
bytes of new code per m a n - m o n t h was e x p e r i e n c e d on the Ground Support S i m u l a t i o n work. all p r o j e c t effort.)
(Again, all numbers are based on over-
This work used none of the components
d e s c r i b e d in this paper. the Skylab project,
In the d i r e c t l y c o m p a r a b l e effort on
a DSL,
s t r u c t u r e d p r o g r a m m i n g and t o p - d o w n
d e v e l o p m e n t were all employed,
and a rate of 3756 bytes of
new code per m a n - m o n t h was a c h i e v e d -- almost twice as m u c h new code was p r o d u c e d w i t h s l i g h t l y m o r e than half the effort. It is i n t e r e s t i n g also to remark that this was a c h i e v e d on the p l a n n e d s c h e d u l e in spite of over Ii00 formal changes made d u r i n g the d e v e l o p m e n t of that product, m a n p o w e r and c o m p u t e r time.
along w i t h cuts in b o t h
Finally, w h i l e the i m p r o v e m e n t
may rest to some extent on the similar w o r k done previously, this was not d e m o n s t r a t e d C o n t r o l work.
in the p a r a l l e l M i s s i o n O p e r a t i o n s
There p r o d u c t i v i t y d r o p p e d from 15~7 to 841 bytes
per m a n - m o n t h on c o m p a r a b l e w o r k which in neither case used anything other than a DSL.
In a d d i t i o n to m a k i n g q u a l i t y m e a s u r e m e n t s
and d e t e r m i n i n g
p r o d u c t i v i t y rates, the m e a s u r e m e n t a c t i v i t y has served a number of other useful purposes.
First,
it has built up a substantial
data base of i n f o r m a t i o n about FSD projects. added,
As new data is
checks are made to ensure its validity,
data is r e v i e w e d before being added.
and q u e s t i o n a b l e
The result is an in-
c r e a s i n g l y c o n s i s t e n t and useful set of data.
Second,
it has
enabled FSD to begin studies on the value of the components of the methodology. of other factors p r o j e c t activity. ongoing projects~
Third,
and related,
(e.g.~ environment, Fourth,
it also permits the study personnel)
affecting
it is used to assist in r e v i e w i n g
w h e r e the o b j e c t i v e data it contains has
p r o v e d quite valuable. for p r o p o s e d projects,
And fifth,
it is used in e s t i m a t i n g
w h e r e it affords an o p p o r t u n i t y to com-
pare the new w o r k a g a i n s t similar w o r k done in the past, to i d e n t i f y risks w h i c h may exist.
and
75
CONCLUSIONS
It should be clear at this point that FSD's experience has been a very positive one.
Work remains to be done, particularly
in the management of top-down development and the formalization and application of CPT's.
Nevertheless,
FSD is fully committed
to application of the methodology and is continuing to require its use.
In retrospect,
the plan appears to have been a success and
could serve as a model for other organizations interested in applying the ideas.
The FSD experience shows that this is
neither easy nor rapid. and, most important,
It takes substantial time and effort
commitments and support from management,
to equip an organization to apply the methodology.
To summarize,
it appears that once a base of tools,
and education exists,
standards
it is most appropriate to begin with use
of structured and top-down programming and DSL's.
Where the
people, knowhow and opportunity exist, then top-down development should be applied on a few large, complex projects to yield an experienced group of people and the required management techniques.
It is likely that one or more of these may
also present the opportunity to introduce a CPT.
This is es-
sentially the approach that FSD has taken, and it appears to be an excellent way to organize for structured programming.
76 REFERENCES
[ 1]
Datamation,
[2]
B. W. Boehm,
"Software
Assessment",
Datamation,
H. Do Mills,
Mathematical
Programming,
Report No. FSC 72-6012,
[3]
Vol~
Gaithersburg,
[4]
H. D. Mills, Procedures,
Production
Vol.
A Quantitative
19, No. 5, May,
Foundations
Chief Programmer
Teams:
Maryland,
USA, June,
~'Chief Programmer
Programming",
1973, p. 52
for Structured IBM Corporation,
USA, February,
Report No. FSC 71-5108,
F° T. Baker,
1973, pp. 50-63
and its Impact:
Maryland,
Gaithersburg,
[ 5]
19, No. 12~ December,
1972 Principles
and
IBM Corporation, 1971
Team Management
IBM Systems Journal,
of Vol.
ii.~
Noo I, 1972, pp. 56-73
[ 6]
Fo T. Baker;
~'System Quality Through Structured
ming ~', AFIPS Conference
Proceedings,
Vol.
Program-
41, Part I,
1972r pp~ 339-343 [7]
Federal
Systems Center Structured
Report No. FSC 72-5075, Maryland,
[8]
USA, July,
Improved Technology ment Overview, August,
[ 9]
Programming
IBM Corporation,
1973
Guide,
Gaithersburg,
(revised)
for Application
IBM Corporation,
Development:
Bethesda,
Manage-
Maryland,
USA,
1973
Federal Systems Center Programm.ing Librarian's Report No~ FSC 72-5074, Maryland,
USA, April,
IBM Corporation,
1972
Guide,
Gaithersburg,
77
[io]
F. M. Luppino and R. L. Smith, Programming (PSL) Functional poration, Contract
Requirements:
Gaithersburg,
[ii]
Final Report,
Maryland,
#F30602-74-C-0186
HQ Rome Air Development
IBM Cor-
USA, prepared under
with the U. S. Air Force
Center,
Griffiss Air Force Base,
New York, USA, July,
197~
Contracting
Mr. Paul DeLorenzo)
officer,
Support Library
(Release subject to approval of
P. W. Metzger,
Managing a Programming
Prentice-Hall,
Englewood Cliffs,
Project,
New Jersey,
USA,
1973 [12]
R. C. McHenry, Structured Maryland,
[13]
Management Concepts
Programming, USA, November,
HIPO - Hierarchical
Gaithersburg,
1972
Input - Process - Output Docu-
mentation Technique: Corporation,
for Top Down
IBM Corporation,
Audio Education Package,
Form No. SR20-9~13
IBM
(Available through
any IBM Branch Office) [14]
M. M° Kessler, ming Macros, USA,
[15]
Assembly Language Structured
IBM Corporation,
September,
Gaithersburg,
ProgramMaryland,
1972
G. F. Weinwurm et al, Research into the Management Computer Programming:
A Transitional
Estimation
System Development
Techniques,
Santa Monica,
California,
from the Clearinghouse nical Information
Analysis
USA, November,
Corporation, 1965
for Federal Scientific
as AD 631 259)
of
of Cost (available and Tech-
78
[16]
E. W. Dijkstra~ ming System", No~ 5, May,
[17]
~'The Structure
Communications
ii.,
1968, pp. 341-346
G. M. Weinbergr
The Psychology
ming, Van Nostrand 1971
of the THE Multiprogram-
of the ACM, Vol.
Reinhold,
of Computer Program-
New York, New York, USA,
79 J ro~t<er\
~
Teams
J222272' , StructuredProgrammlng
Development Support
Li,lraries
Hierarchy of Techniques Figure !
<
~
80
BASIC
FIGURES
SEQUENCE
tFTHENELSE
--~>
~
......
I,,:.... DOWHI LE
ADDITIONAL
FIGURES
DOUNTIL
CASE
Control
Structures
Figure
2
81
JCL
Job control language
LEL
Linkage editor language
SOURCE
Source
(PL/I, Fortran, BAL,
COBOL)
language
TEST
Project test data
OBJECT
Compiler output
LOAD
Linkage editor output
SYSIN
PPL control data
PPL Internal Library Data Sets
Figure 3
82
A set of c u r r e n t
status
notebooks:
o
JCL
Job control
o
LEL
Linkage
o
SOURCE
Source
editor (PL/I,
COBOL) Project
o
OBJECT
Compiler
o
LOAD
Linkage
o
RUN
Execution
A set of a r c h i v e
test data output editor
output
output
notebooks:
For each of the above, General
language FORTRAN,
language
TEST
o
language
-
plus
PPL h o u s e k e e p i n g
PPL E x t e r n a l
Figure
Library
4
output
BAL,
83
Programmers Project Notebooks: Status, Archives, Run
Coding Sheets Marked-up Notebooks Run Requests
I
~'-I 1 Programming F _.. L~ Librarian L J~
i I
~/ Computer Input
,
,
i
"Cook Book" Control I Cards, PPL Office l Procedures J I
Computer
I Machine
I
[.~ced ure.~
PPL Operations Figure 5
]
l
Computer I
84
Technologies Used
Bytes of New Code (Millions)
Total Effort Delivery Man-Months
to
Productivity (Bytes per Man-Month)
Apollo
[ControO lPerat ~ssi°n ionM
DSL
5.8
3748
1547
t dimulatiS O GronUppo Uns r t
None
2.1
1809
1161
DSL
1.4
1665
841
DSL SP TDD
4.0
1065
3756
Skylab
Mission Operations Control
O r Oul3d
Support Simulation
Productivity Comparlson Figure 6
85
Operation
Procedures
Functions
Initiating
PPLSTART ~
Catalogs the p r o j e c t the SYSIN data set
PPLSETUP ~
Sets up space for a single s p e c i f i e d disk pac k
PPLENTER
Changes section
PPLEDIT
Performs the same functions as P P L E N T E R and, in addition, p r o v i d e s for changing p o r t i o n s of s t a t e m e n t s and for shifting of s t a t e m e n t s either right or left
PPLDELET
Removes
PPLINDEX
Provides a d i r e c t o r y and VTOC any specified section
Updating
PPLJCL
Processing
These
name
and g e n e r a t e s PPL section
on a
or adds m e m b e r s to any s p e c i f i e d other than O B J E C T and LOAD
one m e m b e r
from any s e c t i o n listing
for
Copies specified m e m b e r from a p r o j e c t ' s JCL section to the i n s t a l l a t i o n common p r o c e d u r e library P P L . P R O C L I B
PPLJCLD
Deletes s p e c i f i e d m e m b e r bers in the i n s t a l l a t i o n library P P L . P R O C L I B
PPLMOVE
T r a n s f e r s one or more m e m b e r s from any section, e x c e p t LOAD, to a c o r r e s p o n d i n g section of another project
PPLCOPY
Creates a second copy of a m e m b e r of any section, e x c e p t LOAD, and gives the copy a new m e m b e r name
PPLPRINT
Prints
PPLBALSN
Invokes A s s e m b l e r F to p e r f o r m a syntax check on members of SOURCE w r i t t e n in System/360 A s s e m b l e r Language (does not produce object code)
PPLBAL
Invokes A s s e m b l e r F to a s s e m b l e r members of SOURCE w r i t t e n in System/360 A s s e m b l e r L a n g u a g e into members of O B J E C T
PPLBALLE
Linkage edits m e m b e r s of OBJECT d e r i v e d from System/360 A s s e m b l e r L a n g u a g e into m e m b e r s of LOAD
PPLCBLSN
Invokes the ANSI COBOL c o m p i l e r to p e r f o r m a syntax check on m e m b e r s of SOURCE w r i t t e n in COBOL (does not p r o d u c e o b j e c t code)
procedures
are n o r m a l l y
out all m e m b e r s
from the project common p r o c e d u r e
of a section
used only by programmers.
PPL M a c h i n e Table
Procedures 1
mem-
86
Operation
Housekeeping
Terminating
These
Procedures
Functions
PPLCBL
Invokes members members
PPLCBLLE
L i n k a g e edits m e m b e r s of O B J E C T d e r i v e d from COBOL into m e m b e r s of LOAD
PPLFTNSN
Invokes the F O R T R A N H c o m p i l e r to p e r f o r m syntax check on raembers of SOURCE w r i t t e n F O R T R A N (does not p r o d u c e o b j e c t code)
PPLFTN
Invokes members members
the F O R T R A N H c o m p i l e r to compile of SOURCE w r i t t e n in F O R T R A N into of O B J E C T
PPLFTNLE
Linkage FORTRAN
edits m e m b e r s of O B J E C T into m e m b e r s of LOAD
PPLPLISN
Invokes the PL/I F c o m p i l e r to p e r f o r m tax check on m e m b e r s of SOURCE w r i t t e n PL/I (does not p r o d u c e object code)
a synin
PPLPLI
Invokes the PL/I F c o m p i l e r to compile of SOURCE w r i t t e n in PL/I into m e m b e r s OBJECT
members of
PPLPLILE
L i n k a g e edits m e m b e r s of O B J E C T PL/I into m e m b e r s of LOAD
PPLCHKPT
Dumps all PPL sections s p e c i f i e d tape
PPLALLCT •
C l o s e s up gaps b e t w e e n r e m a i n i n g m e m b e r s of any section to make room for a d d i t i o n a l members, or m a y be used to increase the space a l l o c a t e d to a section
PPLSPACE
C l o s e s up gaps b e t w e e n r e m a i n i n g m e m b e r s of any section to make room for a d d i t i o n a l m e m b e r s
PPLRESTR ~
Restores sections of a p r o j e c t tape created bY P P L C H K P T
PPLCLEAN ~
D e l e t e s and u n c a t a l o g s of a p r o j e c t
PPLENDUP •
Deletes
procedures
are n o r m a l l y
the ANSI C O B O L c o m p i l e r to compile of SOURCE w r i t t e n in C O B O L into of O B J E C T
a project's
1
of one p r o j e c t
Procedures
(continued)
from
onto
a
from a c h e c k p o i n t ,......
any s p e c i f i e d
name
from
derived
from the
used only by programmers.
PPL M a c h i n e Table
derived
a in
section
system
index
The
Reliability of Programming Systems
H. Gerstmann, H. Diel, and W. Witzel, IBM Germany, Boeblingen
1.
ABSTRACT
The reliability of a programming system is not only determined b y the number of errors to be expected, but also by its behavioar in error situations. An error must be kept local to identify its origin and annul its effects at a n tolerable expense. This paper discusses a uniform approach to the limitation of error propagation, the identification of the process in error, and the provision for error rezovery.
2.
THE MODEL
The concepts of reliability are described for a model of a programming system which consists of three basic types 3f objects:
(I) procedures, languages
which m a y
be
nested
(2) a state space, represented within the procedQres (3) processes, operations.
which
are
higher
the variables
by
the
as in
units
of
level
declared
asynchronous
~he notions used are taken from reference [1]. The resources of the system, also called objects, ~re represented by variables. All variables constitute the variables state set R =
{xl, x2 ...,
xn}.
assignment of values to all the variables in the state variable set defines a state of the system. The set of possible states is the state space. With each variable xi a type is associated which defines the set Vi of values it may assume. In these terms the state space can be written as An
S =
Vl
x
V2 x
...x
Vn.
88 The set of p ~ o c e s s e s
{Plg is p a r t i a l l y
ordered
P2,
can
Pn}
by a p r e c e d e n c e P1 < Pk
which
...~
be i l l u s t r a t e d
relation
,
in the
form of a
diagram
< Pk
have
(figure
I). All p r o c e s s e s Pi~ sach that Pi b e f o r e Pk can be initiated.
must
completed
Each p r o c e s s P uses a s u b s e t Rp of the set R of resources. These o b j e c t s d e f i n e the s u b s p a c e of the s y s t e m state space in ~hich the a c t i o n s of the p r o z e s s take place. Resources utilized by mdre than one process are s h a r e d resources. Input r e s o u r c e s Rip to P are s h a r e d r e s o u r c e s set by other p r o c e s s e s which are r e f e r e n c e d by P, output r e s o u r c e s Nop t h o s e set by P and r e f e r e n c e d by other processes. The input state Si of p r o c e s s P is d e f i n e d by the state of each input r e s o u r c e at process initiation, c o r r e s p o n d i n g l y the s t a t e of the o u t p u t r e s o u r c e s at process t e r m i n a t i o n d e s c r i b e s its o u t p u t s t a t e So. {si} is a s s o c i a t e d With each process a set of input states for which a mappin~ Fp to ouput s t a t e s [So} is defined (figure 2). From a f u n c t i o n a l point of view Fp is a p a r t i a l function. No action is d e f i n e d in case P is i n i t i a t e d in a state S {Si}. The next s e c t i o n is devoted to this exceptional situation.
3.
EXCEPTION
HANDLING
It has long been r e c o g n i z e d by e n g i n e e r s that i n s t r u c t i o n s perform p a r t i a l functions. To cope with them, exceptions have been introduced, the ZERODI~IDE and 9VERFLOW c o n d i t i o n s are t y p i c a l examples. H i g h e r level l a n g u a g e s e i t h e r ignore this property or just s u p p o r t e x c e p t i o n s on the i n s t r u c t i o n level as in the case of PL/I. There is, however, no c o n s i s t e n t treatment of e x c e p t i o n s at the level of procedures. The a r g u m e n t that such a f e a t u r e is not n e e d e d goes as follows: E x c e p t i o n s at
89 the procedure level hardware exceptions.
either
can be programmed
or reduced
to
Although this is a true statement, it expresses a narrow attitude with respect to the purpose of a language. The semantic distinction between functional and exceptional actions should als3 be reflected in the syntax of the language. As indicated in figure 3 the functional action Fp expresses the function of the procedure as long as its arguments are in [Si}. If this is not the c~se the exzeptional action Ep maps the invalid state into an exception description. ro support this PL/I, extensions
property in a programming language of the following kind are required:
(I) The values a scalar constrained by appending
such as
variable may assume can be a range to its data attribute.
(2) The values of structured variables (including arrays) can be constrained by imposing relations between its subcomponents. (3) A built-in-function depending on wheter
RANGE which returns '|'B 3r *0'B the argument lies in its range 3r
not.
(~) A built-in-function ON_ERROR DESCR which returns error description in the form ~fJa structure:
as
I ERRDR_DESCR 2 E~ROR_TYPE 3 E~ROR_MAIN_TYPE 3 ERROR_SUB_TYPE 2 SrATEeENT_NO 2 STATEMENT_LABEL 2 AFFECTED_VARIABLES 3 VARIABLE| O @ O
Figure ~ shows the use of these language constructs to define exceptional actions. The language elements introduced
90 should {lot be c o n s i d e r e d a s a proposal to extend PL/Io Its puEpose is to indicate the direction in which exte,sions a~e needed to separate the functional part of a procedure from its e x c e p t i o n a l part. A c o n s i s t e n t solution casnot neglect type a t t r i b u t e s as provided in PASCAL [2,3].
81 ~.
EBROR
ISOLATION
The c a p a b i l i t y to i s o l a t e e r r o r s in a s y s t e m does h o w e v e r not only d e p e n d on the r e a l i z a t i o n of the p a r t i a l f u n c t i o n concept. Additional system properties are required to a t t r i b u t e an error u n e q u i v o c a l l y to a certain process: At each p o i n t of t i m e only one p r o c e s s may update a shared resource. To this end the use of s h a r e d r e s o u r c e s
must be restricted.
Two d i f f e r e n t c a s e s are to be c o n s i d e r e d : Case
I
The r e s o u r c e is s h a r e d by p r o c e s s e s which lie on a path t h r o u g h the s y s t e m (figure 5). S i n c e PI < P2 it is a l w a y s p o s s i b l e to a l l o c a t e the r e s o u r c e R in such a way that deallocate
(R,PI)
< allocate
D u r i n g the e x e c u t i o n of p r o c e s s e s a l l o c a t e d to at most one process.
Case
(B,P2). at
any point of time R is
2
The r e s o u r c e path through
is s h a r e d by p r o c e s s e s the system {Figure 6|.
which do not lie
on a
In this s i t u a t i o n the direct a c c e s s to the r e s o u r c e is p r e v e n t e d by e s t a b l i s h i n g an i n t e r f a c e b e t w e e n the p r o c e s s e s and the resource. A s s u m i n g that the p r o c e s s e s e i t h e r want to read or to u p d a t e (read and write) the resource, they have to i n i t i a t e separate atomic processes READ (R, Pi) or UPD (E, Pj) which are associated with R and obey the c o n s t r a i n t s i n d i c a t e d in f i g u r e 7. An empty c i r c l e r e p r e s e n t s any other p r o c e s s i n c l u d i n g read and update. D e p e n d e n t on the i n t e n d e d use of the r e s o u r c e R the p r o c e s s e s Pi are d e c o m p o s e d into s u b p r o c e s s e s . A c c o r d i n g to above c o n s t r a i n t s , d i s r e g a r d i n g symmetry, this r e s u l t s in one of the t h r e e t y p e s of d i a g r a m s shown in f i g u r e 8. By m e a n s of this d e v i c e c a s e 2 is reduced to c a s e 1. The p r o c e s s a d m i n i s t e r i n g the r e s o u r c e H and the a s s o c i a t e d set of p r o c e s s e s {READ(R, Pi)} U [UPD(R, Pj}} in a c c o r d a n c e with a b o v e c o n s t r a i n t s is c a l l e d a r e s o u r c e manager. T h e r e is an i n t e r e s t i n g p a r a l l e l i n t r o d u c e d by B r i n c h S a n s e n [4].
to the c o n c e p t
of m o n i t o r s
92 Due to the use of r e s o u r c e m a n a g e r s a unique path of serial p r o c e s s e s can De a s s o c i a t e d with the state c h a n g e s of each s h a r e d r e s o u r c e (figure 9). For the purpose of e~ror i s o l a t i o n each process including those c o n t r o l l e d by r e s o u r c e m a n a g e r s is r e q u e s t e d to check its input states.
W h e n e v e r an error is d e t e c t e d by a p r o c e s s Pk it must have been caused by some process Pi < Pk on the path for the resource concerned. In any case, Pk will accuse its i m m e d i a t e p r e d e c e s s o r Pk-1 of h a w i n g made an error based on the f o l l o w i n g c o n s i d e r a t i o n : Either Pk-1 c a u s e d error in a c c e p t i n g
the e r r o r during its e x e c u t i o n an e r r o n e o u s input state.
or made an
As i n d i c a t e d in figure 10, going the path b a c k w a r d s in this way, the p r o c e s s o r i g i n a t i n g the error can be identified. lhe p r o c e s s Pk may wrongly a c c u s e Pk-1 to have supplied faulty input. To s e t t l e this case, it is n e c e s s a r y that obligatory specifications d e t a i l i n g the i n t e r f a c e s between p r o c e s s e s have been e s t a b l i s h e d b e f o r e the i m p l e m e n t a t i o n . The language features described for input c h e c k i n g were introduced in the previous section. Their use is ngw described. C o n s t r a i n t s i m p o s e d on the state space are e i t h e r p r o c e s s or s y s t e m specific. P r o c e s s specific constraints define the a d m i s s i b l e input s t a t e s of a process. Formal parameters are to be s p e c i f i e d with ranges. Dependencies between global variables and/or formal parameters are c h e c k e d as indicated. System specific constraints are properties of shared E e s u u r c e s r e p r e s e n t e d by global variables. To m a i n t a i n their i n t e g r i t y r a n g e s are appended. Since the s e q u e n c e in which a shared resource will be use~ by the processes is u n d e t e r m i n e d , the ranges must e x p r e s s i n v a r i a n t properties, i.e., the c o n d i t i o n s i m p o s e d on its state before and after p r o c e s s e x e c u t i o n must be the same (figure 11). The e m b e d d i n g of o n - u n i t s in s u b j e c t of the next section.
process
hierarchies
is
the
93
5.
EERO~
~ECOVER¥
The concepts developed for error s u f f i c i e n t for the p u r p o s e of recovery. the f o l l o w i n g e x a m p l e : Process Case
PI uses r e s o u r c e
R to p r o v i d e
isolation are not T h i s can be s h o w n by
i n p u t to p r o c e s s
P2.
1
The r e s o u r c e R is used by the s e r i a l p r o c e s s e s PI and P2 (figure 12). After providing i n p u t to P2 the p r o c e s s PI terminates. P2 detects an input error. Recovery must c o m p r i s e p r o c e s s PI which is no l o n g e r in e x i s t e n c e .
Case 2 The resource R is used by the p a r a l l e l p r o c e s s e s Pl and P2 (figure 13). After providing input to P2, process PI continues to e x i s t and discovers an error affecting R. P r o c e s s PI has to r e t u r n to a p r e v i o u s state and r e c a l l the data s u p p l i e d to P2. T h e r e f o r e r e c o v e r y must also include p r o c e s s P2. Although in this situation the resource manager is r e s p o n s i b l e for the i n p u t c h e c k and errors violating the constraints imposed on R are detected before P2 is initiated, the subprocess P12 may consider the values supplied to R as i n c o n s i s t e n t according to t h e internal s e m a n t i c s of the program. Thss the c o n c e p t of input v a l i d a t i o n as d e s c r i b e d for the p u r p o s e of error i s o l a t i o n must be e x t e n d e d for the p u r p o s e of r e c o v e r y . T w o s t r a t e g i e s which s u p p l e m e n t e a c h o t h e r are discussed - a discipline with respect ( c o m m m i t m e a t discipline) - a hierarchical recovery.
structure
to
data
communications
of p r o c e s s e s with
respect
to
The i n t e n t of the c o m m i t m e n t d i s c i p l i n e is to e n f o r c e that no data is c o m m i t t e d outside a process uoless e i t h e r it is e n s u r e d that t h e r e will n e v e r be a need to r e c a l l the d a t a or there is a m e c h a n i s m a v a i l a b l e to do it, As d e s c r i b e d in s e c t i o n 2, a p r o c e s s makes use of input and output resources. For the p u r p o s e of commitment v a l u e s s u b m i t t e d to o u t p u t r e s o u r c e s are c l a s s i f i e d as:
94 uncommitted
committed
precommitt~d-
- Data the r e c e i v i n g p r o c e s s c a n n o t rely on. It will not be rezalled in case the s e n d i n g p ~ o c e s s fails. -
Data t~e sending consistent to the C o n s e q u e n t l y it will the s e n d i n g p r o c e s s .
process commits as receiving process. not be r e z a l l e d by
Data that can exist in one of three states: 'OPEN', ' R E C A L L E D ' or ' C D M M I T T E D ' . The i n i t i a l s t a t e is 'OPEN'. At recall ~r commitment the state is changed to IECALLED' or ' C O M ~ I T T E D ' .
This distinction is i n t r o d u c e d in r e f e r e n c e applyi~;g d i f f e r e n t t e r m i n o l o g y and s e m a n t i c s .
[5],
however,
Output is c o m m i t t e d by the supplying process when it is considered consistent. The term 'zonsistent' remains u n d e f i n e d . It is up to the i n d i v i d u a l p r o c e s s to e s t a b l i s h appropriate criteria. They should, however, at least g u a r a n t e e valid output. The c o m m i t m e n t i m p l i e s for c o m m i t t e d data the r e l e a s e to o t h e r p r o c e s s e s and for p r e c o m m i t t e d data a s t a t e t r a n s i t i o n from 'OPEN' to 'COmMiTTED'. The data must be committed b e i o r e the highest level ~rocess terminates to ~ h i c h the o u t p u t v a r i a b l e s are n o n - l o c a l . C o m m i t t e d and u n c o m m i t t e d data do not i ~ t r o d u c e d e p e n d e n c i e s b e t w e e n p r o c e s s e s w h i c h have to be ~ o n s i d e r e d for r e c o v e r y . The sitsatioll is d i f f e r e n t for precommitted data. This notion allows to e x t e n d the szope of in-process recovery, which is based on the tact that the p r o c e s s to be r e c o v e r e d is s t i l l in e x i s t e n c e . F i g u r e 14 s h o w s the s t a t e d i a g r a m for p r e c o m m i t t e d data. The s e n d i n g p r o c e s s sets the data in the i n i t i a l s t a t e 'OPEN'. The a s s o c i a t e d r e s o u r c e manager g u a r a n t e e s that the d a t a are not c h a n g e d by any r e c e i v i n g p r o c e s s as long as they are in the s t a t e 'OPEN'. Only the s e n d i n g p r o c e s s is e n t i t l e d to c h a n g e the s t a t e to ' C O M M I T E D ' or 'RECALLED'. The r e c e i v i n g p r o c e s s e s are not p e r m i t t e d to terminate before the s t a t e ' C O M M I T T E D ' is entered. In a d d i t i o n they must not commit o u t p u t which d e p e n d s on i n p u t not yet c o m m i t t e d . This m e c h a n i s m e n s u r e s that ~ii dependel~t p a r a l l e l p r o c e s s e s are s t i l l in e x i s t e n c e in c a s e data have to be recalled. It therefore a l l o w s to apply i n - p r o c e s s r e c o v e r y to several p r o c e s s e s . For back out e a c h of them :an be reset to the
95 initial state which w~s kept ~t process initiation. Figure 15 illustrates the commitment discipline. Process PI precommits data to the resource R and sets its state to 'OPEN'. The data can be read but not updated until the end of P21. Before P2 is permitted to update R and/or terminate its execution it must wait for the commitment of R by PI. The commitment discipline cannot be applied to serial processes. To guarantee the existence of a process that can perform the ~ecovery, the system should be designed as a hierarchy of processes. In cases where this hierarchy canngt be predefined measures for post-process recovery have to be introduced in the direction as described in reference [6]. Figure 16 shows an idealized system meeting above design constraints. The system is structured in three processes Pl, P2 and P3. Each process Pi =onsists of subprocesses Pij. Recovery situations affecting only parallel processes such as P22 and P23 or P32 and P33 can be handled by means of the the process commitment discipline. In any other situation detecting the error has to escalate it to the next level in the process hierarchy. To achieve this on-units providing for recovery must also be ordered hierar:hically. Figure 17 indicates a way how e~rors can be escalated to higher level on-units for recovery.
6.
DISCUSSION
The concepts of reliability described in the preceding sections require a system partitioned into modules. Following Parnas £7] a system is considered well structured in case the interfaces between modules contain little information, where interfaces are the assumptions modules make about each other. To minimize the information being transferred, interfaces must be raised to a higher level of abstraction. In this context the features discussed in the paper offer tools to enforce abstractions. As abstractions represent design decisions independent from the program £1ow, the features can only partially be provided automatically by a compiler. A compiler handles one external procedure at a time, whereas module interfaces comprise more than one external procedure. Also, not every external procedure declaration constitutes an abstract interface and an abstract interface may contain assumptions which cannot be expressed in terms of parameters. The en[orcement of interfaces requires additional effort at execution time. Sometimes performance reasons are pretended to reject an approach of this kind. There are at least two
96 reasons which show that the argament is not stringent. PASCAL [8] has demonstrated that dynamic range checking can be implemented efficiently. Ranges, as proposed here, are sore complex since they can be defined by any>relation. On the other hand, they need not be checked at any reference but just at the interface. This leads to the second reason: It is the designers responsibility to minimize the information passed across interfaces.
7.
SUMMARY
The preceding sections presented an attempt to handle errors systematically. Error isolation a n d ~ecovery were treated under one aspect~ Error isolation led to the realization of the p a r t i a l fanction concept and the provision of resource manager~. Error recovery in addition necessitated the i n t r o d u c t i o n Of a c o m m i t m e n t discipline in conjunction with a h i e r a r c h i c a l structure of processes.
REFERENCES
I.
J.J. Ho~ning and B. Eandell: Process Computing Surveyse Vol. 5, No I, March 1973
2.
N. Wirth: Informatica
.
The Programmin 9 I, 35-63 (1971)
N. Habermann: Language Pascai,
Language
Structuring,
Pascal,
Acta
3ritical Zomments on the Programming Acta Informatica 3, ~7-57 (1973)
P. Brinch Hansell: Concurrent Programming Computing Surveys, Vol. 5, No 4, December 1973 5~
Ch. T. Dawies, Jr.: Recovery Semantics System, 1973 P r o c e e d i n g s of the ACM
6.
L.A. Bjork: Proceedings
recovery Scenario of the ACM
for
a DB/DC
for
Conceptsr
a
DB/DC
System,
1973
7. D. L. Parnas: Software Engineering or Methods for the Multi-Person Construction of Multi-Version Programs, published in these proceedings 8. N. Wirth: The Design of 1, 309-333 (1971)
a P A S C A L Compiler~
Software,
Vol
97
Precedence Relation between Processes
Fig.
I
98
Mapping of Input States to Output States defined by a Process
l~p = {X+,X2} d
X2 Xl
~ig. 2
99
Distinction between Functional and Exceptional Actions of a Process I
~ig. 3
IIII
I
II
I
100
Use of Rangesand Error Descriptions
Example 1: 1 X(IXl <10,X2 I) 2 Xl INTEGER (I 0 ... 99 I) 2 )(2 INTEGER (I 0 .o. 10 I) BLOCK ARRAY (10) CHAR (4) (I 'READ', 'WRTE', 'WAIT' I) Example 2: P: PROC; DCL (X, Y) INTEGER (I 0 . . .
10 I) EXTERNAL, 1 E . . .
;
o
o
ON CONDITION (INPUT) BEGIN; .., ; E = ° ON_ERROR_DESCR; ... ; END; o @
IF 7 (RANGE (X) A RANGE (Y) /~ X '~ Y) THEN • SIGNAL CONDITION (INPUT); o o
END;
Fig. 4
101
Shared Resource used by Sequential Processes
Fig. 5
102
Shared Resource used by Parallel Processes
~iG.
6
m
area
m
11111mm
am
m
m
i~
m
m
r/~,~~
~
.am
m
m
mm
m
m
atom
m
m
,~
~
l
-'~
,mI
D) rs lip
ID
<="
[
g
i
"ID
2O
C0
m~
iC* CI i'J
105
Unique Correspondence between State Changes of Resources and Sequential Processes
~ig. 9
Backtracking to locate Error
• error origin I I I
• error in input check
%
&, s
Pig.
I0
I I
• error detected by input check
t07
INVARIANT
RESOURCE
R
¢o,~I~
R
~.d(s')
Fig.
11
CONSTRAINTS
108
Scope of Recovery for Sequential Processes
R
Fig, 12
~J
m
uunm
l
--~/
~
I
immmm m
m
m
m
~
f
nmmm
mum
II
!
.J
110
State Diagram for Precommitted Data
,7 ~
~ig.
14-
111
Use of Precommitted Data
(
Fig. 15
112
Hierarchy of Processes ii
i
i
i
m
m
m
mm
m
I)1
li~ !!
P2S" /
I |
Ii I| 3
~
//
%% %%
//
% %
/ /
I
/
,- Ps-.
/
I l
/
, / \,
~
I /
/
~J/// ~
16
1
m
a
m
m
m
~
I|
I I
)/
~
| I
~i7
%%
/
| !
! !
1
113
Escalation to Higher Level On-Units
PI: PROC; 1: PROC; . . . END Pll; P12: PROC; ON CONDITION (INPUT12) BEGIN;
E = O~_ERROR_DESCR; SlGNAi~ CONDITION (ESCALATE); END;
IF input-error THEN SIGNAL CONDITION (INPUT12); e
END Pro;
ON CONDITION (ESCALATE) BEGIN;... CALL Pll; O
CALL 1:)12;
END P1;
~ig. 17
; END;
FEHLERANALYSE UND FEHLERURSACHEN IN SYSTEMPROGRAMMEN
A.
Endres
Kurzfassung Die w~hrend e i n e s
internen
Tests
des B e t r i e b s s y s t e m s
aufgedeckten
Programmfehler
Untersuchung
yon F e h l e r v e r t e i l u n g e n
einer wird
Klassifikation
der F e h l e r
auf die m~gliche
dabei
bilden
der w i c h t i g s t e n
fur
eine
i n Systemprogrammen.
Aus
nach mehreren G e s i c h t s p u n k t e n ,
Ursache d i e s e r
gewonnenen E r k e n n t n i s s e
kussion
die Grundlage
DOS/VS
Fehler
geschlossen.
Die
werden angewandt a u f e i n e
Methoden z u r
Dis-
VerhUtung bzw. A u f -
deckung von F e h l e r n .
in i, EINLEITL~G FUr a l l e ,
die
sich
d i e Aufgabe s t e l l e n ,
Software-Produkten lichst in
viel
tats~chlich
jeder, !ich
das t a t ,
ging
ein
h a b e n , was bei
und warum.
Als
stellen
Erfolg
hatte~
macht,
m~chte man n a t U r l i c h
fahrungen,
den e i n
erleben,
die
welche Fehler Klasse
jeder
guter
einzelne
von e i n e r
einer
Fall
grSBeren
Zahl
auf dafUr
daneben
Programmierstil mit
denen
Gefahren-
individuell
gr~Beren
ist
Programmierer
man muB s i c h
Fast
bekannt-
typische
s o g a r bei ist,
die
verwendet.
Programmierer
auch bei
mSg-
sind,
Theorie
diesem s p e z i e l l e n
wenn n i c h t
Das h e i B t ,
ja
persbnliche
h. man h a t d i e T r i c k s ,
Was d a f U r e r f o r d e r l i c h
generalisieren. gr~Bere
d.
von
das n i c h t
- und das i s t
seine
vermieden oder auf
LernprozeB,
Berufsstand.
sich
MaB an A u f m e r k s a m k e i t
Diesen
Fehler
hat,
F o l g e davon h a t man s e i n e n
ein erh~htes
Programmierern
die
Programmen gemacht w e r d e n .
ihm i n
beim n ~ c h s t e n Mal g e ~ n d e r t , man k e i n e n
Art
Programm g e s c h r i e b e n
- wird
Zuverl~ssigkeit
es von Nutzen s e i n ,
welcher
was es t u n s o l l t e ,
der Normalfall
entwickelt
sollte
zu w i s s e n ,
geschriebenen
der e i n m a l
Anhieb
zu e r h ~ h e n ,
darUber
die
durch
Gruppe von
dem gesamten da~ man d i e macht,
Er-
versucht
damit besch~ftigen,
von L e u t e n Uber e i n e
von Programmen hinweg gemacht w e r d e n .
zu
115
Die b i s h e r v e r ~ f f e n t l i c h t e n
Untersuchungen in d i e s e r Richtung
weisen in meinen Augen e i n i g e e r h e b l i c h e M~ngel a u f . spiel
sei die A r b e i t von Moulton und M u l l e r
[i]
Als B e i -
erw~hnt.
Diese
Untersuchung bezog sich auf FORTRAN Programme im U n i v e r s i t ~ t s milieu
(University
of M i c h i g a n ) .
Anzahl von Programmen (ca. durchschnittliche
Zwar wurde eine b e a c h t l i c h e
5.000) a n a l y s i e r t ,
Programmgr~Be
jedoch war d i e
nur 38 Statements. Was jedoch
diese A r t yon Untersuchungen noch mehr k e n n z e i c h n e t ,
ist
die
Tatsache, dab die Analyse von F e h l e r a r t e n sich auf eine r e i n syntaktische Klassifizierung Auswertung von Rubey [ 2 ] ,
beschr~nkt.
Dasselbe g i l t
f u r die
die er bei einem V e r g l e i c h von
FORTRAN, COBOL, JOVIAL und P L ' I machte. Die S c h l u B f o l g e r u n g , zu der man etwa auf Grund d i e s e r Untersuchungen kommen kann, heiBt,
dab man sich in FORTRAN vor Assignment und I / 0 S t a t e -
ments hUten s o l l .
DaB die Probleme normalerweise n i c h t
Syntax e i n e r Sprache zu suchen s i n d , suchung von Boies und Gould [ 3 ] .
in der
z e i g t z. B. die U n t e r -
Hier findet
man b e r e i t s
SchluBfolgerung,
dab der A n t e i l
u n t e r 15% l i e g t .
Der Versuch, den ganzen Komplex v o n d e r
definition
s y n t a k t i s c h e r Fehler d e u t l i c h Problem-
bis zur Codierung in e i n e r vorgegebenen Programmier-
sprache m i t in die Betrachtung h i n e i n z u z i e h e n , f i n d e t gut d a r g e s t e l l t
bei Henderson und Snowdon [ 4 ] .
in einem v o r h e r a l s r i c h t i g
sich sehr
Obwohl h i e r sozu-
sagen nur die Geschichte eines e i n z i g e n Fehlers wird,
die
bewiesenen Programm)
(dazu a l l e r d i n g s beschrieben
d U r f t e jedoch diese A r t der Untersuchung am e r f o l g v e r -
sprechendsten s e i n . Der folgende B e i t r a g b a s i e r t
auf e i n e r Untersuchung von Fehlern
in Systemprogrammen. Die B e s o n d e r h e i t von Systemprogrammen l i e g t darin,
dab s i e im V e r g l e i c h zu Anwendungsprogrammen
s t a r k e P a r a m e t e r i s i e r u n g , ein b r e i t e s lange Lebensdauer aufweisen.
Benutzerspektrum und eine
Dies hat n i c h t nur zur Folge, dab
an s i e besonders hohe Q u a l i t ~ t s a n s p r U c h e g e s t e l l t auch, dab i h r e S t r u k t u r
eine besonders
sich o f t
werden, sondern
a l s besonders komplex heraus-
stellt.
Das Z i e l
dieses B e i t r a g s
ist
es, sowohl die Frage zu k l ~ r e n , welche
f u n d i e r t e n Aussagen ~berhaupt auf Grund e i n e r d e r a r t i g e n F e h l e r analyse gemacht werden k~nnen, a l s auch d i e SchluBfolgerungen zu p r ~ s e n t i e r e n , die sich aus den s p e z i e l l e n Daten d i e s e r Auswertung ableiten lassen.
116
2, GEGENSTANDUND HETHODEDER UNTERSUCHUNG Der Gegenstand der Untersuchung waren die bei
i n t e r n e n Tests
entdeckten F e h l e r an den im B U b l i n g e r IBM Labor e n t w i c k e l t e n Komponenten
des Betr~ebssyst~.~is DOS/YS ( R e l .
Zum besseren V e r s t ~ n d n i s mu6 f o l g e n d e , I n f o r m a t i o n v o r a u s g e s c h i c k t werden.
28).
dieses P r o j e k t b e t r e f f e n d e
DOS i s t
ein Betriebssystem,
dessen e r s t e Version etwa 1966 auf den Markt kam. In z u e r s t viertel-
und s p a r e r h a l b j ~ h r i g e n
Intervallen
bzw. Verbesserungen des Systems f r e i g e g e b e n . fang der e i n z e l n e n Versionen (sog. schiedlich. inhaltete
Die V e r s i o n ,
wurden Erweiterungen Der Xnderungsum-
Releases) war sehr u n t e r -
die h i e r zur D i s k u s s i o n s t e h t ,
woh! die t i e f g r e i f e n d s t e n
an diesem System gemacht wurden.
be-
Xnderungen, die Uberhaupt
Bei den E r w e i t e r u n g e n , die
f u r Release 28 im B ~ b l i n g e r Labor e n t w i c k e l t
wurden, h a n d e l t e
es sich im w e s e n t l i c h e n um f o l g e n d e T e i l p r o j e k t e ,
wobei a l l e
sich
auf das K o n t r o i l p r o g r a m m im engeren Sinne bezogen: a)
UnterstUtzung des v i r t u e l l e n
b)
Erweiterung des Systems von 3 auf 5 Programmbereiche (partitions),
incl.
Speicherkonzepts,
variabler
Priorit~tsvergabe,
c)
UnterstUtzung neuer K a r t e n e i n - und K a r t e n a u s g a b e g e r ~ t e ,
d)
UnterstUtzung eines o p t i s c h e n Anzeigeger~tes a l s Operateur-Konsole,
e)
mehrere k l e i n e r e
f)
Anpassung des Spooling-Subsystems "POWER" an die o. a.
Zeitgeber je
Erweiterungen
(katalogisierte
Prozeduren,
B e r e i c h , Anpassung f u r VSAM),
System~nderungen. Zeitlich
parallel
zu den erw~hnten Erweiterungen wurden andere
Zus~tze des Kontrollprogramms vor a l l e m im h o l l ~ n d i s c h e n Labor und neue Assembler, Kompiler und D a t e n z u g r i f f s m e t h o d e n in mehreren Labors u. a. in den USA e n t w i c k e l t . Der das System d a r s t e l l e n d e Code is p h y s i k a l i s c h g e g l i e d e r t Macros und Moduln.
Macros sind jene Routinen,
System im Assembler-Macroformat e n t h a l t e n s i n d ; U b e r s e t z t e r Form vorhanden. nicht wesentlich
ist,
die im a u s g e l i e f e r t e n Moduln sind in
(Da im folgenden diese Unterscheidung
w i r d der B e g r i f f
oder Macro gemeint i s t . )
in
Modu] b e n u t z t , wenn Modul
117 Von den T ~ t i g k e i t e n "berUhrt".
im B ~ b l i ~ g e r
Labor wurden etwa 500 Moduln
Die d u r c h s c h n i t t l i c h e
360 I n s t r u k t i o n e n / M o d u l , betrachtet,
und bei
mitz~hlt.
Global
GreBe d i e s e r
Moduln l a g bei
wenn man nur den a u s f U h r b a r e n
480 I n s t r u k t i o n e n / M o d u l ,
gesehen,
hatte
wenn man Kommentare
das P r o j e k t
das System:
etwa
Code
folgenden
Moduln
Effekt
auf
Instruktionen neu
alt a) ganz neu g e s c h r i e b e n wurden
169
--
53K
b) a l t e n
253
97K
33K
i00
7K
522
I04K
und neuen Code e n t h a l t e n
c) n u r Kommentare g e ~ n d e r t insgesamt
-86K
Die angegebenen 190K I n s t r u k t i o n e n
stellen
Dazu kommen noch ca.
Kommentare und d e r g l e i c h e n .
Alle
60.000
Zeilen
Moduln und Macros s i n d
Wie T a b e l l e sehr.
2-1 z e i g t ,
Ebenso i s t
im DOS M a c r o - A s s e m b l e r
schwankt die G r ~ e
der r e l a t i v e Tabelle
auch neuen Code e n t h a l t e n .
2-2 z e i g t
zusammen kann man s c h l i e ~ e n , t~tigkeit ca.
etwa d a r i n
200 I n s t r u k t i o n e n
hinzuzufUgen. fehlerfreier
insgesamt
das b i s h e r
fur
die
253 M o d u l n , d i e
Aus b e i d e n T a b e l l e n typischste
Projekt-
i n einem v o r h a n d e n e n Modul
Situation
zu ~ n d e r n ,
viele
der f u r
von
bzw.
das E r s t e l l e n
usw.)
Gesagte
kaum z u r Anwendung komnlen klar
gemacht haben.
d i e U n t e r s u c h u n g waren d i e A u f z e i c h n u n g e n Uber
Fehler,
d i e w~hrend e i n e r
formellen
Testperiode
von
5 Monaten i n den oben angegebenen Moduln g e f u n d e n w u r d e n .
Die f r a g l i c h e
Testphase
den k r i t i s c h s t e n ,
in
umfa~te
ein Teilprojekt
durchgefUhrt
hatte.
nut einen Abschnitt,
dem gesamten T e s t v e r l a u f
gegangen waren d i e T e s t s , ~r
fur
etwa 50 I n s t r u k t i o n e n
Programmieren
dUrfte
Das M a t e r i a l diejenigen
Moduln
Programme a n g e p r i e s e n e n Methoden (Top-down E n t w u r f ,
strukturiertes konnten,
dies
da~ d i e wohl
bestand,
Da~ i n d i e s e r
geschrieben.
der e i n z e l n e n
sowohl
als
Code d a r .
Umfang der ~nderung j e Modul s e h r
unterschiedlich. alten
ausfUhrbaren
die
die jeweils
verantwortliche Jedes T e i l p r o j e k t
fur
wenn auch
des Systems.
Vorweg-
e i n e n Modul oder
Programmierergruppe war s o w e i t
dezentral
ausgetestet,
dab
118 es-"integrationsre~f" vollst~ndig alle
in
war,
verifiziert
nicht
Konflikte
ho d i e
worden,
unmittelbarer
Komponenten a u f demselben Auch d u r c h
d.
wie dies
"lauff~higen"
in mSglichst
in m~glichst testen.
benutzt.
Testf~lle,
gelaufen
die
waren,
verschiedenen
Kom-
Konfigurationen
und
in m~glichst
stark
frUheren
ge~nderter
laufen
ausgehend v o n d e r
neue T e s t f ~ l l e
entwickelt,
zu
simulieren
Leistungsuntersuchung,
B. e i n e
Datenfernverarbeitung
und e i n T e s t
Gruppe
des Systems und
Eine z w e i t e , Beschreibung
e i n e Abnahme des
sollten
Tests,
von
Konfiguration
externen
die
des Systems an d i e
so z.
geh~rige Version
(Regressionstest).
Vor der A u s l i e f e r u n g
(Beta Test).
Kunden f o l g t e n spezielle
noch w e i t e r e Tests
fur
im Rechenzentrum von a u s g e -
Kunden.
Das M a t e r i a l , abgeben f u r
das b i e r alle
deckt werden, tinen),
Phase
seinen
Funktionsbeanspruchungen
auf einer
Systems aus K u n d e n p e r s p e k t i v e
in
dieser
allen
Eine zum E n t w i c k l u n g s b e r e i c h bereits
u n a b h ~ n g i g e Gruppe h a t t e , des S y s t e m s ,
sie
Ziel
System m i t
verschiedenen
Systemzusammensetzung e r n e u t
w~blten
eingefUhrte zentrale
Zu diesem Zwecke wurden yon zwei Gruppen zwei A r t e n
Testf~llen lieB
vielen
evtl.
so dab der b e s a g t e
System begann.
vielen
ohne dab stehenden
waren.
Integrationsproze~
gel~st,
war es n u n , sozusagen das f e r t i g e ponenten
voneinander
Entwicklungsniveau
den a n s c h l i e B e n d e n
waren so
m~glich war,
Abh~ngigkeit
waren i h r e r s e i t s
T e s t a u f einem
neuen F u n k t i o n e n
sondern
frUhen Stadien bei
untersucht
Fehlerarten,
Anf~ngern Dasselbe
andere Methoden a l s
kann a l s o
im L a u f e
eines
nur e i n e n A u s s c h n i t t . eines
Projekts
(Syntaxfehler)
~nderungsphase aufzutreten grund treten.
wurde,
die
gilt
dutch
pflegen, fur
kein
Typische
(vollst~ndig
Fehler,
etwas
die
das D u r c h f U h r e n
ent-
Fehler,
fehlende
oder nach e i n e r dUrften
Bild
Projekts
wie Rou-
hektischen
i n den H i n t e r -
normalerweise
von T e s t l ~ u f e n
durch ge-
funden werden. Die von den b e i d e n T e s t g r u p p e n regelm~Bigkeiten
entdeckten
(oder
vermuteten)
Un-
des Systems wurden i h r e m ~uBeren E r s c h e i n u n g s b i l d
119
nach, d. h. nach dem E f f e k t dokumentiert.
a f einen bestimmten T e s t f a l l ,
Diese I n f o r m a t i o n bezeichnen w i t a l s das Problem.
Sie wurde der u r s p r U n g l i c h e n
Entwicklungsgruppe Ubergeben,
diese nahm eine Analyse vor und s c h r i e b folgenden a l s F e h l e r p r o t o k o l l Antwort
( a u f demselben, im
bezeichneten, Formblatt)
eine
(siehe Bild 2-3).
Die A n t w o r t k l a s s i f i z i e r t e
das Problem zun~chst nach folgenden
6 Gruppen: Maschinenfehler B e d i e n e r - oder B e n u t z e r f e h l e r Verbesserungsvorschlag Duplikat
(eines b e r e i t s
bekannten Programmfehlers)
Dokumentationsfehler (noch n i c h t bekannter)
Programmfehler.
Die V e r t e i l u n g auf die e i n z e l n e n Gruppen h~ngt im a l l g e m e i n e n vonder ist
A r t und auch der O r g a n i s a t i o n eines P r o j e k t e s
z. B. der A n t e i l
der D u p l i k a t e um so n i e d r i g e r ,
eine K o r r e k t u r eines b e r e i t s zur VerfUgung g e s t e l l t
ab. So
je s c h n e l l e r
bekannten Problems der Testgruppe
wird.
Obwohl auch in den anderen Gruppen durchaus r e l e v a n t e I n f o r m a t i o n e n t h a l t e n s e i n kann, w o l l e n w i r uns im folgenden auf d i e Gruppe (f)
konzentrieren.
Es sind dies die v o n d e r
a k z e p t i e r t e n F e h l e r im Code.
Entwicklungsgruppe
In unserem v o r l i e g e n d e n F a l l e um-
faBte die gesamte Datenbasis insgesamt etwa 740 Probleme, yon denen 432 a l s Pregrammfehler k l a s s i f i z i e r t
worden waren.
Es sei h i e r bemerkt, dab aus der B e n u t z e r p e r s p e k t i v e die oben angegebene A u f t e i l u n g
nicht
immer ohne w e i t e r e s akzeptabel
FUr ihn sind F e h l e r in den Gruppen ~rgerlich
wie die e i g e n t l i c h e n
(a),
ist.
(d) und (e) ebenso
Programmfehler. Wir w o l l e n s i e
deshalb n i c h t w e i t e r untersuchen, w e i l
s i e i h r e n Ursprung n i c h t
in der P r o g r a m m i e r e r t ~ t i g k e i t per se haben. Der Ordnung h a l b e r sei auBerdem v e r m e r k t , dab a l l e vor A u s l i e f e r u n g des Systems behoben waren.
erw~hnten Fehler
120
3, II~GLICHKEITENUND GRENZENEINER FEHLERANALYSE Die fUr die Auswertung zur VerfUgung stehenden 432 F e h l e r p r o t o k o l l e e n t h a l t e n j e F e h l e r f o ! g e n d e Angaben: Administrative
Angaben Uber die Problementdeckung
(benutzte Systemversion, Testfali, b
Konfiguration,
Datum des T e s t l a u f s ,
benutzter
Name des T e s t e r s
usw.).
Beschreibung des Problems. Administrative
Angaben Uber die d u r c h g e f U h r t e
Korrektur
( g e ~ n d e r t e Moduln; Datum der ~nderung; Name des Programm i e r e r s ; S y s t e m v e r s i o n , in die die K o r r e k t u r i n t e g r i e r t werden s o l l d
usw.).
Codesch!Ussel
fur
Ursache des F e h l e r s ;
verursachendes
Teilprojekt. e) Beschreibung der d u r c h g e f U h r t e n Sobald man eine d e r a r t
Korrektur.
umfassende Datenbasis
einem e i n e ganze Reihe von Fragen e i n , Mir e r s c h i e n e n f o l g e n d e Fragen als a) ~ 2 _ ~ [ ~ _ ~ [ _ ~ ! ~ _ ~ ~
die man s t e l l e n
hat,
fallen
mBchte.
sinnvoll:
Wie i s t
die V e r t e i l u n g
nach Moduln? Gibt es H~ufungspunkte, sonders b e t r o f f e n
vorliegen
waren? Wenn j a ,
der F e h l e r
d. h. Moduln, die be-
was tun d i e s e Moduln? Wie
sind sie s t r u k t u r i e r t ? b) ~ Q _ ~ [ ~ _ ~ _ ~ Z ~ E . ~ # ~
In j e d e r Phase des E n t w i c k l u n g s -
z y k l u s ~ kQnnen F e h l e r gemacht werden, angefangen bei !egung der e x t e r n e n Z i e l s e t Z u n g Detailplanung
des l o g i s c h e n
des P r o j e k t s ,
der Fest-
w~hrend der
Aufbaus, w~hrend der u r s p r U n g l i c h e n
Codierungsphase, bei der K o r r e k t u r
eines Fehlers
usw.
121
c) ~ _ ~ _ ~ _ ~ 1 ~ [ _ ~ ~ auf d i e V e r a n t w o r t l i c h k e i t Projektzyklus'
(Entwurf,
einzelne Teilprojekte
Man kann dies beziehen einmal e i n z e l n e r Gruppen w~hrend des I m p l e m e n t i e r u n g ) , aber auch auf
oder sogar auf e i n z e l n e Programmierer.
d) ~ _ ~ _ ~ ! ~ _ ~ # ~
Welche programmtechnische T e i l -
aufgabe wurde n i c h t oder f a l s c h g e l ~ s t ? Die daraus sich ergebende Gliederung nach F e h l e r a r t e n kann dann die Basis b i l d e n f u r f o l g e n d e w e i t e r e Fragen: e) ~ r ~ _ ~ _ ~ [ _ ~ [ ~ f ~ _ ~ l ~ _ ~ ~
Was hat den F e h l e r
v e r u r s a c h t ? Eng damit zusammen h~ngt (wie s p ~ t e r noch g e z e i g t wird)
d i e Frage:
zu verhUten?
Und s c h l i e S l i c h :
g) Wenn der F e h l e r schon n i c h t v e r h U t e t werden kann, ~ [ ~ . ~ ! ~
NatUrlich g~nzen. fall
kann man diesen Fragenkatalog noch e r w e i t e r n und e r -
Relevant w~re e v e n t u e l l
die Frage: Welche A r t von T e s t -
hat welche F e h l e r a r t aufgedeckt? Auch kann man sich jedwede
Kombination der oben angegebenen Fragen als i n t e r e s s a n t v o r stellen,
so z. B. (b) und (d) zusammen, was dann hie~e: Warm
werden welche F e h l e r gemacht? Man w i r d m i r zugestehen, dad der F r a g e n k a t a l o g , wie er s t e h t , wir
schon r e c h t umfassend i s t .
in der Lage, auf diese Fragen v o l l s t ~ n d i g e
worten zu f i n d e n ,
W~ren
und g U l t i g e Ant-
so w~ren w i r f u r die Zukunft v i e l e r
unserer
Sorgen enthoben. Welche S c h w i e r i g k e i t e n bei einem d e r a r t i g e n Unterfangen jedoch auftreten
k~nnen, und welchen Beschr~nkungen w i r
uns daher u n t e r -
werfen mUssen, s o l l e n die folgenden Bemerkungen v e r d e u t l i c h e n . Da i s t
zun~chst die Frage, wie man Uberhaupt f e s t s t e l l e n
worin der F e h l e r bestand.
Um es g l e i c h
kann,
vorwegzunehmen: Bei der
122 beschriebenen gleich
A u s w e r t u n g wurde i n
durchgefUhrter
richtig
ist,
zu t i e f
liegt
rUhrt
die
nur e i n e
herumgeleitet.
F o l g e des F e h l e r s
Dies
jedoch
der Sache h e r d u r c h f U h r b a r , Lmplementierung anwendbar s e i n Aus d e r s e l b e n rektur macht,
entweder
nicht
verursacht
wo gerade
hat.
typisch,
dab d i e
Relevanz hat
erkennt,
triebssystem
sich
die
und der t a t s ~ c h l i c h noch von
beabsichtigte oder nicht
mehr
auch der Modul
fur
sein
frei
da ge-
ist. scheint
sehr selten
und Ursache des F e h l e r s .
B. e i n
Abge-
Bibliothekswartungsprogramm
und man an i h r
im N o r m a l f a l l
der
direkt
Effekt
eines
Problembeschreibungen
bei
einige Fehlers
Das System h ~ n g t i n
-
Das G e r ~ t X k o n n t e
einer nicht
Schleife. gestartet,
die
Datei
Y nicht
gelesen werden. Das System s t o p p t gUltiger -
mit
Speicher-,
ungUltigem Operationscode,
Platten-
Der K a r t e n l e s e r
K l~uft
Geschwindigkeit,
usw.
un-
oder Ger~teadresse.
nur mit
halber
ein
einem Be-
sind:
-
-
Kor-
muB, der
auch d i e ~nderung
an einem B e t r i e b s s y s t e m
Art
wo z.
Typische
an dem d i e
Problembeschreibung
produziert ist
dab der M o d u l ,
Platz
Fehleruntersuchungen
indirekter.
statt
d e r Aus-
machen z w i s c h e n d e r u r -
ursprUnglich
Oft wird
am m e i s t e n
Ausgabeliste
sehr
viel-
kann.
sehen von den F ~ l l e n , Fehler
Exaktheit
mehr e r k e n n b a r
angebracht wurde,
sehr viel eine
da d i e
folgt,
als
F~llen
weder a u f w a n d s m ~ i g
nicht
Oberlegung
FUr d i e es m i r
gr~Bere
Implementierung
ist
Fehler
neue F e h l e r
behoben oder um das Problem
einen Vergleich
beabsichtigten
immer ganz
Problem zu l ~ s e n .
in solchen
mU~te man e i g e n t l i c h ,
anzusehen,
den F e h ! e r
hat
Fehler
nicht
der e i g e n t l i c h e
um das e i g e n t l i c h e
Um d i e e n t s p r e c h e n d e
durchgefUhrten.
tats~chlicher
DaB d i e s
oder das R i s i k o ,
gemacht w u r d e ,
zu e r r e i c h e n ,
Korrekturen sprUnglich
dab m i t u n t e r
zu gro~ s i n d ,
Die K o r r e k t u r ,
wertung
daher,
der Regel
gesetzt.
und der Z e i t a u f w a n d
einzufUhren, leicht
Korrektur
theoretischer
123 Es d U r f t e
auch n i c h t
Fehlern als
Einheit
eindeutig ansieht.
sein,
was man beim Z~hlen von
Als Folge e i n e s Problems kann es
vorkommen, dab man e i n e oder 20 Konstanten ~ n d e r t , 20 I n s t r u k t i o n e n Modul,
einfUgt,
an e i n e r oder an 5 S t e l l e n
i n einem oder 5 Moduln,
schlieBen, gel~st
Anzahl
Bei d i e s e r
Auswertung wurde Anzahl
(Duplikate),
terial
mit einiger
Bei e i n i g e n zuziehen, geben. lers
(a),
bereits
v o r h e r abgezogen waren.
bis
(c)
direkt
a u f einen e i n z e l n e n
bibliothek
noch I n f o r m a t i o n e n
hin-
aus den F e h l e r p r o t o k o l l e n
er-
man z. B. die Frage nach dem V e r u r s a c h e r
so muB man I n f o r m a t i o n Es s e i
hier
erw~hnt,
wissen personalpolitischen
d i e i n der System-
fairen
W~hrend w i r
Explosionsstoff
dabei
im f o l g e n d e n
als
der Fragen ( e ) ,
den Fragenkomplex (b)
nur kurz b e h a n d e l n ,
vermutlich
entwickelte
schlieBend
enthalten
und man
sehr a b g e k l ~ r t e n
und
Weise angehen kann.
ausklammern und (a) komplex (d)
jedes Moduls v o r -
dab manche Fragen e i n e n ge-
s i e d a h e r , wenn Uberhaupt - nur in e i n e r sachlich
e i n e s Feh-
Programmierer herunterverfolgen,
mit heranziehen,
Uber d i e E n t s t e h u n g s g e s c h i c h t e
handen i s t .
kann
beantworten.
Fragen mu5 man a l l e r d i n g s nicht
Einschr~nkungen
und (d) aus dem vorhandenen Ma-
Zuverl~ssigkeit
die sich
Will
(b),
der F e h l e r g l e i c h
wobei Probleme, d i e a u f demselben
U n t e r Beachtung der gerade e r l ~ u t e r t e n man d i e Fragen
Durch-
( o d e r yon denen er i n s g e h e i m l ~ n g s t
der Probleme g e s e t z t ;
F e h l e r beruhen
aus-
andere Probleme m i t -
a u f d i e der P r o g r a m m i e r e r beim e r n e u t e n
gehen des Programms s t i e ~ wuBte).
in einem
usw. Man kann auch n i c h t
dab m i t einem Problem g l e i c h
wurden,
e i n e oder
die interessanteste
Kategorisierung und ( g ) .
und (c)
ganz
der Fragen-
Information.
nach F e h l e r a r t e n
Basis fur weitergehende (f)
bietet
dient
Betrachtungen
Die an-
bezUglich
! 24
q, ]]ARSTELLU[iGE[NiGER ERGEBP,LSSE LiND SCHLUSSFOLGERUHGEN i n den fo~genden A b s c h n i t t e n pr~sentiert~ sei
gen o f t der
die
darauf
sich
wird
e i n e Auswah]
dab d i e
tabellarischen
e i n e n Grad an D e t a i l i n f i o r m a t i o n
unmittelbar
Lnformation wenn d i e
folgenden
mit
Information
aus d e r e r w ~ h n t e n U n t e r s u c h u n g
hingewiesen~
gangen ~verden kanno
der
Ich
Diskussion
hielt
einzufUgen,
enthalten,
nicht
fur
auf
Es
die
vol]st~ndig
es dennoch f u r
in
einge-
zweckm~Big,
d a m i t man a u f s i e
Zusammenfassungen
ergab.
Zusammenstel]un-
diese
zurUckgreifen
das V e r s t # n d n i s
nicht
kann,
ausreichen
sol]ten.
4.1
Fenlerverteilung D]ese
nacn ~ioduln
information
4-1 z e i g t Anzahl
ist
in
3 Tabeilen
die Auswirkungen eines
der ~.!oduln, d i e
d u r c h ~nderung j e
Dehoben werden
konnten,
triebssystem
der
eines
tiberrascht
Da# Uber
einzelnen
Moduls
Es w i d e r s p r i c h t
der Moduln i n
so h ~ u f i g
von S c h n i t t s t e l l e n
TaOelle
bezogen a u f d i e
etwas.
Interdependenz
und der a l l g e m e i n
leranf~]]igkeit
Fehlers,
g e ~ n d e r t werden mu#ten.
85% der F e h l e r der V o r s t e I I u n g
zusammengefa~t.
einem Be-
beobachteten
Feh-
zwischen versChiedenen
~oduln. ~aOe!ie Anzah]
4-2
zeig~
wurden F e h l e r , 28,
19 bzw.
15 F e h l e r n
(alle
Platz
in d i e s e r
]ativ
K]einen~
Genere!i
negativen als
auffallen,
4-i
gez~hlt,
gezeigt,
Die d r e i
stellten
um d r e i
mehr a l s
aber
dUrfte
geschriebenen
dabei
n~mlich
Modul g e f u n d e n w u r d e n .
in Tabelle
mehrfach
es h a n d e ~ t e s i c h des Systems
die je
d i e wie
Moduln b e t r a f e n , mit
~]e ~mgekehrte A u f t e i l u n g ~
der F e h l e r ,
keine Oberraschung
Instruktionen).
Reihenfolge
sehr
mehrere
Spitzenreiter
der u m f a n g r e i c h s t e n
3,000
instabil
wird
dar;
Moduln Der v i e r t e
von einem r e -
bekannten
Modul b e l e g t o
da~ von 422 g e ~ n d e r t e n
Moduln nur 202 F e h l e r
die
Dabei
aufwiesen
o d e r neu
(48%).
125 Klammert man noch d i e Moduln m i t ergibt (90)
sich,
dan 78% d e r F e h l e r
Dabei
4-3 e n t h ~ I t
ist
jedesmal
diverse
auf
Vergleiche
unterschieden
nur neuen Code e n t h a l t e n Das V e r h ~ I t n i s also
der Anzahl
so
z w i s c h e n Moduln m i t die
gemischtem
Fehlern
Fehleranf~lligkeit
der F e h l e r Code.
der F e h l e r h ~ u f i g k e i t .
z w i s c h e n den M o d u l n , d i e
Was a u f f ~ l l t ist
sowohl ist
und Moduln i n s -
dreht
sich
um, wenn man den Umfang des neu g e s c h r i e b e n e n DaB d i e s e Angaben s e h r v i e l zweifeln.
GefUhl
P u n k t ab b e s s e r
ist,
ein
mUglichst
Verteilung
nach F e h l e r a r t
Die u n t e r
diesem A b s c h n i t t
Zahlen dieses
viel
durchgefUhrten
wieder
Codes a n s i e h t .
Programmierern
alten
weit-
als
zu
Code zu r e t t e n .
dargebotene ergaben
Analyse aller
be-
da6 es yon einem g e w i s s e n
Kern der v o r l i e g e n d e n
Abschnitts
d i e Moduln
jedoch
Programm neu zu s c h r e i b e n
versuchen,
den e i g e n t l i c h e n
das u n t e r
zu b e s t ~ t i g e n ,
Bei nur
R e l e v a n z haben, mUchte i c h
Sie s c h e i n e n j e d o c h
verbreitete
(48%).
d i e Moduln m i t
abzuschneiden als
Das V e r h ~ I t n i s
alten
folgendes:
dieselbe
j e Modul s c h e i n e n
neuem Code z u n ~ c h s t s c h l e c h t e r
4.2
aus,
17% der Moduln
und den M o d u l n , d i e
auch neuen Code e n t h a l t e n .
gesamt,
mit
einem F e h l e r
fallen.
Tabelle
als
je
(400)
sich
Information
stellt
Auswertung dar. aus e i n e r
beschriebenen
Die
nachtr~glich
Programmkorrek-
turen. Um s i n n v o l l e darstellen fiziert teil,
Aussagen machen und das M a t e r i a l zu k 6 n n e n , muBten d i e
und a b s t r a h i e r t
werden.
Dies h a t nebenbei
dan d i e Angaben v e r s t ~ n d l i c h
manden, der k e i n e V o r k e n n t n i s s e ziellen
Betriebssystems.
Verlust
an G e n a u i g k e i t
Uberhaupt
v o r h a n d e n e n Angaben k l a s s i werden auch f u r
hat bezUglich
Der N a c h t e i l und P r ~ g n a n z .
ist
den V o r je-
dieses
natUrlich
speein
!26 Wegen aer F U l ] e Tabellen 6-!,
6-i
oes M a t e r i a l s
bis
6-10
ist
zweistufig
die
Darstellung
strukturiert.
Die T a b e l l e n
6~5 und 6-10 ergeben e i n e U b e r g e o r d n e t e
die hier
mit
Die d a z w i s c h e n l i e g e n d e n
Tabellen
der Angaben i n
fur
einige
Gruppe A i s t
diese
3 yon 6 U n t e r g r u p p e n A l ~ e Zah!e~angaoen auf dieseIbe
sind
bei
ninein
und i s t
erfolgt
Die Yeh]e~ ~ich
in G £ ~ e
lem g e l ~ s t
eines
Drob~emspezifisch.
entsprechenden
Problemstellungen
Es han-
des Problems
und i n
L~sungsalgorithmus'
da5 v i e l f a c h
das f a l s c h e
ad~quat oder n i c h t
Prob-
in G~u~e_B Fehler
Bei
dieselben
Verfahren
bier
werden.
Hier
vollst~ndigen
gegebenen A l g o r i t h in eine Pro-
verschiedenartigen Fehlerarten
und H i l f s m i t t e l
fur
auftreten, die
Werden andere V e r f a h r e n
B. h~here Sprachen~
so d U r f t e n h i e r
eines
eines Algorithmus'
dUrften
benutzt
Bei einem
verfahrensspezifisch°
und d e r g l e i c h e n .
dieselben
grammierung
sind
Implementierung
Problemstellungen
fur
einem K o m p i l e r d U r f t e n
i n der mehr oder w e n i g e r
der O b e r t r a g u n g
grammiersprache
charakteristisch
auftreten.
Die # e a ] e r
~nd r i c h t i g e n
fur
ausreichend
i n einem B e t r i e b s s y s t e m .
liegen
die
sind
zo B. bei
andere Grupp~erungen
z.
im nach-
wurde oder da~ der g e w ~ h l t e A l g o r i t h m u s
anderen P r o j e k t ,
nutzt~
alle
bezogen.
A sind
im V e r s t ~ n d n i s
Die U n t e r g r u p p i e r u n g e n
sofern
und zwar s i n d
zu m o t i v i e r e n :
aas gegebene Problem n i c h t
mus~,in
nur f u r
4 von 7.
wie f o l g t
M~n kann dazu auch s a g e n ,
die
Erl~uterung
d i e H a u p t g r u p p e n A, B und C ~st
um F e h l e r
der Auswah!
eine weitere
Gruppe B f u r
Prozentzah~en,
ist.
den Gruppen A bzw.
Grundmenge von 432 F e h l e r n in
war.
enthalten
zus~tzliche
gegeben,
Die A u # ~ e i l ~ n g
delt
Klassifikation,
Gruppe A, Gruppe B und Gruppe C b e z e i c h n e t
Detaillierung B. Bei
i n den
vorgefertigte
andere G r u p p i e r u n g e n
Probe-
Hilfsroutinen,
auftreten.
127 c) §~u~e_§ s c h l i e ~ l i c h engeren Sinne.
sind keine P r o g r a m m i e r f e h l e r im
Es sind schon F e h l e r im Code, die n i c h t
d r i n b l e i b e n k~nnen. Sie k~nnen aber, nachdem s i e e r kannt worden s i n d ,
zumindest t e i l w e i s e
auch von Leuten
behoben werden, die keine Programmierer sind oder keine D e t a i l k e n n t n i s s e bezUglich dieses P r o j e k t s
haben.
Was sagen uns diese Zahlen? Zun~chst zu der g l o b a l e n Aufteilung.
Fast die H ~ I f t e
(46%) a l l e r
F e h l e r l i e g e n im Be-
r e i c h des P r o b l e m v e r s t ~ n d n i s s e s , der Problemkommunikation, dem Wissen um L~sungsm~glichkeiten und L~sungsverfahren. Die andere H ~ I f t e fahren,
(38%) sind Dinge, wo w i r m i t anderen Ver-
andere d. h. bessere Ergebnisse erwarten k~nnen.
Diese A u f t e i l u n g
in je zwei f a s t g l e i c h e H ~ I f t e n s c h e i n t
sich auch bei anderen g l e i c h g e r i c h t e t e n zu b e s t ~ t i g e n .
Diese Tatsache i s t
Untersuchungen
a l a r m i e r e n d oder e r m u t i -
gend, je nachdem wie hoch die Erwartungen waren b e z U g l i c h der h u n d e r t p r o z e n t i g e n Automation der S o f t w a r e - E r s t e l l u n g . Genauer g e s a g t , nur die H ~ I f t e der F e h l e r s i n d m i t programmtechnischen M i t t e l n dere T e s t h i l f e n )
(bessere Programmiersprachen, umfassen-
in den G r i f f
zu bekommen.
Der anderen
H ~ I f t e mu~ m i t besseren Methoden der P r o b l e m d e f i n i t i o n (Spezifikationssprachen),
dem besseren V e r s t ~ n d n i s der
grundlegenden Systemkonzepte (Schulung, A u s b i l d u n g )
und
dem VerfUgbarmachen e i n s c h l ~ g i g e r A l g o r i t h m e n zu Leibe gerUckt werden. Nun zu den e i n z e l n e n Gruppen. Was sich in den Zahlen der Gruppe A a u s d r U c k t ,
kann man wie f o l g t
umrei~en: Die Auf-
g a b e n s t e l l u n g e n in einem B e t r i e b s s y s t e m sind a u ~ e r o r d e n t lich
diffus.
Die A b h ~ n g i g k e i t v o n d e r
und - k o n f i g u r a t i o n
ist
groB.
an das System sind n i c h t werden h ~ u f i g was i h r
Effekt
Maschinenarchitektur
Die f u n k t i o n a l e n Anforderungen
sehr p r ~ z i s e f o r m u l i e r b a r .
Es
Dinge g e ~ n d e r t , wenn man einmal gesehen h a t , auf das System i s t .
Das dynamische V e r h a l t e n
128
des Systems
~st das H a u p t p r o b l e m . Die . P a r a l l e ! i t ~ t
A b l ~ u f e n und ~_reignissen s t e l l t d e r u n g e n an d i e 11i t t e l Bei
Vorstellungskraft.
Gruppe B f ~ i l ~
auf,
Auch s i n d
dal5 h i e r
Assemm!er.~Programmierung e i n e G r u p p i e r u n g e n , w i e z. ihr
bier
derart
liefern
fur
den E i n d r u c k , te]tsfehler
starke
d i e H i l f s .-
Liste,
Ph~nomene, so da~ n i c h t
s onder n l e d i g l i c h
yon I n t e r e s s e
sein
dUrfte.
als
ob t r i v i a l e
dieser
Eindruck
~ei
den U n t e r g r u p p e n B] und B3 s t e c k t
ein
vie]
k o m p l e x e r e r und t i e f e r kiingenden
Die G r i p p e S s c ~ l i e S l ; c h daran v o r b e i
die pro-
Insgesamt
V e r w e c h s l u n g e n und F l U c h t i g -
eine sehr grope R o l l e s p i e l e n .
trivia]
Andere
Initialisierungo
d i e s e Gruppe g e w ~ h l t e n F o r m u l i e r u n g e n etwas
gruppen D2 und B4 d U r f t e
etwas
Probleme der
Rolle spieleno
b e k a n n t e und u n i v e r s e l l e
H~ufigkeit die
typiscne
B. das Problem d e r
Erscheinen auf dieser
zentuaie
gibt,
pedantischer
zu Recht b e s t e h e n ,
sicherlicl]
liegender
illustriert,
auch d i e Sorgfalt
Bei den U n t e r sehr o f t
Fehler hinter
der
Klassifizierung.
Aufgaben im Zusammenhang m i t mit
von
grol~e A n f o r -
schlecht°
der
sind
bekanntlich
dais es k e i n e n ~eg
~echnisch
relativ
der E r s t e l l u n g
unattraktiven gro[?,er Systeme
zu e r l e d i g e n .
5, UESAC,~E Ui"iD VEP,ilOfU[,iG VO~i FEHLERIt Die Fr~ge nacn der Ursache e i n e s ist
mehrschichtigo
Aktivit~t
ba man j a
P r o g r a m m f e h l e r s ~ na. ch dem l,larum,
Programmieren als
ansehen ;;iu~3, kann man e i g e n t l i c h
sehr w e i t e n
Bogen zu spannen.
Bei e i n e r
eine menschliche
nicht
umhin,
einen
e i n i g e r m a [ ~ e n umfassenden
B e t r a c h t u n g s w e i s e kann man f o l g e n d e F e h l e r u r s a c h e n u n t e r s c h e i d e n :
3)
~ecQiL~o~o,~ische ' ]ichkeiten,
(Problemdarst;ellbarkeit°
L~Jsungsmbg-
v e r f U g b a r e V e r f a h r e n und H i l f s m i t t e l ) ,
129 b)
£~9#~!~#~£[!&5)#, Kommunikation,
c)
)!§~£[!§~)#, Situationen
d)
e)
vorhandene I n f o r m a t i o n ,
eingesetzte Mittel), (Vorgeschichte
des P r o j e k t s ,
des Programms;
und ~u6ere E i n f l U s s e ) ,
9 £ ~ Q ~ ! ~ , innerhalb
(Arbeitsteilung,
(Kooperationswillen,
Rollenverteilung
der P r o j e k t g r u p p e ) ,
!~!~!~1!~,
(Erfahrung,
Veranlag~ng und Verfassung der
einzelnen Programmierer),
g)
~!~_~c~!~.
Es kann n i c h t
g e l e u g n e t werden, da# die Ursachen m e n s c h l i c h e r
F e h l l e i s t u n g e n - und darum h a n d e l t es sich j a auch bei Programmfehlern - zu einem w e s e n t l i c h e n T e i l im Psychologischen (also
im unteren T e i l
der oben angegebenen Skala)
liegen.
hat z. B. G. Weinberg [5] in seinem bekannten Buch v i e l e f u r eine d e r a r t i g e B e t r a c h t u n g s w e i s e g e l i e f e r t .
So Argumente
Ich mSchte im folgenden eine s t a r k e i n g e s c h r ~ n k t e B e t r a c h t u n g s weise w~hlen. /ch mQchte als F e h l e r u r s a c h e das v e r s t a n d e n w i s s e n , was h ~ t t e w~re. selbe,
anders sein mUssen, damit der F e h l e r n i c h t
eingetreten
F e h l e r u r s a c h e und F e h l e r v e r h U t u n g sind dann ein und daslediglich
mit
verschiedenen V o r z e i c h e n .
Wenn ich au~erdem
f u r die w e i t e r e n AusfUhrungen mich beschr~nke auf die beiden oberen Gruppen, n~mlich die t e c h n o l o g i s c h e n
und o r g a n i s a t o r i s c h e n
Ursachen, so i s t a)
Feh~e~u[~a~he:
Die Diskrepanz zwischen S c h w i e r i g k e i t
der Aufgabe und Angemessenheit der aufgewandten M i t t e l und b)
~ ! ~ £ ~ / ~ :
A l l e Ma~nahmen, die g e e i g n e t s i n d ,
Diskrepanz zu v e r r i n g e r n .
diese
130 Da# s e ] b s t
bei
noch v i e l e
Freiheitsgrade
liegt die
dieser
a u f d e r Hand. beschriebene
sehr eing~schr~nkten
Dazu kommt,
Diskrepanz
Man kann e n t w e d e r
die
von F e h l e r n
A u f der B a s i s
dieser
jeder
dieser
in
Wie s i c h
6-i
Gruppen d i e
den T a b e l l e n pen der
Oberlegungen
7-I
die bis
um den b e t r e f f e n d e n
dies
fur
gefUhrt die
Fehler
deuten die
eines
diese
Betrachtungsweise
spricht
betonen
~ ist
ihr
konstruktiver
st~ndigen
und w i s s e n s c h a f t l i c h
erhalten;
es s i n d
man u.
jedoch
Vorschl~ge
a. b e i
Kosy
die
[6],
absoluten Dinge,
zur
VerhUtung
Elspas
[7]
Beispiel: in
der G e r ~ t e -
der HardwareverhUten,
i n dem
- um das d e u t l i c h
Es s i n d
beeinfluBt
so s p e z i -
verbessert.
aus d i e s e n O b e r l e g u n g e n e r g i b t ,
Mitteln
Als
Fehlers
Fehlertyp
der H a r d w a r e - D o k u m e n t e
und V e r b e s s e r u n g e n .
Grup-
gemacht.
der mangelnden K l a r h e i t
ausgesprochen
Mit
an, was g e t a n werden
zu v e r h U t e n .
dab d i e Ursache
GrUnde
sieben wichtigsten
Was f u r
motivierte
sind,
haben k~nnen.
Fehlerarten
so kann man d i e s e n
organisatorischen
Aus b e i d e n geeignet
und o r g a n i s a t o r i s c h e n
Klarheit
Ver~nderungen
ge-
uns noch e i n m a l
man d i e
was s i c h
um e i n e
ansehen und v e r s u c h e n ,
gleichzeitig
(Gruppe AL) i n lag,
kSnnen w i r 6-10)
Gesagten e r g i b t ,
kBnnte,
Dokumentation
ist. die
aufgetretenen
"Feh]erfaktoren"
behandlung
bis
ist
fizierten
Wenn man a k z e p t i e r t ,
angepa~t
technischen
7-7
aus dem v o r h e r
Situation
werden kann.
vermehren,
Forderungen,
zu diesem F e h ] e r
unserem F a l l
bestimmten
reduziert
zu v e r m e i d e n .
(Tabellen
gegenUberstellen,
einer
besser
resultieren
das A u f t r e t e n
Fehlerarten
vorhanden sind,
o d e r aber d i e Aufgabe so m o d i f i z i e r e n ,
den v o r h a n d e n e n M i t t e l n
Betrachtungsrichtungen
die
dab i n
a u f zwei A r t e n
aufgewandten Mittel
gebene A u f g a b e zu l ~ s e n , dab s i e
Betrachtungsweise
und U n s i c h e r h e i t e n
zu
Charakter.
Alles
sind Ansatzpunkte nicht
immer d i e
Erkenntnisse,
die mit
technischen
werden k~nnen. [8].
die wir und
~hnlich
von P r o g r a m m f e h l e r n und Lowry
fur voll-
findet
131 Die in den T a b e l l e n 7-1 bis 7-6 angedeuteten F e h l e r f a k t o r e n geben l e d i g l i c h
die A r t der technischen oder o r g a n i s a t o r i s c h e n
Ma~nahmen an, die diese b e t r e f f e n d e F e h l e r a r t b e e i n f l u s s e n . Die j e w e i l i g e n
L i s t e n erheben keinen Anspruch auf V o l l s t ~ n d i g -
keit. Was die D a r s t e l l u n g s w e i s e j e d e n f a l l s sache, dad o f f e n s i c h t l i c h
verdeutlicht,
ist
die Tat-
f u r jede F e h l e r a r t andere Ursachen,
und daher andere VerhUtungsma#nahmen in Frage kommen. Oder anders ausgedrUckt, es g i b t
kein e i n z e l n e s A l l h e i l m i t t e l .
BerUcksichtigt
man f e r n e r noch die h i e r auger acht gelassenen Schichten bezUglich m ~ g l i c h e r F e h l e r u r s a c h e n , so e r g i b t
sich ein sehr ernUchterndes
Gesamtbild: Genau so m a n n i g f a l t i g wie d i e F e h l e r a r t e n und so vielschichtig
wie die F e h l e r u r s a c h e n , mUssen auch die Ma#nahmen
s e i n , die man e r g r e i f e n muD, um Fehler zu verhUten. Jeder Katalog derartiger bleibt
MaBnahmen, den a u f z u s t e l l e n w i r in der Lage w~ren,
s e i n e r Natur nach b r u c h s t U c k h a f t .
A r t der Untersuchung dazu l i e f e r n
Was a l l e r d i n g s
kann, i s t
diese
eine K o n k r e t i s i e r u n g
und eine Gewichtung.
6, DAS AUFDECKEN VON FEHLERN Wenn man zu dem Schlu~ kommt, da~ die M ~ g l i c h k e i t e n der F e h l e r verhUtung nur StUckwerk sein k~nnen, dr~ngt sich die Frage a u f , ob man aus den b i s h e r i g e n Oberlegungen etwas a b l e i t e n z U g l i c h der A u f d e c k u n g s m S g l i c h k e i t von F e h l e r n . s c h e i n t es s i n n v o l l , zierung
kann be-
In der Tat e r -
der in A b s c h n i t t 4.2 gewonnenen K l a s s i f i -
in F e h l e r a r t e n , die Verfahren e n t g e g e n z u s t e l l e n ,
die
heute angewandt werden, um F e h l e r in Programmen aufzudecken. Entsprechend der A u f t e i l u n g der F e h l e r a r t e n in Gruppe A und B (Gruppe C w o l l e n w i r h i e r Ubergehen) kommen u n t e r s c h i e d l i c h e Verfahren in Frage; zumindest l i e g t fahren anders.
der Schwerpunkt der Ver-
132 FUr die Grupge A erscheinen f o l g e n d e ,
in der P r a x i s angewandte
Verfahren a l s die W i c h t i g s t e n :
a)
Sorgf~Itiges
PrUfen der f u n k t i o n a l e n
der l o g i s c h e n teiligte
(internen)
Dritte.
d i e am P r o j e k t
(externen)
Spezifikationen
und
durch unbe-
Es sind dies in der Regel andere a l s direkt
beteiligten
Mitarbeiter.
F a l l e eines B e t r i e b s s y s t e m s kSnnten dies s e i n : ware-Entwickler,
Produktplaner,
Im Hard-
Vertriebsspezialisten,
Anwendungsprogrammierer u. a.
b)
Erg~nzende oder Uberlappende Beschreibung des Systems durch f o r m a l e Methoden ( z .
B. VDL, P e t r i - N e t z e ,
Markov-
Model!e).
c)
Benutzung von a n a l y t i s c h e n oder S i m u l a t i o n s m o d e l l e n , (die normalerweise fur Leistungsanalysen e r s t e l l t um das f u n k t i o n a l e
werden),
V e r h a l t e n des Systems zu veranschau-
l i c h e n bZWo zu v e r s t e h e n .
d)
Inspektion
des geschriebenen Programmtextes durch D r i t t e .
H i e r f U r kommen andere e r f a h r e n e Programmierer des Projektes
in Frage, die zwar die P r o b l e m s t e l l u n g , aber
n i c h t den gew~hlten L~sungsweg
e)
9urchfUhrung jektes
kennen.
von T e s t l ~ u f e n durch M i t a r b e i t e r
des Pro-
oder durch unabh~ngige D r i t t e .
FUr die Gruppe B s i e h t der t y p i s c h e Katalog der Ma#nahmen etwas anders auso H i e r i s t
der Schwerpunkt d e u t l i c h
auf die Verfahren
der P r o g r a m m v e r i f i k a t i o n im engeren Sinne zu legen.
Es geh~ren
dazu:
a) b) c)
Programminspektion durch D r i t t e oder Hoare. H a n d s i m u l a t i o n yon T e s t l ~ u f e n , von B e i s p i e l e n m i t B l e i s t i f t
d)
( s i e h e d) oben),
Beweisen von Programmen nach der Methode von Floyd d. h. Durchrechnen
und P a p i e r .
DurchfUhren von T e s t l ~ u f e n dutch den E r s t e l l e r Programms.
des
133
e)
DurchfUhrung
yon T e s t l ~ u f e n
Es w~re s i c h e r l i c h
vermessen,
Wirksamkeit
Verfahren
dieser
Daten a b l e i t e n
zu w o l l e n .
Urteilsverm~gen einzelnen geeignet,
leicht die
vornehmen.
auf Erfahrungen eine
Wenn d i e
spekulativen es w i r d
relative
dadurch
Charakter
nicht
erwartet,
Z a h l e n w e r t e n e i n e gro~e G e n a u i g k e i t
hat,
und s u b j e k t i v e m G e w i c h t u n g der
gewonnene Aussage
so i s t
sie
doch
zu b e l e u c h t e n .
Mit
dab man den j e w e i l i g e n
beimiBt,
einen gewissen A n h a l t
die Wirksamkeit
relative
und Ma~nahmen aus den v o r l i e g e n d e n
Basierend
damit verbundene Problematik
anderen W o r t e n , man, dab s i e
unabh~ngige Dritte.
e x a k t e Angaben Uber d i e
kann man b e s t e n f a l l s
Verfahren
auch e i n e n
durch
stattdessen
hofft
d a f U r geben, w i e man U b e r h a u p t
der e r w ~ h n t e n Ma&nahmen messen und ausdrUcken
kann. Die D a r s t e l l u n g , zeigt
fur
lichkeit
die
jedes
Fehlerart
Uberhaupt
funden wird,
8-I
und B i l d
in Prozent
wie sehr dieses
s o n d e r n nur r e l a t i v , Fehler
Bild
Verfahren
dafUr,
betreffende
fur
h. es w i r d
Verfahren
dies
w~re n a t U r l i c h Spekulation wir
von noch g r ~ e r e m
i n diesem F a l l e
Wahrscheinerscheint,
sondern je
nicht
darUber g e s a g t ,
vermutlich
Die Zeilensumme der W a h r s c h e i n l i c h k e i t 100%. Die Frage nach der a b s o l u t e n
geeignet
Die Aussagen s i n d nichts
g e f u n d e n werden kann;
welches
gesch~tzte
Verfahren
aufzudecken. d.
die
8-2 g e w ~ h l t w u r d e ,
nut,
absolut, ob d e r
wenn er ge-
bewirken
Fehlerart
die
ist
kann. also
Erfolgswahrscheinlichkeit
Interesse.
noch um e i n i g e s
Da der Grad d e r gr~Ber were, wollen
uns davon e n t h a l t e n .
7, SCHLUSSBEMERKUNGEN Die v o r a n g e g a n g e n e n A u s f U h r u n g e n haben - so h o f f e dad d i e A n a l y s e yon P r o g r a m m f e h l e r n ein
recht
notwendiger
und n U t z l i c h e r
ich
- gezeigt,
e i n zwar s c h w i e r i g e r , ProzeB i s t .
aber
Wir k~nnen a u f
134 diese A r t zu Aussagen kommen, ate uns h e l f e n kSnnen, GegenmaBnahmen zu l i n d e n gegen gewisse wiederkehrende F e h l e r a r t e n .
Die
v o r l i e g e n d e Untersuchung befa~te sich m i t e i n e r ganz s p e z i e l l e n Gruppe von Systemprogrammen. Es g i b t
daher noch eine V i e l z a h l
von Fragen, zu denen auf Grund des v o r l i e g e n d e n M a t e r i a l s keine Antworten gegeben werden konnten.
a)
Was i s t
der E f f e k t
Zu diesen Fragen gehSren:
der Verwendung
h~herer Programmier-
sprachen in der Systemprogrammierung? Welche F e h l e r a r t e n treten
b)
s e l t e n e r auf?
Welche g e n e r e l l e n U n t e r s c h i e d e g i b t programmen
E i n i g e der S c h l u # f o l g e r u n g e n , gezogen wurden, sind s i c h e r l i c h Vielleicht
es zwischen K o n t r o l l -
und Kompilern? d i e auf Grund der v o r l i e g e n d e n Daten alles
andere a l s zwingend.
r e g t gerade das den einen oder anderen Kollegen an,
in der angedeuteten Richtung nachzudenken und meine Vorschl~ge zu erg~nzen, zu r e c h t f e r t i g e n
oder zu w i d e r l e g e n .
135
Literaturverzeichnis [1]
Moulton,
P. G. and M u l l e r ,
emphasizing
[2]
Rubey, R et a l : (Apr.
[3]
diagnostics,
M. E.: DLTRAN - A c o m p i l e r
CACM 10, 1 (Jan.
Comparative e v a l u a t i o n
1967)
of P L / I ,
Boies,
S. J.
and Gould J.
D: A b e h a v i o r a l
analysis
programming - on the f r e q u e n c y of s y n t a c t i c a l IBM Research, Yorktown H e i g h t s ,
[4]
Weinberg,
BIT 12 ( 1 9 7 2 ) ,
programming
in W. G. Hetzel Prentice Elspas,
Hall,
New York, 1971
language d e s i g n ,
(ed):
Program Test Methods
Englewood C l i f f s ,
B.et al:
(January/February
[8]
pp 38 - 53
Kosy, D.W: Approaches to improved program v a l i d a t i o n through
[7]
errors.
RC-3907 (June 1972)
G: The psychology of computer programming,
Van Nostrand R e i n h o l d , [6]
N.Y.,
of
Henderson, P. and Snowdon, R: An e x p e r i m e n t in s t r u c t u r e d programming,
[5]
ESD-TR-68-150
1968)
N. J . ,
Software r e l i a b i l i t y ,
1973 IEEE Computer 4,1
1971)
Lowry, E: Proposed language e x t e n s i o n s to aid coding and a n a l y s i s of l a r g e programs, IBM Systems Development Division,
Poughkeepsie, N . Y . , TR 00.1934 (1969)
136
Instr./~lodu
nut
alter
nur
insge-
alter
+ neuer
neuer
saint
Code
Code
Code
O-
99
84
27
52
163
100-
199
5
63
38
106
200-
299
5
43
3O
78
300-
399
4
27
19
50
400-
499
23
6
29
500-
749
22
31
750-
999
12
16
1000-1249
!
1250-1499
13
14
5
11
1500-1999 2000-2499
5
z ModulnI I
Tabel!e
2-1:
3,5%
2
2500-2999 3000-4200
14,0%
11 6
I
81,5%
-
4
I
5
1,0%
I00
253
169
522
100,0%
Verteilung
der M o d u l g r 6 ~ e n
und M o d u l a r t e n
Tabelle
2-2:
>1000
999
99
10 -
-
9
I00
Verteilung
neon
r~Be
1 -
ge~nderte Instruktio
der
26
nach
~nderungsumfang
3
4
12 9
4
400
!-
Moduln
'
300
1250
138
~ Testfa] ,l,,,,,,,,,,,,,,,, I .....................
, --_
|P,,ob~m
T
I I ESTGRUPPE
,
J
1
/
.,~/ z-,
l
I
". .
.
.
.
"%<
oa~
.
./ ....
.....
J
nei n
~ / ,# ,' Y i
I F .,,
Antwor
¢±"
~%
INTEGRATIONS-
GRUPPE ___I
.
,Y
! ~
\
,~ja .... f Moou1I I ~oe~u~g I
"
ENTWICKLUNGSGRUPPE
Bild 2-3: informationsfluB w~hrend eines Programmtests
139
Anzahl Fehler
der
Anzahl
der
betroffenen
371
1
50
2
6
3
3
4
1
5
1
8
Moduln
432
Tabelle
4-I:
Anzahl
d e r von einem F e h l e r b e t r o f f e n e n
Modu
140
Anzahl
der
Moduln
Tabel!e
4-2:
Anzahl
der
Fehler
j e Module
112
1
36
2
15
3
11
4
8
5
2
6
4
7
5
8
3
9
2
10
1
14
l
15
i
19
1
28
202
512
Anzah~ der F e h l e r
je
Modul
141
Modulart
Anzahl
Anzahl
Moduln
Moduln
insgesamt
mit Fehler
Prozent der! ~ Moduln mit
Fehler
169
81
48
neuer Code
253
121
48
insgesamt
422
202
48
! Anzahl
Anzahl
F e h l e r pro
' Moduln
Fehler
Modul
nur neuer Code a l t e r und
nur neuer Code alter
169
254
1,5
253
i 258
1,0
422
512
1,2
und
neuer Code insgesamt
i
Umfang des
I Anzahl
neuen Codes
Fehler
nur neuer Code alter
und
neuer Code insgesamt
Tabelle 4-3:
1K Code
254
4,8
33K
258
7,8
86K
512
6,0
I
i
~
F e h l e r pro
Fehlerh~ufigkeit
nach M o d u l a r t
! ;
142 GruppeA
1.
Masch~nenkonfiguration
und A r c h i t e k t u r
2,
Dynamisches V e r h a l t e n und Kommunikation
I0
zwischen Prozessen
17
3,
Angebotene F u n k t i o n e n
12
4.
Ausgabelisten
5.
Diagnostik
3
6.
Leistungsverhalten
1
und - f o r m a t e
3
46%
Tabelle 6-i:
Fehlerarten
Gruppe A
143 A1
Maschinenkonfiguration
(a
Ger~tetyp oder - z u s a t z n i c h t b e r U c k s i c h t i g t ; g U l t i g e r E/A Befehl als ungUltig angesehen
(b
und A r c h i t e k t u r
1,5
E/A F e h l e r s i t u a t i o n oder Ger~testatus f a l s c h oder g a r n i c h t behandelt (vor allem bei dezentralen
E/A Routinen)
E/A Befehl falsch b e n u t z t , u n v o l l s t ~ n d i g oder f a l s c h s i m u l i e r t (bei Abbildung eines Ger~tes auf einem anderen) Fehlerstatistik unn~tigerweise
e)
f u r Ger~t n i c h t oder generiert
Externer Bedienungsmodus eines Ger~tes f a l s c h behandelt
0,5 10
Tabelle 6-2:
Fehlerarten
Gruppe A1
t 44 A2
Dynam_}s_£che..s,....V#rh.a.lten und Kommuni k a t i o n
zwi schen Prozessen
(a)
Dynamisch e i n g e t r e t e n e r Systemzustand n i c h t genau genug a b g e t e s t e t
(b)
Bei s e q u e n t i e l l e m Obergang auf anderen Proze~ ( v o r allem bei erzwungenem Abbruch) Zustand n i c h t a u f g e a r b e i t e t . Nachfolgeproze~ f i n d e t n i c h t e r w a r t e t e Parameter (z. B. R e g i s t e r i n h a l t ) vor
(c)
R e g i s t e r und K o n t r o l l b l ~ c k e ~ die mehrfach benutzt werden, wurden n i c h t g e r e t t e t . Unterbrechung z e r s t ~ r t I n f o r m a t i o n , die noch gebraucht wird
(d)
e)
Unterbrechungen waren zugelassen, die ausgeschlossen werden k~nnen. Andere Unterbrechungen waren ausgeschlossen, wurden i g n o r i e r t , obwohl f u r Systemablauf w i c h t i g Logisch e r f o r d e r l i c h e S c h r i t t e (z. B. Qffnen e i n e r D a t e i ) f e h l t e n . Falsche S e q u e n t i a l i s i e r u n g , falscher
f)
(g)
RUcksprung
B e t r i e b s m i t t e l v e r g a b e f a l s c h ; Verklemmung, Belegung n i c h t vorhandener, b e r e i t s b e l e g t e r Resourcen. Falsche Eigenschaft angenommen
1,5
Bei n i c h t g e n e r i e r t e r oder n i c h t d u r c h g e f U h r t e r Funktion damit zusammenh~ngende Folgefunktionen nicht weggeneriert
oder a b g e s c h a l t e t
0,5 17
Tabelle 6-3:
Fehlerarten
Gruppe A2
145 A3 (a)
Funktionen werden v e r v o l l s t ~ n d i g t , beabsichtigte
(b)
(c)
Arbeitsweise
da f u r
des Systems w e s e n t l i c h
Funktionen werden hinzugefUgt, v e r a l l g e m e i n e r t , obwohl n i c h t u r s p r U n g l i c h b e a b s i c h t i g t
1.5
Funktionen werden v e r v o l l s t ~ n d i g t E x t r e m f ~ l l e n , Ausnahmesituationen
1.5
bezUglich
(d)
Funktionen werden ge~ndert, um B e n u t z e r f r e u n d l i c h k e i t , S i c h e r h e i t , K o m p a t i b i l i t ~ t etc. zu verbessern
(e)
Extern veranlaBte ~nderungen ( z . B .
(f)
Grundannahme f u r weggelassene Angaben ( d e f a u l t s )
Produktstrategie)
2
ge~ndert (g)
(I)
Zus~tzliche eingefUgt
N a c h r i c h t f u r Operateur/Benutzer 1.5
Funktion wird e l i m i n i e r t ,
da n i c h t mehr b e n ~ t i g t
0.5 12
Tabelle 6-4:
Fehlerarten
Gruppe A3
146 Gruppe
1.
Initialisierung (yon F e i d e r n und Bereichen
2.
Adressierbarkei~ (io
S. des Assemblers)
3.
Bezugnahme auf Namen
4.
Abz~hlen und Berechnen
5.
Masken und V e r g l e i c h e
6.
Absch~tzung yon B e r e i c h s g r e n z e n ( f U r Adressen und Parameter)
7,
Plazierung
yon I n s t r u k t i o n e n
eines Moduls~ s c h l e c h t e
innerhalb
Korrekturen
5 38
T a b e l l e 6-5:
Fehlerarten
Gruppe B
147 BI
Initialisieru..n.~
(a)
Kontrollblock, Register, Schalter nicht gel~scht oder zurUckgesetzt bei Obergang yon Routine, ProzeB, Job etc. auf anderen
(b)
E/A Bereich,
Puffer
Benutzung n i c h t (c)
und dergleichen
vor
gel~scht
Felder als "Define ~torage" s t a t t als "Define Constant"
(o. I n i t i a l i s i e r u n g ) (m. I n i t i a l i s i e r u n g
bei Laden des Programms) e r k l ~ r t (d)
(e)
S t a t t ganzen Feldes, ganzer Tabelle T e i l davon gelSscht Initialisierung zum falschen oder mit falschem Wert
nur 0,5
Zeitpunkt 0,5 8
Tabelle 6 - 6 : F e h l e r a r t e n
Gruppe B1
148
B2
Adressierbarkeit
(a)
Zuordnen, Laden, Retten von A d r e ~ r e g i s t e r n vergessen (vor
(b)
Auswirkung von L~ngen~nderungen bei Nachrichten
c)
allem bei Anwachsen des Kodes) Konstanten,
auf F o l g e b e r e i c h e Ubersehen
Kode, B e r e i c h e , Speicherplatz
Nachrichten
Uberlagert,
mehrfach b e n u t z t
Verwechslung a b s o l u t e + v e r s c h i e b l i c h e , und v i r t u e l l e
da
Adressen (vor allem bei
reelle Zugriff
zu u n t e r e n K e r n s p e i c h e r b e r e i c h e n )
e)
Anpassung an Wortgrenzen
f)
ORG, LTORG e i n g e f U g t , Auftei!ung
einer
(Alignment)
fehlerhaft
ge~ndert
0,5
Phase wegen O b e r s c h r e i t e n s
des vorgegebenen S p e i c h e r p l a t z e s
0,5 7
T a b e l l e 6-7:
Fehlerarten
Gruppe B2
149
B3
Bezugnahme auf Namen
(a)
Feld hat andere als angenommene Bedeutung (z.B. P o i n t e r e n t h ~ I t n i c h t Adresse, sondern Adresse von Adresse)
(b)
Bezugnahme auf falsches Register oder falschen Feldnamen ( e v t l . wegen ~ h n l i c h k e i t der AbkUrzung)
(c)
Bezugnahme auf falschen r e l a t i v e n r i c h t i g gefundenem K o n t r o l l b l o c k . falschem Suchargument a d r e s s i e r t
(d)
Verwechslung d i v e r s e r Konstanten ( P a r t i t i o n N r . , SVC Nr.)
E i n t r a g in Tabelle mit 1,5
1,5 7
Tabelle 6-8: F e h l e r a r t e n
Gruppe B3
150
B4
Abz~hlen und Berechnen
(a)
Falsche Berechnung oder Abz~hlung von Feldoder Satzl~ngen, BereichsgrUBen ( n i c h t genannter Betrag der Abweichung)
(b)
wie (a)
(c)
Falsche r e l a t i v e
(d)
Falsches AbprUfen e i n e r Schleifenbedingung ( f r U h z e i t i g e r Abbruch, endlose S c h l e i f e )
(e)
Programmierter Z~hler (yon S~tzen, Z e i l e n e t c . ) liefert
(f)
(m)
1,5
(mit Abweichung um i Byte)
falsche
1
Adresse (Displacement)
Werte
Dezimal nach Hex. Umwandlung f e h l t Bin~r nach Dezimal Umwandlung f a l s c h Berechnung von P l a t t e n a d r e s s e n , Spurenzahl oder S p u r e n k a p a z i t ~ t
Ermittlung falsch
0,5 yon 1 8
Tabelle 6-9:
Fehlerarten
Gruppe B4
151
G,ruppe,,,C,
1.
Schreibfehler
in N a c h r i c h t e n und
Kommentaren 2.
Fehlende Kommentare oder Flowcharts (Standards)
3.
U n v e r t r ~ g l i c h e r Stand von Macros oder Moduln (Integrationsfehler)
4.
Nicht klassifizierbar
2 16%
T a b e l l e 6-10:
F e h l e r a r t e n Gruppe C
152 AI
1.
Haschinenkonfiguration
und A r c h i t e k t u r
Anzahl v e r s c h i e d e n e r Ger~tetypen und Zusatzeinrichtungen
2,
G e r ~ t e s p e z i f i s c h e E i g e n a r t e n und V a r i a t i o n e n der Fehlerbeh~ndlung
3.
Zug~nglichkeit,
Klarheit
der h/w
Dokumentation 4.
Kontakt zu / Kommunikation m i t h/w Entwicklern
5.
Z e n t r a l e / d e z e n t r a l e Behandlung der E/A Ger#te im System
6.
Bedienungserfahrung m i t Ger~t
T a b e l l e 7-1:
feh]erfaktoren
Gruppe AI
153 A2
Dynamisches Verhalten zwischen Prozessen
(1)
D a r s t e l l u n g der P r o z e B s t e u e r i n f o r m a t i o n (Obersichtlichkeit, Sicherung)
(2)
Strukturierung
(3)
Beschreibung der S c h n i t t s t e l l e n und KommunikationsbedUrfnisse a l l e r Prozesse a) e x p l i z i t e Parameter ( R e i h e n f o l g e , Bedeutung, Format) b) gemeinsame Datenbereiche ( i m p l i z i t e Parameter)
(4)
S t a n d a r d i s i e r t e Routinen (Macros), die ProzeBsteuerung auf h~herem Niveau u n t e r s t U t z e n , gewisses Aufbereiten
und Kommunikation
im System
der P r o z e ~ h i e r a r c h i e
des Systemstatus'
erzwingen etc.
(5)
D a r s t e l l b a r k e i t der dynamischen Abl~ufe, I n t e r a k t i o n von Prozessen etc.
(6)
Beschreibung der B e t r i e b s m i t t e l , i h r e s Zustands
(7)
Zentrale / d e z e n t r a l e Betriebsmittelvergabe
Tabelle 7-2:
Fehlerfaktoren
ihrer
Eigenschaften,
Behandlung der ProzeBsteuerung, etc.
Gruppe A2
154 A3
Angebotene Funktionen
I)
Qualit~t
der S p e z i f i k a t i o n e n
2)
Erfahrung mit ~hnlichen
3)
Statistische
Information
Datenmengen~
Bedienungsmodus
Systemen Uber B e n u t z e r p r o f i l ,
4)
Verdeutlichung / K o n z e n t r a t i o n auf E x t r e m f ~ l l e , Ausnahmesituationen
5)
Se!bstbeschr~nkung gegenUber eigenen "netten Ideen". D i s z i p l i n gegenUber externen WUnschen.
Tabelle
7-3:
Fehlerfaktoren
Gruppe A3
155
BI
Initialisierun
(1)
Erzwungenes I n i t i a l i s i e r e n oder Warnnachricht durch Obersetzer bei fehlender I n i t i a l i s i e r u n g
(2)
Automatisches
Anpassen der Operationen an
Feldl~nge
B. l~sche ganzes Feld)
~
(z.
(3)
Analyse yon Routinen bezUglich ihres E f f e k t s auf diverse K o n t r o l l b l ~ c k e , Register und Datenfelder
(4)
Darstellung
der e x p l i z i t e n
Parameter; der e r l a u b t e n Wertebereiche.
Tabelle
7-4:
Fehlerfaktoren
und i m p l i z i t e n und erwarteten
Gruppe B1
t56 B2
Adressierbarkeit
(1)
Ausdehnung der symbolischen Adressierung
(2)
Erweiterbarkeit
(3)
Abgrenzung der Adre~r~ume je Routine (~'Need to know")
Tabelle
7-5:
des Adre~raums
Fehlerfaktoren
Gruppe, B2
157
B3
Bezugnahme a u f Namen
(1
S y n t a x von Namen
2
Qualifikationsm~glichkeit Darstellung
der R o l l e
von Namen eines
Angabe der R o u t i n e n m i t Assoziative
Tabelle
7-6:
und
Zugriffsrechten
Adressierung
Fehlerfaktoren
Feldes
von T a b e l l e n
Gruppe B3
158 B4
i
Abz#hlen und Berechnen
Selbstbeschreibende M~chtige,
Daten und B e r e i c h e
an Datenbeschreibung g e k o p p e l t e
S c h l e i f e n b e f e h l e (z. B. I t e r i e r e Eintr~ge einer Tabelle) Hilfstabellen
oder l e i c h t
Umrechnungsroutinen
Uber a l l e
verfUgbare
fur
a) P l a t t e n a d r e s s e n b) S p u r k a p a z i t ~ t e n Regelm~Bigere A d r e B s t r u k t u r symbolische
T a b e l l e 7-7:
fur alle
Adressierung
Fehlerfaktoren
Gruppe B4
Ger~te;
Tabelle
8-1:
A6 Leistungsverhalten
Fehleraufdeckung
10
20
A5 Diagnostik
40
A3 Angebotene Funktionen 30
10
A2 Dyn. V e r h a l t e n + Kommunikation
A4 Ausgabelisten
30
10
20
10
20
10
Formale Beschreibungsmethoden
Gruppe A
PrUfen der Specs d u r c h Dritte
A1 Maschinenkonf. + Architektur
Fehlerart
MaB~
30
20
10
Simulation, ~odellBildung
10
20
10
20
20
20
ProgrammInspektion durch Dritte
40
40
50
40
30
30
DurchfUhren yon Testl~ufen
Idu rch
~rOs~:~:tmonen
20
20
30
B4 Abz~hlen und Berechnen
B5 Masken und V e r g l .
B6 Bereichsgrenzen
T a b e l l e 8-2:
Plazierung
10
10
20
20
10
Floyd/Hoare Beweismethode
F e h l e r a u f d e c k u n g Gruppe B
30
20
B3 Bezug auf Namen
v. Code
20
12 ~dres si e r b a r k e i t
B7
30
B1 nitialisierung
Fehlerart ~ D r i t t e
"~~ahme
10
20
20
10
10
10
~°sntl~ufen
i
Simulation
30
20
20
20
20
30
20
DurchfUhrung yon T e s t l~ufen (selbst)
40
30
30
20
30
40
30
DurchfUhrung yon T e s t l ~ u f e n (durch D r i t t e )
APLGOL a Structured Programmin~Language
Harwood
G.
California,
Kolsky,
IBM
for APL
Scientific
Center,
Palo
Alto,
USA
ABSTRACT
A P L G O L is a language p r o v i d i n g i n t e r s t a t e m e n t control s t r u c t u r e for APL.
It permits p r o g r a m s to
conciseness structured overall
of standard programming
program
be w r i t t e n using the power and
APL expressions concepts
control
to
in c o n j u n c t i o n
emphasize
flow, rather
than
more
the
of
details
with the of
individual statements.
The A P L G O L S y s t e m d e s c r i b e d consists of three parts: an
APLGOL-to-APL
compiler.
1.
compiler
and
an
INTRODUCTION
The idea for A P L G O L arose during the
Fall of 1971 w h e n John R.
IBM Palo A l t o S c i e n t i f i c Center
in c o m p u t e r
science at
instead of the usual ALGOL-W. an a l g o r i t h m
w r i t t e n in
Stanford
required explicit
was t e a c h i n g a
U n i v e r s i t y using
APL
The class o b s e r v e d that a l t h o u g h
APL may
be c o n s i d e r a b l y
more concise than the same a l g o r i t h m APL
reverse
All three parts are themselves w r i t t e n in APL.
Walters of the class
an Editor,
APL-to-APLGOL
shorter and
w r i t t e n in PL/I or ALGOL,
i n t e r s t a t e m e n t control
to be
written.
162
Although
APL contains
these o p e r a t o r s are array data.
number
Only a single b r a n c h
handle w h a t e v e r roughly
a great
of e l e g a n t
operators,
d e s i g n e d to m a n i p u l a t e scalar, (right arrow)
i n t e r s t a t e m e n t control
e q u i v a l e n t to
a
instructiont
which
is,
consequence,
the control
vector,
is a v a i l a b l e to
is necessary.
This is
machine-oriented conditional of
course, very
flow w i t h i n
or
general
an APL
branch
but
as
p r o g r a m can
a be
obscure and n o n - s t r u c t u r e d .
It is
the p r o p e r t y of
m a d e that APL that
claim. APL
has led to
is ~'hostile" to s t r u c t u r e d
the d i r e c t n e s s
implemented
APL w h i c h
to
with
co-exist
w h i c h the w i t h APL
when augmented
s t a t e m e n t s being
programming.
APLGOL
been
disproves
this
certainly
by A P L G O L
is now
We feel
s y s t e m has
one of
the best
s t r u c t u r e d p r o g r a m m i n g languages.
Robert Kelley, the first 1972
one
of W a l t e r ' s s t u d e n t s and
APLGOL compiler
and
early
considerations
1973
in APL, (Refs.
in d e s i g n i n g
b r a n c h a r r o w w i t h more
5,6).
A P L G O L has
an
attempt
wrote
results in
One
of
been
to replace
the
major the
d e s c r i p t i v e k e y w o r d - o r i e n t e d structures
to clarify i n t e r s t a t e m e n t control. was
co-workers,
p u b l i s h i n g his
to p r o v i d e
The
either
first v e r s i o n of A P L G O L ALGOL-like
or
PL/I-like
control s t r u c t u r e s y n t a x for a common set of semantics.
During 1973 Kelley A P L G O L by adding
and Walters
APL.
the source text
o b j e c t A P L procedures. was used to structured
revised and a u g m e n t e d
the c o n c e p t of a reverse
A P L G O L from c o m p i l e d formof
(Ref. 7)
In the o r i g i n a l had to
c o m p i l e r to produce version a character
be m a i n t a i n e d a l o n g
r e c r e a t e the c h a r a c t e r source text form as
required,
so
that
p r o g r a m was n e c e s s a r y for m a i n t e n a n c e ,
The e v o l u t i o n
w i t h the
In the 1973 v e r s i o n the reverse c o m p i l e r
of A P L G O L syntax
in a c a n o n i c a l
only one listing,
form of
the
and execution.
and semantics has
r e s u l t e d in
163
the
addition
of
a
i n c l u d i n g IF, WHILE,
rich
assortment
of
UNTIL, FOREVER, FOR,
the
examined,
e s p e c i a l l y w i t h respect to the control structures and programming
of
labels
structures
Finally,
structured
whole topic
control
and CASE statements.
techniques.
e l i m i n a t e the GOTO statement and of
LEAVE,
access
ITERATE,
specific
and
label-free language. was p u b l i s h e d
The result
has
within
statements, w h i c h the
scope
has thus become a A v e r s i o n of
earlier this
has
been
been
to
its a t t e n d a n t labels in favor
RESTART
points
structure nest. A P L G O L
and b r a n c h e s
year.
of
may
the
only
control
truly GOTO-free,
this A P L G O L w r i t t e n (Ref.
and
in APL
8). Specific
details
structured
program
given in this paper refer to that system.
2.
The A P L G O L Language
APLGOL
is
a
language
for
providing
i n t e r s t a t e m e n t control for APL. to an
APL i n t e r p r e t e r as the
The A P L G O L L a n g u a g e is o r i e n t e d target machine.
semantics are u n a l t e r e d in APLGOL, in
the
syntax.
In
example, must appear the s e m i c o l o n used
a
A P L G O L the
In
general, APL
and only m i n o r changes occur comment
at both ends of
delimi:ter~ ~
,
for
a comment. A d d i t i o n a l l y ,
in APL for c a t e n a t i o n has
been r e p l a c e d by
the union symbol, u, in APLGOL.
An A P L G O L p r o g r a m contains statements
and comments a r r a n g e d to
d e s c r i b e the program's execution.
set of tokens d e s c r i b i n g
an
APLGOL
program
expressions.
The
may
be
lexicon for
The
either
basic
A P L G O L is
symbols
identical to
or
APL
the APL
character set. For details c o n c e r n i n g the APL c h a r a c t e r set and APL expressions, basic symbols
see the APL L a n g u a g e d e s c r i p t i o n
for A P L G O L
words as follows:
are single
(Ref. i). The
characters and
reserved
164
; o c=u~
A B C D E F I L N
FOR CASE ELSE ASSERT
are
with
comment
the
word
a sequence
the
character execution
Elementary
STEP
0 P R S U W X D O IF OF E N D THEN BEGIN
ITERATE
of
delimiter
in an A P L G O L
or A P L
affect
NULL
_REPEAT F O R E V E R
Comments
anywhere
EXIT
zero
RESTART
or more
character,
~
LEAVE
SUBCASE
UNTIL WHILE PROCEDURE
characters
enclosed
. Comments
may
be
in a b a s i c
program
but maynot
imbedded
string;
it is u n d e r s t o o d
that
they
appear
do not
of a p r o g r a m .
constructions
are
syntactic
rules
of
the
following
form:
< L E F T S I D E > ::= < R I G H T
where
the
sequence
left
the brackets nature hand
of
a
entire
further
APLGOL
for
a single
By
right of
a right
hand
to
describe and
substituting
tokens
with
further
for each
can be
in the
a right
it a n d
then
substitution,
obtained
of e a c h e l e m e n t a r y
substitution
produces
for
the
words
approximately
token
rules
replace
English
side
The collection
side
may
right.
left
syntactic
applied
SEQUENCE >
token which
collecting
side
TOKEN
on the
used
the proper
program°
rule
tokens
often
tokens.
sequence
or semantic side
are
sequence
collecting
is
or m o r e
and
the
token
etc.,
side
of o n e
SIDE
of
the resultant
for
an
action
a left hand APL object
program.
2.1
APLGOL
Statements
APLGOL
contains
basic
statements
which
govern
statements.
several executed the
types
of s t a t e m e n t s p
independently,
execution
of
which
or control
other
basic
are
either
statements, or
control
I65
Basic
Control
Statements
Statements
IF S t a t e m e n t
APL S t a t e m e n t
BEGIN Block
EXIT S t a t e m e n t
M
Empty NULL
WHILE
Statement Statement
ASSERT
Statement
FOR Statement
Statement
FOREVER
Level
REPEAT Block
Prefix
Prefix
Statement
Prefix
m
ASSERT
Statement
CASE Block LEAVE S t a t e m e n t
2.1.1 Basic
The m o s t written
fundamental
may c o n t a i n used
The EXIT
statement
level to
the next
expression
The Empty
causes outer
to be e v a l u a t e d
programs
semantically.
the A P L by a
combination
a line
statement,
semicolon.
of
This
APL o p e r a t i o n s
of APL code,
which
is halted.
whatever
course
excluding
the
is useful of his
statement
Empty
action when
code
contain
the
programmer
both be
statement
is
wishes.
other
the
printed,
Mainly,
wishes
parts
to write
syntactically
executed,
point the p r o g r a m m e r he
an
to the exit.
a programmer
while
the c u r r e n t p r o c e d u r a l
may optionally
just prior
F r o m that
of
statement
from
It
are correct
the
the
execution
portions
an exit level.
in A P L G O L permits
Should
with
is
terminated
(right arrow).
statement
associated
Statement
statement
any valid
to form
instruction
partial
APLGOL
an APL e x p r e s s i o n
and operands branch
Statement
Statements
as
statement
ITERATE RESTART
and
the
may choose the
to debug
have
and
comment
Empty certain
not yet
been
166
written.
The NULL
statement expresses
a
null action,
and
is p r i m a r i l y
useful in c o n j u n c t i o n with CASE blocks.
The A S S E R T s t a t e m e n t is useful
The integer
s p e c i f i e d in
for d e v e l o p i n g programs.
the first part
of the
s t a t e m e n t is
checked at compile time w i t h a p a r a m e t e r set by the programmer. If the
value of the
integer is
less than the
parameter,
a s s e r t i o n test is not c o m p i l e d into the program. expression programmer program.
in
the
second
to make The
part of
assertions
about
assertion expression
d u r i n g the p r o g r a m ' s
execution.
the
If
for the
level p a r a m e t e r set
current block and
assertion
level
the c u r r e n t block is completed, to the
next outer block level
does not i n i t i a l l y
c o r r e c t n e s s of
evaluated
a his
dynamically
'assertion fails.'
by the p r o g r a m m e r
inner block levels,
is s p e c i f i e d at
allows
the test fails, the p r o g r a m
is h a l t e d f o l l o w i n g a m e s s a g e p r i n t i n g
The a s s e r t i o n
The r e l a t i o n a l
statement
the is
the
is v a l i d
unless another
an inner n e s t i n g
level. W h e n
the a s s e r t i o n level p e r t a i n i n g is restored.
If
specify an a s s e r t i o n level,
the p r o g r a m m e r a value of i0 is
assumed.
2.1.2 Control S t a t e m e n t s
Several statements of
are used in
other statements.
iterative,
In
A P L G O L to control the e x e c u t i o n
general,
they specify
conditional,
or s e l e c t i v e s t a t e m e n t execution.
The IF s t a t e m e n t c o n d i t i o n a l l y e x e c u t e s
a s u b s e q u e n t statement.
167
Optionally, alternate
it may
statements
Since many
to group a
on the
an ELSE clause
of a single
several matched
in APLGOL,
statement,
statements
such as the IF,
a BEGIN block
into a single
pair of BEGIN
true part
to choose b e t w e e n
two
for execution.
of the statements
the e x e c u t i o n
uses
contain
statement
unit.
The BEGIN block
and END keywords. is
omitted
if
control
is a v a i l a b l e
The s e m i c o l o n
an ELSE
clause
is
present.
The _WHILE, F_OR, specify
and F O R E V E R
how to execute
often useful to control
subsequent
the e x e c u t i o n
the statement.
statement.
expression
statement's
execution.
may
A BEGIN block enabling
permits
iterative expression
the values
must be
a result
in
two
as
possible
form and the F O R . . . U N T I L . . . S T E P . . . D O
first e x p r e s s i o n
must
variable
i t e r a t i o n may be Otherwise,
contain for
the
specified
an a s s i g n m e n t statement.
as shown in the
a step of one is implied.
is p e r f o r m e d
to The
at the top of the block
The test
more
in
executing
for the test to fail, changed
of a
specified
is tested each time before
appears
is
them
execution
FOR...UNTIL...DO
induction
be used to
group of statements.
The r e l a t i o n a l
In order
statement
prefixes
these prefixes,
of an entire
header
relational
F_OR
with
in A P L G O L
statement.
the W_HILE s t a t e m e n t
The
an a t t e n d a n t
in c o n j u n c t i o n
The _WHILE s t a t e m e n t
statement
in the of
the
forms:
The
form.
The
initialize
the
step
the
for
complex
form.
for the i t e r a t i o n
each time before
executing
the statement.
A FOREVER continuous typically
statement iteration by an
is
used in of
EXIT or
APLGOL
a statement. a L_EAVE,
for An
although
unconditional escape an
is
~TERATE
and
caused or
a
t68
RESTART
of
an outer
statement
will
also
terminate
a FOREVER
statement.
A
REPEAT
statement
statement
block.
condition
is t e s t e d
statements
used is
for
within
A
REPEAT
block
is used
repetitive
similar
at the end
contained
least once° its
is
It
to
the
of the block.
the REPEAT
statement
execution
W_HILE,
are
from
a the
Consequently,
block
also differs
of
except
the
executed
at
the W H I L E
in
format.
A CASE
to
select
a particular
statement
in the
m
block an
for execution.
index value
executed. subcase
The integer
subcase
control
statement
is p r e c e d e d
in an
be
by
2.1.3
subcase
Leave,
control
to
and _RESTART
the
subcase
by a colon
cases may although null
m a y refer
to i d e n t i f y
subcases
may
subcases. to the same
or list the
be w r i t t e n
be d e s i g n a t e d
a NULL
statement
by may
Additionally, subcase.
Points
procedure
lie w i t h i n
each control
statements
reference
statement
has been defined.
from w i t h i n
is true
basic
point
and R E S T A R T
This
in the
and for
be a c c e s s e d
be
the m a x i m u m
type of
specify
and Restart
produces is to
statement
and null
expression,
specifies
lowest value.
followed
a structured
statementst
~TERATE,
any
statement
used.
a consequence,
expressions
Iterate
in
As
order,
the subcase
Statements
0 as the
subcase
an i n t e g e r
used e x p l i c i t l y
several
second part
may be
Each
subcase.
arbitrary
omitting
with
in the CASE h e a d e r subcase
of the A P L o r i g i n b e i n g
statement.
particular
which
in the
index value,
independent
A
The e x p r e s s i o n
specifying
some
statement
nest of a
~EAVE,
These points
can
the n e s t e d
structure
by L_EAVE,
I_TERATE,
which
a control
statement
list to
particular
use
control
statement.
Consequently,
169
branches
are
structured
statement
statement
to
following
control
causes
list
contains
words
structure
in the nest w h i c h
The
to
CASE WHILE
statements.
figure
not
contain
the s t a t e m e n t
containing
the control
the on
in the control statement
control
statement
following
An
refers
to resume
~TERATE
statements
of
a
to
with
on line the
with
statement
were
line
outer
line
_REPEAT implies
list
REPEAT
the
be
first
The c o n t r o l
21 refers
to the
on line
30
third
LEAVE
and w o u l d
cause
executed.
another
if the c o n d i t i o n
of
outer WHILE
33,
The
i!,
The
statement
resume
executed. on
level.
statement.
LEAVE line
OR
examples
17, the control
control w o u l d
on line 27 if it were
in the block
some
statement.
on
structure
ITERATE,
nearest
this
WHILE
control
and R E S T A R T
a control LEAVE,
and then the n e a r e s t
the W H I L E
following
of the list:
~TERATE,
the LEAVE
resume
first
PROCEDURE
shows
the complete
statement
the
for the p r o c e d u r e
Should
6. Effectively,
LEAVE
statement control
would
the
immediate
designates
program
nest,
REPEAT.
of
the p a t t e r n
particular
2 by locating
the second LEAVE
this
list the
of the ~EAVE,
structure
executed,
if
of
statement.
most
SUBCASE
is adjusted
line
from the
_WHILE on line
the
in c o n j u n c t i o n
In the first example designates
list in
disciplines
resume w i t h
the LEAVE,
the example
list
to
control
satisfies
list serves
If the control
does
_RESTART,
the
combinations
designate
REPEAT F O R E V E R
same control
which
control
the s p e c i f i e d
reserved
IF FOR
preserve
programming.
The LEAVE
The
restricted
iteration
specified
over
the
in the UNTIL
170
clause is valid~
A RESTART resumes control at
the e n t r a n c e of
the R E P E A T block w i t h o u t t e s t i n g the condition.
An ITERATE of
a FOR s t a t e m e n t begins another
i t e r a t i o n if the
i n d u c t i o n v a r i a b l e has not p a s s e d the FOR s t a t e m e n t limit after adding
the
proper
s t a t e m e n t again,
step.
A RESTART
i n c l u d i n g the
begins
the
entire
FOR
i n i t i a l i z a t i o n of the i n d u c t i o n
variable.
RESTART and ITERATE
both denote i d e n t i c a l actions w h e n used in
c o n j u n c t i o n with the FOREVER
statements.
re-executed. ITERATE,
3.
IF, WHILE,
(See
They the
C_ASE, S_UBCASE,
cause each
Figures
for
such
[ROCEDURE,
statement
diagrams
of
to
the
and be
L_EAVE,
and RESTART points in each of the control structures.)
The A P L G O L S y s t e m
The s t a n d a r d APL S y s t e m o p e r a t e s e i t h e r in a c o m p u t a t i o n a l m o d e or
a procedure-definition
mode.
In
p r o c e d u r e may be created or m o d i f i e d been a c c o m p l i s h e d ,
and
string of c h a r a c t e r s compiied,) interpreter. transformed
the c o m p u t a t i o n a l mode is
i n t e r n a l form
To edit to
this
m o r e suitable
procedure,
the c h a r a c t e r
m o d e an
APL
as desired. W h e n this has
r e p r e s e n t i n g the f u n c t i o n is
into an
back
this latter
entered,
the
encoded
(or
for the
the internal
form
as
the
form
APL is
programmer
changes again from c o m p u t a t i o n a l mode into p r o c e d u r e - d e f i n i t i o n mode. As a result,
a w o r k s p a c e contains only a single copy of a
procedure.
In the
A P L G O L S y s t e m a special
CMS editor
A P L G O L editor,
(Ref. 4), is a v a i l a b l e
and edit A P L G O L procedures. pair of compilers,
similar
The p r o g r a m m e r may invoke one
e i t h e r to t r a n s l a t e A P L G O L
into internal APL object programs,
to the
for the p r o g r a m m e r to create of a
source p r o g r a m s
or to t r a n s l a t e i n t e r n a l APL
171
object programs back into A P L G O L source programs editing.
(See Fig.
for s u b s e q u e n t
I)
To invoke the A P L G O L editor the p r o g r a m m e r types:
< NAME > ÷ E D I T
where
the name
representation complete,
specifies of
an
the EDIT
a
c h a r a c t e r array
APLGOL
procedure.
containing
When
p r o g r a m returns the u p d a t e d
the
editing
is
APLGOL program
as a c h a r a c t e r array.
The EDIT commands are listed b e l o w in section 3.1.
To evoke the compiler one types
APLGOL < NAME>
The
A P L G O L compiler
produces an
APL function
< A P L F N > given by the P R O C E D U R E < APLFN>;
whose name
is
statement in the A P L G O L
source.
The
APL i n t e r p r e t e r
internal
form, w h i c h
or APLGOL.
arbitrary
operates
on the
indistinguishable
may have b e e n p r o d u c e d
from s t a n d a r d APL
(It should be noted,
p o s s i b l e to print APL editor,
only
and edit A P L G O L p r o g r a m s
it is not p o s s i b l e APL programs.
relies heavily on
however,
that although
using the standard
to produce A P L G O L p r o g r a m s from
The Reverse
APL
to A P L G O L
the canonical form of the A P L
is c o m p i l e d from APLGOL.)
it is
compiler
p r o g r a m as it
172
To use the reverse c o m p i l e r one types
R E V E R S E ' < APLFN>'
The result form.
It
is an A P L G O L is left
source p r o g r a m in
in a global
expanded,
c h a r a c t e r array
indented
v a r i a b l e named
"OUT".
The
APLGOL
System
was d e s i g n e d
requires d i f f e r e n t human or editing its
a p r o g r a m than when
structure.
d e s i g n e d to
accept text with
the A P L G O L
the
user
to display
compiler
c o m b i n a t i o n s of
has
to the
p r o d u c e s source p r o g r a m s w i t h
in statements,
i n d e n t i n g for
been
a b b r e v i a t e d and
and m a n y source statements
line, w h i l e the REVERSE c o m p i l e r fully spelled k e y w o r d s
that
he lists a p r o g r a m
Consequently,
completely spelled keywords
with t w o - s p a c e
recognizing
factors c a p a b i l i t i e s w h e n he is typing
one s t a t e m e n t
each layer
per line,
of nesting.
Thus,
a
p r o c e d u r e can be entered as:
SAMPL; ~ A~B I~B A+C; ~ J÷I ~ L(N-1 +I)÷2 L2+L3÷L4L-J-I; L4÷ -L-l+pY÷7 DYADF L; E; E E A÷D; 2:ppZ T X C[2]÷(I+pZ)÷N~ Z+N,C,N,P,Q,R; E while
it is
p r o d u c e d by
the
reverse c o m p i l e r
s u b s e q u e n t editing as: ~ROCEDURE SAMPL~ ZF A~B ~HEN BEGIN A+C; ~OR J÷I ~NTIL L(N-I+I)~2 ~0 L2÷LS+L4L-J-I; BEGIN Lq÷-L-I+pY÷7 DYADF L; END~ END ELSE A÷D; ~Z i : p p Z ~HE~ [XIT C[2]÷(I+pZ)÷N~ Z+N,C,N,P,Q,R~ END PROCEDURE
and used
for
173
3.1
The A P L G O L Editor
A single c o n t e x t - o r i e n t e d editor is used to enter new programs, edit
old
ones,
and
p a t t e r n e d after with typical APL not to
of
a
lines
of text
b o t t o m of
relative to
A P L G O L statements lines
may
an implied
or down
a few
The text
to
cursor,
has no special
enter
an
one
replacing,
lines or
been
to context and
changing
deleting,
and p h y s i c a l lines;
be used
has
4). C o m p a r e d
functions include locating
inserting,
the text.
editor (Ref.
this editor is keyed
Its
cursor up
This
VM/370
c h a r a c t e r pattern,
pattern to another,
m o v i n g the
programs.
editor in
editors,
line numbers.
occurrence
list
the CMS
the next character
or p r i n t i n g
and,
finally,
to the
top or
r e l a t i o n between
an a r b i t r a r y
A P L G O L statement,
number of or
many
statements can be entered on a line.
APLGOL
source
Keywords all recognition,
programs are begin w i t h
actually
an u n d e r l i n e d
APL
character
first letter
arrays. for easy
while other letters are not u n d e r l i n e d in order to
reduce keystrokes.
Also, w h e n programs
the first u n d e r l i n e d
are being entered,
letter need be used.
facilitate the e n t e r i n g
of programs and to
This
only
is intended to
reduce m i s s p e l l i n g
of keywords.
To create an entirely new source
p r o g r a m the editor is invoked
by entering:
NEWPGM+EDIT
(O,N)p
~
t
974
where
NEWPGM
specifies and may lines
is the
the accept
commands
of source
is i n v o k e d
name
line width.
text.
according
of
for
new
source
is then
inserting,
To edit to the
the
The e d i t o r
and
N
C o m m a n d Mode
deleting,
a text array,
following
text~
in
or c h a n g i n g
the edit p r o c e d u r e
example:
ARRAYs-EDIT A R R A Y
where
ARRAY
the newly
is the
edited
and new texts
typing
may
INSERT
to abort
status
The E D I T O R ....
typing
this,
text
an
any other
DELETE
Mode
by
the
....
the
text
editing
following
after
"I" f o l l o w e d lines
the other
Insertion entered
editing
source
edited
and
for the old
either into
by
typing
internal
process
"FILE"
APL,
without
or by
affecting
commands:
the c u r r e n t
is in C o m m a n d Mode,
successive
one after
being
can be used
or I ....
W h e n the e d i t o r by
the old text names
of the procedure.
accepts
Insert
terminate
the A P L G O L
"QUIT"
the prior
of both
Different
if desired.
The p r o g r a m m e r to t r a n s l a t e
name
text.
and
pressing
by a
line.
Insertion carriage
Mode
is e n t e r e d
return.
Following
of text m a y be inserted
as they return
are to appear to
Command
the c a r r i a g e
return
by t y p i n g
in the text. Mode,
a
null
on a n e w
them
To leave line
is
line b e f o r e
character.
n or D n
Delete
n
lines b e g i n n i n g
with
the
current
line.
If
n is
175
omitted,
then
1 is assumed.
PRINT n or P n Print n
lines b e g i n n i n g
omitted,
1 is assumed.
NEXT n
w i t h the c u r r e n t
line.
If
n is
or N n
Step
forward
assume
n lines
in
the Text.
If n is
omitted
then
i.
UP n or U n Step up n lines.
If n is o m i t t e d
then assume
i.
TOP or T Position
line p o i n t e r
at first line
line p o i n t e r
at last line
in text.
B O T T O M or B Position
L O C A T E / .... / Search
or
L/ .... /
the lines
containing
the
following text s t r i n g
are used to
delimit
part of the
string being
be used
in text.
....
the argument
as a delimiter.
of the function,
the c u r r e n t
line
for the line
The c h a r a c t e r s of locate;
searched
for.
If the p o i n t e r
the search will b e g i n
/
and
/
they are not
Any c h a r a c t e r is on the last
may line
at the
top of the
and replace
it by text2
function.
CHANGE
/textl/text2/
Search
or C / t e x t l / t e x t 2 /
the c u r r e n t
if it occurs. delimiter
line for textl,
As noted,
and may
above,
be replaced
the c h a r a c t e r by any
/ serves
character.
as a
If textl
176
is null,
text2
If text2
is nu!ll
REPLACE
textl
at the b e g i n n i n g
of the
line.
is deleted.
.... or R ....
Replace given,
FILE
is i n s e r t e d
the
current
this
line by the
is the same
as DELETE,
text
....
If no
UP,
INSERT
the newly
edited
text
is
combined.
or F Leave target
the editor array
and a s s i g n
text
to the
status
of the
specified.
QUIT or Q Leave
the editor
function. it
and make
will
still
function,
remain
If
the a r g u m e n t
DELETE
is such
that
last
line.
point
3.2
The
The A P L G O L
compiler
syntax text
to the
is
scanner, generator.
initialize invoking
the
then
the
line,
line
to editing,
it was
a
defined
be unaltered.
NEXT,
Similarly,
the top first
to
If
prior
PRINT,
LOCATE,
the line p o i n t e r w o u l d
end of the function,
point b e y o n d
will
to the
undefined
undefined.
its d e f i n i t i o n
Note:
to the
no change
If the f u n c t i o n was
line p o i n t e r if the
move past
or the
is set to point
the UP c o m m a n d line p o i n t e r
tries is set
to to
in the text.
Compiler
organized
into three
(2) a lexical One of tables
the syntax
two for
scanner.
scanner, driving the
main and
sections: (3) an
procedures
appropriate
(i)
a
APL o b j e c t is
syntax
used
to
before
177 The syntax scanner is the c o n t r o l l i n g procedure, symbols
from
rules, and
the
lexical scanner,
then i n v o k i n g
the text
obtaining meta
identifying g e n e r a t o r as
the
grammar
n e c e s s a r y to
produce the a p p r o p r i a t e A P L object text.
On each invocation, symbol
the lexical scanner returns
number r e p r e s e n t i n g
character,
or
an APL
scanned for one
a
expression.
e.g.,
the first
Source
special
text characters
are
search for a specific meta symbol;
c h a r a c t e r in each reserved
w o r d is u n d e r l i n e d m e t a symbol, both
scanner and for the reader.
listed in
a single m e t a label,
characters, which is
d i s t i n g u i s h i n g it as a p a r t i c u l a r
for the lexical symbols
word,
of a very limited set of
then used to key a d e t a i l e d
to aid in
reserved
Appendix
II are
The terminal meta
detected
in the
lexical
scanner.
When a
grammar rule is i d e n t i f i e d
text g e n e r a t o r is
by the syntax
invoked with a number
scanner,
the
c o r r e s p o n d i n g to that
grammar rule to locate the applicable p o r t i o n of the generator. Information a c c u m u l a t e d in
the compile stack and
then used to g e n e r a t e labels, branches,
elsewhere is
or to produce a single
APL text line from a b u f f e r e d APL expression.
An output
p r o c e d u r e is
text,
is invoked from the
and
line is created. workspace,
so
employed to
form an
array of
object
text g e n e r a t o r each time
a new
This object text array remains in the c o m p i l e r that
extraneous b r a n c h e s
and
labels
can
be
removed after all the object text has been produced.
The semantics
of the control
structures are indicated
in the
e n c l o s e d figures.
In
the
APL
object
compiler, many of
text
produced
by
the labels and branches
the
simple
one-pass
may be unnecessary,
178
particularly absolute
the o b j e c t After by
those
branch. code
all the
an
target
of the
appear
on a
removed,
3.3
It
label
absolute single
been
internal
APLGOL
possible
to
produced
by
the single
branch.
one
and
APL
APLGOL
implementation
entity
from
with
if m u l t i p l e
labels
all
but one
are
accordingly.
systems
to r e t a i n
and to
rather
heavily
the other there
(Re,.
dissimilar,
this
to
requires
and
not m u l t i p l e
from
forms
of
of the
In the o r i g i n a l
not
only
a separate
guarantee
level.
by h a v i n g
been
advantage
source was
one could
reduced
it has
by the p a t t e r n s
The m a i n
same m a i n t e n a n c e
are
This
single
translation
conditioned way.
are
The
5) the A P L G O L
the A P L object, at the
a
translate
as necessary.
get out of s y n c h r o n i z a t i o n .
requirements
removed
to the
text,
translators.
is
form is that to
followed
for each direction.
both
however,
program
as
references.
a label and
and an
tables
to r e f e r e n c e s
Further,
are c h a n g e d
APL are
translation
were
revised
of o b j e c t
label
builds
and their
detected
representation,
source
forms
be
each APL p r o c e d u r e
construct
A P L to APLGOL,
storage
can
in
a pair of t r a n s l a t o r s ,
Although
of a
Back-Compiler
customary
form for
labels
suitably
line
The A P L to A P L G O L
or from a c h a r a c t e r
both
showing
and the r e f e r e n c e s
has
consist
the c o m p i l e r
code has been produced,
branch
to that
which
them,
is formed object
absolute
references
statements
To remove
that
Additionally, a single
copy
of a procedure.
The
reverse
stylized
the o r i g i n a l duplicate
translation
canonical
form,
APLGOL
program
the original,
an i n d e n t e d graphically.
format w h i c h In this
from
APL to
w h i c h may as
entered.
the t r a n s l a t o r displays
form,
keywords
APLGOL
appear
will p r o d u c e
quite
Rather was
different
than a t t e m p t
designed
the s t r u c t u r e are fully
a
from to
to produce
of the p r o g r a m
spelled,
and each
179
control
level is
statement
per
consistently
line.
This
indented,
graphical
typically with
representation
p r o g r a m is very useful to depict the p r o g r a m structure. this form is
a v a i l a b l e for listing an A P L G O L
been s u c c e s s f u l l y
t r a n s l a t e d to
always
in a
be listed
APL, the
standard format,
of
one the
Because
p r o g r a m that has
APLGOL program thereby p r o m o t i n g
can a
c o n s i s t e n t style.
In general the of the
reverse c o m p i l e r does not
A P L G O L program.
number of APL statements edited
to have
present in
two
however, keep
in each block.
or m o r e
the o r i g i n a l
will add a BEGIN
It does,
change the structure
statements
A P L G O L source,
...END pair
count of
the
If the A P L f u n c t i o n is where
only one
the reverse
was
compiler
around the statements during back
compilation.
4.
ACKNOWLEDGEMENTS
The author wishes to a c k n o w l e d g e the S c i e n t i f i c Center staff
work of the IBM Palo Alto
for their i n s p i r e d work
i m p l e m e n t i n g and testing the A P L G O L
system.
in designing,
The m a j o r o r i g i n a l
c o n t r i b u t i o n s were made by Robert A. Kelly, John R. Walters and Dan M c N a b b Myers
(See Refs.
for the
5-7).
Special thanks are due to H. Joseph
more recent
modifications
and testing
APLGOL
under m a n y conditions.
The
author is
making
p a r t i c u l a r l y grateful
available to
us
functions w r i t t e n in APL.
his
to John
m o s t recent
(Ref.8)
R. Walters
copyrighted
for
APLGOL
180
APPENDIX I:
References
io
APL/360 User's Manual, GH20-0683,
2.
Dijkstra,
E.
Lecture,
Communications
Wo~
"The
Humble of
IBM Corporation.
Programmer",
the
ACM,
Vol.
1972 15,
Turing No.
I0,
October 1972. 3.
Dijkstra,
E.W.,
Dahl and C.
"Structured Programming",
A. R. Hoare) Academic
(with O.
Press, London,
J.
October
1972. 4.
IBM
Virtual Machine Facility/370:
5.
Kelley,
Edit
Guide, GC20-1805,
IBM Corporation. for
R. A.,
APL",
"APLGOL,
IBM
Palo
a Structured Programming Language
Alto
Scientific Center
Report
No.
320-3299, August 1972. 6.
7.
Kelley,
R.
A.,
"APLGOL,
Programming
Language",
Development,
VOlo
Kelley, R. Programming
an
IBM
Experimental Journal
of
Structured Research
and
17, No. I, January 1973.
A. and Walters, System for
J. R.,
APL"r
IBM
"APLGOL-2 A Structured Palo Alto
Scientific
Center Report No. 320-3318, August 1973. 8.
Kelley, R. Programming
A. and Walters, Language
J. R.,
System
APL-VI Conference, May 14-17, 9.
Knuth~ Stanford
D.
E.,
"A
University
STAN-CS-73-371,
Review
for
Proceedings
of
1973. of
Computer
June 1973.
"APLGOL-2 A Structured APL".
Structured Science
Programming" Department,
181
APPENDIX .
.
.
.
.
.
.
.
II: .
.
.
APLGOL
.
.
.
.
.
.
.
.
.
SYNTAX .
.
.
[I]
[2]
[3] [4]
< S T A T E M E N T LIST> ::= < S T A T E M E N T > ; I <STATEMENT LIST> <STATEMENT>
[5] [6]
< S T A T E M E N T > :::
[7] [8]
< C O M M E N T L I S T > ::= < C O M M E N T S T A T E M E N T > I
[9] [10] [11]
< S T A T E M E N T - A > ::= < E X P R E S S I O N > : <EXPRESSION> <EMPTY STATEMENT> NULL :
~12]
~13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
P R O C E D U R E _I_
::=
_I_
~:=
PROCEDURE
<STATEMENT
<EXPRESSION>
LIST>
LIST>
;
;
<STATEMENT-A>
EZZT [XIT <EXPRESSION> < B E G I N > < S T A T E M E N T L I S T > <END> <STATEMENT LIST> ~NTIL <EXPRESSION> < T R U E P A R T > < S T A T E M E N T > ~ 0 < S T A T E M E N T > < W H I L E H E A D > DO < S T A T E M E N T > DO <STATEMENT> < C A S E E X P R > B E G I N < S U B C A S E L I S T > <END C A S E >
[24]
[25]
::=
<STATEMENT>
[26]
::=
[273 [28]
: : = 1
[29]
[30]
[31]
[32]
<WHILE
[33]
::=
[34]
::=
~ASE
[35]
<END
[36]
< S U B C A S E L I S T > ::= < S U B C A S E > 1 1 <SUBCASE LIST> <SUBCASE> I <SUBCASE LIST>
[37] [38] [39]
<END>
HEAD>
::=
::=
<EXPRESSION>
<STEP> ~NTIL
THEN
<EXPRESSION> <EXPRESSION>
<EXPRESSION>
WHILE
<END>
<ELSE>
<EXPRESSION>
::=
::=
ASSERT
::=
HEAD>
CASE>
::=
<EXPRESSION> HEAD>
~F
<EXPRESSION>
<EXPRESSION> EASE
182
A. P. P. E. N. D.I.X. . II: . . .
A.P .L G. O. L.
.
.
[40]
<SUBCASE>
[41]
<SUBCASE
HEAD~
[42]
<SUBCASE
HEAD-l>
[44]
[45]
I
::= <SUBCASE :::
<SUBCASE
::=
[5o]
[53] [54] [55]
::= ~ E P E A T WHILE ~ASE SUBCASE PROCEDURE ~OR FOREVER
<END>
LIST>
:::
END
:::
[65]
I
[66] [67]
[68]
<ELSE>
::=
REPEAT
I
[69]
: :: ELSE
I _E
[70] [71]
[72]
[73]
: :: / F I I_ : := FOR
iF
[75] [77]
<EXPRESSION> <EXPRESSION>
1
[62] [63]
[76]
HEAD-l>
BEGIN
:::
[51] [52]
[74]
<STATEMENT>
! Z I RESTART !
[49]
[64]
HEAD>
::: LEAVE I ~TERATE
[48]
[56] [57] [58] [59] [60] [61]
S YNTAX . . .
I <SUBCASE HEAD>
[43]
[46] [47]
.
: :: F O R E V E R
I _zZ
<STEP>
: := STEP 1 S
:
;
o
1
o . . . . . .
o
o
o
o
I I
................
o . . . . . . . .
+
o . . . . . . .
o
o
I
~
o ........
o I I I
o ........
[(XEYBOARD)
I
Figure 1
o ........... o
÷ ..........
÷
~ .....................
+
o ..........
o
÷ o . . . . . . . . . . . .
÷
o
[EXECUTION OFf IAPL PROGRA~ I c ~ : : ~ : ~
............
~
I
0 . . . . . . . . . . . .
o
o
o
T
o
I APLGOL I IPROGRAM LISTINGI
o . . . . . . . . . . . . . . .
...........
IRESULTS O F I o ~ : : : ~ : ~ I O U T P U T OFf IEXECUTION I I I RESULTS I
+ +
.......................................
÷
÷
o . . . . . . . . . . . . . . . .
I I
l
A N APLGOL SYSTEM WRITTEN IN APL
............
÷
I APLGOL I IREVERSE COMPILERI [(WRITTEM IN APL)I I (REVERSE)
o ................
I SOURCE P R O G R A M ] I(CHARACTER ARRAY) I~ : ~ o
o . . . . . . .
. . . .
T÷ o . . . . . . . . .
o-o
I ................. I IAPLGOL COMPILER I 1(WRITTEN IN APL) II (APLGOL)
I
I
I
ICHANGES IN I c ~ I USUAL Io;:~:I APL P R O G R A M ] o ; ~ ~ : o IAPL PROGRAMI~::~IAPL EDITORI~:::oI(SPECIAL PORM) I
o . . . . . . . . . . . . . . . . . . . . .
i INPUT OF ~ I [APLGOL 1 IAPLGOL SOURCE PROGRAMI ~ ~ : : : ~.~ ~ ; ~ "-; ; ".",IEDITORI I(EDIT)t I (FROM KEYBOARD ) I I
o . . . . . . . . . . . . . . . . . . . . .
o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
APL WORKSPACE
184
APLGOL
BASIC
STATEMENTS
CONTROL
STATEMENTS
~F S T A T E M E N T BEGIN BLOCK WHILE STATEMENT PREFIX FOR S T A T E M E N T P R E F I X [OREVER STATEMENT PREFIX REPEAT STATEMENT PREFIX REPEAT BLOCK CASE BLOCK LEAVE STATEMENT ~TERATE STATEMENT RESTART STATEMENT
APL S T A T E M E N T EXIT STATEMENT EMPTY STATEMENT NULL S T A T E M E N T ~SSERT STATEMENT A S S E R T LEVEL S T A T E M E N T
FIGURE
:::
STATEMENTS
2
<EXPRESSION>
EXAMPLES: P + ( 2 = + / [ 2 ] : S o . IS)/S+tN; ~TIME= 'uTu' RATE= ' u R u '
::= E X I T I EXIT
DISTANCE=
~uDEI÷I+S;J÷JIB]~
<EXPRESSION>
EXAMPLES: ZF A=$ T H E N [XIT
ELSE E X I T P÷PIS~
~:= c < E X P R E S S I O N >
~ ;
EXAMPLE: ~F S S E C T O D A T E < S S M A X ~ H E N c S O C I A L S E C U R I T Y C O M P GOES H E R E
::= NULL
FIGURE
3
~
185
EXPRESSION>
ASSERT
:::
<EXPRESSION>
: <EXPRESSION>
<EXPRESSION>
;
;
EXAMPLE: ~SSERT
10
: A
EXPRESSION>
ASSERT
:::
EXAMPLE: ASSERT
100;
FIGURE
STATEMENT>
<STATEMENT>
BLOCK>
:::
~F E X P R E S S I O N [ H E N < S T A T E M E N T > IIF E X P R E S S I O N T H E N < S T A T E M E N T > ELSE <STATEMENT>
:::
BEGIN
E X A M P L E S OF B E G I N B L O C K S ARE THE FOLLOWING: IF A>5 THEN B_EGIN B÷A I5 ; A÷C÷ 5 ; END E_LSE BEGIN A÷A+I ; C ÷ B ÷ B IC ; E_ND ;
4
IN CONJUNCTION
IF A>5 THEN B÷A I5 ; ELSE BEGIN A÷A + 1 ; C ÷ B ÷ B IC ; E_ND ;
FIGURE
<STATEMENT
5
LIST>
WITH THE ~F
I F A>-5 T H E N BEGIN B÷I
END
I5 ;
A÷C'5 ; END ELSE A ÷ A + I "~
<STATEMENT>
186
:::
:::
::=
WHILE <EXPRESSION> DO <STATEMENT>
[OR <EXPRESSION> [NTIL <EXPRESSION> DO <STATEMENT> I FOR <EXPRESSION> [NTIL <EXPRESSION> STEP <EXPRESSION> DO <STATEMENT>
[OREVER DO <STATEMENT>
EXAMPLE : FOREVER ~0 ~F FLAGFUNCTION THEN LEAVE : FOREVER~
dEPEAT
::=
<STATEMENT LIST> [NTIL <EXPRESSION>
FIGURE .
.
.
.
.
.
.
.
.
.
.
.
<SUBCASE>
:::
.
.
::=
.
.
.
.
.
6 .
.
.
.
.
.
.
.
.
.
.
.
.
CASE <EXPRESSION> OF <EXPRESSION> BEGIN <SUBCASE LIST> [ND ~ASE
<EXPRESSION>
: <STATEMENT>
THE FOLLOWING IS AN EXAMPLE OF THE EASE STATEMENT BLOCK: [ASE IIJ ~F BEGIN
15
O:
FOR K÷IIJ UNTIL (pTABLE)[I] ETEP J ~0 BEGIN TABLE[K;I]÷FUN ARG~ I+I+l~ J+J-l~ ~ND~ 2: NULL~ i:
c CASE i EMPTY FOR NOW ~ 10:
12:
i%:
BEGIN I÷I-l; J÷J-l~
~F 10aIIJ ZHEN RESTART: EASE ELSE ~TERATE: EUBCASE~ [ND~ 5:
[XIT I,J; END EASE; FIGURE 7
187
::= LEAVE : ; I ~TERATE : ; I RESTART : ; FIGURE
EXAMPLE: [13 PROCEDURE EX; [2] WHILE A>B ~0 [3] BEGIN [4] REPEAT [5] B+C[I]; [6] WHILE I15; C÷C,J,B; [31] [32] END; [33] A÷B; [34] END PROCEDURE FIGURE
9
188
l o IF/THEN
2.
Iterate/Restart
IF/THEN/ELSE
Iterate/Restart
()
()
@ True
True Part Statement
S t aTtrueePm art e n t
Else Statement Leave
Leave Figure 10
APLGOL
189
3.
PROCEDURE
4.
BLOCK
FOREVER S T A T E M E N T
Iterate/Res tart
Iterate/Res tart
?
Forever Statement
Procedure Block
I
O
Leave
Leave
Figure Ii 5. W H I L E
6. REPEAT B L O C K
STATEMENT
Re start
Iterate/Re start
0 Repeat Block
ue Iterate While Statement
False
~
rue
04 Leave
Leave
Figure 12
190
7.
FOR STATEMENT
Restart
nitia [iza tior,
!
t
..... I For Statement
O
~
Leave
APLGOL Figure 13
191
8. CASE B L O C K A N D SUBCASE
Iterate/Re s tart for Case Block
? Select Ith Case
I Iterate/Re start for Subcase N
•
•
•
<>
I Leave for Case and Subcases
APLGOL Figure 14
AUS DER SICHT
SYSTEMPROGRAMMIERUNG
DER UNIVERSITA~_TL
N. ~irth EidgenSssische
Tschnische
Hochschule,
ZOrich,
Switzerland
ABSTRAKT
Die S y s t e m p r o g r a m m i e r u n g praktischen
Ausbildung
Systemprogrammierung lehrt
werden?
zu klarem,
Informatiker.
gelangt Denken
die
zur
und kann
~eise
ge-
dass der Erziehung
einger~umt
Detailwissen,
Informatik
ihres
d~m Auftrag
werden
soll
und dass
eine bedeutende
d~r Universit~t,
Lehrangebotes
in der jOngsten
der Compotertechnik
Computer-Wissenschaftsn. Ober Wesen, richtung
Ziel
Bezeichung
Lmbhafte
Kann
rechtfertigen?
verdr~ngt
dutch
Rolle
zur rechten
Mir will
scheinen,
"denn
dass sich
aus den Anforderungen
der Umwelt
Diskussionen
und
anp~ssen.
- veranlasst
dutch
und Lehrg~nge wurden
dieser
neuen
in
gefOhrt Studien-
eine neue akademische
Frage wurde mancherorts
eb~n,
Zeit sich
sich ~esen
- Abteilungen
eine Maschine Diese
das Ersetzen
des Faches,
ein ~ort
tige T~tigkeit
Vergangenheit
und E x i s t e n z b m r e c h t i g u n g
[3,4,5,I0].
Disziplin
dass
den Entwicklungen
das Aufkommen
besten
soll
in sinnvoller
Priorit~t
So entstanden
reich
Inwiefern
zur Ansicht,
von technischem
Beziehung
Beitrag
kann.
Es entspricht Inhait
einen wesentlichen
an Universit~ten
konstruktivem
in dieser
spielen
der
Der Autor
vor der Vermittlung gerade
liefert
des ~ortes
erfolg-
"Computer"
wo Begriffe
fehlen,
aus der
da stellt
ein".
das Wesen ableiten
eines l~sst,
an den S t u d i e n - A b s o l v e n t e n
Fachgebietes
am
welche
zukOnf-
stellt.
sein8
In diesem
Sin-
193
ne kristallisierte sich mehr und mehr das Konstruieren von Programmen aller Art und yon stets zunehmender Komplexitit als Kern der Informatik heraus. Die T~tigksit des praktizierenden matiksrs ist demnach vor allem konstruktiver Natur; Computer-Inqenieur,
wobsi wit d i e s s n
Infor-
er ist sin
Begriff nicht l~nger auf
den Bersich der Hardware beschr~nken ddrfen.
Besondere Aufmerk-
samksit hat in dieser Beziehung die SEstsmprogrammisrung
errsgt
[~]. Unter 5ystsmprogrammierung
verstehen wir dis Konzeption und Ent-
wicklung yon grossen Computersystsmen wie 5prachObersetzer, Systemkontrollprogrammen und anwendungsorientierter, Programme. Phasen.
komplexer
Dis Erstellung solcher Produkte besteht aus mehreren
Die Oblicherweise mit Programmisren bezeichnete T~tig-
keit ist darin lediglich als sinzelne Komponente enthalten, w~hrend for das umfasssndere Gesamtgsbiet auch dis Bezeichnung "Software Engineering" verwendet wird [6].
Entwicklungsn in der Industrie sind Oblicherweise projektorientiert.
Ein Projekt durchliuft bis zu seiner Vollendung folgends
Phasen
(siehe auch [8]):
I. Die DurchfOhrbarkeitsstudis
(feasibility study).
In dieser
Phase wird vorerst sine detailierte Aufnahme des Ist-Zustandes vorgenommen.
Darauf basiert die Kosten-Nutzsnanalyse,
welche die Grenzsn der einsetzbarsn Mittsl und der erwartstsn Leistungsverbess.erung durch das zu entwerfende 5ystem fsstlegt.
Ebsnfalls in diese Phase gehdrt sine Studie der Reali-
sierungsmeglichkeiten:
Kann das Problem am bestsn durch Kauf
oder Mists yon bsstehenden 5ystemen oder Systemkomponenten gslest werden,
oder durch sine Eigenentwicklung,
oder indem
sine Entwicklung in Auftrag gsgeben wird. Das Resultat disser Phase ist sin Pflichtenheft for den n~chstsn Schritt:
2. Die Projsktanalwss. Prajektdefini%ion, zerlegt wird.
Sis bssteht aus einer Vertiefung in dis wihrsnd der die Aufgabe in Tsilaufgaben
5chon hisr finden wir also Bins Vorbereitung dsr
Modularitit des Endproduktes.
Aus der Projektanalyse resultiert
sin Pflichtenheft~
welches im Detail die Anforderungen an das
projektierte System spezifiziert.
Ferner legt es die Form und
den Omfang der herzustellenden Dokumentation fest. Dabei handelt es sich nicht nut um sine Dokumentation des Systemprogrammes zuhanden des Systemprogrammierers, notwendige
sondern auch um
Unterlagen for den Gebrauch des Produktes,
Schulung und for die ~erbung.
auch die Art und den Umfang yon Abnahmetests, Ober Erfolg oder Misserfolg des Projektes wichtiges
und kritisches
for die
Das Pflichtenheft postuliert die am 5chluss
entscheiden.
Ein
Resultst der Projektanalyse ist der
Zeitplan for die DurchfOhrung
der Entwicklungsarbeitsn.
dem Zeitplan ist der Personalaufwand
ersichtlich:
Zeiten sind welche PersonalbestOnde erforderlich.
Aus
zu welchen Zeit- und
Personalplan ergeben wiederum sine genauere AbschOtzung der anfallenden Kosten,
die sich zwangslOufig im Rahmen der im
Grobkonzept festgehaltenen
3. Die Realisierung
Mittel halten mOssen.
(Implementationlo
In die Phase der Real~sie-
rung geh~rt die Planung d~r technischen Einzelheiten, eigentliche Programmierung baren Programmiersprache.
Msistens ergibt sich aus technischen
Erkenntnissen wKhrend der Programmierung Detail-Spezifikation
die
und die Codierung in einer verfOg-
des Systems,
eine Revision der
und leider oft auch sine
solche des Zeit- und gar des Personalplans.
Grobe FehlschOt-
zungen sind in dieser Hinsicht geradezu sprichw~rtliah
gewor-
den [I], und sind Zeugen for mangelnde langjKhrige Erfahrung und Ausbildung.
Die Realisierungsphase
enth~lt auch die Er-
stellung der bereits erw~hnten Dokumentation,
ja sogar die
Schulung yon Personal teils for die Implsmentierung selbst, vor allem abet for die spOtere Verwendung und die Abnahmetests des Systems.
4. Anwendung uno Wartung.
Diese Phase gehBrt nicht mehr im strik-
ten Sinn zur Entwicklung
sines Systems.
In dsr Praxis jedoch
ist der Uebergang zu dieser Phase oft nut schwer festzuhalten. In ihr wird das System im praktischen Einsatz einer BswOhrungsprobe enterzogen. merzt werden.
Blatante Fehler im System mOssen ausgs-
ErfahrungsgsmOss
treten sis derart h~ufig auf,
195
dass for diese T~tigkeit der Euphemismus ance) die wahrheitsgetreue 8ezeichnung
"~artung"
(mainten-
"Korrektur" vBllig
verdr~ngt hat. Doch handslt es sich in dieser Phase nicht nut um Korrekturen von Programmierfehlern,
sondern recht
h~ufig auch von Konzeptionsfehlern und um Anpassungen des Systems an sich verQndernde und erweiternde Erfordern~sse,
Aus dieser Skizzierung sines Software-Engineering Projektes wird dessen Aehnlichkeit mit Projekten in den meisten anderen Zweigen des Ingenieurwesens offensichtlich.
Was sin Software-Engineering
Projekt yon denjenigen anderer Fachzweige wesentlich unterscheidet, ist nut der Inhalt der Realisierungsphase.
Es ist daher na-
tOrlich, wenn sich der Unterricht in Informatik auf dieses technische Kapitel relativ stark konzentriert.
Skeptiker werden hier einwenden,
dass sich der Informatik-Unter-
richt an Universit~ten bisher Oberhaupt nicht auf die BedOrfnisse der Systemprogrammierung
eingestellt habe,
und dass im
Bereich der Programmierkurse mit allzu grosser Vorliebe kleine Spielzeugprogramme werden,
gebastelt und kOnstliche Problemchen gel~st
die in der technischen Realit~t keinen Platz f~nden.
Ich m~chte abet doch zu bedenken geben, dass dee Ueben an einfachen,
kleinen Programmen zum Erlernen der Grundkonzepta und
zur EinfOhrung in den Umgang mit Computern in der kurzen zur VerfOgung stehenden Zeit die einzige gangbare Meglichkeit darstellt. Der Vorwurf der Skeptiker ist abet dennoch relevant. ermahnt die Dozenten daran,
Er
dass die gestellten Uebungsaufgaben
mit Sorgfelt gew~hlte Abstraktionen yon realen Problemen sein mOssen. Es ist wesentlich,
dass die Informatik dadurch den engen
Kontakt mit der Realit~t beibeh~lt,
und dass sLe sich nicht wie
die moderne Mathematik davon g~ngzlich losl~st und sich darauf konzentriert,
ihre eigenen kOnstlichen Probleme zu postulieren
und Abstraktionen und Formalismen um ihrer selbst ~illsn zu kreieren. ~ir Dozenten werden aber auch aufgefordert, hinzuweisen,
dass die sorgf~ltig gew~hltBn Aufgaben als Abstrak-
tionen zu verstehen sind,
und dass sine Abstraktion die ver-
schiedensten realen Gestalten annimmt. die Studemten
stets darauf
Abet schwierig ist es,
sin 8eispiel als Beispiel erkennen zu lassen~
196
In meinem EinfOhrungskurs
in das Programmieren
erw~hne ich gerne
das Beispiel der Berechnung einer Tabelle yon Primzahlen zur Demonstration der Gedankeng~nge bei der Probleml~sung und des schrittweisen
Aufbaus yon Programmen.
Aber bei allzuvielen Zu-
h6rern ist mit dem besten Willen kein Interesse zu erwecken,
da
for sie das Problem der Primzahlen in Form yon Tabellensammlungen bereits seit langem gel6st ist~
Ohne Zweifel ist f~r den angehenden Software-lngenieur
eine Fol-
ge von Aufbaukursen mit Betonung der praktischen Programmiertechnik unerl~sslich.
In diesen Kursen soll FachwissBn Ober ver-
schiedene Kapitel der Informatik vermittelt werden, strukturiBrung bleme,
und -Representation,
Ober Such- und SortiBrpro-
Ober syntaktische Strukturen und deren Analyse,
Compileraufbau
und letztlich
Betriebssystem~o
man sagen,
Ohne bewusste Pflege der praktischen Oebungen
dass die theoretischen
Etwas pointiert k6nnte
Themen lediglich als GewOrz
um die Praktika zu stimulieren.
und erklQrt:
Ober
sogar Ober ausgew~hlte Kapitel der
sind diese Kurse jedoch beinahe wertlos.
dienen,
Ober Daten-
Naur geht sogar welter
Der [Vermittlung von] Erfahrung gebOhrt mehr Gewich%
als dem ~issen
[7].
Ich habe die Erfahrung gemacht,
dass den Studenten das im Vorle-
sungsteil solcher Kurse vermittelte Fachwissen meistens keine Schwierigkeiten bereite%;
je mathematischer seine Form,
leichter wird es "registriert".
desto
Hingegen spotten die eingereich-
ten Programme der Uebungsaufgaben oft jeder Kritik~ Welche Greuel an verknorzten
Konstruktionen und falsch angewendeten
Rez~pten werden da zu Programmen sungen" angeboten~ offensichtlich,
zusammengeflickt und als "L~-
Es ist nach solchen Einsichtnahmen jeweils
wo die L~cken in unserem Ausbildungsangebo%
waitesten klsffen~
am
doch bietet sich 18id~r keine Allerweltsl~-
sung an~ um sie zu stopfen.
Sicher am wertvollsten ware eine
pers~nliche 8etreuung eines jeden Studenten,
welche in der aus-
fOhrlichen UeberprOfung eines jeden erstellten Programmes und dessen Besprechung und Korrektur kulminier%.
Dass dies selbst
mit unbeschr~nkten Mitteln undurchfOhrbar ware,
ergibt sich aus
dem akuten Mangel an Lehrpersonal mit eben diesBm angestrebten
197
Reichtum an Programmiererfahrung.
Daher bin ich dszu Obergegan-
gen - vorerst versuchsweise in einem Kurs Ober Grundlagen der Compiler - komplette Programme selber nach bestem Wissen und Gewissen zu schreiben und alp Vorbilder zu verteilen.
Diese Pro-
gramme sind dann zu erg~nzsn,
die sine
z.B. dutch Anweisungen,
geschickters Art der Fehlerbehandlung darstellen, schnitte,
oder dutch Ab-
die der 5prache neu hinzugefOgte Satzkonstruktionen
verarbeiten.
Anderssits k~nnen die Vorlagen modifiziert werden,
z.B. indem sin Compiler,
der Code for Computer A erzeugt,
so
abge~ndert wird, dass Code for einen Computer B entsteht. Der Vorteil der Verwendung solcher Vorlagen liegt nicht nut der Reduktion von aufwendiger Codierung, darin,
sondern besteht vor allem
dass der Student sich an ein Vorbild punkto Struktur und
Progremmierstil halten kann, und somit vor Irrwegen bewahrt blsibt,
aus denen er aus Zeitmangel nicht mehr herausfindet,
selbst wenn er sich der begangenen Fehler bewusst geworden ist. Dem Dozsntsn bleibt die Hoffnung,
dass sich der vorbildliche
Programmierstil des Musters mit der Zeit durch Osmosis vererbe (siehe auch [11]).
Ich betone hier besonders den Programmierstil,
die saubere,
zweckm~ssige und klare Gliederung des Programmtextes. offenbart sich die Klarheit der Gedanken, Programmes begleiteten. NOtzlichkeit.
An ihm
die das Entstehen des
Er bestimmt seine Uebersichtlichkeit und
Er ist die ~uintessenz dessen,
was sich ~ieht in
Vorlesungen und BOchern dozieren l~sst, wo man vergeblich versucht,
numerische Massst~be anzulegen,
dessen An- und Abwesen-
heir der erfahrene Programmisrer jedoch ohns ZBgern feststellt.
Als unerl~sslich taxiere ich for jeden angehenden Systsmprogrammister,
sogar for jeden Informatikstudenten,
einer selbst~ndigen Arbeit,
die AusfOhrung
wo - ohne Vorlagen - m~glichst ells
Phasen eines Software-Engineering Projektes durchexerziert werden.
Dabei soil nach MBglichkeit auch in Gruppen gearbeitet
werden.
Erst dadurch wird den Teilnehmern die Wichtigkeit yon
sauberen,
klaren Spezifikationen ersichtlich.
Die Idee der Modu-
larit~t und des "interface" bleibt nicht nur abstrakte Eigenschaft und Problematik des Programmes,
sondern zeichnet sich in
198
der Aufteilung der Arbeit und der zwischenmenschlichen
Verst~ndi-
g~ng konkret ab.
Ich halts solche S o f t w a r e - P r o j e k t e nicht nut for den Studenten als f~rdernd, stimulierend
sondern such for sin Informatik-lnstitut als und notwendig,
scher R e a l ± s i e r u n g s a r b e i t e n
um mi% den Schwierigkeiten praktiin Kontakt zu bleiben.
solche E i g e n p r o d u k t e ideals M~glichkeiten mit neuen
Ideen
Ferner bieten
zum Experimentieren
(die dadurch oft schon verworfen werden,
sis publiziert sind~).
Der Eigenbau vermittelt
genOgend Vertraut-
heit mit der internen Struktur und Funktionsweise, vat Ab~nderungs- und E r w e i t e r u n g s a r b e i t e n den Experiments m~glich,
bevor
um die Seheu
zu nehmen.
Damit wer-
vat denen man sonst zurOckschrecken
wOrde.
Allerdings erfordert das Gebo% der professionellen Ehrlichkeit, dass diese Produkte nicht nu__~rals Versuchsobjskte konzipiert sind.
So ist es zum Beispiel fast Mode gmworden,
ten Compiler zu bauen~
allzu oft werden dann aber wichtige Aspek-
te fast v~llig ignoriert, Arbeitsaufwand im Eingabetext, Computer, kreise,
an UniversitY-
nachdem der wirklich
erforderliche
real erkannt warden ist. Behandlung van FehlBrn Problems der Erzeugung van gutem Code for reelle
Integration in bestehende Betriebssysteme sind Problam-
die nut allzu g e m
beiseite gelassen werden.
abet versagt man sich den Einblick pilerkonstruktion
in zentrale Aspekte der Com-
und erh~lt leicht den Hang,
wand for die Erstellung kompletter,
Dadurch
den nBtigen Auf-
eben praktisch brauchbarer
Systems zu u n t e r s c h ~ t z e n . D e r Entschluss,
sin praktisches Soft-
waresystem an einem Universit~tsinstitut
zu bauen,
nicht leichtfertig gefasst werden. ihm sine Projektanalyse vorangehen.
Wie in der Industrie,
sollte
Ich weiss aus eigmner Erfah-
rung, dass der Bau sines brauchbaren,
qualitativ hochstehenden
Compilers leicht den 5-10 fachen Arbeitsaufwand chendBn sogenannten
darf daher
"Studienobjektes"
erfardert.
eines entsprsEr Obersteigt
bald einmsl den Rahmen van Universit~tsinstituten;
Projekte,
die
sich Ober mehrere Jahre erstrecken und gr~ssere Arbeitsgruppen erfordern,
finden in der Industrie ihre berechtigte St~tte,
sis ohne den Hintergrund konkreter oekonomischer RealitQten
da
199
doch allzu leicht zu unfruchtbaren Monstren ausarten.
Wir haben bislang stillschweigend angenommen,
dass die System-
programmierung ein echtes Anliegen der Universit~tsausbildung sei. Abet ist dies ein Axiom? Wenn wit auf die politischen Postulate der jOngsten Vergangenheit h~ren,
dann sicherlich nicht.
Sie werfen den Universit~ten vor, sich zu Dienern der Industrie erniedrigt zu haben und die Ausbildung rein zweckorientiert den WQnschen der Industrie anzupassen. Industrie nicht nut ein Bedarf,
Bekanntlich besteht in der
sondern sogar ein Mangel an gut
ausgebildeten 5ystemprogrammierern.
Haben wit es also hier mit
einem Musterbeispiel des kritisierten Ph~nomens zu tun7
Nun dOrfen wir einerseits festhalten,
dass eine technische Hoch-
schule oder technische Universit~t ohne Ausrichtung auf die reellen Gegebenheiten in der Industrie ohne Zweifel eine zweckentfremdete Institution ware. auch nicht erwarten,
Anderseits darf die Industrie
dass die Absolventen bereits eine Fachaus-
bildung erhelten haben, die ihren Einsatz ohne weitere DetailAusbildung gestattet. vielmehr daran, lytischen, wird,
In der Tat liegt den Arbeitgebern heute
dass an der Universit@t eine Schulung zum ana-
zum konstruktiven,
zum kritischen Denken vermittelt
als dass bestimmtes Faehwissen verteilt wird.
den in der Computerbranche
Bereits wet-
Ingenieure mit der F~higkeit,
Aufga-
ben richtig einzusch~tzen und anzupacken, den Informatikern vorgezog~n,
selbst wenn dlese reich an theoretischen Kenn@nissen
sind und den festen Glauben haben,
ihre oberste Aufgabe sei der
Entwurf einer neuen Programmiersprache und die Konzeption eines Compilers.
Diese Gesichtspunkte geben uns vielleicht doch Anlass,
bei einer
5tandortbest~mmung unser Blickfeld etwas weiter zu spannen. deuten dahin,
5ie
dass es wichtiger ist, breiten Schichten solide
Grundlagen zu vermitteln,
als viele Software-Spezialisten auszu-
bilden. Man wird dabei unweigerlich die Begriffe Bildung und Ausbildung einander gegenOberstellen. Hauptaufgabe dBr Schule auffasst,
Und gerade wenn men es als
8ildung,
die F~higkeit des
selbst@ndigen und kritischen Denkens zu vermitteln,
dann erkennt
200
man auch die wahre z±ell
Aufgabe
des Programmierens.
und Chance Sic liegt
der
Informatik
nicht so sehr in der Technik
der Computerverwendung,
sondern
in der Denkschulung.
finden
weniger
in der Rolle
wit den Computer
sondern dieser
des H±lfsmittels, Rollen~
wit hier
Wie in keiner
gezwungen,
wegen
- ihrer
dacht
sein mOssen.
und mOssen BemOhung
oder
riesigen
Komplexit~t
dass
jede
um klare Lesungen,
gen abzusehen, Computer
konstruktiven
Sparte
die trotz
bib ins letzte
jedes
der Rolle
roll
von
sind
- oder
Detail
uns exakt
uns zwingen,
wenn wit sie nicht in
Kombination
Ober-
auszudrOcken Verdr~ngen
jede Disziplinlosigkeit
Wit mOssen
erscheint
in einer
Unklarheit,
Hierbei
des Studienobjektes,
zu beschreiben,
Wit sind angehalten,
einsehen,
tern verurteilt.
zumindest
andern
Vorg~nge
und spe-
der
zum Schei-
"gescheiten"
Oberblicken
[2].
der intellektuellen
L~sun-
Der
Herausfor-
derung.
Eine schwierige Unterschied
abet
gleichzeitig
zwischen
mengestelltBn
guten
Hierin
hierin
erkennen
~'strukturierten verifikation". sie vorab
und schlechten,
und reiflich
und den ProgrammiBrer Abet auch
durchdachten
zu strenger
hat unsere
Programmierens" In diesem
Sinn
sind und an komplexen Wenn
dukte
nicht
aufzufassen, Eleganz
sondern
des GefOhl
etwas mitbekommen, genau
dieses
als kleine
des nicht
der professionellen
zum Teil noch
wird,
seine
die dutch
vermitteln,
nut seiner
ausgearbei-
Pro-
Computeranweisungen
"Kunstwerke",
der Befriedigung
GewOrz
zum Handwerk,
Systemprogrammen
zweckgebundene
Programm-
selbst wenn
zur Perfektion
erzogen
Chance.
Ideen des
sie fruehtbar,
dazu
zusam-
zu erziehen.
einzigartige
und d e r " a n a l y t i s c h e n
der Programmierer
nut als tote,
rasch
den
aufzuzmigen
Rolle der neuen
Schulbeispielen
scheitern.
ist es,
zwischen
L~sungen
eine
sind
tet word~n
Aufgabe
Selbstkritik
Sparte
wit die zentrale
an kleinen
zentrale
Karriere
dient.
Integrit@t
des wit in der kommerziellen
ihre
dann hat er
Software
Es ist
und der Liebe so oft ver-
missen.
Wit anerkennen
also~
Sys@emprogrammierer stehen
sollen
dass
im Vordergrund
Grundkonzepte
und nicht
so sehr
der Ausbidlung
zum
und K o n s t r u k t i o n s d i s z i p l i n
technisches
Detailwissen.
Wit
201
bringen damit sogar dam Slogan "Schulung zum nenkBn vat Ausbildung mit Fachwissen"
gegenOber ainiges Verst~ndnis auf. Umso
befremdender muss uns die fas~ kritiklose, Uebernahme van Programmiersprachen, und Denkschemen kannt,
weitverbreitet8
Compilern,
for Lehrzwecke anmuten.
Terminologien
Es ist nachgerade be-
dass viele dieser weltweit verbreiteten
und oft firmen-
spezifischen Spraohen und Terminologien Oberholt,
unsystematisch
und oft unzweckm~ssig sind. Dennoch fehlt entweder Mut oder K~nnan in akademischen Kreisan, ~ege aufzuzeigen.
yam Oblichen abzuweichen und neue
SBlbst dart, wo alte Werkzeuge neuen Erkennt-
nissen diametral zuwiderlaufen, Konflikt zu verdrQngen,
wird krampfhaft versucht,
den
anstatt ihn zu l~sen. Man denke zum eel-
spiel an die durchaus ernsthaft vorgetragenen A n r e g u n g e n Ober "5trukturiertes Programmieren mit Fortran oder Basic". An diesen Zust~nden offenbart sich das Diktat der Industrie am bedenklichsten. Es ist ein Diktat,
dam die Industrie selbst auch unterliegt
und das ihr letztlich nicht zum Nutzen gereichBn wird. Es ist ein Diktat,
das aus der rasanten Entwicklung der Computeranwendungen
entstand, Standards,
aus der rein oekanomischen Unerl~sslichkeit gewisser Norman und Nomenklaturen~ und der Unm~glichkeit,
in der
zur VerfOgung stehenden kurzen Zeit Ordnung und Uebersicht in die Vielfalt der anfallenden Probleme zu bringen.
Swstemprogrammierung klarem,
aus der Sicht der Universit~t:
systematischem,
konstruktivem Denken,
konkreten Problemen der Praxis,
Erziehung zu
Orientierung an
Konzentration auf das Wesentli-
the, ohne zwangsl~ufige Angleichung an Methoden und Werkzeuge, die diesen Prinzipien zuwiderlaufan.
202
Literaturverzeichnis F.P.
Brooks,
jr.,
"The Mythical
Nan Month",
Prentice-Hall,
1974. E.~.
Dijkstra~ 859-866
(Oct.
Q.E. Forsythe~ science", --
~hat
R.W. Hamming, J. ACM~
Tech.
the computer
"One manTs 16, 3-12
Ed.,
Report,
10
N. Wirth~
- -
"Program Comm.
comes",
Science",
Engineering",
Nato
1969.
Zero - a freshman
course of computer
49-51.
"Planung
ProcBssing
yon Systemprogrammen", Oldenbourg,
as an emerging
development
stepwise
221-227
1972. discipline",
(North-Holland).
und die heutige
Universitas,
by
MOnchen
74, ~, 419-426
dar Computer",
ACM, 14,
May 1966.
(May 1968).
"Bie Computer-~issenschaften
Wendung 11
Jan.
~'Systems programming
Information
in computer
1969).
"Software
in Sys@emprogrammierung, G. SeegmOller,
15,
University,
view of Computer
Data 5/74,
E. Schieferdrucker,
ACM,
program
scientist
75, 454-462
(Jan.
Committee
"Datalogi
science",
educational
Rap. CS 39, Stanford
Monthly,
B. Randell~
Science P. Naur,
~'A University's
Math.
Comm.
1972).
to do until
Amer.
P. Naur,
"The humble programmer",
24, 371-384
refinement",
.(April 1971).
Ver-
(April 69)
Systemprogrammiersprachen und strukturiertes Programmieren.
Gerhard Goos, Universit~t Karlsruhe
Zusammenfassung Systemprogrammiersprachen sind h~here Programmiersprachen, die hinsichtlich ihrer Datenstrukturen und Grundoperationen auf die speziellen BedUrfnisse des Systemprogrammierers eingerichtet sind. Nach einem Oberblick Uber den gegenw~rtigen Bestand an Eigenschaften solcher Sprachen wird auf einige in Entwicklung befindliche Probleme, insbesondere im Bereich der Strukturierung von Daten und der Schnittstellenbeschreibung eingegangen.
1. EINFOHRUNG
Complete generality of programming implies complete absence of structure. P. Brinch Hansen H~here Programmiersprachen wurden vor allem entwickelt, um dem Programmierer die BUrde des Arbeitens mit einer Maschinensprache, oder einer maschinenorientierten Assembliersprache abzunehmen. Stattdessen werden Sprachelemente zur VerfUgung gestellt, welche die Formulierung von Algorithmen im Rahmen der Terminologie und der Denkweise des jeweiligen Anwendungsgebietes erleichtern sollen; wir sprechen daher auch von problemorientierten Programmiersprachen. Jede sol che Programmiersprache verdeckt eine Reihe von Eigenschaften der zugrundeliegenden Maschine und erlaubt dafUr neue, "h~here" Konstruktionen. Z.B. wird der Gebrauch von Sprungbefehlen durch bedingte Anweisungen und Schleifen weitgehend UberfIUssig; die lineare Speicherstruktur wird verdeckt durch Kellerstrukturen oder eine Halde wie in ALGOL68. Manche der neuen Eigenschaften werden durch das Betriebssystem zur VerfUgung gestellt wie z. B. Dateien anstelle externer Speichermedien. Sieht man einmal von
204 der Art der Implementierung einer Eigenschaft ab, die hier durch Software e r f o l g t , w~hrend die Eigenschaften der Grundmaschine durch Hardware oder wenigstens durch Mikroprogrammierung r e a l i s i e r t sind, so d e f i n i e r t die Programmiersprache eine neue "abstrakte Maschine". Diese abstrakten Maschinen haben sich Ubrigens bisher als wesentlich s t a b i l e r erwiesen als die Hardware: FUr den FORTRAN-Programmierer hat sich die Rechenanlage s e i t den Zeiten der IBM 650 bei weitem nicht so einschneidend ver~ndert wie f u r den Programmierer in Assembliersprache. Um so verwunderlicher i s t die immer noch zu beobachtende Sorglosigkeit bei der Entwicklung neuer Programmiersprachen verglichen mit der Sorgfalt und dem Aufwand, den man der Entwicklung neuer Hardware angedeihen l ~ t . Program~iersprachen dienen nicht nur dem technischen BedUrfnis, Probleml~sungen dem Rechner mitzuteilen. Wichtiger noch i s t , da# sie weitgehend die Denkgewohnheiten der Progra~mierer beeinflussen. Sie tun das bereits in einem Stadium, in dem der Programmierer noch gar nicht an die e x p l i z i t e Formulierung in einer Sprache denkt; M~glichkeiten, die sich sp~ter nicht oder nur mit MUhe in der Programmiersprache darstellen lassen, werden m~glichst frUhzeitig im Entwurfsproze~ aus dem Denkprozess e l i m i n i e r t .
Systemprograr~nieren i s t nach J. Sammet [I0] die Erstellung von Programmiersystemen; das sind Systeme, welche nicht unmittelbar L~sungen f u r Anwendungsprobleme l i e f e r n , sondern die Grundlage f u r solche Probleml~sungen in weiten Anwendungsbereichen bilden. Kurz ausgedrUckt i s t Systemprogrammieren also die Implementierung von S c h n i t t s t e l l e n , auf denen andere Programme aufbauen, oder wie w i r e s oben ausdrUckten- die Implementierung abstrakter Maschinen. Lange Zeit schien Systemprogrammieren nur mit Assembliersprachen denkbar, da nach allgemeiner Ansicht die Eigenschaften der Hardware, sei es die Speicherstruktur, seien es spezielle Befehle, so stark ausgenutzt werden mu~ten, da5 nur eine Programmiersprache in Betracht zu kommen schien, bei der die zugeh~rige abstrakte Maschine und die reale Hardware Ubereinstimmte. Auch wurde behauptet, da# nur auf diese Weise die "optimale" Ausnutzung des bescNr~nkten Hauptspeichers s i c h e r g e s t e l l t werden k~nnen. Das Entstehen von Systemprogrammiersprachen,
von hQheren Programmiersprachen
fur die Systemprogrammierung, i s t das Ergebnis der Beobachtung, dab der Systemprogran~mierer in Wahrheit das H i l f s m i t t e l Assembliersprache gar nicht v o l l ausnutzt. Er beschreibt beispielsweise bedingte Anweisungen und Schleifen durch immer wiederkehrende Befehlssequenzen, die man genauso gut mechanisch erzeugen k~nnte. Er e n t w i r f t sich in Form yon Unterprogrammbibliotheken zusammengesetzte
205
Operationen, die zusammengenommen neue Datenstrukturen charakterisieren. Insgesamt erh~It er damit Programmiersprachen, die er allerdings von Hand in Assembliersprache Ubersetzt und die auSerdem nicht standardisiert sind, sondern fur jedes Problem und von unz~hligen Programmierern t~glich neu entwickelt werden. Die Oberlegungen Uber systematische Programmentwicklung, Uber hierarchischen Programmaufbau und ~hnliche Prinzipien, die man heute unter dem Begriff "Strukturiertes Programmieren" zusammenfaBt, haben dazu gefUhrt, dab die Gemeinsamkeiten dieser adhoc-Sprachen immer starker in Erscheinung getreten sind und dann ihren Niederschlag in sprachlichen Formulierungen gefunden haben, die sich nicht mehr im Rahmen yon Assembliersprachen bewegen. Schwerwiegender noch i s t die Erkenntnis, die sich aus dem schichtenweisen Aufbau yon Systemen ziehen l~St, wie ihn erstmals Dijkstra [6] paradigmatisch vorfUhrte. Dijkstra zeigte n~mlich, dab der Systemprogrammierer nur in der untersten Schicht seines Programmaufbaus einer Sprache bedarf, die genau auf die Hardware zugeschnitten i s t . Danach entfernt er sich auch innerhalb yon Betriebssystemen und erst recht in Obersetzern yon der Hardware. Es i s t daher gerechtf e r t i g t , Sprachen zu benutzen, die in Ablaufsteuerung und Datenstrukturen eine yon der Hardware abweichende abstrakte Maschine verwenden und eher h~heren Programmiersprachen gleichen. Wenn es dann noch gelingt - z.B. in Form offener Unterprogramme und unter Ausnutzung spezieller Implementierungseigenschaften-, die Programmierung spezieller Hardware-Funktionen zug~nglich zu machen, so i s t man dem Ziel einer Programmiersprache, welche Assembliersprachen abl~sen kann und zudem in weiten Bereichen maschinenunabh~ngig i s t , schon sehr nahe. Die Entwicklung von Systemprogramiersprachen begann mit der Programmiersprache NELIAC [71, einem ALGOL 58-Dialekt, dessen Obersetzer in der eigenen Sprache geschrieben war. Eine auch heute noch bedeutende, sehr frUhe Systemprogrammiersprache i s t ESPOL [2], vonder Firma Burroughs ab 1963 fur die Maschinen der B5000 und sp~ter der B6OOO-Serie entwickelt. ESPOL enth~It ALGOL 60 als Teilsprache und hat schon in frUhen Zeiten, wenn auch u n r e f l e k t i e r t , viele Entwicklungsprinzipien aufgezeigt, die auch heute noch yon allgemeinem Interesse sind. Die allgemeine Entwicklung begann dann mit PL360 [12]. Schrittweise wurde zun~chst die Ablaufsteuerung realer Maschinen, dann die lineare Speicheradressierung [14,3] und schlieBlich die Bin~rcodierung fur Datenobjekte [31 dutch entsprechende Konstruktionen h~herer Programmiersprachen ersetzt (vgl. Abb. 1). Gegenw~rtige BemUhungen zielen darauf ab, das noch unterentwickelte
206 Gebiet der Datenstrukturen in den G r i f f zu bekommen und die Ma~nahmen zum Schutz gegen unbefugten Z u g r i f f auf Datenobjekte und Operationen zu verbessern. Letzteres h~ngt mit der allgemeineren Aufgabe zusammen, das Zusammensetzen von Programmen aus Einzelmoduln modellm~ig zu erfassen, mit dem Ziel zu Schnittstellenbeschreibungen zu kommen, die vom Obersetzer auch auf semantische Konsistenz geprUft werden k~nnen, soweit das mit den M i t t e l n einer Programmiersprache Uberhaupt m~glich i s t . 2. SYSTEMPROGRAMMIERSPRACHENHEUTE Bevor w i r uns der m~glichen zukUnftigen Entwicklung von Systemprogrammiersprachen zuwenden, i s t es nUtzlich, den Bestand an wesentlichen Spracheigenschaften zu betrachten, auf dem diese Entwicklung aufbauen kann. Aufgrund der einleitenden Bemerkungen i s t es nicht weiter verwunderlich, dab dies zugleich eine Bestandsaufnahme wUnschenswerter Eigenschaften h~herer Programmiersprachen d a r s t e l l t . Dabei i s t die Schreibweise der einzelnen Elemente von untergeordneter Bedeutung; barocke Wucherungen wie zum Beispiel unterschiedlichste Schreibweisen f u r die verschiedenen Formen der Wiederholungsanweisung sind natUrlich negativ zu bewerten, treten aber h~ufig auf. 2.1 Programmaufbau und Abl.a.ufsteuerun g Weithin akzeptierte Grundlage des Aufbaus yon Programmen i s t heute die Gliederung in Bl~cke und Prozeduren. Diese Blockstruktur dient i n h a l t l i c h unterschiedlichen Zielen, aus denen f u r die Programmkonstruktion eine Reihe von Nebenbedingungen erwachsen (vgl. Abb. 2). Zum Beispiel kann eine Prozedur, welche nut der abkUrzenden Schreibweise f u r eine Folge von Anweisungen dient, beliebige Seiteneffekte auf ihre Umgebung haben. Neue Grundoperationen k~nnen Seiteneffekte auf die Programmschicht haben, in der sie d e f i n i e r t sind, wie man am Beispiel von Speicherzuteilungs- und Freigabe- Prozeduren und den von ihnen verursachten Pegel~nderungen s i e h t ; hingegen dUrfen sie keine Nebenwirkungen auf dem Abstraktionsniveau des Aufrufers solcher Operationen haben. Selbst~ndige Programmoduln schlieBlich s o l l t e n nur Uber e x p l i z i t d e f i n i e r t e Parameter und das eventuelle Funktionsergebnis Ver~nderungen nach auBen bewirken, um maximale Unabh~ngigkeit vonder Umgebung zu erreichen. FUr die Steuerung des sequentiellen Ablaufs eines Programms ben~tigen w i t H i l f s m i t t e ] (Abb.3) zur Darstellung der sequentiellen Folge, der zeitlichen Unabh~ngigkeit ( K o l l a t e r a l i t ~ t ) , der Auswahl aus mehreren Alternativen, der aufz~hlenden und der iterierenden Schleife, des Aufrufs einer Prozedur, sowie f u r den Abgang aus einer zusammengesetzten Anweisung (Abschnitt, Block, Proze-
207
dur), wenn die Zielbedingung erreicht ist. Trotz aller gegenteiligen Argumente muB darUberhinaus dem Systemprogrammierer der Sprungbefehl erhalten bleiben, damit ein gleitender Obergang vonder Ebene der Hardwareprogrammierung zu h~heren Sprachelementen m~glich ist. Dabei sind bei den in Abbildung 3 zusammengestellten Konstruktionen natUrlich noch Vereinfachungen m~glich; A bedeutet entweder eine einzelne Anweisung oder einen "Abschnitt" (eine Gruppe) aus mehreren Anweisungen, denen Vereinbarungen vorangehen k~nnen. 2.2 Datenarten
Einfaohe Datenobjekte k~nnen charakterisiert werden durch ihren Umfang (Anzahl von Bits). Eine sol che typfreie Kennzeichnung erlaubt zusammen mit der Adresse den Zugriff auf das Objekt; fur weitergehende Operationen muB man jedoch wissen, nach welcher Codierungsvorschrift die Objekte zu interpretieren sind. Abgesehen davon, dab es l~stig ist, st~ndig wissen zu mUssen, dab O, 1, 2, 3, 4, 5 in dieser Reihenfolge etwa die Farben rot, gelb, grUn, blau, violett, purpur bedeuten und durch 3 Bit codiert werden k~nnen, wird hierdurch Fehlinterpretationen TUr und Tor ge~ffnet. Die Kennzeichnung der einfachen Objekte durch Datentypen, die nicht nur den Umfang, sondern auch die Codierung und die zul~ssigen Operationen charakterisieren, greift daher auch bei Systemprogrammiersprachen um sich (Abb. 4). Man sollte dabei sorgf~Itig unterscheiden zwischen der Umfangsangabe, die zugleich die Verarbeitungsbreite bei Operationen wiedergibt, und Ausschnittsangaben, welche besagen, dab ein Objekt mit einem geringeren Umfang gespeichert werden kann, auch wenn es beim Zugriff sofort auf die Verarbeitungsbreite (z.B. Registerl~nge) verl~ngert wird. Allerdings genUgen die M~glichkeiten, wie sie durch Datentypen geboten werden, nicht fur alle Aufgaben der Systemprogrammierung. Schwierigkeiten bereiten vor allem Algorithmen zur Speicherverwaltung, welche beispielsweise Keller oder Halden auf den linearen Speicher abbilden und demselben Speicherbereich wechselnde Decodierungsvorschriften, also unterschiedliche Datentypen, zuordnen. Der gegenw~rtig h~ufigste Ausweg besteht darin, einen Datentyp word einzufUhren, welcher zus~tzlich die M~glichkeiten typfreier Sprachen er~ffnet. Damit ~ffnet man natUrlich die BUchse der Pandora wieder, die man durch EinfUhrung der Datentypen gerade geschlossen hatte. Es ist daher sinnvoll, syntaktisch "unsichere" Programmoduln einzufUhren und nur in diesen den wortweisen Zugriff zu erlauben. Neben einer Erh~hung der Zuverl~ssigkeit erzwingt ein solches Vorgehen gr~Bere Klarheit Uber die Objekte, mit denen man umgeht.
208 Bei zus~mmengesetzten Datenobjekten unterscheiden wir ein- oder mehrstufige Reihungen, bei denen die Elemente durch berechenbare Indizes selektiert werden, und Verbunde, bei denen die Elemente durch fest vorgegebene Bezeichner angesprochen werden. Im Bereich der Systemprogrammierung lohnt es sich, bei Reihungen solche mit Deskriptor und ohne Deskriptor begrifflich zu unterscheiden; letztere haben einen zur Obersetzungszeit berechenbaren Umfang. PASCAL bietet zus~tzlich Mengen als Objekte sowie sequentielle, dafUr aber dem Umfang nach unlimitierte, Strukturen an, die speziell als Modell fur Daten auf Hintergrundspeichern gedacht sind. Schlie~lich kann man in dieser Sprache die Elemente zusammengesetzter Objekte "packen", um Speicher zu sparen. Diese M~glichkeit bildet zusammen mit Mengenobjekten die wesentliche Voraussetzung dafUr, dab man keine Formulierung fur den bitweisen Zugriff in den Speicher ben~tigt. Zur Bildung komplizierterer Strukturen steht gegenw~rtig nur noch der Referenzbegriff, also die Na~hbildung von Adressen, zur VerfUgung, so da~ wir zur Zusammenstellung in Abb. 5 gelangen. Die Entwicklung der Beschreibungshilfen fUr Datenstrukturen i s t yon einer bemerkenswerten Unlogik gekennzeichnet. Bei schrittweiser Verfeinerung des Programmentwurfs ergeben sich Datenstrukturen bereits in den allerersten Schritten. Trotzdem hat man zur Modellbildung mit Reihungen, Verbunden und Referenzen in Programmiersprachen nur Begriffe zur VerfUgung, welche gemeinhin das physische Nebeneinander der Elemente oder die Verweisstruktur wiedergeben, also maschinen- und implementierungsorientierte Begriffe, welche auf diesem Niveau noch gar keine Rolle spielen s o l l t e n und e l i m i n i e r t werden mUBten. (Die Relationenmodelle und andere Beschreibungshilfen, die f u r den Entwurf von Datenbanken entwickelt wurden, zeigen eine m~gliche A l t e r n a t i v e ) . Andererseits hat man bisher zur globalen Organisation der Speicherverteilung auger der Keller- und der Haldenorganisation keine H i l f s m i t t e l zur VerfUgung. Alles Weitere muB m i t h i l f e von Reihungen s i m u l i e r t werden, obwohl dieses H i l f s m i t t e l fUr diesen Zweck wiederum zu hoch gegriffen i s t . (Der bisher einzige weitergehende Versuch, n~mlich die Speicherorganisation des AED-Systems (Ross [9]) hat sich zwar bew~hrt, hat aber keine Nachfolger gefunden.) Im Zusammenhang mit Geflechten aus zusammengesetzten Objekten, etwa bei Listenstrukturen, i s t schlieBlich noch die Frage zu l~sen, wie man zu einer einheitlichen Artangabe fur Verbunde mit Komponenten unterschiedlicher Art gelangt (vgl. Abb.6). Geht man davon aus, dab jeder Verbund bzw. jede Verbundvariable nur eine f i x i e r t e Komponentenart erlaubt, w~hrend verschiedene Verbunde der gleichen Art unterschiedliche Komponentenart haben k~nnen, so gelangt man zur Methode der Verbundvarianten, wie sie PASCAL benutzt. ALGOL 58 [11] erlaubt mit seiner Vereinigung von Arten, dad sich die Komponenten-
209 art w~hrend der Lebensdauer der Variablen ~ndert. In beiden F~llen i s t die Menge der Arten, denen die Komponenten angeh~ren k~nnen, a priori bekannt. Anders i s t dies in SIMULA [4]. Hier kann eine Klasse A (vergleichbar einer Verbundart) als Prefix der Vereinbarung einer oder mehrerer Klassen B benutzt werden. Dies bewirkt die Aufnahme der s~mtlichen Komponenten yon A in die Klasse B. Umgekehrt kann sich dann ein Verweis auf Klassen A auch auf Klassen der Art B beziehen. Bei der Definition der Klasse A mUssen die Klassen B nochnicht einmal ihrer Anzahl nach bekannt sein. Wir k~nnen das Verfahren zum Beispiel dazu benutzen, um zun~chst die s~mtlichen Verbundkomponenten zu definieren, welche fur eine spezielle Speicherverwaltungsstrategie allen Verbunden g~meinsam sein sollen. Dies kann in einer niedrigeren Programmschicht geschehen als die Definition der restlichen Verbundkomponenten, die etwa in wechselnden Benutzerprogrammen erfolgt. Weder in PASCAL noch in ALGOL 68 i s t ein derartiges Vorgehen m~glich.
3. PROGR~M- UND DATENMODULN
Unter (Progre~m-) Moduln verstehen wir ProgrammstUcke mit der Eigenschaft, dab das Zusammensetzen solcher Moduln zu gr~#eren Einheiten keine Kenntnis des inneren Arbeitens der einzelnen Moduln verlangt und die Korrektheit der einzelnen Moduln ohne Kenntnis der Einbettung in das Gesamtprogramm nachprUfbar i s t . Diese Charakterisierung (nach Dennis [5]) sagt aus, dab sowohl der interne Aufbau des Moduls als auch seine externe Verwendung lediglich von einer genau zu definierenden Sohnittstelle abh~ngen soil. Die entstehenden gr~Beren Einheiten sollten wieder als Moduln aufgefaBt werden k~nnen, der Modulbegriff sollte also vekursiv sein. In den Ublichen Programmiersprachen werden zur Wiedergabe von Programmoduln lediglich Prozeduren als sprachliches Hilfsmittel zugelassen. Diese sind in vielen F~llen nicht ausreichend. Allgemeinere Moduln werden ben~tigt, - wenn m o d u l s p e z i f i s c h e Datenbest~nde vor dem ersten A u f r u f i n i t i a l i s i e r t oder zumindest zwischen zwei Aufrufen aufbewahrt werden mUssen, - wenn mehrere Prozeduren auf gemeinsamen Datenbest~nden a r b e i t e n und daher m i t diesen zusammen einen umfassenderen Modul b i l d e n , - wenn mehrere Prozeduren auf gemeinsamen H i l f s f u n k t i o n e n aufbauen, d i e zum Progranmlodul g e z ~ h l t werden mUssen.
210 Beispiele hierfUr sind etwa die Kollektion von Prozeduren, welche zusammen die verschiedenen Zugriffsfunktionen auf eine Datei oder irgendeine andere Datenstruktur realisiereno Mir sind keine Beispiele modularen Aufbaus von Programmen bekannt, bei denen sich die Moduln nicht in dieser Form charakt e r i s i e r e n lasseno Ein solcher allgemeiner Programmodu] hat (n+l) Eing~nge, die s~mtlich Prozeduraufrufe d a r s t e l l t e n . Der erste Aufruf fUhrt zum Modulauybau; er d e f i n i e r t und i n i t i a l i s i e r t die lokalen Daten des Moduls. Die weiteren Aufrufe aktiviermn den Modul oder genauer ausgedrUckt jewei]s eine der Prozeduren des Moduls. Die Dauer dieser Aktivierungen i s t zu unterscheiden vonder Lebensdauer des gesamten Moduls. Letzere beginnt mit dem Modulaufbau und Uberdeckt a l l e m~g]ichen Aktivierungen. Zur sprachlichen Formulierung eignet sich eine leichte Verallgemeinerung der Klassen aus SIMULA [4] ( v g l . Abb. 7). Abbildung 8 zeigt die D e f i n i t i o n eines Kellers fur ganze Zahlen. Auch die Untersuchung weiterer Beispie]e z e i g t , dad Datenmoduln, die Beschreibung der Implementierung einer Datenstruktur zu den h~ufigsten Anwendungen yon Programmoduln z~hien. Zugleich ergeben sich damit M~glichkeiten die Speicherorganisation mit solchen Programmoduln zu erledigen - H i l f s m i t t e l w~re dann eine lokale Reihung von Worten - und sich damit von den Beschr~nkungen f r e i zu machen, welche wir im letzten Abschntit k r i t i s i e r t e n . !mplementierungstechnisch lassen sich Moduln ohne weiteres in die Ubliche k e l l e r a r t i g e Speicherorganisation eingliedern, sofern man verabredet, dab die Lebensdauer eines Moduls am Ende des Blocks endet, in dem der Modul aufgebaut wurde. Um den h~ufig nicht unbetr~chtlichen organisatorischen Aufwand beim Prozeduraufruf zu vermeiden, i s t es zweckm~ig, bei den von au~en zug~nglichen Prozeduren wahlweise auch offenen Einbau zuzulassen. Diese Ma~nahme bew~hrt sich vor allem bei der Realisierung der Zugriffsfunktionen auf Datenstrukturen durch solche Prozeduren. Weist man jedem Modul und den von ihm aufgerufenen Prozeduren ein eigenes Kellersegment zu, so lassen sich damit unabh~ngige sequentielle Prozesse oder Zoroutinen r e a l i s i e r e n . Letzere lassen sich Ubrigens auch in die normale Kellerorganisation einbetten, sofern man die Abschnitte der Koroutinen als Prozeduren f o r m u l i e r t wie aus Abbildung 9 e r s i c h t l i c h .
211 Eine andere Erweiterung des Konzepts bilden die von Brinch Hansen und Hoare eingefUhrten Monitore (vgl. [ i ] ) . Hier kann zu jedem Zeitpunkt h~chstens ein Modulaufruf a k t i v i e r t sein. Treten in einem Proze~system w~hrend dieser Aktivierung weitere Aufrufe des Moduls auf, so werden sie in eine oder mehrere modulspezifische Warteschlangen eingereiht und bis zur Beendigung der laufenden Aktivierung verz~gert. 4. SCHNITTSTELLENBESCHREIBUNGEN Die Besch~ftigung mit kleinen Beispielprogrammen, wie sie beim Lehren des Programmierens Ublich und notwendig sind, hat bisher vielfach den Blick versperrt fur solche Eigenschaften von Programmiersprachen, deren Notwendigkeit erst bei sehr gro~en Programmen und beim Arbeiten im Team sichtbar wird. Dies g i l t nicht nur fur Programmiersprachenentwicklungen im akademischen Bereich. Von A. Perlis stammt die Bemerkung, er habe noch hie ein ALGOL-Programm lauffen sehen, sondern stets nur ein Programmsystem, in dem das ALGOL-Programm einen Teilmodul bildete. Trotzdem halten die Programmiersprachen an der Fiktion des "Hauptprogramms" fest. Diesem kann man zwar getrennt Ubersetzte Prozeduren beifUgen, aber das Verfahren i s t meist einstufig wie in FORTRAN und erlaubt keineswegs die baumartigen Beziehungen beim Zusammenbau von Teilmoduln zu gr~eren Moduln wiederzugeben. FUr Systemprogrammiersprachen scheint es angebracht,
nur (getrennt Ubersetzbare) Prozeduren und allgemeinere Pro-
grammoduln zu unterscheiden. Daneben gibt es noch ein "Strukturprogramm", welches den Zusammenbau und die I n i t i a l i s i e r u n g des Modulsystems beschreibt. Strukturprogramme zeichnen sich dadurch aus, dab sie die LUcken aufzeigen, in welche die anderen Moduln eingesetzt werden sollen. Da sich insgesamt wieder ein Modul ergeben s o l l , sind Strukturprogramme nur b e g r i f f l i c h , nicht aber syntaktisch von sonstigen Moduln unterscheidbar (vgl. Abb.lO). Wie in Abbildung 8 bereits angedeutet d e f i n i e r t jeder Modul die Prozeduren, Untermoduln, Artvereinbarungen und Objekte, welche von au~en zug~nglich, also
~ffentlieh sind. Umgekehrt muB er eine Beschreibung a l l e r externen GraVen enthalten, welche in diesem Modul benutzt, aber nicht d e f i n i e r t werden. Technisch i s t diese Beschreibung externer GraVen eine Vorbesetzung der Obersetzertabellen fur die Obersetzung des Moduls, gibt also auch Auskunft Uber die Art von Objekten usw. und ersch~pft sich nicht in der bloBen Auflistung von Bezeichnern wie in Assembliersprachen Ublich. W~hrend diese Techniken in einigen wenigen Sprachen schon Ublich sind, f e h l t
212 es in f a s t allen Sprachen am n~chsten S c h r i t t , n~mlich der Oberwachung der genauen Zinhaltung der S c h n i t t s t e l l e n d e f i n i t i o n m i t h i ! f e des Obersetzers bzw. des Binders. Hierzu i s t es notwendig die Angaben Uber externe Gr~Ben in Modul A zu vergleichen mit den Originalvereinbarungen (als ~ f f e n t l i c h e GraVen) in Modul B,C,D... Eine Untersuchung in einer Softwareabteilung ( [ 8 ] ) zeigt, da~ man mit einer solchen Kontrolle, die insbesondere auch die Anzahl und Arten von Prozedurparametern erfa~t, ein D r i t t e l des Zeitaufwandes fur die Suche nach Laufzeitfehlern bereits zur Obersetzungszeit abfangen kann. Eine M~glichkeit, den nicht unbetr~chtlichen Aufwand beim gegenseitigen Kontrollieren der Schnittstellenbeschreibungen zu reduzieren, besteht darin, die gesamte Beschreibung der externen und ~ffentlichen GraVen der Moduln im Strukturprogramm zu wiederholen oder - f a l l s das Strukturprogramm als vor den Teilmoduln Ubersetzt angenommen werden kann - sie Uberhaupt ins Strukturprogramm zu verlegen. Die PrUfung der Konsistenz e r f o l g t dann in mehreren Schritten. Einerseits i s t die Konsistenz der einzelnen S c h n i t t s t e l l e n beschreibungen innerhalb des Strukturprogramms zu UberprUfen, andererseits mu~ die Obereinstimmung der zusammengeh~rigen Beschreibungen im Strukturprogramm und im Untermodul geprUft werden.
5. ZUGRIFFSBESCHRANKUNGEN Die erw~hnte Wiederholung der Schnittstellenbeschreibung im Strukturprogramm erlaubt dem Schreiber des Strukturprogramms nun auch weitergehende E i n g r i f f e in die M~glichkeiten der Kommunikation verschiedener Teilmoduln. Durch das AuffUhren oder NichtauffUhren einer Gr~#e als externe GreBe in der Schnittstellenbeschreibung eines Teilmoduls B wird bestimmt, auf we~che Teilmoduln sich der GUltigkeitsbereich dieser in einem Teilmodul A als ~ f f e n t l i c h d e f i nierten Gr~e erstreckt. Unsere frUhere Charakterisierung der Programmoduln macht verst~ndlich, warum diese Festlegung vom Strukturprogramm und nicht vom Teilmodul A ausgeht. Ein derartiges Verfahren erlaubt eine genaue Kontrolle der Zugriffsrechte der einzelnen Moduln und c o d i f i z i e r t daher im Prcgramm Absprachen von Programmierern, welche sonst nur mUndlich oder in der Programmdokumentation festgehalten werden k~nnen. Ein Vergleich mit ~hnlichen Sicherheitsma~nahmen in Dateisystemen zeigt jedoch, da~ es noch M~glichkeiten zur weiteren Differenzierung g i b t , deren Obertragung auf Programmiersprachen nUtzlich erscheint. Als Minimum s o l l t e man zwischen der Erlaubnis zum Lesen eines Objekts und der Erlaubhis zum Ver~ndern des Objekts (Zuweisung nach eventuellem Lesen) unterschei-
213 den. Weitere M~glichkeiten, auf deren Realisierung wir hier nicht eingehen, w~ren etwa die Erlaubnis in eine verkettete Liste oder eine ~hnliche Struktur zu schreiben ohne jedoch den Aufbau des Geflechts zu Nndern, oder die Erlaubnis, ein Geflecht zu vergr~Bern, es aber nicht zu verkleinern, uswo Wir betrachten das Problem anhand der ParameterUbergabe f u r Prozeduren. Technisch l ~ t
sich diese auf die beiden F~lle der Obergabe einer Adresse des ak-
tuellen Parameters und der Obergabe einer Kopie des Wertes des als a k t u e l l e r Parameter auftretenden Objekts reduzieren. Oblicherweise werden diese beiden M~glichkeiten gleichgesetzt mit der Erlaubnis, den aktuellen Parameter zu lesen und Zuweisungen an ihm vorzunehmen (Schreiberlaubnis) bzw. den Wert nut zu lesen (Leseerlaubnis). Ihren extremen Ausdruck f i n d e t diese Auffassung im Referenzkonzept von ALGOL 68. Dabei wird Ubersehen, dab auch Konstante eine Adresse haben und da~ insbesondere bei zusammengesetzten Objekten das Einsparen des Kopierens durch die Obergabe einer Adresse ohne Schreiberlaubnis einen erheblichen F o r t s c h r i t t darstellen wUrde. Interessanter noch w~re die M~glichkeit, die Adresse einer Variablen zu Ubergeben, ohne damit gleichz e i t i g das Schreiben zu erlauben. ZukUnftige Systemprogrammiersprachen s e l l ten nicht nur bei Prozedurparametern, sondern in allen Arten yon S c h n i t t s t e l lenbeschreibungen die Unterscheidung zwischen Adresse und Wert eines Objekts s o r g f ~ I t i g yon der Unterscheidung zwischen Schreib- und Leseerlaubnis trenmen.
214 Literatur [1]
Brinch Hansen, P., 'rOperating System Brinciples". Prentice-Hall, Englewood C l i f f s , N.J., 1973.
[2]
Burroughs, "ESPOL Language, Information Manual" Burroughs Corp., Detroit, Form 5000094
[3]
Clark, B.Lo and Horning, J.J., "The System Language for Project SUE". SIGPLAN Notices vol. 6 no. 9 (1971).
Z4]
Dahl, O.J., Myhrhaug, B. and Nygaard U., "SIMULA 67, Common Base Language". Norwegian Computer Center, Oslo, 1967.
[5]
Dennis, J.B., "Modularity". In: Bauer, F.L. (ed.), Advanced Course on Software Engineering. Lecture Notes in Economics and Mathematical Systems 81. Springer, Berlin-Heidelberg-New York 1973.
[6]
Dijkstra, E.W.~ "The Structure of the "THE" Multiprogramming System". Comm. ACM 11, 341-346 (1968).
[7]
Halstead, M.H., "Machine-Independent Progran~ing". Spartan Books, Washington, D.C., 1962.
[81
Klunder, J., "Experiences with SPL". Working Paper, Conference on Maschine-Oriented High Level Languages. Trondheim 1973.
E9~
Ross, D.T, "The AED Free Storage Package". Comm. ACM 10, 481-492 (1967)
{I0 ]
Sammet, J.E., "A Brief Survey of Languages Used in Systems Implementation". SIGPLAN Notices, Volume 6, Number 9, (1971)
[11]
Wijngaarden, A. v. (ed.), "Report on the Algorithmic Language ALGOL 68. Num. Math. 14, 79-218 (1969).
[12 ]
Wirth, N., "PL360", A Programming Language for the 360 Computers". Journal ACM 15, 37-74 (1968).
C13 ]
Wirth, N., "The Programming Language PASCAL (Revised Report)". ETH ZUrich, Berichte der Fachgruppe Computer-Wissenschaften Nr. 5, 1972.
ABB0 1
SCHNITTSTELLENBESCHREIBUNG
ORGANISATION (KELLER, REIHUNG)
WERTEN
ABL~SUNG DER SYSTEMPROGRAMMIERSPRACHEN HARDWARE
HARDWARE
SPEICHER-
PASCAL UND NACHFOLGER
CODIERUNG VON
DATEN- Zu(BRIFFSSTRUKTUREN SCHUTZ
/
IN E~TWICKLUNG
(SPS) VON DER
ABLAUFSTEUERUNG
,/
PL360 UND NACHFOLGER
(/1
216
BLOCKSTRUKTUR
BL~CKE DIENEN - DER SYNTAKTISCHEN ZUSAMMENFASSUNG - DER KENNZEICHNUNG ZUSAMMENGESETZTER 0PERATIONEN DER KENNZEICHNUNG NEUER PROGRAMMSCHICHTEN, WELCHE AUF DIE HILFSMITTEL DER UMFASSENDEN, UNTERLIEGENDEN SCHICHT AUFBAUEN
BEGIN END
BL~CKE SIND ~QUIVALENT ZU PARAMETERLOSEN PROZEDUREN,
PROZEDUREN
PROZEDUREN DIENEN -
DER ZUSAMMENFASSUNG HAUFIG WIEDERKEHRENDER ANWEISUNGEN (EFFIZIENZFRAGE)
- ALS NEUE GRUNDOPERATIONEN - ALS EIGENST~NDIGE PROGRAMMODULN
ABB, 2
(KEINE SEITENEFFEKTE)
HILFSMITTEL ZUM PRQGRAMMAUFBAU
217
SEQUENZ:
AI ; A2
KOLLATERALIT~T:
A1 • A2
AUSWAHL:
IF B THEN A 1 ELSE A 2 F~ CASE FORMEL OF
M I : A I, s
!
I
MN
OUT
:
AN
Ao
ENDCASE
(D,H, Iz(FORMEL)=M 1 THEN A 1 ELSE,,,) AUFZ~HLUNG:
FOR ZXHLER LAU~ISTE, LAUFLISTE. . . . . DO A DONE FROM F1 BY F2 TDF3
ITERATION:
~LILI_LE.BDo A DONE Do A DONE Do A DONE UNT~LB
PROZEDURAUFRUF:
P P(AP 1 ..... AP N)
ABGANG:
M:BEGIN,,,EXIT M WITH ERGEBNIS,,,END
SPRUNG:
GOTO M
ABB, 3
SEQUENTIELLE ABLAUFSTEUERUNG
218
TYPEN: GANZE ZAHLEN REAL
7>
J
GLE[TPUNKTZAHLEN
MIT IMPLIZITER OBERSCHRANKE BZW, GENAUIGKEITSGRENZE
ENDLICHE MENGEN; ~.Q_Q.L = ('EAL,S_E,ZEUE)
WOCHEBIT.&@_ = (SONNTAG, MONTAG ..... SAMSTAG) g~
AUSSCHNITTE: JJ~L]L (0 : 13)
~TA~.
= ~LQCUEN,AG (MONTAG;SAMSTAG)
ABB, 4
DATENTYPEN
219
ZUSAMMENGESETZTE 0BJEKTE
Row CI:I00] ~NT
A
ARRAY CZ:N, L:M] BOOL B
B[J..I, B EJ,K]
STRUCT (REAL REALTEIL, IMAGINAERTEIL) C
C. REALTE IL
STRUCT (PACKED(INT(I:IOO)JAHR,
INT(Z:12) MONAT, INT(1:31)TAG), STRUCT(PACKED(INT(0:23)STUNDE, INT(0:59)MINUTE)) UHRZEIT)D
SET WOCHENTAG ARBEITSTAGE
D.UHRZEIT.STUNDE
IF MONTAG IN ARBEITSTAGE THEN. . . . .
FILE CHAR TEXT
ZUR BI LDUNG VON GEFLECHTEN' REF DATENART z.B. = STRUCT (REAL INHALT, REF ~
ABB. 5
DATENSTRUKTUREN
NEXT)
220
PROBLEM: DARSTELLUNGDER ELEMENTEDER LISTE
L MARKE
L PASCAL: LiSTELEMENT = ~
~
i MARKE NIL
1.5
MARKE : BOOL; ZEIGER It LISTELEMENT;
~AS~ KENNZEICHEN: (ZAHL, ZAHL :(X :
UNTERLISTE) DE_
REAL)~
UNTERLISTE :(X :~ LISTELEMENT) ~J~
ALGOL 68:
~~EMENI=STRUCT (B_Q_QJ. MARKE,
SIMULA:
/J.~5_$_A; ~EGIN~.DI.~.BII MARKE$ REF(A) ZEIGER; END;
REE ~ ZEIGER, ~NID~(REAL,REF I.Z~J~.LEI~F.II~)X)
A ~ A ~
L!STELEMENT1; ~Z~JJ~BE.BJ.X; END; LISTELEMENT2; BEGIN REF (A) c;
REF(A) AA; A A : - I ~ LISTELEMENTI; (AA QUA LISTELEMENTI),X := 5j
~CT
ABB, 6
AA ~
LISTELEMENTZDO X:=X+Z
VERBUNDE MIT KOMPONENTEN VARIABLER ART
221
MODULE
MODULBEZEICHNER
(FORMALE PARAMETER FOR DEN MODULAUFBAU): BEGIN VEREINBARUNGEN
LOKALER DATEN LOKALER PROZEDUREN UND UNTERMODULN VON AUSSEN ZUG~NGLICHEN PROZEDUREN UND UNTERMODULN;
ANWEISUNGEN ZUR INITIALISIERUNG END
ABB, 7
ALLGEMEINE PROGRAMMODULN
222
MODU_L~
STACK (IILIMAXIMALE TIEFE, PROC OBERLAUF, PROC INT KELLER LEER):
BEGIN
ARRAY~i:MAXIMALE TIEFE} INT K;
TI EFE ; P_~
P_B.QIIPUSH {I.BLILA): IIL TIEFE = MAXIMALE TIEFE THEN UBERLAUF TIEFE := TIEFE+I; K [TIEFE] := A
PUB~I.~ ~ POP : lJZ TIEFE = 0 THEN KELLER LEER ELSE TIEFE : : TIEFE -1 EL; PUBLL~. ~ VAL INT: TIEFE = 0 ~
P~L.I_C~pRO~ DEPTH
KELLER LEER ~
K CTIEFE] ElL;
INT:
TIEFE; INITIALISIERE: TIEFE := 0 END STACK (10, QBERLAUF, KELLERLEER) S~ S,PUSH (17)
ABB, 8
EIN MODUL ZUR KELLERORGANISATION
223
ZEITLICHER ABLAUF: KOROUTINE l i
KOROUTINE 2
KOROUTINE 1 :
I ~
BEGIN PROC, KOROUTINENEINGANG (*EINE PROZEDURVARIABLE*); VEREINBARUNG LOKALER DATEN; PROC PI: PROC
P2:
BEGIN . . . .
KOROUTINENEINGANG : : P2 ~ ;
BEGIN .... KOROUTINENEINGANG :=
INITIALISIERE: KOROUTINENEINGANG := PI END
ABB, 9
REALISIERUNG VON KOROUTINEN
P3
END;
224 MODULE STRUKTPROGI: BEGIN
STRUKTPROG2: BEGI~ ,
~ D U L ~ STRUKTPROG2:
~DDJJJ.Z_TEILMODULI: TEl LMODUL2 : INNER; <----
E
~ O D U ~ TEILMODULI: INNER,; £
n
i
END
~£QjJIJJ.~TEILMODULI: BEGIN ~.
, a
END
~ D U L E TEILMODUL2:
P (LBLT. i): ~EGIN
j j
ABB° 10
END
AUFBAU VON PROGRAMMEN AUS EINZELNEN MODULN
]
Software E n g i n e e r i n g o_~r M e t h o d s for the M u l t i - P e r s o n C o n s t r u c t i o n of M u l t i - V e r s i o n Programs
Prof.Dr.
D.L. Parnas
Technische H o c h s c h u l e D a r m s t a d t F a c h b e r e i c h Informatik, B e t r i e b s s y s t e m e I 61oo Darmstadt,
Steubenplatz
12
Abstract This talk will d e s c r i b e some m e t h o d s w h i c h have been used to produce a family of related software products using many r e l a t i v e l y u n s k i l l e d programmers.
The primary topics of the talk will be:
I. An i n t e r p r e t a t i o n of the word
"structure" w i t h regard to
software; 2. C r i t e r i a to be used in d e c o m p o s i n g software into modules; 3. P r o b l e m s and techniques w i t h regard to software m o d u l e specifications. The talk will be i n t r o d u c t o r y in nature e m p h a s i z i n g the d e s i r e d p r o p e r t i e s of well engineered software systems and p r o v i d i n g an o v e r v i e w of new methods which have proven useful in achieving those properties.
INTRODUCTION
The title of this paper is intended to suggest that software e n g i n e e r i n g is p r o g r a m m i n g under at least one of the following two conditions: (I) More than one person is involved in the c o n s t r u c t i o n and/or use of the p r o g r a m and (2) m o r e than one v e r s i o n of the p r o g r a m will be produced.
226
By the above I intend to e m p h a s i z e the fact that software e n g i n e e r i n g is not c o n f i n e d to the p r o d u c t i o n of certain classes of p r o g r a m s systems),
(e.g. compilers,
o p e r a t i n g systems,
file
but is p r e s e n t w h e n e v e r we are not in the s i t u a t i o n
of w r i t i n g a p r o g r a m e x c l u s i v e l y for our own use
(solo pro-
gramming). In p r o d u c i n g software we find three p r o b l e m s w h i c h are not s i g n i f i c a n t in the solo p r o g r a m m i n g situation: (I) How to d i v i d e the job of p r o d u c i n g the software among subgroups. (2) How to s p e c i f i y to the user and to the v a r i o u s
subgroups
the e x a c t b e h a v i o u r d e m a n d e d of each. (3) How to c o m m u n i c a t e
i n f o r m a t i o n about the o c c u r r a n c e of
errors b e t w e e n the v a r i o u s p r o g r a m parts and to the user.
It should be clear that these p r o b l e m s are not p r e s e n t in the "solo-programming"
s i t u a t i o n and that c l a s s i c a l p r o g r a m m i n g
t e x t b o o k s do not treat them.
The third p r o b l e m is p a r t i c u l a r l y
i n t r a c t a b l e as even the latest in p r o g r a m m i n g t e c h n i q u e s p r o c e e d on the a s s u m p t i o n that e v e r y t h i n g will go well.
In this paper we will b r i e f l y r e v i e w some t e c h n i q u e s w h i c h have been d e v e l o p e d for software e n g i n e e r i n g situations,
and
the r e s u l t s of p r e l i m i n a r y e x p e r i m e n t to e v a l u a t e the v a l u e of those techniques.
These results are not newt having been
p r e s e n t e d e a r l i e r in [I, 2, 3, 4, 5, 6]; they will be p r e s e n t e d again as c o n c i s e l y as possible.
The last section of this paper
w i l l p r e s e n t the result of m o r e recent a t t e m p t s to e v a l u a t e the t e c h n i q u e s by a p p l y i n g them to the d e s i g n of o p e r a t i n g systems.
What is a Well S t r u c t u r e d P r o @ r a m ?
Before we can proceed we m u s t explore the use of the word "structure ~ w h e n d i s c u s s i n g programs.
227
The w o r d
"structure" is used to refer to a partial d e s c r i p t i o n
of a system. A structure d e s c r i p t i o n shows the system d i v i d e d into a set of modules, the modules.
and specifies some c o n n e c t i o n s between
A n y given system admits m a n y such d e s c r i p t i o n s .
Since s t r u c t u r e d e s c r i p t i o n s are not unique, our usage of "module" does not a l l o w a p r e c i s e d e f i n i t i o n
(parallel to
that of "subroutine" in software and of "card" in hardware). The d e f i n i t i o n s of the latter words d e l i n e a t e a r e s t r i c t e d class of objects in a way that the d e f i n i t i o n of "module" d o e s not. N e v e r t h e l e s s ,
"module" is u s e f u l in the same m a n n e r
that "unit" is in m i l i t a r y or e c o n o m i c discussions.
We shall
c o n t i n u e to use "module" w i t h o u t a precise definition.
It
refers to p o r t i o n s of a system i n d i c a t e d in a d e s c r i p t i o n of that system. A precise d e f i n i t i o n is not only system dependent,
it is also d e p e n d e n t upon the p a r t i c u l a r d e s c r i p t i o n
under discussion.
The term
"connection"
is usually accepted more readily. M a n y
assume that the "connections" passed parameters,
are control transfer points,
and shared data for software, wires or
other p h y s i c a l c o n n e c t i o n s for hardware. of
"connection"
Such a d e f i n i t i o n
is a highly d a n g e r o u s o v e r s i m p l i f i c a t i o n
w h i c h results in m i s l e a d i n g structure d e s c r i p t i o n s .
The
c o n n e c t i o n s between m o d u l e s are the a s s u m p t i o n s w h i c h the m o d u l e s m a k e about each other.
In m o s t systems we find that
these c o n n e c t i o n s are m u c h more e x t e n s i v e than the calling s e q u e n c e s and control block formats u s u a l l y shown in system s t r u c t u r e descriptions.
The m e a n i n g of the above remark can be e x h i b i t e d by c o n s i d e r i n g two s i t u a t i o n s in w h i c h the structure of a system is t e r r i b l y important:
(I) m a k i n g of changes in a system, and
system correctness.
(I feel no need to argue the n e c e s s i t y of
proving p r o g r a m s correct, changes.
(2) proving
or to support the n e c e s s i t y of m a k i n g
I w i s h to use those h y p o t h e t i c a l situations to e x h i b i t
the m e a n i n g of "connection").
C o r r e c t n e s s proofs for p r o g r a m s can become so complex that their own c o r r e c t n e s s is in q u e s t i o n
(e.g. [7],
[8]). For large systems
we m u s t m a k e use of the structure of the p r o g r a m s in p r o d u c i n g the proofs.
We m u s t examine the p r o g r a m s c o m p r i s i n g each
228
m o d u l e separately.
For each m o d u l e we will identify
system p r o p e r t i e s that it is e x p e c t e d to guarantee, the p r o p e r t i e s w h i c h it expects of other modules. ness proof for each m o d u l e will take to be p r o v e n and
(I) the and
(2)
The correct-
(I) as the set of theorems
(2) as a set of axioms w h i c h may be used in
p r o v i n g that the p r o g r a m s do indeed g u a r a n t e e the truths of the theorems°
E v e n t u a l l y the theorems p r o v e n about each m o d u l e
will be used in proving the c o r r e c t n e s s of the w h o l e system. The task of p r o v i n g system c o r r e c t n e s s will be f a c i l i t a t e d by this p r o c e s s if and only if the amount of i n f o r m a t i o n in the s t a t e m e n t sets
(I) and
(2) is s i g n i f i c a n t l y less than the in-
f o r m a t i o n in the c o m p l e t e d e s c r i p t i o n of the p r o g r a m s w h i c h i m p l e m e n t the modules.
We now c o n s i d e r m a k i n g a change in the c o m p l e t e d system. We ask,
"What changes can be m a d e to one m o d u l e w i t h o u t i n v o l v i n g
change to other modules?" We may m a k e only those changes w h i c h do not v i o l a t e the a s s u m p t i o n s w h i c h other m o d u l e s make about the m o d u l e being changed.
In other words,
be c h a n g e d only w h i l e the "connections"
a single m o d u l e may
still "fit".
In both cases we have a strong a r g u m e n t for m a k i n g the connections contain as little i n f o r m a t i o n as possible~
Systems in
w h i c h the c o n n e c t i o n s between m o d u l e s contain little i n f o r m a tion are labeled well structured.
Two T e c h n i q u e s for C o n t r o l l i n g the S t r u c t u r e of Systems P r o g r a m s
in studying the p r o b l e m of p r o d u c i n g well structured systems p r o g r a m s we have d i s c o v e r e d that there are two basic f u n c t i o n s w h i c h the d e s i g n e r m u s t p e r f o r m c a r e f u l l y in order to control the structure of his programs.
The first of these is the d i v i s i o n
of the project into m o d u l e s or work a s s i g n m e n t s
(decomposition);
the second f u n c t i o n is p r e c i s e s p e c i f i c a t i o n of those modules.
Some i n f o r m a l e x p e r i m e n t s have r e v e a l e d that there is a rem a r k a b l e c o n s i s t e n c y in the way that p r o g r a m m e r s will divide systems into modules.
For example~
I have r e p e a t e d l y asked pro-
g r a m m e r s how they w o u l d go about d i v i d i n g the p r o j e c t of pro-
229
ducing a KWIC index program into work assignments. rare exceptions
the programmers
based on a flowchart description
of the whole system.
following the lessons that they received training taught
in programming.
Programmers
a program is to
flowchart and then proceed
to detail each of
the boxes in it. This is often an excellent "solo" programming project, is seldom a good procedure assignments.
strategy for a
but, as demonstrated
section was that
a well structured program was one with minimal between its structure components or modules. must be passed in large chunks ventional
in [2] it
for dividing a project into work
The conlusion of the previous
between the various phases
They are
in their early
are almost invariably
that the first step in producing
write a "rough"
With very
suggested a d e c o m p o s i t i o n
interconnections
Because information
(and using fixed conventions)
(boxes in the flowchart)
the con-
approach results in module~ which have quite strong
interconnections.
As was indicated in [2], s y s t e m s which result
from such a design are quite difficult cant way. We can, however,
instead to define our modules likely to change.
to change in any signifi-
forget our flowcharts
and attempt
"around" assumptions which are
One then designs
a module which
"hides" or
contains each one. Such modules have rather abstract interfaces which are relatively
unlikely to change.
fining interfaces which are sufficiently will not change), module.
"solid"
(i.e., they
changes in the system can be confined
The reader is referred
of this point including described
If we succeed in deto one
to [2] for a deeper discussion
a detailed example.
Teaching experience
in [9] has shown that, at this point in time,
skills can only be taught by example providing
such
simulated ex-
perience.
The second function of the designer
is that of specification.
There is good reason to believe [6] that the designer
can ob-
tain a system with the structure he suggested only if he has a way of precisely defining which assumptions one module are permitted they interface. fications
In describing
for each module),
If the programmer modules,
the designers
of
to make about other modules with which those assumptions
the designer
(writing speci-
is walking
has too little information
a tightrope.
about the other
he will be unable to produce an efficiently working
230
system.
The need for s u f f i c i e n t i n f o r m a t i o n is obvious to al-
m o s t everyone°
The surprising side of the t i g h t r o p e is that it
is also w r o n g to p r o v i d e too m u c h information. information
is used
If the excess
(i.e., if a d d i t i o n a l a s s u m p t i o n s are made) ~
the s t r u c t u r e of the p r o g r a m w i l l not be that i n t e n d e d by the designer.
The m o s t common a p p r o a c h to software m o d u l e specifi-
c a t i o n is to r e v e a l a r o u g h d e s c r i p t i o n of the i n t e r n a l structure of the module.
The next m o s t common approach is to reveal
a d e s c r i p t i o n of a "hypothetical"
i m p l e m e n t a t i o n of the module.
A l m o s t i n v a r i a b l y we are told the s t r u c t u r e of some real or i m a g i n e d table
(or procedure)
w h i c h the user of the m o d u l e
does not have access to. Both of these a p p r o a c h e s are fraught w i t h danger.
In the first,
the system may become hard to change
if the c o r r e c t n e s s of the i n t e r f a c i n g p r o g r a m s d e p e n d s u p o n the i n t e r n a l i n f o r m a t i o n w h i c h was revealed. the p r o g r a m m a y n e v e r work correctly, d e p e n d e d on some of the "hypothetical" was n e v e r true~
In the second case
b e c a u s e the c o r r e c t n e s s implementation
It is o b e r s e r v a b l e that it is e x t r e m e l y d i f f i -
cult to reveal some part of an i m p l e m e n t a t i o n precise~
which
in order to be
and then i n s t r u c t the r e a d e r p r e c i s e l y w h i c h parts of
the r e a v e a l e d i n f o r m a t i o n he m u s t not use.
We are now gaining
some e x p e r i e n c e w i t h a s p e c i f i c a t i o n technique
w h i c h takes as its goal the p r e c i s e s p e c i f i c a t i o n of e x t e r n a l l y v i s i b l e aspects w i t h o u t s u g g e s t i n g internal constructions.
The
a p p r o a c h is to specify i d e n t i t i e s or r e l a t i o n s b e t w e e n the ext e r n a l l y v i s i b l e aspects of the m o d u l e rather than reveal the i n t e r n a l construction.
Thus the s p e c i f i c a t i o n s relate the exter-
n a l l y v i s i b l e f u n c t i o n s to each other rather than to some real or
imagined
lower level machine.
In syntax these s p e c i f i c a t i o n s
r e s e m b l e p r o g r a m s but the r e f u s a l to m e n t i o n lower level or internal m e c h a n i s m s d i s t i n g u i s h e s f i c a t i o n s or programs.
them from other forms of speci-
The reader is r e f e r r e d to [3] for m o r e
d i s c u s s i o n and d e t a i l e d examples.
Results On the basis of the above c o n s i d e r a t i o n s we have o b t a i n e d quite s a t i s f a c t o r y r e s u l t s in small scale e x p e r i m e n t s w i t h u n d e r g r a d u ate classes°
For example,
in the fall of 1971 we p r o d u c e d
q u i t e d i s t i n c t v e r s i o n s of a KWIC index p r o g r a m using
192
15 m o d u l e s
231
p r o d u c e d by 15 students.
(We set out to produce 45 v e r s i o n s
using 2o students, but five of the students p r o d u c e d m o d u l e s w h i c h failed to m e e t specifications.)
Of the 15 w h i c h passed
the p r e l i m i n a r y testing we could make
192 d i s t i n c t combinations.
We selected only 25 of these for testing
(for e c o n o m y reasons),
but all of the 25 ran successfully. All students w o r k e d indep e n d e n t l y and had no advance k n o w l e d g e of the c o m b i n a t i o n s w h i c h w o u l d be tested. The actual testing was carried out by g r a d u a t e students w i t h no k n o w l e d g e of the internal b e h a v i o u r of any p r o g r a m m o d u l e and who did not alter the internals reduce e x c e s s i v e space requests).
(except to
While we would prefer to
have tested the techniques on a larger and more i n t e r e s t i n g project, we feel that the limited use suggests that it is both feasible and v a l u a b l e to produce systems using the p r i n c i p l e s o u t l i n e d above.
Error handlin@
The treatment of run time errors is made more d i f f i c u l t by the i n f o r m a t i o n hiding a p p r o a c h to structuring programs. error is detected,
When an
the i n f o r m a t i o n about w h a t has gone w r o n g
is e x p r e s s e d in terms of data structures and p r o g r a m s w h i c h are not k n o w n to the m a j o r i t y of the system.
The i n f o r m a t i o n
needed to u n d e r s t a n d the cause of the error and the p r o c e d u r e for c o r r e c t i v e action is likely to be in other modules.
If the
i n f o r m a t i o n about the error is c o m m u n i c a t e d in terms of the hidden p r o g r a m structures,
the structure of the r e s u l t i n g sys-
tem will be d e s t r o y e d by the d i s t r i b u t i o n of the a d d i t i o n a l assumptions.
In [Io] an approach to solving this p r o b l e m has been outlined. This a p p r o a c h was used, in a p r i m i t i v e form, in the e x p e r i m e n t d e s c r i b e d above. Even in this form it had the advantage that, when an error was discovered,
it took no d e t a i l e d k n o w l e d g e of
any m o d u l e to i d e n t i f y w h i c h m o d u l e had caused the error. This was a g r e a t a d v a n t a g e in m a n a g i n g the p r o j e c t and a complete change from the author's e x p e r i e n c e in other m u l t i - p e r s o n projects.
Identifying the m o d u l e at fault is often the m a j o r prob-
lem in c o r r e c t i n g an error.
232
Error
recovery
as d i s c u s s e d
so we can r e p o r t has
shown that
call
stacks
in
[Io] has not yet been a t t e m p t e d
no e x p e r i m e n t a l
it leads
results.
to d i f f i c u l t i e s
and an i m p r o v e d
method
Study of the scheme
in m a i n t e n a n c e
is now under
of the
study.
More Recent Work
Since
it is well known
applied
have d e c i d e d discussed
to the p r o d u c t i o n
In p a r t i c u l a r ,
specification
operating
systems,
set of m a c h i n e s
first
range
the m a c h i n e
step was
of the
to run w i t h o u t
reported
by Price
[11]
using
physical
we have d i s c o v e r e d
considerable
complex
change
than
fications
it is. The
it n e c e s s a r y fication. functions, are u n h a p p y
developing
extentions
to our
low. We
so that the
rules
in terms
Price
we found
about
speci-
of some
"hidden"
to the user.
the s p e c i f i c a t i o n
detect.
speci-
size.
of the Price module,
each a t t e m p t
longer.
far m o r e
is m u c h too
are not a c c e s s a b l e
requires
realistic
appear
be of r e a l i s t i c
is p r o d u c e d
cannot
because
specification
for m o r e
density
will
difficulties:
module
one of our c a r d i n a l
which
the user
functions,
the
modules
all
This work was
small e x a m p l e s
of s p e c i f i c a t i o n
about this procedure,
to facts
hidden made
information
specification
functions
Price's
allows
[12].
some u n e x p e c t e d used on the
makes
which
addresses.
and Parnas
the s p e c i f i c a t i o n
to v i o l a t e
The
to be hidden w i t h i n
it is p r a c t i c a l
new m e t h o d s
for r e a l i s t i c
(2) In p r o d u c i n g
refer
before
The s p e c i f i c a t i o n
are n o w studying
In principle,
of any o p e r a t i n g
of a m e c h a n i s m
and Price
(I) The form of s p e c i f i c a t i o n
problems.
aspects
for a broad
system.
specification
programs
the t e c h n i q u e
of applications.
details
family
up a family of
be suitable
dependent
we
the t e c h n i q u e s
realistic
w h i c h make
w h i c h we hope will
modules
As a r e s u l t
of a m o r e
we are now a p p l y i n g
to be the i m p l e m e n t a t i o n
the v a r i o u s
The
works well when
to a small problem,
of the m o d u l e s
in a broad
one can consider system
researchers
to c o n t i n u e our study by applying
above
of programs. to the
that any n e w m e t h o d
by e n t h u s i a s t i c
should
We not
had to i n t r o d u c e
to work w i t h o u t
them
In some current work we are now
specification
techniques,
which
the
233
make it p o s s i b l e to remove the hidden functions w i t h o u t inc r e a s i n g the length of the specification.
The hidden functions
can be r e p l a c e d by sets w h i c h c h a r a c t e r i z e the history of the m o d u l e from an external view.
(3) P r o b a b l y the m o s t important insight to arise out of the Price work is that we have found f u n d a m e n t a l limitations to the ideas m e n t i o n e d above.
In plain English, not e v e r y t h i n g
can be hidden! An example centers about the t r e a t m e n t of I/O in our o p e r a t i n g system family. tation was on the PDP/11,
Because our initial implemen-
a m a c h i n e w i t h o u t e x p l i c i t I/O in-
structions, we found that no special c o n s i d e r a t i o n of I/O was needed in the d e s i g n of the lower levels. Now that we are studying the i m p l e m e n t a t i o n of these concepts on a /36o-like machine, we find that "hiding" the e x i s t e n c e of the I/O instructions will introduce inefficiency. While we see the e x i s t e n c e o6 I/O i n s t r u c t i o n s on our m a c h i n e as a disadvantage, POPEK [13] a t t e m p t i n g to transfer ideas d e v e l o p e d on a /360 to a PDP/11
sees the lack of I/O as a d i s a d v a n t a g e on the PDP/11.
At the m o m e n t we have not found a system structure w h i c h is r e a l l y suitable for both types of machines.
A similar situation arose because of the P D P / 1 1 / 4 5 ' s ability to use seperate address maps for d a t a and instructions. taken advantage of this "feature" in our design,
Had we
the r e s u l t i n g
d e s i g n w o u l d be i m p r a c t i c a l for other machines.
Conclusion
It seems clear that the i n f o r m a t i o n hiding p r i n c i p l e espoused earlier in this paper is a v a l u a b l e technique for software engineers,
but we have d e f i n i t e l y found limitations.
The purpose
of our further r e s e a r c h is then
(I) to a p p l y those techniques to p r o d u c e i n f o r m a t i o n about good structures for f r e q u e n t l y built items such as o p e r a t i n g systems,
(2) to d e v e l o p techniques for e x t e n d i n g the u s e f u l n e s s of the c o n c e p t s and u n d e r s t a n d i n g the real limits in p r a c t i c a l use.
234
References [I] Parnas~
D.Lo,
"Some Conclusions
Engineering '~, Proceedings [2] Parnas~ Systems
D.L~
[3] Parnast
Techniques D.L~,
Communication~
Department),
~'A Technique
with Examples",
[5] Robinson~
of the ACM
D.L°,
Proceedings
Robert M.~ Institute
[8] London,
R.,
[9~ Parnas~
D.Lo~
included
University, [11] Price~ W.R.~ Implementing Technical June 1973.
1971.
of Technology,
Problem",
Time Solutions Ph.D.
3 ~', CACM, June 197oo
of the ACM SIGCSE,
March 24-25,
"On the Response Systems",
Thesis,
1966.
"A Course on Software Engineering
cally Structured
Carnegie-
Aspects of Design
"Studies Concerning Minimal
in the Proceedings
DoL.,
of A Multi-Level
of IFIP Congress
~'Certification of Treesort
nical Symposium, [1o] Parnas~
June 1973.
Technical Report,
to the Firing Squad Synchronization Carnegie
Tech-
June 1973.
~'Information Distribution
Methodology", [7] Balzer;
University,
L.~ ~'Design and Implementation
System Using Software Modules",
[6] Parnas~
(Programming
"A Program Holder Module ~',
Carnegie-Mellon
Mellon University~
(Pro-
1972.
May 1972.
[4] L. Robinson and D.L. Parnas, Report,
of the ACM
Dec.
for Software Module Specification
Communications
niques Department),
Technical
in Software
"On the Criteria to be Used in Decomposing
into Modules",
gramming
from an Experiment
of the 1972 FJCC.
Techniques ~', Second Tech-
1972. to Detected Errors in Hierarchi-
Technical
Report,
Carnegie-Mellon
1972. "Implications Protection
Report
of a Virtual Memory Mechanism
in a Family of Operating
(Ph.D. Thesis),
Carnegie-Mellon
for
Systems", University,
235
[12] Parnas,
D.L., Price, W.R.,
"The Design of the Virtual
Memory Aspects of a Virtual Machine", ACM SIGARCH-SIGOPS
Proceedings
Workshop on Virtual Computer
of the
Systems,
March 1973. [13] Popek, G.J. and Kline, C., Systems Software",
"Verifiable
AFIPS Conference
NCC AFIPS Press, Montvale,
Secure Operating
Proceedings,
N.J.U.S.A.
1974,
KNOWLEDGE AND REASONING
IN PROGRAM SYNTIIESIS
BY %OHAR MANNA~ Applied Mathematics of Seience~
Rehovot~
Department~
Weizmann
Institute
Israel
and RICHARD WALDINGER~ Research
Artificial
Institute,
Intelligence
Menlo Park,
Center,
California,
Stanford
U. S. A.
ABST~CT~ Program synthesis
is the construction
given specifications. combine reasoning knowledge
must be represented
and structurally We describe
synthesis
the introduction with side effects
system.
of conditional
ability with a good deal of
several interacting
both procedurally
and programming
tests,
loops,
different
is paid to
and intructions The ability
goals simultaneously
proves to be
of an already exist-
problem has been found
approach.
these concepts with hand simulations
thesis of a number of pattern-matching techniques
(by programs)
capabilities
Special attention
The modification
ing program to solve a somewhat
We illustrate
This ability
in the program being constructed.
in many contexts.
to be a powerful
system must
(by choice of representation).
some of the reasoning
of a projected
important
and programming
synthesis
about the subject matter of the program.
and knowledge
to satisfy
of a computer program from
An automatic program
programs.
have already been implemented,
course of implementation, known unsolved problems
while others
in artificial
of the synSome of these
others are in the
seem equivalent intelligence.
to well-
237
I.
INTRODUCTION
In this paper we describe
some of the knowledge
and the reason-
ing ability that a computer system must have in order to construct computer programs
automatically.
It is our hypothesis
a system needs to embody a relatively and programming
tactics
about the world.
These tactics
(i.e., explicitly
solving process)
small class of reasoning
combined with a great deal of knowledge
both procedurally
and this knowledge
and structurally
of representation).
(i.e., implicitly
process,
of common-sense
ques into a program
system.
synthesis
we therefore
construction
symbolic reason-
consider other techniques
of "almost
correct"
different task
of an existing program to perform a somewhat
(cf. Bundy
to reduce the need for
[1973]).
We regard program synthesis
as a part of artificial
Many of the abilities we require of a program as the ability to represent knowledge conclusions
that must be
(ef. Balzer [1972]).
use of "visual" representations
deduction
programs
from facts,
guage understanding
intelligence.
synthesizer,
such
or to draw comI~on-sense
we would also expect from a natural
system or a robot problem solver.
neral problems have been under study by researchers years,
of complex
as well:
(cf. Sussman [1973]).
eThe modification
eThe
in the choice
reasoning techni-
However,
ing alone will not suffice to produce the synthesis
debugged
of a p r o b l e m
and most of this paper is con-
cerned with the i n c o r p o r a t i o n
OThe
are expressed
in the description
We consider the ability to reason as central
to the program synthesis
programs;
that such
lan-
These ge-
for many
and we do not expect that they will all be solved in the
near future.
However,
rather than restrict
we still prefer to address those problems
ourselves
to a more limited program
synthe-
sis system without those abilities. Thus,
although
implementation
of some of the techniques
paper has already been completed, ment before
a complete
gine the knowledge expressed
others require
implementation
and reasoning
in a PLANNER-type
in this
further develop-
will be possible.
We ima-
tactics of the system to be
language
(Hewitt
[1972]);
our own
238
implementation is in the QLISP language [1973])~
(Reboh and Sacerdoti
Further details on the implementation
are discussed
in Section V-A. Part II of the paper gives the basic techniques of reasoning for program synthesis.
They include the formation of conditional
tests and loops, the satisfaction of several simultaneous and the handling of instructions with side effects. applies the techniques "pattern-matcher"
goals~
Part III
of Part II to synthesize a nontrivial
that determines
instance of a given pattern.
if a given expression is an
We show how different choices made
during the synthesis process result in different final programs. Part IV demonstrates the modification of programs.
We take the
pattern matcher we have constructed in Part Ill and adapt it to construct a more complex program: that determines V
In Part
we give some of the historical background of automatic prog-
ram synthesis~
II.
a "unification algorithm"
if two patterns have a common instance.
and we compare this work with other recent efforts.
FUNDAMENTAL REASONING
In this section we will describe some of the reasoning and programming tactics that are basic to the operation of our proposed synthesizer.
These tactics are not specific to one particular
domain; they apply to any programming problem.
In this class of
tactics~ we include the formation of program branches and loops and the handling of statements with side effects~
A.
Specification
and Tactics Language
We must first say something about how programming problems are to be specified.
In this discussion we consider only correct
and exact specifications
in an artificial
will not discuss input-output
examples
Thus, we
(cf. Green et al. [1974],
Hardy [1974])~ traees
(cf. Biermann et al.
language descriptions
as methods
will we consider interactive
language.
[1973]), or natural
for specifying programs; nor
specification of programs
(of. Balzer
[1972]).
Neither are we limiting ourselves to the first-order
predicate
calculus
(cf. Kowalski
troduce specification
[1974]).
Instead, we try to in-
constructs that allow the natural and
239
i n t u i t i v e d e s c r i p t i o n of p r o g r a m m i n g problems. include constructs
We t h e r e f o r e
such as
Find x such that P(x) and the ellipsis notation, e.g., A[I]~ A[2] ..... A[n]. Furthe!~ore, we i n t r o d u c e new constructs that are s p e c i f i c to certain subject domains.
For instance,
in the domain of sets
we use
{x[ FOx)} for "the set of all x such that P(x)".
As we introduce an example
we w i l l describe features of the l a n g u a g e that apply to that example.
Since the s p e c i f i c a t i o n language is extendible, we can
i n t r o d u c e new constructs at any time. We use a separate l a n g u a g e to express the system's k n o w l e d g e r e a s o n i n g tactics.
In the paper, these will be e x p r e s s e d in the
form of rdles w r i t t e n in English.
In our i m p l e m e n t a t i o n ,
same rules are r e p r e s e n t e d as p r o g r a m s language.
and
the
in the QLISP p r o g r a m m i n g
When a p r o b l e m or goal is p r e s e n t e d to the system,
the a p p r o p r i a t e rules are summoned by " p a t t e r n - d i r e c t e d function invocation"
(Hewitt [1972]).
In other words,
the form of the
goal determines w h i c h rules are applied. In the f o l l o w i n g two sections we will use a single example, the synthesis of the s e t - t h e o r e t i c union program,
to illustrate the
f o r m a t i o n both of conditionals and of loops.
The p r o b l e m here
is to compute the union of two finite sets, where
sets are rep-
r e s e n t e d as lists w i t h no r e p e a t e d elements. Given two sets, s and t, we want to express union(s t) = {x[xCs or xEt} in a L I S P - l i k e language.
We expect the output of the s y n t h e s i z e d
p r o g r a m to be a set itself. u n i o n ( ( A B)
Thus
(B C)) = (A B C).
We do not regard the e x p r e s s i o n program:
{x[xffs or xCt} itself as a proper
the o p e r a t o r { I...} is a construct in our specifica-
tion language but not in our LISP-like p r o g r a m m i n g language. assume that the p r o g r a m m i n g tions:
We
language does have the f o l l o w i n g func-
240
head(~)
= the
first
element
tail(~)
= the list of all but the
Thus h e a d ( C A
Thus add(x
s) = the
ments
of the
B C D))
tail((A
= A.
B C D))
set c o n s i s t i n g
~.
first
element
of the
list
~.
= (B C D). *) of the
element
x and the ele-
set s.
Thus
add(A
whereas empty(s)
of the list
is true
(B C D))
add(B
= (A B C D)
(B C D))
= (B C D).
if s is the empty
list
false otherwise. Our task
is to t r a n s f o r m
algorithm
in this
We assume
the
as the
our
system has
following
specifications
programming
into
an e q u i v a l e n t
language.
some basic k n o w l e d g e
about
sets,
such
rules:
(!)
x E s
is false
(2)
x ~ s
is e q u i v a l e n t
if empty(s)
(3)
(xix a s) is equal to s
to
(x = h e a d ( s )
or x a tail(s))
if ~ empty(s).
(41 We also
(xlx=a or Q(x)) assume
propositional Before
that logic,
proceeding
the
is equal
w h i c h we w i l l
with
our e x a m p l e
of c o n d i t i o n a l
expressions.
B.
of C o n d i t i o n a l
Formation
In a d d i t i o n ming
to the above
language
contains
to a d d ( a
s y s t e m knows
not m e n t i o n we must
amount
of
explicitly.
discuss
the f o r m a t i o n
Expressions
constructs,
conditional
(if p then q else r)
(xIQ(x)))
a considerable
we assume
expressions
that
our p r o g r a m -
of the
form
= r if p is false q otherwise.
The
conditional
tainty. p is true
expression
In c o n s t r u c t i n g or not,
is a t e c h n i q u e
a program~
for dealing
we want to know
but in fact p may be true
with
uncer-
if c o n d i t i o n
on some o c c a s i o n s
~Since sets are r e p r e s e n t e d as lists, h e a d and tail may be applied to sets as well as lists. Their value then depends on our actual choice of r e p r e s e n t a t i o n .
241
and false on others,
d e p e n d i n g on the value of the argument.
The human p r o g r a m m e r faced w i t h this p r o b l e m is likely to resort to " h y p o t h e t i c a l reasoning": and write
he will assume p is false
a p r o g r a m r that solves his p r o b l e m in that case;
then he will assume p is true and write a p r o g r a m q that works in that case; he will then put the two programs t o g e t h e r into a single p r o g r a m (if p then q else r). C o n c e p t u a l l y he has solved his p r o b l e m by splitting his w o r l d into two worlds: p is false.
the ease in which p is true and the case in which
In each of these worlds,
u n c e r t a i n t y is reduced.
Note that we must be careful that the condition p on which we are s p l i t t i n g the w o r l d is computable in our p r o g r a m m i n g language; otherwise,
the conditional expression we construct also will not
be computable
(of. Luokham and Buchanan [1974]).
We can now proceed with the synthesis of the union function.
Our
s p e c i f i c a t i o n s were union(s t) = {xlx ~ s or x ~ t}. We begin to t r a n s f o r m these specifications gram in our language,
using our rules.
sion x e s.
Two of the rules,
expression.
Rule
into an e q u i v a l e n t pro-
We examine the subexpres-
(i) and (2), apply to this sub-
(i) generates a subgoal,
empty(s).
We cannot
prove s is empty - this depends on the input -- and t h e r e f o r e this is an o c c a s i o n for a h y p o t h e t i c a l w o r l d split. that empty(s)
(We know
is a computable condition becuase empty is a pri-
mitive in our language.)
In the case in which s is empty,
the
expression {xlx
E s o r x E t}
therefore reduces to {x I false or x s t}, or, by p r o p o s i t i o n a l {xlx
Now rule
logic,
~ t}.
(3) reduces this to t, w h i c h is one of the inputs to our
p r o g r a m and t h e r e f o r e is itself an acceptable p r o g r a m segment in our language. In the other w o r l d - - t h e
case in w h i c h s is not empty--we
cannot
solve the p r o b l e m without discussing the r e c u r s i v e loop formation
242
mechanism° have
the
However~
we know
at this
point
that the p r o g r a m will
form union(s
t) = if empty(s) then t
else where ruct
the else for the
clause
ease
in w h i c h
Before
we continue
mation
mechanism.
C.
Formation
The t e r m in this
that
with
[1971]).
both
only
the e l e m e n t s If in the
discuss
subgoal
that
a recursive
"shorter"
we
to solve the
such infinite
discuss
we
const-
the loop
call
for-
loop
where
attempt
to e x p a n d
input
technique
we left
£ (e.g.,
the elements
because
a subgoal
For instance, reverse(h),
reverse(A this
(B C) D)=
p r o g r a m we
of the
list
tail(£),
to satisfy
this
sub-
a reeursive
problem.
and
call when,
the p r o g r a m
call revers__~e
We must
lead to an infinite
can o c c u r here
Let us see how this
list
can i n t r o d u c e
than the o r i g i n a l
tinuing
goal.
however,
(cf. M a n n a
we generate
constructing
subsidiary cannot
loops
form a r e e u r s i v e
of c o n s t r u c t i n g
of r e v e r s i n g
In other words
(tail(£))
reeursive
we
of the
course
can use the p r o g r a m we are
goal.
segment
and recursion;
goal is to construct
(D (B C) A)). the
iteration
in form to our t o p - l e v e l
our t o p - l e v e l
generate
we will
on our problem,
that r e v e r s e s
we
example
Intuitively,
of w o r k i n g
is i d e n t i c a l
suppose
program
s is not empty.
this
includes
p a p e r we will
course
.... be w h a t e v e r
of Loops
"loop"
Waldinger in the
will
always
check
recursion.
the input
tail(£)
No is
~.
applies
to our union
off in the d i s c u s s i o n
example.
of conditionals,
Conwe
the e x p r e s s i o n
{xlx ~ s or x a t} in the
case
in w h i c h
subexpression
s is not empty.
x s s~ we
{xlx = head(s) Using
rule
(4)~ this add(head(s)
If we observe
can expand
or x s tail(s)
reduces
rule
or x e t}.
to
{xlx s tail(s)
that
{xlx s tail(s)
Applying
our e x p r e s s i o n
or x s t}
or x s t}).
to
(2) to the
243
is an instance of the t o p - l e v e l subgoal, we can reduce it to unionCtail<s) Again,
t).
this r e e u r s i v e call leads to no infinite loops,
tail(s)
is shorter than s.
since
Our c o m p l e t e d union p r o g r a m is now
unionCs t) = if empty(s) then t else add(head(s)
union(tail(s)
t)).
As p r e s e n t e d in this section, the loop f o r m a t i o n technique
can
only be applied if a subgoal is g e n e r a t e d that is a special case of the t o p - l e v e l goal.
We shall see in the next section how
this r e s t r i c t i o n can be relaxed.
D,
G e n e r a l i z a t i o n of Specifications
When proving a t h e o r e m by m a t h e m a t i c a l induction,
it is often neces-
sary to s t r e n g t h e n the t h e o r e m in o r d e r for the i n d u c t i o n to "go through."
Even though we have an a p p a r e n t l y more difficult t h e o r e m
to prove, the proof is f a c i l i t a t e d because we have a stronger induction hypothesis. programs,
For example,
in proving theorems about LISP
the t h e o r e m p r o v e r of Boyer and Moore
[1973] often auto-
m a t i c a l l y generalizes the statement of the t h e o r e m in the course of a proof by induction. A similar p h e n o m e n o n occurs in the synthesis of a recursive program.
It is often n e c e s s a r y to s t r e n g t h e n the s p e c i f i c a t i o n s of
a p r o g r a m in order for that p r o g r a m to be useful in r e c u r s i v e calls.
We believe that this ability to strengthen specifications
is an e s s e n t i a l part of the synthesis process,
as m a n y of our ex-
amples will show. For example, list.
suppose we want to construct a p r o g r a m to reverse a
A good r e c u r s i v e reverse p r o g r a m is reverse(k)
= rev(~
())
where rev(~ m) = if empty(h) then m else rev(tail(~) head(~)-m). Here
244
() is the empty list xo~ is the !ist formed by i n s e r t i n g x before the first element of ~.
(e.g., A°(B C D) = (A B C D)).
Note that rev(~ m) reverses the list ~ and appends it onto the list m, e.g.~ r e v ( ( A B C)
(D E)) : (C B A D E).
This is a good way to compute reverse:
it uses very p r i m i t i v e
LISP functions and its r e c u r s i o n is such that it can be compiled without use of a stack.
However, w r i t i n g
such a p r o g r a m entails
w r i t i n g the f u n c t i o n rev, w h i c h is a p p a r e n t l y more general and difficult to compute t h a n reverse itself, its first argument as a subtask.
since it must reverse
The synthesis of this reverse
function involves g e n e r a l i z i n g the original
specifications
of
reverse into the s p e c i f i c a t i o n s of rev. The reverse f u n c t i o n requires that the t o p - l e v e l goal be generalized in order to m a t c h the lower level goal.
A n o t h e r way for
the s p e c i f i c a t i o n s to be g e n e r a l i z e d is as follows.
Suppose in
the course of the synthesis of a function f(x), we generate a subgoal of the form P(f(a)), where
f(a) is a p a r t i c u l a r recursive
call.
it may be easier to rewrite
Instead of p r o v i n g P(f(a)),
the s p e c i f i c a t i o n s This
for f(x) so as to satisfy P(f(x))
for all x.
step may require that we a c t u a l l y m o d i f y portions of the
p r o g r a m f that have already been s y n t h e s i z e d in order to satisfy the new s p e c i f i c a t i o n P.
The r e c u r s i v e
call to the m o d i f i e d prog-
ram will then be sure to satisfy P(f(a)).
This process will be
i l l u s t r a t e d in more detail during the synthesis of the p a t t e r n m a r c h e r in Part III.
E.
~unctive
Goals
The p r o b l e m of solving c o n j u n c t i v e goals is the p r o b l e m of synthesizing a p r o g r a m that satisfies several constraints
simultaneously.
The general f o r m for this p r o b l e m is Find z such that P(z) and Q(z). The conjunctive goals p r o b l e m is difficult because, have methods
for solving the goals
Find z such that P(z) and
even if we
245
Find z such that Q(z) independently,
the two solutions may not merge t o g e t h e r nicely
into a single solution.
Moreover,
there seems to be no way of
solving the conjunctive goal p r o b l e m in general~
a m e t h o d that
works on one such p r o b l e m may be irrelevant to another. We will illustrate one instance of the conjunctive goals problem: the solution of two simultaneous
linear equations.
p r o b l e m is not itself a p r o g r a m synthesis problem, r e p h r a s e d as a synthesis problem.
A l t h o u g h this it could be
M o r e o v e r the difficulties in-
volved and the technique to be applied e x t e n d also to many real synthesis problems, Part llI.
such as the p a t t e r n - m a t c h e r
synthesis of
Suppose our p r o b l e m is the following: Find such that 2z I = z 2 + I and 2z 2 = z I + 2.
Suppose further that although we can solve single linear equations w i t h ease, we have no b u i l t - i n package for solving sets of equations simultaneously. e q u a t i o n separately.
We may try first to find a solution to each Solving the first equation, we might come up
with = , whereas
solving the second e q u a t i o n might give = <2,2>.
There is no way of combining these two solutions.
Furthermore,
it doesn't help matters to reverse the o r d e r in w h i c h we a p p r o a c h the two subgoals.
What is n e c e s s a r y is to make the solution of
the first goal as general as possible, of the solution might tance,
a "general"
so that some special case
satisfy the second goal as well.
For ins-
solution to the first equation might be
for any w. This solution is a g e n e r a l i z a t i o n of our earlier solution . The p r o b l e m is how to find a special case of the general solution that also solves the second equation.
In other words, we must
find a w such that 2(1 + 2w) = (i + w) + 2. This strategy leads us to a solution. Of course the m e t h o d of g e n e r a l i z a t i o n
does not apply to all con-
junctive goal problems.
the synthesis of an inte-
For instance,
246
get square-root Find
program
has
specifications
z such that z is an i n t e g e r 2 z ~ x and
and
(z + 1) 2 > x, where The n a t u r a l conjuncts
x ~ 0.
approach
of finding
and p l u g g i n g
this
case.
F.
Side Effects
a general
it into the others
Up to now we have been
considering
guage.
return
These
programs
In the next two general rams
programs
that
which
change
tion of data of p r o g r a m general
sections
structures
is u s u a l l y
the
are examples
the
state
lan-
effects.
synthesis
of more
of the world.
or alter the
of this
when
in
in a L I S P - l i k e
but have no side
consider
of v a r i a b l e s
synthesized
to one of the
class.
a goal
Prog-
configuraThis
is p r o p o s e d
sort of the
form Achieve
A program
that
of m a k i n g
P true.
To discuss
P.
satisfies
this
general
of '~wor!d ~' that we reasoning. concept
may m o d i f y
the values
programs
a value
we will
solution
is not p r a c t i c a l
The
of state
one w o r l d
specification
case we will
introduced
concept
ways:
continue
is v i r t u a l l y
[1962]).
in another.
in three
will have
concept
of h y p o t h e t i c a l
identical
Assertions
New w o r l d s
the effect
to use the
in our d i s c u s s i o n
of w o r l d
(McCarthy
and false
by p r o g r a m s
this
to the
may be true
in
may be c o n s t r u c t e d
world modification,
splitting
and
joining, oWorld side
Modification effects
assertions
causes
-- The e x e c u t i o n the
creation
in the old w o r l d
new world. WORLDI
WORLD2
i
of an i n s t r u c t i o n
of a new world.
may be a s s u m e d
None
to be true
with of the in the
247
eWorld
Splitting
-- The
execution
of a conditional
test P causes
the creation of Two new worlds. WORLDI
~
• WORLD2
WORLD3
Any a s s e r t i o n Furthermore • World the
in WORLD
P is true
Joining
i is also true in W O R L D 2
-- When two paths
corresponsing
worlds
and
in WORLD2
of a p r o g r a m
are joined
I WORLD1
and WORLDS.
~ P is true
in WORLDS.
join together,
too.
WORLD2 WORLD3
Here,
in order
be true
for an assertion
in both W O R L D I
When
an i n s t r u c t i o n
need
to be able
with
ed.
tually
construct
is executed.
an a s s e r t i o n
a new a s s e r t i o n
in order that the original
new w o r l d
(cf.
To i l l u s t r a t e
Floyd
[1967]
sort(x y) that simplicity changes
sorts
we w i l l
will
we
should
requirement
sort should be the not
consider
such
and we can achieve c h a n g e statement appears
we ac-
in the old
will be true
in the
[1989]). constructs
to the synthe-
let us c o n s i d e r
of two variables
statement
that
is execut-
over an instruction,
assertion
we
interchange(x of p r i m i t i v e
will be
the p r o g r a m
x and y. y) that
For ex-
assignment
simply
x { y.
speaking,
additional the
use the
the world,
in the new w o r l d
that must be true
side effects, the values
is true
the i n s t r u c t i o n
of these
Our s p e c i f i c a t i o n s
Achieve Strictly
back
the value of x and y, i n s t e a d
statements.
it must
We do this by "passing"
and Hoare
the a p p l i c a t i o n
sis of a p r o g r a m with
in WORLD3
has m o d i f i e d
back to the old w o r l d b e f o r e
When we "pass"
world
side effects
to test if an a s s e r t i o n
after the i n s t r u c t i o n assertion
to be true
and WORLD2.
the
include
that the
goals
same effect
However,
until the next
by r e q u i r i n g with
the
of x and y after
the sort.
be the only i n s t r u c t i o n
in the program.
specifications
set of values
same as before complex
in the
we
section,
that the inter-
side effects
that
248 The first step in a c h i e v i n g a goal is to see if it is already true. (If a goal is a theorem,
for instance,
we do not need to construct
a p r o g r a m to achieve it.) We cannot prove x~y, but we can use it as a basis
for a h y p o t h e t i c a l w o r l d split. WORLDI
~
WORLD2
Our p r o g r a m thus far is
i
~
WORLD3
In WORLD2 our goal is already achieved; we may r e s t r i c t our attention to WORLD3.
In WORLD3 we know that ~ (x~y), i.e., x>y.
To
achieve xcy~ it suffices to e s t a b l i s h x
~
WORLDI
it WORLD2
~
F
l
6
WORLD3
E
1 ~WORLD4
We have a c h i e v e d x~y in both WORLD2 together, true.
and WORLD4.
If we join them
we will have s u c c e e d e d in m o d i f y i n g WORLDI to make x~y
The final p r o g r a m is therefore:
1
WORLD2 I
WORLD3
1
- } WORLD4 f WORLD5
O f t e n a goal to be a c h i e v e d will involve the s i m u l t a n e o u s faction of more than one condition. junctive
satis-
As in the case of the con-
goals in the programs without
side effects,
the special
249
interest of this p r o b l e m lies in the i n t e r a c t i o n b e t w e e n the subgoals.
The satisfaction of simultaneous goals will be the sub-
ject of the next section.
G.
Simultaneous Goals
The p r o b l e m of simultaneous goals is the p r o b l e m of a p p r o a c h i n g a goal of form A c h i e v e P and Q. Sometimes P and Q will be independent
conditions,
so that we can
achieve P and Q simply by achieving P and then a c h i e v i n g Q. example,
For
if our goal is A c h i e v e x : 2 and y = 3,
the two goals x=2 and y=3 are completely independent.
In this
section, however, we will be concerned w i t h the more complex case in which P and Q interact.
In such a case we may make P
false in the course of achieving Q. Consider for example the p r o b l e m of sorting three variables x, y, and z.
We will assume that the only i n s t r u c t i o n we can use is
the subroutine sort(u v), d e s c r i b e d in the previous sorts two variables.
section, w h i c h
Our goal is then
A c h i e v e x ~ y and y ( z. We know that the p r o g r a m sort(u v) will achieve a goal of form u~v.
If we apply the s t r a i g h t f o r w a r d technique of a c h i e v i n g the
conjunct x{y first, and then the conjunct y(z, we obtain the program sort(x y) sort(y z). However, this p r o g r a m has a bug in that sorting y and z may disrupt the r e l a t i o n xCy: if z is initially the smallest of the three, we make y less than x in i n t e r c h a n g i n g y and z. order in w h i c h the eonjunets
R e v e r s i n g the
are achieved is useless in this
case.
There are a n u m b e r of ways in which this p r o b l e m may be resolved. One of t h e m involves the notion of debugging
(ef. Sussman [1973]).
The a p p r o a c h is to debug the p r o g r a m sort(x y) sort(y z) so that in e x e c u t i n g sort(y z) we do not disturb the r e l a t i o n
250
x~y.
To illustrate the process more clearly, we will expand the
definition of sort in terms of interchange and present the entire program in flowchart notation, WORLDI
WORLD3
WORLD2 !
J
linterchange (x y)I WORLD4
L
~ WORLD6
~t"~'WORLD5 ~ W O R L D 7 linterehange(y zi I
~ l WORLD8 L.... - ~ , I W O R L D 9 We want to modify this program so that x~y in WORLD9. choose not to achieve xcy directly in WORLDg, turbing the protected relation y~z.
We now
for fear of dis-
Instead we decide to pass
the predicate x~y back to some earlier point in the program where there are no protected relations.
We can then safely attempt to
achieve the modified predicate at that point. There are two ways of passing the predicate x~y back to WORLD5 -through WORLD6
or through WORLD8
ease the predicate is unmodified. WORLD5,
and WORLD7.
In the first
Since we know x~y is true in
we do not have to worry about this case.
In the second
case~ passing x~y over statement ~nterchanse(y z) gives the modified predicate x~z in WORLD7.
Therefore we must achieve
x ~ z if y > z in WORLD5° We may attempt to achieve this relation directly, by inserting the instruction sort(x z) in the middle of the program, yielding
251
the p r o g r a m sort(x y) sort(x z) sort(y z). However, this action, though ultimately
correct, risks disturb-
ing the relation x(y p r o t e c t e d in WORLD5.
A more prudent course
is to pass the predicate back still further to WORLDI, r e l a t i o n is p r o t e c t e d at all.
where no
Passing the r e l a t i o n back to WORLDI
gives the two predicates x ( z if y > z and x ~ y in WORLDI and y ( z if x > z and x > y in WORLDI. We must achieve both of these relations.
If we achieve the first
r e l a t i o n first, we insert the i n s t r u c t i o n sort(x z) at the beginning of the program, vially satisfied.
then beeuase x(z, the second relation is tri-
The p r o g r a m o b t a i n e d is the
sort(x z) sort(x y) s o r t ( y z). If, on the other hand we try to achieve the second r e l a t i o n first, by i n s e r t i n g the i n s t r u c t i o n sort(y z) at the b e g i n n i n g of the program,
the first r e l a t i o n will also be satisfied automatically,
and the p r o g r a m w i l l then be sort(y z) sort(x y) sort(y z). The simultaneous goal p r o b l e m is essential to all p r o g r a m synthesis:
what we have said in this section only begins to explore
the subject. This concludes the p r e s e n t a t i o n of our basic p r o g r a m synthesis techniques.
In the next part we will show how the same techni-
ques work together in the synthesis of some more complex examples.
III. PROGRAM SYNTHESIS:
THE P A T T E R N - M A T C H E R
We w i l l present the synthesis of a simple p a t t e r n - m a t c h e r to show how the concepts discussed in the pre%ious n o n - t r i v i a l problem.
section can be applied to a
Later, in Part IV, we shall show how we can con-
252
struct an even more complex program~ the u n i f i c a t i o n a l g o r i t h m of R o b i n s o n thesize.
[1965], by m o d i f y i n g the p r o g r a m we are about to syn-
We must first describe the data structures
tive operations
i n v o l v e d in the p a t t e r n - m a t c h i n g
and primi-
and u n i f i c a t i o n
problems.
A.
Domain and N o t a t i o n s
The main objects in our domain are 9 x p r e s s i 0 n ~ and substitutions. i.
Expressions
Expressions
are atoms or nested lists of atoms{ cog.,
is an expression.
An atom may be e i t h e r a v a r i a b l e or a constant.
(In our examples we will use A,B,C,... for variables.)
(A B (X C) D)
for constants
We have basic p r e d i c a t e s
and U,V,W,...
atom, var and eonst to
d i s t i n g u i s h these objects: atom(R) var(~) and
~ ~ is an atom, ~ Z is a variable,
const(Z)
~ ~ is a constant.
To decompose an expression, we will use the p r i m i t i v e functions head(~)
and tail(Z)~
defined w h e n ~ is not an atom.
head(~)
is the first element of Z~
tail(~)
is the list of all but the first element of Z.
Thus head(((A
(X) B) C (D X)))
: (A (X) B),
tail(((A
(X) B) C (D X)))
: (C (D X)).
We will a b b r e v i a t e head(~)
as £i and tail(~)
as ~2"
To construct e x p r e s s i o n s we have the c o n c a t e n a t i o n function: is any e x p r e s s i o n and m is a n o n a t o m i c expression,
if
~.m is the
e x p r e s s i o n w i t h £ i n s e r t e d before the first element of m.
For
example (A (X) B)
(C (D X)) = ((A (X) B) C (D X)).
The p r e d i c a t e o c c u r s i n ( x ~) is true if x is an atom that occurs in e x p r e s s i o n ~ at any level, e.g., oceursin(A
(C (B (A) B) C)) is true
but o c c u r s i n ( x Y) is false. Finally, we will introduce the p r e d i c a t e
constexp(~), w h i c h is
true if ~ is made up e n t i r e l y of constants.
Thus
253
constexp((A
(B) C (D E)))
constexp(X)
is false.
is true
but
Note that
constexp
differs
from const in that constexp may be
true on nonatomic expressions. 2.
Substitutions
A substitution
replaces
other expressions. of pairs.
certain variables
We will represent
of an expression
a substitution
by
as a list
Thus (<X (A B)> )
is a substitution. The instantiation to an expression
function inst(s £.
For example,
£) applies the substitution if s is the substitution
s
above
and £ is (X (A Y) X) then inst(s
~) is
((A B) (A (C Y)) Note that the substitution currences
(A B ) ) . is applied by first replacing
of X simultaneously
of Y simultaneously
by
(C Y).
all oc-
by (A B) and then all occurrences Thus,
if the substitution
s were
(<X Y > ), then inst(s
~) would be
(C (A C) C). The empty substitution Thus,
A is represented by the empty list of pairs.
for any expression
~,
inst(A ~) = ~. We regard two substitutions
s I and s 2 as equal
(written Sl=S 2) if
and only if inst(s I ~) = inst(s 2 ~) for every expression
~.
Thus
(<X Y> ) and (<X C> ) are regarded
as the same substitution.
We can build up substitutions position):
If v is a variable
using the functions
p a i r and o (com-
and t an expression,
pair(v t) is
254
the s u b s t i t u t i o n that replaces v by t; i.e.~ a ~ i ~ ( v t) : (). If s I and s 2 are two substitutions~
Sl°S 2 is the s u b s t i t u t i o n
w i t h the same effect as a p p l y i n g s I followed by s 2.
Thus
inst(sl°s 2 ~) : inst(s 2 inst(s I ~)). For examp!e~
if
s I = (<X A> ) and s then
2
= ( <X D>)
Sl°S 2 = (<X A> ). Note that for the empty s u b s t i t u t i o n A A°s : s°A : s for any s u b s t i t u t i o n
B.
s.
The s p e c i f i c a t i o n s
The p r o b l e m of p a t t e r n - m a t c h i n g may be d e s c r i b e d as follows. are given two expressions,
but ar~ is assumed to contain no variables; is true.
We
p a t and ar___gg. Pat can be any expression, i.e.,
constexp(a_r~)
We want to find a s u b s t i t u t i o n z that transforms pat in-
to arg, such that inst(z pat)
= arg.
We will call such a s u b s t i t u t i o n a match.
If no m a t c h exists, we
want the p r o g r a m to return the d i s t i n g u i s h e d For examp!e~
constant NOMATCH.
if is
(X A
(Y B))
and is (C A
(D B)),
we want the p r o g r a m to find the m a t c h (<X C> ). On the other hand, if pat is
(X A (X B))
arg is
(B A (D B)),
and then no s u b s t i t u t i o n will t r a n s f o r m pa t into arg, so the p r o g r a m will y i e l d NOMATCH. This version of the p a t t e r n ~ m a t c h e r
is simpler than the pattern-
m a t c h i n g a l g o r i t h m s u s u a l l y i m p l e m e n t e d in p r o g r a m m i n g languages because of the absence of "sequence"
or "fragment" variables.
Our variables must m a t c h exactly one expression, whereas ment v a r i a b l e may m a t c h any n u m b e r of expressions.
a frag-
Because of
255
the absence of fragment variables, be unique.
a match,
if it exists, w i l l
Thus if pat is
(X Y Z) and
at@ is
((A B) C (A B)),
X and Z must be bound to
(A B), and Y must be bound to C.
If
pat is (X Y) and arg is (A B C), no match is possible at all.
(If X and Y were fragment variables,
four matches would be possible.) In m a t h e m a t i c a l n o t a t i o n the specifications
for our p a t t e r n - m a t c h e r
are:
Goal i: I match(p0t a~g) : Find z such that inst(z pat) = arg
r
.......else ..... z = NOMATCH
where
......
"Find z such that P(z) else Q(z)" means find a z such that
P(z) if one exists; otherwise,
find a z such that Q(z).
The above s p e c i f i c a t i o n s do not c o m p l e t e l y capture our intentions; for instance,
if
pat is (X Y), and arg is (A B), then the s u b s t i t u t i o n z = (<X A> ) will satisfy our s p e c i f i c a t i o n s as well as z
= (<X A> ).
We have n e g l e c t e d to include in our s p e c i f i c a t i o n s that no substitutions
should be made for variables that do not occur in pat.
We will call a m a t c h that satisfies this additional condition a most seneral match. An i n t e r e s t i n g c h a r a c t e r i s t i c of the synthesis we present is that even if the user forgets to require that the match found be most general~
the system will be able to strengthen the specifications
a u t o m a t i c a l l y to imply this condition, in Section II-D.
using the method outlined
T h e r e f o r e we will begin the synthesis using the
w e a k e r specifications.
C.
T h e Synthesis:
The Base Cases
Rather than listing all the k n o w l e d g e we require in a special see-
256
tion
at the beginning~
about
to be used.
vial we will
we will m e n t i o n
Furthermore~
omit
a rule
if a rule
it entirely.
only w h e n
seems
The g e n e r a l
it is
excessively
strategy
tri~
is to first
work on Goal
2:
Find
If this
z such that
inst(z
pat)
is found to be i m p o s s i b l e
such
z exists)~
Goal
3:
we will
is seen to be t r i v i a l l y
Thus~
from now on we will be w o r k i n g
ever,
in w o r k i n g
cases
in w h i c h
return
Goal
NOMATCH~
We have
satisfied
is i m p o s s i b l e
2 is p r o v e n
which
base
Goal
z to be NOMATCH.
on Goal
a portion
to achieve.
impossible~
satisfies
in our k n o w l e d g e
by t a k i n g
primarily
on any goal we devote
the goal
that no
z = NOMATCH;
which
that
if it is proven
work on
Find a z such that
showing
= arg.
(i.e.,
2.
How-
of our time
to
When we find
we will
automatically
3.
a number
of rules
concerning
inst,
including Rule
i:
inst(s
x) = x for any s u b s t i t u t i o n
Rule
2:
inst(pair(v
s
if constexp(x) t) v) = t
if var(v) We assume
that
these
tion i n v o c a t i o n cons texp(pat) ditions;
on Goal
and pat
their
truth
to the program~ thetical
Rule
We will
are r e t r i e v e d
2.
Rule
= arg.
In the
i tells have
as they
to t i g h t e n
stand now~
The p o r t i o n
of the p r o g r a m we have
case that
on the p a r t i c u l a r
the
inputs
conditions
is a s a t i s f a c t o r y
simply
constructed
con-
for a hypo-
specifications
we will
func-
of these
as conditions
any s u b s t i t u t i o n
later;
arg)
in the
either
case that both of these
our p r o g r a m
match(pat
only
prove
depends
predicates
us that
occasion
by p a t t e r n - d i r e c t e d
i applies
We cannot
or f a l s e h o o d
We use these
world-split.
are true~ match.
rules
return
of z÷any.
so far reads
=
if c o n s t e x p ( p a t ) then
if pair = ar_~g then
z + any
else.o. On the other
hand~
in the
case
constex~(pat)
and pat
~ arig, Rule
I
257
tells
us that inst(z
for
any
z.
pat)
Hence
~ arg
we are
led to try
to s a t i s f y
Goal
3 and take
z÷NOMATCH. We n o w
consider
the
case
~ constexp(pat). Ruel
2 establishes
the
subgoal
var(pat). This
is a n o t h e r
var(pat)
occasion
is true,
g r a m we h a v e
the
for a h y p o t h e t i c a l
program
constructed
m a t c h ( p a t arg)
must
return
world-split.
~air(pat
arg);
When the p r o -
so far is =
if e o n s t e x ~ ( p a t ) then
if p a t
else
= arg
then
z ÷ any
else
z ÷ NOMATCH
if v a t ( p a t ) then
z ÷ pair(pat
arg)
else .... Heneefore
we
also
~ 9onstexp(pat).
ing
that
assume
additional
Rule
3:
This
rule
and y.
- var(pat).
knowledge
inst(s
x.y)
applies
We have
about
the
= inst(s
to o u r Goal
some
Recall
To p r o c e e d
that
function
x). inst(s 2 if p a t : x . y
additional
we h a v e
we m a k e
knowledge
been
use of the
assuming follow-
inst:
y)
for any
for
some
about
substitution expressions
expressions
s.
x
in ge-
neral: Rule
4:
u = uI
u2
if ~ atom(u) R e c a l l that u I is an a b b r e v i a t i o n t i o n for tail(u). Rule
5:
u ¢ v.w
for any u, v,
for h e a d ( u )
and
u 2 is an a b b r e v i a -
and w
if atom(u) Using
Rule
4, we
generate
a subgoal
- atom(pat). Since
we h a v e
can a c t u a l l y Therefore
already prove
assumed
~ atom(pat)
p a t = p a t l . p a t 2 and
~ constexp(pat) using
using
knowledge
Rule
and ~ v a t ( p a t ) , in the
3 our Goal
system.
2 is then
we
258
reduced Goal
to
~:
Find
We n o w m a k e Rule
6:
z such that
use
of some
To p r o v e
Applying
this
for
u a n d v.
arg some
x
rule~
u = a__~l a n d
= U
inst(z
general
• y = u we
patl)~inst(z
p a t 2 ) = ar_~.
list-processing
. v, p r o v e
generate
knowledge.
x = u a n d y = v.
a subgoal
to
show that
• V Applying
Rule
4, w e k n o w
this
is t r u e w i t h
v : arg 2 if
~ atom(arg). This
is a n o t h e r
Thus,
by R u l e
duces
to
Goal
for
6~ in t h e
Find
5:
occasion
a hypothetical
case
that
world-split.
~ atom(arg),
our
subgoal
re-
z such that
inst(z
p a t I)
= arg I
inst(z
p a t 2)
: a r g 2.
and We w i l l
portpone
sidered
the
treatment
other
case,
of ~his
goal
until
after we have
con-
in w h i c h
atom(ar___~g) holds.
In t h i s
for
any
z.
can
take
The
program
case
inst(z
Rule
p a t I)
Hence,
our
5 tells
- inst(z goal
us t h a t
p a t 2)
~ arg
is u n a c h i e v a b l e
in t h i s
case,
a n d we
z + NOMATCH. so f a r
match(pat
is ar~)
:
if constexp(~at) then
else
For the forth
as y e t
using
a r g l " arg 2 •
if pat then
z + any
else
z + NOMATCH
if var(pat) then
z ÷ pair(pat
else
if atom(arg)
untreated
Rule
= arg
4 we
arg)
then
z ÷ NOMATCH
else
.. .
case neither
assume
that
pat
pat
nor
a r ~ is
atomic.
is p a t l . p a t 2 a n d a r g
is
Hence-
259
D.
The Synthesis:
The Inductive Case
We will describe the remainder of the synthesis in less detail, because the reader has already seen the style of reasoning we have been using.
Recall that we had postponed our discussion of
Goal 5 in order to consider the ease in which arg is atomic. Now that we have completed our development of that case, we resume our work on Goal 5: Find z such that inst(z pat I) = argl , and inst(z pat 2) = ~Fg2" This is a conjunctive goal, and is treated analogously to the goal in the simultaneous treated separately.
linear equations example:
each conjunct is
The system will attempt to use a recursive
call to the pattern-matcher itself in solving each conjunct. The interaction between the two conjunets is part of the challenge of this synthesis.
It is quite possible to satisfy each conjunct
separately without being able to satisfy them both together.
For
example, if pat=(X X) and ~rg=(A B) then patl=X , pat2=(X) , argl=A and ar~2=(B).
Thus z=(<X A>) satisfies the first conjunct,
z=(<X B>) satisfies the second conjunct, but no substitution will satisfy both conjuncts because it cannot match X against both A and B.
Some mechanism is needed to ensure that the expression
assigned to a variable in solving the first conjunct is the same as the expression assigned to that variable in solving the second conjunct. There are several ways to approach this difficulty. programmer may satisfy the two conjuncts
For instance~ the
separately and then attempt
to combine the two substitutions thereby derived into a single substitution.
Or he may actually replace those variables
in pat 2
that also occur in a~9~l by whatever expressions they have been matched against~ before attempting to match pat 2 against arg 2. Or he may simply pass the substitution that satisfied the first conjunct as a third argument to the pattern-marcher in working on the second conjunct. The pattern-marcher must then check that the matches assigned to variables are consistent with the substitution given as the third argument.
260
We will examine in this
section how a system would discover the
second of these methods
and consider in a later section how the
third method
could be discovered.
method here because
We will not consider the first
it is not easily adapted to the unification
problem. Our strategy We will
for a p p r o a c h i n g
consider the first conjunct
Goal 6:
satisfies
z into the second conjunct, 7:
Prove inst(z
If we are successful
independently:
~at2)
find a broader
this goal~ we will substitute
z.
= ar___~g 2.
that satisfy Goal 6 and
select one that also satisfies
strategy,
rule that relates
if we fail,
In other words, we will try to
class of substitutions
method we introduced to solve conjunctive A p p l y i n g this
that
giving
in Goal 7~ we are done~ however,
we will try to generalize from these
goal is as follows.
Find z such that inst(z patl ) = argl.
If we find a z that
Goal
the conjunctive
Goal
7.
This
is the
goals in Section II-E.
we begin work on Goal 6.
We first use a
the construct
Find z such that P(z) to the construct Find z such that P(z) else Q(z). Rule 7:
To find z such that P(z), find z I such that P(z I) else Q(z I) for some predicate
Q,
and prove ~ Q(Zl). This rule~ Goal
8:
applied to Goal 6 causes the generation
of the subgoal
Find z I such that inst(z I pat I) = arg I else q(Zl)~ and prove ~ Q(Zl).
This subgoal matches NOMATCH.
the top-level
This suggests
Goal i, where Q(z I) is Zl=
establishing
a recursion
at this point,
taking z I = m a t c h ( p a t l argl). T e r m i n a t i o n is easily shown, because both pat I and arg I are proper subexpressions show,
of pat and @rg, respeetively,
according to Rule
7, that
zi~NOMATCH.
This
It remains
to
causes another
261
hypothetical NOMATCH
world-split:
in the case Zl=match(patl
(i.e., no substitution will cause p atl and arg I to match),
we can show that no substitution either,
and hence
match(~atl However,
ar~l)=
can cause pat and arg to match
can take z=NOMATCH.
argl)~NOMATCH,
On the other hand,
we try to show Goal
inst(z I pat 2) = arg 2. we fail in this attempt;
if
7, i.e.,
in fact we can find sample in-
puts pa t and arg that provide
a counter-example
to Goal
7 (e.g.,
pat=(A X), arg=(A B), Zl=A).
Thus we go back and try to genera-
lize our solution to Goal 6. We already have a solution to Goal 6:
we know inst(z I patl)=arg I.
We also can deduce that constexp(argl) , because we have assumed constexp(arg).
Hence Rule i tells us that
inst(w arg I) = arg I for any substitution w. Hence inst(w inst(z I p atl))
= argl,
i.e., inst(zl°w p a t I) = arg I for any substitution w.
Thus having one substitution
z I that sa-
tisfies Goal 6, we have an entire class of substitutions, Zl°W ~ that also satisfy Goal 6. sidered to be "extensions"
These substitutions
of Zl; although
sfy Goal "7, perhaps
some extension
The above reasoning
is straightforward
further work is needed to motivate
z I itself may not sati-
of z I will. enough to justify,
9:
but
a machine to pursue it.
It remains now to find a single w so that Zl°W satisfies Goal
of form
may be con-
Goal
7, i.e.,
Find w such that inst(zl°w pat.2)=ar__zg2,
or equivalently, Find w such that inst(w inst(z I P at2))=arg 2. applying Rule 7, we establish Goal i0:
a new goal
Find w such that inst(w inst(z I pat2))=arg 2 else Q(w), and prove
~ Q(w)
This goal is an instance inst(z I pat2),
of our top-level
goal, taking pat to be
a rg to be a rg2' and Q(w) to be w=NOMATCH.
attempt to insert the recursive
call
Thus we
262
z 2 * m a t c h ( $ n & t C z I pat2) into our p r o g r a m must
at this p o i n t
arg2 )
and take w to be z 2.
However,
we
first e s t a b l i s h Q(z2) , m a t c h ( i n s t ( z I pat2)
We c a n n o t p r o v e pat
this.
a r g 2) ~ N O M A T C H .
We have
= (A A)~ ~
eounter-examples,
e.g.,
if
= (A B) and z I = A,
then match(inst(A Therefore
we split
A)
B) = NOkIATCH.
on this
condition.
'In the case m a t e h ( i n s t ( z I pat2) Goal
i0 is s a t i s f i e d .
Z=Zl°Z 2 satisfies Our p r o g r a m
arg2 ) ~ N O M A T C H
Thus w=z 2 also
Goal
satisfies
Goal
9, and
7.
so far is
mateh(p_~
ar_~) =
if e o n s t e x ~ ( p a t ) t h e n if ~
= arg
then
z ÷ any
else
z ÷ NOMATCH
else if v a r ( p a t ) then
z ÷ pair(pat
ar$)
else if a t o m ( a < [ ) then
z + NOMATCH
else
z I ÷ m a t c h ( p a t I argl )
if z I = N O M A T C H t h e n z + N0~IATCH else
z 2 ÷ m a t c h ( i n s t ( z I p_a~2 ) arg2)
if z 2 = N O M A T C H
E.
The S y n t h e s i s :
We have
gone this
fications~ general.
...
else
z ÷ Zl°Z 2
The S < r e n ~ t h e n i n g
far t h r o u g h
i.e., w i t h o u t In fact~
then
the
requiring
the m a t c h
of the
synthesis
Specifications
u s i n g the w e a k
that the m a t c h
speci-
f o u n d be m o s t
f o u n d may or may not be m o s t g e n e r a l
263
d e p e n d i n g on the v a l u e taken for the u n s p e c i f i e d s u b s t i t u i o n "any" p r o d u c e d in the very first case.
The synthesis is nearly complete.
However, we will be unable to complete it without the s p e c i f i c a t i o n s
strengthening
and m o d i f y i n g the p r o g r a m accordingly.
have only one case left to consider.
We now
This is the case in w h i c h
z2 = NO~TCH i.e., m a t c h ( i n s t ( z i pat 2) = NOMATCH. This means that no s u b s t i t u t i o n w satisfies inst(w inst(z I pat2))
= arg2,
or, e q u i v a l e n t l y i n s t ( z l ° w pat 2) # arg 2 for every s u b s t i t u t i o n w. This means that no s u b s t i t u t i o n of form Zl°W could p o s s i b l y satisfy i n s t ( z l ° w pat) We here have a choice:
= a rg. we can try to find a s u b s t i t u t i o n s not of
form Zl°W that satisfies inst(s pat I) = arg 1 and repeat the process; stitution
or we could try to show that only a sub-
s of form Zl°W could p o s s i b l y satisfy inst(s pat I) = argl,
and t h e r e f o r e we can take z=NOMATCH. We have already e x t e n d e d the class of s u b s t i t u t i o n s that satisfy the condition once; t h e r e f o r e we pursue the latter course. try to show that the set of substitutions
We
S=Zl°W is the entire
set of solutions to inst(s pat l) = arg I. In o t h e r words, we show that for any s u b s t i t u t i o n s, if inst(s pat I) = arg I then s = Zl°W for some w. This condition is equivalent to saying that z I is a most general match.
We cannot prove this about z I itself; however,
since z 1
is m a t c h ( p a t I arg I) it suffices to add the condition to the specifications
for match, as d e s c r i b e d in Section II-D.
s p e c i f i c a t i o n s now read Find z such that {inst(z pat) = arg and for all s [if inst(s pat) = arg then s = z°w for some w]} else z = NOMATCH.
The s t r e n g t h e n e d
264
O n c e we h a v e go t h r o u g h
the
cifications In this
strengthened entire
are
case
the
program
satisfied,
no m a j o r
specifications and
see that
modifying
modifications
it is n e c e s s a r y the
new,
to
stronger
spe~
the p r o g r a m
if n e c e s s a r y .
are n e c e s s a r y ;
however,
the
assignment z + any that
occurs
tant
is
in the
further
case
in w h i c h
specified
pat
and
ar$
are
equal
and
to r e a d
z ÷ A. Our final
program
is t h e r e f o r e
match(pat
a_E~) :
if 9 o n s t e x z ( p a t ) then
if p a t = arg
then
z + A
else
z ÷ NOMATCH
else
if v a r ( p a t )
then
z + pair(pat
else
if a t o m ( a r g )
ar$ ,)
then
z ÷ NOMATCH
else
z I ÷ match(~atl
argl )
if z I = N O M A T C H then
z ÷ NOHATCH
else
z 2 ÷ m a t c h ( i n s t ( z I p a t 2) arg 2)
if z 2 = N O M A T C H then
z ÷ NOMATCH
else
z ÷ zl°z 2.
cons-
265
F.
Alternative
Programs
The above p a t t e r n - m a t c h e r
is only one of may pattern-matchers
that can be derived to satisfy the same specifications. suing the synthesis the alternative altogether,
the system has made many choices;
some of
paths result in a failure to solve the problem
whereas
better programs.
other paths result in different,
In this
might make a different sequently,
In pur-
possibly
section we will indicate how a system
decision
in the above synthesis
and, con-
drive a different program.
In the above synthesis we derived a goal Find w such that inst(zl°w
(Goal 9):
Pat2 ) = arg 2.
We chose at that point to transform the goal to the equivalent Find w such that inst(w
inst(z I Pat2))
and then apply Rule 7, which added an else-clause yielding Goal I0. goal, introducing
= arg2, to the goal,
We then matched Goal i0 against our top-level the second recursive
It would be equally plausible
call into our program.
that the system should apply Rule
7 directly to Goal 9, giving a new goal Goal I@': Find w such that inst(zl°w
pat 2) = ar__~2
else Q(w), and prove ~ Q(w). The system may now attempt to use a reeursive Goal I0'
However,
top-level
this goal is not a precise
call to satisfy instance of our
goal~ we are demanding not only that a match be found
between pat 2 and arg2, but also that the match be of form Zl°W , where z I is the output of the previous recursive therefore and alist. alist.
generalize
call.
our program to take three inputs:
We must pat,
ar___gg,
We insist that the match found be an extension of
In the case that alist is A, the new p r o g r a m will be iden-
tical to the old one. and arg:arg2,
On the other hand,
a recursive
if alist=zl~
pat=pat2
call to the new p r o g r a m suffices to
satisfy Goal i0' In mathematical
notation,
the stronger
I Find z such that
specifications
[inst(z pa t ) = arg and z = alist°w for some w]
[ else z : NOMATCH.
now read
266
We will call the g e n e r a l i z e d
program
m a t c h 2 ( R a ~ arg alist). The original p r o g r a m match will be written general
auxiliary program: match(pat ar_~) = m a t c h 2 ( p a t
The portion of the p r o g r a m already of match must be s y s t e m a t i c a l l y ed requirements First,
as a call to the more
ar e A).
constructed
m o d i f i e d to satisfy the strengthen-
of match2.
in the case that constexp(pat)
took z÷any.
in the synthesis
However~
where p at=arg~ we originally
we must now take
z ÷ alist°any, becuase the substitution
found must be an extension of alist.
In the case that vat(pat) however, matched
here we must against
we originally
took z+pair( a ~
check to see w h e t h e r ~
ar_~g);
has already been
something other than ar__~. The new p r o g r a m seg-
ment is if inst(alist
~at)
= pat
then z ÷ alis~°pair( ap~_ arg) else if i n s t ( a l i s t
pat)
= arg
then z ÷ alist else z ÷ NOMATCH. Of course.we derivation
are omitting the details
of synthesis;
the actual
is somewhat more lengthy.
The call z I ÷ m a t c h ( p a t I argl ) must be replaced by z I * mateh2(pat I arg I alist). Having m o d i f i e d the p o r t i o n of the p r o g r a m already constructed~ the system continues
the synthesis.
A recursive
satisfies
z 2 ÷ match2( a~_~2 arg 2 z I) Goal 9; the balance of the synthesis
velopment
of the first program,
ing of the specifications
including
call
parallels
the further
the de-
strengthen-
to ensure that the match found is the
most general possible. The value of the second recursive strengthened
top-level
The p r o g r a m derived
call is shown to satisfy the
goal.
from this alternative
synthesis
is
267
m a t c h ( p a t arg) = m a t c h 2 ( p a t arg A) where m a t e h 2 ( p a t arg alist)
=
if constexp(pat) then if ~ a t = arg then z ÷ a!ist else z ÷ N O ~ T C H else if var(pat) then if inst(a!ist pat)
= pat
then z ÷ a l i s t ° p a i r ( p a t arg) else if inst(alist pat)
= arg
then z + alist else z ÷ N O M A T C H else if atom(ar$) then z ÷ N O M A T C H else z I + m a t c h 2 ( p a t I ~ g l
alist)
if z I = N O M A T C H then z ÷ N O M A T C H else z + m a t c h 2 ( p a t 2 IV.
P R O G R A M MODIFICATION:
arg 2 Zl).
THE U N I F I C A T I O N A L G O R I T H M
In general, we cannot expect a system to synthesize an entire complex p r o g r a m from scratch, as in the p a t t e r n - m a t c h e r example.
We
w o u l d like the system to r e m e m b e r a large body of programs that have been s y n t h e s i z e d before and the m e t h o d by w h i c h they were constructed.
When p r e s e n t e d with a new problem, the system should
cheek to see if it has solved a similar p r o b l e m before.
If so, it
may be able to adapt the t e c h n i q u e to the old p r o g r a m to make it solve the new problem. There are several difficulties i n v o l v e d in this approach.
First,
we cannot expect the system to r e m e m b e r every detail of every synthesis in its history. and w h a t to forget.
Therefore,
it must decide what to r e m e m b e r
Second, the s y s t e m must decide w h i c h p r o b l e m s
are similar to the one being considered, larity is somewhat ill-defined.
and the concept of simi-
Third, having found a similar
program, the system must somehow modify the old synthesis to solve the new problem.
We will concentrate only on the latter of these
268
problems
in this discussion.
program modification RobinsonTs A.
We will illustrate
a technique
as applied to the synthesis
unification
for
of a version of
algorithm.
The Specifications
Unification matching
may be considered
in which variables
to be a generalization
of pattern
appear in both Pat and arg.
lem is to find a single substitution
(called a "unifier")
when applied to both pat and arg , will yield identical for instance,
The probthat,
expressions.
if
pat = (X A) and arg : (B Y)~ then a possible
unifier of pat and arg is
(<X B>). The close analogy between pattern-matehing clear.
and unification
If we assume that the system remembers
marcher we constructed goal structure unification
in Sections
The specifications cal notation,
the pattern-
III-2 through
involved in the synthesis,
problem is greatly
is
III-5 and the
the solution to the
facilitated.
for The unification
algorithm,
in mathemati-
are
unify(p_a~ arg) I Find z such that inst(z pat) = inst(z arg) else z : NOPLATCH B°
The Analogy with the Pattern-Matcher
For purposes
of comparison
match(pat
we rewrite the match specifications:
arg) =
Find z such that inst(z pat)
= arg
else z = NOMATCH. In formulating
the analgy, we identify unify with match, pat with
pat, the ar___ggin unify with arg~
(~at arg) with arg, and inst(z arg) also
In accordance
alter the goal structure example,
with this analogy,
we must systematically
of the pattern-matcher
Goal 5 becomes modified to read
synthesis.
For
269
Find z such that inst(z pat I) = inst(z argl ) and inst(z pat 2) : inst(z aFg2 ). In constructing the p a t t e r n - m a t c h e r , we had to break down the synthesis into various oases.
We will try to m a i n t a i n this case
structure in f o r m u l a t i n g our new program.
Much of the savings
derived from m o d i f y i n g the p a t t e r n - m a t c h e r instead of constructing the u n i f i c a t i o n a l g o r i t h m from scratch arises because we do not have to deduce the ease splitting all over again. A difficult step in the p a t t e r n - m a t e h e r synthesis i n v o l v e d the s t r e n g t h e n i n g of the specifications
for the entire program.
We
added the condition that the match found was to be "most general." In f o r m u l a t i n g the u n i f i c a t i o n synthesis, we will i m m e d i a t e l y s t r e n g t h e n the s p e c i f i c a t i o n s in the analogous way. thened s)ecifications
The streng-
read
u n i f y ( p a t argi =
.........
Find z such that {inst(z pat)
= inst(z erg) and
for all s [if inst(s pat)
: inst(s arg)
then s = z°w for some w]} else z = NOMATCH. Following Robinson
[1965], we will refer to a unifier s a t i s f y i n g
the new c o n d i t i o n as a "most general unifier." Note that this a l t e r a t i o n process is p u r e l y syntactic;
there is
no reason to assume that the altered goal structure corresponds to a valid line of reasoning.
For instance,
simply because a-
chieving Goal 2 in the p a t t e r n - m a t c h i n g p r o g r a m is useful in a c h i e v i n g Goal i does not n e c e s s a r i l y imply that a c h i e v i n g Goal 2' in the u n i f i c a t i o n a l g o r i t h m will have any bearing on Goal I' The extent to w h i c h the r e a s o n i n g carries over depends on the soundness of the analogy.
If a portion of the goal structure
proves to be valid, the c o r r e s p o n d i n g segment of the p r o g r a m ean still remain; otherwise, we must construct a new p r o g r a m segment.
C.
The M o d i f i c a t i o n
Let us examime the first two cases of the u n i f i c a t i o n synthesis in full detail,
so that we can see exactly how the m o d i f i c a t i o n
270
process works.
In the pattern-macher,
we generated
the subgoal
(Goal 2) Find z such that inst(z pat) The corresponding
unification
subgoal is
Find z such that inst(z pat) In the p a t t e r n - m a t c h e r where pat=arg.
= arg. = inst(z
arg).
we first considered the case constexp(pat)
In this case the corresponding
p r o g r a m segment
is
z ÷ A. This
segment also satisfies the m o d i f i e d ins t(A a~9~)
The system must also
: inst(A
goal in this case, because
arg).
check that i is a most general
for any s [if inst(s pat)
: inst(s
unifier,
i.e.,
ar__~g)
then s = A°w for some w]. This
condition
is easily
satisfied,
case, the p r o g r a m segment
is correct without
The next case does require marcher,
taking w=s.
Thus,
any modification.
some modification.
In the pattern-
when c o n s t e x p ( p ! ~) is true and p!tCarg,
be NOMATCHo
However,
in this
z is taken to
in this case in the u n i f i c a t i o n
algorithm
we must check that inst(s pat)
~ inst(s arg),
i.eo~ nat
~ inst(s ar$)
for any s, in order to take z=NOMATCH. may contain variables, must therefore way.
In this case
the u n i f i c a t i o n
this condition
try to achieve
Since for u n i f i c a t i o n cannot be satisfied.
the specifications
(where constexp(pat)),
a l g o r i t h m reduce
arg
We
in some other
the specifications
of
to
Find z such that {pat = inst(z
arg) and
for any s [if pat = inst(s
arg)
then s = z°w for some w]} else z = NOMATCH. These
specifications
pattern-marcher
are p r e c i s e l y
with ~
the specifications
and arg reversed;
of the
consequently,
we can
invoke m a t c h ( a r g ap_a~) at this point in the program. The balance manner.
of the m o d i f i c a t i o n
The derived
can be carried out in the same
unification
algorithm is
271
unify(pat
arg)
=
if ~constexp(pat) t h e n if pat
= arg
then
z ÷ A
else
z ÷ match(arg
pat)
else if v a t ( p a t ) t h e n if o c c u r s i n ( p a t
arg)
t h e n z + NO[%ATCH else
z + pair(pat
arg)
else if a t o m ( a r g ) then
z ÷ unify(arg
pat)
else
zI ÷ unify(patl
ar~l)
if z I = N O M A T C H then z ÷ N O M A T C H else z 2 ÷ u n i f y ( i n s t ( z I a ~ 2 )
i n s t ( z I arg2))
if z 2 = N O M A T C H then z + NOMATCH else z + Zl°Z 2. Recall that occursin(pat
arg) m e a n s that pat o c c u r s
in a r & as a
subexpression. The t e r m i n a t i o n
of this p r o g r a m
is c o n s i d e r a b l y
to p r o v e t h a n was the t e r m i n a t i o n ever,
the c o n s t r u c t i o n
of the u n i f i c a t i o n
algorithm
pattern-matcher
is m u c h e a s i e r t h a n the i n i t i a l
pattern-mateher
itself.
N o t e that the p r o g r a m we have branch.
more
difficult
of the p a t t e r n - m a t c h e r .
constructed
f r o m the
synthesis
contains
How-
of the
a redundant
The e x p r e s s i o n if pat = arg then z + A else z ÷ m a t e h ( a F ~
pat)
c o u l d be r e d u c e d to z ÷ mateh(arg Such i m p r o v e m e n t s phase.
pat).
w o u l d not be m a d e until
a later optimization
272
V.
DISCUSSION
A.
l~l~entat~on
Implementation way.
of the techniques
presehted
in this paper is under-
Some of them have already been implemented.
quire further development
before
Others will re-
an implementation
will be possible.
We imagine the rules,
used to represent
reasoning tactics,
expressed
as programs
in a PLANNER-type
language.
mentation
is in QLISP
(Reboh and Sacerdoti
to be
Our own imple-
[1973]).
Rules are
summoned by pattern-directed
function invocation.
Worlds have been implemented
using the context mechanism of QLISP,
which was introduced structure
necessary
actual splitting
in QA4 (Rulifson e t a l . for the hypothetical
[1973]) of INTERLISP
thetical world-splitting experiment
II, or the loop-free The generalization a difficult
strategies
Similarly,
[1974]).
(Bobrow The hypo-
but we have yet to
for controlling
it.
simple programs
the program to sort two variables
segments
of the pattern-matcher
of specifications
technique
develop heuristics tion.
(Teitelman
system is capable of producing
as the union function,
environments
has been implemented,
with the various
The existing
The control-
of the control path as well as the assertional
data base~ is expressed using the multiple and Wegbreit
[1972]).
worlds, which involve an
(Seotions
to apply without
to regulate our approach
such
from Part
from Part III.
II-4 and III-5) is
its going astray.
We will
it in the course of the implementato conjunctive
goals
(Section !I-5)
needs further explication. \
B.
H~storical
Early'work Waldinger bilities
Context
and Contemporary
in program synthesis
Research
(e.g. Simon [19631, Green [1969],
and Lee [1969]), was limited by the problem-solving of the respective
formalisms
involved
lem Solver in the case of Simon, resolution case of the others).
(the General Prob-
theorem proving in the
Our paper on loop formation
get [1971]) was set in a theorem-proging attention to the implementation
problems.
capa-
framework,
(Hanna and Waldinand paid little
273
It is typical of contemporary program synthesis work not to attempt to restrict itself to a formalism;
systems are more likely
to write programs the way a human programmer would write them. For example, the recent work of Sussman [1973] is modelled after the debugging process.
Rather than trying to produce a oorrect
program at once, Sussmants system rashly goes ahead and writes incorrect programs which it then proceeds to debug.
The work re-
ported in Green et al. [1974] attempts to model a very experienced programmer.
For example,
if asked to produce a sort program, the
system recalls a variety of sorting methods and asks the user which he would like best. The work reported here emphasizes reasonging more heavily than the papers of Sussman and Green.
For instance,
in our synthesis
of the pattern-marcher we assumed no knowledge about patternmatching itself.
Thus our system would be unlikely to ask the
user what kind of pattern-matcher he would like. do assume extensive knowledge of lists,
Of course we
substitutions,
and other
aspects of the subject domain. Although Sussman's debugging approach has influenced our treatment of program modification and the handling of simultaneous goals, we tend to rely more on logical methods than Sussman. Furthermore,
Sussman deals only with programs that manipulate
blocks on a table; therefore he has not been forced to deal with problems that are more crucial in conventional programming, as the formation of conditionals
such
and loops.
The work of Buchanan and Luckham [1974]
(see also Luckham and
Buchanan [1974]) is closest to ours in the problems it addresses. However, there are differences in detail between our approach and theirs: The Buchanan-Luckham
specification language is first-order pre-
dicate calculus; ours allows a variety of other notations. method of forming conditionals
uses contexts and the Bobrow-Wegbreit Buchanan-Luckham
Their
involves an auxiliary stack; ours control structures.
In the
system the loops in the program are iterative,
and are specified in advance by the user as "iterative rules," whereas in our system the
(recursive)
loops are introduced by
the system itself when it recognizes a relationship between the
274
top-level goal and a subgoal.
The treatment of programs with
side effects is also quite different in the Buchanan-Luckham system, in which a model of the world is maintained and updated, and assertions
are removed when they are found to contradict
other assertions
in the model.
Our use of contexts allows the
system to recall past states of the world and avoids the tricky problem of determining when a model is inconsistent.
I should
be added that the implementation of the Buchanan-Luekham system is considerably more advanced than ours. C.
Conclusions and Future Work
We hope we have managed to convey in this paper the promise of program synthesis, without giving the false impression that automatic synthesis is likely to be immediately practical.
A compu-
ter system that can replace the human programmer will very likely be able to pass the rest of the Turing test as well. Some of the approaches to program synthesis that we feel will be most fruitful in the future have been given little emphasis this paper because they are not yet fully developed.
in
For example,
the technique of program modifieation, which occupied only one small part of the current paper, we feel to be central to future program synthesis work.
The retention of previously
constructed
programs is a powerful way to acquire and store knowledge.
Further-
more program optimization and program debugging are just special cases of program modification. Another technique that we believe will be valuable is the use of more visual or graphic representations, properties
that convey more of the
of the object being discussed in a single structure.
For example, we have found that the synthesis of the pattern matchef could be made shorter and more intuitive by the introduction of the substitution notation of mathematical resent an expression P as P(Xl,...,Xn),
logic.
If we rep-
where Xl,...,x n is the
complete list of the variables that oeeur in P, then P(tl,...,t n) is the result of substituting variables x i by terms t i in P. can then formulate the problem of pattern matching as follows:
We
275
Let a ~
= pat Lxl,..,,xn)
Find z such that if ar$ = pat(tl,...,t n) for some tl,...,t n then z = {<x I tl> ,...,<x n t n >} else x = NOMATCH. Note that this specification
includes
implicityly
the restriction
that the match found be a most general match, because each of the variables
x i actually occurs in pat.
tions do not need to be strengthened
Therefore,
the specifica-
during the course of the
synthesis. We hope to experiment of applications.
with visual representations
Clearly,
in a variaty
while the reasoning required is simpli-
fied by the use of pictorial notation,
the handling of innovations
such as the ellipsis notation in an implementation
is correspond-
ingly more complex. ACKNOWLEDGEMENTS We wish to thank Robert Boyer, land for giving detailed
Bertram Raphael,
critical readings
would also like to thank Nachum Dershowitz, Fikes, Akira Fusaoka, Carl Hewitt,
Shmuel Katz, David Luckham,
We
Peter Deutsch~ Richard
Cordell Green and his students,
breit for conversations this paper.
and Georgia Suther-
of the manuscript.
Irene @reif,
Earl Saeerdoti,
that aided in formulating
and Ben Weg-
the ideas in
We would also like to thank Claire Collins
and Hanna
Z£es for typing many versions of this manuscript. This research was primarily dation under grants GJ-36146
sponsored by the National and GK-35493.
Science
Foun-
276
BIBLZOGRAPHY i. Balzer, R. M. (September 1972), "Automatic Programming," Institute Technical Memo, University of Southern California/Information Sciences Institute. 2. Biermann, A. W., R. Baum, R. Kirisknasw~my and F. E. Petry (October 1973)~ '~Automatic Program Synthesis Reports," Computer and Information Sciences Technical Report TR-73-6, Ohio State University. 3. Bobrow~ D. G. and B. Wegbreit
(August 1973), "A Model for Control
Structures for Artificial Intelligence Pr0grammin ~ Languages," Adv. Papers 3d. Intl. Conf. on Artificial Intelligence, 253, Stanford University, 4. Boyer, R. S. and J
246-
Stanford, California.
S. Moore (1973), "Proving Theorems about
LISP Functions," Adv. Papers 3d. Intl. Conf. on Artificial Intelligence. 5. Buchanan~ J. R. and D. C. Luckham (March 1974), "On Automating the Construction of Programs," Memo, Stanford Artificial Intelligence Project, Stanford~ California. 6. Bundy, A.
(August 1973), "Doing Arithmetic with Diagrams," Adv.
Papers 3d. Intl. Conf. on Artificial Intelligence, 130-138, Stanford University,
Stanford, California.
7. Floyd, R. W., (1967), "Assigning Meanings to Programs," Proc. of a Symposium in A~plied Mathematics, Vol. 19, (J. T. Schwartz, ed.), Am. Math. Sock, 19-32. 8. Green, C. C. (May 1969), "Application of Theorem Proving to Problem Solving~" Proc. Intl. Joint Conf. on Artificial Intelligenoe~ 219-239. 9. Green~ C. C., R. Waldinger, R. Elschlager, D. Lenat, B. McCune, and D. Shaw~
(1974), "Progress Report on Program-Understanding
Programs~" Memo, Stanford Artificial Intelligence Project, Stanford, California. i0. Hardy, S. (December 1973), "Automatic Induction of LISP Functions," Essex University.
277
ii. Hewitt, C. (1972)~ "Description and Theoretical Analysis Schemata) of PLANNER:
(Using
A Language for Proving Theorems and Mani-
pulating Models in a Robot," AI Memo No. 251, MIT, Project MAC, April 1972. 12. Hoare, C. A. R., (October 1969), "An Axiomatic Basis for Computer Programming," C. ACM 12, i0, 576-580, 583. 13. Kowalski, R. (March 1974), "Logic for Problem Solving," Memo No. 75, Department of Computational Logic, University of Edinburgh, Edinburgh. 14. Luckham, D. and J. R. Buchanan (March 1974), "Automatic Generation of Programs Containing Conditional Statements," Memo, Stanford Artificial Intelligence Project, Stanford, California. 15. Manna, Z. and R. Waldinger (March 1971), "Toward Automatic Program Synthesis," Comm. ACM, Vol. 14, No. 3, pp. 151-165. 16. McCarthy, J. (1962), "Towards a Mathematical Science of Computation," Prec. IFiP Congress 62, North Holland, Amsterdam, 17. Reboh, R. and E. Saeerdoti
21-28.
(August 1973), "A Preliminary QLISP
Manual," Tech. Note 81, Artificial Intelligence Center, Stanford Research Institute, Menlo Park, California. 18. Robinson, J. A., (January 1965), "A Machine-0riented Logic Based on the Resolution Principle," Jour. ACM, Vol. 12, No. I, 23-41. 19. Rulifson, J. F., J. A. Derksen, and R. J. Waldinger 1972), "QA4:
(November
A Procedural Calculus for Intuitive Reasoning,"
Tech. Note 73, Artificial Intelligence Group, Stanford Research Institute, Menlo Park, California. 20. Simon, H. A., (October 1963), "Experiments with a Heuristic Comuter," Jour. ACM, Vol. I0, No. 4, 493-506. 21. Sussman, G. J. (August 1973), "A Computational Model of Skill Acquisition," Ph.D. Thesis, Artificial Intelligence Laboratory, M.I.T., Cambridge, Mass. 22. Teitelman, W., (1974), INTERLISP Reference Manual, Xerox, Pale Alto, California. 23. Waldinger, R. J., and R. C. T. Lee
(May 1969), "PROW:
A Step To-
ward Automatic Program Writing," Prec. Intl. Joint Conf. on Artificial Intelli~enee,
241-252.
A NEW APPROACH TO PROGRAM TESTING
James C. King, IBM T. J. Watson Research Center, Yorktown Heights, New York, USA
ABSTRACT: The current approach for testing a program is, in principle, quite primitive. Some small sample of the data that a program is expected to handle is presented to the program. If the program produces correct results for the sample, it is assumed to be correct. Much current work focuses on the question of how to choose this sample. We propose that a program can be more effectively tested by executing it "symbolically". Instead of supplying specific constants as input values to a program being tested, one supplies symbols. The normal computational definitions for the basic operations performed by a program can be expanded to accept symbolic inputs and produce symbolic formulae as output.
If the flow of control in the program is completely independent of its input parameters, then all output values can be symbolically computed as formulae over the symbolic inputs and examined for correctness.
When the control flow of the program is input dependent, a case analysis can be performed
producing output formulae for each class of inputs determined by the control flow dependencies. Using these ideas, we have designed and implemented an interactive debugging/testing system called EFFIGY.
INTRODUCTION
As tools for realizing correct programs, program testing and program proving are the ends of a spectrum whose range is the number of times the program must be executed. To establish its correctness through testing, one must execute the program at least once for all possible unique inputs; usually an infinite number of times. To establish its correctness through a rigorous correctness proof, one need not execute the program at all; but he may be faced With a tedious, if not difficult, formal analysis. These two extreme points of the spectrum offer other contrasts as well. Correctness proofs usually ignore certain realities encountered in actual test runs, for example, machine dependent details like overflow and precision. (One notable effort to bring machine dependent issues into correctness proofs is the recent thesis by Sites [7]). On the other hand one may finish a proof of correctness, but seldom do we ever finish testing a program, Normal testing and correctness proofs also differ in the degree to which the user is required to supply a formal specification of "correct" program behavior. While a careful statement of correctness may be recommended for program testing, it is not required. A user
279
may choose an interesting input case and then decide a posteriori, in this specific case, if the output appears to be correct. In a formal proof of correctness one must have a careful program specification.
A testing tool is described in this paper which allows one to choose intermediate points on the spectrum between individual test runs and general correctness proofs. One can perform a single "symbolic execution" of the program that is equivalent to a large (usually infinite) number of normal test runs. Test run results can not only be checked by careful manual inspection but if a machine interpretable program specification is supplied with the program it can be used to automatically check the results. Furthermore, by varying the degree to which symbolic information is introduced into the symbolic execution one can move from normal execution (no symbolic data) to a symbolic execution which, in some cases, provides a proof of correctness.
SYMBOLIC EXECUTION
The notion of symbolically executing a program follows quite naturally from normal program execution. First assume that there is a given programming language and a normal definition of program execution for that language. This execution definition must be used for production executions but an alternative symbolic execution semantics for the language can be defined to great advantage for debugging and testing. The individual programs themselves are not to be altered for testing. The definition of the symbolic execution must be such that trivial cases involving no symbols should be equivalent to normal executions and any information learned in a symbolic execution should apply to the corresponding normal executions as well.
An execution of a procedure becomes symbolic by introducing symbols as input values in place of real data objects (e.g., in place of integers and floating point numbers). Here "inputs" is to be taken generally meaning any data external to the procedure, including that obtained through parameters, global variables, explicit READ statements, etc.
Choosing symbols to represent procedure inputs
should not be confused with the similar notion of using symbolic program variable names. A program variable may have many different specific values associated with it during a particular execution whereas a symbolic input symbol is used in the static mathematical sense to represent some unknown yet fixed value. Values of program variables may be symbols representing the non-specific procedure inputs.
Once a procedure has been initiated and given symbolic inputs, execution can proceed as in a normal execution except when the symbolic inputs are encountered. This occurs in two basic ways: computation of an expression involving procedure inputs, and conditional branching dependent on procedure inputs.
Computation of Expressions The programming language has a set of basic computational operators such as addition (+), multiplication (*), etc. which are defined over data objects such as integers. Each operator must be extended to
280
deal with symbolic data. For arithmetic data this can be done by making use of the usual relationship between arithmetic and algebra. The arithmetic computations specified by these operators can be "delayed" or generalized by the appropriate algebraic formula manipulations. For example, suppose the symbolic inputs a and /3 are supplied as argument values to a procedure with formal parameter variables A and B. Denote the value of a program variable X by v(X). Then initially, v(A) = a and v(B) =/1. If the assignment statement C := A + 2*B were symbolically executed in this context C would be assigned the symbolic formula (a + 2*/3). The statement D := C - A , if executed next, would result in v(D) = 2"/I.
Similar symbolic generalization can be done, at least in theory, for all computational operations in the programming language. In the most difficult case, one could at least record in some compact notation the sequence of computations which would have taken place had the arguments been non-symbolic. The success in doing this in practice depends upon how easily these recordings can be read and understood and how easily they can be subsequently manipulated and analyzed mechanically.
Conditional Branching Consider the typical decision-making program statement, the IF statement, taking the form: IF B THEN $I ELSE Sz, where B is some Boolean valued expression in the language and $1 and $2 are other statements. Normally, either v(B) = true and statement S~ is executed or v(B) = false and statement Sz is executed. However, during a symbolic execution v(B) could be true, false or some symbolic formula over the input symbols. Consider the latter case. The predicates v(B) and , v ( B ) represent complementary constraints on the input symbols that determine alternative control flow paths through the procedure. For now, this case is called an "unresolved" execution of a conditional statement. The notion will be refined as the presentation develops. Since both alternatives paths are possible the only complete approach is to explore both: the execution forks into two "parallel" executions, one assuming v(B), the other assuming ~ v(B),
Assume the execution has forked at an unresolved conditional statement and consider the further execution for the case where v(B),
The execution may arrive at another unresolved conditional
statement execution with associated boolean, say C. Expressions v(B) and v(C) are both over the procedure input symbols and it is possible that either v(B) = v(C) or v(B) ~ ~v(C). Either implication being true would show that the assumption made at the first unresolved execution, namely v(B), is strong enough to resolve the subsequent test, namely to show that either v(C) or , v ( C ) .
Because the assumptions made in the case analysis of one unresolved conditional statement execution may be effective in resolving subsequent unresolved statement executions they are preserved as part of the execution state, along with the variable values and the statement counter, and are called the "path condition*' (denoted pc). At the beginning of a program execution the pc is set to true. The revised
281
rule for symbolically executing a condition statement with associated Boolean expression B is to first form v(B ) as before and then form the expressions: ~
v(B)
~ ~v(B). If pc is not identically false then at most one of the above expressions is true. If the first is true the assumptions already made about the procedure inputs are sufficient to eompletely resolve this test and the exe,cution follows only the v(B) case. Similarly if the second expression is true it follows the ~v(B) case.
Both of these cases are considered "resolved" or non-forking executions of the conditional
statement.
The remaining case when neither expression is true is truly an unresolved (forking) execution of the conditional statement. Even given the earlier constraints on the procedure inputs (PC), v(B) and ~v(B) are both satisfiable by some non-symbolic procedure inputs. As discussed above, unresolved conditional statement executions fork into two parallel executions. One when v(B) is assumed, in which case the pc is revised to pc ^ v(B), the other w h e n ~ v ( B ) is assumed and then oe becomes pc ^ ~v(B). Note that the forking is a property of a conditional statement execution not the s t a t e m e n t itself.
One
execution of a particular statement may be resolved yet a later execution of the same statement may not.
The pc is the accumulator of conditions on the original procedure inputs which determine a unique control path through the program.
Each path, as forks are made, has its own pc.
No pe is ever
identically f a l s e since the original pc is true and the only changes are of the form pc := pc ^ q and those only in the case when pc ^ q is satisfiable ((pc ^ q) = , (PC = ~q) which is satisfiable if pc ~ - q is not a theorem). Each path caused by forking also has a unique pc since none are identically false and
they all differ in some term, one containing a q the other a ~ q.
S Y M B O L I C E X E C U T I O N TREE
One can characterize the symbolic execution of a procedure by an "execution tree".
Associate with
each program statement execution a node and with each transition between statements a directed arc connecting the associated statement nodes.
For each forking (unresolved) conditional statement the
associated execution node has more than one arc leaving it labeled by and corresponding to the path choices made in the statement.
In the previous discussion of I F statements there were two choices
corresponding to v(B) and , v ( B ) . The node associated with the first statement of the procedure would have no incoming arcs and the terminal statement of the procedure ( R E T U R N or END statement) is represented by a node with no outgoing arcs.
Also associate the complete current execution state, i.e., variable values, statement counter, and pc with each node.
In particular, each terminal node will have a set of program variable values given as
formulae over the procedure input symbols, and a pc which is a set of constraints over the input
282
symbols characterizing the conditions under which those variable values would be computed, A user can examine these symbolic results for correctness as he would normal test output or substitute them into a formal output specification which should then simplify to true.
The execution tree for a program will be infinite whenever the program contains a loop for which the number of interations is dependent, even indirectly, on some procedure inputs. It is this fact that prevents symbolic execution from directly providing a proof of correctness technique.
Symbolic
execution is indeed an execution and at least in this simplest form described here provides an advanced testing methodology. BurstaU [1] has independently developed the notion of symbolic execution and
added the required induction step needed to have a complete proof of correctness method. Deutsch [2], also independently, developed the notion of symbolic execution as an implementation technique for an interactive program prover based on Floyd's method [3]. In fact, one can see the basic elements of the notion of using symbolic execution as the basis for a correctness method in the earlier work of Good [4]. The author and his colleagues have been pursuing the idea of symbolic execution in its own right as a debugging/testing technique. A particular system we have built called EFFIGY is described briefly in the next section.
EFFIGY - - AN INTERACTIVE SYMBOLIC EXECUTOR
The author and his colleagues at IBM Research have been developing an interactive symbolic execution system for testing and debugging programs written in a simple P L / I style programming language. The language is restricted to integer valued variables and vectors (one dimensional arrays). It has many interactive debugging features including:
execution tracing, break-points, and state saving/restoring.
Of course, it provides symbolic execution and uses a formula manipulation package and theorem prover developed previously by the author [5, 6].
The generat facilities and capabilities available are all that is of real interest and these are perhaps simplest and most economically explained by a system demonstration. An APPENDIX is included which shows an actual script (annotated in italics) from such a demonstration. A method for exploring execution trees with their multitude of forks and parallel executions is up to the user. He is provided the ability to choose particular forks at unresolved conditional statement executions (via go true, go false,
and a s s u m e )
and also has the state save/restore ability so that he may return to
unexplored alternatives later. We are currently experimenting with various "test path-managers" which would embody some heuristics for automating this process, exhaustively exploring all the "interesting" paths, As with previous testing methods the crucial .issue is: if one cannot execute all cases, which ones should he do; which are the interesting ones.
We are also working on practical methods for dealing with more odvanced programming language
283
features such as pointer variables. While, as mentioned above, most such enhancements are straightforward "in theory" many offer fundamental problems in practice.
CONCLUSION
Interactive debugging/testing systems have shown themselves to be powerful, useful tools for program development. A symbolic execution capability added to such a system is a major improvement. The normal facilities are always available as a special case, In addition, the basic system components of a symbolic executor provide a convenient toolbox for other forms of program analysis, including program proving, test case generation, and program optimization. Since such a system does offer a natural growth from today's systems, an evolutionary approach for achieving the systems of tommorrow is available. Valuable user experience and support is also provided. While practical use of the EFFIGY system is still quite limited, considerable insight into and understanding of the general notion of symbolic execution has been gained during its construction.
ACKNOWLEDGMENTS
The colleagues at IBM Research collaborating with me in this work are: S. M. Chase, A. C. Chibib, J. A. Darringer, and S. L. Hantler. They have all contributed significantly to the ideas presented here and to the design and implementation of our EFFIGY system.
We also appreciate the support and
encouragement received from D. P. Rozenberg, P. C. Goldberg, and P. S. Dauber. The manuscript was typed by J. M. Hanisch.
REFERENCES
[1]
Burstall, R. M. Program proving as hand simulation with a little induction, IFIP Congress 74 Proc., Aug. 1974, pp. 308-312.
[2]
Deutsch, L.P. An interactive program verifier, Ph.D. dissertation, Dept. Comp. Sci., Univ. of Calif., Berkeley CA., May 1973.
[3]
Floyd, R.W. Assigning meanings to programs, Proc. Symp. Appl. Math., Amer. Math. Soc., vol. 19, pp. 19-32, 1967.
[4]
Good, D.I. Toward a man-machine system for proving program correctness, Ph.D. dissertation, Comp. Sci. Dept., Univ. of Wisc., Madison, Wisc., June 1970.
[5]
King, J.C. and Floyd, R.W. An interpretation oriented theorem prover over integers, Journal of Comp. and Sys. Sci., vol. 6, no. 4, August 1972, pp. 305-323.
[6]
King, J.C. A program verifier, IFIP Congress 71 Proc., Aug. 1971, pp. 235-249.
[7]
Sites, R.L. Proving that computer programs terminate cleanly, Ph.D. dissertation, Comp. Sei. Dept., Stanford Univ., Stanford, CA., May 1974.
284
APPENDIX
A script from an actual EFFIGY session is shown below. The user's inputs are in lowercase letters and the system responses are in uppercase letters. To prevent any possible confusion the symbol " ~ " is shown here to ~he left of the user inputs. Explanatory comments, in italic letters, have been added as a right hand column.
When EFFIGY is initially invoked it is in an "immediate" mode and will execute statements as they are typed.
Any statement executed in this context is considered part of a main initial procedure called
MAIN. The concept of the MAIN procedure and the concept of immediate execution are distinct since statements can also be executed in an immediate mode in the context of other procedures. MAIN is unique in that it has an immediate mode only and it is the onty procedure privileged to execute the managerial system commands. Programs are made available to EFFIGY for stored program execution by declaring them, in MAIN, with the PROC statement similar to the way that internal procedures are declared in PL/I. However, EFFIGY does consider all procedures as EXTERNAL and they must be declared in MAIN.
Procedures are tested by a CALL from MAIN. SymboIic inputs can be supplied by enclosing a symbol string in double quotes, e.g., "a", "Dog". These symbolic constants can be used in most places instead of integer constants.
The system responses drop the quotes since the context always makes the
distinctions between different uses of identifiers quite clear.
Values always involve the input symbols
and never program variable names. Formulae are stored internal to EFFIGY in a "normalized" form and some of the expressions may appear quite different from what one might expect (e.g., A < @ will be typed out as A - n > - q ). The formulae are also kept in a simplified form (e.g,, 2 * B = 4 is stored as
B-2=0). EFFIGY runs on CMS under V M / 3 7 0 on an IBM/370 model 168. The CMS filing system and context editor are used as an integral part of EFFIGY for creating, changing, and storing procedures and command files. The INPUT command directs EFFIGY to read its input from the designated file (files have two part names in CMS) instead of directly from the user's terminal. As procedures are entered into E FFIGY (by a PROC ,.. END declaration) the statements are sequentially numbered. These statement numbers are used to reference particular points in the procedure for inserting breakpoints, turning tracing on and off, etc.
~effigy EFFIGY READY ~ e d i t a b s o l u t e effigy; NEW FZLE: ~input ~absolute: proc(i,o); dcl (i,o) integer; if i<0 then o = -i else o = i; end; ~file
Invoke the EFF1G¥ system. Invoke ,the C M S file editor and type-in a new file called 'absolute effigy'.
(end o f input signified by null line.) Save file permanently &
285
go back to EFFIGY,
~ i n p u t a b s o l u t e effigy; Have EFFIGY read input from that file. I: A B S O L U T E : P R O C ( I , O ) ; 2: D C L (I,O) INTEGER; Statements are numbered by EFFIGY. 3: IF I<0 THEN O = -I 4: ELSE O = I; 5: END; Last line of file -- back to ~ d c l z integer; ~call absolute(55,z); d i s p l a y z; 55 ~call absolute(-66,z); d i s p l a y z; 66 ~ i n a b s o l u t e ; t u r n all on all; ~ i n main; ~call absolute("a",z); I: A B S O L U T E : P R O C ( I , O ) ; 2: D C L (I,O) INTEGER; 3: IF I < 0 T H E N O = - I ((Am>-1))
T Y P E GO T R U E OR GO F A L S E ~ b e f o r e 5; ~savestate; q SAVED STATE ~ g o true; ((Am>-1)) TRUE BRANCH O=-A STOPPED BETWEEN 3 AND 5 ~display variables, assumption; IN A B S O L U T E I:A O:-A ((A~>-I))
~ r e s t o r e I; STATE I RESTORED. ~ g o false; ((A~>-I))
'
terminal input. Declare a variable in MAIN. Try a numeric execution. Result of display statement.
All tracing on in proc. "absolute'. Set back m MAIN. Try a symbolic input "a". Each statement execution is traced by printing it. Evaluated result of I
Current pc (assumption). Return to execution state 1,
IN A B S O L U T E and try else path.
v(I
FALSE BRANCH 4: E L S E O = I; O=A New value of O. STOPPED BETWEEN 4 AND 5 Before 5. ~display variables, assumption; IN A B S O L U T E I=A O=A ((Am<0)) ~xgo; Resume execution and delete breakpoint. B A C K F R O M A B S O L U T E TO M A I N ~ d i s p l a y z; A ~ i n a b s o l u t e ; t u r n all off all; Turn all tracing off. ~ i n main; ~erase assumption; Reset pc to true. ~ c a l l a b s o l u t e ( "a" - " b " , z ) ; g o true; d i s p l a y z; T Y P E G O T R U E OR GO F A L S E Go true above anticipates question. -A+B l,.edit absolute effigy ; Invoke editor to change absolute. ~next Edit command to look at line t of file. change /absolute/newabs/ Change proc name. ~bottom go to end of file. It, up I Well not quite.
286
~input assert(o ~file newabs
eq abs(i));
Insert a correctness specification. File away as newabs effigy. Go back to E F F I G Y . Enter into E F F I G Y .
~ i n p u t n e w a b s effigy; I: NEWABS: P R O C ( I , O ) ; 2: D C L (I,O) INTEGER; 3: IF I<0 THEN O = -I 4: ELSE O = I; 5: A S S E R T ( O EQ ABS(1)) ; New sm~ment. 6: END; )erase assumption; ~call newabs("a",z); go true; d i s p l a y zr a s s u m p t i o n ; Response was anticipated on previous line. T Y P E GO T R U E O R G O F A L S E Result o f executing assert (statement 5). ( ( a b s ( A ) + A :0 ) :: T R U E
-A
((A~>-I)) ~erase assumptlon; ~call newabs("a",z); go false; T Y P E GO T R U E OR GO F A L S E ( ( a b s ( A ) - A =0)) :: T R U E
o f form l :: r where... l is evaluated assertion and r is result o f pc = l. Result o f display z and assumption for line typed earlier.
display
z, a s s u m p t i o n ;
Try only other case. That also gets proved. Have correctness proof--both paths correct.
A
((A~<0)) ~erase assumption; Now read in procedure times. ~ i n p u t t i m e s effigy; I: T I M E S : P R O C ( X r Y , Z ) ; 2: D C L (X,Y,Z) INTEGER; 3: Z=0; 4: IF X<0 T H E N 4: DO; 5: C A L L A B S O L U T E ( X , X ) ; Times calls absolute. 6: Y=-Y; 7: END; 8: L: it multiplies by looping add. 8: IF X>0 T H E N 8: DO; 9: X:X-I; 10: Z=Z+Y; 11: GO TO L; 12: END; 13: END; (Try some numbers. ~call times(3,5,z); d i s p l a y z; 15 call t i m e s ( - 3 , 5 , z ) ; d i s p l a y z; -15 call t i m e s ( - 3 4 , " b " , z ) ; d i s p l a y z; A mixed case--determinate control flow.
-34"B ~n times; t u r n all call times ( "a", "b", 4: IF X < 0 T H E N ((Am>-1) ) T Y P E GO T R U E O R G O savestate ; STATE 2 SAVED go t r u e ; ( (i~>-I ) ) TRUE BRANCH 5 : CALL ABSOLUTE
on 4 5 6 8 9 10; b e f o r e 13; in main; z) ; The completely symbolic case. DO; FALSE
( X , X) ;
Executed a resolved I F in absolute.
287
Knows A < - I .
6: Y=-B 8:
Y = L:
- Y; IF
X > 0 THEN
DO;
((i~>-1)) TRUE
BRANCH
9: X = X - I; X=-A- I 10: Z = Z + Y; Z=-B 8: L: IF X > 0 T H E N DO; ((i~>-2)) TYPE GO TRUE OR GO FALSE ~go true; ((A~>-2)) TRUE BRANCH 9: X = X I; X=-A-2 10: Z = Z + Y; Z:-2*B 8: L: IF X > 0 T H E N DO; ((A~>-3)) TYPE GO TRUE OR GO FALSE }go false; ((i~>-3)) FALSE BRANCH STOPPED BETWEEN 8 AND 13 ~display variables, assumption; IN T I M E S X=-A-2 Y=-B Z=-2*B ( (i =-2) ) ~ restore 2; STATE 2 RESTORED. IN TIMES ~go false; ( (A~>- I ) ) FALSE BRANCH 8: L: IF X > 0 T H E N DO; ((A~4) ; ~go; ((A~ 0 T H E N DO; ((A-~<2)) TRUE BRANCH 9: X = X - I; X:A-2 10: Z = Z + Y; Z=2*B 8; L: I F X > 0 T H E N DO; ((A~<3)) TRUE BRANCH 9: X : X - I; X=A-3 10: Z = Z + Y;
Another resolved IF. A < - I so - / i > 0 .
Loop around.
Now go out to end o f proc.
Breakpoint at end o f proc.
Path choices determine A = - 2 .
Try another case.
A d d this assumption to the pc, Now retry the I F with new pc. New pc resolves it.
Assume carries us through this one too.
288
Z:3*B 8: L: IF X > 0 T H E N DO; ((A~<4)) TRUE BRANCH 9: X = X I; X=A-4 10: Z : Z + Y; Z=4*B 8: L: IF X > 0 T H E N DO; ((A~<5)) TRUE BRANCH 9: X = X - I; X=A- 5 10: Z : Z + Y; Z=5*B 8: L: IF X > 0 T H E N DO; ((A~<6)) TYPE GO TRUE OR GO FALSE Unresolved when X gets to A - 5 . ~go false; Leave loop ((i~<10)) FALSE BRANCH STOPPED BETWEEN 8 AND 13 ~display variables, assumption; IN TIMES X=A- 5 Y=B Z=5*B ( (A :5) ) restore 2; Go back and try another case. STATE 2 RESTORED. IN T I M E S ~assume ( "a" e q "b" g "b" e q 2) ; Indirectly assume A is 2. go ; Does that assume resolve the if? ( (Am>- I ) ) FALSE BRANCH Yes it does. 8: L: IF X > 0 T H E N DO; ((A~ 0 T H E N DO; ((i~<2)) TRUE BRANCH 9: X : X - I; X =A - 2 10: Z = Z + Y; Z=2*B 8: L: IF X > 0 T H E N DO; ((A~<3)) FALSE BRANCH STOPPED BETWEEN 8 AND 13 ~disp!ay variables, assumption; IN T I M E S X:A-2 Y=B Result still in symbolic terms. Z:2*B ( ( A - B = 0 & B =2)) Does it k n o w Z is really 4. ~assert(z e q 4) ; Yes. ( (B =2)) : : TRUE Go on out o f times. ~go; IN M A I N
Response m previous null line.
289
~ e d i t 'times' effigy; ~next ~input a s s u m e ( x eq "x0"
Edit times procedure, In editor.
& y eq "y0") ; Insert correctness specifications.
~bottom ~up I ~input assert(z ~file ~erase ~erase ~input 1: 2:
eq
"x0"
* "y0");
times; assumption; times effigy; TIMES:PROC(X,Y,Z); A S S U M E ( X EQ "X0"
Replace original procedure and go back m EFFIGY, Can't have two times routines. Input from "times" file.
& Y EQ "Y0") ; Used to name input values,
3: DCL (X,Y,Z) INTEGER; 4: Z=0; 5: IF X<0 T H E N 5: DO; 6: CALL ABSOLUTE(X,X); 7: Y=-Y; 8: END; 9: L: 9: IF X>0 T H E N 9: DO; 10: X=X-I; 11: Z=Z+Y; 12: GO TO L; 13: END; 14: A S S E R T ( Z EQ "X0" * "Y0"); Relate input values m output.
15: END; ~ i n times; t u r n all on 5 9 14; Selectively trace. ~ i n main; ~assume("a">4 & "a"<5); N o integer between 4 and 5. C O N T R A D I C T I N G A S S U M P T I O N . IGNORED. ~assume("a">4 & "a"<7) ; ttow about A ~ 5 or 6, ~call times("a","b",z); 5: IF X < 0 T H E N DO; (A~>- I ) ) FALSE BRANCH For 5 and 6 X > O. 9: L: IF X > 0 T H E N DO; (i~<1) ) TRUE BRANCH For 5 and 6 loop some too. 9: L: IF X > 0 T H E N DO; (i~<2) ) TRUE BRANCH 9: L: IF X > 0 T H E N DO; (A~<3)) TRUE BRANCH 9: L: IF X > 0 T H E N DO;
(A~<4)) TRUE BRANCH 9: L: IF X (A~<5) ) TRUE BRANCH 9: L" IF X (A~<6) ) T Y P E GO T R U E ~ g o true; (A~<6) ) TRUE BRANCH 9: L: IF X
> 0 THEN
DO;
> 0 T H E N DO; OR GO F A L S E
> 0 THEN
DO;
Now must decide 5 or 6. Pick 6,
290
((A~<7)) FALSE B R A N C H 14: A S S E R T ( Z EQ X0 * Y0); ((6*B-X0*Y0 =0)) :: TRUE ~ d i s p l a y assumption; ((h =6$A-X0 =0&B-Y0 =0)) ~ d i s p l a y variables; IN M A I N ABZOLUTE=PROC Z=6*B NEWABS=PROC TIMES=PROC ~quit
Known not > 6. Results check by assert--O.K. What is the pc? Relates the symbolic inputs to the names given to inputs by assume in proc. M A I N has variables and values too. Value "PROC" means it is a procedure.
Leave E F F I G Y system.
INTERPROCEDURAL ANALYSIS AND THE INFORMATION DERIVED BY IT F. E. Allen Computer Sciences Department IBM T. J. Watson Research Center Yorktown Heights, New York 10598 USA ABSTRACT Well structured programs of
functionally
oriented
are usually expressed as procedures.
By
transforming an entire system of procedures,
a system
analyzing
and
linkages can be
modified or eliminated and interprocedural data dependencies documented to methods
the user.
being
developed
This paper
presents some
to
such
effect
of the
interprocedural
analysis and transformations.
i.
INTRODUCTION
As part of the effort to improve programmer productivity and system reliability, emerged
for
language", parameter
the
"avoid
a number
of excellent
programmer:
"write
guidelines have
in
a
high
goto's and external variables"
passing
mechanism
functionally oriented routines", programs carefully",
instead"), "annotate
etc. Furthermore
and language constructs have been
"write
level
("use the small,
.and document the
a number of languages
developed to support
(and
292
enforce)
some
of these techniques.
developments
in
programming
increased the potential
of the problem
methodology
and other
have
greatly
for improved programmer productivity
and system reliability, requiring attention.
While these
there are some major
In
problem areas
this paper we consider
of developing, managing and
one aspect
maintaining the
entire collection
of procedures which will
in a large system,
particularly one which has been developed
in a top-down style using
many small,
typically exist
functionally oriented
routines.
The context in which we will that
of a
compiling
system.
collections of procedures level
language.
considered.
be considering this problem is
Both
We
will
be concerned
(and functions)
nested and
with
written in a high
external procedures
Since compilers traditionally
are
compile only one
external procedure at a time, a quite radical departure from the
traditional design
is
required;
should be viewed as one component interfaces
with the
design
such a
of
discussed
in
presented here
user and compiling
this
paper.
have been,
Experimental
Compiling
development.
Since
methodology being
this
indeed the
of an entire system which
manages system
his programs. will not
However, or are System system
compiler
most
further
the
ideas
being implemented
in an
(ECS) is
of
be
The
currently PL/I
developed is designed to
oriented,
under the
accommodate the
many features supported by that language and hence should be
293
applicable to a number of other languages.
In this paper a method
(actually
presented which analyzes the constitute
all
determines the each
or
Section 2,
references the by ECS.
and
of
a
between lists
the some
Section 3 gives the A
information
in
and data procedures.
of this
brief discusssion of developing,
acknowledgements
The
analysis
flow within The
and
program analyzed develop the
possible uses
managing
next
information
algorithm used to
programs is given in Section 4.
2.
program.
Appendix, which contains a
information.
is
collection of procedures which
possible control flow
procedure
section,
part
a composite of methods)
and
of the
transforming
We conclude with a summary,
and a bibliography.
INFORMATION DERIVED
As a result of performing the analysis to be outlined in the next section, a great deal
of information is obtained about
the possible data and control
relationships
in the program.
Some of the information which is produced is: a.
the
call
graph
relationships
showing
the
possible
in the collection
b.
a control flow graph for each procedure
c.
the data flow within each procedure
d.
the control flow between procedures
e.
the data flow between procedures.
invocation
294
The
example
given
information Compiling the
the
appendix
shows
currently being
produced
by the
System.
type of
features more
in
available
the
Experimental
rather than
supported by ECS. Using the
detailed discussion
2°1
of
The example has been chosen to illustrate
information
interprocedural
some
of the
example,
information
the
PL/I
we now give a collected
by
..o pn ) ,
the
analysis.
The Call Graph
Given
a collection
referencing expressed
relationships
by
edges ~ i ~ E
of procedures,
(PI' P2'
between the
a directed graph C
procedures
= (N,E) of nodes
can
be
ni~N and
in which
a.
each node~ n i, represents
b.
each
edge
references
(nj~n k)
a procedure,
= ~ieE,
Pi' and
represents
in procedure Pi to procedure
one
or
more
pj.
Such a graph C is termed a call graph.
Although
methods
analyzing
programs
this paper we procedures. collection A, contains references E.
[7,8]
are currently
which contain
being developed
recursive procedures,
will restrict our attention The
call
of procedures
graph
in
Figure
A, B, C, Dr E.
references to procedures
Ca D,
for
to non-recursive 1
depicts
the
The main procedure,
B and C; procedure,
and E; and procedure,
in
D,
references
B,
C and
295
It should be noted that the call g r a p h is not a control flow g r a p h since returns are not shown. Figure A2 in the a p p e n d i x shows the call graph p r o d u c e d by ECS for the partial p r o g r a m g i v e n there.
2.2
The Control Flow Graph
For each p r o c e d u r e the flow "control flow
graph". A
graph
in w h i c h
edges
r e p r e s e n t control
linear point
(the
control flow
the nodes
s e q u e n c e of
relationships
are d e p i c t e d by a
graph is
r e p r e s e n t basic flow paths.
blocks and
A basic
program instructions
first i n s t r u c t i o n executed)
a directed the
block is
h a v i n g one and one
a
entry
exit point
(the last i n s t r u c t i o n executed).
Figure
A5
procedure
shows the EXAMPLE,
arbitrarily numbered its n u m b e r and
control
flow
the
appendix.
in
and each block
graph for The
in the
the serial numbers of
the blocks
outer are
p r i n t o u t shows
the source statements
in the block. Block 1 is a
dummy block and b l o c k 2 contains
e v e r y t h i n g up
IF test.
through the
Block 3
contains the
first call to SUB and block 4 contains the branch around the ELSE clause w h i c h will be e x e c u t e d call.
Block
5 (for
on return from the first
s t a t e m e n t 8) has
block 6 has the return statement.
the second
call and
296
2.3
The Data Flow Within Each Procedure
For each
procedure
obtained:
two types
of data flow
"definition-use"
information
relationships
and
are
"live"
information°
Using the notation
X d, to denote l
the definition
of data item
X in block then
b i and X u, to denote the use of X in block bj 3 a definition-use relationship (or simply, a def-use
relation)
exists
definition
between them if
at b. I
can be
notation~
(introduced
expressed
by the
contain example
a
the one
in
a
X~).
Such
is a path from b
redefinition
in Figure
used at
[3]),
pair(X~,
exist only if there
the value created
of the
b.. 3
def-use
l
by the
With
this
relation
a relationship to
data
is can
b. which does not ]
item.
2. The def-use pairs are
Consider
d A~) (AI,
the
and
(A~,
A~). It should
be noted
variable was data
item
reconciled aliases
used in defining can if
have the
can result
use of pointers
In Figure A6
that the term
several aliases
effects of
is
which
to
be
from p a r a m e t e r - a r g u m e n t
or simply by o v e r l a y i n g
in the appendix EXAMPLE,
rather than
the relationship.
information
the outer procedure~ the
"data item"
the
useful.
all
be
These
associations,
the
storage.
def-use relationships
are shown.
interprocedural
must
The same
analysis
for
Here we see some of on local
def-use
297
information.
Variables A, B, and C
and B are used The def-use
in SUB and C
information
example,
is shown
block 2
and used in
is, in
fact,
eliminate
The
in Figure
at statement blocks,
A, for
3 in basic
3 and
5, 7, and 8 but
"dead" and interprocedural
second
form
of
data flow
Given a
5, w h i c h
is shown as
not used.
(C
o p t i m i z a t i o n might
information
def-use relation
and 4 to 2 in Figure
of X°
X~ is
is
(X~, X~)
of any path from b. to l
contain a r e d e f i n i t i o n
the
live
then Xdl is
b. which does not ]
live on edges
3
to 4,
2.
The Control Flow Between Procedures
Not only are the usual calling procedures
exposed
(abnormal returns)
but
relationships
non-nested
in a system of
control
transfers
certain
data items
are also found.
The Data Flow Between Procedures
When one are
this.
On the other hand C
statements
live on all edges
2.5
A6 reflects
the two basic SUB.
Z) is modified.
it.)
informatiom.
2.4
(via parameter
as being defined
contain the calls to being defined in
are all passed to SUB; A
procedure
references
m u t u a l l y accessible.
passed as arguments,
another,
These are
data
are defined as global
items which
are
variables,
have
298
the
same
pointers~
scope,
or
overlays~
are
etc.
indirectly
At each
accessible
through
call p o i n t the data items
w h i c h are r e f e r e n c e d and/or m o d i f i e d as a result of the call are identified.
The E x p e r i m e n t a l
Compiling System
which automatically at
inserts c o m m e n t s
certain points.
interprocedural
has a
These
flow
listing a n n o t a t o r
into the source listing
comments
information.
c o n t a i n some Figure
A8
of
the
shows
the
p a r t i a l r e s u l t of such an a n n o t a t i o n at the call point.
3.
ANALYSIS ~THOD
Given
a collection
of procedures,
c o n s t i t u t e all or p a r t of a want
to c o n s i d e r
in
i n f o r m a t i o n listed material
in
the
c o m p l i c a t e the the
this section
literature,
complete,
is
2. We
how
which
to derive
will draw
particularly
Flow Analysis
presentation,
c o l l e c t i o n is
Pn'
p r o g r a m P, the p r o b l e m w h i c h we
in S e c t i o n
Interprocedural Data
PI' P2 . . . .
[!].
on In
i.e.,
r e f e r e n c e d are in the collection.
all
h e a v i l y on the
paper,
order not
we will i n i t i a l l y of the
the
to
assume that procedures
It w i l l later be evident
that this r e q u i r e m e n t can be r e l a x e d but will result in less accurate
(but not incorrect)
i n f o r m a t i o n b e i n g produced.
B e f o r e g i v i n g the analysis approach, to
be resolved:
in w h a t
a basic q u e s t i o n needs
order should
the p r o c e d u r e s
be
299
analyzed?
The
dilemma
posed
i l l u s t r a t e d by the p r o c e d u r e s If S is
this
question
by the CALL and/or
statement:
used.
We
can
be
in Figure 3.
a n a l y z e d first we cannot d e t e r m i n e
and used defined
by
G, A,
cannot,
w h a t is d e f i n e d
and B may
therefore,
each be
accurately
deduce the data flow of S.
If T is a n a l y z e d first we don't know w h e t h e r or not X, Y and G are aliased
in any way:
actual a r g u m e n t w h i c h
X
and Y m i g h t refer
also m i g h t or m i g h t not
the d e f i n i t i o n of X may also
to the same be G.
Hence
be d e f i n i n g Y and/or G.
Again
our data flow i n f o r m a t i o n m i g h t be inaccurate.
T could be a n a l y z e d in its ~ e f e r e n c e context in S.
However,
if there are m a n y references to T this could be very costly.
In
[i]
this dilemma
invocation "worst
order".
In
paper
introduced
the
e x a m i n a t i o n of the case" estimate. the E x p e r i m e n t a l now given.
was
it was
always
if
of
an
program,
assumed that
and G in Figure
initial
the e s t i m a t e
estimate
is based
is m o r e a c c u r a t e
This a p p r o a c h is C o m p i l i n g System.
the "inverse
made r e g a r d i n g
as between X, Y, notion
which,
by c h o o s i n g
that paper
case" estimate
i n t e r f e r e n c e s such this
is r e s o l v e d
on an
certain 3. In [9]
is
actual
than a "worst
the one a c t u a l l y The basic
a
used in
a l g o r i t h m is
300
A l g o r i t h m for I n t e r p r o c e d u r a l 7hnalysis Ste~
~.
overestimate) Ste ~
E s t a b l i s h an
estimate
(actually
on the control and data r e l a t i o n s h i p s
2o
Establish
procedures based as part of step refined
initial
an
order
for
u p o n the i n v o c a t i o n
in P.
processing
the
relationships deduced
1 or, if the process is
invocation relationships
an
which
iterated, can be
the more
determined
from the i n f o r m a t i o n c o l l e c t e d in Step 3. Step 3.
Establish
the control and
in P by p r o c e s s i n g the p r o c e d u r e s Step 2 by using either the flow r e l a t i o n s h i p s or the
data r e l a t i o n s h i p s
in the order d e t e r m i n e d
in
e s t i m a t e on the control and data r e l a t i o n s h i p s already deduced for
p r o c e d u r e s a p p e a r i n g e a r l i e r in the p r o c e s s i n g order. Step
4.
If desired,
i n f o r m a t i o n c o l l e c t e d in
update
the
estimates w i t h
step 3 and repeat steps
the
2, 3, and
4.
A r e a s o n for the i t e r a t i v e r e f i n e m e n t of the i n f o r m a t i o n may be
illustrated
Suppose a
by
procedure,
entry variable.
considering S, c o n t a i n s a
By the
the
following
CALL EV w h e r e EV
initial e s t i m a t e
we may
that EV can take on a number of p r o c e d u r e values and P3.
However,
a s s u m p t i o n we only h a v e steps 2 and
having
p e r f o r m e d steps 2
may be able to
the value P2, say,
say PI, P2,
and 3
at that p o i n t in
is an
determine
d e d u c e that EV can,
3 w i t h this new i n f o r m a t i o n leads
a c c u r a t e information.
example.
S°
on that in fact, Redoing
to m u c h more
30I
The steps in the process will now be elaborated.
3.1
(Ste~
l)
relationships
Establish in
P.
an
initial
Three
types
d e t e r m i n e d by the analysis performed a.
the
possible values
variables b.
of all
associations.
c.
In
the
information
are
in this step:
pointer,
label and
entry
including parameter-argument
this way
the parameters
Figure
of
on
in P
the aliasing relationships
that
estimate
we determine,
and the
for example,
global variable
in T
in
3 are all distinct.
the call graph.
The analysis method used in ECS described
in [i0].
collecting
up the
for performing
It essentially information
of
scans each interest
matrix showing immediate relationships. that
the
information
relationships
is
(e.g., the
expanded effects
this step is
of
into a
It
to
procedure, binary
is in this form
expose
transitive
calling a
procedure
which calls other procedures).
3.2
(Step 2) Establish a
of
procedures,
(Step
i) or
possible a
PI'
the
readily determine
"'"
relationships
If the
order on the collection
Pn" From
revised estimate
invocation
call graph.
P2'
processing
the intial
(Step
4)
we know
in the collection
call graph
is cycle
an inverse invocation order
estimate the
and have
free, we
can
[i] for a call
302
graph C
with nodes n l, n 2,
relation~ -~:
....
ni-
predecessor
Consider
the precedence
if node n i is an immediate
of node
n~o The nodes of C can be given a 3 (nl, n2r °.. n~) w h i c h satisfies the constraint
linear order
that if ni-~fn j then
i < j in the linear
order.
The inverse
i
invocation
order K =
linear order.
(n] ... n2~ n i) is the
By p r o c e s s i n g
the procedures
control and data flow w i t h i n before
the
procedures
inverse
invocation
inverse of the
in this order the
a p r o c e d u r e will be d e t e r m i n e d
which call
it
are
order for the example
analyzed.
One
in Figure 1 is
(C,
E, D, B, A).
3.3 P by
(Ste~ ~) E s t a b l i s h processing
step 2o
the control and data relationships
the procedures
A number of
this analysis w i t h i n
methods
be
context
of of
In
the
establishing
interprocedural
a procedure
been p r e v i o u s l y
data
local uses,
defines
these methods
items are used and
[i] discusses
flow
external
data
information
analysis
establishing
since it
in
the the
will have
be treated
the information
like about
flow changes.
a means of e s t a b l i s h i n g items
flow can
algorithm,
Thus a call can
and control
Reference
this
in
exist for p e r f o r m i n g All of
call are known
analyzed.
any other statement when
of
about what
determined
in each block is known and that the control
determined.
effects
[3,5~6,7]
each procedure.
presume that information defined
in the order
in
within
a
the possible
procedure.
The
303
a p p r o a c h is to treat each such item as if it w e r e d e f i n e d at the entry point and to d e t e r m i n e
from that w h a t uses can be
affected by it and w h e t h e r or not such a d e f i n i t i o n could be p r e s e r v e d by the
procedure.
In this way we
can d e d u c e the
effects of the p r o c e d u r e on data flow from outside.
In the E x p e r i m e n t a l C o m p i l i n g method
[4]
is
relationships.
used Since
System,
to
establish
this
method
subgraphs into blocks to form new
the Interval analysis the
control
iteratively
flow
within a
p r o c e d u r e can
be h i e r a r c h i a l l y
Thus
the data
relationships
between
are given,
combines
graphs in w h i c h the nodes
r e p r e s e n t i n c r e a s i n g l y larger areas of the program,
procedure
flow
then b e t w e e n
the
the data
structured.
large areas
of
the
areas w i t h i n
each
area, etc.
3.4
(Step 4)
at all
clear at
w o u l d be. seems
Update the estimates and interate. this time how
(ECS does not
p r o b a b l e that
v a l u a b l e such
It is not an i t e r a t i o n
incorporate this feature yet.)
one iteration
w o u l d make
It
substantial
i m p r o v e m e n t s but it is u n l i k e l y that m o r e iterations could.
Before c o m p l e t i n g this the obvious
section it is important
questions of a l g o r i t h m cost
p r o c e d u r e s are m i s s i n g from the
to consider
and w h a t to
collection.
do if
F i r s t of all it
should be o b s e r v e d that e x i s t i n g compilers w h i c h analyze one p r o c e d u r e at a time make
"worst case" estimates
for certain
304
data
item
a!iasing,
p a r a m e t e r s and calls.
When
particularly
that
e x t e r n a l variables, a procedure
is
and
missing
associated
for the
with
effects of
we revert
to
that
strategy~
As stated earlier system
which
we v i e w the c o m p i l e r as part
manages
programs~
We
would
of a larger not
envision
r e a n a l y z i n g an entire s y s t e m of p r o c e d u r e s each time one was changed. should
An i n t e l l i g e n t p r o c e d u r e be
able
reanalyzed when
to
deduce w h i c h
one is changed.
should p r o b a b l y be given some are
to be
included
library m a n a g e m e n t system
in the
procedures
Furthermore
need
to
be
the p r o g r a m m e r
control over w h i c h p r o c e d u r e s collection
of p r o c e d u r e s
for
analysis~
4.
A P P L I C A T I O N S OF I N T E R P R O C E D U P ~ L A N A L Y S I S I N F O R M A T I O N
The i n f o r m a t i o n c o l l e c t e d by an i n t e r p r o c e d u r a l a n a l y z e r can be useful
in a n u m b e r of
applications.
In this
section we
w i l l b r i e f l y sketch p o s s i b l e uses in three m a j o r areas: a.
d o c u m e n t a t i o n to the p r o g r a m m e r
bo
program management
c.
program optimization
4.1
As a
D o c u m e n t a t i o n to the P r o g r a m m e r
result of p e r f o r m i n g
the i n t e r p r o c e d u r a l
analysis,
a
305
great deal
of i n f o r m a t i o n is
obvious or not a v a i l a b l e to the p r o g r a m m e r may not be
c o l l e c t e d w h i c h is the programmer.
is using p r o c e d u r e s
For example if
not created by
fully aware of the effects
procedures.
often not
him, he
of r e f e r e n c i n g these
F u r t h e r m o r e the analysis will f r e q u e n t l y expose
r e l a t i o n s h i p s w h i c h he had not intended.
Three forms of d o c u m e n t a t i o n seem desirable: a.
error
messages
which
draw
the
d e f i n i t e or p o s s i b l e errors in
user's
attention
the program.
the analysis w i l l
find v a r i a b l e s w h i c h are
being
mismatches
defined,
parameters, p r o b a b l y be
unreferenced
between
arguments,
p r e s e n t e d to the
For example used before
arguments
etc.
to
and
These should
p r o g r a m m e r even
w h e n not
solicited. b.
a n n o t a t e d listings. has
a
The E x p e r i m e n t a l
listing a n n o t a t o r
comments
into listings
procedure comments
definition sum up
the
m a k i n g the reference. CALL T(A,B)
which
automatically
at certain and
C o m p i l i n g System
points
reference
effects of
the
inserts
such as
points.
at
These
p r o c e d u r e or
of
The a n n o t a t i o n inserted after the
in p r o c e d u r e S of Figure 3 would contain the
following: w h e t h e r A was used and/or m o d i f i e d w h e t h e r B was used and/or m o d i f i e d whether
G and
any other
and/or m o d i f i e d
external
v a r i a b l e was
used
306
other invocations
r e s u l t i n g from
this i n v o c a t i o n
and
the effects of such invocations~ In other words any effect an
i n v o c a t i o n can have on the
invoking
included
p r o c e d u r e will
inserted at
be
that point.
Figure
in the
A8 contains
comments a partial
example° c.
documentation~
p r e f e r a b l y via
an i n t e r r o g a t i o n
w h i c h permits s e l e c t i v e probes
on the information.
amount of i n f o r m a t i o n p r o d u c e d is unselective example,
presentation
all the uses of
are known
and all the
system
would
so v o l u m i n o u s be
The
that an
overwelming.
For
each d e f i n i t i o n in the p r o g r a m definitions
affecting
each use.
If a p r o g r a m m e r is t r a c k i n g such data flow it is easy to p o s t u l a t e a system w h i c h gives him the i n f o r m a t i o n as he r e q u e s t s it and w h i c h is, him than a v a s t dump of
thereforer m o r e m e a n i n g f u l to the i n f o r m a t i o n in w h i c h he has
to track the flow.
4.2
Progra{n. M a n a g e m e n t
Whenever collection procedures
a
change
is
being made
of p r o c e d u r e s in
we
the
p r o g r a m m e r decides
v a r i a b l e s from mechanism
often
the c o l l e c t i o n and
k n o w w h a t the effects of such
From
a
need to
a change are.
and uses the
procedure change
frequently would
to e l i m i n a t e
his s y s t e m
itself.
to
the
in
a
other like to
For example if use of
external
the a r g u m e n t - p a r a m e t e r
interprocedural
analysis
3O7
i n f o r m a t i o n he could d e t e r m i n e w h a t p r o c e d u r e s r e f e r e n c e the externals,
how the procedures are linked and then adjust the
appropriate
parameter
information. management
It
is
system
lists not
which,
changes a u t o m a t i c a l l y
on
hard
the to
of
visualize
in fact,
or w i t h only
basis
makes
a
many
this program
of
a few assists
these
from the
user.
4.3
Program Optimization
A number
of the
well known
program optimizations
eliminating
unused
eliminating
r e d u n d a n t expressions,
can be
context
procedure
as
of
code,
moving
references
code
out
a
result
C o n s i d e r the p r o c e d u r e
Figure
are
A
expression A + B
and B
not changed
does not depend on
reducible,
i.e., the then it
reference
from
number
m i g h t be the
loop
in
of
SUB then
of references
the the
the
In fact if
a n y t h i n g in the loop to
and is
it can
p o s s i b l e to
remove the
and
a
effect
loops,
fragment in
can be removed from the loop.
SUB itself
changed,
of
applied in
i n f o r m a t i o n collected. 4. If
such as
large
be
entire program
improvement.
In a d d i t i o n
to i n c r e a s i n g
p r o g r a m optimizations, m a d e on
the basis of
integration.
In this
the u t i l i t y
of the
traditional
another form of o p t i m i z a t i o n the i n f o r m a t i o n
collected:
m u l t i p l e procedures
can be
can be
procedure combined
308
into one p r o c e d u r e
5.
formal
linkages
eliminated.
S UMMARY
An a l g o r i t h m flow
has been given
information
constitute produced with
all
reference
some
to
recursive
part of
a
the
and data
of p r o c e d u r e s
program.
of a p p l y i n g
a specific
control
The
the a l g o r i t h m was d i s c u s s e d
example
in
the a p p e n d i x w h i c h This
Experimental
example
Compiling
a partial
implementation
The a l g o r i t h m
presented
here
procedures
but current
can be e x t e n d e d
a.
a call graph
b~
the control
c.
all
includes:
showing
procedure
inversely
the external arguments modifies,
effects
and
of
external
it invokes,
in the
of
the
handle
that this
form of a graph
in
a procedure
affecting
a procedure variables
what kind of r e t u r n
System
them.
each d e f i n i t i o n
all the d e f i n i t i o n s
are
shows
relationships
flow of a p r o c e d u r e of
does not
indications
to a c c o m m o d a t e
produced
the uses
which
information
contains
The i n f o r m a t i o n
All of
collects
the c o l l e c t i o n
o u t p u t of
currently
algorithm.
method
or
which
the form of the information.
of the
which
from
as a r e s u l t
illustrates
d.
and the usual
and
each use in terms it
it makes,
uses
of w h a t and/or
what procedures
etc.
the i n f o r m a t i o n
is c o l l e c t e d
by a compile
time and
309
therefore,
represents
potential
rather
than
actual
relationships.
A
brief
discussion
information
for
of
possible
documenting,
applications
maintaining
and
for
this
optimizing
programs was given.
6.
ACKNOWLEDGEMENTS
Ken Davies and author to the wishes
to
Bob Tapscott have contributed m a t e r i a l presented in this
thank
them
contributed to this work.
and
the
many
more than the
paper. The author others
who
have
310
REFERENCES
[i]
F. E. Allen, ~'Interprocedural Data Flow Analysis"r ~roceedin~s IFIP Conference 1974, North Holland Publishing Company, Amsterdam, 1974 (also as IBM Research Report RC4633, T. J. Watson Research Center, Yorktown Heights, N.Y., November, 1973).
[2]
F. E. Allen, "A Basis for Program Optimization :~, Proceedings IFIP Conference 1971, North Holland PUblishing Company, Amsterdam, 1971.
[3]
F.E. Allen, "A Method for Determining Program Data Relationships", International S_~mposium on Theoretical Programmin[, Edited by Andrei Ershov and Valery A. Nepomniaschy, Lecture Notes in Computer Science, Vol. 5, Springer-Verlag, pp. 299-308, 1974.
[4]
F . E. Allen, "Control Flow Analysls " " , Pr0ceedings of a Symposium on Compiler Optimization, SIGPLAN Notices, July, 1970.
[5]
Matthew S. Hecht and Jeffrey D. Ullman, "Analysis of a Simple Algorithm for Global Flow Problems", Conference Record of ACM Symposium on Principles of Programming L a n g u a g e , Bost0n, Mass., October, ]973
[6]
K° Kennedy, "A Global Flow Analysis Algorithm", International Journal of Computer Math., Vo!. 3, pp. 5-15, December, 1971.
[7]
Gary A. Kildail, "A Unified Approach to Global Program Optimization, Conference Record of ACM Symposium o_~n Principles of Programming Languages, Boston, Mass., pp. 194-206, October, 1973.
[8]
Barry K. Rosen, "Data Flow Programs" (In preparation).
[9]
J. Schwartz, "Inter-Procedural Optimization", SETL Newsletter #134, Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, N.Y., N.Y., July i, 1974.
Analysis for Recursive PL/I
[I0] Thomas C. Spillman, "Exposing Side-Effects in a PL/I Optimizing Compiler", Proceedings of IFIP Congress 1971, North Holland Publishing Company, Amsterdam, 1971.
311
An Example The example given here illustrates some of the results of interprocedural analysis as currently available in the Experimental Compiling System. It will be noted that the example includes only nested procedures and does not have multiple external procedures: ECS has been designed to handle multiple procedures, both external and nested, but not all of the required components are currently available. Furthermore the example does not show a recursive procedure -- another feature which is currently unsupported.
APPENDIX:
PL/i
I I 1 I I
1 t
I 2
2 2 2 2 2 2 I
3 4 5 6 7
8 9
10 11
12 13 14 15 16 17 18
0 I I 0 0 0 0
0 0
0 0
0 0 0 0 0
0
I
2
NT
0
LEV
LISTING
EXAMPLE:
Figure
PROCEDURE DECLARE
> 0 CALL CALL
PARM A B C D
BINARY~ BINARY~ BINARYr BINARY~ BINARY;
SUB(PARM,A,B,C) SUB(PARM,B,A,C)
FIXED FIXED FIXED FIXED FIXED
; ;
AI
ANALYSIS
EXTERNAL;
ILLUSTRATE
(N,X,Y,Z) ; N FIXED BINARY, I FIXED BINARY, X FIXED BINARY Y FIXED BINARY, Z FIXED BINARY, G (10) F I X E D B I N A R Y D O I = I T O 10 B Y I; G ( I ) = G ( I ) + D; END ; Z = X + Y; RETURN ; END SUB ; END EXAMPLE ;
SUB-
A = I; B = 2; C = 3; D = 4; IF P A R M THEN ELSE RETURN;
DECLARE
TO
PROCEDURE(PARM);
EXAMPLE~ PROCEDURE(PARM); /* A M E A N I N G L E S S PROGRAM
SOURCE
COMPILER
I
STMT
CHECKOUT
RESULTS°
~/
EXA00010 EXA00020 EXA00030 EXA00040 EXA00050 EXA00060 EXA00070 EXA00080 EXA00090 EXA00100 EXA00110 EXA00120 EXA00130 EXA00140 EXA00150 EXA00160 EXA00170 EXA00180 EXA00190 EXA00200 EXA00210 EXA00220 EXA00230 EXA00240 EXA00250 EXA00260 EXA00270 EXA00280 EXA00290 EXA00300 EXA00310
~o
313
IDENTIFIER Z X Y
ALIASES C,C A,B B,A
CALL
GRAPH
I I
I
I
{(SYSTEM)
I
I I V
I
2
I I EXAMPLE
I i V
3
i I SUB
PROCESSING
O R D E R W I L L BE SUB EXAMPLE
:
Figure
A2
314
*** ANALYSIS FOR 08.000 SECS. ***
SUB
ON
AUGUST
FLOW
I I i
27,
GRAPH
FOR
1
1974
SUB
I I Ol
o-
i I V
L I l
2 10
-
I I 121
1 I V --->
3 12 -
12 . . . .
4 *---
13
-
14
I
t
5
I<--I
I
15 -
Figure
16!
A3
AT
10:32
AM
315
DATA FLOW DEFINITION
- USE
INFORMATION
FOR PROCEDURE
RELATIONSHIPS DEFINED AT STMT BLOCK
IDENTIFIER
U S E D IN BLOCKS
4
D
0
1
Z
15
5
G
13
4
4
I
12
2
3 4
14
4
3 4
X
0
I
5
Y
0
I
5
LIVE
- SUB
IS N O T U S E D O U T S I D E
INFORMATION
IDENTIFIER
DEFINED AT STMT BLOCK
LIVE ON EDGES FROM - TO
D
0
I
Z
15
5
G
13
4
3
4
I
12
2
2 3
3 4
14
4
3 4
4 3
0
I
I 2 3 3 4
2 3 5 4 3
X
Y
Figure A4
I 2 3 4
2 3 4 3
IS N O T L I V E
THE BLOCK
316
**~ A N A L Y S I S 25.000 SECS.
FOR ***
EXAMPLE
ON AUGUST
FLOW
GRAPH
I I
27,
FOR
I
1974
EXAMPLE
I
o -
oi
I ! V
i
2
I
I
I
-
71 ....
I I V
I I I
3
I I
7 -
71
I I V
I
7 -
71---I
. . . . . . . . . . . . .
.... i
I I . . . . . . . . . . . . .
i I I
I
5
I<--I
8-
sl I I V
1 I I
6
I< ....... I
9-
Figure
91
A5
AT
10:32
AM
317
DATA DEFINITION
- USE
FLOW
INFORMATION
PROCEDURE
- EXAMPLE
RELATIONSHIPS DEFINED AT STMT BLOCK
IDENTIFIER
A
PARM
LIVE
FOR
USED IN BLOCKS
3
2
3 5
4
2
3 5
5
2
IS N O T
USED
OUTSIDE
THE
BLOCK
7
3
IS N O T
USED
OUTSIDE
THE
BLOCK
8
5
IS N O T
USED
OUTSIDE
THE
BLOCK
6
2
3 5
0
I
2
INFORMATION DEFINED AT STMT BLOCK
IDENTIFIER
LIVE
ON
FROM
-
EDGES TO
A
3
2
2 2
5 3
B
4
2
2 2
5 3
C
5
2
IS N O T
LIVE
7
3
IS N O T
LIVE
8
5
IS N O T
LIVE
D
6
2
2 2
5 3
PARM
0
1
1
2
Figure
A6
D
A
SECONDARY
0
G
G
THE
OF
OF
THIS
Figure
A7
IN
DIRECTLY
SUB
BY
THE
INVOKED
PROCEDURE
PROCEDURE
PROCEDURE
PROCEDUR
PROCED
...........................
BY
INVOKED
INVOKED
INVOKED
THE
BY
INVOKED
THE
THE
USED
BY
END A N N O T A T I O N
IN
USED
BY
USED
MODIFIED
THE
IEXA00130 tEXA00140 ...................................
INVOCATION~
MODIFIED SUB
DIRECTLY
PROCEDURE
VARIABLES
THIS
INDIRECTLY
DIRECTLY
PROCEDURE
VARIABLES
B
THIS
OF
INDIRECTLY
RESULT
PROCEDURE
INDIRECT
THIS
AN
EXTERNAL
VARIABLES
THE
EXTERNAL
OF
AS
VARIABLES
THE
ANNOTATION
PARM ,VARIABLE,NOT USED,NOT MODIFIED A ,VARIABLE,USED,NOT MODIFIED B ,VARIABLE,USED,NOT MODIFIED C ,VARIABLE,NOT USED,MODIFIED
OCCUR
: : : :
VARIABLES
MAY
I 2 3 4
THE
THE
INVOCATIONS
ARGUMENT ARGUMENT ARGUMENT ARGUMENT
...............................................................
NO
I
I IF P A R M > 0 I THEN CALL SUB(PARM,A,B,C) ; + ....................................................................
7
CO
CO
G
X
G
G
z
STATEMENTSTATEMENT-
D
B
A
INVOKED
AT:
PROCEDURE
7 IN E X A M P L E 8 IN E X A M P L E
IS
SUB:
RETURN;
EXTERNAL
VARIABLES
I
LOCAL
OF
THIS
END
BLOCKS
ANNOTATION
USED
MODIFIED
MODIFIED
PROCEDURE
USED
USED
MODIFIED
BLOCKS
MODIFIED
USED
PROCEDURE
VARIABLES
THIS
VARIABLES
LOCAL
OF
IN EXAMPLE
IN CONTAINING
IN EXAMPLE IN EXAMPLE IN EXAMPLE
PARAMETERS
PARAMETERS
I
Y
VARIABLES
VARIABLES
ANNOTATION
IN C O N T A I N I N G
EXTERNAL
VARIABLES
(N,X,Y,Z);
.......................................................... Figure A8
11
1 I o I
0
PROCEDURE
I
10
THIS
I
9 IE X A 0 0 1 6 0 IEXA00170 IE X A 0 0 1 8 0
~D
F-',.Q
I1
II
li
321
S: P R O C E D U R E ; DECLARE G EXTERNAL;
CALL T
T: P R O C E D U R E (X,Y); DECLARE G EXTERNAL;
(A,B) ;
X =
E N D T;
END S;
Figure
3
D O I = 1 T O 100; C A L L S U B (A,B); X(I) = A + B; END;
Figure
4
NEUE
VERFAHREN
ZUR O P T I M I S T E R U R G UND P ~ R ~ I L ~ I S I ~ E U N G
VON P R O G R I M M E N
G.Urschler,
IBM Labor Wien
~BER SICHT
Ziel
aieses
Vortrages
Strukturierung automatische
Verarbei%ung
der Vorf~hrung
yon
Programme
ausgehe~.
redundan%e
Ausdrncke
datenabh~ngigen auf
Es
wird gezeig%,
in Flussdiagrammen
Interpreta%ion kurze
Zweckm@ssigkeit
Programmen
deren
Dies geschieh%
an Sand
zur Optimisierung die
beide
der
zu
Ton
und
einer
verarbeitenden
wie sich beispielsweise vollstSndig
- eliminieren
genannten
der
f@r
Verfahren
Modularisierung
Situationen
m~gl~chs%
die
won
Flussdiagrammen,
spezifische~
symbolische
es,
aufTuzeige~.
won 2 neuartigen
Parallelisierung Jeweils
is%
(Modularisierung)
Yerfahrens.
Programmausf~hrungen
-
auch
in
lassen sittels eines Eieses
zielt
ab, rut dies Jedoch
auf Kosten der ProgrammgrSsse. ES
wird
ferner
gezeigt, dass sich Flussdiagramme
eine variablenfreie lassen,
dass
sich
und gleichfalls dabei maximale
erlaubt beispielsweise unabhmngigen
die parallele
!%era%ionen
die Parallelitat
modulate
Torm
Parallelit~t
derartig in ~erse%zen
ergibt.
Diese
~usf~hrung
von wonei~auder
oder !%erationsteilen
auf Grund eines
dynamisch erkennenden
Kontrollmechanismus.
324
I. EINLEITUNG
Unter
~Prograsmen"
(vgl.
P1oyd
/3/)
angenommen, sit ~),
verstanden.
dass
jedes
~Qr ein ~ d e
[Ja/~ei~)
~bb~
(bezeichne%
~..
bezeich~et
Programs
~nd
~,
wor~ntez
wird
~edundanzen, einer
B,
unter
Ohne
zwei~erfige
Seguenzen
dutch
die
yon
die
Zuweisungen
Eliminierung
der
~bergang
die
Prograsmen
d.h. auf Grund geeigneter
hesteht
dann
darin
zu
gegebenen
Programsen
darin,
die
gegebenen
Prozeduranfrufen Sprachen, Die
z.B.
aufbauende
verschiedeD
werden. wird
Tnterprefation
is
Kapitel
eines 3
neuartige
yon
und
zur Durchf~hrung
Forsalism,s
redundanfeQ
hesteh%
sodulare,
(wie
sie
manche
opfisisierung
sod~laren dan~
und
Ahschnitt
Prograssaufhau
eine
symbolische
Optimisierungstechnik Ausdr~cken dieser
aQf
garantieren).
aus und ¥ird im n~chsten
Von
bena,nte,
f~r
Hand der Elisinieru~g ein
eine
den
aufgeprHgt
Str~ktur"
Form ~berzuf~hren
sieht
dieser
wird, wenn
VDL - siehe /5/ - yon vet,herein
Parallelisierung ausgehend
in
yon
Vortrages
Struktur
geeigneten
Programme
Modularisierung
heschrieben
erleichtert
vorher eine geeigne%e einer
dass
~Igorithmen
dieses
dass die Durchf~hrung
wesentlich
"Aufpr~gen
erlauht.
Parallelisierung
Grundgedanke
zeigen,
Programmfransforma%ionen wird. Dieses
Der
die
zu
gleichzeitige
Operationen
These vorweggenossen,
auch
yon
~rspr~nglichen den
die
unabh~ngigen
werden soil.
p,
werden.
als
autosatisch,
eines
sind darin air
soyohl die Optisisierung durchgef~hrt
~sdr@cke aufge~a~t
C, ... stehen f~r "elesentare
verstanden
sei
nut
Wariablen
~sdr~cke
be~eichnet,
Begr~ndung
wird
{hezeichnet
dass s~m%liche
Parallelisier~ng
yon voneinander
weitere
~nd
Beibehaltung
w~hrend
Kentrollstruktur
Ausf~hrung
halher
Weise die Grundstr~kt~r
maximale
optisisiert
jedcch
Kentrollstr~k%ur,
~d
Boole'sche
und Eingabe/R~sgabeoperafione~ Ein
~
strukt~riertea)
I zeigf in bekannter
Programm~l~cke",
Einfachheit
sir
a~fweisf,
(nicht
so!chen Fl~ssdiagramms. g,
Der
~lussdiagrasme
Programs nut einen Anfang
E,tschei~ngen
nut ads einfachen sind.
werden is folgenden einfache
an
beschrieben
~ethode dargelegt
325
werden.
Diese
bestreb% der
Op%imisierungstechnik
Programmgr~sse,
abschliesse~d
zu
werden,
yen Prcgrammen eine
f~r die AusnMtzung
yon ~aximaler
in
erster
Linie
dies jedoch auf Kosten
erhalten.
gezeigt
~odularisierung
2.
ist
minimale Programmausf~hrungen,
Im
Kapitel
dass
die
wesen%liche
Parallelit~t
q
wird
geeignete Vorausset~ung
ist.
MODULARISIERUNG
Berei%s
im
1960
Jahre
Verfahren
angegeben,
bedingte
Ausdr~cke
gewissem in
wurde
dass
es
Sinne zu strukturieren.
Abb.
gegeben.
zugeordne%, wird
und
~amen
in
Das Verfahren sei an Band
ein
wird
illustriert.
Name
~nd
|bezeichnet
2 angegeben Modul
jedem
mit a, ~, T
durchgefmhrt.
(ein
Programmteil)
Bekursionsgleich~ng
der in Form einer
des
Jls erstes
Entscheidung)
ein
wie folgt aussieh%.
ein in
Dies sei wie in Abb.
dieser
/6/)
goto-Progr amine
{jeder
(Ziel)punkt
{wgl.
und mithin diese Programme
I gegebenen Programmbeispiels
Verschmelz, ngs
Jedem
McCarthy
gestattete,
zu ~bersetzen
wird Jedem Verzweigungspunk% • ..)
yon
heschriehen
Ist der Modul unbedingt
{geht er
a~s einen Verschmelzumgspunk%
hervor), dann gib% die
an
dee Modul beginnt und mit welchem
mit welchem Prograsmblock
Soaul dee Block anschliessend =
AT
bedingt f~r
beschreibt
beispielsweise
wird.
den Modul @.
fie
diesen
Modul
vorangesef~t
2
Alternativen
der die
zu
Auswahl
der
wird. In selbsterklSrender
den Modu] a des Beispiels: der Abb.
a = (p)
~
~
dann
beschreiben, Alternativen
sind
denen der regel%,
Syntax ergibt sich f@r ¥].
2 erhalt man das folgende
wird k~ns%]ich als "Haupfmodul"
Gleichung
!st der Modul
{lieg% ibm ein Verzweig,ngspunkt z,grunde),
Boolsche Ausdruck,
Programm
fortgesetzt
Gleichung
eingefMhr£):
F@r
das
gesamte
Gleichungssystem
(~
326
=
~=A
7
7 = B ~
Das
Charakteris%ische an diesem Modularisierungsverfahren
dass in ibm aer geh~renden
immer ein ~odu! a~gelang%
eiDes
jeden
Diese der
Models
auch
Eigenschaf t
aufgepr~gten
das
zeigt
S truktur
auch his heute keine ~raktische
deft
angegebene
~u
ibm
dass wenn
Programmende
zumindestens an
wodurch
die diese
BedeQtung erlangt hat.
Tn Abschnitt 3 wird jedoch gezeig% werden, dass das
~die
maximal ist, was impliziert,
Ende a usgef~hrt ist
Zu
is~o
Unnat~rlichkeit Technik
"Bereich"
~n~eisungen)
ist,
Optimisierungsverfahren
sic
sich
f~r
in idealer Weise
eignet. Zu
einez
~at~r!ichen,
d.h.
die
~odu!arisierung
ko~mt
san,
"minimalisier%"
werden.
Dieses
Programmlogik wenn
die
Verfahren
5eschrieben.
Es beruht auf der sei% langem
(vgl°
/8/),
/~/,
Prograsm die
Modulbereiche
ist
in /11/ nMher
bekannten
Tatsache
dass es zu jedem verzweigungsp,nkt
ge~au einen ihm zugeordneten
entsprechenden
Programmpunkt
~l%ernativen
das
zusasmes!aufen.
Dieser "unmittelbare
ale
berei%s
(selbst
aufzeigenden,
erste
in eines gibt,
Mal
wieder
Nachdominator" gibt
a~sserhalb
liegende)
wo
damit
Grenze
des
~etreffeDde~ Modulbereiches an. Von
Abbo
2
!~sst
sich
beis~ielsweise
unmi%telbare Nachdcminator effektive
Besti~mu,g
Verfahre~ zur Auswahl eindeutige)
~achdosinatoren
(siehe
/9/).
beschrSnkt
Bereich
bleiben
zwei%es Sodularisierumgsverfahren, Bedeutung
f~r
die
Durch
mNssen
des -
dass i~ ~.
Parallelisierung
zeigen
Beispiel vo~ Abb. 2 ergib% sich folgendes b~zeichnet den lee[en P[cgrammblcck) :
stehen die
yon unmittelbaren
wcbei diese Ketten abet auf den beschreiben,
dass 7 der
yon ~ ist und A der yon 6.
yon
Kettenbildung
ablese,,
~r
die
bekannte
(zwangslMufig
Nachaominatoren Moduls ergibt
den sich
Abschni%t wird.
-
sic ein seine
~r
Gleichungssystem
das (u
327
~=
{Die
6A
Einf~hrung
werden, die
~B
der unbedi~gten
E]emenfarbl~cke
aufgegl~edert Dass
dutch
ill~strier%
beispielsweise
V-
Sie
2eigt,
sich
Meduls des
6
f~r
beispielsweise
5
en%weaer
anschliessend
6
der
interessier%, A
nU~
u~d
bis
a,
~ereits
die
vorkosmenden die
Ton
B
Wenn
Module,
entsprechende
auszuf~hren
una
ist, oder dass 5 abgeschlossen
7um
erreicht,
gew~nschten
eines gegebenen Prcgramses
der Grundanfcrderung
Moduls
herausgearbeite%.
werden kann. Dami% is% eine Pregramsstruktur schri%tweises
des
is%
2eigt
gefolgt
zu wiederholen
der Gleichung
E, gefolgt vo, einer
Damit
Prcgrammes
einen
Gleichung,
dass
des Blockes
bes%eht.
gegebenen
naher
Verst~ndnis
~nweisungen
dass jeder Prograsmahla~f
in einer AusfShrung
einer Ausf~hrung des
Grunds%ruktur man
wenn
die Programmlogik
eine n~here Betrachtung
yon den Daten)
yon
Ausf~hrung
sinnvoll,
~ und B in die entsprechenden
dieses Modularisierungsverfahren
(unabh~nging gefolgt
@ und 7 kennte gemacht
Beispiel nut
warden.)
2utage kcmmf, f~r
~odule
ist abet f~r das gegebene
gew~hrleistet
f~r eine sinn~olle
die
ein
Detail gehendes und mithin
Programmstrukt~r
Gen~ge
tut.
3.
SYMBOLISCHE
Jede
INtERPREtATION
Progralmausf~hrung
ausgef~rten Beispiel
w~rde
Programnausf~hr,ngen der wirklichen angegeben. Un%er~ume,
beschreibe~.
beis~ielsweise
Programmausf~hrung
u~endllcher
l~sst sich als ~olge der nacheinander
Programmteile
Dieser die
mif Baum genau
gegebene
Alle
mSglichen
il allgemeinen eine echte ~berlenge
Programmausf~hrungen Baum
das
die ~olge ~ p B q ~ B q A eine
beschreiben. (die
~r
~urzel hat
n~r
dutch
~
darstellen)
lassen s±ch als
darstellen, endlich die
wie
viele
Gleichungen
in
Ahb.3
verschiedene des
ersten
328
~cdulari~ierungsverfabrens Vorteil
~ieser
(~ote~tiel3) die
~nend!iche
schrittweise
bergs!citer Der
Baum aller
Expansion
den
de~
Baus
durchzugehen,
aass
einersei%s
die
Logik
abet
K~nverge~zkriterius
gegebenen
redundanteD
der
in
Gebrauch
dieses
Ab5.
4
das
redundant
ausgefnhrt
wurde. der
unendliche
Verzweigungen Redundan~en
dee
Ausdruck
er
ist
~u
diesem
nnd
Beschreihung
wenn
Verf~gbarkei%
wird dabei als
Zeitpunkt
in
alle Variahlen,
ben~tigt
Beispiel
is%,
wurden,
("get~tet") yon
Ibb.
seit
dies 5
abet ist
I
mit
Vers~ch
Interpretation
~as dabei passiert ist, dass als yen
durchgegangen (strichlierte
oben
nach
,ird
und
Linien)
(pu~ktierte
endliche~ Prograwmdarstellung
Linien)
die z~r diesem
speziellen
nnr falls vorher
der
einer
wurden.
Es zeigt, dass der A~sdruck
~n Abb.
Banm
yon
yon
yon ~in
unten dabei
in
g(~
in
Block
gemacht,
den
informal
zu
I.
Schritt
allen seinen
"offensichtliche"
beseitigt
besteht darin, im Resultatbaum
iaentifiziere~
genauere
es,
symbe!ische~
veranschaulichen.
nach endlich vielen
be~irkt.
machen.
wiedergegeben.
¥organg
Schri%t
ist sin entsprechendes
ist
wenn
das
lassen,
Damit dieses
yon
Ausdruckes
ist
die
unge~naert verkSrzen.
Eine
nicht ~ehr neu definiert
Anweisungen
zu
werden,
Die Grundidee bei der Eliminierung
zu
gespeichert
~uswertung
B
skizzier%.
/lq/.
verfngbar angesehen,
der
Ausf~hrung
zu
Gebrauch
¥erfahren an Hand der Elilination
~us~r~cken
Ausdr~cken
Block
sp~ter dadurch
endllch bleibt,
das
Ausdr~cken
~edundanten
Zeitpunkt
dutch
ist es, die
im Baum durchgef~hr£
bereitzus%ellen,
~ird
sich
Variable~
der
Programsausf~hrungen
den Abbruch dee Transfermation
Fclgenden
In
Interpretation
die AusfSh~ungszeit
Transfcrmat~ossverfahre~
finds%
dass
dabei Optimisierungsinformation
Mcdifizier~ngen
andererss~ts
Is
besonderer
es,
Modulen aus dem ~auptmodul
sam~eln und yon dieser Information
Schri~%e~
~in
ist
Programmausf~hrungen
yon
symbelischen
unendlichen
schrittweise ~ache~,
werden.
~erden kann.
Gru~dgedanke
dutch
beschrieben
Programmdarstellung
gleiche
werden.
Eer 2.
Unterb~ume
zu
und damit wieder zu einer
~u gelangen.
329
Us
diesen Vorgang formal darchzuf~hren,
Zustan dsbescbreib,ng
eingef~hrt,
Problemstellang,
aufzeigt
verf~ghar si~d.
Seinen Ausgang
leerm
~keine
Ein fachheit
~o. a.
halber ergib%
aaf
seine
-o bewirkt (f (b,c)
in
yen
b
gelSscht
interpretier%
liesse.
ist,
ergib t
sodass
sich
getan)
elisinimre~
u,d daflr die
~guivalent
sit
dis
and
vo~
Zustandsbeschreibung In
Zeicbe~
Ferner
die
b
jedoch
~u
dutch
,..A¥ = A $~.7 wobei
dass
g (a)
c := g(a) Schrit%
(b
"~.~ = ,2. A?
und
der durch
vernichfet
wird.
c in
-2 = [b -> g (a) ;
f(b,c)
is sind
yon c der
G,
kosmt
man
in f(c,c) nach
.2
enthaltene
lnalcg ergibf
sich:
b - c}
schliesslich
Ton A angedeuted ,=
enthaltenen
ersetzt is%. Zur der
neuerliohen
A zur~ck, da dutch die Neudefinition in
b
gSnzlich
and
sin~)
ges~ss
smmtlicbe
(wie
dass sp~tere Workommnisse in terpretieren
darin,
h
gehen
c := g (a)
b - c
wird,
und
in
dutch c := b ersetzen
lquival~zinformaticn,
vo.
bet.its
welter
Anueisung
.,. B6 = .=* 6,
{,2-~ ~ #2-A].
~terpretation
a -> f(b,c)
in die
wobei darch die Unterstreichang
Zustands~eschreib~ng
mieder
wieder
•a.A7 = A_,i.¥ dass
is%
yon A in B e z u g auf
speichern.
wiederus:
,a.8 =
,o.~
(mine
indem sie
und dater b -> g(a)
Information
d~r Bedeatu~g,
Vorkcmmnisse
der
Zustandsheschreibung,
Formal:
sich,
San kann jedoch einen
~olgenden
dies
...? = ,,.B8 Wenn B in Bezug auf $,
Ferner
wird,
ist
~ ,o.¥}
Information die
darauf wird diese Eintragung
,~ = { b -> g(a)}.
1
"o
GemMss
Ausdr @cke
[$o-~
wird).
der
verfngbar)
die
Opti aisier un gsver f a hr en =
Interpretation
Zastandsheschreibung aQfgencmmen.
verf~gbar
das
ange.andt
,..A T.
a
Neudefinition
,o.~.
wird auf einen Modal angewandt,
|it
is
indmm
Kapitel
Boole 'sche
$o.e
zunSchst die Aufnahme
ist
unmittelhar die
in
Meiter
Alternativen
gleichbedmu%end
yon
We rden
nicht sich
Zustandsbeschreibung
gege~ene~
Zustandsbeschreibang
wird, in Zeichen
R ekursionsg leich ,ngss ystes mit
der
nims% das Verfahrmn
Information enthaltende)
gleichbedeutmnd
einbmzogen,
bei
welch, Ausdr~cke in welchen Variablen
auf den ~au~tsodal ~ "angewand%" des
wird der ~egrlff miner
der,
Information
won a
wiederum
330
~.~
Das
= ~.~7
= A~..7
I~%erpre%a%icnsverfahren
endet, sobald ein Modu! in ~ezug
auf eine Zus%andsbeschreibu~g auf
die
er
Verfahrers der
A~zahl
yon
I,formafion, Da
ei~e
bereits
ergibt
interprefier%
Mcdule~
s%eht es frei, daf~r eine,
[A ~I I B 5,)
System
entspricht
obeD
vollst~ndig Variahle~
in
blieben.
zu eliminieren
sich notwendig, zu ersefzen
das
vereinfachf,
unver~ndert
~bb.
Um
Idee,
die
is
~u tragen, dass in einem
gebrachf
al!erdi~gs
die Namen
yon
Ausdr~cke ist es
wesent!ichen wurde),
an yon vom
um dem
Programm verschiedene
Variable bezeichnen
f~r die M~glichkeit
werden
!~ferpretatic~sprezesses.
f~r redundante
Namen fragen kBnnen und dass umgekehrt
Versfandnis
zU gehen,
mehr.
yon /I/ @bernommen
Nalen die gleiche
Um ein h~sseres Verfahre~s
zu
wiedergegebene
(wie in /14/ beschrieben)
(eise
gleichen
Namen
Variablen dutch nquivalenzklassen
Ussfand Rec~nung den
6
redundante
Algorifh~us
verschiedene
Modul wieder
ergibt sich das
als darin
value-numbering Variablen
der ~@glichen
neuen
Eli~inationsverfahren
inscfer~e
des
als Besultatprogramm:
keine Redundanzen
¥iedergegebene wurde
Variab!en f~r
Es eufh@If
Bezug
~nd!ichkeit
angewandt auf einen
~o = {P)
in
eingeht.
Wird dies f~r cbiges Beispiel gemach%,
Ausdr~cke
und
der
und aus der Endlichkeit
Zustandsbeschreibung
Flussdiagra~eo Das
ist
~ie ~ndlichkeit
aus
die in die Zustandsbeschreib~ng
fo!gende R e k u r s i c n s g l e i c h u n g S S T S % e m
Diesem
wurde.
sich dabei aufoma%isch
einen Mcdu! ergib%, w~h!en.
xu inferpretieren
im
ohne Das
Folgenden Ausfahrung ers%e
des angegebenen
noch des
Beispiel
kBnnen.
2
~eispiele
eigentlichen betrifft
die
331
Opti~isierung
des
fclgenden
{einem
Programmteiles
unver~ffentlichten ~rtikel yon J.C. Heatt~ entnommen) :
DO I = 1 % 0
N;
IF K(I)
> 0 TH~N M (1)=~(J) ;
END ;
Mit
herr~mmliche~
OFtimisierungsverfahren
sich der nach der ersten Ausf~hrung nicht eliminiereD, Seiteneffekten o~iges ~bh.
8
dam
ResultatFrogramm.
Der
dass
der
in
wurde.
urspr~Dg]ic~
einfache
f~r
duplizierter
Code
ist
Sch!eife
Beis~ie]
und
darin
sell der
kann.
Interyretationsverfahren
der
dieser
Einschl~ss
Zustandsbeschreibung e~alich viele Divergenzgefahr
(do
Bocle,sche gegeben)
der
Ausdruck
wie
das
2
Variable
2 wie
angegebene
yon
ist
ist
nachtr~gliche
er!aubt es, in gevisse~ F~!len ~berflHssige
is
Wahrheitswert in
Wahrheitswerte gibt
yon
Wahrheitswerten
Information
nut
und
i(J)
Entscheidung
entsprechende
es
die in
~inschluss
Propagierung
~ekannt.
Duplizierung
den jeder
dass
werden kann.
zeigen,
Nach
Graphen wiederus
ersichtlich,
durch
dutch
und
werden
symholiscbe
benannte
Aus Abb. 8
Optimisierungsverfahren erweiter%
sei allerdings
das
gerich%ete,
gewHnscht h~chstens einmal ausgefHhrt
Wahrheitswerten
an
erhaltene
nech eine Reduktion durchgef~hrt wurde
Schleifen aufgelRst w~rde
zweite
halber
Ansch!uss
unn~tig
zusammengelegt
Dos
L(J)
Abb. 7 zeigt
Cptimisierung
Vel!s%~ndigkeit
dabei
(wie aus der Syntaxthecrie in
kommen kBnnte.
dutch
Interpre%ationsverfahren beka~nt),
~usdruck
in geeigneter For~ als ~lussdiagramm kodiert.
zeigt
hinzugef~gt,
redundan%e
l~sst
da es sonst 7u dos Programs falsifizierenden
(~nsafe code morion)
Beispiel
(code motion)
ja die
und
wiederum
nut keine
verwendung doyen ~ntscheidungen
zu
erke~nen und damit beis~ielsweise Schleifen zu diagnostizieren, die selbst ge~bten Prograssierern sanchsal Ahb.
9 9ibt ein solches Prog~ammfragsent.
in Abb.
10 zeigt, dass
SchleifeDdurchlSufe
das
~och
Pregramm sinnvoll
des drit%en Sch!eifendurchlauf ~m
Fermalismus
der
f~r
verborgen die
ersten
heiden
ist, und dass es erst nach
unbeding% zu schleifen
symbolischen
bleiben.
Dos Resultatprogra~s
Interpretation
ist
beginnt. dieser
332
Zustand
durch
dam
Auf%re%en
eines
unbedingten,
rekursiven
~oduls leicht z~ erkennen.
~. ~ X ~ A L Z
PARRLLELISIERUNG
Parallelisierung aus
eigent~ich
l~sendes)
ist
~hnlich
dynawisches
Preblem.
Im
der Op%imisierung ein yon Natur (nut
letz%en
~rograms!aufzeit
zu
Abschnitt wurde gezeigt,
zur
dass
sich f~r die Optisisierung ein Tell dieser eines
z~r
~berse%zungs~ei%
Prograsma~lauf werden,
vorwegnehmen
dass
sich,
Paral]e!itMt
zur
Verfahres beruht auf
der
in
~od~larisierungstechnik beschriebe~. gezeigfeD f(1},
Es
wird
her ist
wird) klar~
voneinander erfolgen
Parallelismus
~apitel ist
illustriert
f(100)
2
an
l~sen
Hand
der
und
in
tas
zweiten
/13/
des
Operator
der
l~sst,
angegehenen
/10/
das
~rkennung
Dam Programm berechnet (wobei
gezeigt
~lussdiagramme, -
in
wird
n~her
~bb.
11
die ~usdr~cke f
unhestismt
und summiert diese Ausdr@cke auf. Von der Logik dams
die
uDabh@ngig
kant.
f~r
in
symbolischen
Folgenden
itpliziert
und
Flussdiagramms.
f(2},...,
ge]asse,
Is
vollst~ndig - was eine
La~fzeif
bereits
durchgef~hrten
l~sst.
zumindestens
Para!lelisierungs~rcble,
~ynasik
Zu
Rus~ertung ist
~eigen
~nd ist,
aller
sithin wie
dieser
parallel dieser
~usdr@cke zueinander
"dynasische"
(der im allgemeinen nicht ~on der Programsgr@sse
scnder~ vc~ der L@nge des
Programmablaufs
abh@ng%)
ermittelt
werden kant. Gem~ss ergehen
der sich
in Abschni%t 2 angegebenen die
fclgenden
~ekursio~sgleichunges symbo]ischer
Programs
~ B C a G ~
beschreihenden
Anweisungen
@else als A, B, C, ... angef~hrt,
angede~%e~): ~=
das
(die einzelnen
Modularisierungstechnik werden nur in
wie in
A~h.
11
333
Der
gresse
Vcrteil
Parallelisierung
dieser
ist,
KontrollahhMngigkei%en
-
Rnweisungsausf~hrung aufgezeigt
durch
das
yon
werden.
Programmdarstellung
dass
sind
sie
die
Moduls der
cder
eines
darin
liegt, dann h~n9% die Ausf~hrnng
~usf~hrung
Ahh~ngigkeit
des
wird
(siehe
/7/),
wobei
(d~rch
Anweisung
Modulaufrufes yon
aS.
den
Des leichteren im
in
~olgenden
Er~ei%erung
(Zuweisung
Yariablen
und
lesen, bzw°
Expansionsoperaticnen hinzukcmmen,
dieser
weiteren yon
Um
zu
die ihrer
ist
~ine
Verst~ndnisses
in graphischer
~orm
basierend auf des Konzep% eines Datenflussgraphen
Basisopera%ionen yon
Analyse
~Iternativen
herauszuarbeiten
unumg~nglich.
diese
durchgef~hrf,
Dated
-
im Eereich eines
enthaltenen
Anweisungsausf~hrung
ben~tig%en
Datenflussanalyse halber
betreffenden
eider
~usf~hrung
die
Bereich
eider
~ntscheidungen
Warm immez eine Anweisung
im
die die
Abh~ngigkeiten
vcrangehenden
~oduls liegt, d.h. entweder direkt in einer seiner vorkommt
f@r
genau
(dutch
dieses
Konzeptes
zu
den
Eingabe/Ausgaheoperationen), nach Variablen
schreihen,
Doppelkreis
gekennzeichnet)
noch
die bei Ausf~hrung einen Knoten in einen weiteren
eine
Al~ernative
gegebenen)
Datenflussgraphen
e~pandieren. Ein
Dat~nflussgraph
des
betreffenden
wird gezeichnet,
Medals
nach
der
Schreihcperationen
in
jede Variable
K~sfchenform
beschrieben
(in
Pfeilfcrm gezeichne%
fortlaufende
Kopien
Numerier~ng
nut die let~te dieser d.h.
vcneinander
angefer%ig%en
in darauf folge~den
Ausl~s,ng
der
Entscheidungsvariable ~infachpfeile pote~%ielle
den
~U
oder
V
einen vcn
ergib%
(die
durch
verden)
verwendet
sich
und
iesevorgang allein
werden
yon
der
masgehenden)
Doppelpfeil
angegehen.
Expansionsoperationen
~bergehen
dargeste]Ite Bild. DeE Nebenmodul
einmal
"Aktualit~tswert"
die ers% bei Ausfffhrung
Datenli~ien
Hau~%~odul
Kopien der
and
~uftre%en ein
sind
unterschieden
Expansion
dutch
Datenlinien,
i~ tats~chliche ~nr
h@chstens
Operationen
ist
Lese-
uerden, wohei abet
angegeben) an2ufertigen
dart. F~r Expansionsoperaticnen (f~r
die
werden dart, sodass bei wiederholtes
and @erselben Variablen
besitzt,
indem f~r eine Operation anderen
der
sind
~xpansion
k@nnen. dabei
a hat 2
das
in
~Iternativen,
Ahh.
12
deren
334
entspreche~de wirk]ich
Datenflussgraphen
kommen
kann,
tragen
ist
wiedergegebene i~
yon Eingabe Eine
s ben~%igt einer
gew~hrleisten
ist.
Betrachter
hestimmt),
bereits
Ausf~hrung
erkennen.
werden.
Dies
vorgegeb~nen eigenes Bei
Graphen
deren Ausf~hrung geschrieben
gel~schf
ausgewMhlt
/2/)
de[
Ausf~hru~gs~glichkeiten VerschwindeD
beschreibf°
des Netzes
ausser~alb
dee
Betrachtungen).
ergibt
erweisen.
im
~r
zu
dabei
der
dass
Anweisung
momentan
Die
als
Netz jedem
Graphen" erlaubte, endet
mit
~uswirkungen ~edien,
maximal
der
steht
mBgliche
Programmansf~hrung
yon
oder Nichtvorhandensein
Programs
sich ~
hestimmt
Zu
"precedence
externen
bei
das gegebene
~herzeugen,
ihre
bestehende
Die ~usf~hrung
dutch das Vorha~deDsein
vo, Date~]inien.
das
(die eigentlichen
Parallelit~t
sich
dieser
eingesetzt.
s~mt]iche
das Schreiben
Ausf~hrungen
kann
sohald
~er Weft
wird in
Programmausf~hrung,
se!bst
Wet% an die
werden soll, und eine Eopie des
sfe!!t das Ne%z den
dar,
einfach
ein darin
vorhanden
selhst
ausf~hrbar,
Expansionsoperation
Ausf~hr~Dgszeitpu~k%
selbst,
sind
wird der entsprechende
Datenf]ussgra~hen
der
(vg!.
erzeugt mit der
~xpansion
ist,
scbald ihre Eingabewerte
definiert ist.
Alternative
eDtsFrechenden
parallelen
beginnend
~. Sobald dutch
werden
~tscheidu,gsvariable
anste!]e
einen
ausschliesslich der
und die Operation
zu
werden.
~xpansicnsopera%icnen welche
sie
f~r
m~ssen ~uerst einsal
hergestellt
ausf~hrbar,
jedoch durch
2umindestens
Expansion,
Expa~sio~soperaticn
Dafenf]ussnetz
~usgabevariable
des
dutch
nut
|s := s)
Kcntrollmechanismus
Operationen
nicht
Mod~lal%ernative
(und f~r den sind
geschieh%
Basisoperationen
(hier
~nalyse zeigt ferner,
Kopieroperation
lassen~
den
Zuge
f@r jede
wird, deren Hereitste!l~ng
Datenflussgraphen
auch
zum
serge zu
tiefergehende
eigenen
menschliche~
~amit
~xpansion "interface"
~alle eines Aufrufes der zweiten
~ie Variable
siDd.
zeigt.
und ~usgabevaria~len
abet in /13/ beschriebene)
Einf~hrung
Die
13
bei einer
f~r ein gleichf~rmiges
(gleich A ~ a h l
~odula]~ernafive). dass
~bb.
jede dieser ~]%ernativen
dabei
m@ge sich der die
voneinander
Leser
wiederholten una~hmngig
335
So
anschaulich
u~geeignet n~chster
schri%%
linearisiert gebracht. (nicht
graphische
sein
mag,
wird die graphische
Da~stellung
Lis%e)
einer
der
Operatienen
versehen
werden,
getren~ten
V =
als Menge Graphen
wobei Modulaufrufe die
die
dutch
Eingabe und Ausgabevariablen
Beispiel
ergibt
sich
damit
in
Form:
{So:=O; no:=1; write sz}
Po:=no<-100;
~(I) = [So,no;sa,na] Po:=n*-<100; ~(2)
Sprache
entsprechenden wird,
F~r das angef~hrte
selbsterk1~render
dem
geschrieben
Parameter]iste voneinander
aufzeigt.
in
so Als
daher eiBmal
Dies geschieh% indem jede Modulalternative
als
Semikole~
Die
Darstellung
und in die Form einer "single-assignment"
auftrstenden mit
die
is% sie f~r die Verwend~ng dutch eine Maschine.
~(Po)[so, no:sz,nz];
{ro:=f(no) ; sx:=so+ro; ~{Po)
n.:=no+1;
[slenl;s~ena]}
= [so,-;sa,- ] {sa:=so]
Reihenfolge
der
Anweisungen
innerhalb
einer
Sodula!ternative
hat keinen Einfluss auf die Frogrammlogik,
die
yon Anweisungen
Ausf~hrung
lekal dutch das ~intreffen
hen@tigten Einga~gswer%e
gesteuert
eine
per
Parameter~bergabe
wird.
Name
Bei Modulaufrufen
angenommen
(per
da der ist
Wert w@re
jedoc~ genausc gut). Der
Nachteil
immer ncch
der
Dicht
Ausf~hrharkeit beschri~ben tieses
gen~gend einer
ist,
entsprechenden
obigen Pregrammrepr~sentation
in
ei~em
erfolgt
Transformationsschritt, Grundgedanke Opera%ie~ ahzurufe~, Operatio~en
,obei
die
Programms%r~ktur
dieser
Transfcrmation
ermittel%en
Datenfe]dern
echten
zwar
Werte
zu s~eichezn
nicht
ist,
implizi%
da
in
letzten
assignment
umgeschrieben ist
es,
indirek£
die
werden m~ssen.
einem
single
~orm in
wird.
Eer
die
dutch
die
in
bestimmten
und s~Ster bei Bedarf yon dot% wieder
scnder~ diese Werte direk% in den Eingi~estellen einzusetzen,
die
eindeutig
Kon%rollmechanismus
jedoch explizi% gemach%
ExF~izitmachen
ei~e variab!enfreie
maschinenorientiert
Opera%ion
Signale
ist, dass sie
we sie ben~tigt
yon
werden. Zu disem Zweck
336
ist es zun~chst ein~a] nofwendig zwischen die
SchreibeEinf~hrung
yon
Mehrfachz~weisungen jedoch
~och
deswegen
nc%wendig ~c
ers% in ~ i ~
~eziehung was dutch
"¥er%eileranweisungen"
{die
erfolgt.
ge~eig~
Z~s~%zlich ha% es sich
an
all
(die Kopie[anweisungen
eine
nicht
Eins-zu-~ins herzus£ellen,
dars%e!!en)
"Bufferan~eisu~gen" ei~z~f~hren,
eine
u~d Leseopera%ionen
nich%
%riviale
den
Stellen
der ~rt a := h sind)
Cpera%ion
sonst
nur
ausf~hrbar ~Rre, well die Wertzuweisungss%elle
sp~%eren
erfolgenden
Expansionsschritt
erzeug%
wird. Sobal@ is%,
diese ka~
Paar~ng won Lese- und Schreiboperationen
daz~
elisinieren
~bergegangen
und
dutch
Adressen
Anweisungen der Art a := f(b) diese
in ~ := f(b)
werden,
und
workomm%)
Variable, das
angib%,
a
vor,
z.B.
dann
gehen
@her, wobei ~ die ~dresse yon
Stelle im ~rogramm,
selbs%
seDder~ als pla%zhaltende
=
{soo:=O;
=(Po)
ist
dabei
wo a
~oo,
x:=l;
leerstelle anzusehen.
~o:=So+ro; n~=:=n11:
Eine Basisoperation
Po:=noo <100;
[~o,~a:~,-~];
;sa,
,
l:=no; ro:=f(noo) :
sla:=Slo;
a(Poo)
noa:=nol;
nlo,~,:=nox+1;Poo:=n~o-<100;
[s-~i,n~a:sa,n~]}
] {s~:=So}
wird ausgef@hrt~
soba!d ihre ~ingabes%ellen Adressen darste!len.
und der ~uweisung des erhaltenen
die
~dressen.
angegeSenen
vorhanden
Ihre
erfolgt dutch A~swer%ung des an der "rechten" Seite
s%ehenden Ausdruckes ausf~hrbar,
~r
~ors z.~.
write S,}
@erie u~d schald ihre Ausgabestellen Ausf~hru~g
sons%
nich% mehr als
(~ be~eich~et die ieeradresse) :
~{I) = [So,no;S2,na]
~(2) = [ss~
zu
Liegen
Beispielsprcgramm ergib% sich in v~riablenfreier
die ~o~gemde D a r s % e l l ~ g
erfolgt
Variahlen
ersetzen.
c := g(a)
und ~ := g(a)
(d.h. die ei,deu%ig bestimm%e Doch
zu
die
sobald der f ~ ist
Expansionsopera%ionen En%scheidu~g
¥erden
notwendige
Weft
und a!le @brige, Parame%er Adressen darstellen.
3ine Expansio~seperatic~ entsprecSende
die
Wertes an
wizd a~sgef@hrt, indem
Modulalternative
ausgewMhl%,
eine
zQnMchst
die
Kopie dawon
337
angefertigt
("Expansion")
"geladen"
wird,
~dressen
dy~a~isch
letzte
wcbei
Schritt
Herstellen dutch
und
die
diese
im
in Absclutadressen
in
der
geeignete
besteht
der
Adressliste,
die
Adressliste,
~ie vo~ ~cdul selbst her gbernommen
Dieser
AusfRhrungsmechaniswus
kurzen
Ausf~hrungsbeis~ie!es
heginn%
mit
der
kommt
sei
werden.
Dies
~wischen
Modulaufruf
und
im
"aktuellen" "formalen"
wurde. an Hand eines
Jede
V .
Der
geschieht
der
im Yolgenden
illustriert.
ExpansieDsoperation
relativen
schliesslich
Datenanschldsse.
Adress~bergabe vo~
schliesslich
vorhandenen
umgewandel%
~usf~hr~ng
der entsprechenden
~opie
~odul
Ausfdhrung
Diese
geht
bei
AusfBhrumg ~ber in: Soo:=O;
roe,no,:=1:
po:=noo<-100,
a(F) [Soo,no~;~,"~]; Ven
allen
~xekutio~
diese~
noa:=no,;
write s,.
O~erationen
ist nur nso, no~:=1 ausf~hrbar.
ergibt:
see:=0: po:=1~10o; no=:=1: , ( p ) [ ~ , ~ ; ~ , ~ ] : write sz Wiederum
ist
nut
@ahrheitswert
eine Operation ausf~hrbar
"wahrn
wird
d~rch
I
und
und f~hrt zu
"falsch"
dutch
(der 2
repr~sentier%): Soo:=O; Nun|ehr
ist
ergebende darin
setze~
~berga~e
die
Expansionsoperation
fcrmale Adressvek%or
jene
~ingabewerte zu
noa:=l; =(1)[Soo,noa;~,-~];
Stellen
im
zu liefern sind. und
ist
Mcdul,
~
write s. ausf@hrbar.
[So,no;sa,n2].
u,d die L e e r a d r e s s e ~
nach ~ .
•
sza:=slo:
Mist
daher an die
sind
Adresse
See
verlMuf%
die
Die Adresse s z kosmt nach
2usammen ergibt dies:
~oo,~.,:=no; ~.:=f{noo):
nzo,n~,:=no,+1;
• (Foe) [ s,z, n--~a;~,-~];
so, no
an die die naussen" er~e~gten
no nach noa. Fgr die Ausgabewerte
in umgekehrter Rich%u~q.
~er sich
Poo:=n,o-<100;
writes,
s~o:=So re: nx~:=nz~:
338
Nunsehr
stud
die
ersten drei Cperationen
und set~t sich die PrograsmausfShrung Je
eher
~nd
umso grosset Die zu]etzt
je
mehr Expansicnsoperationen
is% im al!gemeinen behandelte
Pregra~miersprache, Seres
Aufbau
der
werden
erkennen
l~sst.
In /12/
zu
nicht als verstehen,
Interpretation
finder
sich
eine
der Logik dieser Maschi,e.
(in
/10/ sins
Speicherzuweisung
nachgewiesenen) vcr
direk%e
dynamische
(dutch den leek-ahead ~inhaumSglichkeit Nach%ei!e
Progzamsstruk%ur allem
erw~hnen: so
lange,
Datenverarbeitung
genau
(kein
Adressberechnung,
Effekt der Expansion)
in ein multitasking sind anzuf~hren
zur ~bersetzungs-
als such
sind mannigfaltig.
erhaltenen
zu
(Dates existieren
werden),
Dat~nspeicher),
A!s
ausgef~hrt
Parallelitat.
bereits die Struktur der zu ihrer
Paralle!itHt ben~%ig%
fort.
Sprache fat selbstverst~ndlich
Die Ver%6ile tier erhaltenen Neben
die m~gliche
ausf~hrbar
Weise
sc~dern als Raschinensprache
hen~%ig%en Maschine Beschreihu,g
simultan
in analoger
zur
~ehlen eines physisch existenten
maximalen dynamische als
sie
interner
einfaches
Pages
und eine nat~r!iche
Environment.
der zus~tzliche Laufzeit
und
Interpretierers.
Cverhead sowohl vet
allem
das
339
Appendix A:
Abb.1
Abb.
ABBILOUNGEN
FLUSSDIAGRAMM
3
Abb.
2
FLUSSDIAGRAMM MI MODULNAMEN
BAUM BER P R O G R A M M A U S F O H R U N G E N
J I
I
/q\
B I
A/~\A I
B I
/q\
A I B
I
/q\
A
340
Abb,
4
OATENABH£NGIGE REDUNBANZ
,L~ ......
ja ~ Abbo
5
SYMB@LZSCHE
a:-
/q\
INTERPRETATZ
f f ~ , , ~ ) " .... "..
0
N1
-
INFORMAL
c.-
~ro}
341
Abb,
6
RESULTATPROGRAMM
~ ,.,~,.,
i Abb.
7
UNSAFE CODE MOTION
• ne~.n
-
I
lllll
342
Abb.
6
RESULTATPROGRAMM
E,Z ~eL~
.~. ~ ,,:-,.i~ I L
l~,-,~il "
l,rl(~:':=A I .
I
.| ,,,,,
Abb,
9
,
PROGRAMM MIT ENOLOSSCHLEIFE
2' p:
F,(~) ,
343
Abb.
I0
RESULTATPROGRAMM
T
~ ~~ f ~l
I!
~(~) I
""
_- A I
~~ ~
~ ~ll
2 Abb.
11
PROGRAMM
MIT
PARALLEL
AUSF@HRBAREN
T
/x
s:,,, 0
B
ill :-- #l
J
I,:-..~ool G ~B~m
i@ I
~:=fa) 1 I
D
E F
ITERATIONSTEILEN
~a
~ ~1~
a~
~ 77 C;
Frl
72
<j
C:
-EJ
-TI r~ C.
Ft
F C O3 O3 G3 ~D
L~
L~
r~
-H 177
(.0
CT
U"
CO
345 AN H A N G
B
L ITERATUR
/I/
J.Cocke,
J.T. Schwartz,
Cospilers",
Couran~ Institute
New York University,
/2/
"ProgrammiBg
J.B.DEnnis,
of
~anguages
and Their
Hathematical
Sciences,
1969.
"Computational
Structures",
Lecture Notes,
1970.
13/
R.W.~ICyd , Symposia
/~)/
"Assigning
in Appl.
~.S.Lowry,
Neaning
to
Hath., V01.19,
C.W. Hedlcck,
Prograss",
Proc.of
1967, pp.19-32.
"Object Code Gptimisation",
C~CM,
Voi.12, 1969.
15/
P.L~cas,
~.wa]k,
Ann.Reviews
/61
J. HcCarthy,
R.E.Hiller,
181
Formal Description
Pzogr., Vol. 6/3, a
J.D. Rutledge,
Science
of
1962, pp.21-28.
"Generating
IBR Techn. Disclcsure
of PL/I",
1969.
Ha thematical
Proc. of IYIPs Congr.,
of a Prcgras", 1966,
the
"Towards
cogputation", /7/
"On
in Automatic
a rata Flow Model
Bulletin,
Voi.8/11,
p~.1550-1553.
R.T.Prosser, Analysis
"ApFlicaticn
of Flow Diagrams",
of
Boolean
Hatrices
1959 ~roc.of
the
%o the
FJCC,
pp.
133-138.
/9/
R.~.Tarjan, P[cc.
7th
Systess, /10/
"~inding
Dominators
Ann. Princeton
Conf.
in in
Eirected
Graphs",
~or~. Sciences
and
1973.
G. Urschler,
"The Inherent Parallelism
TBH Lab Vienna,
TR 25. 129, 1972.
of ~low Eiagrams",
346 /11/
G.Ursch!er, avai]able
/12/
G. Urschler~ GshH,
/13/
"Automated
from author,
~rogramming",
Paper
"Patentschrift
GE
972 502",
IEM 8sterreich
Wien, 1973.
G~Urschler, ~a~ima!ly Cenf.
Functional 1973.
"The
Transformation
Paralle]
on Parallel
Form",
Prec.
Processing,
of
Flow Diagrams into
of
Sagamore
Computer
Syracuse University,
~.¥.,
1973. /14/
G.
Ursch!er,
in
Flow
,'Complete ~edu~dant
Diagrams",
¥¢rk%cwn ~eights,
IB~
B.Y.,
~xpression
T.J. Watson
1974.
~lisination
Research
Center,
AUTOMATIC P R O G R A M M I N G
Patricia C. Goldberg Computer Sciences Department IBM Thomas J. Watson Research Center Yorktown Heights, New York 10598
ABSTRACT: Work in Automatic Programming focusses on the problem of how to make the computer more accessible to the non-programmer. It involves both the design of interfaces and the development of algorithms to support those interfaces. It is also concerned with how to write programs which can be manipulated: changed, subsetted, or composed. This paper outlines one facit of the work in Automatic Programming at the IBM Research Center, Yorktown Heights, New York.
I. Introduction The last few years have seen the formation of a number of groups working in the general area of Automatic Programming [1,2,3].
Although this term is used to identify a variety of research
activities, these generally have in common an interest in the computer as a tool for end users who have little interest in programming as a skill. This is in contrast to the common goal of computer science research which focusses on the use of computers by computing professionals.
The aim of work in Automatic Programming is not to develop specific applications, but rather to develop application methodologies:
that is, methods of specifying programs and systems of
programs which will automate a given area. Because of the intended audience for these methodologies, there has been much attention paid to both very high level programming interfaces and "non-programming" interfaces [2,4,5].
The group at the IBM Research Center is primarily focussing on the area of business applications. There are several reasons for this. First, there is a fair amount of structure in business applications which can be exploited.
Further, many small businesses could employ computers to advantage
were it not for the cost of programming. Finally, because there is also a great deal of individual variation in businesses, we must face directly the problem of handling these idiosyncracies.
348
We are looking at many possible methods for specifying business programs. Some of these involve new programming interfaces in which the user directly specifies the application, while others concentrate on methods for determining what the user needs in the way of application programs and for composing the required programs out of program fragments.
We are also looking at
techniques for programming by example, a technique which would permit the user to specify a procedure by giving examples of the input-output relationship. Some of our work in this area involves design of appropriate interfaces, while other groups are concentrating on the development of efficient implementation techniques for realizing these interfaces.
In spite of the concern for methodologies for use in a programmerless environment, the Yorktown group is nonetheless very concerned with programs. In many respects the involvement with programs is intensified, by this point of view since progress must be considered as automatically constructible, manipulable objects. This leads to a somewhat different perspective on programs from that derived from a view of programs as manually constructed objects for one-situation use. In order to illustrate this point, and in order to show how some of the apparently disparate efforts in Automatic Programming fit together, the following sections elaborates one problem area and our solution to it.
II. The Problem One approach to the problem of developing tailored applications in a programmerless environment is to write a large number of program fragments which, taken as a whole, represent all of the common combinations of performing a given business function. Then, on the basis of information received from the ultimate user concerning his particular application needs, these program fragments are subsetted, modified, and composed to build the required application programs.
Such a view of application development raises several interesting questions. How ought program fragments be expressed to facilitate the kind of manipulations required?
Can such program
fragments be written by an expert familiar with the application area rather than by someone who is an expert in data processing? How are programs truly peculiar to a given application to be introduced? Is it possible to combine the programs written for two separate areas to form one application package?
In the process of elieiting information from the end user concerning his particular requirements, what kind of user-computer dialogue is necessary and how do we relate the information gathered from the user to the program fragments? What models of the programs and computing processes,
349
other than the programs themselves, are required in developing such a system? What other models of the business world are useful? How do we correlate these with the information received with the user?
Finally, how do we reconcile the requirements imposed on the programs from the standpoint of manipulability with the requirement that they make reasonable use of machine resources? How do we decide on appropriate data representations? What kind of analysis and optimization techniques are peculiarly required by programs so generated?
The construction of a feasible system for generating applications from prestored program fragments, requires that each of these questions must be faced directly; otherwise the resulting system will be nothing but a toy. Each of these facits of the problem are being addressed by the Yorktown Automatic Programming group. The bulk of the discussion below, however, is concerned with the specification of a programming interface in which to develop the program fragments. The work on the other issues raised above is briefly discussed in Section V.
III. The Programming Interface Is a new programming interface required for this kind of work? The following example (one to which we shall return frequently) illustrates some of the problems with existing languages.
Consider the task of taking a set of incoming orders and producing the invoices for the goods ordered. Assume that it has been decided to produce separate invoices for each order. In this case the program, in a conventional language such as PL/I, is rather trivial. That is, it consists of a looping construction which iterates over all the elements of the input file and for each element in the input file produces an output, which is essentially a copy of the input with additional calculations. This code is shown in figure 1. It is assumed in this example that the two files required, CUST
MAST and I T E M M A S T , are small enough to be represented as simple v in primary
storage. Were this not the case, decisions concerning the appropriate I / O statements and access methods would have to be incorporated into the code -- further complicating the situation.
350
DO
I =
~ TO
NO_ORDERS;
INVOICE.NAME INVOICE.ADDR INVOICE,CUST# INVOICE.GROSS DO
J
=
]
TO
= CUST_MAST(ORDER(I).CUST#),NA~4E; = CUST MAST(ORDER(I).CUST#).ADDR~ = ORDER(I).CUST#; = 0; ORDER(I).NO
ITEMS;
INVOICE,ITEMS(J).ITEM# = ORDER(I).ITEMS(J).ITEM#; INVOICE.ITEMS(J).QUANT = ORDER(I),ITEMS(J).QUANT; INVOICE.ITEMS(J).PRICE = ITEM MAST(INVOICE.ITEMS(J).ITEM#).PRICE; INVOICE,ITEMS(J),EXTEND = INVOICE,ITEMS(J).PRICE INVOICE.ITEMS(J),QUANT; INVOICE.GROSS = INVOICE.GROSS + INVOICE.ITEMS(J).PRICE; IF I N V O I C E . A D D R . S T A T E = 'NEW YORK' THEN INVOICE,TAX = .05 * I N V O I C E . G R O S S ; ELSE INVOICE.TAX = 0; INVOICE.AMOUNT = INVOICE.GROSS + INVOICE.TAX; END; CALL
*
PRETTYPRINT(INVOICE);
END~
I
I
FIGURE 1. CODE TO PRODUCE INVOICES FROM ORDERS -- ONE PER ORDER
The program in figure 2 is for the same problem, except that it produces only one invoice for each distinct customer who has placed an order during that period. This, of course, implies that several orders may be collected to produce one invoice.
In spite of what appears to be a rather minor
change in the specification, the P L / I program for this function looks very different from the original program. It still begins with a loop over the input documents, but in this ease the loop is used only to sort the orders into piles of orders with common customer name. There is necessarily a second loop which takes each pile and produces an invoice. This includes logic to insure that, if the same item is contained on severa~ orders for the same customer, only one mention of the item appears on the invoice along with the appropriate adjustment to the quantity of that item that has been ordered.
351
J = O DO I - I TO NO_ORDERS; D O K = I T O J; IF O R D E R ( I ) . C U S T # = M ORD(K).CUST# T H E N C A L L M E R G E ( O R D E R (I ). M O R D (K)) ; /* MERGE. a d d s in n e w i t e m s to M O R D a n d a d j u s t s the q u a n t i t y for p r e ~ i o u s l y o r d e r e d items */ END; IF K = J + 1 T H E N DO; J = J + I; M ORD(J).CUST# = ORDER(I).CUST#; D O L = I T O O R D E R ( I ) . N O ITEMS; M ORD(J) .ITEMS(L) .ITEM# = O R D E R ( I ) .ITEMS(L) .ITEM; M O R D ( J ) . I T E M S (L) . Q U A N T = O R D E R ( I ) . I T E M S (L) .QUANT; END~ END; END; DO
I = I TO J;
/*
CALL
Prepare
PRETTY
Invoice
PRINT
- Similar
to Fig.
I
*/
(INVOICE);
END;
FIGURE 2. CODE TO PRODUCE INVOICES FROM ORDERS -- ONE PER CUSTOMER
Even though the
specifications
for these two programs differ only slightly, the programs vary in
nontrivial ways, so that it would be difficult to see how to generate one from the other in a simple manner. It would probably be easier to write a new program than to attempt to modify the first. It can be argued that this is the result of a "hidden "optimization" in the first program which causes calculations for the preparation and aggregating of the input to be merged with those for producing the output documents.
352
The function originally described above is clearly a batch processing function in the sense that it takes a predetermined set of orders and produces the set of invoices. Suppose now that we wanted to place essentially the same function in an on-line environment, in which case we would want the module that produces orders to send them individually, as they are created, to the module that produces invoices. Although in this case the change to the invoice producing module would not be catastrophic, the effect on the overall system structure and the order in which modules make calls on other modules, would be. Yet again, conceptually, from an applications point of view, the change is trivial.
There are other objections which we ca~a make to writing these programs in P L / I or any other computer oriented language. For example, the model writer must distinguish between the same function when applied to large files and when applied to small files. This largely because of the difference in accessing techniques for primary and secondary storage devices.
Furthermore the
programmer must be concerned with the various formats of the data, specifically the difference between internal and external representations. In many languages these distinctions, as well as issues concerned with word length, etc., are spread throughout the code. All of these problems make it necessary to find in the same individual both enough application knowledge and data processing skill to write the required program fragments and the rules of composition.
From these comments it is possible to construct a list of requirements which the programming interface must at 1east meet if it is to prove suitable for this task. First, it must be a very high level interface in which data processing notions are suppressed and terms, concepts and constructs reasonable to the application expert are utilized. This requirement stems simply from the impossL bitity of finding people skilled in both areas. There are other, more technica!, characteristics that the tanguage should also have which influence how these high level notations are introduced. One would like to guarantee that the specifications for an application option map into some relatively local fragment This, of course allows us to modify a program for one application option to handle a different option by making trivial local changes to the program. There is another technical reason for this requirement.
It is highly desirable to program the
fragments in such a way that application options that are apparently independent do not affect the
353
same pieces of code. Put another way, dependencies that are not present in the application itself should not appear in the program representing the application. Finally, the language should place minimal fixed sequencing constraints on the program. Such a language makes it easier to insert and remove functions. It will also make it easier to support batch and data entry functions for the same programming module.
IV. The Business Definition Language The Business Defintion Language, BDL [5], has been defined with these criteria in mind. It is aimed at a particular set of business applications, namely, the paper handling, data manipulation functions. More specifically it is directed towards applications which are minimally involved with computation and maximally involved with data manipulation.
It is almost certainly not the
language that should be used in writing, for example, the linear programming algorithms used in warehouse utilization problems.
The language itself is highly structured and is designed to support a particular style of programruing. It is intended that for any function there will be one obvious way to program it. That is, there should not be a number of possible computational paths to the solution. Furthermore, the one path should be obvious to someone skilled in the application area. In particular, the application programmer will not be concerned with issues of efficiency nor will there be choices in the language which will allow him to substantially effect the efficiency of the algorithm. Rather, the language has been designed so that the structure of the programs will reflect the structure of the business application.
The view of business that is reflected in this programming language is that a business consists of operating entities, such as departments or sections, or clerks; and that these operating entities communicate with one another largely on forms or documents. The function of these entities, at least the parts and that can be reasonably automated, is to transform their incoming forms to the kind of forms that they produce.
As a consequence of this view of business the BDL language contains as primary elements: steps, which are used to represent the operating entities; documents, which represent the paper that the steps consume and produce; paths, which represent the established communication links between the various operating entities.
354
The language makes heavy use of two dimensional programming techniques and is intended to be used with a sophisticated display device. The language is also heavily biased toward interactive applications, in recognition of the fact that many business activities can only be partially automat~ ed. This aspect of the language, however, will not be discussed.
BDL programs are developed in a structured way. Programming begins by drawing on a screen labelled boxes representing the principle operating entities within the application area to be automated. These boxes are connected by paths indicating the lines of communication among them. These paths are labelled with the kinds of documents that flow along them. The semantics associated with this part of BDL is that the step from which the path emanates is expected to produce a set (perhaps a singleton set) of documents at some instance of time. The steps receiving this set of documents will begin execution as soon as a set is available along each of their input paths. Thus at this level the language is a data flow language with steps producing all of their outputs simultaneously and executing whenever their input is ready. The only constraints on the order of execution implied are those implied by the data flow constraints.
An example of this aspect of the BDL program is shown in figure 3, This example indicates an application consisting of three steps, namely, Billing, Inventory Control, and Sales Analysis. It involves several sorts of documents:
Orders, Order Acknowledgements, Back Orders, Item
Summaries, Invoices, and Salesman Summaries. Except for Orders, which originate outside of this step, all of these documents originate in the Billing department and are used to trigger work by tile Inventory Control department and the Sales Analysis department, or, in the case of Order Acknowledgements and Invoices, to be sent outside of the system. It should be emphasized that a programmer writing in BDL will, via a display device, enter precisely the sort of diagram shown in figure 3. BDL is not a linear language. This information is part of the BDL program; it is no_._At accompanying documentation. The application expert then proceeds to specify in more detail how each of the steps executes. As figure 4 illustrates, this can be accomplished in part simply by elaborating the data flow to a further level of detail. In figure 4 the Billing step has been broken down to indicate that it contains three sub-steps: the step which produces the Order Acknowledgement, the step which produces the Invoice and a special step called an accumulator. The accumulator step, like a number of special purpose steps which have been introduced into BDL, is a device which is useful in data flow programming and which, it is hoped, will reduce the programming burden. The accumulator step illustrated here takes in singleton sets of Orders until a signal is received. At that point it outputs the accumulated group of Orders.
355
Order
o..~. i~,..,~o~°r~r~r ' ~ =1I
'~W~O.*
=J1 II Itemsummary
CONTROL
FIGURE3. ANINITIALBDLPROGRAM
~
BILLING " PRODUCE t Backorder OrderAck
Order
',.., ~'"!,,
Order t
~
j ltem summary i PRODUCE INVOICE Invoice
I
i
Salesmansummary
i
FIGURE4. FURTHERSPECIFICATIONOF THEBILLINGSTEP
356
Another facit of BDL illustrated in figure 4 is the role of permanent storage, called files. In figure 4 two files, the Item Master and Customer Master, are shown. The Item Master can be accessed by both the Produce Order Acknowledgement step and the Produce Invoice step. The Customer Master file is accessed only by the Produce Invoice step. The perception of files encouraged in BDL is similar to the one which in a manual operation is associated with a folder in a file drawer: that is, it is a collection of documents all of the same sort which are accessed in total by a step when it executes. Thus from the point of a step, it is as if the step simply had another input path.
The programmer can continue with this process of elaboration as long as he finds it productive. At some point, however, the problem cannot reasonably be decomposed further in terms of data flow. The programmer must indicate the details of the relation between the group of input documents and the group of output documents. At this point the BDL programmer makes use of another component of BDL, a component specifically designed to show how the transformation from input to output documents is accomplished. To indicate the details of this transformation tlae programmer utilizes another portion of BDL called the transformation component.
The transformation component is a highly structured
language with a tabular format. Figure 5 illustrates such a piece of code. The first column lists, in a structure-like format, the fields on the output document to be computed. A field may have sub-fields and a given field may be repeated indefinitely often. The latter possibility is indicated by appending "'s" to the field name in the listing.
The second column listed is
causality/derivation. For individual (scalar) fields it indicates the computation by which the field value is derived.
For indefinitely repeating groups of fields it indicates how to compute the
cardinality of this group for a particular document and, in effect, selects the set of items that will be used in computing the subfields.
In writing the causality/derivation, field names may be
introduced which are not names of fields from the first column. Each such name is listed in the third column. The definition of the name is given in the fourth column. Figure 5 is the BDL program equivalent to the P L / I program of figure 1. That is, it is the program to calculate invoices assuming that one invoice is to be produced for each incoming order. Several aspects of the transformational component are illustrated by this example. First, the program centers around the computation of the output. This is in contrast to the P L / I program, which was driven by consideration of the input. Second, the form of the transformation component makes it very difficult to reuse computations. This is illustrated by considering the computation for Name
357
i
Causal~/l)eriva//on
G.mp/~dd I Invoice*s
i
l]~4rmilkm
l~anle
ONE PER Order
Order 's
'INPUT IN ASSOC. Customer master Customer master WHERE Customer number = Cust# IN Customer master INPUT
2
Name
Customer name
Customer name ASSOC, Customer master Customer number Customer master' s
2
Address
customer address
Customer address Assoc. Customer master Customer number Customer master' s
IN ASSOC. customer master Customer master WHERE Customer number = Cust# IN Customer master INPUT
2
Cust#
Incoming Cust#
Incoming Cust# Order
IN Order CAUSE OF Invoice
2
Item's
ONE PER Incoming Item
Inqoming Item' s Oraer
IN Order CAUSE OF Invoice
3
Item#
Incoming Item#
Incoming Item# Incoming Item
IN Incoming Item CAUSE OF Item
3
Quantity
Incoming Quantity
Incoming Quantit~ Incoming I t e m
IN Incoming Item CAUSE OF Item
3
Price
Item price
Item price Assoc. Item master Item number Item master' s
IN Assoc. Item master Item master WHERE Item number = Item# IN Item master INPUT
3
Extended price
Quantity - Price
2
Gross
SUM(Extended price's)
2
Tax
0.05 × STATE(Address) = SUM(Quantity's x "New York" Price's) 0 OTHERWISE
2
Amount
Gross + Tax
i
FIGURE 5.
i
BDL PROGRAM TO PRODUCE INVOICES (ONE PER ORDER) FROM ORDERS
and Address in figure 5. These calculations both require access to the Customer Master file. In most programming languages, assuming the file resides in secondary" memory, the program would be written in such a way that the appropriate record was accessed once and the two fields extracted. In BDL this must be indicated by two separate computations. This, of course, permits us to change the definition of how one value is computed without requiring that the second definition be changed as well. This is one way in which BDL supports the notion of locality of definition.
Figure 6, which corresponds to the program in figure 2, illustrates other characteristics of the transformational component. If a program is to be written in terms of the output rather than the input, then it is necessary to indicate what causes a given element in the output set to be produced.
358
The causality for invoices is given as "ONE PER Like Orders". Like Orders are defined to be "Orders WJTH COMMON Cust#". The effect of this set of definitions is to create a partition of the input set, where each element in a partition are the Orders with a common customer number. One Invoice is produced for each element in the partition. Similarily, items are produced "ONE PER Like Incoming Item's" where Like Incoming Item's are defined to be "Incoming Item's WITH COMMON Item#" and where Incoming Items are defined to be "ALL IN Like Order's". This piece of code first causes a set to be formed of all Items on all Like Orders (that is all Orders with a given customer number). This set of items is then partitioned according to common item number and an item on the Invoice is produced for each element in the partition. Observe that the two programs in figures 5 and 6 vary only in those parts of the program which are concerned with calculating Invoices and Items. All other computations remain the same. This is another aspect of which BDL supports the notion of locality of reference.
Group/Field Invoice~s 2
Name
Causality/Derivation
DermkMm
Nm
ONE PER Like Order's
Like Order's Order's
Order's WITH COMMON Cust# INPUT
Customer name
Customer Assoc, Customer Customer Customer Customer
IN Assoc. Customer master Customer master WHERE Customer number = Cust# IN Customer master INPUT
name master number master master's
2
Address
Customer address
IN Assoc. customer master Customer address Customer master WHERE Assoc. Customer number = Cust# Customer master IN Customer master Customer number INPUT Customer m a s t e r Customer master's
2
Cust#
Incoming Cust#
Incoming Cust# Order
IN Order CAUSE OF Invoice
2
Item's
ONE PER Like incoming Item's
Like incoming Item's Incoming ItemPs Like Order's
Incoming Item's W I T H COMMON Item~ ALL IN Like Order's CAUSE OF Invoice
3
Item#
incoming Item#
Incoming Item# Like incoming Item's
COMMON IN Like incoming Item CAUSE OF Item
3
Quantity
SUM(Incoming Quantity's)
Incoming Quantit~ Like incoming Item's
IN Like incominq Item CAUSE OF Item
3
Price
Item price
Item price Assoc. Item master Item number Item master's
IN ASSOC. Item master Item master W H E R E Item number = Item# IN Item master INPUT
3
Extended price
Quantity - Price
2
Gross
SUM(Extended price's)
2
Tax
0.05 × STATE(Address) = SUM(Quantity~s x "New York" Price's) 0 OTHERWISE
2
Amount
Gross • Tax
FIGURE 6. BDL PROGRAM TO PRODUCE INVOICES FROM ORDERS (GROUPED BY CUSTOMER)
359
There are other components of BDL which are used to complete the program definition.
One of
these components has to do with the definition of forms, i.e., empty documents. In this component are isolated all of the data typing, data formatting and field sizing aspects of a program which, in more conventional languages, are sprinkled throughout the program.
These definitions are
accomplished by displaying or creating a form on a screen and systematically filling it out in various ways:
indicating either the name, the field size, the formats, or the data types of the
objects which it is to contain. This not only isolates these aspects of programming but also is a specific context in which an application expert can deal with these matters.
In passing, it is
interesting to note that BDL contains several unconventional data types having to do with money, dates, addresses, etc. It also has rather strict rules about the ways in which different data types can be combined and what the resulting data type is. These rules are used to do extensive type checking on the computations used in the transformational component.
Finally, there are ways for indicating that a transformation step involves human interaction. For all steps associated with an I / O device there are ways of indicating this as well.
In summary, then, BDL is a highly structured language in which the various components of programs in an application area are isolated and dealt with separately. This maximizes maximal possibilities of changing one aspect of a program without concern for other aspects. This in turn gives us the manipulability and composability that we seek.
V. Other Associated Work BDL has been designed to make it possible for an application expert to write a large set of program fragments which, taken as a whole, can be used to generate a large variety tailored application programs. Two issues remain to be discussed. The first concerns the method by which information is extracted from the user concerning his application and how that information is used to generate the programs required for his application.
We are currently working on two quite
different techniques for accomplishing this purpose.
In the first approach we are attempting to judge the feasibility of using a predetermined set of questions, which are chosen by the application programmer who writes the program fragments. The application programmer is also given straightforward methods for showing the composition of program fragments corresponding to the various alternative answers to the question. The system is
360
also being designed-so that the application programmer can introduce his question set without regard to the order in which the eventual end user will answer the questions.
Although it is
feasible to build such a system it is less clear whether it is possible for the application expert to generate such a question set and the associated actions on the program fragments and, further, whether such a- system will satisfy the user's needs. We are currently building a prototype system of this sort and plan 'to run extensive human factor studies to determine how easy it is both for the application expert and for the end user.
A second approach to the question of eliciting information from the end user is to try to build a system which engages in a more informal dialogue and which, as a result of an extended conversation with the user, determines his application needs and arranges for the appropriate program to be generated. Here the problem is not so much the specification of an appropriate interface as the development of a methodology to support it. As a consequence we are investigating the issue of how to model programs and how to model the business world. We are also formulating methodologies for inferring the information required for generating appropriate programs.
Another
important issue is how to exploit these models to direct the conversation with the user in an appropriate direction so that the necessary information can be gathered. We must however, be prepared to accept unexpected information from the user and make use of it as best we can. Obviously, we must be able to uncover and resolve apparent inconsistencies and misunderstandings on the part of the user. Clearly, this is a very difficult task, and one in which we do not expect immediate results.
Another problem with which we must deal is the translation of the BDL programs to executable code. BDL as a language forces considerable redundancy in the specification of a system. This redundancy is an especially severe problem because many of the operators in BDL are aggregate operators. Because of its forms orientation, this redundancy also appears in the data structures used in a given program. As a consequence straightforward implementation of BDL, either as an interpreter or as a simple compiler, will result in unacceptable execution time. Hence we are looking at program analysis and optimization techniques which will transform a program written in BDL which takes maximum advantage of the application expert's skill into a program that makes better utilization of the machine resources. This analysis must take place on the whole complex of BDL modules. It cannot be limited to a single transformational step. Considerable emphasis is being placed on the collection of global information relating the various parts of the BDL program, and on transformations involving the aggregate operators and data structures.
361
VI. Acknowledgements The work briefly outlined above corresponds only to one facit of the work in Automatic Programming at Yorktown Heights. Other points of view are also being vigorously pursued. These will be reported on at a later date. The work outlined includes contributions from a large number of members of the Automatic Programming group. Special acknowledgement should he made of the contributions of Gerry Howe, Irving Wladawsky, Vincent Kruskal and Mike Hammer (now at MIT) in the definition of BDL. Irving Wladawsky, Martin Mikelsons, Peter Sheridan and George Heidorn are responsible for the work on the Information Acquisition System. Fran Allen, Dave Lomet, and Bill Harrison are contributing the design of the analysis and optimization techniques.
References [1] Balzer, Robert, Automatic Programming, Technical Memo, Information Sciences Institute, University of Southern California, September, 1972. [2] Martin, WiUiam, et. al., Automatic Programming Internal Memos, 1972, 1973. [3] Automatic Programming Workshop, M.I.T., January, 1973. [4] Hershey, E. A., et. al., PSL/II Language Specifications, Version t.0 ISDOS Working Paper No. 68, University of Michigan, Dept. of Industrial and Operations Engineering, Ann Arbor, Michigan (Feb. 1973). [5] Hammer, M. M., Howe, W. G., Wladawsky, t., An Interactive Business Definition System, RC 4680, IBM T. J. Watson Research Center, Yor~own Heights, New York, January, 1974.
NONPROCEDURAL PROGRAMMING
B. M. Leavenworth Computer Sciences Department IBM Thomas J. Watson Research Center Yorktown Heights, New York
ABSTRACT
Nonprocedural programming involves the suppression
of unnnecessary detail
from the statement of
an algorithm.
The conventional representation of an algorithm as a step by step
sequential
procedure
nature of the procedure. transparent
when
In
stated
often
obscures
many cases,
the
essential
algorithms are more
recursively,
combinatorially
or
nondeterministically.
The paper discusses these three styles
of
gives
progr~ing
and
examples
elimination
of certain
low level
programming
and
replacement
techniques
their
is
advocated
their
features of
(associative referencing,
pattern matching)
of
by
these
use.
The
traditional and
other
aggregate operators and
in order to
raise the level
of algorithm description.
Introduction
Nonprocedural structured easier involves
programming
programming:
to understandf
has
constructing modify and
the suppression
statement of
many
of
an algorithm.
The
of
the
programs
debug.
unnecessary
goals that
In addition, detail from
of are it the
conventional representation
of an algorithm as a step by step sequential procedure often obscures
the essential
nature of
the
procedure.
In
many
363
cases,
algorithms
are
more
recursively,
combinatorially
We
problem solving
see the
transparent
when
stated
or nondeterministically.
process as
composed of
three
components:
(I) statement of the problem
(2) statement of the solution
(3) efficient
implementation
We are mainly interested the second step. Step
(1)
to
from a programming
point of view in
(3) can in principle be carried out by
an optimizing compiler, step
of the solution
step
whereas (2)
is
the the transformation a
problem
in
from
Artificial
Intelligence.
In
any
case, the
end
user
must always
either that his statement of the problem is
correct.
This
paper
programming which,
is
by removing
is nonprocedural
accepted definition, it involves
function
of
functional
in
the
output when presented program
has
no
There
specifying the
the sense
side
(2) of
level details,
easier.
programming?
inputs.
techniques
certain low
but for purposes
say that
himself
(I) or solution
concerned with
make this task of verification
What
satisfy
A
that it
outcome desired
nonprocedural
program
always produces
This
commonly
of this paper we will
with the same input; effects.
is no
as a is
the same
a nonprocedural
condition
can
be
364
guaranteed
by
nonprocedural
eliminating languages,
see
assignment.
For
a
survey
of
(Leavenworth and Sammet 1974).
Recursive P r o q r a m m i n g
There
are
many
algorithms
r e c u r s i v e l y than iteratively. as sorting,
Knuth's presents
that
Examples
tree walking, parsing,
Chapter 2
on I n f o r m a t i o n
algorithms
in
a
are
style
easier
to
state
abound in areas such
etc.
Structures
(Knuth
which
suitable
for
algorithms
for
is
1968)
e f f i c i e n t implementation~
As
an
example,
we
choose
one
of
his
t r a v e r s i n g binary
trees. A binary tree
is a finite
nodes that either
is empty, or consists of
with two binary trees. Knuth r e p r e s e n t s the
set of
a root together tree
365
by
./
the
data
structure:
IBI
i IDI^I -A
l^I o I^ I ~
l^l~J ^ I
366
The
algorithm
for
'~postorder"
traversal
can
be
stated
simply:
T r a v e r s e the left subtree V i s i t the root T r a v e r s e the right subtree
The d e t a i l e d v e r s i o n of this a l g o r i t h m is given below in his iterative style°
A l g o r i t h m T. Let T be a p o i n t e r to a binary tree and A be an auxiliary stack.
TI. [Initialize~]Set stack A emptyt
and set the link v a r i a b l e
P <- T. T2. [P = A?] If P = A, go to step T4. T3. [Stack
<= P.]
(Now P
points to
w h i c h is to be traversed.)
Set A
a nonempty
b i n a r y tree
<= P~ i.e., push the value
of P o n t o stack Ao Then set P <- LEFT
(P) and return to step
T2. T4.
[P
<=
terminates;
Stack.]
If
stack
A
is
empty,
the
algorithm
o t h e r w i s e set P <= A.
T5. [Visit P.] "Visit"
NODE
(P). Then set P <-
RIGHT
(P) and
return to step T2.
In this case~ visit means a c c u m u l a t e the "value" of the root in a buffer which is printed w h e n the a l g o r i t h m terminates.
367
The above algorithm
is reasonably
close to
a corresponding
program in some high level language except that the stacking operations
would
representative
be
of
less
clear in
iterative
the
algorithms
program. with
It
is
sequential
updating of memory and transfer of control.
Since the treewalk is essentially can describe functional
the algorithm
a recursive procedure,
more naturally
in a
we
LISP-like
language:
postorder x = if null
(left x) then
()
else postorder
(left x)
~I root x II if null
where
'II'
'left',
denotes
'root', and
components
of
(right x) then
an infix 'right'
types if
else postorder
(right x)
concatenation
operator
tree. The binary tree
in this
using either programmer-defined
such a facility
and
represent selectors of the three
a node of the
case can be constructed
()
exists
(for example,
data
SNOBOL4)
or
defining them by functional composition.
Recursive programming
is supported
called higher order languages
by LISP
and by
(see next section).
the so
368
Some further examples of r e c u r s i v e p r o g r a m m i n g will be given in the next section.
Combinatory Programming
The idea of
this
type of p r o g r a m m i n g is
to m a n i p u l a t e and
combine functions
with the purpose
of
most
c o n d i t i o n a l s and
r e c u r s i v e calls
part loops,
1972). By
s u p p r e s s i n g these
e l i m i n a t i n g for the
"lower level"
p r o g r a m m e r is freed from u n n e c e s s a r y
(Burge
constructs,
the
detail and can exploit
a p o w e r f u l and concise style of programming.
In
order
to set
programming, of a list.
the
stage
for examples
c o n s i d e r the p r o b l e m of
of
combinatory
adding up the elements
The r e c u r s i v e a l g o r i t h m is stated
in E n g l i s h as
follows:
To sum the elements of a list to
the
result of
summing
x, add the first element of x the
remainder
of x.
This
is
t r a n s l a t e d into a r e c u r s i v e p r o g r a m as follows:
sum x = if null x then 0 else h x + sum
The
boundary
condition
identity element for r e c u r s i v e formulation.
"if
(t x)
null
x
then 0"
a d d i t i o n and is always The above example
defines
the
required for a
is c h a r a c t e r i s t i c
369
of recursive that the
programming, but is
"low level" in
recursive operation of
the p r o g r a m
the sense
involves data
sequencing of the list.
The
function just
functions
which
specification.
given is can
Before
r e p r e s e n t a t i v e of
be
defined
a class
using
e x p l a i n i n g this
of
combinatory
technique,
let
us
consider the slightly more c o m p l i c a t e d example of applying a given function f to each element
of a list x. The r e c u r s i v e
a l g o r i t h m is:
To map a function f to each element the first element of x and
of a list x, apply f to
prefix this result to the result
of m a p p i n g f to each element of the remainder of x.
The functional p r o g r a m is:
map f =
I x. if null x then
()
else f (h x)
The
i n t e r p r e t a t i o n is
produces
a new
produces
the desired
called an "f
follows:
the application
function w h i c h
when
result. This
mapper". That is, it
characteristics of however,
that
: map f (t x)
f into the new
it is more convenient to
of
map to
f
applied to
a list
x
new
function might
encapsulates function.
be
(binds) the
Syntactically,
write the map function as
370
map
f x = if n u l l x t h e n else
even
though
to a s i n g l e
Now,
the
map
f
(h x)
: map
(and e v e r y
other
function)
f
(t x)
is a l w a y s
applied
argument.
two p r e c e e d i n g
as s p e c i a l
()
cases
dissimilar
of the
functions
c a n be o b t a i n e d
general
list p r o c e s s i n g
following
function:
l i s t a g f x = if n u l l else
Functions
of this
x then g
( f
(h x))
type w h i c h
special
cases
will
If the
infix
operators
a
produce
be c a l l e d
(list a g f
other
(t x))
functions
as
generators.
~+' and
':' are
given
the
prefix
formulations
plus
x y = x + y
prefix
the p r e v i o u s
x y = x
functions
s u m = list map
: y
0 plus
f = list
can be d e f i n e d
i
() p r e f i x
f
in t e r m s
of
'list':
371
where
'i' represents the identity function
ix=x
The
standard set
operations
have
been defined
(Burge 1968) using combinatory functions. here,
since
programming.
they In
demonstrate the
what follows,
sets
by
Burge
We will give them
flavor
of
will be
combinatory
represented by
lists with no d u p l i c a t e elements.
exists p = list false or p
where
'or' is the logical function
or x y = if x then true else y
The
'exists'
element of a
function
applies the
list and returns true
predicate
p
to
if at least one
each of the
resulting values is true, and false otherwise.
filter p = list
The
'filter'
() i
function
X x. if p x then prefix x else i
returns a subset of
selected by the predicate p. Mathematically,
{xeS Ip(x) }
the argument set the result is
372
where
S is
belongs
the
argument
1 x = exists
where
equal
equal
x y = x = y
The x
is
the
set.
(equal
prefix
'belongs ~ function
is
an
element
intsn
=
where
~ is
of
filter
an
l,
x)
1
formulation:
is and
a
predicate
false
which
f g x =
Thus,
The
diff
where
f
f
x y =
infix
representation
of
the
prefix
(g x)
f g
function
filter
~ n o t ~ is
if
o belongs
g = b
'intsn'
true
otherwise,
function
b
returns
the
,
defines
( not
.
logical
set
intersection
(belongs
function
y))
x
composition
373
not x = if x then false else true
The
'diff'
function defines set difference.
union x y = concat
where
'concat'
(diff x y) y
is defined by
concat x y = list y prefix i x
The
'union'
The
type
function defines set union.
of
supported by order" 1966)
combinatory and can
languages such as
described
be p r o g r a m m e d in
(Reynolds
PAL
functions
1972)
are
the "higher
inspired by Landin
(Evans 1968), McG
(Reynolds 1970) and QUEST
any of
here
(Landin
(Burge 1968), GEDANKEN
(Fenner et al 1972).
N o n d e t e r m i n i s t i c Programming
The p r o g r a m m i n g of a wide class of c o m b i n a t o r i a l problems is made easier by
using certain operators introduced
by Floyd
(Floyd 1967). These consist of:
(I)
a multiple
valued choice
function
whose values are the integers from I to n
called choice
(n)
374
(2) a success function,
and
(3) a failure function
The
choice function
allows a
program
executed in p a r a l l e l with each path of the
argument.
The
t e r m i n a t i o n points t e r m i n a t i o n points
success and
to be
conceptually
using one of the values failure functions
of the computation.
However,
labelled as success
label
only those
are considered to be
c o m p u t a t i o n s of the algorithm.
Since
context-free
languages
nondeterministic
pushdown
nondeterministic
primitives
are
automata,
recognized
we
will
in s p e c i f y i n g
a
use
by these
context-free
parser.
We w i l l
m o d i f y the choice
function slightly and
a r g u m e n t to be a list instead
of an integer.
allow the
Then each path
will use one of the elements of the list.
The
parsing algorithm
method
which
will
to
be p r o g r a m m e d
parse
strings
c o n t e x t - f r e e grammar w i t h o u t left an
input string
consists of the
and a
a top
generated
recursive rules. string w h i c h
d i s t i n g u i s h e d symbol S of
leftmost symbol of following cases:
prediction
uses
the p r e d i c t i o n string is
by
down any
There is initially
the grammar.
The
tested for the
375
(I) If
a Terminal
symbol under deleted,
(2)
choice
scan.
otherwise
If a
symbol, If
there
is compared is a
the failure
Non-terminal
function)
it
match,
function
symbol,
it is
with the both
symbols
it
is deleted
are
is invoked.
replaced
(using
by all the right hand rules defining
(3) If a Rule number,
input
the
it.
from the prediction
and
added to the buffer.
A
simple program
to
realize
this
algorithm
will now
be
shown.
parse
(input,pred,bufr)
if and
(null pred,null (print bufr;
i_ff or
=
success)
(null pred,null
if rule no
(h pred)
parse i_ff term
input)
input)
then else then failure
then
(input,t pred,h pred:bufr)
(h pred)
else
else
then if h input = (h pred) then parse
(t input,t pred,bufr)
else failure else let x = choice parse
If both
(input,x
the prediction
(gmap II
(h pred));
(t pred),bufr)
string
and input string
are empty,
376
the
buffer is
successful.
printed
(side
After this test,
or input string is empty, p r e d i c t i o n is a the
top of
function
part
is
(nonterminal)
(right
the parse fails.
with
a
is
If the top of the
a function
of
right
(relation)
W h e n the
the buffer.
nonterminal,
a list
of a grammar
hand rules)°
the parse
it is added to
p r e d i c t i o n is
is called
[) and
if either the p r e d i c t i o n string
rule number,
the
argument°'gmap'
effect
the
choice
hand rules
that maps
to a list
If
as
a left
of a l t e r n a t i v e s
c o m p u t a t i o n terminates,
the
buffer contains in reverse order the rules that were applied during the parser
The above
programt w h e n defined
in the e n v i r o n m e n t
of the
grammar
gmap = {<S~ ([IaAS], [2a]) >,},
r e p r e s e n t i n g the c o n t e x t - f r e e grammar S -> aAS
I a
A -> SbA
I ba
and applied and
buff
i SS
to the arguments input =
()
e x p r e s s e d as given as lists
,
the
a binary r e l a t i o n
lower case,
,pred
string 13242. w h e r e the range
of c h a r a c t e r strings
case, terminals in integers)°
produces
=[aabbaa]
(nonterminals
and rule
= S:()
'gmap'
is
values are in upper
numbers denoted by
377
Nondeterministic
functions such as those described here have
been added to FORTRAN followed is FORTRAN
(Cohen
and Carton 1974). The approach
to transform programs
into
standard
Floyd's work.
written in
(deterministic)
Similar techniques
the extended
FORTRAN
can be
following
applied to
other
high level languages.
Elimination of Low Level Detail
The
programming techniques
combinatory,
nondeterministic)
inessential detail in briefly
outline
those
low
roughly their
the
form:
following
Some
of
do
much
(recursive,
to
the programming process. Now
eliminated and
substitute.
already introduced
level
be
nonprocedural equivalents
in
the
already been discussed.
feature
=>
nonprocedural
that
we will can
low level
features
eliminate
nonprocedural
techniques
have
The others will appear in subsequent
sections.
Explicit referencing and search => associative referencing We would
like to eliminate
explicit access
paths and
referencing dependent on array subscripts, pointers and explicit searching.
Loops =>
associative referencing,
combinatory programming
aggregate
operators,
and
378
Elimination of because
it
programmer
loops raises
decreases has
to
the level
the
number
make.
We
of programming
of
also
decisions
include
in
the this
category most iterative and recursive constructions.
Explicit sequencing => recursive and combinatory programming Explicit
sequencing
is
procedural programming, of
memory.
The
greatly the
intimately
connected
side effects and
presence of
opaqueness of
side
the updating
effects
programs and
with
increases
difficulty of
verification.
Explicit control
and pattern
matching =>
nondeterministic
programming and pattern matching Pattern
matching
and
nondeterministic
treated together since The
suppression
direction
of
control
they are related in
of control
flow
nonprocedurality
is
and
are
many ways.
a step
in
serves
to
the hide
details which are not relevant to the problem solution.
Associative Referencing
We
use the
accessing of data. This suppress
term associative data based on
referencing to
some intrinsic property
method of referencing implementation
refer to
oriented
allows the details
the
of the
programmer to so
that
the
decision of how to represents objects in the machine is left
379
to the compiler.
The relation mapping which
'gmap' in the
from nonterminals did
not
representation
commit
previous example
represented
to right hand rules the
or access
compiler
to
paths. Earley
described higher
level data structures
sets, relations)
and operations
in a grammar
any
particular
(Earley 1974) (tuples,
on these
a
has
sequences,
structures which
provide this type of freedom from access path dependence.
Associative
referencing
is usually
described
syntactically
set S satisfying
the property
using the standard set notation: {xeS Ip (x) }
That is,
all the members of
p(x). Underlying
this syntax, however,
a function such as 'filter'
is the application of
previously defined:
filter p S
Aggregate
The
set
functions briefly
qperators
operators are discuss
previously
examples four
defined
of aggregate
types
of
by
combinatory
operators.
We
aggregate operators
perform the following kinds of mappings:
will which
380
(I) aggregate -> scalar
(2) aggregate -> aggregate
(3) a g g r e g a t e X aggregate -> scalar
(4) a g g r e g a t e X a g g r e g a t e -> a g g r e g a t e
An example APL
w h i c h is
earlier.
We
of the first type
The
will
exemplified
is the reduction
by
the
~sum'
function
'map' function is an example of type
now define
a
function
operator of
analogous to
defined
(2).
the
'list'
function but w h i c h operates on two lists of equal length. will then
accommlodate the
two r e m a i n i n g
types as
It
special
cases.
lists a g f x y = if null x then else g
An example
()
(f (h x) (h y))
of the third type
(lists a g f (t x) (t y))
is an inner
product function
d e f i n e d as follows:
inner = lists 0 plus mult
where
'mult'
is given by the prefix formulation
381
mult x y = x * y
Finally,
a distribution
operator
on pairwise
function
elements of
which
applies the
two lists
same
to produce
a
result list, a la APL, can be defined
dist f = lists
It
is well
() prefix f
known
that APL
has
aggregate operations. However, powerful generator
because any whereas
the
function the
excellent facilities
for
present approach is more
can be
arguments
the
allowed
argument of by
APL
a are
restricted to the built-in functions.
Pattern M a t c h i n g
The string (Griswold
pattern m a t c h i n g facilities p r o v i d e d et al
operations we
1968) are
r e p r e s e n t a t i v e of
want in order
However, we would like pattern arbitrary data structures,
The
following highly
patterns and
to suppress low
by SNOBOL4 the type
of
level detail.
m a t c h i n g to be a p p l i c a b l e to
not just strings.
recursive SNOBOL
uneva!uated expressions
program which to r e c o g n i z e
uses
strings
generated by the context-free grammar p r e v i o u s l y introduced, demonstrates the power of a g e n e r a l i z e d pattern matcher.
382
&ANCHOR A = *S S =
= I ~B ~ *A
'A' A *S
INPUT
I 'BA ~ I *S *S i ~A~
S RPOS(0)
END
In
the above
pattern
programr
variable
S,
the
operation
of
postpones
evaluation
specifies match
succeeds
infix
of
string
the f u n c t i o n only
a parse
between
successes
addition
to the i n f o r m a t i o n
a derivation
would
indicate
given
string.
An
approach
matching
because
the input
facility
called QUEST
has
the
'I ' r e p r e s e n t s
the
operator
The first
the first is a
string
can't
use this way
failures
of
alternative
to
rules
incorporates
of
were
a
order
to
distinguish paths.
be useful
tree as the value
been d e s c r i b e d
scanned.
the p a t t e r n m a t c h e d
it w o u l d
that
mechanism
no
into a h i g h e r
character.
has been
is
string,
statement
pattern
there
that
'*'
that the p a t t e r n must
at
input
by
unary
'RPOS(0) ~
exactly w h i c h
which
the
w h i c h means
call
produce
produced
and
programmer
or
represented
symbol
starting
if the entire
the
is
its operando
"anchored mode",
Unfortunately~
not m a t c h
grammar
alternation,
the input
Finally,
the
In
or did
if SNOBOL
the match w h i c h
used
to
match
the
SNOBOL-Iike
pattern
programming
language
by T e n n a n t
(Tennant
1973).
383
This approach allows the type of translation discussed above and hence is more powerful than SNOBOL.
Since
space precludes
matching techniques,
an
adequate
a discussion
various artificial
intelligence
(Bobrow and Raphael
1973).
discussion of their
languages
of
pattern
application can be
in
found in
Summary
We
have
discussed
programming
subsumed
programming.
These
raising the advocated
operators
some by
three
notion have
of been
algorithm description.
the elimination
of
certain
styles
of
nonprocedural applicable We have
low level
to also
features
used and their replacement by these and other
such and
detail
the
techniques
level of
conventionally techniques
in
as
associative
pattern matching.
some programming
languages
referencing,
Finally,
we
aggregate
have suggested
and extensions which support this
type of programming.
REFERENCES
D.G. Bobrow and AI
Research",
California,
B. Raphael, Tutorial
August,
1974.
"New Programming
presented at
Languages
3rd IJCAI,
for
Stanford,
384
W.H. Burge~
"McG - A
Functional Programming System",
RC 2189, IBM Research DivisionF
Report
Yorktown Heights, N.Y.August
1968.
W.H.
Burge~
Analysis"~
"Combinatory
Cohen
and
E.
Carton,
"Non-deterministic
17, No.
I,
Vol.
16,
Acta Informatica Vol.
Evansr
~'PAL -
Programming Conference,
A
2 Fasc.
Language
Linguistics",
FORTRAN",
(May 1974).
"Relational Level Data Structures
Languages"~
A.
Combinatorial
1972).
Computer Journal, Vol.
J. Earley,
and
IBM Journal of Research and Development,
No. 5 (Sept.
J.
Programming
for Programming
4 1973.
designed
Proceedings
ACM
for
teaching
23rd
National
1968.
T.I. Fenner,
M.A. Jenkins
and R.D.
Tennent,
"QUEST
: The
Design of a Very High Level Pedagogic Programming Language", S!GPLAN Notices~ Vol.
R.W.
Floyd,
(Oct.
1967) o
R.E. Griswold,
1968.
1973).
"Nondeterministic Algorithms",
J.F. Poage
Programming Language, Jersey,
8, NO. 2(Feb~
and I.P.
Prentice-Hall,
JACM
Polonsky, The Englewood
Vol.
14
SNOBOL4
Cliffs, New
385
D.E. Knuth,
"Fundamental
Computer Programming,
P.J. Landin,
Algorithms",
Addison-Wesley,
Vol.
I,
Reading,
"The Next 700 Programming
The Art
of
Mass°,196~8.
Languages",
CACM Vol.
9, No. 3 (March 1966).
B.M.
Leavenworth
Nonprocedura!
J.C. Reynolds,
Proceedings
"An
Overview
of
Symposium on Very High
"GEDANKEN
: A
Simple Typeless Language Based
of Completeness
and the Reference Concept",
13, No. 5
J.C. Reynolds, Programming Conference,
"Definitional
Languages",
Interpreters
Proceedings
for Higher-Order
27th
ACM
National
1972.
Tennent,
Programming 1973.
Sammet,
SIGPLAN Notices Vol. 9, No. 3 (April 1974).
on the Principle
R.D.
J.E.
Languages",
Level Languages,
CACM Vol.
and
"Mathematical
Languages",
Semantics
PhD. Thesis,
and
University
Design
of
of Toronto,
~ormal Definition in Program Development
C. B. Jones, IB8 Laboratory Vienna.
ABSTRACT
The
intent of the current paper is to show how a large problem
like compiler proTides
a
development structure
can
notation
divided
in
a
way
for arguments of correctness.
• echanically checked proofs formal
be
is
are
not
envisaged,
recommended
so
that
Although
the
the
which
use
basis
of for
correctness arguments exists. The
paper
reviews
three
topics:
the first two are relevant
particularly to the development of compilers: general.
The
subject
of
the
language definition to he used Beginning
with
a
first as
a
the third is more
section is the style of basis
for
development.
small language, possible ways of describing
added features are discussed. The
selection
usability process:
criterion
in it
developing is
this
for a
definitio, specification
techniques is their of
the
compiling
development which is the subject of the
second section. The
third
Development"
Q
section
briefly
reviews
the
process
of "~ormal
which has been described more fully elsewhere.
IBH ~sterreich
1974
388
O. I N ~ O D U C T ! O N
This
paper
provides an overview of a number of pieces of work
related to ~rogram correctness. completely
mecha,ically
appears to be some way
checked off,
small
to show how a large
for
approach
large
taken
programs
is
one
of
for human readers. The paper will problem
can
be
decomposed
into
enough steps that such justifications ca~ be convincing.
The particular compiler, to
proofs
the
documenting a " j u s t i f i c a t i o n " a%tempt
Since the possibility of having
problem to be c o n s i d e r e d is that of developing a
of course, a compiler
oversimplify
specia]
for
a
one
can
only in requiring ~wo extra stages
precede that which is a ~ p l i c a b l e
Three
is a very special program,
moment
major
parts
of
%he
consider of
hut
that it is
development
to
to any program.
compiler
development
discussed i~ sections I tc 3 of this paper. The
problem are
first
section
discusses
the
definition
of the semantics of the source language provides the
overall
d e f i n i t i o n of the language to be compiled. This
correctness criteria for the compiler:
whatever results
can he deduced about a ~rogram iritten in the must
also
be
true
when
compiling that program. properties
of
a
source
language
ru~ning the object code produced
The
discussion
language
definition
identifies
by
important
to be used in the next
stage. Given
a particula~ object machine, the next step is to develop
a mappi,g from the source to the object language.
How
done
of the paper.
is
the
subject
of
the
E x a m p l e s are given of mapping definition
onto
the
second
section
this
is
the a b s t r a c t state objects of the
store of a target machine
(cf. rots. [9,
15]). Any
top-down
correctness
development criterion:
input/output relation~
process
that
is,
must begin with a, overall a
specification
The purpose of section the
process
specification
developing
such
a
an
2 was to show how
exactly this can be Frovided for of
of
compiling
problem.
The
by a s t e p - w i s e
389
process to a running argued
program is discussed in section 3.
It
is
that the use of data a b s t r a c t i o n and appropriate choices
of implicitly defined functions can provide the
structure
for
justifications of large programs.
Since no specific le,gth limit was given to the author,
it must
be co~fessed that his lack of time is the reason that all three sections
are
not written up fully. The ideas behind section
are well enough dccumented in ether should
suffice.
papers
is
very
small.
precisely
3
overview
the example provided
This runs into the usual problem that in such to
"see"
the
correctness,
whereas
if
is
the inability of our "small head" to contain a large
proble~ that gives rise to the need for a section
an
A!thcugh the general direction to be followed
in the work covered by section 2 is clear~
cases it is easy
that
I a~Froaches
justification.
Only
the level cf c o m p l e t e n e s s the author would
have liked te attain.
The
process
of
"Formal Development" outlined in section
applicable to any programming it
is
an
~rcblem.
oversimplification
to
As was
think
observed
that it is therefore
sufficient to show how to tackle any other computing precise!y problem,
the
way
that
for
3 is
above,
task.
In
a c o m p i l e r it was a significant
generating the input/output relation for
other
tasks
will he difficult. It is a b n o r m a l for initial s p e c i f i c a t i o n s to be couched in terms of such relations, and, other than
arguing
that its production should be the first step, the current paper offers nc help as to how it ca, be obtained.
The emphasis throughout is on the method of decomposing a large problem
into small enough steps
Certain
common
te
provide
a
Justification.
requirements result from this, one of which is
the necessity to use a formal notation: only then possible
to
i,tenticn
tc argue for one particular ~ t a t i o n .
d o c u m e n t justifications.
A further technique,
implicitly
be
It is not, however, the
length, is the use
has used the terms "Cperational defined
it
which becomes almost a necessity if proofs
are to be of an acceptable Dijkstra
will
operations
and
of
abstraction.
Abstraction" to cower "Bepresentational
3go
Abstractie," properties.
to
refer to the postponement of unnecessary
Beth of these technigues
will be used,
data
but again no
particular notation is recommended.
Tt
is
the
intention
in
the current paper to concentrate on
t e c h n i q u e s rather than deep results. a
particular
mathematical
T r a n s c e n d i n g the choice of
discipline
to underly the work are
justification:
%he practical steps which must appear in any is t o
attempt
I. LANGUAG~
DEFINITION
This
of
part
the
throw some light on these.
the
STYLE
pape~
suggests
certain
properties
of a
language definition which will facilitate its subsequent use in development
of
constructea
along
other
a
purposes
translator design.
Notice that a definition
the proposed lines will not n e c e s s a r i l y like
proving
suit
programs c o r r e c t in the defined
language~
Although
easy
given
language
a
formulation
to express
with
(see summary)
feature
wanting
the required properties.
plan adopted below is %c c o n s i d e r and possible
way
it
language
formulations for their definitions.
find
a
features
~n attempt has
language
concepts.
In
is hoped that the reader can see the requirement
for the different complete
to
For this reason the
separate
been made tc d e l i b e r a t e l y separate the this
it is not always easy,
definition,
formulations
definition
in
of a language
the source of the complexity.
If
isolation,
whereas
in
a
it is often difficult to see the
current
approach
were
executed on all of the features cf the respectiwe languages one would then be able "cross
products"
to explain ref. of
their
[I]
and
respective
although the notation of the latter is
a
ref.
[4]
as
formulations. step
forward,
the For, both
d e f i n i t i o n s possess the properties discQssed. Rather
than
discussing
finding the a p p r o p r i a t e
notation, model
for
the a
emphasis
language
below is on
feature.
The
391
distinction
between
and
similarities
of
the
so-callsd
" o p e r a t i o n a l " and " m a t h e m a t i c a l " approaches are c o n s i d e r e d it
is
argued
that
which transcend
Most
bat
there are criteria for c h o o s i n g ~he model
the distinction.
of the so!uticns discussed are behind a number of current
language definitions.
Only the solution of the S 2 ~
problem is
novel.
Tn
order
to provide an overall context for the decisions made
below it is worth pointing out the origin of which
led
(Vienna author
to
a
Language)
the pleasure
notation.
algorithms
definition of PL/I shown
in
rots.
encountered. mechanis~
(ref. [14,
1.q
on
[24]).
~Ithough the
7,
9],
belo,)
a
the
in its use.
interpreter results
used
,e%
to
require~
By
define by
arguments
then
the
use
notation
this
is
current
for
formal
feasibility
was
of
the
control
not caused by
itself.
The
common
the tendency to be "over
meant
that
the
abstract
the language sometimes indicated
the
certainly possible to deduce the final outcome,
the
number of difficulties were
of
of most problems was, in fact,
specific"
!969/70
these were certainly
any shortcomings in the formal origin
correctness
based
With the exception (see
During
cf co-operating with P. Lucas and his
co~leagues o, attempts to document compiling
difficulties
reconsideration of some aspects of the "VDL"
Definitio~ had
the
language.
Although
it
was
that such results had no effect on
this proof fzequently went
far
beyond
the
part of the language under consideration.
A
simple
example
was
the
use in some VDL models of a never
recurring "unique name generator" required.
required outcome. stack
to obtain new locations
However, if one wanted
ilplementation,
in
which
the
to
prove
correct
conditions
of
uniqueness.
possible to prove that, since an
undefined
state,
~ccations. ~owever,
it
is
a
locations of previously
closed blocks could be re-used, one was more i n t e r e s t e d in peces_s_a_rx
when
This c e r t a i n l y gave a sufficient model which gave the
Now
the
it was, of course,
mew locations were initialised to permissible
to re-use discarded
one was paying with a very expensive
the sawing of relatively few lines of definition.
proof
392
The
basic
maxim
properties
to
be followed
to the d e f i n i t i o ~
which
then
will
are
not
be to avoid required
giving by
the
language.
Before
coming
on notation
1.1
The
to the language
features
proper,
a brief section
is offered.
Notation
formulae
which
follow
their
use cf notation
also
made of simple
and c o n d i t i o n a l sketchi,g complete
~eanings discussion
concept
used
to a d v a n t a g e
of
a!so
the
importance
Given
syntax
Non-terminals
confines
introduced
a semantic
this
its
use
definition
alternatives;
paper
will
of but
show
its
deve!epment.
notatio~
now
used
tc shorten
as names
has been
and
changed
clarify
of
unit
sets
somewhat
descriptions. as follows
-
-
{LBc}
as
names
for defining W = XIZ
~or a more
is
of
the sets defined
in the r e s p e c t i v e
rule. Rules
to
in ref. [ 17], was
Not only
it can be read as set e q u a t i o n s
-
is
~!~"
itself
items.
of its syntactic of
Use
"iZ ~ h ~
[~].
to divorce
parts
in crdeI
like
non-standard
Syntaz,
mandatory
Objects
~_~_c_
section
1 of ref.
as regards
operations.
in the VDL definitions.
translator
[24]
the
the richness
a grammar
Elementary
for
logical constructs
This
see ~art
subsequent in
The abstract from ref.
for set and
Abstract
considered
a language f r o m
offer no d i f f i c u l t y
programming
statements.
The
still
should
alternatives ~
~=
of non-terminals
X u 2
-
393
~ules
for i n t r o d u c i n g X ::
YZ
Furthermore,
-
use
denotes
of
a
name w i t h o u t
for
its
Other
than
decompose
o~
= z
name of
which the
ends
class
in
"*"
defined
("-set") by
the
of a set -
deccmpositio~
by
made
selection
the c o n s t r u c t o r
it
is
possible
on the left of
to
a definition-
= x -_le__f y = s-n1(x) let
cases
s-n2(mk-X(y,z))
is-X(o)
by u s i n g
is a l s o
= y
suffix.
le_f m k - X ( y , z )
Use
-
s-nl (mk-X (y,z))
of o b j e c t s
membership
is-W (o)
[ y~Y ^ z~Z}
c a n be i n t r o d u c e d
s-n2:Z
(set)
-
{mk-X(y,z)
nonterwinal
a list
To t e s t
X =
selectors
X :: s - n l : Y
The
constructors
z = s-n2(x)
of this b i n d i n g
of
n a m e s in a c a s e s
construct-
w:
mk-X(y,z)
- > f(y,z)
is-X{w)
-
->
(!e_~t m k - X ( y , z )
= w:
f (y,z)) is...
mk...
At
the
the
manipulation
points
w h e r e i t is n e c e s s a r y of
functions,
use
to d i s c u s s is
made
more carefully of
the
Lamhda
notationf(x)
= ...x...
~owever,
these
(see ref.
[13])
let
x = e:
-
uses
will
o f t e n be
"sugared"
) )
g(x)
f = Xx . . . . x...
)
-
( x x . g ( x ) ) (e)
in L a n d i n ' s
style
394
Raps
are
used where %he graph of a function can be explicitly
co,structed -
[dl -> r~ J
explicit definition
[d->
implicit definition
r ] p(d,r)]
+
joining
are the c o ~ n t e r r a r t s of the set concepts.
To
come
now
to
the
problem
of
Semantic
le_~n_it_~o_~n°
d e f i n i t i o n s given beloN will be written in terms from
stated
se@~ntics VDL
domains
to
it is intended
style
models,
ranges.
[16],
in
c o m p o n e n t exists in the i n t e r p r e t i n g discussed
more
construct).
The
full~
below
re la ticn
mathematical
sema,tics
be reviewed
in c c D n e c t i o n
considered.
To begin
in
between
the d i s t i n c t i o n
from
the
which an ezplicit control machine connection functional
(this with
point the
semantics
is
set_e and
ref. [22] is of more i n t e r e s t
and will
with several of the l a n g u a g e
features
with i% is worth showing the definition of
a language which is itself f u n c t i o n a l and thus affords path tc functional
functions
In using the term f_u_n_ct_~onal
to e m p h a s i s e
ref.
of
The
an
easy
semantics.
1.2
Consider
the
language
given by the f o l l o w i n g
rules -
BI expr = inf-expr
B2 inf-expr
B3 var-ref
I var-ref
:: ex~r o F expr
:: id
B~ const =: I~TG
~ const
abstract syntax
395
The
class
op
is
not
existence of a functiom B5 apply-op
Now,
further
specified
than by the
-
: INTG op INTG -> INTG
for a given set of denotations
avoided
other
because
it
will
be
used
(the term "environment" below)
for
the
is
free
identifiers-
B6
DEN
the
: id->
INIG
denotation
of an expression,
which is also an integer,
is
given hy B7 eval-expr(e, den) = cases e: mk-inf-expr (el,op,e2)
->
(l_et v1=eval-expr(el,den) ; l_e_% v2=eval-expr (e2,den) ; r~su_!it_ is
{apply-op(~1,op,v2)))
mk-var-ref(id) mk-co~st(n) type:
This
expr DEN -> INTG
definition
composite
is
programming mathematics, at
from)
introduction
variable
a
has
expression
constructed The
-> den(id)
-> n
the depends
only
the denotations of
perhaps
on
that
the denotation
(and
can
of its component
].anguage. are is forced
most The
distinctive fact
that,
be
expressions.
feature in
of a
therefore
the concept of a dynamic assignment the
given point in time
semantics.
property
to a
of
a
contrast
to
to consider the value of a
variable
Foses problems for the definition
of
396
t.3
&~ais~a!_~a~£~
Consider
the
assignment
language
statements
whose programs consist of a s e q u e n c e of
(as-st)
which can be d e s c r i b e d -
CI as-st*
C2 as-st
The
:: s-lhs:id
effect
of
s-~hs:expr
such
a
sequence
of
statements
transfer~ so~e initial set of d e n o t a t i o n s step
by
step,
into
longer sufficient
their
final
to c o n s i d e r
for
the
denotations.
the DEN as
an
will
be to
variables,
Thus it is no
argument
to
the
interpretation:
the DEN reguired as the a r g u m e n t to the second
(and subsequent)
calls of
changed
by
the
the
interpretation
interpretation
may
have
been
of the first assignment. this proh!em
omitting
This is done because the
a]l
mention
of
the
DEN.
intention is to offer a number of
different
by
The
function given below a p p e a r s to ignore
simply
explanations.
should
be p o s s i b l e to see the intent of what is~ written
accepts
that ~ssign -changes"
assumes e v a l - e x p r
the DEN for
uses the c u r r e n t
the
given
It
if one
id,
and
DEN -
C3 int-st-l(st-!,!)
if i ~ lst-I !_he_n (let mk-as-st(lhs,rhs)
= st-l[i ];
l e t v:eval-expr (rhs) : assign(lhs,w) : int-st-I (st-l,i# I))
i type: It
is
formulae.
as-st ~- IN~g possible
to
The first
=> consider
three
ways
of
possibility is to read them
reading as
such
programs.
397
As
such,
each functicn
refers to one non-local course,
would ccrrespond variable
the same non-local
(i.e.
variable
eval-expr and all calls of int-st-l. trivially
defined
tc
modify
has its usual ordering are
written
in both cases to distinguish c~iven
this view and
the
DEN).
referenced
It
is,
of
by the modified
The sub-program
this variable.
implications.
with "=>", and
to a subrouti,e which
assign
is
The separator
"~"
Subroutine
type
clauses
their calls are marked with a ":", them from pure functions.
using the notation of ref. [ 12], the types
could be given in full by int-st-I
:: DEN:as-st~ INTG ->
eval-expr
:: DEN:expr
assign
:: DEN:id INTG ->
With
this simple, constructive,
discuss
one
of
definition.
the
dynamica~!y
functions. which te~t, there make
has
with
is no incentive
sequencing
language State" to
which
to
a
"Grand
State"
now
tc
all
part
of
the
state
this approach
is that it
the
discussion
The second
ref. [I ], would
in
as-st • INTG DEN -> DEN
eval-expr:
expr DEN -> INTG DEN
assign:
id INTG DEN -> DEN
not
The be
counter
eval-expr.
of alternative
views of the
interpretation
in each case a DEN. This,
int-st-l:
show
variable.
that the statement
possible
give-
and
would
functions as taking an extra argument
an extra result:
style
of program
Although in %his trivial language
further inspection,
int-st-l.
change
to do so, it would have been possible to counter
of taking
without
Returning regard
a
"Small
the possible exception
could net be affected h~, for erample,
function
of
by a side effect on this new non-local
disadvantage clear,
term
put into the state used by the defining
variables,
statement
the
in which only those things
are
are put into the state. the
properties
used
This is in contrast
all
view it is already possible to
desirable
McCarthy
describe a defi,itie, very
-> INTG
is
to
and yielding
which is the view of
398
(I,
fact
examplet
it
is
often
possible
since it can cause
nc
to
simplify;
changes,
in the above
eval-expr
need
not
return a DEN.)
But it is ne longer possible to rely on %he p r o g r a m m i n g view of ":".
It is n e c e s s a r y to d e s c r i b e it
functions.
as
a
combinator
The task of doing this is c o m p l i c a t e d
betwee,
by the various
a l t e r n a t i v e c o n t e x t s and it is e a s i e r to show the result would
come fros using
which
the c o m b i n a t o r -
int-st-I (st- l,itden)
=
if i -< ! s t - I
the__n (l_e~ mk-as-st(lhs,rhs) let
(v,denl)
= st-l[i ];
= eval-expr(rhs,den);
!_et den2 = assign(lhs,v,denl) ; result is
(int-st-l(st-l,i+1,den2)))
_e!s~e den
type:
Since
the
sugared
The
as-st* IN~G DEN -> DEN
"lets" are
now on pure functions,
a
fors cf lambda expressions.
third view one could
of ref.
they are simply
[22 ].
functiona~
The c o m m e n t
language,
take of the f u n c t i o n i n t - s t - i was made on the
definition
that the de,oration of its
was al! that was necessary to d e t e r m i n e
the
sub-components of
a
unit. By regarding the d e n o t a t i o n of an a s s i g n m e n t statement
as
a function this
is
it is again sc
~ossible
to enjoy
the
is that of
denotation
this property.
(That
is mcre c l e a r l y shown if the abstract syntax of a
cosposite statement
is given
recursively.)
would be-
int-st-l:
as-st ~ INTG ->
eval-expr:
expr ->
assign:
id->
(DEN ~> DE~)
(DEN -> INTG BEN)
(IN~G->
(DEN->
DEN})
The r e s u l t i n g
types
399
Again
in
this
combinator. (after
view
it
is
necessary
But now the fact that the
applying
the
functions
to
define
units
to
";"
be
as a
combined
to the static components)
basically f u n c t i o n s of the type ~EN -> ~E~ means that the
are very
pleasing model of functional c e m p o s i t i o n is adequate.
The
Oxford
group
(refs. [22, 23, 19]) have gone rather
far in
designing c c m b i n a f o r s which weuld permit f o r m u l a t i o n s like -
int-st-! (st- l,i) =
X~en.(!f
i <~ !st-I
_t_he__n (lee mk-as-st(!hs,rhs)
= st-l[i];
int-st-I (st-l,i+ I) o CONB (eval-expr (rhs) , assign (lhs)) )
e_!_se
I) type: as-st ~ INTG ->
(DEN -> DEN)
w here -
CO~B(vf,uf)
=
kden.(kv,denl.(uf(v) (den1)) (vf(den)))
(It
should
be
made
clear
that if this had been written by a
genuine devotee of mathematical semantics, very
different.
it would have looked
It is the c u r r e n t author's view that excessive
zeal in shortening d e f i n i t i o n s makes them less rather than more readable,
Since
of. ref. [19],
ref. [1]).
this is a function one can look at its result
program!
Consider -
p = (X := I; y := x * 2)
for a test
400
Then after reduction-
i~t-st-I (ptl) = (kden.den + [y -> den(x)+2])
which
is
the result expected.
o (Xden.den + [x -> I])
(This e x e r c i s e is somewhat more
i l l u m i n a t i D g on larger examples.)
to
the earlier d i s c u s s i o n of grand
versus small state a p p r o a c h e s ,
(Reverting
it is worth noting that it would
have
for
been
a
moment
possible
tc make the
(undesirable)
step of putting
the s t a t e m e p t c o u n t e r in the state and still give a in
terms
cf a function
would ~ot,
howe~er, have been possible to provide
combinators. counter
to f u n c t i o n s from states
In
particular,
is required
definition
to states. such
~t
simple
the static role of the s t a t e m e n t
to provide the
required
decomposition
of
the semantics.)
If the a p p r o p r i a t e c o m b i n a t o r d e f i n i t i o n s bow provided three questio~
The
ways of reading
were written,
the f o r m u l a
we have
int-st-l.
positicD
advanced
in the next part of this paper is that
the i n t e r p r e t i v e view is useful during the development translator
mapping
thinkiDg
in
problems be resolved
~rograms
terms
~a~ipulated
definitions
notation
meta-~anguage.
written
in
reasoning
this
style
will
be
the m a t h e m a t i c a l view
or any
is
it
so far it would be easy
variant of the "fRr". the question
reasonable
to
add
In
doing
arises to
as the
Would it, for instance, be a c c e p t a b l e to have a
while c o n s t r u c t ? of
that certain
Thus the notation will develop
this for any r e a s o n a b l y c o m p l e x language m~ch
however,
this leads to r e t e n t i o n of
the a~ount of notation i n t r o d u c e d
how
the
to when d e c i s i o n s are otherwise unclear.
to define ~'if then else"
to
Observe,
of c e m b i n a t o r s ~ e ~ _ ~ d
as if they were oFerational;
say he a~pealed
With
to functions.
moze carefully:
also this view of the formulae. the style above~
of
and it is cn!y then that one need take the
view of mapping source %hat
The
of which should be used must now be discussed.
lhe answer
about
must a l w a y s be sought in
a construct.
the
ease
Thus in ref. [~] both while
401
and ~9~ constructs more
have been included,
restricted
kind
than
the
but they are of
FOR
a
much
of PL/I which was being
defined.
This
topic
definition
1.q
The
brings
in its banishment
standards
valuable
to
situations them
committees. have
and
some
an
problem
hop,
its ability
attempt
for
of
it
has not
languages
appears to he
abnormal
to provide formal
defining ~ 9
its
is
has
the
terminated
sequencing
definitions
for
as
both
technigue
this
the
stack
so
be
order.)
that
below
of the interFreting stack:
in of
if
with
a
can be shorter
The subject of
the
connection
state with
is itself
operation
obeyed and the next step of operation
of
the but
arbitrary
the recta-language
LAMBDA in ref.
removes the top
this
exits,
was more general,
describing
(called
function
block
in ref. [ 2,] was to
into
a VDL definition
function
compound
in the ne,t section.
the control
Instead
as functions,
interpreting
the
whose semantics
discussed.
component
[In fact
upsets the
same phrase structure
for modelling ~oto employed
point is discussed
control
can
is discussed
~acbine.
evaluation
section
this is simpler than
that
problems
introduce a control
~irectly
In
the phrase structure
Although
property
phrase structures
of a unit solely in terms of the
sub-units.
chosen
block terminations
abstract
is that, other than the local
and initiated abnormally
definition
the
serious,
tc leave or enter
should be pEovided.
an
be
mechanism
ability
with
of
statement
this
from descriptions
To
to state the semantics
semantics
The
a semantic
may be one of the tools for comparison.
The
it
the challenge of giving
debate on the morality of the Horn construct
yet resulted by
to
Lan~ua~
Goto
long
us
of ~o~o.
described [24]).
by
A step
instruction
from
is elementary,
it is
is performed;
in the case
402
of
a
"macro"
the centre]
operation~
the a p p r o p r i a t e o p e r a t i o n s are put on
stack so that the next step will e n c o u n t e r them.
So far this can be thought of as a way of d e s c r i b i n g functional application. component
The
an
its e x p l i c i t phrase
purpose,
explicit
~anipulation.
stzucture
structured delete
hoover,
was
Thus one
to
the
making
the
control
way to model ~ot_o out of
define
the
in line with the phrase
from
of
part of the state was to make p o s s i b l e
control
"obvious"
structure,
component
any
a
operations
but
to
simply
operations
which
c o r r e s p o n d e d to parts of the program being jumped over.
The effect of this was that, in general, present a r g u m e n t s structure
whose inductive
of the program.
structure f o l l o w e d the p h r a s e
It was of c o u r s e p o s s i b l e to present
proofs,
but they had to be by i n d ~ t i o n
states
generated
by
it was not p o s s i b l e to
LAMBDA.
One
over
could
the
sequence
argue
p r e c i s e l y the undesirable effect of the ~t__Qo, but in definition
had
the generality, instruction,
gone
too far.
in this case
forced
of
that this is fact
the
It was one of the places where
to
change
the
control
in
any
one to show that, in precisely the places
one did not require the power, it was not used.
In fact the deletion used more
of parts of the control
for exits from blocks: local
phrase
was sometimes only
the reason that it was not used for
structures
was
the
s o l u t i o n adopted for
h a n d l i n g abnormal entry into such phrase structures. some
definitions
was simply changed q_oto
into
and
to point tc the next
statement.
out of phrase structures
on exit from the phrase structure.
However,
unit
by
~anipulating
treatment of ~ND in ref. The
current author
action and
the
to be
into the
tended to cloud
the n e c e s s i t y to describe finding
made
performed
(This can, in fact he v i e w e d
part cf the iAMBDA function
such d e f i n i t i o n s
This
very easy to describe
providing there was no special e p i l o g u e action
as a b s o r b i n g
Hasical!y,
a d o p t e d a "current s t a t e m e n t s e l e c t o r " which
definition).
the normal action by
the successor to an
Fointer
(see,
for
embedded
example,
the
[25]).
became c o n v i n c e d
that setting up the normal
letting a ~oto "take the machine
by
surprise"
was
403
the
wrong
model.
The
proposal
could result in abnormal "abnormal"
result,
made
termination
which
was
was that any unit which
should
some
return
an
case. Any call of a function which could result in an return
must
(Together with W. Henhapl,
selector
treatment,
order
to
this was written
ref.
who provided the
statement
up in ref. [6]).
define Algol 60# in which it is possible to ~Qt__qo
into both "_if" and compound address
abnormal
test for this possibility and perform a p p r o p r i a t e
actions.
In
extra
null value in the normal
statements,
the other part of the problem.
[I ] is to provide functions
structure
without
to fo]]cw.
Since these functions
it
was
necessary
to
The approach employed in
which run through
the
phrase
executing but setting up all of the actions prompted the e x e c u t i o n
where to begin they became called
"cue-functions"
as
to
(as in acting
- Dam Stichwort).
Consider,
for example,
the following
-
~ot_a l: i_f P ~_h_e_n I: sl else s2 : s3
Not oD1y should
this,
rather odd, transfer of control
without evaluating p, it should also set up the performed
thereafter
so
thal
s2
is
get to sl
events
to
be
skipped and s3 is next
considered.
The c o m p l e t e l y functional definition of Algol given in ref. [ 1] became tedious because of the man~ places where the effect of a ~o~o
can
cause
a cha~ge of e v e n t s and therefore the abnormal
return value must be tested. most
common
action
was
It was, however,
next operation and pass back the abnormal level.
that
the
value
to
the
next
In fact %here are very few places where it is necessary
to describe any sFecial action. Lucas
clear
simply to refrain from execQting the
proposed
that
Based on
adopting
some
this unit
observation to
trap
P. the
404
interpretation
where the action
free %o drop the ,,test and
This
abbreviation
normal • ~!
is
for
ABN.
e x p l i c i t l y by the retur,ed
a
to
be
Non-~i~l
exi~
the
(implied)
values
statement.
for
ABN
Normal
return
of
All a
are returned
action
with the same value for
~i~
unit b r a c k e t e d
which it applies.
The
one
on
being
ABN.
An
explicit
performed for a non-hi I ~BN value is defined by
means of a ~ a ~
containing
leave
non-hi ! ABN value is to terminate also the calling
function abnormally action
would
the one used in ref. [ ~] and below.
~eturns are written omitting
walue
to
was required
return" case by convention°
together with the statement
C o m p l e t i o n of the trap unit completes the
function.
developmert of this idea has been d e s c r i b e d in terms of an
i~terpre%er partly
partly because
because
it
is
this is how it actually o c c u r e d
probably
easier
to
first
read
and the
following functions in this way. (In
fact
the
separation
i n t - n s - I and c u e - i n t - n s - i were more
taking
the
mathematical
of
the. largely similar,
would probably not
purely i n t e r p r e t i v e view
given
be
functions
made
view: it is only the
if
one
for the
below
that
functions
are
given
by the f o l l o w i n g abstract
written separately°)
The
language
syntax. is
considered
It is assumed
something
is
that among
the unlisted statement
types
like the a s s i g n m e n t of the previous section which
would force retention of the DEN component.
D1 st = goto-st
} cpd-st
I ---
D2 goto-st :: id
93 cpd-st :: nmd~st • D~ hind-st :: s - n m = [ i d ~ s-body:st
id net further defined
405 The defining "the unique
functions
can now be given
object satisfying")
(the "b" operator
-
D5 int-st(st): cases
st:
ink-gore-st(lab)
-> ~Ii~(lab)
mk-cpd-st(ns-l)
-> int-ns-l(ns-l,1)
type:
st =>
D6 int-ns-l(ns-l,i) :
if i _~ !ns-I
t~_n {(trap exii_t (lab) wit_~h if is-contained (lab,ns-l) then cue-int-ns-l(ns-l,lab) else exit (lab) ; int-st(s-body(ns-l[i~)) int- ns-I (ns-l,im I) )
_e!s~e ! type:
hind-st* INIG =>
;
means
406 D7 c u e - i n t ~ n s - l ( , s - ] , l a b ) : !et i =
(hi) (is-contained(lab,<ns-l[i]>))
if lab = s-nm(ns-l[i]) thhen int-ns-l(ns-l,i)
e_is_e witl
(~ra~ exit(lab)
if is-co~tained |lab,ns-l) t_he~n c u e - i n t - n s - i (ns-l,lab) else exit (lab) ; c u e - i n t - ns-I (s-body (ns-l[i ]) ,lab) ) int- ns-I (ns-l,i+ I) )
type:
hind-st* id =>
D8 i s - c o n t a i n e d ( ] a b , n s - l ) (3i) (s-nm(ns-l[i])
= = lab)
v
(39) (is-cpd-st (s-body(ns-l[j ]) ) ^ i x - c o n t a i n e d (lab,x-body(ns-l[j 9))
type:
[%
id n m d - s t ~ - >
was
observed
obtained
terms.
constructs.
that
a deeper
This
view
is
~irst it is n e c e s s a r y
in the "=>"
int-st:
Tt
above
understanding
by viewing a m o r n - l a n g u a g e c o n s t r u c t
sesantics
hidden
{_t_rue,fa_!_se)
now to
in
a t t e m p t e d of the above uncover
what
st ->
n~d-st~ INTG -) (DEN -> EEN IBN) hind-st* id
to
define
s t a t e m e n t s separated
by
(DEN -> DEN A~N)
the
denotation
";"
in
terms
of two m e t a - l a n g u a g e of
their
denot atic,s. stl;st2
-
been
(DEN -> DEN ~%BN)
int-rs-l:
easy
has
of the type c l a u s e s -
cue-int-ns-l:
is
is often
mathematical
kden. (le_t (denl,abnl)
= st1{den) ;
if abnl = _nil then st2(denl) else
{denl,abn I)
individual
407
This
gives
us the way of creating
-> ~ ABN}
from two functions
test
dynamic
is
~qt_o will, !t
is
a function
depend on the state.
to write a very straightforward
for the t r!2 e_/xi~ but, if this results in the
fact
functions applied make
which
to the whole text
for
which
one
or free
although
the
labels
(in the sense they may the set of labels the
labels.
by the following
example -
1
I:$2 /*no contained
Then
that
can do something is known: it is precisely
~ f p ~he._,.nn~
=
is
within the unit),
This point can be illustrated
S
dynamic
(of the current unit) would
to the trap are unknown
he either contained set of contained
equally
to ascribe a semantics of the required type
The key observation
will come
an
combinator
that the t_ra_~ exit body again uses defining
it impossible
to a unit.
{~
unavoidable because the occurence of the
in general,
is also possible
action,
whose type is
of similar type: the fact that t h e
labels
*/
-
int- ns-l(S,1 ) = !_e_% (den1 ,abnl)
= int-st(if
p _th_e.~ng_o_to 1 e_!Is_~e~Lqto =);
(abnl = all -> int-st(s2) abnl = 1
-> cue-int-ns-l(S,l)
T
-> exit (abnl))
Wow s i n c e cue-int-ns-l{S,l)
it
= int-st(S2)
can be seen how tc construct the denotation
of course, statements
a trivial case. introduces
But eve~
where
looping, an equation
fixed point can be sought.
the
of S. This was, graph
of
~o
will be given whose
408
It should be conceded at ~rife " d e f i N i t i o n s " combination°
The
above
which
do
not
permit
static
This is a cause for further consideration.
mechanis~ is not the one usually e m p l o y e d
mathematical semantics: been
this point that it is also p o s s i b l e to
using exits
accepted
{cf.
the mechanism
ref.
[23])
which
is
in giving
appears
that of
to
have
"Continuations".
Basically,
the denotation of a label is the function
~)
r e p r e s e n t s starting at that label and running to the
which
(say ~
->
end of the program!
This is c e r t a i n l y a more powerful concept:
that
more g e n e r a l languages,
it
can define
found out to his cost possible
when
continuations
while
tried
to
the current author show
that
to e l i m i n a t e c o n t i n u a t i o n s in ref. [21].
maxim is to be sparing
general.
he
to
model Algol 60 labels
was
~owever,
c~ power in the m e t a - l a n g u a g e {ref. [ 1 9 ~
It seems unfortunate, for instance,
it
and
the
using
may be too
that in -
p do
!_f q !_b_e_~so__t_o I: I: $2
ea_d the
The
label I cannot be "treated
actual
locally".
choice between c o n t i n u a t i o n s and the model offered
here must be made in the c o n t e x t of the definition.
Since
use
be
decided.
%.he language
the Oxford group has an interest in proving
c o m p i l e r s c o r r e c t it will await a larger can
of
The
experience
with
arguments on e~i~t is, sc far, e n c o u r a g i n g .
example basing
before
this
correctness
409
B_!_o_cL_s!~9~!uj~L__aan_.q~a_q~
1.5
Both
blocks
and
rrocedures
local level of ~aling.
permit their user to i n t r o d u c e a
Since the names defined within different
(even nested) b l o c k s do not have to be distinct, of 1.3 will not suffice as a state. language
in
"remember"
which
Consider
no recursion is allowed.
the value of a variable,
a nested block in which a n o t h e r to o v e r c o m e this problem statically
distinct
for
the
of
a
say x, over the lifetime of
be
to
make
all
one way
identifiers
example, q u a l i f y i n g them with a
unique b l o c k
number.
The
renaming scheme would not, however,
static
case
It is necessary to
variable x is declared,
would
by,
the s i m p l e DEN
be a d e q u a t e if
recursion
were also allowed.
It would then be necessary to keep
distinct,
multiple i n s t a n c e s of a variable which is declared in
a recursive block.
Before
considering
the passing of procedures as parameters,
is a p p r o p r i a t e to discuss c a l l - b y - r e f e r e , c e since i,troduces
a
tool
which
makes
the
its
it
solution
remaining problems both
easier te state and solve.
C o n s i d e r the following -
b__eain ~oc
p(x): ia~
x; x := a;
p(a)
If
the variable a is passed by reference, the parameter x will
refer to a. In an i m p l e m e n t a t i o n the
parameter
location. i~
x
would
result
in
a
reference a
reference
the argument
in all
referenced.
In
places
and
to the same
The d e s c r i p t i o n of Algol 60 in the Algol Report,
this part very operational.
become -
the non-local
was
The model given was to copy in
where
the
formal
parameter
was
this way the body of the above p r o c e d u r e would
410
a := a.
Some
care
because
was
discussion using
necessary
concrete of
when
abstract
describe
with
What
The storage
directly
with
has really
be i n s e r t e d ) .
for
class
[I]).
and
for
there
associate
of the class. which
to
least is
a
is to show the s h a r i n g
of objects
below
the
Eut even
tedious At
call-by-name}
component
bee~ done
(of.
somewhat
The idea
identifiers
rule" partly
should
ref.
the same m e m b e r
"copy
discussed
becomes
below
i, the e x a m p l e
the
being
(of.
by an e n v i r o n m e n t
(chosen
!ocations)~
it
mechanism.
some a u x i l i a r y
is m a i n t a i n e d
values
(see
equivalent,
by h a v i n g
LOCs
parentheses
copying
call-by-reference
identifiers
were
programs,
this
simpler~
in d e s c r i b i n g
strings
maps
This
identifiers
to be s u g g e s t i v e will
no
but i n s t e a d
is to d e c o m p o s e
into
of m a c h i n e
longer with
both
association
associate
locations.
-
DE~ i d - - ..... > VAL into-
ENV
STG
id ....... > LOC ....... > VAt
hut
in doing
so,
the
possibility
is i n t r o d u c e d
to have
idl .... I--->
n--->
v
id2 ....
so
that
any change
references more
via
thaw
via
one of the i d e n t i f i e r s
the other.
the
expression
(The use of of
an
£0C is,
is r e f l e c t e d
to
in this model,
no
eguivalence
relation
over
identifiers).
Tt
is
now
necessary
environments mentioned dynamica!ly
handles
above~
consider
the block
~he
distinct,
to
and
locations
how
model
recursive
will
so the problem
a
block
be g e n e r a t e d
of entering
which
a
has
problems
so as to block
be and
411
destroying
a
denctafion
c e r t a i n l y been overcome. environment location the
is
case
of
base
bindings,
The
will
later
be
r e q u i r e d has
generated
mapping
the
new
local
to
a new
identifier
(notice such a copying of DEN would be incorrect). a
~eeper blocks the
which
All that happens is that a
block
which
can be known by and called from
(i.e. a procedure), it is necessary to
environment,
to
which
it
will
show
insert
how
its local
is to be found.
most
mcdel
economical
would
be to assume again that all
identifiers are distinct in which case it is possible that
In
any
to
show
valid calling e n v i r o n m e n t c o n t a i n s the required base
environment as a sub-part.
"most
existence
does nct cover the case where procedures can be passed
parameters!
ref.
This
is
variable
the
recent"
as
any
then, only
solution
recent"
of
TB this case,
precisely
references are possible.
[7]).
The
operational
because
then,
"most
discussion
see
will be to "remember"
what its base e n v i r o n m e n t should
be.
In
an
model one would make a procedure d e n o t a t i o n contain
the pair of procedure text and environment. semantics
other than
(For a fuller
general solution,
for any procedure
can be r e f e r r e d to. This
In
a
mathematical
style d e f i n i t i o n one wo~Id use these two entities to
create a function.
The language to be considered is -
El proc :: s-nm:id s-parms:id~
E2 s t
= call-st
~ as-st
E3 call-st :: s-pn:id
~4 as-st :: s-lhs:id
Identifiers
--.
s-args:id~
s-rhs:expr
then correspond either to variables
is considered) required,
I
proc-set s - d c l s : i d - s e t st
in
or procedures: the
latter
in a
associated object.
E5 ENV : id ->
(LOC | PROC-DE~)
the
former
procedure
(only one type
case
a
denotation,
LOC as
is the
412
A not yet initialised
value for a variable is allowed,
so -
~6 S : LOC ~> VAL
E7 LOC = sere infinite set
E8
VAL
=
INTG
~
A functio~a] type for ~rocedure d e n o t a t i o n s is given
Eg PROC-D~N
: (LOC ~ PROC-DEN)~
->
-
{S -> S)
(Notice that qo_tc is net is the c u r r e n t language,
so ~BN is not
required) o In order tc cover recursive
procedures it is necessary that the
d e n o t a t i o n of a procedure is a v a i l a b l e indirectly) denotation
call
itself.
Since
denotations
wherever it might
is discussed
here are functional objects,
(let
env'
validit7
of
= proc;
= = env
+
([parm-l[i] - d e n ° l i t ] [id - eval-dcl(id) [s-nm(proc)
~ 1-
I i d , dcls] ,
- e v a l - p r o c - d c l ( p r o c , env') proc, procs ]) :
in%-st (st,env') for all id • dcls d__q release (env' (id)) ) result
is
s~ch
=
1~et < i d , p a r m - l , F r o c s , d c l s , s t > let f(den-l)
(The
in ref. [22].)
e v a l - ~ r o c - d c l (proc,env)
type:
(even
i n t e r p r e t i v e d e f i n i t i o n the
is "recQrsive".
the d e f i n i t i o n of env'
2:10
an
would have been a pair of the text and the declaring
environment.
definitions
In
(f)
proc ENV -> YROC-D~N
413
Eli
eva!- dcl (id) : let
l:alloc () ;
assigm (I,?) result is
type:
E12
;
(I)
id => LOC
int-st(st,env|: cases st: mk-call-st (pn,arg-l)
->
(let f = env(pn); let den-i = <env(arg-l[i])
~ 1-:
f (dem-l)) mk-as- st (lhs,rhs)
->
(let T:eval-expr(rhs,env) ; assign |enw (lhs) ,v) ) mk...
type:
s% ENV =>
The functions -
~13
allot : :> r OC
El@
release : LOC =>
extend
and
restrict,
respectively,
the
domain
of storage.
While-
E15
assign : LOC VAL =>
E16
contents : LOC => INTG
modify and read,
Based
on
the
respectively,
environment
important points.
it
values of storage.
is
possible
First, it is interesting
is precisely the response
to
to note
clarify that
two this
of both c o n s t r u c t i v e and m a t h e m a t i c a l
414
definitions ]a,guageo
to
the
This
problem
the above manipulation cf ~ref.
the environment,
as
the
well
in the state.
he modified the
defining
a
block
structure
environment
better
than,
is say
The VDL models used the grand state approach and
[2~3"?
co,tailed
of
leads to the second point: in what respects
as
all
"stacked"
versions,
This ~eans that, potentially,
by any function.
were
they can
It was then necessary to show that
i n t e r p r e t a t i o n of two successive statements in a statement
list was under the same environment. non-trivial call, had
because if the first statement was, for example,
the e n v i r o n m e n t to
call,
show
was indeed dumped
environments exactly
that by termination of the i n t e r p r e t a t i o n of the
as
arguments,
that two
the
argument.
~£~i~i~.
certain
the
other
calls
of
(This
was
of
hand, shows guite int-st
the
are
passed
subject of the,
~roefs of the first two lemmas in ref. [g].)
language now available
%hat
on
successive
sa~e
somewhat tedious,
of ~ £ n ~
a
then changed: the proof
the dumped e n v i r o n m e n t had been restored. The passing
explicitly
The
Moreover, such proofs were
is rich enough
to discuss the topic
In the d e f i n i t i o n given, it
conditions
is
assumed
hold for a b s t r a c t programs which are
not expressed by the syntax rules.
For example,
the
definition
would simply be undefined for a program which attempted to call a simple variable or which called a defined procedure less
arguments
than
type
rules
the
unnecessary
in
parameters. abstract
encumberance.)
syntax
however,
of
ref.
It would, of course,
write appropriate checks i~tc
but
with
(The attempt to include such
the
defining
[ I]
was
an
he possible to
functions.
This,
would net show that the checks, in this language,
of a static
nature.
That
is,
it
is
possible
to
define
are a
predicate -
is-well-formed
: ~roc ->
{tr_ue,_fa_!se}
which only yields tzue in the case that all such static context c o n d i t i o n s are fulfilled. position
This
is
not
intended
to
take
a
on to what extent type questions in a language should
be s t a t i c a l l y checkable° useful
~roperty
static
and
what
of can
a
It is only
to
definition
to e x p l i c i t l y show what
only
be
checked
argue
that
it
dynamically.
is
a is (Rn
415
associated
~oint
implementation errors
It
is
i~ an unexecnted
would
be
procedures
1.6
that
for programs
it
permits
which contain
freedom
statically
to
an
checkable
part).
to
possible
define
both
and function
blocks
in the style of this section.
F ux!her ToEiN~
This
section
ccnstrucis With
will consider how some other,
familiar,
language
could be tackled in the same spirit as above.
regard
definition
to fl£~£, it is straightforward
to extend the last
to cover S£~2 out of procedure calls.
This,
with the merging of the other features already defined,
along is done
in the Appen@ix.
~he problem is simple because only known,
therefore
recently activated,
most
referenced.
[f the language allows
parameters
this
the
passing
property no longer holds.
to make each instance
of
labels
as
It is now necessary
cf a label dynamically
this requires some mechanism
and
instances of labels can be
distinct
like the activation
and to do
identifiers
of
ref. [~].
The
i,troduction
variables)
into
consideration
a of
label
language
target variable definition
the
something
The
(or,
with
it
affected of
variables
is
forced
fact,
the
a label instance
to be not-greater PL/q d ~ s
in
entry
additional
which Do longer
the
lifetime
than the lifetime of the
not make this restriction to
of
add
a
validity
and
check via
like a set of currently active activations.
definition
creation model
brings
referencing
label being assigned. so
variables
Algol 68 avoids this by constraining
exists. any
of
of
call-by-value
of a special by the in
any
location
changes
A ]gol
60
via
is simple to include
which
will
be
the parameter.
description
aD imaginary block.
of
the
by the
only
one
This is a close
assignment
to
new
The fact that Pi/I makes the
416
choice
between
calling
call-by-value
side
ca~!-~y-~a~a
is
of Algo~
passing
procedures.
it
is
frequently
the
i~Dlementer
wider
cf
definition write
it
referenced
could
is
freedo~
the
First
-
that
in
definition
reasonable
is warning
to leave
This
is
permit
the
user
order.
in
references
such orders
is an open
a
the
to be made
function
In a l l o w i n g
for
an e q u i v a l e n t
to
in an expression contains
such freedom
powerful
mechanism
which g u a r a n t e e
be
designer
define
note
may
the
on
pmore
the
rely on any particular
of how tc formally
(a +
via
in a !a,guage
side-effects.
which
The
of order of evaluation.
variables
language
programs
[4].
handled
if the e x p r e s s i o n cause
call-by-reference
ref.
than o p t i m i s a t i o n s
any order even which
60
~cr instance,
access
and
in
desirable
some
freedom
result.
shown
in a
not
to
The question problem!
(b + c))
the d e f i n i t i o n
may
wish
to allow
not only
-
a h c, a c b, b c au c b a
but also c
It
a
h etc.
is net,
at each The
therefores
branch:
response
of the
performed
were in any
any a v a i l a b l e
The
its
Droperty
of
such
arbitrary
this
was
VDL
to this
~achine But
in
ref.
LAMBDA
into
order
cculd
The
in fact a r e a s o n a b l y
control
operations
to
randomly
be be
chose
for execution.
by
very similar
building
the definition. occur
arbitrariness.
if they were to
function
[I] was in fact nature
the
was to sake the
branches
IAMBDA
of the ccnt£cl
functional
tree.
parallel the
one path or the other
to inherit problem
into a
on
crder and
leaf
definiticn
achieved
to choose
it is n e c e s s a r y of
component
executed
adequate
was in
Since
this
the only
expression
small impact.
and only relevant place
evaluation
417
The
definition
meta-language not
solved
provides
in
ref.
[4]
by introducing because
for the
the
shifts
the
problem
the "," separator.
definition
inheritance
of
of
a
to
the
The problem
combinator
sequencing
freedom
is
which is
not
provided. 9.
Beki%,
pieces than
among
others,
of "interfering" their
has pointed out that to combine two
program it is necessary
(extensional)
this problem is discussed
in ref.
be
the
generally
required,
pursue the definition defining
of
know
more
His approach to
[3]. While this approach
current
author
more axiomatically:
conditions
to
functional meaning.
good
may
would prefer to
in the first place by
cooperation
that
guarantee
~on-interference. One
final
axiomatic great
area,
that of Storage
parts of a definition.
freedom
left
to
the
reasons,
In
some
implementor
mappings of the data structures efficiency
Models,
are
mappings.
worthwhi3e example, list
viewing
I. 7
in
an
ref.
array
there
is
to what storage
required.
In
PL/I,
somewhat
[q],
however,
storage
model
for
and
to
express
the
constrain
it
was
found
abstractly
value as a mapping
For a fuller discussion
(for
from a (hypera
additional
linearised constraints
see ref. [ 2].
S_~u_m_ma__r Z
difficulty
of defining a programming
from a purely functional occur.
A
functional
extensionally Three
as
of integers to values, rather than as
thereof)
separately.
The
Even
to state the basic
)rectangle
languages
the programmer can take alternative views
of an aggregate and thus the language does the
brings in the role of
definition
(of. section
alternative
definition
in
solutions
were program
to
which functions
I. 2) is not immediatel~
as an interpreting
all interpreting
language as distinct
language is that changes
di~ussed;
a
state
are read
applicable.
to consider the
(ref. [ 2 ~ D ; to
functions as ha~ing an additional
consider
argument
and
418
result which is the state fu,ctions [!9]}. to
(ref. [I]): %o consider the
defining
as producing a f u n c t i o n s from states to states
I% is possible, by adopting c a r e f u l l y
write
in
a
chosen
notation,
style which can be read in more than one way.
Except for the problem cf a r b i t r a r y order, c o m b i n a t o r s provided which permit
The
advantages
are returned
Whatever
(ref.
of
can
be
ref. [4] to be read in all three ways.
the different ways of viewing a definition
to in the next part of the paper.
one's chosen viey of a semantic definition,
there are
more important c o n s i d e r a t i o n s which influence what is
written.
The
central
guide-line
proposed is that the definition should
not possess properties ~hich are no% inherent in being
defined.
This
is
that
a proof is often
a
such
details
it
program,
required that certain
the model haYe no effect on the final outcome. eliminate
language
,or to say that such definitions are
wrong in the overall effect they d e s c r i b e for rather
the
but
details of
If the model can
will facilitate the use envisaged
below. ~xa~ples
of
where d e f i n i t i o n s can be o v e r - s p e c i f i c range
the use ef the grand state a p p r o a c h use
of
lists
vhere
to trivial items
like
from the
sets conld suffice for components of the
state in ref~ [25]. Before
proceeding
%e
the
discussion
of
proving a compiler
correct with respect to a language definition, should
be
spePt
on
the
language definition. corresponds hope.
Most
properties
to
a
such
standards
(axioms)
The
direction
encouraging
moments
definition,
descriptions
the
definition
there is little
are
a
mixture
and partial models for the language. which it is not, for the natural
he read precisely,
best i0cemplete,
verbal
few
of the correctness of the
Rs far as -proving" that normal
if it were possible, to
question
a
such d e f i n i t i o n s
of Even
language
would be shown to be at
at worst inconsistent. followed
by
ref.
[25]
is,
of
course,
very
in that it is a huge step towards s t a n d a r d i s i n g via
a document which could be considered
to be formal.
419
In spite of the difficulties possible
to consider
most trivial languages passing
level a definition
should be checked the
implicitly
defined
seeing
why
currently
task
under
cn
a
the source
first
be
is required. "formally"
question
of
the must
two
to
source to
be
to
to
be
entail
is
the creation of a
of
Secondly,
functions from states generated
environment
an understanding is
returned
of to
process to be discussed
resolved
is
is which of the reading
are
to
be
adopted.
the interpretive
This
which a
have
mapping
been shown in the source in the choice of code to of
source
to states has more to be
functions:
will also require
author,
view of the
Firstly, it is very unlikely that
are exactly those required
produced.
It is
(and even ref. [9]).
for taking
distinctions,
case
source
specification
ucrked out. There seem, to the current
arguments
the
definition,
the
The question of whether this doCume, ted
definition
as the basic one.
these
aid
some extent remain open until more examples
definition
be
would
is
this process can begin,
must be
have bee~ fully to
has been
definition
definition
The step by step development
question
an
to the object language.
very similar to that in ref. [15]
styles
the proof
consideration
Based
machine
understanding
The
a
as
of
texts of a source language into texts of a
from
that before
object
below.
Much more
existence
this section discusses how to obtain a
mapping
obvious
believe
like
OF i TRANSL~TOB SPECIFICATICN
language.
a
by current
In ref. [4] an attempt
authors
the
under consideration.
overall
language,
and
it is
it
errors
%o a function.
termination
objects.
program to translate machine
of
The guestion of what
2. DEVELOPMENT
the
of the size required
~cst and assertion comments
the
"consistent".
correctness,
consistency,
to be free of clerical
guesticns
made tc insert pre,
for
like
the wrong number of arguments
subtle are
The
in establishing
some property
programs
developed
to than
manipulations of, for example, modelling in the object state.
the
420
The
abstract
state
of
the
source
definition
permit a large range of implementations. with
a
particular
designer ~ s task particular
machine
is
to
object
might be a
developing
something
of ref.
in view, to be more specific. The
find
machine,
development
concrete
realisations,
for
abstract
the
multi-stage
process
version
properties example
a
in
on
state. the
case
his This of
like a n ~NV to a d i s p l a y model like those
[7]. E a c h stage of d e v e l o p m e n t
concrete,
was chosen to
The time has now come,
of
an
object.
not possessed
by
the
proTides This
more
a
new,
more
model
will
have
abstract
object.
For
list has an o r d e r i n g p r o p e r t y not present in a set.
For this reason,
the a p p r o p r i a t e style to document the r e l a t i o n
believed
to express the c o r r e c t n e s s is from
abstract~
Thus,
(more)
c o n c r e t e to
if Z is a set and modelled by a list L, the set
is r e t r i e v e d by -
retr-$ {L) =
{L[i] ) !-
LIST -> SET
would
then
prove that the new function models the old in
the same style as discussed Milner's "Simulation"
in section
in ref.
3 (this notion
is
like
[18 D .
Another e x a m p l e of the d e c i s i o n s made during the development of the interpreter, flexibility
is the
was
permitted
for the implementer. real
use
of
ratio~a!e
of
arbitrary
by %he language
ordering.
hardware,
design.
is to a t t e m p t a l w a y s to find a model and express
the result of the previous stage.
possible tc avoid large e q u i v a l e n c e hard
to
functions
see. may
in
the r a n d o m n e s s is removed in
that best fits the c o m p i l e r
its c o r r e c t n e s s by sho,ing how, among other i r r e l e v a n t it c o m p u t e s
The
to provide freedom
Other than those a s p e c t s which result
parallel
favour of the choice
The
removal
In
details,
In this way it is
proofs whose
structure
is
fact, for many stages, d o c u m e n t i n g the retr
itself
be
an
adequate
Justification.
The
d e f i n i t i o n is thus providing, not only the c o r r e c t n e s s c r i t e r i a b~t a]se a basis for the c o r r e c t n e s s argument.
421
~t
the
termination
multi-stage, state now
however,
possible
believed sense,
on
the
seeking
object
record
to produce exactlT
machine.
on
The
as documenting more
It
which are
assumptions
attainable
a
state
In a about
goal
than
its full formal definition.
sense
operations
that
to read the interpreter
a
they depend
are inserted,
source language
will
on the text alone.
as
the
in section
that the subsequent d e v e l o p m e n t
of the source language! justify
this
mapping.
static
in
If the machine from
(abstract)
language.
serve
to be described
as a mapping
which are
this is now a function
to the object
function
~evelopment note
which functions
the same state transformation.
which may be a
should now be possible
Such
which may be
the machine o p e r a t i o n s
by expanding all of the case distinctions the
above,
are still written in a meta-language.
to
this can be considered
the object machine
Tt
the work detailed
there exists an interpreter
representable
transitions, is
of
specification 3. It
is
for the
important
to
requires no understanding
The meaning of the language was used to The
mapping itself is a s p e c i f i c a t i o n
purely in terms of strings.
Tt
is
nov
appropriate
to consider
been useful for this section tc block concept to
the
next
an example.
consider
some
Tt might model
I
section
to
use
the
2
the language
of section
example
of
1.3, States are -
DEN:id -> INTG
Given,
is -
apply-op
have the
(of. ref. [7]), but it will provide a better link
expressions. Consider
of
: I~TG op INTG -> IRTG
compiling
422
Then the definition
could be written -
_c_a_se_s e:
mk-inf-ezpr (el ,op,e2)
->
tlet v1:eval-expr(el) : !et v2: eval-expr (e2) ; result iS
(apply-op(vl,op,v2)))
sk-war-ref(id) sk-const(n) type:
-> contents(id)
-> n
expr = > IN~G
int-st-I (st- I,i) : if i -< !st-I t~
(_let mk-as-st(lhs,zhs)
= st-l[i]:
l_~et v:eval-expr(rhs) ; ~ssign (lhs,v) : int-st-I (st-l,i+ !))
else !
type: Now,
as-st~ INTG =>
considering
an actual
is the way individual though
the
eval~ation
to
that
-
5
DENt
: (id I T ) - >
where 6
-
id ~ T =
{}
INTG
to note
by the "l.~e% vi"
On most Rachines
some ~se of temporaries,
so
the first problem
of the ether sub-expression
give rise %o man~ such ~ses. rise
machine,
values are retained
so consider
this
even
may itself would
give
storage extended
423 A
algorithm can
simple
interpreter that
be
given
on the new class of states.
superfluous
lengthens
nov
assignments
the example
are
which will a c t as an
~he
made
-
author
is
aware
but their removal
without adding any new concepts).
trans-expr (e, J) :
_cases e: mk-inf-expr(e1,cp#e2)
->
(trans-expr (eq,j) ; trans-expr (e2, J÷1) ; assign (t[ j ],appl~-op {contents (t[j ~ ,op, contents ink- va r - r e f
(id}
(t[J+ 1 9 ) )
->
assign (t[ J ],contents (id) } ik- const (n) -> assign (t[ J ],n) tTpe:
expr !~TG =>
t tans- st- 1 (st- l,i) :
!! i <_ !st-I t_hhen
(let mk-as-st(lhs,rhs)
= st-l[i];
trans-ezpr (rhs, I} ; assign (lhs,contents (t[ I ]} } ; trans-st-l(st-l,i+ I) }
e..! s_~e ! type:
as-st*
INTG =>
To show that the trans-expr
interpreter
models eval-expr,
it is
necessar 7 to prove contents (t[ j ]) after If
one
appends
trans-expr(e,j)
the statement
leaves contents (t[ J ])
-- eval-expr{e)
that for i < j, trans-expr (e, j)
unchanged,
it
is
easy
to
prove
the
424
combined
p~operty
expressions.
As
str,ctnra!
induction
The further e x t e n s i o n
suggested above
%he original the results
bT
has been shown to model
interpreter by showing
of the latter from
the class of
to as-st lists is trivial.
the new interpreter
defining
on
how
to
retriexe
the former.
The type of the function assign (t[ j],ap~!y-op (.. o))
is
DENt
->
Suppose then,
DENt
the object
machine is of 3 address
an appropriate
instruction
uight be
type
(of. I~M Iq01)
-
op = A_D_DD - ~DD (t[ J ],t[ j ],t[ j+ 1 3) Inserting
such
function
%ransaexpr
languages. derived (At
operations,
It
as
is
a
it
mapping
important
step by step
is to
now
possible
from
source
remember
to use the to
object
that this has been
from the definition.
the risk of labonring
the point, it could
be remarked that
had the statemeDt counter been made part of a grand would ,ow pose problems because
there is no model
state,
it
for it in the
object state). It
should
in
the
be clear
source
development properties the
it
problem mappimg
is which
the mapping
but
used so far that not only
also
in
the
subsequent
likely to lead to more work if unnecessary
are introduced.
~rob!ems
expressed
from the methods
definition,
has
This, however, not
really
is likely to be
conveniently
in
a
such
"single
touches
on
been solved. that pass".
it
one
of
In a large cannot
be
If a multi-pass
is described and its structure differs from that of the
eventual difficult.
translator,
the
proof
Sore work is needed
is
likely
in this area.
to
be much more
425
Before
concluding
this
section
it is worth c o n s i d e r i n g
what
happens if a defining
model is chosen, or given,
which
some
language
is, there are
areas
of
some details Not
the
present
too concrete.
That
is
which do ~o_tt appear in the planned
model.
only should one refrain from throwing the definition
one should also ~o~ embark has
been
stage,
shown for
one
satisfying
equivalence.
ref.
[10]
area
the
The
abstract
subseguent
away:
proof.
It
as a development
notion than that of the source
of the language.
new
In
One can then prove
notion
development
the mere abstract definition. interesting
equivalence
how one can,
a more abstract
introduce
definition that
in
on a complete
in
this
ensures
overall
can now be made from
respect
it
would
be
to prove that the storage model of ref. [25] was a
model of that in ref. [~].
3. FORMIL
Faced
DEVELOPMEN~
with
a
program
specification
and a code listing it is
difficult to ascertain
whether the latter satisfies the former.
The
of
basic
framework Thus,
intention
the
subscribed development correctness It
is
Formal
idea can
be
documented
argued
is to provide step
b~
of a development
that
precisely
each enough
a
step.
level that
is of its
can be the subject of a proof.
important
to distinguish
makes
proof far more tractable. when an attempt
the current proposal
then constructing
its specification.
during a development redone,
recorded
of top-down documentation
to. In addition it is
idea of writing a program fulfills
Development
in which the design can be
The possibility the
a
that
it
to u s e abstraction
construction
Furthermore,
proof
from the
of
a
step-wise
the amount of work to be
to construct a proof uncovers
an error,
is r e d u c e d .
It
may
also avoid
misunderstanding
that Formal Development invention.
if it is stated right
is not proposed
The backtracking
as
a
rule
and effect of inspiration
to
away order
conveyed
426
by ref. Just
[20] are much more typical
as
one f r e q u e n t l y
documenting a design so that a
it is worth cohere~%
of program invention.
development
reader
can
see
a
of ideas.
Two main sorts of development the one norla!Iy connected specification
But,
rearranges a proof when writing it up,
steps are discussed.
mith t o p - d o ~
of w_hha~ is required,
a
which
may be either s t a t e m e n t s of some language or assumed
tC
second
have
possible
perform
the
specifications
used:
one
the give~
properties
be
achieved.
it
becomes which
can
are
The also
step comes from the wish to use By
such steps of development
using
to
an
the
arguments.
necessary be
combined
task.
reduce
and c o r r e c t n e s s
the
believes
relevant
drastically
representation
may
notation is being used it is now
M~I
of normal data objects.
to
development
formal
down
sort of development
abstractions only
some
write
sub-operations The
task
in either case their specifications
Since
possible
the
Given
operations recorded.
hoXw
development.
one writes some sequence of
operations
sub-operations:
show
The first is
to
objects
which
algorithm, length
it is
of
both
At some point in the seek
an
efficient
described in the language being are considered
in section
3.2.
!~!___o_me_r~.~io~a! A__bb_s_t_ra_c/tio~
Operations
are considered
some class,
say Z. For some operation,
means
that
OP will
(only deterministic
to be transformations
~rcduce a state
say CP
~' when
operations are c o n s i d e r e d
on states from
-
"run in" a state in
the
current
paper). The
term
not
De
abstraction available
is applied
(other
than
because operations,
which may
by
performed
a
yet
to
he
427
construction) an implicit
can be discussed. specification.
They are in fact discussed via
The definition
be given via two predicates
of an operation will
one which specifies the domain over
which it must yield a result pre-OP : Z -> pre-oP(q) and
the
required
{t_~rue,f~!se}
= (3#') (G [OP] ,')
other
cf
which
specifies the input/output
post-OP
: Z Z ->
pre-OP(#)
{tr_ue,false}
^ w [OP] #' = post-OP(#,#')
by -
If both of these conditions hold the fact is recorded pre-OP These
relation
for the operation -
two
post-OP
predicates,
the meaning of OP
then, record everything
(there is of course
necessary
about
nothing about performance
etc.). Suppose
a
proceed?
The problem is decomposed
specification
is
given in this form, how does one
a set of
(simpler)
operations,
to sub-problems
on
the sub-operations
the
be
true
of statements
that time they provide
will be recorded
The
most
languages execution
trivial is of
to the
way
of
separate first
execution of the second. such a combination
in the language
specifications
is
to
with be
of the two operations
in exactly assumptions
work.
two
operations
":"
showing
immediately
The conditions
necessary -
The
being used. Until
for further
combining them
could be
specification.
the sate style. Eventually all of the sub-operation will
choosing
which, if one had them,
combined in some stated way to fulfil assumptions
by
in most
that
the
followed to show
by
that
428
pro-OPt
post-OPl
pre-OP2
post-OP2
will satisfy pro
are
firstly
stated
post
that
each
pro(G1)
= pre-OP1 (=1) A post-OPt(,1,=2)
secondly
~erived
the
overall
such a sisp!e
combination,
requirement
not being suggested
should
be
accompanied
post(,1,,3)
to
In wany sit~atious
For instance,
list has been provided
can be
is proved in refo [12])~ prove
however,
true.
For
three special
with total operations
the first two lemmas are vacuously
it is certainly program
the
excessive.
cases can be applied.
relation
^ post-OP2{,2,~3)=
of these c o n d i t i o n s
appears
~X~)
input/output
-
~ post-OPl(~1,e2)
adequacy
lemmas
%hat
= pre-OP2(q2)
frow the combination
pre{#1) (The
will only be used over its
domain-
pre(,1) and
=
operation
(pre
Furthermore,
that every use of ":" in
by formal
proofs:
a
but a check
to which an appeal can be made
in
case
cf doubt. The
reader
the
next
should stage
properties
observe of
relied
onQ
that the development current providing
proof.
that a specification
development
which
Thus it is not n e c e s s a r y
cf that operation
There
is passed on to
states
is
a
complete
does
not
all
of
the
to later show disturb
split of the problem
the of
J ~stif ications.
More
methods
style
by ref. [ 12], section 9 of that paper also considers
of
combining
operations
the set could be further extended.
are defined in the same how
429
3~
many
respects
this might be the more important
forms of abstraction
being considered:
of the two
it is certainly
the
one
which is under-employed. The idea of data abstraction the earlier mappings
has, in fact,
parts of the paper.
have
both
used
an
The
language
abstraction
(i.e. that class of objects described That
this
was
alternative
necessary
of redefining
can
already
be
been used
definitions
and
of the program text
by the abstract seen
in
by
syntax).
considering
%hose functions in terms of
the
concrete
strings. If
then,
terms
it
ef
is
difficult
detailed
impossible
tc
data
tc even state the specification
representations,
it
will
write a r g u m e n t s for correctness
in
become
at such a level
of detail. The
proposal
is
that development
bring in those properties effect
on
percentage section
the
below.
represe,tafion performance The
of the data structure
algorithm.
That
cf the development
3.~
~hus
this
that
permits
can be seen
one
in
a
have
an
reasonable
the
example
is able to postpone
of
fixing the
of a data object until adequate reason
of a sub-algorithm)
interface
of an algorithm should only
(e.g. the
can be ascertained.
between operations might, then,
terms of sets or maps for example:
in
the
be described
final
code
in
linked
lists or hash tables might be the chosen representations. ~t
is
necessary to discuss
which refiDes a data doing is adding
properties
has an ordering
property
possible
what goes on in a development
representation. to
not
Essentially
the data structure present
in
the
what
step is
(e.g. the list set).
to re~iev~e all of the data of the abstract
the more concrete.
one
It
is
level from
430
Suppose
an
o p s r a t i c ~ ~n states of c l a s s D has been used, s u c h
that -
prod postd
and one now wishes to show that -
pree poste is
an
adequate
simulation.
If
is
sufficient
to
find
a
r e l a t i o n s h i p b e t w e e n the two state c l a s s e s -
retrd : g -> D w h i c h shows that OPe works on a wide enough class of states -
pred(d)
and
that
retrd)
the
ne~
operation
produces states m a t c h i n g
those produced by the old o p e r a t i o n
prod(d)
(This
^ r e t r d ( e ) = d = pree(e}
^ retrd(e)=d
~ction
differs
O p e r a t i o n s are Dot,
(under
-
^ poste {e ,e ') = postd(d,retrd(e'))
from
that
in
ref.
[ 18]
in that the
necessarily, functions).
3j___~le___o_f_~_xa_re_s si o_n_C_o_~ t_ii_on
The
input/output
[elation given for trans-expr in section
defined %o operate conveniently linear form consideriDg
cn
chosen
objects
parsing and
consider a reverse-~olish first p a r s e -
type
"expr".
These
2 is were
to be tree r e p r e s e n t a t i o n s of %he o r i g i n a l
(presumably infix). the
of
Without going all of the way to
tokenising of an e x t e r n a l string,
text which might result from
such
a
431
c-expr
::= c-inf-expr
c-inf-expr c-var-ref c-const The
~ c-const
::= c-expr c-expr c-op ::= ...
::= ...
relation
retrieve
~ c-Tar-tel
of
this to the class expr can be specified
function
which
hy a
uses a stack -
retr-expr (tl) =
!_o_r i = ~ _to ! t l ~o (is-c-var-ref (tl[i ]) -> ~ush (retr-var (tl[i ])) is-c-const (tl[i ]) -> p.ush (retr-ccnst (tl[i ]) ) is-c-op (till ]) -> (!_e_t e2: p_q~; l~t e1:~o~; 9.~sh(mk-ex~r (el,retr-op (tl[i ]) ,e2) ) ) ) ; result type: Not
is
(~o~)
c-expr -> expr
only
criterion suggestive
does
this
retrieve
for the following o f a way
to track
function
translate
give the correctness
function,
the temporaries,
the stacking
is
432
Assuming
an external
variable b -
trans-c-ezpr (t i) : _~_~_r i = I t c _! t l do (is-c-var-ref (tl[i ]) -> (b:=b+
I;
assign (rib ],contents (retr-var (tl[i ~ ) ) ) is-c-const (tl[i ]) -> t;
(b : = b +
assign (t[ b 3, retr-const (tl[i ])) )
is-c-op(tl[i])
->
(assign (t[b-1 ].appl y-op (contents (t[ b- I), retr-op (tl[i ~ , contents (t[ b]) )) ; b := b-l)
type:
c-expr =>
The correctness
ca~ now be proved
if -
b := O; trans-c-expr(tl| trans-expr (retr-expr (tl) , I) This
result
follews
from a Froof, by induction
{and possible c c n s t r u c t i o n s
of)
on the length
tl, of the stronger statement
-
b := k; trans-c~expr(tl) leaves b = k + I and creates
the same as
frans-expr (retr-expr (tl) ,k+ I)
The
rather
short
leave the reader Whilst
the
correctly
unclear as tc
notation
made the
development
treatment of Formal
via
used
step
in
from
operations,
he,
a
ref.
Development bigger [12]
development the
example
offered
example
may
looks.
is t h o u g h t to have via of
functions
to
that report
is
433
unconvincing. was
so
This is mainly because the a l g o r i t h m
oriented
to
arrays
considered
that the use of an abstract data
r e p r e s e n t a t i o n is somewhat artificial.
The
example
ref. [11] is more i n t e r e s t i n g with respect to
of
data a b s t r a c t i o n and a
short
outline
of
a
rewrite
of
its
development is now given -
Specification:
find
a
(general,
table
driven
recogniser)
algorith~ -
REC
: grammar nt s y m b * - >
{_Y_E_S,~O}
Where the abstract form of a grammar is -
grammar : nt -> rhs-set rhs = el* e! = symb I nt
The
pro-condition
n o n - t e r m i n a l and sentence
defines that
that
there
non-terminal.
there
is
The
are
exactly
rules
one
post-condition
should yield "YES" if and only if produced from the given gra$mar.
the
for
rule
states
symbol
each
for
the
that
R~C
string
can
be
("Produceable" is defined).
Step I: Splits the problem into an input stage which stores the grammar:
a main stage which c r e a t e s
"State
contain
information on all possible top-down parses:
stage which yields YES or NO depending on a state
sets.
terms of the the
which
will
an output
predicate
of
the
The storage for the grammar is still s p e c i f i e d in (abstract)
programmer
map.
assigned
This may be
the
task
a
disappointment
to
of c o n s t r u c t i n g the input
routine since he has little to work on yet. fixing
Sets"
But
the
cost
of
this interface for his c o n v e n i e n c e is that the far more
time c o n s u m i n g parsing operation has not yet been developed far enough to get h~s
views on an e f f i c i e n t representation.
The state sets are a l s o described a b s t r a c t l y
(as a list of sets
of tuples)
is
certain
since the purpose of this
stage
to
show
that
upper and lower b o u n d s on the state sets are s u f f i c i e n t
434
to make the final Fredicate correct. of abstract
definition.
Notice this
give~ bounds and different a l g o r i t h m s could be use
this
freedom.
(In
fact
the
considerable use in considering
Scanning] the
which generate nee stales. ~inimum
constructed
specification
to
has been of
(Prediction,
Completion,
The state sets are defined
sets satisfying a certain equation.
are showP to fall
form
optimisations).
Step 2: introduces E a r l e y ' s operations
as
extreme
There is a great deal of freedom in the
within the bounds stated in
Step
Such sets I.
Notice
that not o~ly are these operations defined in terms of abstract data objects, distinction
they are also
implicitly
Step
~:
not
Begins
%c
restriction would,
algorithm
4:
Makes
is not
representing ~t
this
rhs)
the
Using
~t
algorithm
the restriction.
similar ordering step to the data structure
grammars.
what the common oFerations on the
Now is the time to give
REFEP
abstraction [S].
complete.
to the allowable grammars.
the c o n c r e t e
(In fact those used were quite complicated the
lists
mcint most of the algorithms as such, is designed and
it ix clear are.
yet
have been possible to use a different
a
of
on lists it is necessary to iDtrodnce a
(no zero length
however,
by mapping state
object in a yon NeQmann machine,
at _this stage of development and avoid
Step
in
development
But notice that, since a list
a convenient data
chosen
this form of
ccnsider representations
step to a concrete representation the
is
later.
sets onto state lists. is
This
to ref. [8 ] in which the a l g o r i t h m s are programmed
with operations on the a b s t r a c t data: is employed
defined.
option.)
Doing
data
(~L/I)
structures
data objects!
HASEE variables with
this prompts a macro style of data
like refo [5] which is similar to that used in ref.
435
4. SU~M~RY
The
aim
of the paper has been to show how a large problem,
this case the development of a compiler, can be decomposed small
steps.
Providing
each
step is adequately
complete design history is thus
obtained.
One
in
into
documented, of
the
a
views
expressed is that each stage of development should be supported by a justification. recorded
in
a
~his
notation
correctness argument. with
a
view
implies
to
cn
Such
that
which
it
correctness
steps
of
design
are
is nossible to base a arguments
are
sought
human readers rather than mechanical theorem
checkers.
The
key
to
making
ahstraction. minimum
properties
more difficult to provide
such
a
an approach practical is the use of
In each of the sections the value of stating has find
been shown. an
construction,
the
Although it is frequently
~FFropriate
abstraction
than
to
the a d v a n t a g e s of the former make the
effort worthwhile.
By
emF]cying
both data and operational abstraction,
the development are
rea]isations
of detail. argument
is based on showing how the from the
it is possib]e
The
of the same algorithm at ever greater
Taking this view, the normal
he retrieved
operations
design
(more)
of
levels
correctness
abstract model can
(more) concrete realisation.
In this
of
a
way
design language to be able to specify
imp!ici%ly and tc ~se very abstract data objects
different
particular
style
tc avoid general equivalence proofs.
requirements
very
a view of
process is obtained where the s u c c e s s i v e stages
from these of ~rogramming languages.
netatiGn
language,
(that of ref.
are
A!tho~gh a
[4]) has been employed as the
it should be emphasised
that it is the method
not a Farticular notation which is being proposed here.
436
Ackncwle~sement
~ost
of
the
ideas
contained in this paper were developed
co]]ahcraticn
with
se~bers
of
Laboratories.
The members and
the
Hsrs!ey
and
vienna
i~ IBM
meetings of IYIP WG 2.3 have also
been a great stimulus.
~eferences
[I]
C.9.%]]en~
9.N.Chap~an
Definition
of Algol 60",
12.1~5 August
[2]
[Ed.)
on
~ormal
Report,
TR
Algorithmic
Languages"
Lecture
Notes
in
1979.
H.Beki&~
P r e s e n t a t i o n on "Semantics of Actions" given at
Newcastle
University,
September
197~.
Ho~eki£ et a] "A Formal ~efinition of PL/I" to be printed
A.Hansal,
Re~ort Gf T ~
"Soft,are
Laboratory Vienna.
Devices for Processing Graphs Using
Facilities",
W.~e~hap]
a~d
Statements
in the VDL ~, I ~
March
[7]
of
SF~iDger-Verlag
IS8, October
PL/I c o m p i l e - t i m e
[6]
IB~ Hursley T e c h n i c a l
Semantics
E.Engeler,
as a Technical
[5]
"A
H.~ekie and K. Walk, " P e r m a l i s a ~ i o n of Storage Properties"
~ a t h e m a % i c s No.
[~]
C.E. Jones,
1972.
i~ ~'Sym~osiu~
[3]
and
C.B. Jcneso
Into Proc Letters.
1974.
~'On the Interpre%ation of Gore Vienna
Note,
LN
25.3.065,
1970.
~.He~hapl
and
C.B. Jones,
Possible Implementations, IBM Vienna Tech,ical
"The with
Hlock Proofs
Report, TN 25.104,
Concept and Some of
~quivalence",
April
1970.
437
[8]
C.A.R.Hoare,
"Proof
?epresentations",
ef
Correctness
Acta Informatica,
Vo!.
of
1,
pp
Eata 271-281,
1972. [9]
C.B.Jo~es
and
Implementation Algorithmic
P. Lucas,
Techniaues",
Languages"
"Proving
(Ed.) E. Engeler,
Lecture Notes in Mathematics [10] C.B. Jones,
"Sufficie ~t
Correctness:
Assignment
Correctness
in "Symposium
on Semantics
Springer-Verlag
No. 188, October
Properties Language",
of of
for
1970.
Implementation
IB~ Hursley
Note,
TN
9002, June 1971. [11 ] C.B.Jcnes,
"Formal Development
Example Based on Harley's S!GPLAN Conference,
of Correct
~ecogniser",
SIG~AN
Algorithms:
presented
at
An AC~
Notices Vol. 7, No.l, January
1972. [12] C.B.Jcnes,
"Formal Development
Technical
Deport,
[13] P.J.Landin,
"R
Correspondence
Church's Lambda-Notation: No.2, February [14] P.Lucas,
of Programs",
Constructive
"On
Development
[16] P.Lucas
Program
Realisations
cf
I mple me nfa tions",
Press,
[17] J.McCarthy, Computation"
60
ACM,
and
Vol.8,
of
the Block
Vienna
Technical
and
the
presented
Stepwise at
IB~
1972.
K. Walk, "On the Formal Description
in Annual Review in Automatic Pergamon
IB~
Correctness
at Pisa University,
and
Algol
1965.
"Two
Conference
Between
Part !", Comm. of
Co,cept and their Equivalence", Report, TR 25.085, 1968. [15] P.Lucas,
!~M Rursley
TR 12. 117, June 1973.
Programming,
Vol.6,
of PL/I" Part 3,
1969. "Towards
a
Mathematical
presented af !FIP Congress
1962.
Science
of
438
[18] R.~il~er,
"An Algebraic
Programs"~
Stanford
[ Ig] P.gosses,
"~he
Definition
University
~athematical
Oxford University Computing
of Simulation
AIM-I~2,
February
Semantics
Laboratory,
of
Between 1971.
Algol
PNG-12,
60",
January
197g. [20] Po
Naur,
"An Experiment
pp 3~7=365, [21 ] J.C.~eynolds,
Languages",
Conference,
August
Languages",
Computers
Series
Brcok!yn,
1971.
[23] C. Strachey,
presented
~icrowave
Vo!.21,
"~bstract
PL/I", TBM Vienna Technical [25] "PL/I BASIS/I"
National
ACM
A
Semantics
of the Symposium
Research
Polytechnic
,'Continuations:
al,
for Higher-Order
25th
"Toward a Mathematical
which can deal with Full Jumps", et
at
in "Proceedings
and Automata",
Symposia
[2~] K.Wa~k
HIT 12,
1972o
and C.Strachey,
for Computer on
Eevelopment",
~'Definitic~al I,terpreters
Programming
[22] DoScott
on Program
1972.
Institute
Institute
Mathematical
of
Semantics
unpublished.
Syntax
and Interpretation
Report, T~ 25.098,
ECMA ANSI working document,
of
lg69.
February
19g~.
439 A~pendix This appendix c o n t a i n s a definition merging
the separate
written
in a style
mathematical
features
of the language obtained
of section
I.
(cf. the tlpe clauses)
The definition
is
which can he read
as
sesantics.
B___STIR A ]%CC~. SYNTAX prOC
:: s-nm:id
St
=
as-st
:: s-lhs:id
goto-st
:: in
call-st
:: s-pn:id
as-st
s-parms:id~
~ gore-st
s-args:id~
:: nmd-st*
nmd-st
:: s-ns:[id S s-body:st
expr
=
inf-expr
:: expr o~ expr
vat-tel
:: id
const
:: INTG
in
}
op
)
~ vat-rot
/'args dcl,
~ const
not further specified
INTG )
DOMAINS
ENV: i d - > S: L O C - >
(LOCI
v~n = I ~ T G
PROC-DEN: AB.
= [ id ]
~ROC-DEN)
VAL
LOC = infinite
cpd-st
~ cpd-st
s-rhs:expr
cpd-st
inf-expr
proc-set s-dcls:id-set
| call-st
set
I ! (LOC I PROC-DEN}~
-> (S -> S ABN)
bT
proc or parma/
440
FUNCTIONS
e v a ! - ~ r o c - d c l (proc) (env) = le_t < i d , p a r ~ - l , ~ r o c s , d c l s g m k - c p d - s t ( n s - l )
> = proc;
!e_t f~den:!)= {!_~! env'
: e~v + ([ par,-![i ]
den-l[i]
~ 1<_i_
[id
eval-dcl|id)
~ id~dcls]
u
[ s-nm (proc)
e val-proc-dcl (proc) (env') pro cE procs ] ) ;
(t~a~ ezit (lab} with (free (dcls,env') ; e/x!_t(lab) ) ; int-ns-l{ns-l,1)
(env')) ;
free{dcls) (envv)) ; result
type:
is(f)
proc->
(ENV->
PROC-D~N)
eval-d¢l (id) : !e_!t !: alloc () : assign(l) (I) ; result
type:
is(])
id ->
(s ->
s LOC)
441
free(dcls) (env) : for all id(dcls
d~o
release (env (id))
type:
id-sef->
{ENV->
(S-)
S))
i,t-st (st} (env) : cases
st:
ink-as- st (lhs.rhs)
->
(let v: eval-expr(rhs) assign(e,v(lhs))
(env) ;
(v))
mk-goto-st (lab) -> e x i t (lab) ~k-cal]-st(pn.arg-l)
->
(l_eet f = env{pn): 1_~et den-I = <env(arg-][i]) f (de,- I) ) mk-cpd-st (ns-l)
-)
int-ns-I (ns- I. I) (env)
type: s t - >
(ENV-> ( S - > S ~BN))
I 1-:
442
int- ns-I (ns- !,i) (env} :
i_f i~!~s- 1 t_~h_en
( (t_xa_a _~_xi_t~lab) _wi_!h_ if is-coDtained (lab,ns-l) ~he@ cue-int-~s-l{ns-l.lab)
{env)
_~!gm _e!i_!t(lab) : int-s% (s-body (ns-l[i D ) (env)) :
int- ns-i (n~-lai÷ I) (env)) e_/!_s_e !
type:
hind-st* INTG -> (ENV -> (S -> S ABN))
cue-int- ~s-! (ns-1, lab) (env) : let i = (hi} (is-ccntained(lab~<ns-l[i]>)}
;
if lab = s-~m(ns-1[i]) t hhen int-ns-l(ns-l,i) (env}
e_!~e ( (_tr_a~ e~l~ (lab) _~it___h i f is°co~tai.ed (lab,ns-l) then cue-int-ns-l(ss-l.lab) else e_zxi_~(lab) ; cse-int-ns-l(s-body(ns-l[i]),!ab) int-ns-I (ns-l.i+ 1) (env))
type:
nsd-st ~. i d - >
(ENV->
(S->
S ~B~))
(env) (en¥)}:
443
eval-expr (e) (en,) : cases e: mk-inf-expr (el,op,e2)
->
(_]_et vq: eval-expr(el)
(env) ;
l__e% v2: eval-expr(e2)
(emv) ;
r e_sul_~t is (a p~ly-op (vl ,op, V2) ) ) mk-var-ref(id) mk-comst(n)
t~pe:
expr-)
is-contained:
-> contents(enw(id))
-> n
(ENV->
(S->
S INTG))
id nmd-st* -> B
app]y-op:
INTG op !R~G -> INTG
alloc: ->
(S -) S LOC)
release: assign:
LOC -> LOC ->
contents:
(S -> S) (VAL ->
(S -> S))
LOC ->(S -> S INTG)
/* _.~ yields e_~ror */
PROGRAMMIERTE STRUKTUREN R. Gnatz, Technische U n i v e r s i t ~ t
I,
MUnchen
Einieituna
Methoden zur K o n s t r u k t i o n
k o r r e k t e r Programme sind in den l e t z -
ten Jahren immer mehr zu einem z e n t r a l e n A n l i e g e n der I n f o r m a tik
geworden° Die BemUhungen k o n z e n t r i e r e n sich zum T e l l
das Problem, d i e K o r r e k t h e i t F e s t s t e l l u n g yon HOARE [ I ]
"...the
cost of e r r o r
types of program may be almost i n c a l c u l a b l e craft,
a collapsed building,
war" -
in c e r t a i n
- a lost
space-
a crashed a e r o p l a n e , or a world
nun, diese F e s t s t e l l u n g geht u n t e r d i e Haut.
DIJKSTRA's [ 2 ]
Aussage "Program t e s t i n g
can be used to show the
presence of bugs, but never to show t h e i r aller
auf
von Programmen zu beweisen. Die
Munde (und w i r d g e l e g e n t l i c h
bei a l l z u
absence" i s t
dazu b e n u t z t ,
heute in
Kollegen, die
langem Testen e r t a p p t werden, zu h ~ n s e l n ) .
Andererseits bringt
nun DIJKSTRA [ 3 ]
zum Ausdruck, dab zwar vom
mathematischen Standpunkt der f o r m a l e K o r r e k t h e i t s b e w e i s eines gegebenen Programms das a t t r a k t i v e r e
Problem i s t ,
dab aber in
der P r a x i s der K o n s t r u k t i o n des Programms s e l b s t d i e gr~Bere Bedeutung zukommt, a l s o der Frage, wie zu gegebenen S p e z i f i k a t i o n e n e i n g e e i g n e t e s Programm gefunden werden kann. Wesentlich dabei i s t ,
dab das Programm so zu k o n s t r u i e r e n
Korrektheit konstruktive Schritt (vgl.
evident ist
ist,
dab seine
oder nachgewiesen werden kann. D i e s e r
Aspekt i s t
durch DIJKSTRA [ 4 ]
einen w e s e n t l i c h e n
w e i t e r gebracht worden. Der s i c h abzeichnende Trend BAUER [ 5 ] ) h a t
den S e i t e n e f f e k t ,
dab a n g e s i c h t s der Not-
w e n d i g k e i t zur F o r m a l i s i e r u n g d i e Zusammenh~nge zwischen den
445
Algorithmen bzw. Programmen e i n e r s e i t s und den, den konkreten Objekten der Algorithmen aufgepr~gten, algebraischen S t r u k t u ren a n d e r e r s e i t s d e u t l i c h e r h e r v o r t r e t e n und bewuBter werden. Es i s t s e l b s t v e r s t ~ n d l i c h keine neue Erkenntnis, dab bei der Umformung eines konkreten Algorithmus (etwa zum Zwecke der Optimierung) Rechengesetze, also Eigenschaften der konkret gegebenen Objekte, angewendet werden k~nnen bzw. angewendet werden mUssen. Die p h y s i k a l i s c h konkret gegebenen Objekte auf Maschinenebene sind etwa gewisse Magnetisierungszust~nde im Magnetkernspeicher eines Rechners. Es i s t gewiss kein t r i v i a l e r
gedanklicher S c h r i t t , wenn man
sich nun f r e i macht v o n d e r obigen Voraussetzung, dab die Obj e k t e , auf denen d i e Algorithmen o p e r i e r e n , konkret gegeben seien und man sich l e d i g l i c h auf i h r e abstrakten Eigenschaften a b s t U t z t , dab man also von Besonderheiten der konkreten Objekte a b s t r a h i e r t .
1.1
Die Entwicklung in der Mathematik
A b s t r a k t i o n s v e r m~ g e n h i g k e i t des menschlichen Geistes.
i s t primer eine F~-
A b s t r a k t i o n
in
der Mathematik i s t so a l t wie d i e Mathematik s e l b s t ; d i e a x i o matische Behandlung mathematischer Probleme, d i e ja A b s t r a k t i o n v o r a u s s e t z t , i s t jedoch e r s t zum Beginn dieses Jahrhunderts zur v o l l e n BIUte gekommen, obwohl die Wurzeln bis ins Altertum (EUKLID) zurUckreichen. REDEI [7] nennt E. STEINITZ den BegrUnder der modernen Algebra, dessert berUhmt gewordene A r b e i t [8] ist.
im Jahre 1910 erschienen
STEINITZ hat das fundamentale I s o m o r p h i e p r i n z i p in der A l -
gebra v e r a n k e r t :
I s o mo r p h e
s i nd
n i c h t
a l
s
s c h i e d e n
we sen
a n z u s e h e n .
S t r u k t u r e n tl
i c h
v er-
Demnach s p i e l t die Be-
s c h a f f e n h e i t der Elemente der S t r u k t u r - oder wie w i r sagen der Objekte - bei algebraischen Untersuchungen keine R o l l e ; es kommt nur auf i h r e Eigenschaften, d i e durch die Strukturaxiome
446
angegeben werden bzw. aus diesen d e d u z i e r b a r s i n d , griff
an. Der Be-
der Isomorphie kommt schon bei GALOIS (1811-1832)
vor;
das P r i n z i p wurde aber e r s t von STEINITZ f o r m u l i e r t . Es i s t
historisch
weise n i c h t plin, ist
interessant,
zuletzt
dab sich d i e a x i o m a t i s c h e Denk-
an e i n e r u r s p r U n g l i c h
wie es d i e Geometrie i s t ,
so konkreten D i s z i -
entwickelt
hat.
Der Weg d o r t
gekennzeichnet durch Namen wie EUKLID, der s i c h bei seinen
Beweisen n i c h t
auf den "gesunden Menschenverstand" sondern auf
die VerknUpfungsregeln der Logik a b s t U t z t e , GAUSS, der das Par a l l e l e n a x i o m zu F a l l
brachte,
KLEIN, der s i c h in seinem " E r l a n -
ger Programm" m i t den S t r u k t u r g l e i c h h e i t e n f a B t e und HILBERT,
in der Geometrie be-
der in seinen "Grundlagen der Geometrie" die
A x i o m a t i k der Geometrie zu einem AbschluB b r a c h t e . Die a x i o m a t i s c h e Methode (im Gegensatz zur k o n s t r u k t i v e n ) s i c h in der Mathematik n i c h t z u l e t z t sie,
~konomisch b e t r a c h t e t ,
Satz, der l e d i g l i c h Struktur
bewiesen i s t ,
Wir h a l t e n
fest,
die A b s t r a k t i o n
gUnstig i s t :
mit Hilfe gilt
Ein mathematischer
der Axiome e i n e r a l g e b r a i s c h e n
fur alle
Modelle d i e s e r S t r u k t u r .
dab d i e a x i o m a t i s c h e Methode in der Mathematik vom konkreten Modell b e d i n g t und dies e n t s p r i c h t
auch der h i s t o r i s c h e n
Entwicklung.
1.2
in der I n f o r m a t i k
Die E n t w i c k l u n g
Die I n f o r m a t i k
hat
deshalb d u r c h g e s e t z t , w e i l
a l s jUngere Wissenschaft
kann sich der E r g e b n i s -
se, der Methoden: aber auch der Denkweisen der Mathematik bedienen: Trotzdem s c h e i n t s i c h m i t e i n e r gewissen N o t w e n d i g k e i t in der I n f o r m a t i k
eine entsprechende, wenn auch z e i t l i c h
wenige Jahre g e r a f f t e Abstraktion
allein
N~hrboden f u r
auf
E n t w i c k l u n g zu ergeben. Methoden, d i e auf
aufgebaut s i n d ,
scheinen Voraussetzung und
d i e EinfUhrung e i n e r a x i o m a t i s c h e n B e t r a c h t u n g s -
weise zu s e i n : der I n f o r m a t i k :
So e r s c h e i n t A b s t r a k t i o n UNCOL [ 6 ]
fur
s i c h sehr frUh in
aus dem Jahre 1958 i s t
wenn auch damals wenig e r f o l g r e i c h e r
ein e r s t e r ,
Versuch, das P o r t a b i l i -
t ~ t s p r o b l e m durch A b s t r a k t i o n a l l e i n zu l~sen (The T h r e e - L e v e l Concept ("UNCOL")). !m UNCOL-Report [ 6 ] werden e r s t e D i s k u s s i o -
447
hen dieses Konzeptes in das Jahr 1954 zurUckdatiert. Die Notwendigkeit zur Abstraktion wird damals - und dies g i l t auch heute noch - bestimmt durch die V i e l f a l t der verschiedenen Rechenanlagen und durch die Tatsache, dab sie in r e l a t i v kurzer Zeit veralten. Generell kann man sagen, dab die Idee der h~heren Programmiersprache, wie sie z.B. in FORTRAN und ALGOL 60 r e a l i s i e r t wurde, das Ergebnis eines Abstraktionsprozesses
ist.
Bei der Konstruktion des "THE - Multiprogramming Systems" demonstriert DIJKSTRA [11,3] die Leistungsf~higkeit des Abstraktionsprinzips in Verbindung mit dem Nachweis des korrekten logischen Entwurfs (Synchronisation!). Eine Arbeit des Autors [12], in der ein abstraktes Zeichenger~t d e f i n i e r t wird, beruht ebenfalls auf Abstraktion. Die Realisierung des abstrakten Zeichenger~tes e r f o l g t dabei durch eine geeignete Parameterisierung der Software. In den Jahren ab 1962 (McCARTHY [ I 0 ] ) erkennt man angesichts der "software c r i s i s " immer deutlicher, dab die Korrektheit von Programmen wegen der erdrUckenden kombinatorischen Komplexit~t dutch Testen a l l e i n nicht ausreichend gew~hrleistet werden kann. Abstraktion reduziert die Komplexit~t (DIJKSTRA [11] und [2]) und h i l f t so ein StUck weiter. Letztlich b l e i b t doch nur der formale Beweis der Korrektheit eines Programmes. Diese Einsicht setzt sich z~gernd durch (McCARTHY [10] (1962), NAUR [13] (1966), FLOYD [14] (1967), deBAKKER [15] (1968), BURSTALL [16] (1968)). Die eingangs bereits z i t i e r t e Arbeit von HOARE [1] aus dem Jahr 1969 i s t programmatisch: In der Ein|eitung s t e l l t er f e s t : "Computer programming is an exact science in that all the properties of a program and all the consequences of executing i t in any given environment can, in p r i n c i p l e , be found out from the text of the program i t s e l f by means of purely deductive reasoning. Deductive reasoning involves the application of valid rules of inference to sets of valid axioms. I t is therefore desirable and interesting to elucidate the axioms and rules of inference which underlie our reasoning about computer programs." Damit i s t die axiomatische Methode in der Informatik verankert. Die axiomatische Definition der Semantik einer Programmiersprache (FLOYD [14], HOARE, WIRTH [17]) als "Kontrakt" zwischen Sprachdesigner, Obersetzerbauer und Benutzer, als Ba-
448 sis
fur
trieb
formale
Beweise von P r o g r a m m e i g e n s c h a f t e n
z u r m a s c h i n e n u n a b h ~ n g i g e n B e n u t z u n g d e r Sprache
k o n s e q u e n t e und n a t U r l i c h e
Entwicklung,
w i e schon im Zusammenhang m i t schiedenartigkeit zu v e r a l t e n ,
1.3
Weitere
nicht
und a x i o m a t i s c h ,
durchsichtig sich
Produkt,
die
T~tigkeit
[2]
abstraktesten
aus.
ten Programm, beim V e r f e i n e r n
j
(Objekte) die
Arbeiten
Konsequent ist
P r o g r a m m i noch e r w ~ h n t ,
development" teaching", fahren
ihre
ist.
Programms, n t
r e f
fur
also
beim O b e r s e t Abstraktions-
i n e m e n t
, dab
der a b s t r a k t e n
Da-
diese
und WIRTH [ 2 0 ] .
der V e r -
Datenstrukturen. stammen yon NAUR
Zu nennen s i n d
und BALZER [ 2 3 ] ,
aber
b e i d e aus dem
i n diesem Zusammenhang d e r V e r s u c h , M e t h o d i k des e r e n s
S t r u k t u r i
durch geeignet
( e t w a LISKOV, ZILLES [ 2 4 ] )
e r-
entworfene
zu u n t e r s t U t z e n .
dab d i e M e t h o d i k des " t o p - d o w n
Erg~nzung und U n t e r s t U t z u n g
w i e es etwa i n BAUER, GOOS [ 2 5 ]
kann.
sollte
einem a b s t r a k -
Er b e o b a c h t e t
Richtung gehen,
von MEALY [ 2 2 ]
knapp a n g e d e u t e t e
in
Anweisungen, die auf abstrak-
Operationen
in diese
das
von B e i s p i e l e n
in eine niedrigere
o i
fur
zu s t r u k t u r i e -
Hand i n Hand zu gehen h a t m i t
der abstrakten
Programmiersprachen Es s e i
sondern sie
also mit
{Konkretisierung)
ZURCHER, RANDELL Z21]
t e n
sich
e i n e s Programmes
aufgebaut
abstrakten
Programms
das Ph~nomen des
hier
rasch
Systems zu d e f i -
und l o s g e l ~ s t
operieren,
eines
Andere A r b e i t e n ,
die
-
zu machen (DIJKSTRA t l l ] ,
Form b e g i n n e n ,
n~mlich die Verfeinerung tenstrukturen
klar
das aus a b s t r a k t e n
zen des a b s t r a k t e n
J a h r 1967.
eines
zu s t r u k t u r i e r e n ,
Die K o n s t r u k t i o n
ten D a t e n s t r u k t u r e n
[19],
Methode l a s s e n
Struktur
des P r o g r a m m i e r e n s ,
dies
der A r b e i t
auch d i e
Eigenschaft,
auch dazu v e r w e n d e n , den H e r s t e l l u n g s p r o z e B
mit
feinerung
deduktive
die
das P r o d u k t
DIJKSTRA s p r i c h t
ebene,
eine
Linie
UNCOL e r w ~ h n t - d u r c h d i e V e r -
und h a n d h a b b a r
also
also
seiner
ist
in erster
Entwicklung
PARNAS [ 1 8 1 ) , !assen
An-
ist.
n u r dazu e i n s e t z e n ,
nieren,
die
d e r R e c h e n a n l a g e n und i h r e
bedingt
Abstraktion
ren.
und a l s
durch
versucht
program "top-down wird,
er-
449 2.
St.r.ukturiertes.Programmieren
und programm..i.erte ..S.trukturen
DIJKSTRA's [2] abstraktes Programm i s t aus abstrakten Anweisungen (Operationen)
aufgebaut und o p e r i e r t auf abstrakten Ob-
jekten. Das heiBt, dab von den Besonderheiten der Objekte und der Operationen a b s t r a h i e r t wird und nur auf die s t r u k t u r e l l e n , algebraischen Eigenschaften zurUckgegriffen wird. Dazu pr~zisieren wir
2.1
zun~chst den B e g r i f f der (algebraischen) Struktur.
(Algebraische)
Strukturen
und i h r e Modelle
Wir folgen mit unserer Darstellung den Begriffsbildungen, wie sie bei LORENZEN [26] (aber auch bei GERICKE [27])zu linden sind: Eine (algebraische) A x i o me n s y s t e m
S t r u k t u r
wird durch ein
d a r g e s t e l l t , also durch ein System
yon Aussagen, von denen angenommen wird, dab sie gelten. FUr diese Struktur gelten dann a l l e Aussagen, die aus dem Axiomensystem logisch deduziert werden k~nnen. Die Aussagen eines Axiomensystems sind, wenn man sie als Formeln betrachtet, aus Primformeln, aus Primtermen und eventuell aus Primkonstanten, also aus gewissen
P r i m s y m b o l e n , aufgebaut.
So wird zum Beispiel durch die Axiome
(A1)
vu,v,w:
(A2) (A3)
VU,V:
U'V ~
VU~W:
3V:
u.(v'w)
-~
V'U
U ' V =~ W
(u.v)-w
(Assoziativit~t) (Kommutativit~t) ( E x i s t e n z e i n e r L~sung)
die S t r u k t u r e i n e r kommutativen Gruppe d a r g e s t e l l t . Man sagt, dab zwei verschiedene Axiomensysteme dieselbe S t r u k t u r d a r s t e l l e n , wenn yon beiden dieselben Aussagen l o g i s c h a b l e i t b a r s i n d . Bekanntlich kann die S t r u k t u r e i n e r kommutativen Gruppe auch durch das folgende Axiomensystem gegeben s e i n : (B1)
wie (At)
(B2)
wie (A2)
450
(B3)
vu:
3e:
u~e ~ u
(Existenz
der
(B4)
vu:
3v:
u.v :
(Existenz
des I n v e r s e n )
Der B e w e i s ,
dab b e i d e Systeme d i e s e l b e
dadurch gefUhrt, ziert
e
dab man ( A I ) - ( A 3 )
Struktur
Eins)
geben, wird
aus ( B 1 ) - ( B 4 )
logisch
dedu-
und u m g e k e h r t .
LORENZEN f U h r t
nun den B e g r i f f
des
G e b i
1 d e s
ein:
Eine Menge zusammen m i t
einem System von R e l a t i o n e n
und Funk-
tionen
die
heiBt
(VerknUpfungen),
Gebilde.
So i s t
Ublichen
G!eichheitsrelation
+
auf
etwa d i e Menge
ein Gebilde
(Z,
:,
+)
Elemente eines Gebildes
Z :
fur
das G e b i l d e
zugeordnet,
alle
als
Axiome des Systems das G e b i l d e
ein
M o d e 1 1 T r ~ g e r
bekanntlich
man auch s a g t
-
Z
=, +)
ist
e i n Mo-
gegebenen A x i o m e n s y s t e m s o d e r
t r ~ g t
die
Struktur
pe.
Durch d i e
gehen d i e Axiome ( A 1 ) - ( A 3 )
die
Aussagen vu,v,wEZ:
u+(v+v)
(A2z) ¥u,vEZ:
u+v : v+u
(A3z) vu,wEZ:
3vEz:
Alle
Aussagen, die
im M o d e l l dell,und 1.1
darin
gUnstig
ist.
Grup-
Uber i n
u+v = w aus dem Axiomensystem a b l e i t b a r
liegt
ja
- die
axiomatische
sind,
gelten
und zwar i n jedem Mo-
auch d e r G r u n d , warum - w i e b e r e i t s
Eine axiomatische
d a n n , wenn man bei
kommutative
= (u+v)+w
des A x i o m e n s y s t e m s e b e n f a l l s ,
angedeutet
eine
- wie
kommutati-
Z
(Alz)
s t
einer
yen Gruppe oder noch k U r z e r Interpretation
i
die
i n wahre Aussagen
Das oben e r w ~ h n t e G e b i l d e
(Z,
fur
Gehen bei
Struktur.
(A1)-(A3)
und
Zuord-
Variable
d e r d u r c h das A x i o m e n s y s t e m d a r g e s t e l l t e n
des d u r c h
eines
des A x i o m e n s y s t e m s .
des A x i o m e n s y s t e m s und d i e Menge des G e b i l d e s
dell
der
Funktionen
zu i n t e r p r e t i e r e n .
dann h e i B t
ein
Addition
so n e n n t man d i e s e
d e r Axiome s i n d d a b e i
Uber,
der Ublichen
gewisse Relationen,
Elemente d e r Menge des G e b i l d e s Interpretation
sind,
d e r ganzen Z a h l e n m i t
und m i t
I n t e r p r e t a t i o n
Die O b j e k t v a r i a b l e n einer
definiert
. Werden nun den P r i m s y m b o l e n
Axiomensystems eineindeutig nung e i n e
ihr
in
Methode a r b e i t s ~ k o n o m i s c h
Betrachtungsweise
verschiedenartigen
lohnt
sich
Anwendungen d i e g l e i c h e n
451
Aussagensysteme gefunden hat. Man spricht dann yon Strukturgleichheit.
2.2
Abstrakte Programme
DIJKSTRA's abstrakte Programme k~nnen aufgefaBt werden a|s Programme, die in einer Struktur im Sinne von 2.1 operieren. Abstrakte Programme werden geschrieben, indem man l e d i g l i c h die Existenz eines Modells einer Struktur annimmt, ohne sich jedoch um die Besonderheiten des Modells zu kUmmern. In die Programmkonstruktion k~nnen dann nut die abstrakten (modellunabh~ngigen) Eigenschaften, also l e t z t l i c h die Strukturaxiome eingehen. Dieser Sachverhalt s o l l t e durch die Wahl des Titels dieser Arbeit "Programmierte Strukturen" zum Ausdruck gebracht werden. Es erfordert selbstverst~ndlich noch eine gewisse Oberlegung einzusehen, da~ Aussagen Uber ein abstraktes Programm Aussagen sind, die aus den Axiomen der Struktur abgeleitet werden k~nnen und somit fur s~mtliche Modelle der Struktur gelten. Die Oberlegung l~Bt sich kurz in der folgenden Weise skizzieren. Man betrachtet die Gesamtheit der abbrechenden Berechnungen, die das Programm fur die verschiedenen Parameterkonstellatiohen durchfUhren kann. Der Weft der Resultatvariablen kann dann als endliche Formel der Eingangsparameter beschrieben werden. Diese Formeln sind aus Primsymbolen der Struktur aufgebaut. Aussagen Uber diese Formeln k~nnen somit als Aussagen Uber das Programm betrachtet werden. I s t beispielsweise ein sehr einfaches ProgrammstUck x := a, y := b; if
x=y then x := S(x,y) el.se y := T(x,y) f__~i
gegeben, wobei
S und
T
zwei VerknUpfungen einer (abstrak-
ten) Struktur sind und g i l t unmittelbar vor AusfUhrung des ProgrammstUckes
P(a,b)
, dann g i l t unmittelbar nach seiner Aus-
452
fUhrung P(a,b)
(i)
^ ((a:b
^ x:S(a,b)
^ y:b)
v
(amb A x=a ^ y : T ( a , b ) ) ) Dies
ist
{falls
nun s i c h e r
eine
zul~ssig
ist),
P
in unserer da s i e
f u n g e n und d e r G l e i c h h e i t s r e l a t i o n und
T
Struktur
zul~ssige
neben den l o g i s c h e n
Formel
VerknUp-
nur d i e V e r k n U p f u n g e n
S
enth~It°
WeiB man nun~ da~
P(a~b) ~ aCb
bedeutet,
so kann man aus der
o b i g e n Formel {2)
aCb ^ x=a A y = T ( a , b )
deduziereno eines
Diese Deduktion
logischen
KalkUls
die unsere Struktur Ist
kann a l l e i n
durchgefUhrt
darstellen,
nun d i e V e r k n U p f u n g
T
mit
werden,
wurden d a b e i
kommutativ,
gilt
Hilfe
d e r Regeln
d.h.
die Axiome,
gar n i c h t also
ben~tig~c
das S t r u k t u r -
axiom vu,v: dann I ~ 6 t (3)
T(u,v)
= T(v,u),
sich mit
Hilfe
dieses
Axiems d i e Aussage
a#b ^ x : a A y = T ( b , a )
deduzieren~
also
Es i s t
nicht
hier
eine
strukturabh~ngige
Absicht,
e i n e BegrUndung f u r
B e h a n d l u n g von P r o g r a m m e i g e n s c h a f t e n dazu etwa MANNA, PNUELI [ 2 8 ] . die
So e r l a u b t
d i e Umformung des Programms x
if die
die
:=
as
y
::
in die
die
die
ist
hier
T~tigkeit
folgende
Form
e l s e y := T ( y , x )
f_~i,
vielmehr
des P r o g ~ m -
Kommutativit~t
b;
×=y t h e n x := S ( x , y ) Aussage
fur
beispielsweise
die axiomatische
zu geben; man v e r g l e i c h e
Von I n t e r e s s e
Bedeutung d e r S t r u k t u r a x i o m e
mierens.
Umformung d u r c h f U h r e n .
von
T
453 (I')
P(a,b) ^ ((a=b ^ x=S(a,b) ^ y=b) v (a~b ^ x=a ^ y=T(b,a)))
als Konsequenz hat und die ja wegen der Kommutativit~t von und nur wegen dieser fur jedes Pr~dikat
T
P mit (1) ~quivalent
ist. Den Zusammenhang zwischen Strukturaxiomen und abstrakten Programmen wollen wir an einem weiteren Beispiel deutlich machen: Gegeben sei eine Halbgruppe
M , d.h. wir nehmen an, dab ein
Gebilde
(M, =,
.)
e x i s t i e r t , fur welches das assoziative Gesetz ( v g l . A1) (A)
vx,y,zEM: x - ( y - z ) = ( x . y ) . z
gilt.
FUr jedes Element aEM
kann man den Ausdruck
~.a....-~ (abgekUrzt an ) betrachten, wobei n eine p o s i t i n 1) ve ganze Zahl i s t ; dieser Ausdruck heist die n-te Potenz yon a
,
Man kann nun die n-te Potenz als eine Funktion
p I MxN ~ M
auffassen. Der Halbring der p o s i t i v e n , ganzen Zahlen wird dadurch zum Operatorenbereich der Halbgruppe
M . Aufgrund des
assoziativen Gesetzes lassen sich f u r Potenzen die Rechenregeln (PI)
an.a m = an+m
(P2) ~n)m = anxm 1) Die Menge
N der p o s i t i v e n , ganzen Zahlen mit der Ublichen
G l e i c h h e i t s r e l a t i o n =, Addition + und M u l t i p l i k a t i o n x - d.h. wir betrachten das Gebilde
(N, =, +, x)
tr~gt die Struktur
eines kommutativen Halbringes mit Einselement bezUglich der M u l t i p l i k a t i o n . Es gelten somit fur die Addition die Axiome (BI) und (B2), fur die M u l t i p l i k a t i o n die Axiome ( B I ) , (B2) und (B3), und es g i l t darUber hinaus das d i s t r i b u t i v e Gesetz, also
(D)
Ya,b,cEN:
ax(b+c) : (axb)+(axc)
454 mit
aEN
und
n,mEN beweisen. Diese Rechenregeln kann man
a u s n u t z e n , um d i e n - t e Potenz r e k u r s i v falls (4)
an = I
Dies l ~ t (4')
n=l
sonst
sich d i r e k t
~
zu berechnen:
in eine Prozedur
pl
umschreiben:
D1 = (M a, N n) M: if
n=1 then a e l s e a.p1(a,n-1)
fi
Hier wurde nur das Rechengesetz (P1) a u s g e n U t z t ; zientere M~glichkeit Es g i l t
ergibt
ja f u r gerades
sich,
eine e f f i -
wenn man auch (P2) a u s n U t z t .
n
an = (a2) n/2 und f u r
ungerades
n>1
an= a o ( a 2 ) { n - 1 ) / 2
Somit e r h a l t e n w i r f a l l s n=l f a l l s n gerade sonst
a n = ~(a2) n/2 I ~.(a2) (n'1)/2
(5)
oder umgeschrieben
(5 ~ )
1)
proc p2 = (!~ a, N n) M: if n:l then a elsf
even n then p 2 ( q u a ( a ) , n/2) else a.p2(qua(a), (n-l)/2)
fi,
mit proc
I)
qua = (M a) M:
a.a
.
Man b e a c h t e , dab der 0bergang von (4) nach ( 4 ) nach ( 5 ' )
eigentlich
nur o r t h o g r a p h i s c h e r
bzw. yon (5)
Natur i s t .
455 Wir haben h i e r
e i n e Form d e r P o t e n z b e r e c h n u n g ,
gewendet w i r d ,
da s i c h d e r T e s t ,
die Division gesetze
durch
2
leicht
ob
n
die
h~ufig
geradzahlig
realisieren
lassen.
(P1) und (P2) geben aber Raum auch f u r
ist
anund
Die Rechen-
andere Ver-
fahren:
(6)
p r o c p3 = (M a, N n) M: if
n = I
then a then q u a ( a )
elsf
n = 2
elsf
n mod 3 = 0 then c u b ( p 3 ( a , n / 3 ) )
elsf
n mod 3 = I
then a . c u b ( p 3 ( a ,
(n-l)/3))
else qua(a).cub(p3(a,
(n-2)/3))
fi
mit
proc cub : Auf e i n e
n~here D i s k u s s i o n
gegenUber nicht
von (6)
vorteilhafter
sein
NatUrlich
p2 und p3 ~ q u i v a l e n t
Funktion Gibt
(5')
n~her e i n g e h e n .
ten p l ,
wir
(~ a) M: a . q u a ( a ) .
zum Axiom
M
(A)
Frage, wir
warm (6)
hier
Rechenvorschrif-
dab s i e
dieselbe
ein
Einselement,
dann haben
das Axiom
3eEM: ¥xEM: x . e = e . x : x
Das E i n s e l e m e n t
ist
ein zweites
, dann f o l g t
e'
eindeutig
e.e'
= e, a b e r auch
e.e'
= e', a l s o
e
=
bestimmt,
denn g i b t
es neben
e
e'
Somit k~nnen w i r
mit
Hilfe
gew~hlte Bezeichnung fur eins :
Menge
sind die drei
i n dem S i n n e ,
es nun i n d e r H a l b q r u p p e
Gilt
wollen
repr~sentieren.
zus~tzlich
(E)
oder a u f d i e k~nnte,
Axiom NU{O}
eines dieses
~eEM: VxEM: x - e :
(E)
, so kann d i e
Kennzeichnungsterms eine freiEinselement e.x :
Definition
der nicht-negativen
einfUhren
x.
d e r Potenz a u f d i e
ganzen Z a h l e n a u s g e d e h n t wer-
456 den, was b e k a n n t l i c h a° = e i n s
setzt.
Halbgruppe ist, nition
die
die erstere
letztere.
wobei
Wir
Struktur
dab d i e D e f i -
kompatibel
ist
mit
bekommen a l s o d i e R e c h e n v o r s c h r i f t
p
n = 0 then e i n s e l s e
mit
pl
~
p2
oder
p r o c ( M , N ) M p = p2.
p(a,n)
p3
p
bei
im S i n n e e i n e s d e f e n s i v e n ~ d . h .
werden kann,
auf eine b e r e i t s
d er K o n s t r u k t i o n die
mes k o n s e r v i e r e n d e n , P r o g r a m m i e r s t i l s Auch im H i n b l i c k
fi,
identifiziert
Der R U c k g r i f f
handene R e c h e n v o r s c h r i f t
Korrektheit
sollte
I d e e kommen, etwa a n s t e l l e
sehr z w e c k m ~ i g
man n ~ m l i c h von
p2
yon
mit
q
kann
sein.
ist
d i e s e Vor-
sp~ter wirklich p3
vor-
e i n e s Program-
auf die ~nderungsfreundlichkeit
gehensweise ratsam, fen,
Einselement eine
p r o c q = (~ ao NU{O} n) M: if
z.B.
da~ man
Halbaruppe mit
w i r d man d a r U b e r h i n a u s f o r d e r n ,
der Potenz f u r
der f u r (7)
i n der Weise q e s c h i e h t ,
Da nun j e d e
arbeiten
auf die zu w o l -
so kann d i e ~nderung d a d u r c h g e s c h e h e n , da~ man d i e
Iden-
tit~tsdeklaration p r o c ( H , N ) M p = m2 durch proc(M,N)M p = p3 ersetzt. Nimmt man nun zu den Axiomen
(A)
und
(E)
die
E x i s t e n z des
Inversen dazu, (I)
VxEM: ~yEM: x . y
dann e r h ~ I t
= eins,
man d i e S t r u k t u r
y = yo(x.y') eindeutig
= y.x
bestimmt
zeichnungsterms die
= (y.x)'y ist,
w~hlte Bezeichnung f u r wie bei
Gruppe. Da das I n v e r s e wegen
~ = y'
k~nnen w i r
Intention
seiT~inverses z u o r d n e t , dabei ~hn!ich
einer
formal
wieder mit
der F u n k t i o n ,
Hilfe
eines
Kenn-
d i e jedem Element
niederschreiben
und e i n e f r e i g e -
diese Funktion einfUhren.
Wir v e r f a h r e n
d e r E i n f U h r u n g der B e z e i c h n u n g
eins
457 proc i n v e r s :
(Mx) M: IyEM: y . x = x . y = eins
Wir nennen d i e s e D e k l a r a t i o n diese Deklaration angibt,
nicht
tionswerte.
lediglich
Entsprechend
genschaft eines Objektes, aber k e i n V e r f a h r e n
Menge
M
(die ja
ist
eins
auch d i e D e k l a r a t i o n intentional,
Z
da d o r t
d i e es e i n d e u t i g
der Funkfur die freizwar e i n e E i -
bestimmt,
zur Auswahl d i e s e s Objektes
sowieso n i c h t
procr
aus der
Gruppen a u f d i e Menge
d e r ganzen Zahlen u n t e r Wahrung der K o m p a t i b i l i t ~ t
(8)
angegeben
k o n k r e t gegeben i s t ) .
Die U b l i c h e E r w e i t e r u n g der Potenz f u r dann s o f o r t
, weil
d i e E i g e n s c h a f t der F u n k t i o n s w e r t e
aber e i n V e r f a h r e n zur K o n s t r u k t i o n
g e w ~ h l t e Bezeichnung ist
i n t e n t i o n a 1
liefert
die Rechenvorschrift : if
(Ma,Zn) M: n
-n)
else q(a,n)
fi
Eine M o d i f i k a t i o n der Verfahren zur Berechnung der Potenz kann sich ergeben, wenn w e i t e r e S t r u k t u r e i g e n s c h a f t e n bekannt s i n d , wenn b e i s p i e l s w e i s e gewisse Elemente idempotent oder von endl i c h e r Ordnung sind.
Wir h a l t e n bole,
fest,
wendung f i n d e n , hung
dab i n den a b s t r a k t e n
Programmen d i e Primsym-
d i e auch im Axiomensystem der zugeordneten S t r u k t u r M
auftreten
k~nnen, b e i s p i e l s w e i s e
e i n e r Menge ( v e r k l e i d e t
als
. Daneben t r e t e n
nen f u r
Bezeichnungen von O b j e k t e n ,
freigew~hlte
oder F u n k t i o n e n
die Bezeich-
"Artindikation")
VerknUpfungssymbol
intentionale
Ver-
oder das Deklaratio-
Relationen
auf.
2.3
Konkrete Programme
Die
V e r f e i n e r u n g
wie s i e bei DIJKSTRA [ 2 ]
eines a b s t r a k t e n Programmes,
oder bei WIRTH [20] a u f t r i t t
, bedeu-
458 tet
im Grunde n i c h t s
anderes a l s d i e
Struktur
im Sinne von 2 . 1 ,
tionalen
Deklarationen
diskutieren schieht
dadurch,
eines
dab d i e P r i m f o r m e l n ,
einer
den a b s t r a k t e n sprechenden
Gebildes
Struktur
beispielsweise
noch der V e r a l l g e m e i n e r u n g d.h.
tur,
dann g e l t e n a l l e
gramm a u f g r u n d fur
geh~ren-
indem man ihnen d i e e n t (Die I d e n t i -
Es i s t
Interpretation
Ist
fur
e i n Modell
der
das G e b i l d e
in
das G e b i l d e e i n Modell
Interpretation
und f u r
der
dann zu z e i g e n , da~
der S t r u k -
Aussagen, d i e Uber das a b s t r a k t e
da~ d i e s r i c h t i g
der S t r u k t u r
Die I n t e r -
vom S t a n d p u n k t des A l g e b r a i k e r s bedarf.)
der S t r u k t u r a x i o m e
das durch d i e
Tatsache,
Bildungen
zu i h r
hinzufUgt.
da5 d i e S t r u k t u r a x i o m e
wahre Aussagen Ubergehen.
ge-
von ALGOL 68 s i n d h i e r
das G e b i l d e u n t e r der gew~hlten ist,
werden.
auf die
Programme a u s g e d e h n t ,
adequate A n s a t z , d e r jedoch
Struktur
der S t r u k t u r
d i e Primterme und d i e
identifiziert
wird direkt
Identit~tsdeklarationen
t~tsdeklarationen
einer
Programms besonders zu
Die I n t e r p r e t a t i o n
der Axiome m i t den entsprechenden
(gegebenen)
pretation
des a b s t r a k t e n
sein wird.
Primkonstanten
Interpretation
wobei d i e Behandlung der i n t e n -
ist
Pro-
gemacht werden kbnnen, "verfeinerte"
fur
j e d e s Modell
Programm. Die
jedes abstrakte der S t r u k t u r ,
mehrfach erw~hnten a r b e i t s ~ k o n o m i s c h e n
Programm
begrUndet den
Gesichtspunkt.
B e t r a c h t e n w i r w i e d e r d i e Berechnung der Potenz i n e i n e r gruppe.
Die Menge
N
der p o s i t i v e n
chen G l e i c h h e i t
und A d d i t i o n
pe. B e z e i c h n e t
N
haben w i r d i e
auch
Halb-
ganzen Zahlen m i t der U b l i -
bilden
e i n Modell
d i e A r t der p o s i t i v e n
einer
Halbgrup-
ganzen Z a h l e n ,
dann
Identit~tsdeklaration
mode M = N und bezeichnen tion
fur
equal
natUr]iche
und
plus
Gleichheitsrelation
Z a h l e n , dann haben w i r d i e D e k l a r a t i o n e n
o_pp =
=
equal,
•
=
plus
oder a u s f U h r ] i c h e r o2= o__pp •
und A d d i -
=
(~x,y)
bool:
:
(N_x,y)
N_: x p l u s
x equal y ; y
459
Wir haben nun zu zeigen, dab die Addition der natUrlichen Zahlen assoziativ i s t . I ) Dazu s t e l l e n wir die natUrlichen Zahlen durch senkrechte Striche dar
I,
If,
Ill .....
so dab die An-
zahl der Striche mit der dargestellten natUrlichen Zahl Ubereinstimmt. Die Addition zweier natUrlicher Zahlen erkl~ren wir durch die Konkatenation der Strichsequenzen. -
Wenn bewiesen i s t
und wir nehmen das der KUrze halber hier an - dab diese Kon-
katenation assoziativ i s t , dann i s t auch die A s s o z i a t i v i t ~ t der Addition gezetgt. Zwei Bemerkungen dazu: (a) Die obigen Strichsequenzen mit der Konkatenation als VerknUpfung bilden i h r e r s e i t s ein Modell der natUrlichen Zahlen, das einen konstruktiven Beweis der Rechengesetze erm~glicht (LORENZEN [26]). (b) Die Dualdarstellung der Zahlen und ihre darauf aufbauende physikalische Realisierung in Rechenanlagen l i e f e r n ein Modell fur einen Teil der natUrlichen Zahlen: Unsere Oberlegungen z i e hen sich somit durch bis zu den Mikroprogrammen von Rechenanlagen (BAUER [ 5 ] ) . Ein anderes Modell einer Halbgruppe bilden die Objekte der Art
string
mit der Konkatenation als Ver-
knUpfung oder die Selbstabbildungen
einer Menge mit dem Hinter °
einanderausfUhren als VerknUpfung. Die rationalen Zahlen mit der Ublichen M u l t i p l i k a t i o n als VerknUpfung bilden ein Modell einer Gruppe. Die I n t e r p r e t a t i o n der zur Gruppenstruktur
geh~renden abstrakten Programme kann durch
die folgenden Identit~tsdeklarationen erfolgen: mode M = s t r u c t ( i n t
z,
n ) ; 2)
op :
:
(Ma,b) bool:
op
=
z of a x n of b = n of a x z of b, (Ma,b) M: (z o_~f a x z o__f_fb, n o_ff a x n o__[f b)
I ) Neben der A s s o z i a t i v i t ~ t sind natUrlich auch noch die Axiome f u r das Modell nachzuweisen, die i m p l i z i t u n t e r s t e l l t werden, n~mlich die Geschlossenheit der VerknUpfung oder dab equal
eine Aquivalenzrelation i s t .
2) Die Null i s t auszuschlieBen, was dutch die Deklaration mode M = ( s t r u c t ( i n t z, n)a) bool: n o_ff a m 0 GNATZ [29, 30]) geschehen k~nnte.
( v g l . etwa
460 Bei der Potenz f u r
Gruppen haben w i r
Deklarationen
eins
fachste
fur
Weg i s t ,
M eins
die
= (I,
formal
zu b e h a n d e l n :
Deklarationen
z of a ¢ 0 then
dab d i e s e
In o f a,
Interpretationen
durchzufUhren;
weis auf einschl~gige Ein a n d e r e s ,
intentionalen
wir
zul~ssig
z.B.
die
bool
Permutationen
for
a)
(z.B°
der Z a h l e n I b i s
bool:
i
to 4 do l~a[i]
for j
^ a[i]~4
from i + I
^ b;
to 4 do
b := a [ i ] ~ a [ j ]
^ b j;
b =
=
J
(Ma,b) i
bool:
bool for
b :: i
true,
to 4 do b := a t i ] = b [ i ]
^ b;
b
°
:
(Na,b)
M:
FMc; for
i
to 4 do c [ i l
::
a[b[i]];
b
Die i n t e n t i o n a l e n eins proc
J
Deklarationen
= {I,
invers
ersetzen
wir
durch
2, 3, 4), :
(~a) M: F M c; for i
to 4 do c [ a [ i ] ]
::
i;
w~re dem V e r -
REDEI [ 7 1 ) .
Modell
b := t r u e ;
F b ::
sind,
begnUgen uns j e d o c h m i t
L e h r b U c h e r der A l g e b r a
mode M = ( [ l : 4 ] i n t
op
z o f a) f i
zum v o r a n g e h e n d e n n i c h t - i s o m o r p h e s
Gruppe b i l d e n
Der e i n -
zu e r s e t z e n :
= (M a) M: if
hier
invers
intentionalen
zwei
I),
Droc i n v e r s
Der N a c h w e i s ,
und
die
einer 4:
461
Auch hier w~re wieder der Nachweis zu fUhren, dab das Gebilde eine Gruppe i s t , bzw. dab die Substitution der intentionalen Deklarationen zul~ssig i s t . Wir halten also f e s t , dab die Modelle einer algebraischen Struktur nicht isomorph zu sein brauchen. Das klassische Beispiel hierfUr aus der Algebra i s t der euklidische Algorithmus zur Berechnung des gr~Bten gemeinsamen T e l l e r s : Er funktioniert fur die ganzen Zahlen, fur die ganzen GAUSS'schen Zahlen, fur Polynome Uber den ganzen Zahlen usw. Auch hier kann also das gleiche abstrakte Programm an verschiedene Modelle adaptiert werden.
Aufgrund der S t r u k t u r a x i o m e
bzw. der daraus a b g e l e i t e t e n
Re-
chengesetze k~nnen g e l e g e n t l i c h mehrere l o g i s c h ~ q u i v a l e n t e Versionen eines a b s t r a k t e n Programmes e n t w i c k e l t werden, wie etwa d i e R e c h e n v o r s c h r i f t e n pl , p2 und p3 f u r d i e Potenz in Halbgruppen. Die Frage, welche der l o g i s c h ~ q u i v a l e n t e n Versionen f u r ein konkretes Problem auszuw~hlen i s t , i n t e r e s s i e r t den Mathematiker in der Regel Uberhaupt n i c h t , um so mehr j e doch den I n f o r m a t i k e r
wegen der damit verbundenen Optimierungs-
m 6 g l i c h k e i t . Diese Frage kann g e l e g e n t l i c h auf der a b s t r a k t e n Ebene gar n i c h t e n t s c h i e d e n werden, sondern i s t e r s t zu k l ~ r e n , wenn das k o n k r e t e Modell bekannt i s t . B e i s p i e l s w e i s e i s t es zweckm~Big, die gew~hnliche M u l t i p l i k a t i o n n a t U r l i c h e r Zahlen, wenn sie auf die A d d i t i o n z u r U c k g e f U h r t werden muB, mit H i l f e der R e c h e n v o r s c h r i f t p2 zu r e a l i s i e r e n . Will man jedoch P o l y nome p o t e n z i e r e n , dann kann die Verwendung der R e c h e n v o r s c h r i f t p3 wesentlich zweckm~Biger sein. G e l e g e n t l i c h wird es vorkommen, dab ~ q u i v a l e n t e Programmtransf o r m a t i o n e n e r s t nach der A d a p t a t i o n an ein M o d e l l , also auf e i n e r n i e d r i g e r e n A b s t r a k t i o n s e b e n e , d u r c h g e f U h r t werden k~nnen. So kann es etwa f u r die r a t i o n a l e n Operationen qua und cub durch proc qua :
Zahlen zweckm~Big s e i n ,
(M_ a) _M: (z __°f a + 2, n of a + 2)
proc cub : (M_ a) M_: (z o._ff a ÷ 3, n o f a ÷ 3)
die
462 zu e r s e t z e n . qua
Bei der M u l t i p l i k a t i o n
a u f d e r Mikroprogrammebene
eines
Bin~rwortes.
3.
SchluBbemerkung
Der A u t o r
ist
sich
darUber
ten V o r g e h e n s w e i s e n Versuch,
die
nicht
natUrlicher
im k l a r e n , neu s i n d .
dab d i e Er s i e h t
Zusammenh~nge z w i s c h e n der
t e n s und den d e d u k t i v e n
step
o d e r um m i t
in the process
the Science of
genannt wird
K u n s t des Programmie-
ben h e r a u s ist
formatik.
sicht-
z u r M e t h o d i k des
zu s p r e c h e n
"a r e l e v a n t
Beitrag
sollte
in~
ganz a l l g e -
zum E i n s a t z
oder - wie es bei
Rechenanlagen mit
dem Z i e l ,
und a l l g e m e i n
langem e i n e r k l ~ r t e s ,
zu b r i n REDEI [ 7 ]
Die Entdeckung von
i n den v e r s c h i e d e n a r t i g s t e n
zu k r i s t a l l i s i e r e n seit
Algebra
o f Programming
- das S T E I N I T Z ' s c h e P r i n z i p .
ten e l e k t r o n i s c h e r
i n dem
the A r t
des Programmierens
das I s o m o r p h i e p r i n z i p
Strukturgleichheiten
deln,
Dieser
aufgezeig-
f o r m a l e m a t h e m a t i s c h e Methoden und Denkwei-
der T~tigkeit
gen, w i e z . B .
DIJKSTRA [ 3 ]
transforming
Programming"
mein dazu a n r e g e n , sen bei
of
hier jedoch
Methoden d e r a b s t r a k t e n
bar und b e w u B t e r zu machen, e i n e n B e i t r a g Programmierens
Zahlen geht
Uber i n d i e L i n k s v e r s c h i e b u n g
EinsatzgebieStandardaufga-
gUltig
zentrales
zu behan-
Anliegen
der
In-
463
Referenzen [I]
C.A.R. Hoare, "An Axiomatic Basis for Computer Programming", Comm. ACM, 12, 10, (1969) 576-581.
[2]
E. W. Dijkstra, "Structured Programming", in: I. N. Buxton and B. Randell (ed.), "Software Engineering Techniques", Report on a conference at Rome, Oct. 27-31, 1969.
[3]
E. W. Dijkstra, "A Constructive Approach to the Problem of Program Correctness", BIT 8 (1969) 174-186.
[4]
E. W. Dijkstra, "A Simple Axiomatic Basis for Programming Language Constructs"(Report EWD 372). Lectures, given at the Marktoberdorf Summer School 1973.
[5]
F. L. Bauer, "A Philosophy of Programming", Lectures given at the Imperial College of Science and Technology, University of London (1973).
[6]
I. Strong et a l . , "The Problem of Programming Communication with Changing Machines", Comm. ACM, I , 8 (1958) 12-18.
[7]
L. Redei, "Algebra I " , Akademische Verlagsgesellschaft Geest u. Portig K.G., Leipzig (1959).
[8]
E. S t e i n i t z , "Algebraische Theorie der KUrper", J. reine angew. Math., 137, (1910) 167-3o9.
[9]
IU. I. lanov, "On the Equivalence and Transformation of Program Schemes", Doklady, AN USSR, 113, I , (1957) 39-42 (ins Englische Ubersetzt yon Morris D. Friedman).
[I0]
J. McCarthy, "Towards a mathematical theory of computation", Proc. IFIP Cong. 1962, North Holland Pub. Co., Amsterdam, (1963).
[11]
E. W. Dijkstra, "The Structure of the 'THE'-Multiprogramming System", Comm. ACM, iJ_1, 5 (1968) 341-346.
464
[12]
R. Gnatz, " D I I : A P l o t t e r - Independent I n t e r m e d i a t e Language f o r Graphical O u t p u t " , C a l c o l o , Suppl. IX, (1972) 69-92.
[13]
P. Naur, "Proof of a l g o r i t h m s BIT 6 (1966) 310-316.
[14]
R. W. Floyd, "Assigning Meanings to Programs" Proc. Amer. Math. Soc., Symposia in Applied Mathematics, I_99, (1967) 19-32.
[15]
J. W. deBakker, "Axiomatics of simple assignment s t a t e ments". M.R. 94, Mathematisch Centrum, Amsterdam (1968).
[16]
R. B u r s t a l l , "Proving p r o p e r t i e s of programs by s t r u c t u m al i n d u c t i o n " , Experimental Programming Reports: Nr. 17 OMIP, Edinburgh, (1968).
[17]
C. Ao R. Hoare and N. W i r t h : "An Axiomatic D e f i n i t i o n of the Programmin~ Language PASCAL", Acta I n f o r m a t i c a 2, 4,
by general
snapshots",
(1973) 335-355. [18]
D. L~ Parnas, methodology", 26-30.
[19]
P. Naur,
"Information distribution aspects of design Proc. IFIP Cong. 71, Booklet TA-3, (1971)
"Programming by Action C l u s t e r s "
BIT, 9,
(1969)
250-258.
[20]
N. W i r t h ,
"Program Development by Stepwise Refinement"
Commo ACM, i~4, 4, (1971) 221-227. [21]
F. W. Zurcher and B. Randell, " I t e r a t i v e ! l u l t i - L e v e l M o d e l l i n g , a Methodology f o r Computer System Design", Proc. IFIP Cong. 68, N o r t h - H o l l a n d Pub. Co., Amsterdam, (1969) 867-871.
[22]
G. Mealy~ "Another look at d a t a " , 525-534.
Proc. AFIPS, 3~1, (1967)
465
[23]
R. M. Balzer,
"Dataless programming", Proc. AFIPS, 31,
(1967) 557-566.
[24]
B. Liskov, S. Z i l l e s , "Programming with Abstract Data Types", SIGPLAN Notices, Proc. Symposium on Very High Level Languages, 9, 4, (1974) 5o-59.
[25]
F. L. Bauer, G. Goos, " I n f o r m a t i k , eine einfUhrende Obersicht" 2 B~nde, Springer (1971).
[26]
P. Lorenzen, "Metamathematik", BI-HochschultaschenbUcher, 25, Mannheim (1962).
[27]
H. Gericke, "Theorie der Verb~nde" BI-HochschultaschenbUcher, 38/38a, Mannheim (1963).
[28]
Z. Manna and A. Pneuli, "Axiomatic Approach to Total Correctness of Programs", Acta Informatica, 3, 3 (1974) 243-263.
[29]
R. Gnatz, "On Basic Concepts of Higher Graphic Languages" in: F. Nake and A. Rosenfeld: "Graphic Languages", NorthHolland Puh. Co., Amsterdam (1972) 302-320.
[3O]
R. Gnatz, "Sets and Predicates in Programming Languages" Lectures given at the Marktoberdorf Summer School (1973).
466 A X I O M A T I S I E R U N G VON P R O G R A M M I E R S P R A C H E N
UND
IHRE G R E N Z E N
GUnter
Zusammenfassunq
Hotz
{ Es gibt fur r e a l i s t i s c h e
liches A x i o m e n s y s t e m ,
Programmiersprachen
kein end-
das es g e s t a t t e t die e i n s c h l M g i g e n T h e o r e m e einer
T h e o r i e der P r o g r a m m i e r s p r a c h e
abzuleiten.
P r o g r a m m e n kann d u r c h K o r r e k t h e i t s b e w e i s e
Die T r a n s p o r t a b i l i t ~ t
von
im R a h m e n einer a x i o m a t i s c h e n
T h e o r i e nicht v o l l s t ~ n d i g g e s i c h e r t werden.
Einleitunq__! Dieses X o l l o q u i u m u m f a S t p r a k t i s c h e und t h e o r e t i s c h e Themen. Der T e i l n e h m e r k r e i s stattet, ten,
ist e n t s p r e c h e n d
die G e s i c h t s p u n k t e ,
an k l a s s i s c h e n
heterogen.
Beispielen
jedoch ein B e d U r f n i s h~uften Wissens
Theorie
Interesse
gefunden,
nach einer s y s t e m a t i s c h e n D u r c h d r i n g u n 9
!nteresse
ohne dab
des aufge-
b e s t e h t d a r i n die angesan~nelten E r k e n n t n i s s e
l o g i s c h aus m 6 g l i c h s t wenigen, R e s u l t a t e n der T h e o r i e
m~glichst
abzuleiten.
e v i d e n t e n und w i d e r s p r u c h s f r e i e n
Diese R e s u l t a t e w e r d e n nun als g e g e b e n
hingenommen.
Zu ihrer B e g r U n d u n g b e n 6 t i g e n
aus - nichts
auBer ihrer W i d e r s p r u c h s f r e i h e i t .
sie - v o m f o r m a l e n S t a n d p u n k t Diese Grundlage
fur eine
D u r c h d r i n g u n g des W i s s e n s heist ein A x i o m e n s y s t e m .
Axiomensystem heist vo!ist~ndiq, s i c h t e n der T h e o r i e m i t t e l s
wenn es m ~ g l i c h
Nun m a n spricht
einer
- aus den A x i o m e n abzuleiten. aus ?
in der G e o m e t r i e Uber G e g e n s t ~ n d e ,
heiSen und davon~
Das
ist, aile a n d e r e n Ein-
formalen R e g e l n - d.h. ohne V e r w e n d u n g
I n t e r p r e t a t i o n der A x i o m e
W i e sehen A x i o m e der e u k l i d i s c h e n G e o m e t r i e
Strecken
stellt die e u k l i d i s c h e
in d i e s e n Z e i t r ~ u m e n n a c h g e w i e s e n w e r d e n kann.
Das a x i o m a t i s c h e
inhaltlichen
lei-
G e o m e t r i s c h e E i n s i c h t e n und ihr l o g i s c h e r Z u s a m m e n h a n g
batten schon J a h r h u n d e r t e vor Euklid groBes
systematische
sei es mir ge-
zu erl~utern.
Das erste B e i s p i e l einer a x i o m a t i s i e r t e n G e o m e t r i e dar~
Deshalb
die uns bei dem A x i o m a t i s i e r u n g s v e r s u c h
die Punkte und G e r a d e n
d a b S t r e c k e n d u r c h P u n k t e g e h e n oder dab Punkte auf
liegenf dab sich G e r a d e n s c h n e i d e n und P u n k t e auf einer G e r a d e n
!iegen. Die G e g e n s t ~ n d e ,
u m die es geht b i l d e n also M e n g e n P und S. Die E l e m e n t e
v o n P sind d i e Punkte~ d i e E l e m e n t e v o n S sind d i e Strecken. DaS ein P u n k t auf einer G e r a d e n liegt, geht,
oder eine Gerade d u r c h einen Punkt
sind zwei v e r s c h i e d e n e B e z e i c h n u n g e n
den m a n d u r c h eine R e l a t i o n
fur den g l e i c h e n Sachverhalt,
467
RIC b e s c h r e i b e n kann.
( p,g ) e R 1 b e d e u t e t eben, daB p auf g l~egt. DaB
G e r a d e n sich schneiden k~nnen,
kommt in einer R e l a t i o n
R2 C zum A u s d r u c k
P x S
P x S x S
: ( P' g1' g2 ) e R 2 bedeutet,
dab gl und g2 sich in p
schneiden° E n t s p r e c h e n d g e h 6 r t zu " v e r b i n d e n R3 ~ mit der I n t e r p r e t a t i o n
" eine R e l a t i o n
S x P x P,
( g, PI' P2 ) £ R 3 genau dann, wenn g d u r c h Pl
und P2 geht. Nun sind die R e l a t i o n e n
im Lichte u n s e r e r a n s c h a u l i c h e n
n i c h t u n a b h ~ n g i g voneinander.
Interpretation
Es gilt dann v i e l m e h r
( P' gl' g2 ) e R 2 =>( p, gl ) e R 1 und
( p, g2) e R I.
W e i t e r etwa ( g' P1' P2 ) e R 2 =>
( g' P2' Pl ) e R 2
oder ( g' Pl' P2 ) £ R2 und
( g', P1' P2 ) e R 2 => g = g'.
Solche E i g e n s c h a f t e n y o n Relationen, schenswert
die wir als " e v i d e n t
" als Basis u n s e r e r T h e o r i e w~hlen,
" oder " w~n-
heiBen im R a h m e n der T h e o r i e
Axiome. Axiomensysteme m0ssen widerspruchsfrei Ein Axiomensystem
und sollen v o l l s t ~ n d i g
ist w i d e r s p r u c h s f r e i t
sein.
wenn sich aus ihm m i t t e l s
logi-
scher S c h l H s s e nicht zugleich ein Satz und die N e g a t i o n dieses Satzes ableiten l~Bt. Die W i d e r s p r u c h s f r e i h e i t
zeigt m a n d u r c h K o n s t r u k t i o n eines
Modells.
Ein A x i o m e n s y s t e m h e i B t vollst~ndig,
Theorie,
der r i c h t i g ist, aus den A x i o m e n logisch ableiten l~Bt.
wenn sieh jeder Satz der
Hieraus e r g i b t s i c h f~r unser Problem schon s o v i e l , dab der A~spruch, eine Programmiersprache axiomatisch v o l l s t ~ n d i g d e f i n i e r t zu haben, ohne eine Umschreibung der zu e r f a s s e n d e n Theorie als T a n t o l o g i e a u f g e f a B t werden m~B. Uns i n t e r e s s i e r t an u n s e r e m Beispiel noch ein w e i t e r e r G e s i c h t s p u n k t Man u n t e r s c h e i d e t v e r s c h i e d e n e A r t e n von Geometrien. projektive
Geometrie genannt,
v e r b i n d e n bezieht, zu fassen,
sei die
die sich nur auf A u s s a g e n wie schneiden und
ohne eine A n o r d n u n g der Punkte auf d e r G e r a d e n ins A u g e
also eine reine I n z i d e n z g e o m e t r i e
ordnungsaxiomen
Als Beispiel
:
G 1 . D u t c h H i n z u n a h m e von An-
erh~it m a n eine r e i c h e r e G e o m e t r i e G 2. Man kann sich nun
468
fragen,
ob m a n nicht n a t H r l i c h e r die Geometrie u m g e k e h r t
re, n ~ m l i c h indem m a n A n o r d n u n g s a x i o m e
a u f g e b a u t h~t-
an die Spitze stellt. Nun rein
formal kann man d a s tun, nur w i r d kaum jemand ein System, das k e i n e Aussagen ~ber das S c h n e i d e n von G e r a d e n und das V e r b i n d e n von P u n k t e n macht, als G e o m e t r i e bezeichnen. Wir m e r k e n uns f~r sp~ter
:
Eine a x i o m a t i s c h e T h e o r i £ , d i e i n v e r s c h i e d e n e n R i e h t u n g e n V e r f e l n e r u n g e n e n t h ~ I t , b e s i t z t e i n ~ n a t ~ r l i c h e H i e r a r c h i e i n i h r e n Axiomen. Wir w o l l e n dies an e i n e m zweiten B e i s p i e l
erl~utern,
das auch noch aus
e i n e m a n d e r e n Grund for uns w i c h t i g wird,
n ~ m l i c h am Beispiel der Gruppen-
theorie. In der G r u p p e n t h e o r i e tun, einer Relation,
haben w i r e s
nur mit einer Menge G von O b j e k t e n
n ~ m l i c h der M u l t i p l i k a t i o n
zu
und einiger w e n i g e r Axi-
ome. E i n e M e n g e G m i t einer O p e r a t i o n T:GxG÷G, die die G r u p p e n a x i o m e Es gibt b e k a n n t l i c h
erf~llen,
sehr v e r s c h i e d e n e Gruppen,
17 und solche mit 3 1 E l e m e n t e n . mit
17 E l e m e n t e n
n a c h b i l d e n wollen, es v o n d e r
Gruppe m i t
entsprechende
obwohl die Gruppe,
~r~Ber
z.B. gibt es G r u p p e n mit
W e n n m a n nun eine R e c h n u n g
in der Gruppe m i t 3 1 E l e m e n t e n
m a n in S c h w i e r i g k e i t e n ,
Homomorphismus
h e i s t eine Gruppe~
in der Gruppe
m a c h e n will, dann ger~t
in der wir die B e r e c h n u n a
ist. Die U r s a c h e liegt b e k a n n t l i c h darin, dab
17 E l e m e n t e n
in die mit 3 1 E l e m e n t e n
nur einen
gibt, der alle E l e m e n t e auf das l-Element abbildet. gilt fdr die " S i m u l a t i o n
der B e r e c h n u n g
Das
" in der a n d e r e n
Richtung. Wir b e h a l t e n von d i e s e m B e i s p i e l
in E r i n n e r u n g
:
Es g i b t Gruppen d e r a r t , da~ " Beaechnungen " i n airier Grippe a l s E r g e b n i s das l - E l e m e n t l i e f e r n , w~hrend d i e " g l e i c h e n " Berechnungen i n anderen Gruppen d i e s Weiter
nicht
tun.
f~llt uns an d i e s e m B e i s p i e l
schaften der A n z a h l der E l e m e n t e hervorragende
Rolle spielen,
auf, dab z a h l e n t e h o r e t i s c h e
einer Gruppe in der G r u p p e n t h e o r i e
ohne dab die n a t ~ r l i c h e n
theorie a x i o m a t i s c h e i n g e f ~ h r t werden. Zahlentheorie
zwar verwendet,
Eigeneine
Zahlen in der Gruppen-
Das heiBt, dab man Resultate der
d i e s e aber als u n t y p i s c h oder gar st~rend
in d e m i n t e r e s s i e r e n d e n
B e r e i c h ansieht.
Das w i r f t die F r a g e auf
:
Was s i n d d i e f ~ r e i n e T h e o r i e d e r P r o g r a m m i e r s p r a c h e t y p i s c h e n
Res~ltate,
469
was i s t
e i n Theorem der T h e o r i e der P r o g r a m m i e r s p r a e h e n ,
der T h e o r i e
? Geh~ren z . B .
d i e Axiome der n a t ~ r l i c h e n
T h e o r i e der P r o g r a m m i e r s p r a c h e n
?
Bevor wir einen weiteren w i c h t i g e n sei auch an diesem Beispiel
was e i n Axiom
Zahlen i n e i n e
Begriff an Hand der Gruppe erl~utern,
auf die hierarchische
Struktur
in Axiomen-
systemen verwiesen. Man kann z.B. von Gruppen
zu a belschen
durch Hinzunahme
Relationen
weiterer
Gruppen,
zu ~
Wir haben oben den Begriff der Berechnung rechnung
nung nicht
zur Umformung Beispiel
angewendet
in konkreten
wird.
Gruppen
von Ausdr0cken
( a.b )-I.
irgendeiner
Menge,
zul~Bt.
dann bilden wir
wie
und rechnen mit diesen A u s d r H c k e n
unter V e r w e n d u n g
der Axiome°
= (b-l.a-1).(a.b)
= ((b-l.a-1).a).b
= (b -I. (a -1.a)).b
= b -1.b = I
a und b " erzeugt
Man erh~it
etwa
Man erh~it durch das symbolische mente
eine Berech-
symbolisch , indem man
( a.b )
dann aus obigem Ausdruck (a.b)-l.(a.b)
sondern
nur die Axiome der Gruppentheorie
AusdrHcke
Eine Be-
die auf eine gewisse Menge
Man kann nun versuchen
auszufdhren,
: Sind a und b Elemente
gruppentheoretische
Gruppen und
~bergehen.
in Gruppen verwendet.
ist hier eine Folge von Operationen,
von Gruppenelementen
stetigen u.s.w.
" wird;
Rechnen
eine Gruppe,
diese Gruppe
die durch die Ele-
ist die freie Gruppe Hber
{ a, b }. Damit haben wir genug Material
zur Veranschaulichung
fHhrungen.
Zwei einfache
Programme H b e r
Gruppen.
Es seien a.b und a - 1 G r u p p e n o p e r a t i o n e n . Wir betrachten
das Programm
be~in x I :=a; x 2 :=a; x3:=I;
der folgenden Aus-
470
m
: x I :=x I °x I x 3 :=x 2 •x 3 ; if xi=I the__.~n~oto~else
goto m;
& : end Des Programm quadriert
also den Inhalt von x I so lange bis I herauskommt.
Ebenso oft wie es quadriert, Nach n - m a l i g e m
Durchlaufen
multipliziert
der Sch!eife
es a mit dem Inhalt von x 3.
haben wir also
2n eontent Rechnen wir in einer
content Ist unsere
und content
freien Gruppe,
Rechnen wit in der additives
rithmus
n
(Xl)=a
Gruppe yon 7~ ab
niemals
ab.
dann haben wir nach n Zykien
(Xl)=2n.a und content
Gruppe die additive
.
bricht des Programm
Gruppe,
f~r a + o auch niemals
(x3)=a
( des
(x3) =n.a.
1 5' dann bricht der Algo-
l-Element
ist hier die "o".I) .
Rechnen win in der additives Gruppe St~ ten7~ mus nach sp~testens m Schritten ab. r 2m, dann bricht der Algorithwir die Rechnung mit a = I ( hier nicht die Einheit genau m Schritten
der Gruppe
), dann bricht der Algorithmus
nach
mit content
(x 3) =m
ab. Wir fassen
zusammen
Ein Programm ~ i t O p e r a t i o n e n , d i e den Axiomen der Gruppe gen~gen, kann 6 e i , Rechnungen i n ore~en Gruppen n i e m a l s a 6 6 r e c h e n , w~hrend es 6 e i Rechnungen ~6er e n d l i e h e n Gruppen abbrechen kann oder auch n i e h t . Im F a l l e des Ab6ruches des Pr~grammes, 6rauehen d i e R e s u l t a t e 6 e i Rechnungen i n v e r s c h i e d e n e n Gruppen n i c h t ~ Wir Mndern des Beispiel
m i t e i n a n d e r zu t u n ha6en.
leicht ab
:
x1:=a; x2:=a; x3:=I; m
:
x1:=Xl.Xl; x3:=x2.x3; if xi=I end
then goto m elso qoto £;
471
Wir haben also in der bedingten Nun haben wir
Anweisung
nur die Sprungziele
vertauscht.
:
Bei Rechnungen i n der f r e i e n Gruppe b r i c h t das Programm immer ab.
Es g i b t
e n d l i c h e Gruppen, i n denen das Programm n i e m a l s a b b r i c h t und s o l e h e i n denen es f ~ r a ~ o s t e t s a b b r i e h t . Axiomatisierung
der P r o g r a m m s p r a c h e n
Den HauptanstoS
fHr die Versuche
axioamtisch HoareE~, angetrieben
festzulegen,
und Transportabilit~t
die Semantik von Programmiersprachen
ging wohl von F l o y d ~
3aus.
VorzHglich
Manna/qS-~ und WirthE~ 2 wurde diaser Ansatz und i n L ~
yon Proqrammen.
bei der Definition
durch
theoretisch
wait vor-
von Pascal einer praktischen
Pro-
be unterzogen. Die Begr~ndungen
fHr diese Forschungen
I. Eine nur syntaktisch
formal definierte
des C o m p i l e r k o n s t r u k t e u r s bei verschiedenen 2. Eine Sprache, Benutzer
Sprache
zuviel Spielraum.
Sprachimplementationen
deren Semanti@formal
die M~glichkeit
und so bei axiomatisch Transportabilit~t
seien kurz zusammengefaBt.
vollstHndiq
Wir wollen
uns hier die BegrHndung
Definition
ist, ~ibt dem zu beweisen
der Sprache die
unter Verwendung
einer abstrak-
ist zu kompliziert,
um
zu fHhren.
kritisch
anschauen.
Der erste Grund kann wohl yon niemand bestritten die verbale
liefern.
zu sichern.
der Semantik
Basis Korrektheitsbeweise
Resultate
definiert
ten Maschine wie zoB. bei der Wiener M e t h o d e ~ J auf dieser
warden deshalb
seines Pro~rammes
Implementationen
seines Programmes
3. Die formalen Definitionen
verschiedene
die Korrektheit
richtigen
l~Bt der Interpretation
Programme
durch eine formale
warden.
Die Notwendiqkeit
zu ersetzen wird allgemein
bren-
nend empfunden. Die zweite Begr~ndung
trifft nur zum Teil zu, wie unsere
spiele aus dam vorigen
Paragraphan
a) N~mlich die beiden Programme
trivialen
Bei-
zeigen.
sind fHr jade Implementation,
ration a.b und a -I , die den Gruppenaxiomen
genH~en,
der Ope-
syntaktisch
korrekt.
Dies be~agt weder etwas ~ber d i e Term~nierung der Programme, noch i h r e R e s u l t a t e im F a l l e der T e r m i n i e r u n g aus. b) Der Beweis, alleiniger
da~ ein Programm Verwendung
nungen i n der f r e i e n
eine bestimmte
der Axiome
Funktion
kann die K o r r e k t h e i t
Gruppe gewMhrleisten.
berechnet,
unter
n u t b e i R£ch-
472
c) Die I m p l e m e n t a t i o n v o n P r o g r a m m i e r s p r a c h e n
auf R e c h n e r n mit v e r s c h i e -
d e n e n W o r t l ~ n g e n wird abet stets auf nicht i s o m o r p h e der a x i o m a t i s c h definierten
Algebra
( Gruppe ) fHhren und nie zu einer R e a l i s i e r u n g
der freien Algebra. F o l ~ e r u n g aus a l ~ b) und c). Ohne eine genaue U n t e r s u c h u n g der in der D e f i n i t i o n von P r o ~ r a m m i e r s p r a chen f e s t g e l e g t e n
algebraischen
Bereiche,
kann man die B e g r ~ n d u n g
d i e s e r al__ll~emeinen Form n i c h t a u f r e c h t erhalten. d e r e n K o r r e k h e i t auf a x i o m a t i s c h e r kSnnen,
Basis n a c h g e w i e s e n wurde,
dab gewisse Fehler ausgeschlossen
Zum Beweis der " v o l l s t ~ n d i g e n K o r r e k t h e i t notwendig
zu sein, dab die k o n k r e t e
Ist d i e s e r N a c h w e i s
sein
" scheint stets der N a c h w e i s die freie M a s c h i n e
) " korrekt
( Maschi-
" simuliert.
f~r eine k o n k r e t e M a s c h i n e erbracht,
nicht s e l b s t v e r s t ~ n d l i c h p
nut sicher
sind.
Maschine,
ne, die in den freien A l g e b r e n r e c h n e t
2. in
Man wird bei P r o o r a ~ e n ,
dann ist es
dab er fur die anderen M a s c h i n e n H b e r f l ~ s s i q
ist. Nun zur d r i t t e n B e g r ~ n d u n g
:
Die D e f i n i t i o n der S e m a n t i k u n t e r v e r w e n d u n g
einer a b s t r a k t e n M a s c h i n e
sind in der Tat sehr a u f w e n d i g und fur einen Benutzer der Sprache schwer zumutbar.
Z u n ~ c h s t muB man sagenF dab der obige E i n w a n d bezfiolich der
Korrektheit von Programmen
auf k o n k r e t e n Maschinen,
ten M a s c h i n e k o r r e k t
bier ebenso zutrifft.
sind,
die auf der abstrak-
Falls die a x i o m a t i s c h e
D e f i n i t i o n eine P r o g r a m m i e r s p r a c h e
so weit festlegt,
w e n d u n g einer a b s t r a k t e n Maschine,
stellt die abstrakte Maschine,
oben s k i z z i e r t e
axiomatische Definition
De___r U m f a n g a x i o m a t i s c h e r
wie a u f w e n d i g eine ~ l e i c h v o l l s t M n d i q e
einer Programmliersprache wird.
D e f i n i t i o n e n yon Proqrammiersprachen:"
Wir e r i n n e r n uns an das in der E i n l e i t u n q
geschildete
Wir hatten eine M e n g e von Objektmenqen,
ge von R e l a t i o n e n RI~ R2,... ziehungen
die
freie M a s c h i n e dar.
Es stellt sich d a m i t die Frage,
Definitionen.
wie die unter Ver-
Schema a x i o m a t i s c h e r PI' P2'''"
eine Men-
und eine Menge von A x i o m e n A I, A 2 , . . . , d i e Be-
zwischen d e n R e l a t i o n e n postulierten.
Was s i n d i n e i n e r T h e o r i e der P r o g r a m m i e r s p r a c h c n dL~ O b j e k t e ? Zur B e a n t w o r t u n g d i e s e r Frage gibt es nur die MBqlichkeit, tur n a c h z u s e h e n ,
welche Gegenst~nde
chen und in A u s s a g e ~
in d e r Litera-
in D e f i n i t i o n e n von P r o g r a m m i e r s p r a -
~ber P r o g r a m m i e r s p r a c h e n
p r o g r a m m i e r s p r a c h e n t_y_pischen O b j e k t e
vorkommen,
ausz~sondern.
um dann die fur
473
In der Arbeit von Floyd [~] werden einige Formen f~r Axiome der Programmiersprachen angegeben, die allgemein akzeptiert werden. Ein Beispiel ist : F~r alle PI' P2' QI' Q2 und C gilt Pl { C } QI & P2 { C } Q2 => ( PI A
P2 ) { C } ( QI A
Q2 )
Hierin sind Pi und Qi Pr~dikatsausdrHcke und C ist ein operationeller Ausdruck.
Die
Pi' Qi und C sind a l ~ O b j e k t e ,
die in Relationen P{C}Q
stehen k6nnen und diese Relationen stehen in dem oben angeqebenen Zusammenhang. Das sind unendlich viele Axiome in denen beliebi~ komplizierte Objekte vorkommen.
Die~e und d i e w e i t e r e n b e i Floyd angegebenen Axiome he, aden ~enau, dab Programmiersprachen aus T e i l e n C a u f g e b a u t w e r d e n , . d i e a l s R e l a t i o n e n im m e n g e n t h e o r e t i s c h e n Sinn i n t e r p r e t i e r t werden ~ o l l e n . Hiergegen kann man wenig einwenden. Diese unendlich vielen Axiome haben eine sehr einfache Gestalt und sie sind unter natHrlieben Voraussetzungen Hber die PridikatsausdrHcke
aufzMh{bar im Sinne der rekursiven
Funktionentheorie. LieBen sich Programme stets aus primitiven Elementen C. durch Aneinander1 reihen CI; C2;.-.; C k aufbauen, w~re man fertig. Leider beginnt aber hier erst das ei~entliche Problem. Der Aufbau der Programme C aus primitiven Strukturen CI,... sehr kompliziert,
ist
wie aus der syntaktischen Beschreibung der Program~ier-
sprachen hervorgeht.
Hoare und Wirth haben in[~.] den Versuch unternommen
die Semantik der sehr Gbersichtlich auf~ebauten Programmiersprache Pascal zu einem wesentlichen Teil axiomatisch zu fassen. Hierbei treten in der Tat nahezu al!e
( vielleicht sind es alle, ich habe es nicht nach-
geprGft ) syntaktischen Grundbegriffe der Sprachdefinition yon Pascal auf. - Nach meinem Empfinden sollten es alle sein, wenn nicht durch die syntaktischen Sprachmittel HberflHssige Begriffe hereingekommen sind. - Dies sind aber grSBenordnungsm~Sig
5o Begriffe. Das heist, dab wir bei einer
vollst~ndigen axiomatischen Beschreibun~ einer hSheren Pro~rammiersprache mit einem unerh~rt groBen Axiomensystem rechnen mGssen. Hier stellen sich sofort zwei Fragen
:
Kann d i e s e s Axiomen6ystem n i c h t s e h r s y s t e m a t i s c h s e i n ? Wit haben die6 doch oben am B e i s p i e l e i n e s u n e n d l i c h e n Axiomensystem~ g e s e h e n . Was h e i B t h i e r das Wort V o l l s t ~ n d i g k e i t
?
Die erste Frage l~st sich nicht unabh~ngig v o n d e r
zweiten beantworten.
474
Man kann d i e s e F r a g e auch nicht nur yon e x i s t i e r e n d e n
Programmiersprachen
ausgehend beantworten.
Genauer~uon der D ~ f i n i t i o n e x i s t i e r e n d e r Sprachen ausgehend. Es i s t denkbar, dab w i t uns i n der s t a r k e n Beschr~nkung der s y n t a k t i s c h e n B e s c h r e i b u n g s m i t t e l bei der f a s t a u s s c h l i e B l i c h e n Verwendung van k o n t e x t f r e i e n Chromsky-Spra~hen oder ~berhaupt yon s e m i - T h u e - S y s t e m e n ein P r o k r u s t e s b e t t g e s c h a f f ~ n haben. Was auf den e r s t e n B l i c k a u f f ~ l l t ist das folgende
: N i m m t man die
F l o y d ' s c h e n A x i o m e und d i e E r g ~ n z u n g d u r c h M a n n a L S 3 ,
die die B e h a n d l u n g
dann b e s i t z e n d i e v o r g e s c h l a g e n e n Axiomensysteme k e i n e hierarchische Struktur.
des " While
~ aus,
Diesen Abschnitt
zusammenfassend
b e m e r k e n wir, dab auch die a x i o m a t i s c h e
D e f i n i t i o n der S e m a n t i k v o n P r o g r a m m i e r s p r a c h e n , drungen,
wie es scheint notge-
sehr a u f w e n d i g wird.
Die S t r u k t u r i e r u n @ y o n A x i o m e n s y s t e m e n : Wenn wir d i e V e r s u c h e
eines d u r c h g e h e n d
Mathematik
d a n n entdecken wit auch bier sehr u m f a n g r e i c h e
anschauen,
Axiomensysteme. Er r i c h t e t
Nut b e g e g n e n d i e s e e i n e m M a t h e m a t i k e r
sich ein A r b e i t s f e l d
naiver V e r w e n d u n g
aller in der M a t h e m a t i k
bis jetzt die Einsicht, selbst~ndiges
Zum B e i s p i e l
Aufbaues der
so gut wie nir.
a x i o m a t i s c h her, b e a r b e i t e t es aber unter
In d e r T h e o r i e der p r o g r a m m i e r u n g
~hnlich
axiomatischen
zur V e r f ~ g u n g stehender Mittel.
oder der P r o g r a m m i e r s p r a c h e n
ob und wie sich B e r e i c h e a b g r e n z e n
Leben nebeneinander
fehlt uns
lassen, die ein
fHhren k~nnen.
fehlt v o l l s t ~ n d i g die A n t w o r t auf die Frage, was ist ein
Axiom e i n e r T h e o r i e der Programmiersprachen und was n i c h t ? H a b e n die A x i o m e fur die n a t ~ r l i c h e n
Zahlen in einer T h e o r i e der P r o g r a m -
m i e r s p r a c h e n m e h r zu suchen als in d e r G r u p p e n t h e o r i e
?
Eine A n t w o r t auf die erste beider F r a g e n wMre ein S c h r i t t auf eine hierarchische Man e n t w i c k e l t
Strukturierung: z u n ~ c h s t eine T h e o r i e der P r o g r a m m i e r s p r a c h e n
z.B. mit
freien Typen. Das heiSt~ dab die V e r w e n d u n g des T y p e n a l p h a b a t e s d u r c h A x i o m e n i c h t e i n g e s c h r ~ n k t wird. " Typenaxiomen
~' v e r t r M g ! i c h e n
Man b e t r a c h t e t d i e m i t den a l l g e m e i n e n
Interpretationen.
Ein Beispie!
fur die
B e h a n d l u n g d i e s e r T y p e n findet m a n etwa in L ~ . Man nimmt den a u s g e z e i c h n e t e n SchlieBlich Axiomensysteme Irgendwo
Typ b o o l e a n auf. P r o z e d u r t e c h n i k e n .
for integer,
real, usw.
in d i e s e m h i e r a r c h i s c h e n A u f b a u gibt es A b z w e i g u n g e n
oder Pascal.
zu Algol
68
475
Ein solcher Aufbau w~rde die Definition der einzelnen Sprachen sehr entlasten.
Er w~rde es erlauben in allgemeinen Theorien f~r die Bearbeitung
spezieller Probleme so viel Vorarbeit
zu leisten, daB die groBe Kompli-
ziertheit der Sprachen weniger dr~ckend empfunden wHrde. In die oberste Hierarchie geh6ren die Untersuchungen Hber
Programm-
schemata C7 ~. Die Vollst~ndigkeit Ein Axiomensystem, Sinn vollst~ndig
: das fur eine Theorie der Programmiersprachen
in dem
ist, dab es die ~quivalenz von Programmen nachzuweisen
gestattet,
die bei jeder zul~ssigen Interpretation die gleiche Funktion
berechnen,
gibt es nur fHr sehr allgemeine Theorien.
che ein Axiomensystem yon Integervariab!en es n i c h t
Gibt es in der Spra-
fHr einen T y ~ z.B. integer, der die Interpretation auf die natHrlichen
Zahlen einschr~nkt,
dann ~ibt
einmal ein a u f z ~ h l b a r e s Axiomensystem fHr diesen Zweck. Dies
folgt leicht aus dem bekannten Satz, dab die Turin~proqramme fest vorgegebene Funktion nicht aufzMhlbar AbschlieBende
Zusammenfassung
fHr eine
sind.
:
Die Entwicklung einer axiomatischen scheint notwendig und bei geeigneter
Theorie der Pro~rammiersprachen
er-
Strukturierung der Axiomensysteme
vielversprechend. Vollst~ndigkeit
sowohl beweistheoretisch
wird nicht erreichbar
als auch fHr praktische Zwecke
sein. Eine erg~nzende Betrachtunq der Simulation
yon zu den axiomatischen Theorien gehSrigen freien Maschinen auf konkreten Maschinen erscheint
stets als notwendig.
AIs dringend w~nschenswert integer und real heraus. Maschinenarithmetik
Literatur
stellt sich eine Untersuchun~ der Beqriffe
Hieraus sollte schlieBlich eine Normung der
resultieren.
:
[I~
Floyd, R.W. (1967) "Assigning Meanings to Programs" in Proz. Sym. in Applied Math. 19, Mathematical Aspects of Computer Science (Schartz, J.T. ed), Amer. Math. Soc. pp. 19-32
L2~
Hoare, C.A.R. (1969) "An Axiomatic Basis for Computer Programming". Comm. ACM 12, pp. 576-583
L3]
Hoare, C.A.R. and Wirth, N. (1973)"An axiomatic Definition of the Programming Language". Pascal Acta Information Vol 2, pp. 335-357
L4~
Hotz, G. (1972) "Grundlagen einer Theorie der Programmiersprachen II". Berichte des Fachbereiches f~ir Angew. Math. + Informatik, pp. 1-51
L5~
Manna,
Z. and Paueli, A.
(1974)
" Axiomatic Approach to total
476
Correctures of Programs", Acta Informatica Vol.3, pp. 243-265 ~6~
Lucas, P. and Walk, K. (1969) "On the Formal Description of PL/I", Annual Review in Automatic Programming, Vol. 6, Part 3, Pergamon Press
L7~
symposium on semantics of Algorithmic Languages, Lecture Notes in Mathematicsv Vol. 188, Springer-Verlag 1971 (ed E. Engler).
Anschrift des Verfassers: Prof.Dr.G.Hotz,Universit~t des Saarlandes FachbereichAngewandte Mathematik und Informatik 66 Saarbr~cken
FORMALIZATION, History,
Present,
and F u t u r e
Zemanek
H.
IBM L a b o r a t o r y ,
Vienna
In the M u i r Woods near San F r a n c i s c o ,
there
is
a cut
t h r o u g h a redwood t r e e more than 9oo y e a r s o l d . a cut
is
an e x c e l l e n t
between f o r m a l
example f o r
and i n f o r m a l
the g r o w t h o f a t r e e still
the r e s u l t
tion
of a circle.
is
Such
the r e l a t i o n s h i p
structures.
a very natural
Certainly process,
and
comes v e r y c l o s e to t h e f o r m a l
no-
In t h i s
quence o f g r o w i n g c i r c l e s
particular
case the se-
is
to the c i r c u l a r
related
movement o f our p l a n e t around the sun - two s p h e r i cal
objects,
by th e way, whose a p p e a r a n c e suggests
again the formal Very g e n e r a l l y ,
shape o f formal
a c h i e v e d by o n l y l i t t l e of the i n f o r m a l by c l a r i t y effort:
cirGle
the c l e a r e s t ,
any c l o s e d ,
description
usually
be
or m a n i p u l a t i o n
The common ground i s
and economy, by s i m p l ~ c i t y
the i n f o r m a l
shortest,
e x p r e s s i o n can f r e q u e n t l y correction
pattern.
of a balanced process, cribe
a circle.
given
and minimum is
and the f o r m a l
t h e consequence circle
and th e s i m p l e s t
is
the
way t o des-
somehow round shape - e v e r y o t h e r
requires
additional
information.
478
The c l o s e c o n n e c t i o n o f is
not only true
many o t h e r
for
structures.
language in general our p o s s i b i l i t y natural
of
And i t
is is
equally
language there
for
description true
particularly
true
is
and f o r m a l
instance,
In the most
aspects.
a r e s u g g e s t e d by t h e sounds
informal,
artificial
and s t i l l
in a h i g h l y tactical
Some s o r t
is
little
its
history
Misspelling
evokes syn-
even i f
the r e c e i v i n g
individual
or e d u c a t i o n ;
a child
of formality,
some amount o f f o r m a l i t y , and f o r m a l :
to
push f o r m a l i t y .
And t h i s
two d i r e c t i o n s .
Namely, how f a r
formal
from e x i s t i n g
derivations, formal
is
s h o u l d one go w i t h formally
formal
defined,
starts
formal
The i d e a l
the b e t t e r
axiomati-
But t h e work an en-
the redwood c u t : formal
b u t a t t h e b e g i n n i n g o f any i n v e s t i g a t i o n ,
such n o t i o n s
are not p r e c i s e
development frequently even o f two h o s t i l e formal
the
to atomic e l e -
t h e r e ar e s u g g e s t e d and even s e l f - s u g g e s t i n g notions,
the b a s i s
we a r e born i n t o
v e r y much l i k e
how
the basis in
total
interconnection.
the m i d d l e :
v i r o n m e n t which is
conclusions
methods i n t o
case i s
can be
s h o u l d one
secondly,
o f th e s t r u c t u r e
logical
always in
formal - and,
Obviously,
the r e d u c t i o n
ments and t h e i r
question
t h e s a f e r we a r e l a t e r
derivations.
zation~
i.e.
structures
o f what we a r e d o i n g .
there-
and r e a s o n a b l e , t h e q u e s t i o n
asked i n t o
far
can
words which he does n o t u n d e r s t a n d .
t h e p r o p e r b a l a n c e between i n f o r m a l
go w i t h
all
training
a p p e a r s t o be n a t u r a l
how f a r
is
composed o f sounds or l e t t e r s
principle.
correction,
speak c o r r e c t
procedure,
it
digital
has had v e r y
is
way. A
word has n o t been c r e a t e d by a c o n s t r u c t i o n
or any o t h e r
fore,
for
a r e m a r k a b l e co-
which the body can produce in a v e r y n a t u r a l natural
for
- o u r means o f c o m m u n i c a t i o n and
informal
The l e t t e r s ,
and f o r m a l
it
to e x p r e s s our t h i n k i n g .
forms o f
existence
informal
shapes,
universe.
and f i g h t s ,
it
further
universes,
t he
informal
In o r d e r t o a v o i d f r i c t i o n s , is
the r e a l ,
of formalization essentially
tensions
and how i t s
informal
-
and t he
n e c e s s a r y to u n d e r s t a n d t h e v i r t u e s
and t h e l i m i t a t i o n s bedding in
enough. T h e i r
leads to a kind of s e p a r a t i o n
em-
w o r l d can
479 be done w i t h o u t
harm. Such an u n d e r s t a n d i n g i s
d e r i v e d from a s t u d y o f h i s t o r y the p r e s e n t s i t u a t i o n .
better
than from a l o o k a t
So I w i l l
describe
history
then have a l o o k a t the p r e s e n t s i t u a t i o n
first,
in general
terms - many o f you know more a b o u t f o r m a l i z a t i o n n o l o g y than
I do - and f i n a l l y
number o f c o n c l u s i o n s f o r
The H i s t o r y
try
th e f u t u r e
tech-
to draw a
development.
of Formalization
Formal t h i n k i n g is
I will
is
as o l d as t e c h n o l o g y ,
as o l d as t h e human mind.
and t e c h n o l o g y
The a n c i e n t
expression
tools,
h o w e v e r , were crude and d i d n o t l e a d f a r .
formal
methods o f t h e Chaldeans or o f t h e a n c i e n t
Greeks,
consequently,
the d a i l y
life
could hardly
be made u s e f u l
o f the a v e r a g e c i t i z e n .
it
always remained r a t h e r
an a r t
principles
s c i e n c e were l a i d
down as e a r l y
Two i m p o r t a n t zation.
than a f o r m a l as t h a t .
look,
because a f t e r
information
of nature,
processing.
- like
tences by l o g i c a l
for
an u n d e r -
and t h e y are b a s i c
While the i d e a o f d e r i v i n g
geometry - from a few b a s i c sen-
rules,
and th e i d e a t h a t
we see i s
composed o f a s m a l l
particles
combined and i n t e r a c t i n g logical
laws,
number o f
what e v e r
indivisible
on t h e b a s i s
are e x t r e m e l y f o r m a l
writing
It
denoting variables
numbers f o r m a l l y
was n o t u n t i l
(but
82o a.D.
formalized.
The man who d i d in
called
the c i t y
n ot y e t
that
Arab l i v i n g
with
o f ab-
in essence,
t h e y were e x p r e s s e d by the Greek p h i l o s o p h e r s language, just
more
development, they still
seem to be the b e s t our mind can o f f e r s t a n d i n g and a c o n t r o l
solutely
building.
i d e a s were t h o s e o f atomism and a x i o m a t i -
Both d e s e r v e a c l o s e r
a whole f i e l d
character,
o f our p r e s e n t f o r m a l
than 2ooo y e a r s o f s c i e n t i f i c
for
for
And a n c i e n t
s c i e n c e n o t o n l y remained o f p h i l o s o p h i c a l Anyway, the b a s i c
The
in n a t u r a l
symbols and decimally).
m a t h e m a t i c ~ began t o be
this
important
s t e p was an
o f Khiva (now U z b e k i s t a n ) ,
Khorezm (which due t o t h e d i f f i c u l t i e s
vowels in A r a b i c was s p e l l e d
then
in w r i t i n g
as Khwarizm and i n
this
480
form t h e c i t y
occurs
as l a s t
part
of
the name ) :
D s h a f a r Muhammed ~bn Musa a l - K h w a r i z m i . cribe
the
up to left
legal
four
wives
a big
partition for
problems of
occurring,
different
inheritance
for
that
He had to des-
when an Arab w i t h
legal
standing
d i e d and
them and many c h i l d r e n .
was a huge m a t h e m a t i c a l
such problems
Abu
problem,
Muhammed i n v e n t e d
The
and i t
was
algebra,
in a
"Kit~b al-jabr w'almuq~balah". The term algebra comes from t h i s t i t l e and t h e term algorithm book w i t h
the
comes from
title
the a u t h o r ' s
(i.e.
The book was t r a n s l a t e d
into
12oo,
but
it
was r e a l l y there
was not u n t i l
accepted
was t h e
of
in
method.
that
In t h a t
between the A b a c i s t s
stones
name.
Europe a r o u n d
16th c e n t u r y
between the c o n c r e t e
calouli, l i t t l e
calculation
Latin
the
as a f o r m a l
struggle
Algorithmists,
from t h e c i t y ' s )
and the
calculation
or c o i n s ,
algebra century, by means
and the a b s t r a c t
on paper and by more and more f o r m a l
The a l g o r i t h m i s t s
rules.
won, because s u d d e n l y paper c o u l d
produced a t a much l o w e r p r i c e
be
- w h i c h shows once more
how much we depend on t e c h n o l o g y ,
even i n such mental
aspects. Another
important
construction tecture.
step of abstraction
of buildings:
There also
between the a subject
Italian
was a s t r u g g l e and the
which I intend
will
come back to t h e
will
restrict
tural
myself
principles.
notion
of
the E n g l i s h
the French park.
of architecture, Here,
is
I
between E n g l i s h
between t h e t h e
garden and the f o r m a l
The o t h e r
I
examples o f a r c h i t e c -
the d i f f e r e n c e
and F r e n c h garden a r c h i t e c t u r e : philosophy
style
i n more d e t a i l .
of architecture.
simple
One i s
between two s c h o o l s ,
British
to s t u d y
to
happened i n the
the d e v e l o p m e n t o f a r c h i -
the difference
informal art
of
between
the Red Square i n Moscow and the P l a c e des Vosges
in
P a r i s : between t h e u n s y s t e m a t i c c o m b i n a t i o n o f d i f f e r e n t b u i l d i n g s to a p i c t u r e s q u e ensemble and t h e s y s t e m a t i c design of a whole city
square by one a r c h i t e c t .
add to t h e s e two p i c t u r e s trarily
selected
two o t h e r
set of blocks
ones,
If
you
say an a r b i -
i n M a n h a t t a n and a d i s t r i c t
l i k e T e e s s i d e i n England (an i r r e g u l a r l y filled unit of a regularly p l a n n e d c i t y and a p a r t o f the c o u n t r y w h i c h was d e s t r o y e d
by u n p l a n n e d ,
unsystematic
and u n n a t u r a l
481 erection
of
poor p e o p l e ' s
how o r d e r e d and how w i l d
h o u s e s ) , you get an i d e a o f t e c h n o l o g y may go. And i t
not d i f f e r e n t
i n our own f i e l d .
A third
of abstraction
field
where d e s i g n i s
is mechanical c o n s t r u c t i o n ,
based on b l u e - p r i n t s .
but a b l u e - p r i n t
is
a formal
to be p r o d u c e d , composed o f
We may not r e a l i z e ,
definition ideal
of the o b j e c t
straight
and o t h e r e l e m e n t s ; somewhere i n print
lines,
the c o r n e r of
t h e r e may even be f o u n d an i n d i c a t i o n
s h o u l d be used to c o n s t r u c t
the o b j e c t .
The f i e l d
as we can see,
larger
of formalization,
than j u s t
much b e t t e r
mathematics,
itself;
after
turies,
i n th e
h a v i n g used f o r m a l
use o f f o r m a l for
structures
a clean situation
logic ter
o u t a need o f f o r m a l
19th c e n t u r y in
it
is much
is
~t
an e s s e n t i a l
part
in
This
with
HILBERT gave an a c c o u n t o f t h i s ideal that
that
G. FREGE.
"Prinaipia
their
of mathematics, but t h i s in
G~DEL proved t h a t
or o r d e r l y
chap-
t h e name o f
shows a l s o how many problems t h e r e are s t i l l soon a f t e r
the
o f the b a s i c
required.
of mathematics starts
RUSSELL and WHITEHEAD f o r m a l i z e
cen-
not a guarantee
definition is
for
of
recognized that
itself
- formal
develops.
definition
G. BOOLE, and t h e n e x t name to be m e n t i o n e d i s
Mathematica
the b l u e -
which m a t e r i a l
structures
is
and o f the b a s i c n o t i o n
of history
circles
but t h e r e we can see
how t h e i d e a o f f o r m a l i z a t i o n
Mathematics finds
is
his
open.
Programme, b u t
t h e w o r l d was n o t q u i t e as
as t h e m a t h e m a t i c i a n s had t h o u g h t and
even a f i e l d
as n u m e r i c a l m a t h e m a t i c s has u n d e c i d a b l e
spots. Hardware i s much b e t t e r because d i g i t a l grounds.
While
their
really
than s o f t w a r e
hardware f u n c t i o n s it
is
engineers reinvented built
off
switching
true
that
for
knowing - on th e u n i v e r s a l
matics,
on s t r i c t l y
in switching
th e p r o p o s i t i o n a l circuits
and i n so d o i n g
respect,
formal
a l g e b r a t he
calculus,
they
computers - w i t h o u t f o u n d a t i o n o f mathe-
they prepared the u n i v e r s a l i t y
o f t h e computer from the v e r y b e g i n n i n g . algebra,
in t h i s
by the way, o r i g i n a t e d
and NAKASIMA p u b l i s h e d t h e i r b e f o r e SHANNON's paper o f
first
1938.
Switching
i n Japan, where HANZAWA articles
in
1936,
482
The f o r m a l abled
character
of
the s w i t c h i n g
the hardware e n g i n e e r s
to
design
and s u b s e q u e n t
little
change i n the s t r u c t u r e
turization
amplifies
strongly is
forced
so too
frequently
irrational) tomatic
with
design
and a u t o m a t i c
discipline.
This
software:
texts,
informal
t h e y do
(not
a situation production
to say:
where aubecome
in-
difficult.
come back to
Formalization
t h e s e problems
outside
Not v e r y many f i e l d s times,
highly
Minia-
methods and
case f o r
formal
methods p r o d u c i n g
creasingly I will
work w i t h
with
philosophy.
to e x t r e m e
en-
to a u t o m a t i c
production
t h e need o f f o r m a l
hardware
programmers
turn
automatic
how h a r d w a r e became a model
while
circuits
Mathematics were f o r m a l i z e d
but formalization
mathematics
only.
later.
before
was n o t a t a l l
o u r computer
restricted
A few examples s h a l l
illustrate
to this
statement. Few p e o p l e language, although songs, i
realize
that
musical
and even f e w e r t h i n k almost
everybody
whether children's
16 bar s t r u c t u r e .
can see t h a t
mit
Jack and J i l l
d e r Post went up the h i l l
Bakersman
like HUo ho, a l t e r
show t h e
stehen
Jack H o r n e r
Pat a Cake, and h i t s
im Walde Stern!ein
16 bar s t r u c t u r e
Schimmel,
perfectly,
formal terms,
a r e based on
ging allein
WeiBt Du, w i e v i e l
a true
most p o p u l a r
Songs l i k e
Ein M ~ n n l e i n s t e h t
Little
is
in binary
songs or h i t s ,
H~nschen k l e i n ,
Ich fahr'
notation o f music
hUo ho
483
The b i n a r y into
character
what I call
I was r e a l l y
of music goes,
abstract
struck
of
Beethoven's
ly
512 b a r s .
and many o t h e r bigger
to round b i n a r y
t o a good p a r t
blocks
Of c o u r s e ,
or
of 8
an a r t i s t
intuitively
and t h e r e
- an ab-
or exceptions.
the cries
"Halleluja"
remains
principles,
But
from
a perfectly
the b i n a r y
The d i g i t a l
look
at
composition
character
more
that
digital
automata
t h a n 6oo y e a r s
from the chimes history
of
systems w i l l
take
this
is
but
a systematic
for
thousands
of analog-stored
step
Upper A u s t r i a :
and a r t
the weaver's
to be f o u n d
recently
of very early
control
digital
on-
used i n w e a v i n g
i n any f o l k l o r e .
discovered unit
started.
loom i s
an i n t e r -
days i n a l o c a l
a programming
wooden bars
music
is weaving -
where a u t o m a t i o n
of a closed-loop
an i n v e n t i o n
craft
for
Edison's
in the future.
i m p r o v e m e n t o f gadgets
of years,
there
music.
I am c o n v i n c e d t h a t
a field
One o f my c o l l e a g u e s mediate
performed
over not far
invented
ly
have been used
to the n i c k e l - p i a n o ,
A n o t h e r example o f a b i n a r y The p u n c h - c a r d
be d e r i v e d
to produce music a u t o m a t i -
formally
reproduction,
and a g a i n
More r e s u l t s
and c o m p o s i t i o n
of music could also
p h o n o g r a p h opened a c e n t u r y and i t s
o f m u s i c comes
i n t h e composing
can be e x p e c t e d .
from t h e f a c t
a long
character
each symmetry a d d i n g a b i t .
architecture
cally;
for
from t h e s y m m e t r i e s
from a b i n a r y
little
building
Beethoven
composition.
The e x p l a n a t i o n
sisting
numbers.
deviations
instance
H~ndel's"Halleluja"
is
had e x a c t -
was the o n l y
seem to be t h e most f r e q u e n t
in compositions.
scheme w i t h o u t
since
liked
- rationally
out for
binary
this
movement
symphonies to show a p r e c i s e
composers
blocks
stract
out that
and 64 bar b l o c k s
never follows take
the first
b u t many show sums o f two powers o f
two o r come c l o s e or 4 b a r s ,
of composition.
that
symphony (The P a s t o r a l e )
turned
movement o f the n i n e power o f t w o ,
architecture
by t h e f a c t
Vlth
It
h o w e v e r , much f u r t h e r
built
in
piece of linnen the s t e e r i n g
w h i c h may go back b e f o r e
museum i n 174o c o n -
on w h i c h o f t h e loom -
169o.
484
The e x p l a n a t i o n is
that
each c r o s s i n g
a digital as i s
of the binary
o f two t h r e a d s
weaving pattern.
all
n o r the
maybe c o m p u t e r
programming
could
and from w e a v i n g programming sequentialization.
A third
example o f
with
a formal
restricts
which are essentially "DEBIT"
column, of
but
his
programmer
A similar
reduction
register,
tomatic
does a t
Hermann
be e l a b o r a t e d run on Otto
"CREDIT" and
books
o f the
- precisely
to is
simple
t h e popu-
origin
o f au-
the
pioneer,
to a m e e t i n g
of Sciences
1896.
life
of
already
to f m r m a l l y
a Spanish
in
final
for
census p r o c e s s -
- one o f t h e census o f imported first It this
a notation
cal
constructions.
for
two f i r s t
and i m p r o v e d
patent
for
by
'com-
t o o k me t h r e e y e a r s Austrian
pioneer.
the b l u e - p r i n t .
define
one to
189o was a l s o
There was one
the b l u e - p r i n t .
In 19o7,
Leonardo TORRES y QUEVEDO, subof
representatives
o f Academies
in Vienna a paper c o n t a i n i n g
for
patents
189o was t h e f i r s t
HOLLERITH m a c h i n e s , programming
his
system d e s i g n e d
automatically
I have m e n t i o n e d
mitted
books
He i n s i s t s
processing
because t h e A u s t r i a n
to r e c o n s t r u c t
attempt
his
the computer.
HOLLERITH f i l e d
SCH~FFLER, who got t h e
puter'
again
The b o o k - k e e p e r of
o f a complex r e a l i t y
and t h e US census o f
rather,
the
loom
of ex-
book-keeping,
t h e census - a n o t h e r
the punch-card
ing,
rid
the v e r i f i c a t i o n
outside
the
sequence -
computing.
In 1889: for
he l e a v e s
in is,
from t h e
sum i n b o t h t h e
and s u b s e q u e n t f o r m a l
lation
is
extended forms.
data
what the
forms
learn
correctness
on h a v i n g t h e same f i n a l semantics
Neither
the time
of centuries.
to the
a point
thinking
how t o g e t
formalization
tradition
himself
marks
sequential.
loom are bound to
cessive
of weaving
Mathematical
human t h i n k i n g ,
computer
character
the formal
description
As an e x a m p l e ,
the proposal of mechani-
he f o r m a l l y
defined
485 a small
machine the d e s c r i p t i o n
lished
a year before,
putation formal
a gadget f o r
o f the p r o d u c t definition
description
in
equations, applies
of
looks
the automatic
com-
two complex numbers,
almost
the proposed
including
in his
o f w h i c h he had pub-
like
APT and t h e f o r m a l
language consists
even the box o f
paper arguments
The
o f 31
the g a d g e t .
in favour
TORRES
of formal
definition
w h i c h are as v a l i d
t o d a y as t h e y were s i x t y
years
(The Academies d i d
not accept
ago.
As a f i n a l science
example f o r
in general
formalization,
including
S i n c e 2oo o r 3oo y e a r s ,
science
concept
The method i s this
basically
century
Mathematics.
had been used t o m a s t e r all
to e q u a l l y direction It
kinds
teaches
that
reality
all
we s e n s e ,
and c o n s i s t s
of
their
association.
The s i t u a t i o n
for
somebody to
T h i s man i n f a c t engineer
HOLLERITH f i l e d titled
(elementary
by t h e laws o f
science
around
it
to a p h i l o s o p h y
It
191o c a l l e d of
patents.
b u t an a l g o r i t h m i n terms o f
These atoms he c a l l e d later
science
was t h e young V i e n n e s e in which
At the end o f t h e F i r s t a manuscript
"Traotatu8 Logioo-Philosphicus", else
is
character.
appeared. his
a
sensations)
of
WITTGEF~STEIN had f i n i s h e d
of the universe Circle
In P s y c h o l o g y ,
remember and t h i n k
inputs
generalize
of material
have been proven
Ludwig WITTGENSTEIN, born i n t h e y e a r
World War, nothing
kinds
was most p o w e r f u l .
combination
and l o g i c
atoms
and laws o f n a t u r e
character.
Associationism sensory
of atomic
In t h e f i r s t
WHITEHEAD had f o r m a l l y
not o n l y a l l
based on a t o m i c
the
Greek
method had reached
o f e n e r g y w h i c h must
have a t o m i c called
have
model o f
In P h y s i c s and C . h e m i s t r y ,
had become an e s t a b l i s h e d but also
science.
the a n c i e n t
this
a peak o f s u c c e s s . RUSSELL and defined
of
and t e c h n o l o g y a formal
o f Atomisms and A x i o m a t i z a t i o n .
two decades o f
paper.)
I want to take
philosophy
worked on what c o u l d be c a l l e d universe.
this
for
w h i c h was
a formal
logic
elementary
definition
and l o g i c a l sentences
used t h e term p r o t o c o l
en-
atoms.
(the
sentences).
Vienna Try
486 all
possible
sentences, of
logical says h i s
verification
false~ the
general
true
ones,
formal
algorithm,
algorithm,
and you w i l l
description
is we
For i f
nothing
tence of
the
have a p e r f e c t
that
cannot
speak
assumptions
namely t h a t
about,
were.
come to a d e c i s i o n ~ there
is
atoms.
the f a c t
we have f i n a l l y do'not
just
sical for
cognized
First
philosophy
of
all
unit
of
and pack a f u l l
this
fact
II
over
around
in which for
a word depends on t h e
in
silence.
b u t some o f
GUDEL h e r e ,
p r o c e s s must
speak a b o u t t h e
logics
of
a perfect
the s m a l l e s t
one can f i n d
outside
from p h y s i c s
The same f a c t
psychology.
Even i n
sen-
But much more b a s i c a l l y ,
got a class
universe.
pass
decision
and one c o u l d
promise
last
reads:
One c o u l d m e n t i o n
that
t h e end
made to w o r k ,
were not w r o n g ,
of verification.
We know t h i s
and com-
And t h e
we m u s t
n o t each l o g i c a l
difficulties
is
consequently
or
From t h i s
concluded
principle
more to c o n s i d e r .
Tr~ctatus
true
ones and c o l l e c t
of the universe.
WITTGENSTEIN's d e r i v a t i o n s his
elementary check by means
false
WITTGENSTEIN c o r r e c t l y
of philosophy. there
of all
w h e t h e r t h e y are f a c t u a l l y
t h r o w away t h e f a c t u a l l y
plete
What
combinations
there
are no
and c h e m i s t r y ,
small
where
particles
description
of
which the p h y -
c o u l d be e s t a b l i s h e d it
is
true
language story.
1933,
for
(say:
language. a gesture)
WITTGENSTEIN r e -
and he s t a r t e d
instance
a
the meaning o f
l a n g u a g e game w i t h i n
which it
for
is
is
used. What i s that
of
importance
the d i g i t a l
computer
of
the Tractatus.
is
a combination
mation bits
be f o r
domain o f
true
the w o r l d
atoms o f
system
infor-
i n d e e d keeps s i l e n t ~
It
is
it.
in
terms
To make i t
clear
in
only of
t h e meaning o f any program depends game w i t h i n
which
it
is
used.
e v e r a gap between the f o r m a l
i n o u r systems
the f a c t
- and a b o u t w h a t c a n n o t be s a i d
about
on the computer will
perfectly~ealizes
and sequence o f
- of bits
WITTGENSTE!N I I :
science
Whatever happens i n a c o m p u t e r
t h e computer
we who t a l k
computer
and t h e
t h e programs
informal
reality,
and a l g o r i t h m s
There
universe between the
and t h e
life
of
487 people
and c o m m u n i t i e s
bridged
- a gap w h i c h r e m a i n s
by t h e human b e i n g ,
mechanical
and f o r m a l
The P r e s e n t
of
has a c c e l e r a t e d fields,
a practical
formalization
and o u t s i d e
need - s i m p l y
The i n s t r u c t i o n
by b e i n g
formal
defined
by the e l e c t r o n i c s
of
to
solve
the computer
vations
practical
The way from t h e
problems.
case t h a t
both syn-
circuits
pro-
was p r o to
pragmatic
moti-
could occur. set
The i d e a o f t h e f l o w
to t h e programming
lines
d i a g r a m h e l p e d , to p i n
of a program
What has to be m e n t i o n e d
text
ZUSE's
"Plankalk~l",
guage g o i n g f a r by w r i t i n g
languages.
It
was p a i d t o
it:
formula
con-
formal
formalism.
this it
lan-
ZUSE p r o v e d It
is
work and t h a t c o u l d have a c c e -
t h e d e v e l o p m e n t o f programming
was R u t i s h a u s e r
i n Europe w i t h
"Reahenplanfertigung" who t r i g g e r e d guages i n d e p e n d e n t
this
which
in his
he c o u l d n o t c o n t i n u e
considerably
in
a perfectly
the game o f chess
that
- to any d e s i r e d
next
beyond m a t h e m a t i c s ,
n o t more a t t e n t i o n
tize
t h e machine
The u n i v e r s a l i t y
from t o t a l l y
detail.
lerated
unit,
system w i t h
who wanted the t o o l
instruction
down the g e n e r a l
a pity
formal.
i s marked by s t e p s w h i c h a g a i n were e s s e n t i a l l y
practical.
is
has made
e n g i n e e r who i n t u r n
resulted
- the b e s t
language
itself
by t h e s w i t c h i n g
grammed by the m a t h e m a t i c i a n be a b l e
in certain
mathematics,
set of the processing
an a b s o l u t e l y
t a x and s e m a n t i c s vided
the
Formalization
The computer
language,is
and o u t s i d e
tools,
mathematical it
before
to be
of
the American e f f o r t s
translation.
his
algorithmic
lan-
to automa-
FORTRAN and ALGOL were the
results. At t h i s insert
point
But s i n c e fairly
of
a detailed
the p a p e r , chapter
I can assume t h a t
well,
before
proceeding,
on the t h e o r y
I can r e s t r i c t
j u s t to l a y o u t a framework my l i n e o f t h o u g h t s .
my r e a d e r s myself for
of
I should
languages.
know t h i s
theory
to a few k e y w o r d s ,
the c o n t i n u a t i o n
of
488
The t h e o r y
of
scientific
l a n g u a g e as d e v e l o p e d by
PEIRCE, MORRIS and t h e V i e n n a C i r c l e , three
levels
add a l e v e l first
of
0 which
item in
distinguishes
to which
historically
language starting
I want to
has always been t he
language description:
Here~ we see cept,
investigation~
t he a l p h a b e t .
from an a t o m i c
which however does n o t h e l p f o r
its
con-
essential
content. Ne f i n d
the o n l y
to c o n s t r u c t character bits,
a code f o r
level,
number,
is
well
its
to
one f o r
the c o m b i n a t i o n of
the b i b l e
th e
of
attaches
to
so t h a t
t he
level
which
characters
the p a r t i c u l a r
On t he I will
yields
not t he
as t he human
meaning which
to t h e word Verbum. As easy as
code a w o r d , to d e f i n e
formal
can be used
a c o m b i n a t i o n of
l a n g u a g e as m y s t e r i o u s
supporting
meaning,
which
each c h a r a c t e r .
as d i f f i c u l t formally
word means. Only the c o n s t r u c t e d self
bit,
any a l p h a b e t ,
an i n t e r m e d i a t e
word - a u n i t mi n d ,
atom,
a p p e a r s as a m o l e c u l e :
a different
next
it
true
definition.
is
it
to c a t c h
what a t r a d i t i o n a l meaning o f f e r s
it-
489 The p r o p e r tics
levels
of
makes i t
a set
tences;
it
of
The p u r e s t
rules
describes
perfect mits
syntax
checking,
studies
sentence.
The i d e a l
principle
shown t h a t
(at
Karl
tences
to our best.
temporary
are a l l o w e d
and the computer
formally
level
like
fashions
vious
is
and m a t h e m a t i c s
ground.
there
of is
There we
t h e meaning and
languages
language,
the h i s t o r y
prag-
its
user,
in other words,
pragmatics;
the
of the language this
and s e c o n d l y ,
with
language
are o n l y
since
no hope to f o r m a l i z e
(if
shows a g a i n
the b e g i n and the end o f it;
is
by s y n t a x and seman-
and s y n t a x and s e m a n t i c s
sections that
the
What to a c h i e v e ,
investigation,
logic
of
not covered
language is
pragmatics
middle
the t h e o r y
the use o f
applicable),
lan-
difficulties.
correctness.
and d i a l e c t s ,
a designed that
of
- everything
systems.
of natural
defining
of
one ex-
formal
- we are on s a f e
to p r o v e s e m a n t i c a l
tics,
in our basket
to t h o s e
world of
was sen-
which resist
no f i n a l
within
connected
have t h e chance o f
The t h i r d
verification
of A u t o m a t i c T r a n s l a t i o n
certainly
philosopher
to f a l s i f y
remain is
has
any s u c c e s s f u l
since
to t r y
to
truth
Only in the constructed
matics
that
and t h e r e
cept the tautological
guage i s
but time
Only t h o s e s e n t e n c e s
knowledge,
The f a i l u r e
was p o s s i b l e
inhibited
we a r e o b l i g e d
falsification
and L o g i c a l
The A u s t r i a n - B r i t i s h
POPPER c o n c l u d e d
not possible,
The T r a c t a t u s
in principle),
many d i f f i c u l t i e s
programme t o do so.
is errors.
w o u l d be to v e r i f y
verification
least
The per-
(well-formed)
the Vienna C i r c l e
assumed t h a t
a sentence
the
sen-
characters
language
o f each s e n t e n c e .
the Tractatus
Positivism) for
well-formed of
of syntactical
the meaning o f
truth
semansyntax
w h e t h e r the s e n t e n c e
and the r e j e c t i o n
the f a c t u a l
all
of
i n d e p e n d e n t of any m e a n i n g .
Semantics
(and w i t h
to d e f i n e
of a well-constructed
mechanical
well-formed,
are s y n t a x , definition
the c o m b i n a t i o n
or words a b s o l u t e l y
Sir
investigation
and p r a g m a t i c s .
it
is
ob-
pragmatics,
490
it
makes c l e a r
tic
once more t h a t
structures
life.
We w i l l
syntactic
are e s s e n t i a l l y never attain
and seman-
embedded i n
a total
real
formalization
of anything. L e t me s h o r t l y
mention the d i s t i n c t i o n
and c o n s t r u c t e d already;
natural
between n a t u r a l
guages. An e x a m p l e f o r
this
guage o f which
formal
certain
body t e m p e r a t u r e ,
value of a physical formal
handling
tures
of medical
nical
construction.
it
I am sure t h a t
today to a t o t a l l y
lan-
indication
like
any o t h e r
are o t h e r
record,
of
of computer p r o c e s s i n g . struc-
as any t e c h -
parts,
for
which a r e h i g h l y
informal;
h a z a r d o u s to s u b m i t them t o than s t o r a g e and r e p r o d u c -
nobody w o u ld e n t r u s t
formalized
himself
and c o m p u t e r i z e d m e d i -
T h i s ~i~ves an i d e a o f t h e heavy p r o -
blems o f m e d i c a l Finally,
the
advanced m a t h e m a t i c a l
But t h e r e
in a medical
treatment.
parts,
k n o w l e d g e a r e as f o r m a l
automatic processing other
cal
lan-
the medical
instance,
therefore,
wo u l d be e x t r e m e l y
tion.
and f o r m a l
case i s
for
and dead.
m e a s u r e m e n t , can be s u b j e c t
and,
The many, sometimes v e r y
instance
I have used
l a n g u a g e s can be a l i v e
There ar e m i x t u r e s
of the
between n a t u r a l
languages, a distinction
information
processing.
I have to m e n t i o n th e d i s t i n c t i o n
between
l a n g u a g e , m e t a - l a n g u a g e , m e t a - m e t a - l a n g u a g e , and so on ( m e t a ( n ) l a n g u a g e ) . ral
The open q u e s t i o n
one day a f o r m a l the degree t h a t that
clear
is
that
a natu-
w h e t h e r we s h a l l
l a n g u a g e which it
is
Now I w i l l
in
turn
formalization
to
itself,
I have n o t
to achieve t h i s .
to the d i s c u s s i o n
of
three
levels
which have g a i n e d i m p o r t a n c e
around the c o m p u t e r ;
these
(I)
formal
notation,
(2)
formal
definition
(3)
correctness
three
and
proofs.
so
l a n g u a g e as
m e t a - l a n g u a g e . So f a r ,
seen any p r o m i s i n g a t t e m p t
see
recursive
can be d e s c r i b e d
we can do away w i t h a n a t u r a l
the u l t i m a t e
of
is
l a n g u a g e can be used as m e t a - l a n g u a g e o f any
level.
yet
It
levels
are
491
The f i r s t it
is
level
has been i n
the i n t r o d u c t i o n
allows mechanical sible if
a Jong t i m e :
of a formal
derivations,
to achieve general
the r u l e s
use f o r
l a n g u a g e w hic h
so t h a t
it
is
pos-
a g r e e m e n t on t h e r e s u l t :
have been p r o p e r l y
observed,
the o u t -
come must be a c c e p t e d by e v e r y b o d y . What can be c h a l l e n g e d are the assumptions o r , conditions,
t h e used n o t i o n s .
be c h a l l e n g e d , dictions,
if
is
possible
to d e r i v e
p a r a d o x a . Then, a n e x t l e v e l
introduced, which i s
a formal
possible
languages.
definition
their
contra-
must be
o f t he n o t i o n s ,
- I repeat - only
Constructed
advantage that in
it
under c e r t a i n
These n o t i o n s w i l l
in constructed
l a n g u a g e s have t h e f u r t h e r
knowledge can be a c q u i r e d
t h e m o t h e r l a n g u a g e , so t h a t
i n the f o r m a l
field
c o m m u n i c a t i o n becomes i n d e p e n d e n t o f t he command of a foreign
language (of
English,
in the computer
field)
and o f e x c e s s i v e l a n g u a g e f i n e s s e
author
(which u s u a l l y
is
identical
t i s m anyway - n o t a l w a y s , The second l e v e l nition cally
o f an obscuran-
I admit).
of formalization,
of the a p p l i e d
with
formal
the formal
language,
is
practi-
always a c h i e v e d by means o f an a b s t r a c t
c h i n e - t h e T u r i n g machine i s
the best
defima-
known
example. The a b s t r a c t
machine can assume a s e t o f s t a t e s
moves from one s t a t e well-defined
t o the n e x t on the b a s i s
transition
function.
no reason to exchange a l l transition
- rather
th e t r a n s i t i o n
cases c o n c e r n o n l y a s m a l l it
is
system
will
subpart of
of substates
functions,
and i n p u t
A special
and t o
(or
control
of the a b s t r a c t
introduce
conditioned
next elaborated
is
one
i n most the s t r u c t u r e
c o n v e n i e n t to o r g a n i z e t he s t a t e
of transition state)
Since there
parameters during
as a complex a full
by s t a t e part
set (sub-
of text).
mechanism~ which can be a p a r t
m a c h i n e , e n s u re s t he p r o p e r se-
quencing of the s t e p s .
and
of a
-,
492 The a b s t r a c t semantics
machine can d e f i n e
of a formal
based on o p e r a t i o n a l is
certainly
or axiomatic
known t o a l l
IBM L a b o r a t o r y
the s y n t a x and t h e
l a n g u a g e or a f o r m a l
and f i n a l l y
the semantics o f
PL/I,
so t h a t
result
size).
zation
definition
is
proofs
mind watches steps
is
the
the
stance
i n which
out of
the process that
running
computer,
there
speak-
t he f o r m a l i by
S i n c e no human
in is
nano-second n e v e r an i n -
a wrong a s s u m p t i o n can be o b s e r v e d details.
Of course~ we f l a t t e r
i n o u r programs e v e r y p o s s i b i l i t y
has been p r e c o n c e i v e d and t h a t ,
consequently,
is
is
nothing
to
be o b s e r v e d which
But a r e we a l w a y s sure t h a t
this
in
there
a purely
offered
human o p e r a t i o n
solutions, to o b j e c t
situation
has i n f o r m a l
(Sometimes t h e it
applies
that of
is
there it
is
is
to a s o l u t i o n , ingredients
solution
is
ready,
unknown. T h e r e i s
there
not a l r e a d y really
is
and i n many cases
difficult
case:
of
(and a
characterized
solutions.
the processes
through
ourselves
for
results
n o t the end o f
game° The n e x t l e v e l
correctness
a method
definition
I am n o t j u s t
but of r e a l
of non-trivial
It
the V i e n n a
the formal
ing of p o s s i b i l i t i e s ,
But f o r m a l
principles.
o f you t h a t
has d e s i g n e d a n o t a t i o n ,
of definition
system,
known.
so? Even
much t r u s t ' i n it
would be
because t h e w h o l e w h ic h r e m a i n d a r k . t he p r o b l e m t o w hic h
some l o g i c
a r e so many more q u e s t i o n s
in
such a
t h a n answers
e c o n o m i c t o work o u t answers i n d e p e n d e n t l y
the q u e s t i o n s . . . ) ,
A correctness formal that
proof
question
the
and a f o r m a l
answer i n d e e d f i t s
the s o l u t i o n purpose,
achieved, a formal
the
If
it
makes sure
the q u e s t i o n ,
the p r o b l e m .
notation
th e s o l u t i o n
the c o r r e c t n e s s
user e n j o y s
system.
between a that
For t h i s
and d e f i n i t i o n
but they are not s u f f i c i e n t .
environment for
be e s t a b l i s h e d .
a link
ans w e r ; to
indeed resolves
however~ f o r m a l
are a p r e r e q u i s i t e , formal
establishes
all
process proof,
A
is
to
then,
the r e l i a b i l i t y
of
is
493
This
highest
tremely
level
The t r a g e d y
formality
of operation,
and d e f i n i t e l y
i n computer is
that
with
hand,
it
is
apparently
not easy to r e f u t e . should
it
is
possible
possible
Programming
n o t be p o s s i b l e ? for
language - except naive error,
to f i g h t
committed
o n l y t h e programming
learn
computer
the b r a i n s
specialist
but
informal
style
a formal
language.
Whoever
has the a b i l i t y
press
clearly,
function. tions
program w i l l
puter
will
however,
the u s e r ,
then at a l l ,
can on-
to him.
of
command o f
and to e x -
a programming of data,
required will
of
the c l u m s y
clearly
the f l o w
ability
a
t h e clumsy s t y l e
a lack
to t h e p r e c i s i o n
puter
- is
specialist
can l e a r n
or i n s t r u c t i o n s
a pro-
in natural
specialists. of
to t h i n k
to d e f i n e
Lack o f t h i s
(like
i n no case i s
a help against
himself
and m a t e r i a l
n o t be the language is
can h e l p - i f
l a n g u a g e s are a h e l p f o r
guage i n o r d e r
Why
and even c u l t i v a t e d ,
program what has been made c l e a r
intelligence,
language
applications
because even t h e most i n t e l l i g e n t
a weaker
for infor-
so f a m i l i a r .
programming
in trivial
problem exceeds
Formal
for
w h i c h are
And w o u l d i t
And y e t ,
by the most s o p h i s t i c a t e d
ly
for
to f i g h t
t h e a v e r a g e user? N a t u r a l
gramming l a n g u a g e ) .
the
fighting
in natural
so u s e f u l ,
wh~t he knows, what he needs to
If
is
sound arguments
sounds so a t t r a c t i v e , solution
it
science
an ex-
on v e r y o b s c u r e p a t h s .
On t h e o t h e r mality
requires
clean field
not every author it.
of formalization
lan-
energy
by a u t o m a t i c
produce d e s c r i p -
w h i c h t h e most advanced comn o t be a b l e
do the r i g h t
thing
to f o l l o w ;
the com-
by mere (and s m a l l )
chance. The o n l y o t h e r programming":
way o u t
the d e s i g n
which the processes triggered
i s what may be c a l l e d
by t r i v i a l
o f an a u t o m a t i c
- simple actions
system
or c o m p l i c a t e d (insertion
"invisible in
- are
of a magnetic
494 card o r a c o i n ~
deposit
grams r e m a i n t o t a l l y
of a load),
within
The c o m p u t e r i s
an i n t e l l i g e n c e
in%elligence
at
th e
intelligence
amplifier,
It
is
tect
a good g o a l ,
it
t he p r o -
amplifier,
if
The c o m p u t e r is
if
instructed
probably
is
there
is
an un-
by i g n o r a n c e .
a necessity
to pro-
t h e c o m p u t e r u s e r from u n n e c e s s a r y l e a r n i n g
(technology
in
particular large is
input.
while
the boxes.
general
o v e r b u r d e n th e u s e r w i t h
b u t more so i n
small
some p s y c h o l o g i c a l
tomatic
learning,
quantities);
in
but there
economy i n d e a l i n g
Quasi-intelligent
can o n l y
Arguments f o r
with
au-
a p p e a r a n c e o f t he com-
lead to d e c e p t i o n . and a g a i n s t
A few keywords f o r
Formalization
and a g a i n s t
formalization
FOR
AGAINST
clarity
clarity
economy
learning
security
risk
f re e d o m
Clarity
are
freedom
generality,
elegance
costly,
impractical
and u n a m b i g u i t y a r e t h e most i m p o r t a n t
of formalization,
The a b s t r a c t
system a r e d i f f e r e n t in natural and t a c i t
assumptions -
But t h i s
notions
is
it
precisely is
binds
of
all
to
express
formal
which a r e to be s p e c i f i e d is
least
cum-
methods to
later, which
is
leave
and no
- b u t such methods r e q u i r e
l e d g e s a command o f f o r m a l i z a t i o n
formali-
c o n c e r n e d t o an
v e r y bad or a t
course,
precision available.
used
from c o n n o t a t i o n s
possible
out parts
lost
the formal
a r e a s o n to be a g a i n s t
frequently
bersome. T h e r e a r e ,
is
virtues
than what s h o u l d be s a i d .
t o o much c l a r i t y
amount which
of
from t h e c o n c r e t e n o t i o n s
language, they are f r e e
no more and no l e s s
zation:
in
e q u i p m e n t from which one c a n n o t d e v i a t e
too far: puter
and computer t e c h n o l o g y
a know-
not e a s i l y
495 There is a more i m p o r t a n t argument, namely t h a t
infor-
mal e x p r e s s i o n c a r r i e s more i n f o r m a t i o n than formal which is t r u e . earlier;
Think of the medical example given
medical d e s c r i p t i o n s ,
medical
knowledge can
be analyzed and t r a n s f o r m e d i n t o a formal t h e r e is no doubt t h a t some i n f o r m a t i o n this
process.
Economy,
Clarity
reflected
mechanical
system, but
is l o s t
in
can indeed be a n e g a t i v e argument.
by the shortness of e x p r e s s i o n ,
simplification
l y a strong argument.
and r e p e t i t i o n ,
Repetition,
is o b v i o u s -
in p a r t i c u l a r ,
y i e l d s the basis f o r savings and is the p r e c o n d i t i o n for automatization, for a u t o m a t i c programming automatic generation a u t o m a t i c s i m u l a t i o n and automatic documentation. The counterargument a g a i n s t economy is to say y e s , that
is a l l
true,
but i t
r e q u i r e s an amount of edu-
c a t i o n on the producing and at the a p p l y i n g side which d e s t r o y s the advantages. This is in f a c t an i m p o r t a n t p o i n t which keen f o r m a l i z e r s o f t e n o v e r look. A similar curity
Structures tion,
double argument can be presented f o r
and r e ] i a b i l i t y
gained by formal
become e x c e p t i o n - f r e e ,
se-
treatment.
mechanic e v a l u a -
p e r f e c t syntax - e v e r y t h i n g is very good
the i n t e r p r e t a t i o n
goes o u t s i d e the formal
and becomes independent of the personal any l i v i n g language, to be handled.
structures
command of
in which t h i n g s o t h e r w i s e have
The counterargument again says t h a t t h e r e i s r i s k in the new p r i n c i p l e s ;
nobody can guarantee t h a t
they r e a l l y w i l l be s u c c e s s f u l , gain is q u e s t i o n a b l e .
so t h a t
the o v e r a l l
496 Another fine ty
property
and e l e g a n c e ;
ceived
problem
at an e a r l y
no a r b i t r a r y each o f
to any o t h e r
time.
where i t
Elegance
some v e r y
gant.
The s p e c i f i c
try
If
it
from
that
solution
a special
including
and
can be a p p l i e d
generality e.g.
the
last
patent
is
and,in
are h i g h l y for
ineleis
to p r e -
economical
business, button
than with
cost-
in compi-
by many p e o p l e
you want to make b i g
with
it
impractical
things
t h o s e on the m a r k e t
button
on,
often
in
are
the most g e n e r a l
but also
is
not
can be p r e c o n -
applies.
is
practical
the general,
sons.
here
generali-
of concepts
in
one i s w o r k i n g
fact, fer
only
in dollars,
is
development, there
to t h e s e t
The c o u n t e r a r g u m e n t lation
of
accepted
field
- not only
situations
stage
entries
them i s
form~ w h a t e v e r
ly
of f o r m a l i z a t i o n
rea-
you r a t h e r
which is
different
a generalized
the common f e a t u r e s
of all
others
on the m a r k e t . Freedom,
fina!ly~
subsequent
is
steps,
opened by f o r m a l i z a t i o n
because the system
parent,
all
trials.
But the a r g u m e n t can be t u r n e d
that
possibilities
formalization
preting say:
the freedom
description,
you have m i s u n d e r s t o o d
long
around, of
the trans-
can be seen w i t h o u t
removes
the e a r l y
for
is ideally
stating
reinter-
the p o s s i b i l i t y
to
me, what I r e a l l y
meant
to say was the f o l l o w i n g . . . Of c o u r s e , as a d e s i g n quently
nobody w i l l
sell
principle,
but
such a r e i n t e r p r e t a t i o n intuitively
be a power i n t h e f i g h t
it
might
between f o r m a l
freand
informal. All
t h o s e arguments
ciple
together
l e a d to a b a s i c
of design which carries
tecture
- abstract
the name o f
architecture,
I should say,
in
order
to d i s t i n g u i s h
nates
now - namely the same s l o p p y way of g l u e i n g
parts
together,
design.
it
prin-
archi-
from what the term d e s i g -
w h i c h was f o r m e r l y
called
logical
497 One can make an i n t e r e s t i n g
exercise:
the
and r e a d i n g what
Encyclopedia
building
Britannica
architects
understand
defined
as
the a r t
and t e c h n o l o g y
practical
of
and e x p r e s s i v e
The main g o a l s
looking
by the t e r m .
building,
up
It
is
fulfilling
the
needs o f c i v i l i z e d
people.
are
suitability
to
stability
use by human b e i n g s
and permanence o f
communication
its
of experience
construction
and i d e a s
through
form. Are n o t a l l
of
those thoughts
to what we are d o i n g ? The c h a p t e r s
of
perfectly
applicable
Or to what we s h o u l d do?
the E n c y c l o p e d i a
keyword are
use - t y p e s planning
-
techniques
materials methods
expression
- content form
and "Economy tial)
prevents
(or
The n o t i o n 1962,
and the term o f computer a r c h i t e c t u r e i n a book on the computer
and t h e r e
BROOKS i s
quite
the d e f i n i t i o n
clear
It
It
since
t h e n on the n o t i o n
points
was one o f
IBM S y s t e m / 3 6 o ,
three
steps
architecture, Architecture system to the the
forms zation
the /36o f a t h e r s
logical
is
from l i t e r a -
of architecture,
and he
the f u n c t i o n a l
the
and r e a l i z a t i o n .
appearance of
(what happens).
structure
the a r c h i t e c t u r e is
He d i s t i n -
in design implementation
user
but
who p u b l i s h e d
o u t what makes ~ood a r c h i t e c t u r e .
guishes
F.P.
was a p p l i e d
soon the term and the i d e a d i s a p p e a r e d ture.
"STRETCH"
g i v e n by
and a l r i g h t .
i n the hardware d e s i g n o f the
is
only poten-
demand"
were i n t r o d u c e d in
work w i t h o u t
Implementation
- in detail (how i t
hardware c o n s t r u c t
the
happens).
which perAnd r e a l i -
(where i t
happens).
498 The key i d e a i s is
consistency:
consistent,
allows
the p r e d i c t i o n
why I l i k e rive
if
the
(this
architecture
redundancy - repetition in
t he system
remainder
abstract
not o n l y
the a r c h i t e c t u r e
k n o w le d g e o f
of
to d e f i n e
principles arts
a partial
is as c r e a -
and symmetry a r e b u i l d i n g
t e c h n o l o g y but also
in the
and i n music p a r t i c u l a r l y ) .
Good a r c h i t e c t u r e gonality,
is
propriety
also
characterized
and g e n e r a l i t y ,
by o r t h o -
Orthogonality
means no u n n e c e s s a r y c o u p l i n g
of concepts or
functions
the e s s e n t i a l
which a r e p r o p e r to
quirementso property
Generality
or f u n c t i o n
be i n t r o d u c e d
in
each c o n c e p t ,
which must be i n t r o d u c e d
its
Good a r c h i t e c t u r e
means t h a t
most g e n e r a l
will
show openendedness and com-
unnecessary restrictions for
later
and f u r t h e r
a part
there
of a set,
a r e symmetry, bility;
is
will
form.
p l e t e n e s s ~ openendedness means t h a t
means t h a t
re-
and t h e r e
there is
are no
ample space
development; completeness no a r b i t r a r y
then
th e f u l l
transparency,
good a r c h i t e c t u r e
is
selection set.
- if
Further
k~ywords
compatibility,
relia-
stimulating
and s e l f -
teaching. Of c o u r s e ,
there
chitecture
is
the p r o p e r t y result
will
is
the danger t h a t
designed, of
but t h a t if
wh i c h
evolvement" ! will
to
clean formal
requires,in
thorough formalization; relation
finally
ar-
shows
t h e w h o l e d e s i g n was
based on the a d v a n t a g e s o f
esting
a perfect
p o o r p e r f o r m a n c e - b u t such a bad
not o c c u r ,
System a r c h i t e c t u r e tural
it
it
methods.
other words,
c a n n o t be l e f t
to
"na-
And t h i s
remark opens an i n t e r -
"General
Systems T h e o r y " -
n o t l o o k any f u r t h e r ,
however.
into
499 The F u t u r e
of
Formalization
Formalization
is
not a solitary,
esoteric
of
some e x t r e m e s p e c i a l i s t s ,
a theory
of
theory.
very short
'It will
become i n
a production an a p p l i c a t i o n
and
i s much too p o w e r f u l
organize
and to a p p l y Information
precision
in every
it
layout,
powerful
needs,
if
tool
it
control
and f i n a n c i a l l y ,
and i t s
applications.
infinite
Formalization,
therefore,
ment o f
the p r o p e r
a formal
definition
computer
is
in
zation.
turn
its
will
the r e l a t i o n
its
its
between a l l ones?
an a b s t r a c t
mechanical
natural
structures. of f o r m a l i -
be r a i s e d : model
what i s
those constructed
as examples o f complex n a t u r e are a n o t h e r Although
sophy o f
formalization logical
put
legal
cessary,
is
the l e g a l
essentially
difficult
and l e g a l
thinking philo-
different
formalization;
on the computer
but extremely
- legal
o f a complex s i -
l a n g u a g e and l e g a l
character,
and t e c h n i c a l
structures
between l o g i c a l
instance
legal
formal
from the
languages
I have m e n t i o n e d m e d i c a l
structures tuation.
the
and an o r g a n i s m ,
structures
have a c e r t a i n
which
institution
sharpen the problems will
equipment,
system w i t h i n
not only
between a f o r m a l
and the n a t u r a l
processing
enterprise,
human, to
Many q u e s t i o n s
relation
growth
be n e c e s s a r y ,
the t o t a l
structure,
but also
its
e x t e n d to the e n v i r o n -
the t o t a l
used w i l l
machine m o d e l i n g or any o t h e r
will
in
informati~on of
thorough
w h i c h our
s h o u l d n o t go a s t r a y
organizationally
This
the more the b i g g e r
c h e a p e r t h e e q u i p m e n t becomes - o n l y the t i g h t
to
or by c a s u a l
requires
the f a s t e r ,
offers
parts
to p r o d u c e ,
by i n t u i t i o n
processing
formalization
the
time
tool.
methods. and t h e
the sake
tool,
a management t o o l
The computer
amusement
for
is
because o f
thinking.
to
highly
ne-
t h e gap
Many o t h e r
500 examples o f computer work i s
p r o b l e m areas
application ahead o f
Formalization and i t
will
on w h i c h
for
could
be g i v e n ,
will
grow i n
importance
have many f e e d b a c k s
it
is
applied.
of
computer network
malized
systems.
will
produce a t i g h t
develop
- forcefully of
from the
classic
same v i r t u e s
will
the
population
be m a i n t a i n e d
- the
mechanical of
engineer,
compromising
bet-
between m a t h e m a t i c a l
intuition.
Formalization increased
pushed o r
between q u a l i t y ,
delivery,
and i l l o g i c
influence
society.
The l i f e
of our planet
of
can o n l y
by means o f advanced t e c h n o l o g y ,
good m e c h a n i s m s j c o n c r e t e be a b s o l u t e l y
for-
the type of engineer
needs and p o s s i b i l i t i e s , and p u n c t u a l
appliof
computerization
engineering,
for-
work,
in which substantial
b u t based on t h e
machine,
applying
will
different
will
processes,
cations
of abstract
precision
influence
have to go i n t o
their
h a p p e n i n g as a s i d e - e f f e c t
cost
will
from f o r m a l i z e d
which will
treatment
Formalization
ween
and e x t e n s i o n ,
processes,
the p r o f e s s i o n s
very
of
the s t r u c t u r e s
and i n s t i t u t i o n s
formalized
the c o u n t r y
to
Formalization
resulting
the e n t e r p r i s e s
mal
a lot
and
us.
the p r o d u c t s
art
formalization
needed.
as w e l l
as a b s t r a c t
and ones,
But man i s more t h a n a
and human b e h a v i o u r
must be p r o t e c t e d
submerging
society
i s more t h a n a machine w h i c h r u n s p e r f e c t l y
if
serviced
fields
by keen s o c i a l derivations,
s h o u l d warn c e r t a i n other
universe.
engineers.
Also,
There are
w h i c h must be k e p t away from f o r m a l i z a t i o n
and m e c h a n i c a l into
in a formalized
from
totally
admiration principles
other
and the professions
and s i m u l a t i o n of
life
abstract
engineer
n o t to f a l l
of e n g i n e e r i n g ,
where
and work s h o u l d be p u r s u e d .
501
Formalization which w i l l
will
have a f e e d b a c k on t h e human mind
dispose with
of extremely powerful make p e o p l e l i k e but t h i s
th e r u l e s
than the r u l e .
to
where the e x c e p t i o n Liking
and i n
information
all
-
t he h u m a n i z a t i o n i s more i m p o r t a n t
and u n d e r s t a n d i n g t he e x c e p t i o n
a l w a y s remain the most human t a s k
being,
methods
mechanisms, w hic h w i l l
and h a t e t he e x c e p t i o n s
w o u l d be t h e c o n t r a r y
of our w o r l d , will
t h e advance o f f o r m a l
abstract
our work i n
for
the human
t he f o r m a l i z a t i o n
p r o c e s s i n g we s h o u l d n e v e r f o r g e t
s i d e o f o u r human e x i s t e n c e .
of this
Lecture Notes in Computer Science Edited by G. Goos and J. Hartmanis
23 I
I
Programming Methodology 4th Informatik Symposium, IBM Germany Wildbad, September 25-27, 1974
Edited by Clemens E. Hackl IIII
IIII
III III
I
I
Springer-Verlag Berlin.Heidelberg • New York 1975
Editorial Board" P. Brinch Hansen • D. Gries C. Moler - Go Seegm011er • No Wirth
Prof. Dr. Clemens E. Hackl IBM DEUTSCHLAND UV Wissenschaft 7 Stuttgart 80 Pascalstr. 100 BRD
Library of Congress Cata|egil~g in Publication Data
lnformatik S~m~positu% 4th, Wildbad 1974. Programming methodology.
im Schwars~.:ald, Ger.
(Lecture notes in computer science ; 23) ~'Sponsored by IBM Germany and the IBM World Trade Corporation. ,i Bihliogr ap~v: p. Includes index. 1. P r o g r a ~ n g (Electronic computers)--Congresses. I. F~ckl~ Clemens E.~ ed. II. I~M Deutschland. III. IBM World T r ~ e Corporation. IV. Title. V. Series. QA76.6.147 1974 001.6 '42 74-34362
AMS Subject Classifications (t970): 00A10, 0 2 G 0 5 , 02G10, 6 8 A 0 5 , 68A10, 6 8 A 2 0 , 6 8 A 3 0 , 6 8 A 4 0 CR Subject Classifications (1974): 4.0, 4.12, 4.20, 4.22, 4.30, 4.6, 5.20, 5.23
ISBN 3-540-07131-8 Springer-Verlag Berlin • Heidelberg • New York ISBN 0-387-07131-8 Springer-Verlag New York • Heidelberg • Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the materiat is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher. © by Springer-Verlag Berlin ' Heidelberg 1975. Printed in Germany. Offsetdruck: Juiius Beltz, Hemsbach/Bergstr.
PREFACE The papers in these proceedings were presented at the 4th Informatik-Symposium which was held in Wildbad, Federal Republic of Germany, from September 25 - 27, 1974. The symposium was organized by the Scientific Relations Department of IBM Germany and sponsored by IBM Germany and the IBM World Trade Corporation. The aim of the Informatik-Symposia
is to strengthen and improve
communication between universities and industry by covering a subject in the field of computer science as well from a university as from an industrial point of view. Following last year's subject of computer structures the emphasis in this conference was placed on programming methodology which has become a field of increasing activity during the last years. Like in hardly any other segment in computer science problems are related to research, development, education and application with progress depending both on advances in theory and on increased practical experience. By organizing this symposium it was tried to cover this broad spectrum of programming methodology. At the beginning problems and experiences of production programming in an industrial environment were presented. Aspects for the development of large systems, organizing for structured programming, investigations about the reliability of programming systems and error analysis and error causes in production programming were covered. After presenting the industrial aspects system programming was considered from an university point of view. Problems in education, in language design for systems programming and general aspects of software engineering were addressed. In the following lectures emphasis changed from product development to subjects in advanced development and research. New approaches for program testing, new concepts about reasoning in program synthesis and methods of interprocedural analysis were presented.
fV
A subject of particular importance in advanced programming seems to be functional or nonprocedural programming. The strictly sequential character of a program is relegated to the background in favour of a precise description and formulation of the problem to be solved. A change from a procedure oriented programming to a description oriented programming could be a final goal. In concluding the symposium contributions were presented which were related to the formal definition and representation of programs, the description of mathematical structures in programming languages and to the axiomatic foundation of programming languages. Finally,
we w o u l d l i k e
tributions during
and for
to
thank
all
the very valuable
the preparation
of the
the
lecturers
advice
for
their
and a s s i s t a n c e
symposium.
Stuttgart, October 24, 1974
Gerhard
C l e m e n s E. H a c k i
Hflbner
Manager Scientific
IBM Germany
Relations
Symposium Chairman
congiven