Lecture Notes in Computer Science Edited by G. Goos and J. Hartmanis
141 II III
Uwe Kastens Brigitte Hutt Erich Zimmermann
GAG: A Practical Compiler Generator
Springer-Verlag Berlin Heidelberg NewYork 1982
Editorial Board D. Barstow W. Brauer R Brinch Hansen D. Gries D. Luckharn C. Moler A. Pnueli G. SeegmLiller J. Stoer N. Wirth
Authors
Uwe Kastens Brigitte Hutt Erich Zimmermann Institut f0r Informatik II der Universit~'t Karlsruhe Postfach 6380, 7500 Karlsruhe 1
CR Subject Classifications (1981): D 2 J , D3.1, D3.2, D3.4, F4.2 ISBN 3-540-11591-9 Springer-Verlag Berlin Heidelberg N e w Y o r k ISBN 0-387-11591-9 Springer-Verlag New York Heidelberg Berlin This work is subject to copyright. All rights are reserved,whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to "VerwertungsgesellschaftWort', Munich. © by Springer-Verlag Berlin Heidelberg 1982 Printed in Germany Printing and binding: Beltz Offsetdruck, Hemsbach/Bergstr. 2145/3140-543210
CONTENTS: Chapter
I: Introduction
Chapter 2: The Compiler Generator GAG 2.1 Introduction 2.2 Attributed Grammars 2.3 System Overview 2.4 The Generated Compiler 2.4.1 Parser Interface 2.4.2 The Structure Tree 2.4.3 Attribute Evaluation 2.4.4 Attribute Types 2.4.5 I/O of the Generated Compiler 2.5 Dependency Analysis 2.6 Attribute Optimization 2.7 Compiler Generation 2.8 Experience 2.8.1 The Processing of Attributed Grammars 2.8.2 Generated Compiler Front-ends
5 5 5 7 i0 i0 i0 ii ii 12 12 14 14 15 15 16
Chapter
19
3:
ALADIN A Language for Attributed Definitions 3.1 Introduction 3.1.1 Notation 3.1.2 Basic Symbols 3.2 General Structure 3.3 Constant Definitions 3.4 Type Definitions 3.4.1 Enumeration Types 3.4.2 Subrange T~pes 3.4.3 Union Types 3.4.4 Structure Types 3.4.5 Set Types 3.4.6 List Types 3.5 Symbol and Attribute Definitions 3.6 Productions 3.7 Semantic Rules 3.7.1 Attribute Rules 3.7.2 Attribute Names 3.7.3 Attribute Transfer 3.7.4 Context Conditions 3.8 Semantic Expressions 3.8.1 Formulas 3.8.2 Type and Symbol Tests 3.8.3 Remote Attribute Access 3.8.4 Let Clauses 3.8.5 Operands 3.9 Semantic Clauses 3.9.1 Type Conversion 3.9.2 Calls 3.9.3 Case Clauses 3.9.4 Conditional Clauses 3.9.5 Relational Clauses 3.10 Function Definitions 3.10.1 Generic Functions 3.11 Standard Definitions
19 20 21 22 22 22 23 23 23 23 24 24 24 25 26 26 26 27 28 28 28 29 3O 31 31 32 32 33 33 34 34 34 35 35
~V Chapter
4:
37
Development of an Attributed Grammar for a Pascal-Analyzer 4.1 Introduction 4.2 Development of the Attributed Grammar 4.2.1 Types 4.2.2 Scope Rules 4.3 Error Handling 4.4 Performance and Optimizations 4.5 Results of Attribute Evaluation
37 38 39 43 46 48 5O
Chapter 5: Generating Efficient Compiler Front-ends 5.1 Introduction 5.2 Basic Generation Concepts 5.2.1 Types in the Generated Compiler 5.2.2 The Program Tree Structure 5.2.3 Sequence Control of Attribute Evaluators 5.3 Improvement of the Tree Representation 5.3.1 Tree Compactification 5.3.2 Tree Partitioning 5.3.3 Attribute Evaluation Without Tree 5.4 Attribute Optimization 5.5 General Optimization Techniques 5.5.1 Common Subexpressions 5.5.2 Recursion Elimination 5.5.3 Inline Code for Functions 5.6 AG Transformations 5.7 Application to Pascal 5.8 Results
53 53 54 55 55 56 58 59 60 60 61 64 64 65 65 65 66 71
Appendix
73
A: Attributed
Grammar
for Pascal
Appendix ~B: Results of the Usage of GAG
137
References
155
Chapter i: Introduction
Modular decomposition of the compilation problem for programming languages leads to the main compiler tasks scanning, parsing, semantic analysis, optimization and code generation. Attributed grammars (AGs) are a well suited method for definition of static context dependent properties of programming languages, thus specifying the semantic analysis phase. For usual high level languages this phase comprises application of scope rules, type checking, overloading resolution, etc. Implementation of semantic analysis can be systematically derived from such a specification: A tree representing the abstract program structure is augmented by attributes specified in the AG. Hence AGs are a widely accepted base for systematic compiler construction. This book shall demonstrate that the AG method is a suitable base for practical compiler generators. The GAG-System is a compiler Generator based on AGs. Its development was guided by three dominating aims: The system sho-~id be usable in practical compiler projects; it should be able to process AGs which are written as formal language definitions rather than as implementation oriented specifications; and the generated attribute evaluators should be efficient. The GAG-System uses an attribute evaluation technique (OAG technique, see [Ka80] and Sect. 2.2) which is more powerful than the multi-pass techniques of comparable systems. In practice it turned out that this technique releases the user of the system from consideration of attribute evaluation order during AG development. The evaluation order is computed automatically by the GAG-System - even for complex languages like Ada or PEARL. We developed an input language for the system which is suitable for writing AGs as self-contalned definitions of static language properties on a high level of abstraction (see Chapter 3). AG development is supported by elaborated system facilities (see Chapter 2 and Appendix B). The generated attribute evaluators can easily be embedded in a complete compiler environment using suitable interfaces and parameterized modules for compiler phases not specified in the AG (e.g. scanner, protocol generator). Efficiency of the generated attribute evaluator is achieved by many effective optimizations of space and runtime which are automatically applied by the GAG-System (see Chapter 5). Portability and maintainability of both the system and its products are achieved by systematic implementation techniques using Standard Pascal. Attributed grammars were introduced by Knuth [Kn68] as a method for defining semantics of programming languages based upon the following principles (see Sect. 2.2): The syntactic structure of sentences in the defined language is described by a context-free grammar. The derivation of a sentence can be represented by a structure tree in which every node stands for a symbol of the context-free grammar's vocabulary.
The AG a s s o c i a t e s a set of attributes with each symbol in the c o n t e x t - f r e e g r a m m a r ' s v o c a b u l a r y , and hence with each node in the s t r u c t u r e tree of a sentence. The value of an attribute d e s c r i b e s a context dependent property of the language element represented by the symbol, e.g. the type of an e x p r e s s i o n or the set of d e f i n i t i o n s valid in the particular context. The AG a s s o c i a t e s attribute rules with each production of the c o n t e x t - f r e e grammar s p e c i f y i n g the c o m p u t a t i o n of a t t r i b u t e values in terms of attributes of surrounding nodes. Application of a production to derive a symbol selects the attribute rules by which the a t t r i b u t e s of the c o r r e s p o n d i n g nodes will be computed. The c o r r e s p o n d e n c e between a t t r i b u t e s and values is static. An a t t r i b u t e of a node of the s t r u c t u r e tree can o n l y take on a single value. The a t t r i b u t e rules are static definitions; they should not be t h o u g h t of as a l g o r i t h m s over v a r i a b l e attributes. The AG d e s c r i b e s a sublanguage of the language defined by the context-free grammar through context conditions that must be satisfied by the a t t r i b u t e values: A s y n t a c t i c a l l y correct sentence belongs to the defined language if and only if the context conditions a~e satisfied for all a t t r i b u t e s of the structure tree. There are formal conditions on c o m p l e t e n e s s and c o n s i s t e n c y for well-defined AGs which ensure that the value of each a t t r i b u t e is u n i q u e l y d e t e r m i n e d and e f f e c t i v e l y c o m p u t a b l e in any context. The input language of the G A G - S y s t e m (ALADIN, see Chapter 3) is based on these p r i n c i p l e s for AGs. As a c o n s e q u e n c e of the static character of AGs ALADIN is a s t r i c t l y a p p l i c a t i v e language without any control flow elements. A t t r i b u t e e v a l u a t i o n order is computed automatically by the GAG-System. A powerful type concept allows d e f i n i t i o n of the attributes' meaning on a high level of abstraction. The strong typing rules of A L A D I N and the w e l l - d e f i n e d n e s s of the AG - both checked by the G A G - S y s t e m - g u a r a n t e e a high degree of c o n s i s t e n c y and c o m p l e t e ness of the s p e c i f i c a t i o n . The input language encourages the writing of comprehensible AGs by naming all defined entities and by powerful abbreviations. The d e v e l o p m e n t of an AG for a programming language can be compared with software specification. S p e c i f i c a t i o n tools like AGs, GAG, and A L A D I N must be used r e a s o n a b l y by a p p l i c a t i o n of suitable systematic methods and standard techniques. In Chapter 4 general d e v e l o p m e n t methods and standard d e s c r i p t i o n techniques are presented for common p r o p e r t i e s of high level programming languages, using an AG for Pascal as an example. The c o m p l e t e P a s c a I - A G is contained in Appendix A. Furthermore, we show h o w error recovery and code g e n e r a t i o n (or interfaces to it) can be specified. In Chapter 5 it is shown that c o m p i l e r s a u t o m a t i c a l l y g e n e r a t e d from AGs need not be inefficient. Many e f f e c t i v e improvements of storage and runtime are applied a u t o m a t i c a l l y by the GAG-System. M e a s u r e m e n t s on the compiler front-end g e n e r a t e d from the P a s c a l - A G d e m o n s t r a t e that space and runtime r e q u i r e m e n t s n e c e s s a r y for practical usage can be achieved. In several p r o j e c t s we m a d e very encouraging experience with the GAG-System: In the PEARL project it helped to d e v e l o p a c o n s i s t e n t and complete formal d e f i n i t i o n of the static p r o p e r t i e s of the large and complex real-time language PEARL. The AG is part of the German standard document [DIN80]. In c o m p i l e r p r o j e c t s for LIS and Ada the AG served as a s p e c i f i c a t i o n for the compiler front-end. The use of
the GAG-System enforced consideration of all consequences of the described language properties before implementation. Open questions could be answered and inconsistencies be removed before they caused costly redesign of programs. Due to certain project requirements, the implementation was produced by systematic manual translation. In spite of that, the front-end generated by the GAG-System was used as a valuable tool for testing the specification. I n several smaller scientific projects for application languages the GAG-System reduced the costs for compiler development to the costs for the definition of the language by an AG. Performance measurements on automatically generated Pascal analyzers demonstrate that runtime and space requirements close to those of conventional compilers can be achieved. The efficiency will be sufficient for many practical applications. The subsequent chapters present different aspects of the G A G - S y s t e m and its use: Chapter 2 gives a system overview. It contains a short introduction to AGs (Sect. 2.2) intended for readers who are not familiar with this method. In Sect 2.8 our experience with practical applications is summarized. Chapter 3 is the reference manual for the input language ALADIN. The general concepts and the notation are introduced in Sect. 3.1. The rest of that chapter can be skipped in the first reading because most of the later examples are self-explanatory. In Chapter 4 we demonstrate how an AG for a programming language (Pascal) is systematically developed. It contains important know-how for AG development, and it shows how the specification tools are used adequately. AGs for similar languages may be directly derived from the Pascal-AG in Appendix A. Chapter 5 describes performanc e improvements and their results applied by the GAG-System to the generated compilers. Aspects for comparison of the GAG-System with other compiler generating systems are found most of all in Chapter 2 and 5. Acknowledgements: We are indebted to Prof. G. Goos who initiated the project, established a fruitful working environment within his group, and who made valuable remarks on early drafts of this book. We thank Dr. J. Schauer, Dr. J. R~hrich, and the Ada Implementation Group Karlsruhe for many helpful discussions and for acting as pilot users. We thank Prof. W. M. Waite for the translation of the ALADIN definition. A significant part of the development of the GAG-System was supported by the Deutsche Forschungsgemeinschaft.
Chapter
2: The Compiler
2.1
Generator
GAG
Introduction
The GAG-System (Generator based on Attributed Grammars) generates a compiler for a language defined by an attributed grammar (AG). The underlying concepts were first outlined in [Ka76]. The central part of the system is the analysis of attribute dependencies and the generation of the attribute evaluation phase. Interfaces are defined for scanner and parser. The system processes attributed grammars of type OAG which is a large subclass of well-defined AGs containing all classes of pass-oriented AGs. For each AG individual Visit-sequences are computed which control the attribute evaluation in the generated compiler. Thus the structure tree is traversed in an efficient (non pass-oriented) manner. This method was presented in [Ka80]. The input language ALADIN (cf. Chapter 3) defines a notation for AGs suitable for complete descriptions of the static properties of languages. It is based on typed attributes and includes a powerful and flexible type concept. Expressions and recursive functions over attribute type s allow precise and complete definitions of attribute values.
2.2
Attributed
Grammars
AGs were introduced by Knuth [Kn68] as a method for defining of programming languages. This section gives a brief formal tion to AGs.
semantics introduc-
An AG is a 5-tupel (G, A, Val, R, C) based on a reduced context-free grammar G=(N, T, P, S) - the sets of nonterminals, terminals, productions, and the start symbol. The AG associates a set A(X) of attributes to each symbol X in the v o c a b u l a r y V = N V T of G. We write X.a to indicate that attribute a is an element of A(X). The sets of a t t r i b u tes are disjoint for d i f f e r e n t symbols. An attribute a can take on any value of a domain Val(a). It represents a specific (context-sensitive) property of the symbol X. Each node in the structure tree of a sentence in L(G) represents a symbol X in the vocabulary of G. For each attribute of A(X) an attribute value is associated with such a node. These values are defined by attribute rules R(p) associated with a production p in P: X0::=Xl...Xn. Each attribute rule Xi.a:=f(Xj.b,...,Xk.c) defines an attribute Xi.a in terms of a t t r i b u t e s Xj.b,...,Xk.c of symbols in the same production. In addition
to attribute
rules the AG
associates
context
conditions
C(p) to each p r o d u c t i o n p. A context c o n d i t i o n is a relation g ( X i . a , . . . , X j . b ) over a t t r i b u t e s of symbols o c c u r r i n g in p. A condition of C(p) is fulfilled for a tree node derived by the tion p if the relation holds for the c o r r e s p o n d i n g a t t r i b u t e A sentence of L(G) is a sentence of L(AG) if and only if no c o n d i t i o n is violated. We subsume a t t r i b u t e rules and context tions under the general term semantic rules.
context producvalues. context condi-
The following r e q u i r e m e n t s ensure that each attribute value of any tree node is defined by e x a c t l y one attribute rule: An attribute X.a is called synthesized if there is an attribute rule X.a:=f(...) a s s o c i a t e d with a p r o d u c t i o n of the form X::=w. If such an a t t r i b u t e rule is a s s o c i a t e d to a p r o d u c t i o n Y::=uXv the attribute is called inherited. Let AS(X) and AI(X) be the sets of synthesized and inherited a t t r i b u t e s of X. Then AS(X) and AI(X) must be d i s j o i n t and AS(X)\/AI(X)=A(X). For a production p: X0::=XI...Xn the set of defining o c c u r r e n c e s of a t t r i b u t e s is AS ( X 0 ) V A I (XI) V A I (X2) V . . . k / A I (Xn) . The set of applied o c c u r r e n c e s is AI (X0) V A S (XI) V A S (X2) V . . . V AS (xn) . There is exactly one a t t r i b u t e rule for each d e f i n i n g occurrence in the set R(p) associated w i t h p. R(p) contains no attribute rule for an applied occurrence. An AG is w e l l - d e f i n e d if all attribute values of any structure tree are e f f e c t i v e l y computable. As a c o n s e q u e n c e the d e p e n d e n c i e s between attributes of a tree which are established by the attribute rules must be acyclic for the structure tree of any sentence in L(G). Unfortunately, exponential time is required to verify that an AG is w e l l - d e f i n e d [Ja75], and a t t r i b u t e evaluators which are applicable for any w e l l - d e f i n e d AG are rather inefficient. Hence subclasses of the class of w e l l - d e f i n e d AGs are c o n s i d e r e d for c o m p i l e r construction. The p a r t i t i o n a b l e AGs (in [Ka80]: "AGs which can be arranged orderly") form a large subclass of w e l l - d e f i n e d AGs including all classes based on p a s s - o r i e n t e d attribute e v a l u a t i o n (e.g. n left-to-right depthfirst passes [Bo76], or n alternating passes [JaW75]). The d e f i n i t i o n of partitionable AGs is based on a p a r t i t i o n of d i s j o i n t subsets Ak(X) of each A(X) such that the a t t r i b u t e d e p e n d e n c i e s allow to evaluate an a t t r i b u t e X.a of a certain tree node before X.b of the same node if X.a is in Ai(X), X.b is in Aj(X), and i<j. A p a r t i t i o n a b !e AG is called ordered (OAG) if the p a r t i t i o n s are canonical in a certain sense. A c o m p l e t e formal d e f i n i t i o n and a d e c i s i o n algorithm for OAGs which r e q u i r e s polynomial time is given in [Ka80]. Such an algorithm is implemented in the GAG-System. It analyses a t t r i b u t e dependencies completely at compiler g e n e r a t i o n time and c o m p u t e s a non p a s s - o r i e n t e d a t t r i b u t e evaluation order if the input AG is ordered. A set of " v i s i t - s e q u e n c e s " controls the attribute e v a l u a t i o n in the generated compiler for any structure tree of the defined language: The a t t r i b u t e e v a l u a t i o n phase in the g e n e r a t e d c o m p i l e r operates on a s t r u c t u r e tree built by p a r s i n g the input program. Each inner node w i t h its d i r e c t d e s c e n d a n t s r e p r e s e n t s an a p p l i c a t i o n of a production. The a s s o c i a t e d semantic rules are evaluated during a v i s i t of such a node. A visit-sequence associated with a production d e f i n e s the e x e c u t i o n order for semantic rules and visits of surrounding nodes. Thus a v i s i t - s e q u e n c e c o n s i s t s of the following operations:
evaluate - check - visit - leave
-
Attribute structure
an attribute rule a c o n t e x t condition the i-th d e s c e n d a n t to the ancestor node
evaluation at any node derived by a production p in any tree is controlled by the visit-sequence associated with p.
The c o m p u t a t i o n of the v i s i t - s e q u e n c e s is based on the partitioned attribute sets. The sets Ak(X) contain alternatingly inherited (AIk(X)) or synthesized (ASk(X)) attributes only: AII(X), ASI(X), AI2(X), AS2(X), .... The p r e c o n d i t i o n for the n-th visit to any node labelled X is the evaluation of all attributes in AIn(X). The postcondition of the visit is the evaluation of all attributes in ASh(X). It holds after the corresponding leave-operation. Thus the attribute partitions are to be considered as context independent interfaces between visit-sequences applied to adjacent tree nodes. The partitions are computed such that no direct or indirect attribute dependency violates that partial evaluation order. 2.3
System
Overview
The requirements a compiler generator must meet are manifold: Of course the main task of such system is the g e n e r a t i o n of an attribute evaluator specified by an AG. As the system is intended for application in realistic compiler projects, it must s u p p o r t AG d e v e l o p m e n t by providing helpful information about the AG such as attributes dependencies, errors and other characteristics. The generated attribute evaluator must be embedded in a compiler environment: As the structure tree is the central data structure for the attribute evaluation the system generates of a module for program tree construction; it provides interfaces to compiler modules not generated by the system (scanner, parser, and synthesis phase if not specified by the AG); it offers facilities for testing and for m e a s u r e m e n t s of space and runtime to be integrated into the attribute evaluator. The user can easily control these facilities by s p e c i f i c a t i o n s added to the AG. The interface to the parser g e n e r a t o r used here [De77] can be adapted to the requirements of comparable systems. Applications for usual high level programming languages are supported by standard solutions for scanner and protocol generator. An example for a protocol of a generated compiler is included in Appendix B, Sect. B.4. The error messages inserted in the source text are defined within the underlying AG (as can be seen in Appendix A). The functions
of the G A G - S y s t e m
are shown
in Fig.
2.1.
The first phase of the GAG-System is a usual compiler task : The ALADIN-text is scanned and checked for syntactical correctness (by a generated LALR(1)-parser). The output of the parser, a sequence of symbols and reductions, is used to build up the structure tree. The semantic analysis is done by using the same techniques as applied in the generated compiler: the structure tree is traversed for attribute evaluation controlled by visit-sequences (VS). Since the scope rules of ALADIN are very simple the main task of semantic analysis is type checking.
I AG V I Scanning I Parsing J Analysis of Static I Semantics +
I f I I
I V + ............. + i Dependency I Analysis l
ALADIN Analysis
• .i
I
.
.
.
.
.
.
.
.
.
.
I Computation I of VS I + ............. + Compiler
Generation
l V I Attribute I I Optimizationl
I V +
V
Parser Interface I VS t r a n s f o r m a t i o n I PASCAL d e f i n i t i o n s I PASCAL code 4
I i I I
I I I
V 4
t-
J Parser [ Generator I + ........... +
4 # Invariant 4
:=====+ parts # ===+
I 4
l
f
V +========:
V
# # 4 Fig.
V +
# Compiler
# =====+
2.1: Functions of the GAG-System
The last function of the analysis part has no direct correspondence to general compiler tasks. First the shorthand notations of ALADIN describing attribute transport are expanded. For each shorthand notation a set of e q u i v a l e n t attribute rules is generated. New attributes are introduced for "long range" attribute transport. This expansion is a p r e c o n d i t i o n for the following analysis of attribute dependencies. The dependency analysis checks whether the AG is ordered (or arranged orderly by additional dependencies) and computes visit-sequences using the method described in [Ka80]. These phases
are discussed
in Sect. 2.5.
The attribute optimization phase reduces the amount of storage needed for attribute instances in the generated compiler. Lifetime analysis based on the visit-sequences determines which attributes can be implemented by variables or stacks instead of tree node components. Optimization is performed completely at generation time. Its results are reflected by the generation phase. The main task of the generation phase is the generation of a Pascal program : the compiler specified by the AG. On the Dne hand the types and objects (attribute values) of ALADIN are mapped to types and declarations in Pascal, on the other hand the attribute computations, conditions, and functions are translated into Pascal statements, functions, and procedures. A further task of the generation phase is the transformation of the visit-sequences into an optimized table which controls the tree-traversal in the generated compiler. In order to use an externally generated parser the generation phase extracts a description of the syntax which serves as input to a parser generating system. As mentioned above the development of an AG is supported by different available system output: A protocol is generated showing which GAG-passes have been executed and merging the messages (informations, warnings, errors) with the input-text. A cross-reference-listing and a listing of the dependency paths between selected attributes are available. An example of the detailed information produced by the system is shown in Appendix B. Sect. B.I shows the header of the protocol resulting from a complete execution of GAG. The implementation qualities:
of
the GAG-System
a) Modularity. The main programs communicating manageable sizes for decomposed into modules
reflects the following program
phases of the GAG-System form separate by files. This principle yields programs of development and maintenance. Each phase is according to its logical structure.
b) Portability. The system is implemented in Pascal as described in [BSI82]. Parts using implementation dependent constructs, e. g. for time measurements, are enclosed within special commands and may be omitted using the preprocessor PROPP [Jan82]. c) Fault tolerance. The system is implemented in a defensive programming style. Assumptions for interfaces between modules and phases are verified on both sides. Sophisticated error recovery is implemented in all phases. Hence the system can handle numerous input and system errors. The decision for Pascal as implementation language was determined by its high portability. Other languages which support the hierarchical development of large systems better were dropped due to their lack of portability.
10
2.4
The G e n e r a t e d
Compiler
Accordin~ to the task of a c o m p i l e r has the f o l l o w i n g structure:
the g e n e r a t e d
attribute
evaluator
+ . . . . . . . . . . +. . . . . . . . + . . . . . . . . . . . +. . . . . . . . . . . . . . . . . . . . . . + . . . . . . . . . . + I I
Scanner
÷ . . . . . . . . . .
I Parser I + . . . . . . . .
In standard and parsing lanquage.
I Tree-ConI struction + . . . . . . . . . . .
I Attribute I
Evaluation
+ . . . . . . . . . . . . . . . . . . . . . .
I Output I + . . . . . . . . . .
I I ÷
applications the s t r u c t u r e tree is built up by s c a n n i n g an input stream r e p r e s e n t i n g a program of the d e f i n e d
Besides the m e s s a g e s arising from v i o l a t i o n of context c o n d i t i o n s the result of the attribute e v a l u a t i o n is the a t t r i b u t e d structure tree. The tree or only some "final" a t t r i b u t e s can be w r i t t e n onto a file. 2.4.1 Parser
Interface
The attribute e v a l u a t i o n is performed on the structure tree which must be built up a c c o r d i n g to the c o n t e x t - f r e e syntax c o n t a i n e d in any AG. Thus a d e s c r i p t i o n of the syntax is p r o d u c e d augmented by actions w h i c h control the c o n s t r u c t i o n of the abstract program tree. With each rule a call of a tree c o n s t r u c t i o n p r o c e d u r e is associated. Its parameters s p e c i f y the type of the tree node to be generated and the number of subtrees. A d d i t i o n a l l y a list of those terminal symbols is produced for which a tree node has to be g e n e r a t e d (the symbols whic h are introduced by a terminal d e f i n i t i o n in the AG). Any t o p - d o w n or b o t t o m - u p parser p e r f o r m i n g be i n t e g r a t e d into the g e n e r a t e d compiler. s y s t e m based on the L A L R ( 1 ) - m e t h o d [De77].
2.4.2 The S t r u c t u r e
the specified actions can We use a parser g e n e r a t i n g
Tree
The inter~al representation of a p r o g r a m is the s t r u c t u r e tree. It consists of nodes for n o n t e r m i n a l s and terminals and of special nodes for repetition and optional clauses. The attribute e v a l u a t i o n needs c e r t a i n fields in the node records for n o n t e r m i n a l s : - a p o i n t e r to the subtrees, an indicator for the applied c o n t e x t - f r e e rule, - an indicator for the symbol on the lefthand side, - a t t r i b u t e values of that symbol, - a p o s i t i o n referring to the source text. The nodes for terminals for attribute values.
only
contain
the
symbol
indicator
and
fields
The tree is built up by a set of p r e d e f i n e d p r o c e d u r e s for allocating and linking the nodes. These p r o c e d u r e s are called by the (generated) parser.
11
2.4.3 A t t r i b u t e E v a l u a t i o n A t t r i b u t e e v a l u a t i o n is implemented by a stack automaton (as d e s c r i b e d in Sect. 6.3 of [Ka80]). The t r a n s i t i o n table contains the encoded visit-sequences. The state of the automaton is a pair made up of the a c t u a l l y visited node in the s t r u c t u r e tree and an element of the t r a n s i t i o n table. A visit of a d e s c e n d a n t node is executed by pushing the actual state and c o m p u t i n g the new state. For an ancestor visit the new state is popped from the stack. Sequences of semantic rules to be executed s u c c e s s i v e l y (not containing a visit) are c o n s i d e r e d as a basic operation of the algorithm. By that means space for the t r a n s i t i o n - t a b l e and runtime for control o p e r a t i o n s are reduced to a n e g l i g i b l e size. C o n t e x t c o n d i t i o n s check attribute values. If a c o n d i t i o n fails, an error message is written and, in general, attribute evaluation continues. In Chapter 4 we s h o w how s e m a n t i c error recovery can be integrated in an AG. The v i o l a t i o n of c e r t a i n implicit c o n d i t i o n s such as the call of "HEAD" with an empty list - may result in an undefined value. Only in that case does a t t r i b u t e e v a l u a t i o n stop.
2.4.4 A t t r i b u t e T~Pes At first glance, the type c o n c e p t s of A L A D I N and Pascal are rather similar: basic types, subrange, set, and structured types. The main d i f f e r e n c e s are: Instead of the Pascal variant concept ALADIN has UNION types (as in ALGOL68), which are better suited for an a p p l i c a t i ve language, and ALADIN contains a LIST concept for sequences of equally typed values with powerful list operations. On the one hand there are no pointer types in ALADIN; on the other hand (almost) any r e c u r s i v e l y defined type is allowed in ALADIN. The basis of the type m a p p i n g of A L A D I N into Pascal is the fact that attributes never change their values. So a mapping is chosen which avoids copies of large data structures. Simple types are mapped to their c o r r e s p o n d i n g Pascal type (scalars, subranges, etc.). All types of complex data s t r u c t u r e s (STRUCT, UNION, LISTOF) are mapped to Pascal types using pointers: STRUCT, UNION : pointer to a record LISTOF : the anchor of the list (a record of two pointers to first and last element) and the elements (a record with the element value and a pointer to the next element). Thus identical copies of complex objects are implemented by pointer copies: the object itself is allocated only once. Several complex objects which share some components logically, even share them p h y s i c a l l y as well.
12
2.4.5 I/O o f the Generated
Compile[
In standard applications the input to the compiler is the textual form of the program to be translated. The output of the generated attribute evaluator may be the attributed structure tree or (only) some final attributes. Thus the system provides types and procedures for binary I/O which can be used for several purposes: - The AG specifies a translation into a target (or language and the interface is a binary coded file. -
-
intermediate)
The specification of a language consists of several AGs, each defininq a pass of the compilation: The partially attributed program tree is written onto a file and read by the next pass to continue attribute evaluation. The specified language comprises facilities for separate compilation: The compiler needs a connection to a library file containing information of units which have already been compiled in the form of attribute values or program trees.
To fulfil all these requirements, the generated program is provided with definitions of intermediate representation for all attribute types and for the program tree, and with generated procedures to read and write these data. In addition procedures for textual output are generated which are helpful to produce a readable form of attribute-values for testing the AG i.e. for validation of the specification by appropriate input.
2.5
Dependency Analysi s
The GAG-System analyses attribute dependencies completely at compiler generation time. If the input AG is ordered "visit-sequences" are constructed which control attribute evaluation in the generated compiler (see Sect. 2.2). The central data structure for the analysis of attribute dependencies is a collection of dependency graphs: For each symbol a graph over the associated attributes, and for each production a graph over the attribute occurrences of the symbols in the rule. The graphs are iteratively updated: The direct dependencies of the attribute rules are entered in the rule graphs. Any path between two attributes of one symbol occurrence is entered in the corresponding symbol graph and induced to all occurrences of that symbol in the rule graphs. Iteration terminates when all graphs remain invarlant. The symbol graphs represent a superset of direct and indirect dependencies between two attributes of any symbol occurrence in any attributed structure tree (cf. [Ka80]). For each symbol graph the partitions (as described above) are computed and represented by additional ordering dependencies. If an attribute can be assigned to more than one partition, the later evaluated partition is chosen. By that means lifetime of attribute values is shortened ("lazy attribute evaluator" in [Po79], cf. Sect. 2.5). Appendix B, Sect. B.2 contains examples for dependency graphs. The partially ordered rule graphs augmented by the ordering dependencies are linearized and converted to visit-sequences. Any freedom in the partial order is used for optimizing strategies:
13
-
Attributes are evaluated immediately before they are needed as precondition for a visit, reducing the lifetime of their values.
- A descendant is visited if no descendant left to it can be visited, and the ancestor is visited if no descendant can be visited. These strategies complete attribute evaluation for leftmost subtrees as early as possible and reduce lifetime overlaps of different instances of one attribute. The algorithm described above can be parameterized in several ways in order to produce v i s i t - s e q u e n c e s which fit to different design criteria for the generated compiler: Attribute evaluation can start either - after the structure tree is completely built, or - interleaved with bottom-up tree construction, or - interleaved with top-down tree Construction. Instead of the permissive visit-strategy for ordered AGs a pass-oriented strategy (n left-to-right or n alternating passes) may be chosen. In that case the attribute partitions are computed according to the rules for LAGs [Bo76] or AAGS [JAW75], respectively. Visit-sequences for two rules of the Pascal-AG are given in Appendix B, Sect. B.2. They show the interdependencies between attributes of different indicated by the column named "precondition". All AGs ~rocessed up to now were OAG or at least aOAG: The system establishes the OAG-property by automatically adding attribute-dependencies for originally independent attributes. (cf. "orderly arranged AGs" in [Ka80]). It is also possible to fix the evalution order for independent attributes explicitly (by adding dependency specifications to the AG). A precondition for dependency analysis is the expansion of ALADIN constructs which abbreviate attribute rules (TRANSFER, INCLUDING, and CONSTITUENT, cf. Chapter 3). They impose attribute dependencies which are made explicit by adding new attributes and attribute rules with an equivalent meaning: A TRANSFER rule is replaced by a set of attribute rules according to its definition. The INCLUDING and the CONSTITUENT construct define "remote attribute access". They are expanded by creating a new "transport attribute" associated to each symbol which can occur in any structure tree between the application of the construct and the referred symbol instance. Suitable attribute rules are generated for these attributes. The attributes and attribute rules resulting from the expansion of INCLUDINGs are removed after the dependency analysis because a procedural implementation of the INCLUDING is used which searches the "included" attributes in the tree and therefore needs no "transport attributes". The remaining generated attributes and attribute rules are handled by the optimizer as well as all others. In general most of them will be eliminated. Similarly explicit and shortened optimizer if possible. Characteristics of complexity, and the summarized by the Appendix B, Sect. Pascal.
transfer
rules are eliminated
by
the
an attributed grammar illustrating its size, visit strategy necessary to evaluate it are system at the end of the dependency analysis. B.2, contains the characteristics of the AG for
14
2.6
Attribute Optimization
A straight forward implementation of each attribute allocates a component of the tree node record. The a t t r i b u t e o p t i m i z e r of GAG finds out those a t t r i b u t e s w h i c h can be implemented by global variables or stacks in order to reduce space for tree nodes. Together with an a p p r o p r i a t e mapping of ALADIN types to Pascal types (see Sect. 2.4.4) we achieved a rather e f f i c i e n t storage layout. The criteria for d e t e r m i n a t i o n of global a t t r i b u t e s are developed in detail in [As79] and [AKZ81]. They are p r e s e n t e d here in Chapter 5. The basic idea is the concept of a lifetime: The l i f e t i m e of an a t t r i b u t e instance can be m e a s u r e d in terms of visit sequence elements, i.e. semantic rules and visits. It starts with the assignment of a value to the instance and ends with any reachable "last" applied occurrence. The VS d e t e r m i n e s the point at w h i c h an o c c u r r e n c e is used. If such an applied occurrence does not exist in some rule context, the end of its lifetime in the associated VS is the point where this VS is reached. If c e r t a i n criteria on t h e l i f e t i m e s hold an a t t r i b u t e can be implemented by a global v a r i a b l e or a global stack. Thus the sizes of the tree node variants are reduced, i.e. the size of the tree itself is reduced. Moreover, d i f f e r e n t a t t r i b u t e s (of d i f f e r e n t symbols) may share the same v a r i a b l e or stack if similar c o n d i t i o n s hold for all of their instances. The result Chapter 5.
2.7
of this o p t i m i z a t i o n
is d e m o n s t r a t e d
for d i f f e r e n t AGs in
Compiler Generation
The m a i n task of the generation phase is the translation of the semantic rules of A L A D I N into Pascal. This translation from one high level language into another - includes some p r o b l e m s known from the code generation of usual compilers. As far as the e x p r e s s i o n s are concerned, Pascal m u s t be s e e n as a language of lower level than ALADIN. Most of the A L A D I N e x p r e s s i o n s must be broken into several s t a t e m e n t s using t e m p o r a r y v a r i a b l e s for intermediate results. Many operators of A L A D I N are t r a n s l a t e d into p r o c e d u r e - and f u n c t i o n - c a l l s in Pascal. The complexity of the translation c o m p a r i s o n with usual code g e n e r a t i o n .
task
can
be
explained by the
- The generated program must not violate the strong type rules of Pascal - an unnecessary c o n d i t i o n since a complete type check is performed on the A L A D I N level. On the other hand this requirement f a c i l i t a t e d d e b u g g i n g of the t r a n s l a t i o n phase: Most of the errors were found by c o m p i l i n g of the g e n e r a t e d program. - The g e n e r a t e d Pascal p r o g r a m should be readable. This is achieved by formatting the text, g e n e r a t i n g names (for v a r i a b l e s and procedures etc.) which are similar to the c o r r e s p o n d i n g ALADIN-names, and l i n e a r i z i n g the e x p r e s s i o n s o n l y as far as necessary. -
Intermediate results of the expression evaluation are stored in auxiliary variables. The m a n a g e m e n t of these v a r i a b l e s during the translation c o r r e s p o n d s to usual r e g i s t e r - a l l o c a t i o n technique. Yet the number of auxiliary variables is not limited, because the
15
necessary
number of declarations
is generated.
The implementation of the translation phase uses the same technique as the semantic analysis. The tree is traversed for attribute evaluation and generation of output controlled by visit-sequences.
2.8
Experienc e
2.8.1 The processing
of Attributed
Grammars
The GAG-System has been in use since 1980 ([KZ80]). Its application ranges from small "toy"-languages comparable to ASPLE [Marc76] to large and complex languages like Ada [Ada82], LIS [Sa80] and PEARL [DIN80]. The use of the GAG-System in real life projects has shown that it is a valuable tool for l a n g u a g e definition, compiler specification, and compiler ~eneration as well. The AGs for Ada and PEARL can be considered as worst case tests for the GAG-System in several aspects: input length, number of language concepts and constructs, number of attributes and semantic rules (cf. Table 2.1). Early applications of the system resulted in improvements of the input language ALADIN and the implementation of the system as well, so that the applicability of the GAG-System and its input language now are proven. The runtimes for the analysis phase were surprisingly low, especially those for dependency analysis which was considered to be critical in time. In [Ka80] it is shown that the complexity of that algorithm is bound by the 4th power of the maximum number of attributes associated with a symbol, the 3rd power of the maximum length of a syntactic rule, and the length of the context-free grammar. The runtime for the attribute optimization turned out to be more critical. The reasons are that for each attribute all occurrences must be considered to compute the lifetimes, and then the lifetimes of all attributes must be compared to find the overlays. The runtime increases with the number of times any attribute occurrence is referenced in the AG; i.e. with the complexity of the semantic rules. But for all applications it was less than the time needed for the whole analysis phase (incl. dependency analysis). The time needed to generate the attribute evaluator and its tables increases linearly with the size of the input AG, so that it is the least critical phase. Table 2.1 gives the total runtimes and those for dependency analysis and attribute optimization. The times of the rest of the analysis phase and of the whole generation phase are typically af equal size.
18
PASCAL
ADA
LIS
PEARL
2997
14922
12615
13644
143
244
301
354
1953
3567
5199
5156
276
718
651
758
OAG
OAG
aOAG
aOAG
4
3
2
4
124
435
385
409
runtime o n l y for d e p e n d e n c y analysis
20
32
113
63
runtime only for attribute o p t i m i z a t i o n
15
124
74
140
lines of the g e n e r a t e d parts of the c o m p i l e r
10699
74249
52751
37277
input lines context-free
rules
total number of attribute o c c u r r e n c e s number of input semantic rules class of AG
*)
max number of visits to a symbol total
*)
(GAG)
runtime
(sec)
OAG = ordered AG, Table
aOAG = arranged
orderly AG
2.1
The storage needed to g e n e r a t e the e v a l u a t o r on our SIEMENS 7.760 increased to 4 MB (Ada), because the whole structure tree is built up and d e c o r a t e d with a t t r i b u t e s needed for the g e n e r a t i o n . But there is a "natural" way to split up the tree (each semantic rule, each f u n c t i o n b o d y can be g e n e r a t e d without any regard to other ones), so that the storage can be reduced (if necessary) without a great loss of time efficiency.
2.8.2 Generated
Compiler
Front-ends
Apart from several smaller test grammars, compiler front-ends have been g e n e r a t e d and executed for Pascal, Ada, and LIS. Table 2.1 shows the characteristics of the processed AGs and of the generated code. Each g e n e r a t e d evaluator consists of the generated part and an invariant frame of about 3000 lines. The relation of output text to input text does for bigger AGs not exceed a factor 4, which m e a s u r e s the mapping from A L A D I N to Pascal. The code length of the g e n e r a t e d program varies depending on the test facilities included. For first e x p e r i m e n t s it is useful to g e n e r a t e a compiler with exhaustive test facilities (which is an optional feature of GAG) for attribute values, tree structure, and the evaluator state. Thus the AG s p e c i f i c a t i o n can be easily validated. Later on, one g e n e r a t e s a pure e v a l u a t o r without these tests, w h i c h will reduce the code to 70%. Since all c h e c k s of a t t r i b u t e values (whether they range within their domains) are done during analysis of the AG, runtime checks g e n e r a t e d by the Pascal c o m p i l e r are not needed. Thus the executable code m a y again be reduced to 70%, if code for these
17 checks is omitted. The program which is finally generated then occupies only half of the code of the first test iteration. As an example, the code length of our generated Pascal front-end in its • inal version is 150 KByte, where the first version produced 319 KByte code. Various experiments with the AG for Pascal proved the system to be able to produce a maintainable, safe, and rather efficient compiler front-end (for details see the chapters 4 and 5). Measurements of storage allocations of the generated program allowed us to improve the mapping of objects (attributes and nodes) (see Chapter 5). Our improvements were linited most of all by the storage allocation techniques of the Pascal compiler used to translate the generated program, and by its poor facilities for dynamic storage deallocation.
Chapter
3:
A Language
ALADIN for A t t - - ~ t e d
Definitions
3.1 Introduction ALADIN (A ~anguage for Attributed DefiNitions) is based general principles of the AG method ou~lin'ed in Chapter I.
on
the
ALADIN is an applicative language (like LISP). Attributes describe static properties of language elements. Hence the concept of variables would violate the principle of attributed grammars and is not included in ALADIN. Since the attribute evaluation order can be computed automatically, there is no need for language elements in ALADIN which specify control flow. On the other hand ALADIN provides elaborated forms of structured expressions (e.g. case_clauses) and powerful operators (e.g. for operations on sequences). ALADIN is a strongly typed language (like ALGOL68 or Pascal). The domain of each attribute is defined by its type. All expressions must be well-defined in the sense of the ALADIN type rules. A powerful type concept allows the construction of types suitable for the abstraction of the attributes' meaning. Based on elementary types for boolean, integer, string, and enumerated values, complex types can be formed by structuring, uniting and sequencing. The applicative character of ALADIN has consequences on the type concept compared with usual procedural languages: ALADIN has no pointer types because they cannot be used without variables. Any recursively defined type is allowed if its domain contains finite values. Recursive computations can be defined with the help of functions. ALADIN encourages the writing of readable specifications by naming all defined entities (e.g. symbols, attributes). In situations which occur frequently, the specifications of several attribute rules (each defining the identity of two attribute values) can be abbreviated. The central part of an AG written in ALADIN is a set of rules, each consisting of a context-free syntax rule (in extended Backus-NaurForm), some attribute rules, and context conditions. Example: RULE r123: statement ::= var denot '.-'.- expr STATIC statement.at_labels := tp_labels (); CONDITION MESSAGE
f_assignment_compatible
"INCOMPATIBLE
TYPES
(var denot.at type, f reduce (expr.at_type)) IN ASSIGNMENT"
END; Furthermore,
there are symbol definitions
associating
attributes
with
20
symbols. NONTERM
statement : at visible labels at--labels --
An AG is completed definitions. CONST c canonical TYPE
tp_labels
by
the
tp_labels INH, : tp_!abels; :
n e c e s s a r y constant,
type, and function
: 0; : LISTOF INT;
F U N C T I O N f reduce (p_type : tp_type) tp_type : CASE p--type OF IS tp s u b r a n g e : TH~S.s host; IS tp__set--: IF THIS.s base IS tp subrange THEN tp_s~t (c canonYcal, THIS.s_packing, f~reduce (THIS.s_base)) ELSE p t y p e FI; IS tp_proc : IF EMPTY (THIS.s_params) THEN f reduce (THIS.s_result) ELSE pZtype FI OUT p_type ESAC;
3.1.1 N o t a t i o n The d e s c r i p t i o n of ALADIN in the following sections is organized by language elements. Every section consists of syntactic definitions and s p e c i f i c a t i o n of the context conditions and meaning of the language elements in verbal form. The productions will be given in BNF, extended by the following abbreviations: The form Y
::=
u
[ x
]
Y Y Y Y
::= ::= ::= ::=
U U U U
S S ( [
V . V • // t ) v • // t ] V •
+ * X X
v
.
stands for
with the additional
Y::=uZv. Y::=uZv Y::=uZv. Y::=uZv. Y::=uZv.
Z::=Zs/s. Z::=Zs/. Z ::=Ztx/x Z ::=Ztx/x/
Z
::=
X
/
rule
•
.
In the above definitions x is a n o n e m p t y symbol sequence, s is a symbol or has the form (x), t is a terminal, Z is a nonterminal that does not occur elsewhere in the grammar, and u and v stand for arbitrary symbol sequences that may also contain the extensions explained here. Nontermina!s will be written as identifiers, Terminals are strings that are bounded by a p o s t r o p h e s in syntactic rules (e.g. 'RULE'). In an A L A D I N sentence, t e r m i n a l s will be written without a p o s t r o p h e s
21
(e.g. RULE). All identifiers a p p e a r i n g from A L A D I N keywords.
in a s e n t e n c e must be d i s t i n c t
3.1.2 Basic Symbols identifier
::= letter
i d e n t i f i e r tail
identifier_tail
::=
letter or digit
::= letter / digit
integer letter
digit
::= d i g i t +
['. '] letter . or . digit .
tail / .
.
'z' / 'Z' .
'0'/'i'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'
string
::= '"' s i n g l e _ c h a r a c t e r
char
::= '"' s i n g l e _ c h a r a c t e r
symb
identifier
.
::= 'a' / 'b' /.../ 'A' / 'B' /.../ ::=
.
single_character+ '"' .
::= '''' s i n g l e _ l i t e r a l _ c h a r a c t e r *
single_character
'"'/ '""'
''''
::= ' c h a r a c t e r _ o t h e r _ t h a n _ q u o t e '
single literal c h a r a c t e r ::= 'ch~racterSther_than_apostrophe'
/ '""'
.
/ ,,,,,'.
Every basic symbol (keyword, identifier, integer, string) of an A L A D I N s e n t e n c e must be c o m p l e t e l y c o n t a i n e d on one line. Two symbols, sl and s2, following one another must be s e p a r a t e d by one or more spaces, newlines, or comments, if sl is a keyword or identifier and s2 is a keyword, identifier, or integer. Comments begin with % and end at the end of the line, or are d e l i m i t e d by (* and *); they a r e used o n l y for d o c u m e n t a t i o n . The A L A D I N k e y w o r d s are : 'AND' 'CASE' 'CONDITION' 'CONST' 'CONSTITUENT' 'CONSTITUENTS' 'DYNAMIC' 'ELSE' 'END' 'ESAC' 'FI' 'FUNCTION' 'IF' 'IN' 'INCLUDING' 'INH' 'IS' 'KEY' 'LET' 'LISTOF' 'MESSAGE' 'MOD' 'NONTERM' 'NOT' 'OF' 'OR' 'OUT' 'QUA' 'RULE' 'SETOF' 'STATIC' 'STRUCT' 'SUBSET' 'SYNT' 'TERM' 'THEN' 'TRANSFER' 'TYPE' 'UNION' 'WITH' 'XOR' The following nonterminals will be used c l a r i f y the v a r i o u s roles of identifiers: c identifier tp_--identifier en identifier s--identifier at--identifier r--identifier sx--identifier f--identifier p--identifier cs--identifier 1--identifier
for for for for for for for for for for for
in the A L A D I N grammar to
c o n s t a n t identifiers type identifiers e n u m e r a t i o n identifiers field identifiers attribute identifiers rule identifiers v o c a b u l a r y symbol identifiers f u n c t i o n identifiers p a r a m e t e r identifiers case s e l e c t i o n identifiers let formula identifiers
22
3.2 General
Structure
attributed_grammar
::= ( definition //
';' ) .
definition ::= constant definition / type_defTnition / symbol definition / function definition / rule . -Symbol definitions specify the correspondence between symbols and attributes. All attribute types appearing in a language definition are introduced by type definitions. Rules define the context-free syntax and the computation schemata for the attributes. Constant definitions introduce identifiers as abbreviations for expressions. Parameterized functions serve to formulate recurslvely computed expressions and to abstract computations for use in more than one context. All sx identifiers used in productions, all attribute types, constants and functions, must be specified by definitions. With the exception of constant definitions, the order of the definitions is meaningless. An sx identifier, tp identifier, r identifier, c identifier, f identifier -or en identifTer is valid ~ver the entir~ AG. They must all be pairwise-dis~inct. They can, however, be the same as identifiers of the remaining classes because the latter do not have global scope. Local scopes are introduced by functions for p identifiers, by case selections for cs identifiers, by let formulas ~or 1 identifiers, and by relational cla~ses. For nested scopes the usual hiding rules apply. The set of defined attribute types must be admissible (see Sect. 3.4). The set of productions must define a reduced context-free grammar. Thus the grammar must not contain useless nonterminals. The set of semantic rules must satisfy the consistency and completeness conditions given in Sect. 3.7.1. 3.3 Constant Definitions constant definition
::= 'CONST' c identifier
':' expression
.
Constant definitions serve to introduce abbreviations for expressions. The expression must be evaluable without reference to rules or symbols, and the definition of a constant must precede its first use in constant definitions. 3.4 Type Definitions type_definition
::= 'TYPE' tp__identifier
':' type_description
.
type description ::= e~umeration type / subrange type / union type / structure type / set type 7 lis~_type . A type_definition
introduces
a
new
type.
All
defined
types
are
23 regarded as distinct, even though their descriptions in structure or representation.
may be
identical
Type definitions must satisfy the rules given in Sections 3.4.1-3.4.6. Types may be defined in terms of other types. The rules in Sections 3.4.1-3.4.6 permit cyclic definition of types whose domains are not empty. Several elementary types are predefined (Sect. 3.11). 3.4.1 Enumeration enumeration_type
Types ::= '('
( en_identifier
// ',' ) ')' .
A enumeration type defines a finite, ordered specified by [he sequence of the en identifiers.
set.
The
order
is
m
3.4.2 Subrange subrange_type
Types ::= '[' identifier
':' identifier
']'
A subrange type defines a subset of the domain of an elementary type (which mus~ be either a enumeration_type or INT). Both identifiers are either en identifiers of the same type or c identifiers for values of t~e predefined type INT. The first identifier gives the lower bound and the second the upper bound of the domain. The lower bound must be less than or equal to the upper bound. Distinct subrange _types over the same elementary type may overlap. 3.4.3 Union Types union_type
::= 'UNION'
'('
( tp identifier
// ',')
')' .
A union type defines a domain of pairs: Every value consists of a value oY one of the given united types combined with a type indicator. Type tests or special case clauses make reference to the type indicator. We say that a union type A "unites the type B" if B is a tp identifier in the type_description of A or if a union type C unites B and appears in the type description of A. A union type must not unite itself. A union_type A--must not ambiguously unite ~ type B. The following type definitions break this rule: TYPE tp_a: UNION(tp_b, tp c); TYPE tp_c: UNION(tp_b, t C d ) 3.4.4 Structure structure_type components
Types ::= 'STRUCT'
'('
::= ( s_identifier
( components
// ',' ) ')'
// ',' ) ':' tp_identifier
.
A structure type defines a domain of n-tuples, in which n is the number of gTven s identifiers. Every element of such an n-tuple is a value of the give~ type. The notation STRUCT(s_a,s tp_t, s b: tp_t).
b: tp_t)
is an abbreviation
for STRUCT(s_a:
The s identifiers define selector functions that may be applied to value~ of the structure_type. They must be unique for one structure type. They may, however, overlap with s identifiers of other structur~ _types.
24 We say that a type A "contains" - A is - B - C - A is
a type B if
a structure type and is a c o m p o n e n t type of A, or is a component type of A that contains B, or a union_type and all united types contain B.
A type must not contain itself. exist for every structure_type.
This condition guarantees that values
Examples TYPE TYPE TYPE
of allowed types: tp_s: STRUCT(s u: tp u, s f: tp_f); tp u: UNION(tp s, INT); -tp--f: LISTOF tp_s.
Examples TYPE TYPE TYPE TYPE
of illegal types: tp_sl: STRUCT(s_sI: tp_sl); tp s2: STRUCT(s u: tp u); tp~s3: STRUCT(s--s2: t~ s2); tp u: UNION(tp_~2, tp_s3)
3.4.5 Set Types set_type
::= 'SETOF'
tp_identifier
.
The domain of a set type is the powerset of the base type specified by the tp identifier. ~he base type must be a enumeration type or a subrange_~ype or CHAR, SYMB or BOOL. The normal set operations are defined over values of s e t t y p e . 3.4.6 List Types list_type
::=
'LISTOF'
tp_identifier
[ 'KEY' s_identifier
] .
The domain of a list type is the finite sequences of values from the base type specified by-the tp identifier (linear lists). The arrangement of the elements is specified by the c o m p u t a t i o n of list valued operators (Sect. 3.9.1). Special standard functions are defined on values of list type (e.g. HEAD, TAIL - see Sect. 3.11). List_types must not be d i r e c t l y recursive. Example of a legal list_type: TYPE tp i: LISTOF tp s; TYP~ tp_--s: bT~UCT(s_~: INT, s l: tp_l) If a key is specified for a list type then the element type must be a s t r u c t u r e _ t y p e having a c o m p o n e n t - w i t h the specified s identifier. Keys only have meaning in c o n j u n c t i o n with generic functions whose parameters are lists (Sect. 3.10.1). 3.5 S ~ b o l 'and Attribute symbol definition s ym]~oI class (sx identifier symbol_class
Definitions
::= //
',')
::= 'NONTERM'
': ' [ attribute / 'TERM'
definition
// ',' ] .
.
attribute d e f i n i t i o n ::= (at identifier // ',') ':' tp identifier [attribute_class] at Y d e n t i f i e r ':=' expression-':' tp identifier .
/
2S attribute
class
::= 'INH' / 'SYNT'
N
All nonterminals of the v o c a b u l a r y of the context-free grammar on which the AG is based must be declared by symbol definitions, as must any terminal appearing in the semantic rules. Attribute d e f i n i t i o n s attach attributes of the given types to symbols. If an attribute class - INH for inherited, SYNT for synthesized - is attached to an attribute definition, then it must agree with the attribute class derived from the semantic rules (see Sect. 3.7). If the value of an attribute is independent of the context of the symbol then it may be specified in the second form by means of an expression. If an attribute of a terminal is not inherited and not given by an expression then it is assumed that the attribute is defined outside the AG (for example by the r e p r e s e n t a t i o n of the symbol). A symbol definition NONTERM nt x, is an a b b r e v l a t Y o n NONTERM nt x: NONTERM nt--y:
of the form nt y: at a: tp_t fo~ at a: tp_t; at--a: tp_t.
An attribute d e f i n i t i o n of the form at a, at b: tp t INH is an abbreviation Yor at_a: tp t INH, at_b: tp_t INH. The at identifiers
of one symbol must be distinct.
m
Every nonterminal production.
must appear on the left-hand
side of
at
least
3.6 Productions rule ::= 'RULE'
production
[r_identifier
':'] production [subset rules] [static-rules] [dynamiC_rules]
::= s x _ i d e n t i f i e r
optional_clause
/ optional_clause
')' / s x _ i d e n t i f i e r
::= '[' s y n t a c t i c _ e l e m e n t +
repetitive clause ::= syntactTc element '*' / syntactic--element '+' / ' (' syntactic element '+' // literal '[' s y n t a c t i c Z e l e m e n t '+' // literal subset rules static--rules dynamiC_rules
::= 'SUBSET' ::= 'STATIC' ::= 'DYNAMIC'
.
'::=' syntactic_entity*
syntactic entity ::= syntactic_element / repetitive_clause syntactic element ::= '(' s y n t a c t i c _ e l e m e n t +
'END'
/ literal
']' .
')' / ']' .
(semantic rule // ';') (semantic--rule // ';') (semanticZrule // ,;,)
. .
.
.
one
26 A rule can be (Sections 3.8.2 the rule.
named by an r identifier. In special e x p r e s s i o n s and 3.9.3) thi~ identifier can be used to identify
The notation for p r o d u c t i o n s is the extended BNF introduced in Sect. 3.1.1. In order to g u a r a n t e e a unique r e l a t i o n s h i p to the semantic rules, alternatives are not permitted and repetition clauses or optional clauses may not be nested. All terminals not specified by symbol d e f i n i t i o n s must be enclosed in a p o s t r o p h e s . For d o c u m e n t a t i o n p u r p o s e s the semantic rules may be c l a s s i f i e d as subset-, static-, and d y n a m i c - r u l e s . 3.7 Semantic Rules semantic rule ::= a t t r T b u t e u rule / c o n t e x t c o n d i t i o n
.
S e m a n t i c rules s p e c i f y the values of attributes for the symbols appearing in the c o r r e s p o n d i n g p r o d u c t i o n s (attribute s p e c i f i c a t i o n s or attribute transfers), or they d e s c r i b e c o n t e x t c o n d i t i o n s over these attributes.
3.7.1 A t t r i b u t e Rules attribute
rule -
::= a t t r i b u t e c o m p u t a t i o n / attribute--transfer
a t t r i b u t e c o m p u t a t i o n ::= .--I a t t r i b u t e name '.expression
.
The rules for w e l l - d e f i n e d n e s s of an AG c l a s s i f y a t t r i b u t e s as either synthesized or inherited. For pragmatic reasons we introduce an additional class of a t t r i b u t e s whose values are defined by e x p r e s s i o n s in the attribute d e f i n i t i o n (Sect. 3.5). Suppose there exists a rule with p r o d u c t i o n X::=u, that c o n t a i n s an a t t r i b u t e c o m p u t a t i o n for an a t t r i b u t e X.a. Attribute a is then a s y n t h e s i z e d attribute of X. If there is a rule with a production Y::=uXv c o n t a i n i n g an a t t r i b u t e c o m p u t a t i o n for an attribute X.a then a is an inherited a t t r i b u t e of X. For every rule X0 ::= Xl ... Xn there must be exactly one attribute c o m p u t a t i o n for e v e r y synthesized attribute of X0 and for every inherited attribute of Xi (i=l,...,n). There may be no other attribute _computations. 3.7.2 A t t r i b u t e Names attribute_name symbol_name
::= s y m b o l _ n a m e
::= s x _ i d e n t i f i e r
'.' a t _ i d e n t i f i e r ['['
An attribute name identifies the c o r r e s p o n d i n g production. All symbol attributes production.
integer
']']
. .
an attribute of a symbol appearing
in
names occurring in an expression, other than remote (Sect. 3.8.3), m u s t identify symbols of the c o r r e s p o n d i n g
If the same symbol appears several times in the p r o d u c t i o n occurrences are d i s t i n g u i s h e d by indexing the symbol
then these name: X[I]
27 denotes the production.
first,
X[i]
the i-th o c c u r r e n c e of the symbol X in the
If the attribute name appears on the left side of an attribute computation, and the symbol name identifies a symbol within an optional clause of the production, the a t t r i b u t e _ c o m p u t a t i o n will be applied only if this symbol is actually present in the structure tree. If the attribute name appears on the left side of an attribute computation, and the symbol name identifies a symbol within a repetitive clause of the production then two cases must be distinguished. Let tp a be the type of the attribute and tp_e be the type of the expression on the right side of the attribute computation. - If tp_e=tp a, or if tp e can be coerced to tp_a (see Sect. 3.9.1) then the V a l u e of the expression will be associated with the identified attribute of every node in the structure tree that is an instance of the clause. - If tp e=LISTOF tp a then the attribute of each node will be associated wTth an element of the list. The sequence of associations will be specified by the correspondence between the nodes and list elements. An implicit c o n d i t i o n stating that the number of elements in the list is equal to the number of nodes generated by the repetitive clause is implied by this specification. If an attribute name appears within an expression then it denotes the value of the attribute it identifies. The value of an attribute whose symbol name identifies a symbol appearing in a repetitive clause is a list of the anonymous type LISTOF t, where t is the type of the attribute. The elements of the list are the attribute values of the nodes c o r r e s p o n d i n g to the symbols in the actual context. Their sequence is the sequence of the nodes. The context of the attribute name must uniquely s p e c i f y a list_type to which this value may be coerced. Attributes of symbols occurring within an optional clause may only be referred to within special conditional clauses (see Sect. 3.9.4). 3.7.3 Attribute
Transfer
attribute transfer ::= 'TRANSFER' [at identifier with_symbols
::= 'WITH'
//
',']
(symbol_name
[ with_symbols
] .
// ',')
An attribute transfer is an abbreviation for several attribute computations, each of which defines the attribute value via attributes of the same name but d i f f e r e n t symbols. Consider a production Y::=w. The following conditions an attribute transfer of the form TRANSFER a l , . . . , a m WITH XI,...,Xn (re,n>0) : -
Every Xi
(i=l,...,n)
- Every aj (j=l,...,m) of the symbols Xi.
must appear
must
hold
for
least
one
in the string w.
must be an attribute
of Y and of at
- The attributes Y.aj and Xi.aj must have ~ the same type. Y.aj and X.aj are both synthesized, or both inherited, or Y.aj is synthesized and X.aj is independent, or Y.aj is independent and X.aj is inheri-
28
ted. The attribute transfer is then equivalent to a sequence of attribute definitions: Y.aj := Xi.aj if and only if Y.aj is synthesized. Xi.aj := Y.aj if and only if X.aj is inherited. If the WITH part of an attribute transfer is absent then the enumeration of all symbols on the right-hand side is implied. If no attribute names are given then the enumeration of all attributes of Y that are also attributes of at least one symbol in the WITH part (or entire right hand side w) is implied. 3.7.4 Context Conditions context condition --
::= 'CONDITION' expression ['MESSAGE' expression]
A context condition specifies a restriction of the defined language: A sentenc~ of the context-free language does not belong to the language defined by the AG if the logical expression given in some context condition yields false for the attribute values of the structure tree node. Such a situation can be described by an error report in the form of an expression of type string. A context condition can also be a part of an expression (see Sect. 3.9.5T. Some ALADIN elements imply a context condition: qualification, certain attribute computation for attributes of repeated symbols (Sect. 3.7.2), ~nd the predefined functions HEAD, LAST, SELECT BY KEY. If the AG is intended to specify a practical compiler it is no~ a~visable to rely on these implicit conditions: Attribute evaluation will stop in the case of violation because of undefined values. 3.8 Semantic Expressions expression
::= formula / remote attribute let cl~use
access /
Evaluation of an expression yields a value of a certain type. necessary, it will be coerced to the type demanded by the context. 3.8.1 Formulas formula : := monadic formula / formula--operator monadic_formula type test / symb~l test . monadic
formula
/
::= 'NOT' operand / '-' operand / operand
operator ::= ,+, / ,-, / '*' / '/' / 'MOD' / 'DIV' / 'AND' / 'OR' / 'XOR' / 'IN' / '=' / '=/' / '<' / '<=' / '>' / '>='
.
If
29 Table 3.1: ALADIN Operators Operator
Left-
Right-
Me an ing
Result type
= = = = = = = = = = : : : : : =
:==:==::
+ + + +
I S(t) L(t) STRING
I S(t) L(t) STRING
-
I
I
I
* *
S(t) I S(t)
S(t) I S(t)
S(t) I S(t)
/ MOD AND, OR, XOR IN =, =/ <,<=, >,>= <=,>= NOT
I I B t G SA S(t)
I I B S(t) G SA S(t) B
I I B B B B S B
I S(t)
I S(t)
-
Addition Set union List concatenation STRING concatenation Subtraction Set difference Multiplication Set intersection Integer division Remainder Logical connectives Set inclusion Equality test Ordering test Set inclusion Complement Negation Set complement
I S(t) L(t) STRING
The abbreviations mean: I Predefined type INT B Predefined type BOOL S(t) Set type with base type t L(t) Lis~ type with element type t SA Enumeration or subrange_type G Arbitrary type Monadic operators have higher priority than dyadic operators in formulas. The priorities of all dyadic operators are the same. The meaning of an operator is determined by the types of its operands, as indicated in Table 3.1. All arithmetic operators are defined over the type INT. Operands of type "subrange of INT" are implicitly widened to INT; the result always has type INT. For integer division, the magnitude of the result is the largest integer not larger than the quotient of the magnitudes of the operands. The sign of the result is the sign of the product of the operands. The result of a MOD b is r, such that a=m*b+r for some integer m and 0 <= r < (magnitude of b). The values of the operands of relational operators (=, =/, <, >, <=, >=) will be balanced (see Sect. 3.9.1). Two records are equal when their components are equal. Two values of union type are equal when they are equal values of the same united type.--Two lists are equal when they have the same length and their elements are equal and given in the same order. 3.8.2 Type and Symbol Tests type_test
::= formula
symbol_test
'IS' tp_identifier
::= symbol_identifier
.
'IS' r_identifier
.
The result of a type test is TRUE when the value of the formula the specified type, otherwise it is FALSE.
is
of
A symbol test is a predicate on the local syntactic structure of the sentence: Let p be a rule whose production has the form X::=w and q
30 be a rule whose p r o d u c t i o n has the form Y::=uXv. The symbol test X IS p in a semantic rule of q then has the value TRUE when X is derived a c c o r d i n g to p in the g i v e n context. Otherwise it has the value FALSE. The symbol test X IS q in a semantic rule of p has the value TRUE when X o r i g i n a t e d from an a p p l i c a t i o n of q. Otherwise it has the value FALSE. C o n s i d e r the p r o d u c t i o n q: Y::=u[vXy]z. The symbol test X IS THERE in a s e m a n t i c rule of q has the value TRUE if X is present in the given context, otherwise it has the value FALSE. THERE is a p r e d e f i n e d r identifier. 3.8.3 Remote A t t r i b u t e A c c e s s remote a t t r i b u t e access ::= 'INCLUDING' a t t r i b u t e name / 'INCLUDING' '(' ( a t t ? i b u t e name // ',' ) ')' [symbol name] 'CONSTITUENTS T a t t r i b u t e name symbol_~ame 'CONSTITUENT' attribute~name
/ / .
A remote a t t r i b u t e access refers to a t t r i b u t e s of symbols that do not (necessa?ily) appear in the associated production. It a b b r e v i a t e s explicit upward or downward transfer of identical a t t r i b u t e values over long distances in the structure tree. The meaning will be d e s c r i b e d in terms of the s t r u c t u r e tree using a quasi ALADIN notation. Assume
that a node of the structure
TYPE node
: STRUCT
( symbol: subtrees: ancestor: ...
tree is a record defined by
symb indicator, node--list, node7 );
TYPE node list : LISTOF node; % representing the nodes for the righthand side % of a p r o d u c t i o n Let a remote a t t r i b u t e access be part of a to p r o d u c t i o ~ r: X0 ::~ Xl...Xn. Then equal
INCLUDING( Yl.al, ..., Ym.am ) with all to the call including( X 0 . a n c e s t o r ) of
FUNCTION
ai
rule
having
associated
type t is
including ( n: node ) t: CASE n.symbol OF YI: n.al; Ym: n.am OUT including ESAC;
The any
semantic
context-free
(n.ancestor)
g r a m m a r must g u a r a n t e e
that the value
is defined
in
case.
INCLUDING y.a
is an a b b r e v i a t i o n
for INCLUDING
(Y.a).
The remote a t t r i b u t e access Xi C O N S T I T U E N T S Z.a, where ta is the type of Z.a a~d t is t h e a n o n y m o u s type LISTOF ta, is e q u i v a l e n t to the call c o n s t i t u e n t s (Xi) of
31 FUNCTION c o n s t i t u e n t s ( n: node ) t: IF n.symbol = X0 % the symbol on the left side THEN t() % of r stops the propagation ELSE sub c o n s t i t u e n t s ( n.subtrees ) FI + IF n.symbol = Z THEN t(n.a) ELSE t() FI; FUNCTION
sub constituents ( nl: node list ) t: IF EMPTY(nl) THEN t () ELSE c o n s t i t u e n t s ( H E A D ( n l ) ) + sub c o n s t i t u e n t s ( T A I L ( n l ) ) FI
The remote attribute access CONSTITUENTS sub c o n s t i t u e n t s (X0?subtrees).
Z.a is equivalent
to
m
Xi CONSTITUENT Z.a is equivalent to H E A D ( c o n s t i t u e n t s ( X i ) ). In the case of CONSTITUENT the c o n t e x t - f r e e grammar must guarantee that the node referred to is uniquely d e t e r m i n e d (the result of constituents(Xi) has exactly one element) for any structure tree.
3.8.4 Let Clauses let clause
::=
m
'LET' 1 identifier ':' formula expression .
'IN'
m
The value of the let clause is the value of expression in which all 1 identifiers are rep~[aced by the value of the formula. Formula must nut "show" an 'IN'-operator: a formula "shows" an 'IN'-operator if it directly produces "formula 'IN' m o n a d i c formula" or if it d i r e c t l y produces "formula operator monad ic fo rm~la" where (the derived) formula "shows" an 'IN '-operator.
3.8.50perand s operand
::= d e n o t a t i o n
denotation
/ simple_operand
.
::= integer / string / symb / char / en identifier
simple operand =:= attribute name / component selection qual i fica~ion / semantic clause / p_identiTier / 1 identifier / c--identifier / c~ identifier . component_selection
.
/
::= s i m p l e _ o p e r a n d
'.' s_identifier
.
The type of the left operand of a component selection must be a structure type that contains the selected component. The value of the selection is the value of this component.
32 3.9 Semantic
Clauses
semantic clause ::= ' (' e~pression ')' / type clause / call--/ case clause / conditional clause / relational ~lause . 3.9.1 Type Conversion type_clause
::= t p _ i d e n t i f i e r
qualification
, (i [ expression
::= simple_operand
// ',' ] ')'
'QUA' tp_identifier
.
The type clause and the qualification are operations for explicit type conversion. A value of the specified type is constructed from the values of the expressions or simple operand. A q u a l i f i c a t i o n of the form e QUA tp t interprets the value of e as a value of type tp t. Either the ~ype of e is a union type uniting (directly or indi?ectly) tp t, or tp t is a subrange type whose base type is the type of e. Fo~ every q ~ a l i f i c a t i o n the~e is an implied c o n t e x t _ c o n d i t i o n CONDITION e IS tp_t. Suppose that tp_t(e) is a type clause, following cases are possible:
and tp_e is the type of e. The
a) tp_t tp t tp-t tp~t tp_t
unites tp_e d i r e c t l y or indirectly; is the base type of the subrange_type tp e; = STRING and tp_e = SYMB; = STRING and tp e = CHAR; and tp e are list types of the same base type and tp_e is anonymous (se~ Sect. 3.7.27; tp t = tp_e. If-- tp t is uniquely determined by the context, tp t(e) can be replaced by e in any of these cases. The value of e Fill then be coerced implicitly to type tp_t.
b) tp t and tp_e are set types with the same base types or with the same element types that may differ in the key tion.
list types specYfica-
c) tp t is a list_type and tp_e = LISTOF tp t. The value of the type clause is a list of type tp_t, formed ~y concatenating the elements of e. d) tp t is a list_type and tp e is the element type of tp_t. The value of the type clause is a list whose only element has the value of e. SimTlarly, if tp_t has type SETOF tp_e then the value of the type_clause is a singleton set whose element is the value of e. If tp t identifies a list or set type then a type clause of the form tp_t( ~I, ..., en ) (n >= i) is an--abbreviation fortp t( el ) + .... + tp t( en ). The value of tp t() is the empty list (o~ empty set) of typ~ tp_t. The result types of then- and e l s e - e x p r e s s i o n s in conditional clauses, the cases and default in case clauses, and the operands of r~lational o p e r a t o r s are specified by balancing: Let el, ..., en (n >= 1 ) be
SS
such expressions, with types tl, ..., tn. They will be coerced to the type tz for which the shortest sequence of type conversions (excluding the identity conversion) suffices to convert tl, ..., tn to tz. The type tz must be uniquely determined. 3.9.2 Calls call
::= f_identifier ::= '('
[ argument_list
argument
list
(argument //
argument
::= expression
] .
',')
')'
.
The value of a call is the value described by the function body when the formal parameters are replaced by the argument values. The c o r r e s p o n d e n c e is specified by the sequence in which the parameters and arguments are described. The types of the parameters and arguments must agree. 3.9.3
Case
Clauses
case clause ::= 'UASE' [cs identifier ':'] selection ['OUT' default] 'ESAC' . selection case
::=
default
::= e x p r e s s i o n ( case label
::= e x p r e s s i o n
/ symbol_name
':' )+ expression
'OF'
(case // ';')
. .
.
case label ::= d ~ n o t a t i o n / c identifier ['IS'] tp identifier / ['IS'] r T d e n t i f i e r .
/
The value of a case clause is the value of the expression in one of the cases, or th~ value of the default expression. The type of the value is d e t e r m i n e d by balancing. The cs identifier introduces an abbreviation for the value of the selectTon expression. Its scope extends over the cases and the default. If the cs identifier is omitted then the cs identifier THIS is implied. The form of the case labels depends upon the type of the selection: If the selection Ts of e n u m e r a t i o n type, subrange type, SYMB, CHAR, or INT then the case labels must--denote values ~f the selection's type. The case whose lab~l corresponds to the value of the selector specifies the value of the case clause. If the selection is of union type then the c a s e labels have the form IS tp identifier. The tp identifier identifies one of the (directly or indirectly) united type~. If the value of the selection is a value of the given type then the value of the selected case expression is the value of the case clause. When a case has only one label IS tp t, a qualification of --the c s _ i d e n t i f i e r having the form cs_identifTer QUA tp_t may be abbreviated as simply cs_identifier. If the selection is a symbol name then the case labels are r identifiers. Let X be the selection. If X appears on the right-hand Vide of the production belonging to the s a m e rule as the case clause then all
34
rules denoted by the case labels have the form X::=u. If X appears on the left-hand side of the- production then the case labels denote rules of the form Y::=vXw. A case with case label r ide~tifier=p will be selected when a symbol test X IS p in the same-context would have the value TRUE. A default must be specified in a case clause if and only if the case labels do not exhaust the elements ~f the selection type, the type~ in the union given by the selection, or the rules having the required form.
Clauses
3.9.4 Conditional conditional clause 'IF' seTection
::= 'THEN' expression
'ELSE' expression
'FI' .
If the selection is an expression of type BOOL then the conditional clause has the usual meaning. The type of the result is determined by balancing. If the selection is a symbol name, it must identify a symbol that occurs within an optional_cTause in the corresponding production. When the symbol is present in the actual context, the value of the conditional clause is the value of the then-expression; otherwise it is the valu~ of the else-expression. Attributes of optional symbols (except those of the left-hand side symbol of an attribute rule) may only appear in then-expressions of conditional_clauses of this form. 3.9.5 Relational relational
Clauses
clause
::= ' (' expression context condition
')'
A relational clause binds a~ attribute relation to an expression. The value of a-relational clause is the value of the (first) expression. In order to avoid repetTtion of complex expressions, the value of the (first) expression can be abbreviated within the relational_clause by the predefined identifier IT. 3.10 Function Definitions function definition ::= 'FUNCTION' f_identifier parameters
::= '('
[ parameters
] result_type
( parameter_specification
parameter specification ::= ( p_identifier // ',' ) ':' tp_identifier result_type body
::= tp_identifier
::= expression
':' body .
// ',' ) ')' .
.
.
A function specifies the computation of a value of the result type. The value is obtained by evaluating the function body, after repTacing all occurrences of the parameter identifiers by the corresponding argument values. Function bodies must not contain any symbol_names or r identifiers. m
In a parameter specification, p_a: tp t, p_b: tp__t.
p_a,p_b:
tp__t is an abbreviation for
SS 3.10.1 Generic
Functions
The identifier LIST can appear as the parameter type of one or more parameters. It is replaced by the type t of the argument which must be a list type. All o c c u r r e n c e s of LIST in the function definition are replaced by the same type t. If at least one paramete~ type is specified as LIST then the identifier LIST ELEM TYPE may also appear in the function definition; it will be ~epla~ed by the element type of t. The identifier KEY L I S T c a n appear as the type of one or more parameters, and will be r~placed by the type t of the argument which must be a list type with a key specification. All occurrences of KEY LIST in the function d e f i n i t i o n are replaced by the same type t. IF at least one parameter type is specified as KEY LIST then the identifiers LIST ELEM TYPE, LIST KEY TYPE and LIST KEg ID may appear in the function ~efinition. EIST--ELEM TYPE is replaced by the element type of t, ~IST KEY TYPE by-the Type of the key component of the element type, and LI~T KEY ID by the s identifier that selects this component. LIST tion.
and
KEY LIST
must not both appear
3.11 Standard
Definitions
The following
types are predefined:
TYPE INT:
(* A " s u f f i c i e n t l y
TYPE BOOL:
(FALSE,TRUE);
TYPE CHAR:
(* Characters
TYPE STRING: TYPE SYMB:
(* Sequences
large"
in the same function defini-
range of integers
*)
*) of c h a r a c t e r s
*)
(* Basic symbols of the language
being defined
*)
The following functions are predefined. The bodies given describe effects of the functions but not their implementation: FUNCTION EMPTY ( p_l p_l = LIST ( ) ;
: LIST ) BOOL
the
:
F U N C T I O N HEAD ( p 1 : LIST ) LIST ELEM TYPE : (* first element of p_l CONDITTON NOT EMPTY(p_I) FUNCTION TAIL ( p i : LIST ) LIST : IF EMPTY(p_I) T H E N LIST() ELSE (* p_l without
*)
its first element *) FI;
FUNCTION LENGTH(p I: LIST) INT: IF EMPTY(p_I) W H E N 0 ELSE L E N G T H ( T A I L ( p
I))+i;
In addition, LENGTH may be applied to a symbol name of a symbol which occurs in a repetition clause. The call th~n yields the number of repetitions in the particular context. FUNCTION FRONT(p i: LIST) LIST: IF LENGTH(p 17<=1 THEN LIST() ELSE (* p_l--without its last element FUNCTION
LAST(p_I:
LIST)
LIST_ELEM_TYPE:
*) FI;
38
IF LENGTH((p 1 CONDITION NOT EMPTY(IT)))=1 THEN HEAO(p Y) ELSE LAST(T~IL(p_I)) FI; FUNCTION KEY IN LIST(p k: LIST_KEY_TYPE, IF EMPTY(p i~ THEN I~ALSE ELSE (HEAD(p__I).LIST KEY ID=p k) OR KEY IN LIST(p_k.TAI~(p_I~) FI;
p_l: KEY_LIST) BOOL:
FUNCTION UNIQUE KEYS(p I: KEY_LIST) BOOL: IF EMPTY(p i[ THEN TRUE ELSE UNIQU~ KEYS(TAIL(p I)) AND NOT KE~_IN_LIST(HE~D(p_I).LIST_KEY_ID, TAIL(p_I))
FI;
FUNCTION ELEM_IN_LIST(p_e: LIST_ELEM_TYPE, p_l: LIST) BOOL: IF EMPTY(p i) THEN FALSE ELSE (p_e=~EAD(p_l)) OR ELEM_IN_LIST(p_e,TAIL(p_I)) FI; FUNCTION UNIQUE ELEMS(p I: LIST) BOOL: IF EMPTY(p I~" THEN T~UE ELSE UNIQUE ELEMS (TAIL (p i)) AND NOT ELaN_IN LIST(HE~D(p_I) ,TAIL(p_I))
FI;
FUNCTION LIST INCLUSION(p 1 l,p_l_2: LIST) BOOL: IF EMPTY(p--I I) THEN T~U~ ELSE ELEM IN LIST (HEAD (p 1 l),p_l_2) AND LIST INCLUSION(TAIL(p_I_I),p_I_2) FI; FUNCTION SELECT BY KEY(p k: LIST KEY TYPE, p_l: KEY LIST) -- -L~ST ELEM--TYPE: IF EMPTY(p_I) THEN (* un~efin~d CONDITION FALSE *) ELSE IF HEAD(p_I).LIST_KEY_ID=p_k THEN HEAD(p i) ELSE SELECT--__BY_EEY(p__k,TAIL(p_I)) FI FI; FUNCTION GENNUM INT: (* A unique, non-zero integer value for each call *) FUNCTION GENSYMB SYMB: (* A unique symbol different from any other value of type SYMB *)
Chapter 4: Development of an Attributed for a Pascal-Analyzer
Grammar
4.1 Introduction Development of an AG for a real life programming language is a rather complex task comparable with software design. The knowledge of suitable systematic methods and standard techniques is a precondition for a successful application of the specification tools AGs, the GAG-System, and its input language ALADIN. From o u r experience with languages (PEARL the development of AGs for several programming [DIN80], Pascal, Lis [Sa80], and Ada [Ada82]) we derived general m e t h o d s for systematic AG development and techniques applicable to the description of usual programming languages. They are demonstrated for an AG defining Pascal according to the draft ISO Standard [BSI82]. The complete AG is contained in Appendix A. Certain important language properties like typing and scope rules are conceptually similar for most high level languages. So the development of the Pascal-AG described here can be used as a general guideline (see Sect. 4.2). In fact AGs for several application languages are directly derived from it [Nu82], [Ro82]. The development of an AG for a programming language can aim at different goals: definition of the language (e.g. for standardization), or specification of an analyzer or compiler. In the second case the AG will specify certain compiler properties (e.g. error handling, interface to synthesis) which are not required in a pure language definition. However, one should start with the design of a language description in order to facilitate validation and maintenance of the AG. Extending the AG by specification of compiler properties will usually not cause major changes because of the following reasons: -
-
-
An AG is based on a context-free grammar which is augmented by context dependent definitions. This distinction is widely accepted compilers for both language definition and modularization of (context-free parsing and analysis of static semantics). Well-defined AGs are understood as a definition method based on effectively computable descriptions. This characteristic is emphasized in the description language ALADIN. Therefore implementations can be derived automatically from the language definition. The GAG-System is based on a powerful attribute evaluation technique [Ka8@] which deliberates the AG-designer from specification of implementation techniques like pass organization. Our experience has shown that even complex languages can be defined without consideration of attribute evaluation order. In any case the AG turned out to be an OAG or was automatically transformed into an OAG by the GAG-System.
- The
GAG-System
applies effective optimizations
in order to produce
38 space and time efficient attribute evaluators. So efficiency aspects should not influence the design of the AG in an early phase. Our experience, especially with the Pascal-AG, has shown that efficiency can be obtained without changing the language defining character of the AG (see Sect. 4.4). - A practical compiler must include error handling facilities. For any AG in ALADIN the GAG-System generates a compiler which fulfils the minimum requirements for error handling. It is possible to add the specification of more sophisticated mechanisms to the AG in a rather late design phase with moderate modification costs (see Sect. 4.3). - If the AG specifies the front-end (analysis phase) of a compiler, certain target attributes must be specified describing the interface to the back-end (synthesis phase). These specifications can usually be added without redesigning the AG (see Sect. 4.5). As a consequence, in any case one should design the AG as a language definition and defer specifications of compiler properties to a later development phase. Clarity and comprehensibility should be the first aim of the design.
4.2 Development of the Attributed
Grammar
A systematic development of an AG requires both analysis of the static properties of the language and suitable use of the description tool ALADIN. The AG development should be guided by a classification of the static language properties: i. structural grammar),
definitions
(usually
expressed
by
a
context-free
2. "global" concepts like typing and scope rules which affect context dependent properties of most language elements, and 3. "local" context dependent
the
rules and restrictions.
It is advisable to start with the (usually given) context-free syntax, which is the frame of the AG. Context-free productions and symbols are introduced (by RULE-, NONTERM-, and TERM-definitions). They will be augmented by attribute definitions and attribute rules in a later step. In general, there is no need to transform the original syntax into an abstract one, because the GAG-System uses an abstract syntax in any case. (Terminal symbols and chain productions which are not relevant to attribute evaluation are eliminated automatically.) Furthermore, a scanner and a parser will be provided almost automatically if the AG is based on the original syntax. There may be situations where a given context-free syntax should be simplified: Certain language restrictions can be expressed by contextfree or context dependent means as well (e.g. restrictions on type denotations). If the context-free specification does not cover the property completely (e.g. because type identifiers are involved), or if the syntax is rather complex because of such restrictions, a simplified less restrictive syntax together with context conditions is preferable. The next development step is concerned with the global language properties. Attributes are attached to symbols describing these properties of the corresponding language elements: Type attributes
are attached to symbols for type denotations and expressions; environment attributes are attached to symbols which introduce new scopes. The domains of these attributes are specified by TYPE-definitions. Together with some FUNCTION-definitions, they should be considered as an abstract data type which models the particular language property, e.g. block structured environments with an identification function modelling scope rules. A careful design of these domains will reduce the necessity for iterations in the development process. It will be discussed in more detail in Sect. 4.2.1, 4.2.2. In the last step all productions are considered: attributes describing local properties are introduced, attribute rules are formulated which define how a certain property is determined in a given syntactic context, and context dependent restrictions are specified by CONDITIONs over attributes. The use of functions is advisable in order to postpone their specification until the end of the development of the AG. Our experience has shown that the following general rules should be applied throughout the whole AG development: If the language is rather large it should be structured into several parts (e.g. declarations, statements, expressions) in order to apply the development steps to each part. If an informal language description exists one should keep close to it in the choice of abstractions and names. It will help to show that the AG really defines what is intended. Finally one should use the facilities of the GAG-System for validation of the specification: One can easily generate an analyzer and test whether it accepts input programs which are intended to be correct or erroneous.
4.2.1 Types In a strongly typed language most static properties are concerned with types. Attributes describing types are attached to syntactic symbols for type denotations, expressions and their constituents. The domain of these attributes is an abstract representation of the language's types comprising their static properties. The structure of the domain resembles an abstract syntax for type denotations. Subsequently we discuss the domain of type attributes for the Pascal-AG (Fig. 4.1) and give examples for attribute rules. (For ease of understanding the identifiers in the AG are prefixed by abbreviations which characterize their use, e.g. tp for attribute types. The prefixes are explained in the heading of-the AG in Appendix A.) The root of the domain structure is tp type uniting several primitive and structured domains. The domains fo? subrange, set, array, and file types are self-explana~ory. For the predefined types (INTEGER, REAL, etc.) only their identity is relevant; they are represented by a single valued subdomain. The key component of some structured domains distinguishes types resulting from several (possibly equal) type denotations, as the Standard requires. For enumeration types only the code of the last value is specified. The enumerated values are treated as implicitly defined constants. Record types are described by two kinds of information used for different purposes: on the one hand, the sequence of all field definitions (including tag fields and all variant fields) needed for analysis of selections and WITH-statements; on the other hand the pure variant structure is given by discriminator types and variant case labels only. This information is needed only for analysis of calls of the predefined functions NEW and D I S P O S E . - -
40 The abstraction for procedure types does not follow the standard in two minor aspects: The concept of conformant array schemes is not modelled. Grouping of formal parameters does not contribute to type differences. Special care domain like
has to be taken for the abstraction of pointer types: A
TYPE tp_pointer
: STRUCT
(s referred
would cause recursively defined types to be large values, and the type attributes of circularly dependent. Hence for any pointer implicit type definition "t = td" is assumed type identifier. The abstraction of the type to type identifier t". Its identification is of the type (e.g. in the contents operation).
: tp_type)
described by infinitely type denotations would be type denotation "@ td" an where t is a new unique denotation is "pointer deferred to applications
Several subdomains are introduced which do not correspond to a Pascal type denotation: tp void, tp nil, tp empty set, tp err are used in contexts where a specYfic type ~annot be--determined, tp_overloaded is used for some predefined routines (e.g. READ, WRITE, EOF, NEW) the calls of which are analysed by overloading resolution, tp_general_set follows the standard's requirement that the type of a set constructor can be packed or unpacked as well. The property of a type to be ordinal is not reflected by the domain structure. A function checks whether a type belongs to one of the ordinal subdomains.
41 : UNION
(tp_real, tp_int, tp_bool, tp_char, tp enum, tp subrange, tp_--array, tp_record, tp_file, tp_set, tp_general_set, tp_empty_set, tp proc, tp_overloaded, tp~pointer, tp_nil, tp_void, tp_err);
TYPE
tp__type
TYPE
tp_subrange
: STRUCT
(s key s-host s~lwb, s_upb
TYPE
tp_enum
: STRUCT
(s_key : INT, s max val : INT);
TYPE
tp_pointer
: STRUCT
(s_referred
TYPE
tp__array
: STRUCT
(s key : INT, s_--packing : tp_packing, s index : tp type, s--el em tp_--type );
TYPE
tp_set
: STRUCT
(s key : INT, s_--packing : tp_packing, s_base : tp_type) ;
TYPE
tp general_set
: STRUCT
(s_base
TYPE
tp_file
: STRUCT
(s key : INT, s--packing : tp packing, s--base : tp~type) ;
TYPE
tp proc --
: STRUCT
(s params : tp defs, s--result : tp--_type);
TYPE
tp_record
: STRUCT
(s key : INT, s--packing : tp packing, s--all fields tp--de fs, s-cholce : tp-choice) ;
TYPE
tp__choice
: UNION
(tp_empty,
TYPE
tp_union
: STRUCT
(s discr : tp type, s--variants : tp--_variants);
TYPE
tp_variants
: LISTOF tp_variant;
TYPE
tp variant
: STRUCT
TYPE
tp_packing
: (sc_packed,
TYPE
tp_int
: (sc_int) ;
% corresponding % tp_empty_set,
(s labels s-'choice
: INT, : tp type, : INT);
: SYMB);
: tp_type);
tp_union) ;
: INT list, : tp_~hoice) ;
sc_unpacked);
definitions for tp real, tp bool, tp_char, tp_overloaded, tp_~il, tp_v~id, tp_err, tp_empty Fig. 4.1
42 The following tions is mapped
examples show how the concrete syntax for type denotato attribute ~al,les of the type domain.
RULE r30 : type denoter STATIC
::= constant
'..' constant
type denoter.at type := t~__subrange ~GENNUM, constant[l].at
type,
constant[l].at--value, constant[2].at~value); CONDITION f is ordinal (constant [l].at_type) MESSAGE "BOUND TYPES MUST BE ORDINAL"; w
CONDITION constant[l].at_type = constant[2].at_type MESSAGE "BOUND TYPES MUST BE EQUAL"; CONDITION constant[l].at value <= constant[2].at MESSAGE "EMPTY RANGE NOT ALLOWED" END; RULE r39 : type_denoter STATIC
::= [ PACKED_sym
] 'SET'
value
'OF' type_denoter
type denoter[l].at_type := tp_set (GENNUM, IF PACKED sym IS THERE THEN sc p~cked ELSE sc unpacked type_de~oter [2] .at type) ; CONDITION f is ordinal (type denoter[2].at MESSAGE "MU~T ~E ORDINAL TYPE" END;
FI, type)
Type checking for expressions and variables is defined in a straightforward manner using an attribute for type abstractions attached to expressions and its constituents. It describes the type of the construct determined by its derivation, without considering type requirements imposed by the context it is embedded in. In the next example the type of a binary formula is determined by overloading resolution for the operator (described in the function f_dyadic_operatot).
43 RULE r70: simp_expr ::= simp_expr add_opr term STATIC simp_expr[l].at type := f_dyadic_ope~ator (add opr.at op, f r~duce (~imp_expr[2].at_type), fZreduce (term.at type)); simp_expr[l].at_kind
:= sc dyn_val;
simp_expr[l].at_value
:= c_a_value
END; The function f reduce converts the types of parameterless procedures to their result type, subrange types to their base types, and set types to a "canonical" set type. The attribute at kind specifies variable, a packed, unpacked, or a (compile time or runtime over that attribute are stated
whether an expression stands for a or a tag field, a procedure identifier, computed) value. Context conditions for actual parameters, for example.
The attribute at value describing the expression's value - if compile time computable-- - is needed only for the analysis of calls of the functions NEW and DISPOSE: The static semantics require that all but the first actual parameter must be constants describing a path in the variant structure of the appropriate record type. This rather singular language property requires a rather high definition effort, since there is no other context where the value of an expression is relevant for static semantics - constant denotations are used in those cases. (Furthermore it would be more convenient for programming to postpone these checks to runtime.) 4.2.2 Scope Rules The scope rules (visibility rules) of high level languages state how to associate a definition with an applied occurrence of an identifier. The concept is usually formalized by mapping the identifier and an environment containing all visible definitions to the valid definition. Thus an AG attaches environment attributes to all symbols representing language constructs which introduce a new scope (e.g. block). The scope rules determine how these attributes are to be computed and how the valid definition is found. The domain of environment attributes is a block structured sequence of abstract definitions, e.g. in out Pascal-AG:
44
TYPE
tp_environ
: LISTOF
tp_block;
TYPE
tp_block
: STRUCT
(s_defs
TYPE
tp_defs
: LISTOF
tp def KEY s_name;
TYPE
tp_def
: STRUCT
(s name s--level s--value s kind sZtype
: tp_defs);
: SYMB, : INT, : INT, % static value % of constants : tp_kind, : tp_type);
The attribute rules ensure that the environment always consists of properly nested blocks where each block precedes its enclosing block in the sequence. The hiding property (a local definition hides any equally named global definition) is stated by a recursively defined identification function: FUNCTION
f_identify
(name loc defs g l o w env err_kind
: : : :
SYMB, tp_defs, tp_environ, tp_kind ) tp_def:
IF EMPTY (loc defs) THEN IF EMPTY--(glob env) THEN (tp def (Fame, 0, c a value, err_kind, c_err) CONDITION FALSE MESSAGE "IDENTIFIER NOT DECLARED") ELSE f identify --(name, HEAD (glob_env).s_defs, TAIL (glob_env)) FI ELSE IF HEAD (loc defs).s name = name THEN HEAD (l~c defs)-ELSE f_identify (name, TAIL (loc_defs), glob_env) FI FI; The valid definition of an identifier (attribute at def) in a block body is identified in the environment attribute o~ the smallest enclosing block or WITH-statement containing the identifier. The INCLUDING construct in rule r95 below is a shorthand denotation for "remote attribute access" with the above meaning. Its use saves explicit transport of identical attribute values through the identifier's context (cf. Chapter 3). RULE r95 : identifier ::= name STATIC identifier.at def :=
f_identify--(name.at_name, tp defs (), INCLUDING (block.at env, WITH_cla~se.at_env), sc_var);
END;
45 In Pascal an applied occurrence of an identifier must not precede its definition. Thus any defining construct enlargens the (local part) of the environment valid for subsequent d e f i n i t i o n s and the block body. A pair of attributes attached to the symbols of the definition part defines the sequence of local definitions before that construct (at_defs_in) and after the defining point (at_defs out). They "thread" the sequence of local d e f i n i t i o n s through the d e f i n i t i o n part of a block, as shown in rules r24 and r25. (In fact, it is threaded through each type denotation too, since implicit definitions may occur there.) RULE r24 : p r o c _ d e c l _ p a r t ::= STATIC proc_decl_part.at_defs_out END;
:= p r o c _ d e c l _ p a r t . a t _ d e f s _ i n
RULE r25 : proc decl_part ::= p r o c _ d e c l _ p a r t proc_decl STATIC-TRANSFER at_defs_in WITH proc_decl_part[2]; TRANSFER at_defs_out
WITH proc_decl;
TRANSFER at_visible_labels; p r o c _ d e c l . a t _ d e f s in := p r o c _ d e c l _ p a r t [ 2 ] . a t defs out END; ----
This d e s c r i p t i o n implies that the global environment defined for the block of a procedure declaration does not contain definitions of s u b s e q u e n t l y declared procedures (which are not yet visible at that point). On the other hand, Pascal requires that a block must not contain both an applied occurrence of a global identifier and a local definition for it. In the following Pascal program the call of R in the body of Q is erroneous. PROGRAM T; PROCEDURE R; BEGIN END; PROCEDURE P; PROCEDURE Q; BEGIN R END; PROCEDURE R; BEGIN END; BEGIN END; BEGIN
END.
An elegant description is given by an edditional attribute of blocks and WITH clauses (~t complete env) containing all definitions of enclosing--blocks. I d e n ~ i f i c a t i o ~ using this attribu--~ follows ALGOLlike scope rules: An applied o c c u r r e n c e of an identifier identifies the d e f i n i t i o n contained in the smallest enclosing block - regardless of whether it stands before or after the application. Pascal requires that both identifications yield the same definition, stated by an additional condition in rule r95:
46 CONDITION i d e n t i f i e r . a t def.s level = f identify (name.at--name, -tp defs--(), INULUDING (block.at complete env, WITH cla~se.at_complete_env), sc var) .s l e v e Y MESSAGE " I D E N T I F I E R USED ~ E F O R E DEFINITION" The i d e n t i f i c a t i o n of c o n s t a n t - and type-identifiers (not d i s c u s s e d here) uses the sequence of local d e f i n i t i o n s up to the identifier's o c c u r r e n c e (attribute at def in of the smallest enclosing definition). In an AG for a language with A L G O L - l i k e scope rules the "threadeningtechnique" described above is not needed, because the order of d e f i n i t i o n s is not s t a t i c a l l y relevant. In that case the attributes at c o m p l e t e _ e n v are sufficient.
4.3 Error Handling In this section we d i s c u s s how s e m a n t i c errors are treated in the generated analyzer. We show that the a u t o m a t i c a l l y supplied error handling facilities can be improved by certain s p e c i f i c a t i o n techniques applied for the AG d e v e l o p m e n t . A powerful means of syntax error recovery is a u t o m a t i c a l l y integrated by our parser g e n e r a t e r [De77] using the method of [Ro78]. It g u a r a n tees that in any case attribute e v a l u a t i o n operates on a s t r u c t u r e tree which is correct in the sense of the c o n t e x t - f r e e grammar. The term error handling c o m p r i s e s three tasks: recognition, information, and r e c o v e r y (i.e. ensure c o n s i s t e n c y of internal data). The minimal requirements for an analyzer are: no e r r o n e o u s input is accepted, - no c o r r e c t input is refused, - no input causes the analyzer to stop before completion, e x h a u s t i o n of resources (time or space).
-
except
In addition, the user of an analyzer applies q u a l i t y criteria error handling facilities: -
-
-
How specific are the error m e s s a g e s ? How many e r r o r s are not found? How many m e s s a g e s are caused by the same error
to
for
the
(avalanche errors)?
Context conditions are stated in an AG by boolean relations over a t t r i b u t e values, e.g. in rule r95. Thus the g e n e r a t e d analyzer recognizes e r r o n e o u s input when s u c h a condition is violated. It initiates an error message with a source position taken from the s t r u c t u r e tree and a MESSAGE a t t a c h e d to that c o n d i t i o n in the AG. Since the d e s c r i p t i o n language A L A D I N is strongly typed the internal data (attribute values etc.) are g u a r a n t e e d to be consistent. So attribute e v a l u a t i o n c o n t i n u e s after d e t e c t i o n of input errors. There are o n l y three s i t u a t i o n s which deviate from this principle: access to an element of an e m p t y sequence (by HEAD or LAST); a value of a union type which does not belong to a certain subtype required by
47 QUA; and non-terminating evaluation of recursive functions. A d e f e n s i ve definition style can (and should) avoid these situations - not only for erroneous input. With that assumption attribute evaluation is safe for any input and the minimum requirements for error handling are fulfilled a u t o m a t i c a l l y for any specification in ALADIN. The quality of the error handling is improved by a careful design of the context conditions according to the following general rules: a) Messages will be more specific where possible: C O N D I T I O N A MESSAGE CONDITION B MESSAGE
if conjunctive
conditions
are
split
"A does not hold"; "B does not hold"
instead of CONDITION A AND B MESSAGE
"A or B does not hold",
cf. r30 above.
b) One should try to attach the condition to a rule which directly contains the erroneous program element. A very unspecific message will be produced by the following condition attached to the block: CONDITION MESSAGE
LIST INCLUSION (statement seq.at_all_labels, -block.at Tabels) "Some labels are not decla?ed".
A specific error message will be produced by an equivalent tion attached to labelled statements: CONDITION MESSAGE
condi-
ELEM IN LIST
"Label
(statement.at label, INCLUDING bl~ck.at labels) is not declared".
By the way, this example demonstrates the turning the d i r e c t i o n of information flow.
general
principle
of
c) If a context condition is considered to be a precondition for an attribute rule the specification is simplified by conditional expressions which combine a context condition with an expression. This ALADIN element can be used in functions as well.
RULE r71: simp_expr STATIC
::= add_opr
term
simp expr.at type := (T reduce--(term.at type) C O N D I T I O N f is arYthmetic (IT) MESSAGE "OP]~RA~D MUST BE ARITHMETIC"); END;
d) In certain situations (e.g. if f identify does not find any valid definition for an identifier) AL~DIN's strong typing requires the specification of a value for the erroneous case. There are two strategies for the choice of such an error value: (I) Take the most probable value of the attribute's domain. In this case the number of further errors possibly found in a
48 "close context"
is maximized.
(2) Add a special error value to the attribute domain, w h i c h will pass all further c o n t e x t conditions it may be involved in (e.g. an error type tp err which is defined to be c o m p a t i b l e with all types). In thTs case the number of avalanche errors will be minimized. E x p e r i e n c e and e x p e r i m e n t s are needed to find a convincing solution for error r e c o v e r y d e s i g n e d by c o n t e x t conditions. Since changes in the AG are not very c o s t l y and the e f f e c t s can be judged immediately, the design may be guided by e x p e r i m e n t s - an approach which is not a c c e p t a b l e for usual c o m p i l e r design.
4.4 P e r f o r m a n c e
and 0 p t i m i z a t i o n s
The Pascal-AG described here was used as a test case in the d e v e l o p m e n t of the G A G - S y s t e m towards a g e n e r a t o r for efficient compilers. So we are able to d e m o n s t r a t e the e f f e c t s of d i f f e r e n t p e r f o r m a n c e i m p r o v e m e n t s applied a u t o m a t i c a l l y by the system or caused by m o d i f i c a t i o n s of the AG as well. The Pascal analyzer generated at the first attempt (without any a u t o m a t i c or manual improvements) needed a rather large amount of runtime and data space. But time and space were already m a n a g e a b l e even for large inputs, because the c a r e f u l l y chosen implementation of structured values by pointers avoids copies of space consuming attribute values. The m a g n i t u d e of the space for all environment attributes, for example, is c o m p a r a b l e with that of a d e f i n i t i o n table in a c o n v e n t i o n a l compiler: For each d e f i n i t i o n in an input program only one record of type tp def (comprising the p r o p e r t i e s of the defined entity) is allocated plSs one "link entry" (consisting of two pointers) chaining the definitions of a block. If such values were copied as in the HLP s y s t e m [HLP78], a non optimizing implementation would be intractable for large input programs. We improved the generation of analyzers by applying systematic optimizations such that the p e r f o r m a n c e now is c o m p a r a b l e to that of m a n u a l l y d e v e l o p e d compilers. The c o m p a r i s o n is fair o n l y for those compilers which use the s t r u c t u r e tree as random access data s t r u c t u re. This is not true for most o n e - p a s s P a s c a l - c o m p i l e r s . But one should c o n s i d e r that m a n y of them do not follow the language definition exactly, e.g. do not recoqnize the scope rule error pointed out in the example program in Sect. 4.2.2. Those p e r f o r m a n c e i m p r o v e m e n t s included in the G A G - S y s t e m are d e s c r i bed in Chapter 5. Here we concentrate on t r a n s f o r m a t i o n s applied to the AG to improve the p e r f o r m a n c e of the generated analyzer. We want to d e s c r i b e some of the most efficient t r a n f o r m a t i o n s in a way that facilitates their a p p l i c a t i o n to any AG. The final form of the AG for Pascal can be studied in A p p e n d i x A. An important aspect of our t r a n s f o r m a t i o n s is that the c l a r i t y of a b s t r a c t i o n did not suffer. In some cases the transformations aiming at performance improvement - even increased readability and c o m p r e h e n s i b i l i t y of the AG. Transformations
applied
to the attributed grammar aim most of all at
49 reducing space for attribute values, but not at reducing tree storage. The space for the structure tree can only be influenced by specifying a more compact syntax, i.e. eliminating nodes for terminal or nonterminal symbols. There are usually very few ways to eliminate nodes. Due to the facilities of the GAG-System (outlined in Chapter 5) to provide a rather compact structure tree, it is more important to concentrate on transformations which improve the structure of types and the usage of attributes. With the help of the performance measurements provided by GAG the AG designer is able to find out "critical types", i.e. pointer types of the analyzer with unexpectedly many allocations. In general the most AG improvements can be deduced from such measurements. They indicate the points of the AG where a critical review should start. From the experience with Pascal we recommend transformations: (a) Block structured
for example the following
environment attributes.
Environment attributes (symbol tables, definition tables) are usually specified as lists of definitions. In a block structured language the local environment of any block can be combined with its global environment by list concatenation. This method leads to a linear list of definitions. If the language contains a property requiring the construction of several different environment attributes (for Pascal it is the problem of redeclaration outlined in Sect. 4.2.2) a list of local definitions may be concatenated to several global lists. The consequence is the necessity of duplicating lists. If, instead, the environment attributes reflect the block structure of the program, as does type tp environ for Pascal, a local sublist of definitions will not be ~oncatenated to a global list at all. Instead, the several environment attributes contain the local lists as elements. Due to the pointer representation of complex types, a list may b e an element of arbitrarily many lists without being duplicated.
(b) Procedure identification. From the performance measurements we learnt that a large amount of storage was used for list elements which had a very short lifetime. This was significant for the lists of types and kinds of actual parameters built up for procedure identification. We then decided to reverse the method of procedure identification: Instead of collecting information about actual parameters, the description of the formal parameters (which is part of the procedure declaration) is distributed to the actual ones, and the identification mechanism is split off into partial identifications performed for each actual parameter separately (see rules r62 upto r64 in Appendix A). As a first result, fewer lists are constructed. A second result is the distribution of error checks to those rules which derive the actual parameters: as outlined in Sect. 4.3, the error messages become more specific. (c) Definition of ALADIN constants. ALADIN allows to define constants of arbitrary types. It is a general recommendation to use such a facility. If a value of a complex type is constructed as a part of a semantic rule (or a
50
function) it will be c o n s t r u c t e d as often as the semantic rule (or the function body) is evaluated, even if it would suffice to c o n s t r u c t it once. Using a constant definition, the value is c o m p u t e d o n l y once, and, m o r e o v e r , the designer may associate a name to it which indicates the meaning of the value within one word: the readability increases. For Pascal we applied this method e.g. to c o n s t a n t values d e s c r i b i n g erroneous types (see function f is tp record in Appendix A). Thus the amount of storage needed--in-the e r r o n e o u s case decreased.
From our e x p e r i e n c e with attributed grammars we know that these t r a n s f o r m a t i o n s are a p p l i c a b l e to m a n y language specifications. We expect similar results as for Pascal, where these t r a n s f o r m a t i o n s (together with some minor r e s t r u c t u r i n g of types) reduced the storage needed for attribute values by a factor of 10 - measured in experim e n t s with p r o g r a m s of typical size and structure. Detailed measurement results are given in Chapter 5.
4.5 R e s u l t s of A t t r i b u t e
Evaluation
Embedding an attribute e v a l u a t o r into a complete compiler e n v i r o n m e n t requires input and o u t p u t interfaces to be specified. Obviously the input interface can be derived from the language p r o p e r t i e s (contextfree grammar, symbol denotations) specified in the AG. An output interface is d e v e l o p e d by c o n s i d e r a t i o n of the results of the a t t r i b u te evaluation phase on the one hand and the requirements of s u b s e q u e n t c o m p i l e r phases (e.g. code generation) on the other hand. Primarily an AG s p e c i f i e s an analyzer deciding whether a given input string belongs to the defined language. A s e c o n d a r y result of s e m a n t i c analysis is the attributed structure tree. It may be considered as the interface for a subsequently executed code g e n e r a t i o n phase. Usually only a set of c e r t a i n "final attributes" attached to tree nodes is needed for the interface. Those a t t r i b u t e s may be introduced for that purpose if not needed for the pure analyzer. Using final attributes it is even p o s s i b l e to d e s c r i b e a t r a n s f o r m a t i o n into a target language with a s t r u c t u r e d i f f e r e n t from the source language. Then the results of the extended a n a l y z e r are the values of the final attributes instead of the attributed structure tree. In any case the results are computed by attribute evaluation. In the following we present three a p p r o a c h e s and compare the c o n s e q u e n c e s for the AG, the analyzer, and the code g e n e r a t o r . (a) The AG is extended by into target machine code.
a complete
s p e c i f i c a t i o n of the mapping
In this case an a t t r i b u t e "code" is attached to most of the symbols. It d e s c r i b e s the instruction sequence to be g e n e r a t e d for the c o r r e s p o n d i n g subtree of the s t r u c t u r e tree. The types of the code a t t r i b u t e s would be target m a c h i n e dependant, e.g.: TYPE instr seq TYPE instr-TYPE RR instr --
: LISTOF instr; : UNION ( RR instr, RS instr, : STRUCT ( o~r: opco~e, rl, r2: reg_no );
... );
The attribute rules for the code a t t r i b u t e s are "code-sequences" which concatenate the code produced for subtrees with certain
51
inserted
instructions.
RULE statement
E.g. for a REPEAT-statement:
::= 'REPEAT' 'UNTIL'
statement_seq expr
STATIC statement.start label := new label; (* generates a ~ew symbolic Yabel *) statement.code := gen_label (statement.start label) + statement seq.code + expr.~ode + gen_jump_~alse (statement.start_label) END The code attribute of the root of the structure tree is treated as (the only) "final attribute" containing the instruction sequence produced for the program. Such an AG describes the meaning of the source language in terms of the target language. The generated attribute evaluator is a complete compiler. This approach is acceptable if the mapping is rather simple. For high level source languages like Pascal and usual target languages the attribute rules describing the transformation will be very complex: The combination of many different cases must be considered (e.g. source language types, locations and addressing modes of the target language). In this case a modular approach should be preferred. Furthermore, an unacceptably high description effort is needed if some well known techniques for optimizing code generation are expressed by well-defined AGs. (b) The AG is extended by the intermediate language.
specification
of
a
mapping
into
an
The intermediate language should be target machine independant and suitable for systematic code generation (e.g. a tree structured language with appropriate node types). The mapping is specified similarly as in (a) by code attributes and code sequences. Compared with technique (a) the transformation part of the AG is (nearly) machine independant and should be acceptably small. For the code generation phase of the compiler (operating on a sentence in the intermediate language) any implementation technique may be used. One can even specify it by a second AG for the intermediate language, which would be simpler than the extension needed in (a). (c) The attributed structure tree is the interface for synthesis. In this case no extension of the AG is needed. One can either integrate the complete code generation phase into the generated analyzer. It operates on the attributed structure tree and is implemented using any suitable technique. In this case complexity problems may arise from the lack of an intermediate language. Or the attributed structure tree can be written onto an intermediate file, read in and possibly transformed by the code generator. For a language like Pascal we recommend approach (b), because the main task of code generation is separated from analysis, the transformation specification is rather simple, and the intermediate language can be chosen according to requirements of code generation.
52 A preparative task for code generation is data allocation, i.e. the layout of data storage segments is determined. This problem can be easily solved by simple extensions of the original AG: The descriptor for a declared object is augmented by a component for its address relative to the enclosing activation record. Its value is computed by a mapping over the object's type and kind. The machine dependencies introduced in the AG can be closely localized. Thus the storage allocation phase is easily included into any of the approaches discussed here. For experimental use the Pascal-AG presented here was augmented by data allocation and generation of intermediate code conceptually applying method (b). The output is processed by a code generator producing machine code for a Siemens 7.000 processor (same instruction set as the IBM 360/370 series), which was automatically generated by the CGSS-System [LJG82] using the method of [G178].
Chapter
5: Generating
5.1
Efficient
Compiler
Front-ends
Introduction
For
several application areas efficiency of a generated attribute is not required: support of language design, validation of compiler specification, generation of compilers for experimental use. In these cases correctness of the generated products is a sufficient goal. Experience with compiler generating systems based on attributed grammars show S that straightforward generated attribute evaluators tend to use an enormous amount of data storage and runtime, which are not acceptable for practical usage. Here we present a collection of generation techniques improving the efficiency of the generated compilers. They are designed for and implemented in the GAG-System. Their effect is documented by results of performance measurements demonstrating that efficiency requirements for practically usable attribute evaluators can be met. evaluator
The input for the GAG-System is a strict definition in form of an AG without implementation oriented specification as evaluation order or representation of attributes. Thus efficiency of the generated evaluator is mainly determined by the generation techniques. Efficiency improvements are achieved by refinement of these techniques and by adding optimization facilities. Compiler synthesis in the GAG-System has to solve four general problems, each of which is an object for improvement considerations: implementation of the attributed structure tree, which is the central data structure of the attribute evaluator, mapping of attribute types into the implementation language of the evaluator (Pascal), computation and implementation of an attribute evaluation order, and translation of expressions of semantic rules and functions. Although the general goal - generating an efficient evaluator - is comparable with the goal of optimizing compilers, the specialized context of AGs requires specialized improvement techniques. In order to facilitate the understanding of the improvements, Sect. 5.2 introduces some basic generation concepts used in the GAG-System. In general any attribute evaluator operates on an attributed structure tree kept in fast random-accessible storage (even pass-oriented evaluators). This requirement is an outstanding field for storage optimization. Optimization techniques suggested in [Po79] are not applicable in the GAG-System, because a non pass-oriented evaluation technique is applied. Improvements concerning the structure tree are discussed in Sect. 5.3. A large amount of storage related to the structure tree is occupied by the objects implementing attributes. In Sect. 5.4 we present an optimization method for attribute storage based on analysis of attribute lifetime. In contrast to the methods desribed in [Ra79], [Po79], and [JP81], the analysis is completely performed at compiler
54 generation
time.
Furthermore, improvement methods well-known in the area of compiler optimization and program transformation are applied in the GAG environment (Sect. 5.5). Apart from the optimization techniques applied by the system, a global review of the AG may result in an essential improvement of the efficiency. In Sect. 5.6 we show how the attribute storage can thus be reduced. Development and implementation of efficiency improving techniques should always be controlled by the effects on realistic examples. Performance data taken from analyzers for Pascal (see Chapter 4), LIS [Sa8Z], and Ada [Ada82] enabled us to concentrate on improvements yielding significant effects. The results in Sect. 5.7 (taken from our Pascal analyzer) demonstrate that efficiency requirements for practically usable attribute evaluators are achievable.
5.2
Basic Gener@tion
Concepts
Translation of an AG into a compiler is in general comparable with the synthesis phase of compilers for high level languages. However, most of the improvement methods discussed later are not comparable with usual compiler optimization. In order to understand GAG's generation and improvement techniques one should be aware of the reasons for that difference: First reason: The generated program is an attribute evaluator operating on the structure tree as central data structure. The sequence control (computed in the generation phase) is made up of tree walk instructions and evaluations of semantic rules (visit sequences). Thus data flow analysis for optimization is based on this special control structure: Storage reductions are achieved by lifetime analysis for attributes. Second reason: The source language ALADIN (see Chapter 3) is strongly applicative, i.e. there are no variables and no algorithmic control structures in the source. Hence no side effects or aliasing prohibit optimization. Furthermore the implementation is free to use pointers to objects instead of copies of object values. Third reason: The target language (implementation language of the generated compiler) is a strongly typed high level language: Pascal ([JW75], [BSI82]). Thus several compiler optimization methods are not applicable on this translation level (e.g. storage sharing of differently typed objects). In this section three basic concepts of the generated attribute evaluators are presented together with an outline of the corresponding generation techniques: The mapping of ALADIN types into Pascal types, the representation of the structure tree, and the sequence control of the evaluator. Some of the solutions presented here are improved versions of less efficient solutions, which are not mentioned explicitly.
55
5.2.1
Types in the Generated
Compile{
The mapping of ALADIN types to Pascal types makes use of the invariability of ALADIN objects (i.e. attributes and constants) to minimize storage for the objects of the generated compiler front-end. It is possible to map logical equality of objects to physical equality: If a complex ALADIN object is built of n basic objects, no copies of these objects are necessary. Instead, the new object is built by composing pointers to the basic objects. This principle allows, for example, the threading of a definition table throughout the whole structure tree without using more space than one pointer for each nonterminal symbol with such an attribute. Of course, such a transport takes an enormous amount of space and time if value copies are implemented (which semantically is quite the same), as illustrated by experiments made with compilers generated by the HLP system [HLP78]. We give a brief survey of the type mapping: -
A simple type.
(not structured)
ALADIN type is
i:I
mapped
- A STRUCT or a UNION type becomes a pointer to a Pascal variants in the case of UNION). - A LIST is implemented other.
by pointers to Pascal
to
a
Pascal
record
records linked
to
(with each
The optimization of the machine representation is a task of the compiler which is used to translate the generated Pascal program.
5.2.2
The Program Tree Structure
For the structure tree nodes we chose a record type with variants on two levels. The first level distinguishes syntactic lists, terminal nodes, and nonterminal nodes (where 'syntactic list' describes the ALADIN property of repeated symbols, in productions written as sx+ or sx*). The second level distinguishes the sets of attributes of terminal and nonterminal nodes. The nodes are linked to a tree by pointers: nonterminal any node
node -> leftmost son -> next right brother
where a 'brother' is either the next symbol in a production or the subsequent element in a syntactic list. An example illustrates
this principle:
56 production: a tree:
p: sxZ
::=
sxl
(sx2)+
+................ + IRULE=p sx0 I ]SONS BROTHER-->
+-I
...
+
I
V
ist son of sx0
IRULE=? ISONS
+-I
2nd son of sx0
sxl I ILISTRULE BROTHER=NILI B R O T H E R .... >Ibase=l ELEMS I ~
I. . . . . . . .
V sons of sxl
+
V IRULE=? ISONS
sx2 ] BROTHER I
+-I v
I---+
I V
sons of sx2
+ IRULE=? ISONS +-I V
Sx2 I BROTHER[ I---+ V
e o .
sons of sx2
rest of list
The m a x i m u m size of a node d e p e n d s on the number of attributes. Many of them are implemented as global data, as we will discuss in Sect. 5.4. Thus we g e n e r a t e d a Pascal a n a l y z e r with an average node size of 24 Bytes, and 1.8 nodes per input token. The runtime to access a descendant of a node is on average (i.e. access to ist or 2nd descendant) not slower than with a structure using d i r e c t (e.g. indexed) access.
5.2.3
S e q u e n c e Control of A t t r i b u t e
Evaluators
A s e q u e n t i a l attribute e v a l u a t i o n a l g o r i t h m controls the walk through the structure tree and the e v a l u a t i o n of a t t r i b u t e s during the visit of a tree node. The c o m p u t a t i o n of each a t t r i b u t e value is specified by the attribute rules in the AG. The tree walk must ensure that before an attribute is e v a l u a t e d the a t t r i b u t e s it depends on are a l r e a d y evaluated. For e f f i c i e n t a t t r i b u t e e v a l u a t o r s the analysis of attribute d e p e n d e n cies and d e t e r m i n a t i o n of the tree walk has to be performed at generation time. The problem can be solved in two d i f f e r e n t ways: P a s s - o r i e n t e d evaluators walk through the tree according to a general s t r a t e g y (e.g. left to right d e p t h first). The s t r a t e g y is a priori fixed, the number of such passes needed is d e t e r m i n e d by the a t t r i b u t e dependencies - provided such a number exists for an a r b i t r a r y input. Non p a s s - o r i e n t e d evaluators as suggested in [KW76] and [Ka80] are applicable for larger c l a s s e s of AGs. Instead of a general s t r a t e g y they are controlled by local tree walk rules ("visit-sequences") attached to the productions applied for the construction of the s t r u c t u r e tree.
57 The local tree walk rules are computed at evaluator generation time, they are individually "shaped" according to the attribute dependencies in the corresponding context. A rough comparison of both solutions under efficiency aspects yields the following: On the one hand pass-oriented evaluators cause less runtime overhead for sequence control. On the other hand in most cases a non passoriented evaluator performs significantly less movements in the tree because the tree walk rules are better fitted to the attribute dependencies; often a complete pass can be saved by a local deviation from a global strategy. Measurements show that these effects must not be overestimated: The total amount of runtime for sequence control is not significant compared with the time needed for attribute computation. Storage optimization techniques for the structure tree are more easily applicable in the case of pass-oriented evaluators (c.f. [Po79] and Sect. 5.3.3). One should be aware that even pass-oriented evaluators need the structure tree in random access storage (if no optimization is applied) because each inner node is visited at least twice. The dominating advantage of automatic generation of non pass-oriented evaluators is the greater freedom for AG development, disregarding the attribute evaluation order. This freedom proved to be useful especially for the development of rather large and complex languages like PEARL and Ada. In the 0AG method the local tree walk rules are sequences of the basic operations of the attribute evaluator associated to the contextfree productions. The basic operations are: - visit the n-th descendant node of the current node, leave the current node to the ancestor, - evaluate an attribute of the current node or of one of its direct descendants, as specified in an attribute rule associated to the same production, - check a context condition attached to the production. -
Attribute evaluation (usually) starts at the root of the structure tree. The visit-sequence to be executed is determined by the production applied for the derivation of the currently visited node. An upward or downward move in the tree causes the evaluator to resume execution of the visit-sequence attached to the r e a c h e d node. This kind of sequence control can be implemented either by a set of recursive procedures (the calls implement descendant visits), by coroutines, or by a table driven interpreter (c.f. [Ka80]). In order to avoid the overhead of procedure or coroutine implementations we chose the table driven technique for the GAG-System. A straightforward implementation contains a CASE statement (one case for each evaluate- and check-operation) within the central interpreter loop. The visit sequences are stored in a vector, each element of which either addresses an evaluate or check case or encodes a visit or leave operation. Execution of a visit operation pushes the current node and the return position within the visit-sequence onto a stack; the new current node and the next o p e r a t i o n are determined by the table entry and the production associated to the visited descendant node. A leave operation pops the new current node and the return position from that stack. The following System :
improvements
are
automatically performed
in the GAG-
58 - A n y s e q u e n c e of evaluate and check operations (not c o n t a i n i n g a visit or leave) is c o m b i n e d to one basic o p e r a t i o n of the evaluator. So the number of table entries and of evaluate and check cases is reduced, and runtime for interpreter control is decreased. - The size of the v i s i t - s e q u e n c e table is reduced by overlaying any two sequences vl, v2 if the encoding of v2 is equal to a tail of vl. The effect of this t e c h n i q u e h e a v i l y d e p e n d s on the encoding of the table entries. Good results are achieved by labelling the cases for each p r o d u c t i o n with c o n s e c u t i v e numbers. The actual case label is computed by adding a base value for the p r o d u c t i o n to the table entry. The effect is twofold : The range of the table entry values is small (even for large AGs two bytes are sufficient). The number of o v e r l a y e d v i s i t - s e q u e n c e s is increased. As was mentioned above, the runtime needed for sequence control of the interpreter is not s i g n i f i c a n t compared with a t t r i b u t e e v a l u a t i o n (execution of the evaluate-check cases). The same holds for the amount of storage needed for the v i s i t - s e q u e n c e table which is not s i g n i f i c a n t compared with tree and a t t r i b u t e storage (about 1 KBytes for the Pascal front-end).
5.3
I m p r o v e m e n t of the Tree R e p r e s e n t a t i o n
An important aspect of an e f f i c i e n t a t t r i b u t e e v a l u a t o r is the size of the structure tree. The basis of a c o m p a c t tree is a compact tree node, as we p r e s e n t e d it in Sect. 5.2. The number of a t t r i b u t e s within the tree node is reduced by the g l o b a l i z a t i o n a l g o r i t h m of Sect. 5.4. The size of the attributes remaining tree node c o m p o n e n t s is limited: - a t t r i b u t e s of simple, STRUCT, or UNION types do not use more than one word in the node, - a t t r i b u t e s of list types use two words, a t t r i b u t e s of set types use s e t w i d t h / w o r d l e n g t h words if implemented as bit vectors. -
The next table shows the m i n i m u m size of the d i f f e r e n t node variants. We assume a word (W) of 4 Bytes (B), and an AG w i t h no m o r e than 255 rules and symbols, so that rule and symbol indicators do not use more than one Byte each. A pointer is assumed to use a word. Layout gaps are not considered. A node of minimal size is of course a node w i t h o u t attributes.
node v a r i a n t s y n t a c t i c list AG-rule terminal node empty s u c c e s s o r
I used Bytes .............. [ 14B I 21B I 6B [ 6B
[ + [ 1 I I
full words 4 6 2 2
W W W W
The m a x i m u m length, of course, d e p e n d s on the number of attributes. The a n a l y z e r g e n e r a t e d for Pascal produces structure trees with about 43 Bytes per token and 1.6 nodes per token. That means an average node size of 26 Bytes per node, including a t t r i b u t e s implemented by tree n o d e components.
~9
Subsequently we discuss concepts to reduce the size of structure trees. The tree compactification described in Sect. 5.3.1 is performed by the GAG-System. The methods of the sections 5.3.2 and 5.3.3 are not implemented because we concentrated on optimization techniques which do not save space at the cost of runtime. Nevertheless these methods are interesting whenever space becomes critical on the target machine of the generated compiler.
5.3.1
Tree Compactification
A significant part of usual AGs consists of semantic chain productions. A semantic chain production is a rule r: X ::= s Y t STATIC TRANSFER END;
RULE
where the attribute sets of both X and Y are the same, are terminal strings.
and
s
and
t
Take Pascal as example: about 15% of the rules of the AG are of this kind. Thus it is useful to eliminate those productions automatically. The following algorithm is implemented: The nonterminal symbols of the input AG are mapped to classes of symbols with identical sets of attributes. The rules of the AG are transformed into rules with a representative symbol of each class substituted for the original symbols. Chain productions within one class are thus eliminated. For parser generation the original context-free syntax is extracted from the AG. It is augmented by procedure calls for tree construction, such that a tree according to the transformed syntax is built. The transformed AG consists of fewer rules, symbols, attributes, and attribute rules. For Pascal, e.g., the following characteristics have changed: original AG symbols
transformed AG
68
50
rules
143
125
attributes
195
140
attr. rules
401
346
(attributes and attribute and CONSTITUENT(S)).
rules counted after expansion of TRANSFER
Detailed information about the results of this transformation is contained in Appendix B, Sect. B.3. There it is evident that the transformation changes most of all the expression syntax. We validated this fact by the help of other AGs. The dynamic effect is significant: About 25% of the tree storage can be saved by this means. The design of the AG influences the result of this optimization: One should try to remove checks and computations from the con£ext of syntactic chain productions.
60
5.3.2
Tree Partitioning
It is possible to give up the principle of the complete program tree in random access storage as base for the attribute evaluation. If the parser of the generated front-end writes a linearized program tree onto an intermediate file, the attribute evaluator can operate on incomplete trees. Storage for certain subtrees can be released before a new subtree is built on demand of the evaluator. This is useful most of all for subtrees which are visited only once. The tree partitions can be computed by the generator using the information of the visit-sequences. Nevertheless the price for saving storage is the time used to write and (repeatedly!) read the intermediate file. Moreover, it is necessary to define an intermediate representation for attributes, if they have to be preserved on a file, too. We found out that the need of storage of the generated Pascal frontend (Chapter 4), where the symbol 'block' is visited just once, can be improved enormously by evaluating subtrees of 'block' separately. Given a Pascal program which is a rather linear sequence of procedure definitions (a usual programming style for Pascal), the tree storage might be reduced by a factor of 1Z. If an AG is of type LAG(l), i.e. program trees can be analysed by one pass left-to-right, this concept allows the destruction of each node after leaving it in the upward direction. That means only the stack of ancestors of the currently visited node is necessary during attribute evaluation. Thus the "tree" needs no remarkable storage at all.
5.3.3
Attribute Evaluation Without Tree
Since the visit sequences together with the information of the currently visited rule control the attribute evaluation, it is possible to give up the concept of the program tree at all (see also [Po79] ) . For an 0AG the general concept is the following: Whenever the parser reduces a 9ertain production, the actions of its visit sequence can be stored - in the order of their evaluation - on an intermediate file. If all visits are replaced by the actions they invoke for the given program tree, the parser produces a file of indicators of actions to be evaluated sequentially. Therefore no tree is needed for the attribute evaluation, only the control file is read and interpreted. The problems are threefold: First, the replacement of visits in visit sequences cannot be done completely at the moment a production is reduced by the parser, i.e. either the control file must be read and rewritten several times during parsing, or it must be sorted at the end of the parsing phase. Second, in general some attributes remain node components, i.e. their instances must be stored with a unique identification of the node instance they belong to ([Po79]: their 'location' must be determined). Third, during attribute evaluation it is possible to get access to some structural information: existence of a certain optional successor, length of a particular syntactic list, rule indicator of a
61
s u c c e s s o r or of the a n c e s t o r node. T h e r e f o r e the intermediate control -
file has to provide
for each action
the indicator of the action itself, the unique indicator of the current node, any special indicators if s y n t a c t i c lists are involved, the indicator of the rule of the current node, indicators and 'existence markers' of all successor nodes.
The a t t r i b u t e s which are not g l o b a l i z e d may be stored in any data structure where the unique node indicators provide the access. A test for a certain s u c c e s s o r rule implies a search on the control file. The same effort is n e c e s s a r y for the m a n a g e m e n t of s y n t a c t i c lists. Some rather simplified experiments showed that the intermediate control file increases to a m u l t i p l e of the tree size, and that it is d i f f i c u l t to find an a p p r o p r i a t e d a t a s t r u c t u r e for the local a t t r i b u tes concerning space and access time. The concept of [Po79] includes the allocation of space f o r the a t t r i b u t e s during parse time and stores addresses in the control file, which would again cause the file records to increase. Besides, writing the control file is - with all these merging problems - always slower than b u i l d i n g the tree, and reading the control file will not be quicker than the tree walk, since there are no time consuming useless v i s i t s in an 0AG evaluator. So one must see that this is an optimization of storage requirements at the cost of e x e c u t i o n time. T h e r e f o r e we can not recommend the general replacement of the tree by a control sequence, but nevertheless it is possible. Given our compact tree representation or the easily i m p l e m e n t a b l e method of tree partitioning, the e v a l u a t o r will not be so space c o n s u m i n g that the control sequence v a r i a n t would be worth while~
5.4
Attribute
Optimization
In general a t t r i b u t e s are implemented by c o m p o n e n t s of tree node records. The a t t r i b u t e o p t i m i z e r of GAG finds out those attributes which can be implemented by global v a r i a b l e s or stacks. Together with the e f f i c i e n t m a p p i n g of A L A D I N types to Pascal types - outlined in Sect. 5.2.1 and Chapter 2 - we yield a c c e p t a b l e storage efficiency. The criteria [As79].
for d e t e r m i n a t i o n of global a t t r i b u t e s are
developed
in
The basic idea is l i f e t i m e analysis for attribute instances at c o m p i l e r g e n e r a t i o n time: If it can be proved that for a certain attribute, say X.a, in any s t r u c t u r e tree the lifetimes of all of its instances (the a t t r i b u t e s of nodes r e p r e s e n t i n g X) are (a) p a i r w i s e disjoint, or (b) p a i r w i s e d i s j o i n t or p r o p e r l y included, then X.a is implemented by a global variable in the case of (a) or a global stack in the case of (b). If neither (a) nor (b) can be proved, the standard i m p l e m e n t a t i o n b y a n o d e c o m p o n e n t is chosen. The v i s i t - s e q u e n c e s d e s c r i b i n g the tree walk and the computation use of a t t r i b u t e s are a s u i t a b l e base for lifetime analysis. For o c c u r r e n c e of an attribute X.a in a v i s i t - s e q u e n c e the range from
and any its
62
computation to its last usage is considered. The point of "last usage" is d e t e r m i n a b l e for each v i s i t - s e q u e n c e where symbol X (and thus a t t r i b u t e X.a) is known. If, in a p a r t i c u l a r rule context, no applied o c c u r r e n c e of X.a exists, the end of its lifetime in the a s s o c i a t e d VS is the point where this VS is reached. The criteria are: - If no range c o n t a i n s the c o m p u t a t i o n or use of a second occurrence of X.a or a visit w h i c h can reach such an o c c u r r e n c e , c o n d i t i o n (a) holds (global variable). If no range c o n t a i n s the c o m p u t a t i o n or use of a second occurrence of X.a (such that the lifetimes of these o c c u r r e n c e s are not properly included) or an ancestor visit, condition (b) holds (global stack).
-
These c o n d i t i o n s tes.
are s u f f i c i e n t and easily d e c i d a b l e for all
attribu-
If an attribute is implemented by a stack, a PUSH o p e r a t i o n is added to the e v a l u a t i o n of a new instance, and a POP o p e r a t i o n is inserted at the end of its lifetime. Thus i n s t a n c e s e v a l u a t e d within a subtree are removed from the stack when the root of the subtree is reached again. F u r t h e r m o r e o v e r l a y s are c o m p u t e d for global implementation: (c) Two attributes other if the disjoint. (d) Two a t t r i b u t e s common stack either p a i r w i s e
for
different
attributes
selected
recognized as global v a r i a b l e s may o v e r l a y each lifetimes of all of their instances are pairwise recognized as global may be implemented by a if the lifetimes of all of their instances are d i s j o i n t or p r o p e r l y included.
The result is a set of "groups", each g r o u p r e p r e s e n t i n g a c o l l e c t i o n of attributes; and each g r o u p is implemented either by one v a r i a b l e or one stack. An optimal o v e r l a y s t r u c t u r e can be d e t e r m i n e d by a graph colouring algorithm. We implemented a simple (heuristic) "first fit" a l g o r i t h m yielding s a t i s f a c t o r y results. Apart from the inserted stack o p e r a t i o n s the attribute o p t i m i z a t i o n s additionally reduce code length and runtime: Transfer rules between a t t r i b u t e s which share the same space are superfluous and therefore eliminated. The following table i l l u s t r a t e s the effect of the o p t i m i z a t i o n phase. The basic data - number of a t t r i b u t e s and semantic rules - are taken from the t r a n s f o r m e d versions of the AGs without semantic chain p r o d u c t i o n s (see Sect. 5.3.1). Concerning our AG for Pascal, the o p t i m i z a t i o n results are also g i v e n in A p p e n d i x B, Sect. B.3.
63 Ada
PEARL
no. of attributes
517
575
140
449
possible global variables possible global stacks node components
324 108 85
187 31 357
44 37 59
222 77 150
variable groups stack groups
59 24
60 23
ii 18
32 23
push operations pop operations
299 333
64 50
128 150
162 201
eliminated transfer rules in % of all sam. rules
228 21
55 2.0
65 7
107 5.5
AGs
-
Pascal
LIS
From the resulting numbers of groups, which will each be implemented as a global data structure, we see the degree of compression. For example Ada: 84% of all attributes may be globalized. The number of these global data is then reduced to 20%. The relative number of eliminated transfer rules seems to be surprisingly low. But one must consider that typically most of the attribute rules necessary for simple attribute transport are abbreviated by INCLUDING-expressions (see Chapter 3). In the generated evaluator the INCLUDINGs are simulated by a procedure reading the attribute via the node stack. Thus no additional transfer rules are generated (see Chapter 2), and the number of transfer rules in a typical ALADIN-AG is rather low. If one compares the transfer rules eliminated by the optimizer with the complete number of necessary transfers - i.e. comprising the implicit transfers of the INCLUDING operations - for Pascal more than 30% of all necessary transfers are eliminated. So we can point out four optimization results: - the space needed for the tree is minimized by automatically introducing global attributes; - the space needed for global attributes is minimized by computing overlay structures; transfer rules between global attributes are minimized; - an "end of lifetime" for all attribute instances is computable in order to deallocate attribute space without a runtime check. -
The last result becomes important when using a low level implementation, where all storage management is done by the generated compiler. GAG leaves storage management to the Pascal runtime system. Moreover, since the generated compiler uses pointer semantics (see Sect. 5.2.1), the deallocation time for a complex object cannot be determined at generation time, because it is in general composed of (dynamically) many values with different lifetimes. The end of the lifetime of an attribute instance is no longer identical to the end of the lifetime of its value. Another variant of attribute optimization is to overlay different tree attributes of the same symbol, if their lifetimes are disjoint and their types (or even more general: their sizes) are identical. A typical example are attributes describing a-priori and a-posteriori types of expressions in many languages. Since the end of the lifetime
64 is e x p l i c i t l y known even for tree attributes, it is easy to determine whether two attributes of one symbol may share one location. Only attributes of the same symbol occurrence are put together, thus there is no need to examine d i f f e r e n t attribute occurrences. If both variants are to be applied, one must consider that the overlay of two attributes in one node prolongs lifetime of the resulting super-attribute. Applying the globalization algorithm subsequently may yield worse results. On the other hand, if the g l o b a l i z a t i o n is performed at first, few attributes remain tree node components, so that the algorithm for overlays within tree nodes does not find many attributes to combine. Hence the g l o b a l i z a t i o n is the more important optimization.
5.5
General
optimization
Techniques
Attribute rules and functions of an AG are written in an applicative high level language (ALADIN). Code improvement techniques well-known in the area of optimizing compilers and program transformation are applicable for the t r a n s l a t i o n into Pascal. Due to the character of the source language, elimination of common s u b e x p r e s s i o n s and transformation of recursions into loops are considered to be most effective. The following typical A L A D I N function gives an example for both features. F U N C T I O N g ( i: LIST, e: LIST__ELEM_TYPE) t: IF EMPTY(l) THEN cl ELSE IF (HEAD(l) = e) OR (HEAD(l) = c2) THEN f (HEAD(l)) ELSE g (TAIL(l), e) FI FI Experiments have d e m o n s t r a t e d that drastic reductions of runtime and m o d e r a t e reduction of code length can be achieved.
5.5.1
Common
Subexpressions '
The algorithm for c o m m o n s u b e x p r e s s i o n elimination is simpler than in usual compilers, because there are no variables in the source language. No data flow analysis is needed in order to determine whether the values of two occurrences of an e x p r e s s i o n are the same: Within the semantic rules associated to one syntax rule or within the body e x p r e s s i o n of one function all equally named identifiers have the same value. If a common subexpression in a rule context is pre-evaluated and preserved for its applied occurrences, it must be implemented like an attribute. That means its storage can be optimized by the same criteria as attribute storage. Especially: If the lifetime of a pre-evaluated s u b e x p r e s s i o n does not contain a visit, it can be preserved in a variable. Within function bodies only p a r a m e t e r s and constants are used. Common s u b e x p r e s s i o n s may be preserved in local variables of the translated
8S
function. Reductions of code length and runtime most of all for are the result.
5.5.2
Recursion
function
calls
Elimination
List structures and recursive functions frequently occur in applicative language specifications. If translated straightforward to P a s c a l , each recursive call costs significant runtime overhead. Many of the list manipulating functions are pure tail recursions of the kind described in function g above. They are easily transformed into a loop. Experiments show that transformation of only the most frequently called recursive function into a loop (in an AG defining a language like Pascal this is the identification function) reduces execution time of bigger programs drastically (see also Sect. 5.7).
5.5.3
Inline Code for Functions
ALADIN functions for list manipulation use - apart from the recursive structure - most of all predefined functions like HEAD, TAIL, EMPTY to handle the lists (see also the example above). Since these functions specify short, non-recursive operations, it is possible to generate them as inline code. Thus another reduction of execution time is achieved, which may reach 25%. Generating inline code for the ALADIN function HEAD is possible in a "safe" AG, i.e. in an AG where the implicit error condition "HEAD called for an empty list" (see Chapter 3) is not used as language restricting facility.
5.6
AG Transformations
Various tests with generated compiler front-ends, most of all generated from several versions of the attributed grammar for Pascal, showed that efficiency is not only a question of the generation tool. Using the knowledge of GAG's generation and optimization techniques, we found several strategies to write an attributed grammar for compiler generation. The essence of an AG for language specification is a clear model of the static semantics, with no regard to implementation details. But generating an efficient compiler means defining such data structures - i.e. attribute types - which a r e e f f i c i e n t l y to implement and to access. This is in general no contradiction to a good abstraction: In some cases the readability may even increase. There cally (i.e. (i.e. other.
are three areas which determine the efficiency of an automatigenerated evaluator: The number of attributes, the structure types) of the attributes, and the AG defined operations on them functions). Of course these areas are not independent of each
In Chapter 4 we studied some important transformations applied to the AG for Pascal and applicable to many AGs. Here we want to summarize those transformations in a more abstract way: One can say that two aspects should guide a review of the attributed grammar to improve efficiency.
First aspect: For well chosen attributes it should be possible to explain their meaning in terms of language properties, e.g. "the type of an expression". Adding redundant information to an abstraction neither simplifies it nor specifies a more efficient implementation. The occurrence of "dummy values" for attributes or components thereof can be an indicator for bad abstractions. A (simplified) example: If there is a list of structures where one field always has the same value, the field is obviously unnecessary. Second aspect: Complex operations on data structures like lists can often be simplified by changing the meaning of parts of the d a t a s t r u c t u r e s . Example: Given the FORWARD declaration solutions to describe
list of (function) declarations in Pascal. If a for a function F is resolved, there are two it:
(a) Rebuild the declaration list replacing the declaration with the remark 'FORWARD' by a complete function declaration for F, or (b) add a second declaration descriptor for F with the remark 'resolved' and the meaning: the latest descriptor found in the list represents the complete function declaration. The second solution prohibits expensive cost of one additional list element.
rebuildings of lists
at
the
It is not unexpected to find the same situation for AGs (specifications) and compiler generators as for programs and compilers: The first design of a larger program does in general not yield the most efficient code. A review under efficiency aspects is necessary in most cases. The GAG-System offers valuable support for the review phase. A compiler may be generated which contains detailed measurements for storage allocations and function calls. During an execution of the compiler the results of the measurements are written to a file which is later on interpreted by a support module of GAG. Studying the protocol of the objects allocated during attribute evaluation, the designer of the AG may deduce inefficient usage of objects, e.g. for lists. An example for a protocol of runtime measurements can be found in Appendix B, Sect. B.5. It contains an analysis of the heap storage used for tree construction and attribute allocation, and it also reveals the runtime spent on function calls (resp. function body executions wherever a recursion could be transformed into a loop).
5.7
Application
to Pascal
The effects of the presented improvement techniques are illustrated by results of measurements for generated Pascal analyzers. Table 5.1 shows the results of several optimization phases described below for two sample programs (modules of the GAG-system) of different sizes. It protocols the development of the GAG-System and - partially - of the AG for Pascal. All improvement methods discussed in this chapter except for common subexpression elimination and tree partitioning are included. The lines of the table reflect a sequence of single improvements such that line i comprises at least the features of line i-l. The total effect was storage reduction for the structure
67 tree by a factor of 2 and for the attributes by factors between 4 and 10 based on the non optimized version which is already rather compact due to the pointer implementation. All improvements mentioned here (apart from the AG transformations) are performed automatically by the GAG-System. The measurements for chain production elimination (d) and runtime reduction (g) are based on systematic manual changes of the generated code using a system version without these facilities. We verified that the total effects of the automatic improvements are even better.
Input Length
Program 1 (lines) (tokens)
Program 2
391 1682
Space (KB) Tree Attr.
2463 9660
Time (sec)
Space ( K B ) Time Tree Attr. (sec)
Optimization
a) none
202
55
8
1057
612
112
b) attribute lifetime
181
58
8
953
619
115
c) AG transformation
179
32
8
947
250
113
d) chain productions
121
27
9
630
182
130
e) AG transformation
118
15
9
606
67
130
f) tree structure
84
15
9
409
67
130
g) recursion
84
15
6
409
67
37
Table 5. i
Explanation of the improvements: a) Data representation. In the first version only storage reduction by pointer tion for complex types was a p p l i e d .
representa-
b) Lifetime analysis. As described in Sect. 5.4, the size of the tree nodes is reduced by elimination of components for attributes which are implemented by global variables or stacks. The 195 attributes of the Pascal-AG are implemented by ii variables (e.g. for at defs in, at_defs_out), 18 stacks (e.g. for attributes of expressionS), and 59 node components. Several attributes share one variable or stack. The
68
effect of this optimization was not as large as one might have expected because all structured attributes are implemented by pointers. The space for the attributes in the structure tree was reduced, but for attribute values it increased slightly because of the global stacks. Since our Pascal implementation (as many others) does not provide an effective recovery of heap storage, space for attribute values referenced by pointers is not deallocated when their lifetime has ended. Therefore a careful design of attribute types and their usage is the precondition to reduce attribute storage. c) Reduction of space for attribute values by introducing block structured environment attributes (see Chapter 4) and by restructuring the description of the "kind" of values. d) Elimination of nodes for chain productions. Although the main goal of this method is reduction of the tree size, it was possible to redesign some attributes such that more chain productions were isolated and some attribute storage was saved. e) Reduction of space for attribute values by redesigning procedure identification (see introducing additional constant definitions.
Chapter
4)
and
f) Space reduction for structural information in tree nodes. The GAG-System was modified such that each node of the tree contains references to its first son and its brother only instead of references to all sons. This space reduction did not affect runtime. e) Runtime reductions. The runtime for attribute evaluation is reduced drastically by transforming tail recursion into loops. Further significant savings are achieved by inline code for calls of certain functions (e.g. HEAD, TAIL). The execution time of the final version of the analyzer could still be improved by excluding the runtime checks provided by the Pascal compiler which is used to translate the analyzer. The final execution time of Program 2, e.g., was 28 seconds, where 9 seconds were necessary for parsing and tree construction. From runtime analysis we learnt that the time needed for the tree walk is not significant compared with the time needed for computing attribute values (1% to 10% depending on the applied runtime improvements). Thus for runtime reductions one should not try to minimize the number of visits (or passes) if the complexity of attribute evaluation is increased. Furthermore, pass-oriented attribute evaluation instead of the more general OAG technique will not improve runtime: The effects of the simpler control operations are not significant and are in most cases at least compensated by the increased length of the tree walks. It is clear that there are further effective and systematic optimizations which have not yet been considered, e.g.:
69
h) elimination of parts of the needed for attribute evaluation,
structure tree which are no longer
i) reduction of the analyzer's program and code length by improving the generator. At the moment the code of the generated analyzer needs about 150 KB without test output operations and about 319 KB with all available test facilities. Thus we expect that efficiency need not be a reason for preferring a manually implemented analyzer to a generated one. The generated attribute evaluator has an average performance of about 350 tokens per second. Using an automatically generated parser [De77] and a separately running scanner program, each with a performance of about 1000 tokens/sec, we gain a total performance of 250 tokens/sec for our generated front-end. The decrease of 90% of the attribute storage is most of all a result of AG transformations. But one must consider that the attribute storage of 619 KByte is already rather compact, because copies of complex data structures are implemented by pointer copies. Using value copies instead would increase storage requirements, e.g. for environment lists in typical examples by a factor of 4. That is why storage improvement techniques in compiler generating systems like HLP [HLP78] start from enormous space requirements. Fig. 5.1 and 5.2 show the performance of the generated Pascal frontend: the progress of storage and time depends almost linearly on the size of the analysed program. We analysed several programs of the GAG-System, in order to ensure that they follow the new BSI standard [BSI82]). The storage measured in Fig. 5.2 comprises only heap allocations. The runtime stack of the program needs at least 10K Bytes, which are used for the node stack, the stack for the evaluator state, the visit-sequence tables, anchors of complex global attributes, simple global attributes, anchors of global attribute stacks, and auxiliary variables. Depending on the size of the analysed program the stack may increase to about 100 KB for parameter values and return addresses for calls of (recursive) functions. Some tests with programs of about 50 000 tokens - e.g. the generated analyzer itself - confirm the linearity of the used time and storage. Yet it seems to make little sense to include bigger programs into these measurements, because typical Pascal programs do not exceed the size of 20 000 tokens. Therefore these data are representative of the behaviour of the generated Pascal front-end. We consider this a remarkable compiler front-end.
result for
an
automatically
generated
70 Total Time
(s)
-150
I I I I -100
I I I I - 50
I I I I I
* * I......... 5
Fig.
Storage
I......... 10
I 15
I
I
20 25 * 1 0 0 0 input tokens
I
3Z
5.1: Time needed by the g e n e r a t e d
PASCAL front-end
(MB)
-1.5
I I I I -1.0
I I I I -0.5
I I I I I
+ +
*
+
+
+
I
I
I
5
10
15
I
I
20 25 * 1000 input tokens
I
30
* : Tree storage + : Attribute storage
Fig.
5.2: Storage needed by the g e n e r a t e d
PASCAL front-end
A valuation of the absolute performance figures for the Pascal a n a l y z e r should take into a c c o u n t that the results do not depend on language p r o p e r t i e s w h i c h allow o n e - p a s s analysis.
71 5.8
Results
The results demonstrate that it is possible to generate compiler front-ends from AGs, which meet efficiency requirements for practical use. Effective improvement techniques are applied automatically by the GAG-System: - Efficient - a compact tree,
type mapping tree
for attribute
types,
node and a compact representation of the structure
- overlay of attribute
storage using global data
- code optimization techniques which are well suited for a translation from the applicative level to a non-applicative language like Pascal. Furthermore we pointed out that design decisions for a compiler specification by an AG can increase efficiency of the generated product without loosing clarity of the description. The performance of the compiler front-ends is comparable to that of manually written compilers, if the latter also use the structure tree as central data structure. Since most common Pascal compilers do not build up the tree, a comparison was not possible for our Pascal front-end. Instead we compared it to manually written compilers for different languages which build a tree, and the rate of Bytes per node and nodes per token were of comparable size. Our results are still valid, if there are attribute rules included which specify code generation or generation of an intermediate language, as described in Chapter 4.
Appendix
A:
Attributed
Grammar
for PASCAL
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % This attributed grammar is intended to comply with the % requirements of level 0 of BSI 6192 with the exception: % Two parameter lists are defined to be congruent if % 6.6.3.6 (a) to (e) hold for each parameter, but grouping % into formal parameter sections may be different. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% % % % % %
% Blocks and Definitions %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
TYPE
tp_environ
: LISTOF tp_block;
TYPE
tp_block
: STRUCT
TYPE
tp__defs
: LISTOF tp_def
TYPE
tp_def
:
TYPE
tp_kind
:
TYPE
tp_symb_list
: LISTOF SYMB;
TYPE
tp_labels
: LISTOF
NONTERM
STRUCT
NONTERM prog_param : at name at--defs
: tp_defs); KEY s name;
(s name s-level s--value s--kind s~type
: : : : :
SYMB, INT, INT, % static values only tp_kind, tp_type);
(sc static val, sc_dyn_val, sc var, scZtype , -sc val param, sc vat param, sc proc_param, s c - d e c T proc, sc-ext~rnal proc7 sc--forw~rd proc,--sc fit f~rward proc, sc~field, ~ c _ p a c k e d ~ f i e T d , sc_t~g_field);
program : at params at-loc defs at~complete_env
NONTERM p r o g _ p a r a m s : at names at--defs
(s defs
INT;
: tp_defs, : tp defs, : tp_--environ;
: tp symb list, tp~defs7
: SYMB,
: tp_defs ;
74 NONTERM
NONTERM
block : at level : at~glob_env : at nested procs, atZparams-: at upto type defs, at-loc ~efs : at--env-: at-complete env : at-declared-labels, at--occurring labels: at~visible_l~bels : const def part, const--def-list, const-def7 type ~ef part, type--def--list, type--defT vat decl part, var--decl--list, var--decl--: --at defs in at--defs--out
INT INH, tp_environ tp_defs
INH,
INH,
tp defs, tp--environ, tp~environ, tp labels, tp~labels INH;
: tp_defs INH, : tp_defs;
m
NONTERM
proc decl part : ~t visible labels at-defs inat-defs-out
tp labels INH, : tp_--defs INH, : tp_defs; :
m
NONTERM NONTERM NONTERM
label decl part at labels
:
label decl, label at value VAR_sym
: ;
: tp_labels; : :
INT;
75 RULE
rl: % p r o g r a m ::= 'PROGRAM' STATIC b l o c k . a t level := i; p r o g r a m . a t params IF p r o g Z p a r a m s block.at_params
name
['(' p r o g _ p a r a m s
:= THEN p r o g _ p a r a m s . a t _ d e f s
')' ] ';' block
ELSE tp_defs
'.'
() FI;
:= p r o g r a m . a t _ p a r a m s ;
block.at_glob_env
:= c _ s t a n d a r d _ e n v ;
block.at_nested_procs
:= t p _ d e f s
();
program.at_complete
env
program.at_loc_defs
:= b l o c k . a t _ l o c _ d e f s ;
block.at_visible_labels
:= c _ s t a n d a r d _ e n v ;
:= t p _ l a b e l s
()
END; RULE
r2:
.% prog_params
::= STATIC prog_params.at_names prog_params.at_defs
( prog_param
//
','
)
:= p r o g _ p a r a m . a t _ n a m e ; := tp_defs
(prog_param.at_defs)
END; RULE
r3:
% prog_param
::= name STATIC prog_param.at_name := n a m e . a t _ n a m e ; prog_param.at_defs
:= IF n a m e . a t name = 'INPUT' T H E N t p _ d e T s (c_input) ELSE IF n a m e . a t name = 'OUTPUT' THEN t p _ d e ~ s (c_output) ELSE t p _ d e f s () FI FI;
CONDITION f elem unique (name.at name, I N C L U D I N G MESSAGE "DUPLICATE PROGRAM PARAMETER"; CONDITION IF (name.at name = 'INPUT') OR THEN TRUE ELSE f i d e n t i f y (name.at name, -t p _ e n v i T o n (), .s kind = sc var FI M E S S A G E " P R O G R A M P A R A M E T E R M U S T BE END;
(name.at
prog_params.at_names)
name
= 'OUTPUT')
INCLUDING program.at_loc_defs, sc_var, FALSE) A VARIABLE"
76 RULE r4:
% block ::=
label decl part const--def part type ~ef Tart var_~ecl~part proc decl part 'BEG~N' s~atement_seq
STATIC const_def_part.at_defs_in type_def_part.at
defs_in
block.at_upto_type_defs var_decl_part.at_defs_in
'END'
:= block.at_params; := const_def_part.at_defs_out; := type_def_part.at
:= type_def_part.at_defs_out;
proc decl part.at defs in := (~ONST~TUENTS pointer type.at_type_def) var_decl_part.at_defsZout; block.at_loc__defs
defs_out;
+
:= proc_decl_part.at_defs_out;
block.at env := tp_en~iron(tp_block(block.at_loc_defs))
+ block.at_glob_env;
block.at complete env := tp environ (tp--block (block.at loc defs)) + (INCLUDING (bl~ck.at_complete_~nv,program.at_complete_env)); block.at_declared_labels
:= label_decl_part.at_labels;
block.at occurring labels := statement_seq CONSTITUENTS
label.at_value;
proc decl part.at visible labels s~atement seq.~t label~ + block.at ~isible--labels;
:=
% % % %
w
possible destinations of jumps from local procedures
statement seq.at visible labels := proc_d~cl_par~.at_visTble_labels END;
RULE r5:
% label_decl_part
::= STATIC label_decl_part.at_labels END; RULE r6:
:= tp_labels
% !abel_decl part ::= 'LABEL'
()
( label decl // ',' ) ';'
STATIC
label_decl_part.at_labe!s END;
:= l a b e l _ d e c l . a t _ v a l u e
77 RULE r7:
%label_decl
::= integer_number STATIC TRANSFER at_value; CONDITION integer n u m b e r . a t value <= 9999 MESSAGE "LABEL EXCEEDS 9999~; CONDITION f_elem unique MESSAGE
(integer number.at value, INCLUDING label d~cl part.at_labels) "DUPLICATE LABEL D E C L A R A ~ I O N " ~
CONDITION ELEM IN LIST MESSAGE
(integer number.at value, INCLUDING block.a~_occurring_labels) "DECLARED LABEL DOES NOT OCCUR"
END; RULE r8:
% const_def
part ::= STATIC c o n s t _ d e f _ p a r t . a t _ d e f s out END; % const def part STATIC TRANSFER END;
:= c o n s t _ d e f _ p a r t . a t _ d e f s _ i n
RULE r9i
% const def list STATIC TRANSFER END;
::= 'CONST'
const def list ---
RULE rl0:
::= const def
';'
RULE rll:
% const def list ::= const def list const def STATIC . . . . . TRANSFER at defs in WITH const def list[2]; TRANSFER at_defs_out const def.at defs
END;
';'
WITH const_def ;
in := const def list[2].at defs out
78 RULE r12:
% const def
::= name '=' constant STATIC CONDITION NOT KEY IN LIST (name.at name, const_def.at_defs_in) IDENTIFIER DEFINED BEFORE"; MESSAGE "--const def.at defs out := t~defs (~p d e Y (name.at name, INULUDING block?at level, constant.at value,-sc static val, c o ~ s t a n t . ~ t type)) + const def.at defs--in END; RULE r13:
% type_def_part
::= STATIC type_def_part.at_defs_out
:= t y p e _ d e f _ p a r t . a t _ d e f s _ i n
END; RULE r14:
% type def part STATIC TRANSFER END;
%type def list STATIC TRANSFER END;
::=
'TYPE'
type_def
list
RULE r15:
RULE r16:
::= type_def
';'
% type_def_list
::= type_def_list type def STATIC TRANSFER at defs_in WITH type_def_list[2]; TRANSFER at_defs_out type_def.at_defs_in END;
';'
WITH type_def; := t y p e _ d e f _ l i s t [ 2 ] . a t _ d e f s _ o u t
79 RULE r17:
% type_def
::= name STATIC TRANSFER at_defs_in;
'=' type_denoter
CONDITION NOT KEY IN LIST (name.at name, type_denoter.at_defs_out) MESSAGE " I D E N T I F I E R DEFINE~ BEFORE"; type def.at defs out := tp_defs ~tp d~f (name.at name, INCLUDING block?at level, c_avalue, sc type, type d e n o t e r . a t type)) + type d e n o t e ~ . a t _ d e f s _ o ~ t
END; RULE r18:
% var_decl_part
STATIC var_decl_part.at END; % var_decl_part STATIC TRANSFER END;
::= defs_out
:= var_decl
part.at_defs_in
RULE r19:
RULE r20:
% VAR_sym
::=
% var decl list STATIC TRANSFER END;
::= VAR_sym var_decl_list
'VAR' END;
RULE r21:
RULE r22:
% var decl list
S TAT IC TRANSFER at defs
::= var decl
::=
';'
var decl list var decl
in WITH vat decl
';'
list[2];
T R A N S F E R at defs out WITH var decl; vat decl.at defs in := var decl list[2].at END;
defs out
80 RULE r23:
% var decl
::= ( name // STATIC TRANSFER at defs in;
',' ) ':' type_denoter
m
CONDITION f none in defs (name.at name, type_denoter.at_defs_out) MESSAGE ; I D E N T I F I E R DEFINED BEFORE"; var decl.at defs out := Y make d~fs (~ame.at name, -INCLUDING block.at level, sc var, type_denoter.at_type) + type_denoter.at_defs_out END ; RULE r24:
%proc_decl_part
::=
STATIC
proc_decl_part.at_defs_out
:= p r o c _ d e c l _ p a r t . a t _ d e f s _ i n
END; RULE r25:
% proc_decl_part
STATIC TRANSFER at_defs_in TRANSFER at_defs_out
::= p r o c _ d e c l _ p a r t
proc_decl
WITH proc_decl_part[2]; WITH proc_decl;
TRANSFER at_visible_labels; proc_decl.at_defs_in END;
:= p r o c _ d e c l _ p a r t [ 2 ] . a t _ d e f s _ o u t
81 % Standard Environment %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
CONST c real
: tp__type
(sc_real);
CONST c int
: tp type
(sc_int);
CONST c bool
: tp_type
(sc_bool);
CONST c char
: tp_type
(sc_char);
CONST c nil
: tp_type
(sc_nil);
CONST c void
: tp type
(sc_void);
CONST c_empty_set
: tp_type
(sc_empty_set);
C0NST c overloaded
: tp_type
(sc_overloaded);
CONST c err
: tp_type
(sc_err);
CONST c_empty_choice
: tp_choice
CONST c a value
: 0; % dont care
CONST c canonical
: 0;
CONST c cardinality
of char
(sc_empty);
: 256;
% implementation-defined
CONST c_text
: tp_file
(-i, sc packed,
c char);
CONST c_err_record
: tp_record
CONST c_err_array
: tp_array
CONST c_err_proc
: tp_proc
(tp_defs
CONST c_text_type
: tp_type
(c text);
CONST c_input
: tp def
('INPUT',
CONST c_output
: tp def
('OUTPUT',0,c_a_value,sc_var,c_text_type);
(-2,sc_unpacked,tp_defs (-3, sc_unpacked,
c_err,
(),sc_empty); c_err);
(), c_err);
O,c_a_value,sc_var,c_text_type);
CONST c a real param : --tp__def~ (tp_def
('X',
i, c_a value,
sc_val_param,
c_real));
CONST c an int param : 'tp_defs (tp_def
('X',
i, c a_value,
sc_val_param,
c_int));
CONST c an overloaded param : --tp~defs (tp_d~f ('X',l,c_a_value,sc_val_param,c_overloaded));
82 CONST c proc void overloaded param : -tp_p?oc (3 an overlo~ded_param,
c_void);
CONST c_proc_bool overloaded param : tp_proc (~ an overlo~ded_param,
c_bool);
CONST c proc int overloaded param : --tp_p?oc ~c an overl~aded_param,
c_int);
CONST c proc overloaded : --tp_p~oc (c an overloaded_param,
c_overloaded);
CONST c standard env : --tp environ (tp block (tp defs ( ~p_def 'BOOLEAN',0, c--a value, sc_type, c bool), tp def 'INTEGER',Z, c--a--value, sc_type, c--int ), tp--def 'CHAR', 0, c--a--value, sc type, c--char), tp--_def 'TEXT', 0, c--a--value, scZtype, c--text type), tp_def 'REAL', 0, cZa~value, sc type, c--real~, •tp def 'FALSE', 0, 0, sc stati~ val, c--bool), tpZdef 'TRUE', 0, i, sc--static--val, c--bool), tp__def 'MAXINT', 0, MAXINT,--sc_sta~ic_val,--c_int), tp_def 'RESET', 0, c a value, sc decl_proc, c proc void ov~rYoaded par~m), tp_def ('REWRITE',07 c_a_value~ sc_decl_proc, c proc void overloaded param), tp_def ('PUT',-07 c a value7 sc decl_proc, c proc_void_ov~rYoaded_par~m), tp_def ('VET', 0, c a value, sc decl proc, c proc_void ov~rYoaded_param), tp_def ('EOF', 0, c a value, sc decl proc, c proc_bool ov~rYoaded_par~m), tp_def ('EOLN', 0, c a value, sc decl_proc, c_proc_bool_ov~rYoaded_par~m), tp__def ('READ', 0, c a value, sc decl_proc, c proc void_ov~rYoaded_param), tp_def ('WRITE T, 0, c a value, sc decl_proc, c_proc_void_ov~rYoaded_par~m),
83
tp_def tp_def tp_def tp_def tp_def tp_def tp_def tp_def tpdef tp_def tpdef tp_def tp_def tpdef tp_def tpdef tp_def tpdef tpdef tp_de f tp def tp_de f
('READLN', 0, c a value, sc decl_proc, c proc void ov~rYoaded par~m), ('WRITE~N',Z~ c a value~ sc_decl_proc, c proc void overloaded param), ('~AGE'7 07 c a value7 sc decl_proc, c_proc_void_ov~rYoaded_par~m), ('NEW' 0, c a value, sc decl proc, c proc void ov~rYoaded param), 'DISPO~E',0~ c a value7 sc decl_proc, c proc void ov~rYoaded par~m), 'PACK'-07 c a value7 sc decl proc, c proc void ov~rYoaded par~m), '~NPAC~', 07 c a value7 sc decl_proc, c_proc_void_ov~rYoaded_par~m), 'ORD', 0, c a value, sc decl proc, c_proc int_oveYl~aded_paraH), 'PRED'~ 0, c a value, sc_decl_proc, c proc overloaded), 'SUCC'7 0, c a value, sc decl_proc, c proc_overloa~e~), ('~BS', 0, c a value, sc_decl_proc, c proc overloaded), ('SQR',-0, c a value, sc_decl_proc, c_proc_overloa~e~), ('SIN', 0, c a_valuer sc decl proc, tp_proc (c a real param, c--real~), ('COS', ~,--c a ~alue, sc--decl proc, tp proc (c_a_r~a~ param, c~real[), ('EXP', 0, c a value, sc decl proc, tp proc (c a r~a[ param, c--real[), ('L~', ~,--c_a_~alue, sc~decl_proc, tp proc (c a real param, c real)), ('SQRT', ~,--c a ~alue, sc--decl proc, tp proc (c_a_r~aY param, c--realT), ('ARCTAN', 0, c_a_~alue, sc--decl proc, tp proc (c a real param, c--real~), ('T~UNC', O,--c_a_~alue, sc~decl proc, tp proc (c_a_real param, c int)), ('RUUND', 0, C a ~alue, sc--decl proc, tp proc (c a r~aY param, c--int)[, ('ODD' , ~,--c_a_~alue, sc~decl_proc, tp proc (c an int param, c bool)), ('CHR' , 0, c a ~alue, sc--decl proc, ))); tp_proc (c an Y n L p a r a m , c~charT) '
n
84
FUNCTION f_identify
(p_name p_defs p env p err kind p_--no_message
: : : : :
SYMB, tp_defs, tp environ, tp--kind, BO~L) tp_def:
IF EMPTY (p defs) THEN IF EMPTY (p_env) THEN (tp def (p name, 0, c a value, p_err_kind, c_err) CONDITION--p no message-MESSAGE CAST p--err kind OF sc type : ~ T Y P E IDENTIFIER NOT DECLARED"; sc--fleld : "FIELD IDENTIFIER NOT DECLARED"; sc decl proc: "PROC/FUNCT IDENTIFIER NOT DECLARED" OU~ "IDENTIFIER NOT DECLARED" ESAC ) ELSE f_identify (p_name, HEAD (p env).s defs, TAIL (p_env), p_err_kind, p_no._message) FI ELSE IF (HEAD (p_defs).s name = p name) AND (HEAD (p_defs).s~kind =/ ~c_fit_forward proc) THEN HEAD (p defs) ELSE f_identTfy (p_name, TAIL (p_defs), p env, p_err_kind, p_no_message~ FI FI; FUNCTION f none in defs (p names : tp symb_list, p__defs : tp_defs) IF EMPTY--(p_d~fs~ OR EMPTY (p namesT THEN TRUE ELSE IF KEY IN LIST (HEAD (p names), p_defs) THEN FALSE ELSE f_none_in_defs (TAIL(p_names), p_defs) FI FI; FUNCTION f defs unique (p_names : tp_symb_list, p_defs IF EMPTY--(p_n~mes) THEN TRUE ELSE IF f def unique (HEAD (p names), p defs) THEN--f_d~fs_unique (TAIL~p_names),--p_defs) ELSE FALSE FI FI; FUNCTION f def unique (p_name : SYMB, p_defs : tp_defs) IF EMPTY--(p defs) THEN FALSE -ELSE IF HEAD (p_defs).s name = p_name THEN NOT KEY IN LI~T (p name, TAIL (p defs)) ELSE f_def_u~iq~e (p_name, TAIL (p_de~s)) FI FI;
: tp_defs)
BOOL:
BOOL:
BOOL:
85 FUNCTION f elem_unique (el : LIST ELEM TYPE, 1 : LIST) BOOL : IF EMPTY--(I) --THEN FALSE ELSE IF HEAD (i) = el THEN NOT ELEM IN LIST (el, TAIL (i)) ELSE f elem unique (el, TAIL (I)) FI FI; m
FUNCTION f make defs ---
(p names p-level p~kind p_type
: : : :
tp_symb_list, INT, tp_kind, tp_type ) tp_defs
: IF EMPTY (p names) THEN tp def~ () ELSE tp--defs (~p def (HEAD (p names), p level, c a value,p kind, p type)) ~ + f_~ake_defs (TATL (p_name~), p_lev~17 p_kind7 p_type7 FI; Types %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
TYPE
tp_type
: UNION
(tp real, tp int, tp bool, tp_char, tp--_enum, tp~subrang~, tp array, tp record, tp file, tp--_set, tp_g~neral set,--tp_empty_set, tp proc, tp overloaded, tp~pointer,--tp_nil, tp void, tp_err);
TYPE
tp_real
:
TYPE
tp_int
: (sc_int);
TYPE
tp_bool
: (sc_bool);
TYPE
tp_char
:
TYPE
tp_nil
: (sc_nil) ;
TYPE
tp_void
: (sc_void) ;
TYPE
tp_empty_set
: (sc_empty__set);
TYPE
tp_overloaded
:
TYPE
tp_err
: (sc_err);
(sc_real);
(sc_char) ;
(sc_overloaded);
86 TYPE
tp_sub range
: STRUCT
(s_key s host s~lwb, s_upb
TYPE
tp_enum
: STRUCT
(s_key s max val
TYPE
tp_pointer
: STRUCT
(s_referred
TYPE
tp_array
: STRUCT
(s_key s packing s--index s~elem
: INT, : tp packing, tp_--type, : tp_type);
TYPE
tp_set
: STRUCT
(s key s~packing s_base
: INT, : tp packing, : tp_type);
TYPE
tp_packing
: (sc_packed,
TYPE
tp_general_set
: STRUCT
(s_base
TYPE
tp_file
: STRUCT
(s key s--packing s~base
TYPE
tp_proc
:
TYPE
tp_record
: STRUCT
TYPE
tp_cho ice
: UNION
TYPE
tp empty
: (sc_empty);
TYPE
tp union
: STRUCT
(s discr sZvariants
TYPE
tp_variants
: LISTOF
tp_variant;
TYPE
tp_variant
:
STRUCT
STRUCT
: INT, : tp type, : IN~);
: INT, : INT); : SYMB);
sc_unpacked) ; : tp_type); : INT, : tp packing, tp_--type);
(s_params s_result
: tp_defs, : tp_type);
(s key sZpacking s all fields s~choYce (tp_empty,
(s labels sZchoice
: : : :
INT, tp packing, tp defs, tp_--choice);
tp_union);
: tp type, : tp--_variants);
: tp labels, : tpZchoice);
87 NONTERM
NONTERM
type d e n o t e r at d~fs in at-defs-out at~type--
: : tp defs INH, tp--_de fs, : tp_type ;
type identifier at d~f
: : tp_def ;
m
NONTERM
NONTERM
NONTERM
NONTERM
NONTERM
NONTERM
NONTERM
NONTERM
enumeration : at const type at--defs Yn at--defs--out at-val -pointer type a t_typeZde f a t_type
: tp type INH, : tp--defs INH, tp--de fs, :
: : tp_def, : tp_type ;
subscr tp list at d e f t i~ at~packTng at defs out atZtype-at_elem_type
: tp defs INH, : tp--packing INH, : tp defs, : tp_--type, : tp_type INH; :
record_type : at defs in at--defs--out a t~c om pYe t e_e nv at level at~packing at_type
tp defs INH, tp--de fs, : tp~envi ton, : INT, : tp_pac king, : tp_type ;
:
field list : at dens in at--defs--out, at--fields at--choice
: tp._defs
tag field : at tag type at--fie~ds
INH,
: tp_defs, : tp_choice ;
record section at d e f ~ in at--defs--out, at--fiel~s v a r i a n t part at defs--in at--defs--out, at--fiel~s a t--un ion
IN~;
: : tp_defs
INH,
: tp_defs ; : : tp_defs
INH,
: tp_defs, : tp_union ; : tp_type, : tp_defs ;
88 NONTERM
variants, variant : at defs in : tp defs INH, at-defs-out : tp-defs, at-fiel~s : tpZdefs, at-variants : tp variants, at-case labels in : tp_-labels INH, at--case--labels--out : tp labels, at--_labe~_type -: tp~type INH;
NONTERM
PACKED_sym
RULE r26:
% type_denoter
: ;
::= type_identifier STATIC type_denoter.at_type := type_identifier.at_def.s_type; type_denoter.at_defs
out
:= type_denoter.at_defs_in
END; RULE r27: STATIC
% type denoter TRANSFER END;
RULE r28:
% type_denoter
STATIC TRANSFER
::= record_type
::= pointer_type
at_type;
type_denoter.at_defs_out END;
:= type_denoter.at_defs_in
89
RULE r29:
% type_identifier
::= name STATIC type_identifier.at_def := f identify (name.at name, INCLUDING (type denoter.at defs in, variant part.at--defs--in, proc_de~l.at_de~s in[, INCLUDING block.at_glob_env, sc_type, FALSE); CONDITION type_identifier.at def.s kind = sc_type MESSAGE "MUST BE TYPE IDENTIFIER";-CONDITION type identifier.at def.s level = f_id~ntify (name.at name, tp defs (), INCLUDING (block.at complete env, form parm sect.at--complete env, proc--type?at complete env,-r eco?d type .~t_compl e~e_env) , sc type, TRUE).s Yevel MESSAGE "~DENTIFIER APPLIED BEFORE DEFINITION" END; RULE r30:
% type_denoter
::= constant '..' constant STATIC type_denoter.at_defs_out := type_denoter.at_defs_in; type denoter.at type := tp_subrange ~GENNUM, constant[l].at type, constant[l].at~value,
constant[2].at_value); CONDITION f is ordinal (constant[l].at_type) MESSAGE "BOUND--TYPES MUST BE ORDINAL"; CONDITION (constant[l].at type = constant[2].at_type) (constant[l].at_--type IS tp err) OR (constant[2].at type IS tp err) MESSAGE "BOUND TYPES MUST BE EQUAL";
OR
CONDITION (constant[l].at value <= constant[2].at value) OR (constant[l].at--type IS tp err) OR (constant[2].at~type IS tp--err) MESSAGE "EMPTY RANGE NOT ALLOWED" END;
90 RULE r31:
% type_denoter
STATIC TRANSFER
,),
::= ' (' enumeration
at_defs_in,
at defs_out;
type denoter.at type := tp_enum (GEN~UM, enumeration.at_val); enumeration.at_const_type
:= type_denoter.at_type
END; RULE
r32:
% enumeration
::= name STATIC CONDITION NOT KEY IN LIST (name.at name, enumeration.at_defs_in) MESSAGE "IDENTIFIER DECLARED BEFORE"; enumeration.at defs out := tp_defs (tp_-def ~name.at name, INCLUDING block.at_level, 0, sc static val, en~merati~n.at_const_type)) enumeration.at_defs_in; enumeration.at
val
+
:= 0
END; RULE
r33:
% enumeration
STATIC TRANSFER
::= enumeration
at_const_type,
','
name
at_defs_in;
CONDITION NOT KEY IN LIST (name.at name, enumeration[2].at_defs_out) MESSAGE "IDENTIFIER DECLARE~ BEFORE"; enumeration[l].at defs out := tp_defs (tp_de'E (name.at name, INCLUDING block.at level, enumeration [l] .at ~al, sc static val, enUmeratiOn [i] .at_const_type) ) enumeration [2]. at_defs_out ; m
enumeration[l].at END;
val
:= enumeration[2].at
val + 1
91
RULE r34:
% pointer_type
::= '@' name STATIC pointer_type.at_type := tp_pointer
(GENSYMB);
pointer type.at_type_def := tp d~f --(pointer type.at_type QUA tp_pointer .s_referred, INCLUDING block.at level, c a value, sc type, (f--identify --(name.at name, INCLUDING block.at_upto_type__defs, INCLUDING block.at_glob_env, sc type, FALSE) CONDITION IT.S kind = sc type MESSAGE "MUST BE TYPE IDENTIFIER").s_type); N
CONDITION f identify --
(name.at name, INCLUDING block.at upto type defs, INCLUDING block.atZglob~env,-sc type, TRUE).s level = f_identify (name.at name, tp defs--(), INULUDING (block.at complete env, record type.at_complete_env), sc type, TRUE).s Yevel MESSAGE "GLOBAL IDENTIFIER REDE~LARED BELOW"
END; RULE r35:
% type_denoter ::= [ PACKED_sym ] 'ARRAY' '[' subscr tp list ']' 'OF' type_denoter
STATIC TRANSFER at_defs_in, at_type WITH subscr tp list; TRANSFER at_defs_out WITH type_denoter[2]; type_denoter[2].at_defs_in subscr_tp_list.at_elem_type
:= subscr tp list.at_defs out; := type_denoter[2].at_type;
subscr tp list.at packing := IF P~CKED_sym IS THERE THEN sc_packed ELSE sc_unpacked FI END; RULE r36:
% PACKED_sym
::= 'PACKED' END;
92 RULE r37:
%, subscr_tp_list
::= type_denoter STATIC TRANSFER at defs in, at defs out; subscr tp list.at type := tp_~rr~y (GENN~M, subscr tp list.at packing, type d~no~er.at t~pe, subset tp list.~t_elem_type); CONDITION f is ordinal (type denoter.at_type) MESSAGE "MU~T ~E ORDINAL TYPE" END; RULE r38:
% subscr_tp_list
::= type_denoter ',' subscr tp list STATIC TRANSFER at_defs_in WITH typedenoter; TRANSFER at defs out, at elem type, at_packing WITH subscr tp [ist[~]; subscr tp list[2].at defs in typ~_d~noter.at_d~fs_o~t;
:=
subscr tp list[l].at type := tp_array (GENNUM,-subscr tp list[l].at packing, type d~no~er.at type7 subs~r tp list[~].at_type); CONDITION f is ordinal (type denoter.at_type) MESSAGE "MU~T BE ORDINAL TYPE" END; RULE r39:
% type_denoter
::= [ PACKED_sym ] 'SET' STATIC TRANSFER at defs in, at defs out;
'OF' type_denoter
type denoter[l].at type := tp_set (GENNUM,-IF PACKED sym IS THERE THEN sc_p~cked ELSE sc_unpacked FI, type_denoter[2].at_type); CONDITION f is ordinal (type denoter[2].at_type) MESSAGE "MUST BE ORDINAL TYPe" END;
93
RULE r40:
% type_denoter ::= [ PACKED_sym ] 'FILE' 'OF' type_denoter
STATIC TRANSFER at_defs_in, at_defs_out; type denoter[l].at type := tp_fil e (GENNUM7 IF PACKED sym IS THERE THEN sc_packed ELSE sc_unpacked FI, type denoter[2].at_type); CONDITION f no file contained (type denoter[2].at_type) MESSAGE "TYPE MUST NOT CONTAIN A FI~E TYPE" END;
RULE r41:
% record_type ::= [ PACKED_sym ] 'RECORD' field_list 'END'
STATIC TRANSFER at_defs_in, at_defs_out; record type.at level := (INULUDING ~block.at level, record_t~pe.at level)) + i; record type.at packing := IF PACKED_s~m IS THERE THEN sc_packed ELSE sc_unpacked FI; record type.at type := tp_~ecord (UENNUM, record type.at packing, field Yist.at Yields, fieldZlist.atZchoice); record type.at complete env := tp ~nviron ~tp block-(field list.at_fields)) + (INCLUDING (bl~ck.at complete env, record_t~pe.at_co~plete_env)) END;
RULE r42:
% field list ::= record section
STATIC TRANSFER at_defs_in, at_defs_out, at_fields; field_llst.at_cholce := sc_empty END;
94 RULE r43:
% field
STATIC TRANSFER END;
list
variant_part
::=
at_defs_in,
field list.at choice ---
RULE r44:
% field
STATIC TRANSFER
list
at_defs_out,
variant_part.at_union
:=
::= record
at defs
at_fields;
section
in WITH record
TRANSFER
at_defs_out
WITH
TRANSFER
at choice WITH field
';' field
list
section;
field_list[2]; list[2];
field list[2].at defs in := re~ord section.at ~efs out; field list[l].at fields := re~ord section.at fields field Tist[2].at Yields
+
END; RULE r45:
%field
STATIC field
m
list
list.at
~
::= defs out
:= field
.
.
.
list.at
field_list.at_choice
:= sc_empty;
field_list.at_fields
:= tp_defs
.
defs
in;
()
END; RULE r46:
% record
STATIC TRANSFER
section
at_defs_in,
::=
( name // ', ' ) ':' type_denoter
at_defs_out;
record section.at fields := f m~ke defs ---(name.at name, INCLUDING record_type.at_level, IF (INCLUDING record type.at packing) = sc_packed THEN sc packed field--ELSE sc--field FI, type_de~oter.a~_type); CONDITION f__defs_unique MESSAGE END;
(name.at name, (INCLUDING record_type.at .s all fields) "DUPLICATE--FIE~D NAME"
type)
QUA tp_record
9S RULE r47:
%
variant_part
::= 'CASE' tag_field 'OF' variants STATIC TRANSFER a t d e f s _ i n , a t _ d e f s _ o u t WITH variants;
[ ';' ]
variant part.at fields := t a g _ T i e l d . a t ~ f i e l d s + variants.at_fields; variant part.at union := tp u~ion ( t a ~ f i e l d . a t tag type, -va r lant s. a t__~ar i~nts) ; variants.at_label_type
:= tag_field.at_tag_type;
variants.at_case_labels_in
:= tp_labels
();
CONDITION CASE v a r i a n t s . a t _ l a b e l _ t y p e OF IS tp_int : FALSE; IS tp err : TRUE; IS tp-bool : 2 ~= LENGTH (variants.at case labels out); IS tp enum : TH~S.s max val <= L E N G T H (variants.at_case_labels_out); IS tp s u b ~ a n g ~ : ((THIS.s upb - THIS.s lwb) + i) <= LENGTH (variants.a~_case_labels_out); IS tp char : c ~ardinality_of_char <= LENGTH (variants.at_case_labels_out) OUT ~ A L S E ESAC MESSAGE "THE SET OF CASE LABELS IS NOT COMPLETE" END;
96
RULE r48:
% tag_field
::= name ':' type_identifier S TAT IC tag_field .at_tag_type := type_identifier.at_def.s_type; tag_field.at_fields := tp defs ~tp_def (name.at name, INCLUDING record_type.at_level, c a value, s~ ~ag field, tag__fi~id.at_tag_type)); CONDITION f_def_unique
(name.at name, (INCLUDING record type.at_type) .s all fieldsT MESSAGE "DUPLICATE FIE~D N~ME";
QUA tp_record
CONDITION f is ordinal (tag field.at_tag_type) MESSAGE "MU~T ~E ORDINAL TYPE" END; RULE r49:
% tag_field
::= type_identifier STATIC tag_field.at_tag_type := type_identifier.at_def.s_type; tag_field.at_fields
:= tp_defs
();
CONDITION f is ordinal (tag_field.at_tag_type) MESSAGE "MUST BE ORDINAL TYPE" END;
RULE r50:
% variants ::= variant STATIC TRANSFER END;
97 RULE r51:
% variants
STATIC TRANSFER
::= variants
at label
type;
TRANSFER
at defs
TRANSFER
at defs ou
i , at case labels
variant.at_defs_in variant.at
';' variant
, at case labels :=
in WITH variants[2]; out WITH variant;
variants[2].at_defs_out;
case labels
in := variants[2].at
case labels out;
variants[l].at fields := variants[2]?at_fields + variant.at_fields; variants[l].at
variants
:=
variants[2]~at_variants + variant.at_variants
END; RULE r52:
% variant
STATIC TRANSFER
::= (case_label
// ',')
':'
'(' field_list
at_defs_in;
TRANSFER
at__defs_out,
TRANSFER
at_label_type,
at_fields
WITH field_list;
at_case_labels_in
WITH case
variant.at case labels out := variant.~t cave labeTs in + (case label.at Value CONDITION UNIQUE_ELEMS MESSAGE "DUPLICATE CASE LABEL"); case_label.at_glob_env
:= INCLUDING
variant.at variants := tp_vari~nts (tp variant --(case label.at value, fielLlist.at-choice)); case label.at
in statement
"= FALSE
label;
(IT)
block.at_glob_env;
case label.at complete env := I~CLUDING ~ecord_type.at_complete_env;
END;
')'
9B FUNCTION
f_compatible_types
(p type_l,
(p type 1 = p type 2) OR (p_type_2 C A g E t y ~ e _ l :--p_type_l OF IS tp err : TRUE; IS tpZbool : IS tp int : IS tp_-char : IS tp e n u m : C A g E t y p e _ 2 : p _ t y p e _ 2 OF IS tp s u b r a n g e : t~pe_l = t y p e _ 2 . s _ h o s t OUT FALSE ESAC;
p_type_2 IS tp_err)
: tp_type)
BOOL
OR
IS tp s u b r a n g e : CA~E type 2 : p _ t y p e _ 2 OF IS tp s u b ~ a n g e : t~pe l.s host = type 2.s h o s t OUT type~l.sShost = typeZ2 ESAC; IS tp set : CA~E type_2 : p_type_2 IS tp set : f ~ o m p a t i b l e types (type l.s p a ~ k i n g = IS tp g e ~ e r a ~ set : f_~ompatibYe_types IS tp e m p t y set : TRUE OUT F~[LSE -ESAC ;
OF (type l.s base, type 2.s_base) A N D type--2.s--packing); ---(type_l.s_base, type_2.s_base) ;
IS tp g e n e r a l set : C A ~ E t y p e _ ~ : p _ t y p e _ 2 OF IS tp set : f ~ompatible_types (type_l.s_base, IS tp g e n e r a l set : f_~ompa t i b T e _ t y p e s (type_l. s_base, IS tp e m p t y _ s e t : TRUE OUT F ~ L S E ESAC ; IS tp e m p t y set : C A ~ E type._2 : p _ t y p e _ 2 IS t p _ s e t : IS t p _ g e n e r a l _ s e t : IS t p _ e m p t y _ s e t : TRUE OUT FALSE ESAC ;
OF
type_2.s_base) ; type_2, s_base) ;
:
IS tp array : f Ts string (p type i) AND f is string (p type 2) AND C~SE--index 1 :-type-l.s_inde~ Ol~ IS tp subr~nge : CASE type_2 : p__type_2 OF IS tp array : CA~E index 2 : type_2.s index OF IS tp subr~nge : i~dex_l.s_upb = index_2.s_upb OUT FALSE ESAC OUT FALSE ESAC OUT FALSE E SAC ; IS tp pointer : -~p_type_2 IS tp_nil) ; IS tp nil : "[p_type_2 OUT FALSE ESAC ;
IS tp_pointer)
FUNCTION f assignment compatible (p type i, p type 2 : tp type) BOOL ((p typ~ 1 = p typ~ 2) AND f no Tile ~ontaTned Tp_type--l)) OR f c~mpatTble t~pes ~p type I, p type-2) OR (Tp_type_l ----c_real) AND (p._type_2 =--c_int) ) ; FUNCTION f equal proc types (p type i, p type 2 : tp_proc) BOOL : ((p type l.s--resuYt = p type 2.s result) UR (p~type--l.s--result = c--err)--OR "~p_type_2.s_result = c err)) AND f_congruent_params (p_t~pe_l. s_params, p_type_2, s_params) ; FUNCTION f_congruent_params (p_l, p_2 : tp_defs) BOOL : IF EMPTY (p I) OR EMPTY (p 2) THEN EMPTY (p--l) AND EMPTY (p--2) ELSE IF (HEAD--(p l).s kind = I~EAD (p 2).s kind) AND CASE parm._l :--HEAD (p_l).s_t~pe O~ IS tp_err : TRUE; IS tp proc : CA~E parm_2 : HEAD (p_2).s_type OF IS tp err : TRUE; IS tp--_proc : f_equal_proc_types (parm_l, parm_2) OUT FALSE ESAC OUT HEAD (p l).s type = HEAD (p 2).s_type E SAC THEN f congruent params (TAIL (p i), TAIL (p 2)) ELSE F~LSE ---FI FI;
:
100
F U N C T I O N f is ordinal (p_type : tp_type) BOOL : CASE p type OF IS tp_Tnt : IS tp bool : IS tp char : IS tp enum : IS tp_subrange : ~S tp_err : T~UE OUT FALSE ESAC ; FUNCTION f reduce (p_type : tp_type) tp_type : CASE p_--type OF IS tp subrange : TH~S.s host; IS tp. set--: IF THIS.s base IS tp subrange THEN tp s~t (c_canon'[cal, THIS.s_packing, ELSE p_~ype FI; IS tp_proc : IF EMPTY (THIS.s params) THEN f reduce (THIS.s_result) ELSE p--type FI OUT p_type ESAC ;
f reduce
(THIS.s_base))
FUNCTION f no file contained (p_type : tp_type) BOOL : CASE p ~[yp~ OF -IS tp_~ile : FALSE; IS tp array : f no file contained (THIS.s elem); IS tp~record : f--no--file--fields (THIS.s_alT._fields) OUT TRUE ESAC ; FUNCTION f no file fields (p_defs : tp_defs) SOOL : IF EMPT~ (p__def~) THEN TRUE ELSE IF f no file contained (HEAD (p defs).s type) THEN--f ~o fiTe fields (TAIL (p_defs)) ELSE F ~ L S ~ -FI FI; FUNCTION f is string (p_type : tp_type) BOOL : CASE p_type OF IS tp err : TRUE ; IS tp array : (THIS.s packing = sc_packed) AND (THIS.s elem = c_char) CASE TH~S.s index OF IS tp subra~ge : (THIS.s_lwb = I) AND (THIS.s_upb > i) OUT FiLSE E SAC OUT FALSE ESAC ;
AND
101
FUNCTION f is tp record (p_type : tp_type) CASE p type OF IS tp_Tecord : THIS; IS tp err : c err record OUT (~ err recor~ CONDITION FALSE M~SSA~E "MUST BE RECORD TYPE") ESAC;
tp_record:
Procedures and Functions %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
TYPE
tp param class ---
NONTERM
NONTERM
NONTERM
NONTERM
: (sc read param, sc write param, sc new param, sc--pack--second pa~am, st pack third p~ram, sc--unpa~k second param, ~c unpack t~ird param, sc~specifYed_para-m, sc_err~r_param); -
proc decl : ~t defs in at--forward exists at--visible-labels at--def -etude fs_out proc type : ~t level aCcomplete_env at_proc_type form parms, form_parm ~t level atEcomplete_env at_params PROC FUNCT at is FUNCT
: : : :
tp_defs INH, BOOL, tp labels INH, tp--def, tp~de fs ;
INT INH, : tp_environ : tp_proc;
:
sect : : INT INH, : tp environ : tp~defs; : BOOL;
INH,
INH,
102
RULE r53:
% proc_decl
::= PROC FUNCT name proc_type STATIC TRANSFER at__visible_labels; proc_type.at_level
' ;' block ';'
:= (INCLUDING block.at_level)
proc_type.at_complete_env
:= block.at_complete
+ i;
env;
proc decl.at def := t~ def (n~me.at name, INCLUDING block.at level, c a value, IF proc_decl.at forward exists THEN sc_fit_forward_pro~ ELSE sc_decl_proc FI, IF proc decl.at forward exists THEN f Tdentify--(name.a~ name, -proc de~l.at defs_in, tp e~viron (7, sc-decl proc, FALSE) .s_type ELSE proc_type.at_p~oc_t~pe FI); proc decl.at forward exists := f~forward~exists ~name.at_name, proc decl.at defs out := t~__defs (~roc_~ecl.at_def) block.at params := proc_~ecl.at_def.s_type
proc_decl.at_defs_in);
+ proc_decl.at_defs_in;
QUA tp_proc
block.at nested procs := tp_deTs (pro~_decl.at_def)
.s_params;
+ (INCLUDING blockoat_nested_procs);
block.at_glob_env := tp_environ (tp_block
(tp defs (proc decl.at def) pr~c decl.at defs in)T + (INCLUDING block.at_glob_e~v); --
block.at_level
:= (INCLUDING block.at_level)
+ i;
CONDITION IF proc decl.at forward exists THEN f ~ef unique (name?at name, proc decl.at defs in) ELSE NOT KEY_IN_LIST (name~at_name, p~oc_decl~at_d~fs_in) FI MESSAGE "IDENTIFIER DEFINED BEFORE"; CONDITION ((proc decl.at def.s_type QUA tp_proc .s_result =/ c_void) PR~C FUNCT~-at is FUNCT) MESSAGE "PROCEDURE 7 FUNCTION SYMBOL IS WRONG"; =
103 CONDITION IF proc decl.at forward exists THEN EMPTY (pro~ type.at proc type.s (proc_type.~t_proc_~ype.~_resul~ ELSE TRUE FI MESSAGE "NEITHER PARAMETERS NOR RESULT
params) AND = c_void) TYPE ALLOWED"
END; RULE
r54:
% proc decl
::= PROC FUNCT name proc type ';' name ';' S TAT IC proc decl.at forward exists := f~forward~exists ~name[l].at_name, proc_decl.at_defs_in); CONDITION NOT proc decl.at forward exists MESSAGE "D~CLARED FORWARD ~EFORE"; CONDITION IF name[2].at name /= 'FORWARD' THEN f forwar~ fitted (name[l].at name, --INCLUDING block.at_loc_defs) ELSE TRUE FI MESSAGE "FORWARD IS NOT RESOLVED"; proc_type.at_level
:= (INCLUDING
block.at
level)
+ I;
proc type.at complete env := tp enviro~ (tp block (proc type.at proc type.s_params)) (I~CLUDING b l o ~ k . a t c o m p l e ~ e _ e n v ) ; - -proc decl.at def := tp..def (name[l].at name, INCLUDING 51ock.at_level, c_a_value, IF name[2].at name = 'EXTERNAL' THEN sc external proc ELSE (s~ forward--proc CONDITION ~ame[2].at name = 'FORWARD' MESSAGE "DIRECTIVE N~T IMPLEMENTED") FI, proc_type.at_proc_type); proc decl.at defs out := tp_defs (proc_~ecl.at_def)
+ proc_decl.at_defs_in;
CONDITION NOT KEY IN LIST (name[l].at name, proc_decl.at MESSAGE "YDENTIFIER DEFINED BEFORE";
defs_in)
+
104 CONDITION PROC FUNCT.at is FUNCT = (pr~c type.at p~oc type.s result =/ c void) MESSAGE "~ROCEDUR~ / FUNCTION--SYMBOL WRONg" END; RULE r55:
% proc_type
::= [ '(' form_parms STATIC TRANSFER at_level, at_complete_env;
')' ] [ ':' type_identifier ]
proc_type.at_proc_type := tp proc --(IF form parms THEN fo~m_parms.at_params ELSE tp_defs () FI, IF type identifier THEN type_identifier.at def.s type ELSE c_void FI); CONDITION IF type identifier THEN CA~E type identifier.at def.s type OF IS tp reaY : IS tp pointer : TRUE OUT f--is ordinal (THIS) ESAC ELSE TRUE FI MESSAGE "WRONG RESULT TYPE" END;
RULE r56:
% form_parms
::= form_parm sect STATIC TRANSFER at_level, at_complete_env; form_parms.at_params
:= form_parm_sect.at_params
END; RULE r57:
% form_parms
: := form_parms '; ' form_parm_sect S TAT IC TRANSFER at_level, at_complete_env; form parms[l].at params := f~rm_parms[2]--at params + form parm sect.at_params % Parameter section~ are straight~ned ~not Standard). END ;
105
RULE r58:
% form_parm_sect
::= PRoc_FuNcT name proc_type STATIC form parm sect.at params := tp_def~ (tp_de~ (name.at.name, form parm sect.at level, c_a_~alue7 sc_proc_param, proc_type.at_proc_type)); proc_type.at_level:=
form_parm_sect.at_level
+ I;
proc type.at complete env := tp_enviro~ (tp_blo~k (proc_type.at_proc_type.s_params)) form_parm_sect.at_complete_env;
+
CONDITION f_def_unique
(name.at name, (INCLUDTNG proc type.at_proc MESSAGE "DUPLICATE PARAMETER NA~E";
type).s_params)
CONDITION PROC FUNCT.at is FUNCT = (pr~c type.a~ p~oc type.s result =/ c void) MESSAGE "~ROCEDURE / FUNCTION--SYMBOL WRONG" END;
RULE r59:
% form_parm sect ::= [ V~R_sym ] ( name // ',' ) ':' type_identifier
STATIC form_parm sect.at params := f make-'defs (n~me.at name, --form palm sect.at level, IF V~R
s~
sT var param ELSE sc val param FI, type_id~nti~ier.at_def.s_~ypeT;
THEN
CONDITION f_defs_unique
(name.at name, (INCLUDING proc type.at_proc_type).s_params) MESSAGE "DUPLICATE PARAMETER NAMe";
CONDITION (VAR sym IS THERE) OR f no--file contained (type identifier.at def.s type) MESS--AGE "FI~ES ARE NOT ALLOWED FOR VALUE ~ARAME~ERS"; CONDITION NOT KEY IN LIST (type identifier.at def.s name, (INCZUDING proc_ty~e.at_~roc_type).s_params) MESSAGE "TYPE IDENTIFIER IS EQUAL TO A PARAMETER IDENTIFIER" END;
108 R U L E r60: STATIC
% PROC F U N C T PROC F U N C T . a t
::= ' P R O C E D U R E ' is F U N C T := F A L S E
END; RULE
r61:
STATIC
% PROC F U N C T PROC F U N C T . a t
::= 'FUNCTION' is F U N C T := TRUE
END; RULE
r62:
% call with params -- i d e n t i f i e r
::= ' (" expr
[ f o r m a t ] act_params S T A T IC call w i t h params.at type := C~SE identifier.~t_def.s_type OF IS tp p r o c : IF--THIS.s r e s u l t IS t p _ o v e r l o a d e d THEN f reduce (expr.at_type) ELSE THIS.s result FI; IS t p _ e r r : c err O U T Tc err C O N D I T I O N F A L S E M]~SSAGE " M U S T BE P R O C E D U R E OR F U N C T I O N " ) ESAC;
call_with_params.at_kind call_with_params.at_value
:= s c _ d y n
val;
:= c _ a _ v a l u e ;
')'
107
CONDITION CASE identifier.at_def.s_type OF IS tp proc : IF--EMPTY (THIS.s_params) THEN FALSE ELSE IF NOT (HEAD (THIS.s params).s type IS tp overloaded) THEN f_check_param (HEAD (THIS~s_params),-expr.at type, expr.at kind) ELSE CASE identifier.at defTs name OF -'RESET': 'REWRITE'~ 'PUT-~: 'GET': 'EOF': expr.at type IS tp_file; 'PAGE': 'E~LN': expr.at type = c_text_type; 'READ': 'WRITE': IF expr.at_type IS tp_file THEN TRUE ELSE f check io param (c text, a~t_params.at_class, expr.at_type, expr.at__kind) FI; 'READLN': 'WRITELN': IF expr.at type IS tp_file THEN expr.~t type = c text_type ELSE f check--io param-(c text, a~t_params.at_class, expr.at_type, expr.at_kind) FI; 'NEW': 'DISPOSE': (expr.at type IS tp__pointer) AND f is varYable access (expr.at_kind); 'PACK':CASE expr.at_type OF IS tp array : THIS.s_packing = sc_unpacked; IS tp--err : TRUE OUT FALSE ESAC; 'UNPACK': CASE expr.at_type OF IS tp array : THIS.s_packing = sc_packed; IS tp_--err : TRUE OUT FALSE ESAC; 'ORD': 'PRED': 'SUCC': f is ordinal (f_reduce (expr.at_type)); ,ABST: TSQR,: f assignment compatible --(c real, f--reduce (expr.at_type)) OUT TRUE ESAC FI FI OUT TRUE ESAC
MESSAGE "FIRST PARAMETER IS WRONG";
108
act params.at class := IF identifTer.at def.s type IS tp proc THEN IF identifier.at ~ef.s level -= 0 THEN CASE identiTier.a~ def.s name OF 'READ' : 'READLN'-- : sc--read_param; 'WRITE' : 'WRITELN' : sc~write_param; 'NEW' : 'DISPOSE' : sc new param; 'PACK': sc--pac~ second param; 'UNPACK': sc--unpa~k second param OUT sc~specifYed_par~m ESAC sc_specified_param ELSE FI ELSE sc_er ro r_param FI;
act params.at formals := UASE identTfier.at_def.s_type OF IS tp proc : IF--EMPTY (THIS. s_params) THEN tp defs () ELSE IF--act params.at class = sc_specified_param THEN TAIL ( T H I S . L p a r a m s ) ELSE THIS.s params % more--parameters are allowed or required FI FI OUT tp_defs () ESAC ; act_params, at_cho ice := IF act params.at class = sc_new_param THEN CASE expr.a~ type OF IS tp pointe? : CA~E f_identify (THIS.s referred, tp_defs (), INCLUDING (block.at env, WITH claUse.at env), sc_type, FALSE)--s_type OF-IS tp_record : THIS.s choice OUT c_empt~_cho ice ESAC; IS tp_err : c_empty_choice OUT c_empty_choice E SAC ELSE c_empty_choice % dont care FI;
109 act params.at file := V A S E act p~rams.at class OF sc read param: sc Write param: --CASE--identifie?.at d~f.s name OF 'READ': 'WRITE': CASE e x p r . a t _ t y p e OF IS tp_file : THIS OUT c text ESAC; 'READLN': 'WRITELN': c text OUT ~ text % dont care ESAC OUT c text % dont care ESAC; act params.at array := U A S E act p~rams.at class OF sc_pack_~econd_param: sc_unpack_second_param: CASE expr.at type OF IS tp_array: THIS OUT c _ e r r _ a r r a y ESAC OUT c _ e r r _ a r r a y % dont care ESAC; CONDITION IF NOT (expr.at_type IS tp file) THEN CASE act params.at cl~ss OF sc_read__param: KEY IN LIST ('INPUT', INCLUDING program.at__params); sc wr~te--param: ~EY IN--LIST ('OUTPUT', INCLUDING program.at_params) OUT TRUE ESAC ELSE TRUE FI MESSAGE "STANDARD FILE MUST BE PROGRAM PARAMETER"; m
CONDITION IF format THEN NOT (expr.at type IS tp file) AND (act_params.~t class = ~c write param) AND (NOT format.at~frac OR (f~reduc~ (expr.at_type) ELSE TRUE FI MESSAGE "WRONG FORMAT";
END;
= c_real))
110
RULE r63:
% act_params
::= ',' expr STATIC TRANSFER at_file, at_array;
[ format ] act_params
CONDITION CASE act_params[l].at_class OF sc error param: --TRUE ;-sc_specified param : IF EMPTY ~act_params[!].at_formals) THEN FALSE ELSE f_check_param (HEAD (act params[l].at formals), expr .at_t~pe, expr .at_~ind) FI; sc read param: sc write param: --f chuck io par~m ( a c L p a r a m s [ l ] . a t file, THIS, -expr.at_type, expr.at_kind) ; sc new param: --(expr.at kind = sc static val) AND CASE act--params[l]--at choYce OF IS tp empty : FALSE; -IS tp--~]nion : f-'compatible_types (THIS.s_discr, expr.at_type) ESAC; sc pack second param: ~__assTgnment-compatible (act params[l].at array.s index, f_r~duce (expr.a~_type))7 sc pack third param: is ~ariabYe access (expr.at_kind) AND C~SE--expr. a t_~ype OF IS tp array: ((~HIS.s packing = sc packed) AND (act_pa~ams[l].at_ar~ay.s_elem = THIS.s_elem)) OR (act params[l].at_array = c err_array) OUT FALSE ESAC ; sc unpack third param: assignment ~ompatible (act params[l].at array.s index, -f_r~duce (expr.a~_type))7 sc unpack second param: is variable ~ccess (expr.at_kind) AND CASE-~ex pr. a t_~ype OF IS tp_array: ((THIS.s_packing = sc_unpacked) AND (act params[l].at array.s elem = THIS.s_elem)) OR (act--params[l].at_--array_ =--c_err_array) OUT FALSE ESAC ESAC MESSAGE "WRONG PARAMETER"; - -
111
act params[2].at class := EASE act para~s[l].at class OF sc pack second param:- sc pack third param; sc--_Pack-third_~aram: sc~specYfied_~aram; sc unpack second param: sc unpack third param;
scZunpackZthi rdSa r am:
scZspeci fTed_paY~
OUT THIS ESAC; act params[2].at formals := • F EMPTY (act--params[l].at formals) THEN tp defs 7) ELSE IF--act params[2].at class = sc_specified_param THEN TAIL (act params[l].at formals) ELSE act_param~[l].at_formaYs % more parameters are allowed or required FI FI; act params[2].at choice := • F act paramsTl].at class = sc new param THEN CASE act param~[l].at_choYce 5F IS tp empty: c ~mpty choice; IS tp unio~ : f Wind variant choice --(THI~.s_vari~nts, expr.at_value) ESAC ELSE c_empty_choice FI; CONDITION IF format THEN (act params[l].at class = sc write param) AND (NOT--format.at_frac OR (f_re~uce (~xpr.at_type) = c_real)) ELSE TRUE FI MESSAGE "WRONG FORMAT" END; RULE r64:
% act_params
: := S TAT IC CONDITION (act params.at class =/ sc_specified_param) EMPTY (act params.at formals) MESSAGE "PARt,METERS MI~[SING" END;
OR
112
RULE
r65:
% format
STATIC format.at
::= frac
':' expr := expr[2]
[ ':' expr
]
IS THERE;
m
CONDITION (f_reduce ( e x p r [ l ] . a t _ t y p e ) = c_int) A N D IF expr [2] T H E N f reduce ( e x p r [ 2 ] . a t _ t y p e ) = c int E L S E T~UE FI M E S S A G E " F O R M A T TYPE M U S T BE INTEGER"
END; FUNCTION
f_forward_exists
(p name p_defs
: SYMB, : tp_defs)
BOOL : IF E M P T Y (p_defs) THEN FALSE ELSE IF HEAD (p defs).s name = p name THEN HEAD ~ p d e f s ) - s kind = - s c _ f o r w a r d _ p r o c E L S E f forwa?d e x i s t ~ (p_name, TAIL (p_defs)) FI FI; --F U N C T I O N f forward fitted (p_name : SYMB, p _ d e f s : tp defs) HOOL IF E M P T Y (p_defT) THEN FALSE ELSE IF (HEAD (p d e f s ) . s kind = sc fit forward_proc) AND (HEAD (p_--defs).s--name = p ~ a m e ~ THEN TRUE ELSE f _ f o r w a r d _ f i t t e d (p_name, TAIL (p_defs)) FI FI; FUNCTION
f_check_param
:
(p formal : tp_def, p--actual : tp type, p--kind : tpkind) BOOL : C A S E p _ f o r m a l . s _ k i n d ~F sc val param : f~ass~gnment_compatible (p_formal .s type, f reduce (p_actual)); sc var p a r a m : ~ p f ~ r m a l . s _ t y p e = p_actual) A N D C A S E p_kind OF sc var : s c _ v a l _ p a r a m : s c _ v a r _ p a r a m : s c _ f i e l d : TRUE OU~ F A L S E E SAC; s c _ p r o c _ p a r am : C A S E p _ a c t u a l OF IS t p _ p r o c : f _ e q u a l _ p r o c _ t y p e s (p f o r m a l . s _ t y p e QUA tp_proc, T~IS ) ; IS tp err : TRUE OUT F A L S E ESAC OUT F A L S E ESAC ;
113
FUNCTION
f_check_io_param
(p_file : tp_file, p class : tp param class, p--actual : tp--type,-p--kind : tp--kind) BOOL : ((p class = sc write_param) OR f is variable access (p_kind)) AND IF p file = c text THEN--(c_int ~ f reduce (p_actual)) OR (c real = f-reduce (p actual)) OR (c--char = f--reduce (p--actual)) OR (c-bool = f-reduce (p-actual)) OR (f--is string (f reduc~ (p actual)) AND (~_cTass = sc ~rite_paraH)) ELSE IF p class = sc write param THEN--f assignment compatible (p file.s base, f reduce(p actual)) ELSE f~assignmentZcompatible (p_-actual7 p_fileTs_base) FI FI;
FUNCTION
f find variant choice (p vars : tp variants, ---p~val : IN~) tp_choice : IF EMPTY (p vars) THEN c empt~ choice ELSE IF ELEM--IN LIST (p val, HEAD (p_vars).s_labels) THEN HEAD (p vars)~s choice ELSE f_find_~ariant ~hoice (TAIL (p__vars), p_val) FI FI;
% Expressions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% NONTERM
expr, simp_expr, term, factor, variable, var denot, call with params at
t~e
-
at--kind at--value NONTERM NONTERM NONTERM
NONTERM
: tp kind, : INT;
identifier : at_def constant
ident ~t def
: tp_def; :
constant, unslgned_const at_type at value
act_params
:
: t~_type,
: tp def; : : tp type, : IN~;
:
at class at--formals at-choice
at--array at~file
: : : : :
tp param class, tp-defs,tp_-cho ice, tparray, tp-_file;
114
NONTERM
format
NONTERM
element,
NONTERM
name
TERM
character,
:
list
string
NONTERM
:
rel_opr,
BOOL;
: : tp_type; : : tp_array : tp_type;
integer number at valu~
INH,
: SYMB;
at name
real number
TERM
element at base
:
subscript list, subscript a~_array_type a t_el em_t ype
TERM
TERM
: at frac
: :
INT;
: at value
: INT;
% a mapping
at value atZlength
: INT, INT;
% mapping
:
add opr, mul_opr at_op
: :
SYMB;
RULE r66:
% expr ::= simp_expr STATIC TRANSFER END;
RULE r67: STATIC
% simp_expr ::= term TRANSFER END;
RULE r68:
% term ::= factor STATIC TRANSFER END;
RULE r69:
% expr
::= simp_expr rel_opr simp_expr STATIC expr.at type := f_dy~dic_operator (rel_opr.at_op, f reduce (simp expr[l].at type) fSreduce ( s i m C e x p r [ 2 ] . a t ~ t y p e ) { ; expr.at__kind expr.at_value END;
:= sc_dyn_val; :=
c_a_value
115
RULE r70:
% simp_expr
::= simp_expr add_opr term STATIC simp_expr[l].at type := f_dyadic_ope?ator (add opr.at op, f r~duce (~imp expr[2].at_type), fZreduce (term?at_type)); simp_expr[l].at_kind
:= sc_dyn_val;
simp_expr[l].at_value
:= c_a_value
END; RULE r71:
% simp_expr
::= add_opr term STATIC CONDITION add_opt.at op /= 'OR' MESSAGE "ONLY SIGN (~,-) ALLOWED"; simp expr.at type := (~ reduce--(term.at type) CONDITION f is arithmetic MESSAGE "OPERAND ~ U S T BE ARITHMETICT); simp_expr.at_kind
:= sc_dyn_val;
simp_expr.at_value
:= c_a_value
END; RULE r72:
% term ::= term mul_opr
STATIC term[l].at type := f dyadi~ operator --term[l].at_kind term[l].at_value
factor
(mul opr.at op, f r~duce (~erm[2].at type), f--reduce (factor.at_~ype)) ;
:= sc_dyn_val; := c_a_value
END; RULE r73:
% factor
::= '(' expr STATIC TRANSFER at_type ; factor .at_kind factor.at value
')'
:= s c d y n _ v a l ; := c a value
END; RULE r74:
% factor ::= call_with_params STATIC TRANSFER END;
(IT)
116
RULE r75:
% factor
::= 'NOT' factor STATIC f a c t o r [ l ] . a t _ t y p e := c_bool; factor [l] .at_kind
factor[l].at_value
:= sc_dyn_val; := c_a_value;
C O N D I T I O N f reduce (factor[2].at type) M E S S A G E " O P ~ R A N D M U S T BE B O O L E A N ~
= c bool
END; RULE r76:
% factor ::= var denot STATIC T R A N S F E R END; RULE r77:
% factor
::= identifier STATIC f a c t o r . a t type := CASE id--type: i d e n t i f i e r . a t _ d e f . s _ t y p e OF IS tp p~oc : IF z d e n t i f i e r . a t def.s level = 0 THEN IF i d _ t y p e . s _ r e s u l t IS t p _ o v e r l o a d e d THEN (c err C O N D I T I O N FALSE M~SSAGE " P A R A M E T E R S MISSING") ELSE CASE i d e n t i f i e r . a t d e f . s name OF 'EOF' : 'EOLN' : -(c bool C O N D I T I O N KEY IN L I S T ('INPUT', INCLUDING program.at_params) MESS--AG~ " S T A N D A R D FILE M U S T BE PROGRAM PARAMETER") OUT (id type.s result CONDITION--EMPTY (id type.s params) M E S S A G E "PARAMETERS--M ISS IN~" ) ESAC FI ELSE id_type FI OUT id_t ype E SAC ;
117
factor.at kind := IF (identifier.at def.s type IS tp_proc) AND (identifier.at--def.s--level = 0) THEN sc dyn val --ELSE IF--identifier.at def.s kind = sc_type THEN (sc vat CONdITION--FALSE ME~SAGE "TYPE IDENTIFIER NOT ALLOWED") ELSE identifier.at def.s kind FI FI; factor.at value
:= identifier.at
def.s value
END; RULE r78:
% factor
::= 'NIL' STATIC factor.at_type := c_nil; factor.at_kind factor.at value
:= sc__dyn_val; := c_a_value;
END; RULE r79:
% factor
::='[' [ element list ] ']' STATIC factor.at type := IF element list THEN t p _ g e ~ e r a l _ s e t (element_list.at_base) FI; u
factor.at_kind factor.at value
:= s c d y n _ v a l ; := c a value
END; RULE r80:
% factor
::= integer_number STATIC TRANSFER at value;
END;
factor.at_type
:= c_int;
factor.at
:= sc static val
kind
ELSE c_empty_set
118
RULE r81:
%factor
::= real number STATIC T R A N S F E R at_value; factor.at_type
:= c_real;
factor.at
:= sc static val
kind
END; RULE r82:
% factor
::= c h a r a c t e r STATIC T R A N S F E R at value; factor.at_type
:= c_char;
factor.at kind
:= sc static val
END; RULE r83:
% factor
STATIC TRANSFER
::= string
at_value;
f a c t o r . a t kind
:= sc static val;
factor.at type := t p _ a r r ~ y (GENNUM, sc_packed, tp s u b r a n g e --(GENNUM, c_int, c_char)
i, s t r i n g . a t _ l e n g t h ) ,
END; RULE r84 : % variable ::= var denot STATIC T R A N S F E R END; RULE r85:
% variable
::= identifier STATIC v a r i a b l e . a t _ t y p e := i d e n t i f i e r . a t _ d e f . s _ t y p e ; v a r i a b l e . a t kind v a r i a b l e . a t value
END;
:= i d e n t i f i e r . a t def.s
kind;
:= i d e n t i f i e r . a t def.s value
119 RULE r86:
% var denot
::= v a r i a b l e ' ' name STATIC C O N D I T I O N f is v a r i a b l e access (variable.at_kind) MESSAGE "MUST BE VARIABLE"; var d e n o t . a t type := T _ i d e n t i f~ (name.at name, f is tp-record (variable.at type).s all fields, tp e~vi~on (), --sc~field, FALSE) .s_type; vat d e n o t . a t kind := C~SE v a r i a S l e . a t type OF IS tp record: IF--THIS.s packing = sc packed -T H E N s c _ p ~ c k e d _ f i e l d E~SE s c _ f i e l d FI OUT sc field ESAC;
END;
var d e n o t . a t value . . . .
RULE r87:
:= c a value
% var_denot
::= v a r i a b l e '[' s u b s c r i p t , list ']' S TAT IC C O N D I T I O N f is v a r i a b l e access (variable.at_kind) M E S S A G E "MUTT B E VARIABLE"; var_denot.at_type
:= s u b s c r i p t _ l i s t . a t _ e l e m _ t y p e ;
vat d e n o t . a t kind := C~SE v a r i a b l e . a t _ t y p e OF IS tp array: IF THIS.s packing = sc packed -THEN s c _ p ~ c k e d _ f i e l d E~SE sc_field FI OUT sc field ESAC; v a r _ d e n o t . a t value
:= c_a_value;
s u b s c r i p t list.at a r r a y _ t y p e := CASE v ~ r i a b l e . ~ t type OF IS t p _ a r r a y : THTS; IS tp err : c err array OUT (~ err a r r a y C O N D I T I O N FALSE M E S S A G E " V A R I A B L E M U S T BE ARRAY") ESAC END;
120
RULE r88:
% var denot
::= variable '@' STATIC C O N D I T I O N f iS variable access MESSAGE "MUTT BE VARIABLE"; i
(variable.at_kind)
var denot.at type := ~ A S E v a r i a b l e . a t _ t y p e OF IS tp pointer f i~entify (THIS.s referred, tp defs (), INCLUDING (block.at env, WITH_claUse. at_env) , sc_type, FALSE) .s type; IS tp file : THI~.s base OUT (c e~r C O N D I T I O N FALSE MESSAGE "MUST BE POINTER OR FILE TYPE") ESAC; vat denot.at
kind
:= sc var;
var denot.at value
:= c a value
END; RULE r89:
% element list STATIC TRANSFER END; RULE r90:
% element
::= element
list ::= element list STATIC T R A N S F E R at base WITH element;
',' element
m
CONDITION element list[2].at base = element.at base MESSAGE "TYPES OF--SET ELEMENTS ARE INCOMPATIBLE" END; RULE r91:
% element
::= expr [ '..' expr ] STATIC e l e m e n t . a t base := (f reduce (expr[l].at type) C ~ N D I T I O N f is ordin~l (IT) MESSAGE "SET BASE TYPE MUST BE ORDINAL"); CONDITION IF expr[2] THEN f_reduce (expr[2].at_type) = element.at_base FI MESSAGE "SET B O U N D TYPES ARE INCOMPATIBLE"
END;
ELSE TRUE
121
RULE r92:
% subscript list ::= subscript STATIC TRANSFER EN]5;
RULE r93:
% subscript_list
::= subscript_list ',' subscript STATIC TRANSFER at_array_type WITH subscript_list[2]; subscript.at array type := CASE subscript Yist[2].at elem_type OF IS tp_array: THIS; IS tp err: c err array OUT (3 err a?ray--CONDITION FALSE M~SSA~E "TO0 MANY INDICES") ESAC; TRANSFER at elem_type WITH subscript END; RULE r94:
% subscript
::= expr STATIC subscript.at_elem_type := subscript.at._array_type.s_elem; CONDITION f_assignment_compatible (subscript.at array type.s index, f reduce (ex~r.at_~ype)) MESSAGE--"WRONG SUBSCRIPT TYPE" END; RULE r95:
% identifier
::= name STATIC identifier.at def := f_identify--(name.at name, tp defs--(), INCLUDING (block.at env, W ITH_cl a~se. a t_env) , sc var, FALSE); CONDITION identifier.at def.s level = f_identify (n~me .a t--name, tp defs-(), INCLUDING (block.at complete env, WITH cla~se.at_co~plete_env) , sc var, TRUE) .~ level MESSAGE "IDENTIFIE--R USED BEFORE--DEFINITION"
END;
122
RULE r96:
% constant
ident ::= name STATIC constant ident.at def := f_ide~tify (na~e.at name, INCLUDING (const def.at defs in, type denoter?at d~fs in, case-label.at_d~fs_i~),_ INCLUDING (block.at glob env, case lab~l.at~glob_env), sc static val, ;ALSE); CONDITION constant ident.at def.s kind = sc static val MESSAGE "MUST BE CUNSTANT I~ENTIFTER"; --CONDITION constant ident.at def.s level = f_identi~y (name.at name, tp defs--(), INCLUDING (block.at complete env, record t~pe.at complete env, case l~bel.at_~omplete__~nv), sc static val, TRUE) .s level MESSAGE "IDENTIFIER USED--BEFORE DEFINITION"
END; RULE r97:
% constant ::= unsigned_const STATI C TRANSFER END;
RULE r98:
% constant
::= add_opr unsigned_const STATIC CONDITION add_opr.at_op /= 'OR' MESSAGE "ONLY SIGN (+,-) ALLOWED"; TRANSFER
at_type;
constant.at value := IF add opr.at op = '+' THEN --unsigned_const.at value ELSE - unsigned_const.atZvalue FI; CONDITION f is arithmetic (unsigned const.at_type) MESSAGE "OP~RAND MUST BE ARITHMETIC ~
END;
123
RULE r99:
% unsigned_const
::= constant_ident STATIC unsigned_const.at_type := constant_ident.at_def.s_type; unsigned_const.at_value
:= constant_ident.at_def.s_value
END ;
RULE r100: % unsigned_const ::= integer_number STATIC TRANSFER at_value; unsigned_const.at_type
:= c_int
END; RULE rl01: % unsigned_const STATIC T R A N S F E R at value;
::= real number
N
unsigned_const.at_type
:= c_real
END; RULE r102: % unsigned_const STATIC TRANSFER at value;
::= character
unsigned_const.at_type
:= c char
END; RULE r103: % unsigned_const STATIC TRANSFER at value;
::= string
unsigned_const.at type := tp_ar r ay (GENN-UM, sc packed, tp_--subrange (GENNUM, c_int, c_char) END;
i, string.at_length),
124
RULE r104: % tel opt ::= '=' STATIC r e l _ o p r . a t op := '='
END;
RULE r105: % rel opt ::= '<>' STATIC rel_o~r.at op := '<>'
END;
RULE r106: % rel opr ::= l<=, STATIC r e l _ o p r . a t _ o p := '<='
END;
RULE r107: % rel opt ::= '>=' STATIC rel__opr.at op := '>='
END;
RULE r108: % rel_opr ::= '<' STATIC rel_opr.at._op := '<'
END;
RULE r109: % t e l opr ::= '>' STATIC r e l _ o p r . a t op := '>'
END;
RULE rll0: % rel opt ::= 'IN' STATIC tel o p r . a t _ o p := 'IN'
END;
RULE rlll: % add opt ::= '+' STATIC add o ~ r . a t _ o p := '+'
END;
RULE rl12: % add opr ::= '-' STATIC add opt.at op := '-'
END;
RULE rl13: % add opt ::= 'OR' STATIC a d d _ o p r . a t _ o p := 'OR'
END;
RULE rl14: % mul opr ::= '*' STATIC m u l _ o p r . a t _ o p := '*'
END;
RULE rl15: % m u l _ o p r ::= '/' S T A T I C m u l _ o p r . a t _ o p := '/'
END;
RULE rl16: % mul_opr ::= 'DIV' STATIC m u l _ o p r . a t _ o p := 'DIV' END;
125
RULE r 117: % mul opt ::= 'MOD' STATIC mul_o~r.at_op := 'MOD'
END;
RULE rl18: % mul opr ::= 'AND' STATIC mul..opr.at_op := 'AND' END; FUNCTION
f is variable_access
(p_kind
: tp_kind)
BOOL
:
CASE p S i n ~ OP sc var : sc_val param : sc var param : sc--field : sc_packed_field--, s~_tag_field TRUE OUT FALSE E SAC ;
:
FUNCTION f is arithmetic (p type : tp_type) BOOL : f_compatible types (p_type, c_int) OR (p_type = c_real); FUNCTION
f_dyadic_operator
CASE p o p I:$
: T(>I
( p o p : SYMB, p_--left, p_rlght
: tp type)
tp_type
:
OF :
I<~I
o
I>:I
:
I
:
1}I
:
(c bool CONDITION (f[ is arithmetic (p_left) AND f is arithmetic (p_right)) OR (f compatible types (p_left, p_right) AND CASE p left OF IS tp bool : IS tp_char : IS tp_enum : IS tp_subrange : tp_err TRUE ; IS tp_set : IS tp general set : IS tp._empty set : (p_op = '=') 5R (p_op--= '<>') OR (p op = '<=') OR (p_op = '>='); IS tp_--pointer : ( p o p = '=') OR (p op = '<>'); IS tp_--array : f_is_string (p_left) AND f_is string (p_right) OUT FALSE ESAC) MESSAGE "INCOMPATIBLE TYPES FOR RELATION") ; 'IN'
: (c bool CONDITION CASE p right OF IS tp ~et : compatible types (p_left, THIS.s_base) ; IS tp general set--: ~_compatYble_types (p_left, THIS. s_base) ; IS tp empty set : tp err : _is_°~dinal (p_--left) OUT FALSE ESAC MESSAGE "INCOMPATIBLE TYPES FOR IN OPERATOR");
:
126
'OR'
i+i
: 'AND' : (C bool CONDITION ((p left = c bool) OR (p left = c err)) -((p--right c--bool) OR (p--right cZerr)) MESSAGE "INCOMPATIBLE TYPES F0R BOOLEAN OPRATOR"); :
I,I
:
l_0
AND
:
IF f iS arithmetic (p left) AND f is arithmetic (p_right) THEN IF f compatible ~ypes (p left, c int) AND f-compatibleZtypes (pZright, L i n t ) THEN--c int ELSE c--real FI ELSE (CASE left : p left OF IS tp set : p_Ye ft ; IS tp_general_set : IS tp empty set : CA~E right : p_right OF IS tp set: IS tp~general_set: p right; IS tp empty_set: p Teft OUT ~c err CONDITION p_right IS tp err MESSAGE "OPERATION NEITHER ARITHMETIC NOR SET") ESAC OUT (c err CONDITION p left IS tp err MESSAGE "OPERATION NEITHER ARITHMETIC NOR SET") ESAC CONDITION f compatible types (p left, p right) MESSAGE "INCOMPATIBLE ~ P E R A N D TYPES") -FI; 'DIV' : 'MOD' : (c_int CONDITION f compatible types (p_left, c_int) AND f--compatible--types (p right, c int) MESSAGE "OPERA~D TYPES ~ U S T BE I~TEGER"); '/'
:
(c_real
CONDITION f is arithmetic (p left) AND f--is-arithmetic (p--right) MESSAGE "OPERAND TYPES MUST BE ARITHMETIC")
OUT (c_err CONDITION E SAC ;
FALSE)
127
Statements %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %
NONTERM
NONTERM
statement seq, statement--: at v i s i b l e labels a t ~ l a b e l s --
case limb, case limbs : ~t visible ~ a b e l s at--case la~els in
at--case--labels--out at_--labe~__type -NONTERM
NONTERM
NONTERM
: tp labels INH, : tp~labels;
W I T H clause : ~t visible labels at--env -atZcomplete_env case label : ~t defs in at~glob~env at complete env at--in statement at label type at--case Yabels in at--valu~ TINT; TO D O W N T O : - - at is TO
RULE rl19: % statement seq STATIC T R A N S F E R ENd;
: : : :
tp labels INH, tp--labels INH, tp~labels, tp_type INH;
: tp_labels INH, : tp environ INH, : tp--environ INH;
: : : : : :
tp defs INH, tp~environ INH, tp environ INH, BOUL INH, tp type INH, tp-_labels INH,
: BOOL; ::= statement
RULE r12~: % s t a t e m e n t _ s e q ::= statement_seq ';' statement STATIC statement seq[l].at labels := s t a t e m ~ n t _ s e q [ 2 ] ? a t _ l a b e l s + statement.at_labels; statement seq[2].at visible labels := statement seq[l]Tat labeTs + statement~seq[l].at_--visible_labels; s t a t e m e n t . a t visible labels := statement--seq[l].at labels +
statement~seq[l].at~visible_labels END;
128
RULE r121: % label ::= integer STATIC TRANSFER at value;
number
CONDITION ELEM IN LIST (integer number.at value, -- -INCLUDING block.a~_declared_labels) MESSAGE "LABEL IS NOT DECLARED"; CONDITION f_elem_unique MESSAGE
"LABEL
(integer number.at value, INCLUDING block.aT occurring OCCURRS MORE THAN ONCe"
labels)
END; RULE
r122: % statement ::= label ':' statement STATIC statement[l].at labels := tp_labels (l~bel.at_value); CONDITION EMPTY (statement[2].at_labels) MESSAGE "MORE THAN ONE LABEL"; statement[2].at visible labels := statement[l]?at_labeTs + statement[l].at_visible_labels END; RULE r123: % statement ::= var denot '--'.- expr STATIC statement.at_labels := tp_labels (); CONDITION MESSAGE END;
f_assignment_compatible
"INCOMPATIBLE
TYPES
(var denot.at type, f r~duce (ex~r.at_type)) IN ASSIGNMENT"
129
RULE r124: % statement :'=. identifier ''-'.- expr STATIC statement.at_labels := tp_labels (); CONDITION f assignment compatible ~ C A S E identTfier.at_def.s_type OF IS tp proc : (TF/IS.s_result CONDITION IF KEY IN LIST (identifier.at def.s name, INCLUDING blo~k.at_~ested_procs) THEN SELECT BY KEY (identifier.at def.s name, INCLUDING blo~k.at_~ested_procs) .s level = identifier.at def.s level ELSE FALS~ --FI MESSAGE "RESULT ASSIGNMENT NOT ALLOWED") OUT (THIS CONDITION f is variable access (identifier.at def.s_kind) MESSAGE "LEFTHAND SIDE MUST BE VARIABLE") ESAC, f reduce (expr.at type)) MESSAGE "INCOMPATIBLE TYPES IN ASSIGNMENT" END;
RULE r125: % statement ::= call_with_params STATIC statement.at_labels := tp labels (); CONDITION MESSAGE END;
(call with params.at type = c void) (call--with--params.at--type = c--err) "FUNCTIUN CA~L NOT A L L I E D " --
OR
RULE r126: %statement STATIC
::= identifier
statement.at_labels
:= t p _ l a b e l s
();
CONDITION CASE i d e n t i f i e r . a t _ d e f . s _ t y p e OF IS tp err : TRUE; IS tp proc : ((THIS.s result = c void) AND E M P T Y (THIS.s_params)) CASE i d e n t i f i e r . a t ~ e f . s name OF 'READLN': 'WRITELNT: 'PATE': i d e n t i f i e r . a t def.s level = 0 OUT FALSE ESAC OUT FALSE ESAC M E S S A G E "MUST BE P R O C E D U R E W I T H O U T PARAMETERS";
OR
CONDITION IF i d e n t i f i e r . a t def.s level = 0 THEN C A S E i d e n t i ~ i e r . a ~ def.s name OF 'READLN': KEY IN LIST ('INPUT', INCLUDING p r o g r a m . a t _ p a r a m s ) ; 'WRITELNT: TPAGE': KEY IN L I S T ('OUTPUT', INCLUDING program.at_params) OUT TRUE-- -ESAC ELSE TRUE FI M E S S A G E " I M P L I C I T USE OF U N D E F I N E D S T A N D A R D FILE"; END; RULE r127: % statement :== 'GOTO' i n t e g e r _ n u m b e r STATIC s t a t e m e n t . a t _ l a b e l s == t p _ l a b e l s (); C O N D I T I O N ELEM IN LIST MESSAGE
(integer n u m b e r . a t value, statement.at_visible_labels) "LABEL IS NOT VISIBLE"
END; RULE r128: % Statement ::= STATIC statement.at_labels END;
== t p _ l a b e l s
()
131
RULE r129: %statement ::= 'BEGIN' statement_seq STATIC TRANSFER at_visible_labels; statement.at_labels
:= tp_labels
'END'
();
END; RULE r130: % statement ::= 'CASE' expr 'OF' case limbs STATIC TRANSFER at visible labels WITH case limbs;
[';']
'END'
m
statement.at_labels
:= tp_labels
case_limbs.at_label_type
();
:= f_reduce
case_limbs.at_case_labels_in
(expr.at_type);
:= tp_labels
()
END; RULE r131: % statement ::= 'WHILE' expr STATIC TRANSFER at_visible_labels; statement[l].at_labels
'DO' statement
:= tp_labels
();
CONDITION f reduce (expr.at type) = c bool MESSAGE "CONDITION M U S T BE ~OOLEAN" -END; RULE r132: % statement ::= 'REPEAT' s t a t e m e n t _ s e q STATIC T R A N S F E R at_visible_labels; statement.at_labels
:= tp_labels
'UNTIL'
();
C O N D I T I O N f reduce (expr.at type) = c bool MESSAGE "CONDITION M U S T BE ~OOLEAN" -END;
expr
~32
RULE r133:
% statement
::= 'FOR' identifier TO DOWNTO expr
':=' expr 'DO' statement
STATIC TRANSFER at_visible_labels; statement[l].at_labels
:= tp_labels
();
CONDITION f_compatible_types MESSAGE
(identifier.at def.s type, f reduce (exp?[l].a~ type)) "TYPE OF INITIA~ VALUE IS NOT CORRECT";
CONDITION f_compatible_types MESSAGE
(identifier.at def.s type, f reduce (exp~[2].a~ type)) "TYPE OF LAST V~LUE IS NOT CORRECT;;
CONDITION identifier.at def.s kind = sc var MESSAGE "IDENTIFIER MUST BE VARIABLE"; CONDITION f is ordinal (identifier.at def.s type) MESSAGE "TYPE OF IDENTIFIER MUST BE--ORDINAL"; CONDITION identifier.at def.s level = (INCLUDING MESSAGE "IDENTIFIER MUST BE LOCAL"
block.at_level)
END; RULE r134: % statement ::= 'WITH' variable WITH clause STATIC CONDITION f is variable access (variable.at kind) MESSAGE "INSPECTIONS ALLOWED FOR VARIABLES ~NLY"; TRANSFER at visible
labels;
m
statement.at_labels
:= tp_labels
();
WITH clause.at env := tp environ -~tp_block (f is tp record (variable.at type).s all fields)) (INCLUD~NG--(b~ock.at_env, WITH._cl~use.at_~nv)T;
+
ISS
WITH clause.at_complete_env := tp environ ~tp block ~CASE variable.at type OF IS tp_record : T~IS.s_all_fields OUT tp defs () ESAC))--+ (INCLUDING (block.at complete env, WITH_cla~se.at_complete_env)) END;
RULE
r135: % statement ::= 'IF' expr STATIC TRANSFER at_visible_labels;
statement[l].at_labels
'THEN'
:= tp_labels
statement
[ 'ELSE'
statement
();
CONDITION f reduce (expr.at type) = c bool MESSAGE "CONDITION MUST BE ~OOLEAN" -END;
RULE
r136: % WITH clause ::= ',' variable WITH clause STATIC CONDITION f is Variable access (variable.at kind) MESSAGE "IN~PEUTION ALLUWED FOR VARIABLES O~LY"; TRANSFER
at visible
labels;
WITH clause[2].at env := tp environ ~tp_block (f is tp record (variable.at type).s all fields)) WITH__cl~us~[l~.at__env; ---m
WITH clause[2].at_complete_env := tp_environ (tp block ~CASE variable.at type OF IS tp record : T~IS.s all fields OUT t~_defs () ESAC)) + WITH_clause[l].at__complete_env END; RULE
r137: % W I T H clause ::= 'DO' statement STATIC TRANSFER at visible labels END;
+
]
134 RULE r138:
%, case limbs
::= case limbs ';' case limb STATIC TRANSFER at_label_type, at_visible_labels; TRANSFER at case labels
in WITH case limbs[2];
case_limb.at_case_labels_in
:=
case_!imbs[2].at_case labels_out;
TRANSFER at case labels out WITH case limb
END; RULE r 139 : % case limbs STATIC T R A N S F E ~ END; RULE r140:
::= case limb
% case limb
::= ( case label // ',' ) ':' statement STATIC TRANSFER at_label_type, at_case_labels_in WITH case_label; m
TRANSFER at visible
labels WITH statement;
m
case limb.at case labels out := c~se limb~at c~se labels in + (case label.~t vaYue CONDITION UNIQUE_ELEMS MESSAGE "DUPLICATE CASE LABELS"); case_label.at_defs_in
:= tp_defs
case label.at glob env := I~CLUDING ~bloc~.at_env,
();
WITH_clause.at_env);
case label.at complete env := I~CLUDING -[block.at~complete_env, case label.at M
END;
in statement
(IT)
:= TRUE
WITH_clause.at_complete_env)
;
135
RULE r141:
%' case label
::= constant
STATIC
case_label.at_value
:= constant.at_value;
CONDITION NOT E L E M _ I N _ L I S T MESSAGE
(constant.at_value, case label.at case_labels_in) "CASE LABEL OCCURS BEFORET;
CONDITION IF case label.at in statement THEN ca~e label.~t Tabel type = c ~ n s t a n t . a t type ELSE f_co~patible_~ypes Tease label.at l a b e L t y p e , cons~ant.at_t~pe) FI MESSAGE "WRONG TYPE OF CASE LABEL" END; RULE r142:
% TO DOWNTO ::-- 'DOWNTO' STATIC TO D O ~ N T O . a t is TO := FALSE END; RULE r143: % TO DOWNTO ::= 'TO' STATIC TO D~WNTO.at is TO := TRUE END;
Appendix B: Results of the Usage of GAG processing the AG of Appendix A Overview: B.l: B.2: B.3: B.4:
B.5:
Output (header) of the protocol pass Selected output of the dependency analysis Output protocols of optimizations i. Elimination of semantic chain productions 2. Attribute optimization Protocol of an execution of the generated analyzer Output of the performance measurement tool for the generated analyzer (Input as for B.4)
The result protocols are produced automatically by the GAGSystem. Detailed information about these facilities can be found in [GAGS1]. The control input necessary to produce B.I upto B.3 is the following: SEXEC $ALL; $ELIMINATE SWITH INFORMATION; $OPTIMIZE SWITH--INFORMATION; $PROTOCOL $NOT SYNPUTTEXT; $GRAPH $SYMBOL block $INDUCED; SGRAPH $SYMBOL block $PARTITIONED; $VISIT $RULE rl, r4
% % % %
produces produces produces produces
B.3, 1 B.3, 2 B.I B.2
t38
B.I:
Output
(header)
of
the
* *
protocol
pass
GAG
* GENERATOR
* VERSION * NUMBER * * * * *
BASED
ON ATTRIBUTED
: 82-05-10
EXECUTED
GRAMMARS
: 05/13/82
09:44:48
OF MESSAGES:
INFORMATIONS WARNINGS ERRORS SYSTEM ERRORS SYSTEM LIMITS
: : : : :
68 (I)
(W) (E) (S) (L)
FOR SCANNING FOR SYNTAX ANALYSIS FOR DEFINITION TABLE CONSTRUCTION : FOR SEMANTIC ANALYSIS FOR EXPANSION FOR CHAIN RULE ELIMINATION FOR DEPENDENCY ANALYSIS FOR ATTRIBUTE OPTIMIZATION FOR TABLE TRANSFORMATION FOR GENERATION OF SYNTAX INTERFACE: FOR TRANSLATION OF D E F I N I T I O N S FOR TRANSLATION OF ACTIONS
6 8 2 ii ii 7 20 14 7 5 4 22
SEC SEC SEC SEC SEC SEC SEC SEC SEC SEC SEC SEC
* * * * * * * * * * * *
TIME TIME TIME TIME TIME TIME TIME TIME TIME TIME TIME TIME
* * * * *
MESSAGES: N U M B E R OF I N P U T L I N E S : 2997 ROOT OF THE CONTEXT-FREE GRAMMAR : program N U M B E R OF E L I M I N A T E D C H A I N R U L E S : 18 T H E A G IS O R D E R E D , MAX. N U M B E R OF S Y M B O L V I S I T S =
799 994 263 739 567 995 317 904 977 77 586 161
MSEC MSEC MSEC MSEC MSEC MSEC MSEC MSEC MSEC MSEC MSEC MSEC
139
B.2:
Selected
===
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
output
of the d e p e n d e n c y
analysis
= = = = = = = = = = = = = = = = = = = = = = = = = = = =
GAG DEPENDENCY ANALYSIS 8 2 - 0 5 - 1 1
EXECUTED 0 5 / 1 2 / 8 2
TRANSITIVE
INDUCED DEPENDENCIES
CLOSURE
OF G R A P H S
FOR
12:18:54
I N D U C E D G R A P H FOR S Y M B O L b l o c k IN LINE 530000 A T N O CLASS PART ATT~ D E P E N D S ON 0 INH 0 at level [ ] 1 INH 0 at--glob env [ ] 2 INH 0 at_--nest~d_procs [ ] 3 INH 0 at p a r a m s [ ] 4 SYNT 0 at--_upto_type_defs [ 0 1 3] 5 SYNT 0 at loc defs [ 0 1 3 4 ] 6 SYNT 0 at--env-[ 0 1 3 4 5 ] 7 SYNT 0 at--complete env [ 0 1 3 4 5 12 ] 8 SYNT 0 at--declared--labels [ ] 9 SYNT 0 at~occurring_labels [ ] 10 INH 0 at_visible_labels [ ] ii INH 0 #580 [ ] 12 INH 0 #599 [ ] TRANSITIVE
CLOSURE
PARTITIONED GRAPH A T N O C L A S S PART 0 INH 2 1 INH 2 2 INH 2 3 INH 2 4 SYNT 1 5 SYNT 1 6 SYNT 1 7 SYNT 1 8 S YNT 1 9 SYNT 1 10 INH 2 ii INH 2 12 INH 2
OF G R A P H S
FOR P A R T I T I O N - D E P E N D E N C I E S
FOR S Y M B O L block IN LINE ATTR D E P E N D S ON at level [ ] at--glob env [ ] at--_nest~d_procs [ ] at_params [ ] at upto type defs [ 0 1 at--loc ~ e f s -- [ 0 1 at--env-[ 0 1 at--complete env [ 0 1 at--declared--labels [ 0 1 at--occurring labels [ 0 1 at-visible l~bels [ ] #5~0 -[ ] #599 [ ]
530000
2 2 2 2 2 2
3 3 3 3 3 3
10 4 4 4 9 10
ii 10 5 5 10 Ii
12 ] ii 12 ] 10 ii 12 ] 10 ii 12 ] ii 12 ] 12 ]
140
ATTRIBUTE
PARTITIONS
S Y M B O L block BEFORE i. VISIT: at level at_glob_env at n e s t e d _ p r o c s at_params at--visible labels #580 --#599 AFTE~ i. VTSIT: at upto type defs at loc defs at env at complete_env at_-declared_~abels a~__oc~urring_~rabels --
VISIT-SEQUENCES IT IS A S S U M E D THAT ATTRIBUTE E V A L U A T I O N STARTS W H E N THE TREE IS COMPLETED.
V I S I T - S E Q U E N C E FOR RULE rl IN LINE 920000 program : := name prog params block NO. KIND SYMBNO VISITI~O SYMBOL OR A T T R I B U T [ PRECONDITION ] 1 VISIT 2 1 prog params [ ] EVALUATES : at defs 2 EVAL 0 program, at_params [ 1 ] 3 EVAL 0 program.at complete_env [ ] 4 EVAL 3 block.at l~vel [ ] 5 EVAL 3 b l o c k . a t - g l o b env [ ] 6 EVAL 3 block.at nested procs [ ] 7 EVAL 3 block, at--params-[ 2 ] 8 EVAL 3 block.at--visible labels [ ] 9 EVAL 3 block. #580 [ 2 ] 10 EVAL 3 block. #599 [ 3 ] 1 ii VISIT 3 block [ 4 5 6 7 8 910] EVALUATES: at_upto_type_defs at loc defs at env a t _ c o m p l e t e _ e n v at declared labels at occurring--labels 0 program.at lot defs [ "/i ] 12 EVA~ 2 prog_param~.#600 [ 12 ] 13 EVAL 2 2 prog params [ 1 13 ] 14 VISIT EVAL]]ATES : at names LEAVE 0 1 TO ANCESTOR -[ ALL ELEMENTS ] 15
141
VISIT-SEQUENCE FOR RULE r4 IN LINE 1480000 block ::= label decl part const def type def var decl_part proc decl p~rt statement -NO. K I N D --SYMB~O VISITNO SYMBOL OR ATTRIBUT [ PRECONDITION ] 1 EVAL 2 const def.at defs in [ ] 2 EVAL 2 const--def.#5~7 -[ ] 3 EVAL 2 const--def.#596 [ ] 4 VISIT 2 1 const--def [ 1 2 3 ] EVALUATES: at defs out 5 EVAL 3 type_def.at_def~_in --[ 4 ] 6 EVAL 3 type_def.#577 [ ] 7 EVAL 3 type def.#590 [ ] 8 EVAL 3 typeZdef.#594 [ ] 9 EVAL 3 type def.#596 [ ] 10 VISIT 3 1 type-def [ 5 6 7 8 9 ] EVALUATES: at defs out ii EVAL 0 block.at upto t~pe_d~fs [ 10 ] 12 EVAL 3 type_def?#592-[ Ii ] 13 VISIT 3 2 type def [ 10 12 ] EVALUATES: #603 14 EVAL 4 var decl part.at defs in [ 10 ] 15 E VAL 4 varZdecl-part.#577 ~ ] 16 EVAL 4 var decl--part.#590 [ ] 17 EVAL 4 var-decl--part.#592 [ ii ] 18 EVAL 4 varZdecl~part.#594 [ ] 19 EVAL 4 var decl part.#596 [ ] 20 VISIT 4 1 var--decl--part [ 14 15 16 17 18 19 ] EVA~UATE~: at defs out #603 21 EVAL 5 proc decl part.at deYs in [ 13 20 ] 22 EVAL 5 proc--decl--part.#5~4 [--] 23 EVAL 5 proc--decl--part.#596 [ ] 24 VISIT 5 1 proc--decl--part [ 21 22 23 ] EVALUATES~ at defs out 25 EVAL 0 block.at loc deTs --[ 24 ] 26 EVAL 0 block.at--complete_env [ 25 ] 27 EVAL 2 const deY.#576 [ 26 ] 28 VISIT 2 2 const-def [ 4 27 ] EVALUATES: 29 EVAL 3 type def.#576 [ 26 ] 30 EVAL 3 typeZdef.#591 [ 26 ] 31 EVAL 3 type def.#593 [ 26 ] 32 VISIT 3 3 type--def [ 13 29 30 31 ] EVALUATES: 33 EVAL 4 var decl part.#576 [ 26 ] 34 EVAL 4 varZdecl~part.#591 [ 26 ] 35 EVAL 4 var decl part.#593 [ 26 ] 36 VISIT 4 2 var--decl--part [ 20 33 34 35 ] EVA~UATES: 37 VISIT 6 1 statement [ ] EVALUATES: at labels #602 38 EVAL 0 block.at occurrTng labels [ 37 ] 39 EVAL 1 label de~l part.#597 [ 38 ] 40 VISIT 1 1 label~decl~part [ 39 ] EVALUATES: at labels
142
41 42 43 44 45 46 47
EVAL EVAL EVAL EVAL EVAL EVAL VISIT
48 49 50 51 52 53 54 55 56 57 58
EVAL EVAL EVAL EVAL EVAL EVAL EVAL EVAL EVAL EVAL VISIT
59
LEAVE
proc_decl_part.at_visible_labels [ 37 ] [ ] proc decl part.#58Z procZdeclZpart.#583 [26] proc decl part.#584 [25] [] procZdeclZpart.#585 proc_decl_part.#599 [26] proc decl part [ 24 41 42 43 44 45 46 ] EVALUATES? block.at env [25] block.at-declared labels [ 40 ] statement.at visiHle labels [ 41 ] statement.#5~5 [49] statement.#579 [26] [] statement.#580 statement.#581 [48] [] statement.#585 [] statement.#596 statement.#5g7 [38] [ 37 50 51 52 53 54 55 56 57 ] statement EVALUATES: [ ALL ELEMENTS ] TO ANCESTOR B
143
CHARACTERISTICS
OF THE AG:
THE AG IS ORDERED, MAX. NUMBER OF SYMBOL VISITS =
NUMBER OF DEFINED SYMBOLS: NUMBER OF CONTEXT-FREE RULES: LENGTH OF CONTEXT-FREE GRAMMAR: MAXIMAL LENGTH OF CF RULES: MINIMAL LENGTH OF CF RULES: AVERAGE LENGTH OF CF RULES:
50 125 297 7 i 2.4
ATTRIBUTES ATTRIBUTES ATTRIBUTES ATTRIBUTES
320 127 13 180
MAXIMAL E. MINIMAL AVERAGE
(TOTAL): (ORIGINAL): (FOR CONSTITUENT): (FOR INCLUDING):
ATTRIBUTES OF SYMBOLS: G. FOR SYMBOL: ATTRIBUTES OF SYMBOLS: ATTRIBUTES OF SYMBOLS:
MAXIMAL ATTRIBUTES OF MINIMAL ATTRIBUTES OF AVERAGE ATTRIBUTES OF ATTRIBUTE OCCURRENCES
RULES: RULES: RULES: (TOTAL):
20 variant 0
6.4 62 0 15.6 1953
CONDITIONS (IN RULES): ATTRIBUTE RULES (TOTAL):
79 945
ATTRIBUTE ATTRIBUTE ATTRIBUTE ATTRIBUTE
196 599 45 105
RULES RULES RULES RULES
(ORIGINAL): (FOR INCLUDING): (FOR CONSTITUENT): (FOR TRANSFER):
ATTRIBUTE RULES OF THE FORM X.A:=Y.B ORIGINAL: EXPANDED:
4
34 727
SYMBOLS VISITED (FROM ANCESTOR) 1 TIMES: program, block, label decl part, VAR sym, PACKED sym, PROC FUNC~, identifier, Tormat, act pa~ams, var denot, element, subscript, rel~opr, add_opT, mul_opr, TO_DOWNTO, SYMBOLS VISITED (FROM ANCESTOR) 2 TIMES: prog params, prog param, const def, var decl part, var_~ecl, proc_de~l_part, labe~_debl, l~bel,-proc decl, type identifier, enumeration, tag field, form~parm sect,--proc_type, case label, WITH ~lause, constant_~dent, unsigned_const,--statement, ~ase_limb,
144
SYMBOLS VISITED (FROM ANCESTOR) 3 TIMES: type def, pointer type, type denoter, record type, field_list, record_section, Variant_part, va~iant, SYMBOLS VISITED (FROM ANCESTOR) subscr tp_list,
4 TIMES :
145
B.3:
Output p r o t o c o l s of o p t i m i z a t i o n s
---=_---
i. E l i m i n a t i o n of semantic chain productions
*** GAG E L I M I N A T I O N OF SEMANTIC C H A I N P R O D U C T I O N S N U M B E R OF S E M A N T I C C H A I N P R O D U C T I O N S :
***
18
TABLE OF E L I M I N A T E D RULES RULE * POSITION (LINE, COLUMN) ************************************* r9 rl0 r14 r15 r21 r50 r56 r66 r67 r68 r74 r76 r84 r89 r92 r97 rl19 r139
* * * * * * * * * * * * * * * * * *
( 2270000, ( 2310000, ( 2670000, (2710000, ( 3160000, (11150000, (150i00~0, (19920000, (1996000@, (20000000, (2068~000, (20850000, (21960000, (22850000, (23130000, (23910000, (26420000, (29400000,
6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
N U M B E R OF E L I M I N A T E D SYMBOL DEFINITIONS: N U M B E R OF E L I M I N A T E D A T T R I B U T E DEFINITIONS: ORIGINAL CONSTITUENT(S) INCLUDING TOTAL :
18 49 6 86 141
SYMBOLS W I T H NEW I D E N T I F I E R S OLD I D E N T I F I E R * NEW I D E N T I F I E R const def part const-def-list type ~ef part type def list var ~ e c l - l i s t sta~emen~seq constant variants form paras call~with_params
constdef const--def type~ef type def vat ~ecl statement u n s i g n e d const v a r l a n t -form parm sect var ~enot--
expr simp_expr term factor variable element list s u b s c r i p t list case l i m b ~
2. A t t r i b u t e
*** *GAG*
OPTIMIZER
var denot var--denot var--denot var--denot var--denot element subscript case limb
optimization
RESULTS
***
* * * * * * * * * * R E S U L T S OF A T T R I B U T E O P T I M I Z A T I O N : 44 ATTR. ARE G L O B A L V A R I A B L E S 37 ATTR. ARE G L O B A L STACKS 1 OF T H E S E A T T R I B U T E S N E V E R USED
**********
53 ATTR. ARE NODE C O M P O N E N T S FOR THE F O L L O W I N G REASONS: 16 ATTR. HAVE TOO LONG L I F E T I M E S 7 ATTR. BELONG TO SYMBOLS O C C U R R I N G IN S Y N T A C T I C LISTS 30 ATTR. ARE A C C E S S E D BY INCLUDING W H I C H W I L L NOT BE E X P A N D E D M O R E O V E R THERE ARE 0 INTERFACE (TO) A T T R I B U T E S 6 I N T E R F A C E (FROM) A T T R I B U T E S 0 CONSTANT ATTRIBUTES 180 ATTR.
ARE
IGNORED
INCLUDING
EXPANSIONS
******* RESULTS OF GROUPING: ******* ******* G R O U P I N G S E P A R A T E S STACKS A N D VARIABLES! ii A T T R I B U T E VARIABLES 18 A T T R I B U T E STACKS
* * * * * * * * * RESULTS OF C H A N G E S IN V I S I T - S E Q U E N C E S : 128 PUSH O P E R A T I O N S INSERTED 150 POP O P E R A T I O N S I N S E R T E D 65 I D E N T I C A L A S S I G N S E L I M I N A T E D *************************************************
***
*********
147
Protocol
B.4:
of
an
execution
of
the
generated
analyzer
::::::~:~==~:=====~==========~==:=~==:::::::
.....
************************************************************************
*
GAG
* PROTOCOL
OF T H E
GENERATED
*
COMPILER
*
,
*
* DATE
:
82-~5-13
*
*
*
* NUMBER
OF MESSAGES:
*
*
*
* * *
*
INFORMATIONS ERRORS
: :
SYSTEM SYSTEM
: :
ERRORS LIMITS
~ 13 0 0
(I) (E) (S) (L)
* * * *
************************************************************************
(* D e m o n s t r a t i o n PROGRAM ***
program
SCANNER
( INFILE,
E : IDENTIFIER (* .......
LABEL
i;
......
i
for
NOT
the
generated
TOKENFILE,
analyzer
ERRORFILE
);
DECLARED
*)
*** E : DECLARED LABEL DOES NOT OCCUR CONST ID = i; N U M = 2; E Q = 3; E R R = 4; E O F C O D E = ~; MAXSTRING = i~0~; TYPE RANGE = EOFCODE .. E R R ; TOKEN = RECORD CASE CODE: RANGE
OF
I ***
E
: THE
SET
LINE
OF
C A S E L A B E L S IS N O T C O M P L E T E ID: ( N A M E : I N T E G E R ); NUM:( VAL: I N T E G E R ); EQ, E R R : () END; = ARP~Y [LINERANGE] OF CHAR;
I ***
E
: TYPE
IDENTIFIER
NOT
DECLARED
I ***
E
: IDENTIFIER APPLIED BEFORE LINERANGE = 1..80;
DEFINITION
VAR STRINGTABLE: LASTENTRY:
ARRAY [I..MAXSTRING] i.. M A X S T R I N G ;
OF CHAR;
*)
148 INLINE: T : I: TOKENFILE: INFILE: FUNCTION
ENCODE
(* V A R I: BEGIN
.
.
.
.
.
.
LINE ; TOKEN; INTEGER ; FILE OF TOKEN; F I L E OF L I N E ;
( FIRSTCOL:
LINERANGE
):
INTEGER;
*)
INTEGER; I := I; WHILE I < LASTENTRY DO IF S T R I N G T A B L E [ I ] = LINE[FIRSTCOL]
J ***
E
: MUST BE VARIABLE T H E N IF S T R I N G T A B L E [ I + I ] THEN ENCODE := I E L S E I := I + 1 E L S E I := I + i; IF I >= T H E N IF
LASTENTRY LASTENTRY
=
INLINE[FIRSTCOL+I]
= MAXSTRING
THEN
GOTO
1
I ***
E
: l A B E L IS N O T ELSE BEGIN
VISIBLE STRINGTABLE[LASTENTRY STRINGTABLE[LASTENTRY+I] ENCODE := L A S T E N T R Y ; LASTENTRY := L A S T E N T R Y
] := I N L I N E [ F I R S T C O L ] ; := INLINE[FIRSTCOL+I] ; +
2;
END; END;
(*ENCODE*)
BEGIN
(* P R O G R A M
BODY
*)
RESET(INFILE)
; REWRITE(TOKENFILE)
;
WHILE NOT EOF(INFILE) DO BEGIN INLINE := I N F I L E @ ; GET(INFILE) I := i; WHILE I <= 80 D O BEGIN IF I N L I N E [ I , J ] = ' '
LASTENTRY
:=
0;
;
I *** E
: TOO
MANY
INDICES
***
: IDENTIFIER NOT DECLARED T H E N I := I + 1 E L S E IF I N L I N E [ I ] IN [ ' A ' . . ' Z ' , ' 0 ' . . ' 9 ' , '='] THEN CASE INLINE[I] OF 'A','B','C','D','E','F','G','H','I' t 'J' F ' K ' , ' L ' f 'M' I 'N' , 'O' , 'P' , 'Q' , 'R' , 'S' , 'T' , 'U' , 'V' , 'W' , 'X' , 'Y' , 'Z' : BEGIN T. C O D E := ID; T.NAME := E N C O D E ( I , I D ) ;
f E
I
149
***
E
: WRONG
PARAMETER I := I + 2; T O K E N F I L E @ := T; PUT(TOKENFILE); END; '0','i','2','3','4','5','6','7','8','9': BEGIN T . C O D E := NUM; T. V A L U E := O R D ( I N L I N E [ I ] - O R D ( ' ~ ' ) ) ;
I *** E
: FIELD
IDENTIFIER
NOT
DECLARED
*** E
: O P E R A T I O N N E I T H E R A R I T H M E T I C NOR SET INCOMPATIBLE OPERAND TYPES I := I + I; TOKENFILE@ := T; PUT(TOKENFILE); END; I=;: BEGIN
[email protected] := EQ; PUT(TOKENFILE); I := I + 1 END; END (* C A S E *) ELSE BEGIN
[email protected] := ERR; PUT(TOKENFILE); I := I + 1 END; END; (* W H I L E I *) END; (* W H I L E N O T EOF *)
I
[email protected] PUT(TOKENFILE); END.
:= E O F C O D E ;
150
Output
B.5:
of the p e r f o r m a n c e
the g e n e r a t e d
analyzer
DATA OF C O M P I . T E S T EXECUTED: 05/17/82
measurement
tool
for
(Input as for B.4)
11:32:58
---
HEAP A F T E R TREE C O N S T R U C T I O N : 39944 BYTES HEAP FOR INITIALIZATION: 2680 BYTES HEAP FOR ANALYSIS: 7008 BYTES >>>>> > > > HEAP TOTAL: 49632 B Y T E S COMPUTED
HEAP S T O R A G E
USED:
15020 A T T R I B U T E STORAGE N E E D E D IN BYTES: D U R I N G C O N D I T I O N S T E M P O R A R I L Y NEEDED: 312 BYTES DISTRIBUTION: 139 USED, LIST LINK R E C O R D S OF 8 BYTES: 139 USED, LIST ELEMENTS OF 16 BYTES: 19 USED, S T A C K P O R T I O N S OF 192 BYTES: 166 USED, U N I O N / S T R U C T POINTER GENERALLY: THE S E V E R A L A L L O C A T I O N S ARE: TYPES: 7 TIMES 2 W O R D S FOR 69 T I M E S 6 W O R D S FOR ADDITIONALLY TEMPORARY 51 T I M E S 2 W O R D S FOR 6 TIMES 2 W O R D S FOR 3 TIMES 4 W O R D S FOR 2 TIMES 6 W O R D S FOR 3 TIMES 4 W O R D S FOR 16 TIMES 4 W O R D S FOR 0 TIMES 2 W O R D S FOR 4 TIMES 4 W O R D S FOR 0 TIMES 4 W O R D S FOR 1 TIMES 2 W O R D S FOR 0 TIMES 2 W O R D S FOR 1 TIMES 4 W O R D S FOR 3 TIMES 4 W O R D S FOR
Ii ii 0 2
tp_block tp_def 2 TIMES tp type tp~choice tp_--file tp record tp~array tp proc tp_enum tp s u b r a n g e tpZset tp general set tp--pointer-tp_union tp._variant
NODES: MIN. NODE L E N G T H FOR NTs IS 6W, FOR Ts 3W, AND FOR LIST N O D E S AS FOR EMPTY N O D E S 2W. THE T O T A L LENGTH DEPENDS ON THE A T T R I B U T E S .
m
1 T I M E S program, 24 B Y T E S FOR A T T R I B U T E S i T I M E S prog_params, 8 B Y T E S FOR A T T R I B U T E S 3 T I M E S prog param, 12 B Y T E S FOR A T T R I B U T E S 2 T I M E S bloc][, 68 B Y T E S F O R A T T R I B U T E S 12 TIMES c o n s t def, 8 B Y T E S FOR A T T R I B U T E S 8 TIMES type__~ef
TEMPORARILY TEMPORARILY TEMPORARILY TEMPORARILY
151
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
2 14 3 2 1 0 2 152 0 28 20 1 14 ~ 2 1 4 2 1 1 5 ~ 1 1 41 1 96 0 13 0 153 8 55 5 14 44 0 0 8 13 0 90 5 ~
TIMES var_decl_part TIMES var decl 8 BYTES FOR ATTRIBUTES T I M E S p r o ~ decl part, 8 BYTES FOR ATTRIBUTES T I M E S l a b e l d e c T part, T I M E S label--decl7 4 BYTES FOR ATTRIBUTES TIMES labelTIMES VAR sym 4 BYTES FOR ATTRIBUTES T I M E S name, 4 BYTES FOR ATTRIBUTES T I M E S p o i n t e r type, 4 BYTES FOR ATTRIBUTES T I M E S integer--number, 12 B Y T E S F O R A T T R I B U T E S TIMES type_deVoter, 12 B Y T E S F O R A T T R I B U T E S T I M E S p r o c decl, 4 BYTES FOR ATTRIBUTES T I M E S type--identifier, 4 BYTES FOR ATTRIBUTES TIMES enumeration, T I M E S s u b s c r tp l i s t T I M E S r e c o r d type, 20 B Y T E S F O R A T T R I B U T E S 8 BYTES FOR ATTRIBUTES TIMES field Tist, TIMES record section T I M E S v a r i a n ~ part, 8 B Y T E S FOR A T T R I B U T E S 4 BYTES FOR ATTRIBUTES T I M E S tag fieTd, TIMES varTant, 4 BYTES FOR ATTRIBUTES T I M E S P A C K E D sym T I M E S f o r m p ~ r m sect, 8 BYTES FOR ATTRIBUTES T I M E S proc--type7 12 B Y T E S F O R A T T R I B U T E S 44 B Y T E S F O R A T T R I B U T E S TIMES caseZlabel , T I M E S PROC F U N C T TIMES identifier TIMES format 4 BYTES FOR ATTRIBUTES TIMES act_params, TIMES WITH clause, 16 B Y T E S F O R A T T R I B U T E S TIMES var denot, 4 BYTES FOR ATTRIBUTES T I M E S c o n s t a n t ident, 4 BYTES FOR ATTRIBUTES TIMES unsigned~const, 8 BYTES FOR ATTRIBUTES TIMES element, 4 BYTES FOR ATTRIBUTES TIMES subscript TIMES character, 4 BYTES FOR ATTRIBUTES T I M E S real n u m b e r , 4 BYTES FOR ATTRIBUTES T I M E S string, 8 BYTES FOR ATTRIBUTES TIMES rel_opr T I ~ E S add opr T I M E S mul--opr TIMES statement, 8 BYTES FOR ATTRIBUTES T I M E S c a s e limb T I M E S TO ~WNTO
T O T A L N U M B E R OF (NT A N D T) N O D E S : A D D I T I O N A L N U M B E R OF "LIST" NODES: A D D I T I O N A L N U M B E R OF E M P T Y N O D E S : THE
PURE
TREE
FOR
323 S E M . C H A I N DISTRIBUTION
NEEDS
17552
BYTES,
PRODUCTIONS
OF RULES:
830 19 21
PLUS
N O NODE
5332 B Y T E S WAS
FOR ATTRIBUTE
GENERATED
ANCHORS
m
g
i
w
B
m
w
1 TIMES 1 TIMES 3 TIMES 2 TIMES 1 TIMES 1 TIMES 1 TIMES 1 TIMES 5 TIMES 6 TIMES 1 TIMES 3 TIMES 4 TIMES 2 TIMES 2 TIMES 6 TIMES 8 TIMES 2 TIMES 1 TIMES Ii T I M E S 1 TIMES 14 TIMES 4 TIMES 2 TIMES 2 TIMES 2 TIMES 1 TIMES 2 TIMES 1 TIMES 1 TIMES 2 TIMES 1 TIMES 1 TIMES 2 TIMES 3 TIMES 1 TIMES 1 TIMES 1 TIMES 1 TIMES 12 T I M E S i TIMES 12 T I M E S 8 TIMES 13 TIMES 1 TIMES 48 TIMES 1 TIMES 16 T I M E S 7 TIMES 22 T I M E S 7 TIMES 12 TIMES 6 TIMES 2 TIMES
rl r2 r3 r4 r5 r6 r7 r8 rll r12 r13 r16 r17 r19 r20 r22 r23 r24 r25 r26 r27 r29 r30 r35 r37 r40 r41 r42 r43 r45 r46 r47 r48 r51 r52 r53 r55 r59 r61 r62 r63 r64 r69 r70 r75 r77 r79 r80 r82 r85 r86 r87 r88 rg0
153
3 1 13 96 8 8 10 37 4 1 1 1 ] 12 1 32 Ii 14 8 1 7 7 1 3 6 2 3 41 0
-
-
-
-
NUMBER 7 2647 16 22 6 7 13 0 53 116 2 39 0 0 99
TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES TIMES
r91 r93 r94 r95 r96 r99 r100 r102 r104 r106 r107 r108 rll0 rlll rl12 r120 r123 r124 r125 r127 r128 r129 r130 r131 r135 r138 r140 r141 r18,r28,r31,r32,r33, r34,r36,r38,r39,r44, r49,r54,r57,r58,r60, r65,rTl,r72,r73,r78, r81,r83,r98,r101,r103, r105,r109,rl13,rl14,rl15, rl16,rllT,rllS,r121,r122, r126,r132,r133,r134,r136, r137,r142,r143
OF F U N C T I O N C A L L S : TIMES f elem unique T I M E S f--identify T I M E S f--none in d e f s T I M E S f-'make--de][s T I M E S f--defs--unique T I M E S f--def ~ n i q u e T I M E S f--is ~ r d i n a l TIMES GENSYMB T I M E S f no file c o n t a i n e d T I M E S f--compati]~le_types T I M E S f--is s t r i n g T I M E S f--asSignment c o m p a t i b l e T I M E S f_--equal_proc--types TIMES f congruent_params T I M E S f--reduce
154 12 TIMES f no file fields 7 TIMES f is tp record 18 TIMES f-foTwa~d exists 0 TIMES f--forward--fitted 1 TIMES f--check p~ram 0 TIMES f--check--io param 37 TIMES f--~s va~iaSle access 0 TIMES f--find variant choice 21 TIMES f--dyadYc operator 40 TIMES f--is arithmetic
References
Ada82
Uhl, J., Drossopoulou, S., Persch, G., Goos, G., Dausmann, M., Winterstein, G., Kirchg~ssner, W.: An Attribute Grammar for the Semantic Analysis of Ada, Lecture Notes in Computer Science, vol. 139, Springer Verlag, 1982
As79
Asbrock, B.: Attribut-Implementierung und -Optimierung f~r Attributierte Grammatiken, Fak. f. Informatik, Universit~t Karlsruhe, Diplomarbeit, Juli 1979
AKZ81
Asbrock, B., Kastens, U., Zimmermann, E.: Generating an Efficient Compiler Front-end, Fak. f. Informatik, Universitar Karlsruhe, Bericht 17/81
Bo76
Bochmann, G.V. : Semantic Evaluation from left to CACM 19, 2, 55-62, 1976
BSI82
BS 6192 : 1982, Specification for Computer programming language Pascal, British Standards Institution, 1982
De77
Dencker, P.: Ein neues LALR-System, Fak. Universit~t Karlsruhe, Diplomarbelt, 1977
DIN80
DIN 66253 Teil 2 : Programmiersprache PEARL, Normentwurf, Beuth-Verlag, November 1980
GAG81
Asbrock, B., Kastens, U., Zimmermann, E.: User Manual for the GAG-System, Fak. f. Informatik, Universit~t Karlsruhe Bericht 15/81, 1981
G178
Glanville, R.S.: A Machine Independent Algorithm for Code Generation and its Use in Retargetable Compilers; University of California, Berkeley, Ph.D. Thesis, Report UCM-CS-78-01, 1978
HLP78
Ralha, K., Saarinen, M., Soisalon-Soininen, E., Tienari, M.: The Compiler Writing System HLP, Department of Computer Science, University of Helsinki, Report A-1978-2
Ja75
Jazayeri, M., Ogden, W.F., Rounds, W.C.: The intrinsically exponential complexity of the circularity problem for attributed grammars. CACM 18, 679-706 (1975)
Jan82
Jansohn, H.-St., Landwehr, R., Zimmermann, E.: Description of the Program Preprocessor PROPP; Fak. f~r Informatik, Universit~t Karlsruhe, Interner Arbeitsbericht, 1982
JAW75
Jazayeri,M., Walter, K.G.: Alternating Semantic Evaluator Proc. of ACM 1975 Ann. Conf., 230-234, 1975
JP81
Jazayeri, M., Pozefsky, D.: Space-Efficient Storage Management in ~n Attribute Grammar Evaluator, ACM Transactions on Programming Languages and Systems, Vol. 3, No.4, October 1981, Pages 388-404
JW75
Jensen, K., Wirth, N.: Springer-Verlag, 1975
m
.
Eight,
f. Informatik,
u
Pascal
User-Manual
and
Report,
156
Ka76
Kastens, U.: Ein ~bersetzer-erzeugendes System auf der Basis attributierter Grammatiken, Fak° f. Informatik, Unlversltat Karlsruhe, Bericht 10/76, 1976 •
.
l!
Ka80
Kastens, U.: Ordered Attributed matica 13, 229-256, 1980
Grammars,
in Acta
Infor-
Kn68
Knuth, D.E.: Semantics of context-free languages, Math° Syst. Theory 2, 2, 127-145, 1968 and Math. Syst. Theory 5, i, 95- 96, 1971 (correction)
KW76
Kennedy, K., Warren, S.K.: Automatic generation of efficient evaluators for attribute grammars; Conference record of the 3rd ACM Symp. on Principles of programming languages, 32-49, 1976
KZ80
Kastens, U., Zimmermann, E.: GAG - A Generator Based on Attributed Grammars, Fak. f. Informatik, Unlversltat ~arlsruhe, Bericht 14/80, 1980 •
LJG82
.
u
Landwehr, R., Jansohn, H.-St., Goos, G.: Experience with an Automatic Code Generator Generator, in Proceedings of the SIGPLAN Symposium on Compiler Construction, Boston, 1982
Marc76 Marcotty, M., Ledgard, H.F., Bochmann, G.V.: A Sampler of Formal Definitions; Computing Surveys, Vol 8, No 2, June 1976 Nu82
Nunnenmann, H.: Schema-Abbildung yon TERM nach UDS; f~r Informatik, Universit~t Karlsruhe, Diplomarbeit,
Po79
Pozefsky, D.P.: Building Efficient Pass-Orlented Attribute Grammar Evaluators, University of North Carolina, PhD Thesis, 1979
Ra79
R~ih~, K.-J. : Dynamic allocation of space for attribute instances in multi-pass evaluators of attribute grammars. Proc. SIGPLAN Symp. Compiler Construction, Denver, Col. Aug. 6-10, 1979, SIGPLAN Notices (ACM), 14, 8 (Aug.1979), 26-38.
Ro78
Rohrlch, J.: Automatic Construction of Error Correcting Parsers, Fak. f. Informatik, Universit~t Karlsruhe, Bericht 8/1978
Ro82
Rosenstiel, W.: DSL - Eine Sprache zur Spezifikation der Funktion digitaler Systeme: Konzept und Implementierung; Fak. fSr Informatik, Universit~t Karlsruhe, Interner Bericht 9/82
SaS~
Schauer, J.: Eine attributierte Grammatik f~r f. Informatik, Unlversltat Karlsruhe, Interner richt, 1980 •
.
n
Fak. 1982
LIS, Fak. Arbeitsbe-