Selected Papers on Automath - PDF Free Download

SELECTED PAPERS ON AUTOMATH STUDIES INLOGIC AND THE FOUNDATIONS OF MATHEMATICS VOLUME 133 Honorary Editor: P. SUPPE...

Author: R.P. Nederpelt | J.H. Geuvers | R.C. de Vrijer

41 downloads 1345 Views 50MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

SELECTED PAPERS ON AUTOMATH

STUDIES INLOGIC AND THE FOUNDATIONS OF MATHEMATICS VOLUME 133

Honorary Editor:

P. SUPPES

Editors: S . ABRAMSKY, London J. BARWISE, Stanford K.FINE, Los Angeles H.J. KEISLER,Madison A S . TROELSTRA,Amsterdam

ELSEVIER AMSTERDAM LAUSANNE NEW YORK OXFORD SHANNON TOKYO

SELECTEDPAPERS ON AUTOMATH

Edited by

R.P. NEDERPELT J.H. GEUVERS Eindhoven University of Technology The Netherlands R.C. DE VRIJER Vrije Universiteit Amsterdam The Netherlands with the assistance of

L.S.VAN BENTHEM JUTTING D.T. VAN DAALEN

1994

ELSEVIER AMSTERDAM-LAUSANNE. NEW YORK.OXFORD.SHANNON*TOKYO

ELSEVIER SCIENCE B.V. Sara Burgerhartstraat25 P.O. Box 211,1000 AE Amsterdam, The Netherlands

ISBN: 044489822 0 0 1994Elsevier Science B.V.All rights reserved.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science B.V., Copyright & Permissions Department, P.O. Box 521, 1000AM Amsterdam,The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the copyright owner, Elsevier Science B.V., unless otherwise specified.

No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. This book is printed on acid-free paper. Printed in The Netherlands

dedicated to N.G. de Bruijn

N.G. de Bmgn

vii

Preface Around 1967, N.G. (Dick) de Bruijn developed the first versions of Automath, a formal language suitable for the representation of mathematics. This topic, the formalization of a mathematical subject matter, fits in a respectable tradition. Whitehead and Russell, for instance, undertook a similar task in the first decade of the twentieth century. Their work culminated in the publication of Principia Mathematica, three sizeable volumes containing the formalization of a considerable amount of mathematics. However impressive the achievement of Whitehead and Russell was, they left large parts of the justification of their formalized mathematics to the reader. Precisely the novelty and the main virtue of de Bruijn’s efforts is that he incorporated automated verifiability as an essential ingredient of his design. The traditional systems all accomodate in one way or another common mathematical notions like axiom, theorem, proof and definition. What Automath adds to this, is the extra rigour that is required for a fully formalized representation of a mathematical argument. This involves issues that have remained somewhat out of focus in more traditional logical systems, such as the computational aspects of function application and the efficient handling of definitions. In order to achieve this, de Bruijn adopted a logical framework which deviates rather drastically from the logical mainstream at that time. A basic feature is that derivations are treated as objects in the formal system and that proof classes of propositions are treated as types. Automath has an inbuilt type mechanism that handles these aspects. Recently, type theory has become a fast developing branch of research. It has applications in mathematics and computer science, especially in the areas of proof checking and program verification. Many current proof systems pay tribute to de Bruijn and his early Automath investigations. The ideas developed in the Automath tradition and the results obtained in the project have been a starting point and a source of inspiration for many recent developments in the area. Therefore the time appears to be ripe for an anthology of papers about and around Automath. The present Volume contains a considered choice of the existing literature on Automath. Many of the papers included in this Volume have been published in journals or conference proceedings, but a number have only circulated as research reports or have remained unpublished. The aim of this book is to present a representative selection of existing articles and reports and of material contained in dissertations, giving a compact and more or less complete overview

viii of the work that has been done in the Automath research field, from the beginning to the present day. Six different areas have been distinguished, which correspond to the Parts A t o F of this Volume. These areas range from general ideas and motivation, to detailed syntactical investigations: A. Motivation and exposition B. Language definition and special subjects C. Theory D. Text examples E. Verification F. Related topics. For more detailed information about the organization of this Volume we refer the reader to the Contents and the subsequent section Hints for the reader. A survey of the contents and a detailed description of each of the selected papers can be found in Section 2 of the introductory paper by R.P. Nederpelt and J.H. Geuvers: Twenty-five years of Automath research.

ACKNOWLEDGEMENTS The editors are glad to have been assisted by the co-editors Bert van Benthem Jutting and Diederik van Daalen in a number of editorial matters, in particular as regards the selection and the arrangement of the contents. Thanks are due to Dick de Bruijn (who composed many figures), to Piet van Rooij (for his advise about the format of the references), to Sjaak Smetsers (for his help in composing figures), to Jan Joris Vereijken (for his help in encapsulating Postscript figures in IATfl) and to Jeff Zucker (for the careful reading of the introductory paper). The editors are very happy with the elegant and flexible layout provided by the UTEX Document Preparation System. They also wish to express their admiration for Anita Klooster of the Protex Text Editing Bureau, who retyped the majority of the papers in UTEX, in an extremely skilful and accurate fashion. We thank the Department of Mathematics and Computing Science of the Eindhoven University of Technology and its Section Computing Science, for their financial assistance which made this publication possible. Furthermore, the editors express their gratitude to the following persons, institutions and publishers for the permission to reprint one or more papers: Academic Press Inc. (Orlando), P. Braffort, Collhge International de Philosophie et ALAMO (Paris), Gesellschaft fur Mathematik und Datenverarbeitung MBH (Schloss Birlinghoven) , JAnos Bolyai Mathematical Society (Budapest),

ix Koninklijke Nederlandse Akademie van Wetenschappen (Amsterdam), Marcel Dekker Inc. (New York), G. Lebeau, Dept. de MathCmatique de 1’ UniversitC Paris Sud (Orsay), Presses du Centre National de la Recherche Scientifique (Paris), P. Dybjer, Programming Methodology Group (Goteborg), Springer-Verlag Inc. (Heidelberg) and Stichting Mathematisch Centrum (Amsterdam). The editors: Rob Nederpelt, Herman Geuvers, Roe1 de Vrijer.

This Page Intentionally Left Blank

xi

Contents Preface

vii

Contents

xi

Hints for the reader

xv

Notation

xvii

Introduction

1

R.P. Nederpelt and J.H. Geuvers: Twenty-five years of Automath research

3

PART A Motivation and exposition

55

A.l. N.G. de Bruijn: Verification of mathematical proofs by a computer A.2. N.G. de Bruijn: The mathematical language Automath, its usage, and some of its extensions A.3. D.T. van Daalen: A description of Automath and some aspects of its language theory A.4. J. Zucker: Formalization of classical mathematics in Automath A.5. N.G. de Bruijn: A survey of the project Automath A.6. D.T. van Daalen: The language theory of Automath. Chapter I, Sections 1-5 (Introduction) A.7. N.G. de Bruijn: Reflections on Automath A.8. R.P. Nederpelt: Type systems - basic ideas and applications

57

163 20 1 229

PART B Language deflnition and special subjects

249

B.l. B.2. B.3. B.4. B.5.

251 275 283 289

L.S. van Benthem Jutting: Description of AUT-68 N.G. de Bruijn: AUT-SL, a single line version of Automath N.G. de Bruijn: Some extensions of Automath: the AUT-4 family N.G. de Bruijn: AUT-QE without type inclusion L.S. van Benthem Jutting: Checking Landau’s ‘‘Grundlagen” in the Automath system. Appendix 9 (AUT-SYNT) B.6. D.T. van Daalen: The language theory of Automath. Chapter VIII, 1 and 2 (AUT-ll) B.7. N.G. de Bruijn: Generalizing Automath by means of a lambdatyped lambda calculus B.8. H. Balsters: Lambda calculus extended with segments. Chapter 1, Sections 1.1 and 1.2 (Introduction)

73 101 127 141

299 303 313 339

xii

Contents

PARTC Theory C.1. L.S. van Benthem Jutting: A normal form theorem in a A-calculus with types C.2. N.G. de Bruijn: Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application t o the Church-Rosser theorem C.3. R.P. Nederpelt: Strong normalization in a typed lambda calculus with lambda structured types C.4. R.C. de Vrijer: Big trees in a A-calculus with A-expressions as types (2.5. D.T. van Daalen: The language theory of Automath. Parts of Chapters 11, IV, V-VIII C.6. L.S. van Benthem Jutting: The language theory of A,, a typed A-calculus where terms are types

369 371

375 389 469 493 655

PART D Text examples

685

D.l. N.G. de Bruijn: Example of a text written in Automath D.2. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Parts of Chapters 0, 1 and 2 (Introduction, Preparation, Translation) D.3. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Chapter 4 (Conclusions) D.4. L.S. van Benthem Jutting and R.C. de Vrijer: A text fragment from Zucker’s “Real Analysis” D.5. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Appendices 3 and 4 (The PN-lines; Excerpt for “Satz 27”)

687

PART E Veriftcation

78 1

E.1. I. Zandleven: A verifying program for Automath E.2. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen’’ in the Automath system. Parts of Chapter 3 (Verification) E.3. L.S. van Benthem Jutting: An implementation of substitution in a A-calculus with dependent types

783

PART F Related topics

839

F.l. N.G. de Bruijn: Set theory with type restrictions F.2. N.G. de Bruijn: Formalization of constructivity in Automath

841 849

70 1

72 1 733

763

805 809

...

Contents

Xlll

F.3. N.G. de Bruijn: The Mathematical Vernacular, a language for mathematics with typed sets F.4. R.M.A. Wieringa: Relational semantics in an integrated system F.5. N.G. de Bruijn: Computer program semantics in space and time

865 937 947

Bibliography

973

References

975

Indexes

995

Index of Names Index of Notations Index of Subjects

997 1003 1005

This Page Intentionally Left Blank

xv

Hints for the Reader The introductory contribution Twenty-five years of Automath research by R.P. Nederpelt and J.H. Geuvers has been written especially for this Volume. It contains a short historical sketch of the development of proof systems and a detailed survey of the contents of this book. The other papers in this Volume have first been distributed over the parts A to F on purely thematic grounds. Then, within the parts, the order is mainly determined by chronology. Therefore, the order in which the papers appear in this book is not to be taken as a hint for the order of reading.

For an exposition of the ideas underlying Automath, we suggest that the reader starts with either one of the following two papers: A.3. D.T. van Daalen: A description of Automath and some aspects of its language theory, A .5. N.G. de Bruijn: A survey of the project Automath. For a personal view of de Bruijn, looking back on his motivation for developing Automath and the related Mathematical Vernacular, see:

A.7. N.G. de Bruijn: RefEections on Automath. An introductory overview of the basic notions used in type systems like Automath, in relation to recent developments in logic, mathematics and computer science, can be found in:

A.8. R.P. Nederpelt: Type systems - basic ideas and applications. The reader with no experience of type systems and interested in obtaining a working knowledge of one of the Automath systems, is referred to:

B.l. L.S. van Benthem Jutting: Description of AUT-68. EDITORIAL NOTES Editor’s comments or new notes added by the authors are given in the running text between square brackets, in italics. For revisions which are more than simple rephrasings or obvious corrections, the same format is used. Omitted text is indicated as follows: [...I.

xvi

THE REFERENCES The contributions in this Volume (but for the new introductory paper Twentyfive years of Automath research) are identified with a letter-number combination. For example, B. 7 refers t o paper no. 7 in Part B of this Volume. All the references are listed at the end of the Volume. The reference list starts with the precise bibliographic data of the selected papers, ordered according t o the letter-number code. Thereafter, all references, including the selected papers, are listed alphabetically by author name and, for each author, by publication date. Within the papers, references to other papers which have been reproduced in whole or part in this Volume are indicated i n italics. For a paper that is fully reproduced in this Volume, we also give the letter-number code; e.g., [ d e Bruzjn 876 (B.7)].If only a part or parts of a paper or thesis are reproduced, we generally give the reference to the full work, without the letter-number code(s), again in italics; e.g., [van Daalen 801. If, however, the reference in question only concerns the text of that part which is reproduced in this Volume, then we do give the relevant letter-number combination. For reasons of readability, we deviate from the above conventions in the introductory paper Twenty-five years of Automath research: there we only use the short letter-number code when referring t o the papers or parts of papers reproduced in this Volume. All other references throughout this Volume are given in roman typeface. For example, we may cite [Barendregt 84a] or [de Bruijn 931.

xvii

Notation In the course of time, different notations have been used for several notions in Automath papers. We have tried to uniformize these notations in this Volume, but only if this created no technical problems. Below we give a list of the standard Automath notations as employed in this Volume. Unavoidably, however, there remain several locations where the notation deviates from the Automath standards.

Functional abstraction. Standard notation: [z : A ] B . This denotes the lambda abstraction of the variable x of type A over B. This notation for lambda abstraction was originally proposed by de Bruijn. In usual lambda calculus notation, one writes something like X,:A.B.

Exceptions: A . 8 : Ax : A . B . This paper has the usual lambda calculus notation. 0 B.8: X,.B. The paper deals with the type-free lambda calculus in the usual notation. 0 C.2: A, B. The lambda calculus is type-free, for the most part, and uses the ordinary X for abstraction. There is, however, a section on typed lambda calculus in which the notation T ( A ,X,B) is proposed for A,:A.B. 0 0.5: [ x , A ] B .In older Automath literature one finds a comma instead of a colon for the separation of abstraction variable and type. This is also the case in this paper, containing the literal rendering of an original Automath text. 0

Functional application. Standard notation: ( E )F. This denotes F applied to E. Note that it is a well-considered Automath decision to put the argument in front ofthe function: ( E ) F is the notation for the expression usually written as F ( E ) . Exceptions: 0 A.8: ( F E ) . This paper has the usual lambda calculus notation. 0 B.2: { E } F . In older Automath literature, one can find curly braces instead of angular brackets around the argument. This is also the case in this paper, in order to avoid confusion with the Backus-Naur normal form for abstract syntax, which uses angular brackets. 0 B.8: 6 E F . This paper uses a deviating notation for application. 0 C.2: A(F,E ) . This notation is in line with the one proposed in this paper for abstraction in typed lambda calculus. 0 E.3: F { E } . This paper employs the usual notation with the usual order of

xviii function and argument, but with curly braces instead of parentheses.

Substitution. Standard notation: A [ . := B ] . This meta-term denotes the result of the substitution of B for all free occurrences of the variable z in A. For simultaneous substitution of B1 for X I , . . . , B, for xn, one can find the notation A [ z l , .. . ,z, := B1,. . . , B,]. Many papers in this Volume use a notation for substitution that deviates from this standard. Sometimes, a meta-term like A [ z ]is employed to indicate that z may occur as a free variable in A . Then the result of substituting B for X can easily be described as A [ B ] . Exceptions: A . 2 : Q x ( B ) A .A special notation used in this paper. 0 A . 3 : A [ B ] or [z := B ] A . The notation A[B] is used as an informal notation for the result of the substitution of B for z in A[$].Formally, this substitution is denoted by [z := B ] A , hence with the substitution operator preceeding the term in which the substitution must take place. 0 A . 6 : A [ B ] .This is, again, the result of substituting B for z in A [ z ] . 0 8.3: A ( B ) . Here the notation A ( z ) is used to indicate that z may occur free in A . 0 B.6: A [ B ]or A [ z / B ] .The first notation is used if it is clear which variable z is intended. 0 B.8:C , ( B , A ) . A special notation used in this paper. 0 C.2: See Section 6 of this paper. 0 C.3: (z := B ) A . Note the deviating place and form of the substitution operator. 0 C . 4 : See Subsection 2.3.6 of this paper for the notation of simultaneous substitution. 0 C.5: A [ B ] or A [ z / B ] . (Cf. B.6.)See Subsection 11.10 of this paper for the notation of multiple substitution. 0 (7.6: See Definition 3.3 of this paper. 0 E . l : I[z/B]IA. 0 E.3: See Subsections 2.2.2 and 2.2.3 of this paper. 0 F.3: [ [ z / B ] ] ASee . Subsection 11.6. 0

Reduction and conversion. Standard notation: A + B , A ++ B, A = B , for a one-step reduction relation, a (multi-step) reduction relation and a conversion relation, respectively, holding between A and B. For special kinds of reduction, the relation symbols can be provided with a subscript. E.g., in lambda calculus, one-step ,&reduction of A to B is denoted by A +p B. Multiple subscripts point to a possible choice (in each step) of one of the relations mentioned in the subscript. For example, A - ~ p B, means ~ that

xix

there is a multiple reduction path “leading” from A to B , each step being either a @-stepor an 77-step. Exceptions : 0 A.3, A.6, B.6 and C . 5 : A 9 B is used for conversion. 0 B.6: A > B for (one-step) reduction. 0 C . 3 : A 2’ B , A 2 B for one-step and multi-step reduction respectively. 0 C.5: A > B, A 2 B for (generic) one-step and multi-step reduction respectively. (See the editorial note at the beginning of this paper). E . l : A > B, A 2 B for one-step and multi-step reduction respectively. Typing. Standard notation: A : B . This denotes that A has type B. Sometimes types can be defined by means of a typing function, e.g. typ(A) for the (canonical) type of A . Exceptions: A.3, A.6, B.5, (2.4, (7.5, 0.2, 0 . 3 , E . l , E.2 and F . l : A E B .

This Page Intentionally Left Blank

Introduction

This Page Intentionally Left Blank

3

Twenty-Five Years of Autornath Research R.P. Nederpelt and J.H. Geuvers

This introductory contribution consists of two parts. In Section 1, we give a concise historical overview of proof systems, leading to a short survey of the Automath project and a description of recent developments. The main subject, to be found in Section 2, is an explanatory survey of all papers selected for this Volume.

1. FORMAL PROOF SYSTEMS AND AUTOMATH 1.1. The origins

The study of valid inference, the basis of logic, seems t o have started with geometrical demonstrations (Thales, Pythagoras) and found an early culmination in Aristotle and his Orgunon (fourth century B.C.) (see [Kneale & Kneale 621). Aristotle’s investigations about inferences, and in particular the analysis and systematization of the syllogisms, were very influential throughout many centuries. Of equal influence were Aristotle’s ideas on the deductive organization of science. The “axiomatic method” was applied on a substantial scale in Euclid’s Elements. In Roman and Medieval times the findings of the Greek philosophers were taken as the basis of logic. There was hardly any progress until after the Renaissance, when Leibniz (1646-1716) made a large step forwards. He tried to develop a universal scientific language, which he called the Characteristicu Universalis. In particular, he desired to invent a calculus for logic, the calculus ratiocinator. His conviction was that, in this setting, a faulty argumentation would inevitably lead t o a mistake in calculation, and hence would become immediately detectable. His attempts to formalize and universalize the scientific language thus gave rise to the beginning of symbolic logic. At the end of the 19th century, it was Frege who invented the Begriffsschrift ([Frege 1879]), a formal language for logic, which he developed in his search for a foundation of arithmetic. In 1893 Frege published his Grundgesetze der Arithmetik ([Frege 18931) in which he ventured on a complete formalization

4

R.P. Nederpelt and J.H.Geuvers

of large parts of mathematics, based on logical principles. He stood up for a clear separation between axioms and theorems. Moreover, he advocated that the inferential machinery that one uses in mathematics should be introduced beforehand. A formal approach to the notion of ‘inference’ was, in his view, indispensable in order t o dissociate oneself from unclear, natural language based reasoning. However, Frege’s attempts turned out t o be too bold. In a letter to Frege, Russell pointed out that Frege’s system leads to a paradox. In the light of the history, Frege is hardly to blame, since such pioneering work is inevitably a hazardous shot in the dark. Whitehead and Russell, inspired by Frege’s work, wrote in their famous Principia Mathematica the following homage: “In all questions of logical analysis, our chief debt is to Frege” ([Whitehead & Russell 19101). 1.2. The influence of the paradoxes In order to prevent the paradoxes, Whitehead and Russell analysed the vicious circles present in all the known paradoxes. They came to the conviction that a hierarchy was necessary for a sound development of arithmetic and they proposed a type system: the simple type theory. It turned out that a refinement was necessary, which they called the ramified theory of types. This worked as they desired, albeit that they needed an extra axiom, in order to “soften” the strictness of the typing hierarchy. Only with this axiom of reducibility they were able to incorporate full arithmetic, in particular the real numbers, based on Dedekind cuts. This idea of using types emerged quite naturally, once the vicious circles had been detected. In fact, one may say that types existed since early mathematics was developed: categories like ‘natural number’ and ’real number’ in calculus, or ‘point’ and ‘line’ in geometry, grouped elements together in clusters with a common meaning or structure. In this sense, types were meant to emphasize the similarities between given entities. But at the same time, types can be of use in establishing differences between entities. The latter aspects turned out to be of great importance in combatting against the paradoxes. Hilbert and Ackermann ((Hilbert & Ackermann 1928)) were not fully content with the solution that Whitehead and Russell had given for the avoidance of paradoxes. In particular, Hilbert and his student Ackermann tried t o avoid the axiom of reducibility. A close examination of the paradoxes led them t o a division of the known paradoxes into two categories: the logical paradoxes and the semantical ones. For the logical paradoxes, the simple type theory appeared to be sufficient. The semantical paradoxes - like the famous liar’s paradox of Epimenides, or like Richard’s paradox - were rendered harmless by the distinction between object language and meta-language. The object lan-

Formal proof systems and Automath

5

guage accounts for the formal body of “mathematics” which is expressed inside this formalism. The meta-language is intended for expressing the intuitively simple “meta-mathematics” , which is used for discussions about the formalism, in particular for the study of properties such as consistency. (More recent ideas in this direction are due to Tarski.) The last mentioned distinction has since been generally accepted as a sound approach to foundational matters. For an analysis of “Hilbert’s program” see [Smorynski 771. Noteworthy is the purely formalistic conception of the nature of mathematics that was adopted by Hilbert. 1.3. Church’s contributions

One may regard Alonzo Church as the founder of type theory. His general investigations about functions gave rise to lambda calculus, the lambda being used as a symbol for abstraction of a function value over a variable. Church’s intentions ([Church 321) were to give a foundation for logic in terms of this general theory of functions. However, his attempts produced a n inconsistent logic, which caused many researchers of the time to turn away from lambda calculus. Church used lambda calculus also for investigations in computability ([Church 361). It turned out that the so-called lambda-definable functions coincide with Godel’s general recursive functions. This fact supported Church’s thesis that lambda-definability (and hence general recursiveness) is an adequate formalization for the intuitive notion of computability. In a new attempt to give a basis for logic by means of lambda calculus, Church introduced a system with types ([Church 401). This system is now known as the simple theory of types. These types were given as an independent set of entities. Both the lambda calculus framework and the types were proposed as objects for study in their own right. A recent adaptation is made by Barendregt et al. in Nijmegen, The Netherlands, (see e.g. [Barendregt & Hemerik 901 or [Barendregt 921) and is called X +Church. The discovery that a proposition can also be considered as a kind of type - t o be precize: the type of all of its proofs - can be viewed as an important byproduct of this approach. This notion, also called the notion of propositions-as-types (and proofs-as-objects) proved to be very fruitful in the following decades, in logic, mathematics and computer science. Later, Curry developed a type notion for lambda calculus that is different from Church’s. The difference is that Church gave a type-annotation for the binding variables in his term, which results in a unique type for each term. Curry used type-free terms and investigated all possible typings, giving a set of types for a term (possibly empty). For more details and for a contemporary discussion, see (Barendregt 921.

6

R.P. Nederpelt and J.H. Geuvers

1.4. The Automath project De Bruijn developed his ideas about a formal language for mathematics around 1967. He then was a full professor in the Mathematics Department of Eindhoven University of Technology, the Netherlands. (The university was called Technische Hogeschool Eindhoven, at that time.) It was his personal experience with reasoning in mathematics that brought him t o the conviction that a formal linguistic apparatus would be advantageous in many respects. (See de Bruijn’s own words about this matter in [ A . 7 ] , this Volume.) De Bruijn tailored the system to his own insights, which originated in a lasting and deep association with mathematical theories in different areas. Quite recently, his friend and colleague J.J. Seidel gave a description of the “highlights” in de Bruijn’s research work by selecting 14 of the most outstanding books and papers among his 175 or so official publications. As main areas of de Bruijn’s work, Seidel mentioned: number theory, combinatorics, asymptotics, functional analysis, optimal control, Fourier theory, type theory and the theory of quasicristals. It is remarkable, and typical for de Bruijn, that he did not revert to known theories for the development of his language for mathematics, not even as regards the underlying logic. As a matter of fact, he needed not much more than the lambda abstraction and application and the reduction mechanism of lambda calculus. For logic he only incorporated the absolute minimum, (re-)inventing the propositions-as-types notion as a unifying basis. De Bruijn baptized his system ‘Automath’, since it had to do with the “automation of mathematics”. However, connotations of this name in the direction of (automated) theorem proving were never intended. His aim was t o develop a formal language for the full range of mathematics, including logic, by the aid of which it would become possible to check mathematical theories which already have been developed. This purpose is considerably more modest than t o invent some artificial intelligence-like expedient. For the verification part, de Bruijn had the intention, from the beginning, to use a computer. This intended use of a computer influenced his design, in the sense that the formal language had t o be implementable on a machine. The then (and now) existing habits of rendering mathematical thoughts were much too informal and inaccurate for his aims. Soon after de Bruijn made public his ideas about Automath (cf. [ A . 1 ] in the present Volume), he appointed L.S. van Benthem Jutting at the Eindhoven University of Technology for research in this area. In 1969, R.P. Nederpelt started his work at the same university, as a Ph.D. student, under guidance of de Bruijn.

Formal proof systems and Automath

7

De Bruijn, van Benthem Jutting and Nederpelt continued t o work on the subject in the following decades, employed by the university. Apart from that, the Dutch Organization for Pure Scientific Research (ZWO) financed a project called Mathematical Language Automath, also located at Eindhoven University, from 1971 until 1976. From a historical point of view, it appears to be interesting to repeat the aims of this project, as formulated by de Bruijn (the original Dutch has been translated): 0

0

0

0

0

The design of a language in which all of mathematics can be expressed so accurately, that linguistic correctness automatically entails mathematical correctness. The development of programs which enable a computer to verify the books written in that language as regards linguistic acceptability, and hence also mat hemat ical correctness. The buildup in the mentioned language of a piece of mathematics, sufficiently voluminous and sufficiently usable to enable a large group of mathematicians to transfer their own mathematical texts into the language. The elaboration of the thoughts that such a language evokes, more or less automatically, as regards the build-up and the presentation of existing mathematics. The attainment of the situation that complicated and badly organized pieces of mathematics can be formulated in an absolutely reliable manner, by cooperation between mathematician and computer.

The following researchers were employed in the mentioned project, for longer or shorter periods: D.T. van Daalen, A. Kornaat, R. de Vrijer, I. Zandleven and J.I. Zucker. Moreover, programmers, typists and some 25 students participated in the work. The results of the project can be divided in the following areas: 0

Language-theoretical contributions.

0

Software.

0

Mathematics in Automath (14 different subjects).

In the present Volume one may find a number of publications that have been prepared or finished in the framework of this project. For example, Zucker’s [ A . 4 ] is a direct result of his work in the project. The same holds for de Bruijn’s [B.3],[B.4]and [F.1], van Daalen’s [A.3],Nederpelt’s Ph.D. thesis [C.3],de Vrijer’s [C.4]and Zandleven’s [ E l ] Much . of the work presented in van Daalen’s thesis ([vanDaalen S O ] ) and in van Benthem Jutting’s thesis ([vanBenthem Jutting 771) originates from this period.

8

R.P. Nederpelt and J.H. Geuvers

In retrospect, one may conclude that the project was successful. A definite drawback, however, has been that many of the results have only been published in unofficial notes and reports. The international community has not sufficiently been able to take note of all the results, ranging from language theory to computer implementations. One reason was that publication in international journals was not the main objective of the project, another that the subject was not really welcomed in the scientific community of the time, probably because of its isolation: for logicians, Automath was not logic; for mathematicians, it was not mathematics, either. And as regards computer science, the flowering of type theory (including Automath-like constructs) is only very recent. Anyway, there has been a continual lack of official publications concerning the Automath project, ever since its inception. Also after the end of the Automath project (1976), this situation has not really changed. However, the work on Automath continued, albeit with a smaller group (viz. de Bruijn, van Benthem Jutting, van Daalen, Nederpelt, de Vrijer, and (later) Wieringa and B a k e r s ) . The intention of the present Volume is that a coherent and representative selection from the Automath literature becomes widely available.

1.5. Related type systems At about the same time as the beginning of the Automath project, Howard invented his version of LLformulae-as-types”, intended as a formalization of the intuitionistic idea of “proofs-as-constructions” . The handwritten notes of the paper [Howard 801 were privately circulated in 1969. (For comparison, the report version [de Bruijn 68b] of [de Bmijn 7Oa (A.2)] appeared in November 1968.) The work of Howard is based on ideas that go back to [Curry and Feys 581. During the active period of the Automath project, other type systems were being developed, some of them also intended as systems for doing mathematics on a computer, others as a mechanism for typing in (functional) programming languages and again others as a system for foundational studies (like, e.g. the above-mentioned work of Howard). An early example of the latter kind is Martin-Lof’s system of Intuitionistic Type Theory, which was developed as a formal system for intuitionistic mathematics. In fact there are many different versions of the system, one of them impredicative, but most of them not. (Impredicativity means in typed lambda calculus that a type can be defined by quantification over the collection of all types.) The first formulation, just called ‘Theory of Types’ dates from 1971 ([Martin-Lof 71al) and is impredicative. It turned out t o be inconsistent by a result of Girard. This result has become known as Girard’s paradox and was first reported in [Girard 721, but see also [Coquand 861 for a general treatment

Formal proof systems and Automath

9

of the paradox in typed lambda calculi. Later versions are all predicative, like the one in [Martin-Lof 75a], which also has decidable type checking, due to the intensional treatment of equality. Martin-Lof also described a type theory with extensional (and hence undecidable) equality ([Martin-Lof 84]), in which type checking is not decidable. This is also the type theory used in the proof development system ‘Nuprl’. (See [Constable et al. 861.) The Martin-Lof type theories are also convenient as formalisms for program development, as the systems permit formal specifications for which programs can be constructed by means of the proof rules. This exploits the old idea that an intuitionistic proof of a formula of the form Vx € A . 3y € B [(p(z,y)] contains a procedure (algorithm) for constructing for every a E A an element b E B such that ~ ( ab), holds. For Martin-Lof’s intuitionistic type theory, this method is explained in [Nordstrom et al. 901, which also gives a good general overview of Martin-Lof’s systems. Another early example of the use of typed lambda calculus for foundational studies is the work of Girard in [Girard 711 and [Girard 721. There the polymorphic lambda calculus (system F ) is described and its higher order extensions. (The polymorphic lambda calculus was later rediscovered by Reynolds as a system for the typing of parametric functional programs, see [Reynolds 741.) The aim of Girard was not to give a calculus for doing formal mathematics, but to extend Godel’s Dialectica interpretation t o higher order arithmetic. This resulted in a lot of new results on the representability of recursive functions in typed lambda calculus and new techniques for proving normalization. (By now, Girard’s proof of normalization has become the standard method for these kind of proofs for higher order typed lambda calculi.) The systems of Girard are impredicative, which makes the normalization proof hard, proof-theoretically speaking. In contrast, all Automath-systems are predicative. Let’s point out here that, although the work of Girard has been very influential for later developments in the connections between typed lambda calculus and proof systems (e.g. the Calculus of Constructions), the polymorphic lambda calculus does not include logic but is purely a system for describing functions and data types. An early system that, like Automath, mixes computation and logic is Scott’s ‘Logic for Computable Functions’, a formal system for reasoning about the behaviour of recursively defined functions. The LCF system goes back to 1969 and marks the beginning of domain theory. (See [Scott 731.) The first implementation of the system was done by Milner around 1972, now known as Stanford LCF. Later versions of the system include Edinburgh LCF and Cambridge LCF, which further develop the basic ideas. The development of ML as an interactive metalanguage has been an important offshoot of Edinburgh LCF. (See [Gordon et al. 791.) The metalanguage ML allows the user to program tactics for proofsearch in LCF, and it also gives relatively great security against faulty proofs. Besides first order predicate logic, all the basic domain-theoretic constructions

10

R.P. Nederpelt and J.H. Geuvers

can be included in LCF as theories. The original formal system is now sometimes referred to as ‘PPX’ (Polymorphic Predicate X calculus), to distinguish it from its implementations. See [Paulson 871.

1.6. Recent Developments Recently there has been a boom in the research on typed lambda calculi and proof development systems, maybe best illustrated by the joint European (ESPRIT) research projects ‘Logical Frameworks’ and its successor ‘Types for Proofs and Programs’. (The latter has as a major aim the study of proof development systems based on typed lambda calculus.) A lot of research in the field is inspired by early work in the Automath project. We therefore want to mention here some lines of ongoing research. Our summary is not meant to be exhaustive but intends just to illustrate the broad actual interest in the topic, from both computer science and mathematics. One of the newer systems that is very close to Automath is LF, developed in Edinburgh and quite close to AUT-QE. (See [Harper et al. 871.) The system has been implemented and the implementation provides interactive proof development facilities. Furthermore the research in LF (an abbreviation for ‘Logical Framework’) has produced a large amount of examples of representations of formal systems in LF, showing the flexibility of the system. At the same time the problem of adequacy of representation has been taken very seriously from the start. (Given the representation of a formal system in LF, it is usually quite straightforward to show that if a statement p is provable in the formal system, then the type that interprets p in LF, say p , is inhabited in LF. The adequacy of the interpretation, stating that if p is inhabited in LF, then p is provable in the original formal system, is much more complicated.) A system that in fact unifies some of the Automath work with the work of Girard, is Coquand and Huet’s Calculus of Constructions (CC). It is a higher order dependent typed lambda calculus that includes constructive higher order predicate logic (by the formulas-as-types embedding) and at the same time includes many data types, due to the fact that they can be coded in Girard’s polymorphic lambda calculus. The system was first described in [Coquand 851 and not much later implemented by Huet in CAML. (See [Coquand & Huet 881 and [Coquand 901.) A lot of formal proofs have been done in the system, which are reported for example in [Coquand & Huet 851. Due to the representability of data types and the formulas-as-types embedding, it is also possible to write specifications (they are in fact formulas) in the system, which in turn allows the extraction of programs from proofs. The idea behind this programs-from-proofs method is the same as for Martin-Lof’s systems, but, due to the higher order logic and the definability of data types in CC, the actual mechanism is different.

Formal proof systems and Automath

11

The method is extensively described and implemented by Paulin-Mohring. (See [Mohring 861 and [Paulin-Mohring 891.) Very recently the Calculus of Constructions has been extended with so called Inductive Definitions, which amounts to the extension of the system with a scheme for inductively defining mathematical objects like sets and predicates, and also for constructing proofs and recursive functions by induction. This new system has been implemented as ‘Coq’ by researchers at INRIA. (See [Dowek et al. 911 for details.) Another implementation of the Calculus of Constructions has been done in Edinburgh by Pollack. The system is called ‘LEGO’ and it in fact allows the user to make a choice for a typed lambda calculus he or she wants to work with. Among the possible choices are of course CC, the already mentioned system LF and also an extended version of CC, called ECC. This system ECC is due to Luo ([Luo 89]), who has also provided the language theory for it. An interesting feature of the system is the possibility of representing a mathematical theory as a type, allowing theory abstraction and the description of morphisms between theories. In the system this is done by using so called C-types (a C-type is of the form Cz : A . B ( z ) with as canonical inhabitants the pairs (a,6) with a of type A and 6 of type B ( a ) ) ,allowing for the same kind of flexibility as can be obtained via de Bruijn’s notion of ‘telescope’ t o denote theories. (The subject ‘telescopic mappings’, by itself, would have been interesting enough for inclusion in the present Volume. However, de Bruijn’s paper [de Bruijn 91a] is very recent and can easily be tracked. Therefore, it has not been reproduced in this Volume. Telescopes are also discussed in Subsection 4.1.3 of [0.3]and in [A.4].) The pure study of the language theory of typed lambda calculi that originated from the Automath project, has also been continued, notably in Nijmegen by the research group of Barendregt. This resulted in the general treatment of different typed lambda calculi in one framework, first the ‘cube of typed lambda calculi’, which later resulted in the notion of ‘Pure Type System’ (PTS). (See [Geuvers & Nederhof 911, (Barendregt 921 and [Geuvers 931.) The framework allows proofs of many of the standard results like closure (under reduction) and uniqueness of typing, in a generic way. An implementation of arbitrary PTSs has been done by Helmink at Philips Research Laboratories under the name ‘Constructor’. Users of the system are allowed to select their own P T S to work in. We have already mentioned the system Nuprl ([Constable et al. 86]), developed at Cornell and based on the formulas-as-types approach and ideas on intuitionistic type theory from Martin-Lof. The actual system has also benefited a great deal from the work on LCF: Nuprl is implemented in ML and exploits ideas from LCF on goal directed interactive proof development. The Nuprl system actively supports the programs-from-proofs mechanism.

12

R.P. Nederpelt and J.H. Geuvers

The type theory of Martin-Lof has also been applied to theorem proving in Sweden itself, notably in Goteborg. This has led to a system called ALF (Another Logical Framework), constructed by Augustsson, Coquand and Nordstrom (see [Magnusson & Nordstrom 941). A proof system which is based on typed lambda calculus but does not treat proofs as formal objects is HOL (Higher Order Logic.) The system was developed in Cambridge by Gordon (see [Gordon & Melham 931) and is in fact based on Church’s simple theory of types ([Church 401) and LCF. The system is not a framework but supports a version of classical higher order predicate logic. Just as in Church’s original work, the typed lambda calculus is used to describe the higher order language while the derivation rules are functions in the metalanguage ML. The implementation is based on ideas from LCF about interactive theorem proving and is done in Lisp and in ML. Another interesting difference between HOL and systems such as Automath or CC is the extensional treatment of equality. HOL does not distinguish between definitional equality (conversion) and, what is in Automath terminology called ‘book equality’ (locally defined), but interprets the definitional equality as a proposition. A lot of experience with HOL has been gained, especially in the field of hardware verification. Of course there are also systems for proof development that do not use type theory at all. An example is the ‘high level formalized language for mathematics MIZAR’. (See [Trybulec 901.) Another example of such a system is the BoyerMoore Theorem Prover. (See [Boyer & Moore 881.) The system is a heuristic theorem prover in the sense that the theorem prover gives the user a lot of support for finding an actual proof. This of course has its price: the logical basis of the system is a quantifier-free variant of Peano arithmetic. It is remarkable that a lot of interesting results can still be proved in the system although this usually requires a serious reformulation of the theorems to make them manageable for the machine. The theorem prover has been applied to many examples, both from computer science (e.g. hardware verification) and mathematics or logic (like Godel’s incompleteness theorem).

2. SURVEY OF THE CONTENTS The papers which have been selected for the present book, are divided in six groups, in accordance with the global character of the topics treated. Each of these groups corresponds to a Part of this book, with the following headings:

Survey of the contents

Part Part Part Part Part Part

A B C D E F

13

Motivation and exposition Language definition and special subjects Theory Text examples Verification Related topics

Inside each of these parts, the papers are ordered chronologically, except when a strong relation in the themes of two papers appears to be more important than the ordering in time. In the present section we describe the contents of the papers which have been selected for this book. This description is meant to give a general idea of the subjects which are dealt with in the respective papers. Moreover, we take the opportunity to sketch the connections between the papers and the developments in the course of time. Often we go further than providing pure abstracts of the papers, because many of the selected papers lack such a r6sumC. Our purpose is then to give some insight into the contents of the different papers, with a view t o the aims and ideas of the Automath project. In particular, this applies to our comments about the (general) Part A, which are rather copious. For Parts B to F, however, which have more specialized subjects, we give more concise rCsumC’s. References to the papers included in this Volume will be indicated with the letter-number combination as given in the list of contents, printed in italics, e.g. [C.2].

Part A: Motivation and exposition

A.l. N.G.de Bruijn: Verification of mathematical proofs by a computer This is the first paper ever that has appeared about Automath. It was written as a companion paper t o a colloquium lecture which de Bruijn delivered at Eindhoven University of Technology, the 9th of January 1967. The paper has primarily been selected for historical reasons. De Bruijn presents it as a “preparatory study” for a project Automath and the paper as such is rather experimental. It is surprising, however, how clear the exposition is and how many of the main themes of the following Automath project are already incorporated. De Bruijn explains his ideas on the basis of a mathematical example concerning equivalence classes in a set. He uses this example to conceive a general plan of formalization. At the same time, the example serves as a good exercise for the discovery and the analysis of the mathematical and logical structures which are

R.P. Nederpelt and J.H. Geuvers

14

present in an average mathematical text, structural aspects which are grasped and understood implicitly by every mathematically educated reader, but which are hardly ever studied in isolation, or elaborated into the finest details. The chosen example is very appropriate, since the distance between the original text and the formalization is great: a considerable amount of work has to be got through in order to obtain the result desired. The formalization of the example consists of three stages: (1) the mathematical text in the usual phrasing, as it could be found in a mathematics book, (2) an elaboration and structuration, using the logical and mathematical formularium wherever appropriate, (3) a full formalization in six columns, with a line-wise development. Each line consists of:

(a) a “call”, or heading, establishing the textual function of the line: whether it concerns a definition of an expression, a context extension or contraction, a derivation step, etc., all with the necessary parameters, (b) the definition of the expression (if appropriate), (c) a “stack”, containing the context administration of variable introductions and assumptions, (d) a “proof”, being an abbreviated form of the asserted expression in the current context, (e) an elaborated version of the proof, (f) a reference number for the proof expression. The paper shows how de Bruijn goes about with the underlying deductive structure. It is intriguing to see how he “reinvents” the rules for natural deduction, still a bit chaotic, but uncompromising. As a major result, variables become first-class citizens, which is hardly the case in the usual conception of natural deduction (or mathematics). The difference between free and bound variables plays an important role. Moreover, de Bruijn employs a useful contextor block-mechanism for variables and assumptions. Finally, we may notice a formal treatment of substitution. The formalization as a whole seems a bit complicated and ad hoc. However, many of its features recur in the subsequent definitions of Automath languages, e.g. the treatment of variables, the substitution mechanism, the block structure and the formal shape of definitions. There are also meta-results: it becomes clear that the given formalization has possibilities for a mechanic verification; the formalization of a proof leads naturally to conclusions about the proof (e.g.

Survey of the contents

15

that a certain assumption is superfluous). Still absent is the propositions-astypes idea, that became so very fertile immediately after this preliminary phase.

N.G. de Bruijn: The mathematical language Automath, its usage, and some of its extensions This paper builds on an internal report of the Technische Hogeschool Eindhoven (the name of the Technological University of Eindhoven at that time), called Automath, a language for mathematics ([de Bruijn 68bl). That report was intended for the Symposium on Automatic Demonstration in Versailles, France (December 1968), which was attended by de Bruijn, and it has been published in the Springer Lecture Notes series. It is the first international publication about Automath. Some of the statements in this paper give an impression of the implicit “Automath programme” that de Bruijn had in mind. We think that it is worth while to quote a number of these phrases: A.2.

Automath is a language which we claim to be suitable for expressing very large parts of mathematics, in such a way that the correctness of the mathematical contents is guaranteed as long as the rules of the grammar are obeyed. The author feels that [...] Automath [...] is very close to the way mathematicians have always been writing, and that the abbreviation system used in Automath has been taken from existing mathematical habits. [Automath] is little more than [.. I the art of substitution.

De Bruijn explains how he uses the essential characteristics of a deductive system, by means of a context structure consisting of a set of nested blocks, in the vein of a “linearized” system of natural deduction. First, a basic abbreviation system is introduced, called PAL (which is an abbreviation for Primitive Automatic Language). A formal text in PAL (a PAL-book) consists of a set of consecutive lines, built from four parts: (1) an indicator referring to the top of a context stack,

(2) an identifier serving as a name (an abbreviation) for the string to be defined in that context, (3) the string which is abbreviated in the given context,

(4) a “category” (or type) for that string Note that PAL already has types, meant to restrict the set of terms in a natural way. (A simpler, pure abbreviation system without types, is called Semipal

16

R.P. Nederpelt and J.H. Geuvers

in the paper.) The terms used as strings are variables, constants or compound strings of the form f(a1,. . . ,an). De Bruijn takes some time to explain the rules for type respecting substitution which must be obeyed in a correct PAL-book. So-called PN-lines (the P N is for Primitive Notion) can be used for basic constants which cannot be defined in terms of previous ones. They may serve for the formalization of primitive notions (like N or the number 0) and axioms. There is a very important aspect in all this, that de Bruijn is very well aware of. Already in the beginning of the report he announces: “The way we handle propositions and assertions will be novel”. To be precise, he introduces and employs a formal version of what is known today as the propositions-as-types notion. As he says:

[...I we represent statements by categories. Saying we have a thing in such a category means asserting the statement. He elucidates this conception with an example concerning (defined) equality. (Note that equalities like “a = b” for set-elements do not fit in the basic linguistic equipment of the Automath languages; this so-called book equality, =, must be introduced as a primitive notion, together with the suitable axioms.) In this example he shows the following: if there is given an expression of type is(7,a, b) (that expression is to be interpreted as a witness for the equality of the elements a and b in the set v), then one can construct an expression of type is(q,b,a) (witnessing that b and a are equal), as well. De Bruijn adds the following remarks:

[...] the category [i.e. type] “is(<, x,y)” consists of all proofs for x = y. [...] “assume z = y” is replaced by “let p be a proof for z = y”. [.. I we have t o imagine the category “Zs(<,z, y)” to be empty if the statement x = y is false. De Bruijn dwells on this subject by giving a way t o insert logic in his system. For this purpose he introduces the category “bool”, consisting of all propositions, and the category “TRUE” which, in the context of an expression of type bool, is the set of assertions of this expression. He gives a n example showing how t o derive T R U E ( a n d ( v , v ) ) from T R U E ( a n d ( u , v ) ) . We note the following. In the setting described here, there is only one “ultimate category”, viz. type. “Inhabitants” of type (i.e. terms of type type) are so-called types. Types can represent sets, for example. There is one type designated for representing propositions, viz. bool. Hence, inhabitants of bool are propositions. So let p be an inhabitant of bool. Then the type of p is bool, and the type of bool is type. Putting the “degree” of type to be 1, we have that the degree of inhabitants of type (like bool) is 2 and the degree of inhabitants of inhabitants of type (like

Survey of the contents

17

the p above) is 3. Now Automath does not allow degrees higher than 3. Hence, the (intended) “proposition” p itself cannot be inhabited. So, the pure notion of propositions-as-types, to the effect that a proof of p should be represented by an inhabitant of p , cannot be applied here. Therefore the intermediate TRUE is required, “lifting” the proposition p to the term TRUE(p),being of degree 2 and hence inhabitable. (For a recent discussion of the use of TRUE and boo1 in Pure Type Systems, see [Barendregt & Hemerik 90, Pragmatics of AP]). He concludes the section about PAL with a number of remarks, a few of which are (again) worth being quoted:

[...] we do not need to subdivide our text into parts like “theorem”, “proof”, “definition”, “axiom”. We never announce a theorem before the proof starts, the result cannot be stated before it has been derived. The system PAL can deal with functions, but only in relation to arguments. That is to say, one can introduce a function f by giving its value f ( x ) for a variable x of some type and apply this function to an argument a of a corresponding type: f ( a ) . However, f on its own, as a function mapping x to f (x), cannot be expressed. For that reason de Bruijn extended PAL with lambda abstraction. He denotes Ax : A . t , where A is the type of z and t a term of degree 3, possibly containing x as a free variable, by [x : A]t. The corresponding product type : A . B, where B, a term of degree 2, is the type o f t , is denoted by [x: A]B. Note that abstraction over type, having degree 1, is not allowed. Roughly speaking, one may interpret Ax : A . t as the function mapping x to t (a term possibly depending on x), and IIz : A . B as the function type BA. However, the product type IIx : A . B is in principle a generalized product, in the sense that B , too, may depend on x. This is not the case with intended function types like BA,but this possible dependency turns out to be very useful in other settings, to be discussed later. Note that both A and n are represented by the same pair of square brackets [. . .] in Automath. This choice has rather farreaching consequences. Moreover, it was one of the reasons why de Bruijn’s system did not meet immediately with a kind reception in the lambda calculus community. The notational difference between the function f and the function value (with indeterminate argument) f (z) was not considered as very important by mathematicians. The function f was often written as f (x) in order to remind one of the dependency on one variable: think of the notation sin(z) for the sine function. However, his teaching in real and complex analysis had convinced de Bruijn that this distinction is essential. He began to use the lambda notation for ordinary mathematics, in the footprints of his colleague Freudenthal of Utrecht

nx

18

R.P. Nederpelt and J.H. Geuvers

University. It is interesting to see that his first attempts to develop a formal language for mathematics, as this paper shows, provide convincing justification for this separation o f f and f(z). In the paper, de Bruijn describes the consequences of the addition of lambda abstraction to PAL, in order to attain Automath. Of course, he adds function application and beta-reduction, but also eta-reduction. He writes (t)f for f(t), reversing the order of function and argument - a second obstacle to the Automath literature for lambda calculus circles. In this paper, de Bruijn gives a short motivation for his choice of a reversed order. In later work, it turned out that this minor revision has great advantages. The rest of the paper gives an overview of the use of Automath. It is interesting to see that de Bruijn already declares normal forms t o be only of theoretical interest. This practical insight is characteristic of his attitude. In the following sections of his paper, de Bruijn discusses various subjects, e.g. the treatment of implication, universal and existential quantification and predicates. Finally, he discusses two unsolved problems and he makes a number of remarks about processors for Automath and about superimposed languages, touching upon the question of artificial intelligence and automatic theorem proving (which he rejects as an aim for Automath). He also anticipates (in Subsection 12.7) a possible extension allowing more liberal abstractions, in particular over types. In this speculations he sets the ground for the extension AUT-QE, which would become the most important member of the Automath family.

A.3. D.T. van Daalen: A description of Automath and some aspects of its language theory This paper, dated to 1973, is a self-contained and clearly written exposition of the most important Automath systems AUT-68 (as the main system from [ A . 2 ] is now called) and AUT-QE. Both systems are, at the time of conception of this paper, well elaborated. AUT-68 contains most of the characteristic Automath aspects and is relatively simple, AUT-QE is a bit more complicated, but also more usable, since it can deal rather smoothly with predicates. In this paper, the description of AUT-68 is informal, with a discussion of its possibilities for formalizing mathematics. The usefulness of the propositionsas-types notion is explained concisely. Moreover, the paper contains a formal definition of AUT-QE, with a motivation of its special features. The points of machine verification and feasible decidability are emphasized in the introductory remarks: Since many common mathematical theories produce undecidable sets of theorems we must conclude that we cannot expect the computer to do all our work. Indeed theorems have to be given together with their proofs in order to allow verification. Thus the correctness

Survey of the contents

19

produced by the machine verification covers the arguments leading from axioms to conclusions only. Van Daalen uses the symbol E for expressing, in the meta-language, the relation ‘... has type ...’. In Subsection 2.9, van Daalen explains the difference between a type valued function and a function type. In AUT-68 this difference is not immediately visible, because of the identification of both concepts. This is due to the absence of the Il-constructor for function types (as we said before in our rCsumC of [ A . 2 ] ,the lambda abstraction, denoted by square brackets [. : .I, is “overloaded” in Automath, since it is also used for ll-abstraction). Van Daalen explains the application restriction intended to guarantee the well-typedness of an argument of a function (over a certain type). As regards the functional interpretation of logic - in relation with the propositions-as-types notion - it may be noted that the TRUES and book of [A.2] have disappeared. Instead, a distinction is introduced between two “ultimate types”, viz. a type called prop and a type called type. The terms of type prop should be considered as being propositions, those of type t y p e as sets. The difficulties with uninhabitable propositions, as we sketched in our previous rCsum6, now have disappeared, since the propositions have moved up one degree. Note that there still is a difference between propositions and sets, which is necessary, since it is desirable to treat these notions in a different manner. An important difference with the TRUEbooCapproach is, that the present treatment of propositions necessitates higher order terms for the introduction of predicates: the class of all predicates over ( is now expressed as . prop (of degree 1) instead of A,,c. 6001 (of degree 2). See also [ A . 4 , Section 51 and [ A . 6 , Subsection 5.21. Moreover, van Daalen discusses in Subsection 3.6 the difference between definitional equality (i.e. convertibility) and book equality, viz. the equality defined in the Automath object language. The latter form of equality was also a subject in [A.2]. Next, in Section 4, he motivates AUT-QE by explaining that arbitrary families of types indexed by I E a cannot be introduced in AUT-68. For example, the introduction of an arbitrary binary predicate over a given set is not possible. In AUT-QE, however, this shortcoming of AUT-68 is repaired. AUT-68 has no abstraction over types, as already said in (A.21, but also no other terms of degree 1 than type itself. In AUT-QE, abstraction over predicates is not permitted, neither do we have abstraction over types or propositions. But there are many terms of degree 1: apart from type and prop, abstractions like [I : &]type or [ x : a]prop are allowed, for terms a of degree 2. A new feature is necessary in AUT-QE, viz. type inclusion, as a consequence of the identification of type valued functions and function types. This causes a complication of the syntax, which is slightly peculiar. The uniqueness of types

20

R.P. Nederpelt and J.H. Geuvers

(up to conversion) is disturbed and the intuitive clarity of the system is a bit disturbed. Nevertheless, it is necessary in the Automath frame and it does not hamper the practical possibilities. The paper contains in Subsection 4.5 a table listing Automath notions and their interpretations, for the object-and-type interpretation as well as the proofand-proposition interpretation. Section 5 contains a formal definition of AUT-QE, with implicit a-conversion: ‘‘name clashes” are avoided by assuming that variable renaming has taken place before undesired bindings can occur. Or, otherwise said: “expressions with bound variables are considered as named versions [...] of some name-free skeleThe definition of ton”. (This idea is already present in de Bruijn’s [C.Z].) AUT-QE is fairly complicated, but its contents should be clear after the careful introduction. Finally, van Daalen mentions some of the problems connected with (theoretical and feasible) decidability. The keywords for this subject are: the ChurchRosser property (nowadays also called confluence), normalization and closure (i.e.: every reduct of a term in the language does, again, belong to the language). He sketches a decision procedure for Automath texts.

A.4. J. Zucker: Formalization of classical mathematics in Automath This paper accounts for an experiment carried out by Zucker, intended to formalize a part of classical real analysis in an Automath-like language. A fragment from Zucker’s text is presented and discussed in [ D . 4 ] . Zucker introduces AUTII, which re-establishes the difference between X and II (hence, there is no type inclusion, as defined in [ A . 3 ] ) .He gives a general description of AUT-II, concentrating on the aspects that differ from AUT-QE. The system has t-expressions of type r and pexpressions of type T, comparable t o the terms of type type and those of type prop, as introduced in [ A . 3 ] . Zucker carefully describes the different possible interpretations of his terms, in Section 2.2: ‘Formation of t-expressions’, and Section 2.3: ‘Formation of p e x pressions’. In particular, he has type valued functions, Cartesian products and types of pairs among his t-expressions, and generalized implication and conjunction, together with universal and existential quantification as interpretations for his pexpressions. The notion of irrelevance of proofs, essential for a proper treatment of (classical) logic, is discussed in detail: an object like the logarithm log depends in fact on two arguments, viz. a number 2 and a proof that this z is positive. However, strictly “different” proofs of the positiveness of z must not lead to different values of log(z). Zucker relates this irrelevance of proofs with the notion of strong existence (nowadays called ‘strong C-types’) and concludes that the two notions are inconsistent. Therefore, he adopts weak existence only.

Survey of the contents

21

Next, Zucker discusses predicates, quantifiers, equality and sets. Note that one cannot express equality between predicates on a given type or quantify over all such predicates, since there is no type of all such predicates (or, equivalently, the “category” of all such predicates has degree 1). Nor does working with the associated subtypes help, since, similarly, there is no type of all such subtypes. In order to overcome this difficulty, Zucker defines the type Powertype(a) of all sets of objects of type a, together with axioms by means of which one can go back and forth between predicates on a and their extension in Powertype(cr). Since Powertype(a) is a type (i.e. of degree 2, not I), we can express equality between sets of a given type, and also quantify over all such sets. He also gives a sketch of the development of classical real analysis in AUT-11, starting from Nu for natural numbers and R1 for reals. There is a primitive constant for recursion over a type a. He mentions that his language has been used to give a formal version of real analysis including derivatives, power series, exponential and logarithmic functions. An example text in AUT-II about differentiation, as written by Zucker, can be found in [D.4],with comments by van Benthem Jutting and de Vrijer. As further mathematical topics, formalized in the same setting by A. Kornaat, he mentions: set theory (including the equivalence of variants of Zorn’s lemma), combinatorics (including a proof of the Hall-Konig theorem) and metric spaces, up to the Heine-Bore1 and Bolzano-Weierstrass theorems. As regards the abstract structures that Zucker encountered during his translation work, he confirms what de Bruijn had predicted earlier, viz. that for linear orders and other algebraic structures the notion of telescope is indispensable. See also ID.5’1 and [de Bruijn 91a]. Important concluding remarks about his work can be found in Section 3:

[.. I as it turns out, this approach permits quite a natural development of classical real analysis; and (I think) this is because it reflects, to some extent, how mathematicians actually reason. In fact, one of the most interesting aspects of the project (to the author) is that it demonstrates the feasibility of founding a large part of everyday mathematics on the typed A-calculus, rather than on axiomatic set theory. A.5. N.G. de Bruijn: A survey of the project Automath This paper has been published in ([Seldin & Hindley SO]), the so-called Curry volume. The paper contains a number of general observations that shed a new light on important aspects of the Automath project. The date of publication is 1980. The work in the Automath project (see Section of the present paper) was finished by that time. Hence, it was a good

R.P. Nederpeit and J.H. Geuvers

22

moment to evaluate approximately 12 years of scientific work about the subject Automath. Apart from that, de Bruijn took the opportunity to “clarify a few points which many outsiders consider as uncommon or weird”. In particular, he points at his concept of types in relation with the propositions-as-types notion. And, finally, de Bruijn used the survey “to ventilate opinions and views i n mathematics which are not easily set down in more technical reports”. De Bruijn summarizes the purposes of the project Automath as follows: 0 [...] to develop a system of writing entire mathematical theories in such a precise fashion that verification of the correctness can be carried out by formal operations on the text. 0 [.. I it has to be possible to instruct computers how to check the correctness.

He adds: “Even if we do not use computers, they are to set the standard of what is ‘fformal” verification. ’’ De Bruijn formulates three motivations for the project:

Checking. [In particular:] things which are very hard and condensed, and where there is little intuitive or experimental support [such as] long and tedious proofs. [As areas of application for these cases he mentions combinatorics and computer science.] Understanding. If we want t o understand mathematics, we also have to get insight into the roles of axioms, definitions, proofs, theorems. [We think that] a good language may help us to understand the structure or the complexity of an argument. Processing. [One of the advantages of] the fact that a machine can read, check and store the mathematics that we produce [is] that two mathematicians use the same theorem with exactly the same conditions.

[..I

Next, he discusses the development of the different languages: Semipal, PAL, AUT-68, AUT-QE, AUT-ll and AUT-SL. (See also [A.2], [A.3], [A.6], [B.1] and [C.5]. For AUT-II: see [A.4] and [C.5],for AUT-SL: see [B.2], [C.3] and [B.7].) De Bruijn rather extensively explains the treatment of the propositions-astypes notion in Automath (Section 14). He shows the strong ‘)parallelism between proofs and “ordinary” mathematical objects” and notes that “proofs may depend on objects and objects may depend on proofs”. He also considers it attractive to make a separation between prop and type (we mentioned this distinction in our r6sumC’s of [A.,?] and IA.41). Other topics that already have been treated in the present introduction are e.g. the difference between definitional equality (convertibility) and book equality (Section 18; cf. [A.2] and [A.3]) and proof irrelevance (Section 24; cf. [A.4]).

Survey of the contents

23

An interesting discussion concerns the subjects verification (Section 20) and automatic theorem proving (Section 21). As to the latter subject, de Bruijn rejects (the artificial intelligence aspects of) automated theorem proving. However, he can conceive of (‘“attachments” to the verifier which find proofs of little gaps the writer might like to leave”. In Section 22, dealing with language extensions, he briefly mentions AUTSYNT (see also [D.3, 4.1.01 and [ B . 5 ] ) ,an interesting syntactic extension that has never been fully elaborated. De Bruijn concludes with a description of the mathematics produced in Autoniath (Section 25) and of work in progress (Section 26). A remarkable observation is the “constancy of the loss factor”, i.e. the nearly constant ratio between the “length” of an Automath text compared with the original “mathematical” text. De Bruijn establishes that this factor is “something like 10 or 20”, but “it does not increase if we go further in the book”. Evidence for this statement can be found in Subsection 4.3.2 of van Benthem Jutting’s thesis, see [D.3]. As regards work in progress, de Bruijn mentions: (i) programming language semantics (see also [F.4] and [ F . 5 ] ) ,(ii) abbreviation of segments (see [B.S]) and (iii) the Mathematical Vernacular, a formal version of “the strange mixture of words and formulas mathematicians use” for every-day writing and reading in mathematics (see also [F.3] and [Nederpelt 871). A.6. D.T. van Daalen: The language theory of Automath. Chapter I, Sections 1-5 (Introduction) In this Volume one can find several parts of van Daalen’s Ph.D. thesis, called The language theory of Automath ([van Daalen SO]). The present paper is the introduction part (Sections 1-5) of Chapter I: Introduction and summary. Other parts of this thesis can be found in [B.6] and [C.5]. We give a list of the most important subjects, present in van Daalen’s thesis, but not reproduced in this Volume: 0

0

different approaches to and notions connected with normalization and the Church-Rosser theorem (in Chapter 11) the theory of abbreviations (LSP) (Chapter 111)

0

the E-definition and the closure property for some Automath languages (in Chapter V)

0

a short definition of AUT-II (Section VIII.2)

In our opinion, Chapter I, Sections 1-5, reproduced as the present [ A . 6 ] , gives an extensive and well-organized impression of the state of affairs in 1980. It is the distillation of many years of research in the area, in a condensed form, in some respects even of a philosophical character.

24

R.P. Nederpelt and J.H. Geuvers

It is remarkable that van Daalen, just like de Bruijn in [ A . 5 ] ,considers it necessary to start with a kind of apologetic remark: “we like to make some general remarks [.. I hoping to clarify some points which have sometimes given rise to misunderstanding”. Between the lines one may read here that the value of the Automath project was not (yet) sufficiently recognized in the time that [ A . 5 ] and this thesis were developed. Both de Bruijn and van Daalen surmise that there has been a lack of information about some aspects of Automath. They may be right, considering the low number of international publications on the subject. Another factor is that the presentation of Automath at that time was not attached t o a respectable scientific current: neither logicians nor mathematicians recognized it as a part of their discipline. Both the subject and the formulation of Automath were original, and therefore isolated and strange to non-insiders. Van DaaJen’s defence in Subsection 1.1is short but convincing. He is an advocate of a “good formalism, which allows a formalization faithful to the informal ideas one had i n mind”. And, implicitly, van Daalen states that Automath is such a good formalism. Therefore, compromises are undesirable. For example, an Automath user should convince critics of the advantages of the new notation, and should not comply t o the (strong, but imperfect) tradition. In Subsections 1.3 and 1.4 van Daalen stresses the experimental character of the project: “can we develop formalisms [...] i n which mathematical texts actually can be formulated i n such a way that mechanical verification [.. I is actually possible?”. Requirements are that both writing and checking are practically feasible. The formalism should be “universal”, i.e. “suitable for large parts of mathematics”. All of the reasoning must be formalizable, including proofs, the handling of variables, abbreviations (definitions), etc. Next, van Daalen mentions possible spin-off in the region of didactics. As regards the foundations of mathematics and/or logic, he states explicitly that “Automath has no strong foundational claim [...] or philosophical position to defend”. And he rejects criticism that Automath contradicts the undecidability theorem for mathematics by explaining that the basic system of Automath contains no more than minimal predicate logic. All additions are at the risk of the user. As he says: “The computer is certainly not supposed to decide the truth of the axioms, it is even not supposed to decide derivability of the axioms, but just verifies derivations”. In Subsection 1.10 van Daalen distinguishes between pure systems, having the ordinary A-operations only, and the extended systems, which have additional logical operations. Both AUT-68 and AUT-QE are pure systems, AUT-lI is an extended system. There are no Automath languages which have arithmetic built in. In Section 2 van Daalen recapitulates the history and the various Automath

Survey of the contents

25

languages which have been developed. He then exhibits the main purposes of the thesis, being the formulation and comparison of various Automath language definitions and proving their decidability. The latter objective is reached by giving proofs of “desirable properties”, namely the well-known meta-theorems about (strong) normalization, the Church-Rosser property, closure (under reduction) and the like. Hence, detailed study of the involved reduction relations is indispensable. Important from the practical side is the difference between formal and feasible decidability (Subsection 2.8). Strong normalization gives formal decidability, but this is in general not a feasible operation, since the difference in size between a term and its normal form can be as forbidding as that between a numerical expression involving the Ackermann function and the numeral giving its value. Van Daalen describes roughly how practical feasibility can be obtained by using an appropriate reduction strategy. Van Daalen notices that slight variations in definitions may cause a major difference in the difficulty of the proofs of the meta-theorems. He gives some examples in Subsection 2.10. Section 3 contains a discussion of the subjects connected with bound variables, such as name-free versions of variables (see also [C.2]), and substitution (cf. [E.3]). Section 4 gives a general introduction in the terminology and syntax of the various systems. Subsection 4.6 contains an interesting intermezzo about interpretations, in particular regarding generalized Cartesian products and function types. In Subsection 4.8, the notion of type inclusion is explained. In Section 5 van Daalen opens an extensive discussion on the expressive power of Automath if used for mathematics. The basis is typed lambda-calculus, not set theory. He divides mathematics in a t-part for the construction of mathematical objects, and a p-part for reasoning about these objects. The letter t in the word t-part stands for “terms, types and type valued finctions”, the letter p in p-part for ‘$roofs, propositions and predicates”. In the rest of the section, van Daalen makes clear what the characteristic properties and limitations of Automath are as regards e.g. constants, typing and definitions. Moreover, the distinction between A-abstraction and axiom schemes for explicit definitions of functions is an important topic, dealt with in Subsection 5.6. Van Daalen compares the various features of Automath with other systems, like Seldin’s system of generalized functionality (cf. [Seldin 76]), Scott’s system of constructive validity (cf. [Scott 701) and Martin-Lof’s system of intuitionistic type theory (cf. [Martin-Lof 75al). The considerations as phrased in this section cumulate in Subsection 5.11 in a “natural” development of the Automath format. Here two diverging requirements play a role: the resulting formal language should be suited to the usual

26

R.P. Nederpelt and J.H. Geuvers

(informal) mathematical environments, but it must also be ready for mechanical proof-checking. Propositions-as-types turns out to be the key notion for the combination of these requirements.

A.7. N.G. de Bruijn: Reflections on Automath As the title suggests, this paper takes a retrospective view of the Automath project, by de Bruijn himself. The original Dutch version of the paper is dated March, 1990. The Automath project as such has become a part of history by that time, but, what is more important, it finally obtained recognition, albeit more in the circles of computer science than among mathematicians. Typed lambda calculus has become an important area of research and de Bruijn’s pioneering work is acknowledged among logicians and computer scientists. For example, the work on Pure Type Systems by Barendregt and others at Nijmegen University shows a direct relation with Automath: two “vertices” of Barendregt’s “cube” of typed lambda-systems are simplified versions of AUT68 and AUT-QE, respectively. See [Barendregt 921 and [Barendregt & Hemerik 901. The paper has aspects of a “causerie”. De Bruijn recalls his experiences with mathematics and mathematicians, leading to a description of how he came to the development of Automath. Consequently, it is a very personal document, with philosophical digressions about the influences which were fundamental for the process of growth of his ideas in the logico-mathematical field. In a loose and pleasant style, de Bruijn tells about his contacts with Beth, Erdos, Heyting, Brouwer, Scott and others. His interest in foundational matters developed slowly but steadily. His daily experiences as a highly esteemed mathematician, especially in analysis and combinatorics, led him to the basic question underlying Automath: can mathematics be formalized in such a manner that computers can do the job of verification? He describes his growing attention for the use of formalizations and his “discovery” of lambda-calculus and the block structures present in reasoning. De Bruijn describes two direct motives for his work on the Automath systems: his experience that a proper natural deduction framework can really help in the finding of a difficult proof, and the doubt glimmering in a sentence of the computer scientist Dijkstra at the end of a paper concerning the correctness of an algorithm: “And this, the author believes, completes the proof”. The link between thinking and calculation, the importance of a natural deduction framework, t h e distinctions between language and meta-language all these matters have had a direct influence on his thoughts about logic and mathematics, culminating in his design of the Automath languages. In Section 2, de Bruijn formalizes (with hindsight) his three main objectives for the Automath project:

Survey of the contents

27

(1) A system must be developed which is suited to the verification of mathematical theories, (2) The system must remain general, hence with no or hardly any built-in features for special applications,

(3) The system must be natural in the sense that mathematics and formalization follow the same order of development. In the rest of the section, he reconsiders the propositions-as-types notion, the mixing of objects and proofs, the treatment of equality and the abbreviation mechanism. His conclusion is that the project has attained its three objects, in particular the “naturality”. In Section 3 he lists a number of consequences, in particular regarding didactics and the Mathematical Vernacular, i.e. the common language which mathematician “speak” in their writings. (See [ F . 3 ] . )

A.8. R.P. Nederpelt: Type systems - basic ideas and applications This paper was meant as a companion paper for an invited lecture by the author at the yearly Dutch Computer Science Conference (“Computing Science in tmheNetherlands” 1990). The purpose was to explain t o a general audience of computer scientists what type theory is about, what the main concepts are and, in particular, the importance of type theory for computer science. Hence, the paper is no more than introductory and it covers an area which Aut,omath did not primarily aim at. However, many concepts developed in Automath are presented here in a general setting, with a link to corresponding areas which have recently attracted the attention of researchers, and with many references t o the literature. Therefore, this paper has been selected as the final entry in Part A, which concerns motivation and exposition. After a short historical introduction, mentioning Automath as a “milestone f o r the development of type theory”, Nederpelt reviews Pure Type Systems or PTSs (see Section 1.6 of the present introduction and [Barendregt 92]), as developed by Barendregt and others. Via untyped and typed lambda-calculus he arrives at the central rules for such Type Systems, related to abstraction and application. In the next section, the connection with logic is explained. It is shown that the propositions-as-types format is very much suited to both propositional and predicate logic. In an example, the coding of a derivation of a logical tautology is given, with explicit “proof objects”. Next, Nederpelt elucidates how mathematics can be treated in a type theoretical setting. Assumptions, variable introductions, axioms, primitive notions, definitions, theorems and proofs - they all have their natural counterpart in

28

R.P. Nederpelt and J.H. Geuvers

type theories, as was the case in the various Automath languages (cf. [ A . d ] ,

[A.41). The final section deals with the relation between typed lambda calculus and a number of concepts which are fundamental for higher programming languages. Subjects are: polymorphism, definition-like constructions such as the let- and where-constructions, products and sums, abstract d a t a types and recursion. These various subjects, all being of current interest in computer science, exemplify the usefulness of type theory in this area.

Part B: L a n g u a g e definition and special subjects B.l. L.S. v a n Benthem J u t t i n g : Description of AUT-68 This is an informal introduction to AUT-68, which served as lecture notes for a course on this subject given in 1981. Since it is an introductory and selfcontained note, we present this paper as the first entry in Part B of this Volume. Chronologically, it should be placed somewhere in the middle of this Part. Van Benthem Jutting starts with a short review of the backgrounds and history of AUT-68. This is followed by a step-wise description of the various syntactic features of a type free version of AUT-68. Subjects treated are e.g. books and contexts, lines, expressions, abstraction and application, substitution, reduction and conversion (or definitional equality). Next, he adds types and a type operator which provides for a “canonical” type of an expression. The restricting role of the types is dealt with in Section 8 (Correctness). Van Benthem Jutting continues with a summary of languagetheoretical meta-results, plus a number of technical Automath facilities which serve as a useful shorthand-help for actual text writing in AUT-68. In an appendix, an example text is given, followed by comments. The example shows how a number of important mathematical and logical language constructs can be rendered into AUT-68. This concerns, for example: propositions and proofs, sets and elements; introduction and elimination of implication and universal quantification; and a possible treatment of contradiction, negation and disjunction. B.2. N.G. de Bruijn: AUT-SL, a single line version of A u t o m a t h This short note, bearing the date May, 1971, is a n early elaboration of Nederpelt’s ideas about the possibility of “streamlining” the Automath format (cf. [Nederpelt 71bl). Contexts are treated as lambda-abstractions, instantiations of definitions are replaced by applications. Moreover, primitive notions and axioms are treated as lambda-abstractions, and definitions, being “only” abbreviations, are omitted. In this manner a simpler system is obtained, AUT-SL (where SL stands for

Survey of the contents

29

“single line”). AUT-SL is nothing more than a system of typed lambda calculus, but with, at least in principle, (almost) the same power as Automath. By its simplicity, it is easier to perceive and understand the various mechanisms of Automath, while, on the other hand, theoretical investigations can be applied to a unified and therefore less complex syntactic structure. The simplifications led to the first proof of strong normalization for an Automath system by Nederpelt in his thesis IC.31. His version of AUT-SL, developed in that thesis, was called AUT-A. A full term (a “book”) in AUT-SL is no more than a single long term constructed in typed lambda calculus. This term can for example represent a mathematical theory. If this theory is extended with more definitions, theorems, proofs and the like, then in the AUT-SL-version, this amounts to an extension of the original “single line” lambda-term at the end, where the part of the term representing “old” material remains unchanged. AUT-SL is in some respects more general than other Automath systems. For example, terms of any (finite) “degree” are possible: each term may have “inhabiting” terms, which in their turn may have inhabiting terms, etc. On the other hand, AUT-SL lacks type inclusion (see [A.3]and [A.6]),which is a serious handicap in applications (but see also [B.4]).Moreover, in practical situations the absence of abbreviations causes the generation of terms of a forbidding complexity. Finally, AUT-SL does not incorporate q-reduction. This, however, seems to be of minor importance in applications, as observed by van Benthem Jutting (cf. [D.3, Subsection 4.1.11). The paper contains a short introduction, a description of the syntax of AUTSL, an example and a verifying program for AUT-SL.

B.3. N.G. de Bruijn: Some extensions of A u t o m a t h : the AUT-4 family This paper investigates the possibility of extending the three degrees of Automath with a fourth one (cf. the simpler Automath system of [B.L], in which any finite degree is allowed). The fourth degree is added in order t o be able to incorporate a formal version of the notion ‘irrelevance of proofs’ (this notion has been explained in our resume’s of [ A . 4 ] and [A.5]). Usually, the three degrees in Automath (degrees 3, 2 and 1) are for ‘terms’, ‘types’ and ‘types of types’, e.g. for representing elements, sets and the class of all sets, or proofs, theorems and the class of all propositions, respectively. Now de Bruijn suggests maintaining the degrees 3, 2 and 1 for elements, sets and the class of sets, but to use degrees 4, 3 and 2 for proofs, theorems and the class of all propositions. In this manner, terms of degree 4 are always to be interpreted as proofs, and irrelevance of proofs can be more easily expressed. (In [/I.&], it was explained how in AUT-68, having only three degrees and only type as

30

R.P. Nederpelt and J.H. Geuvers

term of degree 1, the same result could be attained by the introduction of the categories TRUE and bool.) De Bruijn lists a number of possible other applications of this Automath language with four degrees.

B.4. N.G. de Bruijn: AUT-QE without type inclusion The syntactic notion of type inclusion (see [ A . 3 , Subsection 4.41 or [ A . 6 , Subsection 4.81) has always been a slightly peculiar ingredient of Automath languages like AUT-QE. Its motivation can be found in the syntactic simplification obtained by the identification of the (functional) X and the (generalized product binder) II. In this paper de Bruijn shows that type inclusion can be avoided in the syntax of Automath, provided that the Automath text itself contains a number of axioms (‘primitive notions’, or PN’s) which enable the user to mimic this type inclusion. Otherwise stated, type inclusion is now moved from the language syntax to the object language. As a consequence, each application of type inclusion must now be justified by the Automath writer in the Automath book; it is no longer an automatic type change that is looked after by the syntax (or, in practice, by the verifying program constructed for the language). A smooth treatment of this form of type inclusion requires a number of abbreviating features which de Bruijn describes in the paper. In a sense, de Bruijn violates his own objective of “naturality” (see [ A .7]), since this incorporation of type inclusion in the object languages complicates matters, to some extent. On the other hand, the user of Automath is now obliged to become conscious of the places in his text where type inclusion is desired (i.e., where the X’s are actually II’s). For a recent discussion concerning the transformation into AUT-AA (cf. (B.71) of a book written in AUT-QE without type inclusion, and for a comparison between AUT-QE with and without type inclusion: see [de Bruijn 91bI. B.5. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Appendix 9 (AUT-SYNT) This is Appendix 9 t o Jutting’s thesis. (This thesis is discussed in Subsection D.2 of the present survey. For other parts of this Ph.D. thesis, see Subsections D.3, D.5 and E.2.) In this appendix some preliminary remarks are made about a possible language extension called AUT-SYNT. We note that Subsection 4.1.0 of Jutting’s thesis, reproduced in ID.31, is also concerned with AUT-SYNT. The purpose of AUT-SYNT is to enable the user to suppress redundant parameters. There are various circumstances in which a user of Automath systems may become bothered by the obligation of Automath to explicitly provide in-

Survey of the contents

31

formation which is already implicitly present. An example where this may be desired is the case in which a ‘polymorphic’ definition, depending on a number of parameters, is applied. Jutting uses the example of the polymorphic conjunction. The introduction rule for the conjunction is: if propositions a and b are both derivable, then a A b is derivable as well. (In the propositions-as-type concept, we may read ‘inhabited’ for ‘derivable’.) For a general version of this rule, expressed in Automath, abstraction over a and b is required: given a proposition a and a proposition b , and given that u is a proof of a and v a proof of b, then andi (say) is a proof of a n d ( a , b ) . Note that a n d i depends on four parameters: a, b, u and v . Applying this rule, we need terms A and B representing the propositions, and proof terms p for A and q for B, respectively: a proof of A A B then looks like a n d i ( A , B, p , q). However, the proofs p and q implicitly contain the information concerning the propositions which are at stake, since the type of p is (convertible to) A and the type of q is (convertible to) B. Note that both the types of p and q are ‘calculable’ in each of the Automath languages. Hence, we can - at least in principle - do without the parameters A and B. This being the case, one can give a new definition of andi, this time with only two parameters, viz. u and v. This definition uses a syntactic type synt as a type for the parameters that contain the necessary information for the verification program, in order to restore the full parameter list. In the present paper Jutting describes this new syntactic feature. One may take advantage of AUT-SYNT by ‘predefining’ a number of syntactic metafunctions, like CAT for the calculated (“mechanical”) type, as desired above. Another function which can be predefined for use in AUT-SYNT is DOM, giving the “domain” of a lambda-abstraction. Jutting gives other examples, as well.

B.6. D.T. van Daalen: The language theory of Automath. Chapter VIII, 1 and 2 (AUT-ll) This contribution is taken from Chapter VIII of van Daalen’s thesis ( [ v a n Daalen SO]). See also [ A . 6 ] and [C.5] for other parts of this thesis. Chapter VIII deals with the Automath variant AUT-ll, developed by J. Zucker. For a description of the background, the properties and the practical applicability of this language, see Zucker’s [ A . 4 ] . For a text example in AUT-ll, see [D.4]. Van Daalen starts with a comparison of AUT-QE and AUT-ll. Both have been used in practical experiments, for the formalization of large units of mathematical texts. Van Daalen lists the correspondences between the two languages: both have three degrees, a generalized type structure and only quantification over term variables (not over type variables). Both employ the book format, with context indication. There is no built-in recursion. However, AUT-ll goes beyond AUT-QE. For example, it can handle pairs,

32

R.P. Nederpelt and J.H. Geuvers

projections and injections. For this purpose AUT-II has type constructs for the finite product (a special case of the generalized sum) and the finite sum (or disjoint union). As a consequence, AUT-II can deal with logical systems having, apart from =+ and V, also 3 and V as operators, together with the corresponding introduction and elimination rules. As already mentioned before, there is another important difference. The language AUT-QE has type inclusion (see [ A . 3 ] and [ A . 6 ] , and also [ B . 4 ] ) , since it does not distinguish between type valued functions and function types. AUT-ll, however, has a special symbol (viz. the II present in its name) for referring to a function type (a generalized Cartesian product, the letter IT refers to the p of product). Type inclusion is therefore unnecessary. In practical applications (see e.g. [ D . 4 ] )AUT-ll also had AUT-SYNT facilities (see [B.5] and [ D . 3 , 4.1.0]),but this is not part of the language. Van Daalen devotes a subsection to a short definition of AUT-IT in the style with E-formulas as he employs throughout his thesis. For many syntactic notions he refers to other parts of his thesis, which are for the greater part included in this Volume.

B.7. N.G. de Bruijn: Generalizing Automath by means of a lambdatyped lambda calculus This paper builds on previous work concerning the language AUT-SL (see [ B . 2 ] ;for a more general framework, see also [C . 3 ] and [ A . 5 ] ) . AUT-SL is a unified and simplified Automath version, incorporating all features of e.g. AUTQE, but for definitions and type inclusion. The Single Line approach of the papers mentioned above has been changed into a tree format, which reflects more clearly the internal structure of the terms involved. Moreover, by a generalizing adaptation of beta reduction, definitions can again be expressed. This is a major step forward for practical purposes. The system proposed in this paper has also been called AUT-AA, being an extension of AUT-A (again see [ C . 3 ] )with Definitions. A recent discussion of the same system has been given in [de Bruijn 91b]. Type inclusion, however, is left out in all mentioned versions of AUT-SL or AUT-Ah. De Bruijn first describes lambda trees, being his equivalent of terms in typed lambda calculus. Next, he explains various meta-syntactical notions, some of which are well-known from the usual Automath languages (degree, type), while others are specific for the tree approach (ascendant, implantation). The notion redex is replaced by that of AT-pair. As mentioned above, the reduction relation is refined in order t o accommodate local reductions, necessary for representing definitions in the language. The intuitive background is that the 'unfolding' of a definition is usually desired for only one specific instance of the defined word. For example, suppose that one

Survey of the contents

33

needs somewhere in a formal text, during a formal manipulation, the precise definition of ‘the square of x’ (or x2), being x x x. Then it is advisable to have a formal means to replace x 2 by x x x at that particular place, without the undesired side-effect that all occurrences of the string z2 are replaced by x x x, as would be the consequence of the usual form of beta reduction. A companion to local beta reduction is what de Bruijn calls AT-removal. Bot8hlocal reductions and AT-removals he calls mini-reductions, for which he claims the Church-ksser property. Finally, de Bruijn gives a n algorithmic definition of correctness of lambdatrees. Correctness here means that such a tree belongs to a certain sub-class of all lambda trees (AA), viz. those which obey the typing conditions for “wellbehaving” terms. The algorithmic definition seems suitable for machine verification.

B.8. H. Balsters: Lambda calculus extended with segments. Chapter 1, Sections 1.1 and 1.2 (Introduction) This is most of Chapter 1 (Introduction) of Bakers’s thesis. That monograph deals with the so-called Xu-calculus, being an extension of the ordinary lambda calculus. The u stands for segments, being new “terms” added to provide for certain desired abbreviation facilities. The motivation for the introduction of segments in lambda calculus originated with de Bruijn’s AUT-SL (see [B.2]), where many repetitions of (almost) equal symbol strings were necessary, due to the abolition of contexts. The context mechanism is very convenient in the more user friendly Automath versions (AUT-68, AUT-QE, AUT-n), not only since it reflects so neatly the natural, time-honoured manner of developing mathematically inspired theories, but also for pure practical reasons: it prevents a lot of unnecessary and awkward work. The Xa-calculus (see [de Bruijn 78al) is an attempt to reinstate the advantages of the context administration in the framework of lambda calculus. A segment is a formal and generalized version of such a context, in a lambda calculus format, which can be abbreviated. That is to say: a variant of beta reduction is used to give a formal connection between the name of a segment (a segment variable bound by a special A) and the abbreviated context or segment (being an argument for the term beginning with the mentioned A). Balsters elaborates these ideas of de Bruijn. Moreover, he uses a special name-free notation for segment variables, introduced by de Bruijn and described in [ C . 2 ]for ordinary lambda calculus. These nameless dummies are reference numbers, replacing variable names. A great advantage of the use of nameless dummies is that “name clashes” are avoided in the process of substitution. On the other hand, nameless dummies must be “updated” during a substitution process, in order to maintain the right references. This requires update

34

R.P. Nederpelt and J.H. Geuvers

functions (or reference mappings) which may be rather complicated, as can be seen in the parts of Balstersk thesis which are not reproduced in this Volume

([Balsters SS]). Balsters starts with the type free lambda calculus (Subsection 1.1). In Subsection 1.1.3, he describes his version of name-free notation, and gives an impression of the difficulties that can be expected. He employs a tree format for terms in order to be able to give a better explanation of what happens. Next, he discusses segments and segment variables, their formal format and their use. He eliplains carefully which desirable and undesirable effects may occur, and he adapts the format accordingly. The resulting system is a bit involved, but can be softened by using the name-free notation. This is made plausible in Subsection 1.1.6. Subsection 1.2 is concerned with a typed system, having simple types as in a system introduced by Church (nowadays also called A+ or A+-Church; see [Barendregt 921). The types play a restricting role as usual, thus limiting the set of terms to the set of well-typed (or correct) terms. Again, Balsters also gives a name-free variant. Types are defined for segments, as well. A number of examples are added to show how correctness of a term can be established and how types can be calculated for correct terms.

Part C: Theory

C.l. L.S. van Benthem Jutting: A normal form theorem in a X-calculus with types This short note was presented at a workshop in Oberwolfach, Germany, in 1971. It contains the first proof of (weak) normalization for terms in an Automath system. The format used is an Automath version of a typed lambda calculus. It is shown that normalization holds for such a system of lambda calculus provided that it has a kind of “weak functional behaviour”, which is expressed by means of a n o m , here called T , being a partial mapping from terms to terms. This norm .(a) (if defined) of a term a may be considered to be a rough “skeleton” of the intrinsic abstraction structure of the term a. It can be obtained syntactically by (repeatedly) replacing all variables by their types and by cancelling all applications together with the matching abstractions. The norm T was invented by Nederpelt, for the same purpose: proving normalization. (See also [C.9, Chapter 31 for a detailed description.) The domain of T consists of terms in the system under consideration. Its range is a subset of the same set, consisting of terms which may contain abstractions, but no applications and no variables. It is shown in the paper that a term a has the mentioned property of weak normalization if .(a) is defined. Jutting gives a

Survey of the contents

35

number of lemmas for this main theorem. The proof of Theorem 3, an important lemma, is not given. One may, however, also consult [C.3]for the details. C.2. N.G. de Bruijn: Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with applica-

tion to the Church-Rosser theorem Early experiences with Automath showed that the usual naming of variables by means of letters (z,y, . . .) causes a lot of trouble. For an example, see our rCsumC of IB.81.We shall now go a little further into the matter. It is well-known that it is not a trivial affair to avoid so-called “name clashes” between bound and free variables in substitutions. This is an annoying matter, in particular since these names only serve the purpose of enabling a relative identification: a number of occurrences of the same letter z only points at the fact that these occurrences concern the same variable; the name z itself is irrelevant, and could be (consistently) replaced by any other name. The solution given by de Bruijn is to make the variables name-free. He accomplishes this by replacing each occurrence of a variable by a so-called reference number, being a positive integer which is a measure for its relative position in the term. For example, the term A,. (zy) becomes A . (1 2). Here the number 1 replaces the bound variable z; the 1 points at the binding A, being the first X t o the left (if one follows the path to the root of the “term tree”). The number 2 refers to an invisible X in front of the term, meant to “bind” this free variable 2, formerly named y. So, bound and free variables are treated similarly. For free variable there must be an imaginary “free variable list”, binding these variables in some fixed order. Note also that multiple occurrences of the same variable (name) can easily lead to dzflerent reference numbers. On the other hand, occurrences of different variable names can give rise to the same reference numbers. De Bruijn explains his approach of name-free variables. A disadvantage of his method is, that substitutions can cause a change of reference. For this purpose he introduces the necessary reference transforming mappings, which complicate matters to some extent. In a way, this is the counterpart of the difficulties caused by name clashes in the usual name carrying version. Nevertheless, de Bruijn gives a smooth demonstration of the usefulness of his notation for metalingual discussions by adding a proof of the Church-Rosser theorem for lambda terms in the name-free format. De Bruijn’s idea of nameless dummies has nowadays found many advocates and users. See for example [Abadi et al. 911.

36

R.P. Nederpelt and J.H. Geuvers

C.3. R.P. Nederpelt: Strong normalization in a typed lambda calc u l u s w i t h lambda s t r u c t u r e d types This is the complete Ph.D. thesis of Nederpelt. It contains the first proof of strong normalization for terms (under P-reduction) in an Automath system. This result is obtained in a particular way: it turns out that strong normalization is a consequence of weak normalization, as soon as two other properties hold: (1) the Church-Rosser property,

(2) the property that the reduction relation in the system under consideration is increasing, i.e., there is a natural number [ A (defined for each term A such that if A reduces to B, then IAl < IB(. (The term ‘increasing’ has been introduced in [Klop 801.) The paper concentrates on a simplified Automath-version, called AUT-A. for an earlier description by de Bruijn of a similar system, called (See also [B.2] AUT-SL.) Nederpelt starts with a general description of a typed lambda calculus called A , in which the types do not play the role of restricting application. (Note that this letter A has no connection with the A in the system name AUT-All mentioned in our resume of [B.7].) In fact, A is the ordinary untyped lambda-calculus to which general, term-like types have been added. Typing is a la Church (see [Barendregt 92]), i.e. all and only the binding variables are provided with type information. The notation of this typed lambda calculus is in the Automath fashion (or de Bruijn-like, as one could say), that is t o say with arguments which precede the “functions”. This is a nice notation for P-reduction, since the X of an abstraction and the matching application term (the ‘argument’) occur adjacently. It turns out that this simple notational convention is very appropriate for the definition of a variant version of P-reduction which is increasing (as defined above). Nederpelt calls this version PI-reduction. It is described in Section 11.6. The characteristic property is that the matching abstraction-application pair in a reduction is not erased; for example (in the usual notation) instead of (X=:A. B)C -+p B[z := C], one has ( & : A . B)C +pl (X=:A. B[z := C])C. Full ,&-reduction is a bit more complicated. Its definition makes an essential use of the matching abstraction-application pairs which are so clearly detectable in the de Bruijn-like notation. The PI-reduction is increasing, as Nederpelt shows. Hence, together with the Church-Rosser property (which the author proves in Chapter I1 of his thesis), this implies that the PI-reduction is strongly normalizing as soon as it is weakly normalizing, (This idea has later been generalized to the case of Combinatory (or Abstract) Reduction Systems by J.W. Klop in his thesis, [Klop 801; see also [Klop 921 and [Barendregt 921.) In order to associate this increasing PI-reduction with the usual @-reduction, Nederpelt introduces a simple reduction mechanism (called P2-reduction) which

Survey of the contents

37

erases superfluous abstraction-application pairs. (Cf. the AT-removal which we discussed in Section B.7.) In Chapter 111, Nederpelt gives a short algorithmic definition of AUT-A, in which the types accept the usual responsibilities of guaranteeing a nice functionargument behaviour. This is established by putting a rather general restriction on A, called the “applicability condition”. The system AUT-A is very close to many Automath-systems, as we described before (see again our rCsumC of

[B.21). The rules given here may be considered t o be an early version of some of the PTS-rules (see [Barendregt 921). For obtaining (weak) normalization, the applicability condition can be weakened. For this purpose, Nederpelt defines a partial function p on the set A giving a norm for a term, in the case that p is defined. (This p corresponds with the norm T in [ C. I].) For such pnormable terms, weak normalization is proved in a manner comparable to the one given in [ C . 1 ].This holds for both the ordinary @-reductionand the (increasing) @I-reduction. With the techniques developed in Chapter 11, strong normalization turns out to be a consequence. All the mentioned results are transferable to AUT-A, which has the (non-increasing) ordinary P-reduction.

R.C. de Vrijer: Big trees in a A-calculus with A-expressions as types This paper, dating to 1975, has been re-published as the final one of five entries in de Vrijer’s thesis ( [ d e Vrzjer 87al). Its aim was to prove a number of C.4.

theoretical meta-results concerning the Automath-languages, in particular the closure property, already mentioned in our rCsumC of van Daalen’s introduction to his thesis ( [ A . 6 ] ,1980): any reduct of a term in a given language is again a member of that language. This property was stated as a conjecture in [ C . 3 ] . See also [C.5] (the main parts of van Daalen’s thesis) for a systematic treatment of these subjects. The desired meta-results are: strong normalization, Church-Rosser, closure and decidability. De Vrijer states in the introduction to his thesis that “the problems of a particular Automath system cannot be considered to be solved to full satisfaction, until all the desirable results [.. I are established together”. For this purpose he defines so-called big trees, which are richer versions of reduction trees, incorporating, apart from proper reductions, the following “improper” steps: 0

passing to the type of a term,

0

passing to an arbitrary subterm.

38

R.P. Nederpelt and J.H. Geuvers

A system satisfies the property BT, if the big tree for all of its terms is wellfounded (i.e. has a finite height, and hence no infinite branches). The property BT implies the properties mentioned above. De Vrijer introduces a system AX, being a variant of AUT-QE. As he observes in the introduction to his thesis, the closure property becomes trivial, since the definition of AX was tailored t o this problem. However, the decidability and the soundness of the language definition of AX become complicated matters. Noteworthy is the manner in which de Vrijer handles the types. In order to prevent interference of derivability and term formation, he does not restrict his set of terms with a view to the applicability condition. Hence, the types have no influence on term construction. However, the @-reduction relation becomes restricted, in the sense that a redex ( & : A . B)C has a corresponding contractum B [ z := C] only if the argument part C of the redex has A as its type. In this setting it is still possible to define a restricted set of legitimate terms, by stating the requirement that each application ( B C )occurring in such a term must have the property that the argument part C “fits in” B. More formally: there must exist terms A and A’ such that C has A as its type, while B is convertible (in the new sense, using the restricted @-reduction relation) t o X,,A. A’. As a matter of fact, the requirement is more free, as in [C.3): it is sufficient that the type of B, or even the type of the type of B , has this property of being convertible to X=:A . A’. These types can be calculated, again as in [ C . 3 ] .The intuitive reason is that B may be a function without having the appearance of a function. For example, if B is an introduced variable with type X=:A. A‘, then B itself can never convert to a term in the required form, but the type of B trivially does so. De Vrijer proves the “big tree theorem” (big trees are well-founded) and shows as corollaries that the “full” AX is decidable and that it is a conservative extension of the “legitimate” fragment of AX. C.5. D.T. van Daalen: The language theory of Automath. Parts of Chapters 11, IV, V - VIII The main part of the introduction t o van Daalen’s thesis ( [ v a n Daalen SO]) can be found in [ A . 6 ] . Chapter VIII, dealing with AUT-II, is reproduced in [B.6]. The present entry contains a selection from the body of this thesis. In this selection one can find the definitions of important notions of the various Automath-systems and most of the desired meta-theoretic theorems, together with their proofs. Van Daalen deals with these topics in a systematic manner, using a style which provides the reader with the necessary intuition and background information. The selection starts with Subsections 11.8 - 11.11. Subsection 11.8 gives an informal analysis of the weak Church Rosser Property (if C reduces in one step to

Survey of the contents

39

both A and I?, then A and r have a common reduct). Subsection 11.9 discusses the so-called Postponement (of one kind of reduction with respect to another one), and Subsection 11.10 deals with multiple substitution. Subsection 11.11 contains a general result about reduction under substitution. It treats the following problem: “zf C[Q] reduces to A, what can be said about A in terms of reducts of C and I’?’’. An analogous lemma due to Barendregt turns out to be a corollary. Next, Subsection IV.2 is reproduced in this Volume. It concerns (weak and strong) normalization results for normable expressions. The norm p presented in this Subsection is a partial function on terms. It is quite similar to the norms 7 and p as given in [C.l]and (C.31, respectively (see the r6sum6’s in this Volume). Normability for terms is a much weaker requirement than the applicability condition present in (most of) the Automath systems. Normable terms have a very weak functional behaviour: for an application ( A B ) it is sufficient that the norms of both constituents, A and B, exist and that the norm of the argument, p ( B ) , is equal to the type given in the “abstraction head” of p ( A ) ; i.e., p ( A ) must be X=:+(B). p(C) for some C. Important properties of the set of normable terms are that the set is closed under taking subterms, under substitution and under reduction. Normability guarantees weak normalization and, together with a substitution theorem, also strong normalization. In this Subsection, van Daalen gives various proofs for these theorems. Also selected is Subsection IV.4.6 which gives an extension of these results to systems with more than &reduction, viz. 0-reduction and &reduction (for unfolding defined constants). The bulk of the selection from van Daalen’s thesis consists of the complete Chapters V to VII. In Chapter V the so-called E-definition is dealt with (the symbol E expresses the meta-relation ‘... has type ...I, as we already mentioned in our rCsum6 of [A.3]).By the aid of this E-definition van Daalen proves closure for AUT-QE with p- and v-reduction (cf. [A.6],[C.4] and [C.6]).Moreover, he shows the equivalence of the two kinds of definitions for a set of “legitimate” terms: the E-definition described above, and the algorithmic definition with (canonical) types (as e.g. in [ C.31) and possibly type inclusion. In Chapter VI, van Daalen tackles the Church-Rosser problem for a typed lambda calculus with p- and 7-reduction. Van Daalen takes a generalized typed lambda calculus, in the style of Nederpelt’s A (see [C.3]),being an Automathlike language without contexts and without definitions. In such a generalized typed lambda calculus the Church-Rosser property does not hold for p combined with 0, as a simple counterexample shows. However, for a “legitimate” (or ‘‘correct”) fragment, consisting of terms for which an applicability condition holds, it does, as van Daalen proves. (This was a conjecture in [C.9].)

40

R.P. Nederpelt and J.H. Geuvers

Chapter VII treats the language theory of Nederpelt’s A (or AUT-SL; see [ C . 3 ] ,[B.2] and the respective rCsumC’s in this Volume). Van Daalen follows the algorithmic approach of Nederpelt, but for the “single line” character. First, he proves closure for A in Subsection VII.3 (closure for A with 7-reduction is postponed until Subsection VII.5). Next, he uses the big tree theorem of de Vrijer (see [ C.41) to prove Church Rosser for @q and strong normalization. The proof of the equivalence of a number of systems, given in Subsection VII.6, shows that all the systems under consideration obey the described “nice” (desired) properties. Finally, the chosen fragments from Chapter VIII (Section 3 and Section 4) deal with the language theory of AUT-II (cf. [ A . 4 ] ) ,in particular the properties of closure and strong normalization. As we already mentioned in Section B.6 (see also Section A.4), AUT-ll is an extended system, having additional type forming operations, viz. generalized product (also known as ‘II-type’, which has arrow type formation as a special case), generalized sum (also known as ‘Ctype’, which has binary product as a special case), and binary sum, plus the corresponding term forming operations like abstraction, pairing and injection. Hence, AUT-ll must be provided with a number of additional reductions, like n-reduction (for projections of pairs), 0-reduction (for surjectivity of pairing) and +-reduction (for injections in binary sums). Van Daalen proves closure and strong normalization for the system with these extra operations and reductions. However, he does not include &-reduction (for binary sums) in his strong normalization result, due to some technical problems. Full Church-Rosser does not hold in this extended system, as is shown by means of a counterexample.

(2.6. L.S. van Benthem Jutting: The language theory of A,, a typed A-calculus where terms are types In this paper, completed in 1984, van Benthem Jutting revisits the system A introduced by Nederpelt in his thesis (see [ C.31). There is no a priori distinction between terms and types, A serves as both the function binder (the usual A) and the product binder (usually denoted by II). Van Benthem Jutting leaves thus out 7-reduction (see the motivation for this in Subsection 4.1.1 of [0.3]), concentrating on @-reduction. He calls the system Am. As he points out, “the importance of A, lies in the f a c t that at may be considered as basic to the theory of Automath”. In contrast to previous proofs due to Nederpelt and van Daalen of a number of theorems stated in this paper, the author uses “nameless variables” in the de Bruijn-style, including the necessary update functions. Cf. [ C.21 and (B.81. A, is a system with the customary restriction on application (“arguments” of functions must have a proper “type”, matching with the function domain). The corresponding system without this restriction is called A, as in Nederpelt’s

Survey of the contents

41

IC.31. The author proves Church-Rosser for this A and strong normalization for a subsystem of “normable” terms in A. This subsystem is comparable again to the work in [C.3]; see also IC.51. The proof of strong normalization, using noming functionals, extends proofs given in [Gandy 801 and [de Vrijer 87~1. The author also proves closure of Am under reduction. The last mentioned result was a conjecture in [C.5’] and was first proved in Chapter VII of van Daalen’s Ph.D. thesis (see [C.5]). Van Benthem Jutting now gives a new, direct proof of closure. Finally he shows that terms of Am are normable, and a substitution lemma then guarantees an upper bound for the length of reduction sequences, hence Am is strongly normalizable as well. (These, again, were results in [C.3] for a system with name-carrying variables; see also [ C.51, again, for a similar method of proving strong normalization.)

Part D: Text examples D . l . N.G. de Bruijn: Example of a text w r i t t e n in A u t o m a t h This is a short example dated March 1968. It gives a very early (albeit limited) demonstration of the use of Automath - to be precise: of AUT-68. Logic is treated in the TRUE-boodstyle of [ A . 2 ] (see our resum6 for details). Both boo1 and TRUE are introduced in the first three lines of the example as primitive notions. Moreover, a notion nonempty is necessary, formalizing the proposition that a certain type is inhabited. Two axioms (then2 and then3) are given, for the linkage of “we have a proof of the inhabitation of a type” to “we have an inhabitant of the type”, and vice versa. Next, axioms for (so-called book-) equality are given: IS is a binary relation given as a primitive notion for each type, obeying reflexivity, symmetry and transitivity. (See also [ A . 2 ] and [ A . 5 ] for more details about this book-equality.) Related to equality are the if-then-else construct, the equality of elements belonging to different types, and embedding. The notion of equality of booleans is given via a primitive pairing type and a kind of bi-implication. The usual implication is a straightforward lambda abstraction, due to the propositions-as-types concept. Subsequently, other logical connectives (non,and) and quantifiers (EXISTS, A L L ) are introduced in a natural manner. Further examples are concerned with predicates, e.g. on “subtypes” (defined with another predicate), with intuitionistic reasoning, with the rule of the excluded third (which is not incorporated, but may be added as an axiom if one desires so) and with sets. It turns out that this way of dealing with propositions is a bit complicated. In particular, the shift between propositions and their TRUEtypes is quite laborious. A similar remark holds for the negation of propositions and the emptiness

42

R.P. Nederpelt and J . H . Geuvers

of their corresponding TRUEtypes: given a proposition b, inhabitation of the type EMPTY( TRUE(b)) implies inhabitation of the type TRUE(non(b))and vice versa. This kind of transformations complicates matters considerably. We explained in our rCsum6s of [A.3] and [A.5] how these problems have been solved, in an elegant manner, by extending the system t o AUT-QE.

D.2. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Parts of Chapters 0, 1 and 2 (Introduction, Preparation, Translation) This entry contains parts of the introductory chapters of van Benthem Jut[D.3), ting’s Ph.D. thesis. Other parts of this thesis are reproduced in [B.5], [D.5]and [ E . 2 ] . Van Benthem Jutting’s thesis summarizes the experiences gained with his translation of Landau’s book Grundlagen der Analysis ([Landau 301). This book, first edited in 1930, was in its time a standard of precision in the development of a fundamental piece of analysis: it starts with Peano’s axioms for the natural numbers, and successively introduces the positive rational numbers, the (positive and negative) reals and the complex numbers by means of the standard constructions, viz. via equivalence classes and Dedekind cuts. Theorems about these number systems and the corresponding order relations are proved rather meticulously, unsurpassed in degree of elaboration for a long time since the original publication. Landau’s book is elementary, his prerequisites for the readers, however, appear to be a bit optimistic. In his Preface for the Student he claims: “I will ask of you only the ability to read German and to think logically - no high school mathematics, and certainly no higher mathematics”. This is in principle true, but it may give a wrong impression about a text book that contains, among other things, a full proof for the Dedekind-completeness of the order relation for the reds ... The book appeared to be a good test case for a translation in Automath, since 0

the subject is well-known and stable in mathematics,

0

its great value was recognized by the mathematical communities for several decades,

0

0 0

0

the theory is developed “from scratch” (albeit that knowledge of logic is presupposed), definitions, theorems etc. itre explicitly phrased, the proofs use small steps, leaving no essential transition to the inventiveness of the reader and hence being unambiguous and reliable, the same degree of precision is maintained throughout the book.

Survey of the contents

43

Moreover, it was interesting t o see whether such a renowned text would stand up to the ultimate precision of the Automath language. The full translation took Jutting quite some years. The complete computer printout of the translated text runs to approximately 500 pages of paper in A3 format. (Landau’s book itself numbers 134 pages of mathematical text.) In the part of Jutting’s Ph.D. thesis now under discussion, the author starts in Chapter 0 with an introduction and motivation for the translation and the Automath version used. Next, in Chapter 1, he extracts the presupposed logic from Landau’s text and gives comments about the way in which logic is expressible in Automath (in particular: AUT-QE). (Details about the incorporation of logic are given in Chapter 4; see [ 0 . 3 ] . )The axiomatic Automath lines (“PN-lines”) are listed and the development of logical concepts and theorems necessary for the Landau translation is elucidated. (The actual PN-lines as they occur in the preliminary AUT-QE text, are given in Appendix 3 of Jutting’s thesis, which is reproduced in this Volume: see [ 0 . 5 ] . ) The reproduced part of Chapter 2 contains an abstract of Landau’s book and an overview of the deviations in the Automath translation from Landau’s text. These deviations are generally of minor importance and are only made for the benefit of the expounded theory. There are, however, a few important and necessary changes, caused by Landau’s appeal to both intuitive and met& concepts, which cannot be expressed in the Automath (object) language. See Subsection 2.1, (vii) t o (x).

D.3. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Chapter 4 (Conclusions) This chapter of Jutting’s thesis (see [0.2], and also [ 8 . 5 ] , [ D . 5 ] and [E.3] for other parts of his thesis reproduced in this Volume) discusses possibilities of representing logic in Automath, in particular in the systems AUT-68 and AUTQE. Apart from that, the AUT-SYNT extension is introduced. This contains extra features for omitting certain subexpressions that can be derived mechanically (by a computer), thus relieving the task for the Automath writer. See our introduction to [B.5]for a discussion, and the actual [B.5]for examples. Next, in Subsection 4.1.1 Jutting discusses 7-reduction in Automath and gives plausible arguments for his observation that ‘$-reduction does not add considerably to the expressive power of Automath”: there are only two occurrences of 7-reduction in all of the Landau translation, and these two cases are avoidable. The distinction between props (proposition types) and types (set types) is shown to be essential in Subsection 4.1.2: this has to do with the different approaches for the double negation law and the axiom of choice, and also with the possibility of incorporating the so-called “irrelevance of proofs” notion. For

44

R.P. Nederpelt and J.H. Geuvers

the last-mentioned notion: see [ A . 4 ] and Section 24 of [ A . 5 ] ;see also [ B . 3 ] . In Subsection 4.1.3 the notions of strings and telescopes are considered. Telescopes are also mentioned in [ A . 4 ] and elaborated in [de Bruijn 91a]. Telescopes are the formal counterpart in a A-calculus-like environment of a notion connected with dependent products. An example of a telescope is the notion ‘group’ in algebra. Such notions, essential in many parts of mathematics, can be expressed quite satisfactory as prefixes of lambda calculus terms, or as contexts in type systems or in the Automath languages. See also [Barendregt 921. The segment calculus of de Bruijn and Balsters (see [B.8] or the full [Balsters 861) is also strongly connected with the notion of ‘telescope’. In Subsection 4.2 van Benthem Jutting gives some comments on the translation, notably about the choice of language (why AUT-QE instead of AUT-68) and some shortcomings of the translation. He concludes this Chapter with a remarkable observation about the complexity of the Automath text compared with the original Landau text (this has also been mentioned in [ A . 5 ] ) :the “loss f x t o r ” , i.e. the ratio between the number of expressions in the translation and the number of words in Landau’s book, is more or less constant, and thus does not grow in the course of the translation.

D.4. L.S. van Benthem Jutting, R.C. de Vrijer: A text fragment from Zucker’s “Real Analysis” When J. Zucker was a member of the team working in the Automath project (1975 - 1976), he developed a variant of Automath, called AUT-IT, with II-types and C-types. See ( A . 4 ) for a description of this research and Section A.4 for a resume of that paper. Zucker’s aim was to transform the syntax of Automath into a new format, especially suited t o mathematicians and the mathematical style. Of course, he maintained an abbreviation mechanism for making definitions. He added, however, a useful paragraph system, in order to obtain greater flexibility in the management of constants. This paragraph system was first used by van Benthem Jutting for his Landau translation ([vanBenthem Jutting 771). Other syntactic extras are described in [ A . 4 ] . Zucker tested his AUT-IT by formalizing a well-known mathematical theory, viz. the theory of real functions, including continuity, differentiation and the power series. The AUT-IT “book” which contains this interesting material does, unfortunately, only exist in Zucker’s own handwriting. It has not been published as a report, nor has it been checked for syntactical correctness. Zucker’s participation in the Automath project was too short t o attain these aims. The verifying program for AUT-IT has never even been written. Van Benthem Jutting and de Vrijer have chosen a representative part of Zucker’s AUT-IT book, with the purpose of giving an impression of Zucker’s intentions. The text fragment deals with the definition and some elementary

Survey of the contents

45

properties of differentiation. The AUT-II text is reproduced at the end of [D.4]. The section starts with an introduction by van Benthem Jutting and de Vrijer. This introduction contains a description of the AUT-rI syntax, a recapitulation of those AUT-ll notions occurring in text lines preceding the chosen fragment (in so far as they are necessary for the understanding of the fragment), and a synopsis of the reproduced text.

D.5. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Appendices 3 and 4 (The PN-lines; Excerpt for “Satz 27”) In this section, two Automath text examples which were used in the AUTQE version of Landau’s Grundlagen der Analysis ([Landau 301) are reproduced. Both examples appear as appendices in van Benthem Jutting’s thesis. (For other parts of van Benthem Jutting’s thesis which are reproduced in this book, see [D.2] (Introduction), [D.3], [E.2] and [B.5].) Appendix 3 is the AUT-QE text containing the primitive concepts and axioms (PN-lines) which occur as preliminaries for the actual Landau-translation. Since Landau did not elaborate logic in his book (he presupposed this as existing knowledge of the reader), van Benthem Jutting had to incorporate the axioms, notions and rules of classical logic into his AUT-QE text. Moreover, he had to give an axiomatic account of a few notions occurring in elementary set theory. In this Appendix 3 we find axioms for contradiction, equality and individuals, followed by axioms for subtypes, products and sets. See [0.2, Subsection 1.31, for a detailed description of the AUT-QE lines, in van Benthem Jutting’s own words. Appendix 4 of van Benthem Jutting’s thesis contains the Automath translation of “Satz 27” (i.e. Theorem 27), the final theorem in Chapter 1, Section 3 of Landau’s book Grundlagen der Analysis. This theorem reads as follows.

Satz 27: I n jeder nicht leeren Menge naturlicher Zahlen gibt es eine kleinste. (That is, every non-empty set of natural numbers contains a smallest element.) As van Benthem Jutting notes in Subsection 2.1, (vi) (cf. [D.2]), this theorem has been reworded and proved in terms of predicates and not of sets. The reproduced AUT-QE text is an excerpt of the full translation, consisting of those Automath lines only which contain necessary information for Satz 27 and its proof. (A precise description of excerpt is given in Subsection 3.3 of [E.2].) We give a short commentary on this excerpted text. The AUT-QE text of Appendix 4 starts with notions and theorems from logic (not, present in Landau’s book): implication ( I M P ) , modus ponens ( M P ) , the transitivity of implication ( T R I M P ) , the primitive notion contradiction ( C O N ) , negation ( N O T ) ,the axiom of the excluded third ( E T ) , and so on. For example, the AUT-QE text in the lines between the “paragraph markers” +AND and

R.P. Nederpelt and J.H. Geuvers

46

-AND, on the second page of this example text, contains a proof of the theorem that -(a A b) and a imply l b . (See [ E l ]for an explanation of the paragraph system employed. An extra feature is that the reopening of a paragraph is also possible; this is denoted by means of an asterisk, e.g. +*OR for the reopening of paragraph +OR.) A predicate P on type SIGMA is a term of type [X,SIGMA]PROP.That is, P is a function from SIGMA to the class of all propositions. The text shows how the universal and existential quantification over a type SIGMA and a predicate P can be defined, namely as ALL(SIGMA,P) and SOME(SIGMA,P), respectively. For example, ALL(SIGMA,P), formalizing V+EC P ( z ) ,can be defined by just P itself. We illustrate this remarkable fact for those readers who are not acquainted with this type-theoretic feature, by means of a short Automath text:

*

SIGMA * P * P * B * S *

SIGMA P ALL B S RES

:=

---

.-

___

:=

P

; ; ; ; ;

TYPE

[X,SIGMA]PROP [X,SIGMA]PROP .- _ _ _ ALL( SIGMA,P ) .- _ _ _ .SIGMA := ( S ) B ; ( S ) P

That is, if B “proves” A L L ( S I G M A , P ) and S is an “element” of SIGMA, then ( S ) Bis of type (S)P . In other words: If VZEx P ( 2 )holds and if S E C, then ( S ) P ,i.e. the proposition P ( S ) ,is inhabited. So this shows that the natural deduction rule of V-elimination is a simple consequence of defining ALL( SIGMA,P) as P itself. The text between the paragraph markers +E and -E introduces the primitive notion of equality on types, plus the relevant theorems about equality. The introductory AUT-QE text ends with establishing the relation between predicates on types, and sets. The actual Landau translation starts with the paragraph marker +LANDAU. The notions of natural numbers and induction are easily recognizable in the translation. A number of theorems necessary for the proof of Satz 27 and also present in Landau’s book, are proved beforehand: Satz 1, Satz 3, Satz 4, etc. For example, Satz 46 can be deciphered as the Landau statement: 2

+ y/ = + y)’ (2

(where y’ stands for the successor of y). The proof of this Satz 4 b is coded in the same AUT-QE line, and can be read

as: “apply T27, occurring in the previous paragraph no. 24,to y”. The last Automath-line in this Appendix 4 gives the desired Satr 27 and its proof, in a context ending with S. We can read: if S is an element of S O M E ( P ) ,

Survey of the contents

47

then S O M E ( [ X , N A T ] M I N ( P , X )is ) inhabited. In other words, if the predicate P holds for some element of C (i.e. if the set connected with P is non-empty), then there is a minimal natural number for which P holds (i.e. a minimal element of that set). Note that S A T 2 27 is the name of the proof, and not of the theorem. This is a somewhat confusing practice, customary in Automath translations.

Part E: Verification

E.l. I. Zandleven: A verifying program for Automath In this paper the organization of the first Automath verifier is described. The task of the computer program was t o check whether a given Automath text was written in conformity with the language definition. Of course, the correctness of a text is decidable, because of the meta-theorems concerning confluence and strong normalization of the Automath languages. (See Part C of this Volume.) But this is not a feasible way, since normal forms may have a forbidding length. Therefore, Zandleven constructed an interactive program, so that the user can be consulted by the program as soon as it encounters a (too much) time-consuming task. Moreover, the user can instruct the computer about what task should be performed, and the user can add new information (Automath lines) from the keyboard. Verification is preceded by the “translation”, which acts as a pre-processor by checking the context-free part of the text: parentheses, commas and the like. The heart of the program is the conversion checker: are given terms tl and t 2 inter-convertible or not? Note that this is a matter that is left implicit in the Automath languages. One has no formal means t o communicate in the language how certain conversion chains connect one term to another. One could say that it requires some kind of artificial intelligence to establish convertibilty in a clever way. This is absolutely necessary, since (as we emphasize once more) the brute-force comparison of the normal forms is impracticable. Another important feature is substitution. Since the verifier of Zandleven uses actual variable names (cf. [C.2], which suggests another approach; see also [C.S]), one has to be very careful to avoid name clashes. Substitution is the kernel of both @-reduction and definitional reduction (unfolding a defined notion), being the basic constituents of conversion. Hence, it must be treated with wariness. Zandleven also incorporates a-reduction (“renaming”) and 77reduction (“extensionality”). (See [E.3]for a modernized version of the described implementation of substitution, making use of nameless variables (“de Bruijn indices”) and structure sharing.) The paper contains streamlined versions of the procedures dealing with sub-

48

R.P. Nederpelt and J.H. Geuvers

stitution and the various reductions. It employs functions CAT and DOM for the calculation of a “canonical’’ type (“category”) and the domain of a term (if existing). In testing inter-convertibility of two terms, the program tries to find a common reduct, in view of the Church-Rosser theorem. In doing this, the program makes a n appeal to an inbuilt strategy for deciding which of the terms should be reduced, and which redex must be singled out. Zandleven gives a short discussion of this strategy. The manageability of the checking of the correctness of an Automath line is heavily dependent on the convertibility procedure. Finally, Zandleven describes a paragraph system for a smooth treatment of multiple names (identifiers): the requirement that all introduced names are mutually distinct turns out t o be untenable, especially when texts become more complex.

E.2. L.S. van Benthem Jutting: Checking Landau’s “Grundlagen” in the Automath system. Parts of Chapter 3 (Veriflcation) This is the major part of Chapter 3 of van Benthem Jutting’s Ph.D. thesis. It contains a commentary on the computer verification in 1975 of the AUT-QE translation of Landau’s book Grundlagen der Analysis. (We refer to Section D.2 for details about this translation project and for a listing of the other parts of this thesis reproduced in this Volume.) Subsection 3.0 describes the most important facts and features of the verification program in general, such as the stages discernible in the checking process, the possibilities to interfere interactively in this process and mechanisms to prevent unnecessary re-verification. A table lists data of the final run for the verification of the complete Landau translation, and gives e.g. time, number of reductions, number of Automath lines and number of expressions per chapter and for the full text. Remarkable is the extremely low figure for the number of g-reductions performed: only 2 in the complete verification. See Section D.3 or Subsection 4.1.1 of [D.3] for more details about this fact. There is also an account of the amount of memory required for the Burroughs B6700 computer in operation at that time. Subsection 3.1 is about the strategy of the program, used in establishing pequality and/or definitional equality (cf. [ E l ] ) Subsection . 3.3 (not reprinted here) explains the procedure of excerpting an AUT-QE book with respect to a given line in the book: find all the relevant lines for this given line, and skip the rest. The state of the art in computer programming in 1975 was less rigorous than nowadays. Since programs in use then (and even now) contain bugs, there is no full guarantee of the absolute reliability of the final verification of Landau’s book in that year. Apart from that, there have been detected a number of

Survey of the contents

49

shortcomings in the program as regards the binding of variables, which were listed and discussed in Subsection 3.2 (not reprinted here). In spite of these observations, van Benthem Jutting concludes in Subsection 3.2 that “it is hard to believe [.. I that any incorrect A UT-QE lines have been accepted b y the machine during verification”.

E.3. L.S. van Benthem Jutting: An implementation of substitution in a X-calculus with dependent types Substitution is fundamental in many mathematical disciplines, in particular in computer science (cf. assignments, functional programs and unification in logic programming). Nevertheless, it is often treated, surprisingly, as a metaoperation. For example, in the traditional lambda calculus, where substitution is the kernel of the fundamental operation called /?-reduction, substitution itself is generally presented as a mechanism outside the object language. In actual implementations of languages using functional concepts, one cannot avoid a formalization of substitution in the coding of function application. This was also the case in the Automath verifier (developed about 1974, see [ E .l ] ) , since /?-reduction and &reduction (for the unfolding of definitions) are fundamental operations in all Automath languages. It appeared to be a good strategy to code substitution in such a manner that the actual copying of expressions, a time- and memory-consuming operation, would be delayed as long as possible. Zandleven designed a method for this purpose which was used in the verification program for Automath. Instead of substituting an expression for a (free) variable, this expression was stored in a so-called environment, and the free variable would point at the appropriate part of the environment. Substitution is implemented by changing the environment. This idea can be advantageous, since many occurrences of the same free variable may point t o the same environment entry. The unpleasant alternative of copying one and the same expression at all places of these occurrences, is thus avoided. The chosen set-up is sometimes described as structure sharing. A related notion is lazy evaluation, see e.g. [Turner 791. Since Zandleven aIso chose to code free variables as natural numbers in the ‘nameless variables’-style (also called ‘de Bruijn indices’ or relative addressing), he could use the values of these variables as the desired pointers for locating the appropriate environment. Hence, a variable, being a natural number, could either be actually free (coding a free variable), or bound to an environment entry (intended t o be an expression substitutable for the variable). In the paper presently under discussion, dated 1988, van Benthem Jutting gives a compact description of this coding of substitution, which had not been properly documented before. He considers both single and multiple substitution, and the combination of both, intended for P-reduction, and (definitional) 6-

R.P.Nederpelt and J.H. Geuvers

50

reduction. Update operations play an important role. (See also de Bruijn’s original paper on nameless variables, [ C . 2 ] ,and the applications in (B.81 and

1c.61. ) Van Benthem Jutting gives soundness results for his operations, thus justifying the definitions. He also considers a typing operation, in the same name-free setting. Recently, the notion of ‘explicit substitution’ has been investigated again in [Abadi et al. 911, with a link to term rewriting. The same explicit substitution in a uniform notation in tree style, inspired by Automath, is the subject of [Kamareddine & Nederpelt 931.

Part F: Related topics

F.l. N.G. de Bruijn: Set theory with type restrictions De Bruijn prepared this paper for an audience of mathematicians, which he addressed at the International Colloquium on Infinite and Finite Sets in Keszthely, Hungary, 1973. It is an attempt to arouse interest among his colleagues for a subject which was novel and not immediately attractive for set theorists. Therefore, it has an expository, informal character. De Bruijn makes a plea for a general type theory, “(also applicable] to cases where the objects to be discussed are not sets”. He introduces the E-notation for the meta-relation “... has type ...” (cf. [ A . 3 ] ) .He explains that AE B can mean other things than A is a member of B, for example: A is a proof for B, or A is the description of a geometrical construction for something expressed by B. He warns the reader that “B is completely determined b y A”, hence one should not confuse E with the membership symbol E. Next, he explains the use of the type-theoretical framework for e.g. axioms, propositions, the introduction of variables and assumptions. (“There is a slightly unfamiliar feature: most mathematicians have not got used to giving names to proofs And after that he touches on the characteristic elements of Automath, including the lambda calculus aspects. He opposes some aspects of the usual set theory (for example of the Zermelo-Fraenkel type) to the Automath approach. Finally, de Bruijn explains why a more liberal (higher order) typing is desirable.

[..I”.)

F.2. N.G. de Bruijn: Formalization of constructivity in Automath In the previous paper, de Bruijn already mentioned a remarkable interpretation of the typing relation, other than the two usual ones: ‘being a member of’ and ‘being a proof of’. In this paper he discusses this third interpretation: ‘being a geometrical construction for’.

Survey of the contents

51

In Section 1, de Bruijn gives a noteworthy argument for the uniform treatment of the first two interpretations: “the substitution machinery is the same for both”. An object dependent on parameters (variables for objects) and a theorem dependent on parameters (assumptions) are both usable by the technique of giving appropriate substitutions for the parameters. These things are even interwoven: “we see that s o m e of the parameters are object-like, and others are proof-like” . The main subject of the paper is the usability of Automath,for describing geometrical constructions. Here, too, substitutions for parameters play a n important role. De Bruijn discusses several specific problems connected with the formalization of these constructions, such as constructibility (of a geometrical object), obseruability (“does the point lie inside or outside the circle?”) and selectability (“take one of the intersection points of the two circles”). In the paper he explains how these matters can obtain a solution in a n Automath setting. He warns that geometry is “ n o t the easiest example for the study of constructions”. In particular, the notion ‘arbitrary’ is rather vague: “If one took a n arbitrary point and a n arbitrary line then the point should n o t be so arbitrary t o lie accidentally o n the line!”. The paper concludes with a short comparison of this approach to geometrical constructions with the semantics of computer programs. De Bruijn mentions a number of correspondences between the two disciplines: the constructive nature of the respective activities, the importance of having proofs concerning the obtained results of the activities (in particular that the specifications are complied with) before a new step is taken and the roles that subroutines play in the constructive process. The subject ‘semantics of computer programs’ is reconsidered in the papers [F.4] and [ F . 5 ] .

F.3. N.G. de Bruijn: The Mathematical Vernacular, a language for mathematics with typed sets An underlying source for de Bruijn’s investigations in mathematical languages like the Automath family, has always been his discontent with the traditional Zermelo-Fraenkel-based set theory. In doing his daily mathematical work, he was always aware of the fact that the usual style of going about with mathematical theories and proofs was an activity with its own rules and manners, fully independent of the ZF-axioms. The early Automath work was, in a way, an attempt to formalize this everyday behaviour of himself and his colleagues. Of course, de Bruijn realized from the beginning that the extreme precision of the Automath languages was too much of a good thing for the average practitioner of mathematics. Therefore, he was always interested in an intermediate stage, “between ordinary mathematical presentation [...] and fully coded presentation in Automath-like systems”. Starting around 1978, he developed a special

52

R.P. Nederpelt and J.H. Geuvers

language for this purpose, which he baptized Mathematical Vernacular (MV). In his own words: ‘IMV is] the very precise mixture of words and formulas used b y mathematicians an their better moments”. It has always been his conviction that such an MV would describe more adequately how people think, which in particular would be of value for mathematical education. In this paper he focuses on a formalizable fragment of the rather complicated language which “mathematicians use to convince each other”. He gives a list of rules for the formation of mathematical “sentences”, in particular useful for the usual manipulation of axioms, definitions, theorems and proofs. MV permits rather big steps, where the intermediate justification need not be given. Only the essential information should be written in MV, just as is normal in the current style of exposition. De Bruijn restricts his MV essentially t o three grammatical categories: the sentence, the substantive and the name. Examples are, respectively: ‘5 is a real number’, ‘real number’ and ‘5’. Adjectives are used to make new substantives from old ones, as in the transition from ‘triangle’ to ‘isosceles triangle’. He makes a fundamental distinction between a substantive, and a name describing the corresponding type. For example, ‘real number’ is a substantive, usable in the sentence ‘5 is a real number’. But the name R for the set of the real numbers plays another role: one cannot say ‘5 is an P. De Bruijn uses the special typing symbol ‘:’ for the first case: ‘5 : real number’, as opposed to the ‘E’ in ‘5 E 8’.(Hence, this usage of ‘:’ is different from the formal relation ‘:’ in statements in type theory, as in ‘a : A’.) In the paper he makes a precise distinction between different language levels. For example, he also introduces a ‘meta-typing’, denoted with a double colon: ‘real number :: substantive’. First he introduces his MV-language, “the general framework of organization of mathematical texts”, next the rules about validity. He pays attention to the contextual structure of MV ‘books’, in the flagstaff form introduced in the early Automath phase (see also [A.1], [A.8], [Fitch 521 and [Nederpelt 871). The influence of Automath is noticeable throughout this paper, albeit that the goals are not the same. A difference is, for example, that equality is treated in its own right, so that the practical difficulties connected with equational reasoning in the Automath languages are avoided. These ideas are elucidated with an example taken from an earlier version (1982) of this paper. In the example a number of logical and set-theoretical notions are introduced, with the appropriate definitions and axioms and a few proved theorems. It ends with the Peano axioms for natural numbers and the beginning of Hilbert’s axiomatization of geometry. The example is followed by an extensive commentary.

Survey of the contents

53

F.4. R.M.A. Wieringa: Relational semantics in an integrated system In this paper, prepared in 1980, Wieringa shows how the Automath format can be used for the description of the syntax and the semantics of computer programs. The programming language is supposed to be ALGOL-gO-like, hence block-structured, allowing declarations of variables and their types and with sequentially ordered statements for assignments t o variables. The semantics is designed for the establishment of program correctness. Wieringa uses relational semantics, describing the relation between initial and final state of a program. The semantics deals with incomplete information. The attempt made in this short, introductory paper is t o use Automath for the mentioned purposes, and also for the usual purpose: in the formalization of mathematical reasoning. So Automath-like constructs are employed 0

for logic and mathematics,

0

for the syntax of the programming language and

0

for its semantics.

These ideas are due to de Bruijn, and were formerly published in unofficial reports ([de Bruijn 73d], [de Bruijn 75bl). Wieringa defines the state of a program as a function giving for each data type (like bool, i n t ) a stack of elements of that datatype. In this setting he gives examples of a few program primitives, like assignment, selection, sequential composition and block structure, in an Automath-like style. Next, he uses the same style for a description of the relational semantics for these primitives.

F.5. N.G. de Bruijn: Computer program semantics in space and time De Bruijn extends the relational semantics described in [F.4] with a time space T attached to the state space. He takes T to be discrete, viz. 2 U {co}, with addition and order. The discrete timing is meant to deal with recursion, the main subject of this paper. The element 00 represents non-termination. t’), Changes in the extended state space are formalized as quadruples (w, t , w’, with w and w’ in the state space and t and t‘ in T , where t < t’. Semantic information is described by predicates on the set of such quadruples. These predicates represent “an m o u n t of information” , where more information means smaller predicates. (The “biggest” predicate TRUE gives no information at all.) Given a program x, the ideal situation is that we can find the predicate Totinf (x),giving all the information about x. That is: the quadruple (w, t , w‘, t‘) satisfies Totinf(x) iff there is an execution of x leading from initial state (w, t ) t o final state (w’, t’). Usually, it suffices to find a weaker predicate R such that Totinf (x)C R. In fact, the standard situation is the other way round: given a so-called specafication R, find a program x satisfying R.

54

R.P. Nederpelt and J.H. Geuvers

First, de Bruijn formulates the semantics of a number of primitive program constructs, such as ‘skip’, ‘non-termination’ and ‘assignment’. Next, he does the same for sequential composition (‘concatenation’) and selection (‘if-then-else’). As other program constructs, which provide examples for his approach to the semantics of recursion, he considers procedures and the while-statement, and proves theorems concerning the ‘partial correctness’ and the ‘total correctness’ for these semantics. Section 8 acts as an interlude, in which de Bruijn opposes the notions of syntax and semantics, and makes a comparison between the relevance of these two notions for a time-honoured subject such as geometrical constructions (see also [ F . 2 ] )versus their roles in computer science. In Section 9 he discusses the relation of his semantics with fixed point semantics. De Bruijn ends the paper with a comment on program abortion.

PART A Motivation and Exposit ion

This Page Intentionally Left Blank

57

Verification of Mat hematical Proofs by a Computer A preparatory study for a project Automath

N.G. de Bruijn

1. INTRODUCTION The Automath can become an automaton that turns out mathematical theorems in perfect form, with complete proof, provided that it is constantly prompted by a mathematician. The kind of man-machine co-operation needed for this situation can be illustrated suitably by comparing it t o operating an automobile. It can be expected that the Automath can be realized inside existing computers. We can distinguish four stages in working with Automath. (1) C, the creative mathematician, creates something. He writes it down in a style that is clear to himself (or did that long ago). Possibly he has overlooked a number of important details entirely. Filling those gaps often requires much less creativity than the main job did (but sometimes much more!). (2) R, the reproductive competent mathematician, formulates theorem and proof in a way that is absolutely flawless for the modern mathematician. He has to be well-acquainted with the area, including some of its background. And he should be skilled in axiomatics too. (3) P, the mathematical programmer, translates his predecessor’s text into a number of statements which are digestible for the Automath. He does not necessarily have to understand the mathematical content, but he should be well-trained in handling mathematical patterns of reasoning as well as in handling the codes of Automath. And he should have enough experience for filling the little gaps that R has still left.

(4) A, the Automath, considers P’s statements as hints for the construction of a perfect text. P’s hints will be accepted by A only if A recognizes them as steps allowed by A’s standards. All what A does is perfect. Whether A’s

58

N.G. de Bruijn production is interesting or relevant depends completely on C, R and P. If all ends well, then A has produced what R intended. Whether that is also what C meant is usually difficult t o trace.

The Automath forces us to a final and merciless check of all mathematical texts that we wish to submit to it. Already before it starts its work, it has its influence on the behaviour of the mathematician. The Automath might play an important r61e in the communication among mathematicians, in particular in the publication system. A long proof with lots of uninteresting details might get its ultimate formulation in Automath code. And then its correctness will not just be a matter of confidence like so often in our present world; it can be established beyond any doubt, and whenever one wants. What Automath has to do is quite impressive in comparison to the size of the original text. It is like a n iceberg in the sense that only a relatively small part is visible. That part consists of the coded statements we offer. What Automath does with them remains invisible. By way of illustration we shall give a simple theorem on equivalence classes in Section 4. The first stage was written by C (judging by the style it seems to have been done around 1930). R made the second stage, and possibly also the list in the columns ELABORATED in the third stage. P writes the column CALLS, as well as the expressions as far as they have been introduced by CALLS (see the arrows). And A writes the rest of the expressions, takes care of the stack, writes the column PROOF, also the column ELABORATED, if we wish, and carries the full responsibility. Usually R will have the hardest job of all, since published mathematical work can be full of gaps. But his attempts to get to the most suitable form for P, might often have the positive side-effect of clarifying the first stage. Without a substantial amount of experience the work will be very timeconsuming both for R and for P. One might compare it to learning to write mathematical texts in Chinese. After a long period of learning it will become just as easy as in the mother’s tongue. But there is an advantage that appears at once: writing in the foreign language forces us to be very clear in organizing our ideas.

2. IDENTIFIERS AND EXPRESSIONS

Identifiers. We distinguish three kinds: variables, bound variables and constunts. The bound variables are chosen by the Automath itself; it does not even

Verification of mathematical proofs by a computer (A.l)

59

mention the choices it made. The constants are chosen by the user. They will be introduced by means of OnAll and OnEx. The variables indicate the substitution places in expressions; the user can choose identifiers for them in cases where he defines the expressions himself, the Automath can modify them at will. Note that in the following line (representing the definition of an expression)

ezpr 225(z, y, z ) := ( e z p r 127(z,y)) n t

,

the identifiers z, y, z are variables both in the left-hand and the right-hand side, but that they are bound variables in the formula as a whole.

Expressions. Every variable is an expression. Furthermore, there is a list with a number of primitive expressions, like z n y, z = y, z E y. Using expressions we can form new expressions by means of substitution of expressions for variables. Example: from 1(z,y), 2(u, v), 3 ( p , q ) we can build 4('L1, v , p , q ) := 1(2(%v), 3 ( P , q ) )

'

Another way t o build new expressions is by quantification. Examples 6(z,y,z) := VtlEs5(u,y,z)

8(z,y1.1

:=

1.

1

E 5 I 7(u, Y, 2))

There are three types of expressions: element, set and boolean. For every expression its type has to be indicated somehow, as well as the types of its variables. In case of substitution and quantification specific rules concerning those types are to be issued. For example, in z E y we may substitute for z either an element expression or a set expression, for y a set expression only, and the result of the substitution is a boolean.

Saturated expressions arise from ordinary expressions if we replace every variable by a constant. And recursively, we get saturated expressions if in an ordinary expression we replace all variables by saturated expressions. Every line in a proof is a saturated expression. One can also admit semi-saturated expressions (where only some of the variables are replaced by constants). These can be used only at places where the remaining variables are valid.

3. BRIEF DESCRIPTION OF SOME ELEMENTARY OPERATIONS 3.1. Generalities

The rules will be given for building formula number (k). A formula ( 1 ) is

N.G. de Bruijn

60

called valid if 1 < k, provided that between ( 1 ) and ( k ) no operations of the type OffAll, OffEx, KillEx, OffAss have been carried out. An identifier is called a constant if it had been introduced prior to ( k ) by means of OnAll or OnEx, and has not been dismissed meanwhile by OffAll, O m x or KillEx. All the time the Automath sees t o it that bound variables are replaced by identifiers which are different from the variables introduced by the user, and if necessary it will replace the user's bound variables by new identifiers. The user need not know those identifiers. Many of the rules mentioned below are to be called by means of just a single formula number in parentheses. Often we can agree to omit that reference if it is just ( k - 1).

3.2. Stack operations (macro-operations)

OnAll(a,2). The Automath establishes that 2 is a well-formed set expression in which only valid identifiers occur. It writes on top of the stack (a,V, ( k ) ) and writes as formula

OffAll a. The Automath establishes that the stack contains some ( a , V , ( l ) ) with 1 < k, that above this point there are nothing but ( ,V, ( ))%,and that the latter ( ,V, ( ))'s had not been introduced previously by means of set expressions containing a . It now cancels (a,V, ( I ) ) from the stack and writes

where Z is the content of ( k - 1) with a replaced by y (y being a n available identifier), and X is the right-hand side of formula ( 1 ) .

SaveAll a. Just like OffAll a, but now without cancelling (a,V, ( I ) ) from the stack. OnEx a. The Automath notes that ( k - 1) has the form 3 y EB~( y , . . .). It puts (a,3, ( k ) ) on top of the stack and writes aEX B(a,. . .)

(k) I (k

+ 1) .

In cases where that ( k - 1) was nothing but 3 y E ~it omits that line ( k

+ 1).

Verification of mathematical proofs by a computer ( A . l )

61

O m x a. The Automath establishes that the stack contains a triple (a, 3, ( l ) ) , that above it the stack contains no identifiers introduced by means of set expressions containing a, and that above it the stack contains no assumptions containing a. It now cancels (a,3, ( 1 ) ) from the stack and writes %EX

(2)

(k)

where Z is the content of (Ic - 1) with a replaced by y, and X the right-hand side of (1).

SaveEx

a. Just like OffEx a, but now without canceling the (a, 3, ( 1 ) ) from

the stack.

KillEx a. Just like OffEx a, but the Automath notes that 2 does not contain the identifier a and writes 2

(k) ,

OnAss 2. The Automath notes that Z is a boolean expression containing constants only. It puts A s s ( k ) on top of the stack and writes 2

(k) .

OffAss(k). The Automath finds Ass(l) on top of the stack, notes that formula ( k - 1) reads Y ,that ( 1 ) reads 2, it cancels Ass(1) from the top of the stack and writes

3.3. Micro-operations 3.3.1. Various operations with wide scope Subst (j)with ( l ) , (s), ( t ) (not just three, but arbitrary many formula numbers are allowed). The Automath establishes that ( 1 ) , (s), ( t ) are valid, and read, respectively, 21 E 5’1,22 E S2, 2 3 E S3 (the xi’s and Si’s are expressions), that (j)is valid and starts with v a € S ~VbESz VcESs

it copies formula (j),omitting the three V’s, and replacing a, b, c, by it writes the result as formula ( k ) .

2 1 , 22, 2 3 ;

N.G. de Bruijn

62

String ( I ) , ( m ) (, p ) . The Automath notices that (I), ( m ) , ( p ) are valid formulas of the type a € b, a c b, a = b that can be linked like a E b, b C c, c = d which have as result a E d. It now writes the result as formula (Ic). Reverse ( I ) . If ( I ) holds, and reads one of a c b, a 3 b, a = b, a V b, a A b then the Automath writes as formula (Ic) the corresponding formula from the sequence b 3 a, b c a, b = a, b V a, b A a. Calls of the type “exprdef”. With these calls the user introduces new expressions by means of substitution or quantification. The Automath accepts them only after checking the subexpressions involved, but does not object to the introduction of earlier defined expressions with a new number. Example exprdef e z p r 1 6 ( z , y, z ) = e z p r 2 ( e z p r 3 ( z ,y), 2) , exprdef ezpr18(z, y, w)= ezpr2(ezpr3(z1y), w), exprdef ezpr19(z,y, z ) = e z p r 2 ( e z p r 3 ( z ,y), z )

.

.

Calls of the type “exprequal” Example: exprewal ezp7-7(2,Y,2) ,a, ( I ) , ( m ) produces

ezpr7(a, U , P ) = ezpr7(a,V,Q ) if a is a valid idcntifier, ( 1 ) is the valid formula U = V , and ( m )the valid formula P = Q. The Automath watches the types.

Calls of the type “tautsubst”. There is a list of tautologies (possibly authorized by the Automath). For example, it might contain Tautology 24(a, b ) := ( a n b ) c a If z and y are constants, then the call tautsubst 24(ezpr 1 6 ( z ,y), ezpr 2 4 ( z , y)) will produce (after type check)

( e v W ,9)) n (ezpr2 4 ( z ,Y))

c

ezpr W z ,Y)

3.3.2. General remark concerning the derivation rules Rule 1,.. . Each call will involve one or more formula numbers. The Automath notes that these numbers are still valid and that they have the form required for those

Verification of mathematical proofs by a computer (A.1)

63

derivation rules. In some cases such a rule will contain some expression twice (like S in rule 4). The Automath then has to check that the two expressions it is supposed to substitute are essentially equal. This may mean that it has to dig into the definitions of those expressions until it turns out that they are identical indeed. The rules mentioned here have the arbitrary order in which they turned up in examples.

Rule 1 (1).

Z#@

(1)

4EZ

(k)

where y is a new identifier.

Rule 2 (1).

Rule 3 ( 1 ) .

Rule 4 ( l ) , ( m ) .

Rule 7 ( 1 ) , ( m ) .

X E A ~ B X E A

(1)

(k)

64

Rule 8 ( I ) , ( m ) .

Rule 9 ( I ) ,( m ) .

Rule 10 ( l ) , ( m ) .

Rule 11 ( I ) .

Rule 12 ( I ) .

Rule 13 ( I ) .

Rule 14 (I), (m).

Rule 15 ( I ) .

Rule 16 ( I ) , (m).

Rule 17 ( I ) .

N.G. de Bruijn

Verification of mathematical proofs by a computer (A.l)

65

Rule 19 ( l ) , (m),(n).

4. ELABORATED EXAMPLE 4.1. First stage

Definition. A relation = on a set A is called an equivalence relation if it is reflexive, symmetric and transitive, i.e., if (a) x 3 z for all z E A , (b) if z = y then also y = z, (c) if z = y and y = z then also x

= z.

Definition. A subset u of A is called an equivalence class if cr consists of all elements of A which are equivalent to one particular element. Theorem. T h e equivalence classes are mutually disjoint. It is even true that for two different equivalence classes all elements of the one are inequivalent t o all elements of the other one. Proof. Let u be the set of all elements equivalent to u,and T the one of all elements equivalent to v. Assume that z E cr, y E T , z = y. Then we have u = z, z = y, y = v (according to (a) there is no difference between u E 2 and z 3 u,etc.). From (c) it follows that u E v. In the same way it turns out that every element equivalent to u is equivalent to v as well, and the other way around. It follows that cr = T , so we have proved the second statement. The 0 first one now follows at once. 4.2. Second stage

Preliminary remarks. The relation = can be described as the set of all pairs

N.G. de Bruijn

66

[a,bl with a E A , b E B, a = 6. We call this set S . It is a subset of A x B. The set of all equivalence classes is called T . In order t o avoid complications in connection with the admissability of definitions, we act as if T is a given set satisfying (3) below. We note that a careful analysis of the proof pays 0% it turns out that the assumption of reflexivity is superfluous. It is not hard to see that we neither have to require that S c A x A . Type declarations. A , S, T are sets.

Assumptions.

Assert ions. V ~ E T V T E T#( ~

*

*

0

V ~ E T ' ~ ~ #E 7T ( ~(

nT

= 0)

,

~1 @ s)).

V ~ E ~ ~ Y[z, E T

Proof. Let (T E TI T E T . There exist a E A and b E A such that is the set of all y E A with [a,yl E S , and T the set of all y E A with [b,yl E S. Suppose that u n T # 0. Then there is an z E u n T , whence z E u, z E T . From z E u we infer that z E A , [a,zl E S. Likewise we conclude from z E T that [b, z1 E S. According to (1) we also have [z, a1 E S. From [b, z1 E S , [z, a1 E S we conclude [b, a1 E S, using (2). For every u E o we have u E A , [a, u1 E S . From this and [b, a1 E S we infer (again using (2)) that rb,u] E S. And from u E A , [b,u] E S we conclude that uE

7.

For every u E T we have u E A , [b,u] E S . We already obtained that [b, a1 E S, so by (1) we get [a, b] E S . From [a, bl E S and [b, ul E S we derive (again by (2)) that [a, u1 E S . And from u E A , [a, u1 E S we conclude that u E u. So it has been proved that for all u E (T we have u E T , and conversely. Therefore u = T . So u n T # 8 implies u = T , which proves the first assertion. Next we start from the assumption u # T . It follows from the above that now u n T = 0. Take some z E u and some y E T , and assume that [z, y1 E S . We shall derive a contradiction.

Verification of mathematical proofs by a computer ( A . l )

67

Since x E u we have x E A , [a,xl E S, and since y E r we have y E A , rb,yl E S. From [a,xl E S and rx,yl E S it follows that ra,y1 E S (according to (2)). Therefore y E u. But we had y E r , so y E n r. This contradicts

~ n ~ = 0 .

0

4.3. Third stage The third stage is given in the tables below. It follows the second stage step by step. This third stage has a stack structure, but that is not an essential difference between the second and the third stage. It would have been easy to impose that structure on the second stage too, with a definite gain in readability. The A , S, T are valid all through the argument. For the sake of clarity they are omitted from the stack. One might also put the assumptions (l),(2), (3) on the stack with OnAss, and discharge them after (71) and (72). It would have the effect that the final conclusion would have the form of an implication, and that there would be no assumptions at all.

CALLS

EXPRESSIONS

PROOF

expr. ~ ( z , Y ) expr. 2(l,Y) expr. 3(+,Y) expr. 4(2) expr. 5 expr. 6 ( r ,Y, 2) expr. 7 ( 1 ,Y, 2) expr. 8 ( l , Y ) expr. 9(z) expr. 10

datum expr. 5

datum 5

datum expr. 10

datum 10

expr. ll(s)

t

expr. 12(2, u)

t

11(r) := {y E A 12(q y)} 1 2 ( z , a ) := a = 11(z)

expr. 13(u)

t

1 3 ( ~:= ) 3

expr. 14

+

14:=P(A)

z 12(2, ~ u) ~

expr. 15

15 := {a E 14 113(a)}

datum expr. 16

1 6 : = T = 15

datum 16

ELABORATED

EXPRESSIONS

CALLS OnAll

T)

(0,

101(u) := E ( U , 15)

String (4143) OnAll (T,T)

201(u) := €(a, 15)

String (6),(3) Rule 5 (5),(3) OnEx a Rule 5 (7),(3) OnEx b expr. 17(X,p ) . expr. 18(X,p) -

1

17(A,p) := Anp 18(X,p) := 17(X,p) # 0

OnAss 18(u,r) Rule 1

102(kp) := 3uE17(A,p)

OnEx x Rule 2 Rule 4 (17),(10)

103(X,p,x) := €(I,17(X,t())

Rule 3 (16) Rule 5 (20),(13)

s 3 ln 'sl v s 3 ln 'uJ s 3 ln'ul

(z'fi)zv(fi'x)z =: (Z'fi'Z)9ZI

S S ) 'A ' n

LZ) 'A

CALLS

EXPRESSIONS

STACK

PROOF

u r a b (:14)

Subst. (2) with (9)7(12),(36) Rule 7 (41),(40)

Rule 6 (10),(42),(36) OffAll u

Rule 9 (34),(33) KillEx x OffAss (14) Rule 12

OnAss 19 ( u , T ) Rule 7 (48),(49) OnAll (x,u) Rule 4 (51),( 10) OnAll (y, T ) Rule 4 (54),(13) OnAss 2(x, y)

ELABORATED

NR.

x u

L Ond(49)

x,'d, (51)

Y, 'd, (54)

Ond(57) I

(42) (43) 'd"uE7 u E U = T

U = T

(u n T

# 0) + (u = T )

CALLS

1

EXPRESSIONS

Rule 8 (53),(57) Subst. (2) with

1

136(x, y, z ) := 2(z,y) A 2(y, z )

STACK

ELABORATED

(58) (59)

(9),(52),(55)

Rule 7 (59),(58) Rule 6 (10),(60),(55)

Rule 10 (61),(54)

117(X, p ) := X

np

118(A,p,x) := z E 117(X,p)

SaveEx y Rule 13 Rule 14 (50),(64) ORASS (57) Rule 11 OffAll y, OffAll x, OffAss (49)

NR.

119(X, p ) := 3,Ex 118(X, p , X) 12O(X,p) := 117(X,p) # 0 121(x, y) := 2(z, y)

+ contr.

122(1, y) := 72(2, y) 1 2 3 ( ~ , ~:= ) VydyEP122(2, y) 124(X, p ) := V,Ex

123(x, p )

125(A,p) := 19(X,p)

+ 124(X,p)

125(u,r) 125(u, T ) 109(u, r )

127(X) := V,ET

109(X,p)

128 := VXET 127(X)

128

130

73

The Mat hemat ical Language Automath, its Usage, and Some of its Extensions* N.G. de Bruijn

1. INTRODUCTION

1.1. Automath is a language which we claim to be suitable for expressing very large parts of mathematics, in such a way that the correctness of the mathematical contents is guaranteed as long as the rules of grammar are obeyed. Since the notions “mathematics1’and “expressing” are rather vague, we had better discuss a specific example. Assume we have a very elaborate textbook on complex function theory presenting everything from scratch. That is, we start with chapters on logic and inference rules, set theory, the number systems, some geometry, some topology, some algebra, and we never use anything that is not derived, unless it has been explicitly stated as an axiom. Assume the book has been most meticuously written, without leaving a single gap. Then we claim it is possible to translate this text line by line into Automath. The grammatical correctness of this new text can be checked by a computer, and that can be considered as a final complete check of the given piece of mathematics. Moreover we claim that it is possible to do so in practice. The line by line translation will be a matter of routine; the main difficulty lies in the detailed presentation of such a large piece of mathematics. The mere labour involved in the translation will not increase if we proceed further into mathematics.

1.2. Automath was developed in 1967-1968 at the Eindhoven University of Technology, The Netherlands. The author is indebted to Mr. L.S. van Benthem Jutting for very valuable help in trying out the language in several parts of mathematics, and both t o him and t o Mr. L.G.F.C. van Bree for their assistance with the programming (in ALGOL) of processors by means of which books written in Automath can be checked. In particular, Mr. Jutting is currently translating Landau’s “Grundlagen der Analysis”. *Reprinted from: Laudet, M., Lacombe, D. and Schuetzenberger, M., eds., Symposium on Automatic Demonstration, p. 29-61, by courtesy of Springer-Verlag, Heidelberg.

74

N.G. de Bruijn

1.3. In this paper we shall not attempt a complete formal definition of Automath, for which we refer to [de Bruijn 68b]. Nevertheless we hope t o make the language intuitively clear in this paper. After all, the author feels that very little is essentially new in Automath, that it is very close to the way mathematicians have always been writing, and that the abbreviating system used in Automath has been taken from existing mathematical habits. But the way we handle propositions and assertions will be novel, among other things. 1.4. One of the principles of the language is that the reader (be it a human being or a computer) never has t o search in the previous text for definitions or arguments. The text presented t o him tells him precisely where to find information needed for checking that text.

1.5. We indicate the possibility of building languages defined in terms of Automath but adapted to special purposes (superimposed languages, see Sec. 10). This is one of the reasons for keeping Automath as primitive as possible. Actually it is little more than what might be called the art of substitution. Automath has an even more primitive sub-language PAL (see Sec. 4), but PAL is definitely too primitive to deal with things like predicates, quantifiers and functions. As a preliminary, we shall introduce a simple language SEMIPAL, which is not a sublanguage of PAL.

1.6. An Automath book is a sequence of lines written according t o the rules of grammar. An important feature is that things which have been derived in a book (e.g. inference rules, definitions, theorems) can be applied later in that same book. It turns out to be possible that even very primitive parts of mathematical logic can be explained in that book, and therefore it is unnecessary to feed that kind of logic into the grammar.

1.7. There is one vital thing that we do not attempt t o formalize: the interpretation. When reading or writing a book in a formal language like Automath, we try t o be constantly aware of the relation between the text and the (mathematical or non-mathematical) objects we imagine that the text refers to. It is in this sense that many words occurring in the book (identifiers) are names of the objects outside. The book itself deals with names only. There may be several different interpretations, and there seems to be no way to discuss these interpretations in the book.

The mathematical language Automath (A.2)

75

2. PRELIMINARY DESCRIPTION OF THE LANGUAGE

2.1. An Automath book is written in lines. Everything we say is said in a certain context; we shall attach a context indicator (or indicator for short) to every line. Usually the context structure can be described by a set of nested blocks (see 3.10), such as in a system of natural deduction. Lines written in a block have a kind of validity inside that block. The context structure will make it possible to express a certain functional relationship. On top of that we have another way of dealing with functions: something that is essentially Church’s lambda conversion calculus. Although these two features do not make each other entirely superfluous, they create a certain abundancy in the language. By virtue of this abundancy, many things can be written in various ways. One might experience this as a drawback, but, on the other hand, it gives something of the flexibility of every-day mathematical language. 2.2. In every line a new name (an identifier) is introduced. It is very essential that to every identifier a category is attached. In every-day language this amounts to stating what kind of a thing we are talking about. For example, we might introduce the identifier “two” and say that its category is “integer”. We shall not admit that “two” has several categories simultaneously. This may have the drawback that we have to invent different notations for the integer 2 and the complex number 2. Accordingly, we have to express ourselves by means of one-to-one mappings of the integers into the complex numbers, instead of care-free identification. (We should not forget that care-free identification is a matter of tradition. The average mathematician is not inclined to identify a unit matrix with the number 1, but he identifies all 1’s he knows as long as they belong to one of the “number systems”.)

In connection with the above example we remark that it is by no means necessary to write mathematics in such a way that “two” has the category “integer”. Another possibility, as well rooted in existing habits as the previous one, is to write that both “two” and “integer” have the category “object”, and t o add that “two E integer” is a true statement. If we do this, there is no harm in saying that “two E complex number” is also true.

2.3. It will be possible to introduce new categories. For this purpose we use the special symbol type. For example, we may introduce the identifier “integer” and attach the category type to it. This will have the consequence that later in the text (at least in the context where “integer” was introduced) we have the right to use “integer” as the category of an identifier.

N.G. de Bruijn

76

2.4. Another feature of Automath is an abbreviation system which is essentially taken from existing conventions in mathematics; this can make the labour of writing and reading bearable, especially if we select suggestive identifiers for all notions introduced in the book. In essence, this abbreviation system occurs already in SEMIPAL.

3. STRUCTURE O F THE LINES

3.1. A line consists of 4 parts: (i)

an indicator,

(ii) an identifier, (iii) a definition, (iv) a category.

3.2. In every line the identifier part (ii) is a symbol that has not been used in previous lines. (This stipulation is unusual in every-day mathematics: a symbol like 2 is used repeatedly in different senses. But assuming we have infinitely many symbols available, it would do no harm to replace all these 2’s by different symbols whenever necessary.) An identifier used as identifier part of a line will be called a proper identifier. There is a second kind of identifiers: those that play the rBle of bound variable. Again, in constrast with existing habits we shall use each bound variable only once, and a bound variable has to be different from previously introduced proper identifiers. There are three kinds of proper identifiers: block openers, primitive notions, and compound notions. This depends on the definition part of the line. If the definition part is -, then the identifier part is called a block opener (or “free variable”). If the definition part is PN, then the identifier part is called a primitive notion. If the definition part is an expression (see Sec. 3.3), then the identifier part is called a compound notion. There is a second classification of identifiers, which bears no relation to the classification above. Some identifiers are object names, others are types. An identifier is a type if and only if it is the identifier part of a line whose category is type. All other identifiers (including bound variables) are called object names. 3.3. The definition part of a line is either an expression or one of the symbols PN or -. If the definition part is an expression, that expression is composed of

The mathematical language Automath (A.2)

(i)

77

proper identifiers of previous lines;

(ii) bound variables; (iii) the symbols

?

)

(

0

[

1

which are used as :par tion marks. 3.4. The category part of a line is either the symbol type or an expression. If

it is an expression, we can say the same things as in 3.3. 3.5. The indicator part of a line is either the symbol 0 or a block opener introduced in a previous line. The indicator is used in order to describe context.

3.6. A book is organized as a string of lines, but the context indicators induce a second structure in the form of a rooted oriented tree. The root is the symbol 0, the other vertices are the identifiers of the lines of the book. The edges are all oriented towards the root. The edge starting at the identifier x points to the indicator of the line that has x as its identifier part.

3.7. As an example we take the following book: indicator 0 0 0

identifier a

definition PN

X

-

b

....

X

C

PN

X

Y

Y Y

-

d

0

e

.... ....

X

W

-

W

f

Y

9

.... ....

z

category

type

.... ....

type type

.... .... .... ....

type

....

In this example we have written .... in order to suppress expressions we do not intend to discuss at this moment. (So "...." is not a symbol used in Automath, but in our discussion about Automath.) In this example x, Y, z , w are block openers, a and c are primitive notions, b, d, e, f , g are compound notions.

N.G. de Bruijn

78

The tree of this book is

C

0 Figure 1

3.8. It has to be remarked that the tree is a combinatorial thing, and that the way it is drawn in a plane is quite irrelevant. Note that the primitive notions and the compound notions are end-points of the tree. The block openers are usually not end-points. To every point # 0 of the tree we can attach the definition part and the category part of the line of which that point is the identifier part. If we do this, the tree contains all the information of the book, and can be referred to as the tree of knowledge. But one thing the tree does not reveal: it does not show the order of the lines in the book. If we want to know whether the tree is grammatically correct, it is useful to know the order of the lines. Given the set of lines of a valid book, there may be several ways t o arrange them. The only condition an arrangement has to satisfy is that no expression occurring in a line contains identifiers of later lines. All such arrangements produce legitimate books. Anyway, if we want to extend the book by a further line then the order of the previous lines is irrelevant. At that moment, it is only the tree of knowledge that counts.

3.9. If p is a point of an oriented rooted tree, different from the root, then we can consider the subtree of all those points of the tree for which the oriented path to the root passes through p ( p itself is the root of the subtree). In the case of our tree of knowledge, we shall refer to these subtrees as blocks. In the tree of 3.7, the point x determines the block containing x, c, y, z , d , w , f , g ; the block opener of that block is 2.

3.10. Quite often a book has been written in an order that makes the block structure immediately clear. This is the case if every block consists of a set of

The mathematical language Automath (A.2)

79

consecutive lines. In this case we shall say we have a nested book. (We remark that it is not always possible to transform a correct book into a correct nested book simply by rearrangements of the lines. In order to get a nested book we might have to duplicate pieces of the text.) In a nested book we can indicate the block structure by means of vertical bars in front of the lines. Corresponding to each block we draw a vertical line spanning all lines belonging to the block. We agree that if block B is contained in block A , then the line for B is drawn t o the right of the line for A . Once the lines have been drawn, the indicators can be omitted since they can be retraced. In the example below we present a nested book twice, once with indicators, once with bars. The version with the bars is certainly more readable for the human mathematician. A computer will of course prefer the one with the indicators. -

~

a

PN

5

-

a,

-

type type

.... ....

b

....

t

-

....

C

PN

....

W

-

....

d

....

....

-

a

PN

type

-

type

y

:= := :=

b

:=

2

:= := :=

PN

:=

........

x

c

w d

.... ........ -

-

.... .... ....

As in this example we shall always separate identifier part and definition part by the symbol := which suggests that the identifier on the left is defined by the expression on the right. Needless to say, the vertical bars and the symbol := do not belong to the language. They are just devices for easier reading. Quite often we shall print both the vertical bars and the indicators.

3.11. Sometimes we shall talk about the indicator string of a line. If the indicator is 0, the indicator string is empty. In all other cases the indicator string describes the reversed path from the indicator in question to the root of the tree (excluding the root). For example, the indicator string of the last line in the example of 3.7 is (x,y), the one of the last line in the example of 3.10 is (x,z, w).

4. HOW TO WRITE PAL 4.1. PAL is a sublanguage of Automath, in the sense that every correct PAL book is also a correct Automath book. PAL is quite easy to learn. In PAL we d o not use the lambda conversion, and we have no bound variables.

N.G. de Bruijn

80

Let us take an example. At this stage the reader must not expect an example with deep mathematical significance, since that would require quite a long book. The interpretation we have in mind is this one: Assume that nat (natural number) and real (real number) are available as categories. If a and b are given reals, then their product is introduced as a primitive notion. If n is a natural number and 2 is a real, then the power xn is introduced as a primitive notion. If n is a natural number and y is a real number, then we define d ( y ) := y n ; e(Y) := 4 Y ) * Y (= yn+% f ( Y ) := 4 Y ) . 4 Y ) (= Y 2 9 d Y ) := e ( d ( y ) ) (= Y ~ ( ~ + ' ) In ) . PAL this can be written as follows:

0

a b

1

.-..-.!rod

!1

power Y d e

f 9

-

-

:=

PN

.-.-

-

..:=

-

PN

.-.-

-

:= :=

.-

power(n,y) prod(d,y) prod(d,d)

:=

e(d)

real real real

3 4

nat real real

6 7 8

real real real real real

10 11 12 13

5

9

This happens to be a nested book in the sense of 3.10, but that does not have any consequence for the present discussion. It is also a very simple case in the sense that the categories are all very simple. Although we are not going to do it in this paper, it may help the reader to provide the identifier parts (as far as they are not block openers) with the indicator strings in parentheses. That means that he writes prod ( a ,b ) in line 5, power(n,x) in line 8 , d ( n , y ) in line 10, e(n,y) in line 11, f(n,y ) in line 12, g ( n , y ) in line 13. This makes it easy to see what we intend with the other expressions: prod (d, d ) indicates that both a and b in prod ( a ,b ) are replaced by d. Now what does e ( d ) mean in line 13? By line 11, e depends on two variables (n and y ) . We agree that we add the letters of the indicator string of line 11 on the left until we have enough entries. So e ( d ) has to be interpreted as e ( n , d ) : the first entry of the string n, y is added on the left. In general: if p is introduced with indicator string (21,...,zn), and if k < n, then ~(21, ...,zk) has to be interpreted as p ( z 1 , ...,Zn-kr 21, ...,Zk).

The mathematical language Automath (A.2)

81

4.2. Before we describe the rules of PAL, we first describe a simpler language t o be called SEMIPAL. This language is different from PAL and Automath in that it does not attach a category to a line. Its relation t o PAL is simple. If we just cancel from a correct PAL book the entire category column, then we get a correct SEMIPAL book. (On the other hand, we can always transform a correct SEMIPAL book into a correct PAL book just by adding type as the category of every line.)

4.3. The rules of SEMIPAL are given in this and the next section. The reader may take the 13 lines of Sec. 4.1 for an example, by just cancelling the category column.

(i) As the first line of the book any one of the lines 0 0

..._ .-

.... ....

is acceptable. (Here (ii) We can add an ( n writing U

"...."

PN -

stands for an arbitrary identifier.)

+ 1)-st line to a correct SEMIPAL book A of n lines by

....

..-

c,

where u is either 0 or one of the previous block openers, and C is either -, or PN, or an expression valid at u, a notion to be defined presently. 4.4. The notion expression valid at u (where u is an indicator) is relative to the given correct book A. We define it by recursion.

(1) If b is a block opener, either equal to u or contained in the indicator string (see Sec. 3.11) of u, then b is an expression valid at u. Example: At y the expressions n, y are valid. (2) If b is the identifier of a line of A, but not a block opener, and if the indicator of that line is either 0 or u or contained in the indicator string of u, then b is an expression valid at u. Example: At y the expressions nat, real, d, e , f , g are valid.

(3) Let b be the identifier part of one of the lines of A, and assume that b is not a block opener. Let n be the length of the indicator string of b. Let k be a second integer, 0 < k 5 n. We assume that El, ...,C k are expressions valid at u. If n > k we have the extra assumption that the (n- k)-th entry of the indicator string of b is a n expression valid at u (that is, it is equal to u or contained in the indicator string of u). Then

N.G. de Bruijn

82

is an expression valid at u.

4.5. In the SEMIPAL book that is obtained from the example of Sec. 4.1 (by

omitting the category column) we give a few examples of expressions valid at y:

4.6. As a preparation to discussion of normal forms, we define the completion of an expression valid at u. Let C be an expression valid at u ; its completion C' will also be valid at u.

(i) If C consists of a single block opener, then C' = C. (ii) Let C = b(C1, ...,Ck) (see the end of Sec. 4.4) and let indicator string of b. Then

If k = 0, n - k # 0 this has to be read as b(u1, ...,u,-k),if k b(C1, ..., C k ) , if k = n - k = 0 it has to be read as just b.

u1;

...,u,

be the

# 0, n - k = 0 as

An expression is said to have normal form (in the sense of SEMIPAL) if it contains no compound notions (see Sec. 3.2). Let C be an expression valid at u. We shall define, again recursively, a reduction to normal form C*. We first complete the expression C t o C' (4.6). If C' is a single identifier, but not a compound notion, then we take C* = C'. If C' is a single identifier and if that identifier is a compound notion, we define C* to be the normal form of R, where R is the definition part of the line whose identifier part is C'. If C' = b(C1, ...,C,) with n > 0, and if b is a primitive notion, then we take 4.7.

C" = b(C:, ..., C:)

,

where Ct is the normal form of Ci (i = 1, ...,n). If C' = b(C1, ...,C,) with n > 0, and if b is a compound notion, with indicator string u1, ...,un,then we obtain C' as follows. Let R* be the normal form of the definition part of the line whose identifier is b. In 52' replace every occurrence of ui by Ct (the normal form of Xi). This gives C'.

The mathematical language Automath (A.2)

a3

Warning: The substitution of the C,t for ui is only carried out for explicit occurrences of ui in R*, and not for new ui’s that arise after substitution (the C,t’s themselves may contain uj ’s) . As an example we give the normal form of the expression e ( d ) of line 13 in the example of Sec. 4.1:

4.8. Two expressions C1, C2 both valid at u are called definitionally equal if they have the same normal form. If we want to show definitional equality it is not always necessary to compute these normal forms; it will often suffice if we can transform both forms into a single form by partial reduction. If we replace an expression in a correct SEMIPAL book by a definitionally equal one, we get a new correct SEMIPAL book. The normal forms of corresponding expressions in both books will be the same. 4.9. We shall describe the notion of a correct PAL book in two stages. We start with a book A written according to the preliminary description of Sec. 3. That is, the definition part of a line is -, or PN, or an expression; the category part is type or an expression; the indicator part is 0 or a previous block opener. By a certain duplication operation to be described presently, we get something which we shall require to be a correct SEMIPAL book A’. Finally, we shall require certain conditions regarding the categories. The duplication means the following thing. We replace every line U

a

.-.-

R

C

(where u may be 0 or a block opener, R may be an expression or - or PN, C is an expression or type) by two lines U U

a+ a

:=

.-.-

C

52 ,

unless C is type, in which case we write the single line U

a

.-.-

R.

We of course assume that for every identifier we can create an entirely new identifier by adding the plus sign. As an example we deal with the first 5 lines of the book of Sec. 4.1:

N.G. de Bruijn

a4

nat real U+

a b+ b prod+ prod

....._ .._ .-

PN PN real -

._ ...-

real

.-._

real PN

-

._ .-

4.10. We define the notion ‘‘correct PAL book” by recursion. The definition will be such that if A is correct, then A’ is a correct SEMIPAL book. A one-line book is correct if and only if that line has one of the following two forms: 0 0

.... ....

._ ..-

PN

type,

-

type

.

Now assume that a book A consisting of n lines is a correct PAL book. We shall state the conditions for any line to be added. (i)

The indicator u is either 0 or a block opener of A.

(ii) The definition part is either (see Sec. 4.11 for this).

-,

or PN, or an acceptable expression at u

(iii) The category part is either t y p e , or an acceptable expression at u with category type. In the case where the definition part is an expression (see (ii)), we require that the category part is definitionally equal (in the sense of the SEMIPAL book A’) to the category of that expression.

4.11. Let u be one of the block openers of the SEMIPAL book A’ obtained by duplication of A. We will define a collection of expressions that we call acceptable at u; to each one of these expressions we will attach what we will call a category. The latter is either an expression or the symbol type. The expressions to be considered for acceptability, as well as their categories, will only contain identifiers of A, and no identifiers with signs attached to them. The acceptable expressons will be automatically valid at u in the sense of Sec. 4.4. The description of “acceptable” closely resembles the one of “valid” (Sec. 4.4).

+

(1) Let b be one of the following:

The mathematical language Automath (A.2)

85

- a block opener whose indicator string is contained in the indicator string

of u; -

the identifier of a line of A (but not a block opener) whose indicator is either 0, or u, or contained in the indicator string of u (cf. (1)and (2) in Sec. 4.4).

Then b is an acceptable expression at u , and its category is the category part of the line whose identifier is b. (2) Let b be the identifier part of one of the lines of A, and assume that b is not a block opener. Let n be the length of the indicator string of b. Let k be a second integer, 0 5 k 5 n. We assume that the expressions C1, ...,Ck are acceptable at u, with categories 0 1 , ...,Rk. If n > k we have the extra condition that the (n- k)-th entry of the indicator string of b is either equal to u or contained in the indicator string of u. Let v1, ..., v k be the last k entries in the indicator string of b. We require, for i = 1,...,k , that

is definitionally equal (in the sense of A') to Ri. (If i = 1 we have t o read (1) as v:. If any of the v+ does not occur in A', we have to read (1) as type, and the condition is just that R; = type.) Under these conditions we proclaim b(C1, ...,C k ) to be acceptable at u, and we give it the category b+(C1, ...,C k ) . If b+ does not occur in A', the new expression b(C1, ...,C k ) is given the category type. One minor modification should be made: we promised that the category would not be an expression containing identifiers with plus signs. Therefore we replace b+(C1, ..., C k ) by the result of an application of a substitution such as described at the end of Sec. 4.7.

5. HOW TO USE PAL FOR MATHEMATICAL REASONING

5.1. In Section 4 we explained how to express things by means of PAL. Seemingly, expressing things covers only a small part of mathematics, for usually we are interested in proving statements. Mathematics has the same block structure as we have in PAL, but there are two ways to open a block. One is by introducing a variable that will have a meaning throughout the block, the other one is by making an assumption that is valid throughout the block. We shall be able to deal with the second case as efficiently as with the first one, if we represent statements by categories. Saying we have a thing in such a category

N.G. de Bruijn

86

means asserting the statement. This can be done in three ways: by means of -, or PN, or an expression. These three correspond to assertion by assumption, by axiom, by proof, respectively. 5.2. As an example we shall deal with equality in a n arbitrary category. The following piece of text introduces equality as a primitive notion, and states the three usual axioms. 0

t

E

X

Y

1

X

i

X

.-..-.....-

reflex asp 1 symm

:= :=

PN

1 2 3 4 5

-

6

:=

PN

asp 2 trans

:= :=

-

7 8 9

PN

10

Y is

Y asp 1 asp 1 z

asp 2

-

-

PN

.-.-

-

This book is not a nested one since line 5 does not belong to the block opened by y. Even so, the vertical bars, with an interruption at line 5 , can be helpful. We now show how this piece of text can be used in later parts of the book. Assume we have the following lines (in some order) in the book:

0 7 7 0

a

0 0

b known

.-....-.:=

.... .... .... ....

type

77 77

zs(v,a, b )

We wish to derive a line: 0

result

:=

....

is (77, b, a )

*

We have to find a definition part for this line. What we want is to apply line 7. The indicator string is (E,x,y,asp 1). In ordinary mathematical terms, we have to furnish a value for 5, a value for x , a value for y , and a proof for the statement obtained from “ x = y” by these substitutions. A proof for the statement means, in our present convention, something of the category is (71, a , b). Indeed we have something, viz. “known”. The reader can easily verify that 0

result

:=

symm(7, a, b,known)

is (77, b, a )

is an acceptable line. The above application was given entirely in context 0, but it can be done in any block that contains 77, a, b and known.

The mathematical language Automath (A.2)

87

We are, of course, inclined to see the categories as classes, and things having that category as elements of those classes. If we want to maintain that picture, we have to say that the category “is (t,z, y)” consists of all proofs for z = y. In this picture the usual phrase “assume z = y” is replaced by “let p be a proof for 2 = y”. Another aspect is that we have to imagine the category “is (t,z, y)” to be empty if the statement z = y is false. The latter remark points at a difference between these assertion categories and the “ordinary” categories like “nut” and “reul” in Sec. 4. In the spirit of the example of Sec. 4 it is vital to know what the expressions are, and it seems pretty useless to deal with empty categories. With the assertion categories it is different. The interesting question is whether we can find something in such a category, it doesn’t matter what. 5.3.

5.4. A modern mathematician has the feeling that asserting is something we do with a proposition. The author thinks that this is not the historic point of view. The primitive mathematical mind asserts the things it can, and is unable to discuss things it cannot assert. To put it in a nicer way, it has a kind of constructivist point of view. It requires a crooked way of thinking to build expressions that can be doubted, i.e. to build things that might or might not be asserted. A possible way to do this in PAL is to talk about the category “bool” consisting of all propositions, and to attach to each proposition an assertion category. We start the book like this:

’1 b

L:uE

.-..-.-

PN

:=

PN

type bool type

-

The standard interpretation is simple. If we write in a certain context

...

.-

....

TRUE(c)

,

where c is (in that context) a proposition, then the interpretation in every-day mathematical language is that we are asserting c. 5.5. In PAL we are able to write axioms and prove theorems about propositions (e.g. tautologies). In later parts of the book we will be able to use these axioms and theorems (just like the derivation of “result” in Sec. 5.2). This means that in a PAL book we are able to derive inference rules that can be applied later in that same book. As a very primitive example we shall write the following in PAL. After introducing bool and TRUE we introduce the conjunction of two propositions. We present some axioms concerning that conjunction, and we show that from z A y we can derive y A z. Finally we show how in a later piece of text the result can be used as an inference rule.

N . G . de Bruijn

88

bool

0 0 b

b

0

X

T R YE

X

Y

Y Y

and asp 1

asp 1 asp 2 Y asp 3 asp 3 asp 3 0 0 0 0

~

2; asp 3 ax 2 ax 3 theorem U 2,

known derived

._ .-

._ .~.........._ ..-

._ .-

PN

-

type bool

PN

type

-

bool bool bool TRUE ( x ) TRUE(Y) TRUE(and)

-

PN -

PN -

..PN ._ .PN .-.- ax l(y,x,ax 3,az 2) .-..... .._ .... ...... .-.- theorem(u,v,known)

TRUE( a n d ) TRUE(x) T R UE ( Y ) TRUE(and(y,x)) bool bool TRUE(and(u,v)) TRUE(and(v,u ) )

5.6. The reader will have observed from the above examples that we do not need to subdivide our text into parts like “theorem”, “proof”, “definition”, “axiom”. Every line is a result that can be used whenever we wish. It may require a large number of lines t o translate the proof of a theorem into PAL. (Needless to say, we can always try to reduce the number of lines, but that makes the lines more complicated and hard to read.) Some of the lines represent definitions of notions introduced only for the sake of the proof. Other lines represent subresults, usually called lemmas. The usual idea about theorems and proofs is, at least formally, that we are not allowed to refer to results obtained inside a proof. In PAL (and in Automath), however, we are free to use every line everywhere. We never announce a theorem before the proof starts, the result is not stated before it has been derived.

6. EXTENDING PAL TO AUTOMATH

6.1. I t was shown in Sec. 4 how we can deal with functional relationship in PAL. Once a function has been defined (either by P N or by definition in terms of previous notions) it can be applied. That is, a function f is introduced by saying what the value of f ( x ) is for every x of a certain category. And if we have, at a later stage, an expression C having that same category, it will be possible t o talk about f(C). A thing that we can not write in PAL, however,

The mathematical language Automath (A.2)

89

is “let f be any function, mapping category C1 into category C2”. If we wish to deal with such mappings the way it is done in mathematics, we want several things: (i)

We need the facility of building the category of the mappings of C1 into

(ii) If f is an element of that mapping category, and if x is something having category C1, then we have to be able to form the image of x under f . (iii) If a mapping of C1 into C2 is explicitly given in the PAL way then we have to be able to recognize that mapping as a member f of that mapping category. (iv) If we apply (ii) to the f obtained in (iii), we can (making z a block opener) obtain a function given in the PAL way. This function should be equivalent to the one we started from in (iii).

6.2. Let us consider (iii) more closely. The “PAL way” of giving a function is the following one: We have somewhere in the book

X

x

:=

- C1

u

:=

A

C2

1 2

where A is an expression possibly depending on x. (That is, its normal form may contain z) But it is only fair to remark that C2 may also depend on x; C1, on the other hand, can not contain x. Let us assume that neither C1 nor C2 is the symbol type. The mapping described here attaches to every x of type C1 a value depending on x, which value has category also depending on x. We shall use the notation

[x: El] C2 for the category of this mapping, and [X : El] A

for the mapping itself. There is an objection against using the old identifier x for this new purpose, and therefore we replace it by a new identifier t. This t will never occur as identifier part of a line. It is called a bound variable, and we may assume that it will be used here, but never again. We shall write i2,(E)A for the result of substitution of C for x in the expression A . (It should be remarked that A may contain x implicitly, which can happen if the above block contains lines between line 1 and line 2. In order to

N.G. de Bruijn

90

make such implicit occurrences explicit, we have to transform A by application of definitions up to a point where further implicit occurrence is impossible.) We can now phrase the rule of functional abstraction: In Automath we have the right to deduce from lines 1 and 2 the acceptability of the line U

...

.._

[t : El] R,(t)A

[t : Ci] &(t)C2

Accordingly we have the right to consider [t : El] O,(t) we have (if C1 and Cz are expressions) U

z

X

w

:= :=

-

CZ

C2

3

as a category. So if

4

C1 type

5

we have the right to add U

...

.-

[t : Ell %(t)&

type

6

This makes it possible t o open a new block with 21

f

:

=

-

It : El] R,(t)Cz ,

7

that is, we can start an argument with: let f be any mapping of the described kind. We also have the possibility to write line 7 with PN instead of -. 6.4. Now returning to point (ii) Sec. 6.1, we introduce the following rule. If we have a line

and also a line U

...

..-

A

El

9

then we take the liberty t o write U

...

:=

(~)r

R,(A)c~.

10

The interpretation is that (A) I? is the result of the substitution of A into I?. We write this instead of r ( A ) since, in the case that I' is a single identifier, the latter notation already had an entirely different meaning in PAL: it was used t o change context. That is, r(A) is the mapping we obtain from r if we substitute A for u, and it is even questionable whether this is possible, since u need not be of category C1. 6.5. In connection with this notation ( ) we take the liberty to extend the notion of definitional equality by the following pair of rules:

The mathematical language Automath (A.2)

91

(i) If C1, C2, C3 are expressions, where C2 contains the bound variable t, but C1 and CJ do not, then we postulate the definitional equality of

That is, it does not make a difference whether substitution is carried out before or after functional abstraction. (ii) If C1 and C2 are expressions that do not contain the bound variable x,then we postulate the definitional equality of

The above rules (i) and (ii) explain why we prefer to write (z) f instead of f (x). By way of these rules, (x)f is in agreement with the convention [t : C1]C2 for functional abstraction, and the latter is in agreement with the general mathematical habit to write quantifiers like VXESl

UXES,

n,=, m

to the left of the formulas they act on. 6.6. The description of Automath in the preceding sections was not as complete

as the description of SEMIPAL and PAL in Sec. 4. For a complete and more formal definition of Automath we refer to the report mentioned in Sec. 1.2.

7. HOW TO USE AUTOMATH FOR MATHEMATICAL REASONING

7.1. If we write elementary mathematical reasoning in PAL as described in Section 5, one of the first things we can not do is to derive an implication. There are two things we wish to do with implication, and only one of the two can be done in PAL. First assume we have introduced implication as a primitive notion, then it is easy to write "modus ponens" as an inference rule:

N.G. de Bruijn

92

0 0 b b

bool b TRUE

C

impl

C

C

asp 1 asp 2

modpon

...-.._ ..-.._ ....._ .-

PN

type

-

bool

PN

type

-

PN -

PN

bool bool TRUE(b) T R UE(impl) TR UE(c)

By means of this piece of text we are able to use the inference rule

A, A = + B B in all possible situations. The second thing we want to do is this. If we have 0 0

P

._ .-

.-.-

....

bool

....

bool T R UE (PI TRUE(q)

....

9 10 11 12

(it might have been given in any other context instead of 0) then we want to construct something in TRUE(imp1 ( p , q ) ) . This cannot be done by means of the rules of PAL. The problem can be solved in Automath, however. We first say that if we have a mapping from TRUE(b)into TRUE'(c),then impl(b,c) is true:

.. .. [ x : TRUE(b)]TRUE(c) TRUE(imp1)

C

13 14

Using the axiom, and functional abstraction, we can derive from lines 11, 12 0 0

first

:=

[y : TRUE(p)]then(y)

second

:=

axiom(p,q,first)

[y : TRUE(p)]TRUE(q) T R (impl(PI q )

15 16

That is, we have derived an assertion of impl ( p ,q). So we have the inference rule

A B

-

A=+B available in all possible cases. If we wish, we can write the application of this inference rule in one line instead of two, viz.

The mathematical language Automath (A.2)

0

...

93

axiom(p,q,[y : TRUE(p)]then(y))

:=

TRUE(impl(p,q)) 17

As a second example we introduce the all-quantifier for a predicate P on an arbitrary type <. 7.2.

0

I

b

bool b TRUE

f

0

t:

P all

P P

X

3: X

P

asp 6

asp 6

._ .._ .._ .._ ...._ .._ .._ .._ .._ .._ .-

PN -

PN -

type bool type

1 2 3

type [u : f ] bool

4

TRUE ( a l l )

5 6 7 8

PN

TRUE((2)P )

9

-

[v : <] TRUE((v)P )

10

PN

TRUE(al1)

11

PN

bool

<

-

Note the close resemblance between the text of Sec. 7.1 and this one. Actually we are able to define “impl” in terms of “all”: we can write instead of line 5 of Sec. 7.1 c

imp1

:=

all( TRUE(b),[t: TRUE(b)]c )

bool

If we do this after having accepted the text of 7.2, then we can replace the PN’s in line 8 and line 14 of Sec. 7.1 by proofs. The reader may check that the PN in line 8 (Sec. 7.1) can be replaced by ax l(TRUE(b),[s: TRUE(b)]c,asp 1, asp 2)

,

and the one in line 14 (Sec. 7.1) by

ax 2( TRUE(b),[s: TRUE(b)]c, asp 4)

.

7.3. Next we discuss the existence quantifier. There are various different approaches to this. The simplest one, and therefore the easiest one for application, is connected with the Hilbert operator. It says, if for any given category there exists an object for which a given property holds, then we have a way of selecting such an object as if we were in possession of a standard algorithm that selects for us. We can write this as follows. We start again with the introduction of bool and TRUE, then we take an arbitrary category ( and an arbitrary predicate on that category, and we introduce existence as a primitive notion. It says that “existence” is true if and only if we have something in that category E.

N.G. de Bruijn

94

._ ..-..-._

0 O b

l

0

E P P

1

2,

asp 1

V

z o ;

1

asp 2 Halbert axiom 3

P asp 2 asp 2

._ .._ .._ .._ ..-._ .-.._ .._ .._ .-

type

1

-

bool

PN

type

2 3

-

type

PN

-

PN

boo1

-

-

PN -

PN PN

4 5 6

[u : <] bool

7

E TRUE((v)P ) TRUE( exists)

8 9

TRUE (exists)

10

t TRUE ((Hilbert) P )

11 12

~

In combination with other axioms this way of defining existence easily leads to non-contructive things, e.g. the axiom of choice. A different way of introducing existence is to say that it is not true that the negation of the predicate holds for all objects in the given category. This of course requires a definition of negation, which can be done in several ways. We shall not discuss it here. The difficulties about existence arise already at a lower level, viz. with the notion of non-emptiness of a category. In that case the following may be a useful substitute for the kind of non-emptiness related to the Hilbert operator:

1

..LEPTY

-

:=

[c : bool] [u : [ x :

<] TRUE(c)]TRUE(c)

type type

If we have something in NEPTY, if c is any proposition, and if we can prove that whenever we have an x in [ then c is true, then we have proved c. So if we have something in NEPTY, we have a kind of inference rule: If we want to prove a proposition c then we may act as if we know an x with category t. 7.4. There is no objection against higher order predicate calculus in Automath. For example, we can talk about the category R of all predicates on the category of natural numbers, say, about the. category S of all predicates on R, etc.:

0

0 0

nat R S

:= := :=

.... [ n :nat] bool [ T : R] bool

type type type

7.5. Every language has its advantages and disadvantages. The disadvantages of Automath are obvious: it is tedious t o have to write in full detail, carefree identification of things in different categories is forbidden (see Sec. 2.2), and

The mathematical language Automath (A.2)

95

embedding of types into other types is not an automatic facility. In order to compensate for these disadvantages, the user should try to exploit the advantages the language has. One advantage is that we do not have to announce theorems and lemmas in a formal way, and therefore repetition of arguments is much easier suppressed than in ordinary mathematics. And, of course, we can invent all sorts of tricks. We present just one such trick here. Consider an axiom like the line TRUE in Sec. 5.4. Once we have written it this way, we cannot get rid of it: if we want to do mathematics without it, we have to write a new book. There is a way, however, to introduce the axiom in such a way that, so to speak, it is only available to those who have authority t o use it. We introduce a new primitive notion AUTH (for authority) and then state the axiom for those users who have something in A U T R

0

b a I

,

I

a

.-

;RUE*

:= :=

-

AUTH

-

bool

PN

type .-

If later we have c in AUTH and d in bool, we can use TRUE*(c,d). If c in AUTH is valid in a large part of the book, we can get rid of the awkward obligation t o mention our authority, by defining (in a context where c is available)

8. UNSOLVED PROBLEMS ABOUT AUTOMATH

8.1. It is very probable (but not yet proved) that the following is true. If the lines 21

'11

... ...

.-...-

xi

Ai

C2

A2

occur in a book, if C1 and Cz are definitionally equal, then A1 and A2 are definitionally equal. We only say roughly what definitional equality is: Two expressions are definitionally equal if one of them can be transformed into the other by replacing an identifier in one of the expressions by the expression that defines it, and also by application of one of the operations of the lambda calculus. They are also called definitionally equal if they can be connected by a chain of pairwise definitionally equal expressions.

96

N.G. de Bruijn

We do not express the notion by means of normal forms, as in 4.7, since we are not yet sure about normal forms. 8.2. Probably every expression occurring in an Automath book is definitionally equal to an expression that does not contain any ) followed by a. [. This means an expression

(possibly k = 0, h = 0 , or m = 0), where the Greek capitals again represent expressions of that form, the PI,...,/& are bound variables, and is either a block opener or the identifier part of a line with PN.

9. PROCESSORS FOR AUTOMATH

9.1. A processor is a computer program that enables a computer to check line by line whether any given input represents a correct Automath book. One of the things the computer gets to do is to check whether two expressions are definitionally equal. Even if the conjectures of Sec. 8 are true, it can be very impractical t o use normal forms for checking that equality. It is already impractical in PAL, where there is no difficulty with the normal forms (see Sec. 4.7). A good processor should have a good strategy for checking definitional equality. In cases where the general strategy is failing, it may pay t o assist the computer by giving hints as to what to do first. It is to be expected that very few hints will be needed in general. That is, at least as long as we do not try to condense a larger number of lines into a single one. Such a condensation is quite often possible, it saves identifiers, but makes things harder to write and harder to check. (An additional disadvantage of condensed writing is the repetition of expressions which might have been abbreviated by means of extra lines. Another aspect of the same thing is giving an argument twice where a lemma might have been more efficient.) 9.3. There are several attractive possibilities for man-machine interaction if a terminal is available for direct communication in conversational mode. (The Automath processor in operation in 1968 at the Eindhoven University of Technology, Eindhoven, did not yet provide such facilities.) For lines the machine rejects, it can produce diagnostics by means of which the operator can carry out corrections or add hints. It will be very practical for the operator to suppress the category of a line (unless the definition is - or PN), and t o ask the machine what category it finds. If it does not coincide with the one the operator has in

The mathematical language Automath (A.2)

97

mind, the operator can ask the machine to check definitional equality of the two expressions.

10. POSSIBILITIES FOR SUPERIMPOSED LANGUAGES

10.1. For practical purposes it will be attractive to make languages which bear the same relation to Automath as a programming language has to some particular machine language. We shall call such languages superzmposed on Automath. They require a compiler for translation into Automath.

10.2. A very simple thing a superimposed language might do is admitting repetition of names (such as the repeated use of the letter 2 for many different purposes in the book). The compiler has to rename everything in order t o meet the requirement that in Automath the identifier parts of the lines are distinct. 10.3. In complicated cases the superimposed language will require a fixed correct Automath book as a basis. If we have written a book in the superimposed language, then the compiler starts from the basis, and next it translates the given book into Automath lines which are subsequently added t o the basis, and checked by the Automath processor.

10.4. In a superimposed language standard mathematical notation might be used more freely. For example, in the superimposed language one might write p := a b c. The compiler sees that a, b, c were previously introduced as reals, it sees that no change of context has been mentioned, it knows that “real” and “plus” are identifiers in the basis. It writes

+ +

p := plus(plus(a, b ) , c) real and it keeps the context indicator of the previous line. 10.5. A superimposed language might be very different from Automath in its approach t o things like propositions, assertions, predicates. The user of the superimposed language need not even notice that Automath has a slightly unconventional approach t o these things.

10.6. It is not strictly necessary that the text presented in a superimposed language is entirely unambiguous and free of gaps. Just as the human mathematician has been trained to guess what the sentences in his textbook mean exactly, the compiler can be trained to guess the meaning of what is said in the superimposed language. It cannot be expected to do very much in this direction,

98

N.G. de Br;ijn

but whatever it can do, will be very helpful. Writing absolutely meticuously is very much harder than writing almost meticuously, and it will be a great gain if a machine can bridge the gap between the two.

11. AUTOMATIC THEOREM PROVING 11.1. Automath is not intended for automatic theorem proving. Theorem proving is a difficult and time-consuming thing for a machine. Therefore it is almost imperative to devise a special representation of mathematical thinking for any special kind of problem. Using a general purpose language like Automath would be like using a contraption that is able to catch flies as well as elephants and submarines. 11.2. There is a case for automatic proof writing in Automath if we have to produce a tedious long proof along lines that can be precisely described beforehand. Let us take a n example. Assume that P is a proposition on magic squares, and that we want to prove a theorem saying that there is no 8 x 8 magic square that has property P . We can write a computer program for this and run it on a computer. The computer says that none exist. Now quite apart from the question whether the computer is right, we have to admit that a formal mathematical proof has not been produced. Even if we had a complete mathematical theory about the machine, the machine language, the programming language, our proof would depend on intuitive feelings that the program gives us what we want, and it would definitely depend on a particular piece of hardware. For those who are willing to take Automath, at least temporarily, as their only final conscience of mathematical rigour, there is a way out. We can rewrite the magic square program in such a way that the search is stepwise accompanied by the production of Automath lines that give account of a detailed mathematical reasoning, ending with the conclusion that there is no 8 x 8 magic square with property P. This way we get a complete proof that can be checked by any mathematician. If we leave the checking to a computer, we get again into the question of whether the processor and the computer do what we expect them t o do, but that is an entirely different matter.

12. EXTENSIONS O F AUTOMATH 12.1. If we feel we should have a more powerful language than Automath, this can have two reasons.

The mathematical language Automath (A.2)

99

12.2. One reason is that we feel that the language is clumsy, and that we want to make it more handy, without changing the scope of what we can say. For some purposes this might be possible by extension of the language, i.e. by adding new grammar rules without cancelling the old ones. It is hardly necessary t o consider such extensions for the present purpose, since it can be expected that the same goal can be reached by means of superimposed languages. We might think about facilities for easy identification of two things of different categories (see Sec. 2.2), embedding of one category into another, etc. If such matters can be handled satisfactorily, they can be handled by a superimposed language. The only reasons for doing it without such a language may be computer time and memory space. 12.3. A different reason for extension can be that we feel that Automath is not strong enough, just as we extended PAL to Automath since PAL was not strong enough for modern mathematics. One might suspect that no single language will ever be entirely satisfactory. It is an old mathematical habit to mix language and metalanguage: we write a text in a language; we discover facts about that text; we use these facts in the subsequent text. This of course means an extention of the language. We mention an example, though not a very important one. Let q be any identifier in an Automath book, and let p be a block opener. If it happens that q does not implicitly depend on p , this is an observation about the book, and there seems t o be no way t o write it as an assertion in the book. It will be an extension of the language if we design some way to write this independence, a way to derive it from the book, and a way t o use that written information if we need it. This kind of thing is done in ordinary mathematical language, but in Automath it is not necessary. If q does not depend on p , then we are able t o define r := q in a context where p is not valid, and then we need not bother about p any more.

12.4. There is a class of extensions of Automath that is very easy t o describe: We start the book with a number of lines some of which have not been written according to the rules; we want to write the rest of the lines in the book according to the rules. We give an example that does not belong t o Automath, but t o the language we get from Automath if PN’s are forbidden: Then we can write all axioms in the basis as theorems without proofs, and talk PN-free language ever after. One might even think of an infinitely long basis. For example, one might like t o have all the natural numbers as a priori given, and devote a line or two to each one of them. 12.5. In Automath we have the right to indulge in functional abstraction with

N.G. de Bruijn

100

respect t o every type. In private discussions Prof. Dana Scott said he did not like the idea of introducing “bool” as such a type, at least not in intuitionism. It is very easy t o extend Automath by introducing a symbol type*, and saying that if C has category type*, then we do not have the right of functional abstraction with respect to C. It seems fair to admit the category C3 := [z: El] C2 if C1 has category type and Ez has category type’, and to say that C3 has category type’. If we do all this, we can introduce “bool” as something of category type*, and “nut” (the natural numbers) as something of category type. 12.6. In Automath we did not allow functional abstraction with respect to type itself. For example, if we have

then we can not write 0

...

.-

[t : type] bool

[t : type] b ( t )

.

It is difficult t o see what happens if we admit this.

12.7. A possibility that seems less dangerous than the one of 12.6 is the following one: if we have 0

5

:=

....

:=

[t : [] b ( t )

type

then we allow t o write 0

...

[t :

r] type

This gives more information about [t : (1 b ( t ) than just saying that it has category type, but on the other hand it puts an end t o uniqueness of category. Moreover, we permit lines such as 0

a

:

=

-

[t : r1 type

<.

in order to introduce an arbitrary way of attaching a type to each t in Once we have opened these possibilities, it will be pretty obvious what the further operational rules have to be. We mention a single case where this extension of our language is needed. In connection with recursive definitions, we might wish to say: let P I ,P2, ... be an infinite sequence of categories. This can be done by means of a block opener with category [n: nut]type.

101

A Description of Automath and Some Aspects of its Language Theory D.T. van Daalen 0. SUMMARY This note presents a self-contained introduction into Automath, a formal definition and an overview of the language theory. Thus it can serve as an introduction to the papers [van Benthem Jutting 731 and [Zandleven 73 (E.l)]. Among the various Automath languages this paper concentrates on the original version AUT-68 (because of its relative simplicity) and one extension AUT-QE (in which most texts have been written thus far). The contents are: 1. Introductory remarks. 2. Informal description of AUT-68. 3. Mathematics in Automath: propositions and types. 4. Extension of AUT-68 to AUT-QE. 5. A formal definition of AUT-QE. 6. Some remarks on language theory.

1. INTRODUCTORY REMARKS 1.1. According to the claims for the formal system Automath one should be able to formalize many mathematical fields in it in such a precise and complete fashion that machine verification becomes possible. The flexibility required to meet the indicated universality is provided by having a rather meagre basic system. The Automath user himself has to add appropriate primitive notions to the basic system in order to introduce the concepts and axioms specific to the part of mathematics he likes t o consider. In this respect, the basic system may be compared with some usual system of logic (e.g. first order predicate calculus) t o which one adds mathematical axioms in order to form mathematical theories.

102

D.T. v i l ~Daalen

1.2. In spite of this analogy, however, the basic system itself does not contain any logic in the usual sense. Basic for the system are the concepts of type and function (instead of, e.g., the concept of set or of natural number), which are formalized by a certain typed A-calculus. When representing mathematics in Automath one has to deal with the question of coding: How to formalize general mathematical concepts in the form of types and functions (see Section 2.2). Clearly an appropriate formalization will incorporate as much as possible of the basic type-and-function framework. Section 3 discusses this coding problem and in particular proposes a suitable way of representing propositions, predicates and proofs (a functional interpretation of logic). 1.3. In order to satisfy the claim of automatic verification of correctness the system certainly has to be decidable (and even feasibly decidable on now-existing computing machines). Since many common mathematical theories produce undecidable sets of theorems we must conclude that we cannot expect the computer to do all our work. Indeed theorems have to be given together with their proofs in order t o allow verification. Thus the correctness produced by the machine verification covers the arguments leading from axioms to conclusions only. The Automath user himself is responsible for his choice of primitive notions and all the coding (and decoding) involved.

2. INFORMAL DESCRIPTION OF AUTOMATH

2.1. Introduction Here we treat the original version of Automath, now named AUT-68. We chose this system as a n example because of its relative simplicity. The discussion will be informal and intuitive and in fact restricted to the object-and-type fragment of the language (thus leaving the proof-and-proposition fragment t o Section 3).

2.2. Intuitive framework (This section may be skipped by formalists.) The mathematical entities discussed in the language fall into two sorts: objects and types. The types may be considered as classes or sets of a certain kind, which may have objects as their elements. All types are supposed to be disjoint, for each object belongs to just one type. This uniqueness of types permits one to speak about the type of an object.

A description of Automath (A.3)

103

The typestructure is built up by starting from ground types and forming function types from these. Each mathematician may choose the ground types himself (as primitive notions), e.g. the type of natural numbers. An example of a function type is the type a + /3 (where Q and /3 are types) of the functions from Q to /3. More generally, the function types are formed by taking products, as follows: The language allows one to express dependence of types on objects (of some given type). That is, one can describe certain families of types bx indexed by the objects z of a given type a. Now every function type is formed as the generalized Cartesian product of such px,usually denoted l l x ~ a / 3 x ,and containing as objects just these functions that associate to any object z of type Q an object of type Px.The type Q + /3 is the special case where all Dz are a fixed type p.

2.3. Expressions, degrees and formulas; correctness The language as such only expresses the constructions of types and objects and the typing relations between objects and types. The expressions of the language have degree 1, 2 or 3. Types and objects are denoted by expressions of degree 2 and 3 respectively (for short 2-expressions, 3-expressions). For convenience we introduce the 1-expression t y p e to provide a type for the types. Further 1-expressions will be introduced in Sections 3 and 4. The symbol E expresses the typing relation: ... has type ... . So if A denotes an object then we have the E-formulas A E (Y and Q E type. The 2-expressions and 3-expressions are built up from variables and constant-expressions by means Of:

(i) the substitution mechanism (Section 2.5). (ii) functional abstraction and application (Sections 2.8 and 2.10). The constant-expressions have the form c ( q ,...,zk)where 21,...,zk are variables and c is either a primitive constant introduced as a primitive notion (Section 2.6) or a defined constant (Section 2.7). Expressions and formulas are correct if they are constructed according to the rules of the language, which are informally discussed in the sequel. 2.4. Variables and contexts

A mathematical statement generally presupposes certain assumptions on the variables used. For example: “let z be a natural and y a real number”. In Automath, in accordance with this usage, each variable of degree 3 (object-variable) ranges over a certain type, called the type of the variable. The 2-variables (typevariables) are supposed to range through the types and have t y p e as their type.

D.T. van DaaJen

104

Expressions and formulas containing free object- or type-variables, say X I , ..., X k , can only be correct relative to a certain context: 1.e. a finite sequence of Eformulas x 1 E a l l ..., x k E a h , called assumptions, in which the free variables have to be explicitly introduced with their types. Some of the types ai may depend on the variables given earlier in the sequence. For instance, a3 may contain both x 1 and 22 as free variables. It is understood that all a, are correct expressions themselves: a 1 relative to the empty Context, a 2 relative to x 1 E a 1 , etc.

2.5. Substitution mechanism Let us, in informal discussion, exhibit the possible dependence of an expression C on variables X I , ...,x k by writing c [ X I , ...,x k ] for c. Then we write C [ A l l..., A k ] for the result of simultaneously substituting Ai for xi (for i = 1,...,k) in C. Suppose that under assumptions 2 1 E LY1,...,Xk

E

a k

we have a correct E-formula A[Xl,...,Xk] E a [ X 1 ,

..., X k ] .

Then the substitution mechanism yields the substitution instance

A [ A l ,. ‘ . i

Ak]

E

[All * . * , A k ]

for any sequence A 1 , ...,A k of suitable candidates for 2 1 , ...,X k . 1.e. these A 1 , ...,A k have to be of the appropriate types where, however, in view of the possible dependence of types on variables, the substitution has to take place in the types too. So we require A 1

E airA

2

E

a 2

[Ail,...,A

k E a k

[ A i l..., A k - i ]

.

2.6. Primitive notions As mentioned before, one has to add primitive notions to the basic system in order to introduce the specific concepts of the piece of mathematics one wants to study. For example, in order to write about the natural numbers, one might introduce the primitive type-constant nat and the object-constant 1 by axiomatically stating: nat E type 1 Enat.

A description of Automath (A.3)

105

In general, primitive notions are introduced by stating an axiomatic E-formula p ( z 1 , ...,zk) E Q [XI, ...,z k 3 under certain assumptions z1 E al,...,Xk E Qk. Here either a is type (and p is a type-constant) or in the current context we have Q E type already ( p being an object-constant). All correct substitution instances p ( A 1 , ...,Ah) of such a constant-expression p ( z 1 , ...,zk)are then produced by the substitution mechanism, described above.

For example, the concept of successor in the natural number system can be introduced under the assumption z E nat by stating: successor(z) E nat. Using the substitution mechanism we get successor( 1)

E nat

successor(successor(1)) E nat

,

etc

.

Notice that primitive constant-expressions may not only contain object-variables (like the z in successor(z)) but also type-variables. 2.7. Abbreviations

In mathematics one often introduces abbreviations, i.e. new names for possibly long and complicated expressions. In Automath this abbreviation facility is also present; indeed, it will appear that by the particular format of the language every derived statement gives rise to the introduction of a new defined constant. Although this kind of explicit definition is often considered theoretically uninteresting, we feel that it is essential in practice for the actual formalization and verification of complicated theories. Just like primitive notions, abbreviations are introduced under certain assumptions and so may contain free variables in general. Thus new constantexpressions d(z1, ...,zk) are introduced, abbreviating expressions D which are correct in the current context. Clearly the type of d ( q , ...,zk)must be the same as that of D. Example: 2, 3, ... can be introduced by 2 := successor(1) 3 := successor(2) ,

etc

.

Further, the notion of “successor of successor’’ might be abbreviated by stating (under assumption z E nat) that plustwo(z) := successor(successor(z)) . Again, all correct substitution instances with their types are produced by the substitution mechanism.

106

D.T. van Daalen

2.8. Functional abstraction: A-calculus

We have mentioned functional abstraction and application as further tools for constructing expressions. By these devices a form of typed A-calculus is incorporated into the basic system. In A-calculus, intuitively speaking, Ax.B denotes the function which to any object x associates the object B. Or (exhibiting the dependence on x) Xx.B [x] is the map which, with any A, associates B [A]. In Automath (where all functions have a domain) such explicitly given functions are denoted by abstraction expressions [x : a]B, where B may contain x as a free variable; a is the type of z and the domain of the function. In case B is a 3-expression1 [x : a]B attaches objects to the objects of type a and is called an object-valued function. If B is a 2-expression, [x : a]B attaches types to the objects of type a and is called a type-valued function. In AUT-68 no abstraction expressions of degree 1 are formed (in contrast with AUT-QE). Notice that possible free occurrences of x in B are bound by the abstractor [z : a] and are not free in [x : a]B any more. An important restriction on abstracting is that such a bound variable must be a 3-variable. Thus we only quantzfy (cf. Section 3.4) over (the objects of) a given type and quantification over type is not possible. 2.9. Type of abstraction expressions

Suppose that under the assumption x E a we have B E p. If p is not a 1-expression then we may form both the abstraction expressions [x : a] B and [x : a]p. According to Section 2.8 [x : a]B denotes an object-valued function and [z : a]p denotes a type-valued function. The latter abstraction expression [x : a ] p [ x ]however, , is also used with a different meaning in Automath, that is, to denote the corresponding function type I I Z ~/3= [x] (which is the type of [z : a] B [x] by Section 2.2). So we obtain [z : a]B E [x: a]p and [z : a]p E type. Example: The successor function can be introduced (in the empty context) by succfun := [x : nat] successor(x) E [x: nat] nat . The double use of 2-expressions mentioned above does not cause ambiguity, because it is always clear whether an expression acts as a function or as a type in a formula. In fact in AUT-68 abstraction expressions of degree 2 are exclusively used with the second meaning, i.e. as function types.

2.10. Functional application In full (i.e. type-free) A-calculus any expression - as a function - may be applied t o any expression - even itself - as an argument.

A description of Automath (A.3)

107

In Automath, as a typed A-calculus, all functions have domains and any form of self-application is ruled out by the application restrictions: The application expression ( A ) B (denoting the result of applying B as a function to A as an argument) is correct only i f (i) B is a function and so has a domain, say a. (ii) A is an object of type a. The notation ( A ) B ,with the argument in front, is somewhat unusual; it is convenient, however, since abstractions are written in front too.

2.11. Type of application expressions Assume that B E [x : a]P. Here [x : a]P [x]is a 2-expression acting as a type and so denotes I I z ~p=[x].Hence B must be considered as a function with domain a. Now if A E a we are allowed to form the application expression ( A ) B having P [A]as its type. Note that B need not be of the form [x : a]C itself. It may, e.g., be a single object variable or object constant with type [x: a]/3. Example: As an alternative expression for the number 3 we might introduce 3alt := (2)succfun E nat

.

2.12. Equality We will define a relation of definitional equality among the correct expressions, appropriate to the interpretation of expressions suggested above. The relation is denoted ... = ... and generated by: (i) abbreviational or &equality, 3. (ii) A-equality. The latter is generated in turn by P-equality, =pl and 7-equality, =,,. Usually in A-calculus the A-equality also explicitly embodies a-equality (renaming of bound variables). In this note, however, we take the point of view of simply ignoring the names of the bound variables. So a-equal expressions are identified and are a fortiori definitionally equal by the reflexivity of the =-relation (cf. also Section 5.3.2).

2.12.1. &equality Assume the defined constant d has been introduced in suitable context by d(zl,...,xk) := D [xi,...,xk]

108

D.T. van DaaJen

Then d ( q , ...,z k ) abbreviates D and we write d(z1, ...,d k ) for the substitution instances:

=6

D. And further

2.12.2. ,&equality Assume ( A ) [ z: a]B [z] is a correct expression (so A E a). Now P-equality exploits the interpretation of [z : a]B as a function with domain a and simply amounts to evaluating the result of the application:

( A ) [ z: a]B =p B [ A ] . 2.12.3. 11-equality In mathematics one usually considers functions as eztensional objects, in the sense that functions with the same domain and which are pointwise equal are identified. In Automath this extensional equality is partly covered by the 77-equality: If z does not occur free i n B then [z : a]( z ) B =,, B (for correct expressions only). This is intuitively sound only if domain B = a,which indeed is the case by the correctness of [z : a]( z ) B . (Strictly speaking one also needs uniqueness of types (see Section 6.4.2.2).] 2.12.4. Definitional equality Now definitional equality = is defined to be the equivalence relation on the correct expressions, generated by =6, =pl =,, and by monotonicity: If A = A' and B' is produced from B b y replacing one specific occurrence of A in B b y (an occurrence of) A' then B = B'. Or, using suggestive dots for the unchanged part of the expression B : If A = A' then ...A... = ...A' ... . Example of the monotonicity rule: If A = A' then ( C )( A ) D = ( C )(A')D (if both expressions are correct). [Later, we will use 9 to denote the restriction of = to the correct ezpressions.] 2.13. The format: books and lines 2.13.1. Actual Automath texts are written in the form of books. A book consists of a finite sequence of lines. Each line must be placed in a certain contest (the context of the line) and introduces a new identifier of a certain type. All lines consist of four consecutive parts, separated by suitable marks or spaces: (i)

context part, indicating the context of the line. In general the context part consists of the context indicator, i.e. the last variable of the current context. Fkom this the complete context can easily be recovered. If the context of

A description of Automath (A.3)

109

the line is x 1 E a l , ...,X k E Q k , the sequence of variables 2 1 , ...,X k is called the indicator string of the line. The empty context can be indicated by an empty context part. (ii)

identifier part, consisting of the new identifier.

(iii) middle part, containing the symbol E B (cf. 2.13.2), the symbol P N (cf. 2.13.3) or the definition of the new identifier (cf. 2.13.4). (iv) category part, containing the type of the new identifier. Assume an Automath book is given, in which the variable x k has been introduced with type a k in the context x 1 E a 1 , ...,x k - 1 E a k - 1 . Then we may add lines with context indicator X k , so having x 1 E a l , ...,zk E a k as their context. Below we discuss the three different kinds of lines.

2.13.2. The block opening lanes have middle part EB (for e m p t y block opener) or, in alternative notation, a bar . An EB-line introduces a new variable and thus allows extension of the current context by one assumption. Example: x k * y := EB E a (“let y be of type a”)introduces a new variable y of type a. Lines having y as their context part - which may appear later in the book - then have x 1 E al,...,X k E a k , y E a as their context. ~

2.13.3. The primitive notion lines have middle part PN and introduce the primitive notions. For example: Xk

*P:=PNE~

introduces the primitive constant expression iomatic E-statement p ( x 1 , ...,x k ) E a.

p(x1,

...,x k )

and contains the ax-

2.13.4. The abbreviation lines look like: x k * d : = D E Q ,

where the middle part D is the definition of d , i.e. the expression to be abbreviated. This line contains, relative to the preceding book and the current context, both the derived E-statement D E a and the defining axiom for the new defined constant d :

2.14. Correctness of lines; validity A line is correct if both the middle part (if not EB or PN) and the category part are correct expressions with respect to the preceding book and the current

D.T. van Daalen

110

context, and the category part is the type of the middle part (if not E B or PN). For the correctness of the expressions, all identifiers used have to be valid. Constants are valid in a book from the line on in which they are introduced. Free variables are valid in a line if they occur in its context. We speak about the block of lines in which a free variable is valid (whence block opener).

2.15. Shorthand facility Assume that a primitive or defined constant c was introduced in a certain context X I E al,...,xk E (Yk. Then if later in the book c occurs with fewer than k arguments, the argument list is completed by adding a suitable initial segment of the original indicator string (cf. 2.13.1 (ii)) 21,...,Xk. In other words the expression x(Ai+l, ...,Ak) is shorthand for c(x1, ...,xi, Ai+l,..., A k ) and the single constant c is shorthand for c(x1, ..., xk). Clearly the completing variables have to be valid, that is, the initial segments of the original and the current context have to coincide. The shorthand facility accords with usual mathematical practice where free variables are often considered as fixed throughout an argument and are not mentioned explicitly.

2.16. Paragraph system For each variable and constant it must be possible to retrace from which line it originates. This condition is clearly satisfied when all names are unique. A more liberal method of naming, however, is allowed by the so-called paragraph system, for a description of which we refer to [Zandleven 73 (E.l), Section 111. Both shorthand facility and paragraph system do not really concern the language definition but are present for convenience only. 2.17. Example In the following AUT-68 booklet the examples of the preceding sections are now written in the proper format.

*

nat 1 x * successor * 2 * 3 * plustwo * succfun * 3alt * *

x

x

:= PN := P N

.-.-

:= := := := := :=

~

PN successor( 1) successor(2) successor(successor) [x : nat] successor(x) (2)succfun

tYP= nat nat nat nat nat nat

[x : nat] nat nat

Here the middle part of plustwo uses the shorthand facility. It is left to the reader t o establish 3 = 3alt.

A description of Automath (A.3)

111

3. MATHEMATICS IN AUTOMATH: PROPOSITIONS AS TYPES 3.1. Functional i n t e r p r e t a t i o n of logic Up till now we have described Automath as a calculus of objects and their types only. A major part of mathematics, however, consists of making statements and reasoning with them, i.e. deals with logic. Now there are different ways of coding some logic into the objects-and-types framework. Here we only mention a so-called functional interpretation of logic, which gives rise to the propositions-as-types notion. This idea of interpreting logic was developed independently by de Bruijn and certain others, of whom we mention Howard [Howard 801, Prawitz [Prawitz 711, Girard [Girard 721 and Martin-Lof [Martin-Lof 75a].

3.2. Propositions as types So far we have introduced type as the only 1-expression. We have C E t y p e and l? E C for the types C and the objects r of type C respectively. Now we introduce another 1-expression, the basic symbol prop. Originally in AUT-68 no distinction was made between t y p e and prop. The latter 1-expression acts just like type and was introduced later to allow difference of treatment between types which are to be considered as propositions and types which are just types of objects. If C E prop we consider C as a proposition. If further I? E C, we consider r as some construction establishing the truth of C (a “proof” of C). Thus the formula r E C is conceived as asserting the proposition C. 3.3. Interpreting implication Let a E prop and p E prop. Now we may say we have a “proof” of the implication a ---* p if from a n assumption of the truth of a we can argue and conclude the truth of p. That is, if for any construction establishing the truth of a we can produce a construction for the truth of p or, equivalently, if we have a map from “proofs” of a to “proofs” of p. Now in Automath terminology: we say we “prove” a ---* if for any x E a we can produce some B E p. 1.e. if we have some C in the function type [z : alp. So we let [x : a]p denote the implication a 4 p and have [z : a]p E prop. This corresponds to the second interpretation of abstraction expressions in Section 2.9. Now by this interpretation we obtain the modus ponens (from a and a + p infer p) by simple functional application. For let A E a and C E [z : a]p ( A and C thus being “proofs” of a and a ---* ,d respectively). Then by the application rule we construct ( A ) C establishing the truth of p.

D.T. v m Daalen

112

3.4. Universal quantification; negation In exactly the same manner a functional interpretation of universal statements can be given. Namely if a E type and for z E a we have p E prop then we . identify the function type [z : a l p with the universal statement V , E ~ ~Here functional application corresponds to the “instantiation” rule in logic. Thus by this interpretation of logic in Automath one gets the (V, +)-fragment of first order predicate logic for free. However, in Automath only positive statements are made and statements like: “C is not of type r” cannot be expressed. In order to interpret negation we introduce as a primitive notion the proposition con (for “contradiction”) together with some suitable axiom (primitive notion). Here are different possibilities, e.g. the intuitionistic absurdity rule (for any proposition a,from con infer a) or the classical double negation law. Then an Automath theory (i.e. book) is consistent if, in the empty context, it does not produce some C E con. For Q E prop we define non(a) as a + con or, in Automath notation, [z : a]con. Now the double negation law can be stated by introducing the primitive notion dnl as follows: Zfa E prop, z E non(non(a)) then dnl(a, z) E a. By also choosing suitable definitions for the other connectives (A, V) and the existential quantifier we can smoothly obtain full classical first order predicate calculus.

3.5. Assumptions, axioms, theorems In Automath-books the E-formula usual three kinds of lines again: (i)

r E C for proposition C can occur in the

EB-lines: u * z := EB E C. These must be interpreted as assumptions: “let C hold” or “let z be a proof of C”. Now in a line where z is valid we may refer to z whenever we want to w e the assumed truth of C.

(ii) PN-lines: u * p := PN E C. These serve as axioms, or rather as axiom schemes (by the dependence on the variables contained in the context 0). (iii) abbreviation lines: 0 * d := r E C must be considered as derived statements, i.e. theorems, lemmas etc. Here the middle part r “proves” the proposition C from the assumptions in the context u ,

3.6. Book-equality The definitional equality (cf. Section 2.12) of Automath only covers a small

A description of Autornath (A.3)

113

part of the usual mathematical equality. Further a statement of definitional equality cannot be handled as an actual proposition; e.g. it cannot be negated or even assumed (as in: let A = B). As the Automath-counter part of the usual mathematical ... equals ..., the book-equality IS(a,A , B) - where A and B are objects of type a - can be introduced by suitable primitive notions, some of which are shown in the example below.

a 2

Y X

Y i

* *

*

*

*

a 2

Y

IS REFL

* i * SYM

._ .-

:= -

._ .-

:= P N := P N := ~

:= P N

etc. and also:

a

* P

P * f

f *

X

Y i

* *

*

X

..:=

.-.._ .-

~

~

~

Y := i ISAXl := P N

By the axiom of reflexivity (REFL) above, definitional equality implies bookequality: if A E a , B E a , A = B then REFL(a, A ) E IS(a,A , B).

4. EXTENSION OF AUT-68 TO AUT-QE

4.1. Function-like expressions

Expressions C such that C E [x: a]p or C = [x : a]P are called function-like expressions. Whereas in AUT-68 function-like 3-expressions may have any form, e.g. they can be variables or primitive constant expressions, the only functionlike 2-expressions are (possibly abbreviated) abstraction expressions. This is because function-like 1-expressions are absent in AUT-68. Thus we discuss explicitly constructed families of types Pz where x ranges over some type a (namely by forming the abstraction expression [x : a ] P [ x ] ) but we cannot discuss arbitrary families of types indexed by x E a. Indeed, we cannot introduce a family of types as a primitive notion or as a variable.

D.T. van Daalen

114 4.2. Supertypes or quasi-expressions

In AUT-QE on the other hand such arbitrary type-valued functions are admitted, however, by extending the class of 1-expressions. The new 1-expressions, quasi-expressions (whence AUT- Q E ) or supertypes, have the form [z1 : a11 ... [Xk : a k ] type or [XI : a11 ... [zk: a k ] prop

,

where a l , . . . , a h are 2-expressions, i.e. propositions or types. For example, an arbitrary type-valued function on a can be introduced by an EB-line:

.-

* * f

[z : a]type

.

If for a we take the type of natural numbers, then f is an arbitrary sequence of types. 4.3. The use of AUT-QE

Similarly we have arbitrary prop-valued f u n c t i o n s in AUT-QE. These are especially useful in our interpretation of logic, for a prop-valued function with domain a is nothing but a predicate over a . For example, by an EB-line

a * R

.-

[z : nat] [y : nat] prop

~

an arbitrary binary predicate (rather: relation) on the natural numbers is introduced. The presence of predicate and relation variables in AUT-QE allows us to write axiom s c h e m e s with such variables, e.g. to introduce a further equality axiom (cf. Section 3.6) we can write:

y

* i

i

*

j

..-.....-.-

j

*

ISAX2

:=

a

*

P

P * x Z

*

Y

[z : a]prop

~

a

~

a

~

~

~

PN

~~

IS(X,Y) ( X P

(YP

We emphasize, however, that abstraction over such 2-variables (e.g. type-variables, prop-variables, predicate-variables) in AUT-QE is still forbidden, so both AUT-68 and AUT-QE may still be called first-order systems. 4.4. Type-inclusion and prop-inclusion

Just as in AUT-68 the function-like 2-expression f (cf. Section 4.2) also codes its corresponding function space, i.e. the type of those g with domain a such that

A description of Automath (A.3)

115

for A E o we have (A)g E (A)f. As prop behaves just like type, the predicate P (cf. Section 4.3) also denotes the proposition V = E .~P ( z ) . As a consequence, we allow the transition from C E [z : a]type to C E type. This transition or, in general, from E

[XI : a11

... [ Z k : a k ] [YI : PI]... [Ym : Pm] type

CE

[XI : CYI]

,.. [ X k : a k ] type

to

is called type-inclusion. The similar transition with prop instead of type is called prop-inclusion. By this type-inclusion and prop-inclusion AUT-QE contains AUT-68 as a proper subsystem. Notice that for 2-expressions uniqueness of types - if A E a,A E ,B then a = 3'/ - is lost. 4.5. Let us finish with a table in which some Automath notions are listed with their possible meanings in the propositions-as-types interpretation.

Automath-notions

object-and-type interpretation

2-expressions 3-expressions ... E ... function-like 2-expressions

types objects ... has type ... type-valued functions function types

variable introductions primitive object introductions abbreviations lines definitions or abbreviations EB-lines PN-lines

proof-and-proposition interpretation propositions proofs ... proves ... predicates implications universal statements assumptions axioms theorems

5. A FORMAL DEFINITION OF AUT-QE

5.1. The language, to be defined now, is the one accepted by the current checker (cf. [Zandleven 73 (E.1)])except for two points: (i) Paragraph facilities are not present here so all constant names have to be distinct (cf. Section 2.16). (ii) There is no shorthand facility (i.e. all expressions are written out in full) (cf. Section 2.15).

D.T. van Daalen

116

The actual formalism has been chosen in this way in order to keep as close as possible to the preceding informal book-and-line description. A definition along more usual natural deduction lines may possibly be more elegant. For technical reasons we preferred to avoid redundancy almost completely in our definition. As a consequence of this, some useful extra rules follow as derived rules in the section on language theory. 5.2. Our aim is to define formally what correct AUT-QE books are.

The description consists of (i) Preliminaries, mainly devoted to the context free part of the language (Section 5.4). (ii) Simultaneous definition of correctness of books, contexts, lines, expressions, E-formulas and Q-formulas (Section 5.5). The 4-formulas only serve as a help in our definition; they do not appear in the book. The kernel of (ii) is the definition of correctness of expressions and formulas relative to a certain book and context. Here the book serves to determine the set of primitive notions and abbreviations, and the context serves to determine the set of valid free variables. Most concepts are introduced by ordinary inductive definitions. These consist of a finite set of rules of the form: “if ... then ...”. Here only such conclusions may be drawn which follow from a finite number of applications of the rules. 5.3. Notational conventions 5.3.1. An extensive use is made of syntactic variables throughout the definition. Often certain assumptions on these variables are implicit by their specific choice, e.g. u and always run over contexts. Syntactic variables may always be indexed or primed.

<

5.3.2. As for substitution and a-conversion (renaming of bound variables) we adopt the following point of view: expressions with bound variables are considered as named versions - named to facilitate reading - of some actually namefree skeleton (cf. [ d e Bruijn 72b (C.2)]).Thus we identify a-equal expressions and assume that a-conversion is applied whenever necessary t o avoid clash of variables. We use ... = ... to denote syntactic identity (symbol-for-symbol equality) modulo a-equality. E.g. [z : C ]...2 ...z... = [p : C]...p ... y ... . 5.3.3. Correctness of expressions A and formulas cp relative to a book B and a context u are abbreviated by B ; u I- A and B ; u I- cp respectively. Sometimes we write I- A or u I- A for B ;u I- A and I- cp or u I- cp for U; u I- cp when there

117

A description of Automath (A.3)

is no particular need to emphasize the current book or context. The notions I-(i) A and I-(z) A E B are used to express that A is an i-expression and I- A (respectively I- A E B).

5.4. Preliminaries 5.4.1. Alphabet (1) As variables and constants we allow any alphanumeric string. Such a string is considered atomic and is thus counted as one single symbol. Syntactic variables for variables are x , y , a, ... . Among the constants (syntactic variable c ) we distinguish primitive (syntactic variables p , q ) and defined or abbreviational constants (syntactic variable d).

( 2 ) Improper symbols (i)

Some brackets and braces: [, 1, (, ), (, ).

(ii) Some separation marks: !, *, I-, El :=, =, semicolon and comma. (iii) Some reserved symbols: EB, PN.

5.4.2. Expressions (syntactic variables A , B,C, D, ..., C,A,I', ...) (i)

Variables: x.

(ii)

Abstraction expressions: [x : C] A.

(iii) Applications expressions: @)A. (iv) Constant-expression instances: c(C1, ..., C,) (k may be 0, in which case we write just c instead of c( )). (v) Basic constants: type, prop. As special syntactic variables for 2-expressions we take a , p, ... .

5.4.3. Formulas (syntactic variable

'p)

(i) E-formulas: C E A. (ii) 9-formulas: C Q A.

6.4.4. Additional concepts (1) Contexts (syntactic variables u,E): A n y finite (possibly empty) sequence of E-formulas xi E X i , separated by commas, where all xi are different.

D.T. van Daalen

118

(2) Lines (syntactic variable A) (i) EB-lines

:

(ii) PN-lines

:u

~7

(iii) Abbreviation lines: u

*z:=EBEC. *~:=PNEC.

* d := A

EC

(3) Books (syntactic variable a): Any finite (possibly empty) sequence of lines, separated from one another b y exclamation signs (!).

5.4.5. Free variables We define the free variable set FV(C) of expressions C by induction on the structure of C (cf. Section 5.4.2): (i)

FV(z) = {z}.

(ii) FV([z : r]A) = F V ( r ) U (FV(A)\{z}). (iii) F V ( ( r ) A ) = F V ( r ) U FV(A). (iv) FV(c(C1, ...)Ck)) =

u

i = l , ...,k

FV(&).

(v) FV(prop) = FV(type) = 0.

5.4.6. Substitution (1) The result of simultaneous substitution of Al, ..., Ak for the free variables 2 1 , ...,xk in an expression c is denoted by [zl,...,zk := A1, ...,Ak] C and locally abbreviated by C*: (i)

zt = A,.

(ii)

y*

= y if y

not among

21,

..., X k .

(iii) ([g : El] &)* E [ y : Ci] X; if y not among zl,...,xk and, for i = 1,...)k, xi E FV(C2) + y FV(Ai) (otherwise rename y in [ y : C,] C2). (iv) ((Ei)C2)* (Ci)Cz. (v)

( c ( c 1 ,...,Xk))*

c(Ci, ...) X i ) .

(vi) prop* = prop, type* = t y p e . (2) Substitution of A for x is denoted by [z := A] and amounts to the case k = 1 above.

A description of Automath (A.3)

119

5.5. Correctness 5.5.1. Correct books (i)

The empty book is correct.

(ii) If B is correct and X is correct with respect to B then B!X correct. 5.5.2. Correct context with respect t o

(i)

B

The empty context is correct.

(ii) If o * x := EB E A is a line i n the book B then o,x E A is a correct context with respect to 13. 5.5.3. Correct lines with respect to B (1) EB-lines: If B ; o I-(1) A or B ;o I-@) A, o = XI E C1, ...,X k E C k , and y not among X I , ...,xk then o * y := EB E A is a correct line with respect to

13. (2) PN-lines: If B ;u F ( l ) A or B ;u I-(*) A and p does not occur in B then u * p := P N E A is a correct line with respect to B. (3) Abbreviation lines: If B ;u I- C E A and d does not occur in B then u * d := C E A is a correct line with respect to B. 5.5.4. Correct E-formulas relative t o a correct book u , correct w.r.t. B

(1) Repetition rule: If o = z1 E E l , ...,x k E Ck and x j E ~j (for j = 1, ..., k). B ;o

Cj

B and

a context

is an i-ezpression then

(2) Abstraction rule: If B* = B!u * x := E B E a and B* is correct and B* ; o,x E a I-(i) C E A then B ;o I-(i) [z : a]C E [z : a]A. (3) Application rules:

(i) If I- A E a and I-(i) B E [ x : a]C then I-(i) ( A ) B E [ x := A] C. (ii) If !- A E a , I-(i) B E C and I- C E [ x : a ] D then I-(i) ( A ) B E (A)C (clearly i will be 3 here). (4) Substitution rule: If C is an i-expression and either 21 E C1, ..., X k E C k * C := PN E C Or X I E C l , . . . , X k E C k * C := A E C is a line in the book B and B ;u I- Aj E [ X I , ...,xk := A l , ...,A k ] C j fOT j = 1, ..., k then B ;o b(i) c(A1, ..., A k ) E [ X I , ...,z k := A l , ..., A k ] C.

D.T. van Daalen

120

(5) Rule of type-conversion:

If I- A E C and

I- C Q

I' then k A

E

I?

( 6 ) Rules of type- and prop-inclusion:

(i) If I- C E [zt : a11 ... [zk : ak] [v : p] t y p e (possibly k = 0) then I- c E [zt : a11 ... [zk : ak]t y p e . (ii) If I- C E ( 2 1 : 1.11 ...[zk : ah][y : D] prop (possibly k = 0) then I- C E [zt : at]... [zk : ak]prop. 5.5.5. Correct expressions with respect to 0 and

0

(1) Correct 1-expressions: (i) If B is correct and 0 is correct with respect to B then B ; u I-(1) t y p e and B ; 0 prop.

= B!u [x: a ] A .

(ii) I f B *

* z := EB E a

and

B * ;0 , x E a I-(1) A then B ;o

(2) Correct 2- and 3-expressions: IfI-(') C E A then I-(') C.

Remark. It is intended that B ; 0 I- A or B ; 0 I- cp only if B is correct and 0 is correct with respect to B. This condition is explicitly imposed in 5.5.4, 5.5.5.1 (i) and propagated all through the definition. 5.5.6.

Correct @formulas with respect t o B and c

If I- ( A ) [ z: a]B and I- [z := A]B then I- ( A ) [ z: a]B Q [z := A]B.

(1) &equality:

(2) q-equality:

If+

I- [z : B](z)c Q

[z : B] (z)C, and z @ F V ( C ) and I- C then

c.

(3) &equality: If 2 1 E C1, ...,zk E & * d := A E C as a line in B, and B ;u I- Aj E [ X I , ...,x k := A l , ...,Ak]Cj for j = 1,...,k, and

B ;0 I- [ ~ ..., t , zk := Al, ..., Ak]A then B ;0 I- d(A1,...,Ak) 9 [ X I , ..., zk

:= A t , ..., Ak] A.

(4) Monotonicity rules: (i)

IfB* = B!u * z := EB E a and a*; u , z E B ; u I - [ z : c x ]QB[~~ : a ] B 2 .

(ii)

If I- a1 Q 0 2 , I- [z : a11 B, and I- [z : a21 B then I- [z : a11 k? Q [z : 0 2 1 B.

LY

I-

B1 Q B2 then

A description of Automath (A.3)

121

(iii) ZfI- A1 Q B1, I- Az Q B2, I- (A1)Azr and I- (B1)Bz then I- (A1)AZ 9 (El)&. (iv) ZfI- A, 9 Bj (for j = 1,...,k), and I- c(A1, ...,Ah), and I- c(B1, ..., Bk) then I- c(A1, ...,Ak) Q c(B1,...,B k ) . ( 5 ) Reflexivity, symmetry and transitivity rules

(i)

I f k A, k B and A

=B

then I- A

B.

(ii) Zf I- A Q B then I- B Q A. (iii) ZfI- A Q B , and I- B 9 C thenI- A Q C.

Remark. It is intended that B ; u I- A p B only if both B ; u I- A and 23 ; u I- B. In most cases above, though sometimes unnecessary, such conditions have been explicitly stated. Where they have been omitted it will be immediate that they hold by some other conditions.

6. SOME REMARKS ON LANGUAGE THEORY 6.1. Decidability The language theory is mainly concerned with the investigation of the basic system. A major aim is to prove the decidability of the Automath languages. That is, to prove the existence of an effective procedure which for any given text in a finite amount of time decides whether it is correct or not (in AUT-QE, say). The kernel of such a checker deals with the verification of correctness of expressions and formulas (both E- and 9-formulas), relative to a given book and context (which are assumed to be correct already). In this section we shall sketch a certain checking procedure, closely related to the actually running verWe shall also roughly ifying program of Zandleven (cf. [Zandleven 73 (Ef)]). indicate the proof of correspondence between the proposed checking procedure and the language definition of the preceding section.

6.2. Reduction 6.2.1. In order to study the &relation in more detail we introduce the reduction relation +, a partial order among the expressions. For an explanation of the suggestive dots in our definition we refer to Section 2.12.4. 6.2.2. Deflnit ion. (1) One-step reduction (with respect to a book

B)

122

D.T. van Daalen

(i) (ii)

one-step P-reduction: one-step greduction:

... (A)[x : a]C ... +p ... [x := A] C ... . Zf x F V ( C ) then ... [z : a] (x)C ... +,, ... C ... .

(iii) one-step &reduction: If d was introduced b y an abbreviation line x1 E &, ...l xk E & * d := D E c in t? then ... d(C1, ...l &) ... +6 ... [ X i , ...,Xk := X i , ...,Ck] D .... (iv) also + is allowed with any combination of the indices such as: Zf A +p B or A B then A +p,, B.

+,,

(v) one-step reduction in general: If A +p$

B then A

+

B.

(2) Many-step reduction (with respect to f3) (i) If A

=B

(ii) If A

+

then A

B and B

+ B. +

C (with respect to f3) then A

+

C.

So + is the reflexive and transitive closure of --+. Likewise 4 p 6 denotes the reflexive and transitive closure of -’pa etc. For A + B we also write B (3) (i)

U-

A. Reduction sequence: A sequence C1, Czl ... of expressions is called a reduction sequence of C1 if for all i we have Ci G Ci+l or Ci ---* C i + l .

(ii) Proper reduction sequence: A reduction sequence C1, Cz, ... is called proper if for all i we have Ci + &+I.

6.2.3. Clearly the &relation is the equivalence relation generated by the restriction of + to correct expressions. So we can conclude: I- A p B ifl A = c 1 + D1 U- c z + D2 ... + Dk-1 U- c k = B (possibly k = l), where all expressions in the respective reduction sequences are correct. (t

6.2.4. As an example of a reduction sequence consider: 3alt +6 (2)succfun -6 (2)[2 : nat] successor(x) +p successor(2) -6 successor(successor(1)) (see Section 2.16). So each reduction step seems to bring us closer to some possible “outcome”. Here P- and 6-reduction amount to evaluation and q-reduction to a certain simplification of expressions. 6.3. The three problems: normalization, Church-Rosser and closure 6.3.1. It will appear that the decision procedure for equations (@formulas) plays a central role in the checker. At first we state - in terms of the remark in Section 6.2.4 - two important questions around reduction and definitional equality:

A description of Automath (A.3)

123

(i)

(Normalization) Do correct expressions always have a final outcome, i.e. do they always reduce to an expression which does not reduce further?

(ii)

( Church-Rosser property) Do definitionally equal expressions have a common outcome, i.e. an expression to which they both reduce?

A third general question concerns the so-called closure property (this term was introduced by R.P. Nederpelt in the introduction to [Nederpelt 73 (C.3)]): (iii) Is the system closed under reduction, i.e. do correct expressions remain correct under reduction?

6.3.2. Normalization and strong normalization Let us define: (1) A is normal if no one-step reduction A

-, B

can be applied.

(2) A is said to normalize if A reduces to some normal B (which is then called a normal form of A). (3) A is said to strongly normalize if all proper reduction sequences of A terminate. We say that normalization (resp. strong normalization) holds if all correct expressions normalize (resp. strongly normalize). Normalization (and a fortiori strong normalization) does not hold in the full A-calculus (take as a counter example the expression (Az.(z)z)h.(z)z). In typed systems such as Automath, however, strong normalization (and hence normalization) does hold. Much work concerning (strong) normalization has been done by logicians studying systems of natural deduction and functional interpretations (cf. for instance [Girard 721, [Martin-Lof 75a], [Prawitz 711). Their methods often apply to Automath also. Some new proofs of normalization and strong normalization have been given by members of the Automath-project (cf. [Nederpelt 73 ( C . 3 ) ] ) .

6.3.3. Church-Rosser theorem; uniqueness of normal forms Question 6.3.1 (ii) above amounts to the Church-Rosser theorem; If A = B then A -* C B for some C. An alternative formulation of this is the Diamond property for -w: If A -* B and A -* C then B -* D + C for some D (Figure 1). (c

D.T. van Daalen

124

A.

Figure 1. As a corollary of the Church-Rosser theorem we mention the uniqueness of normal forms: If B and C are normal forms of A then B = C. This property together with the normalization theorem allows us to speak of the normal form N F ( A ) - computable by an effective procedure N F - of correct expressions A . The Church-Rosser theorem holds in the full A-calculus as well as in typed systems. In Automath languages without greduction the standard A-calculus proofs simply carry over (cf. [Nederpelt 73 (C.3)]). In fact, in view of strong normalization, a slightly easier proof can be given here. For, e.g. AUT-QE, where we have 77-reduction the proof is somewhat more complicated and depends heavily on the closure theorem. The author intends to publish this proof and the other proofs omitted in this section in his doctoral dissertation.

Closure property Let us first formulate the closure theorem: If 23; u I- A (respectively B ;u I- A E B) and A -n C (with respect to B) then B ;u I- C (respectively 8 ; u I- C E B). In connection with the closure theorem, which holds for AUTQE, we have two important derived rules: 6.3.4.

(1) General substitution principle (as mentioned in 2.5): If z1 E C1,...,z k E Ck I- B (resp. I- B E C) and u I- A , E Et (for z = 1, ...,k) then u I- B* (resp. I- B* E C*), where C* stands for [XI, ...,zk := A l , ...,Ak] C. (2) The “left-hand equality rule” (compare with the rule of type-conversion, which is the “right-hand equality rule”): If I-(3)A E B and I- A C then I-CEB. For 2-expressions A we only have a weaker version in view of type-inclusion: If I-(2)A E B and I- A Q C and I-(2)C E D then I- C E B or I- A E D. 6.4. A decision procedure

6.4.1. Deciding Q-formulas Suppose A and B are correct expressions. The normal form procedure N F

A description of Automath (A.3)

125

(Section 6.3.2) easily yields a decision method for the equation A Q B , namely A Q B iff N F ( A ) = N F ( B ) . Often, however, it is not necessary to compute normal forms for deciding A Q B. For example, when A and B have different degrees one can easily draw a negative conclusion. Or more important, it generally happens that a few well-chosen reduction steps in A or B will result in a non-normal common reduct. The choice of efficient reduction steps here is a matter of strategy; the termination of a procedure which successively applies reduction rules to A or B is anyhow guaranteed by the strong normalization property, no matter in what order the reduction steps are applied. In order to prove the correspondence between decision procedure and language definition we must know that all the expressions in the reduction sequences from A and B to some common reduct are correct again. This is indeed the case by the closure theorem. 6.4.2. Deciding E-formulas a n d expressions 6.4.2.1. Assume U is a correct book and CT a correct context; we must define

a decision procedure for the correctness of E-formulas and expressions. It will appear that this problem can be reduced to the decision problem for Q-formulas (but for the straightforward task of checking the validity of the identifiers used). 6.4.2.2. Uniqueness of types We know (by the rule of type conversion) that for all B’ with I- B Q B‘ we have I- A E B e I- A E B‘. For 3-expressions A the converse (uniqueness of types) holds too: (I- A E B and

(*)

I- A E B’) +I-

B

Q

B’.

(Here we mean uniqueness with respect to definitional equality (Q),in contrast with Section 6.3.3, where we mean uniqueness with respect t o syntactic equality

(=I*) For 2-expressions A we must be somewhat more precise in view of type-inclusion. We define among the correct expressions the relation E by: (i)

[XI : ( Y l ]

(ii)

[XI

... [ Z k : (Yk] [ y : a]type E [ E l

: ( Y l ] ... [ X k : (Yk] [y :

(iii) A Q B

: (Yl]

... [ X k : (Yk] type.

a]prop E [ I 1 : (Y1) ... [ Z k : (Yk] prop.

+ A C B.

(iv) C is transitive. Then instead of (*) for 2-expressions A we can prove (I-(2)A E B and

i-(2)

A E B’)

+

(I- B E B’ o r

I- B’E B )

.

6.4.2.3. Now assume that A is correct. Then we can define a “mechanicaltype” function CAT, such that:

D.T. van Daalen

126

(i) I-(3)A E B

@

(I-(3)A, I- B and I- CAT(A) 9 B ) .

(ii) k(2) A E B

@

(I-(') A, k B and F CAT(A) C B ) .

So CAT computes some canonical representation of the class of B' with I- A E B'; furthermore, this B' is minimal with respect to E. For the actual definition of D CAT we refer to [Zandleven 73 (E.l),Section 71. Since the decision procedure = for equations in the current checker also contains the possibility of type-inclusion B iff A B - the type function CAT reduces the verification of - i.e. A E-formulas to the verification of equations. Finally we point out a decision procedure for correctness of expressions. Here we proceed by induction on the length of expressions. As an example we treat the case of application expressions (A)B where A and B are already supposed to be correct. 6.4.2.4.

6.2.4.5. Uniqueness of domains For function-like expressions A we define a to be a domain of A if I- A E [z : a]C

or

I-(1)A 9 [z : a]C

.

For domains we have uniqueness also (by the closure theorem and the ChurchRosser theorem): If a and @ are domains of A then a 9 @.This fact allows us to speak about the domain of function-like expressions. Now we are able to define a "mechanical domain" function DOM (for which we refer to [Zandleven73 (E.l), Section 7]), which for function-like A picks out a canonical representative of the domains of A. The termination of DOM(A) follows by induction on the degree of A, using strong normalization. 6.2.4.6. By CAT and DOM the verification of correctness of ( A ) B reduces to the verification of some suitable equation: I- (A)B @ (I- A and I-

B and I- A E DOM(B))

or, equivalently, by 6.4.2.3 (i), I- (A)B

e (I- A and I- B and I- CAT(A) 9 DOM(B)) .

6.2.4.7. For the other cases of correctness of expressions we refer to Zandleven again. The correspondence of the current verifier with the actual language definition is either immediate or follows from the above facts about CAT and DOM.

127

Forrnalizat ion of Classical Mathematics in Automath J. Zucker’

0. INTRODUCTION Automath is a formal language, invented by N.G. de Bruijn, suitable for the formalization of (large parts of) mathematics, such that the correctness of any proof of a theorem written in Automath can be checked mechanically (by a computer). A description of the Automath project (headed by Prof. de Bruijn), and its motivation, are given in [de Bruijn 73~1.Further information on the language and the project is given in [van Daalen 73 (A.3)], [Zandleven 73 (E.l)] and [van Benthem Jutting 731. The present paper deals with an aspect of the project with which the author was involved, namely the formalization, in Automath, of a certain part of mathematics, viz. calculus or classical real analysis. This (together with the formalization of some other mathematical topics) will constitute an “Automath book”. Only a brief outline of this work is possible here; a more detailed account is planned.

1. GENERAL DESCRIPTION OF THE LANGUAGE

1.1. We will give a sketch of a version of Automath called AUT-II. (A detailed description of another version, AUT-QE, essentially a sub-language of AUT-11, is given in [van Daalen 73 (A.3)]. We will first give a general descrip tion of the language, in accordance with [van Daalen 73 (A.3)], and then, in Section 2, consider those aspects of AUT-II which differ from AUT-QE.) ‘The author was previously employed in the Automath project at the University of Technology, Eindhoven, Netherlands, and supported by the Netherlands Organization for the Advancement of Pure Science (Z.W.O.). ‘This article is baaed on two talks on Automath given at the Colloque International de Logique, Clermont-Ferrand, France, July 1975. (The first talk was by N.G.de Bruijn, adapted from his [de Bruijn 73~1). Thanks are due t o my colleagues in the project, and t o Roe1 de Vrijer, for helpful comments on earlier versions.

J . Zucker

128

The language is based on the typed &calculus, incorporating the well-known correspondence between types and propositions, discovered by Curry, Howard, Martin-Lof and Tait (see e.g. [Martin-Lof 75a]), and independently by de Bruijn ( [ d e Bruzjn 7Oa (A.211, [Scott 701).

1.2. Notation Throughout the paper, the symbols “E” and “F”, possibly with subscripts, will denote arbitrary expressions of the language. The colon is used for the typing relation; thus “E : F” may mean that “ E is an object of type F” or “ E is a proof of the proposition F”; more generally, to cover both cases, we say that “ E has category F”.

1.3. Automath lines An Automath book consists of a sequence of lines. A line may be one of three kinds: (1) Context line. A context is a string of parameters (or free variables)

a1 : F1, a2 : F2, ... , a,, : Fn where each parameter ai has category Fi (so that ai may be a variable of type Fi, or an assumption of the proposition Fi). Each expression Fi may contain the aj for j < i (only). This serves as the context of a mathematical argument. A context line, then, introduces such a context, or lengthens one by adding new parameters. The empty context will be denoted by “cp”. (2) Definition line. In a given context, then, we may introduce a constant by definition: a1 :

F1 , ... , a, : F, I- C := E : F .

Here C is the new constant, defined (in the context shown) to be equal to the expression El with category F , where the expressions E and F may contain the parameters a l l ...,an. (For convenience, we display the full context on the left of the definition line, separated from the defined constant by “I-”.) (3) Primitive notion line. This is similar to the definition line, except that the constant C is introduced, outright, as a “primitive notion” of some category:

Formalization of classical mathematics in Automath (A.4)

a1 : F1, ... , a, : F, I- C := prim : F

129

.

So (in the context shown) C may be a primitive constant of type F , or a primitive proof of the proposition F (i.e. an axiom). 1.4. Substitution In a definition line or primitive notion line the constant C depends (of course) on the parameters a l , ...,a, (cf. 1.3). We may exhibit this dependence by writing C as C(a1,...,a,). Let t l , ...,t, be a sequence of terms, and (with F and F, as in 1.3) let F’ denote the result of simultaneously substituting t l , ...,t, for al, ...,a, in F , and F: that of simultaneously substituting t l , ...,ti-1 for a l , ...,ai-1 in Fi, for 1 I

i I n. Now suppose further that the sequence t l , ...,t, instantiates the given context, i.e., t i has category FJ for 1 i n. Then we may substitute these terms for the parameters in C(a1,...,a,), to form the expression C(t1,...,t,), with category F*.

< <

1.5. Abstraction and application Complex expressions may be formed from simple expressions (free variables and constants) by (inter alia) the operations of abstraction and application. If a is a parameter of category E , and F a n expression, then we may “abstract over E” to form the abstraction expression [z : El F* (where F* results from replacing a by the bound variable z in F ) . This corresponds, in more usual notation, to “(Az E I?),*”. Next, if E and F are two expressions (satisfying certain assumptions of type coherence) then we can form the application ezpcpression ( F )E. This is the “application of E t o F”. (Notice that the “argument” F is written before the function E. This unconventional order has certain technical advantages.)

1.6. An example Suppose that “Na” has been introduced as the type of the natural numbers, and we want to define the successor as a primitive constant. This is done by a primitive notion line, in the context of an arbitraty natural number:

a : Na I- Sc := Prim : Na

.

Then “Sc” (or ‘%(a)”) represents the successor of the (arbitrary) natural number a, with type again Na.

J. Zucker

130

If we now want the successor function (of type Na over Na; so we define (now in the empty context):

8 I-

ScFn := [z : Na] Sc(z) : Na

+ Na

+

Na), we must abstract

.

Suppose next that t is a term of type Na. Then we can apply this successor function t o t , t o form the expression “(t)ScFn”, which is definitionally equal to “Sc(t)” (by @reduction).

2. SPECIAL FEATURES OF AUT-II 2.1. It must be remembered that our aim is to formalize (a part of) classical mathematics. The following question then arises: is an Automath-like language suitable for this? - since such a language would seem to be appropriate rather for formalizing something like the work in [Bishop 671. The answer, as it turns out, is “Yes”. (We return to this point in Section 3). A consequence of the requirements t o formalize classical mathematics is a loss of the perfect symmetry between propositions and types found, for example, in [Martin-Lof 75a]. In fact, we have two kinds of expressions: (i) t-expressions (“t”for “type” or “term”), (ii) p-expressions ( “p” for “proposition” or “proof”). These two kinds of expressions will not always be handled in the same way, as we will see. If E : F , then E and F are either both t-expressions or both p-expressions. Further, to every expression E we attach a degree deg(E):l, 2 or 3. Expressions of degree 1 are (roughly) “large categories” (cf. the type “V” of [Martin-Lof 75a], and if E : F then deg(E) = deg(F) 1. We give some details in 2.2 and 2.3. (The description will be informal and incomplete, for the sake of brevity and simplicity).

+

2.2. Formation of t-expressions These may be formed in the following ways. (t-1) We begin with a constant

“T”

of degree 1, the “category of all types”.

(t-2) Suppose a : 7 (so deg(cY) = 2). Then a is a type. (Below, “a”will refer to this a). (t-3) Suppose a : a (so deg(a) = 3). Then a is an object of type a.

Formalization of classical mathematics in Automath (A.4)

131

What other t-expressions of degree 1 are there?

(t-4) a -, 7 has degree 1. It i s the category of functions from a to “type-valued functions with domain a”.

7 ,or

of

(t-5) Suppose f : a -,7 (so deg(f) = 2). Then f is a type-valued function with domain a. Now we can form a type from f in three ways:

Application. If a : a then (a) f : T .

(i)

(ii) Cartesian product. n(f): T . (Hence the name “AUT-n”). This is the type of all functions g with domain a such that for all a : a,( a )g : ( a )f . (iii) Type of pairs. C(f) : T . This is the type of all pairs (a,b) such that a : a and b : ( a ) f.

Note. If f has the form [z : a]/3, where the type p does not contain the bound variable 2, then we can write (in more usual notation)

n(f)

a+p

for

amp

for C(f)

and

.

2.3. Formation of p-expressions We will try, as far as possible, t o make the description here parallel t o that in 2.2 for t-expressions. (p-1) We begin, again, with a constant propositions”.

“K”

of degree 1, the “category of all

(p-2) Suppose A : K (so deg(A) = 2). Then A is a proposition. (Below, “A” will refer to this A). (p-3) Suppose p : A (so deg(p) = 3). Then p is an proof of A . What other p-expressions of degree 1 are there? (p-4a) A + K has degree 1. It is the category of functions from (proofs of) A t o propositions. (p-5a) Suppose f : A in three ways: (i)

+

K

(so deg(f) = 2). We can form a proposition from f

Application. If p : A then ( p ) f : K.

(ii) Generalized implication. n(f) : K. This is the proposition proved by those q such that for all p : A, ( p ) q : ( p ) f.

J. Zucker

132

(iii) Generalized conjunction. C(f) : x. This is the proposition proved by those pairs ( p ,q ) such that p : A and q : ( p ) f .

Note. If f has the form [x : A ] B , where the proposition B does not contain the bound variable 2, then we can write (in more usual notation)

A

-+

B

for II(f) (ordinary implication)

and

AAB

for C(f)

(ordinary conjunction)

.

As further p-expressions we have (with a as in (t-2)): (p-4b) a --., x has degree 1. It is the category of proposition-valued functions with domain a,i.e. predicates on a. (p-5b) Suppose P : a + T (so deg(P) = 2). Then P is a predicate on a. Now we can form a proposition from P in two ways: (i) If a : a then ( a )P : x. (Note that the convention for the order in an application expression works well here! We can read “ ( a )P” as “ a has property P” or “a satisfies P” or “a is P”.) (ii) Universal quantification. H ( P ) : A . This is the proposition proved by those p such that for all a : al (a)p : ( a ) P. We also write “V(P)” for ‘TI(P)”. Now what about “(iii) Existential quantification E(P) : x”? We return to this point below, in Section 4.

2.4. Notes

(1) The above description is incomplete (as stated previously). For example, there is also a disjoint sum of types a @ p(for all a : T and p : T ) . Further, we have (for all a : 7)a type a 4 (a-+ x) of binary predicates or relations on a,etc. (2) There is abstraction over categories of degee 2 only. Thus, for an expression of the form [z : El F or E 4 F , E must have degree 2, while F may have any degree (except 3 in the case of E + F ) ; and the degree of the expression is then the same as deg(F).

3. IRRELEVANCE OF PROOFS

It should be clear from the above description that the language incorporates minimal logic in + l A and V.

Formalization of classical mathematics in Automath (A.4)

133

Further development is necessary to make the system suitable for the formalization of classical mathematics. This is achieved in three steps. (1) We introduce I as a primitive constant, and the intuitionistic I rule as a primitive axiom. This gives intuitionistic logic in -+,A and V. Now we define 1 from -+ and I,and also V and 3 from A, V and 1,in the well-known way.

(2) We introduce the classical -1 rule as a primitive axiom. However, we still do not have classical logic! The problem is this. Consider e.g. the logarithmic function, defined on the positive reals: a:R1,

p : a > O + l o g :=

... :FU

(where “Rl” is the type of reals, and “...I’ is an expression defining the log in the context of the two parameters a and p . So the log is essentially a function of two arguments, namely a real number, and a proof that it is positive. This is a familiar situation in constructive mathematics, but how does it work in classical mathematics? Suppose, for example, we have two proofs p and q that 3 > 0. We want to say immediately that

(The same problem arises with the reciprocal function, etc.) We therefore adopt the principle of (3) Irrelevance of proofs. This principle (due to de Bruijn) can be stated in the form: any two proofs of the same proposition (or of definitionally equal propositions) are taken as definitionally equal. (It follows, to continue the above example, that “log(3, p)” is definitionally equal to “log(3, q ) ” . ) The above three steps, then, form our basis for formalizing classical mathematics. And, as it turns out, this approach permits quite a natural development of classical real analysis; and (I think) this is because it reflects, to some extent, how mathematicians actually reason. (Even “generalized implication” and “generalized conjunction” (cf. (p-5a) in 2.3), which are not usually considered as part of “classical logic”, turn up naturally in reasoning in classical mathematics.) In fact, one of the most interesting aspects of the project (to the author) is that it demonstrates the feasibility of founding a large part of everyday mathematics on the typed A-calculus, rather than on axiomatic set theory.

J. Zucker

134 4. STRONG AND WEAK EXISTENCE, SUBTYPES

We return to the problem left open in Section 2, (p-5b): what about “strong existential quantification” of predicates? - i.e., to form, from a predicate P on a!, the proposition C ( P ) , proved by those pairs ( a , p ) such that a : a! and p : ( a ) P (by analogy with (iii) of (t-5) or (p-5a); cf. [Scott 731 or [Martin-Lof 75aI). The problem here is that “strong existence” is inconsistent with the principle of irrelevance of proofs! For consider, e.g., the predicate on the naturals of “being less than 2”; i.e., define the predicate

0 I- P

:= [z : Na] (z < 2) : Na -+

A

and let po and pl be proofs that 0 and 1, respectively, satisfy P . Then both (0,po) and (1,pl) have category C ( P ) . If this is taken as a proposition (“there exists P”) then they are proofs; so by the principle of irrelevance of proofs, they are definitionally equal, and hence (by taking left projections) so are 0 and 1. In fact, we do allow the category C ( P ) ,but as a type, not a proposition (i.e., its category is r , and it is a t-expression). It is the ‘‘subtype of a! on which P holds”. Now there is no problem, since (to return to the above example) ( 0 , p o ) and (1,pl) are simply two distinct objects of this subtype. Hence we stick to the negative fragment of predicate logic, and define 3 from V and 1 (“weak existence”). So also, for the sake of uniformity, we define V from A and 1 (“weak disjunction”) (although we do have a disjoint sum of types a!@/3, and strong disjunction would probably be harmless).

5. DISTINCTION BETWEEN t- AND p-EXPRESSIONS

We repeat that an important feature of AUT-TI is the distinction between tand p-expressions (made possible by the existence of two “large categories”, T and T ) and differences in their treatment. We summarize some of these differences: (1) The double negation law can be applied to propositions, but not to types. (2) The principle of irrelevance of proofs applies (only) to proofs of a given proposition, not t o objects of a given type. (Clearly, we would not want a principle of “irrelevance of objects”!) (3) Although we have universal quantification of predicates corresponding to Cartesian products of type-valued functions (compare ii) of (p-5b) and of (t-5)) we do not have strong existential quantification corresponding to the type of pairs, as explained in Section 4.

Formalization of classical mathematics in Automath (A.4)

135

6. PREDICATES AND QUANTIFIERS We have, for each type a , a category of predicates on a (as indicated in Section 2),

a : r t- Pred := a

4

7r

and also (binary) relations on a

a : r I-Reln := a +

(a-7r)

(also of degree l),and so on. Predicates play an important part in our work. Not only are they formed by abstracting from propositions, but they also occur as predicate variables or parameters, so that we can work in the context of an arbitrary predicate on a:

a : T , P : Pred(a) I- ... . This is useful, for example, in permitting a general treatment of quantifiers. For we can view a quantifier on a as an operation from predicates on a t o propositions (as was done for the universal quantifier in [Church 401). Thus, e.g., the existential quantifier is defined by:

a : ~P :,P r e d ( a ) I - 3 := - V [ z : a ] ( i ( z ) P ) : 7 r . Other examples of quantifiers are the following. We can define, for predicates P on the naturals, the propositions “P holds eventually” and “P holds infinitely often” :

P : Pred(Na) I- Evt := ... : 7r ,?

I- I0

:=

... : 7r

In fact, there are many situations, notably in the theory of convergence of series, where the “Evt” quantifier is more important than the universal quantifier.

7. SOME FURTHER FEATURES OF THE LANGUAGE 7.1. Equality There is, for each type a , an equality relation on a , introduced as a primitive proposition: a : ~ , a : a , b : a I - E q := prim:7r

J. Zucker

136

(“a equals b ” ) . So equality can only be expressed between objects of a given type, but not between types, or between any p-expressions. (But in any case, all proofs of the same proposition are definitionally equal.) We also introduce primitive axioms for equality, namely reflexivity, substitut i d y (for an arbitrary predicate on a ) , and extensionality for functions. (The last axiom is probably not essential for our results in elementary real analysis, but simplifies the work.) 7.2. Sets

Now we cannot express equality between predicates (as stated above), only logical equivalence. Nor can we quantify over all predicates of a given type a , since Pred(a) has degree 1. Therefore we introduce, for each type a, its “power type”:

a : r t Powertype := prim : T i.e., the type of all sets of objects of type a ; and also, the membership relation as a new primitive proposition:

a : T , a : a , b : Powertype(a) t E l := prim

: 7r

(“a is an element of b”); and primitive axioms, including set extensionality and comprehension (for an arbitrary predicate on a ) , by means of which we can go back and forth between predicates on a and their extensions in Powertype(a). Now “Powertype(a)” is a t-expression of degree 2, so we can express equality between sets of a given type, and also quantify over all such sets. 7.3. Definition of individuals Given a predicate P on a type a , and the assumptions that an object satisfying P exists and is unique, we may introduce, as primitives, (a name for) an individual object of type a, and the axiom that this individual satisfies P, thus:

a: 91

T

,P

: Pred(a) , p : 3 ( P ) , q : Unq(P) jl

11

”

t Indiv

:= prim : a

t AxIndiv := prim : (Indiv) P

(“Unq” is the uniqueness quantifier). So this functions rather like an “L-symbol” (cf. e.g. [Church 401).

8. DEVELOPMENT OF CLASSICAL REAL ANALYSIS

8.1. We give, below, an outline of the mathematical development in the book.

Formalization of classical mathematics in Automath (A.4)

137

We are guided by the nature of Automath, and so a type-theoretical rather than set-theoretical basis is used. So we start with a few basic types, including “Narrfor the naturals and “Rl” for the reals, and build up other types from these. For example, Na + Na is the type of number-theoretical functions of one argument; Na -+ R1 the type of sequences of reals; R1 + R1 the type of (total) real-valued functions of a real variable, and so on. 8.2. We start, then, with the naturals,

8 F N a := p r i m : T

,

and introduce primitive constants for 0 and successor, as well as Peano’s axioms for these. There is also a primitive constant for recursion on every type a,to define functions of type Na + a, together with the corresponding axioms. Next, we introduce (as primitive constants) the operations and relations fro the ordered fields of reals,

together with the axioms for this structure, including order-completeness. (More accurately, we start with the extended reals as an ordered structure, and cut down to the reals as a subtype). Then we define, by recursion, the embedding of the naturals in the reals, and hence construct the integers and rationals as predicates on (or subtypes of) the reals. Next, we construct the type of partial functions of a real variable as follows. We take a primitive type “w-type” containing exactly one point “u”,and form the disjoint sum type

0l-Rw := R l $ w - t y p e : T . Then the type of partial functions is defined by: @I-Pfn := R 1 - b : ~ . The idea is that the partial function is “defined” at exactly those points where its value is not w . Now we can define, in the context of an arbitrary partial function and arbitrary real point, the predicate of “being a derivative” (of the function, at the point), and prove that the derivative (if it exists) is unique. Hence (using “Indiv” of 7.3) we can define the “derived function” of any partial function (as another partial function). We can then compute the derivatives of, e.g., the polynomial and reciprocal functions.

J. Zucker

138

Next, we develop the theory of power series, define the exponential function as the sum of such a series, and compute its derivative. Finally, by an application of (a version of) the inverse mapping theorem, we compute the derivative of the logarithmic function.

9. FURTHER MATHEMATICAL TOPICS

A. Kornaat, a colleague in the project, has written some sections for the book on other topics: (1) Set theory, in which the equivalence is shown between various formulations of the axiom of choise (such as Zorn’s lemma).

(2) Combinatorics, including a proof of the Hall-Konig theorem [Hall 67, Th. 5.1.51. (3) Metric spaces, inclusing compactness and connectedness; hence the HeineBore1 and Bolzano-Weierstrass theorems for the reals.

Our work is interrelated; e.g., the work in 3) above depends on the theory of the real number field (Section 8), and, conversely, the proof of the inverse mapping theorem (mentioned in Section 8) uses the result, proved in 3) above, that the continuous image of a connected subset of a metric space is connected.

10. ABSTRACT STRUCTURES 10.1. An example In the course of our work we come across the following problem. We have to deal with a linear order on the naturals, and one on the reals, and we do not want to have to prove the same theorems (on linear orders) twice for these separate structures. Therefore we first consider an “abstract linear order”, as follows. We define, in the context of an arbitrary type a and relation p on a: cy : T

, p : Reln(a) I-

AXLO :=

... : T

where “. . . I ’ is the conjunction of the axioms for a linear order. So “AxLO(a, p)” says that p is a linear order on a. Next we take the context

a : T , p : Reln(a) , p : AxLO(cr, p )

Formalization of classical mathematics in Automath (A.4)

139

(so p is the assumption that p is a linear order on a ) , and, in this context, prove general theorems about linear orders, which we can then apply to our two examples. For example, we can prove that the inverse relation to p is again a linear order on a. Hence, from any theorem that we may have about increasing sequences on the reals, or about upper bounds (say), we can immediately infer the dual theorem for decreasing sequences, or lower bounds (respectively).

10.2. Telescopes Now a general framework in which to view linear orders, or other algebraic structures, has been proposed by de Bruijn. It uses the notion of “telescope”. A telescope, rougly speaking, acts as a “category” for certain strings of expressions (or “structures”). More precisely: consider a context a1

: F1,

a2 : F2, ... , a, : F,

We can associate with this context a telescope, i.e., the expression

[a1 : Fi] [a2 : F’] ... [a, : Fn] which consists of n “modules”, as shown. Now certain strings of length n will “fit into” or “belong to” to this telescope, namely those strings which instantiate the given context (cf. 1.4). A telescope therefore functions like a “generalized C”. Now to apply this to the situation in 10.1, we define a specific 3-telescope (i.e. telescope of length 3), the “category of linear orders”: CatLO := [a : T ] [p : Reln(a)] [p : AxLO(a, p)]

.

Those strings of length 3 which fit into “CatLO” are called linear orders. Three such structures were mentioned in 10.1: the “abstract linear orders” ( a , p , p ) , and the linear orders on the naturals and on the reals. Next, we can define a proposition “AxComp(a,p)” which says that ( a , p ) satisfies the axiom of order-completeness, and then add another module on to “CatLO” to form a 4-telescope, the “category of complete linear orders”: CatCLO := [a : T ] [ p : Reln(a)] [p : AxLO(a, p)] [q : AxComp(a, p ) ] .

So modules can act like adjectives (e.g. “complete”). (This analogy works best in a language like French, where the adjective comes after the noun.) Another example of a telescope used in the book is the “category of fields” (a 6-telescope). Again, further modules can be added on to this, to form, in turn, the “category of ordered fields”, and of “complete ordered fields”. The formalism of telescopes may prove especially useful in an Automath treatment of abstract algebra.

This Page Intentionally Left Blank

141

A Survey of the Project Automath* N.G. de Bruijn

1. PURPOSE OF THIS SURVEY

Thus far, much about Automath has been written in separate reports. Most of this work has been made available upon request, but only a small part was published in journals, conference proceedings, etc. Unfortunately, a general survey in the form of a book is still lacking. A short survey was given in [de Bruijn 73~1,but the present one will be much more extensive. Naturally, this survey will report about work that has been done, is going on, or is planned for the future. But it will also be used t o explain how various parts of the project are related. Moreover we shall try t o clarify a few points which many outsiders consider as uncommon or even weird. In particular we spend quite some attention to our concept of types and the matter of “propositions as types” (Section 14). Finally the survey will be used to ventilate opinions and views in mathematics which are not easily set down in more technical reports. Some further material of a general nature can be found e.g. in [ d e Bruzjn 70a (A.211, [van Benthem Jutting 771. For those who have not read anything about the project, this survey cannot pretend t o give more than a vague idea of the languages. For getting a better idea, [de Bruijn 73b], [van Daalen 73 (A.311 may be recommended; [van Daalen 73 (A.311 gives a very precise definition of AUT-QE, one of the most prominent members of the family (see Sections 9, 13).

2. PURPOSE OF THE PROJECT AUTOMATH

The project was conceived in 1966; the first report was [de Bruijn 68b]. The idea was t o develop a system of writing entire mathematical theories in such a precise fashion that verification of the correctness can be carried out by formal operations on the text. Here “formal” means: without “understanding” *Reprinted from: Seldin, J.P. and Hindley, J.R.,eds., To H.B. C u m : Esaays on Combinatory Logic, Lambda Calculus and Formalism, p. 579-606, by courtesy of Academic Press Inc., Orlando.

142

N.G. de Bruijn

the “meaning”, and therefore it has to be possible to instruct computers how to check the correctness. Indeed, the fact that we do have computers will be one of the reasons why our generation has better chances than those who tried to have similar claims in the past, like Leibniz, Peano and Hilbert. Even if we do not actually use computers, they are there to set the standard of what is L‘forma“’ verification. In the next three sections we discuss motivations for the project: checking, understanding and processing. The first two motives seem to favour the choice of a system of a very general nature, not necessarily tied to today’s ideas of formalizing mathematics in terms of classical logic and set theory.

3. CHECKING

Most mathematicians can very well check themselves what they read and write. Nevertheless only a small portion of mathematical literature is absolutely flawless. Moreover, human checking seems to be a social affair too: mathematicians put trust in something since they think or know that other mathematicians have checked it. Very meticulous checking is definitely unpopular. The thing we have in mind puts quite a burden on those who write the mathematics to be checked. They have to justify every little step extensively. It is only after this that a computer can do the final checking and guarantee the correctness. We mention two cases where checking may be important. The first one is for things which are very hard and condensed, and where there is little intuitive or experimental support. The second one is for long and tedious proofs which form very long chains of very elementary steps. Such things may occur in combinatorial arguments, but, more important, in the large amount of work that has to be done to check the semantic correctness of large computer programs or machine designs. Checking may actually be carried out in man-machine cooperation. This may also mean that, at least temporarily, parts of the checking may be omitted if they refer t o things we are absolutely sure of. Many errors in mathematics are made at the interfaces between theories. Therefore, we want to do the checking in a system that embraces all the theories involved. For example, if we want t o check that the regular 17-gon can be constructed by ruler and compass, we have to be able to formulate the rules of geometric constructions into our system.

A survey of the project Autornath (A.5)

143

4. UNDERSTANDING

Formal systems help us to understand mathematics already by the mere fact that they force us to subdivide mathematical discussion into (i)

language,

(ii) metalanguage, and (iii) interpretation. The role of the latter is often underestimated. Those who say that mathematics is set theory, usually disregard the fact that they handle an extensive system of interpretation which is almost completely intuitive. Quite often it is just the interpretation that means “understanding” mathematics. Therefore we want a system that checks as much as possible of what we can actually say. (This is as far as we can go: we cannot expect a machine or a person to check what is in the back of our minds.) Our system should check a kind of language that comes as close as possible to what we write in ordinary mathematics. If we want t o understand mathematics we also have to get insight into the roles of axioms, definitions, proofs, theorems. We cannot expect to get such an insight from a basic theory that has been built up itself with axioms, definitions, proofs and theorems. It is much better to have a foundation that is nothing but a set of rules for manipulating language. On such a foundation we can build logic and mathematics, possibly with the use of axioms. There is nothing against axioms, but we should be free to accept them or to reject them. Axioms should not be tied to the fundamentals of our system. Another thing that a good language may help us to understand is the structure or the complexity of an argument. The text may reveal analogies in the structure of arguments, and classification of their inherent difficulty. As t o the classification of difficulty we mention that a very useful borderline between “elementary” and “higher” mathematics is that elementary mathematics is the part of mathematics that can be expressed without lambda calculus. In other words: “elementary” is what can be said in PAL (see Section 11).

5. PROCESSING The fact that a machine can read, check and store the mathematics we produce, can have several advantages. One of these is that we can be absolutely sure that two mathematicians use the same theorem with exactly the same conditions. But a machine can also process its contents for answering questions.

N.G. de Bruijn

144

Examples: (i)

produce a glossary of a text,

(ii) find out in a given argument whether a given axiom does or does not play a (direct or indirect) role, (iii) print all notions and arguments that are needed to understand a given theorem, omit everything that is irrelevant t o it.

6. WHAT KIND OF MATHEMATICS CAN WE DO?

The Automath system is like a big restaurant that serves all sorts of food: vegetarian, kosher, or anything else the customer wants. The languages are not tied to any logical system: hardly any logic has been built in. Admittedly there are basic notions of functionality and typing, but these need not be used the way they seem to be intended for. Those who want t o say that a function is a subset in a certain Cartesian product, can say it in Automath, but the restaurant also caters for those who want to describe mathematical functions by means of the functionality available in the language itself. Those who reject the axiom of choice or the excluded middle can use the system, as well as adepts in “New Math” and those who see “truth” as a matter of checking zeros and ones in truth tables. Nevertheless some customers are better served than others. The best-served are those who try to keep close to the way mathematicians actually talk and think. They can use the types for doing typed set theory, the context structure t o represent their ordinary way of reasoning (natural deduction), and the built-in functionality for describing their functions. For typed set theory and natural deduction in relation to Automath, see [de Bruijn 75a (F.111, (Nederpelt 771. Formal Zermelo-Frankel set theory was written in AUT-68 (cf. Section 9) by [van Daalen 701. For a large piece of mathematics described in the “natural style”, we refer to the Landau translation (see Sections 20 and 25).

7. BOOKS AND CONTEXTS We write our mathematics in books, consisting of sequences of lines. Each line is written in some context. We use the word “context” in a restricted sense. At each point of a mathematical discussion we can consider:

145

A survey of the project Automath (A.5) (i)

The set of assumptions which are considered to be valid at that point.

(ii) The set of variables which are “alive” at that point. (iii) The set of all notions that have been developed previously (either by definitions or by taking them as primitives).

+

Many people will say that the context is (i) (iii), and disregard (ii) (their idea is that there is an infinite pool of variables which are always available). We shall use the word context differently, taking it to be described by (i) (ii). There is no reason for us to specify (iii), since it follows from the given order of the lines in the book. This is not true for (i): assumptions can be both introduced and discarded. And as to (ii): our point to take this as part of the context, is the fact that the variables will be typed. These types may be expressed by means of “older” variables but their construction may also depend on the fact that the assumptions of the context are valid (i.e. the types may be defined by expressions containing things that were defined only under these assumptions). Similarly, the assumptions may be expressed in terms of variables belonging to the context. In this report assumptions and variables play the same role in the context. They can appear in any order. Let us give an informal example of a context:

+

“Let n be a natural number. Let P be a point of R,. Let Q be a point of R,. Assume d(P,Q)

> n.”

This context contains three variables n, P , Q and one assumption. We say that this context has length 4. Things of the kind (iii) are “natural number”, “point of R,” , “d” , “>” . In a mathematics book we can indicate the context of every line. There is a special kind of lines that serve to define new contexts (these lines are called block openers). Examples: “Let n be a natural number”. “Assume d(P,Q) > n”. Block openers are placed in a context too. A context can be seen as a sequence of block openers, arranged in the order in which they appear in the book. If these context lines are labeled A l l ...,A,, then the context of A , is A l , ..., A,-1. Therefore the context A1, ...,A , is adequately described by mentioning A , only: looking up line A , in the book will reveal A,-I, etc. The phrase “block opener” suggests the usual situation that assumptions are taken to be valid during a sequence of consecutive lines, and that validity regions of assumptions are nested intervals. These things will not be generally assumed however. A context can shrink for a while, and be picked up later.

N.G. de Bruijn

146

8. DEFINITIONAL LINES AND PN-LINES What kind of material can be written in a context (apart from block openers that extend the context)? It will turn out that we can get away with two things: definitional lines and PN-lines. In the first case we have a new identifier (symbol or word), and an expression (in terms of old identifiers and material from the context); the line is interpreted as the definition of the new identifier. In a PN-line, however, no expression is given, but the symbol P N is written instead. The interpretation is that the identifier is introduced as a primitive symbol. In Section 14 it will be explained how some of the definitional lines can be interpreted as theorems with proofs and some of the PN-lines as axioms.

9. THE LANGUAGE FAMILY As basic language we take SEMIPAL. It is not able to handle mathematics, but just intended to give a record of how things are expressed in terms of others. The contexts in SEMIPAL are sequences of untyped variables. Apart from the block openers there are definitional lines and PN-lines. The expressions are composed of identifiers and variables. If the context is 2 1 , ...,z, the new identifier is p , then the line is written as something like 21,...,z,

*p

:= f ( g ( z 1 , a ) , h ( z 1 ) )*

(9.1)

On the right we have a n example of an expression. In order to explain what we intend with this line, it is better to write p ( z 1 , ..., 2,) instead of p ; the interpretation is that p is introduced as a function of n variables. The expression on the right is assumed t o be correct, i.e. (i) each non-variable identifier has been introduced previously in the book, with a context length equal to the number of subexpressions it has in (9.1), (ii) the variables occurring in (9.1) all belong to the context

...,z,.

21,

SEMIPAL can be extended in two ways: (i)

By admitting lambda expressions (A-SEMIPAL).

(ii) By attaching a type to every expression, taken from a fixed finite set of types. Let us call this PAL-FT (PAL with fixed types). We can go beyond (ii): (iii) By admitting the introduction of type variables and of primitive types. This will be called PAL (“Primitive Automath Language”).

A survey of the project Automath (A.5)

147

The combination of PAL and X-SEMIPAL leads to AUT-68 (for a long time this wits called Automath), and, a little beyond it, AUT-QE. Let us write A 5 B if every correct book in language A is also correct in language B. Then we have PAL 5 AUT-68 5 AUT-QE. A different extension of PAL is J. Zucker’s AUT-II (see Section 22). The language AUT-SL (single line AUT, see [de Bruzjn 71 (B.2)], [Nederpelt 73 (C.311) has been created mainly in order to get a streamlined language theory. It is a very general higher-order language, obtained by giving u p all restrictions on abstraction, and admitting all numbers 0,1,2,... as degrees (see Section 11). Once this has been done, we can write PN’s as block openers (cf. Section 16), eliminate all definitional lines, and thus obtain a complete book in the form of a single line.

10. ABBREVIATION SYSTEM

In SEMIPAL we have a simple abbreviation system that can be maintained throughout the language family. If p was introduced by (9.1), say, then in later expressions p is allowed to have fewer than n subexpressions. The missing subexpressions are just supplied by adding ~ 1 ~ x... 2 ,on the left. For example: if E3, Ell, ...,En are expressions, then p(E3,...,En) is an abbreviation for p(zl,x2,E3, ...,En). (So p(E3, ...,En) can only be used in a context containing the first two variables of the context of (9.1).) Quite a different kind of abbreviation (again for all languages of the family), lies in the paragraph system (for a description see [van Benthem Jutting 771). It has the practical advantage that names for identifiers (e.g. common letters like x, a, ...) can be used over and over again. The book is divided into sections, subsections, sub-subsections,... (all called paragraphs). If we mention an identifier we mean the one that was introduced in the smallest surrounding paragraph; if we want to refer to a different identifier with the same name, we have to mention its paragraph number.

11. TYPING AND DEGREES

We begin with a language with fixed types. Let us call it PAL-FT. We start from SEMIPAL, and we attach a type (taken from the given set) t o every variable, to every identifier and to every expression. The rules are obvious: if we form an expression by substituting expressions El, ...,En for 5 1 , ...,2, in p ( q , ...,xn), then for each i the type of Ei should equal the one of zi, and p(E1,...,En)gets the same type as p . The type can be written at the end of

148

N.G. de Bruijn

each line of the book (including block openers). As a separation mark we can use the semicolon (we also write p : r in the metalanguage in order to say that p has the type T ) . Let us pass t o PAL. We introduce a new symbol type, and say that T : type for every type T we had thus far. Let us admit this new kind of typing for block openers as well as for PN-lines. Then by obvious extensions of our rules, we can get the new types in the definitional lines too. We do not need the collection of fixed types anymore: the same effect can be obtained with PN-lines ‘‘7 := PN : type”. Since PN-lines can be written inside a context, we can get big expressions typed by type (i.e. we get types depending on a number of parameters). Let us say that type is an expression of degree 1; if E : type we say that E has degree 2; if F : E and E : type we say that F has degree 3. In the languages mentioned in Section 9 the degrees are restricted to 1,2,3. There would not be any harm in admitting higher degrees, but the description of present-day mathematics does not seem t o require more than three degrees. There is a suggestion of using degree 4 in [ d e Bruijn 74b (B.3)],but what is done with it might also be done with lower degrees by slight modifications of the language. The typing rule of PAL-FT is to be modified in PAL: the type of p(E1, ...,En) is to be what we get if in the type of p ( q , ...,2), we substitute El for 21, ...,E, for x,. And we require that the type of Ei is “definitionally equal” (see Section 18) to the one we get by that same substitution in the type of xi (the latter type does not contain xi,Zi+l, ...,x,).

12. ADDING THE LAMBDA CALCULUS In Section 9 we announced A-SEMIPAL as what we get from SEMIPAL by admitting A-expressions as expressions. (This language has never been used or studied in the project; it is only mentioned here as a resting-point in the discussion.) If E is an expression containing the variable x, then AXE is an expression in which x is no longer a variable but a dummy. The passage from E to AXEis called abstraction. The interpretation is that AXEis a function, which at any point p has as its value the expression we get if in E we replace x by p (the result of this substitution is written in the metalanguage as E [x := p ] ) . The counterpart of abstraction is called application. We write ( p )f for the thing that is interpreted as the value of the function f at the point p. (The usual way of writing f p or f ( p ) is inconvenient since abstraction is written on the left, and it happens so often that abstractions and applications are tied together in pairs.

A survey of the project Automath (A.5)

149

A crucial role in the metalanguage is played by /3-reduction. This means reducing (p)X,E to E [z := p ] , in accordance with the interpretation. Less important is r]-reduction, reducing A, (x)E to E in cases where E does not contain z. In X-SEMIPAL we have two different ways to describe the relation between a function f , a value p of the variable, and the value of the function at that point. One way is the application ( p )f , the other one is by means of what we shall call instantiation. If f is an identifier introduced in the context z (either by a definitional line or by a PN-line or by a block opener) then we can use the expression f ( p ) in later lines. This feature of the language has disadvantages (two ways of writing, with the same interpretation) and hardly any advantage: instantiation does not do what application cannot do. This will be different in typed languages: the scopes of instantiation and application overlap, but none of the two scopes is contained in the other.

13. ADDING TYPED LAMBDA CALCULUS TO PAL

We first say that the word “typed” in the title does not refer t o fixed types like in PAL-FT of Section 11. We shall admit type variables, and lambda expressions as types. Therefore we get beyond what is usually called typed lambda calculus. The typed lambda expressions we want to add t o PAL are of the form X,:AB. The B may contain z as a variable, and it has to be a legitimate expression under the assumption that the type of z is A . In the metalanguage we speak of “abstraction over A” or “abstraction of B over A”. The subscripted notation Xu , is hard to print in the many cases where U is an expression containing further A’s. Therefore we always write [u : U ] instead of X,u. There are various possibilities to play the game. For a survey we refer to [de Bruijn 74a]. In particular we have to decide what degrees for A and B we admit. Both in AUT-68 and AUT-QE we admit abstraction over A’s of degree 2 only. In AUT-68 the abstracted expression B can have degree 2 or 3, in AUT-QE B can have degree 1, 2 or 3. The typing rule for A-expressions in AUT-QE is roughly this: if in the context y extended by 2 : A we have B ( z ) : C(z), then in the context y we have [z : A] B ( z ) : [x : A] C(z). In AUT-68 this is different if B has degree 2. If B ( z ) : type then AUT-68 obtains [z : A] B ( z ) : type. In AUT-QE the “quasi-expressions” (like [z : A] type) seem strange, but once one gets accustomed to them they turn out to be quite natural and enjoyable. They allow applications (a)f if we know f : P , P : [z : A] type and a : A . There is a rule in AUT-QE that increases the power of the language. The rule

150

N.G. de Bruijn

is called type inclusion. If we have a typing like T : [u : V] [v : V] type we say that the typing T : [u: V ]type and T : type are also acceptable (acceptable in the sense of the rules for instantiation and application). Expressed superficially: everything we say for arbitrary types can be used for function types too. Actually we can take three decisions about type inclusion. It can be forbidden (like in AUT-SL), allowed (like in AUT-QE) or prescribed (like in AUT-68). Prescribing type inclusion means that the abstractions in front of type have to be skipped. In AUT-68 typings are unique in the following sense. If in some context both A : B1 and A : B2 are correct, then B1 and 82 turn out to be definitionally equal (see Section 18). In AUT-QE this holds with the exception of type inclusion. But this is just a matter of phrasing the language definition. We can also say that typing is unique but that the typing rule is liberalized (cf. “mock typing” in [de Bruijn 74al). If A : type and B : type we are able to say “let f be a mapping of A to B” by means of a block opener “f : [x : A] B”. This shows that in the typed language the lambda calculus can do what instantiation cannot do (cf. Section 12). On the other hand, by instantiation we are able to handle block openers like “ A : type”, and the functional relationships expressed in this context cannot be expressed by abstraction, at least not in languages (like AUT-68, AUT-QE) that forbid abstraction over expressions of degree 1 (e.g. over type).

14. USE OF TYPING FOR REASONING The fact that PAL and its descendants AUT-68 and AUT-QE can be used for mathematical reasoning depends on the idea of propositions as types. Roughly it means that if p is a proof for a proposition, we write it as a typing p : P. This principle goes back to [Curry and Feys 581, and was elaborated by [Howard 801, [Prawitz 711, [Girard 721, [Martin-Lof 75al. Completely independent of these developments it appeared in [de Bruijn 68b], [ d e Bruij’n 70a ( A . 2 ) ] . Treating propositions as types is definitely uncommon for the ordinary mathematician, yet it is very close to what he actually does. We shall try to explain this presently. Assume that our book contains the following theorem (described informally), for some given functions cp, +:

Theorem 1. Let x be a real number. Assume $(z) > 1. Let n be an in0 teger. Assume cp(x) > xn. Then +(x) > n. We want t o apply this later, with x = q, n = 5, and want to conclude $ ( q )

> 5.

A survey of the project Automath (A.5)

151

We have to convince ourselves that the conditions are satisfied. To this end we write a proof for $ ( q ) > 1 and label this result as (1). And we write a proof for p(q) > q5 and label that result as (2). Now we claim t o apply the theorem, providing in this order q, ( l ) , 5, (2). So the (1) and (2) are treated on a par with the names (of “objects”) q and 5. Is (1) to be considered as a name for the proposition $ ( q ) > 1? No, the application of the theorem is not legitimate because of the existence of the proposition $ ( q ) > 1, but because of its being proved. So consider the reference (1) as a reference to a proof of $ ( q ) > 1. Let us try to explain our application to a machine that knows Theorem 1. The machine wants to check (i)

that q is a real number,

(ii) that (1) is a statement that $ ( q )

> 1 has been

proved,

(iii) that 5 is an integer, (iv) that (2) refers to a proof of p(q) > q5. We only need to change a few words in order to get: q is a real number, (1) is a proof of $ ( q ) > 1, 5 is an integer, (2) is a proof of p(q) > q 5 . Altogether, we have a proof of $ ( q ) > 5. The parallelism between proofs and “ordinary” mathematical objects gets even stronger if we realize that many objects are defined conditionally only. If we define a function f for x real, x > 1 then the use of the value of the function at a point requires (i) that point (a real number), (ii) a proof that the real number is

> 1.

Now the value of the function is an object, and it depends on a n object and a proof. So proofs may depend on objects and objects may depend on proofs. One might say that we have been confusing “proofs” with “references to proofs” or “names of proofs”. But in informal talk we make the same switchings from “objects” to “names of objects”. There is not much of a point in arguing whether proofs are as real or more real than objects. Quoting Wittgenstein’s “Don’t ask for the meaning, ask for the use”, we must say that as far as the use is concerned, the parallelism is complete. In the above example, the proof of $ ( q ) > 5 is a single-step proof. In PAL it is expressed in a line Theorem 2 := Theorem l(q, ( l ) , 5 , (2)) : P

N.G. de Bruijn

152

where P in some way represents the proposition $ ( q ) > 5 (or rather the type of proofs of that proposition). The term “single-step proof” means that we only have t o quote. It would become a multi-step proof if (1) was not available directly in the book, but (1) had t o be constructed on the spot, again by substituting things in the name of the proof of a theorem, like “lemma 3(q,q)” in Theorem l(q, lemma 3(q,q), 5 , (2)) :

P

.

In this way arguments of several steps can be condensed into a single line. Let us illustrate the principle “propositions as types” by how it works for implications. Let p and q be propositions. Having a proof of the implication p -, q can be interpreted as this: we have a procedure by which we are able to give a proof of q for every customer who might present us a proof of p. That is, our procedure is a function that maps the set of all proofs of p into proofs of q. Using our terminology of context, we can say that in the context “x : proof(p)” (representing “let x be a proof of p ” ) we can write a line

f :=

... : proof(q) .

By the abstraction rule of AUT-68 or AUT-QE we get, outside the 2-context (see Section 13 for the notation),

[t : proof(p)]f(t)

:

[t : proof(p)]proof(q) .

Hence [t : proof(p)] proof(q) acts as the proof type of the implication.

15. USING TWO EXPRESSIONS OF DEGREE 1

For various reasons it is attractive to introduce a symbol prop of degree 1 that behaves exactly like type, but with different interpretation. If A : B , B : type then A is the name of an object of type B, and if C : D , D : prop then C is the name of a proof for the proposition expressed by the proof type D. One reason to make the distinction between type and prop is to give an easier insight into the interpretations, but there are also more essential reasons for making the difference. One of the forms of the logical double negation axiom, written by means of “prop”, turns into the axiom about Hilbert’s &-operator if we replace prop by type. So if we want to do classical logic and do not want to accept the axiom of choice, we need some distinction. It should be mentioned, however, that introduction of prop is not the only way out of this difficulty. (Another way is t o create a primitive type called “bool” (for boolean) and for every boolean b a primitive type “proof type of b”.)

A survey of the project Automath (A.5)

153

Another suggestion to profit by treating type and prop differently, is “proof irrelevance” (Section 24). We can now give a survey of the various kinds of lines involving prop. First, block openers “x : prop” introduce propositional variables. P N lines “ p := PN : prop” introduce primitive propositions. A definitional line “6 := ... : prop” introduces an abbreviation for a more complex expression representing a proposition. Next we take some P with P : prop. This P is interpreted as the proof type of a proposition. Now the block opener x : P states the proposition as an assumption. The PN-line u := PN : P is interpreted as stating the proposition as an axiom. The definitional line v := E : P states the proposition as theorem. The expression E represents the proof, and v is a name for the proof. The theorems themselves do not get names. In order to quote a theorem it suffices to quote a name for the proof. Contexts are sequences of block openers like 21 :

A ~ , . . . , x: ~A,

.

At the places where A, : prop the interpretation is that xj is the name of the assumption, at places where Aj : type the xj is a variable. And, of course, there can be places where A j = type or Aj = prop. Especially in AUT-QE it is attractive to talk “prop-style”, i.e. to suppress all propositions and talk about their proof types only. It turns out that there is hardly ever a necessity to talk about the propositions any more (talking about propositions is called “bool-style”). The example at the end of Section 14 shows how this works: we can just define the proof type of the implication associated with the proof types P and Q by [t : P] Q. The more often one does this kind of thing, the easier one forgets the original use of the word “proposition”. This may explain why the Automath workers began to say prop instead of proof type. A consequence is that if P : prop they do not pronounce p : P as “ p is a P” but as “ p proves P”.

16. AXIOMS VS. ASSUMPTIONS If we have a PN-line in an empty context there is no harm in replacing it by an assumption. The name of the assumption will be a part of every context in the sequel. Taking it as an assumption gives more flexibility, since axioms are things we can never get rid of (unless we start a new book) and assumptions can be discarded if we wish. If a PN-line is written in a non-empty context we can sometimes, but not always, replace it by an equivalent axiom in the empty context (and next re-

154

N.G. de Bruijn

place it by an assumption). Whether this is possible depends on the degrees involved in the context as well as on the degree of the type of the PN-line, both in connection with the abstraction rules of the language. In AUT-SL, the most liberal language of the family, all PN’s can be eliminated this way.

17. DERIVATION RULES In the Automath family there is no essential difference between logic and mathematics. Logical connectives can be taken as PN’s or as defined notions, inference rules can be taken as axioms or as derived rules, and later applications of such rules have the same form as applications of mathematical theorems. As an example we present the double negation law. Somehow we have an expression CON with C0N:prop. It has the following interpretation: if in some context we have an expression p with p : CON, then “we have a contradiction” (one can even say that p is a contradiction). In the context P : prop we next define NON(P) (by means of a definitional line) as [z: P] CON. The “double negation law” can now be written as follows:

[ P : prop] [y

:

NON(NON(P))] * dbng := P N : P .

To the left of the asterisk the context is indicated: “let P be a prop, let y be a proof of the double negation”. In this context we postulate the truth of P. The identifier “dbng“ is chosen as the name of the law.

18. TWO KINDS OF EQUALITY There is (already in SEMIPAL) a notion of definitional equality between expressions. The notion plays a central role in language theory. In typed languages it is essential already in the language definition (see the end of Section 11). Definitional equality is generated by &reductions (&reduction means elimination of some previously defined identifier, replacing it by its definition given in the definitional line) and the p- and 7-reductions of the lambda calculus. In our languages no facilities have been provided for talking in the book about definitional equality. It is hardly necessary, for if A and B are definitionally equal then at every place in the book A may be replaced by B without any argumentation. The kind of equality mathematicians do talk about is what we call book equality. It may be introduced by means of a PN (but there are also possibilities to define book equality), and its basic properties can be covered by axioms or theorems.

A survey of the project Automath (A.5)

155

19. LANGUAGE THEORY Language theory is about reductions (the 6-, /3- and r]-reductions mentioned in Section 18), normal forms (i.e. expressions which do not admit reductions) and about the relation between correct expressions and their types (“correct” means: acceptable in the book). Important parts of the language theory were obtained in [van Benthem Jutting 71a], [van Daalen 73 (A.3)], [Nederpelt 73 (C.3)], [ d e Vrzjer 75 (C.d)]. The forthcoming Ph.D. thesis [van Daalen 801 will cover all aspects of the language theory at least for AUT-68, AUT-QE and AUT-SL. The essential results are (in a rough formulation) (i)

The Church-Rosser theorem: If A and B are definitionally equivalent then there is an expression C such that both A and B can be reduced to C by sequences of reductions.

(ii) The normal form theorem: For every A there is a normal form N to which A can be reduced by a sequence of reductions; N is uniquely determined. (iii) The strong normal form theorem: Every reduction sequence terminates (and for every A there is an upper bound to the length of the reduction sequences starting at A ) . (iv) The closure theorem: If A is correct and if A reduces to B then B is correct. We note that ii) is not true for untyped lambda calculus. It is true, however, for the untyped language SEMIPAL (which has no lambdas).

20. VERIFICATION One of the most important things in the project is that we expect machines to check the correctness of what humans have written. This would be an easy programming job if the language would require of the writer that every little application of the rules of the language should be indicated in the text. But this is out of the question: from experience we know that it would require texts which are hundreds of times longer than they are in our present system. We expect the machine to do much of the checking on its own initiative, not necessarily in the same way the textwriter might have had in mind. The machine has to find out whether there is a sequence of applications of the language rules that motivates the correctness of a line of the book, once all previous lines have been checked. The results of language theory show (at least for SEMIPAL, PAL, AUT-68, AUT-QE, AUT-SL) that this is automatically

156

N.G. de Bruijn

decidable. Definitional equivalence of two expressions can be establishcd by reducing both to their normal form and checking whether these are the same. But already in short books this may turn out to give a prohibitive amount of work (in particular it will duplicate much of the work in checking previous lines). What we really want is a good strategy by which the machine can try to find a shorter way from one expression t o the other, about as short as what may have been in the writer’s mind. The computer programs whose execution effectuates the verification of a book, are called verifiers or checkers. For AUT-68 and AUT-QE the verifiers operate satisfactorily. The checkings can be done on-line from a teleprinter. In some cases where the program’s strategy seems to run into very much work, the machine may ask whether the writer really wants it. In most cases it turns out that the writer has made a mistake. It would not be sensible to require that the machine proves that a line is incorrect: such a proof might require evaluation of normal forms. Therefore, it is better to let the machine report if it has a serious difficulty. And on some rare occasions we may let the machine ask for a hint in what direction to search. Sometimes it may help the machine if we write a few extra lines in the book, just as if we are explaining mathematics t o human readers. In general, if we condense two lines into one, then checking the condensed line may require more work than checking the separate lines one by one. Most of the work on verifiers was done by 1. Zandleven (cf. [Zandleven 73 (E.111)in the years 1971-1976. Later this work was continued by A. Kornaat and L.S. van Benthem Jutting. A very large part of the effort is just caused by the limitations of today’s computer technology. The amount of information involved in handling moderate amounts of mathematics is so big that it has to be distributed efficiently over the various kinds of fast and slow memory, and checking a single line may require consultation of many remote parts of the book. The paragraph system (see Section 10) plays a role in coping with these difficulties. In handling substitution in lambda calculus it is often necessary to re-name dummies in substitution operations, in order to avoid “name clashes”. In order to simplify this, namefree lambda calculus was developed by [de Bruzjn 726 (C.211,[de Bruijn 78a], where references to dummies are not indicated by name but by reference depth. This system lies at the root of today’s verifiers. As it was said before, the problem of how to handle large amounts of mathematics requires considerable effort in the design of the verifiers, but the matter of strategies is more essential. It is, of course, closely related to language theory. The closure theorem (Section 19) is important: it saves much work, e.g. it saves checking types when doing P-reduction. The essential difficulty of verification is also the essential difficulty of language

A survey of the project Automath (A.5)

157

theory. It is the fact that definitionally equal expressions are connected by chains A l , A l , ..., A , in which the reductions go either way: sometimes Ai reduces to A,+l, sometimes A,+l reduces to Ai. [van Benthem Jutting 771 gives some details about experiences with the checking of a relatively large text (viz. the translation of Landau’s “Grundlagen”, [Landau 301). The coded version ([van Benthem Jutting 761) consists of about 5.107 bits. This may seem very large (maybe 10 to 50 times as large as a direct encoding of the words and symbols Landau wrote himself), but it is still of the order of what a single cassette tape can contain.

21. AUTOMATIC THEOREM PROVING Automatic theorem proving is a very hard subject. In order t o be efficient it certainly requires clever adaptation to the kind of problems it is applied to. Therefore it is very questionable whether it would profit much from Automath, with its claims for generality and adaptivity to human reasoning. Admittedly, our verifiers do automatic searching, and may establish definitional equalities the writer has not bothered to see through, but this is not the level of what is usually called automatic theorem proving. Nevertheless one may think of building “attachments” to the verifier which find proofs of little gaps the writer might like to leave. This might be done completely outside the system (e.g. by consulting the computer’s arithmetic unit or by checking tautologies by inspection of cases), but it can also be conceived that the machine, after finding its proof, writes it in Automath and checks it by its own verifier. An attachment of the latter type was built (as a student’s exercise) by R.M.A. Wieringa. Given natural numbers p , q, r with pq = r , where p , q, r are presented in the binary number system, his program produces an Automath text proving pq = r . The number of lines is of the order of the number of digits we write down with ordinary pencil-and-paper multiplication. Attachments of the first kind, operating outside the system, can of course work very much faster. At least some of them will be very profitable, but the Automath group never worked in this direction. They rather did what others don’t than what others do very efficiently already.

22. FURTHER LANGUAGE EXTENSIONS There is a number of things that mathematicians find so self-evident that they do not see them as part of the structure of axioms, definitions and theorems, but more as a part of their language. This is deceptive, of course (after

158

N.G. de Bruijn

all we seem to do a lot of mathematical work subconsciously), but nevertheless one can try to incorporate as much as possible in the language definition. To quote a few unrelated things: pairs, strings, set theoretical operations, equality, commutativity and associativity, mathematical induction. One might say that in AUT-68 and AUT-QE only two things have been implemented: functional relationship and typing. All the rest is left to the book-writers. We never found much use for building mathematical induction into the language definition (it is done in some other constructive systems). The reason is that in our system we have books t o write in, and for a thing like induction it is as easy to quote the rule from a book as to apply a language facility. But for some of the other subjects mentioned above, the use of language facilities would be very much shorter than quoting from the book. Every extension may seriously complicate both language theory and verifier. It is not clear how far one should go. J. Zucker devised AUT-II ([Zucker 77 ( A 4)as a relatively mild adaption and extension of AUT-68. It is much easier to write than AUT-68. Zucker wrote an extensive manuscript “Real Analysis” directly in AUT-II (it is not a translation of something that was written first in ordinary language on scrap paper). A few chapters were written by A. Kornaat, who also produced some harder material in AUT-II, viz. the proofs of the equivalence of various forms of the axiom of choice. AUT-II uses some proper extensions (like facilities for handling pairs) and a number of things which are more in the line of fast notation. Much of this belongs to a system called AUT-SYNT (partly developed by I. Zandleven) which has facilities for operations on syntactic variables, strings, and telescopes (a telescope is a string of block openers with types, like [q : All ... [x,,: A,]; the name comes from hand telescopes with tubes fitting into each other). The work on language theory and verifier of these languages is unfinished. One way to look at AUT-SYNT is that it is just a n auxiliary language (like in [de Bruijn 72al) that helps us to prepare an input text in a language like AUT-68 or AUT-QE, where language theory and verifier are on pretty safe grounds. It is likely that on the long run AUT-68 provided with AUT-SYNT input facilities will not be less adequate than some of the fancier languages, at least for classical mathematics.

23. IS THERE A NEED FOR HIGHER ORDER LANGUAGE?

As it was said before (Section 13), AUT-68 is a first order language since there is no abstraction over expressions of degree 1. Yet this does not seem t o be a serious limitation, since a few extra axioms in the book extend the power of the language. As an example, we mention how abstraction over prop can be

A survey of the project Automath (A.5)

159

mimicked. We start wity an axiom in the empty context “bool := PN : type”, and from now on expressions b with b : bool are interpreted as propositions. Next, in a context [z : bool] we take the axiom “proof:= P N : prop”. (In older publications we wrote “TRUE’ instead of “ p r o o f . ) The effect is that for every proposition b the typing “ u : proof(b)” will mean that u is a proof for b. Now we can mimic abstraction over prop. Instead of saying that f ( p ) holds for all p : prop, we say that f(proof(b))holds for all b : bool, and the abstraction is now over something of degree 2. The reports [de Bruijn 761, [de Bruijn 771, [ d e Bruijn 78c (B.4)] g‘ive suggestions how slight extensions of AUT-68, and how AUT-QE-NTI (AUT-QE without type inclusion) can be used for mimicking stronger languages.

24. PROOF IRRELEVANCE This is a feature we might add to our languages if we are interested in classical mathematics only. The classical mathematician would find it even hard to understand what its counterpart “proof relevance” is. We give an example. If x is a real number, then P ( z ) stands for “proof of z > 0”. Now we define “log” , if we want t o talk about (the logarithm) in the context [z : real] [y : P ( z ) ]and log 3 we have t o write log(3,p), where p is some proof for 3 > 0. Now the p is relevant, and we have some trouble in saying that log(3,p) does not depend on p . This can be done by means of the general axioms for book equality, with the effect that in this case log(3,pl) and log(3,pz) are book-equal if both pl and pz are proofs of 3 > 0. Some time and some annoyance can be saved if we extend the language by proclaiming that proofs of one and the same proposition are always definitionally equal. This extra rule was called “proof irrelevance” in [de Bruijn 746 (B.3)]. We of course do not want to have the similar feature for type.

25. MATHEMATICS PRODUCED IN AUTOMATH As a test case for handling larger amounts of mathematics, [van Benthem Jutting 761 gave a line-by-line translation of Landau’s “Grundlagen” ([Landau 301) into AUT-QE. His experiences are reported in [van Benthem Jutting 771. Landau’s book was chosen because it gives material of different kinds in a very constant style of presentation: the steps do not get bigger towards the end of the book. It would of course have been much easier to rewrite Landau’s book first, so as to make it easier to translate, but it was our aim to show that Automath can cope with any kind of mathematics, not just the mathematics especially

N.G. de Bruijn

160

designed for it. Another substantial piece of work is Zucker’s “Real Analysis” (mentioned in Section 22). And J.T. Udding writes (in AUT-QE) a new theory of real numbers based on an approach that avoids the repeated troublesome embeddings Landau had to go through [Udding 801. Many smaller pieces of mathematics have been done by students. The experience is that in a period of 2 or 3 weeks a mathematics student (without any training in logic) is able to learn AUT-QE, produce a piece of text (possibly using basic material already known to the computer), punch it, have it checked via a teleprinter, correct it, and get a final AUT-QE version. For an account of how a piece of mathematics is translated in several stages see [van Benthem Jutting 73). The easiest things t o translate are very condensed and very abstract pieces of mathematics. (Example: the proofs of equivalence of various forms of the axiom of choice dit not become much longer than the original text.) Hard subjects are those where (subconscious) “experience” comes in, like in analysis and combinatorics. A very important thing that can be concluded from all writing experiments is the constancy of the loss factor. The loss factor expresses what we loose in shortness when translating very meticulous “ordinary” mathematics into Automath. This factor may be quite big, something like 10 or 20, but it is constant: it does not increase if we go further in the book. It would not be too hard to push the constant factor down by efficient abbreviations.

26. WORK IN PROGRESS

Apart from things discussed before, we mention a few sub-projects which are studied now or will be studied in the near future. (i)

Programming language semantics. This may become an important customer for Automath. The idea is, to write in a single book: definition of a programming language and of its semantics, the logic and mathematics involved, particular programs, and proofs for their semantics. The ideal situation is this: a computer that has to execute a program, reads it directly from that book, thus avoiding every kind of interpretation. R.M.A. Wieringa is working on a system proposed in [de Bruijn 75b].

(ii) A far reaching extension of lambda calculus is presented in [de Bruijn 78a] and studied by [Wieringa 781. In ordinary lambda calculus we can interpret substitution and &reduction as replacing end-points of trees by branches of trees. The extension in [de Bruijn 78a] means that we can also break

A survey of the project Automath (A.5)

16 1

open some edge and paste a segment of a tree into it. These segments might represent strings or telescopes (see Section 22). This kind of lambda calculus can be expected to be helpful to simplify both language definitions and verifiers. (iii) In the spirit of the work of the Automath project, the project WOT was started. WOT is a Dutch abbreviation and stands for the “mathematical vernacular”, i.e. the strange mixture of words and formulas mathematicians use. The idea is to get to a purified form of WOT that can be used as a formal system for expressing mathematics. The foundations of mathematics have to become some kind of grammar for WOT. Thus far the only reports on WOT are in Dutch, and are used in the training of mathematics teachers.

ADDITIONAL NOTE After this paper was finished, Professor Dana Scott pointed out that Section 14 should have mentioned H. Lauchli’s work on the principle of propositions-astypes. Reference should have been made to his abstract “Intuitionistic propositional calculus and definably non-empty terms”, J. Symb. Logic 30 (1965) p. 263, and to his paper [Lauchli 701.

This Page Intentionally Left Blank

163

The Language Theory of Automath Chapter I, Sections 1 - 5 (Introduction) D.T. van Daalen

I. INTRODUCTION AND SUMMARY This thesis gives an account of the author's language theoretical studies on the Automath languages, during his work in the project Mathematical Language Automath (under supervision of Prof. de Bruijn) at the Eindhoven University of Technology. These studies can be considered as a continuation and completion to previously published work; see [Nederpelt 73 (C.3)] and [de Vrijer 75 (C.4)]. Actually, an introduction to the remaining chapters of the thesis is hardly necessary because they are formally self-contained and provided with lengthy introductions themselves. However, we like to make some general remarks on the Automath project, hoping to clarify some points which have sometimes given rise to misunderstanding. Most views expressed are common in the Automath project, but some are personal views, not necessarily shared by other workers in the project. We start with preliminary remarks, followed by a survey of the Automath project. We discuss the language theory and its role in the project. We give an informal introduction to the various Automath languages and explain how mathematical reasoning can be represented. Finally we summarize the contents of this thesis. Occasionally we make a comparison with related logical systems and related enterprises elsewhere. For more information on the subjects of this chapter we refer to [de Bruijn 80 ( A . 5 ) ] , [van Benthem Jutting 771, [Zucker 77 (A.4)] and [van Daalen 73 (A.3)].

1.1. Preliminary remarks 1.1. Reliability and formal rigour The Automath project originally arose (around 1966) from the idea that it was desirable to increase the dependability of pieces of mathematics by having them checked by a computer. To this end the mathematics involved was to be formalized in a mathematical language allowing computer verification.

164

D.T. van Daalen

First something about this part of the motivation. One might wonder whether greater dependability is desirable at all - and if so, in what parts of mathematics -, and whether formal rigour (as imposed by the computer) contributes at all t o dependability. Critics sometimes argue that correctness of a mathematical text, or of a proof, after all depends on human insight in the situation and understanding of the concepts involved. And, consequently, they sometimes suggest that formal rigour can be opposed to reliability, because the presence of too many formal details may spoil the understanding. There is, generally, some point in this criticism, but all the same, many mathematicians sometimes produce faulty proofs and, even, false theorems. This just means that they have been cheated by their intuition. Such mistakes cannot be said to be caused by lack of rigour but, rather, would have prevented by being more rigorous. E.g. by formalizing the subject matter in a well-chosen formalism. In general, the possibility of computer verification plays a minor role here and, as de Bruijn puts it, the computer is just there to set the standards. Serious errors won’t survive the process of formalization and will never be fed into the machine. However, after having taken the trouble to produce a “fully formal” proof with possibly lots of technical details it is nice to have a patient computer actually waiting to read it and relish the details. In particular, because on rereading, the details indeed may spoil one’s own understanding. Besides (this is our second point against the criticism), we think that the latter situation can be avoided by using a good formalism, which allows a formalization faithful to the informal ideas one had in mind (see also 1.4). It has, of course, never been intended that computer verification might replace human understanding, and that formalization might cover all of mathematics. We just note that formalization sometimes can support our understanding and guide our intuition. 1.2. The ‘&data bank” aspect According to the above criticism one never can rely on results one does not fully understand. Such an orthodox point of view we think unsatisfactory: one sometimes wants to use what might be called “more or less black boxes”, e.g. one sometimes wants t o believe a theorem without knowing, or without quite understanding, its proof (e.g. one does not understand the proof any more). Here we touch a certain “data bank” aspect (as opposed to the checking aspect) of such a formalization project: the codification and storing of a large amount of dependable and unambiguous mathematics. 1.3. The experimental character of the project Thus far about the original motivation. The present author likes t o consider the Automath project as an experiment in order to answer the question: can

The language theory of Automath, Chapter I (A.6)

165

we develop formalisms (mathematical languages), in which mathematical texts actually can be formulated in such a way that mechanical verification (by a computer) is actually possible. Apart from the emphasis on computer verification there is another difference as compared with earlier formalization projects: it is required that both writing (i.e. translating mathematics into Automath) and checking (and, if possible, reading as well) are practically feasible, and that the formalism is kind of universal, i.e. suitable for large parts of mathematics.

1.4. The correspondence with ordinary reasoning In Automath it is attempted to achieve the feasibility of the writing stage by keeping as close as possible to ordinary informal mathematical reasoning, and to existing good mathematical habits. This then was t o result in the possibility of a fully formal proof not blurring the understanding - compare a well-structured computer program -. Keeping close to ordinary reasoning also serves the feasibility of the checking process: in principle we do not expect more from the machine than we would expect from a human checker - though of course we expect the machine to be much faster and more accurate than a human -. The feasibility of the checking requires that all of the reasoning is formalized in the language, whereas usual logical systems generally formalize only part of it and leave the rest to informal meta-language. In particular we mention the handling of proofs, the handling of variables and the handling of abbreviations (i.e. the introduction of new defined constants, see 4.3). 1.5. The didactical aspect A side effect of the analysis of mathematical reasoning needed for the development of a formalism meeting the above specifications might be a better insight into ways of presenting and teaching mathematics. This, didactical, aspect of Automath (beside the aforementioned checking and storing aspects) proves indeed to be important: Nederpelt and de Bruijn have used Automath-like systems to explain first-year mathematical students and mathematics teacherst e b e some principles of mathematical discipline. Research in this direction now falls under the WOT project (“Wiskundige Omgangs Taal”, this is Dutch for: mathematical vernacular), which is going on in Eindhoven. [See also [de Bruzjn 87a (F.3)I.l One tries to codify elements of natural mathematical reasoning into a rather precise language which is inspired by Automath but does not particularly aim at computer verification.

1.6. The possible foundational contribution From the modest statement of the aims of Automath, above, it will be clear that Automath has no strong foundational claim - in the usual logical sense -

166

D.T. van Daalen

or philosophical position to defend like some of its forerunners. But if one wants to hear such a claim it might be the following one: that it is possible to present large parts of ordinary mathematics in Automath in a natural way. In particular that large parts of even classical reasoning fit quite well in the "minimal logic" of Automath (see 5.10) and that large parts of classical mathematics can be founded on the typed A-calculus framework of Automath (see 5.3) rather than on axiomatic set theory. (In fact this claim is a sine qua non to the Automath project .) Besides, the original, simple wish t o increase the reliability of mathematics can, from a practical point of view, also be considered as a foundational contribution.

1.7. T h e nature of A u t o m a t h A more ambitious, less careful phrasing of the aim of Automath, viz. the development of a language in which all mathematics can be expressed so meticulously that syntactical correctness would entail mathematical correctness, has sometimes given rise to confusion. Logicians then argued that such an enterprise was doomed to failure, firstly, because it would contradict the incompleteness theorems and, secondly, because it would contradict the undecidability: the computer certainly would not be able to check for correctness (to decide, as one says) any substantial part of mathematics. We will explain that such criticism is hardly to the point. The basic system of Automath just covers a tiny part of mathematics, so t o say minimal predicate logic. The Automath user himself has t o add t o this basic system all the axioms and constants necessary for his specific area of interest, and he has t o supply more axioms and constants whenever he wants to increase the expressive power of his language or the strength of his theory. Further, the computer is certainly not supposed to decide the truth of the axioms, it is even not supposed t o decide derivability from the axioms, but just verifies derivations (i.e. proofs). 1.8. Some proof checking systems In the Automath project the computer is not expected to check (e.g. to prove) theorems but, rather, is expected to check whether something is a proof and whether it proves a certain theorem. Thus, the project can be compared with two other major proof-checking projects: the FOL (First Order Logic) project of Weyhrauch C.S. in Stanford (Weyhrauch 771, and the LCF (Logic of Computable Functions) of Milner C.S. in Edinburgh [Gordon et al. 791. [See also the introduction to this Volume for other proposals.] FOL is based on classical first order logic in natural deduction style, and is intended to be universal like Automath. However, according to Bulnes [BulnesRozas 791, the system (still) has some difficulties in coping with sorts (or types)

The language theory of Automath, Chapter I (A.6)

167

which seems to make the system less appropriate for parts of mathematics not based on classical set theory. The kernel of LCF is a system called P P A (polymorphic predicate A-calculus), a system of typed A-calculus plus fixed point induction plus logic, also in natural deduction style, based on Scott’s work in the theory of computations. It is especially intended for problems concerning algorithms and programming languages. In principle, these two systems are not more interactive than Automath, since in Automath as well line after line can be fed into the machine, thus incrementally constructing pieces of correct mathematics. However, recently both systems have been enriched by a strong heuristic mechanism allowing so-called top-down proof (i.e. working from the results backwards to the assumptions). In fact, by these mechanisms, called GOAL (for FOL) and ML (for LCF) respectively, a kind of clever mixture between a proof-checker and a theorem-prover has been created (in fact the “top-down tacticals” are just a part of ML, which also contains some other useful mechanisms). The basis elements of Automath just include what may be called “constructive reasoning”, as borrowed from ordinary, informal, sound mathematical practice. Of these we mention the “linear” natural deduction system (see 4.5) used in the construction of both proofs and objects, the facility to abbreviate expressions by a new name (with parameters) at any desired moment (see 4.3), and the suppression mechanism for “fixed” parameters (see, e.g., [ v a n Daalen 73 ( A . 3 ) , Sec. 2.151). [There the suppression mechanism is called “shorthand facility”.] A consequence of the logical weakness of the basic system is the required universality: the Automath user is even free in the use of his logical axioms.

1.9. Proof checking vs. theorem proving When constructing a proof-checking or theorem-proving system one has to decide how to divide the total amount of work between the human writer and the machine. In general it is assume that easier writing makes more difficult checking and vice-versa. A distinctive principle of Automath languages always has been that the computer actually must be able to cope with its task. So, at least, the system the machine is supposed to decide must be formally decidable. In fact we want feasible decidability (cf. 2.10). On the other hand it is required that the writer’s burden is as light as possible. A nice point is that, in contrast with the above stated general view, easier writing sometimes makes checking easier too. Viz. if the system allows the writer to omit parts of the argumentation these parts, of course, do not need t o be checked! But, on the other hand, a certain redundancy will help the machine t o detect the, almost inevitable, minor errors at an early stage. In view of feasible decidability general theorem proving is out of the question. But it is in the spirit of the Automath project to successively extend an exist-

168

D.T.

vitn

Daalen

ing, working, verification system with new tools that handle additional, feasible tasks. In such a way one might turn one’s proof-checking system into a partial truth-checking (i.e. theorem proving) system, notably in well-defined restricted domains. Put differently, the machine might be allowed sometimes t o calculate facts, rather than proving them. Although, if one would allow the user of the system to program such attached mechanisms himself, it would be preferable, if also a proof would be generated and checked (cf. 2.3). In fact, the Automath proof-checking system has always contained such a partial truth-checker, viz. a decision procedure for the formulae (definitional equations and typing formulae) of the underlying typed A-calculus (see 4.1).

1.10, Some characteristic features of Automath We just mention here (but will come back to it) that the parallel natural deduction treatment of objects and proofs, which we think quite natural, and characteristic for Automath, gives rise to a generalized typed A-calculus. By “generalized” we mean that the types are not given beforehand, but are rather constructed along with the terms and can have complicated form (cf. 4.1). In [van Daalen 80, IV.11 there is given a further classification of such systems, into pure, extended and arithmetical systems. The pure systems have the ordinary A-calculus operations only, the extended ones have additional logical operations, and the arithmetical systems have arithmetic built in in the form of a recursion operation. The pure and extended systems are the subject of this thesis. The Automath languages AUT-68 and AUT-QE (4.5-4.7) belong to the pure systems, the language AUT-II (in [van Daalen 80, Ch. VIII]; [see [B.6], [C.5]]) belongs to the extended systems and there are no arithmetical Automath languages. This is a fundamental choice: the addition of a built-in recursor might give rise to definitional equations which are not feasibly decidable and, besides, we don’t think that the presence of a recursion operator would make the representation of ordinary mathematical reasoning any easier. Consequently, the natural number structure is not built in, but has t o be introduced axiomatically, just like any other mathematical structure. Needless to say that the Church(or, any other) representation of numbers in A-calculus does not come in. 1.11. Propositions as types The parallelism between objects and proofs, types and propositions, definitional equality and proof theoretic conversion, for short: the propositions-astypes notion of construction, was first hinted at in [Curry and Feys 581. Later on it was developed further in [Howard 801 and employed by him and other logicians (see [Scott 701, [Prawitz 711, [Martin-Lof 71a], [Girard 721) in founding a theory of constructions, in proof theory, and in constructing an intuitionistic

The language theory of Automath, Chapter I (A.6)

169

theory of types. In the meantime, it was independently discovered by de Bruijn (he also inspired Scott [Scott 701) and used in the Automath project.

1.2. A survey of the Automath project 2.1. The AUT-QE stage The experimental, practical character of the project clearly required: (i)

the development of appropriate languages,

(ii) the construction of programs for verifying these languages, (iii) the actual writing and checking of large pieces of mathematics. There exists not just one Automath language, but a whole family of Automath languages. The first language (around 1968) which had the characteristic typed A-calculus structure was AUT-68. Before 1968 there were just some sub-languages: LSP which codified the abbreviation device, PAL which already had type structure but still lacked A-calculus (see ( d e Bruijn 70a ( A . 2 ) ] ) . Experience with AUT-68 led almost immediately to the construction of AUT-QE, which proved t o be quite suitable for the then adopted propositions-as-types style of writing mathematics. So the first language around which the project was centered was AUT-QE. De Bruijn’s sketch of a verifying program was elaborated and implemented by Zandleven [Zandleven 73 (E. I)]. Jutting translated Landau’s “Grundlagen der Analysis”, and his translation was completely checked by the verifying program. This enterprise has been extensively documented in [van Benthem Jutting 771. The Chapters V, VI of this thesis are mainly devoted to AUT-QE. 2.2. The AUT-II stage It was always foreseen that, on the basis of the experience with AUT-QE, higher-level, easier-to-write, so-called super-languages were to be developed, possibly for “special purposes”, i.e. specific areas of mathematics. The second language playing a central role in the project was AUT-II, developed by Zucker. This is indeed a kind of super-language extending AUT-QE in two respect. Firstly, the mathematical basis of AUT-II is somewhat stronger (it is an extended system, i.e. there is slightly more logic built in). This answered, e.g., in combination with the principle of irrelevance of proofs (see 5.2, and [ d e Bruijn 80 ( A .5 ) ] ) Jutting’s need for easier embedding and “exbedding” facilities (see [van Benthem Jutting 771, [Zucker 77 (A.d)]).Secondly it contains some handy “syntactical features” which make life for the Automath user somewhat more comfortable. We mention the synt-facility for syntactical operations on expressions (which, i.a., allows to omit redundant parameters (but see 1.3)), and the

170

D.T.

villi

Daalen

presence of strings and telescopes. [See also [Balsters 861.1 More about this can be found in [van Benthem Jutting 771, [Zucker 77 (A.4)]. However, the use of these syntactical mechanisms is not restricted to AUT-IT, they can as well be added to AUT-68 and AUT-QE. This seems to be particularly worthwhile, because the strings-and-telescopes in some sense duplicate the pairsVIII.1.51). and-products of AUT-ll (see [B.6, Zucker (assisted by A. Kornaat) employed the new language for a modern, thoroughly classical (in the sense of “classical logic”) treatise on the principles of real analysis, thus contributing t o the foundational claim mentioned above. A survey of the AUT-IT part of the project is to be found in [Zucker 77 (A.411. A new verifying program was designed by Zandleven, developed by him and Kornaat, and is now being finished by Jutting [van Benthem Jutting & Wieringa 791. Apart from the fact that this new verifying program accepts AUT-IT as well as the older languages, it also contains improved facilities for handling bound variables (see 3.4) and for storage manipulation. The latter proved necessary because with the first verificator, which left the handling of the extensive storage requirements to the computer system, working in interactive mode turned out to be cumbersome. Apart from the two major Automath texts produced by Jutting, Zucker and Kornaat there have been formalized many smaller pieces of mathematics into Automath by a variety of authors, mostly students. In [Bulnes-Rozas 791 it has been suggested that the size and scope of the proof checking projects performed in FOL were comparable with size and scope of e.g. Jutting’s opus. The present author disagrees: The amount of material handled in FOL is in no way comparable to what has been done in Automath.

2.3. The multi-level approach The words “higher-level languages” suggest a separation between an object language, and a formal super-language which provides easier writing. Texts in the latter language may then be mechanically translated into object-language, which in turn is to be verified by the machine. In AUT-IT, contrarily, there is, in principle, no such separation of levels: all the additional features are incorporated into the language. We write “in principle”, because the syntfacility is indeed somewhat related to this multi-level approach. There have also been certain proposals actually directed towards this multilevel framework. E.g. Wieringa (now working on the application of Automath t o programming language theory), has once constructed a system that answers simple arithmetical questions (n * rn = ?) and provides the resulting equation with a proof in AUT-QE. This AUT-QE proof turns out to be correct, of course! Similarly, there has been constructed a mechanism that decides propositional formulas and provides the true ones with an AUT-QE proof [Penning 771,

The language theory of Automath, Chapter I (A.6)

171

[Wieringa 761. Compare also the discussion in 1.8 about partial theorem-proving mechanisms. In FOL and LCF partial theorem provers and a multi-level approach are present too. We mention the FOL procedure MONADIC, which decides formulas of monadic predicate calculus, and the ATTACH facility, allowing the machine to establish combinatorial facts by actual calculation. As for LCF, the meta-language ML is presented as a kind of programming language for manipulating the objects of the PPX system. 2.4. The theoretical aspects Of course the development task in the project, viz. of developing languages and verifying programs, and of writing mathematics in Automath, also gave rise to theoretical studies. Here we distinguish: (1) language theoretical studies, (2) studies concerning the way mathematics is formalized in Automath. This thesis deals with the language theory (l),which we define as the theory of the underlying typed X-calculus of the Automath languages. Object of study is the syntactical structure consisting of the Automath expressions, provided with the relations reduction, definitional equality and the typing relation (or, typing function). See 4.3, 4.4. As regards (2), we mention some typical logical questions: what do we gain and loose by such formalizations, and: what is the relation between the Automath formalization and, say, some standard formulation of a piece of mathematics. Such questions are interesting, mostly because of the unconventional way in which mathematics is formulated in Automath. In particular, the fact that the proofs explicitly enter the Automath formalization is important. E.g. it allows detailed analysis of proofs, and of reasoning, and it gives rise to, as we say, generalized logic (see 5.10, [de Bmijn 80 (A.511 or [Zucker 77 (A.411). Furthermore, the studies (2) can, i.a., indicate what Automath language is suitable for what kind of mathematics. Roughly speaking, we might say that (2) concerns semantical questions, in contrast with the basically syntactic questions of the language theory, treated below.

2.5. What is language theory? The results of the language theory are important for the construction of the verifying programs and for proving its correctness. Further they serve as a foundation for the study of mathematics in Automath, i.e. the studies (2) mentioned above. E.g. the consistency of the underlying typed X-calculus (as provided by Church-Rosser theorems and the like, see below) is clearly a prerequisite for the consistency of mathematics formalized in Automath.

172

D.T. van Daalen

Nevertheless, the language theory concerns the expressions and formulas as mere syntactical constructs, thus abstracting from possible mathematical content. Hence, the language theory also abstracts from particular sets of constants and axioms (so-called books) belonging to a particular piece of mathematics. We take the point of view that the languages of the Automath family are characterized by their set of correct (i.e. well-formed according t o the rules and restrictions of the various languages) books, formulas and expressions, rather than by a certain specific definition, i.e. a specific set of rules. Two definitions are said t o be equivalent if they define the same language. One language is said to be an eztension of another language if its set of correct expressions, books etc. contains the set of correct expressions, books etc. of the other one.

2.6. The aims of the language theory Now we mention some typical language theoretical aims. On one hand, the design and comparison of language definitions, in particular the comparison of so-called E-definitions, which generate the language in question by a set of production rules, with the algorithmic definitions which describe the language by giving its verifying program. On the other hand there is the comparison of the distinct languages, leading to conservativity and unessential- or definitional extension results (see [ C.5 , V.3.31 for the terminology). Last but not least we mention the question of the decidability of the Automath languages, which is, in principle, essential for the aim of the project, mechanical proof-checking. The latter goal (to prove the decidability) consists of (1) indicating a decision procedure, (2) proving its equivalence with a given language definition (these parts can be skipped if the language in question is given by a definition of the algorithmic type), (3) proving the termination of the indicated procedure.

2.7. Three desirable properties The main tool of the language theory is the detailed study of the so-called reduction relations involved. Roughly speaking, reduction of expressions amounts t o step by step evaluating, step by step transforming the expression (cf. 4.3), until possibly an irreducible (or: normal) expression is reached. Definitional equality is the equivalence relation generated by reduction (see e.g. [van Daalen 73 (A..?)]). Now three important desirable properties of the systems, in connection with reduction and definitional equality, are:

The language theory of Automath, Chapter

I (A.6)

173

(1) normalization and strong normalization, (2) the closure property,

( 3 ) the Church-Rosser property. Normalization states that all the correct expressions indeed reduce into a normal expression, i.e. that there is a reduction sequence, a sequence of expressions produced by successive evaluation steps (reduction steps), ending in a n irreducible expression. Strong normalization says that all the reduction sequences of correct expressions terminate. The closure property (this term is due to Nederpelt) says that correct expressions remain correct under reduction. Finally the Church-Rosser theorem (a corollary of the Church-Rosser property) states that two definitionally equal expressions have a common reduct, i.e. an expression to which they both reduce.

2.8. Formal vs. feasible decidability A typical application of Church-Rosser theorem and normalization is the decidability of the definitional equality relation on the set of correct expressions. First, by the Church-Rosser theorem we have so-called uniqueness of normal forms: An expression has at most one normal reduct. So by combining this with normalization we can define the normal form of an expression. Then, thanks t o these properties, two expressions are definitionally equal iff they have the same normal form. These can be effectively computed, thus yielding decidability (of definitional equality, from which the decidability of the typing relation follows). However, computing normal forms is not a very practical way of deciding definitional equality, because normal forms can be very long and complicated expressions, and the reduction sequences leading to them often require many reduction steps. A more practical decision procedure rather relies on strong normalization. Namely, when confronted with two expressions A and B we can try to successively apply well-chosen reduction steps on either A or B until we possibly arrive in a common reduct (thus establishing definitional equality) or we arrive in reducts A' (of A ) and B' (of B ) which can be recognized not t o be definitionally equal. Strong normalization warrants that this process anyhow terminates, no matter what reduction strategy has been chosen. Although, in the worst case it might end in normal forms A' and B', in particular this might happen if A and B are not definitionally equal. Since reducing to normal forms is simply not acceptable in feasible verification procedures, the importance of the formal decidability result and of the completeness of the indicated more practical decision procedure must not be overemphasized (as observed in [de Vrijer 75 (C.411) - though these facts are, of course, important for a good understanding of the procedure -. In practice, in the Automath project, the action of the verifier can be explicitly bounded by

174

D.T. v m DaaJen

giving a suitable upper limit to the amount of work (e.g. number of steps) it is allowed to perform when trying to establish a definitional equation. If, within this bound no common reduct is reached the equality of the two expressions is provisionally refused and the verifier will ask for further information. This, we think, is in full accordance with the fact that, in principle, the verifier is not expected to do more than a human checker. For more comment on actual verification see also [ C . 5 ,V.4.41 and [C.5, VIII.61. Strong normalization has, apart from this, more or less practical application, some theoretically useful consequences. E.g. it simplifies the Church-Rosser proof in any case, and it seems indispensable for the case where surjective pairing is present. [ This conjecture was wrong, see [de Vrzjer 87al.l Besides, certain proofs of closure (for Nederpelt’s A) depend on strong normalization (in fact on an even stronger termination property, the big tree theorem). Cf. [ C . 5 ,VII.1.2, VII.3, VII.51.

2.9. The cansequences of closure As an application of closure it is sometimes mentioned that it saves time for the verifier. Namely that the verifier does not need to check for correctness again and again when reducing an expression. More specific, the combination of closure and Church-Rosser is important in the verification procedure. First, the Church-Rosser theorem says that definitional equality (via any sequence of correct expressions) can be replaced by definitional equality established via a common reduct. Secondly the closure property states that the latter equality passes through correct expressions only. Besides, closure is connected with many other interesting properties, which are in fact characteristic for the Automath languages, like preservation of types (under reduction; this property is elsewhere sometimes called closure of the types under reduction), uniqueness of types (this means that proper inclusion of types is impossible), uniqueness of domains, and soundness of (definitional) equality with respect to expression formation and typing relation. See 4.1, 5.4 and [C.5, V.1.31. Further, closure is necessary in the PV-Church-Rosser proofs (see [ C . 5 , VI]), for showing the equivalence of various language definitions, and for showing the connections between the various languages. 2.10. The “unstability” of the difflculties of language theory When proving the nice properties connected with closure one often uses induction on the definition of correctness. This means that the choice of definition, i.e. the order in which the expressions are generated, can be important. In fact, the present author thinks it surprising how important the choice of definition can be in this respect. Example: A proof of closure directly from

The language theory of Automath, Chapter I (A.6)

175

the algorithmic definition turns out t o be rather involved (see [ C . 5 , VII.3.3]), whereas de Vrijer [de Vrijer 75 (C.4)] formulated his system AX-1 (essentially AUT-QE+, see 4.9) in such a way that closure was straightforward. (On the other hand, de Vrijer had to prove his big tree theorem in order to get decidability, whereas decidability for the algorithmic system just follows from normalization.) Similarly, there is much difference between closely related languages, as regards the difficulties they pose in proving their nice properties: Seemingly harmless modifications of the languages - hardly increasing their expressive power can make some parts of their language theory considerably more difficult. We mention the transition from AUT-68 to AUT-QE, from AUT-QE to AUT-QE+, or the extension from AUT-QE+ (even without type-inclusion) to Nederpelt’s system A. See Sec. 4 for the characteristic of these languages. And there is the addition of the “extensional” reductions q, u and c (see [ C . 5 ,V.2.61) which essentially complicate the Church-Rosser proof ( E even spoils the property) without contributing much to the expressive power (see e.g. [van Benthem Jutting 771). By the way, the phenomenon that hardly impressive modifications can give rise to considerable extra difficulties is itself the raison d’6tre of a large part of the Automath language theory: Some properties (closure, pq-Church-Rosser) are interesting properties in Automath, but in ordinary typed A-calculus just trivialities, though the Automath languages can be considered as mere generalizations of the latter system! Returning to the Automath languages: generally, we have chosen the strategy of first proving the nice properties for a - in this respect - simple system, and then trying to extend these results to more complicated languages. See [ C . 5 ,V.3, VII.61.

1.3. Something on bound variables 3.1. In this thesis we consider expressions modulo a-conversion (renaming of bound variables), i.e. our relation of syntactical identity = actually stands for a-convertibility. So, in the sequel, we leave the complications concerning the handling of bound variables out of the discussion. This can be accounted for, e.g., by referring to Curry’s classical exposition on substitution [Curry and Feys 581, to Nederpelt’s notation of distinctly bound expressions [Nederpelt 73 (C.3)], or via the correspondence with one of the proposals to eliminate the names of bound variables altogether ( [ d e Bruijn 726 (C.211, [Staples 771). 3.2. Both these proposals for nameless dummies reflect the idea that a bound variable occurrence is just an open position in an expression, which has to be uniquely linkable to its binding A. de Bruijn performs this unique linking by

176

D.T. van Daalen

replacing such an open position with a positive number, the reference depth, viz. the distance to its binding A. 1.e. the number of A’s one encounters scanning the expression from within until one arrives at the binding X (the latter included). E.g. the bound occurrence x in Xxy . y( yx) has depth 2, the two bound y’s have depth 1. Of course the binding variables going with a X can be skipped in this notation. Staples, on the other hand, replaces all such open positions with one and the same standard symbol (one might as well leave them open) and prcvides the linking information by attaching a list of positions t o every A. These positions are coded in the form of binary strings, with 0 standing for left part and 1 for right part of the expression. E.g. the position x in Xy . y(yx) is coded 111, and the y’s in y(yx) have codes 0, 10 respectively. In other words, in de Bruijn’s notation one counts backwards from a bound position t o its binding A, in Staples’ notation one counts forwards from a binding X to the positions it binds. Example: the name-carrying expression Xxy . y(yx) becomes X(X(l(12))) and X(lll)(X(O, lO)(x(xx))) respectively, where we have taken x for Staples’ standard symbol.

3.3. de Bruijn admits that his system is not particularly suitable for (i)

easy reading and writing, but claims it to be good for both

(ii) meta-lingual discussion and (iii) mechanical manipulation Automath project -.

-

what it was invented for, in the context of the

In fact, de Bruijn’s system is just the symbolic representation of the most straightforward computer implementation of X-expressions. Staples thinks his system is better than de Bruijn’s for purposes (i) and (ii) and does not know about (iii). The present author thinks there is not much difference between the two systems as regards (i) and (ii) (probably de Bruijn’s is somewhat better for (i)), but thinks that de Bruijn’s is definitely superior for (iii). He thinks further that both systems, when compared t o ordinary namecarrying X-calculus, are better for (ii) - unless, of course, one wants to study a-conversion - but so much inferior for (i) - at least to people accustomed to ordinary notation but probably to others as well - that he has preferred the ordinary approach in this thesis. 3.4. Zandleven has actually used de Bruijn’s system in the implementation of Automath, extending it to a system of so-called postponed substitution: substitution instructions are incorporated into the syntax of the system, and so, they can be postponed until needed (e.g. for establishing definitional equality). Since the substitution instructions are also coded by means of reference depths,

The language theory of Automath, Chapter I (A.6)

177

we call the system a system of iterated references (documented in [van Benthem Jutting & Wieringa 791). [See also [van Benthem Jutting 88 (E.,?)].] Closely related are de Bruijn’s system of reference transforming mappings [de Bruijn 78b] and Wadworth’s system of graph reduction [Wadsworth 711. Wadsworth’s system is not namefree, but he surely hints at namefree implementation. de Bruijn and Wieringa [de Bruijn 78a] have also studied even more general namefree A-calculuses. 3.5. In a review [Seldin 751 of de Bruijn’s article [de Bruijn 72b (C.211, Seldin suggested that combinatory logic is as good as any other system for nameless representation of bound variables. Since most A-calculus theories can only partially be represented in combinatory logic (see, e.g., [Hindley 79]), and since the usual translations are rather clumsy (though perhaps Turner’s recent proposal [Turner 791 might be satisfactory) we think that Seldin’s remark is not quite correct. (Lately (Swansea, 1979, oral communication) Seldin seemed to agree with this view himself.)

1.4. The Automath languages 4.1. General language rules We give a tutorial survey of the characteristics of the several Automath languages. Another introductory reference on AUT-68 and AUT-QE is [van Daalen 73 (A.,?)], for AUT-SL see [C.5, VII.11 or [Nederpelt 73 (C.3)], for AUT-II see [B.6,VIII.l] or [Zucker 77 (A.411. See also the discussion in [van Daalen 80, IV. 11. We have already announced the generalized type-structure of Automath: the types can be complicated expressions themselves (e.g. they can depend on variables), they are constructed along with the terms and hence, the typestructure cannot be given beforehand - as is usual in ordinary typed A-calculus -. So the type-assignment is itself part of the system and does not belong t o meta-language. Consequently the system has besides formulas A Q B expressing the definitional equality of the expressions A and B, also formulas AEB D

standing for A has type B. An alternative notation for is = or just =, for A E B one sometimes writes A : B. In fact, in accordance with the implicit character of definitional equality (see below), the @formulas are not written down, when actually using the Automath system, but are just introduced in the language theory for formal purposes.

D.T. van DaaJen

178

All Automath languages have the right hand equality rule (or rule of type conversion) AEB, BQC=XAEC. Most languages also have the left hand equality rule LQ AEC, A Q B ~ B E C

as a derived rule (contrarily to the right hand rule, which is part of the language definition). Further, most languages satisfy uniqueness of types AEB, AEC=XBQC i.e. the “converse” of type conversion. In such languages there can be defined an operation t y p , such that, for all correct A , and

A E tYP(A)

I

A E B =X B 9 typ(A) (this explains why the decidability of entails the decidability of E). The expressions are formed from variables x, y etc. and constant-expressions c(A1, ...,Ak) by the operations of A-abstraction and application (in the so-called pure languages AUT-68, AUT-QE, AUT-SL) and possibly other operations (in the extended system AUT-n). Expressions formed according to the rules and the restrictions (in particular the type restrictions) of the various languages are said to be the correct expressions of those languages, in contrast with the (general) expressions just resulting from unrestricted use of the formation operations. 4.2. Abstraction and application The operation A-abstraction leads to abstraction expressions [x : A] B. Generally such an expression can be interpreted as the function X o : B, ~ with domain A and producing values B [D]when applied to arguments D E A. Here the postfix [D] belongs to the meta-language; it is short for [x := D ] , i.e. substitution of D for the variable x . The application operation constructs the application expression (A)B. This expression must be interpreted as the result of applying the function B to the argument A, i.e. the object usually denoted B(A) or BA. The choice of putting the argument in front, between brackets, combines nicely with the notational habit of putting the binding variable x : A in front too, between a different kind of brackets, and is generally preferred in the Automath project. Of course, people grown up with the usual A-calculus conventions find it difficult to get used to such a new notation. (Admittedly, it would have been consistent with our notation for application to put the substitution operator in front too. However, we do not find this too important because substitution just belongs to metalanguage.)

The language theory of Automath, Chapter I (A.6)

179

4.3. Reduction and definitional equality The definitional equality is a restricted form of equality, just covering certain identifications which in ordinary mathematics are understood without any explicit justification. It is defined in a combinatorial, syntactical way, viz. as the equivalence relation generated by so-called reduction steps. Each reduction step replaces a part of an expression, a redez, by another expression, a so-called contractum. This is the usual terminology in A-calculus, where definitional equality is often called convertibility. In order that the so-defined relation is acceptable as definitional equality, it must clearly be required that redex and contractum are intuitively equal. Our notation for reduction is +. The reductions associated with abstraction &d application are p- and 7-reduction:

/3-reduction: (A)[z : B]C + C [A] 7-reduction: [z : B] (z)C + C if C does not depend on z

.

There is also associated a reduction (called &reduction) to the expressions d(A1, ...,Ak) where d is a defined constant. For such defined constants defining axioms (abbreviations, with parameters)

d(z1, ...,zk) := D are given. Here the postfix variables shown. The &reduction reads

...,zk]

[ ~ 1 ,

[zl,...,zk] is

to indicate that D may depend on the

where [Al, ..., Ak] stands for [zl,...,zk := A l , ...,Ak], the simultaneous substitution of A l , . . . , & for 2 1 , ...,xk. Our &reduction is distinct from other 6reductions in the literature. The equality generated by /3, 7 and 6 indeed corresponds to the intuitive interpretation of abstraction and application, and to the idea of abbreviation. However, certain restrictions have to be fulfilled. In particular, 7-equality is only acceptable if the C (in the 7-redex, above) is also a function, with domain B. Since in the general, unrestricted expressions such provisions are not necessarily satisfied, we define 9 between correct expressions A and B only, and also require that the expressions “in between” A and B (i.e. via which the conversion from A to B can be established) are correct as well. For the additional operations (with associated reductions) of AUT-II see [B.6, VIII.11. 4.4. Type assignment Type assignment takes place together with expression formation. The variables get a type by assumption (of the form z E A). Formulas are derived and

D.T. van Daalen

180

expressions are constructed in natural deduction style, i.e. relative t o a set (in our case: a string) of assumptions, called the context of the formula, resp. the expression. Such a context has the form Xi

E Bi

,

22

E Bz [zi],...,x k E

Bk [xi, . . . , z k - 1 ]

where all the xi are distinct. (This notion of context is only vaguely related to the notion of context nowadays used in A-calculus theory, namely an expression with a hole in it.) If ( is a context we sometimes write

(FA,

[tAEB,

(tAQB

to indicate that an expression or formula is correct, resp. derivable, with respect t o (. Here ( contains so to say the type declarations of the variables on which A (resp. A E B, A B ) depends. The constant expressions obtain a type by instantiating of (i.e. substitution in) a scheme. A scheme consists of an axiomatic type assignment with parameters

C(Zl,...,xk) E C[zlr...rzk] relative t o a context 21 E

B1,

22

E B2 [el],*..,zk E Bk ( 2 1 , *..,zk-11

.

Only such instantiations c(A1, ..., Ak) are admitted, where the Ai meet the type requirements of the context, i.e.

A1 E

B1

, Az

E Bz [All, ...,Ak E Bk [Al, ...,Ak-11

.

Then the type assignment to the constant expression becomes

c(A1 , ..., Ak) E C(A1, ...,Ah]

.

A list of constant schemes is called a book and the constants c are called book constants (to distinguish them from the language constants). There are two kinds of constant, viz. primitive constants, having a type-assignment only, and defined constants, having a defining axiom (as mentioned in 4.3) and a corresponding type-assignment (see below). All constants in the book are distinct so each book constant has a unique type-assignment (resp. unique defining axiom). If d has defining axiom d(x1, ...,zk) := D and typing d(z1, ...,Zk) E C then, for the sake of the intuitive interpretation, it must be required that D E C w.r.t. the context of the scheme. This is the compatibility condition of def and type. For more precise definitions see [van Daalen 80, IV.3.2, IV.3.31 and [C.5,V.2.11.

The language theory of Automath, Chapter I (A.6)

181

4.5. The rules of AUT-68 As for the application and abstraction rules, we first describe the simplest language, now named AUT-68. This language has three kinds of expressions: terms (also called expressions of degree 3, or: 3-expressions), types (with degree 2, or: 2-expressions) and a single untyped constant t y p e (also denoted 7 , and called a supertype or 1-expression, of degree 1). Languages with expressions of degree 1, 2 and 3 only are said to be regular. The 1-expressions generally serve as types for the 2-expressions, but do not have a type themselves. Notice that the word “type” is used ambiguously here, viz. to name the 2-expressions and in the sense of: “being the type of”. Typically, the types are the types of the terms and (in AUT-68) t y p e is the type of the types. So, in AUT-68 there are two cases A E B: either A is a term and B is a type, or A is a type and B t y p e (= means syntactical identity). In terms of degrees: if A E B, B has degree i then A has degree i 1. This property holds generally, also in the irregular languages, like AUT-SL, where expressions of all positive degrees are admitted. Now we give the term formation rules for AUT-68. First notice that all variables have a type, so must be a type variable (of degree 2) or a term variable (of degree 3). The abstraction rule reads: if from an assumption z E A, and possibly other assumptions not depending on z, it can be derived that B E C, where z is a term variable and B is a term, then one can conclude that [z : A]B E [z : A]C and discharge the assumption z E A. In natural deduction not at ion

=

+

E A1

degree(z) = degree(B) = 3

term abstraction rule

BEC [z : A] B E [z : A] C Actually, in Automath only the last assumption in the context is allowed t o be discharged. The remaining assumptions clearly satisfy the above mentioned restriction (of not depending on z). We refer to the fact that the context is a string rather than a set (and consequently, that the assumptions can be removed according to the last-in first-out principle) by speaking of the linear natural deduction character of Automath. In the notation of this thesis the rule becomes:

A , ( z E A I-3 B E C ) = + [ z : A ] B E [ z : A ] C with I- standing for correctness, resp. derivability, with the superscripts indicating the degrees (for the precise conventions see ( C . 5 ,V.2.1.1)). In order to guarantee that the type of correct expressions is correct too, there must be an abstraction rule for types as well. This one reads

D.T. van Daalen

182

E A1 type abstraction rule AUT-68

c E type

degree(z) = 3

[z : A]C E type

In our notation

A , (z E A I- C E type) =+ [x : A]C E type . Then there is the application rule for AUT-68: application rule AUT-68

D EA B E [z : A]C ( D ) B E c [Dl

4.6. Interpretation Now something about interpretation. With the 3-expressions [z : A] B and ( D ) B constructed above there is no problem: [z : A]B is the function ax:^ . B , ( D ) B is the result of applying function B to argument D. But consider the 2-expression [z : A]C occurring in the rules above. Under the most convenient interpretation, maintaining that a type is a kind of set or class, and that the E-relation is a kind of element relation, [x : A] C must stand for the object . C). 1.e. the Cartesian product of all the usually denoted ll C or I I ( A x : ~ x:A

C [ D ] ,for D E A. In case C does not depend on x, this product reduces to the function space A -+ C which in type theory would be denoted (AC) or the like. In other words, [z : A]C is the “set” (class, aggregate) consisting of all the functions B with domain A which, when applied to arguments D in A , produce values belonging to C [D].This is precisely what the appl rule says. So in this interpretation the abstractor [x : A] has two different meanings: when used with a term it gives a function, when used with a type it gives a kind of set. Or, we can say that [z : A] has just one meaning, viz. Az : A, but that the II has been omitted, for brevity, in a situation where no confusion is reasonably possible. This is the standard interpretation corresponding with the notation in related typed A-calculus systems and in AUT-II (see [B.6, VIII.11). However, there is a second, alternative, interpretation, too. It is not necessary to stick t o the idea that types are sets and that E is a kind of element relation. Namely, we can very well interpret [z : A]C as the function ax:^ C, if only we accept that a function can act as a type. Then, the term abstr rule says (La.) that the type of a function is again a function, with the same domain, and, conversely, the appl rule says (i.a.) that the functions of degree 3 are characterized by having a function for their type, from which their domain can be read off. In this interpretation the conclusion of the term abstraction rule 9

The language theory of Automath, Chapter I (A.6)

183

([z : A] B E [z : A] C) just means V D E A ( B [ D ] E C [ D ] ) i.e. , the rule abstracts the formula B E C rather than the expressions involved. In algebraic terms: the rule can be considered as a distn'bution rule of the abstractor [x : A] w.r.t. the E-relation. This, second, interpretation has given rise to several extensions of the language, viz. to AUT-QE, to so-called +-languages (AUT-68+ and AUT-QE+), and even to AUT-SL (i.e. Nederpelt's A). 4.7. AUT-QE

First the extension to AUT-QE. Since we interpret the 2-expression [z : A] C as a (type valued) function, and since we want a uniform method of type assignment for both term valued and type valued functions, we drop the restriction to B of degree 3 in the term abstraction rule of AUT-68, thus getting the general abstraction rule: k2 A , (z E A I- B(E C)) =$I- [z : A] B(E [z : A] C) .

So the degree restriction for the variable z is maintained. In the new rule there is included (skip the two E-parts between parentheses) the abstraction rule for 1-expressions, to guarantee that the types of correct expressions are correct again: I-2 A , ( z E A '-I

[z:A]C.

C)=&

So in AUT-QE there are other supertypes than just type, of the form [Z1 : A11

... [Zk : Ak] type

.

These expressions have originally been named quasi-expressions, whence the name of the language AUT-QE. The application rule of AUT-68 is maintained in AUT-QE: application rule I : D E A , B E [z : A] C

( D )B E C [ D ] but is more general here, because it can be used with B of degree 3 and 2 now (in AUT-68 only with B of degree 3). Besides, AUT-QE has, in accordance with the proposed interpretation, another appl rule: application rule 11: E E A , B E C E [z : A] D

* ( E ) BE ( E ) C .

Namely, [z : A] D is a function with domain A , so C is a function with domain A , so B is a function with domain A and can be applied to the argument E E A . (In fact, this rule can be derived from appl rule I by 71-equality, which confirms the agreement with the interpretation.) Just like a degree 2 abstr expression of AUT-68 allows different interpretations, viz. as a set or as a function, a degree 1 abstr. expression of AUT-QE has such different interpretations too. Under the first interpretation the expression [zl : All ... [zk: Ah] type stands for the object

D.T. van Daalen

184

II ( II (...( II type) ...)) . xZ:Aa

xi:Ai

Xk:Ae

This corresponds with the notation of AUT-II, see [B.6,VIII.11. Under the second interpretation it stands for the object Xx1:Al

'

Xx2:Az

*

... X Z k : A k type . '

4.8. Type inclusion Now let z E A k C E type. Two rules of type assignment are applicable, viz.

the type abstr rule of AUT-68 and the general abstr rule, giving rise to [z : A] C E type

Generally a 2-expression types

,

resp. [z : A] C E [z : A] type

[z1

: All

... [zk : Ak] C of

.

AUT-QE has as its possible

t y p e , [q : All t y p e , [q : A11 [x2 : A21 type etc. up to, at least [zl: All

... [zk : Ak] type

.

This ambiguity of types, which is typical for AUT-QE, is usually implemented by adding a rule of type inclusion

B E

[z1 : All

... [zk : Ak] [y : C]type + B E [ZI : AI] ... [ z k : Ak] type

and dropping the type abstraction rule of AUT-68, which now becomes a derived rule. In fact, the type inclusion rule is somewhat stronger than the type abstraction rule of AUT-68 (or, similarly, the product rule of AUT-II). See [B.6, VIII.1.51 and [ C . 5 ,VIII.6.1]. Clearly the property of uniqueness of types

AEB, AEC+BQC is, for 2-expressions A, not valid any more in AUT-QE. This is, however, the only case of proper type-inclusion in Automath language. We introduce c to denote type-inclusion, i.e.

BEG':($ VA(AEB=+-AEC). For the precise definition see [ C . 5 , V.2.131 or [ C . 5 , V.3.21. The possible types of a 2-expression appear to be linearly ordered under C, so

A E B , A E C = + B C C or C C B and it is still possible to define a canonical type (typ) which is minimal, w.r.t. c, among the possible types (and hence gives maximal information), i.e. such that

AEB

+ A E typ(A) c B .

The language theory of Automath, Chapter I (A.6)

185

4.9. +-languages

Now the extension t o +-languages. Recall that in AUT-68 there were abstr expressions of degree 3 and 2, but appl expressions of degree 3 only. We say the value degrees are 2 and 3, and the function degree is 3. Here we use the terminology of [C.5, V.2.71: B is called the value part of [z : A] B and the function part of (A)B. Similarly AUT-QE has value degrees 1, 2 and 3 and function degrees 2 and 3. Such languages, where the minimal value degree is not a function degree are named non-+-languages. However, if the abstraction expressions of minimal value degree are functions, it is reasonable to have an appl rule for them too: appl rule +-languages

D E A , B Q [ z: A ] C = +

(D)B

In particular, if D E A, t- [z : A] C then t ( D ) ( x : A] C. Indeed, by adding the above rule for B of degree 2 t o AUT-68 we arrive at the +-language AUT68+. And by adding it to AUT-QE for B of degree 1 we arrive at AUT-QE+ (which is essentially AX-I, the legitimate fragment of de Vrijer’s AX [de Vrijer 75 (C.d)]). In principle, the new rule is a derived rule for B not having minimal value degree. The words “in principle” here refer to certain problems with type inclusion and defined constants, explained at length in [C.5,V.1.7, V.3.3, V.4.2). It will be shown ((C.5, V.3.3, V.3.41) that a +-language is an unessential (and even, definitional) extension of the corresponding non-+-language (see [C.5, V.3.3)):

t-+ A + 3,4f(t- A’ & A

Q+ A‘)

i.e. to each A in the +-system there corresponds a definitionally equal A‘ correct in the smaller system. In all the languages now defined, the rule general application rule

B E C , t- (A)C =k (A)B E (A)C

is a derived rule. Alternatively, this rule can be adopted in the language definition, either with the application rule I (in the non-+-languages), or with the application rule for +-languages, to generate all the appl expressions of the various languages. The nice point about the general application rule is that it (similar to the general abstraction rule) can be considered as a kind of distribution rule, viz. of the applicator (A) w.r.t. the E-relation. Though in AUT-QE+ we have achieved a fairly uniform treatment of expressions of all degrees, we still have maintained the restriction that only abstractors [x : A] with degree(z) = 3, degree(A) = 2 are formed. In other words, only term variables are quantified. So there is no quantification over type variables and

186

D.T. va.n Daalen

we say that our systems are first-order (this term refers t o the fact that in the proposition-as-types interpretation quantification over types gives rise to higher-order logic). Consequently only applicators (A) with degree(A) = 3 are admitted. We say that the only domain degree is 2, and the only argument degree is 3 (A is said to be the domain part of [x : A] B and the argument part of ( A ) B ) .Apparently there is a certain duplication in having both instantiation and application in the system. However, because of the aforementioned application restriction instantiation cannot be missed: substitution of 2-expressions (for type-variables) cannot be performed by means of application so has to take place by means of instantiation. (See also 5.6.) 4.10. AUT-SL

Now we explain how AUT-SL (i.e. Nederpelt’s A) can also be considered a result of our extended interpretation of the E-symbol. Namely, now that we have accepted that functions can be inhabitable, i.e. can be the type of other expressions, there seems to be no principal objection against allowing each expression to be inhabitable. This is indeed the most striking characteristic of A: there are expressions of all positive degrees admitted, so A is irregular (Sec. 4.5). (Here is an analogy with the language of set theory where a priori no term is excluded from being inhabitable, i.e. from being a set.) Further, in A all degrees are domain degrees, so all degrees but 1are argument degrees, so instantiation can be missed and, indeed, has been dropped. Still, we shall not call A a higher-order language ([vanDaalen 80, IV.1.5.31, [ C . 5 ,VII.11) because any form of type inclusion has been omitted. So, AUT-68 and AUTQE which are based on type-inclusion, are not included in A, and uniqueness of types holds in A. For more information about the background of A see [ C . 5 , VII. 11. The definition of A either must contain the general application rule, above, or for B of degree k, k 2 2,

In fact, Nederpelt gives an algorithmic definition of A, in terms of a type function t y p , and in terms of unrestricted reduction *, instead of a so-called E-definition in terms of E- and Q-formulas, such as the definitions given above. For a discussion of algorithmic definition vs. E-definition see IC.5, V.1.21 and for the equivalence of both definitions see [ C . 5 , V.41. Because of the simple form of the general abstraction and application rule, the function t y p has a very simple definition too, in particular typ((A) B )

:=

(A) t y p ( B ) ,

typ([x : A] B )

:=

[Z: A] t y p ( B )

.

The language theory of Automath, Chapter I (A.6)

187

Nederpelt gives a so-called application condition which in our notation, for B of degree k would read

D E A , typ"-'(B)

Q [z : A] E

=+(D)B

(where typ"' stands for k - 1 successive applications of the function typ), completely in accordance with our application rule for B of degree k , above. By the way, we write, like Nederpelt, typ* for the typ"' of expressions of degree

k. The language A was invented for theoretical purposes. It is interesting because it has a very simple and elegant definition and exhibits some typical Automath features. However, because it is in some sense weaker (no type inclusion) than AUT-68 and AUT-QE, results valid for A cannot directly be transferred t o these, from a practical point of view, more important languages. In particular, the "strict" normability of A (proved by Nederpelt) is easier to prove than the "weak" normability of AUT-QE (see [van Daalen 80, IV.3-41) because of the weak second order aspect of AUT-68 and AUT-QE. See [van Daalen 80, IV.1.5). See also [ C . 5 , VIII.4.2.21 for an interesting interpretation of these normability results (inspired by [Ben-Yelles 811). Conversely, the facts that A is a +-language, is irregular, and has no abstraction restrictions, pose certain difficulties which in the theory of AUT-68 and AUT-QE can be avoided. The present author has mainly devoted his language theoretical attention towards the languages actually being in use: AUT-68, AUT-QE and AUT-II. In this thesis we have indeed at some places introduced new languages (for technical or expository reasons), but we have tried to exhibit the precise connections with existing languages. Also, we have devoted a chapter ( [ C . 5 ,VII]) to A, which deserves some interest of its own.

4.11. AUT-II For an informal introduction to AUT-II see [B.6, VIII.11. In AUT-II the standard mathematical distinction between types (being inhabitable) and functions (not being so) is made by putting in II's at the proper places (whence the name AUT-II). In [ C . 5 , VIII.61 the difference has been indicated between the rule for inserting lTs (the product rule) and the rule of type-inclusion of AUT-QE. 4.12. Two higher order languages For completeness reasones we mention two proposals for higher order languages. First, de Bruijn once proposed a language AUT-4 [ de Bruijn 74b (B.3)], where the proofs come in as degree 4 expressions (whence AUTd), instead of, as usual (5.9, 5.2), as degree 3 expressions. AUT-4 would have provided an ap-

188

D.T. va.n Daalen

plication of the higher degrees of irregular languages, but has never been used or implemented. Secondly, the author has introduced a language (let us name it AUT-2) which has expressions of degree 1 and 2 only, with unrestricted typeinclusion rule (Sec. 4.8) and without abstraction restrictions. This language proved to be essentially identical to a system of type-assignment to A-calculus terms invented by Dezani and Coppo [Coppo & Dezani 781, [Coppo et al. 811 for quite different purposes. These two languages are not discussed in this thesis. It seems that (strong) normalization for AUT-4 can only be proved by Girard-like methods [Girard 711, [Girard 721, whereas for AUT-2 we have a strong normalization proof in the style of this thesis.

1.6. Mathematics in Automath 5.1. Survey of this section Because of the presence of a type (type) of types, the presence of typevariables and the generalized type-structure, people often tend to overestimate the expressive power of (i.e. what can be said in) the Automath languages. Here we refer to the expressive power of the languages as such, i.e. to what can be said directly in the basic system, without any constants added. (Because, with additional constants, as we shall see, almost anything can be expressed, just like in the language of first order predicate logic.) Below we sketch what has become the standard development of mathematics in Automath. The emphasis will be on the inherent limitations of Automath. Occasionally we make a comparison with closely related systems: Seldin’s system of generalized functionality [Seldin 761, Scott’s system of constructive validity [Scott 701 and Martin-Lofs systems of intuitionistic type theory [Martin-Lof 75a], and Girard’s systems for analysis [Girard 711, [Girard 721. Throughout we comment on the typical Automath features. 5.2. The t-part and the ppart of Automath Let us. for the sake of the exposition, divide mathematics in two parts: one part, let us say the object part, dealing with the construction of mathematical objects (resp. types), and one part, the logical part, for reasoning about these objects. Our framework of Automath languages, above, is formulated in terms of objects and types, rather than in logical terms: there are, indeed, 9- and E-formulas expressing facts about the objects, but they play an auxiliary role, viz. to control the construction of the correct (Sec. 2.6) objects. Following [van Benthem Jutting 773 and [Zucker 77 (A.4)] we name the fragment of Automath that deals with the object part the t-fragment (for terms, types and type-valued functions), and the fragment of Automath representing the logical part the pfragment (for proofs, propositions, predicates). Degree

The language theory of Automath, Chapter I (A.6)

189

i (Sec. 4.5) expressions of the t-fragment and the pfragment are said to be i - t-expressions and i - pexpressions respectively. So, whereas the preceding sections suggest how the t-fragment can be developed (3t-expressions for objects, 2t-expressions for types), it is a priori not clear how the pfragment will express the logical part. Essential is that the E-formula A E B, of the pfragment, with A a 3pexpression and B a Ppexpression, is interpreted as expressing the truth of the proposition B (i.e. as expressing B itself). So, a proposition is true (asserted) if “we have something in it”, i.e. if we have a (3p)expression having the proposition for its type. There are several ways of interpreting the realizer A (we borrow this term from [Pottinger 801 who borrowed it from Helman), i.e. the expression we have in the proposition B: as an abstract proof construction proving B, as a symbolic translation of a natural deduction proof figure (with B as its end formula), or as just some indication (some reference to the fact) that B holds. If we are interested in constructive foundations the first interpretation is appropriate. If we want to study proof figures (e.g. in view of normalization properties) the second interpretation is the best one. If we just want classical logic the third point of view seems to be right, and it also seems justified to identify (in the sense of definitional equality) all the realizers of one and the same proposition. This identification principle is called irrelevance of proofs [van Benthem Jutting 771, [Zucker 77 ( A 41, [ de Eruijn 80 ( A .511. We will explain that the propositions-as-types way, as sketched above, of fitting the logical part of mathematics into a typed A-calculus framework arises quite naturally from the idea of mechanical proof-checking (and, on the other hand, that it is the only way of expressing actual reasoning in terms of the Eand 9-formulas) . 5.3. The t-fragment

Generally speaking, the systems introduced in Sec. 4 are as yet still empty because we have not introduced any constants. Here we adopt the common point of view that the meaningful objects (resp. types) of a theory correspond to its closed expressions (i.e. those not depending on variables). One way to construct closed terms is from constants, another way is by binding the variables in an expression, i.e. by A-abstraction. Since in most Automath languages abstraction over type-variables is forbidden we need at least one primitive type-constant before we can start generating closed expressions. (Here A is an exception: In A the basic constant T (this is just an alternative notation for type) can be used as a ground type and we can directly start constructing functions of type T --* T etc.) In the Automath project it has sometimes been stated, that there is no essential difference between a constant without parameters - i.e. introduced in

D.T. van Daalen

190

an empty context - and a variable. This is formally right: a constant can be conceived as a variable one does not want to get rid of, and for which no substitution is possible. Conceptually, however, it seems better to maintain the distinction. We just sketch very briefly how the typed A-calculus framework of Automath can be used to construct the objects (numbers, functions, functionals) forming the universe of discourse of ordinary mathematics (say, analysis). One first introduces some primitive type constants (2t-expressions) for the ground types, the natural and the real numbers, say, by stating as an axiom (i.e. an axiom scheme in an empty context): n t E type, r l E type. (Of course, if one knows a bit more one can also define the reals in terms of the natural numbers, but that does not concern us here.) Secondly, one introduces some primitive term constants (3t-expressions) for generating the objects of these types. E.g. in order to construct the natural numbers one states axioms one E n t , sucfun E n t + n t (the successor function, which can alternatively be introduced by a scheme, see below). From these constant we get the natural numbers, which we can give a new name by introducing defining constants: two := (one) sucfun, t h r e e := (two) sucfun(Q ((one) sucfun) sucfun) etc. If one likes, one can also introduce primitive constants plusfun E n t + (nt -+ n t ) and timesfun E nt ( n t + n t ) for plus and times on the naturals. Additional (equality) axioms will be needed t o fix the properties of the thus constructed objects, but these rather belong to the logical part. Similarly, constants can be introduced (with the additional axioms) to generate the objects of type r l . By A-abstraction closed expressions of higher type are constructed. These higher types themselves (we already used some of them) are also constructed by A-abstraction (in AUT-QE etc.) or by A-abstraction and product formation (in AUT-n). E.g. we get n t + r l , the type of real number sequences, ( r l + r l ) + r l the type of real functionals etc. We see that up t o now there seems to be no possibility to introduce non-trivial type-valued functions: the higher types shown are just (products of) the constant type-valued functions [ x : nt] n t , [x : n t ] ( n t + n t ) etc. In fact, the type-valued functions do not become essential before we arrive at the p p a r t , However, we give an example of a typical type-valued function in the t-part (see [van Benthem Jutting 77)): In the context x E n t we can introduce the primitive 2t-constant 1 to(x) intended to contain the natural numbers up to 2 , as follows ---$

x E n t k 1 to(x) E type

.

(This cannot become an actual subtype of n t (cf. 5.4), injection functions and equality axioms will be needed.) From this scheme we can construct the nontrivial type-valued function [x : n t ] 1 t o ( x ) (a 2t-expression). I t depends of

The language theory of Automath, Chapter I (A.6)

191

course on the additional axioms what objects will belong t o this type. It is an interesting question what higher type objects (functions and functionals) can actually be defined by mere A-abstraction (either from object constants, or just from variables): of course we have constant functions and selectors Ax1 ... xn . xj, and we can define composition of functions, but what else? For an answer see [Plotkin 801. 5.4. Some comment on the t-part From the examples, above, several characteristic features and limitations of Automath become clear. First, that the whole development is based on typed A-calculus rather than on set theory. More about this in the next section. Then a point on defined constants: from our present point of view (What objects are actually constructed?) they are irrelevant, because they just serve as new names for objects already present. From a practical point of view, however, they form an indispensable feature of Automath. Another characteristic facility of most Automath languages is that a function can be introduced in two ways, viz. either as a single higher type constant or, by a scheme, as a constant depending on parameters (in this case the constant rather stands for the function value). Above, sucfun, plusfun and timesfun were introduced by the first method. Alternatively, one might introduce SUC, plus and times by an axiomatic typing scheme, i.e. depending on variables of type nt: x E nt k suc(z) E nt x E nt , y E nt k plus(x,y ) E nt etc.

That these mechanisms really form a duplication is shown by the fact that they can be defined in terms of each other, e.g. sucfun := [z : nt]suc(x) ,

resp. 2

E nt k suc(x) := (x)sucfun etc.

More about schemes can be found in Section 5.6. Now we arrive at some mutually related characteristic limitations of the Automath languages (further elaborated in 5.7). First that hardly any mathematical structure is given beforehand: even the natural numbers have t o be introduced by a series of constants and axioms (this point we have mentioned before). Secondly that a type must be present before it can be postulated to be inhabited, i.e. a type must be introduced before the objects or that type. This contrasts with the common ideas about the set theoretic hierarchy where sets

192

D.T. van Daalen

cannot be constructed unless their elements are given (and grasped, as one says). In fact, this distriction between types and sets suggests that, after all, the ground types must be understood as syntactic linguistic categories rather than as actual mathematical objects themselves (compare (Martin-Lof 75al). Then, the higher types can be understood in terms of the ground types. A third limitation of Automath (related to the second one, though) is the uniqueness of types. In the above development one might think it handy if the number one of type n t would be of type rl as well and, more general, if n t would be an actual subtype of rl (in the sense of c,see 4.8). Such proper inclusion of types is not expressible in Automath, and non-trivial intersections of types are not present either. (Whether the identification of the natural number one with the corresponding real number would be justified is another question. See [ d e Bruijn 75a (F.111.) 5.5. The typed A-calculus framework

This section tries to support the choice of basing Automath on the concept of function rather than on the concept of set. The first point is, that in almost any interesting part of mathematics some form of abstraction is needed, either as A-abstraction, or as a comprehension axiom. (The alternative to abstraction is a development in the style of combinatory logic, as in von Neumann-BernaysGodel set theory.) As stipulated in [ d e Bruijn 72b (C.211, A can be considered as the, neutral binding operator, not to be explained in more primitive terms. E.g. the comprehension set (z I A ) can be defined in terms of A by, say, setof(Az.A). The second point is, that the primitive concept of function is basic in ordinary mathematics (analysis, say). It is, of course, well known that the graph of a function can be coded (implemented, say) as a set - and we don't deny that the graph concept itself can be clarifying -, but in ordinary mathematics there is usually no point in this implementation. In fact it just shows the welldefinedness of the function concept (i.e. of a function on a given domain) in terms of the commonly accepted formal development of axiomatic set theory which for a practical mathematician is hardly doubtful and probably uninteresting -. Compare [de Bruzjn 75a (F.111. Similarly the possibility of implementing other familiar concepts (the natural numbers, the reals, the complex numbers) in axiomatic set theory, or in any other form, is usually of no practical importance. By basing one's function concept on pq - A-calculus one gets the possibility of making explicit definitions of functions (by A-abstraction), and of making those identifications (by definitional equality) that follow from these explicit definitions. Clearly, the graph concept of functions gives more, viz. extensionality, whereas &-equality just pins down the function intensionally, i.e. as a rule. Additional equality axioms (not for definitional, but for book equality) are needed for extensionality. We stress that q just gives a very weak form of exten-

The language theory of Automath, Chapter I (A.6)

193

sionality. According to Scott, the 7-equality Xx.fz = f (in ordinary X-calculus notation) must not be understood as extensionality but rather as stating that f is a function. So, in a typed setting 7 seems to be anyhow justified: the mere correctness of [x : a] (x)f (in Automath notation) warrants that f is a function. However, 7-equality presupposes uniqueness of types! Above we have taken for granted that the appropriate practical function concept is a typed one. Indeed, free, untyped X-calculus is a farreaching, a priori just formal, extension of this concept (compare e.g., the notations for limits and formal series, in analysis). It is an extension useful for studying computations but which does not seem very well applicable to “ordinary” mathematics. Compare LCF, being intended for the former purpose and actually based on the polymorphic typed X-calculus PPX, where the type conventions are not quite as strict as in ordinary typed X-calculus. We note that these two restrictions of the definitional equality (that it just covers intensional equality, between ordinary typed X-calculus objects) are essential for its being decidable (in contrast with, e.g., the convertibility in PPX).

5.6. Axioms vs. schemes, abstraction vs. abbreviation In 5.4 we saw that there are two possibilities to introduce primitive constants for the construction of functions, either at low type level (example: suc) in a scheme, or in a higher type by an axiom (example: sucfun). The difference between the two approaches is that from a scheme objects are constructed by instantiation (example: suc(one)), and from the corresponding higher type axiom by application (example: (one) sucfun). In most logical formalisms the distinction between instantiation and application cannot be stated in such an explicit form, since their instantiation mechanisms belong to meta-language. Similarly there are in Automath (usually) two possibilities for making explicit definitions of functions: by X-abstraction and by a definitional axiom scheme. These definitions are respectively eliminated by application plus @reduction and instantiation plus &reduction (this duplication is eliminated in Nederpelt’s

A). Apart from the fact that writing schemes allows a form of (substitutional) quantification of variables not quantifiable by X (viz. type variables), it also allows quantification of more variables at a time. However, as one knows, this simultaneous quantification can be simulated by successively quantifying one variable at a time. So, roughly speaking, what can be done by schemes can also be done by X-abstraction. In some sense schemes are simpler than abstraction: higher type objects are avoided. Indeed, in the Automath project a schematic introduction of constants (i.e. suc instead of sucfun etc.) would generally be preferred. And, rather than asking how instantiation can be dismissed in favour of application,

194

D.T. va,n Daalen

one should ask what abstraction, application and higher order objects actually contribute. We think that X-calculus only comes in when one wants to express nested quantifications (either substitutional or by X-abstraction) such as, e.g., needed when quantifying over functions or defining functionals. Example: the proposition conf ( f ) expressing the continuity of f depends on the higher type variable f. If one wants to use this proposition (by instantiation), higher type objects (like [ x : rl] F ) must be substituted. de Bruijn has, accordingly, conjectured that up to 18th century mathematics is expressible without X-calculus and, hence, that the primitive Automath language PAL would do for that subject. 5.7. More o n the language restrictions ( a s m e n t i o n e d in 5.4) The fact that no arithmetic is built in, distinguishes Automath from systems meant to give a foundation for constructive mathematics. In particular, we want to make a comparison with the systems in [Scott 701 and [Martin-Lof 71a] because these two systems have the same generalized type-structure as Automath, and the same way to represent reasoning, viz. a propositions-astypes way. Scott sketches a general recursive construction mechanism that allows the definition of the natural numbers from a finite set of given ground objects. Martin-Lofs introduction of the natural numbers is more like ours: he introduces zero and successor but additionally he has recursion over the natural numbers built-in in his language. The main difference between built-in arithmetic and arithmetic introduced axiomatically (as in Automath) is that in the case of built-in arithmetic one gets the equations following from the recursive definition of a function for free, i.e. as definitional equality. In Automath one can also introduce a constant intended for primitive recursion but the point is that the additional equality axioms, needed to give such a constant its meaning, concern book equality, not definitional equality. This limitation also distinguishes Automath from LCF, where recursive definitions of functions is indeed possible. Now we come back to the second and the third limitation: that a type must be present before its inhabitants and, that in Automath uniqueness of types holds. These limitations prevent any inductive construction of a type, in a general sense: both the recursive definition of a type, and, even, the construction of a new type consisting of, e.g., a finite number of previously given objects, are impossible. Such previously given objects have a type already and it is simply not possible to state as an axiom (neither as an assumption) that such an object also belongs to a different type. In AUT-II (and in Scott’s and Martin-Lops system as well) there is the possibility to construct binary disjoint unions of

The language theory of Automath, Chapter I (A.6)

195

previously given types, but, even there, the objects of the old types cannot be identified with the object of the new types: injection functions are needed. 5.8. A comparison with generalized functionality

Uniqueness of types seems a good starting point for a comparison with Seldin’s sytem of generalized functionality [Seldin 761. This is a generalization of Curry’s systems of basic functionality [Curry and Feys 581, [Curry et al. 721. Basic functionality has the usual function types a + p (there denoted Fa@, but generalized functionality has the generalized type-structure of Automath and the other two systems, above. Actually we took the word “generalized” from Seldin. The product types denoted above as [z : a]p or II( [z : a]p) or II p X:CY

are in Seldin’s system written as Ga(Xz.P). This is, including the introduction and elimination rules for G (i.e. our abstraction rules) all quite similar t o the product types of Automath. However, an important difference is that in Seldin’s system the variables do not get a fixed type and consequently, the system rather must be viewed upon as a system of type assignment to (certain) terms of the type free Xcalculus. E.g. the identity I belongs to every type a -+ a (where a is a type), whereas in Automath we have different la’s, denoted [z : a]z, at every type a. Consequently, a term can indeed belong to different types. In functionality theory the statement A has a type B is denoted B A (the predicate B applies at the subject A , as one says) and is itself an object (ob) of the system. In principle, interference of B and A (by reduction, where B acts as a function, with argument A ) is not excluded. However, in the separated systems, where the equality rules operate on subject and predicate separately, the interference is forbidden and B A is just a n alternative notation of our A E B. (Notice that this kind of interference in the case of Automath, where (except in AUT-II) [z : A] B can be both a function and a type, would be disastrous.) A point of difference between Seldin’s system [Seldin 761 and our systems is that the type formation rather belongs to his meta-language (and is less restricted then ours: he just respects the arity (i.e. number of arguments) of the type valued functions). Seldin proves for his system the subject reduction theorem (our closure theorem) and the normal f o r m theorem (our normalization theorem). The systems of functionality are said to be systems of illative (combinatory) logic. The word “illative” now refers to the presence of other basic constants (viz. F and G) than just the combinators (or, alternatively, than just A-abstraction). Originally, Curry rather meant the word “illative” to stand for inferential, i.e. also dealing with the logical part (cf. 5.2) of mathematics. In view of the facts, that the Automath languages are quite similar t o functionality systems, and that Automath is indeed intended to represent both the object part and the

D.T. van Daalen

196

logical part of mathematics, it seems justified to call Automath a system of illative combinatory logic (or rather illative A-calculus). 5.9. The pfragment

Recall that the logical part of mathematics (the reasoning) is represented in Automath by a propositions-as-types method. The standard way of developing propositions-as-types in the pfragment of Automath is as follows. The propositions enter as special types (2pexpressions of type prop, where prop is another basic constant, a lpexpression, that behaves just like type). We saw that a proposition is true if we have a realizer, a 3pexpression in it. A proposition B is assumed by introducing a variable realizing (i.e. of type) B , and a proposition B is stated as an axiom (resp. axiom scheme) by introducing a primitive constant (resp. primitive constant depending on parameters) realizing B. The implication B =+ C is represented by the function type B + C (in AUT-68- and AUT-QE-notation [z : B]C). Introduction- and elimination rules for =+ correspond with the abstraction and application rules of Automath. The standard development of (classical) logic in Automath starts with the introduction of a primitive 2pconstant con E prop, t o represent the contradictory proposition, i.e. falsum. Clearly con is intended to remain empty. So, the negation of a proposition a (i.e. a =+ falsum) can be represented by [z : a]con, which we abbreviate by non(a). Hence the double negation of a becomes non(non(a))(Q [z: [y : a]con] con). Then, for classical logic, a primitive realizer, called dnl, for the double negation law is introduced by a scheme

a E prop, a E non(non(a)) k dnl(a,a ) E a . We also promised some book equality axioms for giving the expressions of the t-part their meaning. To this end a primitive proposition eq, for book equality between objects of the same type, is introduced by a scheme a E t y p e , a E a , b E a k e q ( a , a , b ) Eprop

together with, e.g., primitive realizers for reflexivity (i.e. in eq(a, a, a ) ) ,symmetry (i.e. to infer eq(a, b, a ) from eq(a, a, b)) etc. Predicates are special type-valued, viz. proposition-valued functions, formed from propositions by A-abstraction. In contrast with the type-valued functions of the t-fragment (cf. 5.3), predicates are usually non-trivial type-valued functions. E.g. the property “being equal to one” on type nt is expressed by the predicate [z : nt] eq(nt, one, x ) . The (minimal) type (cf. 2.10) of this predicate is n t + prop, in AUT-QE written [z : nt] prop and in AUT-ll written II([z: nt] prop). These typical lpexpressions of AUT-QE and AUT-II allow the introduction of predicate variables and, hence, the formulation of schemes depending on pred-

The language theory of Automath, Chapter I (A.6)

197

icate parameters. An important scheme containing a predicate parameter is the axiom scheme for induction over the natural numbers. If P is a predicate on type a (having type a + prop) then the product IT P ( z ) (in AUT-IT this is written IT(P), in AUT-QE it is just P itself) X:a

stands for the proposition VzZaP ( z ) . Introduction and elimination rules for V correspond with the abstraction and application rules of Automath.

5.10. Some comment on the ppart The above examples illustrate why the formulation of schemes with typevariables (and prop- and predicate-variables) are useful. Otherwise we would have needed e.g. separate dnl’s for every proposition, separate book-equalities at every type, and a separate induction axiom for each predicate on type nt. And it also becomes evident why abstraction over degree 2 variables is called higher order quantification: proposition and predicate variables are 2-variables and abstraction corresponds to universal quantification. See further Sec. 5.12. By using Automath in this propositions-as-types fashion we get an almost ordinary many sorted first-order predicate logic, viz. over a pure (or extended) typed A-calculus. It depends mainly on the axioms concerning falsum what kind of logic we get: minimal logic (without axioms), intuitionistic logic (with absurdity rule), or classical logic (as above, with the double negation law, or the like). Additional constants and axioms can be added for the introduction of further mathematical structures (see, e.g. [van Benthem Jutting 773). We wrote that Automath is an almost ordinary predicate logic, “almost” because there is one unconventional feature: Expressions for proofs (i.e. realizers) can occur inside the expressions for mathematical objects and for propositions, i.e. mathematical objects and propositions can become dependent on the truth of (other) propositions. Example: Let P be a predicate on type a, let 3!z.P(z) (how this is defined does not matter here). Then the axiom of individuals [van Benthem Jutting 771, which is usual in the standard development, introduces a constant (a iota-symbol) ind(a, P, t ) together with the appropriate axoms, for the unique object satisfying P; here t realizes 3!z, P ( z ) . Of course, ind(cY, P, t l ) and ind(a, P, tz) are book-equal. However, irrelevance of proofs is needed t o make these expressions definitionally equal (cf. 5.2). In this way implications (Y =+ p (generalized implications, as we say) are formed where /3 cannot be stated unless 01 holds, and similarly we can get generalized conjunctions. Such propositions are said t o belong to generalized logic (see [van Benthem Jutting 771, [Zucker 77 ( A . 4 ) ] , [de Bruijn 80 (.4.5)]). The propositions-as-types development of Sec. 5.9 is not the only one possible. Alternatively, the propositions can be introduced as ordinary types (of type type), or as 3-expressions of a new type bool. Since in the first alternative

198

D.T. van Daalen

no distinction is made between propositions and ordinary types (in fact there is no pfragment, only a t-fragment) the realizers enter the discussion as ordinary objects (constructions) too. This seems t o be the proper choice if we want to study constructive foundations. Of course, irrelevance of proofs is out of the question here. The second implementation, where the propositions enter as degree 3 expressions, gives rise to higher order logic. In this case the truth of a proposition B is expressed by a formula t E B’ where B‘ is a n ordinary type (the “proof-type” of B ) associated with the proposition B. This “proof-type” of B (usually denoted TRUE(B),or F B or p r o o f ( B ) ) has t o be introduced because B itself is not inhabitable (unless we use AUT-4, see 4.12). In [ v a n B e n t h e m Jutting 771 there is also a development in the 6001-style. 5.11. On propositions-as-types In fact, Automath is not just a predicate logic but rather the proof system of a predicate logic, because a formula A of the logic is not expressed directly but via a statement of the underlying typed A-calculus, of the form t E A . So it is reasonable to ask for the decidability of the system: proof systems have to be decidable, One might wonder, though, why we took such a peculiar proof system, this formulae-as-types kind of formalization. Our main point is that the formulae-as-types way of implementing a proof system is a straightforward one. The classical notion of formal proof is: a finite sequence of formulae, each of which is either a n axiom or follows from the preceding ones by application of an inference rule. This meagre notion of proof is already decidable but useless for our purposes because the decidability is not feasible. For other purposes as well (proof theory) this notion of proof is considered too uninformative. The first improvement coming to mind is to provide each formula (let us say: line) in the sequence with additional information:

(1) a label (e.g. a mere line number, or a more expressive identification), for later reference, (2) some reason, some justzfication for that line. The information (2) has to indicate: (a) what inference rule is used for established for establishing that line, (b) on which previous formulas (indicated by their labels) that inference rule has to operate. The axioms in the sequence do not get a justification but just a flag AXIOM, say. Notice that the justification part of a line can also be conceived as an instruction to operate with the indicated inference rule on the indicated preceding lines.

The language theory of Automath, Chapter I (A.6)

199

If the proof is correct, the formula part of the line will be the result of this operation. Another, independent, improvement is to allow proofs f r o m assumptions, in natural deduction style. In this case additional information must be given with each line to indicate the context in which it is valid (i.e. the assumptions on which it depends). The proof system we have now arrived at seems t o be a natural one for mechanical proof-checking: each line consists of four parts, a context part, an identifier part, a justification part and a formula part. Just a slight generalization leads us to Automath. First, we allow the justification part to be a compound expression coding iterated use of inference rules. This will save a lot of lines in the proof. Secondly we allow each theorem from assumptions and depending on propositional or predicate variables to be used in subsequent lines as a new derived, inference rule. This gives the system the flexibility and generality of ordinary mathematical reasoning. Still one step has to be made: to recognize that what happens in our proof system is completely parallel with what happens in our typed A-calculus framework. That making assumptions amounts to introducing variables, that stating axioms amounts to introducing primitive constants, and that deriving theorems can be conceived as introducing defined constants. Finally, the abstraction and application rules of the typed A-calculus amount to the introduction and elimination rules for implication and universal quantification. Then the abbreviation line

a E A , y E B * d := D E C (this is the proper book-and-line format, we would rather write x E A , y E B Id ( x , y ) := D E C or the like) can be understood as “from the assumptions A , B the formula C can be derived by using the compound instruction D ; this theorem can be referred to as line d”. So, we can explain formulae-as-types as just a practical way of implementing a proof-checking system. Fitting the proof system into typed A-calculus gives rise to an unusual interpretation of the E-symbol but there is no harm in that (compare 4.6). The third interpretation of realizers (cf. 5.2) seems appropriate to the above explanation: a realizer is a mere indication that its formula holds. A completely different question is: would there be any more direct way of representing reasoning via the E- and Q-formulas of the underlying typed Acalculus of Automath? The answer to this question (no) sheds some light on the particular limitations (see 5.7) of Automath. The first point is that the Eand Q-formulas themselves do not allow any reasoning. The only E-assumptions we can make are the typing assumptions for variables, and the only E-axioms we can make are the typing axioms for the primitive constants. The Q-formulas

D.T. van Daalen

200

are even more implicit: Q-assumptions are not allowed at all, and the only Qaxioms are the abbreviations. ([Scott 701 indicates that allowing Q-formulas for assumptions would spoil the decidability.) For the rest, E- and Q-formulas just hold or not: if they do not hold they cannot even be stated as an axiom or an an assumption. Consequently they cannot be negated or used in a reasoning ad absurdum. Then, we might look for another trick (different from propositionsas-types) to represent reasoning. One idea might be t o introduce a type of truth-values and to see t o it that each proposition (or some object associated to it) would be definitionally equal to a truth value. Another idea might be to introduce a type for the true propositions (or objects associated to them) and a type for the false ones (or objects associated to them). Apart from the fact that these proposals simply are not feasible (just try) they would imply that all propositions would become decidable (because E and are so) and that is not what we want. 5.12. A comparison with higher order systems

We have mentioned before that abstraction over type-variables is not allowed in Automath. In this respect Automath is distinct from both Martin-Lofs system and Girard's systems. Martin-Lof distinguishes small types and large types. An example of a small type is the type of the natural numbers, examples of large types are: the type V of small types (like our type) and the types which represent propositions (in the propositions-as-types sense). Now variables ranging over small types can be quantified, but quantification over, e.g., propositional variables is still not permitted, so Martin-Lofs system does not have higher order logic. However, Martin-Lof's system is higher-order in our technical sense (see [van Daalen 80, IV. 1.51) because, by his built-in recursion mechanism, a type-valued function, T say, can be defined such that e.g. T ( 0 )= nt, T ( n 1) = T ( n )+ nt (where nt is the type of natural numbers). Then the product n(T)consists of functions with values (numbers, functions, functionals) of arbitrary high complexity (Seldin would say rank). Note that in Automath such functions of unbounded functional complexity cannot be defined: crucial in the recursive definition of T is the presence of the function Xy : V . (y -+ nt) (with y a type-variable!) which takes T ( n )to T ( n 1). Girard's systems actually contain higher-order logic, because quantification over all type-variables is admitted. E.g. (we use Automath notation) the object [a : t y p e ] [ x : a]2 of type [a : t y p e ] [z : a]a can be constructed. In fact Girard would write that DTcu.Xx".x" is of type ha.(&-+ a ) .

+

+

20 1

Reflections on Automath N.G. de Bruijn

Automath was conceived in 1967. It was not just meant as a technical system for verification of mathematical texts, it was rather a life style with its attitudes towards understanding, developing and teaching mathematics. Automath did not arise from of any particular school of logic, and was not conceived in contact with other mathematicians or with computer scientists. That is perhaps the reason why it was so easy to get to this system in spite of its being rather unusual in several respects. And possibly that was also the major reason for lack of interest and understanding in the worlds of mathematicians, logicians, computer scientists and mathematical educationalists. The ideas lying at the roots of Automath did not just appear out of the blue. Needless to say, during the more than thirty years which I had spent on mathematics already, there had been many significant influences, from my own activities in reading, writing and teaching, but also from personal contacts in the whole area of abstract sciences. In this paper I shall try to sketch some of those influences. Not too much will be said about Automath itself. For specific information, reference can be made to the survey paper [ de Bruzj’n 80 ( A .5)] and to [de Bruzjn 70a (A.211 and [de Bruijn 73b]. For general information about the Automath project see [de Bruijn 73~1,[de Bruzjn 80 (A.5)], [de Bruijn 861 and [de Bruijn 91~1. This paper is a very personal one, written in an egoistic style. It will certainly give a very pedantic impression, but I think I have to use it in order t o separate personal considerations and experiences from more general ones.

1. WHAT PRECEDED AUTOMATH

I had no training whatsoever in philosophy, logic, or foundations of mathematics. But one can have a philosophical mind without having philosophical training. It may just mean “thinking about the things we do”, in contrast to “learning what philosophers have built up in the past”. After having left middle school, my first two years of studying mathematics (1934-1936) were devoted to getting a qualification for teaching mathematics. I lived in complete isolation, studying at home from books only. During the first

202

N.G. de Bruijn

year my most important text-book was F. Schuh’s “Beknopte Hoogere Algebra” (Concise Higher Algebra). It was quite dull, but very clear and precise. I think that I learned the language of mathematics from that book. Of course I picked it up by means of what is called “osmosis”, absorbing the style like a sponge, and taking it over without conscious effort. From those days I remember that I learned the definition of the limit of a sequence in &-style. I understood it, went through the examples, did all the exercises, and continued plowing through the heavy volume. In a few months time I finished my first round through the book, and, as a good student should, I took a second round. There were no difficulties, until I got to the definition of the limit of a sequence. My startling experience was that I had to admit that thus far I had always given an incorrect interpretation to it. It was not exactly what I had always thought it was. Again I went through the examples, the exercises and through the whole book. Remarkably, it happened again in a swift third round. I do not remember what it was that I had not understood in the first round and began to understand in the second one, but I know what the matter was in the third round. At that time I discovered that the notion of existence involved in the limit definition had been phrased in words suggesting constructive existence, but that the author had meant to include non-constructive existence too. That had not been clear from the examples, but it was handled that way later in the book. This way I discovered that it was possible to have the impression to understand something where there is actually only some kind of partial comprehension. Would not there be an absolute norm for mathematics? Apparently we can have understanding on various levels. When two mathematicians talk to each other they may understand on different levels, and they might not always notice that. I recall a discussion, a few years later, with a friend who had already a few years of university mathematics. I explained some mathematical idea, and then it turned out that he did not accept that I was at liberty to invent my own definitions for the purpose of the discussion. He might have had the impression that all definitions had to be taken from a kind of mathematical Talmud and that no one was allowed to add anything to it. This attitude was a typical result of osmotic mathematical education. All the time definitions, theorems and proofs had been shown to him and he was never told explicitly according to what rules these things had been built. For that matter, even now there remains confusion about what a definition is. We occasionally see “definitions” that have to be followed by a proof of their legitimacy. Under this heading one might mention the so-called recursive definitions, which are so popular among computer scientists (and also among logicians). Properly speaking one should not call them definitions, rather recursive specifications. But of course my own learning of the formation rules for definitions, assump-

Reflections on Automath (A. 7)

203

tions, formulas, theorems, proofs etc. from Schuh’s book had been by osmosis too. Fortunately I had learned it well, and by saying this I can only mean that (at least after that third round) I never felt the need to revise my insights in the matter. We should be worried by the fact that we learn so much by osmosis. Is our way of doing mathematics just caused by the way we learned it, and confirmed by our experience that our behaviour is accepted in our little sub-culture, or are we pushed by absolute norms we have not made explicit yet? The osmosis technique has been called “Teaching by intimidation and learning by imitation”. Teaching by intimidation: when the student does not understand, the teacher begins to speak louder and louder, until the student is satisfied. This indicates that the teacher lacks the right means to express himself, or lack of a sound basis that was agreed upon once and for all. Anyway it is a situation that outsiders would not expect to find in mathematics. I got a shock from reading Kamke’s “Mengenlehre” (Set Theory) all by myself. I was somehow convinced that those ridiculously large cardinal numbers did not really exist (it was only very much later that I began to realize that the question of existence or non-existence of such things has no meaning at all). Nevertheless the reasoning was crystal clear. At that time (I was 19 years old) I came to the conclusion that the whole thing was a linguistic structure, and that mathematical language does not necessarily have to talk about something. Of all subjects that can draw our attitudes away from mathematical platonism, Cantor’s set theory may have the most instantaneous effect. Another shock was the one I got by reading about the axiom of choice. How can one express as an axiom what we have been doing all the time without any axiom? It seemed that for years I had lived with an unsatisfactory mathematical foundation without ever noticing it. During the time that I studied mathematics and physics in Leiden University (1936-1939) I hardly ever heard anybody even mention the foundation of mathematics. The only exception was the physicist H.A. Kramers who referred t o Hilbert’s interpretation of mathematics as a game with symbols. It did not seem to be a very practical idea, but I never could get away from it any more. Automath is a kind of implementation of what Hilbert had in mind. The first real mathematical philosopher I met was a quite remarkable one: Evert W. Beth. Around 1941 I saw him regularly during a year or so. I did not try to learn anything directly from him, but of course he had an influence on me. And in 1940 I had some contact with D. van Dantzig who was quite impressed by the way how the “Wiener Kreis”, and its Dutch counterpart, the “Signifische Kring” gave a central position to language. In the years that followed I gave hardly any thought to the foundations of mathematics. I have to admit that my taste for mathematics went in the direc-

204

N.G. de Bruijn

tion of concrete subjects and that I disliked non-constructive things, however without pushing that to extremes. Around 1948 I worked in correspondence with P. Erdos on problems of infinite combinatorics treated by means of transfinite induction. I remember once to have remarked that one should try to get rid of Cantor’s paradise, which astonished Erdos since he found that I had been quite good at it. But for me it had just been language. Actually I am now amazed by the fact that I had avoided looking into the foundations of mathematics during that many years. In the various areas of mathematics I happened t o be active, I was always strongly interested in the way things could be built up from scratch, and not always so much in the results themselves. This was probably connected with my relatively poor memory. With everything I did, I had to start on some lower level where I could feel quite confident, and t o build up the structure above it for myself until the point where I was supposed t o do independent work. I thought I never had to dig into foundations since I felt secure on the level where mathematicians usually depart from if they say that they start from scratch.

My first endeavour in putting some foundational matter in my mathematics courses was around 1953 at the University of Amsterdam. I sacrificed the first ten hours of my course on Linear Algebra in order to give the students some feeling for the language of mathematics, in particular for bound variables and quantifiers. I cannot say that I always felt very comfortable at it. When a few years later my publisher asked me t o revise my book on differential and integral calculus (written in 1944) in the style of the “new math” that had just entered schools, I had to admit that I could not do it, at least not without hypocrisy. I felt unable t o give an educationally justifiable set-theoretical foundation, since I lacked the language to describe the way the stage has to be set for mathematics. I did not feel competent to develop a good system myself. And I felt that my readers had the right t o expect something better than the way sets and mappings had been introduced at schools. In particular I did not know how to teach without such vaguely suggestive words as “arbitrary”, “variable”. And then there were obscure things like the “value” of a number. At that time I already had the desire t o describe the structure of mathematics explicitly, but I did not feel to have the means t o do it. My Amsterdam years 1952-1960 could have given me a wonderful opportunity to dig into the foundations of mathematics, but I hardly used it. I had a regular contact with Heyting and Beth, as well as with their many guests from abroad. And also after 1960 I met Heyting quite often. I talked t o Brouwer once in a while, but never on foundations. Like many others, I had never been able to follow his philosophical writings. I now feel that I understand the main reason

Reflections on Automath (A.7)

205

why his intuitionism was so poorly understood at the beginning of this century and why he had so much trouble in selling his ideas. If the method of natural deduction had been recognized as the standard entry into logic at that time, it would have been clear that classical logic requires axioms for negation, and that it is reasonable to try to live without them. Intuitionistic constructivism might have been expressed at once in the down-to-earth way of Heyting. An important thing I got from Heyting is the interpretation of a proof of an implication A + B as a kind of mapping of proofs of A to proofs of B. Later this became one of the motives to treat proof classes as types. In my work in analysis I kept trying to avoid the axiom of choice (although I did not object t o the countable version). This was a surprise for Heyting. Why would one reject the axiom of choice when one does accept the axiom of the excluded middle? At that time I did not see much of a parallel between the two. But I began to see it in 1967 when building Automath. If one interprets the axiom of the double negation as an axiom about proof classes, and then extends it to types in general, one gets Hilbert’s selection operator, and that is even worse than the axiom of choice. But it was not only the axiom of choice that made me object t o a foundation of mathematics based on Cantor set theory. I did not have the feeling that such abstract set theory corresponds to ordinary mathematical thinking. Anyway I felt that mathematical objects have a kind of archetype. For any object one can ask what kind of a thing it ultimately is, if one eliminates all definitions. It may ultimately be a point, or a matrix, a set of such things, etc.. In Cantor’s paradise one feels nothing of that kind. On the other hand, an example of thinking in terms of types can be found in Hilbert’s axioms for geometry. “There are things we call points and things we call lines” sounds like proclaiming two primitive types “point” and “line”, and not as “Let P and Q be be sets (objects in Cantor’s paradise), the elements of P are called points and those of Q are called lines”. A question that would make sense in the latter interpretation would be whether the set of all lines is a point or not. In the course of the years I began to love the use of formalization for didactic presentation of mathematics. I considered it as a fairy tale that formalization is bound to make things dull. Clarity need not be dull at all. In particular I mention my attempts to formalize combinatorics (like Pdlya theory) in a workable form, replacing a language with suggestive vagueness by the language of sets and mappings. Whoever compares the formal and the informal presentation will have to admit that the formal one is easier t o comprehend. And indeed, the way I treated P6lya theory is everywhere accepted by now. In my courses on functional analysis I started around 1964 using lambda notation, where I followed Freudenthal’s suggestion to replace Church’s lambda

206

N . G . de Bruijn

by the symbol 't; where the bound variable and domain are to be written as ~ .F'reudenthal symbol left the letter lambda free for subscripts, like in y z ~That ordinary mathematical usage, and the subscript notation was in agreement with what we otherwise do with bound variables. This ynotation is quite pleasant when writing and teaching functional analysis (see the many examples in [de Bruijn 73al). Once accustomed to using lambda notation in analysis one wonders how one could ever have lived without it. I never understood why some lambda notation had not been accepted a century earlier, and even less why the majority of analysts still ignore it. It is a historic shame. When one does not use lambd a notation many functions and operators can be handled only by introducing a separate new symbol for them in the surrounding text in an informal way, instead of writing them directly inside the formulas where they play there r61e. It is very unfortunate that Bourbaki never used them, for everyone followed Bourbaki in everything. Such experiences in practical usage of formalization made me ready for the idea that it would be feasible to present practically the whole of mathematics in a way that would be open to mechanical verification. For that matter it might have been important that I had taught many mathematical subjects, on several levels and in several styles. That kind of experience gives quite some opportunity to reflect on what one is doing. The lust for effective formalization got stronger when from 1962 onwards I started having quite some fun writing programs (in Algol60) for combinatorial problems, and got some feeling for the possibilities and limitations of computers. Programming languages are not the same as languages for recording mathematical discourse, but there are similarities. Anyway the block structure of Algol made me remark that mathematicians had always talked in terms of block structures but had not always indicated those blocks explicitly. Mathematicians introduce assumptions properly, but hardly ever say when those assumptions are dismissed (usually the reader has to see it from the way the text is arranged typographically in sentences, paragraphs, sections and chapters). Presenting an argument in natural deduction with flags and flagpoles is much closer already t o the nested structure of Algol programs in blocks.

I want to mention two experiences that gave a more direct stimulation in the direction of the Automath enterprise. The first one showed the feasibility of block-structured proofs in natural deduction. One of my Ph.D. students had difficulties in proving a conjecture that was very vital for his work. I tried to do it myself but found that the material and its logical structure was too hard for me. Nevertheless there was a strong conviction that the conjecture was true. Then I took a very large sheet of paper, wrote the assumptions in the upper left corner, the goal in the lower right, and proceeded in the style of flags and

Reflections on Automath (A. 7)

207

flagpoles, working from both ends. Introducing and eliminating implications as well as universal and existential quantifiers all the time, the strategy was quite clear at every moment. That strategy was suggested by the form of the formulas, and not by their interpretation. Remarkably, it did not take long for the gap to be filled. Here was a mathematical proof along more or less mechanical lines, for which it was quite clear how it could be mechanically checked for its correctness. At the same time it was a proof that could be translated at once into natural language. Doing large parts of a proof more or less mechanically is quit common in many parts of mathematics, since those parts may depend on calculations, on rewriting, on substitutions and similar things that have to be carried out meticulously but which are done quite independently of the ideas involved in the problem. But here it was different from all examples I knew: in this case the manipulation was not with algebraic or analytic formulas but with logical ones. This experience put me under the spell of the natural deduction method with flags, both for the benefit of proof development strategy and for proof verification. Here I mentioned the possibility, but the second experience convinced me of the desirability of mechanical verification. In a short paper by E.W. Dijkstra on a number of processes that might sometimes block one another, the correctness of the algorithm was explained in a paragraph that ended with the remarkable sentence: “And this, the author believes, completes the proof”. Indeed, the argument was a bit intuitive. I took it as a challenge and tried to build a proof that would be acceptable for mathematicians. What I achieved was long and very ugly. It might have been improved by developing efficient lemmas for avoiding the many repetitions in my argument, but I left it as it stood. Instead of improving the proof I got the idea that one should be able to instruct a machine to verify such long and tedious proofs. But of course I have to admit that it will be often more elegant and more efficient t o try to streamline such an ugly proof before giving it to a machine. Both experiences had something t o do with the idea of replacing thinking by calculation. It gave me the conviction that the design of a proof might gain by using mechanical means, but that the verification of the final product might be left entirely to machines. So I believed in automatic proof checking and not necessarily in automatic theorem proving. “Calculating instead of thinking” can be done in two ways. The first one is about calculations where the intermediate steps have no heuristic interpretation. One may think of a problem in physics that is solved by mathematical means. Many parts of the calculation may lack any natural physical interpretation. But if the entire calculation does have an interpretation, even when such an interpretation becomes clear only at the end, then calculating becomes an important

208

N.G. de Bruijn

heuristic tool. When the calculations handle logical formulas one should not say “calculating instead of thinking” but rather “writing instead of memorizing”. This holds for the method of natural deduction, but much less for working with Boolean logic with the use of lists of identities. That activity strongly reminds of what one used to do in planar and spherical trigonometry. The geometers were right in finding that inelegant since it did not correspond step by step to geometrical insights. Such objections were less strong when it was about algebraic manipulation in analytic geometry, where the broad lines of the algebraic work had a clear geometrical significance. As a mathematician I had been raised in a tradition that gave me a certain feeling for distinguishing language and metalanguage, with the desire to shift considerations in the metalanguage eventually into the language. This desire is by no means shared by all mathematicians. In a subject like combinatorics one usually thinks intuitively and not very formally, but turning the intuitive ideas into formal language would usually be straightforward. In algebra this seems harder, since algebraic thinking is often in a metalanguage, and reasoning may stay on that level too. One thinks about polynomials the way they are syntactically presented. And with categories this is even stronger since the metalanguage uses diagrams that are visualized as mathematical objects. Before I started Automath I was already convinced that confusing language and metalanguage had been a temptation throughout the ages, generating many paradoxes. When people claim they can talk infinitely long about the way Achilles tries to overtake a tortoise, one easily mistakes the infinity of the discussion for the infinity of Achilles’ endeavours. Somewhat simpler is the matter that circular arguments and circular definitions have to be avoided. When the art of building definitions is taken seriously such circular structures are out. Avoiding circularity in proofs is similar: as we shall see under the heading of “proofs as objects”, argumentation can be seen as a sequence of definitions. Considerations of a different kind were about the influence that mechanical verification might have on the social structure of the production of mathematics. In mathematics we are accustomed that the researcher does all the work alone, or in collaboration with colleagues of about the same level. In subjects like physics and chemistry this is entirely different. In those areas different kinds of laboratory workers can give their assistance, all with their own responsibilities. And routine work may be given t o students who benefit from it by learning the tricks of the trade. In mathematics there is not much opportunity for this kind of thing, and when it happens, the project leader still has to scrutinize

Reflections on Automath (A.7)

209

the work provided by the assistants. This might give him as much trouble as doing everything himself. But with an easy verification system at our disposal mathematicians would be in a much better position t o delegate details of their work to others. In that case the production of mathematics might be conceived as being done along a conveyor belt. The belt starts with a creative mathematician who furnishes the ideas. In the next position we have an expert, who is able to describe such ideas in the publication style as accepted in that particular area of mathematics. That expert delivers his writings to a competent but not necessarily imaginative mathematician who is able to understand the lingo of that area, and who is able to cut it up into the tiniest trivial details. This person may also provide hints for reading, in the form of logical connections or references t o theorems. Next we get somebody who does not have to understand the mathematical content at all, but who is able to translate these trivialities along with the hints into the language of the verification system, like Automath. Finally at the end of the belt we have a computer checking the complete text in every respect. In the usual situation mathematics had only the two highly qualified workers at the first two places of the belt. The task of the third one was usually the reading mathematician who had to try hard t o understand the publication style. The work at farther places along the belt was often done subconciously, if it ever was done. One might say that the choice of the name Automath was not entirely fortunate since that suggests that the system would produce mathematics. The system does not pretend to be more than an automatic listener that scrutinizes everything that is offered to it, and stores it into what one might call a mathematical databank. Listening is automatic, but the input, the production of the reading material, remains essentially the work of people, albeit that parts of the work along the conveyor belt might be automated. The Automath system does not pretend to invent mathematical profs. And neither it claims to automate the mathematical calculations that might form parts of proofs. For many outsiders the calculations are the essence of mathematics, the rest is just talking!

2. DECISIONS TAKEN DURING THE CONCEPTION OF AUTOMATH The goals of the Automath project were, right from the start: (i)

The system should be able to verify entire mathematical theories.

(ii) The system should remain very general, tied as little as possible to any

210

N.G. de Bruijn particular set of rules for logic and foundation of mathematics. Such basic rules should preferably belong to material that can be presented for verification, on the same level with things like mathematical axioms that have to be explained to the reader.

(iii) The way mathematical material is t o be presented to the system should correspond to the usual way we write mathematics. The only things to be added should be details that are usually omitted in standard mathematics.

So the idea is that every piece of common mathematical discourse should be interpretable as a more or less efficient abbreviation of an ultimate complete version. Those abbreviations have t o be such that the distance from the usual style to the ultimate version remains reasonable. This has to be taken in the sense that space and time for writing the complete version are at most a constant multiple of the corresponding amounts for the usual style. As a kind of dream I played (in 1968) with the idea of a future where every mathematician would have a machine on his desk, all for himself, on which he would write mathematics and which would verify his work. But, by lack of experience in such matters, I expected that such machines would be available within 5 years from then. But now, 23 years later, we are not that far yet. Anyway I expected in 1968 that the memory capacity of main frame computers would grow rapidly in the next few years, but that was a deception too. Implementing Automath on the quite advanced computers available to us in the years 1970-1975 was to a large extent a struggle for living with the limitations of fast accessible memory. In a very short time I came to what I later called PAL, and that is Automath without lambda calculus. A book in PAL is subdivided into lines of text, organized in nested blocks. Apart from the few lines expressing axioms and the lines that introduce a typed variable at t h e beginning of a block, all lines have the form of a definition. These appear as b := P : Q, where b is a new name, to be given to the expression P , and Q is the type of that P. With this framework quite some mathematics can be presented, actually almost all mathematics produced before the 19th century, which means before the introduction of the modern notion of a function. And that framework satisfied the above-mentioned goal (ii). It gave the possibility to explain a piece of logic in the books, logical theorems (derived rules) could be shown, and such derived rules could be used later as derivation rules in exactly the same way as mathematical theorems could be used later in proofs of other theorems. Let me say here a few things about types and typing. In my idea types always played the rBle of substantives. When saying “Sirius is a star” and, for a moment, considering “star” as an archetype, i.e., we do not consider a star

Reflections on Automath (A. 7)

211

as a specialization of something more general, then we have a typing “Sirius : star”. Linguistically “Sirius” is a name, and “star” is a substantive. In today’s mathematics one likes to switch at once to the set S of all stars, and then write “Sirius E S”. Many years later I began a more general study of the r d e of typings in ordinary colloquial mathematical language on a larger scale, which gave rise to WOT (“Wiskundige Omgangstaal”) (see [de Bruijn 79a], [de Bruijn 79b], [de Bruijn 79~1)and its english version MV (“Mathematical Vernacular”, see [de Bruijn 87a (F.311). These WOT and MV have a clear relation t o Automath, but are different from Automath in the folowing respect. WOT and MV are like colloquial mathematics in the sense that references and hints are considered as informal, whereas in Automath they form an essential part of the language. The fact that already PAL could feasibly present some mathematical reasoning, was of course due in part to the fact that mathematicians worked with abbreviations all the time already, but a very important addition was the idea to treat proof classes as types, and therefore proofs as objects. Others have called this propositions as types but I am not too happy with that term. The class of proofs of a proposition is not the same thing as the proposition itself. The term proofs as objects is better. It is always difficult to say where ideas come from, but this idea of proofs as objects might very well have been influenced by what I had heard from Heyting years before: the functional interpretation of the implication. But apart from that, I must say that the idea was a natural and direct consequence of the way used in PAL for forming function values by means of instantiations for the variables. This can be explained as follows. Consider a case where the statement of a theorem contains some variables and some assumptions (which may depend on those variables). Later we want to apply that theorem in a special situation. In order to explain to a machine that the theorem can be applied there, we have to explain what objects we take for the variables, and also what proofs we have for the validity of the statements we get if in the assumptions the variables are replaced by those objects. So technically, in the sense of what we have t o supply to the machine, proofs and objects are on the same level. Note that in the statement of a theorem we do not always have first all the variables and then all the assumptions. Quite often the type of some of the variables makes sense only by virtue of some of the assumptions. In the substitution mechanism objects and proofs are completely intermingled, and therefore it is very practical to give them the same notational status. With this as a general principle, the verifying machine does not have to be aware of whether it is supposed to handle objects or proofs. If one wants, the difference between the two might be shifted entirely to the area of the interpretation.

212

N.G. de Bruijn

As a mathematician one learns at an early stage already that simple systems are easier to manage than complicated ones. Uniform notation often leads to clear ideas and often t o stronger results. And if two things show a resemblance, it pays t o treat them more or less on equal footing. Therefore I believe that many mathematicians would have taken the same decisions as I did when seeing this analogy of objects and proofs. Let us give some attention to the interpretation. Consider a line of text like b := P : Q. With the object interpretation this represents a definition. The b is the new symbol, defined by the (usually longer) expression P , and Q is the type of P (and therefore Q also becomes the type of b). With the proof interpretation P is a proof (obtained from an old proof by substituting expressions for the variables and proofs for the assumptions). The new name b is an abbreviation for the proof. The theorem as such does not get a name, at least not a name in the text that is presented to the checking machine. That name for the theorem is something in the metalanguage. Let us see how the analogy with “Sirius is a star” works. We replace b by “proof 128 (of RBnyi)”, P by a proof that RBnyi built for conjecture 1727 of Erdos, and Q by “proof of conjecture 1727 of Erdos”. So the typing b : Q becomes “proof 128 (of RBnyi) is a proof for conjecture 1727 of Erdos”. The idea of manipulating proofs arises too if one considers a theorem and its proof as a blueprint for making applications (and those applications are theorems again). The proof of the application is obtained by carrying out substitutions into the original proof. Instead of making an appeal to the theorem we repeat its proof (in adapted form). So here we have a local elimination of the theorem, and that is similar to local elimination of a notion (replacing the name of the notion by what it stands for). This way of treating theorems and definitions alike, resulting from the idea of “proofs as objects” was very different from existing habits of logicians. Definitions had always been considered as unessential, and one talked about a logical structure as if all the definitions had been eliminated. Strange, since it is just because of the abbreviational mechanism that we are able to discuss complicated mathematical situations. And that abbreviational mechanism is completely similar for objects and proofs. It is a valuable advantage of Automath over several other systems that definitions are taken as seriously as theorems, both from a practical as from a theoretical point of view. Together they form the essence of the mathematical trade. But what does Q look like in a theorem represented by the line b := P : Q ? Above it was pictured as the (composite) substantive “proof of conjecture 1727 of Erdos”. In general we may have a situation where we have some proposition T and we want t o say that P is a proof for A . Then the Q in the line b := P : Q

Reflections on Automath (A.7)

213

represents the substantive “proof of n”. In a PAL-book we can arrange that as follows. First we introduce as a primitive type the type bool. The interpretation is: if b : bool then b is a proposition (instead of the name “bool” we might also have taken the substantive “proposition”). Next we express in our PAL-book as a primitive notion that there is a kind of mapping that attaches a type to every proposition. Let us call that mapping Proof. (In the Automath tradition this mapping was called True, but the name Proof expresses the idea much better.) So if n is a proposition (i.e., n : bool) then Proof(n) is a type, and P : Proof(,) expresses that P is a proof for n. This way t o attach proof classes to propositions is not strictly necessary. One can also deal with proof classes that have not been introduced by having a proposition first, and sometimes that is attractive. In other words, we do not neccessarily have to build on Boole’s idea t o treat propositions as objects. In fact one might say that treating propositions as objects is a trick, neither more natural nor less natural then treating proofs as objects. Working with proof classes attached to propositions has been called boolstyle, and the more permissive system where this restriction is dropped, has been called prop-style, since proof classes are typed by prop in that system. (Here it may be remarked that the word prop was a bit confusing, since it suggests “proposition”, whereas it means “proof class”. The name proof type might have been better.) The idea to switch from bool-style t o prop-style is again an example of an insight that results in a natural way from a n efficient unified notational system. One thing that the presentation in PAL gave at once, was a clear attitude to the old chicken-and-egg question: is logic is a piece of mathematics or is mathematics an application of logic? The answer suggested by PAL is: there is a framework in which both logic and mathematics fit, and they are interwoven. They can make use of each other’s results. Logical derivation rules can have the form of axioms or theorems, and starting from simple derivation rules one can derive more complicated ones, and those new rules can be applied everywhere else in the same way mathematical theorems are applied. Technically there is no difference between mathematics and logic. The machine that checks a book never needs to know which pieces we like to label as logic and which as mathematics. This is similar to the fact that we need not tell the machine whether we talk geometry or analysis. What PAL lacks in order to become a language for most mathematics, is lambda calculus. We know the interpretation of lambda calculus in which mathematical functions can be presented as lambda expressions. This has its consequence on the level of types: one can talk about the type of the functions that

214

N.G. de Bruijn

map natural numbers onto complex numbers. But it also works in the world of predicates. A predicate P can be seen as a function that attaches to every x of type A a proposition P ( x ) . And we get to mappings that attach to every x of type A a proof of P ( x ) , and that means something of the type Proof(P(z)). This puts us into a world of mappings where not only the function value but also the type of the function value depends on the variable. We would need lambda's already if we would like t o enrich the 18-th century PAL-world with something as simple as the introduction of implication. In PAL another proposition u,but one cannot say one can derive from a proposition i~ in PAL itself that one has therefore derived the implication T -+ u. This is not so bad as it seems, for in most cases where one wants t o apply i~ --$ u this is an application of the Modus Ponens rule, and then it can be carried out in PAL: it is the same thing as applying a theorem. This relation between implication and lambda abstraction is no surprise for those who have experience with natural deduction, where there is a clear analogy between introduction of implication and introduction of universal quantification. From this we might learn something for education. The fact that working with implication is considered to be hard can be connected to the fact that implication can present the same difficulty as lambda abstraction. On an educational level where lambda abstraction is considered t o be too hard, implication probably will present difficulties too. One can raise questions about the best way to add (typed) lambda calculus to PAL, but I will not discuss that here, since Automath's philosophy of life is almost entirely present in the case of PAL. Adding the lambdas had a negative effect too: it added to the unpopularity of Automath among mathematicians. Most of them were uneasy already about the ordinary functional interpretation of lambda calculus, considering it as weird and superfluous. And around 1970 logicians had little interest in typed lambda calculus, certainly not for the situation where the types are lambda expressions themselves. When someone comes with a proposal for a new way of doing things, most people will find that it is the newcomer's duty to show all the connections with the usual methods. That did not bother me much. In particular if one thinks that the new system is better and simpler that the old, it is nothing but troublesome t o have to carry the load of the heritage of the past. And making comparisons was very hard in this case. All the time I wits convinced that the most important feature of Automath was that it forced us to give an explicit account of the framework of mathematics: the rules for building theorems, definitions, assumptions and proofs, handling variables and substitutions. In most existing systems such a framework was tacitly assumed to be known, and one started somewhere in the middle.

Reflections on Automath (A.7)

215

I never felt comfortable with the opinion that supporters of a type-theoretical approach should have the duty to show the connection to older formalisms, instead of the other way around. Here I have to add that “type-theoretical” suggests more than we actually need. We can be satisfied by knowing how to handle the types, and that hardly needs any theory. The same thing holds for the usage of lambda calculus, by the way. During the design of Automath I never had the feeling t o be implementing a logical system. I just followed the style of the mathematical trade. Since more than a century that style had flourished as a very dependable way of handling mathematical language, without the slightest appeal to any rule of logic. That language tradition never was an implementation of logic. On the contrary, logic had arisen from attempts to give an explicit description of the mathematician’s behaviour.

So I got to studying the structure of mathematics by starting from the existing mathematical language and from the need to make such language understandable for machines. I think we might call that approach k a t u r a l ” . “Natural deduction” is a part of it. At once I have to stress that the use of the word “natural” has little to do with nature, or with the true nature of things. It refers to the reasoning habits of many centuries, and goal (iii) of the Automath project tries t o bring communication with machines in harmony with the usual communication between people. One of the first ideas around Automath was to try to present mathematics constructively, at least t o take care not to block the possibility t o embrace constructivism. But when constructive methods turn out to be very cumbersome it might be wiser to retreat to non-constructive mathematics, at least for the time being. According to goal (ii) Automath had to remain general, not tied to any special form of logic. True enough, the framework itself is something that one might call logic (or rather pre-logic, or pre-mathematics), but without a framework no life seems to be possible at all. But in Automath things like logical connectives, contradiction and negation are kept outside the framework. The idea is that it is much better to make languages that enable every separate user to introduce inference rules and axioms to his own liking. To such rules and axioms he would tie himself, but only for one particular book. In another book he may act otherwise. Automath is like a big restaurant where one can eat in any style. If one wants a kosher meal one can get it, but one can never force others to eat in the same style. Only the intolerant can be annoyed by the fact that there is room for others too.

216

N.G. de Bruijn

In a very early stage of the project I had the most valuable support of L.S. van Benthem Jutting, who acquired a taste for Automath almost at once, and was ready to work on the use of it. One thing he liked to work on was intuitionistic mathematics (he had learned intuitionism from Heyting). But the system did not have to be adapted to the needs of intuitionism. The framework was the right one for constructivity in the first place. Some people had the idea that a system designed for constructive things cannot be used for non-constructive ones. But there is no difficulty at all: the only thing we have to do in order to treat non-constructive things is adding axioms like the axiom of choice, or the axiom of the excluded third. When P. Martin-Lof first saw something of our Automath system, he got the impression that we could only work non-constructively, and he started considering a system for constructive mathematics himself. The reason why he had got the impression that we had not been catering for it yet, might have been that on the pavement in front of our all-round restaurant he had seen non-kosher guests only. We readily noticed the relation between excluded third and axiom of choice. Hilbert’s axiom (claiming even more than the axiom of choice) states that there is an operator E that attaches to every non-empty set an element of that set. If here we replace “set” by “type”, we get something that can also be applied to proof classes. If b is a proposition, and the proof class Proof(b) is non-empty, then Hilbert’s operator gives a proof for 6. This means that b is derived from the double negation of b. So the double negation law (closely related to the principle of the excluded third) can be seen as a n application of a form of Hilbert’s axiom. The ideas of Automath were developed in a very short period. And since the system was so simple, restricted t o the barest necessities, the result was very robust. It did not need a new release every other year, like so many other systems. At the end of 1968 Automath was presented at the Symposium on Automatic Demonstration at Versailles, where Dana Scott took up the thread at once. In those days much of the influence that Automath had on others ran via that work of Dana Scott. Remarkably, at that occasion Scott wrote: “de Bruijn had been, of course, personally influenced by Brouwer and wanted to present a suitably constructive notion of proof”. I think this is confusing. Not just because 1 was never influenced by Brouwer’s talking or writing, but because “constructive notion of proof” is different from “notion of constructive proof”. The latter can be connected to Brouwer, but the former is much closer to Hilbert and his finitistic game with symbols and rigid rules. Brouwer’s constructivity is (at least in Heyting’s formalized form) is a matter of the content of axioms, and not the way these axioms are manipulated. I already said that Automath aims at some kind of realization of Hilbert’s

Reflections on Automath (A.7)

217

program. By this I mean that it makes operational what Hilbert proposed as a game with symbols, a game that has mathematics as its interpretation. I certainly do not mean Hilbert’s pretenses that it might be possible to prove completeness and consistency for that game. Since the work of Godel all hope for this had to be given up. Before I started Automath I was slightly anti-platonistic. That is to say that I always sympathized with Kronecker’s statement that only the natural numbers really exist. But building Automath I rapidly concluded that I had to be antiplatonistic. Before that, I had allowed myself to be a bit confused by statements like “3 is not a number, it is the name of a number”. But if you have to talk to a machine that knows nothing about the real mathematical objects, then you know that you can handle names only. The “real” mathematical objects are irrelevant in the discussion with the machine. In the language used for communicating with the machine one can say “3 is a number”. In a metalanguage, where one speaks about the language that one talks to the machine, one can try to say that “3 is the name of a number”. The platonistic idea is that 3 is an object that exist outside ourselves, a thing about which we speak. Those who reject that idea, at least not adhere it themselves, understand that people talk in mathematics with the same kind of words and sentences they use for talking about the real world. The same thing can be said of fiction. A number of people can talk about a figure called Sherlock Holmes, and the unanimity of opinions may give them a strong impression they are talking about the same thing. There is a common theme that hardly confuses. Quite easily the conversation is carried out in precisely the same way as if it were about a real person. In our culture one of the very few linguistic differences between fact and fiction may be that some fairy tales start with “Once upon a time”. In mathematics this was never done. I think that platonism is not true or false, but irrelevant. In oral or writ.ten communication between two mathematicians one of them may be a Platonist and the other a non-believer, and none of the two will be troubled by it or even notice it. But platonism is not just irrelevant, it is also dangerous. It seduces people to be less precise with the definitions of the objects they vaguely have in mind. They think that in case of doubt it will always be possible to re-inspect the true object and to study it more closely. This can be compared to the usage of drawn figures in geometrical discourse, as a substitute for a complete and rigorous argumentation. The platonists will have to admit that the figure is only a rough indication of the real figure in the platonistic reality. A mildly platonistic thing is the idea that mathematicians do not create their mathematics, but discover it. This idea is irrelevant only, not dangerous. It is the same kind of idea as the one that all the works of Shakespeare existed at the time of the dinosaurs already, and that Shakespeare just managed t o uncover it

N.G. de Bruijn

218

all during his lifetime. Now that the real mathematical object 3 has been discarded, we have to take a stand with respect to Kronecker’s idea about a difference between the natural numbers that were created on high authority and the more esoteric mathematical objects invented by man. We might expect to see such difference reflected in the axiomatic systems needed for their introduction. Nothing can be erected without any axioms, of course, but one might look into the weight of those axioms. A notion of a weight might be connected with something like normal forms. The first axioms of Peano are not at all bad: “There are things that we call natural numbers”, and “There is a natural number that we call 1”. But the axiom that introduces the successor function is heavier already, and the induction axiom is quite complex. The axiom of choice looks much simpler! It does not seem easy t o find objective arguments for a kind of semi-constructivity, and after all I have to say that Heyting was right. But we can try to be guided by the taste of the majority of mathematicians. Most of them, and certainly most of those who apply mathematics, feel comfortable with a package of which the most salient points are (i)

Classical logic.

(ii) Peano’s axioms, introducing the natural numbers, the successor function and the induction axiom. (iii) A power set axiom that says that to any type T there can be attached the type of all sets of things of type T . (iv) Sets do not come from the big secret reservoir of all abstract sets, but are always dedicated to a type. Sets of ofapples, ofnatural numbers, of sets of natural numbers, etc. Sets coming from different types are never compared. There would not really be an objection to it, but it would require axioms. But then, why do it? So rather not take an abstract universe of all sets, but typed set theory (see [de Bruijn 75a (F . l ) ] ) . When building a foundation for mathematics we have to postulate a number of primitive notions, to be interpreted as primitive objects or as axioms. We have to make a number of choices, of course. Some of the axioms will be felt as heavy and far-reaching in their consequences, others as very natural. In the latter cases it may often be only a matter of technical weaknesses of the language that an axiom is needed at all. And then those axioms are cases of internalization, and this term refers in general to the proces of pushing things of the metalanguage into the language itself.

Reflections on Automath (A. 7)

219

One has a linguistic building, and from the outside we can observe things about the language that cannot be said in the language. Often we can open the possibility to do it by means of some axioms written in the language itself. That way we pump knowledge about the system into the system. Examples of such internalizations are: the logical conjunction, the Cartesian product, and somewhat beyond that, the induction axiom. But a thing like Euclid’s axiom about parallel lines can in no way be seen as internalization. What is internalization too, is the extension of PAL to Automath by means of addition of (typed) lambda calculus. In PAL one can describe functions explicitly like Euler did. We have to say that for Euler the word “function” was a notion in the metalanguage, as we can see from the number of words devoted to the introduction of that notion. In the metalanguage one can say: look, here in this block of text we have defined a function. In PAL this is apparent from the structure of the text. If we want to say: “assume that we have such a function”, then we mean in the metalanguage “assume that we are in a situation that we have such a block in our book”. The lambda calculus internalizes that discussion. It was quite a big step to push functions from the metalanguage into the language. For a long time after it, their remained some mystery around the notion of an arbitrary function. Would every scribble of a child represent a function? Should not be there a decent expression? In Automath we have two kinds of equality. There is the ordinary equality of mathematics, but also a stronger one, called the definitional equality. Ordinary equality is called book equality. It is not available in the definition of Automath, but has t o be introduced and axiomatized in the books written by the user. The term “definitional equality” is a notion of the metalanguage in which we talk about the expressions occurring in an Automath book. Two expressions are cal!ed definitionally equal if they can be reduced to one and the same expression. As reduction we allow every reduction of the lambda calculus, but also local elimination of an abbreviation, which is called delta reduction. This amouts to the following. If in a book we have a definitional line that abbreviates an expression P by the name p , and if later in the book we consider a particular occurrence of p , then delta reduction means replacing that occurrence of p by P. Definitional equality is decidable, so checking it can be left entirely to a machine. This is not just theoretically possible but also feasible in practice, since books are written by humans who are unable to handle very deep strings of reductions mentally. The machine is so efficient in working with definitional equality that we might just have the idea that it does not handle the expressions themselves but equivalence classes of definitionally equal expressions. In an Automath book definitional equality is not written explicitly, it just is there. And when one takes the reflexivity x = x as one of the axioms on book

220

N.G. de Bruijn

equality, then book equality of two definitionally equal expressions X and Y can be established by appealing to that reflexivity axiom. Equality is a weak spot in our mathematical tradition. Mathematicians rarely gave much thought to foundations, but in particular they never knew quite well whether equality belongs t o logic or to mathematics. This hesitation has some relation to platonism. If we have two expressions A and B for which we want to write the equality A = B, and if we would imagine that there is also the “real” mathematical object, then we would have the tendency to interpret A = B as “ A and B denote the same object”. With this view one would have to say that equality precedes mathematics, and even logic, when coming to think of it. Many mathematicians hate discussing equality. In this connection I remember a book in which complex numbers were introduced as pairs (a,b ) of real numbers, and where the author had found it necessary t o say when two complex numbers are equal. He wrote “the complex numbers (a,b) and ( c , d ) are called equal if and only if they are precisely the same”. In Automath the position of equality is clear anyway, with the distinction between book equality in the language and definitional equality in the metalanguage. But in a way book equality remains troublesome. When we have a long expression containing the letter z somewhere inside, and we replace that z by a y which is book-equal to x,then we expect the whole expression to remain bookequal to itself. Technically it is a nuisance to derive this by means of elementary equality axioms. There are various possibilities for soothing this, but the last thing I would propose to do is strengthening the language for this particular purpose. And it would horrify almost all mathematicians to take more equality axioms than strictly necessary, just for the sake of efficiency. I would prefer to relieve the pain by automatic text production, which, by the way, can also be recommended for other time consuming trivialities. If one agrees that Automath is a “natural” way to express mathematics (I already said that this word only refers to the current customs of mathematicians), it will be attractive to use the typing mechanism in a natural way. It is definitely possible to treat something like the Zermelo-Fraenkel universe in Automath, using a type Set as the one and only object type (“everything is a set”), but then one discards the naturalness of Automath. To me this seems cold and aloof. When using, on the other hand, the wealth of the type structure of typed set theory one works much more comfortable, bringing the language closer to the interpretations. For that matter I believe that deep in their hearts most mathematicians only pay lip service t o Zermelc-Fraenkel, and actually think more or less like I do. For Automath itself all this is unimportant, for the restaurant can serve

Reflections on Automath (A. 7)

221

anything, and here we were just discussing standard menus. Such standardized ways of usage must not be seen as inherent to the system itself. Accepting a type-theoretical structure, a mathematician can live very well in a much smaller world than the one of Cantor. With the rules (i)-(iv) one builds the sequence

Without further axioms one does not get to bigger types. That is not because we would have trouble with something like a direct sum of countably many types, but because we are unable t o describe that sequence in the language as an indexed sequence of types. Let us denote the world containing all of n/, P ( N ) ,P ( P ( N ) ).,. . by P”(N).This is a kind of notation in the metalanguage; in Automath itself we are unable to discuss it. If we would insist on doing it nevertheless, we would need an internalization step resembling Peano’s induction axiom. But one should rather not take that step, for there is no end to such things. It was just the unlimited continuation of that kind of steps that Cantor got into his paradise. P”(N) is a nice world for almost the whole of mathematics. But it does not mean we got rid of all nasty problems: the continuum problem will still be with us. A very important decision taken directly in the beginning of the Automath project was to stay away from “automatic theorem proving”. Completely independently of what one may think of that area one should admit that whatever we would have done at the time with the technology of 1970 would have to be considered as worthless today, whereas the more restricted Automath program could reasonably develop on that technology and could get results of a lasting value. Even now I do not cherish exaggerated expectations of automatic theorem proving. Human mathematicians are not easily replaced. But what seems to be feasible, and would fill a need, is a kind of man-machine cooperation where a machine takes care of all the routine things (things which are often still quite cumbersome in Automath) and the human mathematician can concentrate on points requiring new ideas. To set up such man-machine cooperation will not be easy, but very rewarding. There are some possibilities for systematic transformation of seemingly trivial things into official proofs written by means of special programs. As an example I mention ordinary arithmetical relations between integers. Assume that somewhere in a proof of a mathematical theorem there is an appeal t o the inequality 21° > lo3. The ordinary mathematician accepts this, after having verified it mentally or on a slip of paper. One usually does not realize that this kind of

222

N.G. de Bruijn

verification is unacceptable as a formal proof. Having the inequality checked by a computer is still not according to the rules of the game. But it is not too hard to build a computer program for automatic production of texts for this kind of arithmetical results. Given the formula 21° > lo3, that program will furnish a number of Automath lines which together form a proof of that formula. An important aspect of the Automath languages is the organization by means of contexts. The context of a line in an Automath book is the sequence of assumptions and typed variables which hold for that line. With the simple mechanism of instantiation a n identifier introduced in a context can be used later in the book in some other context, by attaching, between parentheses, a sequence of expressions, separated by commas. These expressions provide proofs for the assumptions and values for the variables. By means of this mechanism we can handle a kind of functionality which is general in the sense that the function values do not just depend on objects but on both objects and proofs. This gives the possibility to have particular applications of theorems. And it makes it possible that notions that were once formulated for a general situation (like “the centralizor of the group G”) can be used later in special situations (like “the centralizor of the symmetric group Sn). PAL does not go beyond this instantiation mechanism, but in Automath typed lambda calculus is added, and therefore we have in Automath two different ways for describing functionality. Since one does not always like to admit quantification over type variables (in Aut-68 and Aut-QE this does not happen) it is not always possible t o replace the instantiation mechanism by the lambda calculus machinery. Moreover these instantiations do not always have the interpretation of functionality with object variables. In Automath it was decided t o maintain both systems, and to take it for granted that this sometimes requires translating one into the other. It would be unnatural (and would give more administration) to push everything into lambda notation. The word “unnatural” in such cases always refers to being against our ingrained habits. After all, the lambdas are a relatively recent addition t o our mathematical culture.

3. IDEAS RESULTING FROM EXPERIENCES WITH AUTOMATH

During the seventies quite some experience was acquired by the production of mathematical texts of various kinds, sometimes on a large scale (like van Benthem Jutting’s translation of E. Landau’s “Grundlagen der Analysis”, see [wan Benthem Jutting 77]), by project collaborators as well as by students. The most important experience may have been the constant feeling of being on the right track.

Reflections on Automath (A.7)

223

During the translation of Landau’s book we always suppressed the temptation to adapt Landau’s text in order to make it easier for our particular language. We strictly kept t o the rule that everything was t o be expressed exactly like Landau did, although some interpretation was occasionally needed. In particular the use of derivation rules had to be interpreted, since mathematicians like Landau did not have the habit of stating them explicitly. These efforts in structuring mathematics led to the expectation that on the long run also the teaching of mathematics might profit. We have all been raised in a mathematical world where the structure of mathematics was acquired by osmosis, where the foundations were studied only after having reached insight in mathematics, and such only by a small fraction of the students specializing in that area. But in the Automath project many non-specialized students were given the task t o write existing pieces of mathematics into Automath, and those students easily got sufficient insight into the structure of logic and foundation of mathematics in order to do their job. Therefore we got convinced that it is also possible to teach mathematics to beginners by showing such structure explicitly. When doing this, it may be most important to resist the temptation to try t o teach the metatheory about the structure intermingled with the structure itself. In this connection it should be put on record that in the past there has been insufficient collaboration between mathematicians and logicians in creating an educational situation where all students in mathematics would get a reasonable insight into their foundation. Hardly anything was done in this direction, and it seems that everybody always accepted that as quite normal. One of the advantages of Automath is that for anything one wants to write, one can design a rough outline first, and postpone the details, possibly leaving that to others (or to machines). The boundary conditions for the filling of gaps can be indicated precisely, so one knows exactly what still has to be achieved. This method of working from the outside to the inside is many centuries old. It is actually amazing that we managed during the last 100 years not just t o make the meshes in the net smaller and smaller, but even to close the net completely. This way of leaving gaps can be realized by writing things temporarily as axioms, which can later be turned into theorems by supplying proofs. This means that already during the stage of making rough outlines we have a book for which a machine can check correctness. This matter of extra axioms leads to a remark that holds for other axioms too. If one introduces axioms for local purposes, one might fear undesirable consequences a t other places. In order to prevent that, one can work with the so-called “chastity belt system”. In that system, the local axioms are expressed in the context of a special lock, with the assumption that we possess a key for it. Later in the book those axioms can be applied only under the assumption

224

N.G. de Bruijn

that there is a key. With this chastity belt system we can even write mutually contradicting axiom systems in one and the same book. Writing Automath can be made easier by admitting input text with abbreviations that need not belong to the language itself. Let us call these typographical abbreviations. We can imagine that the machine undoes these, making ordinary Automath text that can subsequently be checked automatically. But one can also imagine these abbreviations being maintained internally. As a most important kind of abbreviations I mention the telescopes, which are sequences of abstractors. The name comes from the collapsible instrument consisting of segments that can slide one into the other. Each one of these abstractors may depend on the variables introduced in the previous ones, and can therefore be seen as a segment fitting into those previous ones. Often we see such a telescope as a unity. Many of the more complex notions in mathematics are not to be interpreted as types but as telescopes. As an example we mention the word “group”. When J. Zucker (see [Zucker 77 (A.4)]) wrote an introduction into analysis he made extensive use of such telescopes. This was so easy for him that he could write almost all text directly in his Automath version Aut-II without first expressing it in conventional mathematical language. It is also possible to treat a kind of mapping of a telescope into another one, where on a higher level there is a kind of lambda calculus again (see [de Bruijn gla]). In order to promote the influence of Automath on teaching, I started a course “Language and Structure of Mathematics” at Eindhoven in 1976, intended for mathematics students qualifying for a teacher’s certificate. In that course quite some attention was given to the mathematical conversational language MV (the “Mathematical Vernacular”, WOT in Dutch). That is the mixture of words and formulas that mathematicians speak and write. In that language fragments consisting of single letters or larger pieces of formulas can be treated in the same way as pieces of sentences consisting of words only. The idea of MV arose from the difficulty of translating semi-formal mathematical texts into Automath. MV is a kind of intermediate level, and one can imagine that much of the translation of MV into Automath can be left to machines with a bit of artificial intelligence. Partly because of its being motivated by Automath, MV is somewhat revolutionary too, of course. There are only four grammatical categories in MV that have to be discussed in the grammar. These are (i) statements (sentences), (ii) substantives, (iii) names, and (iv) adjectives. The statements can be interpreted as propositions, the substantives as object types, the names are expressions that denote objects, and the adjectives are just adjectives. In MV proofs are not treated the same way as objects. In that respect

Reflections on Automath (A. 7)

225

MV takes the same attitude as ordinary mathematical language. A proof is a sequence of precisely formulated propositions, pasted together by informal references to former results or to derivation rules. The last proposition in the sequence represents the statement of the theorem. MV of course requires that there ezist proper references but does not actually require them. And if the MV text contains indications of references, they are considered as hints written in the margin, and not as part of the official MV text. Of course one will ask why just these four grammatical categories were chosen, and why one does not need more that those four. After all, grammars of natural languages have many more. One can discover the four in mathematical literature by analyzing what kinds of things are introduced in definitions. There are statement definitions (like: “we say that the points P , Q and R are collinear i f . . .”), substantive definitions (like: “a rhomb is a quadrilateral PQRS with . * .”), name definitions (like: “ A is half the circumference of a circle with radius l”),and finally adjective definitions (like: an infinite sequence is called convergent when . . .”). Apparently there are no other kinds than these four. Now the question why these four are sufficient. It is because in a mathematical text all the time new words and sentences are introduced by means of definitions, sometimes provided with typed parameters, and later one can build new sentences upon replacing these parameters by words or formulas of the right type. The choice of those new terms is often suggested by natural language. One would coin for some notion the word “semi-complete” rather than “artqqgccaaa”, which is harder to write, harder to pronounce, and lacks the mnemonic valuo of the word “semi-complete”. So we adjust to a bit of natural language deliberately, but this is by no means a principle I already mentioned the example of a typing in MV: “Sirius is a star”. Here the word “Sirius” is a name, and “star” a substantive. Translating to Automath, “star” becomes a type, and a typing sentence “Sirius is a star” becomes the typing “Sirius : star”. Of course one can not live in MV without the binding of variables of the lambda calculus. The helpless and ambiguous ways that natural languages have in that respect is insufficient for mathematical work. And we note that the lambd a is the only essential quantifier, since all other quantifiers can be obtained by applying unary operators to lambdas. Since proofs are treated informally in MV, it is less powerful than Automath. It has the consequence that a much smaller part of the foundation can be left to the writers of books: a considerable part of it has to be taken into the language definition. A component, or rather a sub-language, of MV is natural deduction with

N.G. de Bruijn

226

flags and flagpoles, both for propositional and for predicate calculus. This was a part of the course “Language and Structure of Mathematics” that was later transposed to a course for freshmen. It turned out to be quite digestible there. In my opinion a part of it might be done already at schools. That might be a solution for the problem that in today’s school mathematics there is hardly any material where proofs can play a r61e. But in the simple propositional calculus with implication and conjunction one can learn proof structure as well as proof strategy demonstrated with non-trivial and always comprehensible examples. And again, the main secret in teaching this is avoiding metatheory (including truth tables). In a certain sense most systems of traditional logic can be called met at heory. For MV the same thing holds that I expressed before for Automath: the text in MV should not be seen as the translation of a formal system but as the formal system itself. Who handles MV does not have to bother about its relations to other formal systems. Earlier in life I sometimes had the idea that I was concerned with three things : (i)

the mat hematical objects,

(ii) a formal system (like predicate calculus plus Zermelo-Fraenkel set theory), and (iii) the language we use for discussing those things. The anti-platonistic view killed (i) already, and (ii) becomes superfluous since MV has the foundation in its grammar. So MV can do it all alone. Of course we still use informal and heuristic language around MV, and all sorts of “typographical” abbreviations (some of these are called “sugaring”), but these need not cause trouble: they can always be easily eliminated without big losses in efficiency. Once we have appreciated how Automath deals with objects and proofs it is natural to ask whether the same framework cannot be used for other things too. One of the first ideas that comes to mind is the old area of geometrical constructions. For that area one can get to a formal way of describing the constructions, linked with proofs that these constructions actually produce the required objects. Then various kinds of things interplay: the figures in our geometrical plane, the actions of drawing lines etc., and the proofs for the properties of the produced figures. Analogous things can be done for computer programs, and possibly also for the description of hardware, with proofs that it meets the requirements formulated for it.

Reflections on Automath (A.7)

227

This way to put various abstract sciences in a single system I called integration. Such integration can lead to a final result in which one is no longer tied to all sorts of interpretations of the relations between the various parts. In particular this might have practical value for proving the correctness of computer programs, where just such interpretations can be the source of errors. Integration might be achieved stepwise by leaving gaps and filling them later. For example, we first work with a specification of what a computer can do for us, and only later we supply a proof that a particular piece of hardware meets that specification. Putting all sorts of things in one and the same framework we get the idea that the notion L'mathematicsl'gets much wider that what we were used to. One should not define the word LLmathematicsl' by a list of traditional subjects, but by the mathematical method. That method is determined by the framework. Whatever can be built up by means of that framework might be called mathematics. In that framework one can discuss things that never belonged t o set-theory based mathematics (see [de Bmzjn 84 (F.211, [de Bruijn goal, [de Bruijn 921). If we inspect various kinds of typed lambda calculus, different only in the rules for application and abstraction with variables that represent types, we will of course be in doubt about which is the best. But I believe that one may be able to leave that kind of choices outside the framework, since one can embed them into a single one (for which the one of [ d e Bruzjn 876 (B.7)] seems to be a good candidate), simulating specific abstraction rules by means of axioms written in the books (cf. [ d e Bruzjn 78c (B.4)]).The desire to do things by means of language rules instead of by axioms may have a root in the idea that it gives quite some unpleasant work if all the time we have t o make explicit appeal to those axioms. But I believe that this sort of thing can be automated easily, and that one will be able to find abbreviations in the input language such that the input looks the same as if it would appeal to language rules instead. Needless to say, mathematics is more than just formal language only. In a way we might say that the formal presentation is the graveyard in which all final results get to rest. There are heuristics, interpretation, taste, and the like, and these provide a large part of the pleasure we have in the mathematical world. But the language is the center of everything, since outside that language there are no mathematical objects that can answer definite questions. Of course there are objects in the physical world from which we may derive our geometrical and combinatorial intuition. But in our mathematical tradition we have seriously been trained not to take those physical insights as mathematical proof. Nevertheless LLunderstanding'lgeometry has something t o do with physical reality, and

228

N.G. de Bruijn

seeing how that can be connected to mathematical statements. The goals of Automath have sometimes been called formalistic. This is true, of course, but everyone who takes language seriously is formalistic. And many mathematicians who would never like to be called formalists would be horrified if people would mistreat the formulas in the subject of their own interest, apart from the standard carelessness they can get away with in their sub-culture. Formalism is not so bad as it sounds. It is not the fault of us, mathematicians that in the rest of the world the word “formalist” is more or less abusive, making associations with “bureaucracy” and “form without content”. Styles in formalization are by no means the same in all areas of mathematics. When I started in mathematics, geometry was clearly much less formalized than analysis, and inside analysis there were again different cultures. Very precise treatment of a simple thing is often considered as pedantic, and then we hardly dare to formalize. An example where strict formality is accepted is induction. Nobody will be blamed for presenting an induction argument meticulously. But there are other situations where there is no such tradition. Here I think of all kinds of statements about finite sets. As an example one may take the “box principle”. Such statements are not always trivial consequences of induction. Another thing that comes to mind is the inconvenient difference between a single number and a sequence of n numbers with n = 1, and between a pair of numbers and a sequence with n = 2. And in combinatorics one might give many examples where strict formalization has quite a distance t o combinatorial intuition. We may think of finite trees, where various things become more or less obvious by drawing and observing a picture, and where formal proofs are not customary. Proper induction techniques for such situations do not seem to be generally available. When I gave talks about Automath around 1968 the following two comments came up repeatedly: (i) Peano did all this around 1900, and (ii) around 1935 Godel showed that it is impossible. The answer to (i) is that indeed Peano expressed a considerable piece of mathematics by means of the Frege calculus, but that he did not pretend to make it automatically readable. Proof indications remained informal. The answer to (ii) is that Godel never said a thing like that. What we did learn from Godel is that we should not have the illusion that all wisdom can be caught in a single language. But as long as we are willing to live inside a fixed language, and most mathematicians are like that, then the idea of automatic verification of everything that can be said in that language is by no means an illusion.

229

Type Systems - Basic Ideas and Applications R.P.Nederpelt ABSTRACT This paper gives a general introduction into type theory. It focusses on the main concepts present in different type systems, especially as to the motivation for their use. After a global description of the most important properties of type systems, we discuss some natural connections with systems of logic. Thereafter we describe the usefulness of type systems as a general framework for fundamental notions in the mathematical vernacular and in programming languages.

1. INTRODUCTION Lambda calculus with types (or type theory) is presently a fast growing branch of science. Research in this area has recently been stimulated by computer science, where types play an important role, especially in programming languages. There is a great variety of so-called type systems, each one intended for a special purpose. Type systems are used in mathematics and in programming languages. There is a strong connection with logic. In this paper we shall sketch some of the ideas underlying type theory. A frame of reference for many type systems has recently been developed by Barendregt. His taxonomy of type systems and their relation with logic and computer science is discussed in [Barendregt 921. A concise introduction to this topic, with applications in the theory of programming languages, can be found in [Barendregt & Hemerik 901. In this paper we shall only explain the background of Barendregt’s systematization. The use of types in mathematics has a relatively long history (see [Peremans 941). B.A.W. Russell and A.N. Whitehead incorporated types in their famous Principia Muthematica (1910-1913) t o avoid paradoxes. Lambda calculus, originally untyped, started with A. Church in the thirties. He used it for a formalization of the notion “effectively computable” which later turned out to be equivalent with Godel-recursiveness and Turing-computability.

230

R.P. Nederpelt

In the early forties Church presented the first type system. Lambda-calculus (both untyped and typed) and combinatory logic, a closely allied discipline, have also been developed by H.B. Curry and, later, by J.R. Hindley and J.P. Seldin. Hindley obtained important results about principal type schemes, which are of current interest for computer science ([Hindley 691). Milestones for the development of type theory, in the late sixties and the early seventies, were: -

de Bruijn’s system Automath ( [ d e Bruzjn 7Oa (.4.2)]), with variants like AUTQE; see also [van Daalen 801,

- Girard’s system F ([Girard 711; see also [Girard et al. 89]), -

Reynolds’ second order lambda calculus ([Reynolds 741) and

-

Martin-Lof’s Intuitionistic type theory ([Martin-Lof 821; see also [Backhouse et al. 891 and [Nordstrom et al. 901). About a decade later, other systems were developed, such as

-

the programming language ML of Milner et al. ([Harper et al. 861) and

- the system NuPRL by Constable and his group ([Constable et al. 861).

All the mentioned systems have a direct or indirect connection with type theory and now, in retrospect, they turn out to have been very influential, be it sometimes with a remarkable delay. In this paper we shall not go into the systems of Martin-Lof, Milner and Constable. Girard’s system and the similar system of Reynolds will only be mentioned as an entry in Barendregt’s taxonomy. We shall, however, frequently discuss aspects of de Bruijn’s Automath system, being the first “sugared” typed lambda calculus. Moreover, most of the ideas behind type theory have already been embodied in Automath. Lastly, it has the convincing practical property that it “works”.

2. PURE TYPE SYSTEMS

2.1. Untyped lambda calculuus

Terms in lambda calculus are inductively defined. Firstly, there is a nonempty set V of symbols called variables. There is also a (possibly empty) set C of symbols, which are called constants. Each variable and each constant is itself a lambda term.

Type systems - basic ideas and applications (A.8)

231

Secondly, there are two basic construction principles for terms in lambda calculus: abstraction and application. Abstraction combines a lambda term t and a variable x into a “new” lambda term (Xx.t). In the latter term the x after the X binds all free occurrences of the variable x in t. Thus, abstraction over x turns the free x’s in t into so-called dummies or place-holders; the actual name of this variable is no longer relevant, it is only the binding pattern that counts. Hence, terms (Xx.t) and (Ay.t’) are considered “the same”, provided that t and t’ only differ in the aspect that, wherever a free x occurs in t, there occurs a free y in t’ at the corresponding place, and vice versa. This observation led de Bruijn (see [de Bruijn 78bl) to a variant notation called namefree, where the bindings in a term are represented by numbers. Application is a simpler construction principle. It is just the concatenation of two lambda terms, where parentheses are used to indicate the desired parsing. So lambda terms tl and tz may be combined into the lambda term (tltz). We shall use x,y, z , ... as meta-symbols denoting variables, and c for a constant. Meta-symbols for lambda terms are t , t l , etc. Examples of lambda-terms are z , c, Xx.y and ((xc)(Xy.Xx.y)). (As can be seen in this example, we sometimes omit pairs of parentheses when no confusion can arise.) One intuition behind abstraction is that a certain free variable becomes bound. Another view is to see abstraction as the notation for a function. E.g., term Xx.x can be thought of as being the function mapping x t o x, i.e. the identity; term Xx.y maps all x to y, i.e. the “constant function” with value y. (The fact that y itself is a variable is irrelevant.) This view of abstraction as being function-abstraction has a natural counterpart: one may regard application as function-application. So, (tltz) can express the wish to apply “function” tl to “argument” tz. The notation deviates from the usual one: in lambda calculus one writes ( f x ) instead of f (x). Note that lambda calculus is very liberal in this respect; for example, (zy) is a perfect lambda term, while x does not have the appearance of any function. When tl does have the form of a function, i.e. when tl starts with a Ax for some variable 2 , then (tltz) does not only code a potential application, but also gives information about the result which can be expected. For example, ((Xx.(xy))z) may be considered as the opportunity to apply function Xz.(xy) t o argument z , which should have the result (zy). Hence, there is a natural relation between terms ((Xx.(zy))z) and (zy), called (one-step) ,&reduction and denoted with the infix symbol +p. In general: the relation ((Xx.tl)tz) +p t l [ z := tz] holds, where tl[z := tz] denotes the result of substituting tz for all free occurrences of x in tl. The relation is monotonous (also called compatible), in the sense that: if t -+p t‘, then (tit) -+p (tlt’), ( t t l ) -+p (t’tl) and Xz.t -+p Xx.t’.

232

R.P. Nederpelt

The reflexive transitive closure of +p is - w p , the corresponding equivalence relation is =p. When talking about ,&reduction, we shall mean +p, with +p as a subcase. The definitions above contain the basic concepts of (untyped) lambda calculus. Many beautiful, sometimes unexpected results have been obtained in this field. The reader is referred to [Barendregt 84b] for a concise discussion of the main topics in untyped lambda calculus and to [Hindley & Seldin 861 for a more extensive general introduction into the subject. The present standard work on lambda calculus is [Barendregt 84al. 2.2. Typed lambda calculus

In the untyped lambda calculus nothing is said about the “domain” of the variables occurring in the term. This gives rise to strange effects. For example, a function can act on itself: (ff) is a legal lambda term. When abstraction is considered as function-abstraction and application as (a potential) functionapplication, the freedom of lambda calculus gives sometimes undesired results. For example, calculations (i.e. function applications, formalized by means of P-reduction) may not end. See again the references mentioned above. Therefore a natural addition to lambda calculus is typing, which can be done in several ways. See [Reynolds 851 or [Cardelli & Wegner 851 for motivation and a general exposition. Given an untyped lambda term, one can look for a suitable type (or types). This is called implicit typing or Curry-typing. It asks for some procedure t o find a type, if one exists; or, better: to find the ‘&mostgeneral type” or principal type scheme. See [Hindley & Seldin 861 for a clear exposition. Another approach is to insert some type information in the term itself, when constructing it. The most common form of such typing is called explicit typing or Church-typing, where the variables are introduced with a type. We shall follow the latter approach, which can lead t o very general systems of typed lambda calculus. The various possiblities and degrees of generality are clearly and concisely exposed in [Barendregt & Hemerik 901. [See also [Barendregt 921.] In explicit typing each binding variable (following a A) is given a type. In the more simple systems, types are kept separate from terms. In the general systems, which have a wider range of application, these types are themselves a kind of lambda term. Hence, types and terms are very similar, which is profitable as we shall soon see. Hence, we shall not distinguish between terms and types. They are all terms in the calculus, albeit that some term may act as the type of another one. It follows from the discussion above that terms obtained by abstraction in typed lambda calculus have a slightly more complicated construction principle:

Type systems - basic ideas and applications (A.8)

233

a pair of terms tl and t2, together with a variable z, combine into (Xz : tl.tz). Here tl is called the type of z. Moreover, there is a second abstraction principle called II-abstraction. It is quite similar to the A-abstraction, except that capital II is used instead of A. So in the above case: (nz : tl.t2) is a term, as well. For the rest nothing changes; the application, for example, goes just like in untyped lambda calculus. In [Barendregt & Hemerik 901, the name pseudo-expressions is used for the terms thus obtained. (The word pseudo is used because it will turn out that not all pseudo-expressions can actually occur as terms in typed lambda calculus; pseudo-expressions that do occur as terms in typed lambda calculus, we shall call legal.) Examples of pseudo-expressions are: z , c, Xz : z.y, IIz : 2.2 and ((Xy : (zy).Xz : (IIz : c.z).y)(zc)). (Again, we sometimes omit parentheses.) The mechanism of binding terms is extended as well: the z in Xz : tl.t2 or l-Iz : tl.t2 binds all free z’s in t2, also the z’s occurring free in types of variables in t2. The capital II is used for so-called product types, i.e. types of those terms that are obtained by A-abstraction. This will be elucidated later. To a bound variable one attributes the same type as is given to the corresponding binding variable (after the A). So the second z in Xz : c.(yz) has the same type as the first one, viz. c. For free variables one may give a list of types in a so-called contezt r. Such a context looks like z : t l , y : tarz : t 3 , . . ., where 2, y, 2.. . are again binding variables. Note that bindings in terms and context are essentially ordered: an example is the term Xz : c.Xy : ( c z ) . ( z y ) where , the first z binds the other two occurrences of z. Another is the context z : c, y : ( c z ) , z : (zy), where the same observation holds. In both cases the type of the binding a, depends on the preceding binding 2. 2.3. The structure of PTSs

The importance of typing is that not only variables get a type, but also other terms. The relation between a term and (one of its) types is expressed in a so-called statement of the form t l : t z , to be read as: “term tl has as its type the term t 2 ” . Such a statement can be made in a certain context, according to a number of rules. These rules are part of the specification of a Pure Type System or PTS. Below we shall try to give an impression of some important features of these PTSs. For details and examples we refer to [Barendregt 921. The rules of a PTS are meant to express that “term tl has type t2 in context P.The formal format for this is: r k tl : t2. The main rules state how the typing relation behaves under abstraction and application:

R.P. Nederpelt

234

(1) in-rule: When

r , z : t I- tl : t 2 , then r I- (Ax : t.tl) : (IIz : t.t2);

(2) el-rule: When I? b tl : (llz : t.t’) and I’ I- t2 : t , then

r k (tlt2) : t‘[z:= t z ] .

Explanat ion: (1) When tl has type t2 and z is a variable with type t that may occur in tl and/or t2, then the A-abstraction of tl over has as its type the IIabstraction of t2 over z. The name ‘in-rule’ is explained by the observation that use of the rule introduces a 11-term. (2) When tl “is” a function from t t o t’, and t2 belongs to the domain of tl (i.e. has type t ) , then one may wish t o apply tl to t2 and then ( t 1 t 2 ) gets the appropriate type. (If z does not occur as a free variable in t’, then the substitution [z := t 2 ] has no effect; for the case that z does occur free in t’: see below.) Here a 11-term is eliminated; hence the name ‘el-rule’.

We note that the types now play the restricting role that they are meant for, as can be seen in the el-rule: here it is necessary that the “argument” t 2 of the “function” tl should belong to the “domain” of that function, viz. t. This is expressed in the application condition I? I- t 2 : t. It should also be noted that terms like IIz : t.t’ are generalized or dependent products, which implies that variable z may occur as a free variable in t’. This dependency is not present in “normal” function types, like RN,the function space of sequences of real numbers. Here the range R does not depend on the n chosen in the domain N . If, however, we consider sequences of arrays where the n-th entry in the sequence is an array of length n, then we have this dependency. The corresponding function space could be expressed as the product type rInd.R”, which resembles the notation in type theory. (The function type RN,mentioned above, could be analogously expressed as IIn6N.R.) In applications which we shall describe later, this possible dependency in II-abstractions will turn out to be of crucial importance. With the rules given above, many terms obtain a type. A third rule of PTSs enables us to replace a type with a ,&equivalent one:

(3) conv-rule: When I- tl : t2 and t2 =p t3, then is a legal pseudo-expression).

r k tl

: t3

(provided that

t3

Other rules in PTSs deal with the construction of contexts and the introduction of II-abstractions. For our purpose we only mention that a P T S has special kinds of constants called sorts, which may be thought of as ‘Lsuper-types”.For example, a product type is itself typed by means of a sort: (Tzz : t . t 2 ) : s for some sort s. Finally, a number of axioms can be added in order to establish types of constants.

Type systems - basic ideas and applications (A.8)

235

Barendregt classified a number of existing type systems, by defining a systematic “cube” of eight PTSs (see [Barendregt & Hemerik 901 and [Barendregt 921). In one vertex of the cube (the “origin”) one finds a system already discovered by A. Church in 1941. Generalizing this system in three directions, one finally arrives at the diagonally opposite vertex of the cube, representing a form of the Calculus of Constructions, invented by Coquand and Huet (see [Coquand & Huet 881). Two of the other vertices can be interpreted as simplified versions of two of de Bruijn’s Automath-systems. (See [ d e Bruijn 80 (A.511 and also Subsection 4.1 below.) One of the remaining vertices can be labeled with Girard’s second order system F (or with Reynold’s second order lambda calculus).

3. A NATURAL EMBEDDNG OF LOGIC IN TYPE SYSTEMS 3.I . “Propositions-as-types”

The naive interpretation of a type is as a set-like entity. This we noted before, when regarding the statement (Xz : tl.tz) : (nx: t l . t 3 ) , where the 11-term may be thought of as being a function type when t 3 does not depend on x. Such a function type is the set of all functions from tl to t 3 . The A-term describes one of those functions. We shall sometimes use this intuitive correspondence between types and sets. However, we emphasize that this is done only for clarifying some features of type theory. Type systems have a value in their own right and the correspondence between types and sets only exists within certain limits. Moreover, there is a second correspondence, which is just as important: the correspondence between types and propositions. This uncommon, but very fertile idea is the so-called “propositions-as-types” notion. This discovery is often called the Curry-Howard isomorphism. It is, however, only fair to add here the name of de Bruijn, who invented the same notion independently, at about the same time as Howard. de Bruijn used this notion in his mathematical language Automath, meant for the full verification of mathematical texts, in particular with respect to the formal and semantical correctness of proofs, definitions and the like (see [de Bruijn 70a (A.2)] and ( d e Bruijn 80 (A.511). This notion amounts t o a simple idea: propositions are coded as types, and a proof of a proposition is a term of the corresponding type. ( A term encoding a proof is also called a proof object.) There can be more than one proof of a proposition; each proof becomes a term of the type under consideration. When a proposition has no proof, however, then there is no term with that type. We shall call a type coding of a proposition a prop-type. It will be clear that, for a prop-type t’, it is important whether it is inhabited, i.e. whether there is a

236

R.P. Nederpelt

term t such that t : t’. If so, the proposition may be considered proven. 3.2. Propositional logic Consider a A-abstraction Az : t1.t and assume that t : t2 (hence (Az: t1.t) : (IIz : tl.t2) by the in-rule). Assume moreover that z does not occur free in t2 and that both tl and t2 are prop-types. Then the lambda term can be considered as a function mapping inhabitants of tl to inhabitants of t2, or in other words: each proof of tl is connected t o a proof of t2. Therefore Xz : t1.t can be seen as a proof of the implication tl +- t 2 , because it provides for any proof q of tl a proof ((Xz : t1.t)q) of t2. From this observation we conclude that IIz : t1.t~codes the implication tl 3 t2. This view has already been elaborated by Heyting, in the spirit of Brouwer’s intuitionism. He noted that a proof of A + B consists of a construction which permits us to transform any proof of A into a proof of B. This interpretation of the logical operation ‘implication’ is incorporated in what is now known as the Brouwer-Heytzng-Kolmogorov interpretation. See [Troelstra 771 or [Troelstra & van Dalen 881. The coding of implication as a product-type automatically incorporates Modus Ponens (i.e. if we have A 3 B and A , then we can conclude B ) in the typed lambda calculus. In fact, Modus Ponens is a special case of the el-rule: when t : (nz : tl.t2) and t’ : t l , then (tt’) : t2. In words: when t is a proof of tl + t2 and t’ is a proof of t l , then (tt’)is a proof of t2. Hence, in typed lambda calculus we construct a proof of t2 from proofs of tl +- t2 and tl. Implication is in this view a key notion in propositional logic. The other propositional connectives can be added in several ways. For example, the “contradiction” (or “absurdity”) I can be added as a constant prop-type. Thereafter, one may add connectives like -, V, A and # by defining them in terms of + and 1.We shall see later how definitions are treated in type theory; for the time being it is enough to know that this can be done. Another approach can be found in Subsection 5.3. Thus we have obtained intuitionistic logic. In order t o get classical logic, we have to add the double negation axiom: for each proof of - - A we have a proof of A . (Axioms, too, can be handled smoothly by type theory.) 3.3. Predicate logic

For predicates like P in Vz.P(x) we need a set: P ( x ) depends on x which belongs to some domain (or set) D. Hence we need set-types next to prop-types. For both “super-types”, viz. set-type and prop-type, we add constants to our type system, *s and * p . (These super-types are called kinds in [Barendregt 921.) Now we can express in type theory: “D is a set” or L‘q is a proposition”, by the

Type systems - basic ideas and applications (A.8)

237

statements D : and q : *p, respectively. In the PTS-style we can even add a “super-super-type” 0,which acts as a type for the set-type and the prop-type, so both the statements *s : 0 and *p : 0 hold. So now we can have the following typing chains:

(1) (t : ) D :

: 0 for set

D with (if not empty) element t ;

(2) (t :)q : *p : 0 for proposition q with (if provable) proof t. Moreover, we are able to express predicates in our type system. A (one-place) predicate P on a set D can be considered as a function sending elements 2 of D to propositions P ( z ) : predicate P “holds” for z if and only if the corresponding P ( z ) is “true”. In type theory this becomes: P “holds” for z iff ( P z ) is inhabited. So a predicate on D is an inhabitant of IIx : D.*,; if P is such a predicate, then, for each t in D , ( P t ) is a proposition. In the language of type theory: if P : (nz : and t : D, then (by the el-rule) ( P t ) : *p. Now there is a remarkable result regarding universal quantification, with respect t o type theory:

VzcD.P(z) iff IIz : D . ( P z ) is inhabited. Explanat ion:

( e )Assume t : (IIz : D . ( P z ) ) and t‘ : D , then (tt’) : (Pt’),so P(t‘) holds. (=+) Assume V m D . P ( z ) ,then for all z in D the corresponding P ( z ) holds, so ( P z ) is inhabited. Now we can take a function t on D mapping each z to an inhabitant of ( P z ) and thus t : (nz : D.(Pz)). Hence, just as the implication

tl

=+ t 2 could be represented in type theory by

nz : t l . t 2 , analogously the universal quantification VzcD.P(z) can be expressed as llz : D.(Pz). Note that, in the former case, t 2 does not depend on z. In the latter case, however, it is essential that we have a dependent product, i.e. that ( P z ) may depend on z. We remarked in the previous subsection that Modus Ponens is implicitly present in type systems such as the one under discussion. In fact, both the introduction rule for =+ and the elimination rule for =+ (the latter being the Modus Ponens) are incorporated. It turns out that the introduction and elimination rules for V are implicitly present, as well. The reader can easily convince himself of this fact. Since the existential quantifier 3 can be expressed in terms of V and 1,it will be no surprise that we now have a very natural form of predicate logic at our disposal, with a minimum of rules. It is even possible in some (higher order)

R.P. Nederpelt

238

type systems to define the contradiction I , viz. as lIu : *p.u. In fact, as the reader may easily see, any inhabitant of IIu : *p.uis a function that delivers a proof for every given proposition, which clearly is an “absurdity”. Concluding: the in- and el-rules are all we need, apart from some definitions and the double negation axiom, for classical logic. Thus we obtain a n attractive, simple form of natural deduction. 3.4. An example

As an example we give a proof of the following tautology from predicate logic, for two-place predicates P on a set D:

(vz€D.Vy€D.P(z, y))

* VZ€D.P(Z,z ) .

We shall use a natural deduction format introduced by the author (see [Nederpelt 77]), which was inspired by de Bruijn’s Automath. The “flags” denote the context in which the statements should be placed. Some obvious deletions of parentheses have been made, e.g. P z z should be read as ((Pz).). Moreover, A -+ B is shorthand for IIz : A.B.

P :D

-+

( D -+ * p )

1

F:y:D.pzy)

u : (IIz : D.IIy : D.Pzy)

(by the el-rule)

uzz : Pzz

(Xz : D.uzz) : (IIz : D.Pzz)

(by the el-rule)

(by the in-rule)

(Xu : (lIz : D.l-Iy : D.Pzy).Xz : D.uz.2) : (rIz : D.ny : D.Pzy +. lIz : D.Pzz)

(by the in-rule) (At the end of this example there is, in principle, the possibility for two more applications of the in-rule. Note that this would result in quantifications over predicates and sets. Hence, type systems allowing this are suited for proofs in higher order logic!)

Type systems - basic ideas and applications (A.8)

239

3.5. Recapitulation and further results We summarize the above-mentioned LLnatural”interpretations of A- and IIterms in the diagram below. Here *s and * p can be interpreted as the ‘Lclasses” of all sets or propositions, respectively. We write b(z) or B ( z ) t o emphasize that z may occur as a free variable in b or B, respectively. Variable z ranges over A . The symbol 8 denotes the condition that z does not occur free in B ( z ) . (In that case B ( z ) does not depend on z.) An entry x means: not interesting for our present purposes. We consider four cases for the pair , namely: < *.. >. < *,. *,, >. < *.. *, > and < *,. *. >.

*.

(Ax : A . b ( z ) ): (IIz : A.B(z))

(1) A : *s and

b(z): B ( z ) : *s A : * p and b(z) : B ( z ) : * p (3) A : * s and b(z) : B ( z ) : * p (4) A : * p and b(z) : B ( z ) : *s (2)

(8)

mapping from A to B (#) proof of

(#I B A

Xz : A . B ( z )

set-valued mapping on A x

B proof of VmA.B(z)

(#) proposition A+B proposition VzeA.B(z)

predicate on A

X

X

X

A

We note that, when the restriction # is removed from row (1)1we may interpret Ha: : A . B ( x ) as a generalized Cartesian product. It will now be obvious that the terms Xa: : A.B(a:) and IIz : A.B(a:) have a remarkable syntactic similarity. An important difference becomes clear when one regards typing: the type of Xa: : A . B ( z ) in case (1) is Ha: : A.*,, but the type of IIz : A.B(a:)is only *s. In the latter case there is no iiintroduction’lof t.he

in t.he t.vne.

The correspondence between type systems and systems of logic has recently been investigated. One of the results concerns the “computational strength” of the different type systems. It turns out that eight systems of constructive logic have the property of soundness with respect to eight corresponding systems in the cube of Barendregt ([Geuvers 881, [Barendregt 911). For a detailed description of this correspondence and interesting links with computer science, see also [Barendsen 891.

240

R.P. Nederpelt

4. MATHEMATICAL NOTIONS EXPRESSED IN TYPE SYSTEMS

4.1. Assumption-like notions Given the two basic “super-types” (or kinds) ** and * p for sets and propositions, we can render in type theory a number of important notions from the mathematical vernacular (i.e. the common language which one uses to express mathematical thoughts). We shall give some examples. Firstly, as said before, we can talk about sets D by stipulating that D : **, and about propositions 9 by means of the statement 9 : * p . Secondly, the introduction of a variable in some context r, such as: “Let z be an (arbitrary) element of D”, can be expressed by adding z : D to r. Correspondingly, an assumption of proposition q in context r can be written as an extension a : q (“Let a be an (arbitrary) proof of q ” ) of r. “In” the extended context (i.e. after the k-symbol) one may write a formalized version of the text depending on the introduced x, or of the text which has been written “under” the assumption 9. The in-rule can eliminate either the introduced variable or the assumption from the context, at the appropriate moment. The underlying idea is that one wishes to construct a formal version in parallel to a given text from mathematics or computer science, in such a manner that this formalization has practically the same order of development, and the “same” content as the original text. This fruitful idea was exploited in a feasible form by de Bruijn, when introducing the mathematical language Automath ( [ d e Bruijn 70a (,4.2)]). This language contains a clever syntactic sugaring in order to make it usable in practice. Many mathematical texts, even of a fairly complicated character, have been “translated” into Automath. See e.g. [Wcker 77 (A.4)] and [wan Benthem Jutting 771. The power of Automath is, that its syntactic rules are tailored for mathematical reasoning. A remarkable result is that a text-translation into Automath which obeys the Automath syntax rules is possible and only possible when the original mathematical text is “correct”. In this manner mathematical texts can be justified by translating them into Automath, leaving checking of the syntactical correctness of the Automath-text to a computer. It is even possible to develop a text with mathematical contents directly in Automath. One may also use systems such as Automath for some form of automatic theorem proving (see [Harper et al. 861 or [Nederpelt SO]). Underlying Automath, as remarked first by the author ([Nederpelt 71b]), is a typed lambda calculus. A number of language-theoretical aspects of Automath have been investigated in [Nederpelt 73 (C.3)], [ d e Vrijer 75 (C.4)] and [wan Daalen 801. We shall now give some more examples of the way in which mathematical notions can be rendered in typed lambda calculi.

Type systems - basic ideas and applications (A.8)

24 1

4.2. Axioms and primitive notions The introduction of axioms and primitive notions, as well as logical constants, can be seen as “eternal” context extensions, i.e. statements in the context that are intentionally never removed. For example, Peano’s well-known axiomatization of natural numbers starts with the presumed “existence” of a set N and an element 0 in this set. This can be expressed by inserting the double context-statement N : *st 0 : N in the context (n/and 0 are variables here!), ‘‘never” to be removed by the in-rule (so they act as constants). The successor-function u can be added as a third “eternal” context-statement, viz. cr : (ITn : N.n/). And thus we may continue. The “properties” of the successor-function, embodied in Peano’s axioms, can also be encoded. In the context constructed in this manner (i.e. after the bsymbol) one may use N , 0 and u,having the given types and the desired properties. Similarly, the logical constant I (the contradiction, or absurdity) can be introduced by adding the statement I : *p to the context. As regards the context in general, we may note that context-statements turn out to be expressible in typed lambda-form as well. A context like z : t l , y : t 2 has a great similarity with the double “lambda-head” Xa: : tl .Xy : t 2 and, indeed, can be expressed this way. Acting in this manner, we may dispense with contexts altogether. Texts from mathematics or ccjmputer science can be represented by a single (possibly very long) term in typed lambda calculus. Contexts, dealt with in the usual manner or “rewritten” in lambda-form as just described, cause an unpleasant amount of duplication. In order to cope with this nuisance, one could introduce appropriate abbreviations for contexts or multiple lambda-heads, or for other repeatedly occurring parts of terms. For this purpose de Bruijn extended typed lambda calculus with the notion ‘segment’ (cf. [Balsters 861). (An important part of the “sugaring” in Automath has the effect that exactly this problem does not occur there.) This subject requires more research in the future, especially as one wishes to keep a typed lambda calculus system as pure as possible on the one hand, but yet, on the other hand, immediately useful for as many purposes as possible. (See in this respect [de Bruijn 91bI.) 4.3. Definitions and theorems

We can use the type condition of the el-rule for dealing with definitions. Let us consider the following situation: one wishes to express in some type system that object a is defined as t l , a member of the set t’. What is the purpose of such a definition? Disregarding other aspects, one might say that from the moment of definition onwards, a acts as an abbreviation for t l ; that is to say,

242

R.P. Nederpelt

when a is used, one “thinks of” (or may think of) t l . One is even allowed t o replace a with tl , whenever and wherever one wishes. This possibility of replacement naturally brings the notion of P-reduction to mind. When t 2 is some term in which the abbreviation a for tl is used, then this situation can be described with the term ((Xu : t’.tz)tl). Indeed, (one-step) P-reduction of the last term results in the substitution of tl for a in t z , hence undoing the abbreviation. (Here we adopt the usage t o speak about the relation P-reduction as if it were a kind of process: when t ++p t’, then Lib-reductionof t leads to t’”.) There are two remarks in this respect. The first is, that @-reductionenforces one to replace all occurrences of a in t2 by t l , which is maybe too much of a good thing: when “unfolding” a definition, one usually desires to d o so at only one place. Hence, the definition of P-reduction should be refined to enable local replacements; for this notion of local P-reduction or mini-reductions: see e.g. [ d e Bruzjn 876 (8.711 or [Nederpelt 921. A second remark, of trivial value from a theoretical viewpoint, but of importance for practical purposes, is the following. One may note that in the term ((Xu : t’.t2)tl)the abbreviation a and the abbreviated term tl are separated by term tz. In fact, t 2 codes all the text depending on a, and when one wishes to extend the text with new facts in which a plays a role, then t2 becomes larger, so that a and tl drift still further apart. Note also, that the normal order of the text (“a is defined as tl in tz”, so first a and t l , then t2) is not fully reflected in the lambda-term. An easy way out is t o reverse the order of application: write (t2tl) instead of (tlt2) when tl is applied to t 2 . This simple action, unusual and probably unconvincing at first sight, has nevertheless a number of advantages. de Bruijn has always been an advocate of this convention. For other arguments in its favour, again see (Nederpelt 921. Until now, we have talked about definitions of objects in sets. Hence, the type t’ mentioned above has the kind *s. In the case when t’ has kind * p , so that t’ codes a proposition, we may consider the same term as above, ((Xu : t’.tz)tl), and investigate what this term can mean. Again a abbreviates tl in t 2 , but since the type of both a and tl is t‘, a proposition, tl can be considered as a proof, with name a, of the theorem t’. (Local) ,B-reduction now results in replacing references a to the proof, by the actual proof tl. This is in a certain sense the equivalent of “quoting a theorem”. The only unnatural thing is that the proof is quoted, not the theorem. But we may say that the importance of quoting a theorem is the knowledge that there is a proof of that statement at hand. So one can as well quote (the name of) the proof which is connected t o the theorem. Until now, we have disregarded the matter of contexts. We only mention the fact that definitions occurring in a certain context can be coded in type systems

Type systems - basic ideas and applications (Ad)

243

as well, be it that, as before, some problems have to be solved as to undesired duplications. The same holds for theorems (and proofs), which are very often embedded in a context.

5. THE RELEVANCE OF TYPE THEORY FOR PROGRAMMING LANGUAGES 5.1. Polymorphism

As early as 1964, Landin described the relation between lambda calculus and a number of fundamental constructs in higher programming languages [Landin 641. In this Section we shall only give some examples, enabling the reader t o get the flavour of this attractive correspondence. For detailed examples of different type systems and for more motivations, we refer to [Barendregt & Hemerik 901. See also [Reynolds 741 and [Reynolds 851. Firstly we mention that types play a dominant role in modern programming languages. Types restrict the combinatorial freedom in programming in such a manner that many undesirable situations can no longer occur. For example, one would like to forbid the programmer from (purposely or mistakenly) dividing an integer by a list. Hence, the addition of types (like int, bool, list) was a major step forwards. Other advantages are the possibility of compile-time error checking and compiler optimization. It soon turned out, however, that a strict type system was sometimes too rigid for programming purposes. For example, when calculating the length of a list, it is efficient to have one procedure for lists of integers, booleans or whatsoever, since the type of the contents of a list does not matter when considering its length. In type systems these problems can, t o some extent, be avoided. As a simple example: consider the identity function Ax : D.x on the set D. When one wishes t o have a kind of super-identity function for all possible sets D , then the higher order character of (some systems of) type theory can solve this easily: consider AD : *,.Ax : D.x. This term, “fed with” (i.e. applied to) an appropriate set, int say, yields the identity on that set: Ax : int.x. One speaks in this respect of polymorphism. The term just described is called the polymorphic identity function. Its type is polymorphic, as well, viz.: IID : *,.IIz : D.D. Polymorphism, in general, can be very profitable. Its usefulness in programming languages will be obvious. Polymorphism can be found in many modern programming languages, such as Milner’s ML (see [Harper et al. 861). A clear discussion of different kinds of polymorphism can be found in [Cardelli & Wegner 851.

R.P. Nederpelt

244

5.2. Definition-like constructions One notion often present in programming languages is that of the definition. One uses for example a l e t - or where-construction, something like I[ l e t 2 = tl in t2]l or tawherex = t l . This kind of abbreviations can very well be rendered in type systems by ((Az : t’.tz)tl), where t‘ is the type of t l . Cancelling such an abbreviation in a calculation now of course requires some form of local /3-reduction. See Subsection 4.3. The definition of a type can be treated analogously: translate [ [ t y p et = tl intz]l into ( ( A t : * , . t z ) t l ) .

5.3. Products and sums

A Cartesian product and a disjoint sum do, in some sense, correspond with conjunction and disjunction, respectively. This correspondence can be fully exploited in type systems. In second order predicate logic one can define the conjunction pl Apz in terms of V and *, as V q . ( ( p l p2 q) +q). (We shall suppress parentheses, according to the following convention: =+ and + associate to the right, application associates t o the left.) This suggests defining the Cartesian product tl x t2 by the following term: nt : * , . ( ( t l + t2 -+ t ) + t). Here, as in Subsection 3.4, we use A + B as a shorthand for IIz : A.B. Abstracting over the sets tl and t2, we conclude that pair should have as its type:

*

n t l : *,.nt2 : *,.(t1

+ t2

--*

(nt: *,.((t1

3

t2

+

t ) + t ) ) ).

Indeed, given sets tl and t2 and elements zetl and yet2, then (pair tl t2 z y) must be an element of tl x t2. Now it is not hard to find a typed lambda-term of the appropriate type:

pair := At1 : *,.At2 : *,.Ax : tl.Ay : t2.M : *,.Ah : ( t l + t2+ t ) . ( h z y ) . (This formula looks a bit complicated, but with a n appropriate sugaring it can be sweetened to the taste of the customer.) should have as its type: The first projection function

n t , : * , . n t , : *,.((t1 x t2)

tl) .

Again, a term having this type can be found without too much trouble: T1

:= At1 : *,.At2 : *#.Ap: ( t l x

t2).(Ptl(AZ : t1.Ay : t2.2)) .

(The reader is asked to develop pair and ~1 by hand; in doing so, he or she is solving a problem in proof finding!) One may now verify that the first projection of the pair < a, b > is indeed a:

Type systems - basic ideas and applications (A.8)

(TI

245

A B ( p a i r A B a b ) ) +p a .

We leave it as an exercise to the interested reader to find an expression in type theory for the sum tl t 2 analogous with the definition in second order logic of p l v p2 as:

+

+

Find also the general form of an inhabitant of tl t 2 and define the first injection function. We only mention that the case-construction present in a number of programming languages can be translated into type theory by the aid of this direct sum-construct. For details, we refer to the literature.

5.4. Abstract data types

A notion in programming language that is especially useful for the development of large programs, is that of abstract data type (also present in notions such as “class”, “module”, “package” and “cluster”). One of the advantages of using abstract data types is, that it enables one to build large programs in a modular manner. As an example, we consider the introduction of the type stack. This could be done as follows: type stack with e m p t y : stack, push : ant x stack

top : stack

-+

+

stack,

int

in M

Now in the program M , the implementation of stack, empty, p u s h and top is not relevant, and even “invisible”. One can use the desired notions as abstract entities, with a certain intended meaning. This enables one t o define an implementation of stack etc. independent of the program M . Thus, programming and implementing become different jobs, which can be treated separately by several persons. Changes can be made to either the implementation or the program text M , without influencing the other. This brings about an advantageous aspect of modularity. In the above example, the following implementation (printed boldface) could be given, but other implementations are imaginable, as well: type stack = list(int) with e m p t y : stack = nil, push : ant x s t a c k -

R.P. Nederpelt

246

stack = Xi : int.Xs : list(int).cons(i,s), top : stack + int = As : list(int).hd(s) in M

How can this be translated into type theory? First we shall generalize the above example. Its structure is: type a = t with q : q ( a ) = el,

(with ei : ai(t)). Or, using a vector: type a = t with < 21,.. , ,z, > : u1(a)x

... x

gn(a)=

< el,.. . ,en >

in M

,

or, with a vector variable: type a = t with g :

( a )= g i n M .

As a lambda term, this becomes: ((Xcr : *,.Ax : 0 . M ) t e) ,

where e : .[a := t]. In fact, the implementation is represented by the pair < t , e >. It can also be expressed as a pair in the lambda term, by the aid of infinite direct sums (“C-types” or “existential types”). Cf. [Mitchell & Plotkin 851. 5.5. Recursion

In this subsection we will show that it is possible to encode some powerful forms of recursion in type theory. Firstly, we give a version of Church-numerals in type theory. Here we treat natural numbers in a different manner from that in the Automath-approach described in Subsection 4.2. The Church-numerals are a coding of natural numbers in lambda calculus based on Peano’s successor-function. (Of course this coding is inefficient. However important this may be, this is not our primary concern here.) The intuition is that the n-th natural number, after P-reduction, is represented by:

n

= X a : *,.A!

: cr+a.Xx

: a . ( p ).

,

Type systems - basic ideas and applications (A.8)

247

where (f" x) is the n-th iterate o f f applied to x. Hence we define:

nat :=

na : *,.(a

+

a)+ (a

a) ,

,

0

:= Xa : *,.Xf

s

:= An : nat.Xa : *,.Xf : a-+a.Xx : a . ( f ( n af z))

: a+a.Xx

: a.2

Now it can easily be verified that 0 : nat, and also that ( S O ) +p

Xa : *,.Xf

: a+a'.Xx : a.(f

x)

All primitive recursive functions can be expressed. For example, the addition function can be rendered a.s follows: add = Am : nat.Xn : nat.(n nat 3 rn)

.

More general recursion schemes can be described in type theory, for example the Ackermann function. For more details we refer t o [Reynolds 851 or [Barendsen 891.

6. FINAL REMARKS AND ACKNOWLEDGEMENTS a

This paper gives an impression of different aspects of type theory and type systems. The author explicitly states that he has made a personal choice, motivated only by his own preference and by the limitations of space allotted. As a consequence, the paper does neither pretend to be complete nor to be wellbalanced. Many interesting results have not been mentioned, and many names have been omitted. The author wishes t o express that he regrets any injustice caused by this. The interested reader can find more information in the literature cited in this paper. The author wishes to thank Bert van Benthem Jutting and Marc Bezem for their willingness to read a draft of this paper and to give very helpful comments, of various kinds; he is grateful to Kees Hemerik for his adaptation of the examples in the area of programming languages and to Hilary Backhouse for a number of valuable comments on the phrasing.

This Page Intentionally Left Blank

PART B Language Definition and Special Subjects

This Page Intentionally Left Blank

251

Description of AUT-68 L.S. van Benthem Jutting

This is an informal description of the first Automath language, AUT-68. The first section contains an introduction sketching the motivation for the language, and giving a short historical survey. In Sections 2 to 5 a simple (untyped) version of the language is presented. After that, in Sections 6 t o 9, types are added. Section 10 contains an overview of the results of the language theory. Users may find some useful hints in Section 11. Finally Section 12 contains an AUT-68 text which could serve as a start for an introduction into predicate logic, and Section 13 gives a comment on this text.

1. INTRODUCTION Automath is a language for describing mathematics in such a precise way that texts in that language can be checked mechanically (i.e. by a computer). Such a language could be useful for several purposes: (i)

The language increases our conviction that long and tedious proofs of theorems which are not very obvious, are formally correct.

(ii) The language exposes difficult steps in proofs. Some difficulties in proofs might be caused by the language itself (which has its limitations), but sometimes we get an insight in the structure of a proof by using a very formal language. (iii) The language might be useful in the didactics of mathematics. It forces a user to concentrate on the structure of proofs (cf. (ii)) and allows us to see which axioms and deduction rules have been used in a proof. (iv) It could be possible to write in the language a kind of “data bank of mathematics”, containing a library of books and papers on mathematics referring t o each other. The Automath languages differ in their design from most systems which have the aim to formalize mathematics. Such systems usually presuppose the axioms and rules of predicate logic. To this system axioms are added, describing

L.S. van Benthem Jutting

252

the mathematical objects under discussion, e.g. sets, relations, functions, numbers, vector spaces, etc. Automath, however, presupposes functions, types and deduction rules about these. Axioms about other concepts, e.g. propositions, predicates, sets or numbers are added. N.G. de Bruijn is the spiritual father of Automath. He designed the language, produced and propagated the fundamental ideas underlying it, originated, organized and managed the Automath project. He also contributed extensively to the formal definition of the language, to experiments in its use and to the language theory. Other contributors have been: 0

0

0

R.P. Nederpelt, R.C. de Vrijer and D.T. van Daalen who worked on language theory. I. Zandleven and A. Kornaat who designed a computer program for verifying texts.

L.S. van Benthem Jutting, J. Zucker and R. Wieringa who wrote extensive texts in various Automath languages. Work which has been done:

(i)

Constructing various languages, giving formal definitions; providing theory about these languages, proofs of consistency and decidability.

(ii) Writing extensive texts in various languages: translation of the book by E. Landau: “Grundlagen der Analysis”; a long text on elementary real analysis: real numbers, calculus, power series, the exponential; a text on semantics of computer programs. (iii) Constructing an interactive computer program. (iv) Smaller texts written by students. We mention Marcelis, Penning, Wieringa, Udding and Braun.

2. B O O K A N D C O N T E X T

Everything which is written in the language should be interpreted in a certain environment. In this section we describe this environment informally. A more formal description will be given in the next section. In the environment we discern two parts:

Description of AUT-68 (B.1)

253

(i) All concepts which have been defined before, primitive notions which have been introduced, axioms about these primitives, theorems and lemmas which have been proved. This piece of our text is called the book. The book contains everything which has been written previously, and is extended when we write new text. (ii) The variables which are valid (i.e. which have been “declared”) at the place where we write, and the assumptions which have been made at that place. This is called the context. The context is local, it can be extended and contracted. The context can also be empty.

3. LINES, NAME AND CONTENT; EXPRESSIONS The book is a (finite) sequence of lines. A line consists of (i)

A context. In the present provisional (untyped) version it consists of a finite sequence of mutually different variables. We write these variables in square brackets, e.g.

The context may be empty. The empty context is denoted by

0.

(ii) A name. This name should be new, i.e. it should not occur earlier in the book. Behind the name the variables of the context should be mentioned, e.g.

-[x I[v I[. context

1

a(x,v,z) * name

(iii) A content. The content gives meaning to the name, the name indicates (or abbreviates) the content. Name and content are separated by the symbol ‘:=’. There are two possibilities for the content. -

The content can be the word PN (for “primitive notion”). The interpretation is, that the name indicates a primitive object. Thus a book on natural numbers could begin with the lines:

0

1

[x]

suc(x) := P N .

:= PN

L.S. van Benthem Jutting

254

In these lines the number 1 and the successor have been introduced as primitive notions. - The content can also be an expression. Expressions will be discussed in

the next section.

4. EXPRESSIONS, ABSTRACTION AND APPLICATION In this section we define expressions. We presuppose a book and a context, which in general will not be mentioned explicitly. (i)

A variable from the context is an expression.

(ii) If the book contains a line [ x ][y ][z ] a ( 2 ,y, z ) := C are expressions, then a(A, B,C) is an expression.

...

and if A, B and

Similarly for longer or shorter contexts. Such expressions will be called head expressions.

Example. The book which was started in the previous section could be extended with the following lines: 0

0 [z ]

0

2 := suc(1) 3a := suc(2) plus2(x) := s u c ( s u c ( x ) ) 3b := pZu92(1).

(iii) If A is an expression in the context [ x ](y ] [ z ] then [z ]A is an expression in the context [z ][y 1. Such expressions will be called abstraction expressions. Note that it is possible for the variable z to occur in A although z does not occur in the context of [z ] A .

Interpretation. [ z ]A denotes the function which associates to z the value A. In the A-calculus this function is denoted by Az.A. The choice for a different notation was partly motivated by the limitations of our input and output devices. Examples. With respect to the book which has been given above, and in an empty context the following are abstraction expressions: [ z ] l s u c ( x ) , [ z ] x and

[y 11 .

Description of AUT-68 (B.1)

255

They denote respectively the successor function, the identity and the constant function with value 1. We can extend the book with the lines:

0 0

sucf idf

:= [ z ] s u c ( z ) := [XI..

(iv) If A and B are expressions then ( A ) B is an expression. Such expressions will be called application expressions.

Interpretation. ( A ) B denotes the value of the function B at the argument A . Usually this value is denoted by B ( A ) or possibly by BA. But in Automath it is common to write the argument before the function. This has some technical advantages; it takes a little effort t o get used to it.

Examples. With respect to the book above, and in an empty context the following are application expressions: (2)sucf, (3a)idf and (1)[x 12 They denote respectively the value of the successor function at 2, the value of the identity at 3a and the value of the constant function 2 at 1. The book can be extended with the line

0

3c

:= (2)sucf

Moreover, we can construct in a context [f ][g ] the expression [z ]((z)g)f . Its interpretation is: the function associating to every x the value of f at (z)g, i.e. the composition f o g of f and g. We extend the book with the lines:

We will now present the book which we have made:

0 [z]

0 0 [z ]

0 0

1 suc(z) 2 3a plus2(s) 3b sucf

PN PN suc(1) suc(2) := suc(suc(z)) := plus2(1) := [ z ] s u c ( z )

:= := := :=

L.S. van Benthem Jutting

256

0 0

If

0

idf 3c

:= [XI. := (2)sucf

1[g 1 comp(f,g) := [x l ( ( 4 g ) f plus2f := comp(sucf, sucf)

.

5. SUBSTITUTION, REDUCTION, DEFINITIONAL EQUALITY Let E be an expression on the context [x ][y ] [ z 1, and let A , B and C be expressions on a certain context y. Then E [x,y, z := A , B, C] denotes the expression obtained from E by simultaneously replacing x by A , y by B and z by C. So E [x, y, z := A , B, C] is an expression on the context y.

Remarks. (i) It could be the case that variables from the context y are bound in E. This causes problems, e.g.

These problems are well-known and solved, we will not go into them here. (ii) The substitution operator is an operator in the meta language. In the following we assume substitution t o be well defined. In connection with the interpretaiion of expressions as explained in the previous section it is obvious that we should consider some expressions as “having the same meaning”, or as being “equal”. In our book, for example, the expressions 3a, 36 and 3c seem to have the same meaning, namely suc(suc(1)). We can decide that expressions are equal by expanding them. This expanding procedure is called reduction. There are two reductions: (i) 6-reduction. If

is a line in the book and on a certain context A , B and C are expressions, then we have:

We say that a(A, B , C) reduces t o E [x,y, z := A , B, C].So &reduction amounts t o “expanding a definition”.

Description of AUT-68 (B.1)

257

Example.

(ii) P-reduction. If ( A ) [ . ]B is an expression, then we have: ( A ) [ z ] B +p

B [ z : =A ] .

Again we say that (A)[z ]B reduces to B [z := A]. So 0-reduction amounts to “calculation of the value of a function at some argument”.

Example.

The relation + is the compatible closure of -6 and + p . called reduction, is the reflexive and transitive closure of +.

The relation

++,

Examples. (i)

In the example of P-reduction we saw that

Now we have: s u c ( ( l ) [ y] suc(g)) (ii)

3c

(iii)

plus2f

+

(2)sucf +

+

4

suc(suc(1)) .

(2)[2]suc(z)

comp(sucf,sucf)

.[ )((.)[!I 1 4 Y ) ) 3 ~ C f

+

.[ I(suc(.))[Y 1 4 Y )

D

+

.[

.I

+

+

suc(2)

+

[z ]((.)sucf)sucf

suc(suc(1))* +

I(s~c(~>)sucf

+

144.))

.

The relation = is the smallest equivalence relation which contains relation is called definitional equality.

+. This

Interpretation. Two expressions which are definitionally equal can be thought to denote the same object.

L.S. van Benthem Jutting

258

Examples. (i)

D

3a = 3c because 3a

--*

suc(2) and 3c --H suc(2).

-W

suc(suc(1)) and 36 + suc(suc(1)).

D

(ii) 3a = 36 because 3a

D

D

D

(iii) 36 = 3c because 3a = 3c and 3a = 36.

Exercises.

D

(i) Prove that 3a = (l)plus2f.

D

(ii) Prove that plus2 f = [z ] plus2(z).

6. TYPING

Up till now we have presented an “untyped version” of AUT-68. This has some drawbacks. We mention: (i) According to the rules, we could write in our book on an empty context the expression comp(l,2), which would reduce to [z ]((z)suc(l))l. This expression does not have an intuitive interpretation. (ii) Some queer phenomena can be observed concerning reduction. When we then we observe that this try to P-reduce the expression ([z ](z)z)[z ](z)z, expression reduces to itself. These drawbacks can be removed by taking into account the “functional structure” of the various objects. For any object we want to administrate whether it is a function and, if so, what is its “domain” and its “range”. Moreover we require that only functions can be “applied”, and only to objects in their domain. This book-keeping is attained by typing. Adding types to expressions gives the language AUT-68. In AUT-68 the objects (indicated by expressions) are divided into disjoint classes, called types. These types themselves are also indicated by expressions. We start by giving some definitions and conventions in our metalanguage. Expressions indicating objects are called 3-expressions. Expressions indicating types are called 2-expressions. The new word ‘type’will be called a 1-expression. In the metalanguage 3-expressions are indicated by capitals: A , B,G, ... and 2expressions by Greek letters: a,P, y, .... (Occasionally we will deviate from this convention, but we will be explicit about such a deviation). The 2-expressions

Description of AUT-68 (B.1)

259

have the same syntactical structure as the 3-expressions. They can be variables, head expressions, abstraction expressions and application expressions. For any expression it will be clear (from the environment) whether it is a 3-expression or a 2-expression, and, if it is a 3-expression, what is its type. In the language we will have formulae:

A :a a : type

to be interpreted as ‘ A has type a’, and to be interpreted as ‘a is a type’.

7. CONTEXTS AND LINES In some cases typing changes the rules of the language as given above. (i)

Contexts, object variables and type variables. For variables in the context it should be indicated what is their type. If the variable is a 3expression its type is given together with its introduction: [z : a], if the variable is a 2-expression this is indicated by [y : type]. Example. So a possible context is [z : type][y : z].

(ii) Expressions. Also abstracted variables are given with their types. So clause (iii) in Section 4 becomes: If A is a 3-expression in context [z : ..] [y : ..I [z : a ] ,with z a 3-expression, then [ z : a]A is a 3-expression in context [z : ..] [y : ..I. In the same way we can form a typed 2-expression [ z : 010. Note that the abstracted variable z is always required t o be a 3-expression. The same requirement is made for the argument of an application expression. (iii) The content of a line. As we explained in Section 3, the content of the line gives meaning to a name. It will now also include type information, with the semicolon as separation symbol. -

The name could indicate a primitive object. In this case the type of this object should be made clear, so the content gets the shape PN

-

: a .

The name could also indicate a primitive type. In this case the content is

PN

: type.

Example. A book in AUT-68 on natural numbers could start with the lines

L.S. vaa Benthem Jutting

260

nat 0 1 0 [x:nat] suc(x)

:= PN := PN := PN

: type :

:

nat nat .

In these lines nat has been introduced as a primitive type, the type of natural numbers. After that 1 and the successor have been introduced as primitive objects of type nat. We could now, for an arbitrary natural number n, introduce the ndimensional space Rn as a primitive type, and then in any R” the zero vector as a primitive element. [ n :nat] R(n) [ n :nat] O(n) -

:= PN := P N

: type : R(n)

.

The name could also indicate a defined object, or a defined type. In that case the content of the line will be a formula:

A : a or a : t y p e . Example. Our book on natural numbers could be extended as follows:

0 0

2 := 3a := [z : nat] plus2(2) := 0 3b :=

suc(1) suc(2) suc(suc(z)) plu82(1)

nat : nat : nat : nat. :

8. THE TYPING OPERATOR It will be clear that we don’t want to accept as last line in the book above

0

36

:= plus2(1)

: R(2).

That this is not acceptable can be checked “mechanically”. In fact it is possible to calculate for any object (denoted by a 3-expression) its type. There is an operator r mapping 3-expressions to 2-expressions in such a way that for any 3-expression A we have A : r ( A ) . We define r by cases: (i)

(ii)

A = x, a variable. Then x is given, either in the context or within an expression, together with its type a as [z : a],and we have ~ ( z= ) a. A = a(B,C), a head expression. Then there is a line in the book shaped either

Description of AUT-68 (B.l) [z : ..][y : ..] a(z,y)

:=

26 1

E

: a

or [z : ..][y :

.I

a(z,y)

:= P N

: a

and we have 7(a(B,C)) = a [z, y := B, C].

Example. ~ ( p b s 2 ( 1 )= ) nat [z := 11 = nat and T ( O ( 2 ) ) = R(n)[n:= 21 = R(2). (iii) A = [z : a ] B ,an abstraction expression. A should be interpreted as a function, the variable z has its type a associated to it also inside this expression. The function A has as its domain a and as its codomain the type of B. We define in this case ~ ( [ :za ] B )= [z : a]7(B).

Remark. [z : alp should therefore be interpreted as “the type of the functions mapping a to p’. So [z : alp does not represent a function! Complication. As we have seen types could depend upon parameters: we have introduced the type R(n) depending on the parameter n. This gives us (in our interpretation) a “sequence of types” R(1), R(2), ... but this sequence cannot be coded by an expression in our language. We have however: T ( [ Z : nat]O(z)) =

.

[z : nat]7(0(z)) = [z : nat]R(z)

So we see that [z : nat]R(z)should be interpreted as “the type of the functions associating to every n E N a vector in R(n)” that is “the Cartesian product R ( l ) x R(2) x ...”. Summarizing. - [z : a ] Bshould be interpreted as a function. If z does not occur in B it

is a constant function. - [z : a10 should be interpreted as a Cartesian product. If z does not occur

in p it is isomorphic t o

pa, the class of functions from a to p.

(iv) A = ( B ) C ,an application expression. ) (B)T(C). We define T ( ( B ) C=

L.S. v m Benthem Jutting

262

We can now formulate our requirement on the content of a line in the book:

D

A : a is acceptable as the content of a line only if T ( A ) = a . Example. We could extend our book with the following lines.

0 0 0

sucf idf

:= [z : nat]suc(z) := [ z : natla:

3c

:= (2)sucf

[f : [z : nut]nat][g : [z : nut]nut] c o w ( f , g ) := [.: natI((4g)f 0 plus2 f := comp(sucf, sucf) 0 Of := [z : nut]O(z)

: [z : nat]nat : [ z : nat]nut : nut : [z : nat] nat : [ z : nat] nat : [ z : nut]R(z).

Note that [z : nut] nut is “the type of the functions nut -+ nat” and [z : nat]R(z)is “the Cartesian product of the types R(z)”.Note also that in the line introducing 3c we have: ~ ( ( 2 ) s u c f= ) (2)7-(sucf)= (2)[z : nat]nat -‘p

nat .

Nota bene. In the line above again [z : nat]nut should be interpreted as the Cartesian product of a countable set of copies of N . Then (2)[z : nat]nut must be interpreted as the second copy of N , and not as some function applied to 2. Hence application on 2-expressions represents selection of a coordinate axis in a Cartesian product, and &reduction represents computation of such a coordinate axis.

9. CORRECTNESS We now want to restrict the class of our expressions in such a way that no expressions occur which can not be interpreted. We have no interpretation for the expression (2)l because “1is not a function”, and we have no interpretation for camp( 1,2) because “1and 2 are not functions”. In the latter case we say that the sequence (1,2) does not fit in the context [f : [z : nat]nat][g: [z : nat]nat], that is the context of the line in which comp has been defined. We will now define what fitting means. In our definition A , B and C represent either 2- or 3-expressions (deviating from the notational habits introduced in Section 6).

Definition. The sequence ( A , B , C )fits in the context [z : ..][y : if we have:

. I[.

:

..]

Description of AUT-68

(B.1)

263

or A and 2 are both 2-expressions, or B and y are both 2-expressions, T ( B )= ( ~ ( y ) [z ) := A] T ( C )= ( ~ ( z ) )[z, y := A, B] or C and z are both 2-expressions.

T(A) = ~ ( z )

Similar definitions can be given for longer or shorter sequences and contexts. The definition is complicated because substitution is involved. That this should be the case can be demonstrated by a context like [z : type][y : z]. A sequence fitting in this context is (nat,2) and a sequence not fitting in it is (nat, sucf). Now we can define correctness of 2- and 3-expressions relative to a given book and a given context. (i)

If [z : type] is in the context, then z is a correct 2-expression. If [z : a]is in the context, then z is a correct 3-expression.

(ii) If A, B and C are correct, A, B and C being 2- or 3-expressions, and if (A, B,C) fits in the context [z : ..][y : : ..] then we have:

.I[%

- If either

[z : ..][y :

. I[.

:

..I

a(z,y,z):=

PN

: type

or [z : ..][y :

. I[% : ..I

a(z,y,z):= a

: type

is a line in the book, then a(A, B, C) is a correct 2-expression. - If either

[z : ..][y :

..I[% : ..I

a(z, y,z):= P N

..I[%

a(z,y,z):=

: a

or [z : ..][y :

: ..]

: a

D

is a line in the book, then a(A, B, C) is a correct 3-expression. (iii) If a is a correct 2-expression in the context [z : ..][y : ..I then we have:

. I[.

: a],then [z : alp

B is a correct 3-expression on context [z : ..][y : . I[. is a correct 3-expression on context [z : ..][y : ..I.

: a], then [ z : a ] B

- If ,Cl is a correct 2-expression on context [z : ..][y : is a correct 2-expression on context [z : ..][y :

.I.

- If

L.S. van Benthem Jutting

264

(iv) - If cr is a correct 2-expression, B is a correct 3-expression and a D = [z : T(B)]Pfor some p, then (B)ais a correct 2-expression. Note that cr represents a Cartesian product of types and ( B ) ais a coordinate axis.

A and B are correct 3-expressions and if T ( A ) D = [z : 7(B)]Pfor some p, then ( B ) A is a correct 3-expression. Note that A is a function with domain T ( B ) ,because D T ( A ) = [ z : 7(B)]P.

- If

Finally we have the general requirement:

All expressions occurring in our book must be correct. Example. This holds for our present book:

0 0

nat 1

(z : nat] suc(z)

[ n : nat] R(n) [ n :nat] O(n)

0 0

2 3a

[z : nat] plus2(z) 3b

0 0 0 0

sucf idf

:= := := := := := := := := :=

PN PN PN PN PN

suc(1) suc(2) suc(suc(z)) plus2(1) [z : nat] suc(z) := [ z : natlz

3c := (2)sucf [f : [z : nat] nat][g: [z : nat] nat]

0 0

10. LAT

comp(f19) := .1 : n a t I ( ( 4 9 ) f := comp(sucf sucf) plus2 f Of := [z : nat]o(z)

: : : : : : : : : : : :

type

nat nat type

R(n) nat nat nat nat [z : nat] nat [ z : nat] nat nat

: [z : nat] nat : [z : nat] nat : [z : n a t ] R ( z )

GI AGE THEORY IN A NUTSHELL

The main problem in verifying the correctness of a book is deciding definitional equality. Therefore the language theory of Automath, which was developed mainly by Nederpelt and van Daalen, concentrates on definitional equality and reduction.

Description of AUT-68 (B.1)

265

An expression is called normal if it cannot be reduced by a p- or &step. An expression is called normalizing if it reduces t o a normal expression. An expression A is called strongly normalizing if there is no infinite sequence A l , A2, ... such that A + A1 + A2 + ... . The Church-Rosser property holds for A if we have: if A + B and A + C , then there exists an expression D such that B -H D andC-HD.

A.

Figure 1 If the Church-Rosser property holds for A then there is at mos6 one normal expression B such that A + B. If moreover A is strongly normalizing, then B will be reached in a finite number of steps, no matter which reduction strategy might be chosen. For many Automath languages (including AUT-68) it has been proved that the Church-Rosser property and strong normalization hold for all correct expresD sions. This gives us “theoretical” decidability for the relation =. For “feasible” decidability it is necessary that the verifying program uses a practical strategy. Deciding definitional equality by reducing expressions until they are normal is not practical!

11. SOME TECHNICAL REMARKS

(i)

Context administration. Contrary t o what has been said before, it is customary in Automath to extend context in separate lines, which are called context lines. In other lines it is possible to refer to such context lines by a context indication. This context indication is separated from the name which is defined in the line by the symbol ‘@’ . The empty context is indicated by writing nothing before the ‘62’.Lines which do not contain a ‘62’have the context of the previous line. We will illustrate the customs with an example.

L.S. van Benthem Jutting

266

.-.- ... .-.- ... .-.- ...

@ C

Y

@ d(z,y)

2

@ e(z) [z : 61

context extension (on 0 ) the context is [z : a][y: p][z: 71, as introduced in the previous line the context is [z : a][y: p][z : 71, the context of the previous line the context is 0 the context is [z : a][y: p] the context is [z : a] context extension (on [z : a], the context of the previous line) the context is [z : a][z: 61 the context is [z : a][y: p] the context is [z : a][z: 61 context extension (on [z : a][y: p] ) context [z : a][y: p][t: .

4

(ii) Omitting variables. Contrary to our presentation in Section 3 (ii), the variables in the context will not be mentioned behind the name. So we will not write in the text of our last example a(z,y,z)

:= ..., b(z,y,z) := ..., y @ d ( z , y ) := ... etc.

but

a := ..., b := ..., y @ d := ...etc. Moreover it is possible to omit variables from the context behind names in other places, starting with the first variable. Suppose, for example, that the name a has been defined in the context [z : a][y: p] then in following lines a(z,A) can be abbreviated to a(A) and and a ( z , y )to a. Note that a(A,y) can not be abbreviated! (iii) The shape of names; prefix, infix and postfix symbols. Every string of upper or lower case letters, digits and the symbol '-' can be used as a name.

Examples. de-Bruijn, 1-9-81, A UT-68. Such names can also be used for variables. It is practical to choose for variables one-letter names and for defined names longer words, which are expressive of the concept they represent. In order to make texts more readable for humans it is allowed to introduce prefix, infix and postfix names. There are two kinds of such names:

Description of AUT-68 (B.l)

267

=, +, -, *, I, &. anything which is written in quotes, e.g. ‘(’, ‘+-’,

-

A number of special symbols, including

-

Nearly

‘or’, ‘= N ’ .

Prefix, infix and postfix notation are introduced showing (one or two) variables.

Examples. Q [z : nat] Q [z : nat]

z‘!’ -z

:=

...

.- ...

62 [z : nat][y: nat] z + y := ... Q [z : nat][y: nut] z = y := ...

introduces the postfix symbol ‘!’ introduces the prefix symbol -

+

introduces the infix symbol introduces the infix symbol =

+

Now in following lines 2‘!’, -3 and 2 1 are expressions. There are no precedence rules, for parsing parentheses are needed: 1 2 3 is not allowed, neither is 1 2 = 3.

+ +

+

(iv) The symbol $ is used for “end of line”.

Example. As an illustration we present our book once more with the conventions just given. We have changed some of the names into infix and postfix symbols. Q nat 1

:= PN := PN

: type : nat

:= PN

:

[x: nat] ‘suc’ Q [n: nat] 2

Q

x Q Q

Q

R := PN 0 := PN 2 := 1 ‘suc’ .3a .- 2 ‘SUC’ plus2 := (z ‘suc’) ‘SUC’ 36 := plus2(1) sucf := [x : nat](z‘ S U C ’ ) idf := [ z : natlz 3c := (2)sucf [f : [z : nat] nat][g: [z : nat] nat] f ‘0’9 := [z : nat]((z)g)f plus2f := sucf ‘0’ sucf Of := [z : nat]O(z)

: : :

: : : : :

: : : :

$ $ $ nat $ $ type $ R(n) $ nat $ nat $ nat $ nat $ [ z : nat]nat $ [ z : nat]nut $ nat $ $ [z : nat]nat $ [ z : nat]nat $ [ z : nat]R(z)$

Finally it should be mentioned that in other papers on Automath other notations have been used. We mention:

L.S.van Benthem Jutting

268

for P N for context extension [z : a] for context extension [z : a]

pram x := E B : a 2

:=

-

:ff

(in such notations only one variable per line can be added to the context) for for for for

(A)B [z : a ] B A : cy @.

12. LOGIC

As our aim is t o describe mathematics in Automath, the question arises how to represent logic. In this section we present an Automath text introducing logic in a Natural Deduction style. The lines of this text will be numbered for referencing, and comments can be found in Section 13 below. Q

Prop

:=

PN

b : PrOPllq : ProPl[T : PWPI p Q Pr := PN 0 [ A :type] PRd

:=

[z: Alprop

[P : P 4 4 l All := PN [u : [z: A] Pr( (z) P)] := PN Akin P Q [ti : Pr(All(A,P ) ) ] [ u: A] := PN All-el spec := All.el(u,a) 9

Q

P'-"Q

:= AZl(Pr(p),[z: Pr(p)]q)

[u : [z: J Y P ) I Pr(q)l Imp-in := A l l i n ( P r ( p ) ,[z: Pr(p)]q,u) q Q [u : Pr(p'+'q)][v : Pr(p)] Imp-el := All.el(Pr(p), [z: P r (p )] q,u,v ) := Imp_el(u,v ) mod g o n p Q reflImp := Imp_in(p,p,[z: P r ( p ) ] z ) r 0 [u : Pr(p'+'q)][v : Pr(q'+'r)]

trans-Imp

: type

: type

: type

$ $ $ $

1 2 3 4

$

5

:

6 7 8 Pr(All(A,P)) $ 9 $ 10 Pr((a)P) 8 11

:

Pr((a)P)

: PWP

:

: prop

$ $ $

$

12

13 14 $ 15

$

$ :

Pr(p'-.'q)

16 17 $ 18 $

Pr(q) : W q ) : Pr(p'+'p) :

$

$

19

$

20

$

21

:= I m p i n ( p ,r ,

.1

: Pr(P)lImP-el(q,rr v , ImP-eZ(P,9 , u,

:

PT(P'"T)

Description of AUT-68 (B.l) Q Fls p Q [u : Pr(FZs)] Fls-el

:=

269

All(prop, [z: proplz)

:

$

22 23 24 25 26 27 28 29

$

30

$

31

$ $

All-el(prop, [z: prop]z,u,p) := p'+'Fls p Q 1-'p [u : [z: Pr(p)]Pr( Fls)] Not _in := Zmp-in(p, F l s , u ) q Q [u : Pr(' - 'p)][v: Pr(p)] Not _el := Fls-el(q, Zmp-el(p, F l s , u , v ) ) q Q [u : Pr(' - 'p)] n-antec-soimp := Zmp_in( :=

$

P.(P) : prop :

$ $

: Pr"-

'p)

$ $

:

: .[ : Pr(p)lNot-el(p,49% q)l conseq-so-imp := I m p i n ( [ z : P r ( p ) ] u ) : p Q [u : Pr(' - '(' - ' p ) ) ] := P N dble-negdaw : q Q [u : Pr(p'+'q)][v : Pr(' - 'q)] mod.to1 := trans_Zmp(p,q, Fls, u, v ) : u Q contrapos-1 := Imp_in(' - 'q,' - 'p, [z: Pr(' - 'q)]rnod-tol(p,q,u,z)) :

Q

prop

P,(q)

Pr(p'-+'q)

$ 32

Q [ u :W

Pr(p'+'q)

$ $

P.(P)

$ $

Pr('- 'p)

$

Pr((' - 'q)'+'(' - ' p ) ) $

q Q [u : Pr((' - 'p)'+'('

contrapos-2

33 34 35 36 37

- 'q))]

$

38 39

:= Imp_in(q,p,[z: Pr(q)]

dbl-neg_law(p,Not.in(' - 'p, [y : Pr(' - 'p)]Zmp_el(q, Fls, 9 Q P'v'q

Imp-el(' - 'p, := - 'p)'+'q

- 'q, u, y), z))))

(1

: Pr(q'-+'p) : prop

b:W P ) l Or-in-1

40 $ 41 $ 42 $

:= Zmp_in(' - 'p,q,

[z: P r ( ' - 'p)]Not.el(p,q,z,u)) : Pr(p'V'q)

$

q)l Or-in-2 := conseq_so.imp(' - 'p, q, u ) : Pr(p'v'q) r Q [u : Pr(p'v'q)][v: [z: Pr(p)]P r ( r ) ] [ w: [z: Pr(q)]P r ( r ) ] Or .el := dbl-neglaw(r, Not.in(' - ' r , [z: P r ( ' - 'r)]Zmp.el(r,Fls,z, (Zmp-el(' - 'p,q, u, Not_in(p, [y : Pr(p)]Zmp.el(r,F l s , I, (Y)V))))W))) : Pr(r) pr f .byrases := Or-el(u,v,w ) : Pr(r)

$

4

Q

.[

:W

43 44 $ 45 $ 46

47 $ 48 $

L.S. van Benthem Jutting

270

q Q [ U :PT(p'V'q)]

Com-OT

$

49

:= OT-el(q'V'p,u,

[z: PT(p)]OTin-Z(q,p,Z), [Z: PT(q)]OT-Zn_l(q,p,Z)) [U : [Z : PT(' - 'p)]PT(q)] := Imp_in('- ' p ,q,u) O~in-3 4 @ ['U : [Z : pT(' - ' q ) ] PT(p)] Orin-4 := c o m D r ( q , p , O r i n 9 ( q , p u , )) Q Q ['U : PT(p'v'q)][V: P?'(' - 'p)] := Imp_el('- ' p ,q, u, v ) not-case-l Q 0 [U : pT(p'v'q)][V: Pr(' - 'q)] := not_case_l(q,p,c o m D r ( u ) ,v ) not-case-:! Q

Q

13. COMMENT In this section comment is given on the text in Section 12. We give the comment line by line. (1)

prop is the type of the propositions.

(2)

We introduce three propositions p , q and r for future use.

(3)

If p is a proposition then we can assert p , i.e. we can say that p holds. In Automath assertion is treated by introducing proofs as objects. Logic could also be treated in other ways in Automath, but this method is rather natural. For every proposition p we have the type of the proofs of p : propositions correspond to types (of their proofs). The type P r ( p ) should be considered as the type of the proofs of p . Then u : P r ( p ) should be interpreted as ' u is a proof of p' or ' u proves p'. Now we have three methods for introducing proofs: (i)

As a variable in a context:

...[u : P r ( p ) ] . The interpretation is: 'let u be some proof of p ' , or, in other words, 'assume p'. (ii) As a primitive object: y @ ax

:= PN

:

WP)

Description of AUT-68

(B.I)

271

The interpretation is: ‘ax is a primitive proof of p’ or, in other words, ‘ p is an axiom’. (iii) As a defined constant, abbreviating an expression: y @ th

:= B

:

P.(P)

The interpretation is: ‘ t h denotes the proof of p which is given by the expression B’, or ‘ p is a theorem (with proof El)’. (5)

Let A be a type. A predicate over A is a function which associates to every object x of type A a proposition, that is: a predicate is a function of type [z : A] prop.

So we define pred := [x : A] prop. Then we have: if P : pred(A) then P is a predicate over A . Note that for a : A the proposition ( a ) P means that P holds at a. (7)

Universal quantification. The universal quantifier turns a propositional function into a proposition.

After having introduced the universal quantifier we should fix its meaning. We do this by giving two rules for using it: (i) A rule from which we can conclude that the proposition holds. Such a rule is called a n introduction rule. (ii) A rule that tells us how we can use the fact that the proposition holds. Such a rule is called an elimination rule. (9)

Ad (i). Introduction rule. How do I find a proof for A11(A,P ) , in other words, how do I find an object in P r ( A l l ( A , P ) ) ? Intuitively this is the case if for any x : A I have a proof for ( z ) P ,in other words, if for any x : A I have an object in Pr( ( x ) P ) .That is, if I have a function in (the Cartesian product) [z : A] P r ( ( z ) P ) . This intuition is captured in the (primitive) introduction rule All-in.

(11) Ad (ii). Elimination rule. How can I use the fact that I know A11(A, P)? Intuitively I can use this if I have an object a : A . In that case I can deduce ( a ) P . Again this intuition is captured in the (primitive) elimination rule All-el. In logic the rule is known as specialization.

Remarks. (i) The rules All-in and All-el are each others inverse. They express a link between functions u : [x : A] P r ( ( x ) P )and objects w : P r ( A l l ( A ,P ) ) .

L.S. van Benthem Jutting

272

(ii) The logical system which associates to the various logical operators the introduction and elimination rules which express their meaning is called

Natural Deduction. (13) Implication. Let p and q be propositions. We consider a predicate over the type P r ( p ) - i.e. the type of proofs of p - namely the predicate [x : P r(p)]q,which associates to every proof of p the proposition q. This is, in a sense, a “constant predicate”, because x does not occur in q. What would be the meaning of A ll(P r (p ),[z : P r ( p ) ] q )? This would be that for any proof x of p we have a proof that q holds. In other words, if p holds, then also q will hold, that is p implies q. This is therefore our definition of implication p ‘+’q. (15) With this definition we can derive the introduction and elimination rules for p ‘+’q.

Introduction rule. We can deduce that p ‘+’q if for any proof of p we have a proof of q, i.e. if we have a function mapping proofs of p to proofs of q, i.e. if we have an object of type [x : P r ( p ) ]P r ( q ) . We have

With the definition given it follows that

D

I m p i n ( p , q , u ) : P r ( A l l ( P r ( p ) ,[z : P r ( p ) ] q ) )= Pr( p‘+’q)

.

(17) Elimination rule. This is the well-known rule modus ponens. Check that the definition is correct. (19) Now we can prove p ‘-+’p. (21) We assume p ‘+’q and q ‘+’r and prove p ‘+’r. (22) Contradiction. We define the contradiction Fls as: “all propositions hold”. (This is in fact a definition in second order logic). (24) From the definition of Fls its elimination rule follows: If we have a proof of Fls we have a proof for any proposition p . (25) Negation. We define ‘ - ’ p t o mean p ‘+’Fls.

Description of AUT-68

(B.1)

273

(27) Introduction rule. If we have an object of type [z : P r ( p ) ]Pr(F1s) then we can conclude ' - ' p . (29) Elimination rule. If both p and ' - ' p hold then any proposition q will hold. (31) Now we assume ' - ' p and we deduce p'+'q : the antecedent does not hold, so the implication does. (33) Also we assume q and deduce p '+'q : the consequent holds, so the implication does. (35) We add a new axiom, the double n e g a t i o n law : Assume ' - '(' - ' p ) then we have p .

This gives us classical logic. Without this axiom we have a system for (second order) intuitionistic logic. Now we give some theorems on implication and negation:

(37) Modus tollens: assuming p '+'q and ' - ' q we deduce ' - ' p . (38) The first form of contraposition: assuming p '+'q we deduce (' - ' q ) '+'('

- 'p).

(40) The second form of contraposition, which is the converse of the first form: assuming (' - ' p ) '+'(' - ' q ) we deduce q '+'p.

(The proof uses the double negation law.)

(41) Disjunction. We now define disjunction: p'V'q := (' - 'p)'+'q, (43) We derive two introduction rules and a n elimination rule for p'v'q.

Introduction rule 1. Assuming p we conclude p'v'q. (45) Introduction rule 2. Assuming q we also conclude p'v'q.

(47) Elimination rule. This is the way of reasoning which is called "proof by cases". If we have p'v'q and if we can deduce T from p and also from q, then we can conclude T . In our formalism: If

u : Pr(p'v'q), v : [z : Pr(p)]P T ( T )and w : [z : P r ( q ) ] P T ( T )

then O r . e l ( p , q, T , u, v,w ) : P r ( r ) . Note that v is a function from proofs of p to proofs of is a function from proofs of q to proofs of T .

T,

and similarly w

This Page Intentionally Left Blank

275

AUT-SL, a Single Line Version of Automath N.G. de Bruijn

1. INTRODUCTION We can write Automath in a form as studied in [Nederpelt 71a], called XAutomath. This amounts to abolishing block openers and indicator strings, writing everything in the form of abstractions; some of these abstractions are not necessarily legitimate in Automath itself. Moreover, expressions like b(&, C1, C B ) are replaced by things like {&} {Cz} {El} B, where B is not the same as b, but related to B by means of obvious abstractions. We can go further along this line: First we can abolish all definitions, i.e. all letters expressed by means of a line with a middle part other than PN or EB [EB indicates the bar used as middle part in a block opening line]. A next step is to abolish PN’s: Being under the reign of a given axiom can be interpreted as living in a block where the axiom is a block opener. For example, the book following type bool TRUE := PN type can be compared with a book preceded by the quantifier string

bool

lb

:=

._ .-

PN

-

[bool : type] [true : [z : bool]type] .

( 1)

(2)

The latter seems to be, in some sense, stronger than (1). The form (2) has the following feature: If we have some model for (1) (i.e. a type p and a mapping 7 attaching a category to every object with category p), then the rest of the book can be applied to that model just by substitution. In the form (l),however, such conclusions can only be made metamathematically: every proof of the book following (1) can be rewritten, by trivial translation, as a proof of a statement about such a model. Since we are abolishing all definitions, or, what is the same thing, abolishing all abbreviations, any interesting line of an Automath book can now be written \ and read independently of the preceding book: all necessary information is to be condensed in that line.

N . G . de Bruijn

276

There is a further extension, to be explained now. In Automath we have three kinds of expressions. First, there is a single 1-expression, viz. the expression “type”. (In AUT-QE, an extension of Automath, we have also 1-expressions like [z : nat] [y : bool] type.) Next there are 2-expressions1 as C in a line like

a := C

type.

Finally, we have 3-expressions like 0 in a := C b := 0

type C

But we do not admit, in Automath, 4-expressions like r in

a := C b := 0 :=

type C

r

0 .

In AUT-SL, there is no such restriction. In AUT-SL, we write quantifier strings like

[a : type] [b : a] [c : b] [d : c] corresponding to a := -

I,

b := c := d :=

type

a b

~

C

-

There are two aspects in which AUT-SL deviates from Automath. First we do not consider q-conversion in AUT-SL. That is, we do not admit reduction of [z : A] {z}f t o f (if z is not free in f). It would not be hard, however, to modify AUT-SL such as to admit 77-conversion. A second difference seems to be more serious. In Automath we have the following. If the book contains

x :=

-

q :=

...

A type

we may write

... .- [t : A] q ( t ) *-

type

.

In AUT-QE, however, we may write (3) as well as (4):

...

:=

[t : A ] q ( t )

[t : A] type .

AUT-SL, a single line version of Automath (B.2)

277

In AUT-SL we are more strict: we allow (4) but we forbid (3). This convention makes the language somewhat simpler. It seems to be a good strategy to study this simpler case in every detail before returning to Automath. We shall define AUT-SL by means of a program that checks the correctness of an AUT-SL line. The program produces the “normal form” of a given expression, and it produces, if possible, the category of that expression. It either accepts or rejects: it can be shown that the program will never run indefinitely.

2. THE SYNTAX OF AUT-SL

The syntax is very simple. We have two sets of symbols: “dummies” and “signs”. The set of dummies is infinite; they are different from the signs. The signs are given by (sign) ::= >

I 1 I 1 I { I 1 I type

We can now define “expressions”, “quantifier strings” and “expression tails” recursively: ::= (quantifier string) (expression tail) (expression) (quantifier string) ::= I [(dummy) : (expression)](quantifier string) ::= type I (dummy)I {(expression)} (expression) (expression tail)

Our program will attach a normal form to any expression it finds acceptable. These normal forms are ‘‘normal expressions”; their syntax is given by ::= (normal quant str) (normal expr tail) (normal expr) I[(dummy) : (normal expr)] (normal quant str) (normal quant str) ::= (normal expr tail) ::= type I(dummy)I {(normal expr)} (normal expr tail)

The consequence is that in normal expressions the “}” is never immediately followed by a The handling of dummies (i.e. the answer t o the question of which occurrences of a dummy are bound by which quantifiers) is as explained in [de Bruijn 70b], with the simplification that parentheses “(”, “)” do not occur.

‘‘r.

N.G. de Bruijn

278

3. ANEXAMPLE Assume we offer the following expression to the program:

[bool : type] [true : [z,bool] type] [nonempty : [ksz : type] bool] [a : bool] { { a } true} [ksz : type] {{ksz} nonempty} true then this is taken in as 2. The action of the program produces as the normal form, to be called 21, [bool : type] [true : [z : bool] type] [nonempty : [ksz : type] bool] [a : bool] { { { a } true} nonempty} true Moreover it produces k = 2, i.e. it says that 2 and 21 are 2-expressions. And the program produces the category of 2, viz. 2 2 :

[bool : type] [true : [z : bool] type] [nonempty : [ksz : type] bool] [a : bool] t y p e

As the norm it presents the value m = 7. We devote a few words to its background: Every dummy has a category: viz. in 2 the dummy “nonempty” has the category “[ksz : type] boor’. Moreover, we have the difference between binding occurrences of a dummy (i.e., occurrences followed by a comma) and bound occurrences (i.e , all n+tw occxrences) Mr. Nederpelt’s norm IS obtained roughly as follows: replace the bound o( currences of dummies by their categories. Carry this on umiL AC h i i.r*ei sound occurrences of dummies are left [n this way our expresswn -’g- do: SF I,c [bool : type] [true : [z : type] type] [nonempty : [ksz : type] type] [a : type1 {{type) [z : type1 type} [ksz * type1 ((tYP.1 [ksz : type1 type) .1 : type1 t y p e Next we start canceling parts like { A } [z : B]. So {type} [z : type] t y p e reduces to type. Actually all braces { } can be removed this way: the fact that this can be done is a consequence of the acceptability of 2. What remains is

[bool : type] [true : [z : type] type] [nonempty : [ksz : type] type] [a : type1 t y p e and that is essentially Mr. Nederpelt’s norm. It is not an acceptable expression itself. The simpler norm m we shall work with, is just the number of occurrences of t y p e in Mr. Nederpelt’s norm. There are 7 of them, so m = 7.

AUT-SL, a single line version of Automath (B.2)

4.

279

THE PROGRAM

The program is written in ALGOL 60, with some trivial extensions. It uses wwds li!e “expressim”. “quantifier string” in the same way as ordinary ALGOL uses words like integer and real. There is an input statement “2 .- read”, by which the given expression is fed into the program, and there

)” which produces the actual expression are output statements like “print (21 that was denoted by Z1, just like “print (rc)” produces the actual number that was denoted by k. The program uses the following equality of expressions: A G B means that the expressions A and B can be transformed into each other by means of Qconversion, i.e., just by the very unessential process of changing names of dummies (provided that name changing is not done so clumsily that the relation between a bound occurrence and the binding occurrence of a dummy is disturbed). The program contains the statements create new dummy s‘ ;

D’ := subst(s := s’, D )

which have the following meaning. First, s’ is a dummy that has not been used before in the expressions occurring in the program. The second statement means that D’ gets as its value the expression we obtain from D if we replace every occurrence of s by s’. In our syntax an empty string was presented by the simple procedure of writing nothing at all. Since this is not always very clear, we shall write 0 for the empty string. The program contains clauses like “ i f X starts with [”, which are not ALGOL but cannot be misunderstood. And it contains things like “write X = [u : Y ]2”. It means: we know at this stage that X has the form [... : ...; now give u, Y , 2 the (uniquely determined) values such that indeed X = [u : Y ]2. The reader will notice that the execution of the program causes quite some duplication of work. Having to choose between simpler program and shorter execution, we preferred the former.

..I

begin procedure check ( Q , X , X I ,X z , k , m ) ; value Q , X ; quantifier string Q ; expression X , X I , X z ; integer k, m; begin if X is a dummy then begin dummy y ; quantifier string Q1; expression A , A l , Az; integer k,; if Q = 8 then goto wrong; write Q = Q1 [ y . A ] ; if y f X thencheck ( Q l r X , X 1 , X 2 , k , m ) else begin check ( Q l , A , A l , A z ,k , , m ) ; X i := X ; X z := A , ; k := k , 1 end

+

N.G. de Bruijn

280

end else if X G type then begin k := 1; X1 := type;m := 1 end else if X starts with [ then begin dummy u; expression Y , Y1, Yz, 2 , 21, Z Z ;integer k,, my,m,; write X = [u : Y ]2 ; check ( Q ,Y,Y1,Yz, k,, m y ) ; c h e c k ( Q [ u : Y ] , Z , Z 1 , Z z , k , m Zm ) ; := m,+m,; X1 := [u : Yl]Z1; if k > 1 then Xz := [u : Yl]Zz end else if X starts with { then begin expression Y , Y1, Yz,z, 21,zz, W I , integer k,, k,, k,, m u ,m,, m,; write X = { Y } 2 ; check ( Q ,Y ,Y l ,Yz,k,, mu);if k , = 1 then goto wrong; Z , Z 1 , Z z , k,, m,); if 2 1 = type then goto wrong; check (Q, if k , > 1 thencheck ( Q , { Y l } Z z , W l , W z , k , , m , ) ; if 2 1 starts with [then begin dummy u; expression V , R; write 2 1 = [u : V ]R; if V f Yz then goto wrong; if ( R = type) V ( R is a dummy # u ) then check ( Q , R , X l , X z , k , m ) else if R G u then begin X I := y1; X z := yz; k := k,; m := mu end else if R starts with { then begin expression C ,D ; write R = {C} D ; check ( Q , { { Y l } [ u :V l C ) { y ~ } [ uV:l D , X i , X z , k , m ) end else if R starts with [ then begin dummy s; expression C , D , D‘; write R = [s : C ]D ; create new d u m m y 8’; D‘ := subst(s := s‘,D); check ( Q , [ s ’ :{ Y l } [ u :V ] C ] { Y 1 } [ uV: ] D ’ , X 1 , X z , k , m ) end else goto wrong; end else begin X1 := { Y l } 2 1 ; if k, = 1 then goto wrong; X2 := Wz; k := k,; m := m, end

wz;

AUT-SL, a single line version of Automath (B.2)

281

end else goto wrong end procedure check; program: expression E, El, Ez; integer k, m; E := read; check ( 8 , E, El, Ez,k,’m); print(E1); print(k); print(rn); if k > 1 then print(E2); goto end; wrong: print(# the given expression is not acceptable 3 ) ; end: end

5. TERMINATION OF THE PROGRAM

[Note (1992). This section claimed to give a sketch of a proof for the termination of the program. Unfortunately, the author had to admit later that the ideas presented in that sketch were not suficient for making the proof work. A satisfactory proof can be given on the basis of Nederpelt’s strong normalization theorem (see [Nederpelt 73 (C.3)]).]

6. FINAL REMARKS The language AUT-SL has been defined by means of reduction to normal form. The next step is that we can produce a number of language rules that does not involve the normal form explicitly. First, let us call two acceptable D expressions definitionally equal (symbol: =) if they have the same normal form. Next we can formulate rules producing definitional equality, like:

D D A1 = A2, B1 = B2, if {Al}B1 is acceptable, D then {Az}B2 is acceptable, and {AI} B1 = { A z }Bz if

.

In this way, we can phrase quite a number of derived rules for AUT-SL. On the other hand, we could take these derived rules as the definition of AUT-SL (in the same line as the definition of Automath), and prove the reducibility to normal forms as a theorem.

This Page Intentionally Left Blank

283

Some Extensions of Automath: The AUT-4 Family N.G. de Bruijn

1. In Automath (see [ d e Bruijn 70a (A.2)], [de Bruijn 73bl) we have ezpressions of degree 1, 2, 3 and a typing operation that attaches to each expression of degree 2 an expression of degree 1, and to each expression of degree 3 an expression of degree 2. If the typing operation takes P into Q we shall write this here as P : Q. The lines in Automath are all of the form (context indicator), (identifier) , (definitional part), (category part) , where the category part D has degree 1 or 2 and where the definitional part C (if it is not the block opener symbol or the symbol PN) satisfies C : D. The substitutional mechanism and the abbreviation system are independent of the degrees of the expressions involved. The degrees do make a difference, however, in the rules that express the right to carry out abstraction and application. We shall not recapitulate these rules of Automath here.

2. The extensions to be considered in this note all have the first of the following two features, and may or may not have the second one.

(i) Alongside with expressions of degrees 1, 2, 3 we admit expressions of degree 4, and we consider formulas P : Q where P has degree 4 and Q has degree 3. Accordingly, we admit blockopener lines and PN-lines with category part of degree 3. u (ii) Definitional equality = is extended as follows: For expressions of degree 4 we take the rules that directly correspond with the rules we have in Automath for expressions of degrees 1, 2, 3, but we take the following rule in addition: if P I , P2 are expressions of degree 4, if PI : Q1, P2 : Q2, where

D Q1 and Q2 are expressions of degree 3 satisfying Q1 = Q2, then we have

D

PI = P2. We shall refer to this feature as fourth degree identification.

N.G. de Bruijn

284

The Automath-like languages for which i) is required, will be called AUT-4 languages.

3. In order to show the ideas behind AUT-4, we first devote some attention to the interpretation of texts in languages of the Automath family. We necessarily have to be vague about this, since “interpretation” will mean the system of relations between an Automath book and the “mathematical world”. This mathematical world is not a real world, but the imaginary world of mathematics that has developed in the mind of mathematicians. These mathematicians have been able to discuss that world in natural language, hardly ever getting into serious permanent disagreement, and therefore they feel very confident about it. Nevertheless, it is a strange patchwork of words, formulas and conventions, certainly not easy to describe. In the mathematical world we say things like “7 is a natural number”, “2+2 = 3 is a proposition”, and if T is some piece of text we can say “T is a proof for 2 + 2 = 4”. Let us indicate the particular use of the word “is” in these sentences by means of the symbol E 7 E class of all natural numbers,

2

T

(3.1)

+ 2 = 3 E class of all propositions,

(34

+2 =4.

(3.3)

E

class of all proofs for 2

Let us write N and A for the classes in (3.1) and (3.2),and let us simply omit the words “class of proofs” in (3.3). So we get 7 E N ,

(3.4)

In previous reports on Automath we have recommended the following system of interpretations: the 3, N , etc. of the mathematical world correspond to Automath expressions that we shall abbreviate here by “3”, “ N ” , etc. Now “ N ” , “2 2 = 3”, “2 2 = 4” have degree 2, “7” and “T”have degree 3. There is an extra symbol type of degree 1, and we write

+

+

“7” : “N” : t y p e , “2 + 2 = 3” ; type ,

“T”; “2 + 2 = 4” : t y p e . That is, the symbol type can be interpreted as r if we feel like it. This system has advantages as well as disadvantages.

Some extensions of Automath: The AUT-4 family (B.3)

285

An advantage is that we can reduce the number of primitive notions of a book, since there are primitive notions that serve the needs for classes as well as for propositions (for example, the Cartesian product of two classes can be specialized to the conjunction of two propositions). But this is also a disadvantage: there can be axioms we want to hold for all propositions and not for all classes. In particular we may wish to phrase the axiom of the excluded third without being forced to accept its equivalent for classes (i.e. Hilbert’s universal operator that selects an element from every non-empty set). This disadvantage can be overcome if we introduce in the Automath text a primitive “bool” of degree 2, and a function “TRUE” that attaches t o every b with b : boo1 a value TRUE(b) of degree 2 (see [ d e Bruijn 7Oa ( A . 2 ) ] , [de Bruijn 73bl). In the interpretation the TRUE(b) corresponds to a proposition, and the things which are : TRUE(b) correspond to proofs of that proposition. A minor disadvantage is that we have to pass from b’s to TRUE(b)’s all the time. There is also the matter of “type reduction” [also called “type inclusion”], which we shall briefly discuss presently. Let C be an expression of degree 2 and assume that for all z with z : C we have derived A ( z ) : t y p e . Then we may infer in AUT-QE (see [de Bruij’n 7Qa (A.21, Section 12.71, [de Bruijn 73b, p. 521) that [z : C] A ( z ) : [z : C] type, and it is optional to replace this [z : C] t y p e by t y p e (whence [z : C] A(z) : type). This is called type reduction, it violates the Automath law that in A : B the B is uniquely determined by A, up to definitional equivalence. In Automath type reduction is compulsory. So the feature of AUT-QE is that type reduction can be left undone; at the same time it opens the possibility to start with “let u be a thing with u : [z : C] t y p e ” , which makes it possible to express something with the interpretation “let u be a predicate on C”. Experiences with writing AUT-QE seem to have pointed out that type reduction is nice for the cases of class interpretation, and that the possibility to leave type reduction undone is attractive for the cases with propositional interpretation. We might say that we wish type reduction to be effective for classes and not for propositions. In AUT-4 we have a possibility of an interpretational system that seems t o be definitely better than the system described above. Referring to the examples (3.4), (3.5), (3.6), we let “T” have degree 4, “2 2 = 4”, “2 2 = 3”, “7” get degree 3, “ N ” and “R” get degree 2, and we admit only a single expression of degree 1, viz. t y p e . So instead of (3.7), (3.8), (3.9) we get (3.10) “7” : “N” : t y p e , 112 + 2 = 3” LLx), : t y p e 7 (3.11) (3.12) “T” : “2 + 2 = 4” : “r”: t y p e .

+

+

N.G. de Bruijn

286

Propositions and classes now show a difference on the syntactic level: they get different degrees. If we require type reduction, it works for classes and not for propositions. There is a second syntactic difference with the old system: proofs now get degree 4, and are thus syntactically distinct from things like numbers. The fourth degree identification as described in Section 2 (ii) seems t o be quite attractive for the case of proofs; it has no consequences for objects like numbers, where the corresponding identification would be utterly unacceptable. The interpretation of fourth degree identification is what we call irrelevance of proofs. The idea is connected with the general idea of proofs in classical mathematics. In the use of Automath as described in previous reports ( [ d e Bruijn 7Oa ( A . 2 ) ] , [de Bruijn 73b]), objects may depend on proofs. For example, the logarithm of a real number is defined for positive numbers only. So actually the log is a function of two variables; and if we use the expression log(p,q), we have to check that p is a real number and q is a proof for the proposition “ p > 0”. If q1 and q2 are different proofs for p > 0, the expressions log(p, q1) and log(p, 42) are not definitionally equal in Automath. Yet the classical mathematician wants them to be equal (though not necessarily definitionally equal). It causes quite some trouble to achieve this, and at every instant where such a thing appears we have to appeal to a place in the book where this kind of equality is expressed. And the text checking computer has to do quite a lot of work, too. Yet it seems so simple: if we are not interested in the difference between q1 and 42, then we just don’t look at them, look only at what they prove. This is what fourth degree identification in AUT-4 can do for us.

4. Let us inspect the various possibilities for language definitions in the AUT-4 family.

(i)

There is the possibility to admit quasi-expressions(things like [z : A] type) as expression of degree 1, and to admit or to forbit type reduction. In connection with what we said in Section 3, it seems not necessary to use quasi-expressions. In order to simplify the subsequent discussion we shall restrict ourselves to the case without quasi-expressions. That is, type is only expression of degree 1, and type reduction is compulsory.

(ii) Various things are possible with abstraction. Let A be an expression of degree i (i = 1,2,3)and assume that we have in our book with the context x : A that B(x) : C(z), where B ( z ) and C(z) have degree j 1 and j , respectively ( j= 1,2,3). Then we wish to be allowed to write outside the context z : A , that

+

[z : A] B ( z ) : [z : A] C(z)

Some extensions of Automath: The AUT-4 family (B.3) if j

287

> 1, and [z : A] B ( z ) : type

(44

if j = 1 (whence C(z) = type). Let us call this (i, 3)-abstraction and refer to it as “abstraction from A”. We can choose for which pairs (i, j ) we shall admit this in the language definition. It seems to be reasonable t o admit all pairs with i # 1. (One might hesitate about i = 3, but if we do nut admit i = 3 here, the passage from the interpretational system (3.4)-(3.6) to the system (3.16)-(3.12) would have a serious drawback.) Entirely independently of the question whether (4.1) and (4.2) are admitted, we agree that the “abstractive expression” [z : A ]B(z) has the same degree as B ( z ) (and [z : A] C(z) has the same degree as C(s)). (iii) Let, in a certain context, q and f be expressions of degree i and assume that q : A , f : [z : A] C(s)

+ 1 and j + 1, (4.3)

where A has degree i and C(z) has degree j. Then we can wish to be allowed t o write

(which is called ( i ,j)-application, in this particular case application of f to q ) . We can choose for which pairs ( i , j ) (1 5 i 5 3, 1 5 j 5 3) we shall admit this in the language definition. It seems reasonable t o admit it for all pairs with 2 5 i 5 3, 2 5 j 5 3. Entirely independently of the question whether (4.4) is admitted, we agree the “applicative expression” ( q ) f has the same degree as f . (iv) If j = 3 there is a rule slightly stronger than iii): if q and f have degree i 1 and 4, respectively, and if

+

q .A,

f : h , h : [z : A] D ( z )

then

This rule probably has not much use. It is certainly superfluous if we have both and q-reduction, for then we can write h as [z : A] (z)h. But even if it is not superfluous it is questionable whether we shall ever need it badly.

N.G. de Bruijn

288

+

(v) Let in a certain context q have degree i 1, and let [z : A] B ( z ) be an expression of degree j where A has degree i. Then we wish to reduce

which is called p-reduction. We have to state for which pairs (i, j) this will be taken as definitional equivalence. It seems reasonable t o admit all i and j with 2 I i 5 3, 2 5 j 54. Instead of this ordinary p-reduction, we can take type-restricted preduction. In that case, the above reduction is allowed only if we can show that q : A . (vi) If A and B have degrees i and j , and B contains no

5,

we wish to reduce

[z: A] ( z ) B to B (0-reduction). It is reasonable to admit it for 2 5 i

I 3, 2 5 j 5 4.

(vii) In the interpretations we discussed in Section 3, we had use for expressions p of degree 4 only if p : A where A represents a proposition. Let us refer to all other expressions of degree 4 as extras. One might think it better to ban all extras as long as no interpretation has been agreed upon. On the other hand one may hope to be able to show that extras do not harm, in the following sense: Let B be a correct book, and let 1 be a line we wish to add to B. Let B* be a book with exactly the same PN’s as B has, and no others. Assume that B* contains the line 1, and that neither B nor 1 contain extras. Then there is a correct extension B** of B such that B*\B contains 1, but no PN’s and no extras. If this is true, we may say that the language that allows extras is conservative over the one that does not. Nevertheless there can be some use for extras. Assume we are interested in constructibility of real numbers. Let C be a construction for the real number T , and if “C”and “r” are the expressions corresponding to C and T , we want to have “C” : r ‘ ~ r r . By means of axioms we can describe primitive constructions, and primitive ways to obtain new constructions from constructions already known. In that way we can get a theory of constructions in our book. If our language has fourth degree identification then we have construction irrelevance: if something depends on a construction it only depends on the constructed object and on the fact that a construction exists.

289

AUT-QE without Type Inclusion N.G. de Bruijn

1. INTRODUCTION

We consider a language AUT-QE-NTI. Its definition is identical to the one of AUT-QE in [van Daalen 73 (A.3)] but for the fact that the type inclusion rule (Rule 6 of Section 5.5.4 of [van Daalen 73 (A.3)])is omitted. The letters NTI stand for ‘‘no type inclusion”. The power of this language can be increased in various ways; e.g. (i)

by admitting “mock typing” for writing axiom schemes (cf. [de Bruijn 74a]),

(ii) adding the type inclusion of AUT-QE with about the same effect, (iii) taking the II-operators of AUT-II (cf. [Zucker 77 (A.411, [de Bruijn 771). The latter language has some features that seem peculiar from the point of view of language structure, e.g. if compared to the simplicity of AUT-QE-NTI, but, on the other hand, AUT-II seems to be quite natural from the user’s point of view. In this note we shall try to show that in AUT-QE-NTI a set of axioms on universal quantifications can lead to a set of theorems which can take over the role of type inclusion. Application of these theorems is a kind of automatic affair. It is to be expected that AUT-QE-NTI, enriched with such automatic devices (which means that some “book theorems” are shifted to the language definition) has about the same expressive power as AUT-QE and AUT-II.

2. THE RULES OF AUT-QE-NTI

In [de Bruijn 771 we described AUT-II by a set of basic rules, omitting everything the Automath family has in common (like structure of lines, books, contexts, instantiation). We describe AUT-QE-NTI in the same fashion ( r stands for either type or prop): (i)

1

k r

N.G. de Bruijn

290 2

1

I- a : ? -

(ii)

~-

1

F

[x : a ]P

:

~ (.:a) I- Q : P

2

(iii)

(.:a) .~ F P

2

f a

--

2

~

I- [ x : a ] Q : [ z : a ] P 3

2

a : ~ (.:a) I- R : Q (iv) 3 I- [ x : a ] R : [ z : a ] Q k

~

3

2

I- A : ~ : T (v)

I- Q : [ a : : c u ] P

2

I- ( A ) Q :P [ x := A] 3

(vi)

3

I- A : & : ? ~

I- R : Q : [ x : a ] P

!( A )R : ( A ) Q

3. THE AXIOMS FOR UNIVERSAL QUANTIFICATION

a := Q :=

~

~

: 7

:

All := P N :

x

[x : a17 7

A11(Q) A z l := P N : Q :=

~

:

Q

y : = - :

Ax2 := P N

:

All(Q)

Properly speaking, we have t o consider various sets of axioms: the 7 in the first line may be either type or prop, and the T ' S of the second and third lines may be either both t y p e or both prop. We do not go into these difficulties, and we shall behave as if ?- were the only basic expression of degree 1.

4. GENERALIZATION OF All TO MULTIPLE QUANTIFICATION We consider cases like Q : [z1 : a11 ... [x,,, : am]7 . We cannot introduce independently: 0 2 may depend on xl, etc. But in order to give a

a l , ...,a,

AUT-QE without type inclusion (B.4)

29 1

preliminary idea, we show what can be done if the a's are independent. We take m = 3.

Q

:=

-:

[XI : 011 [ 2 2 :

[ ~ : 3I Y ~ 7 ]

k31 := [21 : a11 [ZZ : a 2 1 All(a3, ($2) (51) k32 := [XI : 011 k33 :=

All(02, 2.i

9 ) : [XI

: 0112.[

Q ) ) : 1x1 : 011 7

: 021 A l l ( a 3 , ( 5 2 ) ($1)

: a 2 1 A l l ( a 3 , (Q) ($1)

All(ai, [$I : 011 All(a2, 2.[

: a21 7

Q))) : 7-

These formulas are the same as those presented in [ 1: in AUT-II the role of All is played by II, which is not introduced by means of axioms, but by a language rule. Note how the kmi's imitate type inclusion. If

Q : [XI: a11 [Q : I Y ~ ][Q : a3]7 then a language with type inclusion permits us to write Q : [XI: a11 T . Here in AUT-QE-NTI we just have t o write k32(Q) : [q : a11 7. We now turn to the general case. Let m be an integer ...,prn as follows (the example shows the case m = 4)

>

1. We introduce

$91,

- : 7

$91 :=

$72 := - : [Zl : $711 7-

$73 :=

-

: 1x1 : $71121.

$74 :=

-

.

[Xl

.

, i j (

:'.,

: (21) $921 7

(p:; 1%;

(Xz) (21) $931 7

Instead of the independent t./?e: C.I' we now have, in the context of ' ~ 4 ,the types $71, (21) $72, ( 2 2 ) ( 2 1 ) $73, (23) ( 2 2 ) ($1) ' ~ 4 .When writing in this context, we shall introduce as "typographical" abbreviations: (l)= ($1)

if j is any integer

[j] =

1

(2) = (.2)

7 '"

,

( j ) = ($J)

> 0; (0) will stand for an empty string. : $711 [ZZ: ($1) 'P2] .'. [$J : ($J-l)

'**

($1)

'**

Similarly

($1) $731

and [O] stands for the empty string. In the context of $9, we now write as correct lines krno := Q := [m - 11All((m- 1),P,( k,1 := [m - 21 All((m - 2) ( k,2

~

: [ m ] ~ : [m - 117 (m - 1)kmo) ~ - (m 1 , - 2) kml) : [m - 21 7

......... ,k

:= All(P1, kmn-1)

The recursion is produced by

: 7

N . G . d e Bruijn

292

kmi := [m- i] A l l ( ( m - i) vm-i+1,( m - i) km,i-l) : [m- i] T for 0

< i 5 m.

5. GENERALIZATION OF Azl AND A22

A d expresses something for the case m = 1. There it says: if we have something of type kll (= A l l ( Q ) ) then we have something of type klo (= Q ) . A22 does it the other way round. We can generalize this to all m 2 1: if 0 5 i 5 m, 0 5 j 5 m, and if we have something of type kmi then we have something of type k m j . It suffices to deal with the cases j = i f 1. We start in the context of qm (of Section 4). If 1 5 i 5 m we can write (by virtue of r]-reduction rules) 'u.

:=

: kmi

~

sm-i := [m- i] A21 ( ( m- i) vm-i+1,( m- i) km,i-1, ( m - i)

U)

: km,i-l .

In particular if i = m this takes the form : kmm

u := 80

:= A z l ( v l , k m , m - l , u ) km,m-1

In the other direction we have, if 1 5 i

.

< m - 1,

u := : km,i-l tm-i := [m- i] A22 ( ( m- i) vm-i+1,( m- i) km,i-l, (m- i) U) : kmi . In particular if i = m this becomes

u := : km,m-1 to := A 2 2 ( p 1 , k m m - 1 , ~ ): k m , m . By means of composition of these operations we can pass indeed from any kmi to any kmj:

1

u

: kmi

:=

t m i j :=

... : k m j

where for each set of integers m, i, j (0 5 i 5 m, 0 5 j 5 m) the dots stand for a particular expression. The larger m and ]i - jl are, the longer this expression will become. In particular, we can pass from

R:Q:[m]r to

t m o j ( R ): k m j ( Q ) : [ m - j l ~ .

AUT-QE without type inclusion (B.4)

293

6. EQUALITY It is attractive to consider the Ax1 and Ax2 of Section 3 as inverse operations. This means that in the context of Q the expressions

x

and

Az2(Axl(x))

refer to the same object (of type All(Q)), and

y

and

Axl(Ax2(y))

refer to the same object (of type Q ) . We can express this by equality axioms (for which we need further PN’s). The applications we have in mind, suggest that identification of x and Ax2(Axl(z)), and of y and Axl(Ax2(y)) is of a linguistic rather than of a mathematical nature. So it is attractive to accept the above equalities as definitional equalities (thus extending the notion of definitional equality). If we accept this, we conclude that the si and ti (of Section 5) are each other’s inverses in the same definitional equality sense, and similarly t m i j and t , j i are each other’s inverses.

7. OMITTING THE k,i’s The category of kmi(Q) (Section 4) is [m- 4 7 . On the basis of what we know on uniqueness of types we can remark that once the category of k,i(Q) is given, then i is known. This can lead us t o the following convention: kmi(&) : [m- i] 7

is to be abbreviated as

Q :[ m - i ] ~ . From the right-hand side we see that the Q on the left is intended to denote km i( Q ) . Another convention is that we omit the t m i j ’ s (see the last line of Section 5 ) : from the typing it is obvious which ones they should have been (provided we have the definitional equalities of Section 6 ) . These funny conventions are exactly equivalent to the rules of type inclusion! That is, they turn AUT-QE-NTI into AUT-QE. It should be remarked that AUT-QE is very convenient for writing and checking. Compared to AUT-QE-NTI it must be said that AUT-QE saves a large number of /3- and v-reductions and similar definitional equalities.

N.G. de Bruijn

294 8. A B S T R A C T I O N I N D E X

In order to facilitate the discussion on the relation between the various languages we coin the term “abstraction index” of an expression in an AUT-QENTI book. In a context containing the context of the 91,..., (P, of Section 4, and with the abbreviation of Section 4, we say that the abstraction index of [m]T is m. Also, if Q : [ m ]and ~ if R : Q : [ m ]we ~ say that the abstraction index of Q and R is m. The abstraction index increases by 1 if we put a single abstractor [x : a] in front, and decreases by 1 if we put a single applicator ( A ) in front (the latter only applies to Q and R, not to [m]T itself).

9. COMPARING AUT-QE-NTI TO AUT-IT

Let us take AUT-QE-NTI with extended definitional equality as described in Section 6. A book in that language may contain expressions of degrees 1, 2, 3 and arbitrary abstraction index. It is not hard to see that we can rewrite such a book in an equivalent form in which all identifiers of degree 3 have abstraction index 0. We replace any block opener line x := : Q (where Q has degree 2 and abstraction index m ) by a line x := - : k,,(Q). Furthermore we replace definitional lines c := R : Q by c := tmom(R): k,,(Q). Because of these changes there have to be obvious adaptions in all places where such an x or such a c is referred to. Let us call a book with this property (all identifiers of degree 3 have abstraction index 0) j, “zero abstraction index book”. Such a book niay very well contain subexpressions of degree 3 and abstraction index > 0, just because the index of a n expression increases if we put an abstractor in front of it. A zero abstraction index book is essentially an AUT-II book. There is a peculiarity with abstraction and application in AUT-IT, however. In AUT-II all expressions of degree 3 have abstraction index 0: if R : Q then Q : T . If we translate AUT-ll into AUT-QE-NTI we have to do the following: every abstractor [x: a ]of AUT-ll has to be replaced by tlol[x : a ] ,and every applicator ( A ) of AUT-11 has to be replaced by ( A )tllo. One might naturally ask whether, conversely, AUT-QE-NTI can be interpreted in terms of AUT-II. A convenient way to do this seems to be the following one. First rewrite all AUT-IT-expressions of degree 3, replacing ( ) by ( )* and [ ] by [ ]=.Next extend the collection of all expressions of degree 3 by putting [ 1’s (without T ) in front of other expressions and provide them with a type according to a rule “if R : Q then [ ] R : [ ] Q”. In this way the typing of expressions of higher index are interpreted as typings of expressions of lower ~

AUT-QE without type inclusion (B.4)

295

index. Needless to say, quite some work has yet to be done!

10. SCHEMATIC PRESENTATION In order to get a quick survey of the various operations discussed thus far, we present expressions as points in a diagram. In such a diagram we have horizontal levels 0, 1,2, ... , according to the abstraction index. If A is an expression then [z : a]A is drawn one level higher, with a vertical arrow connecting the two, and similarly an application is drawn one level lower:

[x:a]A I A

I

A A

Figure 1 If Q has degree 2 and abstraction index 1 we can apply All, and this is indicated as follows

Figure 2 If R has degree 3 and abstraction index 1 we have Ax2(R) with index 0 ; if S has degree 3 and abstraction index 0 we have A x l ( S ) with index 0 , provided that S : AU(Q) with some Q of index 1 . These operations are indicated as follows

R

1

-

-

S

Ax2(R)

0

Figure 3 Presenting composite operations we proceed from right to left. Let Q have

N.G. de Bruijn

296

index 3, then kgl(Q),k32(Q) and k33(Q) are depicted like this

Figure 4 If Q has index 4 we get

.tQ

Figure 5 Next we take the case of degree 3. Let Q have index 3, and let R : Q . The operations of Section 5 can be depicted as follows

Figure 6 The composite operations t,ij can be drawn accordingly. Note that some detours ( ) [ ] can be eliminated by @-reductionand detours [ ] ( ) by v-reduction.

AUT-QE without type inclusion (B.4)

297

Some examples are

Figure 7 At the end of Section 6 we mentioned definitional equalities. In the pictures they mean that the detours

Figure 8 can be eliminated. At the end of Section 9 we mentioned that for expressions of degree 3 the abstractors and applicators in AUT-IT have to be adjusted when translating into AUT-QE-NTI. Denoting them by [ IT and ( )* we have

Figure 9 In this way AUT-ll manages to keep the abstraction index zero for all expressions of degree 3.

This Page Intentionally Left Blank

299

Checking Landau’s “Grundlagen” in the Automath System Appendix 9 (AUT-SYNT) L.S. van Benthem Jutting

APPENDIX 9.

AUT-SYNT

In 4.1.0 we have indicated that for andi the parameters u and v are essential, while a and b are redundant parameters. If A, B , p and q can be correctly substituted for a, b, u and v, then A and B can be calculated (up to definitional equality) from p and q, because A is definitionally equal to CAT(p) and B to CA T ( q ) . Here we introduce an extension of Automath languages, called AUT-SYNT, in which it is possible to suppress redundant parameters. In this language, CAT is incorporated as a predefined function. For any 2- or 3-expression E , C A T ( E ) is the mechanically calculated type of E. It follows that an&( CAT@), CAT(q),p,q)equals andi(A, B,p, 4). The extended language moreover contains variables for expressions. A basic symbol synt (which has no degree) is added to the language. Variables of type synt (or synt variables) are t o be interpreted as syntactic variables for expressions. There are no typing restrictions on substitution for such a variable. Following the AUT-QE text in 4.1.1 we can write in AUT-SYNT:

zl z2

* * *

z2

..._ .-

ANDI

:= an&( CAT(zl), CAT(z2),z l , z2)

z1

-

E synt E synt

Now, if AEprop, BEprop, PEA, qEB then ANDl(p,q) = a n d i ( C A T ( p ) ,CAT(q),p,q)= andi(A, B,p,q)Eand(a,b). Besides CAT we have other predefined functions in AUT-SYNT. They are defined for certain classes of expressions (just as CAT is defined for 2-expressions and 3-expressions). We list these functions here with their semantics. In the description of the semantics we will frequently use the clause: “if E reduces to ...‘I. We will say e.g. “if E reduces to [ x : A] B...”. This is intended to mean: “if [z : A] B is the first abstraction expression in the reduction sequence, obtained by reducing E according to the strategy of the verifying program”. Similar

L.S. van Benthem Jutting

300

meanings are intended in other cases. Everywhere in the description E and E l , Ez, ...,En will denote correct AUT-expressions. predefined function

semantics

CAT

C A T ( E ) is the "mechanical type" of the 2- or 3-expression E

DOM

If E reduces to [z : A ] B or C A T ( E ) reduces to [z : A] B or C A T ( C A T ( E ) )= [z : A]B then D O M ( E ) = A.

VAL

If E reduces t o [z : A] B and B does not contain z then VAL(E)= B.

ARG

If E reduces to ( A )B then A R G ( E ) = A.

FUN

If E reduces to ( A )B then FUN(E) = B.

TAIL

If E reduces to c(Al, ...,An) then TAIL(c,E ) is the string of expressions A1, ...,A,.

LASTELT

If E l , ...,En is a nonempty string of expressions then LASTELT(E1, En) = En. a,.,

PREPART

If E l , ...,En is a nonempty string of expressions then PREPART(E1,...,En) is the string of expressions E l , ...,En-l.

Remarks. ( 1 ) Expressions containing synt variables do not have a type. Lines having such an expression as their middle part do not have a category part. (2) EB-lines which have synt variables in their context can only have synt as their category part. In other words: on a synt context only synt variables may be introduced. (3) The identifiers CAT, D O M , VAL, ARG, FUN, TAIL, PREPART and L A S T E L T , and the identifiers defined in terms of these should not be treated as ordinary identifiers. In particular the monotonicity of definitional equality (in this case: if A = B then c(A) = c ( B ) where c is one of these special identifiers) does not generally apply here. E.g. i f f = [z : nat] 1then D

( 1 )f ((1)suc) f , while ARG((1)f ) = 1 # (1)suc = ARG(((1)suc) f ) . Similar examples can be found for FUN and TAIL.

(4) For languages admitting infix expressions there are functions L F E (for left fixed expression) and R F E (for right fixed expression) with semantics: If E reduces to A c B then LFE(c, E ) = A and R F E ( c ,E ) = B.

Checking Landau’s ‘iGrundlagen”,AUT-SYNT (B.5)

30 1

Examples.

( 1 ) The first elimination rule for conjunction can be represented in AUT-QE by adding, on the context a E p r o p ; b E prop ; introduced in 41.0: b * u u * andel

.:=

-

....

E and(a,b )

Ea

Then a and b are redundant parameters, for andel and u is an essential parameter. In fact, if p is a substitution instance for u, then the type of p can be expected to reduce to and(A,B ) for some A and B , and these A and B should then be substituted for a and b. Therefore, keeping the context z l E synt introduced above, we can add the AUT-SYNT line

zl

*

ANDEl

:= andel(LASTELT(PREPART(TAZL(and,

C A T ( z l ) ) ) )L, A S T E L T ( TAZL(and (CAT(Zl))))IZ1) Then pEand(A,B ) implies ANDEl(p)E A . We can now indicate a complication which must be kept in mind when using AUT-SYNT, and which is connected with Remark 3 above. Suppose and has been defined by and := not(imp(a,not(b))). We may have p , A and B such that CAT(p) = not(imp(A,not(B))),and then we have andel(A,B , p )E A , but ANDEl(p) will be incorrect, since CAT@) does not reduce t o and(A,B ) . Even worse complications may occur when using ARG and FUN.

(2) In [van Daalen 73 (A.3), 3.61 book-equality is introduced. In AUT-SYNT we could add to this text, on the context z l E synt ; t 2 E synt ; 22

zl el

*

* *

is := I S ( C A T ( z l ) ,zl,22) r e f i s := R E F I S ( C A T ( z l ) , z l ) symis := SYMIS( C A T ( L A S T E L T (TAZL(is,z l ) ) ) , LASTELT(PREPART( TAZL(is,z l ) ) ) , L A S T E L T ( TAZL(is,z l ) ) , 21)

Then for a n y type S, if x E S and y E S , equality of x and y could be expressed by is(z,y) instead of IS(S,z,y). Moreover, if z E S we have r e f i s ( z )E is(z,x) and if p E is(z,y) we have syrnis(p) E ia(y, 2). (3) A text in AUT-68-SYNT1 in which the first three theorems of Landau’s book are proved, appears in Appendix 8. [Not an this Volume.]

This Page Intentionally Left Blank

303

The Language Theory of Aut ornat h Chapter VIII, 1 and 2 (AUT-II) D.T. van Daalen

VIII. SOME R E S U L T S ON A U T - n VIII.l. Introduction and summary 1.1. There are two languages of the Automath family that have been developed for practical (in contrast with, say, language theoretical) purposes and have actually been applied in extensive formalization projects. On the one hand there is AUT-QE, used by L.S. van Benthem Jutting in his Landau translation [wan Benthem Jutting 771. The latter reference also contains an informal introduction to the language [van Daalen 73 ( A . 3 ) ] . The theory of AUT-QE is t o be found in [wan Daalen 80, Ch. IV to VI] [large parts of which are included in this Volume]. On the other hand there is AUT-II, invented by J. Zucker, and employed by Zucker and A. Kornaat for the formalization of classical analysis and some related topics. In [Zucker 77 (A.411 one finds a short account of both the language and the formalization project. This chapter is devoted to the theory of AUT-n, which is not quite as complete as the theory of AUT-QE. Some work remains to be done, notably on the extensional version of the language (see IC.5, VIII.61). 1.2. What AUT-QE and AUT-IT have in common We have described AUT-QE as a first-order pure, regular, generalized typed A-calculus system. Using the same terminology, AUT-IT is a first-order extended, regular, generalized typed A-calculus system. So both languages have much in common and, in some sense, AUT-QE can be considered a sublanguage of AUT-IT. We resume: both languages are regular, i.e. they have just expressions of degree l(supertypes), 2 (types and type valued functions) and 3 (terms). They are first-order, i.e. there is only quantification and A-abstraction over term variables, not over type-variables. Further, they have generalized type structure, i.e. the types are constructed along with the terms. Besides, AUT-II and AUT-QE have the book-and-context structure in common. Books to introduce primitive and

304

D.T. v i ~ nDaalen

defined constants, depending on variables, for which substitution (instantiation) is permitted. Contexts for the introduction of variables. Here we want to emphasize that, just like AUT-QE, AUT-11 is a non-arithmetical system, i.e. it has no recursion constant with the corresponding reduction.

1.3. T h e additional operations of AUT-ll But, where AUT-QE belongs to pure typed A-calculus (abstraction, application and instantiation as the only term-forming operations), AUT-II is a typical extended system, with the additional kinds of terms: pairs ( P ,A , B ) ,projections A(1) and AQ), injections i l ( A , p ) and iz(B,a) and @-functions (or: @-terns) A $ B. Here the P of the pair, and the p and a of the injections are mere type-labels to guarantee uniqueness of types. Corresponding with these new terms there are new type-constructs: first the sum-type C P containing the pairs ( P , A ,B ) as elements, where P is a typevalued function with domain a , A belongs to a and B is of type ( A ) P. In case P (as a type-valued function) is constant, i.e. ( A )P does not depend on A, the pair and the sum type can be considered to degenerate to ( A ,B ) and a @ p respectively, where @ is the ordinary Cartesian product and p is the type of B. Secondly, there is the disjoint union or $-type a @ ,B, containing the injections i l ( A ,p) and iz(B,a ) , where A and B are of types a and p respectively. The pairs get their meaning by the presence of the projections and the ass& ciated reductions: if A is a pair, i.e. element of a sum-type, say C P , then A(1) is an element of the domain of P and A(2) is element of ( A ( l ) P. ) Now ( P,A , B)(1) K-reduces to A and ( P , A ,B)(2) reduces t o B. In the extensional version of , Q ) )o-reduces to A , provided A belongs t o C P (otherwise the AUT-II, ( P ,A ( l ) A type would vary under reduction). Similarly, the injections get their meaning by the @-terms and the associated reduction. Let us first explain what a $-term is. Roughly speaking, when f is a function on a and g is a function on p, then - under certain conditions - f @ g is a function defined on a @ p, acting on (injections of terms of type) a as f and on (injections of terms of type) p like g. So the reductions are as follows: ( i l ( A , P ) ) ( f$ g) +-reduces t o ( A )f and (iz(B,a))(f@ g) +-reduces to ( B )g. The corresponding extensional reduction is &-reduction: [z : a]( i l ( z ) )f $ [z : p] (iz(z)) f &-reducesto f, provided f does not contain z as a free variable (i.e. does not depend on z). Please note the use of parentheses: $ is supposed to bind more loosely than the other term forming operations. A more precise definition of AUT-II follows in Sec. 2.

The language theory of Autornath, Chapter VIII, Section 1 (B.6)

305

1.4. The connection with natural deductions systems By the well-known formulae-as-types, derivation-as-terms interpretation, systems of typed A-calculus can be brought into close correspondence with certain natural deduction systems for intuitionistic logic (including the usual proof theoretic reduction relations). Thus, pure systems correspond to logical systems with + and V only, and extended systems correspond to systems with more connectives. In particular, the C, the pairs and the projections of AUT-II may provide the interpretation of “strong” existential quantification with its introduction and elimination rules (though this has not been exploited in Zucker’s book, see [Zuclcer 77 (A.411). And @, the degenerate form of C, corresponds precisely to conjunction. As for the interpretation of V (disjunction) by @-types, the introduction rules of V do correspond to injection, but the elimination rule of V differs slightly from its counterpart in AUT-n. The usual elimination rule of V (see, e.g., [Prawitz 651) operates on three arguments: from (1) a derivation of a V p, (2) a derivation of y under the assumption a , ( 3 ) a derivation of y under the assumption

0,

one can form a derivation with conclusion y. The assumptions a and ,O of the derivations (2) and ( 3 ) are discharged. The AUT-II operation representing this rule must be constructed in several steps: first (2) and ( 3 ) are transformed into derivations of a + y and p + y respectively. These two derivations are combined into a derivation of (avp) -P y (by using @). Then the conclusion y follows from modus ponens (by (1)). Here we stick to the AUT-II variant of the rule. For a discussion of the alternatives see [Pottinger 771, [Pottinger 791. Because AUT-II is still non-arithmetical, it cannot represent natural-deduction systems for arithmetic (in the sense intended above). 1.5. Product formation versus type inclusion Now we discuss a specific difference between AUT-QE and AUT-II, that prevents AUT-QE from being an actual sublanguage of AUT-II. In AUT-QE there is no difference in notation between type-valued functions and function types. 1.e. the expression [x : a]p, with p an expression of degree 2, stands for the function that to arguments A in a assigns types p [ A ] ,but also for the type of the functions which, when applied to A in a , produce a value in p [ A ] .And, to make things even more complicated, it is possible that p allows such multiple interpretations as well.

306

D.T. van Daalen

In AUT-lI there is reserved a special symbol for referring to the function type, viz. II (for Cartesian product formation): by prefixing with II the type-valued function [z : a]p is turned into the corresponding function type ll [z : a]p. More general, if P is a type-valued function, then llP is the corresponding product type, containing those functions as elements which, when applied to arguments A of the right type, produce values in ( A )P. The language AUT-II is named after the II of product formation. In AUT-QE the expression [z : a l p can get (at least) two possible types, viz. [z : a17 and 7, according to which interpretation is intended. This is implemented by the rule of type inclusion. As a consequence, uniqueness of types is valid for terms only. Some problems arise from this in connection with V.1.71 and [G.5, V.3.3.101). In AUT-II uniqueness defined constants (see (C.5, of types is valid for types as well: e.g. if p is a type, then [z : a]p has type II [z : a ]T and lI [z : a]p has type T . Note here the use of II again which makes the (constant) “super-type valued function” [z : a]7 into a super-type II [z : a]7. At first sight it seems that the here-indicated difference is a trifle, and that AUT-QE can be made into a subsystem of AUT-II by simply inserting II’s at the right places. However, as noted by de Bruijn, the correspondence is not that close: the rule of type-inclusion (of AUT-QE) is somewhat stronger than the product formation rule (of AUT-II). See Sec. 6.1 of this chapter, [de Bruijn 771 and [de Bruijn 78c (B.4)]. 1.6. Some features of AUT-ll not discussed here For completeness we mention two important, more or less syntactical, features that enrich the language used by Zucker and Kornaat in their AUT-rI book. First, there is the use of AUT-synt, a kind of Automath shorthand, as documented in [van Benthem Jutting 771. Secondly, there is the use of stringsand-telescopes (see [Zucker 77 (A.4)] [and [Balsters 8611). However, these features do not belong specifically to AUT-II; they rather can be attached to any Automath language, but were not yet available when van Benthem Jutting started his Landau translation. On the contrary, the stringsand-telescopes generalize (and, hence, duplicate) in some sense the pairs-andsums of AUT-II. These two features are not discussed in this thesis. In [Zucker 77 (A.411 Zucker describes how the whole language is divided into a t-part (for terms and types) and a p-part (for proofs and propositions). This division originates with the distinction between the two degree 1 basic constants, 7 (or type) and 7r (or prop). Connected with this is the principle of equality of proofs (two proofs of the same proposition are considered to be definitionally equal; only consistent with classical logic). Here we just use 7 as our basic constant. As a consequence we do not discuss equality of proofs.

The language theory of Automath, Chapter VIII, Section 2 (B.6)

307

1.7. Section 2 below contains a more precise definition of AUT-II. In [C.5, VIII.31 we prove the closure property: Correctness is preserved under reduction. In [C.5, VIII.4] we first define two systems of normable expressions, AUT-IIo and AUT-IIl, which have the same “connectives” and reductions as AUT-II but a simplified type structure. We study SN [i.e., strong normalization] for these two systems. First we show that the methods of proving 0-SN directly apply to the situation with P7r-reduction. In [C.5,VIII.51 we give some different proof methods for SN in presence of +-reduction. Then we extend the AUT-IIl results to AUT-II. [C.5, VIII.61 just contains some remarks on the connection between AUT-II and AUT-QE (type-inclusion vs. product formation), and on the particular problems posed by the addition of &-reduction. VIII.2. A short definition of AUT-II 2.1.1. We give an E-definition of AUT-II, along the lines of the AUT-QE definition in [C.5, V.21. For the formation of books and contezts we refer to [ A . 3 , IV.31, and for their correctness to the requirements in [C.5, V.2.1.31. However, the inhabitable degree condition, to the effect that correct expressions can be of degree 1, 2 and 3 only, has to be restricted further, to an inhabitability condition: Expressions acting as the t y p of a variable or a constant have to be inhabitable. Where we define QI to be inhabitable when degree(cr) = 1, or: degree(a) = 2 and a E 7 (or a E 7 r ) . 2.1.2. But first we must define the degree (and, implicitly, the notion of degree correctness) of the typical AUT-II expressions: degree(A) = 1 or 2 degree(A) = 2

+ degree(II(A)) = degree(A)

+ degree(C(A)) = 2

* degree(A(l)) = 3, degree(Ai(2))= 3 degree(A) = degree(B) = 2 or 3 * degree(A @ B) = degree(A) degree(A) = 3, degree(B) = 2 * degree(il(A, B ) ) = degree(A) = 3

degree(is(A, B)) = 3 degree(A) = 2, degree(B) = degree(C) = 3

* degree((A,B, C)) = 3

2.1.3. Correctness of expressions, E-formulas (for typing) and Q-formulas (for equality) is defined simultaneously. For the notational conventions and abbreviations we refer to [C.5, V.2.11 and IC.5, V.2.21. E.g., we display degrees as superscripts to the correctness symbol F, we freely omit books and contexts (or parts of contexts) not relevant to the rule under consideration, and we sometimes omit I- as well (viz. in front of a formula when context and degree are not shown).

308

D.T. van Daalen

2.2. The general rules 2.2.1. We start with the rules that AUT-II has in common with AUT-QE. We assume a correct book B and a correct context E. First the general rules for correctness of expressions and E-formulas. type and prop: I-' r and I-'

(i)

(ii) variables:

...,x E a,...,

T.

x(E a).

(iii) instantiation: if c is introduced in 0,with context O E B , then c(@ (EtYP(4

PI>.

dEB[g]3

For our language theoretical purposes we need not distinguish between r and T . So in the sequel we just use r , intending to cover T as well. Then the remaining general rules: for Q , for type-modification and strengthening. 2.2.2.

(iv) Q-reflexivity: I- A (v)

+A

A.

Q-propagation: A Q B, k C, ( B > C or C > B ) + A 9 C.

(vi) type-conversion: A E B Q

c jA E c.

(vii) strengthening: if (xEa,r]) I- B (E/QC), x does not occur free in r](,C) and B then r] I- B (E/QC). The Q-propagation rule still depends on an assumed reduction relation, e.g. either with or without the extensional reductions r ] , E , IS.The rule of strengthening is only included for technical reasons associated with r] and E , so can be omitted in the non-extensional case. Notice that the rule of type-inclusion of AUT-QE has been left out here. Its role, viz. of transforming (type-valued) functions into types, is to be played here by the product rule for 2-expressions of the next section. 2.3. The specific rules I Now we come to the rules specific for AUT-II. They are divided into three groups. Each consists of one (or more) introduction rule(s), one (or more) elimination rule(s), and a type formation rule to provide the introduced expression(s) with a type. With each group an IE-reduction rule (i.e. introduction-elimination reduction rule) and its extensional counter part can be associated.

I Abstraction, application and products. (1) Product rule 1: x E a I-lB

+ k1n ( [ x : a]B).

309

The language theory of Automath, Chapter VIII, Section 2 (B.6)

(2) Product rule 2: BEn([z:a].)

* n(B)Er.

(3) Abstr. rule: k'a, zEa ki+' B (EC) =+ ki+' [z : a]B (EII([z : a]C)).

(4) Appl. rule 1: A E ~k2BErI([z: , alp)

+ F2(A) B (Ep[A]).

(5) Appl. rule 2: A E ~BEn(C), , CErI([z:a].)

*

( A ) B (E(A)C).

The associated reduction relations are p and 7:

(A) [z : a]B >p B [A] , [z : a](z) A >,, A if z @ FV(A)

.

It is in the above group of rules that the difference between AUT-QE and AUT-II becomes explicit. For a discussion of the rule of n see 1.5, and [ C . 5 , VIII.S.l]. Notation: In case z @ FV(B) we abbreviate II( [z : a]B ) by a --t B. Using this convention, product rule 2 and appl. rule 2 become

BE&+?-

* n ( B )E T

and

A E ~ B , En(c), CEa-v

*

(A)BE(A)C.

2.4. A possible extension concerning 1-expressions Notice that all compound correct 1-expressions have a II in front, or possibly (when 1-abbreviation constants are present) &reduce to a n expression starting with II. In fact, each correct 1-expression &reduces to an expression like II([Zl : a11 rI([z2 : a4 rI(...rI([ :z an] n T )...))). As a consequence all 1-expressions contain parts which are not correct, e.g. the part [z : a]T in n([z: a]T ) . If we do not like this we can easily extend the language by (1) restricting the notion of inhabitable 1-expressions: 1-expressions are said to be inhabitable according to: (i)

7

inhabitable,

(ii) if B inhabitable, then

n([z: a]B) inhabitable,

(iii) if B inhabitable, B 9 C, then C inhabitable. (2) restricting product rule 1: z E a k1B , B inhabitable

+ k rI( [z : a]B ) .

+

(3) dropping the restriction to degree i 1 in the abstr. rule. Then, we can further extend AUT-rI to a +-language (i.e. all value degrees are also function degrees, see V.2.7) by

D.T. v i ~ nDaalen

310

(4) adding a new appl. rule:

A E ~ B, Q [ z a: ] C

+k (A)B.

These changes are relatively unimportant, of course. 2.5. The specific rules I1 2.5.1. The rules of group I can be considered as just rephrasing the corre-

sponding rules of AUT-QE. Now, however, we come t o rules which have no counterpart in AUT-QE.

I1 Pairs, projections, sums. Let cpEa + r. Then (1) Sum rule: I- C(p) (ET).

*

(2) Pair rule: A E ~B, E ( A ) ~ ~ k (cp,A,B)(EC(cp)).

( 3 ) Projection rules: CEC((P) =s I- C(l) ( ~ a k ) ,C ( 2 )(E(C(*)) 9). The reduction rules associated with group I1 are

A

and

0:

(cp,A,B)(,)> r r A , (cP,A,B)(2)>?r B AEC(cp)

*

(!%4(1),42))

>o

A

2.5.2. Notice that here, for the first time, reduction ceases to be a purely syntactical matter. The condition A E C(cp) is inserted here because we want to maintain preservation of types

A E ~ A, > B * B E ~ . Otherwise, we come in trouble with cp E a + 7 ,A E a , 1c, = [z : a]( A )cp, BE ( A )cp, where C e (cp,A,B)EC(cp)and ( $ J , C ( , ) , C ( ~ ) ) E Cand ( $ Jnot ) cpQ$J. As a consequence we must modify one of the monotonicity rules into: if zEa + A > B then [z : a ] A > [z : a]B. 2.5.3. Notation: In case z @ FV(@ we abbreviate C([z : alp) by a 8 p. For pairs (9,A , B ) in such a degenerate sum we can omit the type label cp and just write ( A ,B ) (because it is intended that cp can be constructed from A and B in this case). The degenerate versions of pair rule and projection rules are:

A E ~ BEP , CEa8P

*

+

( A , B ) E ~ @ ~

C(l)Ea, C ( 2 ) E p .

311

The language theory of Automath, Chapter VIII, Section 2 (B.6)

For degenerate pairs the typing condition for a-reduction can be omitted. Notice that, in contrast with products, only degree 2 sums are formed, and consequently only degree 3 pairs. Besides, the two components of a pair are 3-expressions too. 2.6. The specific rules I11 See the discussion in 1.4. The rules concern

I11 Binary unions, injections and plus-terms. Let ~ E T P, E T . Then (1) Binary union: I- a e3 P (E 7). (2) Injection 1: A E + ~ I- il(A,P) ( E a @ P ) .

(3) Injection 2: BED =+ I- Zz(B,a) (Eae3P). (4) Plus rule: y E 7 ,B E a -,y, C E P -, y I- B (33 c (E (a€I3 P ) -,7 ) .

*

The associated reductions are

+ and

E:

(il(P,A)) (C e3 D) >+ (A) c , ( i z ( B ,0))(C @ D ) >+

(4D

[x : a](il(z))F @ [z : PI ( i z ( x ) )F

.

>€

F if z

FV(F)

Notation: @ is supposed to bind more loosely than the other connectives. This is why the function parts of the +-redices are, and the left- and right part of the E-redex are not put inside parentheses. We mention also the alternative form of +, +’ (which is in fact followed by PI:

+

(il(A,P ) ) ([.

:

I.

B e3 C) >+‘ B “41

and an alternative form of

.1

E,

salt:

: a1 B [Ylil(Z,P)I e3 .[ : PI

We clearly have

B [ y l i z ( z ,.)I

>€alt

[Y : a @ PI B .

>+! =+ >+>p (see 11.7.1.2 for the notation). Further

( il( A ,P ))(B @ C<.rl ) ((l(A,P))([z:al(x)Be3C)>+’ ( A ) B

+

etc. i.e. >+ =+ <.rl>+~. So, as far as equality 9 is concerned, we have (P, =+- +’) and (7, +’ =+ +). Since we always include P, and 7 is optional, we prefer the rule in our definition. Similarly we have >€ >ralt>,, and >Ealt =+ €, so (w.r.t. Q) (7, E alt =+ E ) and (P, e + E alt). Thus we prefer rule E . Binary unions always have degree 2, injections always have degree 3. Only $-functions of degree 3 are formed.

+

*

312

D.T. van Daalen

2.7. A possible extension concerning $-functions We can, however, define an extension of the language by also admitting degree 2 $-functions, i.e. glueing type-valued functions together into a single type-valued function. To this end we put: Let aEr, PET. let cpEa + r , $ E P + 7. Then (4‘) Plus rule 1: I- cp$ $ ( E (a$P)+r). ( 5 ) Plusrule2: BEn(cp), cEn($) +-bB$C(En(cp$$)).

The old plus can be considered as a special case of rule 5 , by using

[. : .I Y €3 .[

: PI

Y

>Edt

.1

: a @PI Y

E

or E alt:

*

We do not discuss this extension here, because it really complicates the normability problem (see [ C.5, VIII.4.6)). 2.8. Elementary properties As in V.2.7-V.2.9 we can infer some nice properties. First, concerning the degrees: I- A

* A degree correct

A 9B AEB

degree(A) = degree(B) 3 degree(A)

= degree( B )

+1 .

Then, concerning contexts, renaming (see (C.5, V.2.9.21) and weakening ( [ C . 5 , V.2.9.31). Further, the simultaneous and the single substitution theorem ([C . 5 , V.2.9.4-5]), and correctness of categories ( [ C . 5 ,V.2.101): A E B I- B. Analogously to the abstr. and appl. properties in ((7.5, V.2.101 and [C.5, V.2.141 (which mutatis mutandis hold hold as well in AUT-II) we have properties like

*

I- (9, A, B )

* ( AEa, c p ~ a+

7,

B E(A) cp) etc.

i.e. the “inversion of the correctness rules”. An important additional property (to be proved in the next section) is uniqueness of types: AEB, AEC*BqC which in AUT-QE did not hold for A of degree 2, because of type inclusion.

313

Generalizing Automat h by Means of a Lambda-Typed Lambda Calculus* N.G. de Bruijn

SUMMARY The calculus A h developed in this paper is claimed to be able t o embrace all the essential aspects of Automath, apart from the feature of type inclusion, which will not be considered in this paper. The calculus deals with a correctness notion for lambda-typed lambda formulas (which are presented in the form of what will be called lambda trees). To an Automath book there corresponds a single lambda tree, and the correctness of the book is equivalent to the correctness of the tree. The algorithmic definition of correctness of lambda trees corresponds to an efficient checking algorithm for Automath books.

1. INTRODUCTION 1.1. Automath and lambda calculus We are not going to explain Automath in this paper; for references and a few remarks we refer to Section 6.1. The basic common feature of the languages of the Automath family is lambda-typed lambda calculus. Nevertheless Automath has various aspects of a different nature, of which we mention the context administration and the mechanism of instantiation. Moreover there is the notion of degree, and the rules of the languages, in particular those regarding abstractors, are different for different degrees. But a large part of what can be said about Automath, in particular as far as language theory is concerned, can be said about the bare lambda-typed lambda calculus already. In [de Bruijn 71 (B.2)] it was described how a complete Automath book can be considered as a single lambda calculus formula, and that idea gave rise *Reprinted from: Kueker, D.W., Lopes-Escobar, E.G.K. and Smith, C.H., eds., Mathematical Logic and Theoretical Computer Science, p. 71-92, by courtesy of Marcel Dekker Inc., New York.

N.G. de Bruijn

314

to work on language theory ([Nederpelt 73 (C.3)], [van Daalen 801) about the lambda-typed lambda calculus system called A. This system of condensation of an Automath book into a single formula (AUT-SL: single line Automath book) had a disadvantage, however. In order to put the book into the lambda calculus framework it was necessary to first eliminate all definitional lines of the book. Considering the fact that the description of a mathematical subject may involve a large number of definitions, the exponential growth in length we get by eliminating them is prohibitive in practice: it can serve a theoretical purpose only. The kind of lambda-typed lambda calculus to be developed in the present paper may be better in this respect. It makes it possible to keep the full abbreviational power of Automath books within the framework of a lambda calculus. In this framework a number of features of Automath can be explained in a unifying way. Lines, contexts and instantiations all vanish from the scene. They find their natural expression in the lambda calculus, like in AUT-SL, but now without loosing the relation with the original Automath book. In particular the way we actually check the correctness of an Automath book is directly related to an efficient way t o check the correctness of a lambda formula. Therefore the checking algorithm described in this paper can be expected to become a basis of all checkers of Automath-like languages. The little differences between the various members of the Automath family lead to rather superficial modifications of that basic program. It can be expected that most of these modifications will be felt at the input stage only. The paper is restricted t o the Automath languages without type inclusion. The feature of type inclusion (which is used in AUT-68 and AUT-QE) requires modifications in the correctness definition and the checking algorithm. We shall not discuss such modifications here.

1.2.

lkees

The paper has another feature, not strongly related t o the main theme. That feature is the predominant place given to the description in terms of trees rather than to the one in terms of character strings. Of course, this may be considered as just a matter of taste. Nevertheless it may have an advantage to have a coherent description in terms of trees, in particular for future reference. The author believes that if it ever comes to treating the theory of Automath in an Automath book, the trees may stand a better chance than the character strings.

Generalizing Automath (B. 7)

315

2. LAMBDA TREES

2.1. What to take as fundamental, character strings or trees Syntax is closely connected to trees. Formulas, and other syntactic structures, are given as strings of characters, but can be represented by means of trees. On the other hand, treeshaped structures can be coded in the form of strings of characters. One might say that the trees and the character strings are two faces of one and the same subject. The trees are usually closer to the nature of things, the character strings are usually better for communication. Or, to put it in the superficial form of a slogan, the trees are what we mean, the strings are what we say. Discussing syntax we have to choose which one of the two points of view, trees of strings, is to be taken as the point of departure. Usually one seems to prefer the character strings, but we shall take the less traditional view to start from the trees. One can have various reasons for this preference, but here we mention the following two as relevant for the present paper: (i) It seems t o be easier to talk about the various points of a tree than about the various “places” in a character string. (ii) The trees make it easier to discuss the matter of bound variables. We shall use the character strings as a kind of shorthand in cases where the trees become inconvenient. This shorthand is quite often easier to write, to print and to read, but the reader should know all the time that the trees are the mathematical structures we really intend to describe. In Section 2.7 we shall display the shorthand rules. 2.2. The infinite binary tree We start from a set with two elements, 1 and r (mnemonic for “left” and “right”). W is to be the set of all words over ( 1 , T } , including the empty word E ; in standard notation W = {l,r}*. This set W will be called the infinite binary tree. We consider the mappings father leftson rightson

(W\{E})+ W , : W +W , : W +W .

:

The father of a word is obtained by omitting its rightmost letter, the leftson is obtained by adding an 1 on the right, the rightson is obtained by adding an r on the right.

N.G. de Bruijn

316

Examples: father(1) = E , father(?-) = E , father(1rl) = IT leftson(&)= 1 , leftson(lrr1) = 1 ~ ~ 1 1 , rightson(&)= T

rightson(rl1) = rllr.

In these examples we have followed the usual sloppy way t o write words as concatenated sequences of letters, and to make no distinction between a oneletter word and the letter it consists of. We define the binary infix relation < by agreeing that u < v (with u E W , v E W ) means that the word u is obtained from the word v by omitting one or more letters on the right. So ITT < I T T T ~ , and E < u for all u E W\{E}. The relation is obviously transitive. As usual, u 5 v means that either u < v or u = v. And v > u (v 2 u ) will mean the same thing as u < v (u 5 v).

2.3. Binary trees We shall consider all binary trees t o be finite subtrees of the infinite binary tree. A binary tree is a finite subset V of W with the following properties:

(i)

E

E

V,

(ii) for all u E V with u #

E

we have father(u) E V ,

(iii) if u E V then leftson(u) E V if and only if rightson(u) E V . Elements of V are called points of the binary tree. If u E V and leftson(u) # V , rightson(u) # V , then is called an end-point of V . The set of all end-points is denoted V,. The point E is called the root of

V. There are two popular ways t o draw two-dimensional pictures of a binary tree. The way we follow in this paper is to draw sons above their fathers. The other one has the fathers above their sons (such pictures can be called weeping willows). In both cases leftson(u) is drawn t o the left of rightson(u), for all u. Readers who prefer to draw weeping willows instead of upright trees will not have any trouble, since for their benefit we shall avoid the use of terms like “up”, “down”, “above”, “below” for describing vertical orientation. The inequality < is neutral in this respect. 2.4. Labels We consider three different objects outside W . They will be called A, T and Elements of the set W U {A, T, T} will be called labels. Points with label A or T will be called A-nodes and T-nodes, respectively. T.

Generalizing Automath (B.7)

317

If V is a binary tree then any mapping of V into the set of labels is called a labeling of V . If f is a labeling, and u E V , then f(u)is called the label of u. 2.5. Definition of lambda trees

A lambda tree is a pair (V,lab), where V is a binary tree, and lab is a labeling of V that satisfies the following conditions (i), (ii), (iii): (i)

If u E V\Ve then lab(u) E { A , T } .

(ii) If u E V, then lab(u) E V U { r } . (iii) If u E V, and lab(u) E V then lab(lab(u)) = T and rightson(lab(u)) 5 u.

2.6. An example We give an example with 17 points. These points and their labels are specified as follows: lab(&)= T , lab(l) = r , lab(r) = T , lab(r1) = T , lab(rll) = 7 , lab(&) = A , lab(rZr1) = rl , lab(r1rr) = E , lab(rr) = A , lab(rrl) = T , lab(rrl1) = T , lab(rr1r) = r , lab(rrr) = T , lab(rrrl) = 7 , lab(rrrr) = T , lab(rrrr1) = TTT , lab(rrrrr) = T . This lambda tree is pictured in Figure 1.

Figure 1, A lambda tree The picture does not show the names of the points, but it does show their labels as far as they are A, T or r. In the cases of points u with labels in V we have indicated lab(u) by means of a dotted arrow from u (which is always an

N.G. de Bruijn

318

end-point, according t o 2.5 (i)) to lab(u) (which is a point on the path from u to the root of the tree). Indeed the arrows always go to points with label T , and at such points the arrows always come from the right, according to 2.5 (iii).

2.7. Representation of a lambda tree as a character string We begin by taking a set of identifiers to be called dummies. They are no elements of the set of labels. Next in some arbitrary way we attach a dummy to every point of the tree that has label T , and different points get different dummies. In the example of 2.6 we attach X I to E , x2 to r , 23 t o rl, 2 4 to rr1, x5 to rrr, 2 6 to rrrr. We can now also attach dummies to the end-points as far as their label is not T . To the end-point u (with label lab(u) E V ) we attach the same dummy as we attached t o lab(u). The point lab(u) with its dummy is called the binding instance of the dummy, the point u with its dummy is called a bound instance. In Figure 2 the dummies are shown. The arrows could be omitted since their information is provided by the dummies: the arrows run from the bound instances of a dummy to its binding instance.

Figure 2, Tree with named dummies We now produce the character string representation by the following algorithm that attaches character strings to all subtrees: (i)

A subtree consisting of a single point is represented by T if its label is and by its dummy if it is a bound instance of that dummy.

T

(ii) A subtree whose root is labeled by A , to which there is attached a left-hand subtree (with character string P ) and a right-hand subtree (with character string Q), gets the character string ( P )Q. (iii) A subtree whose root is labeled by T , with dummy xi, say, and with P and Q as under (ii), gets the character string [xi : PI Q.

Generalizing Automath (B.7)

319

If we apply the algorithm to the tree of Figure 2 we get [zl : T ] [z2 : [z3 : T ] ( 2 3 ) 211 ([z4 : z2] T ) [ 2 5 : T ] (26 : 251 2 2

.

The way back from character string to lambda tree is easy, and we omit its description. 2.8. Remarks

The following remarks might give some background to our definitions and notations. (i)

The notation [z : P ]Q is the notation in Automath for the typed lambda abstraction. Here the binding dummy z is declared as being of type P. In untyped lambda calculus one might write [z] Q , but it is usually written with a lambda: Az.Q or A, Q.

(ii) In standard lambda calculus there is the construct called “application”, usually written as a concatenation QP. The interpretation is that Q is a function, P a value of the argument, and that Q P is the value of the function Q at the point P . The Automath notation puts the argument in front of the function: it has ( P )Q instead of QP. The decision t o put the “applicator” ( P ) in front of the function Q , is in harmony with the convention to put abstractors (like the [z : P] above) in front of what they operate on. Older Automath publications had { P }Q instead of ( P )Q. (iii) The T has about the same role that is played in Automath by ‘type’ and ‘prop’, the basic expressions of degree 1. (iv) The labels A and T in the lambda tree are mnemonic for “application” and “typing”. (v) The typing nodes are at the same time lambda nodes. This is different from what we had in [de Bruzjn 72b (C.2)],Section 13; there the lambda was a separate node in the right-hand subtree of the node with label T. Taking them together has the effect that the arrows in the lambda tree lead to nodes labeled T instead of A, and that the provision has to be made that arrows leading to a T-node always arrive from the right (see 2.5 (iii)). In the character string representation this provision means that in the case of [z : P]Q the dummy 2 does not occur in P. (vi) The tree of Figure 2 can be presented in namefree form by means of the reference depth system of [ d e Bruzjn 72b (C.2)]. We explain it here: If

N.G. de Bruijn

320

there is an arrow from an end-point u to lab(u) then the reference depth of u is the number of IJ with lab(v) = T and lab(u) 5 IJ < rightson(v) 5 u. We can replace the information contained in lab(u) by the reference depth of u. If that depth is 3, say, then we find lab(u) by proceeding from u to the root of the tree; the point we want is the third T-node we meet, provided that we only count T-nodes we approach from the right. For the tree of Figure 1 this is carried out in Figure 3.

Figure 3, Tree with reference depths Comparing Figure 3 t o Figure 2 we note that the three ones in Figure 3 lead to three different dummies ( 5 3 , 52, 55) in Figure 2, and that the two bound instances of 52 have the different reference depths 1 and 3. If we pass from the tree to the character string representation, we can omit the names of the dummies. We can write the namefree form of the example of Figure 2 as

This simple example demonstrates that the depth reference system was designed for other purposes than for easy reading.

3. DEGREE AND TYPE 3.1. Introduction

To every lambda tree we shall assign a non-negative number, t o be called its degree. And we shall even assign a degree to every end-point of a lambda tree.

Generalizing Automath (B.7)

32 1

If a lambda tree has degree > 1 we shall define its type, which is again a lambda tree. The degree of the new tree is 1 less than the one of the original one. As a preparation we need the notion of the lexicographical order in a binary tree. Moreover, for the definition of the type of a lambda tree we need the notion of implantation.

3.2. Lexicographical order In a lambda tree the points are words consisting of 1’s and r’s. We can order them as in a dictionary, starting with the empty word. For the tree of Figure 1 (Section 2.6) the dictionary is: E,

1 , r, rl, rll, rlr, rlrl, rlrr, rr, rrl, rrll, r r l r , rrr, rrrl, rrrr, r r r r l , rrrrr

.

A word u is said to be lexicographically lower than the word v if u comes before v in the dictionary. Note that if u < v (in the sense of Section 2.2) then u is lexicographically lower than v, but the converse need not be true. 3.3. Ascendants Let (V,lab) be a lambda tree. If u is an end-point with lab(u) # T then we shall define the ascendant of u, to be denoted by asc(u); it will be again an end-point. We note that lab(u) is not an end-point (see 2.5 (iii)), and therefore there exist points of V which are lexicographically higher than lab(u) and lower than rightson(lab(u)). The lexicographically highest of these is an end-point of V, and it is this end-point that we take as the definition of asc(u). Let us take Figure 1 as an example. The end-points are, in lexicographic order: I , rll, rlrl, rlrr, rrll, r r l r , r r r l , rrrrl, rrrrr. Of these, I , rll, r r l r and rrrl have no ascendants, but asc(rlr1) = rll, asc(r1rr) = I , asc(rrl1) = r l r r , asc(rrrr1) = rrrl, asc(rrrrr) = r l r r . 3.4. Degree of an end-point If u is an end-point of (V,lab), and lab(u) # cally lower than u. This is obvious since (i)

7,

then asc(u) is lexicographi-

asc(u) is lexicographically lower than rightson(lab(u)),

(ii) rightson(lab(u)) 5 u by Section 2.5, so (iii) rightson(lab(u)) is either equal to or lexicographically lower than u. We can now define the degree deg(u) of the end-points one by one, proceeding through the lexicographically ordered sequence of end-points. We define

N . G . de Bruijn

322 deg(u) = 1

+ deg(asc(u))

deg(u) = 1

if lab(u) = r

,

if lab(u) #

.

T

This defines deg as a function: deg : V,

+

{1,2,3,...}

.

In the example of Figure 1, we have 1, rll, rrlr, rrrl of degree 1, rlrl, rlrr, rrrrl of degree 2, and rrll, rrrrr of degree 3. 3.5. Degree of a l a m b d a tree

As the degree of a lambda tree (V,lab) we define deg(w), where w is the lexicographically highest point of V. This w is a word without l’s. Note that a lambda tree can have end-points whose degree exceeds the degree of the tree. We get an example if in the tree of Figure 1 we replace the label of rrrrr by r : then the tree has degree 1 but its point rrll has degree 3. 3.6. I mp la n tation Let (V,lab) be a lambda tree, and u be a point of V, not necessarily an end-point. And let S be a set of end-points of V . We assume that the following implantation condition holds: for every w E S and for every v E V with v 2 u, lab(v) E V, lab(v) < u we have rightson(lab(v)) 5 w. In this situation we shall describe a new lambda tree obtained by implanting at every point of S a copy of the subtree whose root is u. This new tree (V’,lab’) will be denoted as (V’, lab’) = impl( V, lab, u, S) . First we form the subtree at u, to be denoted as sub(u). It is the set of all words p E { l , r } *such that the concatenation up belongs to V . Next we define V’ as V’=VU

u

wES

u

p€sub(u)

WP

where wp is the concatenation of w and p. In order to define the labeling lab‘ of V’ we divide the set sub(u) into two categories: subl(u) and sub2(u). The first one, subl(u), is the set of all p E sub(u) for which both lab(up) E V and lab(up) 2 u. Such a p has the property that lab(up) = uq with some q E sub(u). We may call these p’s points with internal reference. All other points of sub(u) are put into sub2(u). This consists of the p’s such that lab(up) is A, T or T and of all p’s for which both lab(up) E V and lab(up) < u. The latter p’s may be called points with external reference. We are now ready t o describe the labeling lab’ of V‘. For all u E V\S we take lab’(u) = lab(u). The other points of V’ can be uniquely written as wp with w E S, p E sub(u). If p E sub2(u) we simply take lab’(wp) = lab(up). If

Generalizing Automath (B.7)

323

p E subl(u), however, the label of the copied point is no longer the same as the original label, but the copy of the original label. To be precise, if q is such that lab(up) = uq, then we take lab’(wp) = wq. Note that if s E S then s belongs to both V and V‘, and that lab’(s) can be different from lab(s). It is not hard to show that (V‘, lab’) is again a lambda tree. In Figure 4 we show a case of implantation. The lambda tree on the left is (V,lab), the one on the right is (V‘,lab’). We have (V’, lab’) = impl( V, lab, rl, { rrr}). S

Figure

4,

Implantation

3.7. Implantation and degree We keep the notation of Section 3.6. Taking some fixed w E S we can consider the wp (with p € sub(u)) as copies of the corresponding up. We now claim that the degree of wp in (V’, lab’) is always equal to the degree of u p in (V, lab). This can be proved by induction, letting p run through sub(u) in lexicographical order. If p is an external reference then the statement on the degrees is an easy consequence of the fact that the points of V\S have in (V,lab) the same degree as in (V’,lab’). If p is an internal reference, however, we remark that the ascendants of up in (V,lab) and of wp in (V’,lab’) are corresponding points again, so that they have equal degree by the induction hypothesis. Since deg(up) = deg(asc(up)) 1, deg(wp) = deg(asc(wp)) 1, we also have deg(up) = d e g ( w ) .

+

+

3.8. Type of a lambda tree If a lambda tree (V,lab) has degree > 1 we shall define the type, which is again a lambda tree. Let w be the lexicographically highest point of w (see 3.5). If lab(w) = T then (V, lab) has degree 1 and its type will not be defined. The only other possibility is that lab(w) E V. Now lab(lab(w)) = T , whence lab(w) has a leftson. We now

N.G. de Bruijn

324

define the type of (V, lab), to be denoted typ(V,lab), by implanting the subtree of leftson(lab(w)); here the set S consists of the single point w: typ(V, lab) = impl(V, lab, leftson(lab(w)), {w})

.

An example of typing is already available in Figure 4: the tree on the right is the type of the one on the left. 3.9. Typing lowers the degree by 1

We shall show that if the degree of (V,lab) exceeds 1, then the degree of typ(V, lab) is one less than the degree of (V,lab). Let again 20 be as in Sections 3.5 and 3.7. Since deg(w) > 1, w has an ascendant: u = asc(w). Then deg(w) = deg(v) 1. In the terminology of Section 3.8 we can now state that the lexicographically highest point in typ(V,lab) is the copy of v, and so, by the result of that section, its degree in typ(V, lab) equals the degree of u in (V, lab). So the degree of type(V, lab), i.e., the degree of its lexicographically highest point, is one less than deg(w), which is the degree of (V, lab).

+

4. REDUCTIONS 4.1. Beta reduction

We shall not present beta reduction directly. It will be introduced as the result of a set of more primitive reductions: local beta reductions and ATremovals. The reason for this is that the delta reductions of Automath can be considered as local beta reductions, and not as ordinary beta reductions. 4.2. AT-pairs

Let (V,lab) be a lambda tree. An AT-pair is a pair (u,v) where u E V, u E V, lab(u) = A, lab(v) = T, v = rightson(u).

Example: in Figure 1 (TT, T T T ) is an AT-pair. 4.3. AT-couples

We mention that whatever we do with AT-pairs can be generalized to ATcouples. We shall not actually use AT-couples, but we give the definition for the sake of completeness. Let n be a positive integer, let u1, u2, ...,u,,be points of V with ui = rightson(uj) for 1 < j 1 = 1 5 n. Furthermore, whenever 1 5 m < n, the number of i with 1 5 i 5 m and lab(ui) = T is less than the number of i with 1 5 i 5 m and lab(ui) = A. And finally the number of i with 1 5 i 5 n and lab(ui) = T is equal to the number of i with 1 5 i 5 n and lab(ui) = A. Now ( u l , ~ , , )is called an AT-couple. It is easy to see that

+

Generalizing Automath (B.7)

325

lab(u1) = A, lab(u,) = T . The situation can be illustrated by replacing the sequence ul,..., un by a sequence of opening and closing brackets: ui is replaced by an opening or a closing bracket according to lab(ui) = A or lab(ui) = T . The conditions mentioned above mean that the first and the last bracket form a matching pair of brackets, like in

[I1 “I[111.

4.4. Local beta reduction Let (V,lab) be a lambda tree, and let w be an end-point with lab(w) # T . We assume that the point lab(w) is the rightson of a point u with label A. So (u,lab(w)) is an AT-pair. We can now form the following implantation (Vf,lab‘) = impl(V, lab, leftson(u), {w})

.

The passage from (V, lab) to (Vf,lab’) is called local beta reduction at w. We give an example in the language of character strings. Let (V,lab) correspond to

[w:7][z: ([z:w]z)[y: [p:T]T](y)y]T. We apply local beta reduction to the second one of the two bound occurrences of y. It comes down to replacing that y by [ z : w]z (but we have to refresh the dummy z ) :

[w: T I [z: ( [ z : w ] z ) [y : [p : TI TI (y) [q : w]q] 7 * 4.5. AT-removal

Let (u,v) be an AT-pair in the lambda tree (V,lab), and assume that there is no w E V such that lab(w) = v. Then we can define a new lambda tree (V’,lab’) that arises by omitting this AT-pair and everything that grows on u and v on the left. A formal definition of V’ is the following one. We omit u and v from V, and furthermore all points which are 1 u1 and all points which are 2 vl. Next every point of the form urrw is replaced by the corresponding uw. In the latter cases the labels are redefined: if in V we had lab(urrw) = U T T Z then we take lab’(uw) = uz; if, however, lab(urrw) is not 2 urr we just take lab’(uw) = lab(urrw). We given a n example of AT-removal in the language of character strings. In ((7) .[ :

TI [Y : 71 Y) 7

there are no bound instances of

2, so

the pair

(T)

[z : T] can be removed.

N.G. de Bruijn

326

The result is

4.6. Mini-reductions

We shall use the word mini-reduction for what is either a local beta reduction or an AT-removal.

4.7. Beta reduction Let (u,v) be an AT-pair in the lambda tree (V, lab). Then beta reduction of (V,lab) (with respect to (u,v)) is obtained in two steps: (i) We pass from (V, lab) to impl(V, lab, leftson(u), S) , where S is the set of all w E V with lab(w) = v. (ii) This new lambda tree still has the AT-pair (u,v). To this pair we apply AT-removal. Step (i) can also be described as a sequence of local beta reductions, applied one by one to the w with lab(w) = v. The order in which these w's are taken is irrelevant.

4.8. The Church-Rosser property In the following, R is a relation on the set of all lambda trees. For example, the relation R can be the one of mini-reduction: if A and B are lambda trees then (A, B) E R expresses that B is obtained from A by mini-reduction. If A and B are lambda trees, we say that B is an R-reduct of A if either B = A or there is a finite sequence A = Ao, A l , ...,A , = B such that for every i (0 5 i < n ) we have (Ai,Ai+l) E R. We say that lambda trees C and D are R-equivalent if there is an E which is an R-reduct of both C and D. Simple examples of this are (i) the case C = D , and (ii) the cases where D is an R-reduct of C . We note that this equivalence notion is obviously reflexive and symmetric. If it is also transitive, we say that R has the Church-Rosser property.

Generalizing Automath (€3.7)

327

4.9. Church-Rosser for beta reductions The famouschurch-Rossertheorem states that in untypedlambdacalculus the set of all beta reductions has the Church-Rosser property (see [Barendregt 811). The fact that we have lambda trees with T-nodes does not make it much harder. The left-hand subtrees of the T-nodes do not play an important role in the beta reductions, but nevertheless reductions take place in these subtrees too, so they cannot be ignored. For a treatment that inludes the case of lambda trees we refer to [ d e Bruzj'n 72b (C.2)]. 4.10. Church-Rosser for mini-reductions The Church-Rosser property for mini-reductions is a simple consequence of the one for beta reductions. Actually two lambda trees C and D are beta equivalent if and only if they are mini-equivalent. This follows from the transitivity of beta equivalence, combined with (i) If A leads to B by a beta reduction then B is a mini-reduct of A. (This was already noted at the end of Section 4.7.) (ii) If A leads to B by a mini-reduction then A and B are beta equivalent. In order t o show (ii) we note that if the mini-reduction is local beta reducv), then beta reduction with respect t o (u, v) can be tion with the AT-pair (u, applied both to A and B , and the results are identical. If the mini-reduction is AT-removal, it is just a case of beta reduction. 4.11. Equivalence Now that we know that beta equivalence and mini-equivalence are the same, we just use the word equivalence for both.

5. CORRECTNESS 5.1. Introduction The notion of correctness of a lambda tree is concerned with the type of the P's that occur in subtrees ( P )Q. Roughly speaking, we require that either Q, or the type of Q, or the type of the type of Q, ... , is equivalent to something of the form [x : R] S, where R is equivalent to the type of P. The system of all correct lambda trees will be called delta-lambda (or Ah). It is different from the older system A (see [Nederpelt 73 (C.3)], [van Daalen SO]) in the following respect. In A we always require for the correctness of ( P )Q that

N.G. de Bruijn

328

both P and Q are correct themselves. In AA we do not: P should be correct, but in formulating the requirements for Q we may make use of P. For example, in ( P )[z : R]S the [z : R]S need not be correct. We may have to apply local beta reduction by means of the pair ( P )[z : R], that transforms S into some S’ such that [z : R]S’ is correct. We actually need this feature if we want to interpret an Automath book as a correct lambda tree. 5.2. Subdivided lambda trees In order to facilitate the formulation of correctness, we introduce a particular kind of lambda trees, where the points are colored red, white and blue. We consider a quadruple (V, lab, p , q ) , where (V,lab) is a lambda tree, and p , q are non-negative integers. Every u E V is a word of T’S and l’s, and by ~ T ( u )we ) 0, ~ T ( T T = ) 2, ~ T ( T T ~ = T ) 2, denote the number of T’S it starts with. So ~ T ( E = etc. The points u with TIT(U) 5 p are called red, those with p < ~ T ( u )5 q white, those with n r ( u ) > q blue. The points el T , T T , T T T , ... are called main line points. We shall call (V,lab,p,q) a subdivided l a m b d a tree if (i) and (ii) hold: (i) The white main line points all have label A . (ii) Among the red main line points there are no two consecutive labels A , and the last one in the red sequence E , T , T T , ... has label T . In other words, the sequence can be partitioned into groups of length 1 and 2, those of length 1 have label T , and those of length 2 consist of two consecutive points with labels A and T , respectively. It is any easy consequence of (i) and (ii) that the set of blue points is nonempty. Note that the conditions are automatically satisfied if p = q = 0. In other words, any lambda tree is a subdivided lambda tree if we color it all blue. In the language of character strings a subdivided lambda tree looks like RWB, where W is a (possibly empty) string ( P I )... (pk)(where k = q - p ) , and R is a string with entries either of the form [z : Q] or of the form ( P )[z : Q]. The red part R might be called a knowledge f r a m e , the white part W a waiting l i s t . In order to clearly indicate the subdivision we write the character string as

w,B ) .

(R,

5.3. The deflnition of correctness Let Slam3 be the set of all subdivided lambda trees. It can be presented as a set of triples ( R ,W,B).

Generalizing Automath (B. 7)

329

We shall define a subset Corr3 of Slam3. The elements of Corr3 are called the correct elements of Slam3. A lambda tree (V,lab) is called correct if (V,lab, 0,O) E Corr3. We note that (V,lab, 0,O)equals ( E , E , B ) if the character string B represents (V,lab). As always, E stands for the empty string, and we shall use the obvious notations for concatenation of character strings. We start by putting a set of triples ( R ,W,B ) into Corr3, in rule (i); the other rules produce new triples on the basis of old ones. (i)

If ( R ,E , 7)E Slam3 then ( R ,E , 7)E Corr3.

(ii) If z is a dummy, if ( R ,W , z ) E Slam3, and if (R,W,typ z) E Corr3 then ( R ,W,z) E Corr3. We have not defined typ z separately in this paper (it would not be a lambda tree but part of a lambda tree). But we can define ( R ,W,typ z) as the subdivided lambda tree that represents (V‘,lab‘, p , q ) , where (V‘,lab’) = typ( V,lab), and (V,lab, p , q ) is represented by ( R ,W,z). (iii) If ( R ,E , K ) E Corr3 and ( R ,W ( K ) ,B ) E Cox3 then ( R ,W,( K )B ) E Corr3. , : U ] B )E (iv) If ( R , E , U ) E Corr3, ( R [ z : V ] , c , B )E Corr3, then ( R , E [x Corr3. (v) If ( R , E , U )E Corr3, ( R ( K )[z : U ] ,W , B ) E Corr3, and if T P ( R , K , U ) holds, then ( R ,W ( K ) ,[x : U ]B ) E Corr3. Here T P stands for “type property”, and T P ( R ,K , V ) is the statement that if ( R ,E , K ) and ( R ,E , U ) represent (V,lab, p , p ) and (V’,lab’, p , p ) , respectively, then (V’,lab’) is equivalent to typ( V,lab). We remark that the conditions about Slam3 in rules (i) and (ii) guarantee that indeed Corr3 is a subset of Slam3. It may seem strange that in rule (i) there is no correctness requirement on R. Therefore we cannot claim that the correctness of ( R ,W,B ) implies the correctness of RWB. Nevertheless it can be shown that if we algorithmically check the correctness of a correct lambda tree (see Section 5.4), we will never enter into cases ( R ,W,B ) where ( E , E , RWB) is not correct, and the conditions on Slam3 in (i) and (ii) will always be satisfied. 6.4. Algorithmic correctness check

For every ( R ,W,B ) E Slam3 at most one of the rules (i)-(v) can be applied, and, apart from rule (i), these replace the question of the correctness by one or more uniquely defined other questions. If none of the rules can be applied

N.G. de Bruijn

330

we conclude to incorrectness. Those “other questions” are all about correctness again, apart from the T P ( R ,K , U ) arising in (v). This provides us with an algorithm for the task of the correctness check for a given lambda tree. We can think of the job as having been split into two parts: (i) Preparing a type check list. This means that we do not answer the question about the T P ( R ,K , U)’s with the various R, K , U turning up, but just put them on a list of jobs that still have to be done. The fact that all degrees are finite (see Section 3.4) guarantees that this job list is made in a finite number of steps. (ii) Establishing truth or falsity of the various T P ( R ,K , U)’s. The work under (i) can already lead to the conclusion that our lambda tree is incorrect. If we forget about syntactic errors that arise if we are presented with a structure that is not a lambda tree at all, this only happens in cases where we get to ( R ,W , T )with W # E , where none of our rules apply. 5.5. Remarks about the type check list The type check list can be prepared if we systematically apply the rules (i)-(v). In each one of the rules (iii), (iv), (v) there are two subgoals where something has to be shown to belong to Corr3. There are good reasons to tackle these subgoals in the order in which they are mentioned in the rules. This comes down to a lexicographical traversal of the lambda tree we have to investigate. This traversal can occasionally be interrupted by some application of rule (ii), which leads to an excursion in an extended tree. The type check list prepared by the algorithm hinted at in 5.4 can lead t o some duplication of work, by two causes: (i)

The given lambda tree can have one and the same substructure at various places. This will actually occur quite often if we represent a n Automath book as a lambda tree.

(ii) Application of rule (iv) of Section 5.3 leads us into asking questions about typ x that have already been answered before. The duplications mentioned in (ii) can be avoided to a large extent: see Section 5.8. We mention a shortcut that reduces the work needed t o prepare the type check list. It is obtained by splitting rule (ii) of Section 5.3 into (ii’) and (ii”): (ii‘) is as (ii), but with the restriction W # E , and

Generalizing Automath (B.7)

33 1

(ii‘) if ( R , & , z E ) Slam3 (where z is a dummy), then ( R , E , E ~ )Corr3.

5.6. Remarks about the type checks The type checks T P ( R ,K , U ) were introduced in 5.3 (vi). Given R, K , U , we can consider the question to establish by means of a n algorithm whether T P ( R ,K , V ) is true or false. The question comes down to establishing whether the (V,lab) of 5.3 (vi) has a type (which is simply a matter of degree) and whether (V’,lab’) and typ(V,lab) have a common reduct. It is quite easy to design an algorithm that does a tree search of all reducts of (V’,lab’) and typ(V,lab). If they do have a common reduct, that fact will be established in a finite time. But will “finite” be reasonably small here? And what if they do not have a common reduct? Are we able to establish that negative fact in a finite time, or at least in a reasonable time? And what if the tree search does not terminate? From a theoretical point of view we can say that our questions about the correctness of a given lambda tree are decidable. For the system A this was already shown by R. Nederpelt ([Nederpelt 7 3 (C.3)], (van Daalen S O ] ) , for A h by L.S. van Benthem Jutting (oral communication). It is done in two steps: (i) Between the notion of “lambda tree” and “correct lambda tree” there is a notion “norm-correct lambda tree”. For any given lambda tree it can be established in a finite time whether it is or is not norm-correct. For the notion of norm-correctness we refer to Section 5.9. In [Nederpelt 73 (C.3)] the term “normable” was used instead of “norm-correct”. (ii) For every norm-correct lambda tree we have the strong normalization property: there exists a number N (depending on the tree) such that no sequence of reductions is longer than N . As t o (ii) we note that if we have reduced both (V’,lab’) and typ(V,lab) to a point where no further reductions are possible, then the question becomes trivial: in that case, having a common reduct just means being equal. The strong normalization property guarantees that the question whether a given lambda tree is or is not correct is a decidable question.

5.7. Practical standpoint Apart from the cases of very small trees, the matter of decidability of correctness will not be of practical value: the number N mentioned in 5.6 (ii) will usually be prohibitively large. If a tree is incorrect, the finite time it takes to establish that fact may be hopelessly long. It is better to be more modest, and t o try to design algorithms with efficient strategies, by means of which we can

N.G. de Bruijn

332

show the correctness of the lambda trees we have to deal with in practice. If such algorithms are applied to an incorrect lambda tree, the fact that they have used an unreasonable amount of time without having reached a decision, may be considered as an indication that the tree is possibly incorrect. Sometimes we can apply quite easy checks by means of which an incorrect tree can be rejected fast: it might fail to be norm-correct, or might be no lambda tree at all. Or we might run into cases where the type of some (V, lab) is required but where deg(V, lab) = 1. 5.8. Avoiding double work

We can rearrange the definition of correctness in such a way that it leads to an algorithm that gives just a single type check corresponding to each A-node in the lambda tree we have to check the correctness of. If we just follow the algorithm sketched in Section 5.4, the cases where we have to treat ( R ,W ,typ z) will cause double work: what is involved in typ z has been earlier dealt with in the execution of the algorithm. The only thing that deserves to be checked is whether the A nodes in W match with the T-nodes that arise from typ z (possibly after one or more further applications of rule (ii)). Let us divide the waiting list into two consecutive parts. The first part is still called “white”, the second part is called “yellow”. For the yellow part the work load will be lighter than for the white part. A formal definition of these four-colored lambda trees is easily obtained by slight modification of Section 5.2. We have to consider (V, lab, p , s,q ) ; the points with p < n r ( u ) 5 s are white, those with s < n r ( u ) 5 q yellow. And the yellow main line points are required to have label A, just like the white ones. Let us denote the set of these four-colored lambda trees by Slam4. Its elements will be represented as ( R ,W,Y,B ) , just like those of Slam3 were represented by ( R ,W,B). We now formulate a new definition of correctness of lambda trees, equivalent to the old one. The difference is that the new definition leads to an algorithm that avoids the duplication we hinted at. It involves both a subset of Corr3 of Slam3 and a subset Corr4 of Slam4. The final goal is as before: (V,lab) is called correct is its character string P is such that ( E , E , P ) E Corr3. As rules we take (i), (iii), (iv), (v) as in Section 5.3, but we add new rules (vi)-(xii), where (iii) replaces the discarded rule (ii): (vi)

If ( R , E , E , TE) Slam4 then ( R ,E , E , T ) E Corr4.

(vii)

If z is a dummy, and ( R ,W ,E , z) E Corr4 then ( R ,W,z) E Corr3.

Generalizing Automath (B.7)

333

(viii) If z is a dummy, if ( R ,W,Y, z) E Slam4, and if ( R ,W,Y, typ z) E Corr4 z) E Corr4. The definition of ( R ,W,Y, typ z) is similar to then ( R ,W,Y, the one of ( R ,W,typ z) in 5.3 (ii). (ix)

If ( R ,W,Y ( K ) ,B ) E Corr4 then ( R ,W,Y, ( K )B ) E Corr4.

(x)

If ( R [ z :U ] , E , E , BE) Corr4 then ( R , E , E , [ zV: ] B )E Corr4.

(xi)

If ( R( K )[z : V ] W, , E , B ) E Corr4 and T P ( R ,K,V ) holds, then ( R ,W ( K ) , E[z , : V ]B ) E Corr4.

, Y,B ) E Corr4 then ( R ,W,Y ( K ) ,[z : V ]B ) E Corr4. (xii) If ( R ( K )[z : V ] W, At the end of Section 5.5 we mentioned the shortcut rule (ii’). There is a similar shortcut here: it can replace (vi) and (x): (x‘) If ( R ,E , E , B ) E Slam4 then ( R ,E , E , B ) E Corr4. However, the set of rules without shortcuts may be better for theoretical purposes. 5.9. Weaker notions of correctness

We can weaken the notion of correctness by weakening the requirement about T P ( R ,K , V ) in rule 5.3 (v). If in rule 5.3 (v) we omit the requirement of T P ( R ,K , V ) altogether, we get what we can call semicorrectness. For semicorrect lambda trees we can define a norm corresponding to Nederpelt’s norm for the system A (see [Nederpelt 73 (C.3)]).A norm is a particular kind of lambda tree: it has no labels A and all end-point labels are T . To every semicorrect lambda tree we attach such a norm. It can be defined algorithmically if we just follow the list of Section 5.8. First we define the norms of the (R, W,B)’s and ( R ,W,Y,B)’s: (i) and (vi): as norms of ( R , E , T )and ( R , E , E , Twe ) take the lambda tree consisting of just one node, labeled T . (iii): norm(R, W, ( K )B ) = norm(& W ( K ) ,B ) . (iv) and (x): as norm of (R,E,[z : V ]B ) (or of ( R , E , E[z , : V ]B ) ) we take the lambda tree with root labeled T , whose left-hand subtree is norm(R, E , U ) (or norm(& E , E , V ) ) ,and whose right-hand subtree is norm(R [z : V ] E, , B ) (or norm(R[z : V ] , E , E , B ) ) . (v): norm(& W ( K ) ,[z : V ]B ) = norm(R ( K )[z : V ] W, , B). (vii): norm(& W , z )=norm(& W , E , ~ ) . (viii): norm(& W,Y, z) = norm(& W ,Y, typ z). (ix): norm(R, W,Y,( K )B ) = norm(R, W,Y ( K ) ,B ) .

N.G. de Bruijn

334

(xi): n o r m ( R , W ( K ) , E , [ z : V ] B ) = n o r m ( R ( K ) [ z : V l , W , ~ , B ) . (xii): norm(R, W,Y ( K ) ,[z : V ]B ) = norm(R ( K )[z : U], W,Y ,B ) . The norms of ( R ,W,B ) or (R, W,Y,B ) are actually a kind of norm for W B or W Y B ; the role of R is only to provide the types of the dummies. Finally the norm of a semicorrect lambda tree (V, lab) is defined as the norm of the all-blue lambda tree (V, lab, 0,O) (which has the form (E, E, B ) ) . We can use a similar algorithm for finding the degree of a lambda tree: we just say in cases (i) and (vi) that the degree is 1, and in cases (ii) and (viii) that the degree is t o be increased by 1. Next we can define the notion of norm-correct lambda trees. We get that notion by replacing in rule 5.3 (v) the condition that the type of (V’,lab‘) is equivalent to typ(V,lab) by the condition that (V’,lab’) has the same norm as (V, lab). This condition is weaker than T P ( R , K , V ) ,and therefore every correct lambda tree is also norm-correct. For norm-correct lambda trees we have the strong normalization property (see Section 5.6). 5.10. Norms for lambda trees which are not necessarily semicorrect If (V, lab) is a lambda tree which is not semicorrect, that fact is established by the algorithm of Section 5.8 at some moment where we get to ( R ,W , T )or (R, W, Y ,r ) with W or Y non-empty. For such lambda trees we can nevertheless still define the norm, by the procedure of Section 5.9, if we just extend the action in cases (i) and (vi) by saying that (R, W,T ) and (R, W ,Y ,T ) have the single-noded tree (labeled by T ) as their norm, also in cases where W or Y are not empty.

6. AUTOMATH

BOOKS AS LAMBDA TREES

6.1. Some characteristics of Automath

We shall not explain Automath in detail here: we assume that the reader knows it from other sources (like [de Bruzjn 70a (A.Z)], [ d e Bruzjn 71 (B.Z)], [de Bruijn 73b],[de Bruzjn 80 (A.5)],[uanBenthem Jutting 77],[vanDaalen S O ] ) . In particular, we shall not try to be very precise in defining particular brands of Automath. Nevertheless we indicate a few characteristics, in order to get to the kind of Automath that corresponds to An. For a discussion that compares various forms of Automath in the light of such characteristics we refer to [de Bruijn 74al. (i)

Automath books are written as sequences of lines: primitive lines, ordinary

Generalizing Automath (B.7)

335

lines (= definitional lines), and context lines that describe the contexts of the other lines. (ii) We have the notion of typing, and that leads to the degrees. In standard Automath the only degrees are 1, 2 and 3, and it seems that for the description of mathematics no serious need for higher degrees ever turned UP. (iii) There are restrictions on abstraction. Contexts may be described as [z1 : All ... [z, : A,] where the Ai may have degree 1 or 2, but in expressions (also in the A;’s of the contexts) we only admit abstractors [z: A] where A has degree 2. (iv) In Automath we have instantiation: if the identifier p is the identifier of a line in a context of length n, then the “instantiations” p(E1, ...,E n ) ,where the Ei are expressions, can be admitted in other contexts. (v)

In some of the Automath languages (like AUT-QE) we admit “quasiexpressions”: expressions of degree 1 which are not just T .

(vi) In some of the Automath languages we have type inclusion: if E : [XI : All ... [z, : A,] T then we admit that E is substituted at places where a typing [z1 : All ... [zk: Ak] 7 (with some k < n ) is required. 6.2. Automath without type inclusion We can take Automath with quasi-expressions but without type inclusion (AUT-QE-NTI). Both AUT-QE-NTI and AUT-68 are sublanguages of AUTQE: we might say that in AUT-68 type inclusion is prescribed, in AUT-QE it is optional, in AUT-QE-NTI it is forbidden. In [ d e Bruzjn 78c (B.411 it was pointed out that AUT-QE-NTI can be used as a language for writing mathematics, somewhat lengthier than in AUT-QE. One might say that sacrificing type inclusion has to be paid by means of a number of extra axioms. But there is a disadvantage to type inclusion: type inclusion makes language theory considerably harder. The rules in AUT-QE-NTI are simple; we just mention that whenever A : B in the context [z : U ] , and U has degree 2, then [z : U ] A : [z : U ] B in empty context. 6.3. AUT-LAMBDA

In AUT-QE-NTI we still had restrictions: (i) the degrees are 1, 2 or 3, and (ii) in expressions abstractors [z : U ] are allowed only if U has degree 2.

N.G. de Bruijn

336

If we give up these restrictions, we get what we can call liberal AUT-QE-NTI. In liberal AUT-QE-NTI the role of instantiation can be taken over completely by abstractors and applicators. In order to make this clear we take a simple example: f := A : B in context [z : U ] . According to the liberal abstraction rules of AUT-LAMBDA we can write a new line F := [z : U ]A : [z : V ]B in empty context. Next, the instantiation f ( E ) is equivalent to ( E )F , so we can just replace the line f := A : B by the new one in empty context, and abolish the instantiation. Carrying on, we get books without instantiation, all written in empty context. Such books can be considered as having been written in a sublanguage of liberal AUT-QE-NTI; let us call it AUT-LAMBDA. In AUT-LAMBDA there are just two kinds of lines: (i) primitive lines

f := ‘prim’ : P and (ii) definitional lines g := Q : R .

(Note: ‘prim’ was written as PN in c her Automath public: ions.) 6.4. Turning AUT-LAMBDA into AA

We shall turn a book in AUT-LAMBDA into a lambda tree by a system that turns correct AUT-LAMBDA into correct lambda trees. It almost works in the opposite direction too, but it turns out that AA is a trifle stronger than AUTLAMBDA. The difference lies in some cases of what was mentioned in Section 5.1, but the difference is so small and unimportant that it seems t o be attractive to modify the definition of AUT-LAMBDA a tiny little bit, in order to make the correspondence complete. The transition is simple. We turn the identifiers of a book in AUT-LAMBDA into dummies. To a line f := ‘prim’ : P we attach the abstractor [f : PI, t o a line g := Q : R we attach the applicator-abstractor pair (Q)[g : R]. We do this for each line of the book, and put the abstractors and applicator-abstractor pairs into a single string, and we close it off by T. So to a book

f := ‘prim’

:

P

g:=

Q

:

R

k : =

V

: W

h := ‘prim’ : 2

there corresponds the string

Generalizing Automath (€3.7)

337

and this corresponds to a lambda tree.

6.5. Checking algorithms If we start from an AUT-LAMBDA book, transform it into a lambda tree as in Section 6.4,and apply the checking algorithm of Section 5.4,then we have the advantage that the AUT-LAMBDA book is checked line by line. So even if the book is incorrect as a whole, the first k lines can still be correct, and the algorithm can establish that fact. The same thing holds if we take the weaker correctness notions discussed in Section 5.9. 6.6. Type inclusion If we want to add the feature of type inclusion to AUT-LAMBDA, the transition of a book to a lambda tree can no longer be made in the same way. Moreover we need essential changes in the notion of typing in Ah. 6.6. A variation of Ah

We mention a modification of the definition of correctness of a lambda tree, obtained by considering different kinds of A-nodes. Let us divide the set of all A-nodes of a lambda tree into two classes: strong ones and weak ones. We take it as a rule that whenever a part of a tree is copied (like in the definition of typing) the copies of weak nodes are weak again, and the copies of strong nodes are strong. For the weak A-nodes the rules are as in Section 5.3, but for the strong ones we modify rule 5.3 (iii) by not just requiring that ( R ,E , K ) and ( R ,W ( K ) ,B ) are in Corr3, but also ( R ,E , B ) . In connection to what was said in Section 5.1 we might say that A corresponds to the case where all A-nodes are taken to be strong, and that AA is the case where all A-nodes are weak. The case mentioned in Section 6.4lies between these two: if we want to close the gap between AUTLAMBDA and AA we have to make all main line A-nodes weak and all others strong. If we replace weak A-nodes by strong ones, a correct lambda tree may turn into an incorrect one, but it can be expected t o become correct again by reductions.

This Page Intentionally Left Blank

339

Lambda calculus extended with segments Chapter 1, Sections 1.1 and 1.2 (Introduction) H. Balsters 1. INTRODUCTION The A-calculus is concerned with axiomatizing the mathematical concept of function and the rules governing the application of functions to values of their arguments. In the A-calculus a function is seen as a rule for calculating values; this is a view which differs from the one held in set theory, where a function is t o be a set of ordered pairs and is identified with its graph. In axiomatizing the concepts of function and application we define (i) a syntax, consisting of a set of grammar rules, and (ii) inference rules. The A-calculus to be described in this section, called Aa, is an extension of the ordinary type free A-calculus (cf. [Barendregt 84al) and was originally conceived by N.G. de Bruijn (cf. [de Bruijn 78al). The main feature of Aa is the incorporation of a new class of terms called segments. These segments were originally devised in order to provide for certain abbreviational facilities in the mathematical language Automath. Automath is a typed A-calculus in which it is possible t o code mathematical texts in such a way that the correctness of each proof written in Automath can be verified mechanically (i.e. by a computer). There is much to say about the Automath system, much more than the topic of this thesis aims to cover. We shall mainly treat Aa as an interesting extension of the A-calculus in its own right and not pay very much attention t o connections with Automath. This thesis will be a rather technical treatise of the syntax and axiomatics of Xu-theory. For an introduction to the Automath project we refer to [de Bruajn 80 (A.5)] and [van Benthem Jutting 81 (B.l)];the latter paper offers an excellent introduction to a fundamental Automath-language called AUT-68. For a detailed treatise of the language theory of the Automath-languages we refer to [van Daalen 801. This introduction consists of three sub-sections. In Section 1.1 we shall give an informal description of the Ao-system and pinpoint major differences with

340

H . Balsters

ordinary type free A-calculus (for a very complete and up-to-date description of type free A-calculus we refer to [Barendregt 84a]). Section 1.2 contains an informal description of the XTu-system (Xu extended with types). The types in XTU are an extension of the types in Church’s Theory of simple types (cf. [Church 40]), the extension being that simple types are constructed for segments and segment variables. 1.1. An informal introduction t o the Xu-system

In this section we shall give an informal description of a system called Xu. We shall offer some explanation for the motivation behind the system and show in which way Xu is an actual extension of ordinary type free A-calculus. We start with a simple system called XV.

1.1.1. The system XV The system XV is the well-known type free A-calculus as described in [Barendregt 81 although there are some slight deviations in notation. Type free A-calculus has formulas like

The corresponding formulas in XV are written as

In XV functional abstraction is denoted by Xz( ...) (i.e. the function that assigns (...) to the variable x, where x may occur in (...)), and functional application is denoted by 6AB (i.e. the function B applied to its argument A, where A and B are XV-terms). Note that in XV arguments are written in front of functions, this in contrast with ordinary type free A-calculus where application of a function B to its argument A is usually written as B ( A ) . The syntax of XV is very simple and is given below.

Definition 1.1.1, (1) XV-terms are words over the following alphabet

Lambda calculus extended with segments (B.8)

...

X

variables abstractor

6

applicator

v1, ~ 2 ~ 0 3 ,

34 1

(2) The set of XV-terms is the smallest set X satisfying

xEX , for every variable x (ii) A E X + A, A E X , for every variable x (iii) A , B E X 6AB E X . (i)

*

As will be clear, XV-terms are written in prefix notation: each variable has arity 0, each abstractor A, has arity 1 and the applicator 6 has arity 2. Each term can be represented as a rooted tree. As an example we consider the term 6 2 A,

A, 6yx

(40

which we write in tree form as

6

l2 -

A,

-

A,

-

6

lY -

2.

(4’’)

The correspondence between terms like (4‘) and trees like (4”) is one-to-one. It certainly helps to think of XV-terms as such trees, and in particular t o see operations on terms as operations on their corresponding trees; especially when long terms are involved it is often useful to consider tree representation of terms.

1.1.2. Beta reduction In X-calculus we have the fundamental notion of application. The application of a function B to an argument A is written as 6AB. Apart from functional application we have the notion of functional abstraction. As said before, the intuitive meaning of A=( ...) is “the function that assigns (...) to the variable x”. This is illustrated in the following example (not a XV-term by the way) 63X,(2

*

x + 1) = 2 * 3 + 1

+

i.e., we substitute the number 3 for the variable z in 2 z 1. A formula of the form 6 A A, B is called a redex. Substitution of A for the free occurrences of z in B is denoted by C , ( A , B ) . The transition from 6 A A, B to C , ( A , B ) is called @reduction. We now proceed by giving a more formal description of substitution. We recall that an occurrence of a variable x in a term A is called bound in A if this occurrence of x lies in the scope of some abstractor A, in A ; otherwise this occurrence of x is called free in A . Note that a variable can occur both free and bound in the same term; as an example consider the two occurrences of the variable x in the following term written in tree form

H . Balsters

342

I”

6 - A,

-

6

I” -

y.

Definition 1.1.2. If A is a term and 2 is a variable and y is a variable with y # x then we define C , ( A , B ) inductively for terms B by (1) C , ( A , x ) = A W A ,Y) = Y

I

(2) L ( A , Az C) = Xz C

(3) c,(A, A, C) =

A, &(A, C)

,

X,C,(A, C‘) ,

if x does not occur free in C, or: y does not occur free in A otherwise - where C’ is obtained by renaming of all free occurrences of y in C by some variable z which does not occur free in A, C.

(4) C,(A,6CD) = 6 C z ( A , C ) C , ( A , D ) .

Most of the four clauses in the definition given above are self-evident, with the possible exception of clause (3). Clause (3) is necessary in order to avoid that free occurrences of the variable y in A get bound by the A, of XyC after substitution, which would otherwise lead to inconsistencies. This renaming of bound variables is known as a-reduction. In our case it is said that XyC areduces to A, C’. Usually a-reduction is considered unessential. If a-reduction transforms a term A into A’ then A and A’ are considered to be equivalent in an informal way. This convention implies that the name of a bound variable is unessential; the “meaning” of a term is considered unaltered after performing an a-reduction on that term. Actually, in the definition of substitution given above, clause (3) does not introduce a proper term but rather an a-equivalence class of terms. 1.1.3. Name-free notation Renaming of bound variables can sometimes be very cumbersome; proofs involving a-reduction are notoriously tedious. But apart from this we have our own intrinsic reasons to avoid a-reduction. Later on we shall introduce the full Xa-system, an extension of XV. The main feature of Xa is the incorporation of a new class of terms called segments. Segments are discussed in Section 1.1.4. Substitution of segments for their corresponding variables can give rise to a large number of a-reductions, especially when the formulas are long. There is, however, a very simple way to avoid a-reduction. In [de Bruijn 78b], N.G. de

Lambda calculus extended with segments (B.8)

343

Bruijn introduced the concept of nameless dummies; he invented a A-calculus notation that makes a-reduction superfluous. The idea is that we just write X instead of A, A,, ... and every variable is replaced by a term of the form [ ( n ) , where n is some positive integer. Each J(n) is called a name-free variable and n is called a reference number. The reference number n of a name-free variable J(n) determines the X that binds a specific occurrence of J(n) in some term. The procedure is as follows. If the name-free variable J ( n ) occurs in some term t , we first form the tree representation o f t . We then descend from J ( n )towards the root of the tree and the n-th X encountered is the X that binds J(n). As an example consider the following name-carrying term in tree representation

X,

-

I"

lY

A,

-

6 - A, - 6 - A,

-

x.

The name-free equivalent of this term is

Remark. If a reference number n is larger than the number of X's lying on the path from an occurrence of J(n) to the root of the tree in which it occurs then we can interpret that occurrence as being free. The use of name-free notation has certain consequences for substitution of XJ-terms (XV-terms written in name-free form), which we now shortly describe. Substitution in a XJ-term t results in the replacement of free occurrences of a certain variable in t by some term u. We could also describe this situation in terms of trees by saying that certain end-points J ( n ) of the tree equivalent t^ of t have been replaced by some tree .iL. Consider the following example of such a substitution in a XJ-tree. Let t be the XJ-term

which has the following tree-representation

t^

1 J(4 16 X

-

X

-

6

-

-

J(1)

X - X -

This tree contains a redex, namely

x

- J(3)

H . Balsters

344

1 E(2) 1 6 - ((1)

6

-

x

-

x

-

x

- ((3)

and we can therefore perform a P-reduction on t^. By &reducing ((3) is a candidate for substitution of the sub-tree

t^, the end-point

/ ((2) 6

-

((1) *

Should we, however, simply replace ((3) by this sub-tree, as would have been the case if t^ had been written in name-carrying form, then this would result in the following tree t^’

x

-

x

-

x

-

x

- 6

/5(2) - ((1).

It is immediately clear that the variables ((1) and ((2) in t^’ refer to completely different A’s than in t^. This inconsistency is due to the fact that (i) ((1) and ((2) are external references in of the subterm 6((2)((1));

t^ (Lee,references t o

(ii) after replacement, the variables ((1) and ((2) in their left.

t^’

A’s to the left

have two extra A’S on

There is, however, a simple way t o resolve this inconsistency: by raising the reference numbers 1 and 2 in ((1) and ((2) by 2 in if,these variables refer t o the same A’s that they originally referred t o in t^. This example demonstrates that certain measures have to be taken in order to ensure that external references remain intact when we substitute a At-term. In Section 2, where we give a formal definition of substitution of name-free terms, we shall introduce so-called reference mappings, which see to it that reference numbers are suitably updated in order t o avoid inconsistencies as described above. We refrain from further discussion of these reference mappings here; they shall be described extensively, both informally and formally, in Section 2. In the following sections of this chapter we shall first stick to name-carrying notation of formulas. The major reason for this is to point out that namecarrying notation can possibly be maintained in Xu-calculus (XV-calculus extended with segments and segment variables), but we also want to show how awkward things can get in Xu-calculus by employing name-carrying notation. In the case of XV-calculus the name-free notation might seem exaggerated in preciseness, and we can imagine reservations towards this notation as far as readability of formulas is concerned. In the case of Xu-calculus we shall try to

Lambda calculus extended with segments (B.8)

345

show that the name-free notation has advantages over name-carrying notation, both in preciseness and readability. 1.1.4. Segments and abbreviations We may consider a variable as an abbreviation of a certain term if this variable can be replaced by that term by means of some suitable P-reduction. For example, consider the following term written in tree form -

/ A X

A,

-

6

-

A,

2 -

6

/" -

z .

(5)

By /3-reducing ( 5 ) we obtain the term

A,

-

6

I" -

A,

- 2 ,

(5')

i.e. a term in which the variable z has been replaced by the term A, x and the redex has vanished. If we would have more occurrences of the variable z , each bound by the A, of the redex, then each of these occurrences serves as a kind of abbreviation of the term A, x. In A n there are, however, still quite different things that we want to abbreviate. One such thing is a so-called &string like

I

A

/

B

6 - 6 - 6

C

/

.

If it occurs more than once in a certain term, we may wish to abbreviate it. Yet (6) is not a term, in the sense of a AV-term, but only part of a term; it becomes a AV-term if we place an arbitrary AV-term behind it. Such parts of AV-terms are called segments. Another example of a segment is a so-called A-string like A,

-

A,

- A,.

(7)

In Automath we have many cases where we would like to abbreviate segments. In this respect we mention an interesting Automath-language, namely Nederpelt's language A (cf. (Nederpelt 73 (C.3)]). The original idea of introducing such a language as A stems from N.G. de Bruijn who devised a language called AUT-SL (from Automath-Single Line) in which Automath texts can be represented as one single formula. The language A was devised as a fundamental and simple Automath-language which is very well suited for language-theoretical investigation. In typical codings of Automath texts in A we encounter very many copies of certain &strings and A-strings, copies which we would like to abbreviate. As a consequence, segments like &strings and A-strings will be treated

H. Balsters

346

as separate independent entities in Xa. In X u we shall even take a broader approach and allow for segments of a much more general form than &strings or A-strings alone. In the following section we shall give examples of such segments of a more general form. 1.1.5. S e g m e n t variables and s u b s t i t u t i o n Segments are terms with a kind of open end on the extreme right. From now on we shall use the symbol w to indicate the open end on the right. So

I

A

l

B

l

C

6 - 6 - 6 - w

is a segment as well as A,

A,

-

A,

-

- w .

As said before, segments are not XV-terms; a segment becomes a XV-term if we replace the w by an arbitrary XV-term. According to this scheme the following formulas can also be considered as segments:

lA

6 - A,

- A,

-

lA

Ax

-

w

lB

6 - A,

- 6 - w .

By replacing the w in both of these formulas by some XV-term we obtain a XVterm (provided, of course, that A and B are XV-terms). In X a we will go even one step further by allowing recursive nesting of segments, and as a consequence w’s can occur in other branches as well, like in

6

/ x x - w -

A,

-

I”

A,

-

6 - w

or / A X

,6

-

A,

-

w -

A,

I”

-

6 - w

I

6 - Xu

w .

All these occurrences of w in the foregoing formulas act as a kind of “holes”, which - once replaced by a XV-term - yield again a XV-term. All formulas having an w on the extreme right axe called segments in Xu. Along with segments we also add to our system a new kind of variables for which segments can be

Lambda calculus extended with segments (B.8)

347

substituted. These variables are represented by unary prefix symbols and are denoted, in name-carrying form, by o,o',a",... . An example of a Xo-term containing a segment and a segment variable is

6

I

- A,

A,

- A,

Xu - a -

-

- w

2.

This term is in redex form, where the segment variable o is bound by the Xu of the redex. Performing a P-reduction on this redex results in

A,

-

A,

A,

-

-

x

(8')

i.e., the prefix symbol o is replaced by the segment A, A, A, (where the w has been dropped). In Xu, segment variables can serve as a means to abbreviate segments, just like variables in AV can serve as a means to abbreviate XVterms. When using segment variables to abbreviate segments we must be careful, though. Consider for example the Xo-term (8). The variable x in that term refers to the abstractor A, hidden inside the segment variable o,as seen in (8') where x gets bound by A, after P-reduction of (8). This is an intended feature which we always have to take into account in Xo-calculus. If a segment variable o occurs in some Xo-term then after replacement of o by the segment s that o abbreviates in t , it can happen, as most often will be the case, that certain variables occurring in t get captured by abstractors lying on the main branch of the tree representation of s. This is to say that each occurrence of a segment variable o in a Ao-term t can contain abstractors - hidden inside o - which will capture certain variables in t after performing a P-reduction in t resulting in the replacement of o by the segment that o abbreviates in t. We now wish to discuss a situation in which there are more occurrences of the same segment variable o in some Xo-term. Consider the following Xo-term in tree representation - A,

/A"

6

Xu

-

-

o

-

-

w

1"

o - 6 - y

(9)

Performing a &reduction on this term results in

A,

-

A,

-

A,

- A,

-

6

I" -

y

(9')

where both instances of o have been replaced by the segment A, A., The variables x and y in (9') are bound by the last two abstractors A, and A, as indicated by the arrows in (9") shown below

*A,

- A,

-

A,

&'

- A,

-

7x y.

6

-

L d '

H . Balsters

348

Suppose, however, that we could want x and y to be bound by other occurrences of the abstractors A, and A, as indicated in / - - -

A,

Y

- A,

4

7x

6 - y. \ \ - - -/ - A,

- A,

-

(9y

In Aa we want to have the freedom to allow for such deviations in priority of binding power of A's, which appear when we have more than one occurrence of some segment variable in a Xu-term. One way of doing this is by renaming the abstractors in (9') in a suitable way; consider for example the following term

It is clear that the variables x and y are bound by the first two abstractors A, and A, just as we intended them to be bound in (9"'). This renaming, however, is done after substitution has taken place; i.e. the renaming has taken place after @-reductionof (9) to (9'). What we would like is that it can be seen beforehand (i.e. before @-reductiontakes place) how the abstractors inside segments shall be renamed. We would like t o have a means systematically indicating beforehand how this renaming of bound variables shall take place, instead of more or less arbitrarily renaming bound variables in segments after @-reduction. One way of doing this is by replacing the first, respectively the second, occurrence of a in (9) by a(x, y), respectively a(x1,yl). These parameter lists (x,y) and (x1,y1) serve as instructions indicating that the abstractors A, and A, are t o be renamed A,, in the first, respectively the second occurrence of a in by A, A, and A, (9) (actually only in the second occurrence of a real renaming takes place). In general if a segment has n ( n 2 0) A's lying on the main branch of its tree, say A,, ...,,,A, and a is a segment variable referring to that segment then by adding a parameter list (yl, ...,y,) to 0 we have an instruction indicating that the n abstractors A,, , ...,,A, are to be renamed by ,A, , ...,,A and in that order. Also the occurrence of the variables 21,...,x, in the segment which were bound by A,, ...,,A, are to be renamed by y1, ..., y,. We note that it is important that the parameter list added to a segment variable a has as its length: the number of A's lying on the main branch of the segment s that u refers to (this number is called the weight of 8 ) . By adding parameter lists to segment variables we have a means to bind occurrences of variables to a A hidden inside a segment exactly as we desire. There is still one problem, though, that we have to resolve. When performing a @-reductioninside a segment we are sometimes dealing with redices which, in

Lambda calculus extended with segments (B.8)

349

the substitutional process involved, have an effect on thew on the extreme right of that segment. Consider, for example, the following segment

A,

-

6

lA - A,

A,

-

-

A,

-

w

By @reducing the redex 6 A A, A, A, w occurring in (10) we are faced with evaluating C , ( A , A, A, w).By the clauses given in Definition 1.1.1 we know how to “shift” the C,-operator past the two abstractors A, and A,, but then we arrive at the w and have to decide how to evaluate C,(A, w). We could simply define C,(A,w) as w, but then certain vital information would get lost; a situation which we now explain. Suppose that (10) occurs as a segment in some term t and that (10) is referred to by some segment variable a(y1, y2,y3, y4) occurring in t. Suppose also that there is an occurrence of the variable y2 in t which refers to the abstractor A,* hidden inside a ( y l , y 2 , g 3 , y 4 ) . By ,&reducing (10) and defining C,(A, w) as w , this occurrence of y2 is no longer a candidate for substitution of the term A (which would have been the case prior to this @-reduction of (lo)), simply because the abstractor A, (or better: Ay2) has vanished. In order to avoid inconsistencies and to keep this candidate-role of substitution for such occurrences of variables 92 intact, we shall define such substitutions of a term A at an end-point w of a segment by

C,(A,w) = 6

lA -

A,

- w

.

In this way it remains possible to refer to the A, of the original redex in (lo), and occurrences of variables which referred indirectly to that lambda by means of a reference to a lambda hidden inside some segment variable remain candidates for substitution of the term A. There is still a problem, though, because the order of the A’s in the reduced segment is different from the order in which they appeared in the original segment. In our example, P-reduction of (10) results in

A,

- A,!

- A,,

lA

- 6 - A,

- w

where t and w have possibly been replaced by new variables z’ and w’, this in case that free occurrences of z or w in A would otherwise have been captured. The abstractors in (10) appear in the order A,, A,, A,, A, and in (lo‘) the order is A,, A,!, A,, A,. This difference has consequences when these segments are substituted for some occurrence of a variable a ( y l , y 2 , y3,y4). Consider, for example, the following two terms in which the segments (lo), respectively (lo’), occur

350

H . Balsters

6

A,

-

lA

6

-

1

A,

-

- A,

- A,

-

w r.,

- 4Yl,Y2,Y3,Y4) - yz

and

/ 6

A,

- A0

- A,,

A,!

-

-

- 6

lA -

A,

- w

- y2

4Yl,Y2,Y3,Y4)

These terms @-reduce to

A,,

-

6

A,,

-

A,,

I A’ -

A,,

-

A,,

-

Ay4 -

YZ

and - A,,

-

6

I A’ -

A,,

-

Y2

where A’ is obtained from A by renaming all free occurrences of z by y1. In (12) we see that A’ can be substituted for y2 by performing one more 0-reduction; this is, however, not the case in (12’). So by changing the order of the A’s in some segment s by performing a @-reduction inside s we can get the situation that occurrences of variables that originally (i.e. prior to this 0-reduction of s) referred to a certain A hidden inside some parameter-listed segment variable, afterwards refer to a completely different A. There is a way, however, in which such inconsisteixies can be resolved. By adding a n extra parameter, called a segment mapping (or segmap for short) to an w we can safely @-reduce a segment prior to substitution of that segment. A segmap is a permutation of some interval [l..n] of IN (n 2 0), and tells us how to restore the original order of the A’s occurring in a segment; i.e. by adding a segmap to the w on the extreme right of a segment we can determine the order in which the abstractors occurred before @-reductionof the original segment. Instead of writing w we now write w ( $ ) , where 1c, is some segmap. In our example we replace the w on the extreme right of (11’) by w ( +) , where is a permutation of [1..4] defined by

+

$0)= 1 $(2) = 3

*(3) = 4 $44) = 2

.

Let us denote this modification of (11‘) by (11”). If we rearrange the order of the parameter list ( y l , yz, y3, y4) in accordance to ?J, (i.e. the first parameter remains first in the list, the second becomes the third, the third becomes the fourth and

Lambda calculus extended with segments (B.8)

35 1

- most importantly - the fourth parameter becomes the second in the list) then we obtain a new parameter list (y1, y3, y4, yz). By replacing the parameter list (yl, y2, y3, y4) in (11’) by this new parameter list (yl, y3, y4, yz) we obtain the following modified version of (11”)

/ 6

A, - A,)

-

Xu -

- A,,

- 6

I A’ -

A,

-

w

0(Yl,Y3,94,92) - 92

( 11”)

which @-reduces to

I A’ A,,

- A,,

- A,,

-

6

-

A,,

- y2

(12“)

and we see that all occurrences of variables in (12) and (12”) refer t o the same A’s, just as we wanted. By adding parameter lists and segmaps we can take care of problems concerning references to A’s hidden inside segment variables in a suitable way. We shall now attempt to give a more formal description of substitution of a segment for a segment variable. We shall present this definition in name-carrying form, this in order to show that name-carrying notation can be maintained in principle but that employment of name-free notation provides for a more natural (and certainly more concise) means for dealing with substitution of segments for segment variables.

Definition 1.1.3. Let Aw($) be a segment with weight n (n E nV U {0}), $ be a permutation of [l..n] and B be a term. Substitution of Aw($) for u(y1, ...,gn) in a ( y 1 , ...,yn)B is defined by

6) (ii) (iii) where id(n) denotes the identity map on [l..n],(yi, ...,y;) is the result of rearranging (yl, ...,yn) as indicated by $ and A’ is the result of suitable renaming of bound variables in A as indicated by (yi, ...,y i ) . This definition is still rather vague since we have not defined ,Y,,~,l,.,,,,,,) ( A w ( $ ) ,B ) , and also because such descriptions as “rearrangement of a parameter list as indicated by a segmap” and “suitable renaming of bound variables in a term as indicated by a parameter list” can hardly be considered as descriptions with formal status. The transition from (ii) to (iii) is also a bit strange,

H . Balsters

352

since it is not clear from (ii) alone how the segmap $ in (iii) suddenly turns up again. Apparently, this is not a very good definition since it is too vague; but, as mentioned before, this definition was only intended as an attempt towards a formal definition. A precise formal definition of substitution for segment variables can of course be given, but such a definition would be rather involved. There is a more elegant and shorter way to define substitution for segment variables, namely by employing name-free notation for segments and segment variables. This notation is described in the following section.

1.1.6. Name-free notation for segments and segment variables There is another way of dealing with references t o A’s hidden inside segment variables than attaching parameter lists to segment variables, namely by employing name-free notation. What we shall do is the following. Segment variables are written in name-free form as u(n,m),where n denotes the reference number of o (which, like in E(n),determines the X that some specific occurrence of o(n,m)refers to) and m ( m 2 0) denotes the number of X’s lying on the main branch of the tree representation of the segment that o(n,m ) intends t o abbreviate (the number m is also called the weight of o(n,m ) ) .The number m in o(n,m) is to play the role of a parameter list in name-carrying notation; i.e. m indicates that there are m X’s hidden inside ~ ( nm). , As an example of a term in name-free notation containing a segment and a segment variable consider the following term written in tree form

X

- 6

I

/ 5(1) A - 6 - A - A - w

-

X

-

0(1,3)

-

t(5) I 6 - 5(2).

In this term we see that o(1,3) abbreviates a segment with three X’s lying on the main branch of its tree; so when determining the X that 5(5) refers to we descend from t ( 5 ) towards the root of the tree, subtract 3 from 5, subsequently subtract 1 and see that 5 ( 5 ) refers to the first X (from the left) of the tree. The variable 5(2) refers to the second X (from the right) hidden inside o ( l , 3 ) ; E ( 2 ) is thus bound by the second X (from the right) of the segment

/ ((1) X - 6 -

X - X - w .

By employing name-free notation we get a concise way of denoting segment vaxiables and can do without attaching (potentially long) parameter lists to these variables. There is still one problem, though; a problem which we discussed earlier on in the name-carrying version of Xo-calculus, which dealt with the performance of certain &reductions inside segments prior to substitution of those

Lambda calculus extended with segments (B.8)

353

segments for their respective segment variables. By performing a P-reduction inside a segment, the order in which certain A’s originally occurred in that segment can be disturbed and, as we have seen earlier, this can lead to problems when we substitute the reduced segment for certain occurrences of segment variables in a term in which that segment occurs. We solved those problems by adding segmaps to the w’s on the extreme right of the segments involved and we shall do so again in the name-free version of xu. We now shortly describe substitution of segments for segment variables and we shall give this description in an informal manner in terms of trees. The tree representation of a segment has an w ( $ ) - where II, is some segmap - on the extreme right of its main branch. When we substitute a segment we remove the w($) and put the remaining tree fragment in the place of some occurrence of a segment variable in a Xu-tree. Segment variables occur in Xa-trees as unary nodes and substitution of segments for segment variables thus gives rise to replacements at unary nodes inside a Xa-tree (which differs completely from A(-substitutions, where we could only perform replacements at end-nodes of trees). When such a substitution is performed, we again - as in the case of A(-substitutions - have to be careful and update external references in order to ensure that these references remain intact after substitution. But not only do we have to update external references when we substitute a segment for a corresponding segment variable, we also have to take into account the effect of the segmap $ attached to the end-point w of the segment involved, since such a segmap reallocates references to A’s lying on the main branch of the segment which we want to substitute. We now give a n example to demonstrate both of these features. Consider the following example of a Xa-tree containing a segment and a segment variable

1 x

-

6

I

x

-

-

x

x -

-

x

((3)

6 -

w($)

-

x

-

a(3,2)

-

((1)

where $ is the permutation of [1..2] defined by $(1) = 2 and $(2) = 1. This tree, which we shall refer to as t^, contains a redex, namely

6

I

1 ((3) X - X - 6

-

x

-

x

-

-

x

w($) - 0 ( 3 , 2 ) - ((1)

and we can therefore perform a P-reduction on t^. By P-reducing node u(3,2) is a candidate for substitution of the sub-tree

1 ((3) x

-

x

-

6 - w($).

t^,

the unary

H . Balsters

354

Should we simply replace a(3,2) by the tree fragment

/ ‘33) A - A - 6 then this would result in the following tree

t^’

/ ((3) A

-

A - A - A

-

A

-

6

- ((1).

It is immediately clear that the variables [( 1) and t(3) refer to different A’s then they originally referred to in t^. The variable ((3) is an external reference in t^ and, as in the case of At-substitutions, has to be suitably updated whenever the segment in which ((3) occurs is substituted for some segment variable. The variable ((1) in t^ refers to one of the two A’s hidden inside a(3,2); it seems to refer to the first A (from the right) lying on the main branch of the segment involved, but the segmap Ir, reallocates this reference to the second A (from the right). This means that correct P-reduction of t^ would result in the following tree P’

/ ((5) A - A - A - A -

A-6-((2).

In Section 2 we shall give a formal definition of substitution of Aa-terms. In this definition we shall use secalled reference mappings which see to it that reference numbers are suitably updated, like in our example in the transition from t^ to t^”. These reference mappings (or refmaps for short) and their interaction with Aa-terms are described extensively in Section 2, and we refrain from further discussion of refmaps here. The employment of name-free notation and segmaps makes it possible to give a formal definition of substitution of segments for segment variables in a very concise way, as we shall see in Section 2. In previous examples describing how substitution of segments for segment variables can take place we have restricted ourselves to rather simple situations. Our formal treatment of such substitutions, however, will take much more involved situations into account. Our formal definition of substitution will take into consideration certain accumulative effects which can occur when segments contain references to other segments, or even A’s which bind segment variables. 1.2. An introduction to the typed system ATU In this section we shall give a description of the Xa-system extended with types for terms. The types in AT a are a generalization of the types described in

Lambda calculus extended with segments (B.8)

355

Church’s Theory of simple types (cf. [Church 40]), the extension being that simple types are constructed for segments and that the description is given in namefree notation. The basic ideas for our description are taken from [de Bruijn 78a]. We shall start from a name-carrying calculus without segments - which, basically, is Church’s system of simple types - called ATV. We then gradually move on to a system in which operations on types are made more explicit and in which the name-free notation is incorporated. Finally, we shall describe the full AT a-system by offering, in name-free notation, a typing of segments. The definitions offered in this section will be followed by explanatory remarks.

Definition 1.2.1 (AT V ) . (1) Type symbols ( T ) The set of type symbols T is the smallest set X such that (i) e , @ E X ; (ii) a,P E X\{@}

* ( ~ 0E X) .

(2) Primitive symbols The set of primitive symbols consists of (i) variables: z, y, z,, (ii) the symbols

...

a E T\{@} ;

X (abstractor) and 6 (applicator).

(3) Terms ( X T V ) The set of terms AT V is the smallest set

X such that

X , for every variable z, ; (ii) t E X X za t E X , for every variable z, ; (iii) u, v E X * 6uv E X . (i)

2, E

*

(4) Types of terms

The function typ on AT V is defined inductively for terms t by (i)

tYPb) =a;

(4 @ , 7

(ii) typ(X z, u ) =

(iii) typ(6uv) =

P , @

,

if tYP(U) = P

otherwise

if typ(u) = cr otherwise

(5) The set of correct terms (AT V ) AT V = { t E AT V I typ(t) # @ } .

#@;

# @ and typ(v) = (4)

H. Balsters

356

Remarks. (1) e is some ground type, 8 is to be interpreted as the type of terms which are “incorrectly” typed.

(2) (a@ is to be interpreted as the type of those terms which map terms of type a to terms of type P. (3) If typ(t) = a then a is generally of the form ( ~ I ( Q z (... Q ~(a,&,+I) ... ))), where ( ~ 1 ..., , a,+1 are types. Speaking in terms of trees, this means that there are n abstractors X x a l , ...,X zanlying on the main branch of the tree representation t^ of t (and in that order) that cannot be removed by some &reduction in t ; i.e. for each abstractor Axai there is no matching 6 (or rather: 6 Ai ) such that this 6X-pair can be removed by means of a suitable sequence of P-reductions.

Before giving the next definition we introduce some notation concerning sequences. For an elaborate treatment of sequences we refer to Section 2.1. At this stage it is only important to know that a sequence is seen as a function with some interval [l..n] of N (n 2 0) as its domain, where n will be the length of the sequence in question.

Notation. Let C be some non-empty set (called an alphabet). - C* denotes the set of sequences over C (including the empty sequence denoted

by -

0 (the empty set)).

if c E C then ( c ) denotes the sequence of length 1 consisting of the “symbol” C.

-

-

if F,G E C’ then F & G denotes the concatenation of the sequences F and G, in particular if F is a sequence of length n (n 2 0) then F = (F(1)) ( W ) ) ... & ( F ( n ) ) . if F E C’ then E denotes the reversed sequence of F , i.e. if F = (F(1))& ( F ( 2 ) ) & ... & ( F ( n ) )then E = ( F ( n ) )& ... & ( F ( 2 ) )& (F(1)).

In the following definition we offer an alternative version of ATV in which operations on types are made more explicit.

Definition 1.2.2 ( A p , V ) . (1) Types ( T y ) The set of types T y is the smallest set X such that

Lambda calculus extended with segments (B.8)

357

(i) @ E X ; (ii) F E (X\{@})*

*y(F)E X ;

(2) Primitive symbols The set of primitive symbols consists of (i) variables: zf,y f , zf,... f E Ty\{@} ; (ii) the symbols X (abstractor) and 6 (applicator).

(3) Terms (AT-, V) The set of terms AT-, V is the smallest set X such that

zf E X , for every variable x f ; (ii) t E X + X z ~t fE X , for every variable zf ; (iii) u, v E X =+ 6uv E X .

(i)

(4) Types of terms The function y-typ on AT-, V is defined inductively for terms t by

(i)

T-tYPbf) = f ;

(ii) y-typ(A zf u ) =

(iii) y-typ(6uv) =

I

{

;(f)

& G)

ify-tYP(u) = y(G), for some G E (Ty\{@})* ;

7

, y(G)

if y-typ(u) = f and y-tYP(v) = r((f)& G ) 1 for some f E Ty\{@} and

,

GE @

otherwise

,

(R\{@})* ;

otherwise

( 5 ) The set of correct terms (AT-, V) AT?

v = { t E AT-, v 1 ?typ(t)

# 8).

Remarks. (1) We note that the symbol y is of no particular interest in itself, and the reason for introducing it is basically historical in nature. In [de Bruijn 78a] types of AT-,-terms (i.e. non-segments) were called “green” types, whereas types of segments were called “red” types. The symbol y has been chosen for the construction of the type of a AT-, V-term purely for mnemonic reasons. In Definition 1.1.5 (XTQ) we shall construct types of segments, and these types will be of the form p(F, G, H ) . Here the symbol p is used in the construction of types of segments, again, purely for mnemonic reasons.

H . Balsters

358

(2) y(8) is the analogue of the ground type e in Definition 1.2.1. (3) y((f) & G) is the type of those terms which map terms of type f to terms of type y(G) (cf. clause (4) (ii) above). (4) In terms of trees, if y-typ(t) = y((f1) & ... & (f,,)), then this means that there are n abstractors Xxfl, ...,X zf,,lying on the main branch of the tree representation t^ of t that cannot be removed by means of a suitable sequence of P-reductions in t (cf. comment (3) in the remarks on Definition 1.2.1).

In the following definition we go one step further and introduce a new typeconstructor n which takes two arguments, both sequences of types. We recall that y ( F ) denotes the type of those terms with n abstractors lying on the main branch of their corresponding trees (we assume that F is a sequence ( f l ) & ... & (f,,) of length n) that cannot be removed by suitable P-reductions. In the case of segments, however, we can also have terms with applicators lying on the main branch of their tree representations which cannot be removed by means of suitable ,&reductions. When we write r( F, G), where F and G are sequences of types ( f l ) & ... & (f,,) and (91) & ... & (gm), respectively, then F denotes the sequence of n non-removable abstractors and G denotes the sequence of m non-removable applicators. We also introduce a product operation %” between 7r-types and y-types with which we can calculate types of terms. We note that terms in the system AT,^ V , defined below, are never typed as r-types; n-types in AT,^ V are only used as intermediate constructs for calculating the eventual type (a y-type) of a term. When we calculate the type of a AT,,-term t we first calculate the type of a beginning part of that term (such a beginning part is a segment and will thus have a n-type as its type), say that this results in the n-type r ( F ,G). Then we calculate the type of the remaining part of t (which is not a segment and thus has a y-type as its result type), say that this remaining part o f t has type y ( H ) . The product n ( F ,G) * y(H) will result in the eventual type of t. With the interpretation of r ( F ,G) as the type of a beginning part of a term with F as the sequence of non-removable X’s and G as the sequence of non-removable 6’s, Definition 1.2.3 should not be too hard to understand. After this definition we shall give an example of calculating the type of a AT,^ V-term.

Definition 1.2.3

AT^^ V ) .

(1) Quasi-types ( T r ) The set of quasi-types T, is defined as

Lambda calculus extended with segments (B.8)

359

(2) Products of quasi-types and types (*) Let F , G and H be elements of (Ty\{@})*. The product of a quasi-type and a type is defined as follows

(4) Types of terms The function ny-typ on X T V~is ~defined inductively for terms t by

(i)

v-tYP(xf) = f ;

(ii) “T-tYP(xxf ). = .((f ), 0)

*

TY-tYP(u) ;

(iii) Ty-typ(buw) = ~ ( 0(ny-typ(u))) , * Ty-typ(w). (5) The set of correct terms ( A T ~ ~ V )

A simple example of calculating the Ty-type of a ATny-V term Consider the following term t Xf

6 Xg xg xh

and assume that h = y ( H ) , where H is some element of (Ty\{@})*.According t o the rules given in Definition 1.2.3, the type o f t is calculated as follows ny-typ( x Zf 6 2 9 x Xg Xh) =

* w-tYP(6xgX2gxh) = T ( ( f ) , @ )* .(0,(9)) * T y - t d x Z g X h ) = T((f),0) * 7 4 , (9)) * T((gL0) * v-tYP(xh) = .((f)?fJ)* .(0,(9)) * 4 ( 9 ) , 0 ) * T ( ( f ) , @ )* 4,(9)) * Y((9) H) = .((f)>@)* y ( H ) = r((f) H )

= n((f),0) = =

= = = =

=

H. Balsters

360

and this result is indeed as expected: as mentioned earlier in comment (3) by Definition 1.2.2, y(( f ) & H ) is to be interpreted as the type of those terms which map terms of type f to terms of type y ( H ) ,and clearly t is a term of that type. Also note that t @reduces to the term A x f x , ( ~ )which, as expected, also has type r ( ( f 82 ) HI. The systems AT V , AT, V and AT*, V are, though different in their respective descriptions, essentially equivalent in the sense that the expressive power of each of these systems is exactly the same. The reason for deviating from the notations and constructs employed in the original system AT V is that we eventually want to give a description of a typing mechanism for AT (T,a simple-typed version of the name-free system X a . In XTCT we shall construct a completely new kind of types, called ptypes, for segments. What will be shown is that the employment of K-types, y-types and the *-operation provides for not only an exact but also a concise description of a typing mechanism for segments and segment variables written in name-free notation. We now proceed by defining a typed version AT ( of the name-free system A<. Types in AT< are elements of T y . In order to calculate a type of a name-free term in AT< we introduce the concept of a &context, denoted by T , which is a sequence of elements of Ty\{@}.

Deflnition 1.2.4 (ATE). Terms (AT <) The set of terms A T < is the smallest set X satisfying (i)

[(n)E X , for every n E IN ;

(ii) t E X =+ X j t E X , for every f E Ty\{@} ; (iii) u,v E X

+ 6vv E X .

(-Type contexts (7) A <-context T is an element of (Ty\{@})*. (Note that a type context T is a function of the form T\{ 1.1

T

: [l..length(~)] +

The typing function E-typ Let r be a J-type context. The function 5-typ is defined inductively for AT[-terms t by

Lambda calculus extended with segments (B.8)

36 1

(4) The set of correct terms Let is

T

be a (-type context. The set of correct AT(-terms with respect to

{t E AT t I <-tYP(t,7 ) # Remarks

T

.

.

(1) In AT^ we just write Af, A,, Ah, ... instead of Azf, A x g , A x h , ... (the names of variables are dropped). (2) The type of an occurrence of a variable ((n) in a A ~ ( - t e r m t is found as follows. First we form the tree representation t^ of t, then we descend from that occurrence of ((n)in i towards the root of the tree and the n-th lambda, say Af, is the lambda that binds this occurrence of ((n) and the type f attached to this lambda is the type of ((n). (If the total number of A’s encountered on the root path of this occurrence of t(n)is less than n (implying that this occurrence of t(n)is free) then the type context will see to it that this occurrence of ((n)is suitably typed.) (3) The correspondence between name-carrying terms in AT^^ V and namefree terms in AT ( is as follows. If t is a AT^^ V-term not containing free occurrences of variables then we have the following correspondence

where .f denotes the name-free equivalent of t. If t contains free occurrences of variables then we have the correspondence

where the (-context T is such that it is of sufficient length and sees to it that all free occurrences of variables in are typed in the same way as they were typed in t. We now move on to the definition of the full AT^-system by constructing types for segments. In order to do so we introduce a new kind of types, called p types, for segments. A p t y p e has three parameters and is written as p(F, G, H ) , where F , G and H are sequences of y- and, possibly, ptypes. The extra parameter H has a purely administrative function; intuitively H is the sequence of all types attached to the A’s, including those hidden inside segment variables, lying on the main branch of the tree representation of the segment in question. The sequences F and G have the same meaning as before in the case

H . Balsters

362

of the quasi-type a ( F , G ) , namely the sequence of non-removable A’s and the sequence of non-removable 6’s, respectively. We need such an extra parameter H in p(F,G, H ) in order to determine the type of those variables which refer to a A hidden in a segment variable, a situation which we now explain. Suppose that we have a AT 0-term t in which we have a segment sw($)and an occurrence of a segment variable u(n,m) which abbreviates SW($J) in t. From o(n,m) we see that sw($) has m ( m 2 0) A’s lying on the main branch of its tree representation: these m A’s are hidden inside o(n,m)and they can be referred to by variables in t occurring to the right of o(n,m). In order to be able to type those variables which refer to one of the A’s hidden inside o(n,m) we inspect the third parameter H of the type, say p(F,G, H ) , of sw($).Suppose that the m A’s lying on the main branch of the tree representation of su($)occur in the order Ahl, ..., Ah, then H shall be the sequence (h,) & ( h m - l ) & ... & ( h l ) . If a variable in t refers t o the i-th (0 5 i 5 m ) A (from the right) hidden inside u(n,m)then it will be typed the i-th member hi of H . Our definition of ATO will also take into account the reallocational effects that segmaps $ have on references t o A’s lying on the main branch of the segments in question. We now give our definition, which at first sight might be a bit hard to understand. We shall give a n example of calculating the type of a AT^-term which should help clarify the rules stated in Definition 1.2.5. We note that the construct a ( F , G), given below, is the same construct a(F,G) as in Definition 1.2.3: it is an intermediate construct used for evaluating the product of a number of types in order to evaluate the eventual type of a term (including segments), which is either a y-type or a p t y p e (but never a a-type).

Definition 1.2.5 (ATo). (1) Types (TI The set of types T is the smallest set X satisfying (i)

BEEX;

(ii) V F E (X\{@})* : y ( F ) E X ; (iii) VF, G, H E (X\{@})* : p(F,G, H ) E X

.

(Note that $0) E X and p ( 0 , 0 , 0 ) E X . ) (2) Quasi-types (T,) The set of quasi-types T, is defined as

{a(F,G) I F, G E (T\{@})*}* (3) Products of quasi-types and types (*) Let F , G, H , I and J be elements of (T\{@})*.The product of a quasi-type and a type is defined as follows

363

Lambda calculus extended with segments (B.8)

n(F,G)

n(F,G)

*

*

y ( H )=

y(F & I )

p(H,I,J)=

(4) Terms (AT a )

I

if H = G & I for some I E (I"\{@})*;

, ,

otherwise if H = G & K for some K E (T\{@.))*;

p ( F & K, I , J ) , p(F,K&I,J)

,

ifG=K&i?forsome

K

,

c4

E

(T\{@})*;

otherwise

The set of XF a-terms is the smallest set X satisfying (i)

((n)E X , for every n E

N;

(ii) if $ is a segmap then w ( $ ) E X ; (iii) if u E X and f E I"\{@}

then Xf u E X ;

(iv) if u E X then a ( n , m ) u E X , for every n E

N and

mE NU{O};

(v) if u,

E X then 6uw E X

.

( 5 ) Type contexts A type context is an element of

(T\{@})* .

( 6 ) The typing function (typ) Let r be a type context. The function typ is defined inductively for XTUterms t by

(9

tYP(t(n),T) =

~ ( n,) if n E dom(.r) and ~ ( nis)a y-type ; @

,

otherwise

(iii) tYP(Xf u, 7)= n ( ( f ) ,0) * tYP(U7 (f)

(4 typ(a(n, m)u,7) = n ( F ,G) * typ(u, H & T ) ,

T );

if n E dom(.r) and ~ ( nis)a p t y p e of the form p(F,G, H ) , where H is a sequence of length m ;

, otherwise

H. Balsters

364

(v) tYP(6U'Ul.) = 7

4 t (tYP(U1

TI))

*

tYP('U1.) ;

(vi) The set of correct terms Let T be a type context. The set of correct X T ~-t erm swith respect t o r is

{ t E A T 0 ItyP(ti7) # 8 ) We now give a further explanation of the rules stated in Definition 1.2.5, and we shall do so by means of a non-trivial example in which all of the features for calculating y- and ptypes are incorporated. In this example we shall employ the following notation conventions fl

*

***

*

fn-1

*

fn

= (fl

*

(f2

* ... *

(fn-2

*

(fn-1

*

fn))

...)

(association to the right) ( f l , f2,

*"l

fn)

= (fl) &?

(f2) 8L

a'.

&

(fn) '

Consider the following term t written in tree form

where f, g, h, i and j are certain elements of T\{@} and $J is a permutation of the interval [1..3] defined by $(1) = 2, $(2) = 3 and $(3) = 1. According t o the rules given in Definition 1.2.5, the type of t with respect to the empty context 0 is calculated, step by step, as follows t y p ( X f 6 X g 6 E ( 2 ) X h X i 6 E ( 1 ) ~ ( $ J ) X j ~ ( 1 , 3 ) E ( 1= ),0)

= =

~ ( ( f 0) ) , * ~ ( Xg 65(2) 6 Xi 65(1) 4ll)X j 4 , 3 ) {(Ill (f)) = ~ ( ( f 0) ) , * 4 0 , (tYP(ul (!)))I * ~ Y P ( %4 1 , 3) t ( l ) , (f))

where u is the segment Xg6{(2)XhXi6~(l)w($J), or in tree form

Lambda calculus extended with segments (B.8)

* P(0, 0,( h ,S, i))

365

=

(note that the composition of the sequence (i, h, g, f ) with the segmap $ yields not only a permuted but also reduced sequence ( h ,g, i) of (i, h, g , f)) = 4 ( 9 ) ,0)

*

7

a

(f)) * 4 ( h ) ,0) * .((i),

* P(k4 (i), ( h 9 , i ) ) = * P ( ( i ) , (i), ( h g ,2 ) ) =

* r(0,(f)) * 4v4, 0) 4,(4, @,9, i)) 0) * 7 4 ,(f)) *

= .((9), 0) = .((9)1

0)

*

= .((9), 0)

=

P ( ( 4 , (i), ( h ,91 9 ) =

(iff = h, otherwise the product is equal to @) = P ( ( S , i),

( 4 1

( h g ,i))

and this is indeed as expected: the segment u has two non-removable abstractors (A, and Xi) lying on the main branch of its tree; it has one non-removable applicator with i as the type of its argument; it has a total number of three abstractors lying on the main branch of its tree, which, due to the reallocational effect of the segmap $, are referred to in the order Ah, A, and X i (from the right). Now that we have evaluated t y p ( u , ( f ) ) we can proceed with calculating tYP(t, 0): tYP(40) =

4)))* tYP(4 4 L 3 ) <(1), (f)) = * 744,(P((S14, (9, = 4(f), 0) * 4 0 , ( P ( ( 9 3 4 , h g , 4)))* 4 ( A , 0) * * t y p ( 4 l 1 3 )<(I), (if)) = = 4(f), 0) * 7 0 , (P((9, 4,(i), (h,9, 4)))* .((h0) * T(F,G) *

=

T((f),

0)

( 9 1

(1)

H. Balsters

366

*typ(<(l), (hlrh2rh3,jt.f))= (where j = p(F, G ,( h l ,hz, h 3 ) ) for some F, G E (T\{@})*and hl, h2,h3 E T\{@} (cf. clause 6) iv)), otherwise the product is equal to 8)

= 4(f), 0) *

7 4 ,b((g,i),

(4, (hg74))) * 4 ( A !0) * 4F,G)*

hl =

(if hl is a y-type, otherwise the product is equal to @) = .((f),

0) * 4 0 , ( p ( ( g ,i), (i), ( h g ,4)))*

.((dl

0) * Y ( F

H1) =

(where hl = y ( c & HI)for some H1 E ('T\{@})* (cf. clause 3) ii)), otherwise the product is equal to @) = .((f), =

0) *

7 4 4M(S,

4,(i), ( h g ,4)))* r((Jl8 2 F

H1) =

.((f),0) * r ( F &H1) = (if j = p(F, G,(hl,h2,h3)) = p ( ( g , i), (4, @,g, i)), i.e. if F = (9,i), G = (i), hl = h, hz = g, h3 = i, otherwise the product is equal t o 8)

= Y ( ( f ) &z F & H1) = = r ( V , g , i ) %I H1) (by definition of j ) and this is indeed the expected result: t is a non-segment and therefore its type is a y-type; if we assume that H = (i) & H1 = (i) & ( h l ,..., h,,) for certain hl, ...,h, E T\{@}, then the non-removable abstractors lying on the main branch of the tree representation of t occur in the order Xf, A, Xi, Ahl, ...,Ah,,, since the non-removable abstractors hidden in a ( l , 3 ) are Xg and Xi, and the first type i in the sequence (i, hl,...,h,,) is removed because the type of the argument <( 1) of the last applicator occurring in the segment

is equal to i (remember that the last variable <(1)occurring in t has type y((i, hl,...,hn))which means that the first non-removable abstractor of the term that this occurrence of ((1) intends to abbreviate would be Xi, and that this X i matches the 6<(1)-part in the segment u). Note also that t P-reduces to the following term written in tree form

/ ((2) Xf -

Xg

-

6

-

/ <(l) - Xi -

6

- <(2)

,

where we have substituted the segment u for a ( l , 3 ) (the reference number 1 in the last variable <( 1) in t has been changed to 2 because of the reallocational

Lambda calculus extended with segments (B.8)

367

effect of the segmap $J). This new term can be P-reduced once more, resulting in

Xf - A,

- xi

-

/ ((1) 6 - E(3)

)

where we have substituted an updated version of the first occurrence of the variable E ( 2 ) for the second occurrence of E(2) (which was bound by the abstractor Ah of the redex). The variable <(l)in this term has type i, and the variable t(3) has type f = h = y((i,hl, ...)h J ) ; therefore the type of the whole term is equal to y((f , g , i) & ( h l ,...)hn)),which is the same type as we have calculated for t: an expected result. In general, one would expect the type of a term and its P-reduct to be the same. This property of equality of types for terms and their P-reducts with respect to a certain context is called the closure property. A proof of the closure property for XTO is given in Chapter 4 of [Balsters 861. We note that in Chapter 4 we shall also define the product of two quasi-types and furthermore show that this extended version of the *-operation is association, i.e. ( f * g ) * h = f * ( g * h ) for all quasi-types f g and quasi-types and types h. Products of quasi-types and the associativity of the *-operation will prove to be useful for facilitating the calculations of types of AT g-terms.

[In Chapter 3 of [Balsters 861 a proof of the Church-Rosser property, regarding P-reduction, is given for the system XTU .]

This Page Intentionally Left Blank

PART C Theory

371

A Normal Form Theorem in a A-Calculus with Types L.S. van Benthem Jutting

It has been long conjectured that every expression in Automath has a normal form. An unpublished proof of this has been given by L.E. Fleischhacker. Here a proof is presented that in a A-calculus closely resembling Automath every correct expression has a normal form. The proof proceeds along the lines pointed out by Fleischhacker and uses a norm which is due to Nederpelt. The importance of this theorem is that it makes it possible for us to decide whether two expressions are “equal”. In fact, together with the Church-Rosser theorem (see [Curry and Feys 581) we may deduce that two expressions are “equal” iff they have the same normal form. This helps in proving that correctness of Automath expressions is decidable.

1. DEFINITION OF THE LANGUAGE We will give here only a very loose definition. A strict definition may be found in [ d e Brmijn 70a (A.2)]. We discern constants a, b, c, ..., variables x,y, z , ..., the symbol type and various brackets as primitive symbols. For the sake of clarity we will use below also other constants like 1, s and lV. Expressions are defined by:

a constant, a variable, the symbol type are expressions; if A and B are expressions, then ( A ) B and [x: A] B are expressions. Intuitively expressions may be thought of as denoting objects: ( A )B denotes the value of the function B for the argument A; [x : A] B denotes the function associating to every z in the domain A the value B (which may depend upon z). We will call x bound in [x: A] B. We shall discern 3-expressions, 2-expressions and 1-expressions. Intuitively 3-expressions denote LLmathematicalobjects”, e.g. the natural number one is denoted by the expression 1, the successor-function in the natural numbers may be denoted by s, the natural number two, being the successor of one, is then denoted by (1) s.

L.S. van Benthem Jutting

372

2-expressions denote “classes” to which mathematical objects belong, e.g. the set of natural numbers, denoted by N , or the set of all functions mappings N into N , denoted by [x: N ] N . 1-expressions denote “superclassesll to which classes belong, e.g. the superclass of all classes, denoted by type, or the superclass of the classes of mappings of IV into some other class, denoted by [z : nV] type. Syntactically 1-expressions are those expressions which have type as their last symbol. Now every mathematical object belongs, in our conception, to exactly one class and every class to exactly one superclass. This induces a function y, called type, mapping 3-expressions into 2-expressions and 2-expressions into 1expressions. E.g. y(1) = N , y(s) = [z : N ] N ,y ( N ) = type etc. It follows that we must discern between the natural number one, with y(1) I N , and the real number one, denoted by 1*with y ( l * ) = R. It will be clear now that an expression A is either a 1-expression, or A is a 2-expression and then y ( A ) is a 1-expression or A is a 3-expression and then ? ( A ) is a 2-expression and ? ( ? ( A ) ) a 1-expression. The type y must be thought of as defined on a finite number of constants. It may be extended to a new constant a by defining $a) as a certain 2-expression or 1-expression which contains only constants defined before a. In this case . a must be thought of as denoting a definite object of the class or superclass denoted by y ( a ) . We will say that a is a defined constant. On bound variables the type y is defined, too: in [z : A ] B the variable z, which might occur free in B, has type y(z) zz A. Hence A must be a 2-expression or a 1-expression (otherwise the expression [z : A ] B is incorrect). On composite expressions y may be defined recursively. We now give a notation for substitution: the result of substituting the expression A for the variable z in the expression B is denoted by B [z := A ] . A definition of substitution we will omit here. The intuitive meaning of ( A ) B and [z : A] B leads us to a definition of reduction as follows:

(a) [z: A] B

+

[y : A ] ( B [z := y ] ) if y is not free in B.

(p) ( A ) [ z : B ] C + C [ z : = A ] . (77)

[z : A] (z) B

+

B if z is not free in B.

Intuitively the expressions to the right and to the left of + denote the same objects. We extend the relation + to a monotone quasi order on all expressions, i.e. if A + C and B + D, then ( A ) B + ( C )D and [z : A ] B + [z : C ]D . Now there are rules according to which it may be decided whether an expression is correct. One of these was mentioned above: in [z : A] B, A should be either a 2-expression or a 1-expression. The main ideas are:

A normal form theorem in a A-calculus with types (C.1)

373

(a)

A correct expression does not contain free variables or undefined constants.

(b)

( A )B is only correct if B denotes a function and A belongs to the domain of that function (i.e. A is not a 1-expression and ? ( A ) is the domain of

B). (c)

The constants should be defined in due order, and for every defined constant a, $a) should be correct with respect to the constants defined before.

2. THE NORMAL FORM THEOREM

We say that A is in normal form (in n.f.) if neither A nor any subexpression of A is p- or 0-reducible. It follows that if A is in normal form, then

A

[ZI: B l ] [ ~ :2Bz] ... [z, : B,] ( 0 1 )

... ( D , ) p

where n, m are non-negative integers, p denotes a constant, a variable or the symbol T and B1, ...,B,, D1, ...,D, are in n.f. We say that A has a normal form if B in n.f. exists such that A -W B. We now introduce the norm T on expressions as follows .(type) = type

.(a) = ~ ( ? ( a ) ) for all defined constants a . r(b) = 0

for all undefined constants b .

~ ( z= ) T(A)

if x is bound by [z : A]

T(Y)

if T/ is free.

=0

P

if T ( A ) # O and T ( B )= [ T ( A )P] for a certain symbolstring P

0

otherwise

T ( ( A )B ) =

T([Z :

A]B ) =

I

[ T ( A )T] ( B )

if T ( A )# 0 and T ( B )# 0 otherwise

A strong point of this norm is that it is invariant under substitution and reduction: Theorem 1. If T ( B )# 0 and T ( A )= ~ ( z #) 0 , then T ( B[z := A ] ) = T ( B ) . The proof is easy when substitution is well defined.

0

L.S. van Benthem Jutting

374

Theorem 2. If T ( A )# 0 and A

-n

B , then r ( B )= T ( A ) .

We will prove this for P-reduction: Suppose A = (C) [x : D ] E and B = E [z := C]. As r ( A ) # 0 we know that r ( C ) # 0 and r ( [ x: D] E ) = [ r ( C ) T ] ( A ) .Hence r ( [ x: D ] E ) # 0 and it follows ] ( E ) .It follows that r ( D ) = r ( C ) and 7 ( E )= r ( A ) . that r ( [ x : D ] E ) = [ r ( D ) T Moreover, r(x) = r ( D ) because x is bound by [x : D], hence r(x) = r ( D ) = 0 r ( C ) .Therefore, by Theorem 1, r ( B )= r ( E [x := C]) = T ( E )= r ( A ) . The next theorem is the crucial part in our proof.

Theorem 3. If A is in n.f. with r ( A ) = r ( x ) # 0 and B is in n.f. with r ( B )# 0 , then C in n.f. exists such that B [ z := A] + C. The proof is complicated and proceeds by double induction:

(I)

with respect to the length of T ( A ) ,

(11)

with respect to the length of B .

The difficulty lies in the case when B = ( D )x, because then, by substituting A for z in B, an expression is obtained which is in general not in normal form. 0 The next theorem is an easy consequence of Theorem 3.

Theorem 4. If r ( A ) # 0 , then A has a normal form.

0

We now state

Theorem 5. If A is correct, then r ( A ) # 0 .

0

From Theorem 4 and 5 now follows

Theorem 6. If A is correct, then A has a normal form.

0

375

Lambda Calculus Notation with Nameless Dummies, a Tool for Automatic Formula Manipulation, with Application to the Church-Rosser Theorem* N.G. de Bruijn

ABSTRACT In ordinary lambda calculus the occurrences of a bound variable are made recognizable by the use of one and the same (otherwise irrelevant) name at all occurrences. This convention is known to cause considerable trouble in cases of substitution. In the present paper a different notational system is developed, where occurrences of variables are indicated by integers giving the “distance” to the binding X instead of a name attached t o that A. The system is claimed to be efficient for automatic formula manipulation as well as for metalingual discussion. As a n example the most essential part of a proof of the Church-Rosser theorem is presented in this namefree calculus.

1. INTRODUCTION For what lambda calculus is about, we refer to [Barendregt 711, [Church 411 or [Curry and Feys 581, although no specific knowledge will be required for the reading of the present paper. Manipulations in the lambda calculus are often troublesome because of the need for re-naming bound variables. For example, if a free variable in an expression has to be replaced by a second expression, the danger arises that some free variable of the second expression bears the same name as a bound variable in the first one, with the effect that binding is introduced where it is not intended. Another case of re-naming arises if we want to establish the equivalence of two *Reprinted from: Indagationes Math. 34, 5 , p. 381-392, by courtesy of the Koninklijke Nederlandse Akademie van Wetenschappen, Amsterdam.

N.G. de Bruijn

376

expressions in those situations where the only difference lies in the names of the bound variables (i.e. when the equivalence is so-called a-equivalence). In particular in machine-manipulated lambda calculus this re-naming activity involves a great deal of labour, both in machine time and in programming effort. It seems t o be worth-while t o try to get rid of the re-naming, or, rather, to get rid of names altogether. Consider the following three criteria for a good notation: (i)

easy t o write and easy t o read for the human reader;

(ii) easy to handle in metalingual discussion; (iii) easy for the computer and for the computer programmer. The system we shall develop here is claimed to be good for (ii) and good for (iii). It is not claimed to be very good for (i); this means that for computer work we shall want automatic translation from one of the usual systems to our present system at the input stage, and backwards at the output stage. An example showing that our method is adequate for (ii) can be found in Sections 10-12, which present the kernel of a proof for the Church-Rosser theorem. This proof is essentially the one that was given in [Barendregt 711, where it was attributed t o P. Martin-Lof (1971). Later private information by Mr. H.P. Barendregt disclosed that the idea is due to W.W. Tait. For a survey of proofs of the Church-Rosser theorem see [Barendregt 711 p. 16-17. An elaborate treatment of the theorem can also be found in [Curry and Feys 581. What is said about lambda calculus in this paper can be applied directly t o other kinds of dummy-binding in mathematics. For example, if we have an expression like the product IIi,,f(k, m),we can write it as II(p, q, X k f ( k , m)). For any new quantifier we wish t o use (like II here) we have t o take a particular symbol that is treated aa an element of the alphabet of constants (see Section

3) * Application t o Automath is explained in Section 13.

2. NOTATION IN METALINGUAL DISCUSSION If we want to denote a string of symbols by a single (“metalingual”) symbol, we have to be very careful, in particular if this procedure is repeated, e.g. if we form mixed strings of lingual and metalingual symols, represent these by a new symbol, etc. We shall use parentheses ( ) for this purpose. If denotes a string, then is not the string itself. For the string itself we shall use (a). We shall say that @ denotes the string and that (a) is the string. Let us consider some

Lambda calculus notation with nameless dummies (C.2)

377

examples, where the basic lingual symbols are all Latin letters as well as the hyphen. (These examples will show the use of ( ) in nested form, and therefore show that the simple device of using Greek letters on the metalingual level is definitely insufficient.) We shall use the symbol p for reversing the order of a string. That is, p(pqra) denotes arqp, whence (p(pqra)) = arqp. Now let 9 denote the word phi, and let C denote the word sigma. Then (9)(C) is the word phisigma, (9)- (C) = phi - sigma, ( p ( ( C ) ) )= amgis, and ( P ( ( P ( ( @ ) ) )(dsigma)))) =

= (p(ihpamgis)) = sigmaphi = (C) (9). The ( )’s of this section are not to be confused with the similar symbols we use in Backus’ normal form of a syntax (e.g. in Section 5 ) . In typescript and in handwriting it is convenient to underline a formula instead of putting it in ( )%. In print, however, underlining, and in particular multi-level underlining, is awkward.

3. NAME-CARRYING EXPRESSIONS We explain the kind of lambda calculus expressions which we want to turn into namefree expressions. We have a set of “constants” (a, b, c, f , g, ...) and a set of “variables” (3,t , u, v ,w, 2, ...). And there is the symbol X that can have any variable as a suffix. Moreover we admit application, of which the following is the interpretation. If 9 denotes a function, and I? a value of the variable, then (9)((I?)) is the value of the function at (r).We shall use a different notation instead: we add a symbol A to the list of constants, and we write A ( ( @ ) (I?)) , instead of (a) ((I’)). This puts it on a par with another kind of expression we are going to admit, viz. things like f ( , , ), where f is any constant. In the interpretations the latter kind of expression can be very close to what we have just called application, but that does not bother us at the moment. We shall not go into a formal definition of the syntax; the following example (that accidentally does not contain the symbol A at all) will be clear enough. We take the expression

4. GETTING RID OF THE NAMES OF VARIABLES In order to facilitate the discussion, we represent the expression as a planar tree which is easier to read than (3.1) itself.

N.G. de Bruijn

378

If in (3.1) we change the names of the bound variables, e.g. x, t , u, s into p , u, s, x, we get an expression that is what is usually called a-equivalent to (3.1): Apa(Aub(p,u,f(Asa(~,u, z ) , Lw)),w,Y) .

We shall take the simplistic point of view that a-equivalent expressions are the same. Formula (3.1) contains bound variables x, t , u, s and free variables z , w, y. We shall keep a list of letters from which the free variables are to be taken. Let that list be, in this order, z , v , w, y; we draw the points A,, A,, ,A, A, under the tree. The variables in the tree (Figure 1) are encircled (unless they occur as a suffix of A).

1,3

A,

2,3

7,3

. Figure 1

For every encircled letter we evaluate two integers which are indicated in the figure, viz. the reference depth and the level. The reference depth of an encircled letter at a certain spot, x say, is the number of A’s we encounter when running down until we meet A, (this A, is counted as one of the encountered A’s). It is agreed that the A,, A,, A,, A, (which do not belong t o the tree itself) can also be encountered on our way down, e.g. if we run down to A, we encounter A,, A,, x u , A,.

Lambda calculus notation with nameless dummies ((2.2)

379

The level of an encircled variable at a certain spot counts the total number of A’s we encounter when running down the tree until we get to the root (if the root is a A, like A, here, this one is also included in the count; the loose Av, A,, A,, A, are not counted this time). Let us now erase the variables and the integers indicating the levels; we keep the reference depth. No information is lost: the erased letters and numbers can be easily reconstructed. If we are not interested in the names of the bound variables (and honestly we should not be) we can erase the suffix in A, At, A,, A,. In those cases where we are interested in the names of the free variables we have to keep the ordered list z , v , w, y in order to be able to reconstruct our expression. Note that a point of the tree refers to a free variable if and only if the reference depth exceeds the level. Thus the information contained in our name-carrying expression can be presented as

with the free variable list z , o, w, y. This expression is called namefree. Note that (3.1) can be represented differently if we take a different free variable list. Any sequence of distinct variables may serve as a free variable list provided it contains z , w, y in any order. Conversely, every namefree expression can be decoded into a name-carrying one if we provide a free variable list that is long enough. This determines the name-carrying expression up to namechanging of the bound variables. Instead of providing a finite free variable list we can take an infinite one (with the effect that we need not bother whether the list is long enough). The reference depths refer to a count in the reference list from right to left, corresponding to the fact that A’s are written in front of the formula they act on. Therefore, such infinite variable lists have to be written as ...,23,22,21 instead of the usual left-to-right notation of an infinite sequence.

5. THE SYNTAX FOR NAMEFREE EXPRESSIONS We present the syntax in Backus’ normal form:

::= a I b I c I d ::=

I ...

I ,

::=

I () I <positive integer> 1 X

N.G. de Bruijn

380

In the next sections we shall use, in an informal way, the notions “level” of an integer in an NF expression, in the sense of Section 4. (The “reference depth” of an integer is, of course, the integer itself.)

6. SUBSTITUTION We shall define the effect of a substitution of a sequence of NF expressions into a single NF expression denoted by R. What we intend t o describe is the following. Let ...,C3, C2, C1 denote the sequence (in right-to-left notation). (In practice only finitely many Ck’s are relevant, whence we need not always give the full infinite sequence.) We attach a free variable list ...,x3,x2, x1 to R, and one and the same free variable list ..., 9 3 , 92, 91 to every C;. That determines name-carrying expressions to be denoted by R* and Ct. Now replace any free x; in (0.) by the corresponding (Cz). Thus we get an expression, to be denoted by r*,with possible free variables ...,93, y2, y1. With respect t o this free variable list ..., 93, y2, 91 this r*corresponds to the NF expression S(..., (&), (&), (El) ; (R)) we shall define presently. The definition will be recursive with respect to the structure of (0); R may denote either an NF expression string or an NF expression. We follow the syntactic classification of Section 5. (i)

If (R) = ( R l ) ,( 0 2 ) then

where

r; denotes

(ii) If (0)is a constant then

(iii) If (R) = (y) ((0,))(where y denotes a constant and R1 an N F expression string) then

(iv) If (R) is the positive integer k then

Lambda calculus notation with nameless dummies (C.2)

38 1

(v) If (0)= X ( r ) then

where

Ai

denotes

(S(...,4 , 3 , 2 ;

(Xi))).

(6.2)

Note that (Ai) is obtained from refers to a free variable.

7. THE OPERATORS ~ SUBSTITUTION

(Xi)by adding 1 to every integer in (Xi)that

h AND ,

GLOBAL DESCRIPTION OF

It will be convenient to use the separate notation T ~ ( ( C )in) order to abbreviate S( ..., h + 3 , h + 2 , h + 1 . ; (C)) .

It means adding h (which is a positive integer) to every integer in (C) that refers to a free variable. The special case ~ l ( ( C i )occurs ) in (6.2). With the aid of this notation we can give amore global description ofhow (S(..., (&), ( C Z )(El) , ; (0)))is obtained: start from 0, and in each case where an integer t in (0)exceeds its level 1, we replace that t by ( ~ i ( ( C t - i ) ) ) . In automatic formula manipulation it may be a good strategy to refrain from evaluating such T ~ ( ( C ) ) ’but S , just t o store them as pairs 1, (C), and go into (full or partial) evaluation only if necessary. The following formulas may come in handy: TkTi

= 7k+1

(TO((a))) = (0)

@(.*.,(C3), ( C d , (El) i (Tc((fQ)))) =

=

(s(..., ( C k + 3 ) , (Ck+Z)r (Ck+l) ; (a))).

The latter formula is a special case of the following result on composite substitution: If

(0)= (S(*..,( A d , (A1) i (A))) then

1

s(..., (q, (cl); (a))= s(..., (w, (rl); (A))

where

(ri)= (s(..., (Cz), (xi) ; (Ai)))

(i = 1,2, ...) .

N.G. de Bruijn

382 8. BETA REDUCTION

If we have an applicational expression A ( ( @ ) (r)) , (cf. Section 3), then the interpretation is that (a) is a function, (I?) a value of the variable in that function, and A((@),(r))is intended to represent the value of the function (@) at the point (r).If (@) happens to have the form X (a), then the function value can actually be evaluated. Roughly speaking, it comes down to substituting (I?) in (R) for all occurrences of the bound variable corresponding to the X in front of (0). A precise definition in terms of N F expressions is easy to give: If R and r denote N F expressions, then A(X (0),(r))is an N F expression to which beta reduction can be applied. The effect of the beta reduction is the N F expression

w...,3,2711 (r); (0))).

(8.1)

The usual beta reduction for name-carrying expressions is obtained if we use one and the same free variable list for all four expressions X (R), (I?), A(X (a),(r)) and (8.1).

9. ETA REDUCTION

In terms of name-carrying expressions, g-reduction means the following. If C denotes a name-carrying expression that does not contain the variable x , then A, ( C ) ( x ) (or in our notation X , A ( ( C ) , x ) )has the same mathematical interpretation as (C) itself. The transfer from A, (C) (z) to (C) is called g-reduction. We shall define it for NF expressions: For any NF expression (A) we define as g-reduction the transition of

XA((n((A))),l) into (A)

.

(9.1)

If we transform both expressions of (9.1) into name-carrying expressions by means of one and the same free variable list, the transition (9.1) becomes the 7-reduction for name-carrying expressions.

10. MULTIPLE BETA REDUCTION In Section 8 we considered beta reduction of an NF expression. It was reduction of the full expression and not the beta reduction of a subexpression (local beta reduction) which we shall consider presently. In order to be able t o indicate where the ,&reduction has t o be carried out, we introduce a set of constants (applicational symbols) to be used instead of the single symbol A. By the

Lambda calculus notation with nameless dummies ((3.2)

383

same device we get the possibility of multiple local beta reduction: we indicate a subset of the set of applicational symbols and we carry out beta reduction for all symbols of that subset. let U be a subset of the set of constants. An NF expression (C) is called U-correct if every element of U that occurs in (C) is always followed by a string in parentheses with the form (A(R), (r)).In other words, each occurrence of each element of U is ready for local beta reduction. To be more precise, we indicate how the syntax of Section 5 is t o be changed in order to get the syntax of the U-correct NF expressions. We have to replace the entries

1

()

by I () (A , ) and, moreover, we have to write “U-correct NF” instead of “NF” throughout. The following theorem is intuitively clear, and easily proved formally with the aid of the recursive definition of substitution (Section 6). Theorem 10.1. If R, C1, Cp, ... denote U-correct expressions, then (S(..., (Cz), (El) ; (0)))is U-correct .

We shall now define the operator P,y on the set of U-correct NF expressions recursively: (i)

If (C) is a single constant or a positive integer, then

(ii) If (C) = (y) ((El), ...,( C k ) ) , where (7)is a constant not in U , then

(iii) If (C) = A (El) then

(iv) If (C) = (y)(A

(a),(I?))

and (y) E U then (cf. (8.1))

I

N.G. de Bruijn

384

Needless to say, the effect of Pu on an expression string ( E l ) ,...,( C k ) is to be defined by ( h ( ( Z l ) ) ) , ...,( h ( ( C k ) ) ) .

11. THEOREMS ON MULTIPLE BETA REDUCTION

Theorem 11.1. Zf (R), (El), (I&), ... are U-correct, then (PU((S(...l(C2),(El) ; (0))) = = ( S ( Y (PLI((C2)))l(Pu((C1))); ( P u ( ( f 4 ) ) ) ).

Proof. For easier reading we shall drop the signs ) and ( throughout this proof. The proof has to be read twice. The first time we deal with the proof of PUS(..., E21 C1 ; a) = S(...,PuC2,PuC1; Pun)

(11.1)

in the case that the Ci are integers. (This case is intuitively clear, but it takes little extra trouble to derive it formally.) In the second reading the result of the first reading can be used. We apply induction with respect to the structure of R, using the definition of substitution as given in Section 6. (Note that in the first reading the induction hypothesis is used only for cases belonging to the first reading.) The cases (i), (ii), (iv) of Section 6 are very simple, and so is case (iii) if the constant y is not in U . We concentrate on the two remaining cases, viz. R = X r and R = y(AA, r) with y E U . If R = X r we apply (v) of Section 6 twice: PUS(..., c2,El ;

w = P V W . . . , A2,

All

1;

r)

(11.2)

1

S(...,PU&,PUCI;Wr) = A s ( . . . , A z , A i , 1 ; D

u~)

(11.3)

1

where Ai is given by (6.2), and Af = S(...,4 , 3 , 2 ; PvCi). By Section 10 (iii) and by the induction hypothesis, the right-hand side of (11.2) equals XPUS(..., A21 A l l 1 ; r) = ~ S ( . . . , P u ~ 2 , P u ~ Purl 1,1;

.

(11.4)

In the first reading of the proof the Ei and Ai are integers, whence PuCi = Ci, and therefore A: = Ai = PuAi. So the right-hand sides of (11.3) and (11.4) are equal, hence the left-hand sides of (11.2) and (11.3) are equal. In the second reading of the proof we may use the theorem for the case that the Ci are integers; hence

Lambda calculus notation with nameless dummies (C.2)

385

and the right-hand sides of (11.2) and (11.3) are equal. The second case we have t o deal with, is R = y(AA, r) with y E U . We have to show

PUS(..., c2,c1 ; $4

r))= q..., PU&,PUCl

; Pur(AA,I-))

. (11.5)

The right-hand side equals, by Section 10 (iv)

q..., P U c 2 , P U C l ; q..., 3,2, LPUr ; h a ) ). By the formulas on composite substitution (Section 7) this is

~ ~ . . . , P ~ ~ ~ , P ~ ~ ;~mi; , ~ P~ ~. A . . .) , P ~ ~ (11.6) ~ , P ~ ~ ~ The left-hand side of (11.5) equals, according to 6 (iii),

s(...,c2,c1; r)).

P ~ ~ (..., Scz, ( c1; AA),

(11.7)

By 6 (v) we have

S(..., C z , C1 ; AA) = A@ where

9 = S( ...,Az, A 1 , l ; A)

,

Ai

= S( ...,3 , 2 ; Ci).

Applying 10 (iv) we can write for (11.7)

S ( . . . , 2 , 1 , P U S ( . . .Cz,C1; , r);P U @ )

'

(11.8)

By the induction hypothesis we have PU@ = S(...,PuAz,PuA1,1; P U A )

,

and so we can apply the formula for composite substitution (Section 7) t o (11.8); it becomes

q...,n2,nl ; P U A )

(11.9)

where

nl = s(...,2, i , p U s (...,c2,el ; r); 1) = pus(...,cz,el ; r) I I ~ +=~s(...,2, i , p U s (...,c2,el ; r) ; pUhi) (i = 1,2, ...I . We have to show that (11.9) equals (11.6). By the induction hypothesis we = PuCi have II1 = S(...,puCz,/3~&; Bur), so it remains to show that (i = 1,2, ...). In the first reading of the proof the Ci are positive integers. Therefore the hi are integers > 1; it follows that PuAi = Ai > 1, whence IIi+l = Ai - 1 = Ci = PUCi. In the second reading of the proof we may use the result of the first reading:

N.G. de Bruijn

386

PuAi = PUS(...,3 , 2 ; Ci)= S( ...,3 , 2 ; PuCi) , and the formula for IIi+l now results in (cf. Section 7) R+1

= S(...,3,2,1; PuCz) = puci

0

.

Theorem 11.2. Let U and V be subsets of the set of constants, and let (C) be both U-correct and V-correct. Then (,&((C))) is V-correct, (Pv((C))) is U correct and (Pu( (Pv ( (C))) 1) = (Pv ( (Pu( (C)1) 1) 9

Proof. Again we omit the (’s and )’s. The V-correctness of PuC is easily proved by recursion: use the definition of Pu of Section 10. In 10 (iv) we have to use Theorem 10.1. By the same recursion we shall prove PuPvC = PvPuC. The only case where the induction step is non-trivial is the case C = y(AR,r) with y E U U V. If y E U we have by 10 (iv) PvPuC = PVS(..., 372, L P u r ; Pun) By Theorem 11.1 this equals

q..., 3,2,1,PvPur ; PvPun) .

(11.10)

If y $ U , y E V we find by 10 (ii), 10 (iii)

P ~ P= ~ PC~ T ( A P , ~Purl , , and by 10 (iv) this equals (11.10). So y E U U V implies that PvPuC equals (11.10). On behalf of the induction hypothesis (11.10) is symmetric, whence I3 PvPuC = PuPvC.

12. THE CHURCH-ROSSER THEOREM FOR BETA REDUCTION We consider an NF expression C with a single constant A that can be used for P-reduction. We label all A’s in C so that they become all different. Next we take a subset U of the labelled A’s, we apply pu and then remove the labels. This gives an NF expression C’. We say that C‘ is a multiple reduction of C’, and we write C IrnC’. If U has only one element, and if that element has just one occurrence in C, the reduction is called single, and we write C 2, C’. If C1 and C2 satisfy either C1 2s C2 or C2 2, C1, we write C1 C2. The Church-Rosser theorem for beta reduction says: If C1 C2 ... C, then there are A l , ...,A,, and II1, ...,IIh with N

N

N

N

Lambda calculus notation with nameless dummies (C.2)

2s

A1

2 s ... 2 s

Akt

En

2s

nl 2 s ... 2 s

387

nh, A k = n h .

This can now be proved as follows. From Theorem 11.2 we easily obtain: if C1 C2, C1 2rnC3 then there is a C4 with C2 2, C4, C3 & C4. Moreover it can be shown: If C 2, A then there is a sequence

zrn

C 2s C1

>s

C2

Ls ...

Cm=A

(Actually, if every element of U occurs at most once in the U-correct expression C, then we can arrange the elements of U as u1, ...,urnin such a way that

... D { u * }C = DUX .)

D{um}

The Church-Rosser theorem now follows by a trivial reduction argument. The above proof can be easily adapted to lambda calculus with expressions as types (see Section 13).

13. NOTATION IN AUTOMATH

The mathematical language Automath (see [de Bruzjn 70a ( A . 2 ) ] ) has lambda calculus with types, and these types are again expressions. That is, instead of A, we have things that can be visualized as A,,(*) (R), where 0 and R denote name-carrying expressions. We may think of z to be a variable of the type (0). It is clear that we do not want x to have any binding influence on (0).In order t o achieve this, we create a new lingual constant T (just like we added A to our set of constants in Section 3), and we write

T ((0) 3 A, (Q) 1

(13.1)

instead of A,,(*) (0).Now (13.1) can be transformed into a namefree expression just like any other name-carrying expression. The actual notation in Automath is different. Instead of (13.1) Automath , Automath uses {(I')} (0). uses [z : (a)](a),and for the application A ( ( @ ) (r))

14. ALGORITHMS

An algorithm for turning a n NF expression into a name-carrying one, can be described on the basis of the recursive definition of substitution in Section 6. Let (0)be an NF expression. Take a free variable list ..., 2 3 , 2 2 , z 1 consisting of distinct letters which do not belong to our alphabet of constants. Now add these xi to that alphabet, and evaluate

N.G. de Bruijn

388

This is a namefree expression; if we proclaim the xi’s to be variables again, it becomes an intermediate expression where the free variables have names but the bound variables are nameless. If we want to have names for the bound variables too, we have t o modify S slightly. We take an infinite store of letters y1, y2, ... (different from the x i ’ s and different from the constants), and we take a modified form of (6.1). Any time we get to apply (6.1) we take a fresh y (i.e. one that has not been used before) and we replace the right-hand side of (6.1) by X,

(q..., @3), ( A 2 ) , ( A l ) , Y ; (U))

*

It is not very hard either to give algorithms that transform name-carrying expressions into namefree ones. This can be done if a free variable list is given (and then it has t o be checked, during the execution of the algorithm, whether this list is adequate), but we can also write an algorithm that produces a free variable list itself. For the case of the first-mentioned possibility we give a brief description of the crucial steps. Let ...,z 3 , z 2 , z 1 be a free variable list, and let (a)be the name-carrying expression we want to transfer into the namefree expression (a*).If (a) equals one of the z’s, then (a*)is an integer, viz. the index of that z. If (a)is a variable, but not one of the z’s, the answer is “free variable list was wrong”. If (0)= A, (I’) then we transform (I’) into the nameless expression (I“)by means of the free variable list ...,z3,22,z1, y, and we have (a*)= X (I?*). The other cases (the cases (i) (0)= (%), (%), (ii) (a)= a constant, (iii) (n)= (p)( @ I ) ) ) are very easy.

389

Strong Normalization in a Typed Lambda Calculus with Lambda Structured Types R.P. Nederpelt

CHAPTER I. INTRODUCTION AND SUMMARY 1. Lambda calculus The lambda notation was originally introduced as a useful notation by Church in two papers developing a system of formal logic [Church 321. He extended this notation in his calculus of lambda conversion (lambda calculus). This calculus was meant to describe a general class of functions which have the feature that they can be applied to functions of this same class. For historical comment see [Curry and Feys 58, Ch. 0, D and Ch. 3, S l ] and [Barendregt 71, Ch. 1, 1.11. In the latter reference the importance of lambda calculus for the development of recursive functions is mentioned. The calculus has also been brought into relation with the theory of ordinal numbers, predicate calculus and other theories. From the very beginning, lambda calculus was strongly linked to the theory of combinatory logic. We shall later mention some major results achieved concerning lambda calculus. Right here we stress the contribution of lambda calculus to ordinary mathematics at a purely notational level. The mathematical custom to use the notation f(z), both for the function itself and for the value of this function at an undetermined argument z, obscures the mathematical notion “function”. According to Curry and Feys “this defect is especially striking in theories which employ functional operations (functions which admit other functions as arguments)”. For an example showing that the usual mathematical function notation is defective not only for understanding, but also in use, see [Curry and Feys 58, Ch. 3, A2]. We shall give an example of the lambda notation. Consider the function which assigns to z the value x + 2. This function is denoted in lambda notation as Xz .x + 2. We can apply the function to an argument, say 3. The application of this function to the argument 3 is denoted as ( X i . z 2)3. The result of this application must clearly be 3 2.

+

+

R.P. Nederpelt

390

+

This suggests that there exists an order between the terms (Xz z 2)3 and 3 2 (the latter term is “closer to the outcome”). The transitive and reflexive relation corresponding t o such an order is called a reduction. In the above case it is called a P-reduction, often denoted by 2 p . Thus we have the relation ( A X . 2 2)3 2 p 3 2. The reduction relation is also monotonous, i.e.: if term S reduces to term T , then Xz . S reduces to Xz . T , ( U ) S to ( U ) T and ( S ) U t o ( T ) U . So from the relation (Xz . z 2)3 2 p 3 2 follows, for example, that A y e ((Xz . z 2)3) >p Xy . (3 2). The relation compares two terms (viz. (Xz . z 2)3 and 3 2); the fact that these terms have the common value 5 in the usual interpretation, plays no rBle here. If we d o not take 3, but z as argument for the above function, then we obtain (Xz z 2)z >p z 2. So lambda calculus makes a clear distinction between the function: Ax 2 and the value of this function for an undetermined argument: z+2. We are used to the fact that the terms Xz . z 2 and Xy . y 2 denote the same function. The two terms are called a-equivalent, and the passage of the one into the other is called a-reduction, often denoted by La.In this way we Xy . y 2. It is quite a nuisance that this also have the relation Xz z 2 a-reduction, which is simply a renaming of variables, plays a r61e in the lambda notation. One can avoid this by considering a-equivalence classes instead of separate terms. Another nice and practical way out is given by de Bruijn [ d e Bruijn 72b (C.2)],who completely suppresses the use of names of variables by means of a notational system referring to the positions of a variable in a term. We wish to state that the desire t o eliminate variables is one of the things giving rise to combinatory logic. The method used in combinatory logic to obtain this elimination is, however, different from de Bruijn’s. A third reduction, which is commonly used and strongly related to extensionality (see [Barendregt 71, Th. 1.1.17 and Th. 1.1.18]), is called 7-reduction. This relation, commonly denoted by Iv,is based on the following rule: If z is not free in the term M , then Xz. ( M ) z > v M . An intuitive justification is that, for any argument X ,the sides of the relation have comparable values: this value is (Xz . ( M ) z ) X for the left-hand side and ( M ) X for the right-hand side, and (XZ * ( M ) z ) X 2 p ( M ) X . A sequence of reductions obtained by successive application of reductions is called a reduction sequence. For each of the reduction relations explained above, the corresponding symmetric and transitive closure is called a conversion relation. One of the first important results in lambda calculus concerns the dependence between conversion and reduction. This is called the Church-Rosser theorem, which states: If

+

+

+

+

+

+

+

+

+

+

+

a x +

+

+

+

+

+

Strong normalization in a typed lambda calculus (C.3)

39 1

X converts to Y , then there is a 2 such that X reduces to 2 and Y reduces to 2 (see [Curry and Feys 58, Ch. 41). For interesting historical comments see [Barendregt 71, Th. 1.2.9 and remarks in 1.2.18 plus footnote]. In Appendix I1 of the latter reference the latest and nicest proof of the Church-Rosser theorem is given (1971 by W.W. Tait and P. Martin-Lof). For a precise description see [Schulte Monting 731. In this thesis we shall use the name “Church-Rosserproperty” for the following statement: If A reduces t o B and to C , then there is a D such that B and C reduce to D. This property is equivalent to the Church-Rosser theorem. 2. Normalization and strong normalization An important issue in lambda calculus is the question of the normalization of terms. This is a termination problem. For example, a P-reduction such as (Xzcz)y 2 p y cannot be continued in a non-trivial manner: there is no reduction for y, except those trivial on account of the reflexivity of a-, b- and v-reduction. In this case (Xz . z)y is said t o normalize into a normal form y. In lambda calculus, which allows all functions as arguments of functions, such a termination of the reduction is not guaranteed. See Church’s nice example: w2 = (Xz zz)(Xz. zz). [Note: One writes A B instead of ( A ) B , and ABC instead of ( ( A ) B ) C . ]There is a non-trivial @reduction, by applying the rule (Xz . z z ) A 2 p ( A ) A with A = Xz . zz. This produces w2 2 0 up. It is clear that the reduction of w2 by repeated used of the above non-trivial @reduction will never come to an end. There are more and stranger examples of such terms, the reduction of which never terminates. For example: put w3 = Xz.zzz. Then w3w3 >p w3w3w3 >p ... . Barendregt even constructed a universal generator with the property that it has a reduction sequence in which all terms of lambda calculus occur as subterms. A term in lambda calculus is called normalizable if there is some reduction sequence which terminates. A term is strongly normalizable if each of its reduction sequences terminates. The last term of a terminating sequence is called a normal form. It is obvious that strong normalization implies normalization. The reverse implication does not hold. For example: put again w2 = Xz . zz, then ) t o Xy y if the function Xz . (Xy . y) is applied to (Xz . (Xy y ) ) ( w ~ w 2reduces the argument ~ 2 ~ but 2 ,it reduces to itself if the function w2 is applied t o the argument w2. Since Xyey is in normal form, (Xz.(Xy. y ) ) ( w 2 w 2 ) is normalizable, but not strongly normalizable. In this example we see a term that normalizes if one application of a function to an argument is assigned priority over another. There is a general theorem in lambda calculus (the standardization theorem, cf. [Curry and Feys 58, Ch.

392

R.P. Nederpelt

4El]), which states that any normalizable term can be normalized by assigning priority t o the “leftmost” application in the term. The fact that some terms in lambda calculus have non-terminating reduction sequences is related to the feature that one can use all functions as arguments for functions. (Even the function itself can be used as an argument, see the above-mentioned example by Church. This is called self-application.) The same things can happen in programming languages and in the theory of partial recursive functions, where normalization- (or termination-) problems arise too. In lambda calculus the question of the normalizability of terms has been shown to be undecidable. There are systems in which normalization implies strong normalization. For example, in a restricted lambda calculus (XI-calculus) this implication holds (the so-called second Church-Rosser theorem, see [Curry and Feys 58, Ch. 4, El), but the proof is not trivial. Prawitz [Prawitz 651 proved normalization for derivations in natural deduction. He also proved strong normalization for these derivations in [Prawitz 711. Note that in the latter proof he does not use his results from [Prawitz 651, but quite a different proof technique developed by Tait [Tait 671. An interesting problem concerning normalization is the question of uniqueness of normal forms. If a term A has the property that every terminating reduction sequence leads to the same normal form (but for a-reduction), then A is said to have a unique normal form. We note that the Church-Rosser theorem implies the uniqueness of the normal form if this exists. In this thesis we shall show that, if in a system all terms are normalizable into a unique normal form, then each term is strongly normalizable. This will be proved for a certain lambda calculus called A, the method can, however, be applied to more systems, and we suggest this as a field of further investigation. 3. Normalization in systems of typed lambda calculus

In ordinary mathematics one, sometimes tacitly, assumes that each object has a certain type (in our example of a term in lambda notation: Xz x 2, we assumed that x has a type (e.g. that of the natural numbers) in which addition is possible). In systems of typed lambda calculus one attaches a type to each term. In so doing and in restricting the formation of terms in accordance with the types (see the “applicability condition” explained in Section 1.4) one brings lambda calculus nearer to usual mathematical systems. We note here that there is a strong correspondence between derivations in systems of natural deduction and terms in systems of typed lambda calculus, as well as between formulae in the one and types in the other: a derivation D 9

+

Strong normalization in a typed lambda calculus ((2.3)

393

proving a formula F corresponds to a term D’with type F’. This is called the “formulae-as-types notion”. The latter notion has recently been investigated by various authors in developing a theory of construction and in studying functional interpretations. The first indication in this direction was given in [Curry and Feys 58, p. 312-3151. We further mention Lauchli [Lauchli 701, de Bruijn, who developed and applied this notion with a large variety of types in his mathematical language Automath ( [ d e Bruzjn 70a (,4.2)]),Howard [Howard 801, Prawitz [Prawitz 711 and Girard [Girard 711. Normalization problems also arise in systems of typed lambda calculus, Sanchis [Sanchis 671 investigated a lambda calculus with types (essentially Godel’s theory of functionals of finite type) and found all terms in this calculus t o be strongly normalizable. Martin-Lof [Martin-Lof 75a] admitted more general types and obtained normalization for his terms. His system is close to the requirements of common mathematics in the sense that usual mathematical notions such as the logical connectives and the recursion operator are incorporated. In this thesis we shall regard a typed lambda calculus, in which the types themselves have lambda structure. Our typed lambda calculus, which we call A, has a large overlap with the mathematical language Automath [ d e Bruzjn 70a (A.2)]. (See the following section for the relation between Automath and our system A.) In particular, a single-line version of Automath (AUT-SL, see [ d e Bruijn 71 (B.211) introduced by de Bruijn has led us t o the investigations in this thesis. Preliminary work in the direction of AUT-SL can be found in our notes on Lambda Automath ([Nederpelt 71a] and [Nederpelt 71b]), in which some syntactical notions of Automath were unified. In AUT-SL this unification was extended considerably. de Bruijn defined AUT-SL by means of a recursive programme. Our definition of system A (given in Chapter 111) follows more orthodox recursive lines. Nevertheless, the resulting systems are the same. In these systems there is no syntactical distinction between terms and types. We therefore use the word expression rather than term or type. There is one basic constant in the system, called T . To each expression which does not end in T we shall assign a type in a natural manner. We say that expressions ending in T have degree 1. Each other expression has some degree n > 1, while the degree of such an expression A is defined t o be one more than the degree of the type of A . In this manner we have expressions of any finite degree at our disposal. In Automath and in Martin-Lof s system there is a restriction to the degrees permitted. Both systems have only terms and types of degree 1, 2 or 3.

394

R.P. Nederpelt

Our system has in common with Automath that logical connectives, the recursion operator and a basic set of numbers (e.g. natural numbers) are not incorporated. The proofs of normalization results concerning these systems can be formalized in first order arithmetic. Yet it is possible t o interpret into these systems mathematical theories containing, for instance, logical connectives and the recursion operator by introducing new primitive equality relations which extend the existing equality relations which correspond to conversion. We shall prove normalization and strong normalization for our system in Chapter 111. As mentioned above, we shall introduce a method for deriving strong normalization from normalization together with the uniqueness of normal forms (see Section 1.6). 4. The relation to the mathematical language Automath Automath (see [de Bruijn 68b] and [ d e Bruzjn 7Oa (A.,!?)]) was designed by de Bruijn as a language for mathematics. It has the property that the interpretation of a text written in Automath is correct mathematics if the text is syntactically correct. Many such systems have been developed for logic. For mathematics, Russell and Whitehead’s Principia Mathematica was the first successful attempt in the direction of formalization, There have since been many other attempts. However, in the majority of these systems important parts of the mathematical argumentation were not incorporated in the formal system, but were dealt with at a meta-level. For example, in systems based on axioms and inference rules a theorem is true if it can be inferred by successive application of a number of axioms and rules. But one hardly ever says exactly (in terms of the formal language) which axioms and rules were used, and in which order. Moreover, the use of an axiom scheme was usually not substantiated by a formalized indication of the substitution instance employed. Admittedly, there is a gap in the completeness of the formalization in Automath, too. The gap is that, in the case of “definitionally equal” expressions, there is no indication of how this equality can be established on the basis of the language definition. It is left to algorithms t o justify these definitional equalities. The existence of terminating algorithms for this purpose can be proved by means of normalization properties. The question of practical efficiency of such algorithms is, of course, a different one, and is not considered in this thesis. Two expressions in Automath are called definitionally equal if one expression can be transferred into the other by (1) conversions, and (2) the elimination of abbreviations.

Strong normalization in a typed lambda calculus ((3.3)

395

A major problem for automatic checking in Automath is whether definitional equality of two expressions is decidable. The latter is clearly the case if each expression is effectively normalizable into a unique normal form. In this respect, see [Kreisel 721. The main aim of this thesis is to prove the existence and uniqueness of normal forms for A. Since A does not use an abbreviation system as a syntactical element like Automath does, we may restrict ourselves t o conversions. We note that the omission of abbreviations is no severe restriction, since abbreviations are relatively simple operations usually considered to be only notational devices without mathematical content. The mere typing of lambda calculus expressions does not guarantee the property of normalization. We need more. Automath permits only a restricted class of expressions. In this class only those expressions E are included which obey the so-called applicability condition: for each part of E which has the form of a function F applied to an argument A it is required that (1) F has a domain D, and (2) the type of A is definitionally equal to D. These requirements are natural for a system which is so closely linked to ordinary mathematics. The following examples in lambda notation will make this clear. In the first place, it would be unnatural to supply an expression which is not a function with an argument: one can attach an argument to Ax x 2, but it looks strange to provide the number 7 with an argument. Secondly, let us assume that x in Ax . x 2 is required to have the natural numbers as type. This defines the domain of the function. Then one may write the application (Aa: * x 2)3, since 3 has the same type as x. But it would be quite unnatural to write the application (Xx.x+2)a_,where arepresents a vector in R3. In AUT-SL and in A , expressions have to obey the applicability condition, like in Automath. This condition is sufficiently strong to guarantee normalizability (even a weaker condition suffices, see Section 1.6). We note that Automath has the property that assignment of a type to an expression of degree 3 is different to that for expressions of degree 2. Expressions of degree 3 have lambda structured types, whereas expressions of degree 2 all have the same type, viz. the expression denoted by the symbol type. (This symbol type is the Automath version of the symbol T used in our system 12.) As an illustration we give an example in lambda notation. Suppose that the term Ax.x 2 has Nat as type for x, and type as type for Nat. Then Ax.x 2 has degree 3. In the manner of Automath it has Ax. Nat as type. The latter expression, having degree 2, has as type the expression type.

+

+

+

+

+

R.P. Nederpelt

396

In AUT-SL and in A, however, the assignment of types to expressions of any degree 2 2 is treated in a uniform manner, comparable to the assignment of types to expressions of degree 3 in Automath. If the term in the above example (Ax.x 2) were treated in the A-way, its type would again be Ax . N a t , but the type of Ax. Nat would be Ax . T . We note that an extension of Automath, called AUT-QE (“Automath with quasi-expressions”, see [de Bruijn 73b]), has more expressions of degree 1 than only type; it admits as expressions of degree 1 some of those admitted in AUTSL and A. However, AUT-QE allows a choice to be made for some expressions of degree 2, between essentially different types. Again using the above example as an illustration: in the manner of AUT-QE one may choose either Ax . type or type as type of Ax.Nat. It is to be noted that the above-mentioned difference between Automath (or AUT-QE) and AUT-SL (or A) has the important consequence that neither Automath nor AUT-QE is a subsystem of AUT-SL (or A). The results for A obtained in this thesis are therefore not immediately transferable either to Automath or to AUT-QE. Normalization for a simpler form of AUT-QE, which does form a subsystem of AUT-SL, was proved by van Benthem Jutting [van Benthem Jutting 71a], using the norm introduced in this thesis (we shall call this norm p; cf. Section 1.6). The normalization theorem of this thesis is a generalization of that of [van Benthem Jutting 71a]. Strong normalization for a system resembling Automath was recently studied by R.C. de Vrijer on the basis of Tait’s ideas exposed in [Tait 671, and for Automath and AUT-QE by D.T. van Daalen (private communications). The uniqueness of normal form has only been proved with respect to preduction. Uniqueness of normal form with respect to P-v-reduction is as yet an open question (see also Section 1.6).

+

5. Change of notational conventions In the lambda notation as usually employed the quantifiers (such as Ax) are written to the left of the expressions they operate upon, whereas applications are written to the right. This corresponds to mathematical notational traditions to write quantifiers (such as V, Cr=l,...) t o the left, and to write the argument of a function f t o the right (as in f(a)). In ordinary mathematics these two kinds of operations have nothing in common, but in lambda calculus they are closely related by p- and v-reduction. In a sense, quantification (also called abstraction) and application are inverse operations. Sequences of such operations can be applied in various orders, and it is most convenient to write them all on the same side of an expression, thus

Strong normalization in a typed lambda calculus (C.3)

397

showing clearly in which order the expression has been formed from its constituents. In Automath applications and abstractions are all written to the left. Instead of writing abstractions in the form Ax, Automath writes [z : A ] , in which A stands for the type of the variable x. Applications are indicated by writing the expression in angle brackets; instead of the usual mathematical notation f(a) we write (a) f. For example: the term given in lambda notation as (Ax . x 2 ) 3 reads in Automath as: ( 3 ) [x : N u t ] p h ( x , 2 ) . (Here we assume that x has as type the natural numbers, abbreviated Nut; a minor difference is that Automath uses only prefix notation for operators.) Note that the pair ) [ indicates the possibility of P-reduction. Sometimes, but not always, the pair ] ( indicates the possibility of q-reduction. This notation for abstraction and application renders the use of parentheses ( ) entirely superfluous, since there can be no doubt as to the order in which abstractions and applications appear. The separation dot as used in Ax . x 2 disappears as well. Automath uses the parentheses ( ), but for a different purpose. In AUT-QE, AUT-SL and in the system A which we shall develop in this thesis, these slightly different notational conventions are also adopted.

+

+

6. Summary of the contents of this thesis This thesis contains a chapter on the formal system A (Chapter 11) and a chapter on the formal system A (Chapter 111). In the latter chapter we develop the main results of the thesis. System A forms part of system A, containing those expressions of A which obey the applicability condition (explained in Section 1.4). We shall now discuss the contents of Chapter 11. There we define expressions inductively by: x and 7 are expressions; [z : A] B and ( A ) B are expressions if A and B are so (z is a variable). In system A we only include those expressions which are “distinctly bound”, i.e. (1) which do not contain free variables, and

( 2 ) which have distinct binding variables. Our preference for bound (also called closed) expressions (expressions without free variables) is noticeable throughout this thesis. We give the following justification for this preference. We believe that in a typed lambda calculus the feature of typing can only be meaningful if every typable expression has an effectively computable type. Since free variables have no traceable type in our

R.P. Nederpelt

398

system, this implies that only bound expressions are admissible. If in this thesis we deviate from this agreement by considering expressions with free variables, this will be in cases in which it is clear from the context which types belong to these free variables. The consequence of the above agreement is that many expressions under discussion begin with a n abstractor chain Q. (An abstractor chain is a string of abstractors; an abstractor has the form [z : A ] , A being an expression.) The fact that we require all binding variables in an expression in A to be distinct has only practical reasons (cf. Section 11.5). We stress that system A is not a typed lambda calculus in the usual sense, since the types have no influence whatsoever on the formation of expressions. The types, which themselves have a lambda structure, will only be treated as formal expressions. It is not until Chapter 111, dealing with the restricted system A, that the types will play the usual r6le in the formation of expressions. This is due to the applicability condition imposed upon expressions in A. We shall formulate the relations a-, 0- and 7-reduction inside A and we shall prove a number of properties of these reductions in the system A (in Sections 11.4, 11.5 and 11.7, respectively). In Section 11.6 we shall consider some reductions related to @reduction. Our proof of strong normalization in A (Section 111.3) is based on these reductions. The more important one of these reductions will be called PI-reduction. We shall explain its characteristic property by reducing the term which we previously used as an example: (Xz . z 2)3, or, in Automath-notation: (3) [z : Nut] pZw(z, 2) (cf. the previous section). As for P-reduction, we have the relation (3) [z : Nat] plus(z, 2) >p plvs(3,2). But with ,&-reduction, which we denote by I p , , we have: (3) [z : Nat]plus(z, 2) I p , (3) [z : Nat]plus(3,2). Here the part (3) [z : Nut] is left intact on the right-hand side. (Actually PI-reduction is more complicated; see Section 11.6.) The following feature of 01-reduction is worth noting: Application of Preduction sometimes enables one to eliminate a non-normalizable subterm (in this respect we recall the example (Ax. (Aye g))(wzwz) 2 p Xy . y of Section 1.2), but with PI-reduction this is impossible. In Section 11.6 we prove the Church-Rosser property for PI-reductions, using a proof technique of Tait and Martin-Lof. This property implies the uniqueness of normal form for PI-reductions. The Church-Rosser property for P-reduction can be proved in a similar manner. Unfortunately the Church-Rosser property for P-7-reductions does not hold in our system A. The trouble here arises from the typed character of our lambda calculus. We explain this in greater detail in Section 11.7. (However, we conjecture the Church-Rosser property for Pg-reductions in A; see the end of the present section.)

+

Strong normalization in a typed lambda calculus (C.3)

399

In Section 11.7 we also prove a theorem concerning the "postponement of 7-reductions" in a sequence of p- and 7-reductions, by means of a method suggested by Barendregt. At the end of Section 11.7 we define lambda equivalence for A: A and B are lambda equivalent if there is a C such that A and B reduce to C. This lambda equivalence is not necessarily transitive since the Church-Rosser property for P-7-reductions does not hold in A. In Section 11.8 we define a formal type-operator called Typ, which assigns a type to an expression not ending in r. The action of this type-operator is syntactically simple and is in agreement with what we mentioned about the assignment of types in Section 1.3. In Section 11.8 we also define the degree-function Deg, which is in agreement with our description of degree as given in Section 1.3. In our system A we can apply the type-operator Typ a finite number of times. For each expression A in A there is an n 2 0 such that Typn A ends in r , which implies that Typn A has no type. (Here Typn A is obtained by n applications of the type-operator.) This n is the degree of A minus one. We define Typ* A to be Typn A for that particular n. We begin Chapter I11 with the definition of the formal system A (Section 111.1). Among the theorems in Section 111.1 there is one which states that the type of an expression in A again belongs to A. In Section 111.2 we prove the normalization theorem for A . We use a norm p, which is a partial function on A. The norm p ( A ) for a certain A in A is itself an expression in A. The norm of A is defined if A obeys a weak form of the applicability condition, which amounts to the following: for each part of an expression E which has the form of a function F applied t o an argument A : (1) F has a domain D , and (2) the norms of A and D are defined and equal (apart from a-reduction) Applied t o expressions for which the norm is defined (so-called p-normable expressions), the norm p has two powerful properties: (1) If A reduces to B , the norms of A and B are (essentially) equal, and (2) the norm of an expression is (essentially) the same as the norm of its type.

The norm p ( A ) of a p-normable expression A can be obtained by (1) replacing non-binding variables by their types, repeating this process until no non-binding variable remains, and

(2) cancelling adjacent pairs (C) [z : D ]

400

R.P. Nederpelt

We show in Section 111.2 that all expressions in A are pnormable. We subsequently show that each p-normable expression has a normal form for preductions. It follows in particular that A is normalizable for @reductions. It now easily follows that A is also normalizable for p-0-reductions. Our proofs show that the normal form of A in A is effectively (viz. primitively recursively) computable. In Section 111.3we prove strong normalization for A. We use the 01-reduction introduced in Section 11.6. We show that expressions in A are normalizable for PI-reductions, using the same methods as in the corresponding proof for 0-reductions in Section 111.2. By using the Church-Rosser property for 01reductions as proved in Section 11.6 we obtain the uniqueness of normal form for PI-reductions. The special features of PI-reduction enable us to conclude strong normalization in A for ,&-reductions from the normalization and the uniqueness of normal form. Strong normalization in A for P-reductions is a consequence, as well as strong normalization for p-77-reductions. The uniqueness of normal form in A is proved for ,&reductions but not for 077-reductions. Nevertheless, the latter would be a consequence if we could prove the following conjecture:

Conjecture I. In A the Church-Rosser property holds for p-77-reductions.

0

(The difficulties in proving this arise in the same place where the corresponding statement for A turns out to be false; see Section 11-7,)

As to A, there is an important conjecture on closure: Conjecture 11. If A is an expression an A and i f A reduces to B, then B 0 is an expression in A . In [Nederpelt 72b] we stated this as a theorem, but the proof turned out to be incorrect. The latter conjecture has no influence upon the results in this thesis; it is, however, of importance for the construction of an efficient checkingalgorithm for expressions in A.

Strong normalization in a typed lambda calculus (C.3)

40 1

CHAPTER 11. THE FORMAL SYSTEM A 1. Alphabet and syntactical variables We use the following symbols as our alphabet: (i)

an infinite set of (individual) variables: a I 8 , ~ , a 1 , P 1 , ~ 1 , . . . ;

(ii) a single constant, called the base: (iii) the improper symbols: [ ,

1,

T

;

( , ) , ,.

As syntactical variables denoting certain well-structured symbol strings (possibly empty) we use small Latin letters, a , b, c, ... and Latin capitals A , B, C, ... (primed or subscripted if required). In special definitions, called Notation Rules, we restrict the use of some syntactical variables (and its primed or subscripted variants). For example, we agree upon:

Notation Rule 1.1. A s a syntactical variable for arbitrary strings of sym0 bols from the alphabet we use the Latin capital S. Such a string can be empty. The empty string itself is denoted by 0.

Notation Rule 1.2. A s syntactial variables for individual variables we use 0 the small Latin letters x , y and 2. (Instead of “individual variable” we often say “variable”.) Hence from now on each use of a syntactical variable S (or Sl,S’, etc.) denotes a string of symbols from the alphabet, and each use of a syntactical variable x (or y , X I , etc.) denotes an individual variable. It is usual to build strings of symbols from the alphabet and syntactical variables, concatenated. For example, [z : a][y : a]x is such a string. We shall call this kind of string a mixed string. Equality of mixed strings will be expressed in the discussion language by the symbol =. For example, if we wish to express that the strings S and [z : a l p are the same, we write S = [x : a]p. The symbol f is the negation of E. There are said to be two “occurrences” of a in the string [a: p] a. We shall formalize this notion occurrence. We define that S” occurs i n S after S’ if there is an S”’ such that S = S’S”S”’. Hence, in the above example: a occurs in [a: P)a after [, and a occurs in [a: p ] a after [a : PI. In this manner we can distinguish between occurrences.

R.P. Nederpelt

402

The following statement is clear: If S1 occurs in S after S’ and S2 occurs in S1 after S”, then S2 occurs in S after S’S”. Consider the mixed string [z : y] z, in which there are two occurrences of z. If z denotes a , then [z : y] 2 denotes [a : y] a: both occurrences of z are replaced by a. If, moreover, y denotes p, then [z : y] z denotes [a : /3] a. It is, however, also possible that both 2 and y denote a. Then [z : y ] z denotes [a : a ] a (see also [Shoenfield 67, p. 71). Syntactical variables are used in two hardly distinguishable r6les: as abbreviations (“We abbreviate [z : a] /3 as S ” ) and as variables (“Let S be a string of the form ...”). It is also good usage to state something in the nature of “Let A = [z : B]C”, meaning: “Assume that A has the form [z : B ]C for certain z, B and C” (in this manner one economizes in the use of the existential quantifier). We shall define many specific sets and relations in an inductive manner (see [Shoenfield 67, p. 41). The proof technique linked with this kind of definition, which amounts to induction on the construction, is often called (somewhat confusingly) induction on the length of proof (or induction on theorems, see [Shoenfield 67, p. 51). We shall call an application of one rule of the inductive definition a derivationstep. If a relation is defined inductively by a number of rules, then the relation is also said to be generated by these rules. When speaking of a transitive (or reflexive, etc.) relation generated by a number of rules, one wishes to express that the rule of transitivity (or reflexivity, etc.) is to be added to that number of rules. If S denotes a certain symbol string, then the length of S is the number of symbols in that string. We denote the length of S by 15’1. For example, if S = [a : p] a, then IS1 = 6.

2. Expressions The expressions of our systems are inductively defined as follows (we use the word expression rather than the words term or type):

Definition 2.1. (1) A variable is an expression. (2)

7

is an expression.

(3) If z is a variable and if A and B are expressions, then [z : A] B is an expression. (4) If A and B are expressions, then ( A )B is an expression.

0

Strong normalization in a typed lambda calculus (C.3)

403

Note that this definition gives a unique construction of an expression.

Notation Rule 2.2. As syntactical variables for expressions we use the Latin capitals A , B , C, ...,N . 0 Definition 2.3. A symbol string of the form [z : Cl is called an abstractor, a symbol string of the form ( D ) an applicator. A lambda phrase is either an abstractor or an applicator. A (possibly empty) string of abstractors (applicators, A-phrases) is called a n abstractor chain (an applicator chain, a lambda phrase chain). 0 Notation Rule 2.4. A s a syntactical variable for abstractor chains we use the Latin capital Q, for applicator chains the Latin capital R and for A-phrase 0 chains the Latin capital P. The number of entries in a string forming an abstractor chain Q (an applicator chain P , a lambda phrase chain R ) is denoted by 11Q11 (llPll, IlRll respectively). Hence 11Q11 = 0 if Q = I, and IIQ [x : C]ll = 11Q11 1. An expression B can be a subexpression of an expression A , denoted B c A . This relation is inductively defined as follows:

+

Definition 2.5. (1) A c A . (2) If C c A or C c B , then C c [z : A] B and C c ( A )B.

0

Note: if B c A , then A = &BS2, i.e.: a subexpression of an expression A is an expression which forms a connected part of A . Instead of B c A we sometimes say: A contains B. If B c A and B f A , we call B a proper subexpression of A .

Theorem 2.6. If F C E and E C D , then F C D. Proof. Induction on IDI.

0

If B c A , then B occurs in A , but there may evidently be more occurrences of B in A . In the following we wish to be able to distinguish between such occurrences of B in A . We shall indicate the occurrence meant by saying “B c A after S” if B c A and B occurs in A after S.

Definition 2.7. Let B occur in A after S1 and let C occur in A after S,.

R.P. Nederpelt

404

We call these occurrences disjoint if either

S2

= S1BS’ or S1 = SzCS”.

0

Theorem 2.8. Let B occur in A after S1, let C occur in A after S2, let B c A and C c A. Then (1) B and C occur disjointly an A, or-

(2) B c C, or

(3) C c B.

Proof. Induction on the length of proof of B c A.

0

Let B c A after S. We shall inductively define the factor of A with respect to S and B (denoted A1 (S; B)) in Definition 2.9. In this definition the occurrence of B meant is precisely described. However, it will often be clear from the context which occurrence of B is meant in case B c A. In that case the precise indication of this occurrence is superfluous, and instead of A I (S; B) we shall write AIB. Informally we can concentrate the inductive definition of AIB, under the condition that in each of the following rules the occurrences of B under discussion are “in corresponding places”:

= B, then AIB = A. If A = [z : C] D, then AIB = CIB in case B c C and AIB = [z : C] (DIB)

(1) If A (2)

in case B c D. (3) If A

= (C)D, then

B c D.

AIB

= C ( B in case B c C and AIB = DIB in case

For a description of a characteristic property of AIB, which justifies its introduction, see the following section (after Th. 3.6). The formal inductive definition of A I (S; B) is the following: Deflnition 2.9. Let B c A after S.

= B, then A I (S; B) = A. Let A = [z : C] D.

(1) If A (2)

If B c C after S1 and [z: S1 = S, then A I ( S ; B) = CI (Si; B). If B c Dafter 5’2 and [z : C] S2 = S, thenA I (S; B) [z : C] ( D I ( s 2 ; B)).

=

(3) Let A (C) D. If B c C after S1 and (Sl= S, then A1 ( S ; B) = C ( (SI;B). IfBcDafterSzand (C)Sz=S,thenAI(S; B)=DI(S2;B).

0

405

Strong normalization in a typed lambda calculus (C.3)

Note: the parentheses ( ) in [z : C] ( D I (S2 ; B ) ) belong to the discussion language and are meant to fix the scope of I . Let B c A after S . It will be clear that B C AIB, or, a fortiori: AIB ends in B (here, of course, A ( B is meant to be A I ( S ; B ) ) . It is also evident that AIB = Q B and ( Q A )1 B = Q(A1B). We state the following theorems: Theorem 2.10. If C c B

c A, then (AIB) I C

AIC. 0

Proof. Induction on [A(. Theorem 2.11. If B

cA

and AIB

= Q1[z : C]QzB, then C c A.

Proof. Induction on IAI, using Th. 2.6.

0

Theorem 2.12. If E = Q l [ z : C]D and B c C after S , then EIB (here EIB is E I ( Q l [ z: S ; B ) and CIB i s CI ( S ;B)). Proof. Induction on 11Q111.

= Ql(C[B) 0

Theorem 2.13.

(1) If [z : C]D c A after S1 and B c D after SZ, then AIB (here AIB is A I (&[z: C]5 2 ; B)).

= Q l [ z : C]QzB

(2) If B C A after S and AIB = Q1[z : C]Q2B, then there is a D such that [z : C]D c A after S1, B C D after Sz and S = Sl[z: C]S2 (here AIB is A I ( S ;B)). Proof. In both parts of the theorem: induction on IAl.

0

We conclude with an inductive definition of the function Tail, which maps expressions to expressions: Definition 2.14. ( 3 ) Tail(z) = z . (2) Tail(7)

= 7.

(3) Tail([z : A] B ) = Tail(B) . (4) Tail((A) B ) = Tail(B).

R.P. Nederpelt

406

Note that Tail(A) can only be a variable or 7. An expression A can always be written (uniquely) as A = P Tail(A) (in which P denotes a A-phrase chain, see Notation Rule 2.4).

3. Bound expressions An occurrence of a variable in an expression can be a free, a bound or a binding occurrence. We shall introduce these well-known notions in our system too. An occurrence of a variable in an expression is binding if and only if that occurrence immediately follows an opening bracket [. If D contains an occurrence of z (i.e.: z c D ) , then that occurrence of z is either bound (and there is a unique binding occurrence of z which binds that bound occurrence) or free. A formal description is given in the following inductive definition. In this definition we often encounter “corresponding” occurrences of x. For easy understanding we shall not use our formalism concerning occurrences (see Section II.l), but we shall introduce “a certain z” and refer to it as “that 2”. Definition 3.1. (1) z is free in

2.

(2) Let a certain z be free in A or B. Then that z is free in (A) B.

(3) Let a certain z be free in A . Then that z is free in [y : A] B (both if y and if y f z).

=z

(4) Let a certain z be free in B. Then that z is free in [y : A] B if y f z, but that z is bound in [z : A] B (by the binding z occurring in [z : A] B after

[I. (5) Let a certain I in A be bound by a binding 2 in A, or let a certain z in B be bound by a binding x in B. Then that z is bound by the corresponding 0 binding z in both (A) B and [y : A] B (also if y e z). The binding z occurring in [z : A] B after [ binds precisely the free 2’s in B (if any). We shall mainly be interested in expressions in which no variable is free, called bound expressions (in the literature also called closed expressions). In bound expressions the same binding variable can occur in different instances. This cannot, however, give rise to confusion as to the connection between a bound variable and the binding variable by which it is bound,

Strong normalization in a typed lambda calculus (C.3)

407

Yet, for practical reasons, we wish to avoid such expressions. We call a bound expression in which all binding variables are different, a distinctly bound expression, and we restrict ourselves to the set of all distinctly bound expressions, which we call A. This is no essential restriction. Every interesting theory concerning bound expressions can be restricted to distinctly bound expressions. Let x c D after S and let this x be bound in D. It follows from Def. 3.1 that we have D = &[a: : E ] FS2 such that [x : E] F c D, z c F after S3, S l [ z : El S3 = S and the x occurring in D after S1[ binds the x occurring in D after S. We shall call [x : E ] the binding abstractor of the bound x. From this and Th. 2.13 (1) it follows:

Theorem 3.2. If z c D E A, then Dlx z Q l [ x : El Q2x and [x : El is the binding abstractor of the bound x in D. 0 It follows from Th. 2.13 (2):

Theorem 3.3. Zf for each x c D there are Q l [ x : A] Q22, then D is a bound expression.

&I,

A and Q2 such that D1x

= 0

The following theorem expresses in an intricate manner the obvious observation that, in case x c K c D E A , the x is bound by a binding abstractor either outside or inside K .

Theorem 3.4. If x (2) DIK (ii) Klx

c K c D E A, then either

E & I [ . : A]Q2K, E

D ( x = Q1[x : A] Q2Q’x and K l s = Q’x, or

Q l [ x : A] Q2x and D ( z = QQ1[x : A] Q2x.

In both cases [x : A] is the binding abstractor of x in D.

Proof. Let DIK = Q K , then D ( x = (by Th. 2.10) ( D l K )I x = ( Q K )Ix Q ( K l z ) = QQ’x. Hence, by Th. 3.2, either Q E Q l [ x : AlQ2, or Q‘ Q i [ x : A] Q 2 . Theorem 3.5. If Q ( A )B E then Q A E A.

= = 0

A, then Q A and Q B E A; if Q [x : A]B E A , 0

Proof. Apply Th. 3.3. Theorem 3.6. Zf A E A and B c A after S , then AIB E A.

Proof. First assume that x

c B.

Then by Th. 2.10: ( A I B ) I x = Alx

=

R.P. Nederpelt

408

Q ~ [: xC]Q ~ xNext . assume that AIB e QB and z c Q. Then Q = Ql[y : D ] Q2 and x c D. By Th. 2.11: D c A. So (QB)1x = Ql(Dlx) (010) I x F (AID)1 z z A ( x = Q ~ [:xC ]Q ~ x Apply . Th. 3.3, and 0 note that the binding variables in AIB must be distinct. The above theorem states an essential feature of the factor AIB. If A E A and B c A, then not necessarily B E A. But AIB, which is QB for a certain abstractor chain Q, “closes” B in A by placing in front of B those abstractors which necessarily bind all free variables of E . This, together with our previously uttered wish to restrict our expressions as much as possible to A, justifies our introduction in Section 11.2 of the factor AIB. We continue with three theorems, related to one another.

Theorem 3.7. If Q [z : A]B E A and

t

B , then QB E A.

Proof. Observe the various places where a variable y ( f x ) can occur in QB. 0

Theorem 3.8. If QB E A and Q [x : A]B E A , then x Proof. The assumption z

cB

leads to a contradiction.

B. 0

Theorem 3.9. If QA and QB E A , if the binding variables in A and B are distinct and if x does not occur as a binding variable in QA or QB, then Q [ t : A]B E A. Proof. Again observe the various places where a variable y ( f x ) can occur in 0 Q [x : A]B. In general: If QB and QPD E A and x occurs as a binding variable in P , then not x c B. We say: P has no binding influence on B. The description of A, which we gave so far, began with (general) expressions and selected the distinctly bound expressions among these. This method is not so practical for theoretical investigations. In the following two theorems we shall indicate how we can compose expressions in A from expressions in A. Or, rather, we shall show how expressions in A can be decomposed into smaller expressions, which also belong to A.

Theorem 3.10. (1) r E A . (2) If QA E A and if x does not occur in QA, then Q [ x : A]. E A and

Q [t : A]7 E A .

Strong normalization in a typed lambda calculus ((2.3)

409

(3) If Q A and Qy E A, if x does not occur in Q A and if x $ y, then Q [x : A] y E A .

(4) If Q A and Q B

E A and af the binding variables in A and B are distinct, then Q ( A )B E A .

Proof. It is trivial that. Q [z : A]x, Q [z : A ]7 , Q [x : A]y and Q ( A )B respectively are again expressions. These are also clearly bound expressions, moreover 0 distinctly bound by the conditions given in the theorem. We may consider the four parts of the previous theorem as derivation rules.

Definition 3.11. We call K A-constructible if we can establish that K E A 0 by a (finite) number of applications of the rules in Th. 3.10. The proof of the following theorem is technical. Yet it is interesting to see how we can establish A-constructibility. For better understanding, we shall express the main lines at the end of the present section.

Theorem 3.12. If K E A, then K as A-constructible.

Proof. Induction on 1K(. If (KI = 1, then K = 7 and K is A-constructible by rule (1). Assume that IK(1 > 1, and let all distinctly bound expressions K’ with lK‘1 < 1K1 be A-constructible (first induction hypothesis). Then K = PIP2 ... P,Tail(K) for some n 2 1, where each of the Pi is a lambda phrase (i.e. either an abstractor or an applicator). We can now prove the lemma: “For all i for which 1 5 i 5 n + 1 it holds that: K I (Pi ... P,, Tail(K)) is A-constructible”, by induction on n + 1 - i.

+

(1) Let z = n 1 (i.e. Pi ... P, = 0). If IKITail(K)I < 1K1, then the first induction hypothesis leaves nothing to prove. If [ K1 Tail(K)( = 1K1, then K I Tail(K) G K = [z1 : El] ... [z, : En] Tail(K) for n 2 1. (a) Assume that Tail(K) = x. Then for exactly one s: zs = 2. Abbreviate [zi : El] ... [zt : Et] as Qt for 0 5 t 5 n. We distinguish the cases z, = x and z, f 2. If z, = x, then K is A-constructible by rule (2) since Q,-lE,, E A by Th. 3.5. If z, f x,then K is A- constructible by rule (3), since Qn-lEn I A and Qn-lz, E A (the latter by Th. 3.7). (b) Assume that Tail(K) I 7. Then Qn-lEn E A by Th. 3.5 and Aconstructible by induction. By rule (2) we then find that K is A-constructible.

R.P. Nederpelt

410

(2) Let 1 5 i 5 n and assume that K I (Pj ... P, Tail(K)) is A-constructible if i < j 5 n 1 (second induction hypothesis). If IK I (Pi... P,Tail(K))I < 1K1, then again the first induction hypothesis leaves nothing to prove. So let IKl(Pi ...P,,Tail(K))I = 1KI. Then K = K I ( P i ... P,Tail(K)) = [zl : El] ... [zt : Et] P i ... P,Tail(K).

+

(a) Let Pi E [zt+l : Et+l]. Then we have that KI (Pi ... P,Tail(K)) = K I (Pi+l ... P, Tail(K)), which is A-constructible by the second induction hypothesis. (b) Let Pi = ( F ) . Then QtF E A by Th. 3.5 and A-constructible by the second induction hypothesis, and the same holds for QtP,+l ...P, Tail(K). Hence, by rule (4): Qt(F)Pi+1 ... P,, Tail(K) = K is A-constructible. From this lemma it follows that K I ( P I ,..., P,Tail(K)) = KIK = K is Aconstructible. (In this proof we did not check the conditions concerning variables in rules (1) to (4). It is easy to see that these are fulfilled in the appropriate places.) 0 From Th. 3.10 and Th. 3.12 follows that rules (1) t o (4) generate the relation K E A. Hence, we can consider these four rules as a second definition of A. The advantage of the latter is that we have an inductive definition of A, whereas the original definition was restrictive with respect to the set of all expressions. In a proof by induction the four rules of Thf3.10 are much easier to use. The notion “induction on the length of proof” usually refers to a proof (in fact a construction) produced by an inductive definition, as is the second definition of A. From now on we shall refer to this latter definition when giving a proof by “induction on the length of proof of K E A”. Note that, given a K E A, only one of the derivation steps in Th. 3.10 can give K A as a conclusion: If K = 7, then this must be rule (1). If K = Q ( A )B, then this must be rule (4). If K = QT and Q f 0, then this must be part 2 of rule (2). If K =- Q z and Q = Q’[z : A ] , then this must be part 1 of rule (2). If K = Q z and Q = Q’[y : A] for some y f 2, then this must be rule (3). The proof of the previous theorem suggests how we can establish that K E A by using the four rules of Th. 3.10 (i.e. that K is A-constructible). We can express this ir, words as follows: Let K be a distinctly bound expression. (1) we first establish that K I Tail(K) is A-constructible: (a) if Tail(K) = z, then find the binding abstractor [z : A] in K which binds K, establish that K I A = QA is A-constructible, and apply

Strong normalization in a typed lambda calculus (C.3)

411

be the abstracrule (2) to obtain Q[z : A]. E A. Let Q{,Qh, ...,Qi tors occurring in K 1 Tail(K) “between” [z : A] and z. Insert these abstractors, starting with Q{ (from left to right), by inserting Q; in Q [z : A] Q{ ... Q:-,z between Q:-l and z (by rule (3)). In this manner we establish that K I Tail(K) is A-constructible. (b) If Tail(K) = 7 , then K I Tail(K) = 7 and we may use rule (1) immediately, or K I Tail(K) = Q [z : A] 7 . In the latter case: establish that QA is A-constructible, and apply rule (2) to establish that K I Tail(K) is also A-constructible.

(2) We established that K I Tail(K) = [q : All ... [z, : An] Tail(K) is A-constructible. In K we find applicators ( B l ) ,..., (B1) “between” the abstractors [zi : Ail. Insert these Bi,starting with Bi and ending with B1 (from right to left) in the appropriate places, using rule (4). In this manner we establish that K is A-constructible. (Note the following. If we establish that K is A-constructible, then we use the A-constructibility of KIE for all E c K . We can prove this by induction on the length of proof of K E A.) 4. Replacement, renovation and a-reduction

If we replace a certain variable z in all its occurrences (free, bound or binding) in an expression A by a variable y, then we denote the result of this replacement by ((z := y)) A. An inductive definition of simple replacement is the following (induction here is on the length of the expression):

Definition 4.1. For each pair z and y, ((z := y)) is a function from expressions to expressions. (1) ( ( z : = y ) ) z = y ;

((z:=y))z=z

(2) ((z := y)) [z : A] B (3) ((z := y)) ( A ) B

ifzfz; ((z:=y))~=~.

[((z := y)) t : ((z:= 9)) A] ((x := p)) B.

(((z:= y)) A ) ((z:= y)) B.

0

The simple replacement of certain variables by others will be used for making the binding variables in an expression distinct. This we shall call the renovation of the expression. Renovation is in fact nothing but a repeated renaming of variables. Renaming does not affect relevant properties of expressions (under a reasonable interpretation of “variable”; see also what we said concerning &-equivalent terms in Section 1.1).

R.P. Nederpelt

412

We have maintained names for variables for reasons of tradition and legibility. This is at the expense of the renovation selector (to be introduced in this section) and the secalled a-reduction (our wishes concerning bound expressions in the previous section had nothing to do with names for variables, but with our dislike of the occurrence of free variables; the additional wish to have distinctly bound expressions, however, does concern names for variables, and leads us to introduce renovation). We shall discuss the process of renovation. The mathematical meaning of renovation is not profound. If one dislikes a formalization of an intuitively clear concept, one can continue reading at Def. 4.4 (concerning a-reduction). There is, of course, one precaution which one must take in the renovation process: the relation between bound and binding variables should remain unaffected in a natural way. For example, in [z : y ] [z : z ]z the binding 5 occurring after [ binds the bound z occurring after [z : y ] [z : ; the binding z occurring after [z : y ] [ binds the bound z occurring at the end of the expression. In changing this expression into an expression with distinct binding variables, we might obtain: [x : y ] [ z : x ]z for a certain z f x. Such a variable z as introduced in the latter example by the renovation process plays a special r81e. It has to be chosen with care. At any rate it should be different to all binding variables in the expression under discussion, or, as we shall say: it has to be fresh with respect to that expression. We introduce the renovation selector FTV,operating on expressions. In using the renovation selector FTV with an expression A we have it preceded by a lambda phrase chain P , giving PFTV A. The subscript V denotes a finite set of variables, which can be empty. We shall not specify the variables belonging to V until the following section, where we use FTV in the formal definition of substitution. The expression PFTV A can informally be described as being PA', where

(1) A' is obtained from A by renovation of A , and (2) the fresh variables chosen during this renovation do not occur in P or V and are mutually distinct. The following inductive definition gives a formalization of this concept:

Definition 4.2. Let V be a finite set of variables. (1) PFTVz = P z ; P F r v r

= Pr.

(2) P F r v ( [ y : B ] C ) = P [ z : B ' ] F r v ( ( ( y:= z ) ) C ) ,with PB' while z does not occur in PB' and z # V . (3) P F r v ( ( B )C ) = P (B')FTvC, with PB'

= PFTV B.

= PFTvB, 0

Strong normalization in a typed lambda calculus (C.3)

413

From the above definition we see that the renovation of an expression takes place from left to right. For instance, the renovation resulting in P F T v ( ( B )C) first requires the renovation resulting in P F r v B , and subsequently the one resulting in P (B‘)F r v C . This implies that the fresh variables chosen in the renovation process are mutually distinct. Of course uncertainties remain as to the choice of fresh variables. An order in the set of all variables could turn the selector Frv into an operator. We shall, however, not push the formalization this far. In the following section we shall describe substitution by the aid of the renovation selector, and we shall in turn use substitution in describing the 0-reduction. Our use of the renovation selector is meant to keep an expression distinctly bound after @-reduction. We shall use the renovation selector in typing an expression, as is described in Section 11.8, with the same purpose. We usually begin renovation with V = 0 (here, of course, 0 denotes the empty set and not the empty string).

Definition 4.3. P F r A = PFTQA.

0

In fact P F r A is the concatenation of P and F r w A , where W is the set of variables occurring in P . Let P F r v A = PA’ be the result of a renovation. If we again write F r v A , in the same context, we do not require a new renovation but mean A’. If we wish another renovation in the same context, then we supply F r v with primes: P F r b A can be such a new renovation. We shall now define the a-reduction relation. For an informal discussion of a-reduction see Section 1.1. We restrict a-reduction to expressions in A:

Definition 4.4. a-reduction, denoted by La,is the transitive relation generated by: 0 If A E A and if y does not occur in A , then A > a ((z := y)) A. The a-reduction is clearly an equivalence relation (reflexivity: take z t o be a variable which does not occur in A; symmetry: note that z does no longer occur in ((z := y)) A ) . If two expressions are related by a-reduction ( A 1, B ) , we speak of “the a-reduction A B”. This is clearly abuse of language, although it cannot give rise to confusion. A renaming of a single variable in a distinctly bound expression is called a single-step a-reduction, denoted as A Zb, B (so A Lb, B if and only if A E A and B = ((z := y)) A where z occurs as a binding variable in A and y does not occur in A ) .

R.P. Nederpelt

414 The following theorems are trivial:

Theorem 4.5. If P A E A, then P F T v A E A; if A E A and A 2, B , 0 then B E A and IAJ= IBI. Theorem 4.6. Let P A , P B , PP‘A and PP’B E A. Then P A 2, P B if and only if PP’A PP‘B. 0 5. Substitution and P-reduction Substitution is an operation acting on expressions. We denote “the result of the substitution of A for 2 in B” by (z := A ) B . One can use several definitions of substitution which are equivalent. We shall use the definitions given in Def. 5.1 and Def. 5.2. These definitions of substitution can informally be described as follows: P ( z := A ) v B is the expression which we obtain from P B by replacing all free z’s in B with renovations of A , in which the fresh variables are chosen in the following manner: they have to be mutually distinct for all renovations of A replacing the free z’s, and they have to be distinct from the variables occurring in B , P or V (we do not replace the binding variables in B by fresh ones). Here V denotes a finite set of variables, which can be empty. This careful dealing with fresh variables is necessary to guarantee that an expression with distinct binding variables has again distinct binding variables after &reduction (to be defined in this section); substitution is an essential part of @reduction. The following part of this section as far as Def. 5.5, will formalize the above notion of substitution. As with renovation, our formalization of substitution may be cumbersome to the reader. One may continue with Def. 5.5, without impairing understanding. An inductive definition of P ( z := A ) v B is the following (induction is here on the length of B ) :

Definition 5.1. (1) P ( z := A ) v z

PFrvA; P ( z := A ) v y E Py if y f z; P ( z := A)VT= Pr.

(2) If y $ z, then P ( z := A ) v [ y : B ] C = P [ y : B ’ ] ( z := A)&’, where PB‘ P ( z := A ) w B , W being the union of V and the set of all variables occurring in C. If y = z, then P ( z := A ) v [ y: B ] C = P [y : B’]C , where B‘ is obtained as above.

(3) p ( z := A ) v ( B )C

P (B’)(z := A ) v C , where B‘ is obtained as above.

0

Strong normalization in a typed lambda calculus (C.3)

415

From the above definition we see that substitution (as renovation) takes place from left to right in an expression: for instance, the substitution resulting in P ( z := A ) v ( B )C first requires the substitution resulting in P ( z := A ) w B , and subsequently the substitution resulting in P (B’)(z := A)&. In part ( 2 ) of the definition we see the importance of the subscript used in (z := A ) : in executing the substitution resulting in P ( z := A ) w B we must have at our disposal the set of variables occurring in C in order to be able to choose fresh variables different from the variables in C. The set W contains the latter variables. The set V can be the empty set. When we begin a substitution, V is usually empty.

Definition 5.2. P ( z := A ) B E P ( z := A ) @ B .

0

In the above definitions there are two more or less unusual parts. Usually

(x := A)z is defined as A ; however, with a view to our wish to keep distinctly bound expressions distinctly bound after some substitution, we deviate from this. Next, one sometimes defines (z := A ) [y : B] C, in either of the cases that z = y or z f y, as [ z : (z := A ) B ](z := A ) ( y := z ) C , z being a fresh variable. The latter definition prevents so-called “confusion of variables” (cf. [Curry and Feys 58, Ch. 3D2]; see also the example below), but gives rise to an additional amount of simple substitutions of variables of the form (y := z ) , which we find cumbersome. In using Def. 5.1 and Def. 5.2 confusion of variables may occur if the use of the substitution operator is not restricted. For example, we have that: [g : A ] ( z := y ) [ y : 712 = [y : A] [y : TIP, where the final y in the latter expression is influenced by [y : T],and not by [y : A] as it should be. In general: confusion of variables may arise as a consequence of the substitution resulting in P ( z := A ) v B if a free variable y of A (with y f z) occurs as a binding variable in B , and if there is a free z in B within the “scope” of that binding variable y. A sufficient condition for avoiding this is, that the free variables of A do not occur as binding variables in B . We use substitution only in the relation &reduction, defined later in this section. The above condition is there fulfilled. Hence confusion of variables cannot arise in our system. Note that (z := A ) operates on free z’s, ((z := A ) ) on all z’s, and FTon all binding variables in an expression. We also define the substitution operator for lambda phrase chains:

Deflnition 5.3. If (z := A)PT = P‘T, then (z := A ) P = P’.

0

R.P. Nederpelt

416

One may interchange the substitution operator and the renovation selector under certain conditions: ‘Theorem 5.4. Let P A E A and P [ z : B]D E A. Then PFr(x := A)D P ( x := A)FrD if no binding variable of D occurs in A.

Proof. Induction on IDI,using Th. 4.5 and the lemma: “((y := z ) ) (z := A)C (z := A ) ( ( y:= 2)) C if y (Z A (but for renaming)”. 0 Substitution is used in the more interesting reduction in lambda calculus called P-reduction, which we denote by >-p. The interpretation linked with @reduction is the application of a function t o an argument (see also the informal description in Section 1.1). We shall restrict @-reductionto expressions belonging to A. This is unusual. One usually conceives of a reduction as a formal relation between expressions in which free variables may occur. It is only our preference for distinctly bound expressions which makes us choose the definitions given below. Note that our P-reduction is not essentially different from the usual concept. The use of the Q in Def. 5.5 is a little obscuring in this respect. We first define single-step @-reduction,denoted by >&: Definition 5.5. Single step P-reduction is the relation generated by: (1) If Q ( A )[z : B]C E A, then Q ( A )[z : B]C >-& Q(z := A)C.

>&Q ( A )D. If QA >b QB, then Q [z : A]C >&

( 2 ) Let Q ( A )C and Q ( A )D E A. If QC >-& Q D , then Q ( A )C ( 3 ) Let Q [z : A]C and Q [z : B]C E A. Q [z : B]C.

(4) Let Q ( A )C and Q ( B )C E A. If QA >-& QB, then Q ( A )C

>& Q ( B )C.0

Note: one rule appears to be missing (viz.: Let Q [z : A]C and Q [z : A]D E A. If QC >& QD, then Q [z : A]C >b Q [z : A] D).But this is a derived rule, see Cor. 5.14. Rules (2), (3) and (4) in the above definition are called the monotony rules of single-step @-reduction;we call rule (1) the rule of elementary @-reduction. If K and L are related by a single-step P-reduction, we obtain L from K by replacing a certain subexpression ( A )[z : B]C in K by (z := A)C. This is, in terms of interpretation, a single functional application. If K L , then we have a construction (a “proof”) which establishes that relation according to Def. 5.5. Such a construction consists of one derivation step which is an elementary @-reduction (rule (1) of Def. 5.5) followed by a

>&

Strong normalization in a typed lambda calculus (C.3)

417

number of derivation steps which are monotony steps (rules (2) to (4)). Note that a single-step P-reduction is achieved from a number of derivation steps. We note that, since Q ( A ) [z : B]C E A, no free variable of A can occur as binding variable in C. This is sufficient, as previously stated, to prevent “confusion of variables”.

Definition 5.6. P-reduction is the reflexive and transitive closure of singlestep &reduction. (This means that (1) if K

>&L , then K 2 p L;

(2) K Lp K ;

(3) if K >p L and L L p M , then K >p M . )

0

If A and B are related by a single-step P-reduction, we speak of “the singleB”. As with a-reduction, this is abuse of language. step P-reduction A Analogously we speak of ‘‘the P-reduction A 2 p B”. A,, we also write A0 A1 ... A l , A1 Az, ...,A,-1 If A0 A , or A0 2; A,. We call this a composite single-step @-reduction, or a n nstep @-reduction(a zero-step P-reduction has, of course the form A A ) . If A0 L p A l , ...,A,-1 >p A,, we also write A0 L p A1 >-p ... >p A,. It follows from the definition of @-reductionthat each P-reduction K >-p L can be presented as an n-step @-reduction K = A0 2; ... A , = L. This splitting is called a decomposition of a P-reduction. Each of the monotony rules has the form: “If reduction (i) holds, then reduction (ii) holds”. It is usual to call reduction (ii) the direct consequence of reduction (i). For example: Q ( A )C 20 Q ( A )D is a direct consequence of Q C >p Q D . We recall that “the length of proof of K L” is the total number of derivation steps in the proof of K Lb L. A proof of K L begins with an elementary P-reduction Q ( A )[z : B]C >-p Q ( z := A ) C . In this case we say that Q ( A ) [z : B]C generates the single-step P-reduction K >-b L. The following theorem holds:

>&

>& >& >b

>b

>-b

>b

>b

Theorem 5.7. Let K E A . Then Q ( A ) [z : B]C generates a single-step P-reduction of the form K L if and only if ( A ) [x: B]C C K and Q ( A )[z : B]C = K I ( A )[z : B]C.

>&

*

>&

Proof. : Induction on the length of proof of K L. -e: We state the following lemma: “Let K = Q‘M E A, and let E = ( A ) [z : B]C c M . Then KIE generates a single-step @-reductionK = Q’M I & Q‘N”. We prove this lemma by induction on IMI:

R.P. Nederpelt

418

= E . Then KIE = K = Q’(A)[z : B]C >&Q’(z := A ) C . Let M = [z : F ]G and E c F . Call K’ = Q’F E A; then E c F , hence, by induction: K‘IE = Q‘(F1E) generates a single-step @-reduction K’ =

(1) Let M (2a)

Q’F 2; Q’F’. By applying monotony rule (3) it follows that K’IE also generates Q‘M = Q’[z : F ] G I &Q’[z : F’] G. Moreover, KIE = K‘IE.

M = [z : F ] G and E c G. Take K’ = K = Q’[z : F ] G = Q”G; then E c G, hence, by induction: K’JE generates a single-step 0-reduction Q”G >b Q“G‘, which can be rewritten as Q‘M 2; Q’[z : F ] G’.

(2b) Let

(3a) Let A4 = ( F )G and E c F . The proof is analogous to the one in case (2a), using monotony rule (4) instead of (3). (3b) Let M = ( F ) G and E C G. Again the proof is analogous to the one in case (2a), now using monotony rule (2) instead of (3). This proves the lemma. The “if-part” of the theorem follows immediately from the lemma. 0 We state the “closure theorem for A with respect t o single-step &reduction”:

Theorem 5.8. If K E A and K 2& L , then L E A. Proof. Induction on the length of proof of K 2; L. Note that our definition of substitution for a variable with the aid of the renovation selector is essential. 0 In the proof we can use Th. 3.2. Corollary 5.9. If K E A and K 2 p L , then L E A.

0

One may conceive of the 0-reduction relation, not as a relation between expressions, but as a relation between a-equivalence classes (in the a-equivalence K‘). class of K we include all K’ such that K The following theorem gives a justification for this conception of @-reduction:

>,

Theorem 5.10. Let A E A, A >& B and A 2, C . Then there is a reduction C D such that B D.

>&

>,

Proof. It is sufficient to assume that A >b, C. Apply induction on the length 0 of proof of A B.

>&

In the sequel we shall sometimes refer to the above conception of @-reduction, by inserting the words “but for a-reduction” in a statement (for example: “ A >p B but for a-reduction” means that there are A’ and B’ such that A 2, A’ >p B’ 2, B ) . However, we often omit the words “but for a-reduction”.

Strong normalization in a typed lambda calculus (C.3)

419

We shall proceed with a number of theorems, in which especially the rBle of the abstractor chain Q in a P-reduction is considered, Q occurring in the beginning of an expression. (The definition of llPll was given after Notation Rule 2.4.)

Theorem 5.11. If Q E E A, Q E Q’F and 11Q11 = 11Q’11, then Q = Q’ or E = F . In the latter case Q = Q l [ z : K ]Q2, Q’ = Q l [ z : L] Q2 and Q i K 2; QiL.

>b

Proof. Induction on the length of proof of Q E 2; Q’F. There are four possible Q’F. In three of these cases for the last derivation step in the proof of Q E cases the conclusion is: Q = Q’. The fourth case is that the last derivation Q’F is: “ Q I K 2; Q1L, so Q E Q 1 [ z: K ]M 2; step in the proof of Q E Qi[z: L ] M = Q’F”. Now if 11Q11 I 11Q111, then Q = Q’, and if )1Q11 > 11Q111~ then E = F , Q = Q l [ z :K ] Q 2 and Q ’ E Q l [ z :L]Q2. 0

>&

>b

Theorem 5.12. If Q E E A and Q E 2; K , then K Q‘ and F’ with 11Q‘11 = 11Q11.

= Q‘F’

for certain

>b

Proof. The reduction Q E K must be the conclusion of one of the rules of Def. 5.6. It is easy to see that the statement of the theorem holds good in all these cases. 0 If a reduction QC >p Q’D is given, in which 1)&11 = 11Q‘11, one can conceive of an accompanying “reduction” of C to D. The following theorem shows this.

Theorem 5.13. If QC , QoC and QoD E A, QC 2 p Q’D and then QoC >p QoD.

11Q11

=

11Q’ 1 1,

>b

Proof. First assume that QC Q’D. Then by Th. 5.11 Q = Q’ or C = D. In the latter case it is trivial that QoC >p QoD. So assume Q = Q’. Then inQ’D and Th. 5.12 yield QoC QoD. duction on the length of proof of QC The general theorem follows. 0

>;

>;

The apparently missing monotony rule, announced immediately after Def. 5.5, is a consequence:

Corollary 5.14. Let QC , Q D , Q [z : A] C and Q [z : A ] D E A . If QC 2b Q D , then Q [z : A] C 2b Q [z : A ] D. 0 We defined P-reduction as being transitive and reflexive. We shall now show that P-reduction is also monotonous:

R.P. Nederpelt

420

Theorem 5.15. The monotony rules hold for 0-reduction, i.e.: (1) If Q ( A )C and Q ( A )D E A, and QC >p QD, then Q ( A )C >p Q ( A )D .

(2) If Q [ z :A ] C and Q [ z : B ] C E A, and Q A >p Q B , then Q [ z : A ] C 2 p Q [z : B]C. (3)

If Q ( A )C and Q ( B )C E A, and Q A >p Q B , then Q ( A )C >p Q ( B )C.

Proof. We shall prove rule (2). Since QC >p QD, we know that QC >-& El 2& E2 ... L& En QD. From Th. 5.12, Th. 5.13 and induction on n it follows that there is also a reduction QC QFl QF2 ... QF, >-& QD. Repeated application of the corresponding monotony rule for single-step @reduction gives: Q ( A )C Q ( A )F1 2b ... Q ( A )D , hence Q ( A ) C >p Q ( A )D. The other two monotony rules for P-reduction can be proved analogously. 0

>&

>&

>b

>&

>&

>&

One may extend Cor. 5.14 to P-reduction: Theorem 5.16. Let Q C , Q D , Q [z : A] C and Q [z : A] D E A . Zf QC 2 p QD, 0 then Q [z : A] C 2 p Q (z : A] D. Theorem 5.17. then P C 20 PD.

Zf Q C , PC and P D E A, QC

>p

Q’D and 11Q11 = 11Q’11,

Proof. Let PC(C 3 QoC, then QoC >-p QoD follows from Th. 5.13. The theorem is proved by repeated application of monotony rule (2) for ,&reduction 0 (see Th. 5.15). Note: the converse of this theorem does not hold (“if P C , QC and QD E A and P C >-p P D , then QC 2 p QD”). Example (Q = [z : TI, P = [z : T ] (z)): [X : T ] (32) [y : T ] (Z)[Z : T ] y >p [z : T ] (z) [Z : T ] Z,but not: [Z: T ] [y : T ] (Z) [Z : T ] 9 >p [Z: T ] [ Z : T ] 2. Theorem 5.18. If P ( A )[z : B ] C E A, then P ( A ) [z : B] C

>&P ( z := A ) C .

Proof. This is a consequence of the following: ( P ( A )[z : B]C) I ( A )[z : B ] C = 0 Q ( A )[Z : B ] C 2; Q ( z := A)C. Apply Th. 5.17. Theorem 5.19. Zf Q K , QM and Q’M E A, Q K 2 p Q‘L and 11Q11 = 11Q‘11, then Q M >p Q‘M. Proof. Along the same lines as the proof of Th. 5.16.

0

Strong normalization in a typed lambda calculus (C.3)

42 1

The following theorems are trivial consequences of the preceding. Theorem 5.20. If Q [y : K ] L E A and Q [y : K ] L >p Q [y : K’] L’, then Q K La QK‘. 0 T h a r e m 5.21. If Q K E A, Q K >a Q’K’ and 11Q11 = 11Q’11, then Q K QK’ L p Q’K’ and Q K >a Q‘K 2 p Q’K’.

0

We define the beta equivalence relation as follows: Definition 5.22. Let A and B E A. We call A beta equivalent to B (denoted: A -p B ) if there is an expression C such that A >p C and B >p C. 0 It is clear that beta equivalence is reflexive and symmetric. The transitivity of beta equivalence will be proved in Th. 7.35, using Th. 6.43 (in the literature the transitive closure of beta equivalence is called beta conversion). Theorem 5.23. Let Q K and Q L E A. If Q K -p QL, there is an N such that QK I p QN and QL QN.

Proof. Since QK -p QL: Q K >p A and Q L 2 p A. From Th. 5.12 it folQ‘N with 11Q11 = 11&’11. Then from Th. 5.21: Q K >p Q N and lows that A 0 QL >p Q N .

From this theorem, together with Th. 5.15, it easily follows that the monotony rules hold for beta equivalence (parts (a), (c) and (d) of the following theorem); part (b) follows from Th. 5.23 and Th. 5.16: Theorem 5.24.

(a) I f QC, Q D , Q (A) C , Q (A) D E A and QC -0 Q D , then Q ( A )C -p Q ( A )D. ( b ) If QC, Q D , Q [ x : A ]C , Q [x : A ] D E A and QC ~p Q D , then Q [ z : A ] C -p Q [ z : A] D. (c)

If Q A , Q B , Q ( A )C , Q ( B )C Q ( A )C Q ( B )C -

E

A and Q A “ p Q B , then

( d ) If QA, Q B , Q [ x : A] C , Q [ x : B]C E A and Q A -p QB, then Q [ Z : A] C -0 Q [X : B ] C .

0

R.P.Nederpelt

422

6. Other P-reductions

In a P-reduction we eliminate a pair of the form ( A )[z : B] occurring in an expression, obtaining “copies” of A (to be precise: expressions A’, A”, etc., which are renovations of A ) instead of the non-binding 2’s in that expression. We sometimes wish to retain information concerning “past” P-reductions, as a kind of “scar” in an expression. The easiest way to do this is to maintain the pair ( A ) [z : B] in an expression after P-reduction. We shall formalize this kind of P-reduction, calling it PI-reduction. Another P-reduction, called &-reduction, will be introduced especially t o eliminate the “scars” ( A ) [z : B] as soon as they are no longer required. We shall show that a P-reduction can be decomposed into a PI-reduction and a ,&-reduction. We describe in this section PI- and P2-reduction as a preparation for Section

111.3. The fact that we wish to keep the pair ( A )[z : B] in an expression after 81reduction complicates matters since we wish each sequence of ,f3-reductions to have a corresponding sequence of PI-reductions. For example, a sequence of preductions Q ( A )(B) [z : T] [y : z] y 2; Q ( A )[y : F r B ]y >b Q F r A should have its counterpart in PI-reductions: Q ( A )(B) [z : T] [y : z] y >bl Q ( A ) (B) [z : T] [y : FTB]y >bl Q ( A )( B )[z : T ] [y : F T B ]FTA. Note that in [z : T] located the latter single-step PI-reduction we have to ignore the scar (B) between ( A ) and [y : FTB] (a property of such a scar (B) [z : T] is that the z does not occur in the expression following it). However, PI-reduction permits more. One may ignore the pair (B) [z : T] in Q ( A ) (B) [z : T] [y : z] y, in spite of the fact that z does occur in [y : z] y. This gives the single-step PI-reduction: Q ( A ) (B) [z : 71 [y : z] y >pl Q ( A )(B)[z : T] [y : z] FTA, by applying [y : z] y t o A . This is a real extension of the usual &reduction concept. We shall call a string like (B) [z : T] in the above example, which may be located between “function” and “argument” of a ,&-reduceable expression, a P-chain. Moreover, we shall agree that the relation Q ( A )P [y : C] D Q ( A )P [y : C] (a, := A ) D (in which P is a /?-chain) does only hold if y occurs in D. If we did not require this, one could continue the PI-reduction with the latter expression, thus producing a non-terminating PI-reduction sequence. The P2-reduction relation, on the other hand, eliminates an applicator and Q P E (in which P is again a P-chain); an abstractor as in Q ( A )P [y : C]E the latter relation only holds, however, if y does not occur in E . We give an inductive definition of P-chain:

>bz

Strong normalization in a typed lambda calculus (C.3)

423

Definition 6.1. (1) If P

= 0,then P is a @-chain.

(2) If P is a @-chain, then ( B )P [z : C] is a @-chain. (3) If PI and P2 are @-chains, then P1P2 is a @-chain.

0

Example: ( A ) ( B )[z : C ] (D) [y : E ] [ z : F ] is a @-chain.

Notation Rule 6.2. We write P to indicate that P is a @-chain. We write 0 [g : B]C to indicate in [z : B]C that z C.

A @-chain P has the property that the number of applicators in P is equal to the number of abstractors in P. Moreover, if P = PlP2, the number of applicators in PI is at least equal to the number of abstractors in PI. We can also express this by means of a valuation v of lambda phrase chains, defined inductively by (1)

40) = 0,

+

( 2 ) v ( ( A )P ) = v ( P ) 1, (3) ~ ( [:zA] P ) = v ( P ) - 1.

Then for a @-chain P it holds that (i)

v(P)= 0, and

(ii) if P = PlP2, then v(P1) 2 0. These conditions are also sufficient. The following theorem concerning @-chains can be proved by the aid of the above-mentioned valuation properties:

Theorem 6.3. (1)

If P f 0, then P

(2) P

= P'(B) [z : C]P".

= PIPS is a @-chainaf

and only if PI

= P I P ~ F is' ~ a ,&chain.

0

Note that for each P f 0 there is a unique decomposition P = P1P2 ... Pn with Pi f 0 and Pi f Pip,!' for some P,! $ 0 and P,!' f 0. The following theorem shows that a @-chain has a compact structure:

R.P. Nederpelt

424

Theorem 6.4. If ( A )P [z : B]C C F , then either ( A )P [z : B] is a part of PI (i.e. Pl = s ~ ( AP) [z : B]s2),or ( A )P [Z: B] c F .

c

Proof. The essential part of the theorem is that the following cannot occur: P EE P2P3, PI = Pd(A) P2 and F G P3[z : B ] C ( “ ( A ) P [ :z B ] C occurs partially in P I , partially in F ” ) . This can be proved by the aid of the valuation 0 properties for @-chains. We continue with the definitions of single-step

@I-

and @2-reduction:

Definition 6.5. Single-step &-reduction is the relation generated by (1) If Q ( A )P [z : B]C E A and

2

c C, then

Q ( A ) P [ z : B ] C>bl Q ( A ) P [ z :B ] ( z : = A ) C and ,

(2) the monotony rules (see Def. 5.5 (2), (3) and (4), reading

2&).

>bl

instead of 0

Definition 6.6. @I-reduction is the reflexive and transitive closure of single0 step @I-reduction. Deflnition 6.7. Single-step ,&-reduction is the relation generated by (1) If Q ( A )P

[g : B]C E A, then Q ( A )P [z : B]C >b2 QPC, and 0

(2) the monotony rules.

Deflnition 6.8. P2-reduction is the reflexive and transitive closure of singlestep p2-reduction. 0 Note that in Def. 6.7 z may not occur in C. As in the case of @-reduction (see the previous section), we speak of elementary @I- or @2-reduction, n-step @I- or @2-reduction1and a single-step @I- or @2-reduction generated by Q ( A ) P [z : B]C or Q ( A )P [g : B]C respectively. L” or “... K L” is defined as for single-step The “length of proof of K @-reduction. Finally we define PI-equivalence ( K -pl L ) and @2-equivalence ( K L) analogously to @-equivalence (see the previous section).

>b2

>bl

>b, L , then L E A. Induction on the length of proof of K >., L or K >b, L respectively.

Theorem 6.9. If K E A and K Proof. Cf. the proof of Th. 5.8.

L

OT

K

0

Strong normalization in a typed lambda calculus ((3.3)

425

Definition 6.10. We shall write P >p, P’ etc. if P r >p, P’r etc. Theorem 6.11. If Q A E A and Q A with 11Q11 = 11&’11, and either (1) Q

>bl

>&, K

(for Q A

0

>b2 K ) , then K --= Q’A‘

>&, Q’ respectively) and A = A’, or and Q A >b, QA‘ (or Q A >b2 QA’ respectively).

Q’ (or Q

(2) Q I Q‘

Proof. Cf. the proofs of Th. 5.11 and Th. 5.12.

0

Theorem 6.12. The monotony rules hold for PI-reduction and for Pz-reduction.

Proof. Cf. the proof of Th. 5.15.

0

Theorem 6.13. If QC, Q P C and Q P D E A, and QC >pl Q D (or QC >p, QD), then QPC I p , Q P D (or QPC >pa Q P D respectively). 0

Proof. Cf. the proof of Th. 5.17.

The following two theorems deal with the relation between ,&reduction on the one hand and PI- and &reduction on the other. Theorem 6.14. If K

>bl L , then K -p

L.

Proof. Induction on the length of proof of K

>bl

>b, L.

>&,

L is Q ( A )P [X : B]c Q ( A )P [g : B](Z:= A)C. Note that, if P f 8, then P = P’ ( B )[z : C]P”, and PI = P’P’’ is again a pchain by Th. 6.3. It follows that there is a @-reductionfor K and for L. Continuation of this o-reduction process gives: K >p Q ( A )[z : B’]C‘ >p Q ( z := A)C’, and L >p Q ( A ) [g : B’](z := A)C’ >p Q ( z := A)C’. In the latter reduction one should note that the substitutions (y := D) introduced in the reduction L >p Q ( A )[g : B’](z := A)C’ do not influence A , since y A. Together with the statement that in this case ( y := D ) ( z := A ) E = (z := A)(y := D ) E (but for renaming) this results in us obtaining the same C’ (but for renaming) in reducing L as we obtained in reducing K . It follows that K -p L. 11. K L is Q ( A )C >b, Q ( A )D as a direct consequence of QC QD. Then, by induction: QC -p QD. Hence, by Th. 5.24: Q ( A )C -p Q ( A )D. 111, IV. The two other monotony cases are proved similarly to 11. 0

I. K

<

>bl

>bl

>b L , then K >b, M >b2 L or K >bz L. Proof. Let K >b L be generated by Q ( A )[z : B]C >b Q ( z := A ) C . Q ( z := A ) C . If 2 c C, then Q ( A )[z : B]C >b, Q ( A )[g : B](z := A ) C Theorem 6.15. If K E A and K

R.P. Nederpelt

426

>b

L is a &?-reduction. The remainder follows from the fact If 2 (?- C, then K 0 that the monotony rules for p-, PI- and &reduction are similar. We shall now prove a number of theorems leading to a theorem on the possibility of postponement of Pz-reductions after PI-reductions (Th. 6.19). Theorem 6.16. If K E A, K certain n 2 1.

>bz L >bl

M , then K 2bl L’

M for a

>bl M be generated by the following elementary ,&-reduction: >bl Q ( A )P[z : B ](z := A)C, and K Ibz L by

Proof. Let L Q (A! P[z : B ] C

[i

Q’ (D) PI : El F 2bz Q’PlF. Then ( A )P [z : B]C c L and P1F c L. Now we can distinguish three cases: (1) ( A )P [z : B]C and P1F occur in L in disjoint places, (2) ( A )P [ z : B ] C c PlF, (3) P1F c ( A ) P [ z : B ] C and P1F f ( A )P [Z: B]C. (1) In this case it is clear that the theorem holds for n = 1. (2) We may distinguish (see also Th. 6.4):

(a)

= P 2 ( G )P3 or = P2[z : GI P3 and ( A )P [z : B]C theorem holds for n = 1.

(b)

PI = P2 ( A )P [z : B]P3. Idem.

c

G. The

(c) ( A ) P [z : B]C c F. Idem. These cases (a)-(c) are exhaustive if ( A )P [z : B]C c

F.

(3) (a) Let P1F c A . Now the theorem holds for n is the number of occurrences of z in C plus one.

= P2PlP3. The theorem holds for n = 1. Let P E P2 ( G )P3 or P = P ~ [:zGI P3 and @IFc G. Idem.

(b) Let P (c)

(d) Let P1F

c [z : B ]C. Idem.

These cases (a)-(d) are exhaustive if ( A ) P [z : B ]C.

F c ( A )P [z : B]C and

P1F

f

(In this proof we several times use the lemma: PlqP3, then PI ( G )&[z : H ] P3 is also a P-chain”, which is a conse“If P 0 quence of Th. 6.3.) Theorem 6.17. If K E A, K >pa L

M , then K

>bl

L‘ >pa M .

Proof. Induction on the number of steps of K >pz L, using Th. 6.16.

0

Strong normalization in a typed lambda calculus (C.3)

Theorem 6.18. If K E A, K >pz L

>Fl

M , then K

>$, L’

427 >p, M .

Proof. Induction on p , using Th. 6.17.

0

Theorem 6.19. If K1 2’ K2 2’ ... 2’ K, by single-step 01- and 02-reductions, the total number of pi-reductions being p , there is a reduction K1 = L

>,;

M >pz N

G

K,.

Proof. Combine the successive single-step 01-reductions in K1 >’ Kz 2‘ ... 2‘ K,, and do the same with the successive single-step P2-reductions: we obtain Ki Li 20, Mi >p, L2 I p , M2 I p , ... I p , Li 20, Mi 2 0 , K,. Induction on 1 yields the proof. 0 We shall now prove what we call the Church-Rosser property (CR) for 01reduction, which we formulate as follows: If K I p l L and K I p , M , there is an N such that L >p, N and M 20, N . We can express this Church-Rosser property in a diagram, as follows: K

Figure 1 From CR for PI-reduction it easily follows that 01-equivalence is transitive, hence indeed an equivalence relation (reflexivity and symmetry of are trivial). Hence we can also state that -pl is the equivalence relation generated by I p , , which is an alternative formulation for the Church-Rosser theorem for PI-reduction. In proving C R for PI-reduction we shall use a technique introduced by W.W. Tait and P. Martin-Lof, given in [Barendregt 71, Appendix 111. We shall discuss the power of this technique in brief. In order t o prove CR for a reduction it is natural to begin with single-step reductions “K 2’ L” and “K 2‘ M ” . In a usual single-step (e.g. p-) reduction one can then find an N such that “ L 2 N ” and “ M N ” , but unfortunately

>

R.P. Nederpelt

428

only one of these last two reductions is necessarily a single-step reduction, and one cannot say in advance which of the two. If one now begins with multiple-step reductions “ K 2 L” and “ K 2 M” and one tries, by the aid of the above, to find an N such that “ L 2 N ” and “M 2 N ” , then the termination of this attempt is not guaranteed. The following example, drawn in a diagram, suggests what might happen:

.

M

Figure 2 Each rectangle in this diagram represents reductions; three sides of the rectangle are single-step reductions, one side is two-step. The diagram can, however, be continued indefinitely in the place where we draw the dotted lines. Now Tait and Martin-Lof defined a new “single”-step reduction (which we shall call single-step nested reduction t o avoid confusion). The latter reduction has the property that with each pair of single-step nested reductions “K >* L” and “ K 2’ M” there can be found an N such that “ L 2* N ” and “ M 2’ N” , both last-mentioned reductions being single-step nested reductions as well. Moreover, each multiple-step reduction can be decomposed into single-step nested reductions and each single-step nested reduction is a composition of (ordinary) single-step reductions. If one now begins with multiple-step reductions “ K 2 L” and “ K 2 M ” , one can decompose these reductions into single-step nested reductions and apply the

Strong normalization in a typed lambda calculus (C.3)

429

above. Then one obtains, for example, a situation as is expressed in the following diagram:

M

Figure 3 In each of these rectangles the four sides represent single-step nested reductions. Moreover, the nested reductions “ L 1’ L’ N” and “ M M‘ N” can be decomposed into (ordinary) single-step reductions, which combine into “ L N ” and “ M 2 N ” . Thus we obtain CR. In the following we shall define a single-step nested PI-reduction, which we shall call single-step y-reduction. Our y-reduction is a little more complex than the nested reduction of Tait and Martin-Lof, but it yields essentially the same results. The “nested” character of y-reduction can be explained as follows. Let a 01-reduction be generated by Q ( A )P [z : B]C , let Q A 2; QA’ and Q P [z : B]C I ! QP‘ , [z : B’]C’. Then also Q ( A )P [z : B]C 2; Q ( A ’ )P’ [z : B‘] (z := A‘)C‘ (if the suggested PI-reduction is preceded by singlestep nested reductions “inside” A , P, B and C, the composite reduction is a single-step nested reduction). The reductions take place in a “nested” order. With the aid of y-reductions we shall prove CR for PI-reduction.

>*

>*

>*

>

Definition 6.20. Single-step y-reduction, denoted by tion generated by

>k1 is the reflexive rela-

(1) If Q ( A ) P [ z B : ] C E A,z c C , Q A 21, QA’ and Q P [ z : B ] C >Iy QP‘ [z : B‘]C’, then Q ( A )P [z : B]C 2; Q (A’)P‘ [z : B’](z := A‘)C’.

(2) If Q ( A )C and Q (A’)C’ E A, Q A Q (A’)C’.

>!,

QA’ and QC 2; Q C ’ , then Q ( A )C

>!,

R.P. Nederpelt

430

(3) If Q [z : A] C and Q [z : A’]C’ E A, QA 2; QA’ and Q [z : A]C 2; Q [z : A]C’, then Q [z : A]C 2; Q [z : A’] C’. (4) If A E A, A 21, B and B

B’, then A 2; B’.

0

We call (2) and (3) the monotony rules for single-step y-reduction. We also L ) analogously to /3-equivalence. define y-equivalence ( K

--,

Definition 6.21. y-reduction, denoted by Ir,is the transitive closure of singlestep y-reduction. 0 We continue with some theorems concerning y-reduction (it will be clear that Q 21, Q‘ if and only if QT I ! Q’T). ,

Theorem 6.22. ZfQA E A and QA 2; K , then K Q 2; Q’ and QA >!, QA’.

= Q‘A‘, where 11Q11 = 11Q’11,

Proof. Induction on the length of proof of QA 2; K . (1) If QA 2; K by reflexivity, the theorem is trivial.

(2) If QA 2; K is Qo(B)P [z : C]D 2; Qo(B’)P‘ [z : C’](z := B‘)D’ as a direct consequence of QoB 2; QoB‘ and QoP [z : C]D 2; QoP‘ [z : C‘]D’, then 11Q11 5 IlQoll, so QO= QQ“, and the theorem follows.

I!, K is Qo ( B )C 2; Qo(B‘)C’ as a direct consequence of QoB 2; QoB’ and QoC 2; QoC‘, then again 11Q11 5 IlQoll and the theorem follows.

(3) If QA

2; K be Qo[z : B]C 2; Qo[z : B’]C’ as a direct consequence of QoB I!, QoB’ and Qo[z : B]C 2; Qo[z : B]C’.

(4) Let QA

(i) If 11Q11 5 IlQoll or Q = Qo[z : B] then the theorem follows. (ii) If C = [ y l : B1]... [ y , : B,] COand Q E Qo[z: B] [ y l : B1] ... [yn : B,], then QA G Qo[z : B]C = QO[Z: B] [ P I : B1] ... [y, : B,] A 2; Qo[t : B]C‘ E QO[Z: B] [zi : Bi] ... [z, : Bh]A‘ (by induction) with QA 2; QA’ and Q I!, Qo[z : B] [zi : Bi]... [ z , : Bk]. It follows that Q ?I, QO[Z: B‘] [ZI : Bi]... [zn : Bk]. Also C’ = [ t i : Bi] ... [.tn : Bh]A’, SO Qo[z : B‘]C’ = Qo[z : B‘] [zi : Bi] ... [ z , : B;] A‘. Consequently the theorem holds if we take Q‘ = QO[Z: B‘][zl : Bi]... [ z , : BL]. (5) Let QA 2; K be a direct consequence of QA 2; K’ and K’ K . Then by induction the theorem holds for QA 2; K’, and trivially also for QA 2: K . 0

Strong normalization in a typed lambda calculus (C.3)

43 1

Theorem 6.23. The monotony rules hold f o r y-reduction.

Proof. Cf. the proof of Th. 5.15.

0

Theorem 6.24. If Q C , P C and P D E A , and QC 2; Q D , then PC 2; PD.

Proof. Analogous to the proof of Th. 5.17.

0

The following two theorems deal with the relation between PI- and y-reduction.

Theorem 6.25. If K E A and K 2bl L , then K 2; L.

Proof. Induction on the length of proof of K 2&, L. The rule of elementary PI-reduction is covered by Def. 6a.11 (1) (take Q A 27 QA’ to be Q A QA by reflexivity, etc.), the monotony rules for ,&-reduction are covered by the monotony rules for y-reduction (again using the reflexivity of y-reduction in 0 appropriate places.) Theorem 6.26. If K E A and K 21, L , then K 2 p , L but f o r a-reduction.

Proof. Induction on the length of proof of K 2; L. For example: let K 21, L be Q ( A )P [z : B]C 2; Q (A’)81 [z : B’](z := A‘)C‘ as a direct consequence of z C C , Q A 2; QA’ and Q P [ z : B ] C 2; Q P ‘ [ z : B’IC‘. By induction the last two reductions can also be obtained by P1-reductions and a-reductions, and Q ( A ) P [ z : B ] C z p , Q ( A ) P ‘ [ z: B’IC’ 20, Q ( A ‘ ) P ‘ [ z: B’IC’ >pl Q (A’)P’ [z : B’](z := A’)C’ (but for a-reduction). In the last ,&-reduction we use the lemma: “If z c C , if z occurs as a binding variable in Q1 and if QIC 2 p l QlC‘, then z c C’”. 0 Theorem 6.27. If K E A and K

>7

L , then L E A.

Proof. Follows from Th. 6.26 and Th. 6.9.

0

We inductively define similarity of two lambda phrase chains (not necessarily P-chains):

Definition 6.28. then PI and Pz are similar. (1) If PI Pz = I, (2) If PI and P2 are similar, then ( A )PI and ( B )P2 are similar, and [z : A] PI 0 and [z : B] P2 are similar. The following theorems are a preparation for Th. 6.37, which expresses CR

R.P. Nederpelt

432

for y-reduction. In order to prove Cor. 6.31 and Th. 6.34 it is convenient to extend the notion of &chain, as in Def. 6.29: a number of P-chains, connected by abstractors, will be called a P-chain complex. A P-chain complex may be empty.

Deflnition 6.29. Let P I ,P2, ..., Pi be (possibly empty) P-chains. Then a lambda chain PI [q: All P2 [Q : A21 ... Pi-1 [zi-l : Ai-11 Pi is called a P-chain complex. 0 -

P.

We denote a &chain complex P by The following statement can be proved by the aid of the valuations: If P is a P-chain complex and P E P1P2, then P2 is a P-chain complex.

-

-

>!,

Theorem 6.30. Zf Q P A E A and Q P A

-

-

-

K , then K

= Q’P‘A’,

where

11Q11 = 11Q’11, Q I!,Q’, Q P A >!, QP‘A’ and P and P’ are similar. Proof. If P = Q1 (including P = 8), then Th. 6.22 gives the proof. Let P f Q1. We proceed with induction on the length of proof of Q P A K . Note that there must be at least one applicator in the lambda chain P I on account of our assumption P f Q1.

>!,

-

(1) Assume that Q P A I ! K, by reflexivity. The proof is now trivial.

(2) Assume that Q h A I!,K is QQ1 (C) PI [y : D] E I!, QQ1 (C’) [ y : D’] (y := C‘)E‘ as a direct consequence of QQlC 21, QQlC’ and Q Q 1 9 [y : DIE >!, Q Q l p { [ y : D’IE’. Now it must hold

-

that E-= &A, while QlP1 [y : D] P2 is a @-chain complex. By induction:

-

E’

= PiA’ and -P2 and Pi are similar.

The remainder is easy.

(3) Assume that Q P A >!, K is QQ1 (C) D

>!,

quence of QQlC 2; QQlC‘ and QQlD -

-

QQ1 (C’)D’ as a direct conse-

L!, QQlD’.

-

Now

P=

Q1

(C) F 2

= P2A. Here p2 is a @-chain complex, hence by induction D’ P&4’, in which P2 and Pi are similar. The remainder follows easily.

and - D -

=

>!, QQl[y : C’]D’ as a direct QQlC‘ and QQl[y : C ]D 2; QQ1 [y : C ]D’.

(4) (a) Assume that Q h A 21, K is QQ1[y : C]D

>!, -

consequence of QQlC

-

P

-

Then E Q 1 [y : C ]p2 and D = &B. The completion of the proof is similar to that in the last part of the previous case. -

(b) Assume that Q P A

>!,

K is Qo[y : C ] Q l h A >!, Qo[y : C’]D’ as a -

’direct consequence of QoC >!, QoC’ and QO[y : C]Q I P A

-

>!,

Qo[y :-C ]D’. -Then, by induction, D’ Q’,P‘A‘ with llQ1ll = IlQiII, and are similar. The remainder follows.

while

P

Strong normalization in a typed lambda calculus ((2.3)

433

-

(5) Assume that Q P A 2; K is a direct consequence of Q Y A 2; K’ and K‘ K . The proof is again by induction. 0

Corollary 6.31. If Q P A E A and Q P A 2; K , then K = Q’P’A‘, where 0 11Q11 = 11Q’ 1 1, Q 2; Q‘, Q P A 2; QP’A’, while P and P’ are similar. The following four theorems are lemmas for Th. 6.36. Th. 6.34 might have been called “the substitution lemma for y-reduction” . Theorem 6.32. (1) Let QP [z : B]C E A and Q d [z : B]C 2; K . Then K = Q’fi [z : B’]C’ such that Q 2; Q’, QP [z : B]C 2; Q f i [z : B’]C’, 11Q11 = 11Q’ 1 1, while P and P’ are similar. (2) Let Q ( A )P [z : B]C E A and Q ( A )P [z : B]C 2; K . Then either

= Q’ (A’)fi [z : B’]C’, or K = Q’ (A’)fi [z : B’](z := A’)C‘,

(i) K (ii)

where in both cases Q 2; Q‘, Q A 2; QA‘, QP [z : B]C 2; Qpl [z : B’]C’ and 11Q11 = 11Q’11, while P and P‘ are similar. (3) Let Q ( A )B E A and Q ( A )B 2; K . Then either

(i) K

= Q’ (A’)B’

11Q’Ili

where Q 2; Q’, QA 2; QA’, Q B 2; QB‘ and 11Q11 =

or

(ii) Q ( A )B = Q ( A )P [z : C] D, z c D, K G &’(A’)fi [z : C‘](z := A’)D’, Q 2; Q’, 11Q11 = 11Q’11, Q A 2; QA’ and QP [z : C]D 2; Q P [z : C‘] D’. Proof. See Cor. 6.31 and the possibilities for single-step y-reduction.

0

Theorem 6.33. If QP [z : B]C E A, QP [z : B]C 21, Q‘8’ [z : B’]C’ and z c C, then z c C’.

Proof. In a subexpression we can only eliminate free variabes by substitution, and substitution can only originate from a 01-reduction. Note that a 01-reduction yielding a substitution (z := A ) cannot occur in the above. 0 -

-

Theorem 6.34. Let Q A and Q P B E A, -let no binding variable of P B occur in QA and let z not occur in Let Q P B 2; QP‘B’ (where (IP(1= llP’ll)

P.

R.P. Nederpelt

434 -

-

and QA 2; QA’. Then Q P ( z := A ) B I!, QP‘(z := A’)B’.

Proof. First consider the case that P = Q1, so P’ = Q\. We prove the theorem by induction on IBI. If B = y f z or B = T then the theorem is trivial. If B = z, note that QQlA >(, QQIA’, so also QQlA 2; &&’,A‘ by induction on the length of proof of QQlz I!, Q Q ~ zand monotony rule (3) for single-step y-reduction.

>!,

(1) Let B = [ y : El F . Then B’ = [ y : E’] F’ by Th. 6.32 (l),Q Q l E QQ1E‘ and QQ1 [ y : E] F 2; QQ1 [ y : El F’. So also (by induction) QQ1(z := A ) E 21, QQl(z := A’)E’ and QQ1 [ y : E ](z := A ) F QQ1 [ y : E ] (z := A’)F‘. It easily follows from the latter reduction that QQ1 [ y : (Z := A ) E ](Z := A ) F 2; QQ1 [ y : (Z := A ) E ] (Z := A’)F’. It := A‘)B‘, hence also follows that QQ1(z := A ) B 2; QQl(Z := A ) B >!, QQ\(z := A’)B’.

>!,

(2) Let B = ( E )F. Now by Th. 6.32 (3) either B’ = (E’)F‘ with QQlE >!, QQiE’ and QQlF >!, QQiF’, or B E ( E )PI [ y : GI H , and QQlB =

QQ1 ( E )Pi [ y : GI H I!,QQ’, (E‘)Pi [ y : G’]( 9 := E’)H‘

QQiB‘.

In the first case the proof is similar to the proof in case (1). In the second case we can follow analogous lines, using the fact that (y := (z := A’)E’)(z:= A’)H’ = (z := A‘)(y := E’)H‘ but for renaming. Now consider the case that P f Q I , whence at least one applicator- must occur in chain P. We proceed with induction on the length of proof of Q P B 2; -

QP‘B’.

-

-

(1) Assume that Q P B 2;

QP‘B‘ by reflexivity. This case can be proved -

similarly to the case that -

(2) Assume that Q P B

= &I.

-

>I,QP‘B‘ is QQ1 ( C )PI [ y : D]E 2;

QQ1 (C’) [ y : D’] ( y := C’)E’ as a direct consequence of QQlC 2; QQlC’ and Q Q 1 4 [ y : D]E I!, QQlPi [ y : D‘] E’. Now it must hold that -

-

-

E G P2B and E’ = PiB’, where B’ F ( y := (7’)”’. is also a f?-chain complex, it follows by induction: QQ& [ y : D] F z ( Z := A ) B 2; Q Q ~-P[;y : D’] Q P ( z := A ) B 5 QQ1 ( C )PI [ y : D] := A ) B

Since QIP1 [y : D]& := A I ) B / /so ,

FZ(Z 2; QQ1 (C‘)Pi [ y : D’](y := C’)(Pi(z - := A‘)B’’) =

-

QQ1 (C‘)Pi [ y : D’] ( ( y := C‘)Pi)(Z := A’)(y := C’)B’‘ = Q P ( z := A’)B’ (here we changed ( y := C’)(z := A’),’’ into (z := A’)(y := C‘)B’’, which is allowed by Th. 6.33 and by the conditions imposed upon the variables).

Strong normalization in a typed lambda calculus (C.3)

I!,QP‘B’ is QQ1 ( C )D >!, QQ1 (C’)D’ as a direct con-

(3) Assume that Q P B

sequence of QQlC

>!,

QQlC’ and QQ1D

and PI= Q1 (C’)Pi; D -

QQ1&(z := A ) B -

435

>!,

-

>!-, &&ID’. Then h = Q1 ( C )F 2

= PiB’. By induction: -

P2B - and D’

QQlFi(z := A’)B’, so also Q P ( z := A ) B 2;

Q?(z := A’)B’.

-

-

(4) Assume that Q P B >!, QP‘B’ is QQ1 [y : C] D >!, QQ1 [y : C’]D’ as a direct consequence of QQlC >!, QQlC’ and QQ1 [y : C ]D I!, -

-

-

P

&

-

-

QQ1 [y : C ]D’. Then Q1 (y : C] and = Q1 [y : C’]j . ;D = & B and D’= PiB’. The remainder of the proof is analogous to that in case (3). -

-

(5) Assume that Q P B -

-

-

-

QP‘B’ is a direct consequence of Q P B >!, QtrprrBff

-

and Ql’FB‘‘ > a QP‘B‘. Then Q h ( z := A ) B induction; the remainder is easy. Theorem 6.35. If QA and Q’A’ E A, Q Q A >!, Q’A’.

>\

>\ Q”P“(z := A’),’’

by 0

Q’ and QA

>\

QA’, then

Proof. Induction on 11Q11, using monotony rule (3) of single-step y-reduction. 0

Theorem 6.36. If K E A, K L 2; N and M >!, N .

>!,

L and K 21, M , there is an N such that

Proof. Induction on the length of proof of K Th. 6.35 several times without saying so. (1) Let K

>!,

L by reflexivity. Take N

>!, L.

We shall use Th. 6.24 and

= M.

>!,

(2) Let K >!, L be Q ( A )P [z : B ]C Q (A’)P‘[z : B‘] (z := A‘)C’ as a direct consequence of QA >!, QA’ and QP [z : B ]C QP’ [z : B‘]C’. Now by Th. 6.32 (2) M can have the form (i) M (ii) M

>!,

= Q” (A”)P” [z : B“]C” or = Q” (A”)P” [z : B”](z := A”)C’’,

where in both cases Q >!, Q”, Q A I! QA”, ,QP [z : B ]C >!, QP“[z : B”]C”, 11Q11 = 11Q”11 and llPll = IIP”II. By induction and by Th. 6.22 there is an A”‘ such that QA‘ QA”’ and QA” >!, QA”’. Again by induction and by Th. 6.32 (1) there are PI”, B”’ and C“’ such that

>!,

R.P. Nederpelt

436

QP‘ [z : B’] C’ 2; QB“‘ [z : B”‘]C”’ and QB“ [z : B”]C” 2; QB“‘ [Z: B”‘]C”’. By Th. 6.34: QB‘ [z : B‘] (z := A’)C’ 2; QB“‘[z : B’”](z := A”’)C’“, hence by monotony and Th. 6.35: L = Q (A’)P‘ [z : B’] (z := A’)C’ 2; Q” (A”’)P”’ [z : B”‘](z := A”‘)C”’. Call the latter expression N . It also holds that Q”? [z : B”]C” 2; Q”P“‘ [z : B’”]C”’, and z c C”’ by Th. 6.33. So Q” (A”)B“ [z : B”]C” 2; N = Q” (A’”)B‘“’[z : B‘”](z := A”’)C”’ by an elementary y-reduction. This completes this part of the proof in case (i). In case (ii) we first establish that QP” [z : B”](z := A”)C” 2; QB“‘ [z : B”’] (z := A“’)C”’ by Th. 6.34, yielding by monotony that M 2; N . (3) Let K 2; L be Q ( A )C 2; Q (A’)C’ as a direct consequence of QA 2; QA‘ and QC 2; QC‘. Now by Th. 6.32 (3) M can have the form

(i) M =- Q” (A”)C“, with Q 2 Q”, 11Q11 = IIQ”II, Q A 2; QA“ and QC 2; QC”, or (ii) M z

= Q” (A”)P’ [z : D”] (z := A”)E”, where K = Q ( A )P [z : D] E , c E , QA 2;

QA‘, Q 2; Q” and QP [z : D] E 2; QP” [z : D”] E”.

In case (i) we can find by induction A”’ and C”’ such that QA’ 2; Q‘IA‘’’, Q”A” 2!, Q”A”’, QC’ 2; Q”C’” and Q”C” 21, Q”C”’ and we can take N Q” (A”’)C”‘. In case (ii) we are in a position similar to case (i) of (2), with L and M permuted. (4) Let K 2; L be Q [ z : A ] C 2; Q [ z : A’IC‘ as a direct consequence of QA 2; QA‘ and Q [z : A]C 2; Q [z : A]C’. Then M 2, Q” [z : A”]C” by Th. 6.32 (l),where Q 2; Q”, Q A 2; QA” and Q [z : A]C 2; Q [z : A]C”. Take N = Q” [z : A”‘]C“’, where A”’ and C”’ are obtained as in (3), case (i) * (5) Let K 2; L as a direct consequence of K 2; L’ and L’ L. Then by induction we can find an N such that L’ 2; N and M 2; N , and also L 2; N . 0

Theorem 6.37 (CR for y-reduction). If K E A, K 2-, L and K Ir M , then L --, M . Proof. This is a consequence of the previous theorem.

0

Theorem 6.38 ( C R for ,&-reduction). If K E A, K z p , L and K I p , M , then L -pl M but for a-reduction.

Strong normalization in a typed lambda calculus (C.3)

437

Proof. Decompose K >pl L and K >pl M , apply Th. 6.25, Th. 6.37 and Th. 6.26: we obtain an N' such that L I p , N > a N' and M rplN" > a N'. 0

>b, >b2 M , then there is an n >bl > Let K >b, L be generated by Q ( A )P [z : B ]C Lb,

Theorem 6.39. If K E A, K L and K such that M N and L >i2 N with n 1.

Proof.

>b2

Q ( A ) P [& : B ] (z := A)C, and K M by Q' (D) i)l [i: E] F >b2 Q'PlF. If ( D )PI [y : E ] F c A , we need n P2-reductions for the n A's in Q ( A )P [g : B] (z := A)C. If not, we need only one. The theorem easily follows. 0

Theorem 6.40. If K E A, K >p, L and K >p, M , there is an N such that M >pl N and L >p2 N . Proof. Apply Th. 6.39 repeatedly. This can be illustrated by the following diagram:

M

Figure

4

>bl

Here we assume that K 20, L can be decomposed into K 2bl L' L , and K 2 p 2 M into K M' M . In the diagram all edges (in the sense usual in graph theory) parallel to the edge from K t o L' represent single-step PI-reductions, those in the direction of K M' represent single-step &reductions. 0

>b2

>b2

>b2 rb2

Theorem 6.41. If K E A, K L and K L N OT L = N , and M N or M E N .

>b2

Proof. Let K

>b2 M , there is an N

such that

>b2 L be generated by Q ( A )P [& : B ]C 2b2QPC, and K 2b2

R.P. Nederpelt

438

M by Q' (D) PI

[i: E ]F >&2

Q'PlF. If (D) PI

[i: E] F

c A or c B then

>&, L ; if ( A )P [g : B]C C D or C E , then L >bz M . If (D) 4 [i: El F = ( A ) P [g : B] C then L = M . In all other cases there is clearly an N such that L >&, N and M >&, N . 0 M

Theorem 6.42 ( C R for P2-reduction). then L "pz M .

If K

E

A, K >pa L and K >pz M , 0

Proof. Apply Th. 6.41 repeatedly. Theorem 6.43 ( C R for P-reduction). L "p M.

If K

E

A, K 2 p L and K >p M , then

Proof. As that of Th. 6.38.

0

Theorem 6.44. If K E A, K >p L and K >p M , then there are N , N" and N"' such that L 20, Nf' >pa N and M >p, N"' >p, N .

Proof. Decompose K L and K >p M , according to Th. 6.19, into K >pl L' >p, L and K >pl M' Zp, M respectively. The remainder of the proof is illustrated by the following diagram:

M

Figure 5 We find N' from Th. 6.38, N" and N"' from Th. 6.40 and finally N from Th. 6.42. 0 7. q-reduction, reduction and lambda equivalence

A third reduction in lambda calculus (apart from a- and P-reduction) is called q-reduction and denoted by &. We shall incorporate it in our system.

Strong normalization in a typed lambda calculus ((3.3)

439

We first define single-step 7-reduction, denoted by 2;:

Definition 7.1. Single-step 7-reduction is the relation generated by: (1) If Q [z : A] (2) B E A and ( 2 ) Let Q ( A )C and

2

(Z B, then Q [z : A] (z) B 2; QB.

Q ( A )D E A.

If QC 2; Q D , then Q ( A )C 1; Q ( A )D.

(3) Let Q [z : A]C and Q [z : B]C E A. If QA 2; QB, then Q [z : A] C 2; Q [z : B]C. (4) Let Q ( A ) C and Q ( B ) C E A. If QA 2; QB, then Q ( A ) C 2; Q ( B ) C . 0

Rules (2), (3) and (4) are called the monotony rules of single-step 7-reduction; they are similar to those of single-step P-reduction. Rule (1) is called the rule of elementary 7-reduction.

Definition 7.2. 7-reduction is the reflexive and transitive closure of single0 step 7-reduction. If A and B are related by a (single-step) 7-reduction, we speak of “the (singlestep) 7-reduction A B”. The notions n-step 7-reduction and decomposition of an 7-reduction are defined analogously to the corresponding notions for preduction. If the first derivation step of a single-step 7-reduction has the form Q [z : A] (z) B QB, we say that Q [z : A] (z)B generates the single-step 7-reduction.

>;

>;

Theorem 7.3. Let K E A . Then Q [z : A] (z) B generates a single-step 7reduction of the form K 2; L if and only if [x: A] (x)B c K and Q [z : A] (z) B = K 1 [z : A] (z) B. Proof. Similar to the proof of Th. 5.7.

0

Theorem 7.4. If K E A and K 2; L , then L Proof. Induction on the length of proof of K proof of Th. 5.8.

E

>;

A. L. The proof is similar to the

Theorem 7.5. If QE E A, QE 2; Q’F and 11Q11 = 11Q‘11, then (2)

Q = Q’,

(ii) E

= F , or

0

R.P. Nederpelt

440

= QO [z : A ] , E = (z) [y : B]E’, Q’ = QO[y : B ] , z [y : B]E’ and = E‘. In the second case Q = Q1 [z : K ] Q2, Q‘ = Q1 [z : L]Q2 and Q1K 2; Q1L. (iii) Q F

Proof. Induction on the length of proof of Q E 2; Q’F. The proof is comparable to the proof of Th. 5.11, except for the case in which Q E 2; Q’F is an elementary 77-reduction. In this case we have t o note the possibility that QE and Q’F are as in (iii). 0 Theorem 7.6. If Q E E A and Q E 2; K , then K and F‘ with 11Q’11 2 11Q11 - 1.

= Q’F’

for certain Q’ 0

Proof. Similar to the proof of Th. 5.12. Theorem 7.7. If Q E E A and Q E 2; K

21 QG, then K

= QF.

Proof. If Q = Q1 [z : A] and Q E 2; K is Q1 [z : A] (z) B 2; Q l B , then the binding variable 2 of K has disappeared, and we cannot regain it by 77-reduction. Hence by Th. 7.6: K = Q’F and 11Q’11 = 11Q11, and the case expressed in Th. 7.5 (iii) does not hold. In the derivation steps leading to K 27 QG the final ones of the first 11Q11 abstractors cannot disappear by an elementary 11-reduction for the same reason as above. Assume that Q f Q’.Then by Th. 7.5 (ii): Q = Q i [z : K ]Q 2 , Q’ = Q1 [z : L] Q2 and Q1K 2; Q1L. It is clear that ILI < 1KI. Since the length of an expression cannot increase by 77-reduction it follows that we cannot regain Q from Q’. Hence Q = Q’. 0 Theorem 7.8. The monotony rules hold for 0-reduction.

Proof. Similar to the proof of Th. 5.15, using Th. 7.5. Theorem 7.9. If Q E , P E and P F E A, and Q E 27 Q F , then P E

Proof. Analogous to the proof of Th. 5.17; use Th. 7.7.

0

21 P F . 0

The converse of this theorem holds too. Given an 77-reduction Q K 21 M , it need not follow that M = Q’N’ with 11Q11 = 11Q‘11, since the find abstractors of Q may have been cancelled in 77reductions. For example: Let Q = Q’[z : A] and K G ( z ) ~then , Q K 2; Q’T, where 11Q’11 = 11Q11 - 1. This kind of 77-reduction plays an important r6le in the following. We shall call them g!-reductions. We shall prove a number of theorems concerning q!-reductions. In Th. 7.14 we shall show that we can postpone q!-reductions until after other 77-reductions. Cor. 7.17 will result from our discussions of q!-reductions.

Strong normalization in a typed lambda calculus (C.3)

441

Definition 7.10. (1) K 2; L is called a single q!-reduction (denoted by K

>$)

L ) if K

I

Q [z : A] (z)B and K generates K 2; L (i.e. if K 2; L is an elementary q-reduction). This reduction is called of order p if IIQ [z : All1 = p. (2) K L,, L is called a k-fold q!-reduction (denoted K 2':) L ) if there are K , =

... [xi : Ail (Zi)... ( 2 1 ) B

Q [z1 : All

8,

such that K

= Kk

2:) Kk-1 ...

>'I)

-,,! KO= Q B = L and Ki generates Ki 2:) Ki-1. This reduction K 2s) L is

called of order p if Kk

22)Kk-1 is of order p.

0

Theorem 7.11. If K E A and K 2s) L 2; N , then either K 2:) N or there is a reduction K 2; M 2s) N where K 2; M is not an q!-reduction.

Proof. K = Q [z : A] (z) B 2;' Q B Q B = L 2; N .

=

L. Consider the possibilities for 0

Theorem 7.12. If K E A and K 2s' L 2; N , where K 2s) L is of order where M 2:'

$')

N of orderp, or there is a reduction K 2; M 2s) N N is of order p and K 2; M is not an q!-reduction.

p , then either K

Proof. Compare with the previous theorem.

0

Theorem 7.13. Let K E A and K 2s' L 2,, N , where K 2s) L is of

K

>,,

>,,

L' 2;; N , where a decomposition of L contains no q!-reductions and L' 2;; N is of order p.

orderp. Then there is a reduction K

Proof. Decompose L 2,, N into L = El 2; ... 2; E, = N . We proceed with induction on T . If r = 1 there is nothing to prove. Let r > 1. Consider the reduction K L = El 2; E2. By the previous theorem we have either K

>$+l)

E2 or a reduction K 2; L" 2';) E2 where K 7.

2; L" is

not an q!-reduction. Applying the induction hypothesis on K 2s") E2 2,, N

>,,

or L" 2 ';' E2 N , we obtain K 7. K I,, L' contains no q!-reductions.

>,,

L' 2"; N , where a decomposition of rl.

0

Theorem 7.14. If QK E A and QK 2,, L , there is a reduction Q K 2,, I >(k) Q'K -,,! L where 11Q11 = 11Q'11, Q'K L is of order 11Q11, and where a decomposition of QK Q'K' contains no q!-reductions of order 11Q11.

>,,

Proof. Decompose QK

L into single-step q-reductions Q K

= L1 2; ... 2;

R.P. Nederpelt

442

L,

E

L. Let i be the smallest integer such that Li 2: Li+l is an q!-reduction

of order 11Q11. Apply the previous theorem on L; 2:’ L;+1

z,, L,

>‘I

L’ 2‘:) L as desired. The fact that L‘ obtain a reduction Q K ‘I. 11Q’11 = 11Q11 follows from Th. 7.5.

E

L. We

= Q’K’ with 0

>,,

Theorem 7.15. If Q K E A, Q K Q‘K‘, 11Q11 = 11Q‘11 = p and a decomposition of Q K Q’K’ contains no q!-reductions of order p , there is a reduction Q K 2oQK’ Q’K’ and a reduction Q K Q’K Q’K’.

>‘I

Proof. See Th. 7.5.

0

>,,

>,,

Theorem 7.16. If Q I K E A, Q2L E A, Q1K M , Q2L M , QI = [q : All ... [zp: AP] and Q2 = [q : B1]... [zp: BPI, there is an N such that Q1K 1,, QIN and Q2L 2,, Q2N.

>‘I

Proof. By the aid of Th. 7.14 and Th. 7.15 we can find reductions Q1K Q1K’ lo QiK’ 20’ M and Q2L Q2 L‘ >‘I QbL‘ 2“) M , where Q‘,K’ 2“) M

>’I

’I!

‘I!

and QiL’ ;2 : M are of order p . Note that Q: = [XI : A:] ... [zP : A;] and Q; = [XI : Bi] ... [xp: EL]. Now both QiK‘ and QhL‘ E A, so k = 1: assume k > 1, then M = [q: A’,]... [+k : Ak-J M’ = [z1 : B i ] ... [+k : B’P- k ] ... [+I : L3k-J M”; it follows that [+I : BL-.l] occurs in M’, hence also in K’; this contradicts the fact that Q{K’ E A, since we found two binding variables xp-l in Q: K‘. It follows that K’ = L’; we can take 0 N = K’ = L’. Corollary 7.17. If Q K , QL E A, Q K QN such that QK Q N and Q L QN.

>,,

Theorem 7.18. If Q D E A, Q D Q’ = [q: A;] ... [xp: A;], then QD

>,,

M and Q L

>’I Q’E, Q E

>,,

M , there is a 0

[q : A11 ...[x, : Ap] and

QE.

Proof. Resulting from Th. 7.14 we can find a reduction Q D

>‘I Q”D’ 2‘:’ Q‘E ‘I.

where Q”D’ 2s) Q’E is of order 11Q11. Now Q” = [ZI: Ayl ... [zp: A:], hence k = 0 (because &I’D’ E A; see the proof of Th. 7.16). Then also QD Q E by 0 Th. 7.15.

>,,

We shall now prove a theorem concerning the so-called “postponement of rpreductions” for A. What we want t o prove is that every reduction K 2 M which takes place by means of single-step p- and 7-reductions in arbitrary order, can be replaced by a reduction K 2 p L M , in which all &reductions precede all 7-reductions.

Strong normalization in a typed lambda calculus (C.3)

443

It is easy t o show that each reduction A 2; B 2; C can be replaced either by a reduction A 2; B’ 26 C (where r 2 0) or by a reduction A 2; B’ 2; C. But this does not suffice to prove the theorem. It is not sure that this process of interchanging 17’s and /3’s terminates for a given reduction K 2 M . In [Curry and Feys 58, Ch. 4,D2] a compound /3-reduction is introduced for the purpose of proving the above mentioned theorem. In our opinion there is an error in their proof (viz., the case that R is MkN and L is some Mjyj for j 5 k is missing). Nevertheless, their idea can be extended in such a manner that the theorem on the postponement of 17-reductions can be proved. We have carried this out by defining a ‘‘compound /3-reduction” A 2; B with the property that each reduction A Z,, B 2; C can be replaced by a reduction A 2; B’ >-,, C. However, this compound P-reduction looks rather complicated. Barendregt suggested to us another way of proving the theorem (private communication). He proposed a ‘‘nested” 17-reduction (which we call &-reduction and denote by 2;) with the property that a reduction A 2; B 2 p C can be replaced by a reduction A 2 p B’ 2; C. The nested character of this &-reduction is comparable to that of y-reduction discussed in the previous section. We prefer the latter way of proving because it is easier to understand.

Deflnition 7.19. Single-step &-reduction, denoted by tion generated by

>;,

is the reflexive rela-

(1) If Q [z : A] (z) B E A, z !$ B and Q B 2; QC, then Q [z : A] (2) B 2; QC. (2) If Q ( A )C E A, Q A 2; QA’ and QC 2; QC’, then Q ( A )C 2; Q (A’)C‘.

(3) If Q [ z : A ] C E A, Q A 2; QA’ and Q [ z : A ] C 2; Q [ z : AIC’, then 0 Q [z : A]C 2; Q [z : A’] C’.

We call rule (1)in this definition the rule of elementary single-step &-reduction, rules (2) and (3) the monotony rules for &-reduction. The following two theorems deal with the relation between 17- and &-reduction.

Theorem 7.20. If K E A and K 2; L , then K 2; L. Proof. Induction on the length of proof of K 2; L. Theorem 7.21. If K E A and K 2; L , then K

>,,

0

L.

Proof. Induction on the length of proof of K 2; L. For example, if K 2; L is Q [z : A] (z) B 2; QC, as a direct consequence of Q B 2; QC, then by induction Q B QC, and Q [z : A] (z) B 2; Q B >,, QC.

>,,

0

R.P. Nederpelt

444

We shall now prove a number of theorems which are lemmas for the theorem on the postponement of 77-reductions (Th. 7.28).

Theorem 7.22. If K E A and K 2; L , then L E A. Proof. Follows from Th. 7.21 and Th. 7.4.

0

Theorem 7.23. If Q E E A and Q E 2; Q[y : G] H , then Q E = Q [xi : Ail (xi) [ ~ :2 A21 ( ~ 2 ... ) [z, : An] (5,) [y : G’] H’, with QG‘ 2; QG, Q [y : G’] H’ 2; Q [y : G’] H and zi @ [zi+l: Ai+l] ... (zn)[y : G‘] H’. Proof. Induction on the length of proof of Q E 2; Q[y : GI H . If the latter reduction results from reflexivity, the proof is completed. (1) Let Q E 2; Q [y : GI H be Q’ [z : A] (z)B 2; Q’C, as a direct consequence of Q’B 2; Q’C. If Q‘ = QQ”, induction yields the proof. If Q = Q‘ [z : A], then C begins with [z : A]. This implies that [z : A] occurs in B (since 6-reduction can only omit abstractors and applicators without influencing the remainder of the expression), which is impossible since Q E E A. So this latter case cannot apply.

(2) Let Q E 2; Q [y : G] H be Q‘ (A) C 2; Q’ (A’) C‘. Then Q E = Q [y : G] F . (3) Let Q E 2; Q[y : GI H be Q’[z : A ] C 2; Q’[z : A’IC’, as a direct consequence of Q’k 2; Q’A’ and Q‘ [z : A] C 2; Q‘ [z : A] C’. There are the following possibilities: (a) Q = Q’, (b) Q = Q’ [z : A] Q1 and (c) Q’ = QQ1 with 11Q111 > 0. In all three cases the proof is easy.

0

Theorem 7.24. Let QA and Q [z : B]C E A, QA 2; QA’ and Q [z : B] C 2; Q [z : B] C’. Then Q(z := A ) C 2; Q(z := A’)C’. Proof. Induction on ICI. If C = 7, C

z or C

= y $ 2 , the proof is easy.

(1) Let C = [y : E] F . There are two possible cases: (a) Q [z : B] C 2; Q [z : B] C’ is Q [z : B] [y : E] F 2; Q [z : B] [y : E’]F’, as a direct consequence of Q [z : B] E 2; Q [z : B]E’ and Q [z : B] [y : E] F 2; Q [z : B] [y : E] F’. By induction: Q(z := A ) E 2; Q(z := A’)E’, and Q[y : (z := A)E] (z := A ) F 2; Q [y : (z := A)E] (z := A’)F’ (the latter because Q [y : (z := A)E] [z : B]F 2; Q [y : (z := A)E] [z : B] F’). Hence

Q(z := A ) C 2; Q(z := A’)(?‘.

Strong normalization in a typed lambda calculus (C.3)

445

(b) Q [z : B]C 1; Q [z : B]C‘ is Q [z : B ] [y : E] (y) G 2; Q [z : B]G’, as a direct consequence of Q [ z : B]G 2; Q [ z : BIG’. By induction Q ( z := A)G 2; Q ( z := A’)G’, SO Q ( z := A ) [y : El (y) G Q [y : (5 := A ) E ](y) (z := A)G 2; Q ( z := A’)G’.

(2) Let C = ( E )F . Then Q [z : B ]C 2; Q [ z : B ]C’ is Q [z : B ] ( E )F 2; Q [z : B ] (E‘)F’, as a direct consequence of Q [z : B] E 2; Q [z : B] E’ and Q [z : B]F 2; Q [z : B ]F‘. The theorem results from the induction. (Note that Q [z : B]C 2; Q [z : B ]C’ cannot be Q [z : B ] (z) G 2; QG‘.) 0

Theorem 7.25. Let A E A and A 2; B 2; C. Then A >p B‘ 2; C. Proof. Induction on the length of proof of A 2; B. If the last derivation step results from reflexivity, nothing remains to be proved. (1) Let A 2; B be Q [z : D] (z) E 2; QE‘ as a direct consequence of Q E 2; QE‘, and let B $ C be generated by Q’ ( F ) [y : G]H Q’(y := F ) H . The following cases may apply:

>b

(a) ( F ) [y : GI H C Q. There is clearly a reduction A >p B’ 2; C.

>b

(b) ( F ) [y : G]H C E’. Let B C be QE’ >b QE”’, then by induction there is a reduction Q E 2 0 QE” 2; QE”’, hence Q [z : D] (z) E 2 p Q [z : D ] (z) E” QE”‘.

>;

(2) Let A 2; B be Q ( D )E 2; Q (D‘) E‘ as a direct consequence of Q D 2; QD’ and Q E 2; QE’, and let B 2; C be generated by Q’ ( F ) [y : GI H Q’(y := F ) H . The following cases may apply:

>b

(a) ( F )[y : GI H C Q. Clearly A >p B‘ 2; C. (b) ( F )[y : G]H = (D’)E’. Then QE’ = Q [y : G]H , so Q E = Q [q: All ( 2 1 ) ... [zCn : A,] (zn)[y : G’]H’, with QG’ 2; QG, Q [y : G‘]H’ 2; Q[y : G’]H and zi [zi+l : Ai+l]... (z,) [y : G’]H’ by Th. 7.23. Then Q ( D )E 2 p Q(y := D)H’. By Th. 7.24: Q ( y := D)H’ 2; Q(y := F)H E C .

<

(c)

( F ) [y : G]H C D‘. Then C = Q (D’”)E’ with QD‘ 2; QD”‘, and by induction QD >p QD“ 2; QD”’, hence Q ( D ) E >p Q (D”)E 2; Q (D”’) E‘ G C.

(d) ( F ) [y : GI H c E’. Then C EZ Q (D’) E”’ with QE’ 2; QE”‘, and by induction Q E >p QE” 2; QE”’, hence Q ( D ) E 2 p Q ( D )El’ 2; Q (D’) E“’ = C.

446

R.P. Nederpelt

(3) Let A 2; B be Q [x : D] E 2; Q [x : D’] E’ as a direct consequence of QD 2; QD’ and Q [x : D] E 2; Q [ x : D] E’, and let again B 2b C be generated by Q’ ( F ) [ y : GI H 2b Q’(y := F ) H . The following cases may apply:

c Q. Clearly A 2 p B’ 2; C. ( F ) [ y : G]H c D‘. Then C = Q [z : D’”]E’

(a) ( F ) [ y : G]H (b)

with QD’ 2b QD”’. By induction: QD 2 p QD” 2; QD”‘, so Q [x : D] E 2 p Q [x : D”]E 2; Q [z : D”‘]E’ (where we require the lemma: Q [z : D] E 2; Q [z : D] E‘, then Q [z : D”]E 2; Q [z : D”]E’).

(c) ( F ) [ y : G]H c E’. Then C = Q [z : D’] E”’. Also: Q [z : D] E’ Q [z : D] E”’, so by induction Q [x : D] E >p Q [x : D] E” > K Q [x: D]E”‘. Hence Q [z : D] E >p Q [z : D] E” 2nQ [z : D‘] E”’ = C.

>b 0

Theorem 7.26. Let A E A and A 2; B 2; C . Then A 2 p B 2; C Proof. Induction on p , using the previous theorem.

0

Theorem 7.27. Let A E A and let A 2 C b y means of a number of single-step IC- and P-reductions in arbitrary order. Then there is a reduction A 2 p B > K C. Proof. Induction on the number of single-step &-reductions in A 2 C. If this number is zero, the proof is completed. Else, let A 2 C be A 2 A’ 2; B 1; C. Apply the previous theorem, obtaining A 2 A’ 2~ B’ 2; C, and apply the 0 induction on A 2 A’ 2 p B‘.

Theorem 7.28. Let A E A and let A 2 C b y means of a number of single-step 77- and P-reductions in arbitrary order. Then there is a reduction A 2 p B L,, C. Proof. Since each g-reduction c m be considered as a &-reduction (Th. 7.20), we can apply the previous theorem, obtaining A 2 p B 2, C. But B InC 0 implies B > n C (Th. 7.21), so A >p B C.

>,,

The remainder of this section will concern (general) reduction, defined as a sequence of single-step a-,P- and g-reductions.

Definition 7.29. Single-step reduction (denoted by 2’) is the relation obeying: 0 A 2’ B if and only if A >b, B, A 2b B or A 2; B. Definition 7.30. Reduction (or general reduction, denoted by 2 ) is the reflex0 ive and transitive closure of single-step reduction.

Strong normalization in a typed lambda calculus (C.3)

447

Theorem 7.31. The monotony rules hold for reduction. Proof. Use Th. 5.15 and Th. 7.8.

0

We shall prove a theorem (Th. 7.33) which expresses that the Q is in a certain sense irrelevant in a reduction QC Q E : it can be replaced by any P such that PC and P E E A. This corresponds with general usage in lambda calculus to define reduction for expressions which may contain free variables. Our choice to define reductions inside A is apparently not in disagreement with that general usage.

>

Theorem 7.32. If QC E A and QC 2 Q E b y means of there is a reduction QC >p Q D & Q E .

p - and rpreductions,

Proof. There is a reduction QC >p K & Q E by Th. 7.28. Now by Th. 5.12: K = Q’D with 11Q11 = 11Q’11. If Q = [ X I : All ... [zP: A p ] ,then Q’ E [x1 : A’,]... [x, : A;], so by Th. 7.18: Q‘D >17Q’E. From Th. 5.21: QC 2 p QD, and from Th. 7.9: QD >17Q E . 0 Theorem 7.33. If Q C , PC and P E

E

A, and QC 2 Q E , then PC

> PE.

Proof. See Th. 7.32, Th. 5.17 and Th. 7.9.

0

Reduction is a non-symmetric relation between expressions in A, which is reflexive and transitive. We shall define lambda equivalence. The definition of beta equivalence was given in Def. 5.22. In Th. 7.35 we shall prove that beta equivalence is the symmetric closure of beta reduction.

Deflnition 7.34. Let A and B E A. We call A lambda equivalent t o B (denoted: A N B ) if there is an expression C such that A 2 C and B 2 C . 0 Theorem 7.35. Beta equivalence is reflexive, symmetric and transitive. Proof. Reflexivity and symmetry are trivial. Transitivity follows from Th. 6.43 (CR for &reduction): let A -p B and B -p C , then there are D and E such that A >p D , B >p D , B >p E and C >p E. Moreover, there is an F such that D 2 p F and E >p F (Th. 6.43), so A >p F and C 2 p F . Hence A -p C . 0 Unfortunately, there is no similar theorem for lambda equivalence. Of course lambda equivalence is symmetric and reflexive, but not necessarily transitive.

R.P. Nederpelt

448

The reason for this is that CR does not hold for (general) reduction: for example, let K L and K M , let K = Q [z : A] (z) [y : B ] C where z [y : B ]C, let K 2 L be Q [ z : A] (z) [y : B ]C Q [y : B]C and let K M be Q [z : A ] (z) [y : B]C >p Q [z : A] (y := z)C (la Q [y : A] C ) . Now we cannot in general find an N such that L 2 N and M 2 N , since we know nothing concerning a relation between A and B.

>

>

>,,

>

We note the following. We can embed ordinary lambda calculus into A, since there is a one-to-one correspondence between expressions from lambda calculus and those expressions in A in which only abstractors of the form [z : T ] occur. If we restrict ourselves in A to the latter expressions, the example above changes Q [y : T ] C. Now there into K = Q [z : T ] (z) [y : T ] C, L E Q [y : T ] C and M is no problem as regards CR. Indeed, in lambda calculus the Church-Rosser property holds (see [Barendregt 71, Appendix 111). The following theorem expresses that lambda equivalence of Q K and QL implies the existence of an N such that Q K 2 QN and Q L 2 QN or, otherwise stated: the abstractor chain Q can remain unaffected.

Theorem 7.36. Let Q K and QL E A . If Q K that Q K 2 Q N and Q L 2 Q N .

-

Q L , there exists an N such

>

Proof. There must be an M : Q K 1 M and Q L M . By postponement of q-reductions we obtain reductions Q K >p M I 2,, M and Q L 2 0 M2 Z q M . Th. 5.12 implies that M I = QlK', M2 2 Q2L', 11Q11 = 11Q111 = 11Q211. Then, according to Th. 5.21, we also have Q K I p QK' 2 p Q1K' l qM and Q L 2 p QL' Lp Q2L' I,,M . It is easy t o show that Q1 and Qz have the form as required in Th. 7.16, hence there is an N such that Q1K' Q1N and Q2L' Lq Q2N. From Th. 7.9 it follows that QK' > q QN and QL' Q N . So Q K 2 QN and QL 2 QN. 0

>,,

>,,

The monotony rules also hold for lambda equivalence:

Theorem 7.37. (a) If QC, Q D , Q ( A ) C , & ( A )D E A and QC Q ( A )D.

-

Q D , then Q ( A ) C

(b) I f Q C , QD, Q [ z :A ] C , Q [ z :A ] D E A a n d Q C - Q D , thenQ[z : A J C Q [z : A ] D.

(c) If QA, Q B , Q ( A ) C , Q ( B ) C E A and Q A

Q ( B )C . (d) If Q A , Q B , Q [z : A] C,Q [z : B]C E A and Q A Q [z : B]C .

-

Q B , then Q ( A ) C

-

Q B , then Q [z : A] C

-

-

Strong normalization in a typed lambda calculus ((7.3)

449

Proof. See Th. 7.36 and Th. 7.33.

0

Theorem 7.38. If QC , Q D , PC and P D E A and QC

N

Q D , then P C

-

PD. 0

Proof. Th. 7.36 and Th. 7.33. 8. Type and degree

The notions introduced in the preceding sections are from lambda calculus (as reduction, lambda equivalence) or applicable to lambda calculus (factors, bound expressions), since the types played no essential r6le. We shall now look into the typing of an expression in A. With every A E A for which Tail A f 7 we define a type, denoted as Typ A, as follows:

Definition 8.1. Let A E A and Tail A Typ A G Pi [X : B]P ~ F T B .

= z, so A = PI [z : B]P ~ x .Then 0

Informally speaking, we may say that B is the type of z in the above expression. Note, however, that we allow Typ to operate only on expressions in

A. Theorem 8.2. If A E A and Tail A f

7,then

Typ A E A.

Proof. Let A = P1 [z : B]P2x and let Alx f Q1 [z : B]Q2z. We prove that Typ A is a bound expression. All non-binding variables in PI [z : B] P2 are clearly also bound in Typ A . Consider a non-binding variable z c F T B c is a corresponding y c B, and Aly = Q1Q3y. So Q1 [z : B]Q ~ F T B There . Typ Alt = Q1 [z : B]Q2Qiz where Q$z is a renovation of Q3y. Case 1: if y was bound in Aly by a binding variable in Q3, z is bound in Typ Alz by the corresponding binding variable in 96. Case 2: if y was bound in Aly by a binding variable in Q1, z E y is still bound by the same binding variable in Q1 since all binding variables of [z : B]Q2Qb are different from y. So Typ A is bound. Clearly Typ A is also distinctly bound by the renovation of B. 0 We define repeated application of Typ inductively as follows:

Definition 8.3. Let A E A. Then Typ'A n 2 1 and if Tail Typn-' A f 7, then Typ" A

= A;

if Typ"-'A is defined for 0 A).

= Typ(Typn-'

If A E A and TypnA is defined, we call n permissible for A ( n = 0 is always permissible for A E A).

R.P. Nederpelt

450

Theorem 8.4. Zf A E A and A & B , then Typ" A permissible for A and B. Proof. It is sufficient to prove: if A The latter proof is easy.

rb,B and TailA f

T,

>a

Typ" B for all n

then TypA

>a

TypB. 0

With each expression A in A we define a degree, denoted Deg A: Definition 8.5. (1) If A E A and Tail A = 7 ,then Deg(A) = 1. (2) I f A E A , T a i l A = x a n d A = P l [ z :B]P2x,thenDeg(A)=Deg(PlB)+l. 0

Induction on the length of A shows that Deg(A) is well-defined by Def. 8.5. Clearly Deg A = 1 if and only if Tail A = T . We shall now prove a number of theorems, leading to the theorem: if TailA f T , then Deg A = Deg Typ A 1 (Th. 8.12). We could have taken this property as a definition of Deg. In that case, however, the well-definedness of Deg would have been harder to prove.

+

Theorem 8.6. Zf PC E A and P ( K )C E A, then Deg PC = Deg P ( K )C. Proof. Induction on IPC1.

0

Corollary 8.7. If A E A and A = P C , then Deg A = Deg(A1C).

0

Corollary 8.8. If A E A, Tail A Deg A = Deg Q1B 1.

+

=x

and Alx

=

Q1 [x : B]Q2z, then 0

Theorem 8.9. Zf PC E A and P [x: K ]C E A, then Deg PC = Deg P [z : K ]C. Proof. By Th. 3.8: x @ C. The rest of the proof follows from induction on

WI.

0

Theorem 8.10. Zf PC E A and PP'C E A, then Deg PC

= Deg PP'C.

Proof. Induction on IIP'II, using Th. 8.6 and Th. 8.9. Theorem 8.11. Zf A E A and A

B , then Deg A = Deg B.

Proof. Take A >b, B ; induction on /A]. Theorem 8.12. Zf A E A and Tail A

0

$ 7 , then

0

Deg Typ A = Deg A - 1.

Strong normalization in a typed lambda calculus (C.3)

451

Proof. Let Tail A = z and A = PI [z : C ]Pzz, so Typ A = PI [z : C]PzFrC. Then, P1FrC E A and Deg PIFrC = Deg PIC by Th. 4.5 and Th. 8.11. By Th. 8.10: Deg PIFrC = Deg Typ A. So Deg A = Deg PIC 1 = Deg Typ A + 1.

+

0

0

Corollary 8.13. If A E A, then Tail(TypDegA-' A ) I 7.

This optimal exponent of Typ with a certain A E A is of special importance. We shall introduce an abbreviation:

Definition 8.14. If A E A, then Typ* A

= TypDegA-' A .

0

We stress that the asterisk replaces an exponent n dependent upon A . Moreover, note that Typ is a partial function on A, but Typ' is a total function on

A. We proceed with a number of theorems on Typn, Typ' and Deg:

Theorem 8.15. If A E A , Deg A = 1 and A 1 B, t h e n Deg B = 1. Theorem 8.16. If PC E A, then for permissible n T y p n P C particular Typ* PC = PC".

=

0

PC'; in

Theorem 8.17. If PC E A, PP'C E A, and for a permissible n Typn PC PC', t h e n n is permissible f o r PP'C, and Typn PP'C 2, PP'C'.

CI G

Proof. It is sufficient to assume Tail PC $ 7 and n = 1. Let Tail PC = z and ( P C )I z = Q1 [z : B]Q22, then (PP'C)12 = Qi [z : B]Q;z, and [z : B] appears in either P or C. The remainder follows. 0 Theorem 8.18. If PC E A, PP'C E A and for a permissible n Typn PP'C PP'C', then n is permissible for PC and Typ" PC 1, PC'. Proof. Similar to the previous proof.

= 0

R.P. Nederpelt

452

CHAPTER 111. THE FORMAL SYSTEM A 1. Legitimate expressions The “meaning” of (A) B is the application of function B t o argument A. So far this application was unrestricted: any expression could serve as an argument. Besides, it was of no interest whether B really was a function or not. In the formal system A, which we shall introduce in this chapter, we only admit the expressions of A which obey the applicability condition. (For an informal introduction of the applicability condition: see Section 1.4.) We call this kind of expressions legitimate expressions. Since A is a part of A, we again provide expressions with abstractor chains Q, as we did with expressions in A (cf. the beginning of Section 1.6). We begin with the definitions of function, domain and applicability with respect to an abstractor chain Q:

Definition 1.1. Let Q B E A. We call Q B a Q-function if there are x, K and L such that Typ* QB 2 Q [x : K] L. The expression Q K is called a Q0 domain of QB. Definition 1.2. The expression Q B is called Q-applicable to QA if Q B is a Q-function with Q-domain Q K , Deg QA > 1 and Typ QA 2 Q K . In that 0 case Q ( A )B is a legitimate Q-application of Q B to QA. The formal system A is inductively defined by:

Definition 1.3. (1)

T

E A.

(2) If QA E A and if z does not occur in QA, then Q [ x : A]. Q [x: A] T E A.

E A and

(3) If QA and Qy E A, if x does not occur in QA and if x f y, then Q [z : A] y E A. (4) If QA and Q B E A, if the binding variables in A and B are distinct and if 0 Q B is Q-applicable to QA, then Q (A) B E A. The only difference to the (second) definition of A as given by Th. 11.3.10 lies in the applicability condition in (4): Q B must be Q-applicable t o QA, i.e. Typ*QB 2 Q[y : K ] L and Typ Q A 2 Q K . These reductions are defined for expressions in A (cf. the following Th. 1.4 and Th. 11.8.2). Note that the

Strong normalization in a typed lambda calculus ((2.3)

453

applicability condition does not state that the reductions mentioned concern expressions in A only. The applicability condition has the powerful consequence that all expressions in A normalize (cf. Section 1.2), which we shall prove later in this chapter, whereas in the wider system A normalization is not guaranteed.

Theorem 1.4. If A E A, then A E A. Proof. Induction on the length of proof of A E A.

0

Restricting ourselves to a- and &reductions, we can weaken the applicability condition in the sense that we replace 2 by -:

Theorem 1.5. If Q A and Q B E A , Q [y : K ]L E A, Typ* Q B -p Q [ y : K ] L , Typ Q A -p Q K and if the binding variables in A and B are distinct, then Q ( A )B E A . Proof. Let Typ* Q B = QB’ (Th. 11.8.16). Since QB’ -p Q [y : K ]L , there is an M such that QB’ z p QM and Q [ y : K ] L I p Q M (Th. 11.5.12 and Th. 11.5.16). From Th. 11.5.16 and Th. 11.5.20: QM E Q [y : K’] L’ such that Q K I p QK’. Let Typ Q A = QA’. Since QA‘ -p Q K , there is a K” such that QA’ >p QK’’ and Q K I p QK”. Hence (Church-Rosser theorem for P-reduction, Th. 11.6.43) QK’ QK”, so there is a K”’: QK’ 2 p QK”’ and QK” >p QK“’. Also Q [y : K’] L’ >p Q [y : K”‘]L’. Resuming: Typ* Q B >p Q [ y : K”’]L’ and Typ Q A 2 QK”’. So Q ( A )B E A.

-

0

Note that the above theorem does not hold if we use lambda equivalence (-) instead of P-equivalence (-p). Let Q A E A and TypQA 2 QA’. Let Q B E A for some B. Then Typ* Q B = QB‘ Q [y : A‘] (y) B’ for some fresh y, since Q [y : A’] (y) B‘ & QB‘. If the above theorem were to hold with instead of -0, it would follow that Q ( A ) B E A. Note that A and B are arbitrary. This can clearly not generally be the case. As a counterexample, take Q = [z : 71, A = B = z. Then Q ( A )B = [z : T ] (z)z, which does not belong to

-

-

A. We shall prove a number of theorems concerning A.

Theorem 1.6. If A E A and A 2, B , then B E A.

0

As with A, it holds for A that, given K E A, only one of the derivation steps in Def. 1.3 can yield K E A as a conclusion (unique A-constructibility).

R.P. Nederpelt

454

Theorem 1.7.

If Q ( A )B E A, then Q A and Q B E A.

Proof. Follows from the unique A-constructibility.

0

Theorem 1.8. If Q [x : A]B E A , then QA E A.

Proof. Induction on IBI, using the unique A-constructibility. Let B = [yl : B I ]... [yk : B k ] Ps,where P f [ z : E ] PI, and s = T, s G y f z or s = x.

case 1. P = 8,k = 0. Then QA E A from rule (2) of Def. 1.3 for all possible s. case 2. P = 8, k 2 1. Then Q [z : A ] [yl : B1]... [ y k - l : Bk-11 Bk E A from rule (2) or (3), so Q A E A by induction. case 3. P = ( E )PI. Then Q [x : A] [ y ~: B1]... [yk : B k ] E E A by Th. 1.7, hence QA E A by induction. Theorem 1.9. Zf QA E A , then Qr E A.

Proof. Induction on [ A [ .If A E 7, there is nothing to prove. If A G z, then Q = Q1 [y : B ] or Q f Q1 [x : B ] . In both cases Q1B E A , so also Qr E A. I f A = ( B ) C or A = [ z : B ] C , then Q B E A by Th. 1.7 or by Th. 1.8, so by 0 induction Qr E A . Theorem 1.10. ZfA E A and B

c A , then AIB E A.

Proof. Induction on IAl. If A = r then the proof is trivial. Let A = (21: All ... [zk : Ak] Ps, where P f [ z : E ] PI. (1) If B

= [xj : Aj] ... [zk: Ah]Ps or B = Ps, then AIB 3 A E A.

( 2 1 : Ail ...[zi-i : Ai-I] (AilB) : All ... [ ~ i - 1 : Ai-11 Ai E A by ( [ x i : All ... [zi-i : Ai-l] Ai)lB and Th. 1.8, so by induction AIB E A.

( 2 ) If B C Ai, then AIB

(3) Let B C Ps, B f Ps. If P = 0 then B = s and AIB = A E A. So assume P = ( K )PI. Distinguish the cases B c K and B c PIS. In both cases we may conclude AIB E A by a similar reasoning as in (2). 0

Corollary 1.11. ZfA E A and x

c A, then Alz E A.

0

Theorem 1.12. Zf Q A and Q B E A , Q [x : A] B E A, then Q [z : A ]B E A.

Proof. Induction on IBI. Let B case 1. P case 2. P

= [y1 : B1]... [yk : Bk]Ps, where P f

= 8, k = 0. Then Q [z : A]B E A by Def. 1.3 (2) or (3). = 8, k 2 1. Call [y1 : B1]... [yk-l : Bk-11 E Q’.

[ z : El PI.

Strong normalization in a typed lambda calculus (C.3)

455

(1) Assume s = Y k . Then QQ’Bk E A by Th. 1.8, and Q [z : A] Q’Bk E A (Th. 11.3.8 and Th. 11.3.9), so by induction Q [ z : A]Q’Bk E A, hence Q [z : A] B E A. (2) Assume s f Y k . Then QQ‘Bk and QQ’s E A (by the unique A-constructibility), Q [ z : A]Q’Bk and Q [ z : A]Q’s E A (Th. 11.3.8 and Th. 11.3.9), so by induction Q [z : A]Q’Bk and Q [ z : A]Q’s E A. It follows that Q [z : A] B E A.

case 3. P I ( E )P’. c a l l [ y :~B l ] ... [Yk : Bk] = Q”. Then QQ”E and QQ”P’s E A by Th. 1.7, Typ* QQ”P’s = QQ”F‘ 2 QQ” [ z : K ] L and Typ QQ”E = QQ”E‘ 2 QQI’K. It follows from Th. 11.7.33, Th. 11.8.9 and Th. 11.8.17 that T y p * Q [ z : A]Q”P’s z Q [ z : A]Q”F‘ 2 Q [ z : A ] Q ” [ z : K ] L and Typ Q [ z : A]Q” E = Q [ z : A]Q”E’ 2 Q [ z : A ] Q ” K . By Th. 11.3.8 and Th. 11.3.9 Q [ z : A]Q”P’s and Q [ z : A]Q”E E A, so by induction they also 0 belong to A, hence Q [z : A ] B E A. Theorem 1.13. Zf Q [z : A ] B E A and Q B E A, then Q B E A. Proof. Induction on 1BI. The proof is similar to the proof of Th. 1.12, with 0 the use of Th. 11.3.10 instead of Th. 11.3.11. We shall use the following theorem as a lemma for the important Th. 1.15. Theorem 1.14. Let PP’K, P L E A, PP’L E A and Typ* PP’K Typ* PP’L. Then PP’L E A. Proof. Induction on IlPP‘Il. If PP’

I I, the

la

proof is trivial.

case 1. Assume P = Q ( E )P“. Then QPl‘P’K E A, Q E E A, Typ* QP‘IP’K 2 Q [ y : M ] N and Typ Q E 2 Q M . Also: QP”L E A and QPI’P’L E A. We now prove that Typ* QP”P‘K 2, Typ* QP‘IP‘L. Let Typ* PP’K = PP’K’ and Typ* PP‘L I PP’L‘, then by hypothesis Q ( E )P“P’K‘ = PP’K’ PP’L‘ = Q ( E )P”P‘L‘, so also QPI’P‘K’ la QPI’P’L’ (Th. 11.4.6). But Typ*QP“P’K = QP”P‘K’ and Typ*QP”P’L = QP’‘P’L’ by Th. 11.8.6 and Th. 11.8.18. It follows by induction that QP’IP’L E A. Also Typ* QPI’P‘L 2 Q [ y : MI N , so Q ( E )PI’P’L I PP’L E A. case 2. Assume P = Q and P’ = Q’ ( E )P”. Then QQ‘P‘IK E A, QQ’E E A , Typ’ QQ‘P‘IK 2 QQ’ [y : M ] N and Typ QQ‘E 2 QQ’M. Also: QQ’PI’L E A and Typ* QQ‘PI‘K Typ* QQ‘PI‘L (which can be proved as in case l),so by induction QQ’P”L E A. Since Typ’ QQ’PI’L 2 QQ‘ [ y : M ] N it follows that QQ’ ( E )P”L I PP’L E A. case 3. Assume P I Q and P’ = Q’. If Q’ = 0 there is nothing to prove. Let

R.P. Nederpelt

456

Q’ [zl : MI]... [z,, : Mn]for n 2 1. Since QL and QQ’L E A, z, cannot occur in QL (Th. 11.3.8) or in Q [z1 : M i ] ... [zi-l : Mi-11 Mi.It follows from QQ’K E A (Th. 1.8) that QMi, Q [xi : Mi] M2, ..., Q [xi : M I ]... [zn-1 : Mn-11 Mn E A. SO also Q [XI : M i ] L , Q [xi : M i ] [Q : M2] L , ...,QQ’L E A by Th. 11.3.11 and 0 Th. 1.12. Theorem 1.15. If A E A, then Typn A E A for all permissible n. Proof. Let A = Pi [z : B]P ~ xthen , Typ A = PI [z : B]PZFrB. Since A E A: P1B E A (Th. 1.8), so PlFrB E A (Th. 1.6). Also Typ A E A (Th. 11.8.2) and Typ* A > a Typ*(Typ A). Now, applying Th. 1.14, we obtain Typ A E A. The theorem follows directly. 0 2. The normalization theorem In this section we shall prove the normalization theorem: if A E A, there is a B in normal form such that A 2 B ( B is said to be in normal form if there are no reductions B 2; B‘ or B 2; B’). We do this by the aid of a norm p, which is a partial function from expressions in A to expressions in A, and which has the following powerful properties with relation to A: (1) If A E A, then p(A) is defined.

(2) If A E A and A 2 B , then p(A) & p(B). (3) If A E A and Deg A

> 1, then p ( A )

p(Typ(A)).

Hence this norm is invariant (apart from a-reduction) with respect to reduction and typing. We first define P A for every A E A. This P A is a partial function from subexpressions of A to expressions. It is rather in contradiction to our philosophy to define the norm with respect to subexpressions, which need not belong to A. We could have avoided this by giving a definition of the norm in the line of our second definition of A, only considering norms of expressions in A. This, however, would have impaired understanding of the following and would have led to laborious descriptions. On the other hand, in this section the context of a subexpression will always be clear, so that no confusion can arise. In the following inductive definition of PA we do not explicitly indicate which occurrence of a subexpression in an expression is meant, since this will be clear from the context.

Definition 2.1. Let A E A. (1) If

7

c A then

=

~ A ( T ) 7.

Strong normalization in a typed lambda calculus (C.3) (2) If z c A , Alz

457

= Q1 [z : B]Q 2 x and if p A ( B ) is defined, then PA(Z) = ~ A ( B ) .

(3) If [z : B]C c A, and if both ~ A ( Band ) ~ A ( Care ) defined, then P A ( ( . : B] [x : p A ( B ) ] P A ( C ) .

c)

=

(4) If ( B )C c A , if both P A ( B )and P A ( C )are defined and PA(C) [y : D] E where D ~ A Bthen , ~A((B C)) G E. 0

From this definition it can easily be seen that, if ~ A isAdefined for A E A,

P A Acontains no bound variables. The following theorem is obvious:

The binding variables in P A A will be irrelevant t o our purposes. We might as well do without them. Our reason for retaining them is personal taste: we find the property ~ A ( AE) A agreeable. In trying to calculate ~ A ( Afor ) a certain A E A, we apply the four rules of Def. 2.1; the only event in which this calculation can break down prematurely (before ~ A ( Ais) obtained), is when we encounter a subexpression ( B )C c A for which the conditions stated in Def. 2.1 (4) are not fulfilled. These conditions may be considered as a weaker form of the applicability condition (cf. Section 1.6, where this is explained in an informal manner): (1) C must have a norm with a functional character: PAC= [y : D] E , and (2) B has a norm which behaves as an appropriate argument for the “function” PAC:PABL a D. If these conditions are fulfilled, the norm of ( B )C is defined as the result of the C)) 2 E ; if application of the “function” PACto the “argument” ~ A B~: A ( ( B these conditions are not fulfilled, the norm of ( B )C is not defined, and neither is the norm of A . Note that the norm of a bound variable is defined as the norm of its “type”: if [x : B ] is the binding abstractor of x, then P A ( Z ) = P A ( B ) . The existence of P A A for a certain A E A indicates that some weak funcA tional condition is fulfilled. Suprisingly enough, the existence of ~ A already guarantees that there is a normal form for A . We shall prove this in Th. 2.17. We are especially interested in normalization properties of expressions in A. We note that expressions in A have, so to say, a much stronger functional character than is required for the existence of the norm of expressions. Th. 2.7, stating that P A Aexists for A E A, is not hard to prove.

R.P. Nederpelt

458

If in the following we speak of the norm of a subexpression B of a certain expression A , it will be clear which A we mean, even if we do not state this explicitly. In such cases we shall write p(B) instead of ~ A ( B ) . If A E A, B c A and ~ A ( Bis) defined, we call B pA-normable. Here, too, we speak of “p-normableB” if it is clear which A (with B c A E A ) we mean. If Qr E A, if Qr is pnormable and ~ ( Q T=)Q‘r, we call Q p-normable, and we abbreviate p(Q) = Q’.

Theorem 2.3. If A E

A, A is p-normable and B c A , then B is p-normable.

Proof. Induction on lAl, with the use of the definition of B

cA

(Def. 11.2.5). 0

Theorem 2.4. If Q A E A and Q A is p-normable, then Q and A are pnormable and p(QA) G (pQ)pA; if Q A E A, and if Q and A are p-normable, then Q A is p-normable and p(QA) = (pQ)pA.

Proof. Induction on 11Q11.

0

Theorem 2.5. If A E A, if A is p-normable and A 2 B , then B is p-normable and pA > a pB.

Proof. First assume that A 2’ B. We proceed by induction on the length of proof that A 2’ B. If A 2; B then the proof is trivial; this case is expressed in Th. 2.2. (I)

(a) A 2’ B is Q ( C )[z : D ] E 2; Q ( z := C ) E . Since A is p-normable: p ( [ z : D]E ) = [z : p D ] p E and pC 2, p D = pz (see Th. 2.3 and Th. 2.4). Moreover, pA = p(Q ( C )[z : D] E ) = (pQ)pE. We now prove for this 2, C and E:

Lemma. If K c E, then (z := C ) K is p-normable and p(z := C ) K 2, p K .

Proof of the lemma. Induction on 1KI.

=

(1) (a) If K z z, then (z := C ) K = FrC and p ( z := C ) K pFrC 2, pC 2, pz = p K . (b) If K y f z or K z T , then (z := C ) K K and p(z := C ) K = p K . (2) If K = [y : FIG, then pK [y : pF]pG. Note that y f z. By induction: (z := C ) F and (z := C ) G are pnormable, p(z := C ) F la pF and p(z := C ) G 2, pG. So (z := C ) K is p-normable and p(z := C ) K [y : p(z := C ) F ] p ( z:= C ) G > a [y : pF] pG pK.

=

=

=

=

=

Strong normalization in a typed lambda calculus (C.3)

459

(3) If K = ( F )G , then pG = [y : L] H and L la p F , and pK = H . By induction: (z := C ) F and (z := C)G are p-normable, p(z := C ) F pF and p(z := C)G 2, pG. It follows that p(z := C)G = [ z : L'] H' and L' La L La pF l ap(z := C ) F , so (z := C ) K is p-normable and p(z := C ) K I H' H G pK. 0 It follows that B is pnormable (since E c E ) , and p B = (pQ)p(z:= C ) E l a(pQ)pE= PA. (b) A 2' B is Q [z : C ](z) D 2; QD. Since A is pnormable: QC is p-normable, pz = pC, pD = [a, : L ] H and L > a pz = pC, so pA = ( P Q ) [z : PCI H l a bQ) : PC]H La (pQ)pD= PQD PB. (11) A 2' B is a direct consequence of a monotony rule. It depends on the monotony rule which of the following three cases applies: (a) A 2' B is Q ( C )E 2' Q (C)F as a direct consequence of Q E 2' QF. Since A is p-normable: Q E is p-normable. By induction: QF is p normable, and pQE = (pQ)pE 2, (pQ)pF = pQF, so pE 2, pF. Moreover, pE = [a, : L] H and pC La L, so also pF = [z : L'] HI, where L' > a L and HI La H . It follows that (C) F is p-normable, and p(C) F = H' H . So pB (pQ)pH = pA. (b) A 2' B is Q [z : C ]E 2' Q [x : D] E as a direct consequence of QC 2 ' Q D . Then pA = (pQ)[z : pC]pE. By induction: Q D is p-normable and (pQ)pD = pQD, so pC la pD. Hence B is pQC = (pQ)pC pnormable and pA l a(pQ)[z : pD]pE = pB. (c) A 2' B is Q ( C )E 2' Q ( D )E as a direct consequence of QC 2' QD. Then pE = [z : L] H and L l apC. By induction: Q D is pnormable and pQC = (pQ)pC la (pQ)pD zz pQD, so pC > a pD. Hence B is p-normable and pB = (pQ)H = pA. Finally, if A 2 B is a multiple-step reduction, decompose the reduction and 0 apply the above.

Theorem 2.6. If A E A, Deg A p-normable and pA la p Typ A.

>

1 and A is p-normable, then Typ A is

Proof. Let A = PI [z : B ] P ~ z then , Typ A = PI [z : B ] P2FrB. It is not hard to show that pFrB la pB = px. Let PI [x : B]P2 = PIP''. Next prove by induction on IIP"II that P"FrB is p-normable, and p(P"FrB) 2, p(P"x). 0 Theorem 2.7. If A E A, then A is p-normable (i.e. p is a total function on A).

460

R.P. Nederpelt

Proof. Induction on the length of proof of A E A. (1) A

= 7: trivial.

(2) A = Q [ z : B ] z or Q [ z : BIT E A as a direct consequence of QB E A. Then by induction QB is p-normable, hence Q is pnormable and pB = pz. Hence A is pnormable.

(3) A = Q [z : B]y E A as a direct consequence of QB E A and Qy E A. By induction: QB and Qy are pnormable, hence Q, B and y are pnormable, so A is pnormable.

= Q ( B )C E A as a direct consequence of QB E A, QC E A, and the Qapplicability of QC to QB. Then QB and QC are p-normable (induction), and so are Q, B and C. The Q-applicability implies that Typ* QC 2 Q [z : K ]L and Typ QB 2 QK. From Th. 2.5 and Th. 2.6: Typ* QC and Q [z : K ] L are pnormable, pQC p Typ* QC 2, (pQ)[z : p K ] p L (so PQK (so P B 2, pK). PC [z : PKI p L ) and PQB Za p TYPQ B 0 Hence ( B )C is p-normable and so is A .

(4) A

Instead of p is total on A , we also say: A is p-normable. F’rom Th. 2.5 we derive:

Theorem 2.8. If A , B E A and A

-

B , then p A

pB.

0

We shall now prove the normalization theorem for p-normable expressions.

Definition 2.9. A E A is normal (or in normal form) if there are no reductions A 2b B or A & B ; A is normalizable (or A has a normal form) if there exists a normal C such that A 2 C (C is called a normal form of A ) . 0 Definition 2.10. A E A is @-normal if there is no reduction A 2; B ; A is @-normahable (or A has a @-normalform) if there exists a @-normalG such 0 that A 20 C (C is called a @-normalform of A ) . Hence A is normal if A admits of neither @-, nor 77-reductions; A is @-normal if A admits of no @-reductions (except trivial ones).

Theorem 2.11. If A E A is normal and A B , then B is normal. If A E A as @-normaland A B OT A > q B , then B is @-normal. Proof. The only non-trivial statement is that B is @-normalif A is p-normal

Strong normalization in a typed lambda calculus (C.3)

46 1

and A zs B. It can, however, easily be seen that a single-step g-reduction of a pnormal expression cannot introduce the possibility of a single-step /3-reduction. 0 We restate the following well-known theorem:

Theorem 2.12. If A E A is @-normalizable,then A is nonnalizable.

Proof. As a result of Th. 2.11, g-reductions of A do not cancel the /3-normal character. But the possible number of single-step g-reductions applicable to A 0 is finite, since the expression becomes shorter with each step. Theorem 2.13. If A E A is @-normalizable,the p-normal form of A is unique but for a-reductions.

Proof. Let C and D be P-normal, A >p C and A >p D. Then, by the ChurchRosser theorem for &reduction (Th. 11.6.43): there is an E such that C 2 p E 17 and D >,,p E. Hence C la D. Theorem 2.14. Assume that Q (Ak)... ( A l )B is in A and p-normable. Then k

i=l

Proof. Induction on k . If k = 0 the proof is trivial. Let k > 0. Then ( A l )B is p-normable, hence p B = [z : M ] N and pA1 >, M , so IpAll = IMI. Moreover, k

p ( A 1 )B

G

N , hence IN( >

(pAi(by induction. It follows that i=2

= Q (Cn)...(GI) F IpCi(. If A = Qs (with 3 = z

Definition 2.15. Assume that A is in A and pnormable, A for some n

> 1 and F f

n

( M )N . Then o ( A ) =

or s E T), then a ( A )= 0.

i=l

0

Theorem 2.16. Assume that A as in A and p-normable, A E Q (Cn) ... (Cl) F , F f ( M )N , and let QCi (for 1 5 i 5 n ) and QF be @-normal. Then A is pnormalizable.

Proof. Induction on o ( A ) . (1) If a ( A ) = 0, then n = 0 and A = Q F in p-normal form.

( 2 ) Let a ( A ) > 0. Then n 2 1. We proceed by induction on IF(. If F = y, then A is in @-normal form ( F = T cannot occur since A is p-normable).

R.P. Nederpelt

462

So let IF1

> 1. Then F z [z : D]E .

Assume that E = ( H m ) ... ( H I )y where y f z and m 2 0. Then A 2 Q ( C n )...(C2) ((z := C1)Hm)...((z := C1)Hl)y. Now note that Q (C1)[z : D]H , E A, pnormable, and a ( &(C1)[z : D] H,) = JpClJ5 a(A). Moreover, IH,I < [El,so by induction Q(C1)[z : D]H , is P-normalizable. Since QC,, QD and QH, are P-normal, the pnormalization of Q (C1)[z : D]H , must commence with Q (C1) [z : D] H , 2; Q ( z := C1)H,, so Q ( z := Cl)H,is P-normalizable; say Q ( z := C l ) H , 2 Q K z in p-normal form. It follows that A 2 Q (Cn)... (C2) ( Km )... (K1)y in ,&normal form.

(i)

(ii) Assume that E z ( H m )... ( H I )T where m 2 0. Again, analogously to (i), A is P-normalizable. (Moreover, m must be 0 since A is p normable.) (iii) Assume that E = (Hm)... ( H I )z where m 2 0. Then A 2 A’ = Q (Cn)... ((72) ( K m )... (K1)FrCl, where we obtain the @-normalQKr as in (i). If now FrC1 is a variable or if FrCl begins with an applicator, we have obtained a p-normal form. If FrC1 E [y : M ]N , then

g(A’)=

2 r=2

IpCzl

+

m

lpKzl < IpFrCl) (by Th. 2.14 and Th. 2.5) = p 1

IpClI 5 a ( A ) ,so by the induction hypothesis: A’ is 0-normalizable, so A is P-normalizable too, or n = 1 and rn = 0, whence A’ is p-normal. (iv) Assume that E = [y : H I ]H2. Then A 2 Q (Cn)... (C2) [y : (Z := Ci)Hl](Z := C1)H2 2 Q (Cn)... (C2)[y : K1]K2, where we again obtain the p-normal Q [y : K1]K2 as in (i). Since a(Q(C,) ... (C2)[y : K1]Kz) = n

IpC,l

< cr(A), it follows by induction that A is P-normalizable, or

2=2

n = 1 and Q [y : K1]K2 is P-normal.

0

Theorem 2.17 (@-normalizationtheorem). If A E A is p-normable, then A is P-normalizable. Proof. Induction on the length of proof of A E A.

(1) A

=

T;

trivial.

(2) A = Q [z : B ]z or A E Q [z : B]T E A ils a direct consequence of Q B E A. Then by induction Q B is P-normalizable, so QB 2~ Q’B’ in @-normal form, and A I p Q’ [z : B’]z or A >p Q‘ [z : B‘]T in P-normal form.

Strong normalization in a typed lambda calculus (C.3)

463

(3) A = Q [ z : B ]y E A as a direct consequence of Q B E A and Qy E A. Then by induction Q B >p Q’B‘ in p-normal form, so A Q’ [z : B’]y in O-normal form.

>

= Q ( B ) C E A as a direct consequence of Q B E A and QC E A. Then by the induction hypothesis: Q B >p Q’B’ in P-normal form (with 11Q11 = 11Q’11) and QC >p Q”C’ in P-normal form (with 11Q11 = 11Q”11). From Th. 2.13 and induction on 11Q11 it follows that Q”C’ la Q’C’. Also Q ( B )C >p Q’ (B‘)C’, which is in p-normal form if C’ $ [z : D] E. So all that is left to prove is that Q’ (B’)[z : D] E is P-normalizable. But this 0 follows from Th. 2.16 and Th. 2.5.

(4) A

Theorem 2.18 (normalizationtheorem for A). If A E A then A is P-normalizable and normalizable. Proof. Follows from Th. 2.17, Th. 2.12 and Th. 2.7.

0

In fact we proved that A E A is efectively normalizable, since all our proofs are constructive, which implies that the normal form of A E A is effectively computable.

3. Strong normalization In the previous section we have proved normalization for A. This guarantees that for every A E A there is a reduction which leads to a normal form. However, we do not yet know whether an arbitrary sequence of single-step reductions, beginning with A , terminates (in a normal form). We shall prove this in this section. The property that an arbitrary sequence of single-step reductions, beginning with some A , terminates, will be called the property of strong normalization. In the proof we shall use ,&-reduction and &reduction, introduced in Section 11.6. A feature of PI-reduction is that “scars” of old ,&-reductions are retained. We shall first prove strong normalization for A as to ,&-reduction, and derive strong normalization for A as to P-reduction; finally, we shall incorporate qreductions.

Definition 3.1. A E A is P1-normal (or in pl-normal form) if there is no B such that A B; A is P1-normalizable (or A has a 01-normal form) if there exists a pl-normal C such that A Ip, C (C is then called a P1-normal form of A ) . The concepts &normal, &normal form and P2-normalizable are defined analogously. 0

R.P. Nederpelt

464 Theorem 3.2. If A E A, A is pl-normal and A

>a

B , then B is pl-normal.0

Theorem 3.3. If A E A is pl-normalizable, the 01-normal form is unique but for a-reduction. Proof. This follows from CR for P1-reductions (Th. 11.6.38).

0

Theorem 3.4. If A E A, A is p-normable and A >pl B , then B is p-normable and pA > a pB. Proof. First assume that A of proof of A B.

>bl

(I)

>bl

B . We proceed with induction on the length

Let A = Q ( C )P [x : D]E >pl Q (C) P [i: D](x := C ) E = B. Then P [x : D ] E is pnormable. It is easy to see (induction on IIPII) that [x : pD] pE. So pC > a pD. As in the p ( P [ x : D] E ) E p ( [ x : D ] E ) lemma occurring in the proof of Th. 2.5 we can prove that (x := C ) E is pnormable and that pE > a p ((x := C ) E ) .It follows that B is p-normable and pA 2, pB.

>bl

(11) A B is a direct consequence of a monotony. In all three cases the proof is identical to that given in part I1 of the proof of Th. 2.5. Finally, if A >pl B is a multiple-step reduction, decompose the reduction 0 into single-step PI-reductions and apply the above.

We shall now prove the PI-normalization theorem. We do this in quite the same manner in which we proved the p-normalization theorem. In fact, if we had begun by proving the pl-normalization theorem, the P-normalization theorem would have been a corollary. We have not chosen this order because in the following proof the main lines are obscured by the presence of a number of P-chains Pi. In contrast to this, the line of thought in the proof of the pnormalization theorem, given in the previous section, is much more lucid. Definition 3.5. Let A E A, A E P1PB and let P be such that, for each [x : C] for which P E Pz [x : C ]P3, it holds that x $ P3B. Then we call P an 0

ineffective P-chain, and write A

= PI P B.

0

Theorem 3.6. If A E A is pl-normal and B c A , then B has the form 0

B =PO

0

[XI

: All

PI

... [x, : A,]

0

0

P , ( B I ) Pi

0

... ( B L )Pis,

with s

= zi

or

9 E 7.

Proof. Induction on IBI.

0

Strong normalization in a typed lambda calculus (C.3)

Theorem 3.7. Let

465

(Ak)Pk (Ah-1) P k - 1 ... ( A l )P1B belong to A and be k

>

p-normable. Then lpBl

IpAil. i= 1

Proof. Analogous to the proof of Th. 2.14. Note again that for p-normable 0 PC it holds: p(bC) = pC. Definition 3.8. Let A E A and assume that A is pnormable. Let A = 0 0 0 Q Pn+1 (Cn)P n ... ( C i ) P i F , where F T , F x or F E [y : M ]N with n

y

c N . Then a l ( A ) =

JpCiI if n

1 1, and a l ( A ) = 0 in case n = 0 and

i=l 0

P,+lf 0. Moreover, q ( A ) = 0 if A

= QT or Qx.

0

Theorem 3.9. Let A belong to A and be p-normable, let 0 0 0 A E Q Pn+1 (Cn) Pn ... (Ci) P i F , where F f T , F = x or F

[x : D]E

0

wzth x c E . Let QC,,QF and Q P , r be pi-normal. Then A is pi-normalizable.

Proof. Induction on q ( A ) . The proof is analogous to that of Th. 2.16. How0

ever, some modifications are required due to the P,. We shall briefly comment 0 0 0 on this: As to (2) (i): E = P k (H,) ... P i ( H I ) Pby, and

A2Q

(cn)i

in+i

n

... (

~ 2 F )2 ( ~ 1 F ) 1

[g: I)] ; :(K,,J ... Fy (

~ 1 , y:;)

0

where the K , are obtained as in the proof of Th. 2.16 and where Q P!T are the 0

,&-normal forms of Q(z := Cl) P ~ TThese . can be obtained since either 0

0

0

(1) if x CP;: P! =Pi, or 0

0

(2) if x CP;: q ( Q (CI) [X : D] P t r ) = (pC1I 5 g ( A ) and I P ~ T<~IF( (apply the induction on IF[).

Note that

F 2

(C,)

[g : D] Fk is an ineffective ,f3-chain. 0

As to (2) (iii): FrCl can be P O [x : M ] N. If x c N , then the proof is similar to that of Th. 2.16. If x N , we can take [g : M ] as part of an ineffective p0

0

chain ( K 1 ) P: P O

[g : M I , and

look at the structure of N instead of that of

FrCl. This amounts to looking for the first “effective” abstractor in N . If there is such an abstractor and the obtained expression is not yet in /31-normal form, induction is applicable as in Th. 2.16. If not, we have already obtained pl-normal form. 0 As to (2) (iv): E =PO [y : H I ]Hz. If y c H2 (so y C K z ) , the proof is 0 obvious. If not, look at K2 instead of E , as in the previous case.

R.P. Nederpelt

466

Theorem 3.10 (PI-normalization theorem). is -normahable.

If A

E

A is p-normable, then A

Proof. Induction on the length of proof of A E A, analogously to the proof of Th. 2.17. As to case (4)of this proof the only case worth mentioning is 0

c E , then Th. 3.9 yields the desired result, and if we have already obtained P1-normal form. C’ =P [z : D] E. If z

5

$E

o

Theorem 3.11 (PI-normalization theorem for A). If A E A, then A is ,131normalizable. Proof. Follows from Th. 2.7 and Th. 3.10.

0

In fact we have proved that A E A is efectively PI-normalizable. Theorem 3.12. Let A E A and A

>&,B .

Then IAl 5 IB(.

Proof. Induction on the length of proof of A

>&,B .

0

Definition 3.13. Let A E A. We write Pl-nf A for the PI-normal form of A which we obtain from the effective computation as suggested by Th. 3.10 and used in Th. 3.11. 0 Note. This PI-normal form is unique (Th. 3.3). Definition 3.14. We call K E A strongly P-normalizable if there is an upper bound for the length 1 of reduction sequences K = K1 KZ ... Kl. Analogously we define the concepts strong PI-, Pa- or 17-normalizability of K . 0

>& >& >&

Theorem 3.15 (strong Pa-normalization theorem for A). If A E A, then A is strongly -normahable. Proof. Induction on IA(.

>b2

Definition 3.16. Let A E A, and let A = A1 A2 >k2 ... >L2 A, be the longest possible sequence of single-step P2-reductions beginning with A. Then @(A) = p . 0 Note that p 5 (A[. Theorem 3.17. If A E A and A Proof. Follows from Th. 3.3.

>&, B , then &(P1-nf A) = OZ(P1-nfB ) . 0

Strong normalization in a typed lambda calculus (C.3)

467

Theorem 3.18. If A 2bl B , then & ( A ) < 0 2 ( B ) .

>,;

B. The only interesting case is A = Q ( C )p [z : D] E >&,Q ( C )P [g : D] (3 := C ) E = B , where, indeed, we have at least one single-step P2-reduction more on the right hand side. The rest Proof. Induction on the length of proof of A

of the proof is easy.

0

Corollary 3.19. If A E A, then 02(A) 5 @ @ l - n f A ) .

0

Theorem 3.20 (strong pl-normalization theorem for A). If A E A, then A is strongly ,f31-normalizable. Proof. Follows from Th. 3.17, Th. 3.18 and Cor. 3.19.

0

Definition 3.21. Let A E A, and let A = A1 >b, A2 2&, ... A, be the longest possible sequence of single-step PI-reductions beginning with A. Then 01(A) = p . 0

>bl

Theorem 3.22. Let A E A, let A be in &-normal f o r m and let A >p, B . Then B is also i n PI-normal form. Proof. If B were not in PI-normal form, then B >b, C for some C. In that case there would be a reduction A B’ >p, C according to Th. 11.6.17. Con0 tradiction.

>bl

Theorem 3.23. If A E A, then there is a n upper bound for the length 1 of A1 2’ A2 2’ ... 2’ Al, where each Ai 2’ Ai+l is a reduction sequences A single-step p1- or /32-reduction. Proof. Induction on @ l ( A ) . If 01(A) = 0, then A is in pl-normal form. If we can apply ,&-reductions on A such that A B (n 2 l ) , then B is also in pl-normal form (Th. 3.22). The number of possible single-step &reductions applicable is finite (5 &(A); cf. Th. 3.15). So let 01(A) = p > 0, and assume that the theorem holds for all K with Q l ( K ) < p . Let A 2 D be a reduction sequence consisting of single-step PIand &-reductions. If no PI-reductions occur in the reduction sequence, the length of the reduction sequence can be at most 02(A). Else, let A 2 D be A B C D. Then by Th. 11.6.17 there is also a reduction sequence A B’ >p, C D . Each B” such that A B“ has by induction (since 01(B”) < &(A)) an upper bound for the length of reduction sequences B” 2’ ... >’ E in which each single-step reduction is either a PI- or a p2reduction. Let m be the maximum of these upper bounds. Then the length of the reduction sequence B’ 20, C D, hence of C D, cannot be more than

>z2

>,;

>b,

>&, >

>

>b,

>

>

R.P. Nederpelt

468

m. It follows that the length of any reduction sequence A 2 D can be at most 0 @2(A) m 1.

+ +

Theorem 3.24 (strong P-normalization theorem for A). If A E A, then A is strongly P-normalisable. Proof. Each P-reduction sequence of A can be decomposed into single-step PI0 and Pz-reductions by Th. 11.6.15. So Th. 3.23 yields the desired result.

>&

>& >&

Definition 3.25. Let A E A, and let A = Al’ A2 ... A , be the longest possible sequence of single-step @reductions beginning with A . Then @ ( A )= p . 0 Theorem 3.26 (strong 7-normalization theorem for A). If A E A, then A is strongly rpnormalizable. Proof. Induction on IAl.

0

Definition 3.27. We call K E A strongly normalizable if there is an upper bound for the length 1 of reduction sequences K = K I K2 _>’... >’ Kl 0 where each reduction Ki >’ K;+1 is a single-step P- or 7-reduction.

>‘

Theorem 3.28 (strong normalization theorem for A). If A E A, then A is strongly normalizable. Proof. Induction on @ ( A ) . The proof is similar to that of Th. 3.23. Use Th. 3.26 instead of Th. 3.22, and instead of Th. 11.6.17, use the theorem: If K E A, K lo L M , then K L’ >_ M . The latter theorem is easy to prove, since each reduction A 2; B C can be replaced either by a reduction A >&B‘ 2; C (where r 2 0) or by a reduction A B’ _>; C (see the discussion after Th. 7.18; see also Th. 7.25).

>b

>b

>&

>&

469

Big Trees in a A-Calculus with A-Expressions as Types* R.C. de Vrijer

0. Outline

The abstract term system AX studied in this paper is a close relative of the Automath family of languages. In the investigation of normalization and decidability properties of these languages, AX came up as a natural generalization of AUT-QE, the language currently in use for mechanical proof checking at the Automath project in Eindhoven. For introductory reference, see [van Daalen 73

~4.311. The introduction, Section 1, is an informal account of the system AX and its relation to other systems. The formal description of AX is given in the Sections 2 and 3. In 4 the main results are stated, mostly without proof. Section 5 is devoted to proving that the big trees are well founded (BT).

1. Introduction 1.1. Heuristic description Before describing the main results of the paper we make a few heuristic comments, especially on the generalized type structure involved. Here we use the “formulas-as-types” notion for interpreting mathematical statements and proofs, originated independently in [ d e Bruzjn 70a ( A . 2 ) and [Howard 801 (the term comes from Howard). Further references are given in 1.4.

1.1.1. Type structure To illustrate the transition from the type structure of traditional type theory, e.g. the typed A-calculus exhibited in [Hindley et al. 721, to the types we have here, we consider mnstructive versions of propositional and predicate logic respectively. If we identify a proposition a with the type of its constructions (or *Reprinted from: Bohm, C., ed., A-Calculus and Computer Science Theory, p. 252-271, by courtesy of Springer-Verlag, Heidelberg.

R.C. de Vrijer

470

proofs), then the implication a + p will be the type of constructions that map constructions of a to constructions of /3. That is, a + p corresponds essentially to the Cartesian power pa. In predicate logic a construction c of Vz.P(z) will map any object t from the domain of quantification a to a construction of the proposition P ( t ) . Hence the type of c(t) depends on the choice o f t . The notion of power doesn’t suffice any longer; we need that of Cartesian product: II P ( z ) . XECX

1.1.2. Abstraction and application, two interpretations Automath exploits the formal similarity between two kinds of abstraction: functional abstraction to form the functionlike construction Xz E a.c(z)and the product construction ll P ( z ) . It is convenient to unify these principles in XECX

the notations [z : a]c(z) and [z : a]P ( z ) , respectively. Observe that now functional application in the former case corresponds to specification of coordinate axis in the latter. Also here we use the same notations: ( t )[z : a]c(z) and ( t )[z : a]P ( z ) ,which reduce to c(t) and P ( t ) ,respectively. Now this uniform syntactical treatment of both kinds of abstraction, very convenient for our purposes, may cause some confusion in interpretation. For example, vis i vis the formula-type analogy it amounts to using the same notation for both the predicate, i.e. “propositional function”, Xz E a.P(z) and its universal quantification

vz E a.P(z). 1.1.3. Supertypes, type inclusion We further introduce the constant type as a %upertype” of types. Then, e.g. [z : a]type will be the supertype of all those types p, such that whenever t is an element of type a (notation: t E a ) , ( t )p is a meaningful type. Hence, carrying on the example from 1.1.2, we have [z : a]P ( z )E [z : a]type. Moreover, because of the possibility of interpreting [z : a]P ( z ) as a proposition Vz E a.P(z), we require that [z : a]P ( z )E type. This motivates the facility in AX (and in AUT-QE) to pass here from [z : a]type to type, known as the principle of type inclusion: [z : a]type C type (cf. [van Daalen 73 (A.311, [de Bruzjn 7Oa (A.2)] and 3.5.2 below). In order to clarify this slightly ambiguous situation one could for the product construction introduce the II’s again, and obtain II [z : a]P ( z )E type for the product and [z : a]P ( z )E I I [z : a]type for the type-valued function, respectively (cf. [Zucker 77 (A.d)]).

1.1.4. AX-theories Expressions are built up by using the principles of abstraction and application mentioned above, starting from variables, parameters and constants. A

Big trees in a A-calculus (C.4)

471

particular choice of the constants and their (super) type assignments will depend on the interpretation one has in mind. Such a choice is formally fixed by a base (cf. 3.1). Each base determines a specific AX-theory. In informal mathematics new notions are always introduced in a context, possibly indicated by the presence of certain parameters and assumptions. This observation is reflected in AX by the fact that constants are allowed to depend on parameters. We now illustrate the treatment of constants in AX and the parameter mechanism involved. Let Cl(a,p) be a type constant, t o be interpreted as the proposition 3s E a.(s)p, where p is supposed to represent a predicate on the type a. Inforp ) one might stipulate: mally introducing C1(a, (1) “Let P be a type, and Q be a predicate on P. Then we will consider C1( P ,Q) as a proposition.”

In AX the (super) types of parameters are indicated by superscripts and hence the corresponding axiom reads: (2)

Cl(PtYPelQ[s’pltype) Etype .

The rule of existence introduction can now be formalized by adding another constant Cz(a,p) together with the axiom (3)

C2(PtJPe,Q[s:pltJrpe) E [X : P] [y : (s) Q]Cl(P, Q)

.

When actually given a E type and p E [s: a]type, the statements c1(a, p) E type and Cz(a,p) E [x : a][y : (x) 01 Cl(a,p) can now be obtained as instances of (2) and (3), respectively. Moreover, for objects t E a and s E ( t ) P application and P-reduction yield: (s) ( t )Cz(a,p)ECl(a,p). For further explanation on the subject of interpretation we refer to the treatment of AUT-QE in [van Daalen 73 (A.3)] and to [van Benthem Jutting 773. [Note. We can be somewhat more explicit on the relation between the formats of AX and AUT-QE. Axioms like (2) and (3), which in a AX-theory are given by a base, correspond to PN-lines in an AUT-QE book:

* P *

P Q

Q * Q *

c1

:= := :=

c 2

:=

type [ x : Pltype : type : [s: PI [y : (s) Q] type

: :

PN PN

.

In this manner a AX-theory (or rather, its base) corresponds to an AUT-QE book in which all constants are introduced as primitive notions. Vice versa, each such AUT-QE book gives rise to a AX-theory. Defined constants are not considered in AX.]

R.C. de Vrijer

472 1.2. Applicability

Usually, in type theory as in the Automath languages, term application is subjected to the applicability condition: ( t )f is a term iff there are types Q and p such that tEa and f E [ z : alp. Now in typed A-calculus this condition is easy to formulate. The type structure and the assignments of types to terms are given in advance, i.e. all of the syntax precedes the generation of theorems. In our case, however, types depend on objects and the type assignments are themselves treated as theorems in AX. Hence here the applicability condition would make derivability interfere with term formation. A common way of dealing with this complication (cf. Automath, Martin-Lof, etc.) is to generate the terms (including the types) simultaneously with the theorems. By contrast we take the approach of allowing unrestricted application in AX, but instead now subjecting the rule of @reduction

(4)

( t )[z : a]c --t c [z := t]

to the condition tEa. We can then formulate an applicability condition by referring to derivability in AX and so define the set of legitimate terms. The legitimate fragment AX - 1 of AX is the system one obtains by restricting AX to the language containing only legitimate terms. Hence AX - 1 may be considered as the part of AX that is significant for interpretation. (Though, of course, the illegitimate terms do have a computational interpretation in the term model.) The justification for the above sketched procedure lies in the following result: AX is a conservative extension of AX - 1 , (5) This property may be regarded as a soundness criterion for the notion of legitimacy as defined above, and hence for AX: if the equality of two significant (read: legitimate) terms can be proved in AX, it can be done using only significant terms. The proof of (5) uses the result on “big trees” described below. 1.3. Decidability, big trees

We now turn to a second desirable property of the systems: (6)

AX and hence AX - 1 are decidable.

Decidability of the typed A-calculus is an easy corollary of the strong normalization property (SN) and the Church-Rosser property (CR). Every term reduces effectively to its normal form (nf) and two terms are equal iff their nf’s are identical. However, although both SN and CR go through for AX, they are not sufficient for the decidability, as we will now explain. In the discussion we make use of an effective function T ,which assigns canon. since we have uniqueness ically t o every object a type such that t ~ T ( t ) Then, of types (cf. 3.3.5):

Big trees in a A-calculus (C.4)

(7)

473

tE a -3T ( t ) = ct

(where by CR, = is equivalent to having the same nf). So, in order to see if (4) holds, we must first determine if ~ ( tand ) a have the same nf (by (7)). Then in the process of reducing these terms questions of the form (4) may arise again, and so on. To deal with this problem, we proceed as follows. Let -‘bt be the improper reduction relation generated by (i)

usual &reduction,

(ii) applying T , (iii) taking proper subterms. Call the tree of -’bt-reduction sequences of a term C the “big tree” of C. Then we prove instead of SN the stronger property:

(BT)

big trees of terms in AX are well founded.

Together with CR this result easily implies the decidability. Further, as mentioned above, it is also used in the proof of ( 5 ) . In his thesis Nederpelt [Nederpelt 73 (C.3)] stated as a conjecture for his sytem A, the closure property: (8)

Legitimate terms reduce to legitimate terms.

It turns out that B T (for A) implies (8). Further it seems that BT can be proved for A by a method, similar to the one used here. (Note that by contrast (8) for AX is a simple consequence of the formulation of the system and its proof does not require BT.) We feel that, apart from the applications described, BT may have some interest on its own. 1.4. Historical remarks

The first proof of normalization of an Automath system was given in [van Benthem Jutting 71b (C.l)]. Nederpelt (Nederpelt 73 (C.3)] proved strong normalization for his system A. He made two conjectures: the above mentioned closure property for A and CR for the system with 77-reduction. The latter conjecture was proved in [van Daalen 80, (C.5)]. The result is assumed in this paper. Scott [Scott 701 suggested to use the ideas of de Bruijn [de Bruijn 70a (A.2)] for the formalization of an intuitionistic theory of constructions. At about the same time Howard [Howard 801 came up with similar ideas. The line is pursued

R.C. de Vrijer

474

in [Martin-Lof 75aJ. His theory of types is claimed t o be a natural framework for intuitionistic mathematics. The different accents in motivation - Automath more practical, Martin-Lof more philosophical - might be responsible for some of the differences in the investigated systems.

2. The language, expressions In this paragraph we specify the language of a AX-theory. ‘This language is affected by the choice of a base (cf. 3.1). A similarity type (defined below) codes the information, which is relevant for the formation of expressions. Hence for each similarity type s we define the language 12,. 2.1. Alphabet All formal symbols used are from the alphabet consisting of the symbols for variables parameters const ants binary relations

...

2,? 2, I,

P, Q,R, ... C1, Cz, C,, ... and type

=, +, ++, -+, E , (, ), (, ), ,.

and the auxiliary symbols [, 1, Variable symbols will be indexed by types to become (object-) variables, parameter symbols by types and supertypes to become object- and type-parameters, respectively. The set of variables is assumed to be such that whenever needed, we are able to choose uniquely a “new” variable of the desired type, not yet occurring in the context. The enumeration of the constant symbols is meant to show the order in which they can be introduced in a particular interpretation (cf. 1.1.4 and the notions of date and base). In Automath this would be the order in which they appear in a “book” (cf. [ v a n Daalen 73 (A.3)]). 2.2. Similarity type

A similarity type s is a triple (SO,Sl,o), where So and S1 are disjoint sets ’ to (0, l}”,the set of finite of natural numbers and o is a function from SOU 51 (possibly empty) sequences of zeros and ones. Here SOindicates the set of constant symbols used for object-constants, S1 the set of constant symbols used for type-constants and if i E SOU 4 ,then o(i) determines the positions of object- and type-parameters of Ci (cf. 2.3.1 (ii)).

Big trees in a A-calculus (C.4)

475

2.3. Expressions The expressions fall into three sorts: objects, types and supertypes. These are simultaneously defined in 2.3.1. In the definition we use already the notion of closed expression, to be defined in 2.3.4. However, it is clear that the definitions could have been given simultaneously.

2.3.1. Definition. Given a similarity type s, the sets of variables, parameters, constants, objects, types and supertypes, building together the set E, of expressions is defined by simultaneous induction. If z is a variable symbol, P a parameter symbol, a a type, ,O a closed (cf. 2.3.4) type and p* a closed supertype, then za is a variable, PP is an object-parameter and Po' is a type-parameter. Let o(i)= 61, ...,6, and C1, ...,C, be expressions such that Cj is an object if 6 j = 0 and Cj is a type if 6 j = 1, then Ci(C1, ...,C,) is an object-constant if i E SOand a type-constant if i E S1. Variables, object-parameters and object-constants are atomic objects. Typeparameters and type-constants are atomic types and type is the only atomic supertype. If f and t are objects, a and ,O are types, a* is a supertype and za a variable, then ( t )f and [za: a]t are objects, ( t )p and [xa : a]p are types 0 and ( t )a* and [xa : a]a* are supertypes.

2.3.2. Convent ions As syntactical variables we use C, r, ... for expressions in general, f , g , t , s,... for objects, a,& ... for types and a*,p*,... for supertypes. The symbols for variables, parameters and constants are used themselves as syntactical variables for their respective categories as well. As long as no confusion arises we will freely add and omit indexes. In particular the superscripts of variables and parameters are suppressed where possible, e.g. we write [z : a]z instead of [za: a]za. Vectorial notation is introduced for sequences of expressions; e.g. 3 is short for the sequence a l , ...,an,where the number n is either known or not essential. As = is a symbol of the language, we use = for syntactic equality between expressions. Now follow some more technical and notational definitions concerning expressions.

R.C. de Vrjer

476

2.3.3. Complexity, length and date According to Definition 2.3.1 each expression has a construction, easily seen to be unique, consisting of a finite number of applications of the rules (i) t o (iv). The complexcity c ( C ) of an expression C is the number of steps in its construction. By induction on c ( C ) we define two more measures on C: its length l ( C ) and date d(C). 1(C) = 1 if C is either

a variable, a parameter or t y p e ;

l(C(C1, ...,C,)) = max(l(&), ...,l(C,))

+

+ 1;

l ( ( t )r) = max(l(t), l ( r ) ) 1; l([x : a]r) = max(l(a), l ( I ' ) )

+ 1.

d(type) = o ; d(z*) = d(a); d ( P ) = d ( r ) ; d(C;(&, ...,C,)) = max(i,d(C1), ..., d(C,)) ; d ( ( t )I?) = max(d(t), d ( r ) ); d( [x: a]r) = max(d(a), d ( I ' ) ) . Notice that d ( C ) is the greatest natural number i, such that Ci appears in the construction of C. 2.3.4. Free variables, parameters, special variables By induction of l ( C ) we define the sets FV(C) of free variables and Par@) of parameters of C .

=0

FV(type) = 0

; Par(type)

FV(z) = {z}

; Par(x) = 0

FV(P) = 0

; Par(P)={P}

FV(C(C1, ...,C,)) =

U FV(Ci)

; Par(C(C1, ...,C,)) =

i
U Par(Ci)

isn

FV((t) I') = FV(t) U FV(I')

; Par(@)I') = Par(t) u Par(I')

FV([x : a]I?) = F V ( a ) u (FV(I')\{x})

; Par([z : a]I?) = Par(@)U P a r ( r )

.

The set SV(C) of special variables of C is defined as

SV(C)=

U

FV(a)

sa'EFV(C)

An expression C is called closed if FV( C ) = 0. For a sequence &,...,Enwe introduce the notation where F is FV, SV or Par.

F(5) = U isn

F(C,),

Big trees in a A-calculus (C.4)

477

2.3.5. Proper subexpressions The relation 3 (contains as a proper subexpression) between expressions is the smallest transitive relation such that C(C1, ...,C,) 3 Ci;( t )r 3 t ; ( t )r 3 r; [z : 4 r 3 a and [z : a ]r 3 r. 2.3.6. Simultaneous substitution Let P I ,..., P, and 21,...,z, be sequences of distinct parameters and variables. And let t l , ...,t, be a sequence of objects and &,...,Em. a sequence of expressions, such that Ci is of the same sort (i.e. object or type) as Pi. Then t’] of simultaneous substitution of 2,Cfor P , Z is defined the result C [P,Z := 2, by induction on I(C). In the definition we abbreviate [P,Z := 2, t’] by I”.

xi = ti (1 5 i 5 n) and z’ = z if z @ (21,..., z,} ; Pi = Ci (1 5 i 5 m ) and P’ = P if P @ {PI, ..,,P,} ; (C(C1,...,C k ) ) ’ c C(C’,,...,C;) ; type’

type ;

( ( t )r)l 3 (t’)r’ ; ([z : a]r)’= [y : a’](I’[z := y])’ , where y is a new variable. By

r’ [P, Z := e,t‘] we denote the sequence I?:, ...,r;.

2.3.7. a-equivalence of a-equivalence between expressions is the smallest equivThe relation alence relation such that, s and a ca/3, then also xQ =a xp,Pc = a Pr, if C 5, I?, t C(A1,...,C, ...,A,) =a C(A1, ...,r, ...,A,), ( t )C = a ( s )r and if y $! F V ( r ) then [x : a]C

[y : a]I?

[x := y].

In the sequel we shall simply identify a-equivalent expressions. Formally one might pass to a-equivalence classes, considering an expression as merely a name, denoting the class it belongs to, and show that the preceding definitions behave well with respect to za. In some places names of bound variables will be tacitly assumed to be chosen such that no ‘‘clashes” arise.

2.3.8. Lemma. Let {Ql, ...,Q,} n Par(2,t’) = { y ~ ,...,yn} n F V ( 5 , S ) = 0. Then c y’ := r’, 31[P,z := 2, t‘] = c [P,5 := 2,t‘] y’ := (f, [P,z :=

2,a.

10,

16,

0

R.C. de Vrijer

478

0

2.4. Formulas, language

-

Let s be a similarity type. Then the language C, consists of the formulas: r (equals), C I? (reduces to), C ++ r (properly reduces to), C + r (reduces in one step to) and C E r (has type or has supertype), where C and l7 are expressions in E,. If R is a relation symbol we write C1, ...,C, R r l , ...,r, (5 R l ? ) for the sequence of formulas CI R rl, ...,C, R I?,. C =

3. XA-t heories

3.1. Base According to what has been said in 1.1.4, the set of axioms and rules of a AX-theory can be divided into two parts. (i) A set, characterizing the underlying system (the same for any AX-theory). (ii) In addition, the assignments of types and supertypes to the relevant constants, determined by a base (defined below). The situation may be compared to e.g. predicate logic, where one adds for each particular theory a set of mathematical axioms to the fixed framework of logical axioms and rules. Now recall the example in 1.1.4. It involves an instance (1) of the general assumption scheme:

(*)

“Let PI be a C1, let P2 be a

C2,

... and P,

be a C,. Then

... .”

In such a scheme it is assumed that the Xi’s are well defined in the given context, which leads to the requirement that Ci+l should not contain free variables or parameters other than PI ,...,Pi. This observation motivates the following definition.

3.1.1. Definition. A regular sequence of parameters (‘sop) is a sequence PF1,...,P,”-of distinct parameters, where the Xi’s are closed types or super0 types and for 0 5 i < n, we have Par(Ci+l) C { P I ,...,Pi}. We now proceed t o the definition of a base. Notice the requirements on dates, motivated by the remark made in 2.1.

Big trees in a A-calculus (C.4) 3.1.2. Definition. A base R is a triple (i)

479

(9,

p, T ’ ) , where

s is a similarity type ( S o , S l , a ) .

(ii) p is an effective function from SOUS1 to rsop’s, such that for all i E SouS1, if p ( i ) = PF1,...,P,”-, then C i ( P f l ,...,P,”.) E E, and

max(d(C1),..., d(C,)) < i. (iii) T’ is an effective function from SO to closed types in E, and from 5’1 to closed supertypes in E,, such that for i E SOU S1, if p ( i ) = P I ,..., P,, then 0 Par(T’(i)) E { P I ,...,P,} and d(T’(i))< i.

[Note. W e can now fill in some more details of the correspondence between AXtheories, via a base R, and AUT-QE books (cf. the note to Section 1.1.4). This may also clarify the rather formal definitions. First, what is called here an rsop is just a context i n AUT-QE. The constants Ci would in the AUT-QE book that corresponds to a base R be introduced in PN-lines, in the order of the indexes i. Then p ( i ) gives the context on which Ci is defined and ~ ’ ( i ) determines the category (a type or a supertype). It could be remarked that Automath books are always finite, whereas for a base in AX there is no such restriction. However, for the language theory that makes no difference.] 3.2. Axioms and rules of AX [R]

3.2.1. Given a base R, the AX-theory XA [R]is formulated in the language C,. The axioms and rules of AX [R] are the following. (I)

type assignment. (a)

Z

~

E P ~E E ;

.

(b) Ci(&, ...,C , ) E T ’ ( ~ )[PI,...,P, := C1, ..., C,], if i E SOUS1 and p ( i ) =

PI, ..., P,. (c) a Etype (type inclusion). (d) C E r i- [ z :a ] C E [ z: a ] r ,provided x $Z SV(C). (e) C E r k ( t ) C E ( t ) r .

(11) one step reduction. @reduction: t Ea k ( t )[z : a]C + C [x := t]. q-reduction: f EP, P E [x : a]a* I- [z : a](x)f + f , provided z g! FV( f ) , P E [x: a]a* k [z : a](z) /3 + /3, provided z $Z FV(0).

480

R.C. de Vrijer monotonicity rules: (a) (b)

c r I- C(C1, ...,C, ...,C,) -, C(C1, ...,r, ...,C,). c + r I- ( t )c ( t )r ; t s I- ( t )c (s)C. -+

-+

(c) C -+ I? I- [z : a]C (d) a -+ p I- [z : a]C

-+

-+

-+

-+

r, provided z $! SV(C).

[z : a]

[y : p] C [z := y], provided y

FV(C).

(111) proper reduction, reduction and equality.

(a) (b) (.)

c + r t- c -++ r ; C r, r A I- C -++ A. x++rt-z+r;C+C. c + r I - c = r ;c = r k r = c ; c = r , r = ~ t - c = ~ . ++

-++

(d) C = r , A E C k A E r ; C = r , C E h k r E A .

3.2.2. Remarks (i)

I(c) amounts to the principle of type inclusion (cf. 1.1.3 and 3.5).

(ii) The motivation of the restriction in I(d) is clear from the following example. Suppose one had [z : a]yc(") E [x : a]C ( x ) . Then for arbitrary t E a by application and P-reduction yc(") E C ( t ) ,which is obviously not intended. (iii) The restriction in II(c) excludes the possibility of both ( t ) [z : a](yc(")) [z : C(z)] z ( t )[z : a]yc(") + yc(") and ( t )[z : a](yc(")) [z : C(z)] z + (yc(")) [ z : C ( t ) ]z , both in normal form, 0 violating CR. -+

In the sequel we assume an arbitrary base 0 to be fixed. By just stating a formula we mean that it is derivable in AX [n],for convenience further referred to as AX. Syntactical variables for expressions are supposed to range over E,.

3.2.3. Lemma. The monotonicity rules II(a-d) hold also with b y ++, + or =.

-+

replaced 0

3.2.4. Now follows a, rather technical, definition, auxiliary to the important substitution lemma 3.2.5. Compare also Definition 3.1.1 (rsop's). Definition. Given an expression C, a sequence Prl,..., P:m,x;ll, ...,2:" := rl, ...,rm, t l , ...,t , is called a regular substitution sequence (rss) for C , if the following conditions are satisfied:

481

Big trees in a A-calculus (C.4)

(i) (ii)

riE&[F := r'] (1 5 i 5 m). tiEai [ P , z : = r',t'] (1 5 i 5 n).

(iii) If Q* E Par(C)\{Pl, ..., I'm}, then Par(h) n {PI,...,P,} (iv) If yp E FV(C)\{zl, ...,z,}, then FV(P) n {XI,...,z}, Par(@)n {Pl,...,Pm}= 0.

= 0.

= 0

It is easily verified that the conditions (iii) and (iv) are fulfilled if in particular:

...,z,} n SV(C) = 0,or

0

m = 0 and

0

Par@)

0

Par(C) C { P I ,...,P,}

(21,

{ P I ,...,Pm} and FV(C) E

(21,..., z,},

and hence if

and C is closed.

r, and let C R r , [P,2 := 9, q. 0 Simultaneous induction on the length of deduction of C R r.

3.2.5. Lemma. Let F,.' := 2 , t ' b e both a n rss for C and for where R is +, ++, +, = or E. T h e n also C [P,2 := 2, R

Proof.

3.3. Canonical type assignment, uniqueness of types The assignment function r' generates a function r , which assigns canonically to each object a type and to each type a supertype, such that always C E r ( C ) .

3.3.1. Definition. r ( C ) is defined by induction on 1(C).

a:; .(pr) = r. r(Ci(C1,...,C,)) E r'(i)[P := f ] for i 5

E SOU S1, where p ( i ) = 9,...,P,. 7 ( ( t r) ) 3 (4 7(r); o r( [z : a]I?) = [z : a]r ( r ) ,where z is chosen such that z $ SV(r).

3.3.2. Lemma. CET(C) holds for any object or type C.

Proof. Induction on 1(C). 3.3.3. Lemma.

r(c(5)) ' .1

0

:=

f l 3 T ( c'1.( ~ := t'])).

Proof. Immediate by Lemma 2.3.9 and the definition of

7.

0

3.3.4. Lemma. Let Z := t'be a n rss for C, then T ( C [2:= f l ) = r ( C )'.[ := t'].

Proof. Induction on 1(C). Use Lemma 3.3.3 in case C is a constant.

0

R.C. de Vrijer

482 3.3.5. Theorem. (Uniqueness of types). a = r(t).

tEa

Proof. One side is implied by Lemma 3.3.2. For the other side, prove by simultaneous induction on the length of deduction of t Ea and t = s, respectively, the two statements t E a + a = 7 ( t )and t = s + r ( t )= r ( s ) . The proof makes 0 use of the previous lemma. 3.3.6. Remark. The analogous result for supertypes does not hold (cf. 3.5). However, in AX without Rule I(c) one would obtain Theorem 3.3.5 for super0 types as well. 3.4. Legitimacy In this section we define the set L of legitimate expressions. Then the legitimate fragment AX - 1 of AX is the theory obtained by restricting the axioms and rules of AX, to use only expressions from L.

3.4.1. Remark that L depends on the choice of R. We might call R a le0 gitimate base if {Ci(p(i))I i E SOU Sl} U {T'(i)I i E SOU Sl} L. 3.4.2. For the sake of the characterization of the legitimate expressions we now introduce a function r*, assigning canonically t o each expression a supertype. Deflnit ion.

=

7*(a*) a*

for supertypes a*

r*(a)= .(a)

for types a .

r * ( t )= r ( r ( t ) )

for objects t .

0

Remark. r* may be compared to Typ* in [Nederpelt 73 (C.3)].

0

3.4.3. Definition. The set L of legitimate expTessions is specified by defining by induction on ( d ( C ) , c ( C ) ) (i.e. w.d(C) c(C), cf. 5.2), what it means for an expression C to be legitimate.

+

za E L iff a E L; pC E L iff c E L. Ci(5)E L iff C1, ...,C,, r'(i) E L and p(i) := 2 is an rss for ~ ' ( i ) . ( t )l? E L iff t , I? E L and for some a,a* we have tEa and r*(r)= [Z : a]a*. 0 [x : a]I? E L iff a,I? E L , provided z g' SV(r).

Big trees in a A-calculus (C.4)

483

3.4.4. Lemma, Let P , 2 := r', { b e an rss for C1, ...,C,, respectively, and let := 2 be an rss for the closed expression C. Then also := 9 [P,Z := I?, fl is an rss for C.

a

Proof. Apply Lemma 3.2.5.

0

3.4.5. Lemma. Let C , r l , . . , ,rm,tl,...,t, E L and let rss for C, then C [P,Z := ?, fl E L.

p,Z := ?,? be

an

Proof. Induction on 1(C). Use Lemmas 3.2.5 and 3.4.4. 3.4.6. Theorem. (Extended Closure). Let C E L and let either C + I?, or C 3 (I.e. C E L and C -'bt r + r E L.)

r

or r ( C )

= r.

0

Then also

r

E

L. 0

3.5. Type inclusion, uniqueness of domains The analogue of the uniqueness of types theorem for supertypes does not hold. E.g. we have both [z : a]PE(z: &]type and [x : a ] @ E t y p e (cf. 1.1.3). However, one does obtain a weaker result, viz. uniqueness of domains:

a E [ z : p]p* and aE[$ : y ] y * + p = y . This property is important as a justification for the above characterization of legitimate expressions. We state here without proof:

3.5.1. Theorem. a E [z : p] p*

28 for

some supertype A*, .(a) = [z : p] A*.

In order to say something more on the structure of supertypes in AX - I , we define the relation & of type inclusion between supertypes in L.

3.5.2. Definition. First define the relation c between supertypes in L inductively by (i)

a* c type for any supertype a*.

(ii) If a* c p*, then also [z : a ]a* c [z : a]0' and ( t )a* c ( t )p*. Then

is the smallest transitive relation in L extending = and C.

0

3.5.3. Theorem. Let a , p, a*,p* E L . Then (a) aEa* and aEp* (22) aEa* *.(a)

a* 5 p* or p* E a*.

E a*.

0

484

R.C. de Vrijer

So T assigns to a legitimate type its minimal legitimate supertype. Note that a supertype in L , which is in normal form, is always of the form ... ["k : ak]type. [XI : 4. Decidability and conservativity 4.1. Sequences, trees We use u, p, ... to range over, finite or infinite, sequences of expressions. We define lh(u) to be the length of u if u is finite, Ih(u) = 00 if u is infinite. C will also stand for the sequence of length one, consisting of C only. If Ih(u) < 00, then u,p stands for the concatenation of u and p. We define: u < p ( p extends u ) iff there exists a sequence T , such that u, T = p.

4.1.1. Definition. A sequence Co, C1, ... is called a (i)

reduction sequence of CO iff Ci

(ii)

rs-sequence of Co iff either Ci + Ci+l or Ci 3 Ci+l,

4

(iii) +bt-sequence of Co iff either Ci

C,+l,

+

C;+1 or Ci IICi+l or .(Xi)

= Ci+l.

0

4.1.2. Definition. The finite reduction sequences of a term C form under the partial order < a tree, the reduction tree of C. Analogously we have the rs-tree and the +bt-tree of C. The latter is called the big tree of C. The set of +bt-sequences of C is denoted by S(C). Finally, B ( C ) = {I? I C -Sbt r}. 0 4.1.3. Definition. h ( C ) will be the height of the reduction tree of C: h ( C ) = max({lh(a) I a is a reduction sequence of C}). Analogously, b(C) = max({lh(u)(o E S(C)}) is the height of the big tree of C. 0

4.2. Normal forms, strong normalization An expression C is in normal form (nf) if there does not exist an expression

r such that C -+ r. An expression C is called strongly normalizable if h ( C ) < duction tree of C is well founded.

00,

i.e., if the re-

4.3. Results We now state the main results of the paper. The details of proofs are generally omitted. However, Section 5 will be devoted to sketching the proof of BT (Theorem 4.3.2).

485

Big trees in a A-calculus (C.4) 4.3.1. Theorem. (CR). If C = r, then there exists an expression A, such that C

--H

A and I’ + A.

0

A proof shall not be given here. Let it suffice to remark that in A X without the rule of q-reduction the property follows easily from the strong normalizability of AX. In the present situation, where q-reduction is included, the proof is more complicated. It was proved by van Daalen (cf. 1.4).

4.3.2. Theorem. (BT). For every expression C, b(C) < 00. 1.e. big trees in AX are well founded.

0

This result implies that every expression is strongly normalizable (SN). Moreover, by CR one obtains that for each C, there exists a unique nf r, such that C = I?. (In contrast to its use in “uniqueness of types”, uniqueness is here to be understood with respect to =.) This unique expression will be denoted by nf(C).

4.3.3, Corollary. Given an expression C, its big tree can be effectively constructed.

Proof. Given the big trees of an object t and a type a,one can decide if tEa; viz. by merely checking if nf(r(t)) E nf(a). By this observation it is easy to devise an algorithm, which, when applied to an expression C, constructs the big 0 tree of C, and which can be proved to be correct by induction on b(C). 4.3.4. Corollary. AX is decidable.

0

4.3.5. Let ( C , r ) t- A R A’ assert the existence of a deduction of A R A‘ in AX, in which occur only expressions from B ( C ) u B ( r ) . Lemma. (Transitivity). If C’, I?’ E B ( C )u B ( r ) and (C’, I?’) F A = A’, then ( C ,I?) I- A = A‘.

0

4.3.6. Definition. A new measure n(r) is defined by induction on b(r): 0 > 1). n(r) = (C(g,h)ES‘(r)n ( A ) ) 1 1 where s‘(r)= { P E s(r)I

+

4.3.7. Theorem. Let C R I?, where R is =, +, ++, +, or E . Then ( C , r) ICRr.

+

Proof. Induction on n ( C ) n(I’). Let us restrict attention to equalities. If C and I? are both in nf, then by CR, C 3 I’ and we are done. So assume that C -+ C’. Then by the induction hypothesis and transitivity, (Ell?) k C’ = r.

R.C. de Vrijer

486

Hence it is enough to show that ( C , r ) k C = C'. Now distinguish cases as to the last rule applied in a deduction of C + C'. We treat only one case. ) [z : a]a*. It must be shown that Let C = [z : a](z) f + f = C' and ~ * ( f = ( C , r ) I- ~ * ( f = ) [z : alp* for some p* (cf. the rule of 0-reduction and Theorem 3.5.1). By CR, r * ( f )and [z : a ] a * have a common reduct [y : 7]7*. Now n(a) n(7) < n ( C ) and n(~*(f)) n([y : 7 ] 7 * )< n ( C ) imply that ( C , r ) F a = y and ( C , r ) I- ~ * ( f )= [y : y]y*, respectively, and consequently 0 (c,r) F T*(f)= [z : a]y*.

+

+

4.3.8. Corollary. AX as a conservative extension of AX

- 1.

Proof. By Theorem 4.3.7 and the Closure Theorem 3.4.6.

0

5. Proof of the big tree theorem The strategy of the proof of BT (Theorem 4.3.2) will be to define an extension AX - p of AX, by adding an extra rule of term formation for ordered pairs: if T ( C ) = r, then [C,rl is an expression. A pair [C,rl may be considered as just a copy of C, the second component r being present only for bookkeeping reasons. The reduction relation is extended to include the projections p,rl + C and [C, I'l + r. Strong normalization of expressions in AX - p is proved by using a computability argument. Subsequently a map cp is defined, embedding AX in AX - p such that +bt-sequences in AX give rise t o longer rs-sequences in XA - p . Termination of rs-sequences is an easy corollary of SN. Hence we may conclude that +bt-sequences in AX do terminate.

5.1. Introduction of AX - p The base fl, which was fixed under 3.2.2, is still assumed here. So AX - p will be in fact an extension of AX [a]. The definition of the set E - p of expressions of AX - p involves a "forget function'' p from expressions of XA - p to expressions of AX, consistently deleting the second coordinates of pairs. (Hence p acts as the identity on expressions of AX.) The next two definitions should be taken as simultaneously defining the set E - p and the function p.

5.1.1. Deflnition. For the definition of E - p take clauses (i) to (iv) of the inductive definition of E (2.3.1) and add a fifth clause: (v) If C and r are in E - p and 7(p(C)) = p(T) is deducible in AX, then [C, I?] is a n object if C is an object and a type if C is a type, respectively. 0

Big trees in a A-calculus (C.4)

487

5.1.3. The definitions, notations and conventions from Section 2.3 are generalized to E - p . In particular, l ( [ C , r l ) = max(l(C),l(r)) 1; d ( [ C , r l ) = max(d(C),d(r)); P a r ( [ E , r l ) = P a r ( C ) u P a r ( r ) ; F V ( [ E , r l ) = F V ( C ) U F V ( r ) ;

+

[ c , r 3 c, [c,q 3 r. p , q [ P , Z := X , q = [c[P,z := &q,r[P,z:= i,q1. Substitution is only admitted if the substitution result is in E - p again,

i.e., if the substitution does not violate the restriction in 5.1.1 (v). A sufficient condition for this requirement is given in 5.1.6 below. 5.1.4. The formulas of AX 5.1.5.

- p are defined as in 2.4.

The axioms and rules of AX - p are those of AX (cf. 3.2.1) and addi-

tionally

-, C; [C, rl -,r. A I- [c,q -, p,q;r -,A I- [c,q-, p , ~ ] .

(11) projection: [C, rl (e)

c

--*

Remark that now, by projection, an expression may reduce to an expression of a different sort, i.e. an object to a type and a type to a supertype, respectively. For that reason a few obvious restrictions are to be made in some of the rules. In II(a) and III(d) we require C and l? to be of the same sort. In II(b), t and s have to be both objects; in II(d), a and ,L3 have to be both types. 5.1.6. The definitions and results of Sections 3.2 and 3.3 are generalized to AX - p . Remark in particular that by Lemma 3.2.5 we obtain: If P , Z := l?? is an rss for C in E - p , then C [PIZ := r', is in E - p again, and hence an

4

admitted substitution. Add to Definition 3.3.1 the clause:

T( [C,

rl) = T ( C ) .

5.2. Norms

The proof of SN for AX - p is essentially based on the method of proof originated in [Tait 671, and used e.g. in [Prawitz 71, Appendix A] for a system of natural deduction. The key notion of this method, computability (alternative terminologies: convertibility, validity, reductibilitk), could be defined by induction on the length of type in [Tait 671 and on the length of the end formula of

R.C. de Vrijer

488

a deduction in (Prawitz 711. Here it is essential that the type of a term and the end formula of a deduction do not change under reduction of the term and the deduction, respectively. In our proof their task will be fulfilled by a norm on expressions y(C). Auxiliary t o its definition we first introduce the measure

m w Note. Pairs of natural numbers are supposed to be ordered lexicographically. 5.2.1. Definition. m(C)is defined by induction on ( d ( C ) , c ( C ) ) . m(type) = 0; m ( P )= m ( r )

+ 1 ; m ( x a )= m(a)+ 1;

+

+

m(ci(E1, ...,En))= max(m(Cl), ...,m(C,)) m(+(i)) 1 ; m ( ( t )r) = max(m(t),m ( r ) ) ;m([a:: 4 r) = max(m(a), m ( r ) ) and m(Tr1A l l = max(m(r),m(A)).

0

5.2.2. Lemma.

(i)

If C is an atomic expression (not type), then m(r(C))< m(E).

(ii) For all objects and types C, m(.(X)) 5 m(C). (iiz) I f C 2 I', then m(r)5 m ( C ) .

0

5.2.3. The norm y(C) is going to be a, possibly empty, string of the brackets [ and 1. Let G, H , ... range over such strings. They are well ordered by <: G < H iff the number of brackets in G is less than the number of brackets in H . X denotes the empty string. Deflnition. y(C) is defined by induction on (m(C),l ( C ) ) . ?(type) = A ; y(C) = y(7(C)) for other atomic C's;

I. r) = [Y(U)l

Y([X :

;

Y( P,All = Y(r);

5.2.4. Lemma. I f y ( t i ) = y(ai) (1 5 i 5 n), then y(C [xy',...,z? := t l , ...,tn] = y(C).

Big trees in a A-calculus (C.4)

489

5.2.5. Lemma. (a)

If tEa, then y ( t ) =?(a).

(22)

If C = r, then y(C) = y(r).

Proof. Prove (i) and (ii) simultaneously by induction on the length of de0 duction in XA - p . Use Lemma 5.2.4.

5.3. Computability The notion of computability can now be defined by induction on y(C).

5.3.1. Definition. An expression C is computable (comp) if both (i) C is strongly normalizable; (ii) whenever C -H [x: a]r and t E a and t is comp, then also r [x := t] is comp. 0

The definition is correct. For if C -H [z : a ] r and tEa, then y(t) = ~ ( a<) [ ~ ( ay)( ]r ) = Y(C)and Y(F [x := 4 ) = < Y(C).

5.3.2. Lemma. (i) If C

(aa)

-H

and C as comp, then so is

r.

Let C not have the f o r m [x : a ] r . Then C is comp zff all C1 such that C C1 are cornp. -+

Proof. Immediate by inspection of the definition.

5.3.3. Lemma. If C1, ...,C, are comp, then so is Proof. Induction on h(C1)

0

C(2).

+ ... + h(C,).

0

5.3.4. Lemma. If both C and t are comp, then so is ( t )C.

+

Proof. Induction on h ( C ) h(t). Assuming that ( t )C + r, prove that comp, and apply Lemma 5.3.2 (ii). Distinguish two cases: (i) Either t ---* tl and I? = ( t l )C or C by the induction hypothesis.

-+

C1 and

r is

r = ( t )C1. Then r is comp

(ii) C = [x : a ] A and = A[x := t ] . Then I? is comp by clause (ii) of the Computability Definition 5.3.1. 0

R.C. de Vrijer

490

5.3.5. Lemma. If C and

r

are comp, then so is [C, I?].

+

Proof. Again prove by induction on h ( C ) h(I’),that [C, I’l A is comp.

-+

A implies that 0

5.3.6. Definition. C is called computable under substitution, (cus) if for all comp expressions t l , ...,t, and variables 21, ...,z, such that 3 := $is an rss for 0 C, c [Z := i) is comp. 5.3.7. Theorem. All expressions C of E - p are cus. Proof. Induction on 1(C). Let 3 := $ b e an arbitrary rss for C, such that t l , ...,t , are comp. Throughout the proof we abbreviate A’ = A [3:= The only case which is not immediate by the Lemmas 5.3.3-5 and the induction hypothesis is C = [z : a]I?, C‘ = [z : a’]r’,where a’ and r’ are comp by the induction hypothesis. We check (i) and (ii) of Definition 5.3.1.

a.

(i) Suppose (T = CO,C1, ... is a nonterminating reduction sequence of C’. Distinguish two cases: There exist finite reduction sequences (TO, [z : a11 (z) fo of C’, and (TI, a1 of a’, and ( T Z , (z) fo of r’,such that (T = (TO, [z : a11 (z) fo, f l , ... . Then ( T Z , (z) fo, (z) f1, ... would be a nonterminating reduction sequence of I”, contradicting the computability of I”. Case (a) does not apply, i.e., no outer 7-reductions are performed in 0. Then (T would induce reduction sequences (TO of a’ and 01 of I”, such that either (TO or (TI or both are nonterminating, contradicting the fact that a’ and r’ are both comp. (ii) Suppose C’+ [z : a11 rl. Again distinguish two cases: (a) a’ +, az, I” + (z) f , z @ F V ( f ) , and so C’ + [z : a21 (z) f + f and f +, [z : a l ] r l . Let tEa1 be comp. Then r ’ [ x := t] --H rl [z:= t ] .Further ((z) f) [z := t] = ( t )f + ( t )[z : a11 Z,z := ( t is a n rss for I- and r’[z := t] = r [ Z , z:= C t ] . Hence by the induction hypothesis r’[z := t] is comp and by Lemma 5.3.2 (i) so is rl [z := ti. -+

(b) Case (a) does not apply. Then a’ --H a1 and r’ + !?l[zcyl:= za’]. Hence, if t Ea1, also I?’ [z := t] + rl [z := t] and repeating the argu0 ment in (a) we find that for comp t Eal, rl [z := t] is comp.

5.3.8. Corollary. All expressions of E - p are strongly normalizable.

0

Big trees in a A-calculus (C.4)

49 1

5.3.9. Corollary. If C is an expression in E - p , then every rs-sequence of C

terminates. Proof. Induction on ( h ( C ) I, @ ) ) , observing that if C 2

r, then h(r)5 h ( C ) . 0

5.4. Embedding AX in AX - p We now define a map cp: E -+ E - p , such that to each +bt-sequence of an expression C in AX corresponds a longer rs-sequence of cp(C) in AX - p . Then Corollary 5.3.9 guarantees the well foundedness of big trees in AX.

5.4.2. Lemma. If C E El then C = cp(C) (in AX - p ) . Proof. By induction on I @ ) ,

check that p(C) +t C.

5.4.3. Corollary. If C E I? in AX, then cp(C) E cp(r).

0

5.4.4. Lemma. If tEa in AX, C E El then cp(C) [z" := cp(t)]+t cp(C[z := t ] ) .

Proof. Induction on (m(C),l ( C ) ) . We show only three cases. (i)

= (Tz, cp(a)l)[z := cp(t)l Tcp(t),4a)l [Z := cp(t)l = rC(cp(r')) [z := cp(t)l,cp(.(_C(f)))

V(Z) [z := cp(t)I

-+

cp(t).

( 4 rP(C(Q [z := d4ll rC(cp(r [z := t])),cp(~(C(l?)) [z := t])l = cp(C(r)[z := t ] ) . Here we applied the induction hypothesis on rl, ...,rn and .(C(?)) and we used Lemma 3.3.3.

+

(iii) cp([y : p ] r) [z := cp(t)]= [ z : p(P) [z := cp(t)]] cp(r)[z := cp(t)][y := z] + [u,cp(p [Z := t ] ) cp(r ] [Z := t ] )[y := U ] = p([y : p] r [Z := t ] ) . (Apply the induction hypothesis on /3 and r.) 5.4.5. Lemma. If C 4 r in AX, then p(C) ++

cp(r). Proof. Induction on the length of deduction of C -+ r. We show only one case. Let C G ( t )[x : a]A A [z := t] = and t E a @-reduction). Then V(C)

= Mi))

-+

[Y : 4 Q ) I cp(A) [z := 91

5.4.3 and 5.4.4.

+

cp(A) .1 := cp(t)l

+

cp(nby h n m a s

0

R.C.de Vrijer

492

5.4.6. Lemma. If C E E, then p(C)

++

cp(~(C))(c either object or type).

Proof. Induction on 1(C). Two examples are: (i) p(za)= [za : cp(a)]

-+

p(a)

p(7(zm));

(ii) cp((t)r) E (cp(t))p(r) -++ (cp(t))(p(T(r)) = cp(T((t)r)),by the induction hypothesis for r. 0 5.4.7. Lemma. If C 3

r

in AX, then p(C) 1 cp(I') in A X - p .

0

Corollary. If Co, ...,En is a -+bt-sequence i n AX, then there exists an rs-sequence from cp(C0) to p(Cn) i n AX - p of equal or greater length.

5.4.8.

Proof. Induction on n, using the Lemmas 5.4.5-7.

0

5.4.9. Theorem. If C E E, then every +bt-sequence of C terminates.

Proof. Immediate from the Corollaries 5.3.9 and 5.4.8.

0

493

The Language Theory of Autornath Parts of Chapters 11, IV, V

-

VIII

D.T. van Daalen

11. MISCELLANEA [Notation and terminology

A. Expressions Apart from the usual abstraction and application expressions [x : A]B and ( A )B , we discuss primitive and defined constant-expressions p ( A ) and d(A) - where A stands for a string of expressions Al, ...,Ak -, pairs (P,A,B ) and projections A ( l ) and A(2),injections il(A,p) and i2(B,a) and plus-expressions A @ B . The P in ( P , A ,B ) and the P, a in i l ( A , p ) and i z ( B , a ) are called type-labels and sometimes omitted. See [van Daalen 73 (A.3)]for the constant expressions and [B.6]for the others. B. Elementary reductions Apart from the usual p- and 7-reduction we discuss 6-reduction : d ( A ) >6 D [ Z / A (for defined constants d with definition scheme d(Z) := D) n-reduction: ( A ,B)(1) >= A, ( A ,B)(2) >rrB , >o A a-reduction : ( A ( l )A(2)) +-reduction: ( i l ( A ) () B @ C ) >+ ( A )B and (iz(A))( B eJ C ) >+ ( A )c &-reduction: ( [ x: A] ( i l ( x ) )B ) @ ( [ x: C](iz(z))B ) > E B if x $!'FV(B).

p-, T - and +-reductions are called introduction-elimination (I.E.) reductions.

v-,

u - and E-reductions are called the corresponding extensional (ext) reductions.

C. Notations -

expressions, modulo a-reduction is a subexpression of B AcB A sub B :A is a proper subexpression of B :one-step reduction - contract one redex per step >1 :syntactic identity of :A

D.T. van Daalen

494

51 > 2 A 1B

-

one-step-reduction - several, disjoint, redices m a y be contracted in one step : generic form of one-step reduction, e.g. >1 or 31 : more-step reduction, the transitive and reflexive closure of > 1 :A and B are confluent, i.e. have a c o m m o n reduct :definitional equality, the equivalence relation generated by >1 . : disjoint

For each of the above reduction related symbols, subscripts m a y indicate what types of elementary reductions are included, e.g. >l,p or 'PJ. D. Properties :Normalization

N SN CR CRi

property

: Strong normalization property : Church-Rosser property B 5 A 2 C =+ B 1 C :Weak Church-Rosser B I C B 1C ij -PP : ij-postponement A 2ij B =+ 3 c A Li C 2j weak ij-pp :weak ij-postponement A rij B

*

3 c ,A~ 2i

+

c 2j

B

D 5i B .

Properties N, SN, CR and CR1 are also prefixed to indicate what types of elementary reductions are included, e.g. /3-N and Pq-CR.]

11.8. An informal analysis of CRI 8.1. In presence of SN, the weak CR-property CR1 is sufficient for CR. Anyhow, for the heuristics of a CR-proof an analysis of CR1 is indispensable. Let i and j indicate kinds of elementary reduction, such as p, q etc. Let C be a n expression, with an i-redex R c C and a j-redex S C C. By contracting R t o R' (resp. S to S') we get C >l,i r (resp. C > l , j A). We want to find out whether r and A have a common reduct C' and if so, by what kind of and by how many contractions, C' can be reached from r and A. In the informal discussion below all possible cases are systematically treated, according to the relative positions of the redices R and S. The first point is of course, that either (a) R and S are disjoint, (b) R = S , (c) R sub S or (d) S sub R. In case (a), the contractions just commute: 8.2.

C=

r z ...R'...s... > l , j ...R' ...S'...
...R...s... C'E

>l,i

C,

As for case (b), if we assume that

(*)

for each definitional constant only one defining axiom is given,

The language theory of Automath, Chapter 11,Section 8 ((2.5)

495

then all elementary reductions are mutually exclusive. 1.e. if R i-contracts to R‘ and R j-contracts to S’ then i and j refer to the same kind of reduction and R’ = S‘. So, under assumption (*), which is indeed fulfilled in the Automath system of abbreviations, in case (b) for a common reduct we can take C‘ zi I’(=

A). Case (c) is discussed in Sec. 8.4 and further. Case (d) can of course be reduced to case (c) by interchanging i and j, R and S. 8.3. About expression variables in schemes for reduction The elementary reductions are formulated in schematic form, i.e. with metavariables for expressions in them. For instance, in the scheme of ,&reduction “ ( A )[x : B]C elementary reduces to C [A]” (in Sec. 3.2.1), the meta-variables A , B , C are the expression variables of the scheme. For each of the schemes, all of its expression variables occur (of course!) at least once in the left-hand side (redex). Let X be an expression variable of a scheme for reductions. We distinguish three cases: (i)

X disappears in the contractum (such as B above).

(ii) X occurs just once in the contractum, possibly there is substituted in X (such as C above). (iii) X is possibly multiplied by substitution (such as A above). For all kinds of reductions, except CT and E , the expression variables occur precisely once in the redex. To these two exceptional cases we refer as the twin reductions (because of the twin occurrences of the meta-variable, e.g. of X in (X(1),X ( 2 ) ) ) . 8.4. Case (c). Let R sub S, S j-contracts to S’. Distinguish the following

cases: (cl) R

cX

for some instance X of a meta-variable of the j-redex.

(c2) not (cl), so R forms an essential part of S (such as [x : B]C in ( A ) [x: B]C ) . Now, unless j refers to a twin reduction and R c X for some instance X of a twin occurrence, in case (cl) the j-redex is not spoilt by the i-contraction. For common reduct C‘ we take the result of simply contracting the modified (by the internal i-contraction) j-redex in r. From A we can reach C’ by icontracting nothing (if X disappears, i.e. case (i), Sec. 8.3), i-contracting one possibly modified (by substitution) occurrence of R (if X occurs once, i.e. case (ii), Sec. 8.3) or i-contracting possibly more disjoint occurrences of R (if X

D.T. van Daalen

496

multiplies, case (iii), Sec. 8.3). So C is disjoint one-step i-reduction). Examples:

>l,i

r

>1j

C’ Z1,i A

<1j

C (where

51,i

(1) j is /3, X “occurs once”, use substitution property I, Sec. 3.8:

c = s = ( A )[ Z : B]R

>l,i

r = ( A )[Z: B]R’

>l,p

C ’ z R ‘ [ A ]
c = s = ( R ) : B]...Z...Z... >l,i r = (R‘)iZ : B]...Z...x... C’ = ...R’ ...R‘ ... <1,j A S‘ = ...R...R... <1,p C .

>l,p

In contrast with this, if j refers to a twin case and R c X for some “twin variable” X , then the j-redex is spoilt by the i-contraction indeed - but can be restored by i-contracting the other twin as well. So, since twin variables occur just once in the contractum (case (ii), Sec. 8.3), for some I”, C’, C >l,i r >l,i I?’ > 1 j C’ l,i I? = ( R ( l )Ri2,) , >l,i I?’ = (Rill,Ri2,) >I+, C‘ R’
=

8.5. Case (c2). R is an essential part of S. Notice that there are two possibilities:

(1) j is a n 1.E.-reduction, i is the corresponding ext-reduction. (2) i is a n 1.E.-reduction, j is an ext-reduction. Case (c21). Here are three cases, r] v. p, 0 v. 7r and E v. +. In the first two cases there is no problem, even if the type-labels are present: l? = A, so we can take C’ = I? too.

(4c <1,7 ( A )[ Z : BI (4c >1,p ( A )c ( Z $ FV(C)) (Q,A, B)(p)< l , u (f‘, (Q,A, B ) ( I )(Q, , A, B)(2))(p) >I,* (Q, A, B)(p) 1

(0.1

(p=1orp=2).

+

The case of E v. is more complicated. First, there is an additional P-reduction needed. Secondly, there are problems with the type-labels. (E+)

R 3 ([x: B1J( i i ( 0~1, ) ) C) CB ( [ z: B2] ( i 2 ( 0~2, ) ) C ) ,

S (ip(A,0 3 ) ) R ,

R’= C ,

S’= ( A )[ Z : BPI (iP(5,D p ) )C , ( p = 1 or p = 2, z $ FV(C)) ,

(ip(A,&))C

<1,E

S >I,+ 5’’

>1,p

(ip(A,Dp[A]))C .

The language theory of Automath, Chapter 11, Section 8 (C.5)

497

So, in this case, r I,+ A >1,p A‘ with I? = A‘ but for the type labels. Hence, without type-labels, C’ = I? = A’ can serve as a common reduct. But with type-labels type-restrictions have to be imposed in order to guarantee that D, [A] and D3 are definitionally equal (and may have a common reduct).

+

8.6. Case (c22) covers p v. 77, 7~ v. u, v. E and p v. E . In the first two cases CR1 holds but for the type-labels. In the third case additional q-contractions are needed (compare with 8.5, E v. +), but in the fourth case CR1 (so CR) simply does not hold at all.

So here, I? = A but for the type-labels. Regarding R v. u,the situation compares with the twincase in 8.4: an additional .rr-reduction is needed.

=

So, r’ I , ~A with I” A but for the type-labels. In order t o keep CR in this case, we must at least require that P and Q are definitionally v. E . Since E is a twin-reduction, an additional equal. Then we come to +-contraction is needed, and two additional 77-steps. But to our relief there are no problems with type-labels.

+

Finally, we give a counterexample for ( p e ) , even without type-labels.

€I+ [z]iz(z). Then The best we can get from R’, @ R2 is R’, @ R’, E [z]il(z) S‘ < S 2 R’, @ R‘,, both are normal but S‘ f R’, @ R’, contradicting CR.

8.7. We resume our results in a table, writing i for

>1j,

;for

sl,i.

D.T. va.n Daalen

498 start with 1...j i...i

case a. redices disjoint b. redices equal c. i-redex sub j-redex c l . i-redex non-essential part c l l . j not twin case c12. j twin case c2. i-redex essential part c21. i-redex in intro form

1

j...i

... (*I +

1...j i...j

i , j ...a

r]...P

...

j...i

U*..T

...

E...+

...p (**I

... (**I

P-4

c22. i-redex in eli-form

I d. just like c, with i and j

complete with

T...U

T...

+...E

+, 7, v... (i.e. +, q...)

... P..,.

x x x

interchanged.

Notes to 8.7 and 8.8: (*) Provided there is one defining axiom for each defined constant. (**) But for the type-labels.

8.8. Alternatively, we can arrange our results in a table, according to the kinds of reduction i , j . We write i" for >y,i, the reflexive closure of >l,i. In the first column below one finds values of ( i , j ) . In the second column is indicated by what kind of reductions can be completed (i.e. can be reached a common reduct) if one starts with i...j. complete with j " .,.i"

j' ...i"

(**)

;...;(*I io, jO...io

(**)

io,j'...io i", j"...jo,i"

+", E'...

+O

or ...P

(**)

or

+, r ] , 17..

x x x 11.9. An informal analysis of postponement

9.1. A discussion, similar to the analysis of CR1 in the preceding section, can be devoted to the question of postponement. Let J? contain a j-redex R; by contracting R to R' one gets C. Let C contain an i-redex S; by contracting S one arrives at A.

The language theory of Automath, Chapter

IT, Section 9 (C.5)

499

Essential for ij-postponement is that the j-contraction does not create the i-redex S . Of course, for most of the cases for i , j , essentially new i-redices are indeed created by the j-contractions. E.g., how a ,&redex is created by a n-contraction:

( 4(1.

: BI

c,a ( 1 )

>l,x

( A ).I

:

BI c 3

or by a +-contraction: (ii(a)) ([x : B ] C @ D )

>I,+

( A )[z : B ] C

I

Below we just consider the possibility of ij-PP where i is an 1.E.- and j is an ext-reduction, and the possibility of weak 6j-PP in general. 9.2. Ext-postponement 9.2.1. Let i refer to an 1.E.-reduction and let j refer to an ext-reduction. The

schemes for ext-reduction have a single expression variable as contractum. So R’ is an instance of such an expression variable. If (a) R’ and S are disjoint (in C), or (b) S c R‘ and ( b l ) the expression variable of which R’ is an instance occurs once in the j-redex (so, in fact, j must be 7-reduction), then the i- and the j-contraction can be interchanged. Example of (bl): [X : A] (x)B > I , ~B > l , i

B’

< I , ~[Z

: A] ( x ) B’
FV(B’), because z 6 FV(B)). If (b2) j refers to a twin reduction (i.e. 0 or E ) then two disjoint i-contractions are needed. E.g. ( X $!

P(l),qZ))

>l,o

B

>l,i

B‘

.

< l , i < l , i @(l),B(2))

9.2.2. If (c) R’ c S [the case R’ = S falls under (b)] and (cl) R’ is part of an instance of an expression variable of the i-redex, then one can start with >l,i and finish with some disjoint j-contractions, compare case (cl) of the CR1-analysis. Example:

( R )[X : B]...x...x...

>1,j

...R‘...R’... Z1,j ...R...R...

(R’)[ X : B ]...X . ..x... >1,p <1,p

( R )[ x : B ]...x...x... .

9.2.3. Otherwise, (c2) R‘ is an essential part of S. Since i is an 1.E.-reduction, R’ is in introduction form, i.e. i n j , abstr, or p a i r , or it is a plus-expression.

Now we assume that (*) such type restrictions are fulfilled, that (1) the result of a 0-contraction is never an i n j - , an a b s t r - or a plus-expression, (2) the result of an E- or an 7-contraction is never an inj-expression or a p a i r . Then (c2) can only be realized as follows (for brevity we omit type-labels):

D.T. van Daalen

500

( p = 1 or p = 2)

E

creates

P:

(4(1.1 >1,E

(il(.)) [YI C ) @ ([I.

(i2(.))

[?I1 C))

( 4 [YI c >l,P C[AI .

Indeed, in all but the last case, the i-redex is not essentially new: (T,T (i.e. > I , ~ > I , ~ can ) be simulated by T , T ; v,P by P,P; E , by +,PI and v, by P, But PE-PP (so (~+)-E-PP) is false.

+

+.

+

+

9.2.4. We resume the results of this sections in a table

simulate by case

(b2)

(cl)

P,77 P,v

(a)

-

P,ii

+,v

+,7)

-

r,v

+,vo

r,77

-

T,VO

010

+,E

-

T,E

-

T,T,E

T,E

PIE

-

P,P,E

P,Z

+,c7

T,U

(*)

(bl)

PIPlO

+,+,u

P,Z +,(To

T1X,U

7r,(To

+,+,E

+,EO

Assuming certain type restrictions.

9.3. Weak &advancement 9.3.1. Since the presence of b-redices is only dependent on the presence of defined constants, apparently no essentially new b-redices are created by the other reductions. However, we can only hope for weak &advancement (i.e. weak 6j-PP for all kinds of reductions j , distinct from 6) in view of the PG-example:

( d ( E ) )[Z : A]...z...~...>1,p ...a!(,!?)...d(,!?)...

>1,6

...D [E]...42)...

The language theory of Automath, Chapter 11, Section 10 (C.5)

501

where d(y3 := D is the defining axiom of d. If we start with >I,& here, then possibly too many &redices are contracted. Actually, the situation compares very well with the situation with the twin reductions w.r.t. CR1.

9.3.2. Let r, C, A, R, R‘, S be as in 9.1. R is an arbitrary non-&redex, S is a &redex d(@ (defining axiom as above, say). If (a) R’ and S are disjoint in C then the contractions can be interchanged:

r=...R...S...

>l,j

...R’ ...S’...

...R‘ ...S... >I,& ...R...S’...

+,j

If (b) R’ sub S , then R‘ 51,j.Example:

C

r.

Ei, for some i, so we can simulate

>l,j>1,6

by

>1,6

d(R) >1,j d(R’) >I,&

...R’ ...R’ ... 71,j ...R...R...

d(R) .

If (c) S c R’ then (cl) S is part of an instance of an expression variable of the j-reduction scheme, or (c2) j is /3, S c C [A](E R’ where R = (A) [x: B]C). Case (cl) is just like case (a). In case (c2) there are two possibilities:

c C, $0 [A]=

(c21) d(&)

(for some

go),or

(c22) d ( E ) is part of one of the substituted occurrences of A.

9.3.3. The contractions can again be interchanged in case (c21). Example:

( A ) [Z : €31 d ( F ) <1,p

>1,p

d ( F [ A ] )>I,& ...F [A]...F [A]...

( A )[ X : B]...F...F...

( A )[X B ]d ( F ) .

Case (c22): As in the example above, for some A’, C’

r

>1,p

C

>1,6

A 31,6 A’

<1,p C’ <1,6

r

*

9.3.4. Resuming:

11.10. Multiple substitution 10.1. Let D be a set of expressions; C is an expression, x is a variable. Then r is a multiple substitution result of C with D for x if r can be produced from C by substituting some A E D for each free occurrence of x in C (possibly different A’s for different occurrences of x).

D.T. van Daalen

502

The set of such multiple substitution results is denoted C [[z/DI(here, locally, abbreviated to C*) or just C flD] and can be defined inductively, along the lines of ordinary substitution, as follows:

+ A E x*

(i)a.

A

(i)b.

z f y =+ y E y*

(ii)

2

(iii)a.

If[

E

D

$ y, (VAED y @ FV(A)), r E C* =+ Xy . (if necessary rename y)

.

E (Xy C)*

= 0 =+ f E f *

(iii)b. I‘i E Ct for i = 1,...,If1 =+ f(r’) E

(f(2))’.

By induction on the length of C it can be shown that C* is decidable if D is decidable; e.g., if D is finite then C* is finite.

10.2. Multiple substitution satisfies much the same properties as ordinary substitution. E.g.: if z f y, V a Ey~@ FV(A) then

r*“y/C*n = (rWCI)* or, in full

rn m UYP

ux/an = r

PI r wn .

Here = is ordinary set equality. The proof is by induction on with ordinary substitution,

C 1 r, A E C* =+ A 2 Ar ,

for some Ar E

So, just like

r*.

10.3. If D is a set and, A E D =+ C 2 A then, for all I“ in

r

r.

r [[z/Dl],

2 r’ .

So, if p(C) denotes the set of reducts of C then

r r E r flz/p(c)i=+ r [

~ C2I r’ .

The concept of multiple substitution will typically be used in the proof of normalisation-like properties (see e.g. Ch. IV and VII, also in this Volume).

The language theory of Automath, Chapter II, Section 11 (C.5)

503

11.11. Reduction under substitution in AP-calculus; Barendregt’s lemma 11.l. Introduction The variable x and the expression all reduction relations we have

c 2 C’ , r 2 r’

r will be fixed throughout Section 11. For

c [z/r]2 C/ [z/r/].

Now we consider the converse question: If C [ x / r ] 2 A, what can be said about A in terms of reducts of C and r ? We concentrate on free A-calculus here; reduction is just P-reduction. Expressions are variables, application expressions AB and A-expressions Ay.A. We write A B for (...(ABl)Bz...)Bk. We write C + A (relative to x and I?), if A can be produced from C by replacing certain occurrences ~1 = xAI1, A2 z xAI2, . . . , i l k = xii!tk ( k 2 0 ) of subexpressions of C by reducts A;, A;, ...,A: of A1 [ x / r ] , 112 [ x / r ] ,...,A k [ x / r ] respectively, not leaving any free occurrences of x unreplaced. Here we prove the reduction-under-substitution lemma:

(*)

if C [ x / r ] 2 A then, for some C‘, C 2 C’ + A

.

Barendregt proved a restricted form of a similar fact for weak combinatory logic, using some underlining technique (see H.P. Barendregt, The undefinability of Church’s 6, unpubl. 1972). His proof was extended to A-calculus by de Boer, a student of de Vrijer [de Boer 751. Our proof here will be different: We show that the set of A, such that for some C‘, C 2 C‘ + A, is closed under reduction. Since C + C [ x / r ] , this proves (*). A corollary of (*) is the square brackets lemma, in Sec. 11.5, which is applied in our proof of p-SN in Sec. IV.2.4.4. However, direct proofs of the square brackets lemma are also possible; first there is Levy’s proof [LBvy 75, p. 1341 using the standardization theorem and secondly, there is a proof using SN (IV. Sec. 2.4.3). Further interesting applications of property (*) are the non-definability results in [Barendregt et al. 761, [Mitschke 761. [(*) is included as an exercise in [Barendregt 811 .] 11.2. The definition of -+ 11.2.1. Abbreviate [ x / r ] by * here. Informally speaking, C + A means +

-

*

CE

...A1 ...A2 ... ... E ...XM ~ .X.h./ r , ...XMk ...

A

...A: ...A; ...A: ... ,

and E

with A,’ 2 A: for i = 1, ...,k

.

D.T. van Daalen

504

11.2.2. Formally, we can define

+

(la)

(zG)' 2 A + zG + A

(Ib)

z$Y+Y+Y

(2)

yf

inductively, as follows:

(A? possibly empty)

y $ FV(I'), C -t A

2,

+- Xy.C + Xy.A (if necessary rename y)

C1

(3)

A , , C2 + A2

-+

* ClC2 + AiA2 .

11.3. The following properties of

+

are easily proved from 11.2.2.

(1)

C + C"

(2)

C+A

(3a)

c

(3b)

C+Xy.Al

*(l)C=Ay.C1, (2) c = zG

(3c)

C+AlA2

=S

-t

3

y

=S

C*>A (1)

c =y ,

y f z , or (2)

+

A i , C2

C1+Al,

(1) C r C 1 C 2 , C1 + A l l (2) c = zA2 .

11.4.1. Substitution lemma for

Xi

+

*

A2

y f z , or C2 + A 2 , or

+. If y $ FV(I'), y f

X i [Y/&]

-t

Proof. By induction on the definition of C1 C; A l . Then

>

C1 [Y/C2]

c = zil

z then

A i [Y/&] +

Al. E.g. let C1

= zil[Y/C,]

and

(Xi [ Y / G ] ) * Z; [ Y / C ~ ] ]1 A i [ Y / C ~ ]1 A i [Y/&] because of 11.3 (2) above. So

Xi

[Y/&]

+

.

A i [Y/&]

Or, let

xi

CiiCi2 , A i

A11A12,

xi1

-t

A i l , C12

Then

C i i [Y/G] A i l [Y/&] +

and el2

[Y/.%]

+

A12 [Y/&]

(by ind. hyp.)

+

A12 .

= ($A?),

The language theory of Automath, Chapter 11, Section 11 (‘2.5)

505

0

11.4.2. Reduction lemma for some 2’.

-+. C + A

>1

A’

C 2 C’

+

A’ for

Proof. By induction on the length of A . Or, informally, as follows: A must contain a redex, A = ...(Xy.Al)Az ..., A’ = ...A1 [y/Az]... . Now there are three cases:

or (2)

C 3 ...(Z J G )...~ ,~

(26)’ 2 Xy.Al,

C2 + A2 ,

or (3)

C

+

...(ZM)..., ( 2 9 ) ’2 ( . . . ( A v . A ~ ) A ~,. . . )

We must indicate an appropriate C’: In case (1) take C’ 3 ...C1 [y/C2] ... and use 11.4.1,in case (2) and (3) simply take C = C’. So, in fact, even C >p C‘ +. A’.O 11.4.3. Theorem. C -+ A 2 A‘ =+ C 2 C’ + A’, for some C’. Proof. By induction on A

2 A’, using 11.4.2.

0

11.5. Corollaries 11.5.1. Reduction-under-substitution lemma

( i e . Property (*), Sec.

11.1):

C’2A

+

C>C’+A.

0

11.5.2. Barendregt’s lemma ([de Boer 751): (if z !$ FV(C))

C r 2 A =+ CZ 2 C’ -+ A Proof. (CZ)’

.

= CI’ 2 A SO CZ 2 C’+ A .

11.5.3. Square brackets lemma. If C’ 2 Xy.A then either (1)

C>Xy.Ao, A Z Z A

OT

(2)

C 2 z d , (z@)* 2 Xy.A

.

0

D.T. van Daalen

506

Note about terminology: The name square brackets lemma comes from the square brackets which represent abstr in Automath notation. Here the name A-lemma would be more appropriate. Slightly more general than 11.5.3 is:

11.5.4. “Outer shape lemma”. If C* 2 A then C 2 Ao, A: 2 A with either (1)

the latter reduction (A; 2 A) is non-main,

or (2)

A,=xA?.

0

[Main reduction: when at one of the reduction steps, the expression itself acts as the redex to be contracted.]

11.5.5. Note: The corollaries 11.5.1, 11.5.2 and 11.5.4 do not extend to reduction, but the square brackets lemma does (by ,&postponement).

pv-

The language theory of Automath, Chapter IV, Section 2 (C.5)

507

IV. STRONG NORMALIZATION FOR FIRST ORDER PURE TYPED A-CALCULUS WITH APPLICATION TO AUT-QE IV.2. Normalization and strong normalization for normable expressions 2.1.1. Here we consider a system M of normable expressions, in which the first order pure typed A-calculus systems, such as the systems of correct Automath expressions, can be embedded. Each normable expression C has a norm p(C). Norms are defined inductively: (i)

7

is a norm.

(ii) if u l , v 2 are norms then [ul] v 2 is a norm. The length I ( u ) of a norm u can be defined to be the number of 7’s in u. Equality of norms is denoted by =.

2.1.2. The expressions in M are formed from variables, A, a b s t r and appl (and possibly other constants). Abstraction expressions are denoted [z : A] B and application expressions (A) B. By writing p(A) we implicitly intend that A E M. Here follow the relevant properties of M and the norm p: (1) M is closed under taking subexpressions, i.e. C E M , r C C (2) C = [z : A] B E M

* p(C) E [p(A)]b ( B ) ,~ ( z=) p(A).

(3) C = ( A )B E M

p ( B ) = [p(A)]p ( C ) .

+r E M .

(4) M is closed (and p is preserved) under substitution: l-42) = P(A), B E M CL(B[%/A])= P ( B ) .

*

(5) M is closed (and p is preserved) under reduction: c E M , c 2 r a b(r) p ( c ) .

=

2.1.3. The norm p induces a well-founded order sions, as follows:

c

r

:* W C > ) <

on the normable expres-

w3) .

Then by the properties above, < p induces an actual stratification, according to functional complexity: both argument and value of a function precede the function w.r.t. < p i.e. if (A) B E M then A
(A)

D.T. van Daalen

508

Induction on <,, is just called induction o n p. Below we only deal with (strong) normalization for P-reduction. [Using ,877postponement, see ZI.9,] we can extend to the Pq-case. In Sec. IV.4.6we extend with &reductions.

2.2. Normalization for P-reduction: first proof 2.2.1. Heuristics Assume that C E M , C is not normal, C >1,p C‘. So

C = ...(A ) [z : B]C...,C’ = ... C [A]... The redices in C’ are of several kinds (compare with 11.9): (1) (‘old” redices, already present in C (and there disjoint with

(A)[z : B]C).

(2) “modified” redices, i.e. redices R [ z / A ]C C [z/A]in C’ where R

c C in C.

(3) L‘multiplied”redices, i.e. redices inside substituted occurrences of A in C [ A ] .

(4) “newly created” redices ( D l [A]) [y : 0 2 1 D3, where A ( D 1 ) z c C or z = C.

= [y : Dz]0 3 and

( 5 ) kewly created” redices (01) [y : D2 [ A ] ]D3 [As] where C = [y : Dz] D3. If A is normal, no redices are multiplied. If p ( A ) = T , no redices of type 4) are created. 2.2.2. First proof of &normalization (Prawitz 1965). (See [Prawitz 651.) This proof is quite similar to the first proof of 6-N in [ v a n Daalen 80, 111.4.31. Define the order of a redex (A)[z : B]C to be 1(p([z : B]C ) ) . Let C E M , let m(C) be the maximal order of redices in C, and let #(m,C) be the number of occurrences of redices of order m in C. Our normalization procedure runs as follows: if C is not normal then contract an innermost redex (A)[z : B ] C of maximal order. And so on. That this procedure terminates, follows by induction on (I) m ( C ) , (11) # ( m ,C). For, one redex of order m(C)disappears and, since we chose a n innermost redex of order m ( C ) all the redices of type (2)-(5) above are of order less than m ( C ) . Further, the “old” redices were already present in C, with the same order. So, either m(C)- if # ( m ,C) = 1 - or # ( m ,C) - otherwise - properly decreases under the indicated contraction.

2.3. Second proof of p-normalization (LBvy, Jutting). (See [LCvy 741, [van Benthem Jutting 71aJ.)

The language theory of Automath, Chapter IV, Section 2 (C.5)

2.3.1. (Substitution lemma for P-N (Jutting). B E M , p(A) normal, B normal + B [z/A] /3-normalizes.

509

= p(z), A

Proof. By induction on (I) p(A), (11) length of B. If B [ A ]is not normal, then B contains subexpressions of the form (BI)z and A = [y : All Az. Our normalization procedure for B [A] runs as follows: for each of these (B1)z take ... (B1)z ending in it. By ind. hyp. (11) ((Bk-1) ... (B1)z) [A] the maximal (Bk) normalizes, t o C, say. If C = [y : Cl] C, then by ind. hyp. (I) applied to B k [A], Cz [y/Bk [A]] normalizes. By normalization of all these maximal subexpressions 0 of the form ((8)z) [A], B[A] can be normalized. 2.3.2. Corollary. C E A4

+

C P-N. 0

Proof. By induction on the length of C.

2.3.3. The reduction procedure intended above corresponds to the following definition of normal form:

= [ z : nf(Al)]nf(Az) , nf ((Al) Az) = if nf ( A z ) = [y : B1] Bz then nf (Bz (y/nf (Al)]) nf([z : Al]Az)

else (nf ( A l ) ) nf (Az)

.

LCvy speaks about “interieur d’abord’l-reductions. In fact, LCvy’s proof of normalization by this procedure does not use a substitution lemma, but instead employs an induction up t o ww (for an explanation, see Sec. 2.6.1).

2.4. Strong 0-normalization (p-SN): first proof 2.4.1. Heuristics for SN We can formulate the following conditions for P-SN. C is P-SN if (i) the direct subexpressions of C are SN, and (ii) C = (A) B, B 2 [y : C] D

3

D [A] SN

(because all first main reducts of B are reducts of some D [ A ] ) . So, if we have the substitution theorem for P-SN: if B E M, p(z) then

= p(A)

A S N , B S N =+ B [ A ] SN,

then we can prove P-SN by mere induction on the length of [normable]expressions.

2.4.2. Heuristics for the substitution theorem Now let B E M , p(z) p(A), B SN and A SN. Abbreviate C[z/A] by C*. The question is how to

510

D.T. van Daalen

prove the SN-conditions for B*. The crucial case is when B SN-conditions require:

= (B1)Bz. The

(i) BT SN, B,* SN, and (ii) B,' 2 [ y : C ]D

+

D [Bf]SN.

In the case that the outside square brackets of [y : C]D do not originate from the substitution * but show up as well in reduction sequences of Bz, i.e. if

B z 2 [ ~ : C o ] D oc:2c, , Q 2 D , then

B 2 ( B i ) [ y CoIDo :

(Do [&I)*

>I

Do[&]

,

D: [B;]2 D [B;]

which suggests to use induction on the reduction tree of B ( B is SN) in order to establish that D [B;]is SN. Otherwise, the square-brackets lemma (Sec. 11.11, and Sec. 4.3 below) must provide the necessary information. 2.4.3. Alternative proof of the square-brackets lemma The proof in 11.11 works for free X - p-calculus. Here we give a n alternative proof for SN expressions. Abbreviate C [ z / A ]by C*. Square brackets lemma. Let B* 2~ [ y : C ]D and let B be SN. Then either (a)

B 2 [ y : CO]DO,C$ 2 C , D: 2 D , or

(ii) B 2

(?) 2, (?* ) A 2 [ y : C]D.

Proof. By induction on (I) O(B) (the length of the reduction tree of B ) , (11) length of B. Distinguish the cases:

= 2. Then (ii) holds. B = [y : B I ]Bz. Since we have no 17-reduction, (i) holds.

(1) B (2)

(B1)B2. Then B; 2 [ z : E ] F , F [B;]2 [y : C ]D. By ind. hyp. (11) (3) B applied t o Bz, either

(a) Bz 2 [ z : Eo]Fo, F; 2 F or (b) B2 2 (C?) 2, (G*)A 2 [ z : E] F .

In case (a), B 2 (BI)[ z : Eo]FO (Fo [Bl])'

>I

FO[ B I ]and

F; [Bf]2 F [B;]2 [ y : C]D .

The language theory of Automath, Chapter IV, Section 2 (C.5)

511

Hence, ind. hyp. (I) applies to FO[Bl]which gives the desired result for 8. In case (b),

so B satisfies (ii). (4) In the remaining case, B* does not reduce to [y : C] D.

0

2.4.4. First proof of P-SN 2.4.4.1. In agreement with 2.4.1, we start with the substitution theorem

for P-SN:

B E M , p ( z ) = p ( A ) , A S N , B SN

+

B[z/A] SN.

Proof. By a triple induction on (I) p(A), (11) 29(B),(111) length of B. [29(B) is the height of the reduction tree of B ; also 2 9 ~etc.] Abbreviate E[z/A] by C*. We prove the SN-condition for B*. The crucial case is when B E (B1)B2. The direct subexpressions Bf, B; of B' are SN by ind. hyp. (111) (or possibly (11)). So, let B,* 2 [y : C] D. We must prove that D [Bf]is SN. By the squarebrackets lemma applied to B2 (which is SN, hence we can use the alternative proof without any circularities) we have two cases:

DO, (i) B2 2 [Y : CO] (ii) B2 2

2 D , or

(F)z, ( ( F )z)* 2 [y : C]D.

In case (i), B 2 (B1)[y : CO]DO > I DO[ B I ]and (Do [Bl])'= Dt; [Bf]2 D [Bf], so the ind. hyp. (11) applies to DO[ B l ]whence , D [Bf]is SN. In case (ii), we know that D is SN (since Bd is SN and B,* 2 [y : C] D ) . Further, by the properties of p , p ( B ; ) = p ( B 1 ) E p(y), B1 < p z so BT < p A. Hence, 0 we can apply ind. hyp. (I) to Bf and get that D [Bf]is SN. 2.4.4.2. Corollary. C E M

+

C P-SN (as indicated in 2.4.1).

0

Second proof of 0-SN (Note: Actually, both the second and the third proof of P-SN are incorrect: The substitution theorem is not sufficient here, we rather need a replacement theorem. Since the idea of the proof can be maintained, and since the error will be repaired in VII.4.5 we have not altered the present text.) 2.5.

2.5.1.

Heuristics for the substitution theorem for SN Let A and B

D.T. van Daalen

512

be SN. As in 11.5.3.3, B* (where * stands for [ z / A ] )is SN if all its reduction sequences contain an SN expression. Let B* > I C > ... be a reduction sequence of B*. First, if the redex is an old or a modified redex (terminology as in 2.2.1) then the contraction and the substitution commute: for some CO,B >1 CO, C: E C. So, if we use induction on O(B),we can conclude that C is SN. In fact, the proofs in 2.4.3 and 2.4.4.1 use the similar fact that, in some cases, substitution and main reduction commute. Secondly, reduction sequences of B* can start with contractions inside substituted A’s (or inside reducts of such A’s). There is only a finite number of such contractions, since A is SN. Finally, if the first redex contracted is a new redex then we have to use properties of the norm. Our alternative proof of the substitution theorem, below, is based on the above ideas and avoids the square brackets lemma. 2.5.2. An additional assumption on M and p is needed, viz. that M is closed and p is preserved under “correctly normed” replacement: If C E Ad, and C’ is formed from C by replacing an occurrence of r C C with some r‘ E M , such that @’) = p ( r ) then p(C’) = p( C) . 2.5.3. Second proof of the substitution theorem for P-SN Let B E M, p(z) p ( A ) , A and B are SN. We must prove that B* is SN. Again, we use a

=

triple induction on (I) p ( A ) , (11) 19(B), (111) length of B. Let B* = BO >1 B1 >1 ... >1 Bk > 1 Bk+l > ... (k 2 0 ) be a reduction sequence of B* and let the step from Bk to Bk+l be the first reduction step not taking place inside (a reduct) of some substituted A . So B* = ...A ... A ..., Bk = ... A’ ...A” ... = ... (C) [y : Dl E ..., Bk+l = ... E [y/C] ..., where A 2 A’, A 2 A”. If p ( A ) is the set of reducts of A (which is finite) then B*, B1, ..., Bk belong to the multiple substitution result B [ z / p ( A ) ](Sec. 11.10) and Bk+l is the first reduct not in that set. Clearly, k 5 19(A). # ( E , B ) , i.e. the length of reduction tree of A times the number of free occurrences of E in B. We show that is SN. Put R = (C) [y : D] E , and distinguish the following cases: (i) (Co) [Y : Do] Eo Ro C B and R E Ro [ z / p ( A ) ] . Then the contraction of R commutes with the multiple substitution, viz.

The ind. hyp. (11) applies to B’, and (B‘)*2 Bk+l so Bk+l is SN.

The language theory of Automath, Chapter IV, Section 2 (C.5)

513

(ii) (CO) z = RO c B , C E CO[z/p(A)], A 2 [y : D] E . We apply ind. hyp. (I) twice (in contrast with 2.4.4.1). First CO <, 2 so C <,, A. By ind. hyp. (111) C; (so C) is SN, and A is SN, so E is SN. Hence, by ind. hyp. (I) E [y/C] is SN. Secondly, E <, A so E [C] <,, A. Now take a fresh variable z , with p ( z ) = p ( E ) . Form the expression B’ from B by replacing the specific occurrence of & by z , and form B” from Bk by replacing R with z. [The problem with this ‘$roof” has to do with possible free variables of &, which are bound in B. Their presence makes that not B = B’[z/&]. Instead we would need literary replacement of z with &, as used in VII.4.5.1 By our assumption 2.5.2 the norm of B and its subexpressions are not affected by this replacement. Clearly, since B = B’ [z/Ro],B’ is SN and g(B’) 5 19(B).Further B’ is shorter than B. So ind. hyp. (111) or ind. hyp. (11) can be applied, giving that (B‘)*is SN. And by ind. hyp. (I) - this is the second application - (B’)*[ z / E[y/C]] is SN. Resuming, in case (ii) we have:

= ... 2 ... (Co). ..., B’ E ...2 ... z ... , (B’)*E ... A ...z , B* = ... A . . . ((7;) A . . . , Bk = ... A’ ... ( C )[y : D ] E... , B

B” = ... A’ ...z ... .

so (B‘)*2 B”, and (El‘)* [ z / E[y/C]] = ...A ... E [y/C] ... 2

2 B” [ L I E [V/c]] ... A’ ... h’[y/C]... whence

Bk+l

is SN, q.e.d.

2.5.4. Corollary. C E M

j

C 8-SN (as indicated in 2.4.1).

0

2.6. Third proof of P-SN 2.6.1. This new proof is a mere variant of the previous one. However, instead of an iterated substitution a simultaneous substitution is employed. Consequently, we start with a simultaneous substitution theorem for P-SN. The induction used is essentially induction up to ww, instead of the previously used inductions on w , w2 and w 3.

Explanation: The threefold w-induction (as used in the above proofs) can be considered as a single transfinite (up to w 3 ) induction on triples ( m ,n, k ) of natural numbers, ordered lexicographically, i.e. according to their corresponding ordinal w2 . m w . n k. Similarly, the present proof uses a single transfinite induction on finite sequences (mk,m k - 1 , ...,mo),where mk # 0, for arbitrary lc, ordered (I) according

+

+

D.T. vm Daalen

514

to their length k , (11) lexicographically, i.e. according to their corresponding ordinal wk

9

rnk

+ wk-l

. m k - l + ... + w . ml

+ mo .

2.6.2. Simultaneous substitution theorem for SN. Let B E M , p ( x i ) = p(Ai) = ui for i = 1,...,k . let A' and B be SN. T h e n B [?/XI SN.

Proof. Abbreviate C [?/XI by C".Let 1zi denote the number of occurrences of xi (for i = 1, ...,k ) in the whole reduction tree of B. Define aj to be

C

ni

. 19(Ai).

i,l(ui)=j,l
Let m be the maximum of those l(vi) with ni # 0. We use induction on (I) (am,...,ao) (ordered as above), (11) t9(B),(111) length of B. Let B* >1 C. We shall prove that C is SN. The cases are: (1) If the redex contracted is old or modified proceed as in the proof 2.5.3, case

(9. (2) If the redex contracted is a multiplied redex, i.e.

then take a fresh variable z with p ( z ) = p ( x i ) , form an expression B' from B by replacing the specific occurrence of xi by z , and consider the new substitution [?,z/A,A:]. Clearly, C = B' [Z, z / A , A:],and C is SN by ind. hyp. (I). Notice that, in fact, only c q v , ) is affected, viz. decreased by at least 1. (3) If the redex contracted is new (compare proof 2.5.3), case (ii)) then B = ...xi ... (Do)xi ... ~j ..., B* E ... Ai ... (D:) Ai ... Aj ..., Ai z [y : El F and C E ...A; ...F [D:] ... Aj ... . Now form B' by replacing (DO)zi with a new p ( ( D o ) x i ) , and consider the new substitution variable z, p ( z ) [Z, z / i , F [D:]]. Since B B' [ z / ( D o )xi],the replacement removes at least one occurrence of xi from the reduction tree, whereas possibly only occurrences of z (which has shorter norm) are added. So the component cyj with I(ui) = j properly decreases when going from B to B'. Further, as in 5.3.2 case (ii), F [D:] is SN so C E B' [Z, z / & F [D:]]is SN, by ind. hyp. (I). 0

=

2.6.3. Corollary. Substitution theorem for SN (take k = 1 above).

0

The language theory of Automath, Chapter IV, Section 4.6 (C.5) 2.6.4. Corollary. C E M

3

C P - SN.

515 17

IV.4. The normability of AUT-QE

[In [van Daalen 80, 1V.4.1...4.53 (not included here), it is proved that the correct expressions of AUT-QE+, and hence of A U T - Q E and AUT-68, are weakly norm correct (w.n.c.), and hence weakly normable. These weak notions have been introduced to cope with type inclusion: by type inclusion substitution with 2-expressions may alter the norm of expressions. However, the norms of degree 3 subexpressions are unaffected and all the 0-N and 0-SN proofs remain valid.] 4.6. Extension to PvS-SN By the previous section we know that the correct expressions of AUT-QE-+ (and hence of AUT-QE and AUT-68) are p-SN. The definitional axioms are just like in Chapter 111, so we also have 6-SN. Now we extend to P$-SN. 4.6.1. Lemma. If p ( A ) = p ( x ) , B normable, then

A P6-SN, B P6-SN =+ B [ x / A ]P6-SN . Proof. By induction on (I) p(A), (11) 29pb(B), (111) length (B). Combine a single-substitution version of the second proof of 6-SN [van Daalen 8 0 , 111.5.41 0 with the first proof of 0-SN (1Vq2.4.4).

<

[.'/A]

4.6.2. Lemma. Let k be correctness, as in 4.5.2. I f = 3E6, is weakly norm correct and weakly degree correct, Ai is PS-SN ( i = 1,..., 131) then

=+ B [$/A]PS-SN .

[In fact, i n IV.4.1 ...4. 5, the notions of weak norm correctness and weak normability have been introduced in the framework of the weakly degree-correct expressions. I n the weakly degree-correct expressions, the degree restrictions only serve to rule out application with degree 2 expressions - f o r which weaker norm restrictions are imposed than for expressions of degree 3 and higher.] Proof. By induction on k B (as in 4.5.3). Abbreviate [ Z / 4 by *. Some cases are:

= (C) D. By ind. hyp. C*, D* are PS-SN. Now let D* 2 [y : E] F. By 4.5.3, B*, C* and D* have a norm. By 4.4.3, F has a norm and p ( C * ) = p ( y ) . So, by 4.6.1, F [ C * ]is p6-SN, q.e.d.

(1) B

(2) B = d(C?). By ind. hyp. the Cj* are P6-SN. The C?* form a w.n.c. and w.d.c. substitution. So by applying the ind. hyp. to d e f ( d ) , with the new 0 substitution, def (d) [C?] is PG-SN, q.e.d.

D.T. van Daalen

516

4.6.3. Theorem (PS-SN): F A

+

A PG-SN.

Proof. Take the identical substitution 4.6.4.

Corollary. I- A

+

[?/$I

above.

A PvS-SN (by (P6)-postponement, see 11.9).

0

The language theory of Automath, Chapter V (C.5)

517

V. THE E-DEFINITION AND THE CLOSURE PROPERTY FOR PURE REGULAR AUTOMATH LANGUAGES Section 2 of this chapter introduces the E-definition, closely related to the definition (of AUT-QE) in [van Daalen 73 (A.3)], as a framework for defining Automath languages. Section 3 proves the closure property (correct expressions remain correct under reduction) for several versions of the pure (i.e. only p-, 9-and &reduction), regular (i.e. only expressions of degree 1, 2 and 3) languages AUT-68 and AUT-

QE. Section 4 proves, using closure and CR [ C R stands for Church-Rosser property] (thus anticipating the PVR-result of Chapter VI), the equivalence of the E-definition with an algorithmic definition, such as Nederpelt’s definition of A. This gives the decidability of the various systems, and further allows certain simplifications in the E-definition.

V.l. Introduction 1.1. E-definition versus algorithmic definition We distinguish some fundamentally different methods of defining the correct expressions, with typing and equality relation (w.r.t. book and context) [see, e.g. [van Daalen 73 (A.3)]],of an Automath language, or of any other system with generalized type-structure. First, the E-definition, below, introduces E-formulas A E B (expressing the typing relation: A has type B) and Q-formulas A B (for expressing equality: A definitionally equal to B ) . Correctness of expressions (notation: I- A ) and both kinds of formulas is given by a simultaneous inductive definition, without giving a clue how the correctness might be effectively verified. Essentially the same definition method is used in [van Daalen 73 (A.3)], and in [Martin-Lof 75a). Secondly, there is the algorithmic definition which characterizes the correct expressions by giving a verification algorithm for correctness. In this case Q can be defined in terms of reduction ( A B :($ A 1 B ) and E can be defined in terms of Q and the typing function t y p ( A E B :@ typ(A) Q B , forgetting type-inclusion for the moment). The main example of an algorithmic definition is Nederpelt’s definition of A in [Nederpelt 73 (C.3)]. In the third place, we mention de Vrijer’s definition method of AX in [ d e Vrijer 75 (C.d)]. He starts with the simultaneous introduction of the correct Eand @formulas, and after that defines correctness of expressions in terms of E , Q and typ.

D.T. va.n DaaJen

518

1.2. Some general points on the language theory A priori it is not clear that the various definition methods generate the same structure (of correct expressions, with typing and equality). So one might think that the language theory has two aims, viz. (1) proving the equivalence of the various formulations, and (2) proving that the generated structures satisfy some specific desirable properties (Sec. 1.3).

However, these aims can hardly be separated: properties are first proved for one formulation, then the equivalence is established and finally the properties are transferred to the other formulation, via the equivalence. A simple example of this situation: for the system given by the algorithmic definition, decidability is just a matter of termination of the algorithm, i.e. normalization (as Nederpelt points out [Nederpelt 73 (C.3)]).So, by the results in Chapter IV, if a system can be proved to be equivalent to the “algorithmic one”, it is decidable. As a second illustration, we sketch roughly how the development below is organized. [Terminology: > means 1-step reduction, 2 is the transitive closure of >; A 1 B iff there is C such that A 2 C 5 B; is the transitive closure of 1; further, for relations R like >, 2 , 1, -, R+ is the restriction of R to the correct expressions .] We work with three systems: I and I1 are given by an E-definition and I11 is the algorithmic definition. The three systems essentially just differ as regards their @rules. In system I, p is defined to be the equivalence relation generated by >+ (but realize that 9 and k are introduced simultaneously). This is the restricted “technical” version of the E-definition, which we present in Section 2, and take as the starting point for the development in Section 3. In system 11, 9 is -t-, i.e. the transitive closure of J+. This is the liberal form of the E-definition, which we think is most suitable for practical purposes, as a reference manual, say. In system 111, the algorithmic definition, which we given in Section 4, 9 is defined to be just I+. We say that a system satisfies CL if its correct expressions remain correct under reduction, and that it satisfies CR if its correct expressions are CR. Clearly, both I and I11 are contained in 11, since I1 has more liberal rules for 9. Further, if I satisfies CL then I and I1 are equivalent, as is proved by induction on the definition of correctness in system 11. Also by induction on 11-correctness it is proved that I1 and I11 are equivalent, if I11 satisfies CR. Now, in Section 3 we prove that I satisfies CL, and in Chapter VI we prove (roughly) CL Pq6-CR (for the P6-case we know CR already). This gives CR for 11, so CR for 111, so it shows that all the three systems are equivalent, and satisfy CL and CR.

-

*

The language theory of Automath, Chapter V (C.5)

519

An approach, alternative to the one sketched above, is given in Chapter VII. There the algorithmic definition serves as a starting point and CL and CR are proved simultaneously, using induction on so-called big trees.

1.3. What are the desirable properties? As desirable properties for the structures of correct expressions generated, we mention: (i)

substitutivity: correctness of expressions and formulas is preserved under substitution with correct expressions of the right types.

(ii)

closure (CL) and preservation of types (PT): correctness of expressions and formulas is preserved under reduction.

(iii) the Church-Rosser property CR, and the weak Church-Rosser theorem:

A Q B+ AJB.

(iv) (strong)normalization (S)N and decidability. behaves as an equality, such as:

(v) properties for 9, which show that -

the lefthand-equality rule LQ: A E B, A C equality rule is included in the definition).

- monotonicity rules: A

B, C

Q

D

+

+

C E B (the righthand-

( B )D , etc.

( A )C

(vi) uniqueness properties -

uniqueness of types UT: A E B , A E C

- uniqueness of domains -

+

B Q C.

UD: [z : A ] B [z : C ] D

+

A

Q

C (and B q D ) .

extended uniqueness of domains EUD: [z : A ] B E [z : C]D =+ A q C (and B E D ) .

Of course in the presence of type-inclusion (in AUT-QE), only restricted forms of uniqueness of types and property LQ (see Sec. 1.7) are valid. It depends on the choice of a definition method and on the language defined, which of the above properties are basic and which can be derived from these basic ones. Anyhow, SN, G y C R and GP-CR we know already. The discussion below starts with substitutivity (Sec. 2.9) and ends with Pq-CR (Chapter VI) and decidability (Sec. 4, as sketched in 1.2). In between, (ii) and (v) and (vi), which turn out to be connected, are considered more or less simultaneously. In fact, first PT, LQ and UD and the property of (vii) sound applicability SA: ( A )[z : B]C correct

+

AEB

520

D.T. van Daalen

are proved simultaneously, by a careful induction on degree. Then follows onestep closure CL1 by induction on correctness, and finally CL, by induction on

>.

1.4. Some points on closure

Apart from the specific role which closure plays in our discussion, it is of course important as a technical property. Compare, e.g. IV.2: the point of the generalization from the correct expressions to the normable expressions, lies precisely in the fact that the normable system is “large enough” to prove closure for it in a relatively easy fashion (in contrast with closure for the correct expressions), and small enough to prove (strong) normalization for it, with the help of closure. The normalization properties and CR are nicely preserved under certain forms of taking subsystems. So it is sufficient to prove these properties for some “large” systems: normalization for the normable expressions, p6- and v6-CR for all the expressions, and PqG-CR under faily general conditions in Chapter VI. The closure property, however, poses a separate problem for each particular language, because correctness is defined in terms of reduction. Further we must stick t o a particular definition, since in the proof of closure we often apply induction on the definition of correctness. Only after closure has been proved, some important derived rules follow and equivalence with the alternative definitions can be established. Nevertheless, we try and give a uniform treatment of the various languages here, by splitting up the closure proof in the parts common t o all the languages (e.g. substitutivity, CL1 + CL, etc.), and the part specific for each particular language, i.e. the proof of SA, UD, PT and LQ. The specific part is given quite elaborately for the “worst case”, 011-AUT-QE (and its extensions), in Sec. 3.2 and 3.3, and just sketched for the simpler languages, such as P6-AUT-QE, PqAUT-68 etc. (Sec. 3.4). In fact, for the simpler languages the specific part simply vanishes, in which case the whole closure proof boils down to the simple closure proofs in [Girard 721 and [Martin-Lof 75a]. 1.5. Summary

Section 2 starts with a list of inductive clauses for establishing correctness of expressions, E- and 9-formulas, relative t o correct book and context. E-definitions for particular languages are specified by indicating (1) a reduction relation (0-reduction with or without 6 and

(2) possible degree restrictions, (3) a particular set of rules from the list.

v),

The language theory of Automath, Chapter V (C.5)

521

In order to avoid confusion we restrict ourselves here to the regular languages (i.e. degrees only 1, 2 and 3), from P-AUT-68 to Pqb-AUT-QE+. Then we prove some simple properties (renaming of contexts, substitutivity, correctness of categories) and give a short discussion of some of the rules. Section 3 deals with the actual proof of closure and the connected properties (i.e. (ii), (v), (vi) and (vii) above) for the whole range of regular languages, as far as these properties are valid (in view of type-inclusion). First, heuristic considerations (Sec. 3.1) point out how the connections can be, and how the proof might be organized in the more complicated cases (such as Pv-AUT-QE). Secondly, the proof is actually carried out for Pv-AUT-QE (Sec. 3.2). After that, via an unessential extension result, all the properties are transferred to Pqb-AUT-QE+ (Sec. 3.3). Finally, it is shown, that for all the simpler languages (Pq-AUT-68, PG-AUT-QE(+), etc.) easier proofs can be given, which use the more liberal E-definition I1 (see 1.2) instead of I as a starting point (Sec. 3.4). We claim that the restriction to degrees 1, 2 and 3 in the closure proof of Pq-AUT-QE is not essential, and that this proof can be easily adapted for A(+), using the results on norm-degree-correctness in VII.2.2. Section 4 contains the details of the equivalence proof sketched in 1.2 above. First it is shown how, essentially, the verification of correctness can be reduced t o the verification of equality. typ-functions for the various languages are discussed. Then we present the algorithmic system (like system I11 above) and an “intermediate” system (like system 11). However, the situation is more complicated than sketched above, because the equivalence proofs in 4.3.2 and 4.3.3 are also used for proving the so-called strengthening rule superfluous (see below). Finally some remarks on the actual verification are made (Sec. 4.4).

1.6. Complication 1: the strengthening rule Of course, if an expression or a formula is correct relative to a book and a context, its constants are in the book and its free variables are in the context. The strengthening rule is connected with the converse question: In systems such as I, I1 above, which have rules for the transitivity of 9, it is a priori not clear that a correct equality A p B can be established via expressions containing only variables and constants occurring in A or in B. So it might be possible that a proof of correctness of A , or of A E B needs correctness of expressions containing variables and constants outside A (and B). Now for the sake of proving q- one-step-closure we have included a postulate, the strengthening rule, in our definition, which allows to skip “redundant” variables from the context. This appears to be a nasty rule because it might spoil the nice order on the correct expressions induced by the definition of correctness. See, e.g., Sec. 2.10.3 and 2.14.1.

D.T. van Daalen

522

The proof that the rule is superfluous, runs roughly as follows: let t-I, t-11 and stand for the correctness predicate in system I (as in 1.2, with strengthening rule), system I1 (as in 1.2, without strengthening rule), and the algorithmic system I11 (without strengthening rule), respectively. As in 1.2, t-111 t-11 (Sec. 4.3.2). By CL for system I (Sec. 3), we have t-11 FI. Since in the algorithmic definition strengthening is provable as in [Nederpelt 73 (C.3)]),by CR (for I, so for 11, so for 111, in Chapter VI) we can conclude FI + FII, which closes the circle (Sec. 4.3.3). l-111

*

*

1.7. Complication 2: definitional 2-constants in the presence of type-inclusion The rule of type-inclusion in AUT-QE allows us to infer A E T from A E [z : a ] ~This . shows how uniqueness of types gets lost in AUT-QE (but only for 2-expressions A ) . For the restricted form which we can prove instead we refer to Sec. 3.2.6.1. A peculiarity, due t o the combination of definitional 2-constants and typeinclusion, is that rule Lq is violated too in AUT-QE. Example: if a E T , A E [x : a ] (relative ~ to empty context, say), then the scheme d := A E

T

(also with empty context)

is correct in AUT-QE. Now d 9 A , A E [z : a ] but ~ not d E [z : a ] r . So, in AUT-QE, definitional 2-constants are not only used as abbreviations but also for cutting down the type of the expressions abbreviated. As a consequence of this, definitional 2-constants in AUT-QE can lead t o unessential extensions, which are not definitional extensions (Sec. 3.3.2). One might wonder why we do not take more liberal variants of AUT-QE, which allows d E [ x : a17 as well. In fact, we mention such a variant AUT-QE* somewhere for technical reasons (Sec. 3.3.11),but we do not think that this way of ignoring the typ of a definitional constant is suitable for practical purposes. Part of our motivation runs as follows: First, we do not want it for definitional 3-constants, where the definition part can stand for a long proof, and the typ represents a short theorem (1.5.2 in [A.6]). So, we do not like it for 2-constants, for the sake of uniformity.

V.2. On the E-definition 2.1. The book-and-context part of the E-definition 2.1.1. The correctness of books, contexts and expressions is defined simultaneously with the correctness of E-formulas A E B and Q-formulas A 9 B. [See e.g. [van Daalen 73 (A.3)I.l The symbol F stands for correctness; the notation for the correctness of contexts (w.r.t. a), expressions, E- and 9-formulas (w.r.t. 23 and [) is respectively

The language theory of Automath, Chapter V (C.5)

523

B ; ( I- A , a;( I- A E B and B ; ( t- A Q B. The symbols E and assumed to bind tighter than I-.

B;(

Q

are

2.1.2. For brevity we sometimes write “B;( I- A E/Q B” instead of “13; ( I- A E B respectively B; ( I- A B”, and “B;( I- A ( E / QB)” instead of “B;( I- A respectively t?;
<

(1) (inhabitable degree condition) an expression Q can only act as the t y p of a constant in a scheme or as the t y p of a variable in a context, if its degree is 1 or 2.

(2) (compatibility of def and t y p ) in a scheme ( * d ( 2 ) := A E that B;( I- A E rl where B is the preceding book. 2.2. Some notational conventions 2.2.1. We often assume implicitly a fixed correct book t, correct w.r.t. 23. Le., if B;<,17 I- then we write

r it is required

B and a fixed context

7 I- A ( E / QB ) for B;tlB I- A ( E / QB ) and just

A E/Q B

for

B;( I- A

E/Q

B

(so for formulas we omit the +-symbol in this case). 2.2.2. At some places in the definition the degree of expressions is explicitly displayed as a superscript:

FiA(E/QB ) e t- A ( E / QB ) and degree ( A ) = i . 2.2.3. Formulas like A1 E A2

A1 EA2 and A2

A3 E A4 are used as abbreviation for

A3 and A3 E A4 etc.

2.3. The expression-and-formula part of the definition: expressions The rules for the correctness of expressions and formulas fall apart in six groups labeled I to VI. We start with group I (correctness of 1-expressions) and

D.T. van Daalen

524 group I1 (correctness of non-1-expressions).

I. Correctness of 1-expressions. 1.1. T-rule: t - l ~ . 1.2. Abstraction rule: k 2 a , z E a F1 A + I-' [z : a ] A . 1.3. Application rule: A E a,t-'B [z : a]C + I-' (A) B. 1.4. Instantiation rule: if the scheme of d is in U, with context y' E a 1-constant then B' E &/8] =+ I - l d ( 8 ) . Notice, that the degree of A is indeed 1, if

t-l A

6,and d is

is derived by the above rules.

11. Correctness of non-1-expressions. 11.

AEB + F A .

2.4. The expressions-and-formula part: E-formulas The rules of group 111, below, in combination with rule 11, also serve as the formation rules for the non-1-expressions. Group IV contains the type modification rules. 111. Formation of non-1-expressions. 111.1. copy rule: = ...,z E a,... + z E a. 111.2. Abstraction rules: if F 2 a then III.2.A. x E a I- B E ?- + k [z : a ] B E T . [z : a ] B E [z : a]C. III.2.Ba. z E a B E c =+ So of the latter are two versions, III.2.B' and III.2.B2. 111.3. Application rules: if A E a then III.3.A. B E [z : a ] C =+ (A) B E C [ z / A ] . III.3.B. B E C E [z : a ] D =+ (A) B E ( A )c. 111.4. Instantiation rule: if the scheme of c is in B, with context y' E

8 E @[y'/8] =+

6,then

c ( 8 ) E typ(c)[y'lB].

Note: Below we shall prove A E B not explicitly required here.

+ I-

B (correctness of categories), which is

IV. Type modification rules. IV.l. Type conversion: B E C, C Q D + B E D. IV.2. Type-inclusion: B E [Z : G][u: PIT =+ B E [Z : 617 (where [Z : G] stands ... [ z k : a k ] ) . for [z1: 2.5. The expression-and-formula part: Q-formulas The rules for the correctness of Q-formulas form group V.

The language theory of Automath, Chapter V (C.5) v . Correctness of 9-formulas. V.1. Reflexivity: I- A =+ A p A . V.2. @propagation: A 9 B, I- C , ( B > C or C

> B)

525

j

A q C.

Note: This is indeed the most restricted version of Q. see Sec. 1.2. 2.6. The strengthening rule This is a technical rule, which we use in the proof of q-CL, but afterwards, i.e. after having proved CL and (with help of CL) CR, as in Sec. 1.6, prove superfluous. It is called strengthening rule because it permits to remove assumptions from the context. We say that q is a subcontext of <,for short 77 sub <,if the sequence of E-formulas of q is a subsequence of the sequence of E-formulas of <. So, q sub

< * q sub (<,z E a )

and ( q , x E a ) sub (<,z E a) .

VI. The strengthening rule. If B;to,
*

2.7.2. The degree specification for the regular languages AUT-68, AUT-QE and AUT-QE+ are: (1) degrees admitted 1, 2 and 3, inhabitable degrees 1 and 2, domain degree 2 and argument degree 3. (2) value and function degree are as in the following scheme

I

function degree value degree

I

AUT-68 3 273

I

I

AUT-QE 293 1,293

1

AUT-QE+ 1,2,3 1,293

1

Languages where all value degrees are also function degrees, are said to be +-languages: AUT-QE+ (and AUT-68+, AUT-QE*, to be defined later). Consequently AUT-68 and AUT-QE are non-+-languages.

D.T. van Daalen

526

2.7.3. No matter what rules are chosen, by induction on F (i.e. on the definition of correctness) it follows that

AEB

+

A not of degree 1 .

So no application expressions (C) D with degree (C) = 1 and no instantiation expressions c ( 6 ) where some Cj has degree 1, are formed, and the rules 111.4 and III.3.A do not give rise to substitution with 1-expressions (in the categories). Hence, also by induction on F,

+

(degree ( A ) = 1 e degree ( B )= 1)

(1)

A9B

(2)

A E B =+ (degree ( A ) = 2 e degree ( B ) = 1) .

2.7.4. This shows, together with the explicit degree restriction in the rules 1.2 and 111.2, that the expressions formed and the substitutions involved are weakly degree correct (cf. [vanDaalen 80, Ch. IV.4.4.21). The inhabitable degree restriction guarantees that only expressions of degrees 1, 2 and 3 are formed. So, the specifications of 2.7.2 (1) are fulfilled and

AEB AqB

+ +

+

degree ( A ) = degree ( B ) 1 degree ( A ) = degree ( B )

and all the substitutions generated by the rules are degree correct: If stituted for Z then, for all i, degree ( A i ) = degree (zi).

A' is sub-

2.8. Specification of the languages 2.8.1. The rules The difference between the definitions of the various regular languages only concerns the rules of abstraction, application and type-inclusion. All the other rules, and also III.2.B' (for abstraction expressions of degree 3) and III.3.A (application) are present in each of the definitions. For the rest the situation is as follows

application

III.3.B

III.3.B, 1.3

Note: Below it will turn out that (1) III.2.A is a derived rule of AUT-QE and AUT-QE+. (2) III.3.B, 1.3 and IV.2 (type-inclusion) are (trivially) derived rules of AUT-68.

The language theory of Automath, Chapter V (C.5)

527

So, after all, in AUT-68 all the rules except III.2.B1, 1.2 are valid; AUT-QE and AUT-QE+ have additionally III.2.B' and 1.2 and, besides, AUT-QE+ has 1.3. 2.8.2. T h e reduction relation For definiteness we agree that > in the Qrule V.2 stands for disjoint one-step reduction 51 (i.e. contract several disjoint redices in one go). So it satisfies the monotonicity conditions, e.g.

A>A',

B>B'

*

(A)B>(A')B'

with the important consequence that

A > A"

*

B[lq > B[K'].

In any case the reduction relation includes &reduction, but we leave open the presence of g- and &reduction. Of course, if no definitional constants are in the book then there is no &reduction. We assume that AUT-68 has no definitional 1-constants (because, modulo the elimination of abbreviations, the only 1-expression in AUT-68 is T ) . The rules of strengthening will only be present in languages with g-reduction. 2.9. T h e s u b s t i t u t i o n theorem 2.9.1. For the E-definition (in contrast with the algorithmic definition) it is

easy to show the substitutivity: correctness of expressions and formulas is preserved under correct substitutions, i.e. substitution with correct expressions of the right types. For technical reasons we start with a weak form of substitution, compare a-reduction. 2.9.2.

all

XI

T h e o r e m (Renaming of contexts): If [ = Z E c? and C' are mutually distinct, then (with <' := 2 E c?')

5 I- A(E/QB )

= C [/."'I,

+ 5' k A'(E/QB')

and the correctness proofs of both sides of the implication sign are equally long. Proof. Induction on I-. 2.9.3. An easy corollary of this is the weakening theorem, the converse of strengthening: if sub [ then

E

I-,

EO I- A(E/QB ) +

Proof. Induction on

[ I- A(E/QB ) .

I- A(E/QB ) .

0

As a corollary of this we can prove that in a derivation of correctness the

D.T. van Daalen

528

application of strengthening can be postponed to the end of the derivation.

-

2.9.4. Now we come to the simultaneous substitution theorem: if q then E

pfG/g],7 I- C(E/QD )

= y' E B,

C[$/g] (E/Q D[G/gl) .

Proof. By induction on 77 I- C(E/QD). We treat just some of the cases, distinguished according to the last rule applied in the derivation. Abbreviate C to C'. Last rule is 111.2.B': Assume q I-2 C1 and 7 , z E C1 ki+' Cz E Dz. By the ind. hyp. and by 2.7.4, F2C;. By the copy rule t E Ci I- z E C; (if necessary, i.e. if z in 5, rename the implicit context to E l ) . Now, by weakening, we can apply the ind. hyp. with the extended substitution [G, z/d,21 t o q,z E C1 I-i+' C2 E D2. This gives L E Ci I-2" Cz E Dz and, by III.2.Bi, I- [Z : Ci]Cz E [t : CilDZ, q.e.d. Possibly one must first rename 5' back to again. Last rule is V.2: Assume 11 I- C1 C2,q I- C3,Cz > (73. By the ind. hyp. 0 I- Ci Q Ci and k C;. Since C; > C;, I- Ci Q C;, q.e.d.

[$/a]

<

<

2.9.5. Corollary (single substitution theorem):

A Ea, zE

(v

I- B(E/QC)

B[z/A] (E/Q C[z/A])

0

2.10. Some easy properties 2.10.1. On abstraction In addition to the remark in 2.3, after rule 1.4, we can say that the last inference in a proof of I-' A must be rule VI.l or one of the rulcs I. In particular, if F' [z : a ] A , this can only follow from t,z E a I-' A for some 5 with sub E (since sub is transitive). So application of VI.l gives
<

EI-[z:a]A + < , s E a I - A . 2.10.2. Correctness of categories In the rules of the definition having A E B as their consequence, it is not explicitly required that I- B. For the copy rule this correctness of categories follows from weakening, for III.2.A from the 7-rule, for III.3.A from the single substitution theorem (use induction on I-),for 111.4 from the simultaneous substitution theorem etc. So, we have correctness of categories

A E B =+I--.

The language theory of Automath, Chapter V (C.5)

529

2.10.3. Abstraction again Assume that & , x E a I-a A , A of value degree, degree(&) = 2. If i = 1 then from 1.2 we infer 50 I- [x : &]A. If i > 1 then, as above, we can retrace some <1,x E a,& F i A E B with sub (1 and the transition from &, x E a,& k A to t o , x E a I- A follows from applications of strengthening. By the weakening theorem, we can extend the context to t 1 , x E a , & , x ’ E a , with some new 2’. By the substitution theorem we can infer 61, x E a,( 2 , x’ E a k A [ x / x ’ ]E B [ z / x ’ ] .In case we can apply III.2.B (this depends on the language under consideration) we get 51,x E a,& I- [ x : & ] A E [x : a ] B . Otherwise the language is AUT-68, i = 2, B = r and application of lII.2.A gives < I , x E a,(2 I- [x : a ] A E 7. Anyhow, rule I1 and iterated use of strengthening give t o I- [x : a]A. Resuming, (degree(&) = 2, A of value degree, x E a I- A) e I- [ x : a ] A . Note: The results in 2.9 and 2.10 are also valid, and simpler to prove, if 77reduction (and strengthening) is not present.

2.11. On the Q-rules 2.11.1. Clearly is the equivalence relation generated by >+,i.e. the restriction of > t o the correct expressions. So A Q B means precisely that

ck such that

k A and I- B and there are correct C1, ...,

A

> C1 >

< Ci-1

> Ci+l > ... < > ... < ck < B

C Ci

< Cj-1 < Cj > Cj+l

(where possibly, in view of strengthening, the Ci in between are correct w.r.t. extended contexts).

2.11.2. An alternative rule of @propagation is V.2’

A Q B , kC, B J C

+

AQC.

If the language definition has this rule, becomes -+, i.e. (I+)* (Sec. 1.2), i.e. the transitive closure of the restriction of 1 to the correct expressions. So, no matter what other rules there are in the definition of correctness,

*

v.2

CL, v . 2

*

V.2’ and

V.2’

2.11.3. An even stronger rule for Q, also including reflexivity is

D.T.va.n Daalen

530

V.2” FA, EB, A = B + A q B [where = is the (unrestricted) equivalence relation generated b y

>I,

Assuming the (full) CR-theorem, i.e. CR for all, not just the correct expressions, which is the case if v-reduction is not present, we get

(V.l,V.2‘)

* V.2”

*

2.12. O n type-conversion 2.12.1. The q-formulas (and the Q-rules, see below) can be avoided completely by reformulating IV. 1, the type-conversion rule to IV.1’

A E B , FC, ( B > C o r C > B )

*

AEC.

And, corresponding to V.2’ rather than to V.2,

IV.1”

A E B , I-C, B l C

+

AEC

*

As in 2.11.2, IV.1” + IV.1’ and CL, IV.1’ IV.1“. Corresponding to V.2” is the alternative rule

IV.1”’ A E B , B = C , F C + A E C [where = is the (unrestricted) equivalence relation generated b y

>I.

2.12.2. The system with Q-formulas, 9-rules V . l and V.2, and rule IV.l is indeed a conservative extension of the system without Q but with the corresponding type-conversion rule instead. First we have IV.l,V.l,V.2

+

IV.1‘

,

respectively

IV.l,V.l,V.2/

+- IV.1” ,

respectively

IV.1,V.1,V.2”

*

IV.1”’

,

so the Q-systemis an extension of the Q-lessone. Secondly, the expressions and E-formulas, correct in a Q-system are also correct in the corresponding q-less system.

2.12.3. Notice, that in the presence of v, rule IV.1”’ (so rule V.2” too!) is inconsistent in the sense that it gives rise to anomalies such as self-application. This fact is connected with the Pv
The language theory of Automath, Chapter V (C.5)

531

Example: If a E T then I- [z : a]a and I- [y : [z : a]a]a. Further [z : a ] a = (by p ) [z : a] (z)[y : [z : a]a]a= (by q) [y : [z : a]a]a.So, if f E [z : a]a then (f)f E a.

2.13. On type-inclusion 2.13.1. Iterated use of the rule of type-inclusion gives

AE[Z:Z][G:&

*

AE[Z:Z]T

so

A E [ Z : Z ] T+ A E T . This shows that AUT-68 is a sublanguage of AUT-QE: all the correct books, contexts, expressions and formulas of AUT-68 are also correct in AUT-QE.

Proof. Rule III.2.A, not in the definition of AUT-QE, can be derived from so III.2.B' and IV.2. For, let z E a I- B E T . Then I- [z : a ] BE [z : a y ] ~ 0 I- [z : a ] BE T , q.e.d.

2.13.2. Conversely, rule IV.2 is (vacuously) a derived rule of AUT-68, because all the correct AUT-68 1-expressions &reduce to T . 2.14. The form of derivations 2.14.1. We called the rules I11 the formation rules of non-1-expressions. This is because, in a proof of I-Z+' A, we can retrace some ( I- A E B and & I- A E C, such that (i)

the last rule applied in proving & k A E C is the formation rule of A, i.e. one of the rules 111,

(ii) the transition from type conversion,

(1

I- A E C to

(iii) the transition from [ I- A E B to

5kAE

B is by iterated use of VI.2 and

60 k A is by using VI.2, 11, and VI.l.

So, in case there is no type-inclusion applied, e.g. if i > 1, we have (use weakening) & I- B q C. Below we introduce a symbol covering the relation between B and C in case type-inclusion is involved.

c can be defined as follows Q! I- A c B j [z : &]AC [z : a]B.

2.14.2. The new relation (i)

I- [z : a]A,z E

*

A c B. (ii) A 9 B (iii) c is transitive. (iv) I-'&

+ a c T.

D.T. van Daalen

532

Clearly, c is a reflexive and transitive relation on the correct expressions, including q and type-inclusion, which on the non-1-expressions coincides with q (use 2.10.3). The type modification rule can now be contracted to one rule IV

AEB, B c C

And, for

El,

+

AEC.

B and C as in 2.14.1 we have

2.14.3. so, in a proof of [z : a]C c D.

t- C c B now.

[x : a ] B E D we can retrace

2 E

a k B E C with

Similarly, in a proof of ( A )B E D we can retrace either (i) B E [z : a]C with CIA] 1D , A E a , or (ii) B E C E [z : a ] E with ( A )C C D, A E a. And, in a proof of c(C?) c D we can retrace some

c ( 6 ) E t y p ( c ) [ C ]C

D

.

2.14.4. Above, we used already

t - [ z : a ] A ,z E a t - A q B =+ [ z : a ] A q [ z : a ] B . The other monotonicity rule

a 9 p , t- [z : a ] A + [z : a ] Aq [ z : P]A follows by induction on 9 , using the substitution theorem. However, we do not know yet

A q B , C q D =+ ( A ) C q ( B ) D and consequently, it is a priori not clear that (uniqueness of types for 3-expressions)

k3AEa, A E P

* asp.

This (and its weaker counterpart for 2-expressions) will not be proved before the next section (3.2.4, 3.2.6). 2.15. On the application rules 2.15.1. In AUT-68, where no 1-abstraction expressions are formed, the rule III.3.B is vacuously a derived rule, viz. there are no B with B E C E [z : a ] D .

Since, in AUT-68 and AUT-QE+, k 2 [ z: a ] c

+

[z : a]C E [z : a]D

The language theory of Automath, Chapter V (C.5)

533

we can restrict the rule III.3.A

+

A E ~ B , E[z:a]C

(A)BEC[A]

to the case where degree (C) = 1.

2.15.2. As an alternative to III.3.B (and to III.3.A if 1.3 is present) we mention III.3.B’ k ( A ) C , BE

c+

(A) B E ( A ) C .

The following equivalences hold (1.3,111.3.A,III.3.B)

(1.3J11.3.B’)

(III.3.A1111.3.B)(j (III.3.A,III.3.B‘)

.

Proof. E.g. that III.3.A is a derived rule in presence of 1.3 and III.3.B‘. Let A E a , B E [z : a]C. By 1.3 (and III.3.B’, if degree(C) = 2), t- ( A ) [z : a]C. By the single substitution theorem -i C[A].So by III.3.B’ and type-conversion ( A )B E C[A]. 0

2.15.3. Notice that in the presence of 7-reduction rule III.3.A by itself is sufficient, because qJII.3.A

+

I11.3.B .

Proof. Assume A E a , B E C E [z : a]D. Then z E a k z E a , so by III.3.A, t- (z)C E D and by abstraction b [z : a] (z)c E [z : a]D. By 11 and typeconversion B E [z : a] (z) C (z $! FV(C)), so by III.3.A ( A )B E ( A )C , q.e.d. 0

zEa

2.16. A n E-definition for A and A+ 2.16.1. In order to adapt the E-definition to A and A+ we must first drop the inhabitable degree condition, and the restriction to a of degree 2 in the abstraction rules 1.2 and 111.2. The rule of type-inclusion and rule III.2.A must be skipped but III.2.Bi is permitted for all i. A suitable combination of application rules is 1.3 and III.3.B’ for A+, and III.3.A and III.3.B’ for A. An alternative for III.3.B’ is an extended form of III.3.B AE

CY

, B E Cl E ... E ck E

[Z: a]D

+

( A )B E ( A )CI .

2.16.2. Degree considerations for A and A+ are indeed more involved than those in 2.7. Of course we can show weak degree correctness, as in 2.7, but we must know more in order to establish degree correctness. See Ch. VII, Sec. 2.2. The various properties proved above, such as substitutivity, correctness of categories, etc. simply go through for the E-versions of A and A+.

534

D.T. van Daalen

V.3. The actual closure proof 3.1. Heuristics 3.1.1. The first idea which comes to mind about proving closure, CL CL

FA, A L B * F B

is simply to prove one-step closure, CL1 CL1

FA, A > B 3 l - B

by induction on F A and then use induction on 2. Among the possible ways of one-step reduction we distinguish the main or “outside” reductions

(P)

( A ). 1 : BIG > C[AI

(v)

z @FV(A)

(6)

d ( 2 ) > def(d)[A]

[z : a](z) A

>A

and the “inside” reductions which follow by the monotonicity rules (appl)

A > A’, B > B’

(abstr) a (const)

+

> a’, A > .!’ =+

A > At

( A )B > (A‘)B’ [z : a ] A> [z : @’]A’

c(2) > c ( R ) .

So we assume that > stands for disjoint one-step reduction. Now consider, e.g., the appl-case where the correctness of ( A )B follows from A E a , B E [z : a]C. Here the induction hypothesis, CL1 applied to A and to B , just tells us that I- A’ and F B’ (where A > A’, B > B’), which is of course not enough to conclude k (A’)B’. This suggests that we need preservation of types, PT PT

A E ~ F,B , A 1 B

+.

BE^

or at least one-step preservation of types, PT1 PT1

A E a , I-B, A > B

+

BEa

additionally. Similarly with the const-case of one-step reduction. 3.1.2. So the next idea is to combine CL and PT to

CLPT

F A ( E ~ )A, 2 B

+FB ( E ~ )

(as the conjunction of the version with and the version without parentheses) and to use the same induction. 1.e. first prove

The language theory of Automath, Chapter V (C.5)

CLPTl

535

k A ( ~ a ) A, > B + ~ - B ( E c Y )

by induction on correctness and then use induction on 2. This works fine with all the inside reductions. E.g., consider once more the appl-case: A E a , B E [z : a ] C ,A > A’, B > B’. Now the induction hypothesis gives us A’ E a , B’ E [z : a]C and (A’)B’ E C[A’]. Since > is disjoint onestep reduction, C[A]> C[A’]so C [ A ]Q C[A’]so ( A ’ )B’ E C [ A ] ,q.e.d. The other cases of inside reductions are treated similarly, using some facts from the previous sections. Then the outside reductions: 6 and 77 do not cause major difficulties either. For 6 use the simultaneous substitution theorem and the compatibility of def and typ, for 77 use the strengthening rule. But there is a problem with /I-outside reduction. For, in order to conclude k C [ A ]from k ( A )[z : BIG, we seem to need soundness of applicability, SA SA

t- ( A )[ z : B ] C

+

AEB

which would allow us to use the single substitution theorem. 3.1.3. Let us try to find out about SA. So consider the assumptions which can lead to the correctness of ( A )[z : B ] C . E.g. A E a , [z : B]C Q [z : a ] D (resp. [z : B]C E [z : 010).Then SA amounts to uniqueness of domains, UD UD

[ z : B ] C Q [ z : : a ]+D B

Q

~

resp. extended uniqueness of domains, EUD EUD

[z : B]C E [z: a ] D

+

B

Q cy

or: A E a , [z : B ] C E D E [z : a ] E (these are the assumptions of rule 111.3.B). As in 2.14.3, for some F , [z : B]C E [z : B ] F C D (and in fact [ x : B ] F D ) . So, in this case SA seems to require the left-hand equality rule LQ

LQ

A E ~ A, Q B

BE^

which would give [x : B ] F E [x: a ] E and, by EUD, A E B . However, LQ + PT. So, it appears that we cannot do SA separately beforehand (i.e. not if III.3.B is present) and then proceed with CLPT as sketched above.

3.1.4. In order to simplify matters, we first forget about type-inclusion. Then we may hope to be able to prove uniqueness of types, UT UT

A E ~ A,

E

+~a s p .

D.T. van Daalen

536

*

EUD and, besides, LQ and PT turn out to be If we assume UT then UD equivalent. This may suggest us to incorporate the proof of SA in the proof of CLPT. But we do not have UT yet. If we try t o prove UT by induction on the length of A, we come again in trouble with rule III.3.B. For, let A1 E a!, A2 E B E [z : a ] D , A2 E C E [z : a]E. The ind. hyp. just gives us B Q C here, but we need more, viz. something like

(*)

I-(A)B, B Q C

*

(A)BQ(A)C

(this is one half of the third monotonicity formula of Sec. 2.14.4). Since a proof of (*) requires Lq in turn, UT cannot be isolated either. We might try to combine SA, UT and CLPT, i.e. to prove the necessary instances of SA and UT in the course of the proof of CLPT1. A proof dong these lines is indeed possible even if type-inclusion is present, but it has a complicated structure and it cannot easily be extended to languages with higher function degrees, such as A and A+.

3.1.5. Thus we prefer the alternative approach sketched below, which essentially runs as follows: first prove PT1, UT and LQ by induction on degree, then prove SA and UD, and afterwards prove CL as indicated in 3.1.1. To this end we distinguish degree-i-versions of the various properties

+.

PT~

t - z A ~ a ,A > B , t-*B

Lqi

I - ~ A E QA, q B

*

UT'

I-gAEa, A

+~a s p

(d)

F i B QC , I- ( A ) B

UDi

ki[z: a!]AQ [x : p ] B

SAi

Fi(A)[z:B]C

E

*

BE^

BE&

( A ) BQ ( A ) C

*

a!

9 /3

AEB.

First notice that: PT;,

UTi =% Lqi

and that:

Lqi

*

Lqi

* UTi+'

hence:

(*i)

.

We assume that the language under consideration is a non-+-language (see (ignoring typeSec. 2.7). Then it is relatively easy to show UDk and UT"' inclusion), where k is the lowest value degree. Now let us try t o prove PTi+' by

The language theory of Automath, Chapter V (C.5)

537

induction on correctness, where we assume PT:, L Q j and UTj+' for j 5 i. An instructive example is the appl-case of inside reduction: A > A', B > B', Fi+' ( A ) B, ki+' (A') B'. It is no restriction to assume that both ( A ) B and (A') B' originate from the extended application rule of 2.16.1: A E a, A' E a', B E c1 E ... E cl E [z : a]D,B' E ci E ... E E [z : a']D' with degree(D) = degree(D') = k and 1 = 1'. Then by the ind. hyp. we have B' E C1, so by UTi+l C1 q Ci and by LQi Ci E (32. Then follows C2 q Cl and C; E C3 etc. Finally we have [z : a ] D p [z : &'ID'and by UDk a a' so A' E a. Hence {A') B' E (A')C1 < ( A )C1, so (A') B' E ( A ) C1, q.e.d. From PT;" and UT*+' we get LQi+', and UTi+2. So by induction, we get PT1, LQ, (*) and UT.

c;,

3.1.6. It is clear that SAi++' can be distilled from the proof of PTi+', but it can alternatively be given as follows. First, we have LQi+l U D ~

,

mi++'

so we have UD. Now let k*+' ( A ) [z : B]C. Then (see Sec. 2.15.2) either A E a , [z : B]C E [z : a]D,or [z : B]C E E , k ( A )E . Further [z : B]C E [z : B]F . So by UT we have Q B, or by (*) we have k ( A ) [z : B]F . So from LQ, UD and UT we get SAi 3 SAi+l

and by induction SA.

3.2. Closure for Pv-AUT-QE 3.2.1. For definiteness we present a rather detailed version of our closure proof here for Pq-AUT-QE, i.e. AUT-QE without definitional constants and without &reduction. So the admitted degrees are 1, 2 and 3, the value degrees are 1, 2 and 3, the domain degree is 2 and the argument degree is 3. The function degrees are just 2 and 3, so &-AUT-QE is a non-+-language. So the reasoning of Sec. 3.1.5 is valid, but for additional problems due t o the presence of type-inclusion (viz. that UT is not true and that not immediately (PT1 j LQ) and (UD =+ EUD)). These problems are overcome by the introduction of a "canonical type" in Sec. 3.2.4 below. This canonical type also plays a role in the 0-case of PT1. Later we include definitional constants and &reduction, and application expressions of degree 1, thus extending our result to Pq6-AUT-QEf (in Section 3.3). A closure proof of &-AUT-68 can easily be imitated from the proof below and is in fact somewhat easier because there is no type-inclusion.

D.T. van Daalen

538

3.2.2. We specify a set of rules (in shorthand, omitting contexts) for Pq-AUTQE, which according to the properties in 2.10-2.15 are equivalent t o the rules indicated previously. I-T

... , z E a , ... k z (Ea) z E a I- A (EB)

* I-

[z : a ] A ( E [z : a ] B )

+ I-

A E a, k 2 B E [ z : cy]C AE

(Y,

BE

( A ) B( E C [ A ] )

c E [z : a10 * I-

A E z[d,.'E z * p(.')

( A )B (E ( A )C )

E P is a scheme

(Wd)

I- P ( 4

* AEC

AEBcC I- A, A

> B or B > A, I- B + A B

+

AqBqC

+

(where > is 51,i.e. disjoint one step Pq-reduction)

AqC

A q B =s A c B I-'A =+ A C T z E a I-

A

C

B =+

AcBcC

[[z : a ] A1[z : a ] B

AcC

strengthening.

On 1-expressions and type-inclusion 3.2.3.1. Since there are no 1-application expressions and no definitional constants all 1-expressions are of the form '.[ : ZIT, with .'possibly empty. And, if I-' [z : &]A,I-' [z : P]B, [z : a ] A > [z : BIB, then a > P, A > B so cy q P and z E a I- A 9 B. So, by induction on 9, we can show UD'

l - ' [ z : a ] A p [ z : P ] B+ a q p

(andzEat-AqB).

Then, by induction on C, we get I-'[. : a ] Ac [z : P]B

*a

P

(and z E a I- A C B )

I

3.2.3.2. We introduced UTi, uniqueness of types for expressions of degree i (i > 11,

The language theory of Automath, Chapter V (C.5)

UT~

P A E B A,

*

EC

539

BQC.

For i = 3 this will be proved below, but for i = 2 it is simply false in view of type-inclusion. Now we define

B 0 C : e BcCorCcB. Below we shall prove that the new symbol covers the relationship between B and C whenever A E B and A E C. Clearly on the non-1-expressions 0 is just Q. We have k1[z:a]A0[z:P]B

aQP,

(zEakAOB).

Further 0 satisfies a strengthening rule, and is substitutive: AEQ, z E a k B O C

+

B[A]OC[A].

3.2.3.3. We also want to show

FIBoC

@

forsomeA, A c B a n d A c C .

Proof. + is trivial. So let B I1 A C C. Then A = [t: 71 [y’ : [Z : Z]T, h’= [.?: 711 [y’: C = [.?: 721. (or similar with B and interchanged), with 0 “7 Q T~ Q 72”, Y E 7 F Q f i l , l . SO B c c (or c c B ) .

B~]T,

c

B

3.2.4. The canonical type 3.2.4.1. It is possible, for each A with ki+’ A to indicate a n

QO

such that

is a minimal representative - w.r.t. IT - of the categories of A, i.e. A E QO and: (A E Q + QO C a)

(1)

a0

(2)

FV(a0) c FV(A)

.

We call this (YO the c a n t y p of A (with respect t o a context). The definition of c a n t y p is like the definition of t y p in [van Daalen 80, Sec. IV.3.21 [ t y p is like c a n t y p but with rule (iv) for B of all degrees, and without rule (v)], but slightly modified in order t o stay in the correct fragment, as follows:

= tYP(.)

(i)

CantYPb)

(ii)

CantYP(P(4) = tYP(P”1

(iii)

cantyp([z : o ] B )= [z : a ] c a n t y p ( B ) - w.r.t. to extended context

(iv)

cantyp((A) B ) = (A) c a n t y p ( B ) if degree ( B )= 3

(v)

cantyp((A) B ) = C [ A ]if degree ( B )= 2 and c a n t y p ( B ) = [z : Q]C.

-

540

D.T. va,n Daalen

Clearly, typ(A) 2 cantyp(A) so property (2) above is immediate. Now we prove a lemma corresponding to property (1). Lemma. If LQ' and I-'+' A E a then A E cantyp(A) C a. 3.2.4.2.

Proof. By induction on the length of A. The more interesting cases are (i)

A = [z : crl]Al, z E a1 I- A1 E 0 2 , [z : al]a2 c a. By the ind. hyp., z E 01 I- A1 E cantyp(A1) C C Y ~ ,SO [Z : a l ] A E [Z : 011 c a t y p ( A 1 ) cantyp(A) c [z : a1Jcr2I I a,q.e.d.

(ii) A = (Al) A2, A1 E a1, I-2 A2 E [z : al]C, C[A1] C a. By the ind. hyp., A2 E cantyp(A2) c [z : al]C so cantyp(A2) s [z : ai]C'. Hence cantyp(A) is indeed defined, a1 0 a;, z E a1 I- C' C C, so (Al) A2 E C'[Al] C a, q.e.d. (iii) A = (Al) Az, A1 E 01, F3A2 E B E [z : al]C, (Al) B 0 a. By the ind. hyp. A2 E cantyp(A2) Q B. By Lq* we can use property (*') of Sec. 3.1.5 and 0 get cantyp(A) 9 ( A l )B 9 a , q.e.d. 3.2.4.3. Corollary.

(a)

I-'A E B, A E C =+ B O C (this is, for A of degree 2, the desired property of 0 ) .

(ii) k 2 [x : a ] A E [x : P]B =+ a Q p,

2 E

cr I- A E B (this includes EUD2).

(iii) S A ~ . Proof. (i) LQ1 is vacuously fulfilled, so B 7 cantyp(A) c C, so by 3.2.3.3 0 B 0 C. (ii) and (iii) are immediate. 3.2.5.1. Now that we have introduced cantyp we can use it in the proof of PT. We define the property of preservation of cantyp.

PCT'

!-'A,

A 2 A'

,

I- A' =+ cantyp(A)

cantyp(A')

.

Similarly PCT';; PCT is the conjunction of all the PCTi. We first prove some lemmas for PCT2. 3.2.5.2. Lemma (substitution lemma for cantyp): Let B* stand for B[x/A]. Then z E a , $ E /? I-2 C, k3 A E a cantyp(C)* cantyp(C*) where the cantyp's are taken w.r.t. (z E a,$E $) and ($E p't) resp.

=

Proof. Induction on C. Note that C f z, because degree(z) = 3. Some cases are:

The language theory of Automath, Chapter V (C.5)

54 1

= [z : C1]C2, cantyp(C)* = [z : C;]cantyp(Cz)* (w.r.t. z E a , y ’ p, ~ z E c 1 ) E (by ind. hyp.) [ z : cr]cantyp(Cz) (w.r.t. a E p*,z E c;) -a

(i) C

G

cantyp(C*), q.e.d. (ii) C

=

(Cl)C2, cantyp(C)’

=

D[Cl]*= D*[Cr] where cantyp(C2) = cantyp(C,’), so cantyp(C*)

=

[ z : 710 and, by ind. hyp., [ z : 7*]D*

=

D*[C;] as well, q.e.d.

0

3.2.5.3. Corollary. z E a k2 C, k3A E a

*

cantyp(C)[A] = cantyp(C[A]). 0

3.2.5.4. Corollary (p-PCT;): CantYP(C[Al).

k2 (A) [z : B] C

+-

cantyp((A) [z : B] C)

Proof. By SA2 we have A E B, so even cantyp( (A) [z : B]C) G cantyp(C)[A] = 0 c U t Y P(C [A1). 3.2.5.5. Lemma (Q-PCT:): k2[z : a](.) A , z $! FV(A) =+ cantyp([z : a](.) A)

cantyp(A)

.

Proof. Let cantyp(A) = [y : p]D and let k2 [z : a](.) A be based upon z E a’, A E [y : a’]D’. By 3.2.4.2 [y : p]D c [y : &’ID‘ and z $2 FV([y : BID), so a Q a‘ Q p and cantyp(A) = [z : p]D[y/z] Q [z : a]D[y/z] = cantyp([z : a](z)A). 0 3.2.5.6. Theorem. PCT:.

Proof. Let I-2 A, I- A’, A > A’. For a main reduction use 3.2.5.4 or 3.2.5.5. For inside reductions use induction on the length of A. Some cases are:

=

[z: A’,]A;, A1 > A;, A2 > A;. By ind. hyp. (i) A G [z : AlIA2, A’ cantyp([z : A1]A2) Q c m t y p ( [ z : AlIA’,) = [z : Al) cantyp(A;) Q [z : A:] cantyp(Ak), by the substitution property 3.2.5.3.

= (Al) A2, A’ = (A;)

A;, A1 > A‘,, A2 > A;. Since (Al) A2 is correct, A1 E a1,A2 E cantyp(A2) = [z : PIC C [z : a l ] D . So a1 Q p. Similarly A’, E a’,, A; E cantyp(A;) = [z : p’]C’ C [z : a’,]D’. So a{ Q p’. By the 0 ind. hyp. [z : PIC [z : p’]C’, so C[Al] Q C’[Al] Q C’[A’,], q.e.d.

(ii) A

D.T. van Daalen

542

3.2.5.7. Corollary. (i)

PT?,

(ii) Lq2, 0

(iii) U D ~ .

3.2.6.1. By Lq2 we can apply 3.2.4.2 to expressions of degree 3 now. We get: (i)

F3A E a

+

A E cantyp(A) Q

0.

(ii) UT3 : k 3 A E a , A E p + a O p (i.e. a Q of 0 for A of degree 3).

p) (this is the announced property

(iii) SA3 (e.g. as in 3.1.6). Notice that by UT3 the properties PCT3 and PT3 are equivalent.

3.2.6.2. We introduce CLPTZ: FiA(Ea) , A 2 A’

+ FiA’(Ea)

and similarly CLPT!. Here follow some lemmas for CLPT;.

3.2.6.3. Lemma (PLCLPTB): k3(A) [z : B]C E D

+

C[A] E D.

Proof. Let A E a, [z : B] C E F E [z : a ] G , (A) F Q D , and let z E B I- c E H. [z : B]H F . By SA3 we have A E B and by (*’) (A) [z : B] H Q (A) F . By 0 the substitution theorem for correctness C[A] E H[A] Q D.

3.2.6.4. Lemma (vLCLPT7): F3[z : a](.) A E B , z @ FV(A)

+

A E B.

Proof. cantyp([z : 0](z) A) G [z : a](.) cantyp(A) Q cantyp(A) (by 77-re0 duction), by strengthening I- A, so by 3.2.6.1 A E B.

3.2.6.5. Now we are ready for CLPT. Theorem (CLPT1) : I- A ( E ~ )A, > A’

+ I- A ‘ ( E ~ ) .

Proof. If A > A’ is a main reduction use SA, strengthening, PT2 and the preceding two lemmas. Otherwise use induction on the length of A. (i)

A = [z : al]A1, A’ = [z : a:]Ai, a1 > a:, A1 > A;, 2 E a1 I- A 1 ( ~ a 2 ) , ([z: a1102 c a ) . By ind. hyp. I- a: and z E a; I- A ~ ( E ~ Y ~ ) . So I- [z : a’,]A’, (E[z : ai]a2 Q [x : al]a2 C a ) - read this twice, one time with and one time without the symbols in parentheses -.

The language theory of Automath, Chapter V (C.5)

543

(ii) A = (Al)Az, A’ = (A;)Ai, A1 > A;, AZ > A;, Al E a1, A2 E [z : a l ] C , C[A] c a. By ind. hyp. A: E 011, A; E [x : al]C. So A’ E C[A{]Q C[Al].

(iii) As in (ii), but A2 E B E [z : al]C, ( A l )B C a. By ind. hyp. A’, E q , Ah E B , SO A’ E (A:) B Q ( A * )B. (iv) A = p(B1, ...,B k ) , A’ = p(Bi ,...,B i ) , 2 > I?’, B1 E P I , Bz E Pz[B1],..., Bk E Pk[Bl,...,Bk-11, P [ B ]c a, where y ’ E p’ * p(y’) E P is a scheme. By so ind. hyp. B; E P i , Bi E P2[B1]9 P2[B:],...,Bk E P k [g1Q Pk[g’], 0 p ( B i , ...,Bk) E P [ B { ..., , B;] Q P [ g ] .

3.2.6.6. Corollary. (i)

CLPT,

(ii) LQ, (iii) UD.

0

3.2.6.7. Corollary (Rule V . 2 , See. 2.11): F A , F B , A J. B

+

A 9 B.

0

3.3. Extension to Pr&AUT-QE+ 3.3.1. Now we consider P$-AUT-QE+, i.e. Pq-AUT-QE extended with 1application expressions, with definitional constants and with definitional reduction. The additional rules are 1.3

A E ~ F, I B Q [ z : a ] C +I-’(A)B

(vi’)

A E a[A],2 E 3 * d ( 2 )

:= d($) := D(E E ) is a scheme =+

k d(A)(EE [ A ] )

(cf. Sec. 3.2.2 and Sec. 2.3 respectively). If we try to repeat the previously given proof, we first come in trouble because not all the compound 1-expressions are abstraction expressions anymore. This makes the proof of UD1 from Sec. 3.2.3 fail, though the property itself remains valid. Furthermore there is the problem with definitional 2-constants and typeinclusion (mentioned in Sec. 1.7), which makes Lq2 fail. Below we give an indirect proof instead which runs as follows: first we show (Secs. 3.3.3-3.3.8) that the indicated extension is a so-called unessential extension. Then we use this fact to transfer the desired properties from Pq-AUT-QE t o the new system (Sec. 3.3.9). Finally (in Sec. 3.3.11) we briefly discuss an even larger system than AUT-QE+, which we call AUT-QE*.

D.T. van Daalen

544

3.3.2. Some terminology Consider two systems of correct expressions with typing and equality relation, (k, E, Q ) and (k+,E+, Q+) respectively. (F+, E+, Q+) is an extension of (k,E,Q) if t- =% k+, E + E+ and Q =% Q+, i.e.:

B -I resp. B;E

I- resp.

B;[ I- A

(E/Q

B)

+

B k+ resp. B; [ k+ resp. B;E k+ (E+/Q+ B ) . We further just write t-+ A E/Q B instead of k+ A E+/Q+ B. The “new” system k+ is said to be conservative over the “old” system k if all new facts about old objects are old facts, i.e. if UEO

F A , I- B , k + A E/Q B + t - A E/Q B .

An extension is unessential if no “essentially new” objects are formed, i.e. if all new objects are equal t o old ones. This means that the new system can be translated into the old one by a mapping-, working on expressions, books and contexts, such that

+k+AqA-

t-+A

UE2

B I-+ resp. B;[ I-+ resp. B;[ I-+ A + B- k resp. B-; [- I- resp. B-;[- I- A-

UE3

B;[k+ A

E/Q

B

+

and k A

+

UE1

B-;[- F A -

A=A-

E/Q

B- .

Clearly unessential extensions are conservative. Property U E 3 means that new formulas inply their old counterparts. Unessential extensions also satisfying UE3/, the converse of UE3, UE3’

+ + A , F+B, t - A - E / Q B -

+~+AE/QB

are called definitional extensions. In a definitional extension new formulas are equivalent to old ones. All unessential extensions satisfy the Q-part of UE3’, but for the E-part we need property LQ for the larger system (at least if the smaller system satisfies LQ). For that matter, if the +-system satisfies LQ, we have UE1,UEZ

UE3’

and: UEO,UEl,UE2

+ UE3 ,

3.3.3. The translation Of course, we take Pg-AUT-QE for our smaller system I- and we take Pqb-AUT-QE+ as the extension k+. We are going to prove that k+ is a n unessential (but not a definitional) extension.

545

The language theory of Automath, Chapter V (C.5)

For an expression A we intend its translation A- to be the normal form w.r.t. a certain reduction relation In order to make A- well-defined and in view of UE1, UE2 we require

>-.

(0) 2- normalizes and satisfies CR. (1) 2- just affects the new elements of expressions (1-application parts and definitional constants) and removes them.

(2) 2- is part of the reduction relation of the new system and satisfies CLPT.

<

For contexts 5 E r 3 the context E- is simply 5 E 6- (where the meaning of b- is clear). Similarly schemes for primitive constants * p ( 2 ) E /3 are translated into E- * p ( 5 ) E p-. But schemes for definitional constants have to be omitted in the translation. Before fixing 2- we define ij-reduction 2i, i-reduction of degree j (where i is p, 7, 6 or a combination of these). This is the reduction relation generated from elementary ij-reduction, defined as follows: A elementary iJ-reduces to A' if A elementary i-reduces to A' and degree(A) = j. The corresponding one-step reduction is denoted >:. Notice that for degreecorrect A the degree of A' above is j as well (cf. Sec. 2.7). Now, in view of requirement (1) above, we define 2- to be the reduction relation generated from 2; and 2 6 .

<

3.3.4. Notice that pl-reductions cannot be inside reductions. Strong normalization for p' is easy to prove even without using normability. From [van Daalen 80, Ch. 1111 we recall 6-SN and 6-CR. We can show that P'-CR holds, and that p' commutes with all other reductions (such as p2, 6, $) except 77'. (See 11.8.) So 2- commutes with all kinds of reduction but $, and we have >--SN and >--CR (whence requirement (0)above). Clearly >--normal forms d o not contain defined constants anymore; a simremoves the 1-application parts as well. ple normability argument shows that

>-

3.3.5. A further property we want 2.- to satisfy is CLPT. Since 6-CLPT1 follows from the simultaneous substitution theorem (cf. 2.9.4) we just want to know SA1 -I:

( A ) [z : B] C

+ I-+

AEB

or, equivalently, U D ~ [z : B]

c Q [z : a]D + k+ B Q a .

Here turn up the problems with 1-expressions, announced in 3.3.1. To overcome these we seemingly modify our system:

D.T. van Daalen

546 (1) We exclude ql-reduction. (2) We change our 1-application rule into

1.3’

A E a , F I B red- [ z : a ] C

+ k!+(A)B

where red- is 2- restricted to the correct expressions, i.e. generated by

t-+A, F + A ’ , (A

>b

A’orA

>6

A’) + F + A r e d - A ‘ .

Clearly 1.3 1.3’, so the modification is a restriction. However, after having proved >--CLPT (whence UE1, see Sec. 3.3.6), UE2 and UE3 (Sec. 3.3.7) for the modified version, we shall be able to show that both 1.3 and $-equality: t-+ A, A >: A’, F+A’ + F + A A’ are derived rules. Hence the two versions of F+ are equivalent, and we have the desired properties for the original +-system.

3.3.6.1. For the modified system the property SA’ is clear, so we have the theorem (2- -CLPT): F + A (Ea),A 2- A’ + F+A’ (Ea).

>b

Proof. Since we know b-CLPT, and is just [i.e. identity] on the non-lexpressions we only need to consider A of degree 1. Use, e.g., a double induction, viz.

(1) on O-(A) - i.e. the length of the >--reduction tree of A, (2) on length(A). The only interesting case is when A = (Al) AS, A1 E a , A2 red- [z : a]C. If A1 2- A’, then A1 >6 A‘, so by 6 - CLPT A’, E a. If A2 Ah then by the ind. hyp. and by -CR A’, red-[z : a‘]C’, [z : a ] C red-[z : A’IC’. So A: E a’ and t-+ (A’,) A;. If A2 = [z : As]A4 then A1 E A3 (this is SA1) and t-+Aq[Al]. Since a reduction A 2 A’ starts with an inside or with an outside reduction, we are finished by the first ind. hypothesis.

>

>-

3.3.6.2. Corollary (UEI): F + A

+ F+A 9 A-.

0

3.3.7. Theorem (UE2 and UE3): Consider the system without 77’ and with rule I . 9 . Then

B t-+, resp. B ; [ t-+, resp. B;[F+ A ( E / Q B )

+

B- F, resp. B-; [- I-, resp. B-; [- F A - ( E / QB - ) . Proof. By induction on F+, using >--CLPT.

The interesting rules are :

The language theory of Automath, Chapter V ((2.5)

(i)

547

Appl. rule 1.3‘: let E+ A E a , E+ B red- [z : a]C. By ind. hyp. t- A- E a-. Clearly B- = [z : a-]C- and by ind. hyp. E B - , so 2 E a- I- C - , so I- ((A) B ) - = C-[A-], q.e.d.

(ii) Instantiation rule (vi’): let B contain a scheme y’E fi * d(y’) := D (possibly followed by *d(y’) E C). Let B1 be the book preceding this scheme. By ind. hyp. B;;y’Ep’- I- D-(EC-). Now if B ; < E dEfi[g],then by ind. hyp. B-;,$- E B’- E (p’[l?])[k], so B-;
fi-

(iii) Q-rule: let I-+A B, I-+C, B > C. By ind. hyp. E A- B-, E C.Since 2- commutes with all other reductions, except possibly $ which we have forbidden, we find B- 2 C- so by CL for pq-AUT-QE I- B- C- and I- A- C - , q.e.d. The case that C > B instead is completely similar. 0

3.3.8.1. Now we prove that 1.3 is a derived rule in the modified system. So as[z : a-]C-, whence B- must sume I-+A E a , I - i B [z : a]C.By 3.3.7 E’Bp and k+ a p. Further, by 3.3.6.1, I-+B red- Bbe [z : P]Bl with I- aand by 1.3’ E+ (A) B, q.e.d. 3.3.8.2. Similarly, $-equality is a derived rule. Let E+ A, I-+A’, A >: A’. We can assume that degree(A) = 1. By induction on length(A) we prove that k+ A A’. The interesting case is when A = [z : 0111 (z) A’, z $! FV(A’). As in 3.3.8.1, z E 011 I-+ A’ r e d - [ z : a2]Al with z $! FV(a2). By SA1 z E a1 I-+ a1 q a2 and by strengthening I-+a1 9 012. So E+ A q [z : (~1]A1 q [z : az]Al q A‘, q.e.d. 3.3.8.3. Hence the system with 1.3 and 7’-equality is equivalent to the system with 1.3’ and without ql-equality. So we have SA1, >_--CLPT, UE1, UE2 and UE3 for the original system of ,L?r&AUT-QE+ now. 3.3.9. The proof of CLPT 3.3.9.1. As in 3.2.6.5, we can prove CLPTl from outside-CLPT1, by induction on correctness. Clearly 6-CLPT (and a fortiori 6-outside-CLPT1) is included in >--CLPT, so we just need p- and q-outside-CLPT1. In the next section we infer PT3 and SA from our UE-result, which leaves us to prove the p2- and q2-case of outside-PT1 only. These two cases are dealt with in 3.3.9.3.

3.3.9.2. Consider the properties mentioned in 3.1.5. In this section we distinguish the two versions of a property (viz. for the smaller and the larger system)

D.T. van Daalen

548

by providing the latter with a UTg

+ below. It is clear that +

UTft and UTft,PTi

whence UT,:

LQft

.

PT; and LQ$

The property UD is also preserved in passing to the larger system, and in fact, as in 3.2.3.1, k + [ ~a :] A Q [z:P]B

+ k + a Q P,(xE a F + A Q B ) .

By LQ: we have (*;). SA: we knew already. Now we show SA!+ for i # 1: let (A) [I : B]C. Since i # 1, ((A) [z : B]C)- = (A-) [z : B-]C-, so by UE2, I-' (A-) [z : B-]C- and by SA, F A- E B-. Hence by LQ; again, we have SA!+ for i # 1 as well. 3.3.9.3. In Sec. 3.2.5 we used c a n t y p in proving P- and q-outside-PT?. The same procedure applies in the +-system, but with t y p [see 3.2.41 instead of c a n t y p now. In particular we have

ii')

typ(d(/i))

E typ(d)[Af

for defined constants of degree 2 and 3 now, and (i.1

tYP((A) B ) = (A) tYP(B)

for both B of degree 2 and 3.

As in 3.2.4.2 we get I-:A

Ea

+ F + A E typ(A) C a

and, as in 3.2.5.2, k: A E a , (z E

IY

I-' C)

+ typ(C[A]) G typ(C)[A] .

So, as in 3.2.5.4 and 3.2.5.5, we get I-?(A) [z : B1

c*

whence P-outside-PT:,,,

and

k i [z : a I ( 4 A ,

whence q-outside-PT:,,

tyP((A) [z : B1 c)Q tyP(C[Al)

@ FV(A)

* ~ Y P ( [ Z: a I ( 4 A) 9 typ(A)

.

3.3.10.1. In 3.3.9.2 we have carefully avoided the properties which do not hold in the larger system, in particular LQ2 and (*'). For a counterexample

The language theory of Automath, Chapter V (C.5)

549

let d ( z ) be defined by z E 7 * d ( z ) := [y : 212, with typ(d) = 7. If a E 7 , then d ( a ) Q [y : a]a E [y : a]7,but certainly not d ( a ) E [y : a]7,so not LQ2. If, furthermore, A E a , then I- (A) [y : a]a but not I- (A)d(a), whence not (*'). Consequently, the +-system is not a definitional extension of the old system.

3.3.10.2. Besides, if we stick t o our counterexample, z E d ( a ) I- t E [y : a]a,so z E d ( a ) k (A) z E a, but not z E d ( a ) I- (A) d ( a ) (= t y p ( ( A ) z ) ) . This shows that t y p applied t o 3-expressions can lead us out of the correct expressions (in contrast with the situation in the smaller system), and that not:

F3A

+

A E typ(A)

.

3.3.10.3. In the next section we restore (*) and LQ2 by a further extension of the 1anguage.But first we give a theorem stating some very weak versions of LQ2 to hold in P$-AUT-QE+ instead of LQ'. Recall the symbol 0 from Sec. 3.2.3 and the result (Sec. 3.2.4.3, 3.2.6.1) for Pq-AUT-QE:

I-AEB, I-AEC =sI-BDC. Theorem. Let

~ + A E B I,- + C E D , I - A q C . Then

I - + A E D or I - + C E B . Proof. By UE we get I- A- E B-, I- C- E D - , I- A- Q C-. By LQ for Pq-AUTQE we get I-C- E B- SO I- B- 0 D-, sok+B Q B- 0 D- Q D , i.e. F + B o D , 0 i.e. B c D or D c B, q.e.d. 3.3.11.1. The aforementioned anomalies can partially be removed by properly extending PoG-AUT-QE+ to a language P@-AUT-QE*. In this new system we first replace the application rules by [z : a]C, A E a =+ I- (A) B

(1)

B

(2)

B E C, k (A) C =+ I- (A) B E (A) C

Rule (1) is simply 1.3 (Sec. 2.3) without the restriction to degree 1. Rule (2) is III.3.B' (Sec. 2.15). So, indeed, AUT-QE* extends AUT-QE+.

3.3.11.2. By this modification we gain the property k3A =s I- typ(A)

, so it is a proper extension .

Furthermore, by 0-reduction we get

D.T. van Daalen

550

B E [z : a]C

+

B

[z : a ] ( z B )

, which yields property (*)

for the new system. Our counterexample, however, shows that there are still problems: LQ2 does not hold, so we do not yet have a definitional extension of AUT-QE. Besides, now the new 2-expressions (e.g. ( A )d(a) in the example, which is correct now) do not have a correct t y p l and not even an E-formula.

3.3.11.3. The following theorem shows that the difference between AUT-QE+ and AUT-QE* just lies in the particular role of the definitional 2-constants, and that AUT-QE* is an unessential extension of AUT-QE+ (though it is no definitional extension). Theorem. Let t-* stand for correctness in AUT-QE*, and let A’ be the b2normal form of A . Then I-* A(E/QB ) t-+ A’(E/QB‘)(so I- A-(E/Q El-)).

*

0

Proof. Induction on I-*.

3.3.11.4. A drastic way of combining 2-constants with type-inclusion and still preserve LQ, is to add LQ explicitly to the language definition, or at least something like k2A, C E B , A r Z C

*

AEB.

Adding this rule to P@-AUT-QE+ produces the smallest definitional extension of AUT-QE which includes P@-AUT-QE+, and it gives us AUT-QE* plus all the missing E-formulas. An alternative way of defining this new system (we still call it AUT-QE*) is by ignoring the type-assignment part of definitional 2-schemes, and by defining the t y p of a definitional 2-constant to be the t y p of its definiens (compare the way norms have to be introduced for AUT-QE, [van Daalen 80, Ch. IV.4.41). From the latter definition of this new system it will be clear that our desirable properties (except UT2, of course) can be proved for it by the same methods as used in the closure proof of AUT-QE+.

3.3.12.1. Up till now we have, for definiteness, just compared Pv-AUT-QE with P@AUT-QE+ (and P@-AUT-QE*), i.e. we made the extension in one step and added the definitional constants and the 1-appl-expressions simultaneously. One can as well, of course, consider intermediate languages like Pq-AUT-QE+ and Pqb- AUT-QE. Then one notices that the problems with (*), LQ2 and t y p are exclusively due to the 6 (in particular b 2 ) and not to the in P$-AUT-QE+. Thus Pq-AUTQE+ satisfies LQ and (*), and is a neat definitional extension of PO-AUT-QE, whereas P$-AUT-QE has all the unpleasant features of PT&AUT-QE+. In

+

The language theory of Automath, Chapter V (C.5)

551

fact, PgG-AUT-QE+ is a definitional extension of PgG-AUT-QE, and PgG-AUTQE can only be made into a definitional extension of Pg-AUT-QE (call this new system from now on AUT-QE‘) by adding a rule like in Sec. 3.3.11.4.

3.3.12.2. If one takes AUT-68 instead and adds an application rule: AECI,

[ z : a l c Q B E+ ~ (A)BET

(compare 3.3.11.1, rule (1)) one gets the corresponding +-language (i.e. smallest value degree = smallest function degree), AUT-68+. These systems are easier to handle than AUT-QE: both AUT-68 and AUT-68+ satisfy UT, LQ and (*), even in the presence of definitional constants, and AUT-68+ is a definitional extension of AUT-68. Without definitional constants, AUT-68+ is already contained in AUT-QE, but PgG-AUT-68+ is not contained in Pg6-AUT-QE. It is contained, though, in the system AUT-QE’ of 3.3.12.1. Closure for AUT-68+ can, e.g., be proved by the methods of the next section (see 3.4.5).

3.4. Some easier closure proofs (for simpler languages) 3.4.1. There are various ways of proving closure for simpler languages, such as Pq-AUT-68 or PG-AUT-QE. First, one can take the closure proof of the previous sections and adapt it to the language under consideration. Since g-reduction, type-inclusion and liberal degree specification (in particular for function degree) are responsible for many technical details in the proof, the simpler languages allow some obvious simplifications. E.g. if a language lacks q-reduction we can clearly skip the g-closure part and, besides, we can freely use CR. Or, if a language has more restricted function degrees (AUT-68 vs. AUT-QE, non-+-languages vs. +-languages), we have to push SA, LQ, UD etc. through less degree levels. And, if a language lacks type-inclusion (AUT-68 and Nederpelt’s A), we simply have PT + LQ, and do not need to introduce something like cantyp for this purpose. A second approach is suggested by the fact that our language definition contains some technicalities which are only introduced to make the closure proof (i.e. this kind of closure proof, for a complicated language like Pg-AUT-QE) possible. In particular, I intend the use of the restricted Q-rule V.2 instead of the more liberal V.2’, i.e. the use of the restricted system type I, instead of the liberal system type I1 (see Sec. 1.2). Recall that after having proved closure for I, I and I1 can be proved t o be equivalent, and that, after all, we are more interested in system I1 than in system I. Now it turns out that, for the simpler languages, the modifications in the language definition (and the detour via system I) are superfluous, and that we can give a direct closure proof for a type I1 language definition.

D.T. van Daalen

552

Such direct closure proofs are presented below for all the regular languages which either lack 7-reduction, or have just function degree 3: p(b)-AUT-68(+), P(G)-AUT-QE(+) and PQ-AUT-68. A mere sketch is given for Pq(G)-AUT-68+ (for the definition of AUT-68+ see Sec. 3.3.12). 3.4.2. So we give these languages by a n E-definition with 9-rule

A Q B , B J C , I-C

V.2'

+

AqC

which a priori is stronger than V.2 but later turns out to be equivalent. The properties in Secs. 2.9, 2.10 such as the substitution theorem, correctness of categories, and the property: a of domain degree, A of value degree, z E a I- A ($ F [z : a ] A simply go through. As in Sec. 3.1, we essentially just need SA for proving closure. So below we confine ourselves to SA and, in connection with this, UD for the various languages. We start with the 17-less languages. 3.4.3.1. Theorem. UD for V-less languages.

Proof. Let [z : a ] B [z : PIC. Then by CR, [z : a ] B 1 [z : PIC so a B 1C, whence a 9 P and z E a I- B 9 C.

1P

and 0

3.4.3.2. Corollary. SA' for P ( b ) - A UT-QE+, SA2 for P ( b ) - AUT-68+. 0

Proof. Let A E a,[z : B] C 9 [z : a ] D . Then B p a so A E B.

Let c be defined as in Sec. 2.14. We need a lemma. I- F C G, G 2 [Z: 40 s-F 2 [ Z : G]C with Id1 = lpl and 6 1 b (2.e. a1 1 PI, a2 1 P2, 3.4.3.3.

etc.).

Proof. Induction on

c.

0

3.4.3.4. Corollary. SA2 for P(b)-A UT-QE(+), SA3 for O(b)-AUT-68(+).

Proof. Let A E a,[z : B] C E [z : cr]D. Then [z : B ] c E [z : B] F So by the previous lemma B 9 cr and A E B .

C [Z: a ] D .

0

Now in order to get SA3 for P-AUT-QE(+) we need a lemma again. Notice that the proof of this lemma fails when there are definitional constants. Lemma. I - ~ A E B B , ~ [ z ' : ~ D , A > [ z I:G~I = cI @, I ZJB. 3.4.3.5.

Proof. Induction on the length of A. The interesting cases are: (1) A

[XI

: al]A1, A1

[ ~ :1 P i ] [&

:

2 [&

,&ID, IS21 =

:

&]C,

I- A1 E B1, [ x i : al]B1 C B 2 E 3.4.3.3 0 1 1 P i and B1 2 [& : &]Bl

21

1B-21.By

The language theory of Automath, Chapter V (C.5)

&.

with 8 2 1 By the ind. hyp. (a1,a)1 ( P I ,d ) q.e.d.

(2) A

f

32

= B,

( A i )A2, Ai E 7,A2 E [z : TI&,

553

184 so d2 1 8 2 and (j; = B i [ A i ] C B 1 [.’:

,BID.

0

By 3.4.3.3 again, B l [ A l ] 2 [Z : p’D1 with 1 p”. Because B1 has degree 1 and A1 has degree 3, B1 2 ’.[ : ~ o ] D with o &[A11 2 fl. Similarly, since A2 has degree 2, if ( A l )A2 2 ’.[ : 6]Cthen A2 2 [ z : 7’1[z : &]COwith d o [ A l ]1 6 , Co[A1]2 C . By the ind. hyp. 60180 0 so d 5 do[Al] 1 Po[A1] 2 B a n d by CR d 1 q.e.d.

8,

3.4.3.6. Corollary. SA3 for

P - A UT-QE(+).

Proof. Let A E a, [z : B]C E D E [z : a ] F . Then [z : B]C E [z : B]G Q D whence D 2 [z : B’]G’ with B 1 B‘. By the lemma B 1a , so B Q Q and A E B. 0

3.4.3.7. So we have SA for P(S)-AUT-68(+) and 0-AUT-QE(+). In order to tackle the PG-case of AUT-QE we first prove 6-CLPT, which give us an unessential extension result. Then we can either extend SA directly, or first extend the lemma 3.4.3.5 to P6-AUT-QE+ and proceed as before. 3.4.4.1. Now consider PQ-AUT-68. We cannot use CR anymore. Theorem. UD2 for P7-A UT-68.

Proof. All 2-expressions are of the form ’.[ : cr]y or ‘.[ : ZIP((?). So if 3 [z : P ] B , then A 3 [z : a]A1 with a 2 P. By ind. on 9 we can 0 prove: if F2A Q [z : P]Bthen A [z : a]A1 with a Q 0. This gives UD2. k2 A

3.4.4.2. Corollary. SA for

PQ-AUT-68. 0

Proof. Immediate. 3.4.4.3. The same proof works as well for Pq6-AUT-68, as follows. Lemma. F2A 26 [a::a ] A l , F2B, A 1 B + B 26 [ z : P]B1, a 1 P .

Proof. Since 2 6 commutes with 2, [a: : a ] A l 2 [z : Q’]A{ 56 E 5 B. By . the &advancement (Sec. 11.9.3), B 26 C 2 [z : a”]A;’ 5 6 [z : Q ’ ] A ~Here reduction C 2 [a: : cr’’]A: does not contain &reductions so C = [a: : P]Bl with 0 0 2 a” 5 a’5 a,q.e.d. 3.4.4.4. By the simultaneous substitution theorem we have 6-CLPT again. Then by induction on Q we can prove:

E 2 F Q [ z : P ) B=+ F

1 6

[z:Q]A, a Q P .

This gives us UD2 whence SA, as before.

554

D.T. va,n Daalen

3.4.5. It is possible to extend these results (for pq(b)-AUT-68) to the corresponding +-languages Pq(b)-AUT-68+, but it is rather complicated. We can use a mixture of the methods in 3.4.4.3 and 3.4.4.4 and the methods in Sec. 3.3. Thus we start with leaving v2-reduction out of consideration, and restricting the k ( A )B. appl-rule of degree 2 to: A E a,I-* B 2 [z : p ] D , a! 2 p Later on these two restrictions prove to be immaterial. For the restricted system SA2 is immediate and p2-closure is guaranteed. Then we need b-p2advancement and the fact that bP2-reduction commutes with 2 , and get

*

k 2 F Q [ z : P ] B+ F l a p a [ z : a ] A , a Q P .

This yields UD2, and SA3 and we are finished.

V.4. The equivalence of the E-definition with the algorithmic definition 4.1. Introduction

4.1.1. Since in the E-definition the correctness of expressions and formulas (relative to a correct book and a correct context) was given by an ordinary inductive definition, the correctness relation is a priori just recursively enumerable and not necessarily recursive i.e. effectively decidable. In this section V.4, though, we prove the decidability and discuss some related topics. First we give some introductory considerations leading to a sketch of a decision procedure (Secs. 4.1.3-4.1.6). The whole verification process is, in principle, reduced to the verification of 9-formulas, for which the decidability follows from the normalization property N and the Church-Rosser property. We can use normalization freely because we proved N for a very large system in IV.4.5, but Pq-CR we d o not know yet. Therefore we assume throughout V.4 property CR for the correct expressions, for the proof of which we refer to Ch. VI. Then (see 4.2.2) we present the actual algorithmic definition, to be adapted for the various languages by a suitable choice of a reduction relation, of a typing function cantyp and of a domain function dom for the computation of domains (Sec. 4.2.3, 4.2.4). The equivalence proof in Sec. 4.3 is organized as sketched in Sec. 1.2 and 1.6, with the following effects: 4.1.2.

(1) The strengthening rule can be skipped from the E-definition. (2) The E-systems are decidable. (3) The algorithmic system satisfies the nice properties of the E-system: closure, etc.

The language theory of Automath, Chapter V (C.5)

555

The final sections concern the verification of Automath languages in practice. This is a matter completely different from the theoretical decision procedure discussed before. Particularly some remarks are made on suitable reduction strategies for deciding @formulas.

4.1.3. Deciding 9 and C No matter whether a system has &rule V.2 or @rule V.2', there holds A q B e-I-A,I-B,AJB

=+. By induction on 9, using CR. +. This is precisely rule V.2' so either it

Proof.

holds by definition or it follows

from CL.

0

So, by N (as in 11.5.4), for correct A and B , A 9 B is decidable. In P(v)-AUT-QE all 1-expressions are of the form [Z : 317. We have

ACT e - F I A and (Sec. 3.2.3.1) I-'A

c [z : P]B1 e-

A = [z : o ] A l ,

Q

9 p and x E

Q

I- A1 C

B1

.

So, for correct 1-expressions A and B , A c B is decidable (use induction on the length of B ) . Since on non-1-expressions c is just 9, this is true for A and B of other degrees as well. Let k stand for correctness in P(q)-AUT-QE, I-+ for some larger system, like P$-AUT-QE+ or P$-AUT-QE* and let - denote the P'6-normal form. By UE (Secs. 3.3.2, 3.3.3) we have

I-+B, I - A - c B -

I-+AcB *I-+A,

So, in the larger systems, too, A c B is decidable, for correct A and B. Deciding E-formulas In principle, E-formulas A E B , for correct A and B are going to be decided by the equivalence

4.1.4.

A E B e typ(A)

C

B

which reduces the E-formula to a C-formula. However, there is some trouble with typ. First, t y p can lead us out of the correct expressions of the language we consider. There are two ways to solve this problem: first one can introduce for each language a specified modified typefunction cantyp (for: canonical type) which does not suffer from this defect. Then we get what we want (as in 3.2.4 for AUT-QE) A E B e - I - A , I-B, c a n t y p ( A ) C B .

D.T. van Daalen

556

Alternatively, one can use the fact that the new, possibly incorrect expressions created by t y p in general are correct in some larger system (e.g. the corresponding +-system). Then one can decide the E-formula in the larger system:

I-B, I-+typ(A)cp

A E B *FA,

where I-+ stands for correctness in the larger system. If we make sure that I-+cantyp(A) 9 typ(A) then, by conservativity, the two approaches are clearly equivalent. A second difficulty with t y p occurs exclusively in AUT-QE’ and AUT-QE*. These languages have the rule: I-2B, I- C E D , B 2; C + I- B E D, and for the new category D of B the property t y p ( B ) c D (even if t y p ( B ) is correct) is not necessarily true anymore. This problem can be solved by taking a type-function which first eliminates all the b2-constants. For a b2-constant d we have then cantyp(d(A)) =

cantyp(b2-nf (d(A))). 4.1.5. Deciding correctness of expressions All correct expressions relative to a correct B and a correct 6 have to be B;<-expressions, i.e. the constants have to be in B and the free variables have to be in The verification of compound expressions can roughly be described as: verify the subexpressions, plus their possible type- and degree-restrictions. E.g. for abstr-expressions use the equivalence

<.

I- [z : a ] A ($ I- a , a of domain degree

,

z E a I- A , A of value degree

.

B’

For the subexpressions in c(6) there are type-restrictions prescribed in the scheme of c, viz. if the context of the scheme is y’ E ,dthen I- C(B’)

*

E

,d[B’]

(i.e. B1 E

PI, B2 E P2[B1] etc.) .

To verify the right hand-side first verify I- B1. Since I- p1 (it occurs in B), we can decide B1 E p1 as indicated above. Then check I- B2. Since B1 E /31 and ~1 E pz we know I- ,&[Bl]so we can tackle the next E-formula etc. 4.1.6. Verification of application expressions Now we discuss the typerestriction implied in the correctness of (A) B. We restrict ourselves to AUT-68 and AUT-QE here. Define a to be a domain of B if (i) B E [z : cr]c for some

c,or

(ii) B E C E [z : a]D for some C, D.

The language theory of Automath, Chapter V (C.5)

557

Then, in view of the formation rules for appl-expressions, we have the equivalence:

I-(A)B el--, B h a s a d o m a i n a , A E ~ . The arbitrariness w.r.t. the domain can be somewhat reduced by another property of uniqueness of domains, viz. if a1 and a2 are domains of B then a1 9 a 2 (which will be proved below, 4.2.4.2). This allows us to modify the equivalence: k ( A )B e k B , B has a domain, and

V, ( B has a domain cr

*

A E a)

i.e. we need just one domain t o check the type-restriction. If one fixes a particular procedure for the computation of some domain of an expression, one can define a domain function dom (specific for each language). E.g. for AUT-68 one might inductively define

S2-nf(cantyp(B))

= [z : a]C *

dom(B) = a

Now define an extended reduction relation ter VZZ], as follows: (i)

[compare rt-reduction an Chap-

A>B+A+B.

(ii) A (iii)

+

.

+

+

typ(A).

is transitive.

Then, an alternative way t o compute a domain of an expression B, is to perform a more or less specified search through the +-reduction tree of B until one possibly encounters an abstraction expression, say [z : a]C;if so, this a is some domain of B . Certain restrictions (specific for each language) have to be imposed upon the search in order to guarantee that not too many expressions get a domain in this way. Just like property N (at least S2-N) is crucial in the definition of don above, t h well-foundedness (i.e. property S N ) of -, is needed for the termination of the second procedure. This will indeed be proved below (4.4.11). As a whole, the situation with the two possible ways of finding a domain can be very well compared with the two ways of deciding a 9-formula: either one can compare normal forms (use N) or one can search for a common reduct in the respective reduction trees (use SN).

D.T. van Daalen

558

4.2. The algorithmic definition 4.2.1. Now we give, guided by the considerations in the preceding sections, the algorithmic definition of correctness. Apart from the compatibility condition of def and t y p (see below), the book-and-context part of the definition is as usual [see e.g. [ v a n Daalen 73 (A.3)]] and will be omitted. So y e just define the correctness of expressions and formulas (new notations I-,, E,, Q, and C,,with the subscript for “algorithmic”) in terms of reduction, dom and cantyp (Sec. 4.2). Later we discuss the choice of cantyp and dom for the various regular languages (4.2.3, 4.2.4).

<

4.2.2.1. Let B ; < I-,. The conventions for omitting B and in B;< I-, A are as in V.2.1. Degrees are indicated as superscripts and defined as usual. The compatibility condition reads: def ( d ) E, typ(d). 4.2.2.2. Formula part of the definition Let A and B be B;<-expressions (so not necessarily correct). We define: (i)

A qa B :*A

(ii) A C,

’.[

1B

B , if degree(B) = 1

: fir, d q,

As, B’. F [Z : d]A1, P’b’-nf(B) =

with the straightforward extension to strings:

8.

(iii) A C, B, if degree(B)

#

:%

P’G1-nf(A)

1 :($ A Qa B.

(iv) A E, B :ecantyp(A) C, B with a straightforward extension to strings AE, B’.

4.2.2.3. Expression part of the definition (i)

FAT.

(ii) I-,z

:ez occurs in E.

(iii) Fac(B1,..., B,) :eFa&, ...,I-,B,, has context y’E p’ then B’E, p[B’]. (iv)

E I-,

(v)

I-, (A) B

[z : a]A

c occurs in

B and, if the scheme of c

:e6 I-: a and <,z E a F, A and A has value degree.

:% I-:

A , I-, B , B has function degree, A E, dom(B).

4.2.3. The choice of cantyp 4.2.3.1. For our purposes (see 4.1.4) we require that, for correct A , cantyp(A) is as well correct, is a category of A, i.e. A E cantyp(A), and is minimal with respect to C: A E B + cantyp(A) 1 B.

The language theory of Automath, Chapter V ((7.5)

559

This leaves us still a lot of freedom for our choice of cantyp: e.g., as long as different definitions of cantyp yield definitonally equal results, they are equally good to us. In some languages t y p itself meets the requirements mentioned above, viz. Pv-AUT-QE+ and Nederpelt's A. In most languages, however, t y p causes some problems, e.g. there are correct expressions with incorrect t y p ; then we choose cantyp to be some suitable modification of t y p . Below we give a survey of the difficulties with t y p , and how these can be solved by cantyp. We start with the languages where the trouble with t y p is due to mere degree restrictions.

4.2.3.2.

(1) Pv-AUT-68: If kz [z : alp then its t y p is not correct in AUT-68, but is a typical AUT-QE-expression. Then cantyp of this expression has to be P. Further, typ((A) B ) where degree(B) = 3, is incorrect in AUT-68 but correct in AUT-68+ (so, see 3.3.12.2, in AUT-QE). In cantyp((A)B) we have to remove the applicator (A), so we can define cantyp((A) B ) = C(A], where cantyp(B) G [z : a]C. This is the same idea as in 3.2.4, but now for B of degree 3. (2) Pv-AUT-QE and Pq-AUT-68+: Application of t y p to (A) B of degree 2 yields AUT-QE+ expressions. For AUT-68+ cant yp of these expressions has to be 7 . For AUT-QE we remove ( A ) from cantyp, by P-reduction as in 3.2.4 (and in (1)). 4.2.3.3. Now we add definitional constants. This gives rise to the interference of b2-constants and type-inclusion, discussed before in 3.3.10-3.3.12.

(3) pv6-AUT-68: Consider the example of 3.3.10 which is also correct in AUT68. There occurs an ( A ) B of degree 3 such that t y p ( ( A ) B ) does not belong to AUT-68 (of course not, as in ( l ) ) ,does not even belong t o AUTQE and AUT-QE+, but does belong to AUT-68' (3.3.12.1) and AUT-QE' (3.3.11). Again, we must remove the applicator in cantyp, but we cannot be certain anymore that cantyp(B) is a n abstr-expression. Therefore we define cantyp((A) B ) 3 C[A], where &nf(cantyp(B)) E [x : a]C. (4) pqS-AUT-QE(+): The same expression typ((A) B ) of (3) is again incorrect here. Now the applicator is allowed in cantyp, but we need the @-reduction in order to remove the effect of the type-inclusion: cantyp( (A) B ) = (A) (s2-nf(cantYP(B))). (5) &6-AUT-68+: This language has 2-expressions (A) B (see 3.3.11.2), the t y p of which is incorrect in all the languages, and even not normable, e.g. (A) T . The cantyp of such (A) B must be 7.

560

D.T. van Daalen

(6) &&AUT-QE’ and ,&b-AUT-QE*: Here we have the same (A) B of degree 2 of AUT-68+. Besides, the t y p of a degree 2 definitional const-expression (even if t y p is correct) need not be a minimal category anymore. Therefore we define cantyp(d(A)) :G cantyp(b2-nf (d(2))). Then for the cantyp of (A) B of degree 2 we can simply take (A) cantyp(B) in AUT-QE’, whereas in AUT-QE’ we must take C[A]where b2-nf (cantyp(B)) = [z : a]C. 4.2.3.4. Resuming: we have three types of difficulties, viz.

(i)

In AUT-68(+) the only 2-expression is 7, so the t y p of 2-expressions can be incorrect. Remedy: define cantyp to be 7.

(ii) In non-+-languages (AUT-68, AUT-QE and AUT-QE’) the t y p of (A) B of minimal function degree (say: i) is incorrect. Remedy: create an abstr. expression by taking the (Pb)i-l-normal form of cantyp(B) and remove (A) by another Pi-I-reduction. (iii) In languages with b2-constants and type-inclusion t y p produces incorrect appl-2-expressions (AUT-QE(+)) or appl-1-expressions (AUT-QE’ and AUT-QE’). Besides, in AUT-QE’ and AUT-QE* the t y p of a b2-constexpression is not necessarily a minimal category. Remedy: remove the b2-constants after (AUT-QE(+)) or before (AUT-QE‘ and AUT-QE’) taking cantyp. In view of the arbitariness of cantyp (4.2.3.1) we need only three different definitions of cantyp, one for the AUT-68-family, one for the restricted AUT-QE languages and AUT-QE+, and one for the liberal AUT-QE branch (AUT-QE’ and AUT-QE*). Since the above list of difficulties is exhaustive, for the rest (e.g. for variables and const-expressions) the definition of cantyp differs only as regards the following clauses: 4.2.3.5.

(1) for AUT-68 and AUT-68+

(i) degree(€?)= 2 =+ cantyp(B) := T . (ii) degree(B) = 3, P2b2-nf(cantyp(B)) G [x : a]C

+ cantyp((A) B )

:=

+

:=

CPI. (2) for AUT-QE and AUT-QES

(i) degree(B) = 2, plbl-nf(cantyp(B))

E [z : a]C

cantyp((A) B )

CPI. (ii) degree(B) = 3,

+ cantyp((A) B ) :=

(A) (b2-nf (cantyp(B)).

The language theory of Automath, Chapter V (C.5)

561

(3) for AUT-QE' and AUT-QE*

(i) degree(d) = 2 =+ cantyp(d(A)) ; (ii) degree(B) = 2, p'b'-nf(cantyp(B))

cantyp(b2-nf [z : a]C

+

(d(A))). cantyp((A) B )

:=

C[AL That the proposed definitions of cantyp actually satisfy the requirements of 4.2.3.1 can be proved directly for the E-system using the results (CLPT, LQ, UE etc.) from Section 3, but will become clear as well in the course of the equivalence proof, below. 4.2.3.6.

4.2.4. The choice of dom 4.2.4.1. We start with a recapitulation of the appl-rules for the various languages. First, the appl-rules of AUT-68 ((1) A E a , B E [z : a]C k (A)B ) and of AUT-QE ((2) A E a, B E C E [z : a ] D I- (A) B) are simply valid

*

*

in all the languages (though rule (2) is vacuously so in AUT-68(+)). Then, I- (A)B ) ; this rule is with i = additionally, rule 3') (A E a,Fi B [z : a]C minimal value degree necessary for defining the +-languages AUT-68+ ( i = 2), AUT-QE+ and AUT-QE* (i = 1). For languages satisfying LQ', where i is not the minimal value degree, rule 3') is a derived rule. Indeed, for such i is '-I [z : a]C E [z : a]Cso by LQi B E xal D. Hence, rule 33) is anyhow valid, rule 32) is valid in the AUT-QE languages without b2-constants, further in AUT68+, AUT-QE' and AUT-QE', and rule 3') is valid in AUT-68(+) (vacuously), AUT-QE+ and AUT-QE*. Alternatively formulated, rule 3') is always valid but for: rule 32) in AUT-68 and AUT-QE(+) with b2-constants, and: rule 3') in AUT-QE and AUT-QE'.

So, for certain languages we must extend the definition of domain from 4.1.6 with the clause: (iii) B Q [z : a]C + a is a domain of B. The set of domains of a n expression is clearly closed under 9: 4.2.4.2.

a1

a domain of B , a1 Q

02

=+

a2

a domain of B

.

The converse of this is the announced uniqueness property, which we prove here for the enlarged notion of domain: a1

and

a2

both domains of B

+

a1

Q

a2

.

Proof. From 3.2.3.2, 3.2.4.3, 3.2.5.7 we recall the properties of Pq-AUT-QE I-' [z : a$'

0 [z : az]D

a1

a2

(this includes UD')

D.T. van DaaJen

562

Now let F3 [z : a&' E [z : a2]D. Then also F3 [z : al]C E [z : a1]F. By UT2 we get [z : az]D Q [z : al]F and by UD2: a1 Q 012. So we have EUD3 as well. Further, let k3[z : a&' Q [z : a2]D. Then also k3[z : q ] C E [z : al]F and by Lq3 [z : a2]D E [z : al]F. So by EUD3: a1 a2. This amounts to UD3. These results can all be extended to the extensions of Pq-AUT-QE by translation (e.g. PIG-reduction) into Pq-AUT-QE, as follows: let I-+ [z : al]CE/O [z : a2]D, where F+ stands for correctness in the larger system. By UE, I- [z : a I ] C - E/O [z : a z ] D - , correct in Pq-AUT-QE, so by one of our (E)UD results: a1 Q a; Q a; Q a 2 . Of course, in AUT-68(+) these (E)UD results are also valid. Now we treat the various possibilities for a1 and a 2 to be a domain of B. (1) [z : al]C 9 B Q [z : a2]D. Use UD.

(2) [z : al]C Q B E [z : a2]D. If necessary, translate (e.g. by b2-reduction) into a language satisfying LQ: [z : a I ] C - Q B- E [z : a;]D-. Then by LQ we get [z : E [z : a;]D-, and can use EUD. (3) [z : a1]C Q B E D E [z : a2]F. Use Lq: [z : a&' E D E [z : a2]F. But also [z : al]C E [z : a l ] G and by UT3: [z : a1]G Q D we arrive in case (2) again.

(4) B E [z : a1]C, B E [z : 02]D. Then [z : al]C 0 [z: a2]D so a1 9 a 2 . (5) B E [z : al]c,B E D E [z : az]F. By UT3: [z : al]C D we are again in case (2). (6) B E C E [z : a l ] D , B E F E [z : aa]G. By UT3 we get C Q F. Translate into a language satisfying LQ. This gives C- Q F- E [z : a J G - and by LQ C- E [z : aJG-. It also gives C- E [z : a;]D-, and case (4) applies. 0 4.2.4.3. It would be nice if the notion of domain of an expression were preserved under 9: B C , a a domain of B + a a domain of C . This is indeed true for languages satisfying LQ, but not for the others, viz. Pq6-AUT-QE and PqbAUT-QE+. By CLPT, there holds

B 2 C , a a domain of B =+

LY

a domain of C

i.e. the notion of domain is preserved under 2. So the converse directions (C 2 B , in particular with b2-reduction), fails in pqh-AUT-QE(+). For all the languages we have

B

C , a a domain of B

+

where C- is the b2-normal form of B.

a a domain of C-

The language theory of Automath, Chapter V ((3.5)

563

Proof. By the translation - we arrive in a language satisfying Lq, so from B- Q C-, a a domain of B- we get the desired result. As a corollary of this, we get

B Q C, a a domain of B , C has a domain + a domain of C . 4.2.4.4. In view of the above remarks we still have a lot of freedom in defining

a domain function dom which picks some expression from the set of domains. Dom is going to be defined in terms of cantyp and, just like cantyp, in terms of b2-reduction and (P6)i-reduction, where i is the minimal value degree. 1.e. by application of cantyp and these reductions we arrive at an expression which we call the domain normal form, dnf. If the dnf is an abstr-expression then we read off the domain dom from it:

d n f ( B ) = [ x : a ] C+ dom(B) := a . Otherwise, dom is simply not defined. The rules for computing dnf are for the non-+-languages: (1) AUT-68: dnf(B) := P2b2-cantyp(B).

(2) AUT-QE(‘):

(i) degree(B) = 3 =+ dnf(B) := @161-nf(cantyp(62-nf (cantyp(B)))). dnf(B) := ,BlG’-nf(cantyp(B)), (ii) degree(B) = 2 The P2 of AUT-68 and the P1of AUT-QE(’) were only added in order to cover the corresponding +-languages too. Now, we can deal with the +-languages by simply adding a rule for B of minimal value degree: degree( B) = i , i is minimal value degree + dnf(B) := (,D6)i-nf(B). This rule gives us AUT-68+ from AUT-68, AUT-QE+ from AUT-QE and AUT-QE* from AUT-QE’. That dom(B), as defined above, gives us a domain if B has one, and gives us nothing otherwise, can be proved directly, but will also become clear in the course of the equivalence proof.

4.2.4.5.

4.3. The equivalence proof

4.3.1. As announced before, the equivalence of the algorithmic definition with the E-definition will also prove the superfluity of the strengthening rule. To this end we use, along with the algorithmic definition system 111, two distinct versions of the E-definition, system I and system 11. Here, system I is the system of Sec. 2: it has the strengthening rule and it has Q-rule V.2. System 11, however, lacks the strengthening rule and has Q-rule V.2’ instead.

D.T. van Daalen

564

By CL for system I, we have: str., V.2 e (str.,V.2') 3 V.2', so system I1 is clearly included in system I. Below we denote correctness in I, I1 and I11 respectively by I-, I-0 and F a ; hence the inclusion of I1 in I becomes: hl I-. Now the equivalence of the three systems is shown by additionally proving I-, =+ i-0 (Sec. 4.3.2) and I- =%I-, (Sec. 4.3.3).

4.3.2. The I-, + I-0-part 4.3.2.1. We first formulate the theorem, which we want to prove. Theorem. If B Fa resp. B;( I-, resp. B;( I-: A resp. B;( I-:,' A then B i-0 resp. B;( I-h A resp. B;( I-;' A E cantyp(A). So the theorem implies that cantyp is well-defined on the non-1-expressions of the algorithmic definition. The proof of the theorem is by induction on I-, and depends of course on dom and cantyp, i.e. on the language we consider. However, large parts of the proof can be done for all or some of the languages together. 4.3.2.2. Some properties (1) I-oA, FOB,A Q , B +-I-oA Q B. Proof. This is simply rule V.2'. (2) I-OA

* I-oP'S1-nf(A)

0

A.

Proof. By the simultaneous subst. theorem 6-CLPT holds. Further SA' can be proved as in 3.3.6.1-3.3.8.2, or holds vacuously so P'-CL. By PS-CR and 0 06-N the P'Sl-nf is well-defined. (3) Let I-oA, FOB,A C~ B. Then I-oA c B.

Proof. For A of degree 1, by (2) i-0 A q PIG1-nf(A) = [Z : Z]Al C [i? : ~517-1'.I : PIT P'Sl-nf ( B ) Q B so I-0 A c B. If degree(B) # 1 this is (1) again. 0 4

(4) I-oA E cantyp(A), cantyp(A) Ca B , FOB

* I-oA E B.

Proof. Apply (3).

0

(5) The I-0-system satisfies CR.

Proof.

I-0

+ t-

and we assumed CR for I-.

(6) Strengtheningfor 9: E I-0 A q B , 51 sub E , (1 I-o A , El I-o B Proof. By ind. on 9 we get A 1 B so (1 I-0 A 9 B.

0 (1

I-o A 9 B. 0

4.3.2.3. Proof of the theorem, part 1 We only need t o give the induction step for those clauses in the definition of I-, which differ from the corresponding clauses in the definition of I-0. We start with the easy cases.

The language theory of Automath, Chapter V ((2.5)

565

(1) The compatibility condition. Let * d(Z) := A * d(Z) E B be a correct scheme according t o the algorithmic definition, i.e. E I-a A, E I-, B and A E, B. By the ind. hyp. E ko A E cantyp(A), FOB,so by (4) above E I-0 A E B , q.e.d.

(2) Expressions (easy cases). (i)

T:

trivial.

(ii) variables: let E Fa then by the ind. hyp. z E t y p ( z ) = cantyp(z).

E

ko, so for z in

E, 5

k,,

(iii) const-expressions, except G2-const-expressions in AUT-QE' and AUTLet I-, B1, ..., QE*: let the scheme of c be in B with context $ E k, B, and B'E, p'[B']. By the ind. hyp. ko BI E cantyp(B1), k,, B2 E cantyp(B2) etc. Further y ' E p' k, so $ E p' I-o so ko PI, y1 E p1 FOPZ etc. So I-o B1 E 01 and by the subst. theorem ko P2[B1], so ko& E p2[B1] etc. up to FOB, E pm[2].The conclusion is F O C ( (E ~ )typ(c)[B] E Cantyp(c(2))).

p.

(iv) abstr-expressions: let 5 I-: a and E , z E a Fa A, A of value degree. By the ind. hyp. [ k i a and E , z E a I-0 A (E cantyp(A)). For A of degree 2 in AUT-68(+) this is I ,z E a t-0 A E T which yields E I-0 [z : a]A E 7 E cantyp([z : a]A).Otherwise, we get 5 ko [Z: a ] A (E [z : a]cantyp(A) = cantyp([z : &]A)). 4.3.2.4. Some more properties Before discussing the remaining clauses we prove some more properties of I-0. First something about C. Of course, the @'G1-nf's of 1-expressions are of the form [Z : 517. As in 3.3.6-3.3.8 (leave q1 out of consideration, restrict the appl-1-rule) we can prove, even without using CR k: A

B

=$

plG1-nf (A) = [Z : 517 ,

= [Z : & , 1 - 0 3 9 6, and, by induction on C, FA A 1B + P'Gl-nf ( A ) = '.[ : 51 [y' : 71'17, p'G'-nf(B) = [Z : BIT, 1 - 0 6 Q i? s o we .get: p'G'-nf(B)

khA 1[z : P]B1 =$ plG1-nf(A) = [z : a]A1, I-oa 9 p, z E a I- A1 C B1. Now we prove a lemma. F i A (E B ) =S I-;A E cantyp(A) (C B ) . Proof. E.g. in AUT-68(+) there is nothing t o prove. Anyhow, the cases A = 7 , A a variable or A an easy const-expression (i.e. not a G2-const-expression in AUT-QE' or AUT-QE*) are immediate. For the rest we proceed by induction on

566

D.T. van Daalen

(1) the length of b2-reduction tree of

A,

(2) the length of A.

0

Abstraction expressions are easy. If A is a 6'-const-expression in AUT-QE' or AUT-QE*, by 6-CLPT and the first ind. hyp. t-;b2-nf (A)E cantyp(S'-nf(A)) = cantyp(A)(C B ) . Then by the extra type modification rule of these languages we get t-; A E cantyp(A) (C B ) , q.e.d. Now let A E (Al)Az. We have ko a1 E a , t-0 A2 E cantyp(A2) c [z : a]C. So plSl-nf(cantyp(A2)) = [z : a1]C1 with a1 Q a , z E a1 I- C 1 C C. We want t-0 A E cantyp(A) = C1[Al](cB ) . If the formula A E B in the assumption comes directly from C[A1]r B we get Cl[Al]t~ C[A1]c B q.e.d. Otherwise A 22 D,I-0 D E B (i.e. the extra rule of AUT-QE' and AUT-QE* has been D 2 with A1 2% D1,A2 2% D2, so t-o D1 E a , and used). This D = (01) t o DZE cantyp(D2) Q PIG1-nf(cantyp(Dz)) = [z : az]C2 C [z : a119 (apply one of the ind. hypotheses to Dz),and by the first ind. hyp. t-0 D E cantyp(D) = C2[D1] Q Cz[Al]c CI[A~].So, by the type mod. rule, FoA E Cl[Al],q.e.d. 4.3.2.5. Proof of the theorem, part 2 Now we prove the induction step for the two remaining cases. (1) b2-const-expressions in AUT-QE' or AUT-QE*. As in 4.3.2.3 (iii) vie can get t-8 d ( 2 ) from Fa d ( 5 ) . Then by the lemma ~ o d ( ZE) cantyp(d(3)). (2) Appl-expressions. Let ti A, Fa B, B of function degree, A E, dom(B). By the ind. hyp. t-2 A E cantyp(A) J. dom(B), I- &(E cantyp(B)). For the computation of cantyp and dom in the various languages see 4.2.3.5 and 4.2.4.4 respectively. AUT-68(+), I-iB: p2b2-nf(cantyp(B)) G [z : a]C,dom(B) z a. By I-0 B E [z : a]C and k o a , so t o A E a and t-0 (A)B E C[A]= cantyp((4 B ) . (ii) AUT-68+, k-8B: P2b2-nf(B) = [z : a]C. We have SA2 (see e.g. 3.4.5) so p2 - CL so FOBQ [z : a]C and t-o(A) B E 7 E cantyp((A) B ) .

(i)

6-CLPT

(iii) AUT-QE(+),

t-i B: P1G1-nf(cantyp(6'-nf

(cantyp(B))))

dom(B) s a . By 6-CL and the lemma in 4.3.2.4 I-0 B E

= [z : a]C,

b2-nf(cantyp(B)) E [z : a]Cso l-o(A)B E (A)(&nf(cantyp(B))) = cantyp((A) B ) . (iv) AUT-QE' and AUT-QE*, B: As (iii), but from ko62-nf(cantyp(B)) E [z : a]C we infer now I-0 cantyp(B) E [z : a]C so t-0 (A)B E (A)cantyp(B) = cantyp((A) B ) .

l-2

The language theory of Automath, Chapter V ((3.5)

567

(v) AUT-QE, I-: B: Like (i) but decrease the degrees by 1. (vi) AUT-QE+ and AUT-QE', I-; B: Like (ii), but decrease the degrees by 1. This finishes the proof of the theorem in 4.3.2.1.

0

4.3.3. The Ika-part 4.3.3.1. We formulate our theorem. Theorem. If B t- resp. B;( t- resp. B ; ( I- A then B F a resp. U;[ F a resp. B;( F a A. Further, if B; [ F A E B then A E, B. The proof will be by induction on I-, We just discuss AUT-QE, because with AUT-68 everything is completely similar or somewhat easier. 4.3.3.2. First, we need some properties: (1) Strengthening holds in the F,-system.

Proof. Notice that the definition of c a n t y p only refers to the relevant parts of the context, i.e. to assumptions concerning actually occurring free variables, and that the other notions in the definition of correctness do not refer to the context at all. Hence, strengthening can be proved by a simple 0 induction on Fa. (2) On PCT2 (preservation of cantyp): In 3.2.5, we proved pq-outside-PCT? for pq-AUT-QE. However, 6-outside-PCTf is wrong, so for AUT-QE(+) with b2-constants we can only get restricted PCT2:

If F 2 A , A

1 B not using b2-reduction then cantyp(A) 9 c a n t y p ( B ) .

In order t o prove this, start with F 2 A E a + I- A E cantyp(A) C a! (e.g. as in 4.3.2.4). Then, as in 3.2.5, one can prove: F2A , A 2 B not by b2reduction + cantyp(A) 9 c a n t y p ( B ) . Restricted PCT2 gives us restricted LQ2 for AUT-QE(+):

If k2A , B E C, A 9 B without using b2-reduction then A E C . (3) However, in AUT-QE', and AUT-QE', full PCT2 is still valid and hence La2 holds (this was already implicitly claimed in 3.3.11.4).

Proof. In AUT-QE' and AUT-QE' we have S2-nf (cantyp(A)) 5: cantyp(b2-nf (A))

.

So, let A 1 B. Then b2-nf(A) 2 S2-nf(B) without using b2-reduction, so 0 by restricted PCT2 we have cantyp(s2-nf(B)).

D.T. van DaaJen

568

(4) By CR we have I- A Q B j. A 9, B. As in 4.3.2.4 we have I-l A ,BIG1-nf (A) = ’.[ : Z] [y’: TIT, P1G1-nf ( B )= ’.[ : I- d Q @. So

g]~,

B

*

4.3.3.3. Proof of the theorem Note that the I- A E B j. A E, B part of the theorem, for A of degree 2 follows from k2A E B + I- A E cantyp(A) K B (in 4.3.3.2 (2) and 4.3.3.2 (4)). The proof is by induction on I-. We first discuss some of the clauses for the formation of expressions: (i)

abstr-expressions: let I-2a , z E a I- A1 (E B1). By the ind. hyp. I-: a , z E a I-, All (A1 E, BI, i.e. cantyp(A1) c, BI), so F a [z : a]A1, (cantyp([z : aIA1) = [z : alcantyp(A1) c [z : aIB1, so [z : a]A1 E, [z : a]B1),q.e.d.

(ii) const-expressions: let y’ E @ be the context of the scheme of c, I- B’ E @[B’]. By the ind. hyp. I-, 2,B’E, @[g], so I-, c(@. If c is not a b2-constant in AUT-QE’ or AUT-QE* then cantyp(c(2)) = typ(c)[B’] so certainly c a n t y p ( c ( 2 ) ) C, typ(c)[g], q.e.d. Otherwise use the remark above. (iii) 2-appl-expressions: let k3 A E a , I- B E [z : a]C. By ind. hyp. k : A, I-, B , cantyp(A) 1 a , cantyp(B) 3. [z : a]C. So ,B’b2-nf(cantyp(B)) = [z : a’]C’, dom(B) = a’ 1 a. By CR cantyp(A) 1 dom(B) so I-, ( A ) B . Further, by the remark above, (A) B E, C[A], q.e.d. (iv) 3-appl-expressions: let k3 A E a , I- B E C E [z : a]D. By the ind. hyp. I-2 A, cantyp(A) 1 a , I-, B , cantyp(B) 1 C. By b2-CLPT, I- b2-nf(C) E [z : a ] D . By the I-, + I-0-part, I-oB E cantyp(B) so I- B E cantyp(B), so I- cantyp(B) so I- b2-nf(cantyp)). Further b2-nf(cantyp(B)) b2-nf(C) without using b2-reduction, so by restricted LQ, I- b2-nf (cantyp(B)) E [z : a ] D and cantyp(b2-nf (cantyp(B))) C a [z : a]D. 1.e. ,B1b1-nf(cantyp(b2-nf(cantyp(B)))) = [z : &’ID’, a 1 a’ = dom(B). Hence Fa(A) B. Further (A) cantyp(B) 1 (A) C and (A) (b2-nf(cantyp(B))) 1 (A) C so anyhow cantyp((A) B) 1 (A) C , q.e.d. Finally we discuss the type modification rules and the strengthening rule. (v) Type modification: let I- A E B , B C C. By the ind. hyp. I-, A, A E, B , i.e. cantyp(A) C, B and by 4.3.3.2 (4) B C, C. Use CR to get A E, C q.e.d. (vi) Strengthening: use 4.3.3.2 (1).

The language theory of Automath, Chapter V ((2.5)

569

This finishes the proof of the theorem I- + Fa and the proof of the equivalence of the three systems I-, I-0, Fa. So we do not distinguish between I-, FOand Fa any more and have

I- A (E a )

+ I- A (E cantyp(A) C a )

and I- ( A )B =+ cantyp(A) 1 dom(B)

.

0

4.4. The actual verification 4.4.1. Before discussing the actual verification we make some concluding remarks on the formal decidability of the Automath languages. First, on the well-definedness of the decision algorithm suggested by the definition of I-, in Sec. 4.2, in particular the well-definedness of cantyp and dom. Cantyp and dom axe partial functions, so by well-definedness we understand:

(1) It is decidable whether an expression has a cantyp (or a dom). (2) if it has one, this is effectively computable. All this is already implicitly included in the equivalence proof. E.g. the I-, + I-0part states that cantyp on the correct non-1-expressions delivers a correct expression again. In the course of the decision process cantyp and dom are required of correct expressions only. E.g. before settling cantyp(A) B (in the verification of A E B ) we first check I- A, and before settling A E dom(B) (in the verification of (A) B ) we first check I- B. The definitions of cantyp and dom just require computation of degrees, and computation of p6-normal forms where i is the minimal value degree. Notice that P-N in this case, and in fact for all i < 3, can even be proved without using normability. 4.4.2. Our second remark concerns the normability. Below we make sure the

normability result of Sec. IV.4.4, as we claimed already several times, actually covers the regular languages, viz. by proving that the system of Sec. IV.4.4 contains our most liberal language AUT-QE'. Let us abbreviate the system of Sec. IV.4.4 by system IV. [This is a system like A U T - Q P , i.e. with application expressions of degree 1, but extended further such that expressions of all degrees are permitted.] Theorem. System IV contains AUT-QE".

Proof. This system avoids Q-formulas as indicated in 2.12. For the rest it is like our system I-0, with type-modification rule V.2' (Sec. 2.11) and without strengthening, but of course with much weaker degree restrictions. The expression formation rules are the familiar rules of AUT-68 and AUT-QE, except perhaps for the appl-rules which are most similar to the rules in 3.3.11 for the

570

D.T. van Daalen

first version of AUT-QE’. We only consider the 1-appl-expressions. Let (in AUT-QE*) A E a,I-l B 9 (z : a]C. By 0’6-reduction we get B 2 [z : a‘]C’with a Q a‘. The substitution theorem and SA’ (and hence p’6-CL) are as usual valid in system IV, so using induction on AUT-QE*-correctness we get (in system IV) 0 A E a‘, I- B 2 [z : a’]C’so I- ( A )B, q.e.d. 4.4.3. From our axiomatic introduction in Sec. 11.1.3 the actual nature of ex-

pressions does not become very clear, via. that they are just some well-structured symbol-strings. In view of this fact, a verification process for the correctness of expressions must be able to perform the following task: given a correct book and a correct context (mere symbol strings as well), each symbol-string must, in a finite amount of time, either be recognized as a correct expression (relative to book and context) or be rejected. The verification of such a string can be analyzed in several stages, e.g.: (1) bracket structure has to be correct,

(2) the free variables have to occur in the context and the constants have to occur in the book (after this stage the constants in the string can be assigned an arity, variables and constants get a degree and possibly a t y p and a d e f ) , (3) the arity of each constant has to fit the arity of the argument string going with it (only after this stage we can speak of expressions in the sense of Sec. Kl),

(4) degree restrictions (and possibly norm restrictions) must be satisfied,

(5) the type restrictions have to be fulfilled (i.e. of the argument A in ( A ) B and of the argument string 6 in c((?)). Here i t is just stage (1) which represents the context-free part of the verification. The stages (2)-(4) are literaly context-dependent, but still trivially recursive. After passing stage (3) an expression is pretyped. From our point of view stage (5) is the interesting part of the verification. The actually running verification program for Automath languages at Eindhoven University has indeed been organized along this lines (see [Zandleven 73 (E.I)], [van Benthem Jutting 771). There is a first pass with a ‘Lsyntux-checker’l covering stages (1) and (2). This pass is optional since there is a next pass with a “translator” covering stages (1)-(4) (but without checking norm-restrictions). And finally there is the “processor”, operating on the result of the translator, which covers stage (5). First we discuss the verification of definitional equalities A 1 B. We do not want to compute normal forms but rather design a strategy which after

4.4.4.

The language theory of Automath, Chapter V ((7.5)

571

a few reduction steps in A or B either results in a common reduct of A and B (if this exists), or enables one to conclude that it does not exist. When confronted with certain A and B during the decision process, we have to answer the following questions: (1) Shall we do an outside reduction?

(2) If so, on which of the expressions? The form (or: shape) of A and B (i.e. whether they are abstr-, or appl-expressions etc.) plays a crucial role here. E.g. when A and B are both in immune form, i.e. there is simply no outside reduction possible. So either we can immediately decide our definitional equality (if A and B are of different shape, or if A and B are atomic), or we have to split up (or: decompose) the equality into the equalities of the corresponding subexpressions of A and B. But if A and B have different form, not both immune, then an outside reduction is required. The basic construction aim for a decision strategy is of course to minimize in most of the cases the total number of reduction steps required for a conclusion: A is equal to B or not. There is of course uncertainty about what happens in most of the cases, but the intuitive (and possibly questionable) ideas on this subject, underlying the algorithm in the next sections, can be summarized as follows: generally, the definitional equalities arising in the course of the verification and offered to the decision process, are true, and a common reduct can be reached in relatively few steps. 4.4.5.

>h, 2;

We define new, restricted relations which precisely cover:

>h, >h

( h for head reduction) and

(1) outside reduction steps, (2) the reduction steps needed in order to make new outside steps possible. The relations are given by a simultaneous inductive definition: (i)

B 2; [ z : a]C

(ii) d ( C )

>;

(A) B

>; C [ A ] .

def(d)[C].

(iii) A 2 h ( B )D , B (iv) A

*

>h B +

A

>h Z,D >h

2

c, X $ FV(C)

j

[X : C Y ] A>h

B.

(v) 2; (resp. 2 h ) is the reflexive and transitive closure of 1.e.

B

+

>; A

and

>i are

> B, and if

c.

>; (resp. > h ) .

just q-less versions of >h and >h. Clearly A >h A >h B (or A >h B ) then B is a so-called first

D.T.

572

vil~l DaaJen

main reduct of A . Remark: This reduction does correspond to the head reduction common in the literature [Barendregt 84a], i.e. to the “first half of” the so-called normal reduction [Curry and Feys 581. A reduction A 2h B consists of mere sample head contractions, i.e. ( A l )... (Ak)B > ( A l )... (Ah)C where B > C is an elementary PG-reduction, and even only such of these that their reduct eventually becomes a new simple head redex. The unrestricted reduction D 2 C in clause (iii) is put there on purpose: it is of course possible that internal contractions are needed in order to remove free variables from an expression. The main property of 2 h (or l i , depending on whether 7-reduction is present) is: if A 2 B then A 2 h C 2 B where the reduction from C to B consists solely of internal reductions. So if A 2 B and A , B have different shapes, then A > h A’ 1 B. The intuition formulated in 4.4.4 leads us to the idea that a sensible decision process for definitional equalities must search for a common reduct (i.e. an affirmative answer) rather than normalize, by means of 2 h (in order to get a negative answer), and that during the reduction process the definitional constants must be saved, i.e. left intact, as much as possible. The strategy presented below (corresponding to what is actually implemented in Eindhoven [Zandleven73 (E.l)])can indeed be characterized by the following principles:

4.4.6.

(1) decomposition is preferred above main reduction.

(2) P-reduction is preferred above &reduction (is preferred above 7-reduction).

(3) reduction of a “younger” definitional constant is preferred above reduction of the “older” one. For example, if there is t o be decided whether ( A )B 1 ( C )D, the process first tries decomposition: B 1 D and A 1 C. If this succeeds, i.e. B 2 F 5 D , A 2 G 5 C then we have a common reduct (G) F . Only after this has failed, an outside reduction is attempted on one of the expressions: e.g. ( A )B > h E , i.e. B L [z : a ] F ,E = F [ A ] ,and the new question to be decided is E 1 (C) D. Was no outside reduction possible, then the other expression is tackled: (C)D > h E is tried, possibly resulting in a new question ( A )B 1 E . And, when confronted with the question ( A )B 1 d ( C ) , the process tries t o main reduce the applexpression rather than the other one. 4.4.7. The inductive definition of

> h and rithm for deciding questions of the form A

>h

2h

can be read as a recursive algoB , 38 ( A > h B ) ,

The language theory of Automath, Chapter V (C.5)

573

3& 3~~( A >h [z : &]&) etc. We give our algorithm for deciding 1 also in the form of an inductive definition. Here are the rules:

-1 A

=+: A

1 B.

(0)

Exchange: B

(i)

Variable,

(ii)

Prim: ( A Lh p(C?),(? 1 l?)

7:

A 2 h z e:A J z , and A

>h 7

e:A 1 7 .

e:A 1- p ( l ? ) .

1 D , A 1 C * : ( A )B 1 (C) D. (iv) Appl, P-red: ( A )B >h C * (C 1 D e:( A )B 1 D). (v) Def-def, decompose: B’ 1C? *: d(g)1d(6). (C 1 D e:d ( 8 ) 1 D ) . (vi) Def, 6-red: d(l?) > h C (vii) Abstr-abstr, decompose: a 1 p, A 1 B W: [Z : a ] A 1 [Z: P]B. (viii) Abstr, 77-red: [ z :& ] A > h B =+ ( B 1 C e:[ z : a ] A 1 C ) . (iii) Appl-appl, decompose: B

The notation l? 1 C? is used in the ordinary sense, i.e. B1 1 CI, B2 1 C2 etc. The clauses (i)-(viii) are given in their order of priority, they have t o be tried successively until a clause applies. Clause (0) must only be applied, and of course only once: (1) if none of the rules (i)-(viii) applies, (2) if by the exchange a rule of higher priority among (i)-(viii) can be made to apply1 (3) in case the question d ( 2 ) 1 definitional constant than d.

e(d) is

presented, where e is a “younger”

The clauses containing a bi-implication ((i), (ii), (vii)) are terminal: if application of one of these rules does not lead to an affirmative answer, a negative conclusion about the presented definitional equality can be drawn. In contrast with the other clauses, e.g. clause (iii): if not ( A 1 C), so not ( A 1 C and B 1 D ) then it is of course very well possible that rule (iv) produces a common result of ( A )B and (C) D. Further, a negative conclusion can be drawn if after exchanging still no clause applies at all. If 77-reduction is not allowed then one has to read >h and 2; instead of >h and &, and rule (viii) has to be skipped. 4.4.8. It should be clear that the algorithm above on the correct expressions indeed corresponds with 1. The only interesting point is the bi-implication in clause (vii), which makes that clause (viii) never has t o be applied to a pair of abstr-expressions. This is justified by our property UD (for correct expressions only) from the previous sections. We also have to show the termination of the algorithm (this shows the decidability of 1once more). First, the questions concerning > h and 2 h (e.g. whether

D.T. mn Daalen

574

A >h [z : B1]Bz for certain B1, Bz) are decidable on behalf of SN. Secondly, the procedure sketched above (for deciding A 1 B ) is easily shown to terminate by induction on (1) fl(A) + f l ( B ) ,

(2) 1(A)

+ 1(B)

where I9 stands for length of reduction tree and 1 stands for length of expression. Clearly the g-rule (viii) is equivalent to:

A l ( z ) B , B Z C , xGFV(C)

+

[x:a]AlB.

By a careful implementation of the handling of bound variables - this falls outside the scope of my thesis - there can be guaranteed that whenever during actual verification an equality [z : a ] A 1 B is offered to the decision procedure, B does not contain free occurrences of the same free variable z. This enables us to modify (viii) into the simpler rule viii‘): A 1 (z)B =+ [x : a ] A 1 B , which avoids the nasty internal reductions in the course of an outside g-reduction completely. The termination of the algorithm is still guaranteed with this new rule; we can even use the same induction as before, because it can be shown that rule viii’) never will be applied with a B such that B >h [y : PIC. 4.4.9. In accordance with our views on the actual verification process it may be

sensible to provide the decision procedure with a device which gives a warning in the following cases: (1) If the decision process requires too much time, or rather: too many reduction steps.

(2) If a question d ( B ) 5 d ( 6 ) or ( A )D 1 ( F )G is posed and not ( D 1 G and not ( A 1 F ) ) has been concluded.

(8 1 C?),

resp.

The warnings in case (2) can be partly motivated by the idea that most defined constants in an Automath-book are “XI-constants” (see [van Daalen 80, 111.5.5.3, 111.6.31) and that most functions in an Automath-book are XIfunctions, where D is a XI-function if: D 1 [z : a ] F + x E FV(F). The following example shows, however, that this motivation is not quite satisfactory: D = G = [z : a ] ( V )z, A =- [y : p ] p ( y ,V ) , F = [y : P ] p ( y ,y). 4.4.10. Now we discuss the verification of E-formulas. Since the definitions of cantyp in 4.2.3, with their computation of normal forms , are very unpractical, we prefer the alternative approach sketched in 4.1.4. Besides, the latter approach avoids the different definitions of cantyp and is by uniformity easier to implement for several languages simultaneously.

The language theory of Automath, Chapter V ((2.5)

575

As our “universe”, the large language which we use to decide our E-formulas, we takeAUT-QE*. Let t- denote correctness in AUT-68, AUT-68+ or AUT-QEf and let k * stand for correctness in AUT-QE*. One easily proves by induction on A, using LQ, CLPT etc. for F*, the important properties: (1) t- A =+ t-* typ(A), and

-

unless A is a 2-expression in AUT-68(+)

-

This justifies the equivalence mentioned in 4.1.4.

~ A E eBk A , t - B , t-*typ(A) C B except, trivially, the degree 2 of case of AUT-68(+) t 2 A E B *k2A,

8-7.

The l-procedure of Sec. 4.4.7 can be adapted in order to decide simultaneously by making some obvious modifications, e.g.:

1 and c

- Clause (0) becomes: B 1 / c / 7 A :e A 1 / II/ c B (where “B / [I/ 7 A” reads “ B 1 A resp. B [I: A resp. B 7 A”, etc.).

4

- To clause (i) there is added: degree ( A ) = 1 =+ A - Clause (vii) becomes: a

1p, A 1 / C / 7 B

H:

c r.

[ x : a]A

1 / c / 7 [x: P]B

etc. We do not bother to give a practical algorithm for deciding E in AUT-QE’ and AUT-QE*, because we think that these languages are of mere theoretical purpose. 4.4.11. Rather than computing domains via the domain normal forms (dnf’s)

of Sec. 4.2.4.4 we use the alternative approach of 4.1.6 of searching though the -+-reduction tree of an expression. Recall that + is generated by (1) ordinary reduction, (2) taking typ.

We promised the following theorem. Theorem. -+ is well-founded on the correct expressions.

Proof. As long as we stay inside the correct expressions we can use a double induction, viz. (1) on degree, (2) on I9 (= length of reduction tree).

For, reduction preserves degree and decreases 8 , and taking typ decreases degree. We must be a bit careful with applying typ to a degree 2 AUT-QE*

D.T. van DaaJen

576

expression - such as, e.g., can originate by taking t y p of a degree 3 AUT-QE expression - because a n incorrect and even not normable 1-expression might arise. A typical example is ( A ) r . However, this does no harm t o the wellfoundedness, because p1- SN can be proved, without using norms at all, for all degree correct expressions. 0 Also, we have another uniqueness result (compare 4.2.4.2). Theorem. A correct, A [z : a]C,A [x : ,BID =+- cr 1p. -+

-+

Proof. For 3-expressions A we even have a kind of CR-result A 2 A’ =+typ(A) 1 typ(A‘). Now let degree ( A ) = 2, and let A 2 A’. In AUT-68(+) and AUT-QE(+) this gives k* typ(A) I typ(A’), but in AUT-QE* this is not generally true, because typ(A) and typ(A’) need not be correct. Luckily such incorrect 1-expressions (see the proof of the previous theorem) never reduce to 0 an abstr-expression. So by UD we still get the desired result. The internal q-reductions included in are of course useless during domain computation where one only wants to reach an abstr-expression. So in an algorithm for domain computation we rather employ a restriction of which we name -+h and is generated by head reduction 2; and taking typ. In general, unrestricted search through the +h-reduction tree can be permitted - provided the degree restrictions are respected. However, the 2-expressions of AUT-QE and AUT-QE+ form an exception. Here the search for an abstrexpression has t o start with taking typ. Otherwise too many expressions would get a domain, which would give rise to typical AUT-QE* appl-expressions. Besides, unrestricted search can be very unpractical. E.g. in AUT-68(+) one never needs to inspect 1-expressions: if the 2-expressions in the -+h-redUCtiOn tree fail to produce a domain, going to the 1-expression by taking t y p will not help. In general it is no good strategy to start the domain computation with reduction, unless we are obliged to because the expression under consideration is already of minimal value degree. So, a simple and probably rather practical strategy for AUT-68(+) and AUTQE(+) may run as follows. Let A be the expression we start with. Take t y p until one arrives at a n expression of minimal value degree. Then reduce (with 2h) until one possibly finds a domain. If this does not succeed, A can still have a domain if it is a 3-expression of AUT-QE(+), otherwise A has no domain. In the indicated case unrestricted search of the -+h-reduction tree of typ(A) is required, to be executed as follows: one-step reduce (typ(A) >h B ) ,then take typ, then reduce (with 2;). If this does not yield a domain, one-step reduce B once more etc. The well-foundedness of -+ guarantees the termination of this procedure. 4.4.12.

-+

-+

The language theory of Automath, Chapter VI (C.5)

577

VI. THE @ g - C H U R C H - R O S S E R P R O B L E M OF G E N E R A L I Z E D TYPED A-CALCULUS VI.l. Introduction 1.1. The problem with pq-CR in Automath-like languages was first pointed out in [Nederpelt 5'3 (C.3)].Let x # FV(@),then

[x : a](x)[x : p] c >11[x : @] c and the question is whether [x : a]C and [x : @] C have a common reduct, i.e. [x : a]c

whether @g-CRl holds. In untyped A-calculus this case of CR1 is particularly trivial, because without the type-labels there just remains

Ax.C

Xx.(Ax.C)x

>o

A2.C

and for the common reduct we can simply take Ax.C itself. If [x : a](x)[x : P] C is not necessarily correct, a common reduct does not need to exist, for a and @ can be any expressions. Nederpelt conjectured already that for correct expressions Pg-CR (so @gCR1) does hold. This we shall prove below, making free use of the results of the previous chapter, in particular Sec. 3. So, if k [x : a](x)[x : @] C then by SA we know a 9 @ so [x : a]C 9 [x : @] C;but we know nothing about a common reduct. It is possible that certain versions of the algorithmic definition allow a proof of Pg-CRl. But then it is not so easy to infer CR, because we do not yet know CL for the algorithmic system. An alternative to the approach below is presented in the next chapter. There CR and CL are proved simultaneously for an algorithmic system, by induction on so-called big trees.

1.2. Below we concentrate on @g-reductionand leave &reduction out of consideration. It is easy to extend our result t o @qG-CR, since 6 commutes with Pg-reduct ion:

B

56

A >p,

c*

B rpq D

56

c

and, of course, 6-CR holds. We start (in Sec. 2) with a partial solution of the @g-problem,for g-reduction of degree 2, which works for regular languages only. Then (Sec. 3) we prove full @g-CR.

VI.2. A first result concerning @g-CR for regular languages 2.1. We prove the Church-Rosser property for regular languages with a reduction relation 2 generated by @-reduction and v2-reduction, i.e. g-reduction of [x : a](x)A >: A . degree 2: degree(A) = 2, x # FV(A)

*

578

D.T. van DaaJen

The motivations for studying this restricted ,Bq-reduction lies in the fact that the actual verification of mathematics in AUT-QE (in particular, of Jutting's Landau-translation, see [van Benthem Jutting 771) just required this specific type of q-reduction. 1.e. the Automath texts offered t o the verification program appeared to be correct Pbq2-AUT-QE. 2.2. Heuristics The idea is to proceed in two stages. First we consider a seemingly weaker form of q2-reduction which is tailor-made to avoid the critical ,&case mentioned in the introduction. For this restricted @q2-reductionwe prove CR. Afterwards (Sec. 2.5) it is shown that full Pq2-equality is equivalent to the restricted form. This can be compared with the situation in Sec. V.3.3.8 - where $-equality turned out to be provable. How t o define the restricted form of v-reduction? 1.e. under which conditions do we permit the reduction of [z : a] (z)A to A ? Clearly, we require: (1)

zeFV(A)'

Further, that A is not of the form [y : PIC - to avoid the critical case -. But this is not enough. Consider, e.g., [z : a](z) F, where F 2 [y : Fl]F2, z # FV(F). So we require: (2)

A

2

PIC

i.e. A does not reduce t o an expression of the form [y : PIC. Thirdly we want to preserve the substitution lemma

B > B'

+

B [ D ]>B'[D]

at least for D of degree 3, so we further require (3)

degree(A) = 2

.

This shows why the method works for regular languages only. Condition (2) can now be weakened to (2')

A

2; [Y : PIC

or, in the presence of b-reduction, to: A

2g6 [y : PIC.

2.3. The definition of the restricted reduction relation For definiteness we give a formal definition:

(1)

> is the disjoint one-step reduction generated by the elementary reductions:

The language theory of Automath, Chapter VI (C.5) (i) (A) [z : B] C > C[A]. (ii) z 6 FV(A), A 2; [y : PIC, degree(A) = 2 (2) 2 is the transitive closure of

+

579

[z : a] (z) A

> A.

>.

2.4. The proof of CR for the restricted reduction 2.4.1. Substitution lemma I. (i) A > A ’

+

B[A] > B[A’].

(ii) A 2 A’ =+ B[A] 2 B[A’]. Proof. As usual, by induction on B and 2 respectively.

0

2.4.2. Weak Pi-Pj-postponement: if i # 3 and A is degree correct then A 2; B

+- A

2; C 2; D 56 B .

Proof. If a Pj-contraction produces an essentially new ,@-redex then i = 3 or i = j. If i = j there is nothing to prove, so unless i = 3 we have A >:,p >”;@ B + A >“;p 2; C $ B. So, using P-SN, Pi-CR and the fact that ,@ and P j 0 commute we get the desired property, as in [van Daalen 80, 11.7.41. 2.4.3. Something about

p2 (for degree correct expressions).

(i) Degree(B) = 2, B 2 [y : C ] D

+

B 2; [y : C’ID’.

(ii) If degree(B) = 2, degree(A) = degree(z) = 3 then B[z/A] 2; [y : C ] D

+

B 2; [y : C’ID’ .

Proof. (i) Let B 2 [y : C]D, degree(B) = 2. By Pq-postponement and weak P2-P3postponement we get B 2; F 2; G 5; H >11[y : C]D. Then H , G, F are abstraction expressions, q.e.d. (ii) Use the square brackets lemma (11.11.5, IV.2.4) and the previous property. 0

2.4.4. Substitution lemma 11. Ifdegree(A) = degree(z) = 3 and A, B are degree correct then (i) B > B’ (ii) B 2 B’

+ +

B[z/A]

> B’[z/A].

B[z/A] 2 B’[z/A].

580

D.T. vm Daalen

Proof. (i) By induction on B. The crucial case is when B = [y : B1] (y) B2, y # FV(Bz), B2 2; [y : C]D, degree(B) = degree(&) = 2. Of course, y # FV(Bz[A]), degree(&[A]) = 2 and, by 2.4.3 (ii) &[A] 2; [y : C]D. So B[A] E [y : B1[A]] (y) Bz[Az] > &[A] q.e.d. (ii) By induction on 2.

0

2.4.5. Theorem (CR1 for the restricted reduction): if A degree correct then

A>B, A > C

BJC.

P r o o f . Let A > B, A > C. By induction on A we define a common reduct D of B and C. The crucial cases are (i)

A E (Al) [z : A2]A3, B G A3[A1] (by P-red.), C = (A:) [z : A’,]AL (by monotonicity). Take D E Ai[A‘,] and use the substitution lemmas.

(ii) A = (Al) [z : A21 (z)A3, B = (A:) A3 (by 11-red. and monotonicity), C E (Al) A3 (by P-red.). Simply take D = B. (iii) A = [z : All (z)A2, B 3 A2 (by 7-red.), C = [z : A;] (z)A’, (by monotonicity). Clearly degree(A’,) = degree(A2) = 2, z @ FV(A’,). If A’, 2; [y : C1JC2 then A2 2 [y : C1]C2 so by 2.4.3 (i) A2 2; [y : Ci]Ci. Hence A’, 2; [y : C1]C2 so D = A’, can serve as the common reduct. 0 2.4.6. Corollary. If A degree correct and normable then CR(A).

Proof. By induction on the reduction tree of A .

0

2.5. The extension to full ,f3v2-reduction 2.5.1. From now on we label the notions referring to the restricted reduction

>,

and lo, and by to we denote relation with a subscript 0. Thus we write >, correctness in AUT-QE(+) with an equality relation Q, generated, e.g., by

k,A,

k,B,

A

>o

B or B >, A

+ Aq,B.

By 2.4.6 we have

AQ,B

A

1, B .

On the other hand, the notations without a subscript have t o be interpreted in terms of “full” Pq2-reduction. Thus, we write I- for correctness in AUT-QE(+) with equality Q, generated by

F A , k B , A > B or B > A

+

AQB.

The language theory of Automath, Chapter VI ((7.5) Below we sketch the equivalence of the two systems. The implications >, so to =+ I- and Qo + Q are immediate.

581

+>

2.5.2. First we go through some theory of the o-language (i.e. with and 9,). The theorems about renaming of contexts and weakening (see V.2.9) are still valid. We have a restricted substitution theorem: If r] = ( 1 7 1 , y' E ,@,all yi in y' have degree 3, and B' E B[B]then 71 F,

c (E/Q,

D)

*

711

I-,

@I

(E/Q, D [ ~ I. )

So we have the single substition theorem: if degree(y) = 3 then

I-,,B E P , y E P t-, c ( E / Q , D ) + F,C[B] (E/Q, D [ B ] ). Hence, from SAi we can infer Pi-CLPT, as usual. Now SA2 works precisely as in the previous chapter (V.3.2.4)so we may assume P2-CL.

*

2.5.3. The proof that II-, and Q + Q, goes by induction on I-. The only interesting case is when I-2 [z : a](z) A , z # F V ( A ) , A 2; [z : Al]A2. Then V2-reduction is possible, but restricted reduction is not. So from I- A one gets I- [z : a](z) A Q A and we like to show that I-, [z : a](z) A 9, A holds as well. By the ind. hyp. I-, [z : a] (z)A and I-, A , and by PfL A Q, [z : A I ] A ~ and [z : a](z) A 9, [z : a] (z) [z : A1]A2 I?, [z : a] A2. By SA2 a Q, A1 so by the substitution theorem [z : a]A2 9, [z : Al]A2, whence [z : a] (z) A Q, A. 2.5.4. So the o-language is equivalent with the pq2-language, for which the properties CL, PT, SA etc. can be proved as in the previous chapter. Now let A 9 B. By the equivalence A Q, B and by CR A 1, B, so a fortiori we have CR for full Pq2-reduction. Extension t o the corresponding &language is possible as in Sec. V.3.3. VI.3. A proof of CR for full &reduction from closure and strong normalization 3.1. The assumptions 3.1.1. In contrast with the proof in the previous section, the sequel does not presuppose regularity of the language. So, after having proved CL for, e.g., Nederpelt's A, the present proof applies to this language. We assume that correctness of expressions and equality formulas is defined relative to a correct book t3 and a context E. The book is fixed throughout this section and omitted in the notation. Below we introduce an extended reduction relation and a correspondingly extended equality. Since we want to employ our usual notations 2 , Q for these

D.T. van DaaJen

582

new relations, we write 2, and 9, for the ordinary Pq-reduction and the corresponding equality relation, generated e.g., by

t I - A , [ F B , A 2, C

so B

~I-AQ,B.

We use our ordinary shorthand notation, writing

qkA

forE,qFA

and

AQ, B for [ I- A Q ,B etc. 3.1.2. For definiteness we give a list of the properties which we assume through this section and use in the proof. (1) Strengthening, and in particular the following consequence: if q then

= (170,771)

(2) Soundness of equality w.r.t. abstraction,

(3) W.r.t. application,

(a consequence of La, see below). (4) And w.r.t. substitution

(also a consequence of LQ). ( 5 ) Closure: k A, A 2, B =+ I- B .

( 6 ) SA, so (this concerns directly the critical pg-case)

[x : a](z) [y : PIC

+- x E a

a Q, p

.

(7) Strong normalization (with respect to 2,): l- A =+ SN(A). Remark: the properties (3) and (4) depend on La. As we know (see V.3.3.10) LQ fails in AUT-QE(+) with &reduction, but CR for these languages can be proved in two ways:

The language theory of Autornath, Chapter VI (C.5)

583

(1) From CR for AUT-QE(*).

(2) By first proving CR for a &less version, and then extend the result by using UE [i.e. an unessential extension result].

3.2.1. Heuristics We saw that in the critical case of pv-reduction the two direct reducts of [x : a](z) [z : p] C are syntactically equal ( G ) but for the domains a and /3 which are just definitionally equal (9,). Below we define the relation k: which precisely covers this kind of syntactic similarity intermediate between = and 9,. It would be straightforward to try and prove a modified CR-property

B

so A

2, C

*

B 2, D = D '

so C

by proving =-postponement, i.e. A x B 2, C

*A

2, B ' = C .

However, there is a problem with the latter property if A G [z : a](z) Al, B = [x : a](x)C , z $! FV(C), A1 = C. For it is possible that z E FV(A1). So we take a different approach. We define an extended reduction relation > which is disjoint Pq-one-step reduction, enriched by the clause

A

=B +

A

>B

(elementary =-reduction)

.

This means that internal contractions in the domains are ignored for the bookkeeping of reduction steps. For the new reduction relation we can simply prove CR1. Further there holds a certain version of >-SN, which gives us CR.

3.2.2. Structure of the proof We point out the difference with the approach in Sec. VI.2. There we first restricted our reduction relation, proved CR for the restricted reduction and then extended the result to the original reduction. On the other hand, here we start with proving CR for the extended reduction relation 2 , and afterwards we still must prove CR for 2,. In fact we first prove modified uniqueness of >-normal form, i.e. uniqueness with respect to =: A 9 B , A and B >-normal + A = B. And then, using the equivalence of 9, and 9, uniqueness of >,-normal form. So we have 2,-CR. For a comparison of 2,- and l-normalisation see Sec. 3.7.1 below. 3.3. Definition of the extended reduction relation 3.3.1. By simultaneous inductive definition we introduce the syntactic similarity =, the extended reduction relation >, with one-step reduction >, and the extended definitional equality 9, between correct expressions, as follows.

584

(I)

D.T. van Daalen Elementary reductions

> C[A] (,&reduction) (2) [z : B] (z) C > C if z @ FV(C) (0-reduction) (3) A ST B + A > B (=-reduction) (1) ( A )[z : B]C

(11)

Monotonicity rules

> A’, B > B’ (2) z ~ a l - A > A ’ (1) A

(3) A1

+

*

( A )B > (A’)B‘ [z:a]A>[z:a]A’

> A:, ...,Ak > A: + C(A)> C(A’)

(111) (1) 2 is the transitive closure of

>

(2) 9 is the equivalence generated by

>

(IV) (1) A X A (2) a q a’, z E a l- B c B’

+

[z : a]B

x [z : a’]B‘

+ ( A )B x (A’)B’ A1 z A:, ...,Ak = A: + C ( A )M C(A’)

(3) A = A‘, B x B’ (4)

3.3.2. Some remarks concerning the definition 3.3.2.1. It is not necessary to define the above notions simultaneously. For in view of 3.4.3 below, we might as well have taken instead of IV.2 (Iv.2’) a Qo a’, z E

CY

l- B

= B’ +

[z : a]B

= [z : a’]B‘ .

3.3.2.2. Except for the rules 1.3 and 11.2, the rules of I and I1 are the ordinary rules for S1,ps, disjoint one-step &-reduction. Rule 1.3 can be considered a strong form of the reflexivity rule A > A. Rule 11.2 is one half of the usual monotonicity rule for abstr. expressions. The other half can be derived using IV.l, IV.2 and 1.3: if a > a’then a 9 a’, further A k: A so [z : a]A

= [z : a’]A

so

[z : a]A

> [z : &‘]A.

3.3.2.3. If we had defined > to be the corresponding ‘‘nested” one-step reduction we might have been able to prove the diamond property for >. Then we could have avoided the appeal to SN when deriving CR from CR1. [Nested one-step reduction 5- is like disjoint one-step reduction, but additionally redices may be contracted in one step when they occur inside each other; typical clause

A 5 A A ’ , C 5 C ‘+ ( A ) [ z : B ] C 5 C ’ [ A ’ ] . ]

The language theory of Automath, Chapter VI (C.5)

585

3.4. Some easy properties 3.4.1. By simultaneous induction on Definition 3.3.1, using the soundness of Q, w.r.t. expression formation, we get if A

> A’

or A 2 A‘ or A

A’ or A = A‘ then A Q, A’

.

3.4.2. From 3.3.2.2 it is clear that 2 satisfies all the monotonicity rules and that A

>, B

* A>B,

A

>,

+

so

B

A>B,

and

Aq,B

+

A Q B .

3.4.3. So combining this we have Q, @ Q. As a corollary we have the monotonicity rules 3.1.2 (2)-(4) now also for 9. The monotonicity of M is immediate. Further x is an equivalence relation. 3.5. On =-reduction and normalisation 3.5.1. In certain A-calculus systems (see, e.g. [Curry and Feys 581) renaming of bound variables is not ignored - like we do here - but formalised in the form of a-reduction:

Then (see our definition of substitution, Sec. 11.2.4) it is possible tions are needed before some /3-reduction can be carried out. In a suitable definition of proper reduction sequence is: a sequence a finite number of a-reductions occur. 1.e. a reduction sequence is proper if from a certain C, on, only a-reductions are applied. normal if only a-reductions of C are possible.

that a-reducsuch systems, in which only C1 > C2 > ... Similarly C is

3.5.2. Here we treat the %-reductions analogously, as extended a-reduction, and call them improper reductions. Proper reduction sequences are reduction sequences in which only a finite number of such improper reductions occur. An expression is now SN if all its proper reduction sequences terminate and normal if only improper reductions are possible. So A i s normal, A > A’

+

A%!’

D.T. van Daalen

586

3.5.3. I n 3.5.1 we mentioned the possibility that a-reductions created new 0redices. For =-reductions this is not the case. Let >p (resp. >,,) denote the disjoint one-step reduction generated by the rules (1.1) (resp. (1.2)) and (11) of 3.3.1. So, e.g., A >p A‘ if some P-redices not lying inside a “domain” are contracted. Then we have, indeed, P=-postponement

A R B >p C

+

A >p B ‘ e C .

However, q =-postponement fails because =-reductions can create new q-redices (see 3.2.1). Fortunately we have =q-postponement instead

A >,, B = C

3

A=.’

>,,

C.

3.5.4. Now we can prove SN (in the sense of 3.5.2). Let a proper reduction sequence C1 > Cz > ... be given. If no 0-step turns up then the sequence terminates because from some C, on only q-steps are applied, which decrease the length of the expression. Otherwise, for some n, by wj-PP

c1 r >,, c,

>p

.

By Pq-PP and p=-PP

c1 >p r’ = r ” 2v c ~ . + ~ By C A N , i.e. SN with respect to l o29p(C) , is defined for correct C and 29p(&) 29p(r’).So by induction on 29p we can prove SN.

>

3.6. CR for 2 3.6.1. S u b s t i t u t i o n lemma I. Zfl- B [ A ] l, B[A‘]then (i)

A > A’

(ii) A 2 A‘

(iii) A 4 A’ (iv) A

= A’

+ B[A]> B[A’]. + B[A]2 B[A‘]. + B[A]4 B[A’]. 3 B[A]= B[A’].

Proof. All parts can be proved separately by ind. on B using the monotonicity 0 rules for >, 2 , Q and z.

3.6.2. S u b s t i t u t i o n lemma 11. Zf I- B[A]and t- B’[A]then

B > B’

+

B[A]> B’[A].

(ii) B 2 B’

+

B[A]> B’[A].

(i)

The language theory of Automath, Chapter VI (C.5) (iii) B (iv) B

M

B’

+

B’

* B[A]

587

B[A]Q B’[A] k

B‘[A].

Proof. By simultaneous induction on the definition of >, 2 , and 3.6.3. Main lemma (CR1): If A correct, B < A 2 C then B

M.

0

1 C.

Proof. By ind. on A. If A = B then for the common reduct D we can take D = C. Similarly if A M C. In case A = (Al)A2,B = ( B I )B2, C = (C1)C2, B1 < A1 > C1, B2 < A2 > C2 then by the ind. hyp. and by monotonicity of 2 we find a common reduct ( 0 1 ) 0 2 with BI 2 D1 I C I , B2 2 D2 5 C2. Similarly if A = C(A1,..., Ak). Further distinguish: (i)

A = ( A l )[x : AzIA3, B = ( B I )[x : B2]B3, C = A3[A1],A1 > B I , A2 9 B2, A3 > B3. By the substitution lemmas above B > B3[B1]5 A3[A1]so take D = B3[B1].

=

=

=

( A l )[x : A21 (x)As, B (B1)A3 (by r]-red.), C ( A l )A3 (by P-red.), z $! FV(A3),A1 > B1. Then C B and take D B.

(ii) A

=

=

= [z : A1IA2, B = [z : B1IB2, C = [x : Cl]C2, A1 B1, A1 C1, B2 < A2 > C2. By ind. hyp. B2 2 D2 5 C2 so take e.g. D = [x : B1]D2.

(iii) A

= [z : All (x)A2, B = [z : B1](x)B2, C = A2 (by q-red.), x # FV(A2), A1 B1, A2 > B2. It is easy to see that A2 >p,, D2 M B2. Clearly x $! FV(D2) so B [x : B1] (x)D2 > D2 5 A2 = C. So take D = D2.

(iv) A

(v) A = [z : All (z)[ X : A2]A3, B = [Z: A1]A3, C [z : A2]A3,z # FV(A2). This is the critical case. By assumption 6) from 3.1.2 A1 A2 so we can take D = B M C. 0

3.6.4. Theorem (CR): If A correct then CR(A). Proof. By SN we can define 8 ( A ) the maximal number of proper reduction steps in reduction sequences of A. Use induction on 8(A). Let B 5 A 2 C. The cases A M B and A M C are trivial. Otherwise, for certain proper reducts B1 and C1, A > B1 2 B , A > C1 2 C. First apply 3.6.3 to get B1 2 D1 5 C1. 0 Then apply the ind. hyp. to B1, CI and D I . 3.6.5. Corollaries. I. A Q B + A ~ B . II. Similarity of normal forms:

A q B , AandBnormal

+

AMB.

588

D.T. van Daalen

3.7. CB for zo 3.7.1. Call an expression o-normal if it is normal with respect to >o, i.e. if it does not contain p- or q-redices. So, if A o- normal then there are no reduction steps A >p B or A >,, B possible. But it might be possible - as long as we do not have CR - that after some =-reductions new 7-redices are created. So a priori we do not know whether A is normal. But, if A is o-normal and A does not have abstraction form and A 2 B then this reduction is an internal, and not a main reduction. E.g. A = ( A l )A2 + B = (B1)B2, and:

AE(A~)+ A ~B = ( B l ) B z , A l r B 1 , A2>B2 3.7.2. Theorem (uniqueness of o-normal form): Let A and B be o-normal, then

Proof. By induction on the sum of the lengths of A and B. Let A 9, B , so A Q B , so A 2 C I B. Distinguish the following cases: (1) Both A and B are abstr-expressions, [z : A1]A2 resp. [z : B1]B2. By prop. 3.1.2.2), A1 Qo B1, z E A1 F A2 Po B2. By the ind. hyp. A1 = B1, A2 3 B2

so A = B.

(2) Neither A nor B are abstr-expressions. Then A and B and C have the same form. E.g. if A = ( A l )A2, then C = (Cl)C2, so B z (B1)B2 with A1 1 CI I B1 and A2 2 C2 I B2. So A1 9 B1, A2 9 B2 and A1 qo B , A2 Qo B and by the ind. hyp. Al ZE B I , A2 = B2. (3) A has abstr. form and B has not. Then A = [z : A1]A2, A2 1 (z)D2, 2 $? FV(D2), A1 Q Di, and A 2 [z : 0 1 1 (z) D2 > D2 2 C 5 B. By CL, 2 E Di l- (z)D2 and by 3.1.2.3), 2 E D1 F (z) D2 Q (z) B. SO 2 E A1 lA2 Q (z)B and both A2 and (z)B are o-normal. B y the ind. hyp. A2 = (z)B. Clearly z 4 FV(B), so A is not o-normal, contradiction. So this case 0 does not occur.

3.7.3. Corollary (CR): (a) A correct, A l oB , A 2o C (ii) A Q ,B

A 2o C So B.

+

B l oD So C.

The language theory of Automath, Chapter VI (C.5)

589

3.7.4. Now we can conclude

A o-normal =+ A normal

.

= B > q C (i.e. A is not normal) then A = A~](z)A ...,~ z E FV(Az), B ...[z : B ~ ] ( z )..., B~ z $ FV(BZ), AI 9 Bi, x E A1 I- A:! 9 Bz. By CR, Bz 2, Az, so FV(A2) c FV(&), imFor, if A o-normal, A

... [Z :

possible.

D.T. van Daalen

590

VII. THE ALGORITHMIC DEFINITION AND THE THEORY OF NEDERPELT’S A: THE BIG TREE THEOREM, CLOSURE AND CHURCH-ROSSER VII.l. Introduction and summary 1.1. The history of A A further unification of the concepts underlying AUT-68 and AUT-QE led Nederpelt and de Bruijn [Nederpelt 71a], [Nederpelt 71b], [ d e Bruijn 71 (B.2)], after the construction of an intermediate version A-AUT, to the introduction of the language A or, as de Bruijn names it, AUT-SL, for single line Automath. First Nederpelt noticed that, via a suitable translation, instantiation, i.e. , zn), could be replaced by applicasubstitution in constant-expressions ~ ( 2 1 ..., tion and that, by this translation, &reduction reduced to &reduction. We used this fact for one of our proofs of 6-SN in [van Daalen 80, 111.5.41. However, in order to cover substitution with 2-expressions, as is allowed in Automath languages, the restriction to argument degree 3 and domain degree 2 had to be dropped. This would in combination with type-inclusion have given a higher order system, so to avoid normability- and normalization problems, one had to skip type-inclusion. Then, a further streamlining of the definition was attained by dropping the restriction as to inhabitable degree as well, thus allowing expressions of any degree. By the aforementioned translation and the relaxation of the degree restrictions it became possible to dispense completely with constants and schemes: constants could be translated into variables, schemes could be turned into assumptions and a book could be transformed into a context. Besides, quantification over all free variables was allowed now, so all assumptions 2 E a from a context could be converted into abstractors [z : a]. Thus, a statement B;
<

<

The language theory of Automath, Chapter VII (C.5)

591

1.2. The present treatment The discussion in the previous chapters: starting from the E-definition (V.2), first proving closure (V.3) and Pq-Church-Rosser (VI), and finally proving the equivalence with the algorithmic definition (V.4), though concentrating on the so-called regular languages AUT-QE and AUT-68, applies to Nederpelt’s language as well, which shows that his conjectures were justified. Here we choose an altogether different approach. Below we start with the algorithmic definition of correctness (VII.2). We foilow Nederpelt but for his single-line presentation: we fit the system into the book-and-context framework of the previous chapters. Whereas the definition of the constant-less part of the language (Sec. 2.1) simply can take place in the pretyped expressions [in the pretyped expressions there is a typing function, but the typing does not restrict the term formation], it turns out that adding constant-expressions (Sec. 2.2) requires the introduction of degree-norm correct expressions (Sec. 2.2.4). Then both Nederpelt’s conjectures are proved directly from the algorithmic definition, using the so-called big tree theorem (BT). This theorem states that, on the correct expressions - and, in fact, on the much larger domain of normable expressions - the partial order 5 generated by sub (i.e. taking proper subexpressions), by and by taking t y p is well-founded. So BT is an SN-result for an extended reduction relation and, hence, implies ordinary SN. The big tree theorem was first formulated and proved in [ d e Vrijer 75 (C.4)] for the regular language AX.

>

Section 3 below contains the closure proof of A without constants, serving as a motivation for BT. Section 4 contains two different proofs of BT, and in Sec. 5 we prove closure and CR for the constant-less part of All. In Sec. 6 we give some equivalence proofs: of the systems with and without (definitional) constants, and of the single-line version with the book-and-context presentation. As a result we get the various nice properties for all these systems.

VII.2. The definition of A and Aq 2.1. The part without constant expressions 2.1.1. Both A and Aq are systems of admissible expressions in the sense of IV.3. [In a system of admissible expressions term formation is in one way or another restricted; examples are the normable expressions, the degree correct expressions, and the correct expressions. See [wan Daalen 73 (A,,?)].]The correctness of books and contexts is standard, so we just present the part of the definition concerning the correctness of expressions. A simplification compared with e.g. AUT-QE is that no degree restrictions are imposed. If in the definition below > (resp. 2 , resp. 1) is interpreted in terms of ,&-reduction then we get Aq otherwise just A.

D.T. van Daalen

592

The function t y p and the degrees are as usual, see e.g. [Nederpelt 73 (C.3)] and [van Daalen 73 (A.3)]. Throughout Sec. 2.1 we follow Nederpelt and do not admit constant-expressions. Later on (Secs. 2.2, 2.3) we show how the language can be extended with the formation of constant-expressions.

2.1.2. By taking t y p of a non-constant-expression A the degree is decreased by one so, by successively taking t y p one arrives at a 1-expression. This 1-expression is called typ*(A). So, typ*(A)

:= A if degree(A) = 1

typ*(A) := typ*(typ(A)) otherwise

.

Now let U be correct and let [ be correct w.r.t. 8. We use the conventional shorthand: I- A instead of 8 ; [ , q k A, t y p instead of E-typ etc. Of course, as long as we do not form constant-expressions, the presence of the book B is completely irrelevant. Now correctness of non-constant-expressionsis defined as follows: (i)

I- 7.

(ii) k z if z among the variables in

E.

(iii) I- [z : a]B if k a and z E a k B. (iv) k (A) B if k A, k B, typ(A) 2 a, typ*(B) 2 [z : a]C for some a , C.

2.1.3. So correct expressions are pretyped expressions satisfying the so-called application condition: in appl. expressions (A) B the expression B has a dcmain (to compute from typ*(B)) corresponding with the t y p of A. In the next section where we also introduce constant-expressions, an additional condition concerning instantiation will be imposed. There are various alternative, equivalent, formulations of the application condition possible. E.g. one can replace “typ(A) > a” by “typ(A) a”. In A (i.e. without q-reduction) we have CR, so it is even sufficient to require typ(A) = a and typ*(B) = [z : a]C, in other words: typ*(B) = [z : typ(A)]C - where = is full definitional equality (see V.2.11) - or, anticipating certain results of Sec. 6.2.6, we might restrict the computation of the domain of B by requiring typ*(B) 2b [z : a]C (compare V.3.3). 2.1.4. Since norms are preserved under taking ty p and under reduction (see e.g. [van Daalen 80, IV.3.41) the correct expressions are strictly normable [strictly normable: the norm restrictions are fully respected; contrary to languages with type inclusion, where in some cases (substitution with &-expressions) the norm

The language theory of Automath, Chapter VII (C.5)

593

restrictions may be violated (see also VIII, 5.4.1)]. This can be shown by induction on the definition of k. E.g. that (A) B is strictly normable if it is correct: By ind. hyp. A and B are normable, so @ ( A )= b(typ(A)) G p ( a ) and p ( B ) = p(typ’(B)) = p ( [ z : a]C) = [p(a)]p(C), so ( A )B is normable, with P ( ( A )B )

= P(C)*

Hence the correct expressions are SN and the system is decidable.

2.2. Introducing constant-expressions; degree-norm correctness 2.2.1. We allowed the presence of a book containing schemes for the constants. Now we can simply introduce constant-expressions by adding the instantiation rule:

6

(v) If $ E * c(f) E y is a scheme of B, k = Id, I- B1, ...,I- Bk and tYP(B1) 1P1, ...,tY P ( W 1P k P I then t- c ( @ . That is, in a constant-expression c(@, the arguments Bi have t o satisfy the

instantiation condition t y p ( B i )

1P i [ B ] .

However, we have to make sure that t y p * is still well-defined, particularly that taking t y p still decreases the degree by one. E.g. typ(c(@) (= t y p ( c ) [ g ] = y[B]) and typ(c) (= y) must have the same degree. 2.2.2. Call a substitution [y’i?]degree correct if degree(yi) = degree(&) for i = 1,...,Id. Degree correct substitutions preserve the degree: is degree correct then y[Z] and y have the If y is a y’-expression and same degree. So, if we would add the requirement of degree correct substitution to the instantiation condition, then we might be satisfied. But this is not what we want: we rather would like t o show that the instantiation condition implies

[$/a]

the degree correctness of the substitution involved. This amounts t o showing that degrees are preserved under reduction as well. To this end we introduce the concept of degree-norm correctness. 2.2.3. Degree-norms are defined by:

(i) positive integers are degree-norms. (ii) if v l , v2 are degree-norms then [vl]v2 is a degree-norm.

So, just like ordinary norms (IV.2) are built up from T and square brackets, degree-norms are constructed from 1,2,... and square brackets. For degree-norms v we define the degree-norm v 1 as follows:

+

(i) if v is an integer then v

+ 1 is as usual.

= ( ~ 1 1 then ~ 2 v + 1 := so (“21312) + 1 = “21313. (ii) if v

[vl](v2

+ 1).

D.T. van Daalen

594

2.2.4. Now we define degree-norm correctness of books, contexts (w.r.t. a book)

and expressions (w.r.t. book and context). It is implicitly intended that an expression is degree-norm correct (dnc), it its degree-norm (dn), w.r.t. book and context, is defined. The definition of the latter runs as follows: (i)

dn(T)

:=

1.

:= dn(typ(z)) + 1. dn([z : a ] B ) := [dn(a)+ l]dn(B). dn((A) B) := if dn(B) G [dn(A)]u then u. dn(c(g)) := dn(typ(c)) + 1, if dn(Bi) = dn(yi) y’ E fi * c(y’) E 7 is the scheme of c.

(ii) dn(z) (iii) (iv) (v)

for i = 1,...,Id, where

Here the notational conventions are just like those w.r.t. ordinary norms: we write dn instead of < A n ; clause (iii), e.g., would in full read like this: (iii) <-dn([z : a ] B )

:= [(<-dn(a)) + 11(<,rE a)-dn(B).

Further a context is dnc if all its type parts are so, and a book is dnc, if all the contexts and typs of it are dnc. 2.2.5. A degree-norm u can be translated into an ordinary norm u* by replacing all occurrences of numbers by T . Notice that ( u 1)’ u’, so dn(A)* p ( A ) .

+

=

=

This shows that dnc-ness implies strict normability. Further, degree(A) can also be constructed from dn(A), for dn(A) ends precisely in the degree of A. We call a substitution [$j’/B]dnc if dn(Bi) = dn(yi), for i = 1,...,Id. Clearly dnc substitutions are degree correct. Degree-norm correctness is preserved under dnc substitutions: if E p’ I- y, k = Id, I- &,...,I- B k , y dnc and [$/2] dnc then

W Y ) = dn(7[21)

*

0

Proof. By induction on the definition of dn(y). This gives us the following corollaries: (1) C dnc, degree(C) # 1 (2) C dnc, C 2 D

+

(3) C dnc, degree(C) (4) C dnc, C 2 D

=$

typ(C) dnc, dn(typ(C))

+ 1 = dn(C).

D dnc, dn(D) = dn(C).

#1

+

degree(typ(C))

+ 1 = degree(C).

degree(D) = degree(C).

595

The language theory of Automath, Chapter V I I (C.5)

So typ* is total on the dnc expressions and, since dnc-ness is clearly decidable, typx is well-defined on all the expressions, in the sense of V.4.4.1.

2.2.6. Now we are able to show that correctness implies degree-norm correctness. Proof. By induction on I-. E.g. let t- A, I- B, typ(A) 2 a , typ*(B) 2 [z : a]C. By ind. hyp. A and B are dnc (so typ*(B) is indeed defined), so typ(A), a, typ(B), typ(typ(B)), ...,typ*(B) and [z : a]C are dnc as well. Now dn(typ*(B)) = dn([z : a]C) = [dn(a) l]dn(C) = [dn(typ(A)) l]dn(C) = [dn(A)]dn(C),while dn(typ*(B)) and dn(B) just differ as to their “end number” so dn(B) = [dn(A)]v for some v . Hence (A) B is dnc. Or, let f E * c(fl E y be a scheme, let I- B1, ...,I-B k (with k = Id) and let the Bi satisfy the instantiation condition: typ(Bi) 1 Pi[l?].By ind. hyp. the Bi and the pi are dnc. Now dn(B1) = dn(typ(B1)) 1 = dn(01) 1 = dn(yl), so [y1/B1] is a dnc substitution. So dn(B2) G dn(typ(B2)) 1 = dn(P2[B1]) 1 = 0 dn(y2). SO [y1,y2/Blr B2] is dnc, etc. Hence c(l?) is dnc.

+

+

b

+

+

+

+

So typ* is also total on the correct expressions, and correctness is welldefined. Further, the above proof shows that the system with constants is strictly normable as well, so (using SN) it is decidable.

2.3. Introducing deAnitional constants 2.3.1. After the formulation of instantiation and application condition, it will also be clear how the compatibility condition of def and typ for the formation of definitional constant schemes has to read: typ(def(d)) 1 typ(d), for definitional constants d

2.3.2. The scheme of a definitional constant d is defined to be dnc, if dn(def ( d ) ) = dn(typ(d)) 1, and for the corresponding d(g)we define

+

dn(d(B)) := dn(typ(d)) + 1 provided [f/l?] is dnc, where E p’ is the context of the scheme. So, still dn(d(B)) 3 dn(typ(d))+l = dn(typ(d)[l?])+l =+dn(typ(d(g)))+l, and degree-norms remain preserved under reduction: dn(d(B)) = dn(typ(d)) 1 = dn(def(d)) = dn(def(d)[@). And, by induction on correctness, we can prove that correctness implies degree-norm correctness. E.g. let the scheme of d be correct, then I- def (d), so def ( d ) dnc, and dn(def ( d ) ) 3 dn(typ(def ( d ) ) ) 1, and t- typ(d) so typ(d) dnc, dn(typ(d)) = dn(typ(def ( d ) ) ) and dn(def ( d ) ) = dn(typ(d)) 1, q.e.d.

+

+

+

D.T. van Daalen

596

VII.3. The closure proof for A 3.1. What to prove The decidability of the Automath language is one of the major aims of the language theory. By using an algorithmic definition we got the decidability of A and Aq, both with and without constants, directly from normalization (see 2.1.4 and 2.2.6). So one might wonder what else there is to prove. First there are both Nederpelt’s conjectures, the Church-Rosser property (CR) for Av, and the closure property (CL). We define CR(A) : B < A > C

+

BlC

CL(A) : F A , A > B + F B .

A main lemma for P-CL (and 6-CL) is the substitutivity of correctness: substitution with correct expressions of the right types preserves correctness. Formally: z E a I- B , F A , typ(A)

1a

+ I-

B[z/A]

Other properties which play an important role in the proof of CL, are sound applicability (SA), preservation of t y p (PT), of typ* (P’T) and of domain (PD). We write

SA(A) : A

= (B) [z : C]D +

1C

1t y p ( B ) A > B + typ*(A) 1 typ*(B)

PT(A) : A 2 B P*T(A):

+

typ(B)

typ(A)

PD(A): A = [ z : B ] C , A > [ z : D ] E

(degree(A)

+

# 1, degree(B) # 1)

BlD.

The properties PT1, CL1, P*Tl and PD1 are the respective one-step variants of PT, CL, P*T and PD. The above properties are not mere technicalities from the closure proof, but are also meaningful from the point of view of interpretation. E.g. SA is characteristic for the fact that the Aut-languages do not allow “proper inclusion’’ of type, and PT (resp. P*T) expresses the nice behaviour of t y p (resp. typ*) w.r.t. definitional equivalence. Further, these properties serve t o establish the correspondence between the present, algorithmic systems and the E-systems, and between the versions with and without constants (see 6.2, 6.3).

3.2. Some simple facts 3.2.1. Throughout this Section VII.3 we just discuss A [ i e . without 771 without

The language theory of Automath, Chapter VIl (C.5)

597

constants. So we may assume CR, and PD(A) (for all A) and SA(A) (for correct A) are immediate. By induction on I- A one also proves easily that I- A implies I- typ(A) (so -i typ(typ(A)), ...,I- typ*(A)). This is not easy any more for a system with constants. This property is called correctness of types.

3.2.2. As with the E-systems (see V.3.1), we prove CL from CL1 by ind. on 2. For the p-outside case of CL1 we need substitutivity and SA. Previously substitutivity (i.e. the substitution theorem, V.2.9) was easy and SA was rather involved, but here SA is easy and substitutivity is quite complicated. First some properties of substitution, which are valid already for pretyped expressions. Let A be a <-expression, let B be a (<, z E a , v)-expression. Let C* denote C[z/A]. Then (1)

tYP(A)

1 tYP(Z)

* tYP(B*) 1tYP(B)*

7

i.e.1

written out in full, <-tYP(A) (2)

tYP'(4

1a

*

1 tYP*b)

(61

v')-tYP(B*)

* tYP'(B')

1 ((5,.

E

a1

v)-tYP(B))*

1tYP*(B)' .

Both facts are proved by ind. on the length of B. Notice that (1) and (2) are valid for each right monotonic, reflexive relation instead of 1, so e.g. for 2.

3.2.3. The problem with substitutivity is that the condition typ(A) 1 a is clearly not sufficient. We would also like to know something about typ'. In fact we have the following theorem (modified substitutivity of correctness, for short

sc) : Let z E a , 77 I- B , let I- A, typ(A) 1 typ(z) and typ*(A) 1 typ*(z). Let C' denote C[x/A] again. Then 7' I- B'. Proof. By induction on I- B. E.g. the application case. Let I- B1, I- B2, typ(B1) 2 p, typ'(B2) 2 [ y : PIC. By ind. hyp. I- B; and I- B;. By ( l ) , (2) and CR t y p ( B i ) 1 p* and typ*(B;) 1 [ y : p']C*. SO by CR again typ(Bi) 2 7, typ*(B;) 2 [ y : r ] D for some 7,D. SO I- ( B i ) B;. 3.2.4. Corollary. z E a I- B 1 I- A , tYP(A)

1tYP(Z)

1

tYP'(A)

1 tYP*(Z)

* I- BIAI

'

Another consequence of (1) is PT1(A) for correct A, 2.e. I- A1 A

> J3

* tYP(A) l t Y P ( B ) .

Proof. Assume for definiteness that > is disjoint one step reduction the introductory comment to 11.81. The proof is by induction on the length of A. For example:

51 [see

D.T. van Daalen

598

3.3. Heuristic considerations

3.3.1. At first sight SA,PT1 and correctness of types seem to give a good starting position for proving CL. In a way this is true: we only have to find the right induction and the right induction hypothesis. Let us first try to prove CL1(A) by induction on the length of A, or rather by induction on the relation "being a subexpression of", for short: by induction on subexpressions. We interpret CL1 in terms of disjoint one step reduction. For the appl. case of inside reduction the ind. hyp. is not strong enough, we additionally need P*T1. So instead we try to prove CL1 and P*T1 together, again by induction on subexpressions. Now everything is allright with the inside reductions, but with outside p1 we still come in trouble: A = (Al) [z : cr]Az, SA gives typ(A1) 4 Q but in view of the previous section we also want tYP*(Al) L tYP*(Q) .

3.3.2. So let us see under what conditions we might prove this typ*-requirement. First notice: if we knew CL already, then we could use PT1 t o prove PT (for correct expressions), e.g. by induction on 2. The induction step runs as follows: let I- A, A 2 B 2 C. By CL we get t- B and by ind. hyp. typ(A) 1 typ(B) 1 typ(C) whence by CR: typ(A) 1 typ(C), q.e.d. An alternative proof of PT(A) from CL works by induction on the reduction tree of A (by virtue of SN(A)), for short: by induction on reducts. Viz. let t- A, A 2 C. If A = C then typ(A) = typ(C). Otherwise for some B, A >1 B 2 C. By PT1 typ(A) 1typ(B), by CL t- B and by i d . hYP. tYP(B) 1tYP(C), so by CR tYP(A) 1 tYP(C). 3.3.3. Further from PT we can prove PIT, or rather:

t- A I t- B , A

1B

* ~ Y P * ( A1) typ*(B)

+

by induction on degree(A) degree(B), as follows. If degree(A) = 1, then degree(B) = 1 too, so typ*(A) E A 1 B = typ*(B). Otherwise, degree(B) # 1 either, so we can apply PT to A and B. By CR we get typ(A) 1 typ(B), by correctness of types I- typ(A), t- typ(B) so by the ind. hyp. typ*(A) 1 typ*(B), q.e.d. An alternative proof of PIT from CL and PT is by induction on +, the order generated (as in V.4.4.11) by

The language theory of Automath, Chapter VII (C.5)

599

(1) “being a proper reduct of”, (2) “being the t y p of”.

-

So the induction on

--+ includes the induction on reducts mentioned before. That is indeed well-founded will become clear in the sequel. The proof looks like this. Let F A, let A 2 B. By CL I- B and by PT typ(A) 2 F 5 typ(B). By correctness of types F typ(A), k typ(B) and by the ind. hyp. tYP*(A) 1tYP*(F) 1tYP*(B), and by CR tYP*(A) 1tYP*(B).

3.3.4. In Section 3.2.2 we announced t o prove CL from CL1 by induction on 2. However, this can be interpreted in two ways: (1) To prove -/ A, A 2 B + I- B , by induction on A 2 B , i.e. on the number of reduction steps between A and B, (2) To prove CL(A) by induction on the reduction tree of A , i.e. by induction on reducts. Both inductions work, but the second one has an advantage: we just need C L ~ ( A )but , can freely use CL(B) in the course of the proof, for each proper reduct A or B !

-

3.3.5. Now it becomes probably plausible t o try and prove CL(A)directly by an induction on 5 ,the order generated by (3.3.3) and by sub. In this way we combine the induction on subexpressions (3.3.1, for the “inside” cases of CLl), on reducts (3.3.2, to prove PT), and on -+ (3.3.3, t o prove P’T). In order to make the induction work we need the well-foundedness of 2 on the correct expressions, i.e. the so-called big tree theorem BT. Section 3.4 contains the proof of CL as sketched above, assuming BT. Section 4 is devoted t o the proof of BT. 3.4. The actual closure proof 3.4.1. Definition of + by (1) A

-

.

--+

is the reflexive and transitive relation generated

tYP(A).

A-B.

(2) A 2 B

0

3.4.2. Definition of 5 . 5 is the reflexive and transitive relation generated by

+ A 5 B. + A 5 B.

(1) B sub A (2) A + B

0

D.T. van Daalen

600

3.4.3. The big tree of an expression A is the reduction tree of A w.r.t. the extended reduction relation 5 . Throughout 3.4, we assume the big tree theorem BT, which states that 5 is well-founded on the correct expressions (and, hence, that their big trees are finite). 3.4.4. Lemma. Let I- A, CL(A). Then PT(A) (degree(A) # 1).

Proof. As in 3.3.2, e.g. by ind. on reducts, using PT1 and CR.

0

3.4.5.1. Define: CL+(A) :eA 3.4.5.2. So CL+(A)

3

-+

B

* I- B .

CL(A).

3.4.6. Lemma. Let t- A, CL+(A). Then P*T(A).

Proof. By BT we can use induction on +. Let A 1 B. If degree(A) = 1 then degree(B) = 1 too and there is nothing to prove. Otherwise, degree(B) # 1 either, so by the previous lemma PT(A), i.e. typ(A) 1 F 5 typ(B). By CL and correctness of types t- typ(A), I- t y p ( B ) and by the ind. hyp. 0 typ*(A) 1t y p * ( F ) 1 typ*(B). Now use CR. 3.4.7. Theorem. I- A

+

CL(A).

Proof. By BT we can use induction on 5 . Let I- A, A 2 B. If A = B then there is nothing to prove. Otherwise A > C 1 B with C a proper reduct of A. We want I- C. The interesting cases are: (1) A = ( A d '42, = (Cl) c2, t- Al, tYP(A1) I Q , I- A29 tYP*(A2) 2 [. : @ID, A1 > CI, A2 > C2. By ind. hyp. F C1, I- C2. By PT1 typ(A1) 1 typ(C1), so by CR typ(C1) J. a. Now by the ind. hyp. we can assume CL+(A2), so P*T(A2) and typ*(A2) 1 typ*(Cz), and by CR typ'(C2) 1 [z : @ID,q.e.d.

c

, 1 a. By ind. hyp. we (2) A = (AI) [z : cr]Az, I- Al, I- [z : Q ] A ~ typ(A1) can assume CL+(A1), CL+(a), so typ'(A1) 1typ*(a), and by substitutivity 0 (3.2.4) I- Az[Al] = C,q.e.d. VII.4. Proof of The Big Tree Theorem 4.1. Introduction For the definition of the extended reduction relations and A we refer to Sec. 3.4. Both definitions make use of t y p , so + and 5 are only defined on pretyped expressions, i.e. expressions with a context. Notice: taking subexpressions often requires extension of the context. --.)

601

The language theory of Automath, Chapter VII (C.5)

The big tree of an expression A is its reduction tree w.r.t. 5 ,i.e. the branches of the tree are the proper :-reduction sequences of A . We define BT(A) :eA has no infinite proper 5-reduction sequences

.

The big tree is infinitary so: BT(A) e the big tree of A is finite

.

In this Section VII.4 we prove the big tree theorem BT: (BT)

A normable

* BT(A) .

So BT states that on the normable expressions 5 is well-founded, i.e. that 5-SN holds. de Vrijer [de Vrijer 75 (C.4)] introduced 5 and big trees, and proved BT for a system of normable expressions containing his language AX. Below we give two different proofs of BT. The first (Sec. 4.5) is modelled after the second proof of p-SN (IV.2.5), the second one (Sec. 4.6) uses an idea from de Vrijer’s proof (the “bookkeeping pairs”) but further follows the first p-SN proof (IV.2.4.4). Actually both proofs deal with a modification >p7 of A which is somewhat easier to handle and gives rise to even bigger trees (Sec. 4.4.2). For simplicity we start with a system without constants, and take just preduction for the ordinary reduction involved in -+ and 5 . Later (5.2, 6.2, 6.3) BT will be extended to cover the remaining cases.

>

4.2. Heuristics 1 rt-reduction and rst-reduction respecAfter de Vrijer we also call + and tively, with r for ordinary reduction, s for subexpression, t for type. Similarly we speak about r-reduction (i.e. ordinary >), s-reduction ( A s-reduces to its subexpression), t-reduction (A t-reduces to typ(A) etc.) and their combinations. The meaning of rs-SN, st-SN etc. and 6,, - the length of rs-reduction tree of an rs-SN expression - etc. will be clear. We want BT, i.e. rst-SN for the normable expressions. Let us summarize what SN-results we know already:

(1) r-SN. This is ordinary 0-SN as proved in IV.2.4 for the normable expressions. (2) s-SN and t-SN. s-reduction decreases length of expressions, t-reduction decreases degree of (pre-typed) expressions.

(3) rt-SN. This was proved for correct expressions in V.4.4.11. The same induction (1) on degree, (2) on 19,, applies to all degree-norm correct expressions: taking t y p decreases the degree, r-reduction preserves degree.

D.T. va.n Daalen

602

(4) rs-SN. Provable for the normable expressions by induction on (1) 19,, (2) length of expression. In fact the induction used in the proof of the square brackets lemma SQBR(IV.2.4.3), and in several P-SN proofs as a subordinate induction (IV.2.4.4, IV.2.5.3) is just induction on the rs-reduction tree. (5) st-SN. Can be proved by induction on the definition of pre-typed [ i e . “typable”]expressions. Clearly these inductions fail for full rst-SN: s-reduction can increase the degree, r-reduction generally increases length of expression, and taking t y p can increase both length of expression and length of r-reduction tree. Besides, on the normable expressions r-reduction does not preserve the degree.

4.3.1. Norm properties From IV.2.1 we recall some properties of the norm p and of the normable expressions. We write A < p B for: p ( A ) is shorter than

4B). (1) ( A )B normable (2) A normable

+

+

( A )B

B and A

B.

p ( t y p ( A ) )I p ( A ) .

(3) p ( z ) = p ( A ) , B normable

+

+

p(B[z/A]G ) p(B)

p(B)=p(A).

(4) A

2 B , A normable

(5) B

c A , A normable =+ B normable.

Properties (2), (4),(5) make that the normable expressions are closed under + preserves the norm.

5 and that

BT-conditions Similarly to the SN-conditions in IV.2.4.1 we can formulate necessary and sufficient BT-conditions: 4.3.2.

* BT(tYP(Z)). BT([?/ : B11B2) * BT(Bi), BT(B2).

(1) BT(z) (2)

(3) BT((B1) B2) e BT(B1), BT(B2) and (B2 + [Y : PIC

* C[Bl]BT).

Proof. We just give the +part of (3). Let BT(B1), BT(B2) and B2 + [y : PIC BT(C[Bl]). B2 is rst-SN so rt-SN so we can use drt(B2). B1 is rst-SN so r-SN so we can use 19,(B1). Using induction on dr(B1) drt(B2) we prove that all one-step rst-reducts of (B1)B2 are BT. Distinguish:

*

(i)

D sub ( B I )B2, so D

+

c B1 or D c B2, so BT(D).

The language theory of Automath, Chapter VII (C.5) B2 >1,p D or D D [y : PIC BT((B1) D).

(ii)

(iii) B1 (iv)

B2

= typ(B2). We have BT(Bl), BT(D) and

*

-+

>1,p

603

BT(C[&]). Apply the ind. hyp. to (B1)D, this gives

D. Apply the ind. hyp. to (D)Bz.

= [y : PIC. Then by assumption BT(C[B1]).

0

Heuristics 2 If BT(B2), Bz + [y : PIC then clearly BT(C). So BTcondition (3) above suggests as a main step in proving BT the substitution theorem for BT: BT(A), p(z) = p(A), BT(B) + BT(B[z/A]). Indeed, if we knew this theorem, we could simply proceed by induction on pretyped expressions and get BT. The similarity with the situation around P-SN suggests us to use SQBR (IV.2.4.3), for instead of 2: If B' [y : PIC then either 4.3.3.

-+

(1) B (2) B

-+

[y : Po]Co with

-+

5 P, C$ + C, or

-, (9)z, ((9)z)* + [y : PIC, where * stands for [z/A].

However, the following counter example shows that this lemma is wrong: Take B E (B1)[ z : y] [y : P] (2)z, A = [u : p] .. u .. u. Then B* [y : P*[B;]].. B; ..y*, but B [y : P[B1]](BI)z, and ( ( B ~ ) Z ) *..B;..B;. -+

-+

-+

4.4. ,&-reduction 4.4.1. One point which makes SQBRbreak down for

B

+

C =+ B[z/A]

=

-+

C[z/A]

-+

is that n6t:

.

=

Example: B z, C typ(z) and the only connection between z and A concerns their norms (not their typ's). The other substitution property: A -+ A' =+ B[A] B[A'] does not hold either, due to the lack of monotonicity clauses in the definition of -+. Example: A -t typ(A) but not ... A ... + ...typ(A) ... . -+

4.4.2. Now we introduce Pr-reduction by adding these monotonicity rules to the

definition of +. What we get is a reduction in the usual sense, that a one step reduction consists of replacing a subexpression (redex) by another expression (contractum). The redices are here of two kinds: (1) P-redices which contract as usual.

(2) r-redices: variables z which contract according to z

>s

typ(z).

604

D.T.

vitn

Daalen

We use the same terminology as before [see the comment to 11.81: &, > I , ~ ,>pr etc., T-SN, PT-SN, dp, etc. Now ZpT satisfies the second substitution property (above) indeed but the first one is still not valid (same counterexample). Just like + and ,: ZpT is only defined for pretyped expressions. Formally, we ought to speak about “ l p , w.r.t. context <”, and the monotonicity for abstraction expressions then would read: If

B1

>pr C1 w.r.t.

< and 8 2

>pr

then [y : B1]& >pT [y : C1]C2 w.r.t.

c2 w.r.t.

(<,y E 81)

<.

4.4.3. We are going to prove PT-SN and then conclude BT from the Theorem. PT-SN(A) =+ BT(A).

Proof. Let @SN(A). Using induction on (1) d p T ( A ) ,( 2 ) length of A we proof 0 that all one-step rst-reducts of A are BT. So A itself is BT. 4.4.4. PT-SN conditions These are quite similar to the BT conditions. The only non-trivial modification concerns the application case.

Proof. As in 4.2.3 but now we use induction on dp,(81)

+6pT(&).

0

Just like st-SN (see 4.2 (5)) we can prove TLet C contain subexpressions A = [z : a] ..2 .., I? = [y : P l y . . . Then A >r A’ = [z : a] ..a.., r >r [y : P ] . . P . . and we want a common 7-reduct of ... A’.. I? ... and ... A ..I” ... . As in 11.8.2 we consider all the possible cases. Generally the reductions simply commute: ... A’ ..r ... > T ... A’ .. I? ... < r ... A ..I?‘ ... . In case the specific z occurs in P or the specific y occurs in a then two .r-steps are needed, e.g. [y : ..z . . ] .. y .. > r [y : ..a . . ] .. y .. >T [y : .. ..I..(.. ..) .. < s < s [y : ..2 . . ] ..(..z..) ... Anyhow the weak diamond property holds for > r , so by T-SN we get T-CR, and uniqueness of T-normal form.

4.4.5. Something on SN. Further we verify T-CR:

4.4.6. This gives an easy way of reaching a PT-normal form: first r-normalize then &normalize. Notice: the norm properties guarantee that Lor preserves the norm of normable expressions. 2 p and r Tdo not commute, but we still can get PT-CR for the normable expressions, as follows. For norms v we define a p-r-normal expression v*:

(1)

T*

7,

The language theory of Automath, Chapter VII (C.5)

605

E [x : v;]v;. (2) ([v1]v*)*

Now we can prove

A normable

+

A >pT (p(A))*

by ind. on the definition of p. This gives Pr-CR and uniqueness of Or-normal form. The procedure above assures the existence, so for normable A we can speak of PT-nf ( A ) . In fact v* is Nederpelt’s original representation of the norm v.

4.5. First proof of Or-SN; a correction to IV.2.5.3 4.5.1. In view of 4.4.4 it seems reasonable to concentrate on the substitution theorem for Pr-SN: A Pr-SN, B Pr-SN, p ( x ) = p ( A ) + B[A]Pr-SN.Just like with --)I SQBR fails for >p7, so we rather let us inspire by the second proof of 0-SN (IV.2.5.3). In fact we also take the occasion to indicate (and repair) a flaw in that proof, concerning the distinction between replacement and substitution. 4.5.2. When defining substitution we have assumed the concept of literary replacement t o be understood. Substitution amounts to replacement with precautions, viz. that no clash of variables takes place, and substitution can also be considered a special case of replacement. Now let us see what went wrong in IV.2.5.3 (and also in IV.2.6.2). Essentially we wanted to replace a specific subexpression A in C by another expression A’, thus producing C’. We had the idea that this replacement of A with A‘ could be performed via substitution for a new “fresh” variable y, such that COF .. y .., C = CO[ y / A ]C’ , = CO[y/A‘].However, this is wrong: possible bound variables of C, which become free in A, can never get the appropriate bindings in CO[ y / A ] . What we need here is literary replacement (LR) of y with A and A’ resp. We is the result of literary replacing all free introduce a new notation: B[z/A]LR occurrences of x in B by A. 4.5.3. Below we follow the general idea of IV.2.5.3, but instead of using a substitution theorem for SN, we use the - stronger! - replacement theorem - as we ought to have done there (and in IV.2.6.2) too. The easiest way is to use “multiple” replacement, i.e. replacement with a set of expressions. Notation: B [ [ z / a l ~ where ~ , a is a set of expressions, is the set of expressions which result from B by (literary) replacing all free x in B by an expression A E a, but possibly different A’s for different occurrences of x (compare multiple substitution, in 11.10).

606

D.T. van Daalen

4.5.4. The monotonicity of >pr makes the replacement property work:

A 2 p r A’

*

B[AILRLPTB[A’]LR

provided A has been put in the appropriate extended context. We make this slightly more explicit. Let A be an occurrence of a subexpression in C. The context of A in C can be defined by induction on the length of C. Intuitively speaking, it consists of all the assumptions z E Q, which one encounters (in the form of abstractors [z : a ] )when scanning C from “left to right” until one arrives at A. The crucial clause in the definition is of course: if E is the context of A in C2 then (z E E l , [ ) is the context of A in [z : El]&. Now the context of A in the replacement property must provide all free variables of A with the same typing as they get when A is inserted in B. E.g. we can take (E, 7 0 ) where E is the context of B and 70 is the intersection (in the sense of context inclusion sub, of V.2.6) of all the 7’s which are the context of a free occurrence of z in B. We define p(A) to be the set of or-reducts of A. Then, again if A has been put in the right context,

*

C E B [ [ E / ~ ( A ) ~ L BR [[z/P(A)]]LR Ipr C

.

The other replacement property B 2 p r C + B* >pT C*,where * stands for [x/A]LR is still not generally valid, but we have a restricted version. Lemma. I f A >pT t y p ( z ) and B I p , C then B* Lp, C*.

4.5.5.

Proof. Ind. on >or. E.g. if B > I , ~C, B = ...z ...z ..., C then B* = ... A ... A ... >pr ...t y p ( x )...A ... 5 C’. Corollary. B* Pr-SN, A >pT t y p ( z )

= ...t y p ( z ) ... x ...,

* B or-SN.

Proof. Use ind. on (1) %r(B*), (2) length of B*. E.g. inspect the Pr-SN conditions.

0

4.5.6. Now we are ready for the Pr-SN proof.

Replacement theorem for Br-SN. Let * denote [ z / p ( A ) l ~ Let ~ .B normable, p(z) = p ( A ) ,A, B PT-SN. Then C E B*

3

C PT-SN

provided A has the right context. Proof. By induction on (I) p(A), (11) 2YpT(B),(111) the “capacity” of the transition from B to C, i.e. the sum of the 19pT’s of the reducts of A inserted in B. Now consider a single reduction step C >l,pr D. We distinguish:

The language theory of Automath, Chapter Vll (C.5)

607

(1) this reduction step concerns an old redex, i.e. a redex already present in B , (2) this step concerns a new redex. The latter are of two kinds: (2a) multiplied redices, i.e. redices inside an inserted reduct of A, (2b) newly composed redices. All T-redices fall under case (1)or (2a) and the P-redices are classified as before, so the only possibility of case (2b) is as follows: B = ... z... ( B 1 ) z..., C = ...A1 ... (C i ) [ y : y ] E..., D E ...A1 ...E[C1]..., where C1 E B f , A >pT A1, A 207 19: TIE. In case (1)and (2a) the replacement and the reduction commute, i.e. B > DO, D E 0:. To be precise, let (Cl) [y : r]Cz be an “old” redex, i.e. (B1)[y : P]B2 c B , C1 E Bf, c2 E B;. Then D = ... c2[C1]... E (...Bz[B1]...) [ ~ / ~ ( A [ & ] ) ] L R , and not simply D E D:. Then we get PT-SN(D) by ind. hyp. I1 (case (1))or I11 (case (2a)). Now we tackle case 2b): create a new variable z and form BO by replacFor . simplicity we put ing the intended (B1)z by z. so B = B o [ z / ( B 1 ) z ] ~ R typ(z) PT-nf ((B1)z), so p ( z ) = p((B1)z) and PT-SN(BO)- by 4.5.5. Then we form Bh E BZ by replacing the remaining free z’s of BOwith the appropriate reducts of A, i.e. the same as used in the formation of C, and finally replace the z of Bh by E[C1]. This gives us D = Bh[z/E[C1]]LR back. Informally: BO = ...z ... E ..., Bh = ... A1 ... z ..., D ...A1 ... E[C1].... Either by ind. hyp. I1 or 111 we get P.r-SN(C1). Further PT-SN(A)soPr-SN([y : y ] E )so PT-SN(E). By normability B1
= p ( A ) , A,

Corollary 2. B normable + B PT-SN (see

B PT-SN

+

B[A] 0

4.4.4).

0

Corollary 3. B normable + BT(B) (as in 4.4.3).

0

4.6. Second proof of PT-SN 4.6.1. Bookkeeping pairs, r-expansion and n-reduction. [This x-reduction is

similar, but not equal, to the 7r-reduction in Chs. II and VIII.] 4.6.1.1. Assume that A l TB , i.e. B results from A by successively replacing variables z by their type typ(z). Alternatively we can work backwards from

D.T. van Daalen

608

7-nf ( A ) ,by successively replacing newly created subexpressions by the original variable. In general it is of course not possible to retrace which subexpressions are newly created, and from which variable they stem, unless we store this information somewhere inside the expression! Following [de Vrijer 75 (C.4)] we use a new pairing operation [...,...I for this kind of bookkeeping. Definitions:

(1) If A , B are expressions then [ A , B1 is an expression. ( 2 ) If A , B are [-expressions then [ A ,B ] is a &expression.

( 3 ) If A , B are normable, p ( A ) = p ( B ) then p( [ A , B1) G p ( A ) . For the rest the definitions of pretyped and normable expressions are unaltered. The notions of subexpressions and substitution are extended in a straightforward way. As a new monotonicity rule, for each kind of reduction, we can have, e.g. A > A’, B > B’ =s [ A ,B ] > [A‘,B’]. 4.6.1.2. Now the alternative way of producing B from A (above) can be described as follows:

(1) First provide all variables x successively with a copy of their type, i.e. replace x by [ x ,t y p ( x ) l and so on.

(2) Then for some of these pairs simply restore the lefthand part, and for the rest pick the righthand part. In the process (1)the 7-expansion of A , T-exp(A), is constructed, i.e. each x of A is replaced by [ x ,T - e x p ( t y p ( z ) ) l . The process ( 2 ) we describe in terms of a projection reduction (n-reduction l T ) . Definitions: (1) The T-exp of pretyped expressions is defined inductively:

(i)

T-exp(x) = [ x ,T - e x p ( t y p ( x ) ) l .

(ii) 7-exp((A) B ) = (7-exp(A)).r-exp(B). (iii) T-exp([z : a ] B )= [ x : . r - e x p ( a ) ] ~ - e x p ( B ) . (iv) .r-exp( [ A , B1)= [T-exp(A),.r-exp(B)l. (2) (i) One step n-reduction >I,= is generated from n-contraction: [ A ,B] >I,* A, [ A , B ] >I,= B by the monotonicity rules. (ii) n-reduction

>T

is the transitive and reflexive closure of

>I,=.

The language theory of Automath, Chapter VII (C.5)

609

4.6.1.3. Remark: Formally we should have defined the 7-expansion of expressions w.r.t. their context, notation [-T-exp(B). The abstr. case of the definition then becomes:

[-T-exp([z : a ] B ) =

=

[z : ( [ - f - e x p ( a ) ) (([, ] z E a)-7-exp(B))

4.6.1.4. The point of this alternative approach of

A

>,

B

r-exp(A) 2, B

=S

.

>,,

making use of

(see 6.2.2)

>,

>,

is that is definitely easier to handle than >, roughly because does not depend on the context, and that 20,-reductions of a n expression can be simulated by &,-reductions of its r-expansion. Our proof below consists of two parts: first we show that Pa-SN implies 07SN, then we prove the SQBR lemma for 20, and Pa-SN.

4.6.2. Pa-SN implies Pr-SN. > I , ~B

4.6.2.1. Lemma. A Proof. Ind. on

*

~ + x p ( A )2, T-exp(B) (in fact 5'-1,,).

>I,~:

(i) r-contraction, A = z, B G t y p ( z ) . Then .r-exp(A) >I,, T-exp(typ(z)) = r-exp(B).

= [z,r-exp(typ(z))l

(ii) Monotonicity, e.g. A = [z : A l ] z , B [z : Bl]z,A1 > I , ~B1: By ind. hyp. r-exp(A1) 2, r-exp(BI), so r-exp(A) = [z : r-exp(A1)] [z, r-exp(A1)l [z : ~ - e x p ( B 1 ) [z, ] T-exp(B1)l = T-exp( B ) . 0

>,

4.6.2.2. Corollary 1. A 2, B Corollary 2. A

>T

B

=S

*

r-exp(A)

r-exp(A) 2, .r-exp(B). >T

0

B (because .r-exp(B) 2, B).

0

4.6.2.3. Lemma. Let A be a (-expression, let B be a (E, z E a, q)-expression. Let I and I' stand for [z/A]and [ x / ~ - e x p ( A )resp. ] Then

7-exp(B)11

>,

T-exp(B')

with r-exp(B') taken w.r.t.

E , #.

Proof. Ind. on the definition of .r-exp(B):

(i)

-r-exp(x)" T-exp(xI).

= [ x ,.r-exp(a)]'I = [r-exp(A),.r-exp(a)l

>, T-exp(A)

=

D.T. van Daalen

610

4.6.2.4. Corollary. Let A be a <-expression, B is a (<,xE a)-expression. 0 Then r-exp(B )[ x / r-exp(A)] 2, .r-exp(B [ x / A ] ) . 4.6.2.5. Corollary. A 5 1 , p B Proof. Ind. on

+

r-exp(A) 51,p 2, r-exp(B).

51,~:

(i) P-contraction, A = ( A l )[ x : Az]A3, B A3[A1],r-exp(A) 51,p r-exp(A3) [x/r-exp(Al)]Zn r-exp(A3[A1])= T-exp(B), by 4.6.2.4. (ii) Monotonicity, e.g. A = [Al,A21, B [BI,& I , A1 s 1 , p B1, A2 5 1 , p B2. BY ind. hyp. r-exp(A) G [r-exp(Al),r-exp(Az)l 5 1 , p 2, [r-exp(Bl),r-exp(Bz)l = r-exp(B). 0

4.6.2.6. Theorem. r-exp(A) Pr-SN

+

A Pr-SN.

Proof. Let r-exp(A) be Prr-SN, use ind. on 19pv(r-exp(A)).If A >1,p B then r-exp(A) 3 1 , p In r-exp(B) (by 4.6.2.5), so by ind. hyp. Pr-SN(B). 0 Similarly, if A > I , ~ B then Pr-SN(B). So A is Pr-SN. 4.6.3. The proof of ,&-SN 4.6.3.1. The normable expressions are closed (and norms are preserved) under Zp T. Further & satisfies both substitution properties (see 4.4.1).Notice that 2, does not satisfy CR but that ,B and rr commute (use nested one step reduction 5 1 ,[compared ~ to disjoint 1-step reduction 51, nested 1-step reduction 51 has the extra clause: A > A', B > B' + ( [ A ,B] > A' and [A,B] > B ' ) ] )and that weak rrP-postponement holds: A >pr B + A >,>p C 5, B. 4.6.3.2. PPSN conditions These are again quite similar to the P-SN conditions. The interesting clauses are: (1) A Prr-SN, B Prr-SN

+

[x : A ] B and [A,B1 Prr-SN.

( 2 ) A Prr-SN, B PT-SN and ( B Pn-SN.

[ x : a]D

+

D[A]Prr-SN)

+

(A)B

So, again, we want the substitution theorem for Prr-SN. 4.6.3.3. Square brackets lemma for 2 p n . Let B be Prr-SN. Let * stand for [x/A].Let B* >pT [y : PIC. Then either

The language theory of Automath, Chapter VII ((7.5)

(1)

611

B I p , [Y:PoICOwith& I p T P, C$ >p, C , or

(2) B

>pn

Z)* 2 [Y PIC. (Bk) (B1)2, ((2)

Proof. AS in IV.2.4.3, by induction on (I) lilp,(B), (11) the length of B. The new case is [Bl,Bz],B* G [Bi,B,*l. Then either B,* >p, [ y : PIC 0 or B,* &, [ y : PIC, and we can apply ind. hyp. I to B1 or B2. Remark: An alternative proof is provided by Barendregt’s lemma, which is still valid for >p, (see 11.11.3.5).

4.6.3.4. Substitution theorem for Pr-SN. Let B be normable, p(x) = p ( A ) , A and B are ,hr-SN. Let * stand for [ x / A ] . Then B* Pr-SN. Proof. As in IV.2.4.4, by ind. on (1) P ( A ) , (11) %r(B), (111) length of B. [ B l ,B21, B* The new case concerns B Pr-SN by ind. hyp. I1 so B* is Pr-SN.

E

[Bi, €I,*]. Both Bf and B; are

4.6.3.5. Corollary. B normable =$ B Pr-SN.

0

0

4.6.3.6. Notice that the r-expansion of normable A is again normable, so A normable T-exp(A) normable. Corollary. A normable 3 A Pr-SN ( b y 6.2.6). 0

*

Corollary. BT.

0

VII.5. Closure and Church-Rosser for A v 5.1. Introduction 5.1.1. Here we consider the constant-less part of Aq, defined as in Sec. 2.12, but with 2 standing for Pq-reduction. It is easy t o derive a strengthening rule (Sec. V.1.6) for such an algorithmic system, so q-CL does not cause major difficulties. The problems with closure for Aq, as compared to A, are rather due t o the fact that CL and CR appear to be heavily interwoven. Namely, a proof of CL (see, e.g., VII.3) seems to make quite essential use of CR, while in turn we seem to need CL in the course of the CR-proof - because Pq-CR holds for correct expressions only. The solution is of course to prove CR and CL (and a number of other properties) simultaneously, by induction on big trees. In Sec. 5.2, below we prove indeed that BT extends to the present situation.

612

D.T. van Daalen

5.1.2. We introduce some notation that enables us to make the structure of the proof more explicit. Here 5 is as in VII.3.4. Definition. If P is a property of expressions then P* and Po' are given by (1) P*(A) :-+ A 5 B

=$

P(B).

(2) Po(A) :-+ (A properly &-reduces to B )

+

0

P(B).

Using this notation, we can express our induction step by

F A , CR;(A),

CLg(A)

+

CR(A), CL(A)

for which, of course, it is sufficient to prove

t- A , CRg(A), CL;(A)

+ C R ~ ( A ) ,C L ~ ( A.)

The properties SA, PD, PT and P*T from 3.1 play again a role in the proof, and further property SC, substitutivity of correctness, here defined by S C ( B ) :-+ .( E a B, t- A,tYP(A) 1tYP(.),tYP*(A) 1tYP*(.) I- B[AI).

*

5.1.3. Now the proof below is organized as follows. First we present some preliminary facts, among which Pq-BT (Sec. 5.2), strengthening and Q-PT (Sec. 5.3). Section 5.4 contains the actual closure proof. First we assume t- A, CR;(A), CL;(A), and prove SA(A) and PD(A) (in Sec. 5.4.1), PTl(A), SC(A) and CR1(A) (in Sec. 5.4.2-5.4.4) respectively by a separate induction on big trees, and by simple induction on length. Then we complete the proof by proving PT(A), P*T(A) and CL1(A) simultaneously, by induction on the big tree of A again. 5.2. Extension of BT t o the Pq-case 5.2.1. A postponement result Let &,, and 2p7,, be the straightforward extensions of Z7 and >p7, as defined in 4.4.2. Mere verification shows that

A pretyped, A

>1,,,>1,~ B 3

A

%,, B

>I,~

whence 777-postponement:

A pretyped, A 2,,7 B

A

&>,,

B.

Combining this with Pq-pp [P~ppostponernent] we get

A pretyped, A >-p7,, B

+

A 2p7>,, B

.

5.2.2. Pqr-SN and Pq-BT In 4.6.3 we proved Pr-SN, which - [induction o n 79p7], as in [ v a n Daalen 80, 11.7.3.51 - together with (Pr)-q-pp and 7-SN gives us Pqr-SN, for normable expressions. Then Pq-BT follows, as in 4.4.3.

The language theory of Automath, Chapter VII (C.5)

613

5.3. Some simple facts 5.3.1. Strengthening If B is a (t,x E a,G E B)-expression, but x $! FV(B) and x $? FV(B), then B is a (t,G E B)-expression as well, and the t y p (if degree(B) # 1) and typ' of B w.r.t. both contexts are syntactically equal (E). So, by induction-on the definition of correctness, we get strengthening: if x E a,G E PI- ( B ) ,x fZ FV(B) (and x fZ FV(B)) then G E @I-( B ) - read this twice, with and without the parts concerning B -. As a corollary we have: x E a I- A, x $! FV(A) + I- A, whence q-outsideCL1: I- [x : a](z)A, 2 $? FV(A) I- A.

*

5.3.2. q-PT and q-P*T For pretyped A there holds A

>9

B

* typ(A) tYP*(A)

>9 >9

typ(B) (if degree(A) # 1) , tYP*(B)

Proof. Induction on the length of A. So, induction on 29 gives A 29 B

* tYP(A) 2 9 tYP(B)

(if degree(A) # 1) 1

tYP*(A) 2s tYP*(B) and, a fortiori, we have y P T and q-P*T A

27

B

* tYP(A) 1tYP(B)

(if degree(A) # 1) ,

5.3.3. From 3.2.1 we recall the property of correctness of types I- A =+ I- typ(A)

and the substitution properties from 3.2.2

5.3.4. Property. Let degree(A) = 1, p(A) = [ul]... [ v ~ ] E Then . A 2 [x1 : a11 ... [xk : ak]C. Proof. Induction on the length of A. E.g. let A = (Al)A2, then p(A2) = [ ~ ( A I )[UI] ] ... [a]&, so by ind. hyp. A2 2 [x : P] [xl : a11 ... [xk : a k ] C and 0 A 2 [zl : a:] ... [xk : a;]C', q.e.d.

D.T. van Daalen

614 Corollary. Degree(A) = 1, p(A)

= [v1]v2 *

Corollary. I-' A, A E [z : a]C,A 2 F

+

A 2 [z : a]C.

0

F 2 [z : p]D.

Proof. If A correct, then A normable, so F normable, with

Corollary. I-l A, A E [z : a]C, A

1 F + F 2 [z : p]D.

5.4. The actual closure proof 5.4.1. Lemma. Let I- A, Cg(A), CL;(A). Then PD(A) and SA(A).

Proof. By induction on the big tree of A. (PD). Let A = [z : A1]A2, A 2 [z : B11B2. If A1 2 B1, A2 2 B2, then certainly A1 1 B1. Otherwise A2 2 (z)[z : The latter expression is correct, satisfies CR* and CL*, so we can use SA and get A1 1 B1, q.e.d. (SA). Let A = ( A l ) [ z : A2]A3. Then I- Al, typ(A1) 2 cp, I- [z : AzlA3, typ*([z : Az]A3) = [z : A2]typ*(A3) >_ [z : cp]C. By correctness of types I- [z : Az]typ*(A3), which also satisfies CR* and CL* so we can apply PD and get 0 A2 1 'p, whence typ(A1) I AS, q.e.d. 5.4.2. Lemma. Let I- A, CG(A), CL;(A). Then P T ~ ( A ) .

Proof. Induction on length(A). q-PT1 we know already (Sec. 5.3.2). For Pkmtside-PTl let A (Al) [z : A2]A3. By 5.4.1 typ(A1) 1 A2 and by the substitution property 5.3.3.1) typ(A) = (Al) [z : AzItyp(A3) > 0 typ(A3)[Al] 1typ(A3[A1]), q.e.d. The other cases are immediate.

=

5.4.3. Lemma. Let z E a, G E F B , C q ( B ) , CL:(B), I- A, typ(A) typ*(A) 1typ*(a). We write * for [z/A]. Then ( S C ( B ) ) a E p'* I- B*.

1 a,

Proof. Induction on length(B). The crucial case is: B = (BI) B2, typ(B1) Icp, typ'(B2) 2 [u : 'p]$. By ind. hyp. I- B1, I- B2. We do not know CR or CL for the substitution results, so we use a trick. Distinguish: (1)

B1

does not end in z, then typ(B1)

= typ(B1)"

2

'p*.

(2) Otherwise, let B1 = ...z ...z and form C1 from B1 by just replacing the final z, C1 = ... z...typ(A). Then C1 1 typ(B1) and by CR, C1 1 9. So t y p ( B f ) 3 Cf 1 v*.

Anyhow, in both cases t y p ( B f ) 2 Further distinguish:

'p'*,

with

'p'

1 'p.

The language theory of Automath, Chapter V I I (C.5)

(1) B2 does not end in z, then typ*(Bz)

615

= typ*(Bz)*2 [u : cp*]$*.

...z ... z, C2 (2) Otherwise form Cz from Bz by replacing its final z, B2 ... z... typ*(A) 1 typ*(Bz). Then, by CR (typ*(Bz)), C2 1 [u : cp]$ and, by 5.3.4 C2 2 [u : cp”]$” with, by PD, cp 1 $,”. Now typ*(B,”)= C,” 2 [u : cp’f*] $”* . So in both cases typ*(B,*)2 [u : cp”*]$”*, with cp 1 cp”. Now use CR(cp), this gives cp’ 1 p f f ,whence cpf* 1 cpf’* and typ(B,*)-1 cpIf*. So I- (€3;) B,”,q.e.d. 0 5.4.4. Lemma. Let I- A, CR;(A), CL;(A). Then CRl(A).

Proof. Again by induction on length. The crucial case is the critical &-case: A [X : All (x)[Z : AzIA3, z @ FV(A2). By 5.4.1 SA((Z) [Z: AzIA3) SO A1 1 Az, 0 [z : A1]A3 1 [z : AzIA3, q.e.d. 5.4.5. Lemma. Let I- A, CR;(A), CL:(A).

Then C L ~ ( A )PT(A) , and P*T(A).

Proof. Induction on the big tree of A. (1) (CL1). Let A > B, we must prove t- B. The 7-outside case we know already. Consider, e.g.: A = (Al) [Z: Az]As, B A3[A1]. By 5.4.1 typ(A1) 1 A2. By P*T - ind. hypothesis - we get typ*(Al) 1 typ*(z) as well, so by 5.4.3 we are done. This is P-outside CL1. Or consider: A E (Al) Az, A1 > B1, A2 > B2, B = (B1)B2, typ(A1) 2 cp, typ*(Az) 2 [u : cp]$. By (e.g.) the ind. hyp. we get I- B1, I- B2, typ(A1) 1 typ(B1) and typ”(A2) 1 typ*(Bz). Now use CR, this gives tYP(B1) 1 cp and tYP*(B2) 11. : cpl$. So, by 5.3.4, typ*(B2) 2 [u : cpf]$’I and by 5.4.1 cp 1 cp‘. Finally CR(cp) yields typ(B1) 1 cp’, so I- (B1)B2, q.e.d. The remaining case of CL1 is trivial.

=

(2) (PT). PT1 we know already. Now let A > I B 2 C. By CL1 I- B and by ind. hYP. PT(B), so by CR(tYP(B))l tYP(A) 5. tYP(C), q.e.d. (3) (P*T). Let degree(A) = 1. Then by PT, if A 2 B , typ(A) 2 F 5 typ(B). By C L ~ ( A(this ) implies CL(A))I- B, so by correctness of types, I- typ(A) and k typ(B). Now apply the ind. hyp.: typ*(A) 1 t y p * ( F ) 1 typ*(B) and 0 use CR: typ*(A) 1 typ*(B), q.e.d. 5.4.6. Theorem.

If k A, then CR(A),CL(A).

Proof. By induction on the big tree of A. The ind. hyp. reads CR;(A), CL;(A), and the preceding lemmas produce CR1(A) and CL1(A). As we noticed before, 0 this yields CR(A) and CL(A).

616

D.T. vaa Daalen

5.4.7. Corollary. If k A, then SA(A), PD(A), PT(A), P*T(A) and SC(A). 0

Note: The separate inductions on big trees in 5.4.1, 5.4.5 and 5.4.6 can of course be compressed into a single induction on big trees. 5.4.8.

VII.6. Various equivalence results 6.1. Introduction In V11.2 we introduced A(r]) with and without (definitional) constants. The results in VII.3-5 are derived for the constant-less system. In this section we extend these results in an indirect way to the remaining systems, by showing that, in a certain sense, they can be embedded in the constant-less version. Sec. 6.2 is devoted to primitive constants only. First we give a translation which eliminates the constant-expressions. Then we explain the relations between (a) the system with constants, (b) its image under the translation, and (c) the constant-less system. Afterwards we easily extend our nice properties (CL, CR, BT) to the system with constants. Sec. 6.3 covers the additional extension with definitional constants. In 6.4 we prove another equivalence: between Nederpelt’s single line presentation with abstractor strings Q and our presentation, with contexts E. In this case too, the correspondence is close enough to show that Nederpelt’s original system satisfies the required properties.

6.2. Eliminating primitive constants 6.2.1. The translation ‘ For the system with constants (for short: c-system) we use the notations A ( V ) ~and Fc. Now we define a translation of the c-system into the system without constants. The translation (notation ‘) is characterized by: (1) it transforms constants p into variables p’, (2) it converts constant-expressions p(A1, ...,A h ) into appl. expressions

(4)... (4)P’,

d*

(3) it eliminates schemes y’ E p($ E y one by one from the book by including an additional assumption p’ E [y’ : $17’ in the context,

The language theory of Automath, Chapter VII (C.5)

617

(4) it commutes with the other formation rules (for expressions, strings and contexts). Thus a statement B ;
6.2.2. Why the indirect approach? Below we use the properties of the constant-less system in our proof of the desired correspondence. Afterwards we can extend these properties to the c-system. The point is that the constant-less system is definitely easier to handle. In particular: the fact that the ty p of a constant-expression is constructed by substitution is a complicating factor, because correctness of types is not immediate any more. E.g. by using this indirect approach we would have been able to introduce constants without using degree-norms [as we did in VII.2.31. 6.2.3. The nature of the correspondence For terminology about extensions we refer to V.3.3.2. However, because we study an algorithmic system now, we replace A E B by typ(A) 1 B and A 9 B by A 1 B. Clearly the c-system is an extension of the system without constants. Because typ and remain the same, it is a conservative extension, too. Of course, it is not an unessential one: primitive constant-expressions do not main reduce at all, so they can never be definitionally equivalent to an expression without const ants. Contrarily, the translation ’ maps expression (and contexts), correct w.r.t. B in the c-system, properly [intended is: the translation is not surjective] into the expressions (and contexts), correct w.r.t. B’:expressions (A)p’ that do not have enough arguments in front, i.e. where 1 21 is smaller than the arity of p have no counterpart in the c-system. For the image of the c-system (w.r.t. a fixed book B) under ’, we introduce the notation I-. 1.e.

>

‘

r]

F-

,

resp.

r]

b- B :*q E

B

<’,

E A’

and

B;E kc

resp.

B;< kc A

Then below it will appear that the expressions (and contexts) correct w.r.t. B’ in the constant-less system, form a conservative extension of the system F-. In the presence of r]-reduction, it will be definitional (so unessential) too, See Sec. 6.2.9.

6.2.4. Facts about ’ Notice that ’ is a purely “syntactical” matter, which has nothing to do with correctness: pretyped-ness is sufficient.

D.T. van Daalen

618

As a map from statements B; [ k A to statements B’, [’ I- A’ the translation is not one-one, but as a map from B-expressions and B-contexts into B’-expressions and B’-contexts it is one-one indeed. For the (partial) inverse we use the notation 0:

(A’)o

:= A .

*

Clearly, A[@ = A’[@] so A 2 B + A‘ 2 B’, so A 1 B A’ J. B’. Further typ(A’) 1 typ(A)’ - there are only head-@ contractions involved, where degree(A) = i 1 (for the definition of head-reduction see V.4.4.5, for i-reduction see V.3.3.3). And typ(A’) = cp’ for some cp. If there is no rl-reduction then we have

+

(1) A’> B

+

A

> Bo, BA E B , SO

(2) A ’ ? B’ =+ A > B , and

(3) A ‘ 1 B’

+

A

B.

’ and rl-reduction With ?-reduction, (1) above does not hold any more: ([z: a]p(Ak, ...,A1,z))’ E [x : a’](x)(.&p’ may reduce to (2)~’. Lemma. A’ & B’ + A 2q B. 6.2.5.

Proof. Ind. on the length of A. E.g. let A = [z : a ] C , so A’ E [x : a’]C’. If B’ z [x : p’]D’ with a’ 2, p’, C’ & D’ use the ind. hyp. Otherwise C’ & (z) B’. The latter expression is ((x)B)’ so by ind. hyp. C (x)B and 0 A 1, B , q.e.d.

z7

Now let A’ 2 B‘ then by pq-pp: A’ >p C & B‘. This C 2 Ch so Co & B by the lemma, and A 2 B. This is property (2) above. Property (3) can be proved in the same fashion. 6.2.6. Something about typ* Lemma. -I B’ + (I- typ*(B)‘,typ*(B)’ J. typ*(B’)).

Proof. The translation ’ preserves the degree, of course. We use induction on degree(@). The degree 1 case is immediate. Otherwise typ*(B’) = typ*(typ(B’))and typ*(B)’ = typ*(typ(B))’. By correctness of types I- typ(B’), reducing to typ(B)’ and by P*T typ*(B’) J. typ*(typ(B)’). By CL I- typ(B)’ so by ind. hyp. I- typ*(B)‘, q.e.d., and typ*(B)’ 1 typ*(typ(B)’). By correctness 0 of types k typ*(typ(B)’) so by CR typ*(B)’ -1 typ*(B’), q.e.d. Now that we know CL, CR, PD and SA for h(7) we can extend property 5.3.4

The language theory of Automath, Chapter V I I ((7.5)

619

to: klA, I-’ [z : a]C,A 1[z : a]C =+ A 1~ [z : ,DID, a 1 P. So, as alternative application condition, equivalent to the one used in the original application rule: F A , I - B , t y p ( A ) > a , t y p * ( B ) > [ z : a ] C+ - I - ( A ) B we can as well use, e.g.

tYP(4

1a ,

tYP*(B)

Lp [. : @IC

or typ(A) J, a , typ*(B) 1 [z : a ] C , I- [x : a]C .

6.2.7. The proof of the correspondence Theorem. B;[ kc A & B‘, E’ I- A’. Proof. +. By induction on correctness. The formation of the context B’ is allowed, due to the liberal degree conventions of A(7). Consider, e.g. the appl. rule: let kc A, kc B , typ(A) 1 a, typ*(B) 1 [z : a]C. By ind. hyp. I- A’, I- B’, further typ(A’) 1 typ(A)’ 1a’ and by the lemma in 6.2.6 I- typ*(B)’, typ*(B‘) 1 typ*(B)’ 2 [z : a‘]C’. By CR, typ*(B’) 1 [z : a’]C’. By CL, I- [z : a’]@ so, by the alternative appl. rule I- (A‘) B‘. Or consider the instantiation rule: kcB1, ...,kcBk, y’E p p ( f ) E y is a scheme in B , Id_=k and for i = 1,...,k . The translated scheme reads p’ E [y’: P’ly’. By typ(B,) -1 pZ[L?] ind. hyp. I- Bi, ...,k BL. Now typ(B{) 1 typ(B1)’ 1Pi, typ*(p’) = [y~ : Pi] ... 7,so k ( B i ) p ’ . Further typ(Bb) 2 typ(B2)’ 1 ,&[BI]’= P;[Bi]and typ*((Bi)p’) (Bi)typ*(p’) > [ y :~P6[Bi]]... 7 ,SO I- (Bb)(Bi)p’. Etc. UP to I- (Bk),..(Bi)p’= p ( z ) ’ , q.e.d. e.Also by induction on correctness. E.g. consider an appl. expression. Either it is ((A) B)‘ or it is p ( @ . First case: if I- (A‘) B’ then I- A‘ (so kcA), I- B’ (SO I - c B ) , ~ Y P ( A I ) ’ typ(A’) 1 a (SO typ(A)’ I a ) , typ*(B)‘ 1typ*(B’) 1 [x : a ] C (so typ*(B) J, [ x : a]C). Hence typ*(B)’ 2 p [z : p ] D = [x : &,]DL = ([z : /30]00)’ with a 1p. By CR typ(A)’ -1 &, so typ(A) J, PO,and typ*(B) 1 [z : Po]Do, so kc (A) B. Second case: I- (Bh)... ( B i )p’ so kcBk,...,kc B1. Let y’E * p(y3 E y be the scheme of p . Typ(Bi) 2 P I , typ*(p‘) = [ y ~: Pi] ...7 2 [YI : PI]...^ so typ(B1)’ 1 Pi, typ(B1) 1 PI. Further typ(Bb) 2 (02, and [YZ : Ph[BilI...7 < (B;)typ*(p’) = typ*((Bi)p‘j 1 [YZ : P Z ] . . . ~so , 0 typ(B2) 1Pz[BII.Etc. UP to typ(Bk) I Pk[zI and k C p ( B )q.e.d. 6.2.8. The required properties Theorem. The strictly normable constant-expressions [see the comment to 2.1.41 satisfy BT. Proof. Strictly normable c-expressions transform into strictly normable expressions without constants under the translation ‘. And all 5-sequences of c-expressions A transform into subsequences of 5-sequences of A’:

D.T. van Daalen

620

(1) tYP(A’)

(2) A

>1

(3) A c B

1 tYP(A)‘,

B

+- A‘

+

>1

B’,

A ‘ c B‘.

So by BT for the constant-less version we are done.

0

Theorem. A ( Q ) ~satisfies CR.

Proof. Let F c A, A 2 B, A 2 C. By the =+-part of the correspondence I- A’ and by CR for A(0) B’ 1C’, so B 1C, q.e.d. Theorem. A ( Q ) ~satisfies CL.

Proof. Let k c A l A

> B.

Then k A’, A’

> B’ so by CL I- B‘. So kcB.

0

Theorem. A ( Q ) ~satisfies SA, PD, PT, P*T, SC etc.

Proof. Either from CL and CR, or using the correspondence.

0

6.2.9. An unessential extension result Now we explain the connection between the I---system and the ordinary I--system of A(q) without constants. Recall

I--A’ e F c A , i.e. F - A

t-,Ao

.

The first half of the correspondence result shows I-- + I-, i.e. a simple extension result. Now we define a translation’ from the larger into the smaller system, as follows: if c‘E ~5’* p ( z ) E y is a scheme in B, 151 = k, i < k then ((Ai) ... (A1)p’)- :F [zi+l : ai+l[K-]] ... [zk : ai[2-]](zk)... (zi+l) (AT) ... (A;)p’, i.e. we 77-expand until p’ gets enough arguments in front. For the rest acts as identity. Clearly A- l e A, A- 3 (A;)‘. Viz. ((Ai) ... (A1)p’)i = + [zi+l : (ri+l[AJ] ... [zk : ak[&]]p(&,zi+l, ..., zk). The translation is a bit intricate, because ((A) B)- is not necessarily (A-) B-. In general (A-) B- >p ((A) B)- and B-[A-] >p (B[A])-. Further typ(A-) Le typ(A)-, and also typ(A-) Jp typ(A)-. Without proof we state that A 1 B + A- 1 B - , and that typ*(A-) 1 typ*(A)-. From these facts, it can be proved that: -l A F A-, so by the second part of the correspondence I- A + k- A-. In case of &reduction, this is a typical unessential extension result.

-

6.3. The case of definitional constants 6.3.1. We have three main possibilities to incorporate definitional constants in our theory. The first one studies the new system (we call it A(77)d, with correctness predicate kd, and also speak about the d-system etc.) independently, as a

621

The language theory of Automath, Chapter VII (C.5)

separate subject, the second one considers it as an extension of A ( Q ) ~and , the third one embeds it into h(q),by extending the translation ' from the previous sections in order to cover definitional constants. Here we actually use the second method, and just mention some points on the third one. , reasons of comBut we start by proving the big tree theorem for A ( Q ) ~for pleteness and as an indispensable prerequisite for the separate study of the system (method one above). 6.3.2. The big tree theorem for A(7])d In 6.2.8we proved BT for A ( V ) ~ by means of the embedding ' into A(Q). It is indeed possible to extend ' to the case of definitional constants, but (see 6.3.3)the translation does not reflect the type-structure sufficiently, which makes this method fail here. so instead we revise the BT-proof of 5.2 (for A(Q)) and adapt in to the h ( 7 ) d case, which is relatively easy. First we mention the BT-condition (see 4.3.2):

* BT(Al), ...,BT ( A d , BT(tYP(P) [4). (6) BT(d(2)) * BT(A1), ...,BT(Ak), BT(typ(d) [A]), BT(def(d) [A)). (5) BT(P(4)

The P6r-SN conditions are quite analogous, and, as in 4.4.3, we have: Theorem. P&r-SN(A) =+ BT(A).

0

This suggests that, in this case as well, the substitution property of PSrSN is crucial. We choose to adapt the first BT-proof (Sec. 4.5) so need the

replacement theorem (see 4.5.6)instead: Let * denote [z/p(A)lLR, let B be normable, p ( z ) = p(A), A, B PGr-SN. Then:

C E B*

+

C Ph-SN.

Proof. As in 4.5.6.We consider a single reduction step C >1,$67 D. For all &steps and all r-steps concerning variables (not constants), P&SN(D) can be proved as in 4.5.6. The remaining steps, i.e. &steps and r-steps of constants, can only fall into the categories (1)and (2a) so we get P~T-SN(D)by ind. hyp. I1 or ind. hyp. 111. So we have a list of corollaries: (1) B normable, p ( z ) = p(A), A, B P6r-SN

(2) B normable, p ( q )

= p(Ai), Ai

+

B[A] PGr-SN.

(i = 1,...,lE) and B PSr-SN

+

B[A]

Ph-SN.

Proof. The simultaneous substitution can be simulated by iterated single sub0 stitution.

(3) B normable

+ B PGr-SN.

622

D.T. van Daalen

Proof. Induction on pretyped expressions. For the new cases use the previous corollary. 0 (4) B normable

+ B Oq67-SN.

Proof. ~ q - p pextends to the present case (see 5.2.1), 6q-pp we knew already 0 (see 11.7.4). This gives (p6.r)-q--pp and, by q-SN, Pq&-SN. (5) B normable =+ pqG-BT(B).

(3

6.3.3. The translation into A ( q ) Here we show how the translation ’ can be extended to the d-case. Viz. an expression d(A)transforms into (A:) ... (A;) [Z : G’ID‘, where Z E G * d ( Z ) := D * d(2) E y is the scheme of d. This translation behaves nicely w.r.t. to reduction: A > B =+ A’ 2 B‘. But of course it is possible that an expression A’ P-reduces t o a n expression which is not some B‘. This is in contrast with the situation with primitive constants where this could only occur by q-reduction. The best we can get is: A’ > l ,p B =+ B >p C , A >l,ps C . So, e.g. by ind. on 19p(A’), we get A’ >p B =+ B 2 p C‘, A 2 C . For the rest the translation seems t o be not too useful, because properties like A’ J, B’ =+ A J B (at least where q-reduction is allowed) and typ(A’) 1 typ(A)’ are only valid in the correct fragment. Note that typ(A’) >_ typ(A)’ is simply wrong here. 6.3.4. Some properties of h ( ~ Translation ) ~ of A(q)d into A ( V ) ~just requires the elimination of abbreviations, which can be done by &normalization. In the next sections we show that this actually constitutes a translation, i.e. that it preserves correctness. Here we first give some properties of A ( Q ) ~which we need in the - rather complicated - proof below. ~ The single substitution result (of A ( q ) , and of A ( Q ) too) I - A , t y p ( A ) l a , ( z ~ a , q I - B =+ ) v[Al~BB[Al

can, by induction on 131,be extended to a simultaneous substitution result I-

A, typ(Ai) 1cui[A]- for i = 1,...,IA(, ( 2E G I- B ) + I- B [ q .

The properties of Sec. 3.2.2 concerning the t y p of substitution results can be generalized t o (1) the simultaneous substitution case,

(2) successive application of typ, resulting in:

The language theory of Automath, Chapter VII (C.5) typJ(Ai) 1 typj(zi) tYPW

623

* [A], for i = 1,...,

[A1 1t Y P j ( B [ d )

9

for all relevant j, where typ3 stands for j successive applications of typ. This holds for A ( q ) but also for A ( V ) ~and A(7)d. Notice, that in case B does not end in one of the xi we even have t y p j ( B [ d ) = typj(B)[A] .

6.3.5. The translation into A ( V ) ~ Our notation for the translation is 7 For expressions'amounts just to taking &normal form. It is clear how-acts on strings and contexts. It is intended that the book B- is formed from B by &normalizing and by skipping the abbreviational schemes. The translation is of course not 1 - 1. We recall that B [ d - = B-[A-],that d ( 2 ) - E def(d)-[A-], and that 6reduction commutes with pq-reduction. The latter implies A 2 B

+

A-ZB-

and A J B

+

A-IB-.

6.3.6. The translation preserves correctness Theorem. B;( k d A + B-; (- kc typ'(A)-, typ'(A)i = 0, ...,degree(A) - 1 (thas ancludes kcA- ztself).

1 typ'(A-)

for

Proof. By induction on kd. Crucial cases are:

=

(1) The application case: A (Al) A2, kdA1, kdA2, typ(A1) 2 a, typ*(A2) 2 [z : a]C. By the ind. hyp. kc A;, kc typ(A1)-, typ(A1)- 1 typ(A;), kc typ'(Az)-, typ'(A2)- 1 typ'(A,). Clearly typ(A1)- 2 a- so by CR typ(A;) 1 a-. Similarly, typ*(A2)- 2 [z : a-16, and tYP*(tYPZ(A2)-)1 tYP'(tYPYA2)) = tYP*(A,) 1tYP*(A2)- (by P*T)¶ so by CR, typ*(typ2(A2)-) 1 [z : a-]C-. Hence kctyp'((A1) A2)- (= (AT) typZ(A2)-). See 6.2.6 for the alternative appl. condition. The property typ'((A1) A2)- 1 typ'(((A1) A2)-) is trivial.

(2) The definitional constant case: A = d(@, kdB3, typ(B,) 1 p3[3]for J' = 1, ...,14, where y ' E ,8*d($ := D * d(y') E y is the scheme of d. By ind. hyp. k c B y and typ(B;) 1 typ(B,)- 1 &[g].Also by ind. hyp. y ' E ,8- I-, D - , 3 E p'- kc y-, y ' E ,8- kc typ(D)- and typ(D)- 1 typ(D-). So, by the simultaneous subst. property, kcD-[g-] ( 5 A-), kcy-[g-] (= typ(A)-). We know that y 1 typ(D), soy- 1 t y p ( D ) - so by CR t y p ( D - ) 4 y-, whence typ(D-)[&] Jy-[g-] and, again by CR, typ(A-) 1 typ(A)-. Now there is left t o prove, for i = 2, ...,degree(A) - 1:

D.T. van Daalen

624

(a) kctypi(A)- (= typi-'(y[l?])-), (b) typZ(A)-

1 typ'(A-),

and

i.e. typi-'(y[d])-

1 typZ(D-[d-]).

The ind. hyp. gives us kctypi-l(y)-, kctypi(D)-, typZ-'(y)- 1 typi-'(y-), typZ(D)- 4 typi(D-) for these i, and kc typ'((Bj)- (1 typk(BjT)), for i = 0, ...,degree(Bj) - 1, for j = 1, ...,Id. Now (2) is simple: typa-'(y[J]) 1 typi-'(y) [d]so typi-l(y[J])- 1 typi-'(7)-[J-] 4 t y p y y - ) [J-] 1 t y p i ( d - ) [&I 1typi(D-[&]). Here we use PT and the substitution property of types. By CR we get (2). Property (1) we formulate in the form of a lemma.

Lemma. Let G E p k d y, i-dBj, for j = 1, ..., kctypi(y[d])-, for i = 0, ...,degree(y) - 1.

with y and

B' as above.

Then

Proof. If y does not end in some of the yj then typZ(y[d])- = typi(y)-[&] which is correct by the simultaneous subst. property. This also covers the case i = 0 (which we knew already ). For the rest we use the length of y. The case y = yj is true by assumption. Further consider the application case: Y (Ti)?'z, kd 71, Ed 7 2 , tYP(7i) 2 cp, tYP*('Yz) 2 [ z : VIE. BY ind. hYP. kcyi[B]-, kct y p ( r i [ 8 ] ) - , kc typi(rz[J])- for all i. We have typ(y;[g-]) 1 ~YP(T;) [J-I I_~YP(YI)-[J-I 2 C P - [ ~ - ] , SO by CR t ~ p ( y l [ g ] - )1 C P [ ~ ] - Sim. ilarly typ*(rz[B])- I typi(yz)-[8-] 1 typi(y;)[z-]. So, by CR and P*T, tYP'(tYP"(yz[fl)-) 1t Y P * ( t Y P W 1 t ? P * ( 3 i ) [J-I 1 typ*(_rz)-[J-I 2 [ z : cp[d]-]E[B]-.Again by CR, typ*(typi(yz[B])-) 1 [ z : cp[J]-]E[B]-,whence k c t y p i ( ( n [ @ yz[B])-, q.e.d. 0 The abstr. case is straightforward. This finishes the proof of the lemma.

[a-1)

This finishes the definitional constant case of the theorem. Now the remaining cases of the theorem are straightforward. This finishes the proof of the theorem. 0

6.3.7. Is A(7)d a definitional extension of A ( V ) ?~ The above corollary amounts to the unessential extension properties UE2 and UE3 (see V.3.2.2). Of course, we also have kcA + A z A- and it is tempting t o conclude the other half of of UE1: B;(kdA

+

B;tkdA-

from the corollary. This is, however, not immediate as yet: we can conclude

B;< kd A

+

B-;<-kd A-

The language theory of Automath, Chapter VII (C.5)

625

and we know

(B;6 ) - typ('4-1 2 (B-;6-1- tYP(A-7 but we hardly know anything about

( 4 6 ) - tYP*(A-)

*

Instead, we first prove the substitution theorem for A(q)d; this gives correctness of types, as well as 6-CL. The latter implies UE1, which completes our definitional extension result. 6.3.8. Some nice properties of A(q)d The corollary in 6.3.6 gives us already some nice results. Theorem. A(v) d satisfies (1) CR,

(2) SA and (3) PT

Proof. (1) Let kdA, B

B

1 C.

I A 2 C.

Then kcA-, B- 5 A- 2 C-. By CR B-

(2) Let kd (A) [z : B]C. Then kc(A-) [z : B-]C- so typ(A-) typ(A)- 1 typ(A-) and by CR, tYP(A) I B.

1 B-.

1 C-,

so

Further

(3) Let kd [z : &]A, [z : &]A 2 [z : P]B. Then k, [z : cy-]A-, [z : a-]A- 2 [z : P-]B-. By PD Q- 1p- so Q 1p.

0

Remark. We also prove some form of PT and P'T. Let k d A, k d B , A 2 B. Then typ(A) .1 typ(B) and typ*(A) 1 typ*(B). Proof. kcA-, k c B - , A- 2 B-, so typ(A)- 1 typ(A-) 1 typ(B-) 1 typ(B)0 and by CR typ(A) 1 typ(B). Similar for typ'.

Lemma. Let kd B , i = 1,...,k . Let G E p' with IV'~= k . Let k c c ( 8 - ) . Then kdc(2).

* c(8Ey

be the schemes of c,

626

D.T. van Daalen

Theorem. Let E 3 Ed B. Let * stand for for i = 1,..., [.'I. Then kdB*.

[Z/A].Let

and typ(Ai) 1 a,'

Proof. We use induction on FdB. so, by ind. hyp. kda; for i = 1,..., [.I' Now typ(A1) 1 a1. So typ(A;) 1 typ(A1)- 1 a; and by CR (A;) 1 a;. Similarly typ(A,) 1 a;- = a i [ X - ] . Etc., and for all i typ(Ai) 1 a i [ A - ] . Now consider, e.g., the application case: i? E 3 k d (B1)B2. By 6.3.6, Z E G- kc (B;) BT and by the subst. theorem in A(r]),, kc(B;[A-]) BT[A-] (E (BT-) B;-). By ind. hyp. EdB;, k d B;, so by the first lemma, k d (Bi)B,*.Similarly use the second 0 lemma for the constant-expression case. The other cases are immediate. 6.3.10. The remaining nice properties for ceding theorem are

A(7])d

Corollaries of the pre-

(1) correctness of types,

(2) 6-outside-CL1, (3) P-outside-CL1 (use SA).

Lemma. h(7)d satisfies CL1. Proof. The 7-outside case is mere strengthening. We can use the lemmas in 6.3.9 for the inside cases. Let k d (BI)B2, B1 > CI, B2 > CZ. By ind. hyp. kdC1, kdC2. By 6.3.6 k c (B;) BT, and B; > C;, B, > CF, SO kc (C;) CT SO 0 k d (C1)C2. Similarly for const. expressions. Theorem.

A(7))d

satisfies CL.

Proof. As usual, by ind. on 2.

0

6.4. Nederpelt's original formulation 6.4.1. Nederpelt's original definition of A [Nederpelt 73 (C.311 used single-line presentation. 1.e. instead of defining correctness of expression relative to a context, he defined correctness of expressions having a n abstractor string '.[ : G] (notation Q) in front. For definiteness we give his rules. We write k j for ~ correctness in his system. But for certain provisions making sure that no confusion of variables occurs, the rules read:

The language theory of Automath, Chapter VII (C.5)

627

6.4.2. Apart from the use of abstractor strings instead of contexts, there are two other points that make the two approaches not completely parallel. The first point concerns abstraction; our abstraction rule has no counterpart in Nederpelt’s system. Nederpelt rather follows a combinatory (in the sense of combinatory logic) way of building expressions. In the language of combinatory logic, rule (2) above is the rule for la,the identity in a, and rule (3) is the rule for Kay,the constant function on a with outcome y . Alternatively, rule (3) might be called a rule of weakening (see V.2.9.3). 6.4.3. The second point that requires attention is that an abstractor string can get involved in a reduction (notably an 17-step), whereas contexts are of course immune to reduction. First some notation. We write IQI for the number of abstractors in Q. We write Q 2 Q’ if Q = [Z : d ] ,Q’ = [Z : $1 and d 2 6‘in the obvious sense. Now we have the following lemma: QA 2 Q’A‘, IQI = IQ‘I A 2 A’. Proof: If there are no 77-steps involving the border line between Q and A, then clearly Q 2 Q , A 2 A‘. Otherwise Q = Q l [ z: a ] ,a 2 a’, Q1 2 Q;, A 2 (z) B with z g‘ FV(B) and Q’,B 2 QY[z : PIA‘. 1.e. Q A = Q1[z: a ] A 2 Qi[z : a‘](z) B >, QiB 2 QY[z : P]A’. Now we can, e.g., use ind. on t9(QA) and conclude that B 2 (z : PIA’. But then A 2 A’, q.e.d. 6.4.4. The equivalence proof Now we are ready for the equivalence proof. Theorem. Let

Q = [ Z : Z ] ,( E Z E G . Then FNQA

* (FA.

Proof. The +-part is immediate. We use induction on F. E.g. consider our variable rule: from 3 E Z k we conclude ? E d F zi. If zi is the most “recent” variable then we must use rule (2). Viz. Z E Z I- is itself a result from 2 1 E a1,...,zi-1 E ai-1 F ai. By ind. hyp. we get FN [z1 : a11 ... [zi-l: ai-1]ai. Otherwise we must insert the abstractors inbetween [xi : ail and the end of Q by successive applications of rule (3).

628

D.T. van Daalen

Now consider the =+part. The crucial case is the application clause. So let I-N QA, I-N QB, typ(QA) 1 Qa,typ*(QB) 2 Q[z : aIC. BY ind. ~ Y PI. A, I I- B. Now typ(QA) Q typ(A) 2 Q a so by the lemma typ(A) 2 a. Simi0 lady typ*(B) 2 [z : a]C. So we conclude I- (A) B , q.e.d. 6.4.5. The nice properties for Nederpelt’s system One of the consequences of the theorem is:

I-NA * I - A so the N-system can be considered a part of our system. This gives CR and CL for FN immediately. From this one can get the other properties SA, PD, PT etc. as usual. 6.4.6. Alternative way of embedding Ad into AN Resuming the results of the preceding sections: we have constructed an embedding of A ( q ) d (via A ( V ) ~ and A) into AN. Here we introduce a n alternative way (due to Nederpelt [Nederpelt 71al) of embedding A(q)d directly into AN. Our notation for the translation is, again, ’. Let a statement B ; t Ed A be given. Primitive schemes j? E * p ( 2 ) E y are, as is to be expected, turned into abstractors [p’ E [Z: d’ly’]. The context I is of course transformed into an abstractor string t’ G Q. Essentizl is the translation of definitional constant schemes. A scheme 2 E 6 * d(Z) := D * d(Z) E y is translated into an expression “segment” ( [ 2 : .‘’ID‘) [d‘ : [j? : 6’1 7’1. All constant expressions c(A) are now translated into (A:) ... (A:) c’. So B; k d A is translated into a single expression B’t’A‘, where B’ is a string of abstractors and applicators, and I’ consists solely of abstractors. For expressions the translation is quite similar to the translation ‘ in 6.2.1. In particular we have (as in 6.2.4) typ(A’) > p typ(A)‘. However, w.r.t. to &reduction the correspondence is not too close: it is not possible t o eliminate occurrences of d’ one at a time. So in order t o establish A 1 B =+ A’ 1 B‘ we need a partial &normal form again. FN B’E’A’. Anyhow, it is indeed possible to prove B;[ kd A

The language theory of Automath, Chapter VIII, Section 3 (C.5)

629

VIII. SOME RESULTS ON AUT-II VIII.3. A short proof of closure for AUT-11 3.1. Proving closure for AUT-II is not very different from proving it for AUTQE. So we just sketch how to modify the proof in V.3.2. We start with a version without the extensions mentioned in 2.4 and 2.7, but we include all reductions (also 6l-reduction). 3.2. For the terminology see V.3.1. Let > denote disjoint one step reduction. [See the comment to 11.8.1 By the properties in [van Daalen 80, 11.7.4.31 [or, alternatively, b y weak 6-advancement and induction on da(A)]we have A

>B

* 6-nf(A) > 6-nf(B) .

By the substitution theorem we have 6-CLPT. The 6-nf’s of 1-expressions are of the form n([z : a ] A )or 7 . Reductions of these expressions can only be internal, so by induction on p we get (including what might be called UD1 here): I-ln([z:a ] A )9 n([z :PIE)

*a

p and

(z E a I- A

B) .

3.3. From this follows SA2 (whence Pkmtside-CL:) and P-outside-PT:. Viz. let A E a , I-2[x : B]C E n([x: a]D). with conclusion I- ( A )[z : B]C. Then, for some E , z E B I- C E E and I- n([z : B ]E ) II([z: a]D). So a B and x E B I- E D whence A E B (i.e. SA2) and z E B I- C E D. So C[A]E D[A]

(i.e. @-outside-CLPT:)

.

The proofs of UT2 and the inside cases of PT: are by ind. on I-.

3.4. The strengthening rule gives Iputside-CL1. Here follows a proof of outside-PT: different from the proof in V.3.2.5. Viz. let k2 [z : a](z) A E y , z $ FV(A). Then, for some C, [z : a] (z) A E n([z : a ] C [ y / z ] ) y , where z E a I- A E ll([y : a’]C),a’ q a. So, as well, E a I- A E ll([y : a]C). By A E n([y : a]C) and z E a, y E a I- (y) A E so weakening z E a, y E a x E a t- [y : a](9) A E n([y: a]C).Again by weakening x E a I- [z : a](z) A E y , so by UT2 z E (Y I- y Q n ( [ y : C Y ] ~ ) Hence . x E a t- A E y and by strengthening A E y, q.e.d.

c

3.5. This completes the proof of PTT. Then PT2 and LQ2 follow by ind. on 2 and respectively. Now we come to CLPT3. For properties like SA3 we need

630

D.T.

vi~n Daalen

3.6. To this end we study P2-reduction and, in particular, P2-head-reduction, for short Pi (for the definitions see V.3.3.3 and V.4.4.5). We know already P2outside-CLPT1 (this is P-utside-CLPT:). From this follows P2-CLPT1 by ind. on I-, and P2-CLPT by ind. on 2. Now we use the fact that 3 is the only argument degree and that, hence, P2-reduction does not create new P2-redices. Compare v.3.3.4, VI.2.4. As a consequence, P2-SN is quite easily provable (for degree correct expressions) even without using norms: namely, if A P2-SN, B P2-SN then A [ B ]P2-SN, by ind. on (1) %34B), (2) length ( B ) . So, as usual, P2-SN by ind. on length (see IV.2.4.1). A fortiori, Pi-SN. Besides satisfies CR, so we can speak about Pi-nf’s. E.g., degree(@ = 2 , &nf Clearly

( B )= [x : a]C

+- &nf

( ( A )B ) = C [ A ].

and 6 commute, so PE6-CR and Pi6-nf’s are defined too.

C(cp) =+ @G-nf(A) = C(+), cp 9 $. Sketch of proof. Ind. on 9. For the induction step we need the following property: F2A , PiS-nf ( A ) = C(cp), A > C or C > A , F C =+ /3;6-nf(C) = C($), cp Q $. If C > A it is easy, (Pi)-i-pp holds here for all kinds of reduction i (see 11.9), so PiS-nf (C) = C($), $ > cp. Otherwise, A > C. Now Pi6 commutes with all other kinds of reduction, except 77; (see 11.8). And it even commutes

3.7. Theorem. F2A

with the latter, except for “outside” domains. Where we define the latter to be the airP j , etc. in (A)[Z : Z] (3)[y’: P] ..., with (A)possibly empty. But there are no “outside” domains left in C(cp). So, in any case, Pib-nf(C) = C($), cp > $. In fact, if A >$, C then cp = $. By Pi6-CL we know that both C(cp) and C($) are correct so from (cp > 11, or 0 > cp) we can conclude cp Q $. This proves the wanted property.

+

Corollary. C(p)

Q

C($)

cp Q $.

0

3.8. Both the theorem and the corollary can be proved in precisely the same manner for II and $, yielding the properties in 3.5. Remark: The theorem above is a kind of minimal result for the desired properties, E.g., we can, alternatively, prove a kind of weak CR2-result as in VI.2.4, or prove a similar but stronger theorem in the spirit of V.3.3, V.3.4.

The language theory of Automath, Chapter VIII, Section 3 (C.5)

631

3.9. Now we are able to prove the outside cases of CLPTQ. E.g. for +-reduction. Let ( i l ( A , p ) ) ( F @ G )E y . T h e n i l ( A , P ) E 6 , F @ G EIZ(cp), c p E 6 + r , y’, ( i l (A, p)) cp y. And A E a,a @ p 6, F E a’ + y‘,G E p’ (a’@ p‘) -7’ II(cp). So [x : a’ $ ,O‘]y’ cp, and [x : a‘ @ p’]y’ E 6-7. So (a’ @ p’) q 6 (a@ p), whence a a’,p p’. So (A) F E y’. Further y’ (il(A,P)) [x : a’ @ ply‘ 9 (il(A,p)) cp y, whence (A) F E y too. Similarly for the other variant of +.

-

3.10. Then follows full CLPTl by ind. on F and CLPT by ind. on 2. Besides, we have of course UT and LQ. And we can freely make the language definition somewhat more liberal, as follows. First we can change the &propagation rule into AqB, BJC, FC

AqC.

Secondly we can add the appl. rule, with i 2 1 A E ~ F, * + ’ B q [ x : a ] C =sl-(A)B and drop the degree restriction in the appl. rule 1 (i.e. rule 1.4).

3.11. Now we shall say something about proving CL for AUT-IZ with the extension of Sec. 2.4. Just adding abstr. expressions of degree 1 does not matter at all, we still can get UD’ without any difficulty. Making the language into a +-language (i.e. adding appl-1-expressions too) causes some trouble with the domains in case v-reduction is present. Which can, however, be circumvented as in V.3.3: First leave q1 out, then prove &CL and add 77’ again, 3.12. Finally the extension of Sec. 2.7, i.e. where @-2-expressions are present. If there is also E2-reduction the situation is essentially more complicated, because p and E interfere nastily. But without E~ the proofs of 3.3-3.8 just need some modification: (p+)’-SN can be proved as easy as P2-SN, +‘-CLPT is not difficult either. Then Theorem 3.7 can be proved for (P+)’-b-head-nf’s instead. 3.13. Requirements for the pp-results in 11.9 were: (1) The result of outside-o-reduction is never a $-,an inj- or an abstr-expression.

(2) The result of outside 77 or

E

is never an inj-expression or a pair.

Now we can easily verify them for AUT-IZ using the results of this section. First let (cp,A(l),A(2)) > o A. 1.e. degree(A) = 3, A E C(cp). If A were an C(cp). Theorem abstr-term, then A E n($) for some +. UT states that II($)

D.T. van Daalen

632

3.7 states that n($) 2 C ( x ) for some x. This is impossible. Similarly for injor $-expressions. Or let [z : a](z)A >, A. By PT A E II(cp0) for some cp. If A were an inj-expression, then degree(A) = 3, A E (P @ y) for some lo, y. By UT !J(cp) Q ( p @ y). Use the suitable variant of Theorem 3.7 again (Sec. 3.8), this gives a contradiction.

VIII.4. A first SN-result for an extended system 4.1. I n t r o d u c t i o n The word “extended” in the title of this section refers to the presence of other formation rules than just abstr and appl (and possibly instantiation) and other reduction rules than just P and q (and possibly 6). In the case of AUT-II we are concerned with the additional presence of: (1) pairs and projections, with reductions K and (2) injections and $-terms, with reductions

CT.

+ and

E.

In IV.2.4 we gave some versions of a “simple” (as compared to a proof using computability) proof of P-SN. Then we extended it to Pq using Pq-pp. Afterwards we included 6 as well. Here we stick to the separation of 6 from the other reduction rules. Below we first show (4.6) that addition (1) mentioned above does not cause any trouble: the first version of the “simple” proof of P-SN immediately covers the pn-case. And afterwards, we can include 6 and q by a postponement result again. However, the second addition essentially complicates matters. The presence makes the first P-SN proof fail here, because the important induction on of functional complexity (norm) goes wrong (see Sec. 5.1.2). We add new, so-called permutative reductions (Sec. 4.3.1, (111) in order to save the idea of the proof (5.1.3). These permutative reductions, in turn, complicate the SN-condition, and a way to keep them manageable consists of adding (in 5.1.5) still another kind of reduction, viz. improper reductions (Sec. 4.3.1, (IV)). Our second P-SN proof of Ch. IV can fairly easy be adapted for the present situation, however. We just have to add improper reductions to make the proof work (see Sec. 5.2). For completeness we also include a proof based on the computability method (Sec. 5.3). However, these three proofs just cover the situation with p + ~ - r e d u c t i o nand can, by ext-pt be extended to P K 6 q . Alas, we have not been able to handle E too. We cannot use pp anymore, so we have to include E from the start of the proof on. And none of our methods can cope with this situation. The problems with @ (or V) are well known from proof theory. E.g. Prawitz in [Prawitz 651 first proves normalization for classical propositional logic, where he avoids the problem with V, by defining V in terms of “negative” connectives.

+

+

The language theory of Automath, Chapter VIII, Section 4 (C.5)

633

Then, when studying intuitionistic propositional logic, he also needs permutative reductions for proving normalization. By the way, our improper reductions turn out t o be identical with the semi-proper reduction used in the SN proof for arithmetic in [Leivant 751.

4.2. The system AUT-IIo 4.2.1. For brevity and clarity we study a system of terms with the same “connectives” and reductions as AUT-II (so the essential problems with SN become clear) but with a simplified type-structure. It can be compared with the normable expressions of Ch. IV. Later (Sec. 5.4) we extend our results to AUT-II. 4.2.2. Reduced type structure The reduced types or norms (syntactical variables a,P, y, v) are inductively given by:

(1)

T

is a norm.

(2) if a and p are norms, then also a 8 PI a --* /3 and a @ P.

Note: If we write [a]@instead of a + P it is clear that t h e norms of Ch. IV form a subset of the present norm system. We write a --* /3 with the purpose to show that our norms form a simple type structure over a single fixed type, T . This is also true of the norms in Ch. IV. Hence normability results (as in Ch. IV, or as given earlier in [van Benthem Jutting 71b ((?.I)], [Nederpelt 73 (C.3)]for certain Automath variants) can alternatively be proved as follows: the generalized systems under consideration are not essentially richer than simple, non-generalized type theory, in the sense that they do provide the same set of terms of free A-calculus with a type as does a simple, non-generalized system. Compare [Ben-Yelles 811.

4.2.3. Terms of AUT-IIo All terms (syntactical variables A , B, C, ...) have a norm. The norm of A is denoted p(A). We also write A E a for p ( A ) a. Terms are constructed according to: (i)

variables x, y, z, ... of any norm.

(ii)

~ E ~ , A E ~ , B E[ xP: A ] B E a + P .

(iii)

C E ~ + P , A E ~ , B E P( C , A , B ) E ~ ~ P .

(iv)

A E a,B E P

(v) (vi)

*

*

* i l ( A , B ) E a $ P , iz(A,B) E Pea. B E ~ - + P , A E*~ ( A ) B E p . B E a 8 P * B(I)E a,B(z) E P.

D.T. van Daalen

634 (vii‘) [z : A]C E a

+

7 , [y : BID E

0

-+

y

( a $0) 7 .

*

([z : A]C @ [y : BID) E

+

These terms can be compared with the 3-expressions of AUT-II. However, there are no constants, no instantiation (and no 6),it has simpler type structure and it has only $-terms of the form [z : A]C $ [y : BID. Below we also consider a variant AUT-lll which has general @-terms. Instead of rule (vii’) it has rule (vii)

B ~ a + y , C ~ p - ++ y B$CE(~$P)+T.

Below, we often omit type-labels in [z : A]B, il(A, B ) ,&(A,B ) and (C, A, B ) , just writing [z]B,il(A), i2(A) and (A, B). 4.3. The reduction rules 4.3.1. We consider four groups of reduction rules.

(I)

The introduction-elimination rules (IE-reductions) 0,7~ and

+‘ (see 2.6).

Rule +‘ is particularly appropriate for AUT-no, i.e. in connection with rule (vii’). For AUT-II1 we rather use rule

+.

(11) The ext-reductions 7,0 and

E.

Here we use the simple unrestricted version of

0: (C,A(l), A(2))

>A

(111) Permutative reductions (p-reductions).

(-+) (A) ( B )([zIC@ [VlD) > ( B )([.I

(4c @ [YI (A) D ) .

(8) ((4([zIC@ [ ~ 1 0 ) ) ( 1 )> (A) (bIC(1)@ [YI D(1))- similarly for (’4-projection. (63) D

=E @F

=+ ( ( A )( [ 4 B@ [zlc))D

> (A) ([XI ( B )D @ [z] (C) D ) .

The general pattern of these rules looks like

O((A)“ZIB @ [YIC))> ( A )([zlC3(B)€3 [YlO(C)) where 0 is an operation on expressions, given in one of the following ways: U ( B )= (A) B , O ( B )= ( B )( E $ F ) , O ( B )= B(1) or O ( B )= B(2). The norms of these B’s are respectively a -+ p, a $ 0 and a 8 p. That is why the rules are coded (-+), ($) and (8). In case the argument of U allows outside (i.e. $-reduction), the p-step does ) $ [y]C) > O ( B [ A ] )= U ( B )[A] < not produce a new equality: U ( ( i l ( A ) [z]B (il(A)) ([z]U(B)$[y]O(C)).Below (6.2), it turns out that, generally, p-equality is generated by 077 &-reduction.

+

635

The language theory of Automath, Chapter Vlll, Section 4 ((2.5)

The above mentioned rules are the standard ones from proof theory. There it is formulated like this: if the conclusion of an V-elimination rule forms the major premise of an elimination rule, then the latter rule can be pushed upward through the V-elimination rule. E.g. our +-rule can be compared with the following proof theoretic reduction:

B

avp

1.1

C Y+6

lo1 D

A

7-6

A

Y

V E

Y

>

C

7-6

[PI A D Y 7 4 6

B

avp

7-6

1.1

6

6

+ E 6 6 Both here and in proof theory the p-reductions are primarily introduced for technical reasons. However, as Pottinger [Pottinger 771 points out there is some intuitive justification for them too. Part of it, that in some cases they do not extend the equality relation, is stated above. It has been suggested to allow other permutative reductions as well ([Pottinger 771, [Leivant 751). However, in [Zucker 741 it has been shown that this spoils SN. (IV) Improper reductions (im-reductions).

( A ) ( [ z : B ] C $ [D y :] E ) > C , (im)

( A )([z : B]C @[y : DIE) > E .

Notice that the set of free variables of the expression can be enlarged by performing an im-reduction. If an inside im-reduction takes place inside the scope of some bound variable, the latter variables have to be renamed in order to avoid any confusion. These reductions can be compared with Leivant’s [Leivant 751 semi-proper reductions, They degenerate to what Prawitz calls immediate simplifications, when z FV(C), resp. y $2 FV(E).

4.3.2. One-step and many-step reduction One-step reduction > 1 is, as well, generated from the main or outside reductions given above, by the monotonicity rules. Then follows many-step reduction 2 from reflexivity and transitivity. 4.3.3. The usual substitution properties are valid, e.g.,

B

>1

B’

A

>1

A’

+ B[A] > I B’[A] and + B[A]1 B[A‘] etc.

D.T. van Daalen

636

4.4. Closure for AUT-IIo 4.4.1. First notice that AUT-IIo is certainly not closed under 0, because of the restrictive rule (vii’). So the proof below is intended for the 77-less case. 4.4.2. Due to the simple type structure it is quite easy to show that norms are preserved under substitution and reduction and hence that AUT-IIo is closed under reduction. 4.4.3. Substitution lemma for the norms. E

2 E

a, A E a, B E

P

+ B[z/A]

p (and B[z/A] a term).

Proof. Ind. on length of B.

0

4.4.4. Reduction lemma for norms. A E a, A

> A’

*

A’ E a (this

includes CL1). Proof. Ind. on the definition of >. For /3 and +’ use the substitution lemma. E.g. let A = (il(A1)) ([z]Az @ [y]A3), A E a , A’ = A2[A1]. Then, for a , so [z]A2 E a1 a, some a t , a 2 , A1 E ai, ([z]A2 @ [y]A3) E (01 @ a 2 ) z E al, A2 E a. So Az[Al] E a , q.e.d. Or a permutative reduction: A = ((Ai) ([$]A2 @ [YIA3))(1), A E a , A’ = (Ai) ( [ ~ l A z (@~ )[Y]AB(,)).Then for Some P, Q I , 0 2 , (Al) ([zc]Az@[~]A3) E Q@P,x E 01, E a2, Ai E L Y I @ ( Y Z , A2 E a@P,

-

+’:

+

A ~ E c Y @ S@O. A ’ E C Y . 4.4.5. Theorem (Closure). A E a , A 2 A’ (without 77)

0

+ A’ E a.

Proof. Ind. on 2.

0

4.5. The system AUT-II1 4.5.1. Instead of rule (vii‘) it has the rule

BEa-y,

+

C€@+-, + B@CE(a@P)-y

and it has instead of +‘. Of course (vii’) (vii), so indeed AUT-II1 contains AUT-no. We can define a translation cp from AUT-IIo to AUT-II1 such that cp(A) & A and which shows that AUT-II is not a very essential extension of AUT-IIo. The translation is given by ind. on length. The only nontrivial clause is cp(C1 @ C2) = [z : Ma] (z) cp(C1) @ [z : M p ] (z) cp(C2), where C1 @ C2 E ( a @ 0) -+ y and Ma, M p are suitable fixed expressions of norms a , P and z, y are chosen of norm a , P such that z 6 FV(Cl), y 6 FV(C2), respectively, A. On variables, cp acts like identity. For the rest, cp just commutes with the formation rules. Clearly, cp leaves the norm invariant and is indeed a translation into AUT-IIo.

*

The language theory of Automath, Chapter VIII, Section 4 ((2.5)

637

4.5.3. In the sequel we prove SN for some versions (i.e. with and without p-red. etc.) of AUT-no. By the above properties we can easily extend the pand im-less case to AUT-IIl: AUT-IIo SN (with +') =+ AUT-II1 SN (with +). 0

Proof. Let A be an AUT-lI1 term. Use ind. on 6(cp(A)).

+

+

But, from SN with follows SN with and +', because each +'-step can be and a &step, so I?+decreases under +'-reduction, And, besimulated by a cause AUT-IIl contains AUT-IIo we also get SN for AUT-IIo with and

+

+

+'.

4.5.4. The postponement requirements For AUT-no- and AUT-II1expressions it is quite straightforward to show the requirements (l),(2) of 3.13. E.g. let (A(*),A(z)) > A. Then A E Q @ p. So A is not an inj-term, a @-term, or an abstr-term. Etc. 4.6. The first order character of the systems 4.6.1. In [van Daalen 80, IV.1.51 we emphasized the importance of the property

p((A1) B ) = p((A2) B ) , in particular p((A1) [XI&)

= p(A2)

i.e. the functional complexity of (A) B does not depend on the argument A. Alternatively stated: it is of course possible that the different values of B have different types, but apparently there is a strong uniformity in these types, for the functional complexity of all the values is the same. In fact, we defined a system to be first-order if this property was present.

D.T. van Daalen

638

Generally, the introduction of $-types and $-terms might spoil this uniformity: we might be able to define functions completely different on both parts of their domain. So, by “general” $-functions the first-order property above gets lost. However, in AUT-Ilo, AUT-IIl and in AUT-II the domain of $-functions is explicitly restricted in such a way, that the first-order property can be maintained, viz. by requiring

4.6.2.

(1) in AUT-IIo that p ( B )= p(C) when forming [ z ] B~3[y]C. (2) in AUT-Ill that B E

cy

(3) in AUT-II that B E a

-+

-+

y,C E /3

-+

y when forming B

$ C.

y,C E ,B -+ y when forming B $ C.

As a consequence we still have p((A1)B ) = =4C).

= p ( ( A 2 )B ) and

in particular

P ( ( A )([4B63 [YlC)) P(B)

Now it will be clear that the generalized $-rules of 2.7 would spoil the first-order character. Example: let A E T , B E T , C E T , D E T , then [z : A]C E A -+ T , [z : B ]D E B -+ T . So [z : A]C @ [z : B]D E ( A @ B ) + T . So, if E E A -+ C , F E B -+ D , then ( E $ F ) E Il([z: A]C$ [z : B]D ) . Clearly the functional complexity of ( i l ( G ) () E $ F ) for G E A and ( i z ( H ) )( E @F ) for H E B can be completely different, viz. that of C and D respectively.

4.6.3.

It is possible that a notion of norm (i.e. simplified type) can be defined which is manageable and measures functional complexity of these general $-terms, but the present norm (and the corresponding SN proof) is certainly not suitable for this situation.

4.6.4.

4.6.5. Remark: Strictly speaking, the suggested connection between the typing relation in AUT-II and the norms in AUT-Ilo has not yet been accounted for. The preceding statements have to be understood on an intuitive, heuristic level. 4.7. A proof of Pmp-SN 4.7.1. Here we show that the first P-SN proof of Ch. IV straightforwardly carries over to the case of Pqa-SN. As our domain of expressions we take, e.g.,

the terms of AUT-IIl. SN-conditions for PT For non-main-reducing expressions (also called immune forms or IFs) it is sufficient for SN if all their proper subexpressions are SN. Incidentally this is also true for projection expressions (because main .ir-reduction amounts to picking a certain subexpression). So we have: A SN A(l) SN, and the funny property: A(l) SN e A(2) SN. 4.7.2.

*

The language theory of Automath, Chapter VIII, Section 4 (C.5)

639

We recall the SN condition for appl expressions in this case:

( A )B SN e A SN, B SN and ( B 2 [x]C + C [ A ]SN) . 4.7.3. Heuristics: the dead end set of P So, the substitution theorem for SN is again sufficient for proving SN (see IV.2.4). The crucial case of the substitution theorem for P-SN was where A is SN, B = ( B I )B2 is SN, B2[Az]2 [y]C,but B2 2 [y]Co. 1.e. the reduction to square brackets form depends essentially on the substitutions. Then we used the square brackets lemma: B2 2 ( F ) x ,

[A1 2 [YlC. We define the set &, of these expressions ( F )x symbolically by a recursion equation Ex = x (U)Ex,where U stands for the set of all expressions and it is of course understood that all expressions in Ex are in AUT-IIl again. x can be considered as dead ends when one tries to copy The expressions (p) in Bz the contractions leading from B2[A]to [y]C,i.e. when one tries to come “as close as possible” to an abstr expression. We do not bother to make the concept of dead end more precise, or more general, but just give this informal explanation for naming Ex the dead end set w.r.t. x, P-reduction, and abstr expressions. (($)XI

+

4.7.4. The dead end set of PT When one tries to copy a @r-reduction sequence of B[A]in B one need not end up with an expression in Ex,but, e.g., can also end in x ( ~ )The . following theorem states that F defined by

3=2

+ 3 ( 1 ) + F(2)+ ( U )7

is the dead end set w.r.t. x, PT and immune forms (IF’S).Let 2 stand for 2 p r , and let * stand for [z/A]. Theorem. If B SN, B* 2 C , C E IF then B 2 CO, C; 2 C with either (i) C; non-main reduces to C, or

(ii)

co E 7 .

Proof. Just like the square brackets lemma (second proof, IV.2.4.3), by ind. on

(1) d ( B ) , (2) 1(B). Let B* main-reduce to C (otherwise take B = CO). Then B = z, (and take Co = B , Co E IF), B = D(1),B = D(2)or B = (01)D2. E.g. let B = D(l). Then D* 2 ( D I ,D z ) , D1 2 C. Apply ind. hyp. (2) to D. In case (i), D 2 ( E l ,Ez), Ef 2 D1, E,* 2 Dz,so B 2 E l , E; 2 C. Then apply ind. hyp. (1)t o E l . In case (ii), D 2 Eo, EO E 3,E6 2 (D1,Dz)and B 2 Eo(,, E F,E o ( ~=; E;(,) 2 C , so 0 case (ii) holds for B too.

640

D.T. van Daalen

Remark: (1) Similarly we can prove a more general outer-shape lemma (see 11.11.5.4) for Pr, where the condition “C E IF” simply has been dropped.

(2) It is probable that such “standardization-like” theorems can also be proved without using SN (as in 11.11).

4.7.5. Heuristics: the norms of dead ends The point of the P-SN proof is:

B E EZ

* G 4 B ) )I@(z))

-where 1 is the length of the norm -. So, if B[A]2 [y]C,then 1 ( p ( y ) ) < l ( p ( z ) ) , and we can use ind. on norms in the crucial case of the substitution theorem. We are lucky that the same method works for Pr-reduction too. Namely

4.7.6. The substitution theorem for PA-SN Theorem. A Or-SN, B Pr-SN B [ x / A ]Pr-SN. Proof. Ind. on (1) p ( A ) ,(2) 19pT(B),(3) l ( B ) . Let 2 be 2 p r . If B = z then B[A] = A so SN. If B E IF or B = C(l) or B = use ind. hyp. (3). If B = (B1)B2 proceed as for P-SN, using the norm properties of the dead end set F. 0 4.7.7. Pr-SN and Prqu-SN An immediate corollary of the substitution theorem for Pr-SN is Pr-SN itself. Now we can extend this to Prqo-SN (as in 11.7.2.5) using (Pr)-(qa)-pp, a case of ext-pp (see 11.9.2). The requirement for pp is indeed fulfilled (see 4.5.4). VIII.5. Three proofs of Pr+-SN, with application to AUT-II 5.1. A proof of P.rr+-SN using p- and im-reductions 5.1.1. Here we show how the preceding SN-proof (based on the first version of the simple P-SN proof in Ch. IV) has to be modified in order to cope with (or +’). First we shall see how the norm considerations of that proof do not go through.

+

The language theory of Automath, Chapter VIII, Section 5 ((3.5)

5.1.2. The dead end set for ,Or+ Let 2 be 2pA+. The following theorem states that the set

64 1

G defined by

G = x + G(1) + G(2) + (U)G + (0) (UCB U) is the dead end set w.r.t. x , / 3 ~ +and IF’S. Let * stand for [z/A]. Theorem. Let B be SN, B’ 2 C , C E IF, then B 1 Co with either ( 1 ) C$ non-main reduces to C , or

(2)

c,. 2 c, co E G .

Proof. As in 4.7.4, by ind. on

(9 (ii) 1(B).

CI

Similarly, we can prove the corresponding outer shape lemma. The problem is now that the norm of the expressions in G is not related t o the norm of x. E.g. consider the typical +-dead end (z) ( B CB C ) .

5.1.3. Improving the dead end set by p-reduction We restrict our domain of consideration to AUT-&. Instead of rule we choose rule Besides we add permutative reductions. Then a great deal of the “bad guys” among the dead ends, i.e. whose norm is not related to that of x , can be main reduced by a p-reduction. This will (in the next section) result in an improved dead end set ?-tdefined by

+

+ (F)(U@ U)

? =F i

+‘.

with F as in 4.7.4 .

5.1.4. Let 2 be p+’p-reduction. The direct reducts of a p-main step are of the (see 4.3.1 for the definition of U ) ,so never are in form ( A )( [ z ] Q ( B@) [g]O(C)) one of the immune forms (abstr, inj, pair, plus). Lemma. p-main reduction steps in a reduction to IF can be circumvented.

Proof. The last p-main step in a reduction t o I F must be followed by a +’-main step. However, this combination can be replaced by a single internal +’-step. 0 Corollaries. ( 1 ) ( B )([.]GI @ [ x I C ~2) D , D E IF

+

B 1i j ( A ) ,Cj[A]2 D ( j = 1,2).

+ Either (i) C 2 [PIE,E [ B ]2 D or (ii) B 2 i j ( A ) , C L ([x]C1@ [z]Cz),Cj[A]2 D,( j = 1,2).

(2) ( B ) C 2 D , D E IF (3) B(j) 2

D,D E I F

*

B 2 (C1,Cz), Cj 2 D ( j = 1,2).

D.T. van Daalen

642

Proof. Each of these reductions to I F can be replaced by one without p-main 0 steps. Part of the two corollaries can be summarized (with U as in 4.3.1) by: if

U ( B ) > D , D E I F then

B>C, CEIF, O(C)>D

This gives another lemma. Lemma. ZfU((B) ([z]C1@ [z]C2))2 D , D E IF, then

( B )([.10(Cl)

@

[.lc3(CZ)) 2 D.

Proof. (B) ( [ z ] G fB [z]C2)2 E , E E IF, U ( E ) 2 D. So B 1 i j ( A ) ,Cj[A]2 E . 0 But then (B) ( [ z ] O ( C l@) [z]U(C2)) 2 U ( C j [ A ]2) U ( E )2 D , q.e.d. This proof amounts to: if an expression allows both p- and IE-main reduction then we can insert p-main followed by +'-main before performing the IE-main step. Now we prove the theorem about the improved dead end set H. Let * stand for [ z / A ] . Theorem. Zf B SN, B* 2 C , C E IF, then B 2 CO,C$ 2 0 with (1)

c$non-main reduces to c,or

(2)

co E H.

Proof. As in 4.7.4, by ind. on (i) O(B), (ii) 1(B). Here 19 refers to the current reduction PT+' p. let B* main reduce to C , B f z. If the first main step can be mimicked in B use ind. hyp. (i). Otherwise, by ind. hyp. (ii) B 2 U ( D ) ,D E H , U ( D ) * 1C. If D E 3,then O ( D ) E 3-1 and we are done. Otherwise D = ( 0 3 ) ([y]D1@ [y]D2),0 3 E 3. Then B properly @ [y]U(Dz)), E E H , and by the previous lemma reduces t o E = ( 0 3 ) ([y]U(D1) E" 2 C , q.e.d. 0 5.1.5. Improving the SN-conditions by im-reduction The crucial SNcondition for @T+' (in AUT-IIo) is: If

(1) A SN, B SN, (2) B 2 [ z ] C =+ C [ A ]SN and for j = 1 , 2

(3) B

1 [.]Ci

@ [.]Cz,

A 2 ij(D)

* Cj[D]SN,

then is ( A )B SN. Now the p-reductions have improved our dead end set, but the problem is that

The language theory of Automath, Chapter Vlll, Section 5 (C.5)

643

they make the SN-conditions quite complicated. E.g. in order to prove that ( A )( B )([x]Cl@[x]Cz) is SN we need that ( A )C1 is S N , in particular if C1 2 [y]E we need that E[A]is SN etc. 1.e. the SN-condition of ( A )B ceases t o be easily expressible in terms of direct subexpressions of reducts of A and B. In order t o solve this problem we add im-reduction. But at first we show that the dead end set is not changed by this addition.

5.1.6. T h e dead end set of Pr+'p,im Luckily the dead end set remains 3-1. Let 2 stand for 2pn+fp,im. The first lemma of 5.1.4 can be maintained. For let a p-main step be followed by an im-main step. Then we can skip the main p-step and just apply the im-step internally. The next corollaries need a n obvious modification, in particular: If ( B )([z]C1@ [z]Cz)2 D , D E IF, then either (1) B 2 ij(A),Cj[A]2 D (for j = 1 or j = a), or (2) Cj 2 D (for j = 1 or j = 2). And the property thereafter becomes: E IF,then either

If O ( B )2 D , D

(1) B 2 C , C E IF, O ( C ) 2 D ,or (2) O ( B )E (€3) ([z]Cl@ [x]Cz),Cj 2 D (for j = 1 or 2). But the second lemma of 5.1.4 remains unchanged. Namely, if an expression allows p-main reduction but also im-main reduction, then we can insert p-main followed by im-main before performing the im-main step. E.g.

So, the theorem of 5.1.4, that the dead end set is still 3-1, carries over too. 5.1.7. The new SN-conditions The point of the im-reduction is that the SN-conditions for Pr+'p,im are identical with those for PT+' (see 5.1.5). First we give the SN-conditions of ( B )([x]C1C6 [x]C2).These are (1) B S N , C1 SN and Cz SN, and

(2) B 2 ij(A)

*

Cj[A]SN (for j = 1 and 2).

644

D.T. van DaaJen

Proof. Let the above condition be fulfilled. Use ind. on (1)f l ( B ) , (2) l(B). The interesting case is when the first main step in a reduction is a p-step. So let B 2 (B3)([y]B1@[ Y ] & ) , to prove that ( B 3 )( [ Y ](BI)C @[Y] ( B z )C ) is SN, with C E [x]Cl@ [x]C2.By ind. hyp. (1) or (2) we just need that B3 is SN (trivial), that ( B j )C SN for j = 1 , 2 and that ( B j [ D ]C) is SN, where B3 2 i j ( D ) . Since B properly reduces to both Bj and Bj[D](in case B3 2 i j ( D ) ) we can use ind. 0 hyp. (1) and get what we want. Theorem. The SN-conditions for P.ir+’p,im are identical with those of PT+’ (see 5.1.5). Proof. Let ( A )B fulfill the SN-conditions (l),(2), (3) of 5.1.5. We use ind. on fl(B). The interesting case is when the first main step is p. The case that B 2 [x]B1@ [x]B2has been done before, so let B 2 (B3)([5]B1@ [z]B2),to prove that (B3)( [ z ]( A )B1 @ [x]( A )B2) is SN. 1.e. that B3 SN, that ( A )B1 and ( A )B2 SN and that ( A )B1[D],( A )&[D] are SN whenever B3 2 i j ( D ) ( j = 1 or 2). Now B properly reduces to both Bj and Bj[D](if B3 2 i j ( D ) ) so we use the ind. hyp. and get what we want. In other words: we just need that the direct subexpressions and the IEmain reducts (not all the main reducts) are SN for proving that a n expression is SN. 5.1.8. The substitution theorem for SN Notation: We just write p(A) < (resp. 5 )p ( B ) to abbreviate G@)) < (resp. 5 )W B ) ) . B[x/A]SN. Theorem. B SN, A SN, p(x) = p(A)

*

Proof. Ind. on (1) F(A), (11) fl(B), (111) l ( B ) . The crucial case is when B = (B1)B2 and B[A]IE-main reduces. If this first main step can be mimicked in B use the second ind. hyp. Otherwise we end up with ( B i )C or (C)Bb with C E 7-l and BI 2 B: or B2 2 B: f [y]D1@[y]Dz, respectively. If C E 9, then p ( B i ) < p(C) 5 p ( x ) so a first main reduction of ( ( B i )C) [A]involves a substitution [ z / E ]with p ( z ) 5 p ( B i ) < ~ ( 5 )And . a first main-IE reduction step of ((C)Bb) [A]must be a +’-step, so involves a substitution [ z / E ]with C[A]2 i j ( E ) . So in that case too p ( z ) = p ( E ) < p(C) 5 ~ ( 5 ) . Anyhow if C E 8, we can use ind. hyp. (I). Otherwise C = ((73) ([y]C1@ [y]C2),

The language theory of Automath, Chapter VIII, Section 5 (C.5)

645

with C, E B. Then a p-step is possible and can be inserted before doing the main IE-step. This p-step can be mimicked in the reduction of B, so we can use ind. hyp. (11). 0 5.1.9. SN for AUT-IIo and AUT-IIl Like before, an immediate corollary is Pr+’p,im-SN for AUT-no, so Pr+’-SN for AUT-no, whence Pr+-SN for AUTII1. Then by pp we can extend the AUT-II1 result to PT va-SN. (Not for E . )

+

5.1.10. An alternative method Actually im-reduction can be avoided in this proof. Namely the effect of p-reductions on the SN-conditions can be expressed by means of certain inductively defined sets. We define a set of expressions B! by

B! = B

+ ( U )([XI( B ! )@ U ) + ( U )(U @ [XI ( B ! ) )

1.e. B! contains all those expressions that im-reduce to B. Then the SN-conditions for PT+’ become: If (1) B SN,

c SN,

(2) B 2 B’ E A!,C 2 C‘ E ([y]D)! + D[A]SN, and (3) B

L B’ E (ij(A))!, C 2 ( [ ~ ] C CB I[ ~ ] C Z* ) ! Cj[A]SN ( j = 1,2),

then ( B )C SN. 5.2. A second proof of Pr+’-SN, using im-reduction 5.2.1. This proof is based on the second instead of the first P-SN-proof of Ch. IV (Sec. IV.2.5, see also VII.4.5). There we did not use the square brackets lemma, and no dead end set, so we can do without p-reduction. Our language is AUT-IIo, again, and >_ stands for >_ Pr+’,im. 5.2.2. Replacement theorem for SN As explained in VII.4.5, the kernel of this type of proof is a replacement theorem, rather than a substitution theorem, for SN. Theorem. If B SN, A SN, ~ ( 2I)p ( A ) , then B[z/A]LRSN. Proof. By ind. on

(1) fi(A), (11) d ( B ) , (111) l(B).

646

D.T. va.n Daalen

We write * for [x/A]LR. Consider a reduction sequence B* >1 ... >1 F >1 G, where the contraction leading from F to G is the first contraction not taking place inside some reduct of one of the inserted occurrences of A. Realize first that the number of those inside-A contractions is finite, because A is SN. Now we prove that G is SN. Distinguish two possibilities: (a) The step F > I G does not essentially depend on the inserted A's and can be mimicked in B. 1.e. B >1 Go, G: 2 G. In this case we use ind. hyp. (11). (b) Otherwise some reduct of some inserted A plays a crucial role in the redex contracted. If G F > G is a r-step, then, e.g., B = ...x . . . x c ( ..., ~ ) B* E ... A ...A(l) ..., F = ...A' ... (Cl,C2)(,)..., G = ... A' ... C1 .... Now form BOE ...x ...y ... from B by replacing x ( ~by) a fresh y, with p ( y ) = a1 (wherea! E a ! ~x a2). And B _= Bo[y/z(l)] so BOis SN, ~ ( B o 5 )8(B),~ ( B o<) 1(B). So by ind. hyp. (11) or (111), B,' is SN and B,' 2 Go = ... A' ...y... with G = Go[Y/CI]LR.Here Go is SN, 9 is SN, p b ) = ~(cj), G is a P-step argue as in IV.2.5.3 or VII.4.5.6. If F > G is a +'-step, the redex contracted is, e.g., (il(D)) ([y]C1 @ [y]C2), reducing to Cl[D]. Now distinguish

WY))

( b l ) a reduct of an inserted A is crucial in i l ( D ) , (b2) a reduct of an inserted A is crucial in ([y]Cl @ [y]C2). (z)Co ..., C,+ 2 [y]C1 $ [y]Cz, A 2 First case ( b l ) . Then B 3 ... z... il(D). By a norm argument the $-term must be present in B already, SO Co z [p]El8 [y]E2, Ei 2 4 , E,' 2 Cz. NOWform Bo ...x...El .... This is an im-reduct of B, so SN and by ind. hyp. (11) B,' SN, reducing to Go = ... A' ... C1 ..., where G = ...A' ... Cl[D] .... Clearly Go SN, D SN and l ( p ( 0 ) ) < l ( p ( 2 ) ) . so G -= GO[Y/D]LRSN by ind. hyp. (I). In case (b2), argue as in the p-case. Finally, the redex contracted in F is an im-redex, in which A plays a crucial role. 1.e. B = ... z ... (Co)x ..., A 2 [y]D1@ [y]Dz, C,' 2 C, F = ...A' ... (C) ([y]Dl $ [y]D2)..., G zi ... A' ... D1 .... Form BO= ... z ... y ..., B = Bo[y/(Co) Z]LR; so either by ind. hyp. (11) or (111) B,' is SN, reducing to Go = ...A' ... y .... Clearly D1 SN, I(p(D1)) < l ( p ( z ) )so 0 by ind. hyp. (I) G = G o [ ~ / D ~ ] L is RSN.

5.2.3. An immediate corollary of this replacement theorem is the ordinary substitution theorem. From this, as before, follows @r+'im-SN for AUT-no. So we get Pr m-SN for AUT-rIl.

+

The language theory of Automath, Chapter VIII, Section 5 (C.5)

647

+

5.3. A proof of @r qu-SN by computability 5.3.1. In this proof we do not include qu by a pp-result afterwards, but consider

these ext-reductions from the beginning of the proof on. We must consider AUTIIl because AUT-IIo is not closed under q. Our definition of computability has been strongly inspired by de Vrijer's definition in [ d e Vrijer 75 (C.4)]. de Vrijer's definition is phrased in such a manner that the important properties: (1) computability implies SN,

(2) computability is preserved under reduction, follow almost immediately. Then, as usual, we prove by ind. on length that expressions are computable under substitution. Notice that we do not include E . 5.3.2. The definition of computability We write C, for the set of computable terms of norm a. The set C, is defined by induction on the length of a , as follows: Let B E a. Then B E C, if B SN and the following requirements are fulfilled:

Notice that each clause in the definition of C, only depends on Cp's with shorter than a.

p

5.3.3. We write C for the set of all computable expressions, the union of all the C,s' . By definition: A E C A SN. Each condition in the definition of computability of B has the form: B 2 C +- P ( C ) , with P some condition on C. So computability is preserved under reduction.

*

Now we try to express the computability of an expression in terms of the computability of its subexpressions. First a lemma. Lemma. 5.3.4.

( I ) [x]C2 [ x ] D + C 2 D (2) ( C , D ) > ( E , F ) + C > E , D > F .

D.T. van Daalen

648

(3) ij(C) 2 i j ( D ) =+ C 2 D (j= 1 , 2 ) .

(4) C @ D > E @ F =+ C > E , D > F . Proof. Without main reduction it is trivial. Otherwise it is or cr. E.g. if (C, D ) 2 ( E , F ) ,then C 2 ( E , F ) ( 1 )2 El D 2 ( E , F ) ( 2 2 ) F q.e.d. By the 0 way, Property (4) even holds in the presence of E . Lemma (computability conditions). (0) variables are in C .

(1) A S N , C E C , D E C =+ ( A , C , D ) E C . (2) A SN, C E C

*

il(C,A ) E C , i2(C,A ) E C .

(3) C E C , D E C =s C @ D E C .

(4)

cEc *

C(1) E

c, C(2) E c .

+

(B)CEC.

(5) B E C , C E C

Proof. (0) is clear. ( l ) , (2), (3) by the previous lemma. (4) as follows: Let C E C , then C SN so C(j) SN. If C(j) 2 [y]D,then C 2 (C1,Cz) with Cj 2 [y]D. Each of the Cj is in C , so [y]Dsatisfies the required condition. Similar if C(j) 2 (01, D z ) ,C(j)2 i l ( D ) etc. Proof of (5): Let B , C E C so B, C SN. Induction on ,u(B). We first check the SN conditions. Let C 2 [y]Dlthen D[B]E C so SN. Or let B 2 i j ( D ) ,C 1 C1@C2, to prove that ( D ) C j is SN. Well, both Cj’s are in C , D E C and we can use the ind. hyp. to prove that ( D ) C j E C (so SN). Further, if ( B ) C 2 [ y ] E(or reduces t o ( E ,F ) etc.), this is only possible after a main step, so either via some D [ B ]with C 2 [y]Dor some ( D )Cj where B 2 i j ( D ) , C 1 C1 @ C2. Those expressions were in C so [ y ] E(and ( E ,F ) etc.) satisfy the required conditions. 0

5.3.5. Computability under substitution For expressions [y]Csuch simple computability conditions cannot be given. We define an even stronger notion than computability. Definition. B is said to be computable under substitution (cus) if

A ~..., , A, E

c,

p(zi)

=p

( ~for~ i )= 1,...,

=s

B[z/A’I E c .

Some easy properties are (1) B Cus

+ B E C (e.g. take n = 0), and

(2) B Cus, B 2 C =s C E C .

0

The language theory of Automath, Chapter VIII, Section 5 (C.5) Then a lemma. Let p ( C ) G

c E C,l+,Z

+ QZ

and let F E C,,

+

649

(F)C E Ca2. Then

.

Proof. Clearly C is SN. We use ind. on l(a1). If C 2 [ y ] D ,F E C,,, we must prove D [ F ]E CaZ. This holds because ( F )C 2 D [ F ] . If C 2 D @ E we must prove that D , E E C. For il(F) E C,,, ( i l ( F ) ) C E C so (F)D E C. Now use the 0 ind. hyp. Similar for E. 5.3.6. Lemma. B Cus, C Cus j[y : B ] C Cus.

Proof. Let C Cus, B Cus, A' E C of the right norms. Abbreviate [?/A'] by *, We must prove that [y : B*]C*E C. Well, B* E C, C* E C so [y : B*]C*E SN. If [y : B*]C* 2 [p : DIE, F E C of the right norm, then we need that E [ F ]E C. Because C is Cus, C[3,y / x , F ] E C, which expression reduces to E [ F ] q.e.d. , In particular, if C* L (y) (El @Ez), y $! FV(E1 @Ez),we have that (F)(EICBE~) E C, so by the previous lemma El@&E C, El E C, EZ E C, q.e.d.0 Theorem. All AUT-II1 expressions are Cus. Proof. Variables are Cus by definition. Further use induction on length. For the abstr case use the previous lemma. For all the other cases use the lemma in 5.3.4. E.g. to prove that ( B ) C is Cus. Let * be as in the previous lemma. By 0 ind. hyp. B* E C, C* E C, so ( B * )C* E C. Corollaries. (1) All AUT-nl expressions are computable.

(2) All AUT-nl expressions are PT

+ qa-SN.

0

5.4. Strong normalization for AUT-II 5.4.1. The normability of AUT-II In order to extend our results from AUT-IIl to AUT-II we must first extend our definition of norm (see 4.2.3), and implicitly, of normability, as follows: p(7) E 7

p(A) G

Q + /3

p(A) = a

+

+

p ( I I ( A ) )G Q

,B =+ @ ( A ) )

+

,B

= a 8 ,B

A , B of degree 2 + p ( A @ B ) = p ( A ) @ p ( B ) . And we must say what the norms of the variables are P(2)

:= P ( t Y P ( X ) ) .

D.T. vaa Daalen

650

Our definition of normability, here, is modelled after the normability definition of AUT-QE (weak normability), in particular as far as the handling of 2-variables is concerned. For details see IV.4.4-IV.4.5. First we define norm inclusion c: (1) a a norm (2) a c

P

*a c

7.

=+ ( Y - + @ )

c (-/-+PI.

Then we say that A fits in B (notation A f i n B ) if degree(A) = 3

* p(A) = p ( B )

degree(A) = 2 =+ p(A) c p ( B ) . Now we define the norm of constant expressions Afin

c‘[A +

p(c(A))

:= p( typ( c)[A)

Afin

c‘[d *

p(d(A))

:= p ( t y p ( d ) [A])

where 5 E c‘ is the context of the scheme, in which c (resp. d ) was introduced. We want to show that correct expressions are normable, and of course that whenever A E B , A fits in B. In view of the instantiation rule and the fact that norms can change under substitution (for 2-variables) we prove, as in Ch. IV.4.5 a kind of normability under substitution. Theorem. If f i n g[A, f E B’ I- C E D , then C [ d f i n D [ A (note that “fitting in” implies the normability of the expressions involved).

A

Proof. Ind. on correctness.

0

Corollary. I- C E D

0

C f i n D (so, C, D normable).

5.4.2. Note: By the above defined concept of normability lots of expressions become normable which are certainly not correct in AUT-ll. E.g. p ( B ) , and (C(B))(i),with p ( B ) Pi P2. (A) (n([z: BI C)), with p(A) This is a consequence of the fact that AUT-II is handled just like AUT-QE: IIk are (as regards norms) ignored, and C’s are in some sense identified with pairs. +

E x t e n d i n g the SN-result to AUT-ll Clearly the presence of nonreducing constants such as C, II (for 2-expressions), and T does not harm the SN-results of the previous sections. We just have to add &reduction. The substitution (resp. replacement) theorem for SN can easily be extended because ) take place inside A or can be mimicked in &contractions in B [ x / A ] ( L Reither B already. Then we can proceed as in IV.4.6 or directly prove B normable B SN, by ind. on 5.4.3.

*

The language theory of Automath, Chapter VIII, Section 6 (C.5)

65 1

(1) date(B). [For a definitional constant, date(d) = date(def(d))+l. The date of an expression is the maximum of the dates of the definitional constants that occur in it. So, induction on date can be considered “induction on definitions ”.] (2) l ( B ) . [The length of B.] The new case is when B = d ( 6 ) . The Ci’s are SN by ind. hyp. (2). Further we = want that def(d) [C?] is SN. Well, def(d) is SN by ind. hyp. (1) and def (d) def(d) [Cl]... [Cn]. So by iterated use of the substitution theorem we are done. by pp. Later we can add q , Alternatively we can extend the SN proof by computability to the present case, viz. by leaving the definition of computability unmodified and prove computability under substitution by ind. on (1) date, (2) length. let In particular let A l l ...,Ah E C of the right norms, let * stand for B,*,...,B; E C. Then we must prove that d(@* E C. The Ba’s are SN. By ind. hyp. (1) def(d) is Cus, so def(d) ($1 E C, so SN. Further, if d(B*) 2 [y]E or (ElF ) etc.) then this reduction passes through def (d) [B*](which was in C). So, finally we have PT uq6-SN for AUT-ll.

[e]

[Z/d,

+

VIII.6. Some additional remarks on AUT-II 6.1. The connection between AUT-QE and the abstr part of AUT-II Here the abstr part of AUT-II is the part generated by the general rules (2.2.1, 2.2.2) and the specific rules group I (2.3). If it were not for the role of rI,and the rule of product formation, this part of AUT-II would be identical to AUT-QE. In the introduction to this chapter we mentioned already that the rule of type-inclusion is somewhat stronger than the rule of product formation. This means that the obvious translation of AUT-II, viz. just skipping the II’s produces correct AUT-QE, but not all of AUT-QE. Namely without II, the rule of product formation becomes

+

cpE[z:a]~

(1)

VET

which is just a specific instance of the type-inclusion rule cpE[$:fl[x:a]~+ cpE[y’:fl~.

(11)

Let us see whether sensible use of (I) can yield something like (11). So let

p’ I- (5) cp E [z : 1217 (where Y consists of the yi’s So by (I) y’E $ I- ( i cp ) E 7 ,and by iterated use of the +

cp E [y’ :

[z : 1217.Then y’ E

in the reversed order). abstr rule we get k cp+ E

= [y’: P] (Y) cp. - 9 -

T

with cp+

Clearly

D.T. van Daalen

652

cp+

1; cp

which indicates that AUT-QE is not a very essential extension of the image of AUT-II under the translation. Compare [de Bruijn 771, [de Bruijn 78c (B.4)]. 6.2. The CR problem caused by E In Ch. I1 we gave a counter example for PE-CR. Namely [z]z and [y]il(y)@ [y]iz(y)are distinct PE-equal normal forms (just two different ways to write identity on a @-type). This suggests to save CR by adding E alt (see 2.6)

[.]B[i1(z)] @ [z]B[iz(z)l> [ z ] B*

+

However, E alt and interfere in a nasty way: [z](... (z) F ...) @ [z](... (z) G ...) <++ [XI (... (ii(z))( F W )...) @[z](... ( i z ( 2 ) )(F@G)...) >s [z] (... (z) (FBG) ...), so this does not help. In principle, CR is not too important for our purpose, we rather need a good decision procedure for definitional equality. Just like (in V.4) we suggested to implement 17-equality by the rule (z) F 9 G

*

F 9 [z]G

we conjecture here that we would generate full equality (including

E)

by adding

( i i ( ~ )F) Q (z)G , ( i z ( ~ )F) Q (z) H =+ F Q G @ H . But in order to guarantee the well-foundedness of such an algorithm, we need of course some kind of strong normalization result, which applies in the present situation. The general pattern of the counterexample to E alt-CR reads

[z10((4F ) @ [510((5)G) Q [z10((4(F@ GI) where 0 is a very general operation on expressions. This shows that extensional equality generates the equality induced by permutative reductions (Sec. 4.3) O ( ( 4([ZIB@ [zIC))Q ( 4[40((4 ([zIB@ [zlC))Q ( 4([z10((4[zIB)@ [z10((4[zIC))Q (’4([zIO(B)@ [z10(4). E.g.3

( D ) ( A )([zIB@ blc)
+’+’ ( 4([XI ( D )B @ I.[ ( D )4,q.e.d. Conversely, we might generate part of the &-equalityby adding general permutative reductions, paying due attention to the thus arising SN problem.

6.3. The SN-problem caused by E We strongly believe that SN holds for the full AUT-ll reduction (including E ) , and that there are just some technical problems which prevent the proofs of

The language theory of Automath, Chapter VIII, Section 6 (C.5)

653

the preceding section to apply to that situation. We briefly sketch why each of the three proofs fails in presence of E . The problem with the first proof (5.1) is that the dead end set for, e.g., PE-reduction is not so easy to describe. E.g. [ y ]((il(y)) z) F @ [y]( i 2 ( y ) )F is a typical dead end for PE. Of course Pv- or Do-dead ends are not manageable either, but a77 can be included afterwards, using pp. Then the second proof (5.2). An c-redex [ y ]( i l ( y ) )F @ [y]( i z ( y ) )F can be created by substitution [z/A]in two different ways:

(1) from z @ [y](iz(y)) F , A part) 1

[ y ]( i l ( y ) )F (and similar with the right hand

(2) from [Y]( i l ( y ) )FI @ [yl ( h ( y ) )Fz, F l [ A ]= F , F2[AI = F . In case (1) we are suggested t o replace z @ [ y ]( i l ( y ) )F by a single variable z , and t o introduce a new substitution [ r / F ] . However, l ( p ( z ) ) > l ( p ( z ) ) ,which does not fit in the proof at all. But we can remove this case by just considering AUT-no. Case (2) does not pose a problem: the substitution plus reduction can be simulated by reduction plus substitution, starting from [y](il(y)) Fo @ [ y ]( 2 2 ( y ) ) Fo, where both F1 and F2 can be constructed from FO by substituting A for some of free z’s. Besides, the second proof is based on replacement. This means that the E-redex above can also be created from, e.g., (3) [YI

(4F @ [YI (i2(v))F , with A = i l ( Y ) , or

These two expressions do not reduce, unless we switch to a generalized form of (which does not solve the problem, though - see below). Finally the computability method (5.3) fails because the property: F E C, G E C + F @ G E C is not so easy anymore. For, let F 2 [z]( i l ( z ) [) y ] D ,G 2 [z]( i ~ ( z[)y)] D .Then we just know that A E C D [ i l ( A ) ]E C , D [ i z ( A ) ]E C , but we want that D [ A ]E C for general A E C. We have tried to adapt the second SN-proof to this situation, viz. by restricting to AUT-no, and by introducing a liberal version of &,It, named 8.

Ealt

*

E’

:

> [YIF. [ p ] F ( i i ( y )@] G > [ y ] F G @ [P]F[~(Y)I

This can be considered a kind of improper reduction in the sense that it identifies expressions which in the intuitive interpretation do correspond to different objects. A typical way of creating a new &-redex is, e.g., from (y]z @ G by the ~~, to [yly. One can indeed mimick this by first replacement [ z / 2 1 ( y ) ]reducing reducing to [ylz, and then apply a new replacement, viz. [z/y].But the norm of this new z is longer than that of the old one.

This Page Intentionally Left Blank

655

The Language Theory of A,, a Typed A-Calculus where Terms are Types L.S. van Benthem Jutting 1. INTRODUCTION

In the present paper we present the theory of a system of typed A-calculus A,, which is essentially the system introduced in [Nederpelt 73 (C.3)]. Its characteristic feature is that any term of the system can serve as a type. The main difference between the two systems is that our system only allows for @-reduction, while Nederpelt’s system has 7-reduction as well. The importance of A, lies in the fact that it may be considered as basic to the Automath languages. Therefore its theory can also be seen as basic to the theory of Automath [ d e Bruijn 80 (A.5)], [van Daalen 801. In our notation we will follow the habits of Automath, that is: for terms u and v, types

(Y

and variables z we will denote

Azauby[s:a]u and

(u.1

by

(4..

The system consisting of such terms will be called A. The system A, is the subset of A to which a term ( u ) v belongs only if v is a function, and if the domain of u and the type of u have a common (@-)reduct. Our main theorems will be:

(1) Church-Rosser for A. This will be proved along the lines of well-known proofs by Tait and Martin-Lof [Martin-Lof 75al. (2) Strong normalization for a subsystem of “normable terms” in A. Our proof will be along the lines of proofs in [Gandy 801 and in [de Vrijer 87c] for strong normalization in simply typed A-calculus.

(3) Closure of Am under (@)-reduction. For this we have a new direct proof, though the theorem has been proved previously in [van Daalen 801, see [C.5].

L.S. van Benthem Jutting

656

Moreover, we prove that the terms of Am are “normable” in the sense intended above; therefore those terms strongly normalize. This, together with correctness of types, implies that Am is decidable. In our presentation we will use “nameless variables’’ as suggested in [de Bruzjn 72b (C.2)]. That is, our variables will not be “letters from an alphabet” but “references to a binding A”, or rather, because of our notational habits, “references to a binding square brackets pair”. In order to grasp the use of nameless variables one should note that terms can be interpreted as trees. Consider e.g. the term:

.1 : 4 (4[v : PI (9)2 The corresponding tree is

.

Figure 1 In this tree the bindings may be indicated by arrows, omitting the names of the variables:

Figure 2

The language theory of A,

(C.6)

657

And here, again the arrows may be replaced by numbers, indicating the depth of the binding node to which the arrow points as seen from the node where the arrow starts (only binding nodes, indicated by c‘o”,are counted!):

Figure 3 This last tree can again be represented in a linear form:

1.1

(1) [Dl (1)2

*

Note that the same variable x in the first term (or tree) is represented in the “nameless” term (or tree) once by 1 and once by 2, whereas the same reference 1 in the “nameless” representation once denotes x and once y. Both the name carrying and the nameless linear representation can be considered as formalizations of the underlying intuitive notion of “tree with arrows”. The presentation with nameless variables makes the notion of a-conversion superfluous (and even meaningless). Thereby the definition of operations where “clash of variables” might arise (e.g. substitution) becomes more definite, and the proofs more formal. The drawbacks of this presentation might be a loss of “readability” of the formulas, and the need of a number of technical lemmas for updating references involved in certain formula manipulations. In our presentation frequent use will be made of inductive definitions (e.g. the definition of term, of substitution, of reduction and of A,). Subsequently, proofs are given with induction with respect to these definitions. This should always be understood in the sense of “induction with respect to the number of applications of a clause in the definition”, or, in other words, “induction with respect to the derivation tree”. This concept is not formalized here.

L.S. va.n Benthem Jutting

658

2. PRELIMINARIES AND NOTATIONS In our theory we will use some notions of intuitive set theory. N will denote the set of natural numbers { 0 , 1 , 2 , 3 , ...}, N+ the set of positive natural numbers {1,2,3, ...}, and IF'= N U {oo}, the set nV extended with infinity. The predecessor function is extended to N- by defining oo - 1 := 00. For n E N we define N,, := {k E N+I k I n}, so NO = 0, the empty set. Let A and B be sets. Then A x B denotes the Cartesian product of A and B, that is the set of pairs [a,b] where a E A and b E B; and A -+ B denotes the set of functions with domain A and values in B. If f E A --* B and a E A then ( a )f will denote the value of f at a; and if for a E A we have b(a) E B then [a E A ] b ( a )will denote the corresponding function, that is the set{[a,b(a)l E A x B ~ u E A } . As a consequence of our notation for the values of a function our notation for the composition of functions will be a little unusual: i f f and g are functions with domains A and B respectively, then

f 0 g = [z E C] ((z)f ) g ,

where C = {z E A I (z)f E B }

So (z) ( f o g ) = ((z)f ) g for z E C. If A is a collection of sets then U A denotes + A , i.e. the the union of A . If A is any set and n E N then A(") denotes N,, set of finite sequences of elements of A with length n. In particular A(O) = {0}, where 0 denotes the empty sequence. A* will denote U {A(")I n E N}, that is the set of all finite sequences of elements of A . If s E A* then L ( s ) is the length of a; and if s l 5 A* and 92 E A* then 81&82 denotes the concatenation of s l and 92. In particular, 0&s = s for s E A*. If a E A we will often confuse a with {[l,al}, that is the element of A(') with value a. In particular, if a E A and s E A*, then a&s E A', and moreover:

+

(1)(a&s) = a , and (n 1) ( a h )= (n) s for n 5 L ( s ) . Where no confusion is expected we will often omit the symbol "&". For the updating of references we will use the following functions and operations on functions:

For m E N

vrn = [nE N+] (n+m) For m E N

where

The language theory of A,

T(m,n)= Form E

In

(C.6)

659

n+l

ifnsm

1

ifn=m+l if n

>m

+ 1.

N and $ E N+ + N+ $(m)

= [n E M+]~ ( $ , m , n, )

where

It follows that cpo = 290 = [n E N + ] n , the identity on Nv+,and that for E N + + N+ we have +(') = +. Note that ( P and ~ 19', are injective, and that if $ is injective then so is +("). Simple computation shows that the following lemmas hold:

+

Lemma 2.3. If k E +l(k)

0

N and $1,$2 E N+ -+ N+ then

+2@) = ($10 $ 2 ) ( k )

Lemma 2.4. If k, m E

( n )' : 9

=

("

.

M and n E N+ then

n+m

ifnsk ifn>k

L.S. van Benthem Jutting

660

Lemma 2.5. Zf k,1, m E N then

3. TERMS, TRANSFORMATION AND SUBSTITUTION

We define the set of terms A inductively as follows:

Deflnition 3.1. (1) T E A (2) if

TI

E N + then

nEA

(3) if u,v E A then (u)v E A (4) if u, v E A then [u] v E A

0

Transformation, i.e. adaptation of the references in terms by means of a function $ is defined as follows:

0

Clearly if u E A then $21 E A. Moreover qU=T

iff u = ~ ,

@=m

iff u = 21 and

- = (vl)v2 $u

iff u = (ul)u2, $ul = v l and $2~2= v2

- = [vl] v2 $u

iff u = [ul]212, $u1= v l and $(l)u2 = v2

( T I ) $=

m

, ~

It follows that for injective $, $JU = $v implies u = v. -

Lemma 3.1. Zf$1,@2 E N + -+ N + , u $1 $2u = $2 0$1u

--

~

*

E

A then

, .

The language theory of Am (C.6)

66 1

Proof. By induction on u.

0

For u, v E A, k E N + we define substitution of u in v at k , denoted by as follows:

xi v

Definition 3.3.

xi

Clearly, again, if u,u E A then v E A. Now we have the following technical lemmas: L e m m a 3.2.

xi v = x r

LPk-lu

29k-1~. __

Proof. By induction on v. L e m m a 3.3.

11 ~7

-

v=

0

tltu

@v.

Proof. By induction on v. L e m m a 3.4. If m

then

0

xi (Pmv = xi-mv . -

(Pm

Proof. By Lemma 3.2 and Lemma 2.1. L e m m a 3.5. If m

+ 1 2 k > 1 then xi &v

-

=(P;-~V.

Proof. By induction on v. Corollary 3.5. Zf m 2 k then

xi -~ pmv = vm-lv.

These lemmas are used to prove the following theorem: T h e o r e m 3.1. Substitution theorem.

Z f m 2 k thenx:

xi w = x kE L + l x1+1W '

Proof. By induction on w.

L.S. van Benthem Jutting

662

The relevant case is when w = 12. If n = k then

and on the other hand

If n = rn

+ 1 then

c:+1

= Ck

x;-k+l'

'w

vnu = 'p.,-?u

-

For other values of n the proof is straightforward.

by Lemma 3.5

. 0

4. REDUCTION

We define on A the relation -+,called one step reduction. Definition 4.1. (1)

(4['wl v

+

CY

2,.

If u --* v then

The relation

-,,on A is the reflexive and transitive closure of +, defined by

Definition 4.2. (1) u+u.

(2) If u + v and v

+

w then u + w

The language theory of A,

on

(C.6)

663

It is easily seen that the relation ++is transitive and monotonic. By induction $v, the following technical lemma is proved: 21 -+ v, respectively $u + -

Lemma 4.1. If u + v then for any $ we have - implies u + v . if 4 is injective then $u - + +v

$21 .

+

$v ; 0

Another technical lemma: Lemma 4.2. If $u -

+v

then for some w we have v = $w - and u + w.

Proof. By induction on $u -

+ v.

0

Finally it is easily shown that if [ u l ] u 2 u 2 * v2.

+

v then v = [vl]v2, u1

+

v l and

5. THE CHURCH-ROSSER THEOREM

We define on A the relation

> called

nested one step reduction.

Definition 5.1. (1) u > u . If u 2 u1 and v 3 v l then (2)

(414v 3 EY1v l

(3) (u) v 3

(211)

vl

(4) [u]v 3 [ U l ] v l .

5

2 denotes the transitive (and - of course - reflexive) closure of easy inductive argument it is seen that u 2 v iff u --H v. The following technical lemma is proved by induction on u 3 v.

>. By

Lemma 5.1. If u 3 v then for any $ $u 3 +v. -

an

5

Now we are able to prove two lemmas on substitution. Lemma 5.2. If u 3 u l then

xt v 3 xi1V.

Proof. By induction on v it is proved that

xt v 3 Ct1v for any k.

0

L.S. van Benthem Jutting

664

Lemma 5.3. Substitution lemma for >. Zf u 3 u1 and v 2 v l then v> vl.

xi

xi1

Proof. By induction on v 3 v l it is proved that Lemma 5.2 and Theorem 3.1 are used.

xi v 3 xi1v l for any k.

Using these lemmas we can prove the diamond property for

0

>.

Lemma 5.4. Diamond lemma for >. If u 2 u l and u 3 u2 then there exists a term v such that u l

> v and u2 3 v.

Proof. By induction on u 3 u1 and u 2 u2,using Lemma 5.3.

0

As a corollary we have:

Theorem 5.1. Church-Rosser theorem for +. If u -M u1 and u + u2 then there exists a term v such that u1 +,v and u2 + v.

0

6. NORMS, NORMING FUNCTIONALS AND MONOTONIC

FUNCTIONALS A term u E A is called normal if u -n v implies u = v. A reduction sequence of u is a finite or infinite sequence uo, ul,212, ... such that uo = u and un-l -+ un for n E lN+. We say that u strongly normalizes if all reduction sequences of u are finite. This is the case, by Konig’s lemma, iff there is a uniform upperbound to the lengths of the reduction sequences of u. We will prove strong normalization for a subset of A, the set of normable terns. Our proof extends proofs in [Gandy 801 and [de Vrijer 87c] for strong normalization in simple type theory. It is based mainly on de Vrijer’s “quick proof”; we refer also to that proof for comments. We define the set F of norms recursively as follows:

DeAnit ion 6.1. (1) lN E F

(2) if a , @E F then a

-@

:= ( a

-+

0)x EV

E F.

It is clear that, for a,@€F , a =par a n p = 0 . The elements of UF will be called norming function&.

0

For any norming

The language theory of A,

(C.6)

665

functional f the norm to which f belongs is denoted by the projection operators:

fT.

Moreover, we define

if f = n, n E N then f' = n , if f = [g,nl, [g,nl E a

- p then f' = g and f'

=n.

Let f be a norming functional, m a natural number. We define the norming functional f m as follows:

+

Definition 6.2.

+ m = n + m.

(1) If f E N , f = n then f

(2) If f E a

- P, f = [g,nl then f + m

Thus for f E a we have f

= [ [ hE a]((h)g

+ m),n + ml.

0

+ m E a and

(f + m)' = 'f + m , ( h )(f + m)' = ( h )f'+ m if a = p

-

y and h E

0.

Note that + extends addition on the natural numbers. For a E F and n E IV we define the norming functional c; E a.

Definition 6.3. (1) :c

(2)

=n

4-7 = "h E PI

C;[.+n,

n1'

Thus c;*

=n ,

( h )(@-7)'

= c;.+~

if h E

,

Note that c;

+ m = c+;,

.

Now let a be a norm. We define a subset ao of a and a relation simultaneous inductive definition.

Definition 6.4. (1) N O = N ;for f,g E

N o , f < g iff 'f < g*

< on ao by a

L.S. van Benthem Jutting

666

We define G := {aola E F } ; the elements of G will be called monotonic functionals. Note that < on N o is the order on the naturals. The following facts are easily proved: If f , g , h E a', f < g and g < h then f < h. Iff,gEoo,mEMthenf+mEaoandiff
< n then c g < c:.

7. STRONG NORMALIZATION We will assign to certain terms u E A a functional in U F , which will be called the norming functional of u. In order to define it we need a sequence 9 E ( u F ) * ;9 may be thought of as an administration of the functionals assigned to the free variables of u. f n ( u , 9 ) will denote the norming functional of u. It may be the case that f n ( u , 9 ) is undefined. This will be denoted by fn(u, 9)= 0.Terms u for which fn(u, 9)# 0 for some 9 E (uG)* will be called nomable.

Definition 7.1. (1) fn(7,O) = 0 .

(fn(v, 9)) fn(w, 9)'

if fn(v, 9)#

,

fn(w, 9)# 0 and dom(fn(w, 9)')= fn(v, 0)t ;

l o

otherwise.

The language theory of A,

((2.6)

667

+ + fn(v, @)* + 1,fn(w,@)* -t

[ [ hE a]fn(w, h&@) h* (4) fn(b1 w,@) =

+ fn(w,ct&@)*l

if fn(w, @) # 0 , fn(v, @)t= a and fn(w,h&@) # 0 for h E a ; otherwise.

0

It will be clear from Lemma 7.5, which will be proved presently, that for normable terms u the number fn(u, @)* is a n upperbound for the lenghts of the reduction sequences of u. Note that if fn((u) [w]v,9)# 0 then fn(u, @)t= fn(w, @)t. Our first lemma expresses that it only depends on the norms of the functionals in @ whether fn(u, @) is defined and, if so, what is the value of fn(u, 0)t.

Lemma 7.1. Zf @1,@2 E ( U F ) * , L(@l) = L(@2) = n and (k)@lt= (k)@2t 5 n then either

for k

fn(u, 91) = fn(u, @2)= 0

,

OT

fn(u, @l)t= fn(u, @2)t

. 0

Proof. By induction on u. The following technical lemma is also proved by induction on u.

Lemma 7.2. If@ E ( U F ) ' , II, E N + + IV+ and II, o @ E ( U F ) * then fn($u, - @)= fn(u,II, o 0).

0

(Note that @ as well as II, is a function, hence II, o @ is a function.) The following important lemma expresses that an upperbound for the lengths of the reduction sequences of C'; w can be calculated from fn(u, @) and fn(w, fn(u, a)&@).

Lemma 7.3. Substitution lemma. Zf fn(u, @)# 0 then fn(Cy v, @) = fn(v, fn(u, a)&@). Proof. By induction on v. The main case is: v = [vl] v2.

+ + fn(C;" v l , @)* + 1,

fn(Cy v,@)= [[hE a]fn(C; w2, h&@) h* fn(C;" w l , @)*

+ fn(C;

v2,

.;&@)*I

L.S. van Benthem Jutting

668

where a = fn(Cy v l , 9)f, while by the induction hypothesis fn(Cy v l , 9)f = fn(v1, fn(u, @)&@)I . Moreover, we have by the induction hypothesis for h E a: fn(C; v2, h&9) = f n ( C y 5 v 2 , h&9) = fn(cplu, - h&Q)&h&@) . = fn(191v2,

Therefore f n ( C t v2, h&9) = fn(v2,&

o

(fn(u, 9)&h&9)) =

= fn(v2, h&fn(u, a)&@) .

It follows that fn(Zy v,9) = fn(v, fn(u, 9)&9).

0

In order to formulate the next lemma we need the concept of a free uariable. Therefore we define for u E A and k E N + the proposition free(u, k), expressing (in the language of Section 1) that the term u contains a reference (or an arrow) to the k-th binding node below u.

Definition 7.2. (1) not free(.r, k) (2) free(n,k) iff n = k (3) free( (v)w, k) iff free(v, k) or free(w, k) (4) free([v] w, k) iff free(v, k) or free(w, k

+ 1).

0

Lemma 7.4. Monotonicity lemma. -

If 0 E (uG)* then fn(u, 9)E (UG) u ( 0 ) .

- If 91, 9 2 E (UG)*, L(@l) = L(92) = n, we have ( 1 ) 91 = ( 1 ) 9 2 ,

(k)91 < (k)9 2 and for 1 5 n, 1 # k

then fn(ul91) < fn(ul9 2 ) OT fn(u, 91)= fn(ul9 2 ) = 0 zffree(u, k) and fn(u, 91) = fn(u, 9 2 ) zf not free(u, k).

Proof. By induction on u. The main case is, again, = [ul]212. Suppose fn(u, 9) # 0. Then by the induction hypothesis fn(u1, 9) E UG. Let a denote fn(u1, 9)f. Then also by the

The language theory of A,

(C.6)

669

induction hypothesis for every g E a we have fn(u2,g&@) E UG. Now let g, h be elements of a such that g < h. Then either fn(u2,g&@) < fn(u2, h&@) or fn(u2,g&@) = fn(u2,h&@), hence fn(u2,g&@) g* + fn(ul,@)* + 1 < fn(u2, h&@) h* fn(u1, @)* 1. It follows that fn(u, @) E UG. Now assume that free(u, k ) . Then for g E a we have:

+ +

+

+

+ + fn(u1, @I)*+ 1

(9)fn(ul @I)'= fn(u2,g&@1) g* and

+ + fn(u1, @2)*+ 1

(9)fn(u, @2)' = fn(u2, g&@2) g* and therefore (9) fn(u, 01)' < (9)fn(u, @2)' . Moreover ,

+

fn(u, @l)* = fn(u1, @ l ) * fn(u2, cg&@l)* and

+

fn(u, @2)*= fn(u1, @2)* fn(u2, cg&@2)* and therefore fn(u, @I)* < fn(u, @2)* . Hence if free(u,k) then f n ( u , @ l ) < fn(u,(P2). It is easily seen that if not free(u, k ) then fn(u, @ l ) = fn(u, 32). 0

Lemma 7.5. Reduction lemma. If 9 E (UG)*, fn(u, @) # 0 then u -+ v implies fn(vl 0)< fn(u, @), Proof. By induction on u ered by Lemma 7.3.

-+

v. The case u = (211) [u3]212, v =

xyl

212

is cov0

As a corollary we have Theorem 7.1. Strong normalization. If u is normable then u strongly normalizes. If 9 E (uG)*, fn(u,@) # 0 then fn(ul@)*is an upperbound for the lengths of reduction sequences of u. 0

8. CONTEXTS AND TYPES

In Sections 8 and 9 we will define the system A,. In order t o do so we must be able t o calculate the type of an expression u E A. For assigning a type

L.S. van Benthem Jutting

670

to u we need a sequence U E A*. Such a sequence is called a context. It can be considered as administrating the types of the free variables in u.The type of u may be undefined which, again, will be denoted by the symbol “0”.

Definition 8.1. (1) tYP(.,

U )=

In order to express the properties of the typing operator typ, we must extend the transformation operation, the substitution operation and the reduction relation to contexts. As far as transformation is concerned we restrict ourselves (k). to the functions (pm

Definition 8.2. Let U be a context, L ( U ) = n. Then v (4 m U

E A* with L ( ( p g ) U )= n ~

is defined by

The following lemmas are easily seen to hold:

0

Lemma 8.2. If L ( U 1 ) = k then ( p g ’ ( U l & U 2 ) = (cpg)Ul)&U2. ~

~

We prove a technical lemma by induction on u:

0

The language theory of A,

((2.6)

671

Lemma 8.3. If L ( U 0 ) = k , L ( U 1 ) = m and U = UO&Ul&U2 then either typ(&u,

@U)

tYP(&U,

@U) = q&typ(u, UO&U2) .

= typ(u, UO&U2) = 0

or 0

This gives as a consequence:

Corollary 8.3. If L ( U 1 ) = m, then either typ(q,u, - U l & U 2 ) = typ(u, U 2 ) = 0

or tYP(cp,U, U1&U2) = 'PmtYP(U,U 2 ) .

0

Now in order t o investigate the relation between substitution and typing we define substitution in contexts:

Definition 8.3. Let U be a context, L ( U ) = n, and 1 5 k 5 n. Then U E A* with L ( C g U ) = n - 1 is defined by

xi

(1)

C; u =

xg-l ( 1 ) U (1

if 1

+ 1) U

if k 5 1 < n

.

0

We have the following easy lemmas on substitution in contexts:

0

Lemma 8.5. If L ( U 1 ) = k then

( U l & U 2 )=

(xiU 1 )& U 2 .

0

The next lemma describes the relation between substitution and typing:

Lemma 8.6. Substitution lemma for typ. If tYP((Pku, - u ) -n w and q k (k) u -n w then either

xi U ) = tYP(U1 U ) = 0

tYP(x; v,

or typ(x;t v,

xi U )

-H

w0 and

typ(v, U ) -n w0 for some w0 E A

.

L.S. van Benthem Jutting

672

Proof. By induction on u. The main case is u = b. Because k 5 L ( U ) we have U = Ul&U2, where L(U1) = k .

fore typ(C1 u , C;t U ) = typ(cpk-lu, Hence, by Corollary 8.3: t y p ( Z i o, lary 8.3: ~~

cp1 tYP(CZ

21,

There-

( x i Ul)&U2)

where L(Ci U1) = k - 1. V ) = p k - 1 typ(u, V2), so, again by Corol~

C i U ) = 'PktYP(U,U 2 ) = tYP(B'L1, U )

+

w

*

On the other hand

C i tYP(U, U ) = xi y& (k)u = 'Pk-lM u by Corollary 3.5. This gives us

( k )u

5 C i tYP(U, U ) =

+

'w

.

By Lemma 4.2 it follows that w = cp1 w0 and that t y p ( C i u , Zi U ) -M w0 and

Corollary 8.6. If typ(u, V)

-M

xi typ(u, U )

w and u l

-M

-M

w0

0

w then either

typ(CY u , V) = typ(u, ul&V) = 0

or typ(CY u , V ) -M w0 and

zy typ(v, ol&V)

-M

w0 for some w0 E A

Proof. Take k = 1 and U = ul&V in Lemma 8.6.

.

0

Finally, in order t o describe a relation between typing and reduction we define the concept of reduction on contexts.

Deflnition 8.4. Let u and u be terms, U and V contexts. (1) if u --* u then u&U

(2) if U

+V

-.*

then u&U

u&U

.--$

0

u&V.

We have the following lemma:

Lemma 8.7. If U k 5 72 such that

(k)U

-.*

+

V then L ( U ) = L ( V ) = n

> 0 and there is just one

( I c ) V and ( 1 ) U = (1) V for 15 R , 1 # k

.

The language theory of Aw (C.6)

Proof. By induction on U

4

673

V.

0

Moreover, we have

Lemma 8.8. Zf U tYP(’u, U )

+

+

V then either

tYP(% V )

or tYP(.u,

w = tYP(% V ).

Proof. By induction on u.

Corollary 8.8. Zfv tYP(% V&U)

-+

+

w then either tYP(U, w

w

or typ(u, v&U) = typ(u, w&U) .

0

The relation +I between contexts is the reflexive and transitive closure of -+. If u -W v and U -,, V then clearly u&U +I v&V.

9. THE SYSTEM A,

We will define by simultaneous induction the set r, c A* which is the set of correct contexts, and the set A, c A x A* (it will turn out even A, c A x.),?I If [u, U l E Aw then u will be called a correct term on context U.Here correctness should be understood as follows: If (u)v is correct on context U then v “is a function” and moreover typ(u, U ) and “the domain of v” have a common reduct. In fact, we have not formalized what it means for v t o “be a function” and, if it is, what “the domain of v” is. The requirements described above appear, however, in clause 4 of our definition and - implicitly - also in clause 6. Together with r, and Aw we will define the sets ri and hi for i E N . They are introduced only for the purpose of induction in the proof of Lemma 10.3. If [u,Ul E Ai then u will be called i-correct. The systems are connected with the notion of degree in [de Bruijn 72b (C.2)] and [de Bruijn 80 (A.S)] in the sense that any i-correct term will have degree at most i. (The converse, however, does not hold.) In the following discussion it is always assumed that i E nVw. For i = 00 the definitions and lemmas contain the theory of A,.

L.S. van Benthem Jutting

674

Deflnition 9.1. (0)

ro= ho = 0 .

If i > 0 then ( 1 ) 0 E ri ( 2 ) if [u, Ul E A, then u&U E ri (3) if U E

ri then

17,Ul E Ai

(4) if typ((u)v,U) = 0 , [u,UlE hi, [ v I U 1 E [ v l ]v 2 then [ ( u )v , U l E hi

Ail

typ(u,U)

-w

v l and

-w

(5) if typ([u] v, U ) = 0 and [D,u&U] E hi then [ [ u ]v ,U l E Ai (6) if [typ(u, U ) , U ] E

Ai-1

then [u,UlE A i .

0

Clearly if [ulUl E A, then U E ri and if U1&U2 E ri then U2 E ri. It is also clear (by induction on i) that A i c Ai+l for i E AJand it is easy to check that Am = U {Ai I i E nV}. We have the following technical lemma:

Lemma 9.1. If L ( U 0 ) = k, L(U1) = m, U = UO&Ul&U2 and U1&U2 E then

[g~, &U] hi E

if

[u, VO&U21 E A,

ri

.

Proof. B induction, respectively on [u,UO&U21 E Ai and on [&u, &U1 E A i , where frequent use is made of Lemma 8.3.

0

The lemma has some nice corollaries:

Corollary 9.1.1. Weakening and strengthening lemma. If L ( U 1 ) = m, U1&U2 E ri, then [ ( P ~ UU, l & U 2 ] E hi afl [u, U21 E Ai. Corollary 9.1.2. If U E ri, k 5 L ( U ) then [( ~ k(k)U,Ul E Ai. Corollary 9.1.3.

[n,Ul E A, zf

UE

rooand n 5 L ( U ) .

0 0

The next lemma partially expresses our assertions about correctness of terms.

The language theory of Am (C.6)

675

Lemma 9.2. Soundness of application. Zf [ ( u ) [w]v,Ul E Ai then typ(u, U ) --* w0 and w -* w0 for some w0 E A. Proof. By induction on [ ( u ) [w]v,Vl E Ai.

0

Types of correct terms are, in a sense, preserved under reduction.

Lemma 9.3. Preservation of types. If [u, V l E A,, u v then either -+

tYP(U, V ) = tYP(V, U ) = 0

or typ(u, V ) -* w and typ(v, U ) --* w for some w E A

Proof. By induction on u v. We will consider the case u = ( u l ) [u3] u2, v = C;”’ u2. By the previous lemma typ(u1, V ) -* w0 and 213 -* w0. Now typ(u, U ) = (ul)[u3]typ(u2, u3&U) -+ typ(u2, u3&U) and typ(v, U ) = typ(Cy’ u2, U ) . Apply Corollary 8.6. -+

x;”’

0

The following lemmas are easy to prove. The first contains the converse of clause 6 in Definition 9.1.

Lemma 9.4. Correctness of types. If typ(u, V ) # then [u, Vl E hi if [typ(u, V ) ,U l E hi-I. The second tells us that if an application of a function to an argument is correct, then both the function and the argument are correct.

Lemma 9.5. Correctness of functions and arguments. If [ ( u ) v,Vl E hi then [ u , V l E Ai and [ v , U l E Ai.

0

We prove two lemmas which are, in a sense, converses of Lemma 9.5.

Lemma 9.6. If [ ( u ) v l , U ] E v2, vl E A ~ .

r(.)

Ai,

[v2,U] E hi, v l

-*

w and v2

+

w then

Proof. By induction on [ ( u )v l , V l E A i . We consider the case of clause 4: typ((u) v l , U ) = 0 , [u, Vl E A i , 1.1, Vl E A i , typ(u, V ) -* w0 and v l -* [wO]wl. We know that typ(v1, V ) = 0,hence, by Lemma 9.3, typ(w, V ) = 0 and also typ(v2, U ) = 0.Therefore typ((u) v2, V ) = 0 . Moreover, by the Church-Rosser theorem we have, for some w2 E A:

L.S. van Benthem Jutting

676

w + w 2 and [wO]wl-nw2, hence w2 = [wO*]wl* for some wO* and Therefore typ(u,U) [ ( u )v2, U l E hi.

--H

w0

-*

wO* and 712

++

w

-*

wl*

.

[wO*]wl*, so, by clause 4, 0

Lemma 9.7. If [ ( d ) v,U l E A i , [(u2) V , U ) E A;.

121.2, Ul

E

A, and u1 + 212 then

Proof. By induction [(vl) v,U l E A i . We consider again the case of clause 4: typ((u) v, U ) = 0 , [ul, U l E A i l [v,U l E Ai, typ(u1, U ) w l and v ++ [wl] w2. First we have typ(v, U ) = 0 , hence typ((u2) v , U ) = 0. By Lemma 9.3 we have for some w0 typ(v1,U) ++ w0 and typ(u2,U) w0. Hence, by the ChurchRosser theorem: w0 ++ v l and w l ++ v l for some v l . Therefore typ(u2,U) 0 w0 --H v l and 21 + [wl]w2 ++ [vl] w2, so, by clause 4, [(u2)v,Ul E A,. ++

++

++

Finally we state a lemma on correct abstraction:

Lemma 9.8. [[v]v,U l E A,

zfl [v,u&U1 E hi.

Proof. By induction, respectively on [[u] v, U l E Ai and on [v,u&U1 E A,. 0

10. CLOSURE FOR A,

For the proof that A, is closed under reduction we need Lemma 10.2 which tells us that correctness is preserved under correct substitution. In order to prove this lemma we give a slightly different definition of A*, which we will prove to be equivalent to the first definition. Induction on this alternative definition will be used in the proof of Lemma 10.2. We define for i E IN" the sets Ci and Li by a simultaneous inductive definition as follows:

Deflnition 10.1. (0)

co = Lo = 0.

If i

> 0 then

(1)

0 E Ci

(2)

if [u, Ul E Li then u&U E Ci

(3)

if U E Ci then [ T , U ~E Li

The language theory of Am (C.6)

(5)

677

if typ([u]v,V ) = 0 and [v,u&U1 E Li then [ [ u ]v,U l E Li

The clauses 0 to 5 are the same as the corresponding clauses of Definition 9.1, but clause 6 of that definition has been split up into three clauses. We easily verify that Li-1 c Li and that L , = U { L , I i E N } . In order to show that Ci = ri and Li = Ai we first prove the following lemma:

Lemma 10.1. If [typ(u, V ) ,Vl E Li-1 then [u,Vl E Li.

Proof. By induction on [typ(u, V)l E Li-1. We consider the case of clause 4: typ(u, V ) = (ul ) v, typ((u1) v,V ) = 0 , [ul, Ul E Li-1, [v,U l E Li-1, typ(u1, U ) + w l , v [wl]w2. Now either u = 24 or u = (ul) u2 and typ(u2, V ) = v. If u = B then [u, Vl E Li by clause 6.1. If u = ( u l) u 2 , typ(u2,U) = v then we have by the induction hypothesis ru2, Vl E L; and therefore [u, Vl E Li by clause 6.2. As another case we consider clause 6.3: typ(u, V ) = [ul] v, [typ([ul] v,V ) ,Vl E L,-z and [v,ul&U] E Li-1. Again we either have u = n or u = [ul] 212 and typ(u2, ul&U) = v. If u = n then again clause 6.1 applies. And if u = [ul]u2, typ(u2,~1&U) = v then by the induction hypothesis 0 ru2, ul&U1 E Li and therefore [u, Vl E Li by clause 6.3. I +

Corollary 10.1. C , = J?i and Li = hi.

Proof. Li C Ai is trivial, Ai

c Li is proved by using Lemma 10.1.

0

Now we are able to prove the following important substitution lemma.

Lemma 10.2. Substitution lemma for Li. If [__ ( ~ k u , U ]E Li, [vlul E Li, typ(@u,U) ~i E L ~ .

rc; v , c ;

+

w and ' p k ( k ) U

Proof. By induction on [v,Vl E Li, freely using Corollary 10.1.

+

w then

L.S. van Benthem Jutting

678 We consider some of the clauses:

Clause 3: = r. We have to prove that X i U E Cj. If k = 1 this is clear by Lemma 8.4. U= w& V, also by Lemma 8.4. If k > 1 then U = w&V and Now we have [w, Vl E L i , hence by the induction hypothesis w, U E Ci by clause 2. E L, and therefore

xg

xi

xi

xI Vl

Clause 4: v = (vl ) v2. We know that typ(v, U ) = 0,[vl, Ul E Li, rv2, ul E Li, typ(v1, U ) -n w l

and v2 + [wl] w2

.

By Lemma 8.6 we have:

The induction hypothesis gives us:

Also by Lemma 8.6 we see that

Now by Lemma 5.3 it follows that

hence by the Church-Rosser theorem

w 0 a w and E i w 1 - n ~forsome w , Therefore we have: typ(Cg

x;v2

vl,Ci U ) -n w0 -n w

+

and

[Xi 2011 Cbl w2 -w bl C;+l w2

From (i), (ii) and (iii) we conclude by clause 4 that

*

(iii)

The language theory of Am (C.6)

679

Clause 6.1: v = 12. We know that [typ(v,U),U] = rpn - (n)U,Ul E Li-1. We discern two cases: n = k and n # k . Suppose n = k. As L ( U ) 2 k we may put U = U1&U2 with L(U1) = k . Then U= U l ) & U 2 by Lemma 8.5 and L ( x ; U1) = k - 1. Moreover, U E Ci. Hence by Corollary it can be shown, just as under clause 3, that 9.1.1 we have [u, U21 E Li and by the same corollary also v, Ul =

(xi

r(Pk+a;

x;

Ul E Li.

Now suppose n # k. v either equals 12 (if n < k ) or n-l (if n Lemma 3.4 (for n < k ) or Corollary 3.5 (for n > k ) we see that tYP(C; u, C; U ) =

x;(pn (4u .

By the induction hypothesis we have by clause 6.1 [C; v,Ci Ul E Li. Clause 6.2: v = ( v l ) v2. We know that [ ( v l ) typ(v2, U ) ,U l E tion hypothesis it follows that

[C; __ pn (n)U ,

Li-1

x;U l E

Li-1

> k ) . Using

and therefore

and that [v2, U l E Li. By the induc-

By Lemma 8.6 it is known that for some w0 E A

x;typ(v2, U )

--H

w0 and typ(E; v2, C; U )

+

w0 .

(ii)

And from (*) we conclude by Lemma 9.4 that rtYP(C;

GC;

w,c;Ul E

.

Li-1

(iii)

From (i), (ii) and (iii) it follows by Lemma 9.6 that rtYP(C;

v , c ; % C i Ul E Li-1

7

and this, together with (*) gives us by clause 6.1:

rc;

v,Ci Ul E Li .

We leave the other clauses to the reader.

0

L.S. van Benthem Jutting

680

Proof. Take k = 1 and U = vl&V in Lemma 10.2.

0

Our next lemma implies that for i E N the set A, is closed under reduction. In order to word it we use the relation -H between contexts, which has been defined in Section 8. In order to prove the lemma we assign t o every context U the number M ( U ) which is the sum of the lengths of the terms in U : if L ( U ) = n then M ( U ) = L((1) U ) L((2) U ) ... L ( ( n )U ) .

+

Lemma 10.3. If i E N , u&U

-n

+ +

v&V and [u, U l E

Ai

then [v, Vl E A i .

Proof. By induction on i. If i = 0 then A, = 0, so the lemma holds. Suppose i > 0. We prove the following: Proposition. If u&U

+

v&V and [u,U l E A, then [v, Vl E A,.

Proof. By induction on M(u&U). If M ( u & U ) = 1 then u&U + v&V is impossible, so the proposition holds. Now suppose M ( u & U ) > 1. As u&U + v&V we have either u + v and U = V or u = v and U + V. Suppose u + v and U = V. We inspect the clauses for 21 --+ 0.

(1)

u = ( ~ 1[u3] ) u2

v = Cyl u2

,

.

By Lemma 9.2 we have typ(u1,U) + w and u3 -H w for some w , and by Lemma 9.5 we have “2131 212, Ul E A, so [u2, u3&U1 E Ai by Lemma 9.8. Apply Corollary 10.2.

(2)

21

= ( u l ) v2

,

u1 + v l

,

v = (vl) u2

.

By Lemma 9.5 we have [ul, Ul E A,. Moreover ul&U + vl&U and M ( u l & U ) < M ( u & U ) . Therefore by our induction hypothesis we have [ v l , U l E Ai and hence [v, U l E A, by Lemma 9.7.

(3)

u = (211) u2

,

u2

--+

v2

,

v = ( u l ) v2

.

[v, Ul E A, by a similar argument, where Lemma 9.6 is used instead of Lemma 9.7. (4)

u=

[vl]v 2 ,

u1 + v l

,

v = ( v l ] u2

.

By Lemma9.8 we have [u2, ul&U1 E A,. Moreover u2&ul&U + u2&vl&U and M(u2&ul&U) < M(u&U); in fact M ( u & U ) = M(uZ&ul&U) 2. Therefore our induction hypothesis gives us [u2, vl&U1 E A, and it follows that [v, Ul E A, by Lemma 9.8.

+

The language theory of Am (C.6)

(5)

'11 =

[v,Ul

[ul] u2

E Ai by

,

u2

v2

,

21

= [ul] v2

.

a similar argument as under 4.

Now suppose u = v and U

(3)

+

68 1

+

V. We inspect the clauses for [u, Ul E A i ,

u= 7.

We have to prove that V E ri. As U -+ V it is impossible that U = 8, so we may put U = ul&U1 and V = v l & V l . As U E ri we have [ul, U l l E Ai and also M ( U ) < M(u&U). Therefore we have by our induction hypothesis [ v l , V l l E Ai, hence V E ri.

u = ( ~ 1~2 ) ,

(4)

typ(u, U ) = 0

typ(u1, U ) +,v l and 212

-+

,

[ ~ Ul l , E hi

,

[ ~ 2U, l E

Ai

[vl]v2 .

By Lemma 8.7 we know typ(u, V) = 0 . Moreover, we have ul&U + u l & V and M ( u l & U ) < M ( u & U ) so by our induction hypothesis [ul, Vl E A i , and by a similar argument we see that ru2, V1 E A i . Also by Lemma 8.7 it is seen that typ(u1, U ) --* typ(u1, V) so by the ChurchRosser theorem we have: v l - w and typ(u1, V)

It follows that u2 -,, [vl]v2 = [ul] u2

(5)

,

-+

-+

w for some w

.

[w] v2, hence [u, Vl E Ai by clause 4.

typ(u, U ) = 0

,

[u2, ul&U] E A,

.

We know that u2&ul&U + 212&ul&V and that M(u2&ul&U) < M ( u & U ) . It follows that [u2, ul&V] E Ai, hence [u, Vl E A, by Lemma 9.8.

(6)

[ ~ Y P ( u , U ) , UE] Ai-1

.

By Lemma 8.7 we have typ(u, U ) -+ typ(u, V), hence typ(u&U)&U -+ typ(u&V)&V. Now by our induction hypothesis on i it follows that [typ(u, V), Vl 0 E Ai-1 and therefore [u, Vl E Ai by clause 6.

So our proposition is proved, and it follows immediately that u&U [u, Ul E A, imply [v,Vl E A i . This proves our lemma. Corollary 10.3. Closure for Ai. If i E N , [u, U l E Ai and u v then [v,Ul E -+

As a consequence we have:

Ai.

-+

v&V, 0

0

L.S. van Benthem Jutting

682

Theorem 10.1. Closure fOT Am. If [u, U l E Am and u --H v then [v,Ul E Am.

0

11. NORMABILITY FOR Am In this section we will prove that ru,Ul E Am implies that u is normable. It then follows from Theorem 7.1 that u strongly normalizes. In order t o prove that u is normable we will assign to certain sequences U E A* a sequence s ( U ) E (UG)*. If the assignment is not possible then we will write as before, s ( U ) = 0.

Definition 11.1. (1)

40) = 0 cg&s(U)

(2) s(u&U) =

if

s(U)#

fn(u,s(U)) # 0

0,

and fn(u, s(U))t = a otherwise

0

,

Proof. By induction on U.

0

Our second lemma gives a relation between norms and typing.

Lemma 11.2. If U E A*, s ( U ) # 0 and typ(u, U )# 0 then either fn(typ(u, U),s ( V ) )= fn(u, s ( U ) )= 0

or fn(typ(u, U), s ( U ) ) t = fn(u, W ) t

'

Proof. By induction on u. We consider the case that u = (ul]u2. Then typ(u, V ) = (2111 typ(u2, ul&U) and typ(u2, ul&U) # 0 . If fn(u1, s ( U ) ) = 0 then fn(typ(u, V ) ,s ( U ) ) = fn(u, s ( U ) )= 0 . Now assume that fn(u1, s ( U ) ) # 0 and put fn(ul,s(U))t = a. Then it follows that s(ul&U) = c$&s(U) # 0. If fn(typ(u2, ul & U ), s(ul&U)) = 0 then also fn(u2, s(ul&U)) = 0 by the induction hypothesis, and therefore fn(typ(u, V ) ,U ) = fn(u, V ) = 0 . SO let us assume fn(typ(u2, uI&U), s(ul&U)) # 0 . Putting fn(typ(d2, ul&U), s(ul&U))T = /3 we have by the induction hypothesis

The language theory of A,

683

(C.6)

fn(u2, s(ul&U))T = p and also fn(u2, g&s(V))j = ,B for g E a. Hence fn(typ(u, U ) ,s ( U ) ) t = fn(u, s(U))T = a + 0.

Lemma 11.3. If [u, Vl E

Ai

then s ( U ) #

0

0

and fn(u, s ( U ) ) # 0 .

Proof. By induction on [u, Crl E A i . We consider clause 3: u = 7. We only have to show that s ( U ) # 0 . If U = 0 then s ( V ) = 0, and if U = v&V then we have [v,V1 E hi, so by the induction hypothesis s ( V ) # 0 and fn(v,s(V)) # 0 and therefore s ( U ) # 0. We will also consider clause 4: u = ( u l ) u2. We have typ(u,U) = 0 , [ u l , U ] E A;, [u2,U] E A i , typ(u1,U) --H v l and 212 -H [vl] v2. By the induction hypothesis fn(u1, s ( U ) ) # 0 and fn(u2, s ( U ) ) # 0 . Putting fn(ul,s(U))T = a we have fn(typ(ul,U),s(U))T = a by Lemma 11.2 and fn(vl,s(U))T = a by Lemma 7.5. Also by Lemma 7.5 we have fn(u2, s ( U ) ) t = fn([vl]v2, s ( U ) ) t = a + p for some p, hence fn(u, s ( U ) )# 0. 0 We leave the other cases to the reader. As a consequence we have

Theorem 11.1. Strong normalization for A,. If [u, Ul E A, then u strongly normalizes.

0

ACKNOWLEDGEMENT

I want to express my gratitude to R. Nederpelt for his encouragement and his careful reading of the original text, where he suggested some improvements and detected a serious error.

This Page Intentionally Left Blank

PART D Text Examples

This Page Intentionally Left Blank

687

Example of a Text written in Automath N.G. de Bruijn

[Editor’s comments This early text is written in the first full-fledged version of an Automath language, later to become known as AUT-68. It covers some elementary logic and the notions of set, powerset and set inclusion. An introduction to the language AUT-68 can be found in this Volume flvan Benthem Jutting 81 (B.l)]). First a f e w remarks on features that are particular of this text and on the way at has been reproduced here. (1) Early 1968 d e Bruijn still used the term sort instead of type. We have

not changed this terminology.

(2) In the original text one finds vertical lines as indicators of the scope of the variables, like in Natural Deduction in the style of Fitch [Fitch 521. These lines are redundant, although they enhance readability. They have been deleted in this reproduction. (3) In some places, especially at the opening of each new section, you will find a few lines that have been placed between brackets. These lines are superfluous in the sense that deleting them would affect neither the correctness nor the meaning of the text. They redefine a context that could have been just picked up from the preceding sections. De Bruijn included these lines as reminders, saving the reader the trouble of searching the text for the proper identifiers. Since they definitely contribute to the readability, we have reproduced both the lines and the brackets.

(4) The division

of the text in sections with descriptive headers is from the original. So are the comments between the lines.

(5) The text has never been checked on a computer. A few obvious mistakes

have been corrected.

688

N.G. de Bruijn

We.continue with a few short comments on the handling of logic in terms of bool and T R U E . Once this mechanism is understood, it is not dificult to read the plain Automath text. Consult on this subject also the rCsumCs of [D.1] and [A.2] in the Introduction. Section 12 of [van Benthem Jutting 81 ( B . l ) ] , entitled ‘Logic’, contains a more recent text fragment developing some logic. It may be instructive to compare the two texts. 1.1-1.3. The primitive type bool of propositions (here called ‘booleans’) is introduced, and for each boolean x the connected assertion type T R U E ( x ) . The idea is that a boolean will be true if its assertion type is inhabited. 1.4 C O N T R is defined as the type of functions that attach to each boolean v an assertion of v. Such a function could of course be taken as an assertion of a contradiction, and in a pure propositions-as-types setting it would be natural to view the type C O N T R itself as the proposition that in a canonical way represents falsity. Here the corresponding proposition (boolean) can be obtained via the nonempty-construction that now follows. 1.9-1.13. To each type ksi corresponds a boolean nonempty(ksi). Its assertion type TRUE(nonempty(ksi)) is inhabited if ksi is. This construction may look a bit artificial: why is not ksi itself taken as the assertion type? The answer is that we already have the uniform construction of the TRUE-types as the assertion types of booleans, and there is no way to make ksi and TRUE(nonempty(ksi)) definitionally equal. It is noteworthy that i n the present text de Bruijn not always takes the trouble to explicitly construct the TRUE-type, or even the bool. A typical example is Section 2. Both the equality axioms and the reasoning on equality are entirely in terms of thc types IS(ksi,x,y). The corresponding boolean equal is defined in line 2,3, but never used, the assertion type does not occur at all. A s a matter of fact, the IS-types here already have taken the role of the propositions. A similar observation could be made e.g. on implication (IMPL, defined in line 4.7). I n other places the booleans are used in an essential way, though. I n particular the type [u : ksi] bool plays an important role. First, in Section 7 on quantification, as the type of predicates on ksi, and then again in Section 13, as the type of sets over ksi. A final remark seems in order on the kind of logic that is at issue. De Bruijn has always emphasized that Automath is neutral with respect to the logical principles that one wants to accept or reject. This view is reflected in the present text b y the manner i n which he handles non-constructive principles. Two such principles (or rather, corresponding types) are defined as PARADISE I and I 1 (line 7.15 and 1.19). Metaphorically speaking, a type ksi has an inhabited PARADISE I 1 if the double-negation law holds for ksi. However, there is no axiom ( P N ) stating that PARADISE II(ksi) is inhabited for each ksi. I n particular, this is not assumed for the TRUE-types, so that we obtain intuitionistic and not

Example of a text written in Automath (D.1)

689

classical logic for the inhabitants of bool. (The names bool and TRUE may be felt to be a bit misleading here.) In Section 12, line 12.1, the type EXCLTHIRD is defined. An inhabitant of EXCLTHIRD would yield that all TRUE-types are in PARADISE 11. Then in line 12.7 a non-constructive notion of truth, called VALID, is defined as truth on the assumption of EXCLTHIRD. The upshot is that intuitionistic and classical logic live happily together in the guises of TRUE and VALID.

R.C. de Vrijer]

1. BOOLEANS 1.1 0 1.2 0 1.3 x

@ bool

1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18

:= PN

: sort

@ X

.-.- _ _ - - -

@ TRUE

:= PN

: sort

0 0 a b 0 ksi ksi a ksi a

@ CONTR

:= [v:bool]TRUE(~)

: sort

ksi ksi x u x

Q EMPTY

@ a @ b @ then 1 @ ksi @ nonempty @ a Q then 2 @ a @ then 3

.-...-

_____ _____

:= (b)a

..-

_____

:

TRUE(b)

: : : : :

sort

:= P N := [u:ksi]CONTR

: sort

:= PN

.-.- _ _ _ - _ := PN

..- _ _ _ _ _

@ U

Q then 4 @ then 5

:= (x)u := [t:EMPTY (ksi)]

__--_

___-_

then 4 (t) 1.19 ksi

: CONTR : bool

bool ksi TRUE(nonempty) TRUE(nonempty) : ksi

.-..-

@ X

: bool

: ksi : EMPTY(ksi) : CONTR :

EMPTY(EMPTY( ksi))

@ PARADISE I1 := [t:EMPTY(EMPTY(

ksi))]ksi

: sort

N.G. de Bruijn

690 2. EQUALITY

0 ksi

Q (ksi Q (x

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

Q y x y Q IS Q equal y x Q reflexive y Q ass 1 ass 1 Q symm ass 1 Q z z Q ass 2 ass 2 @ transitive

2.10 2.11 2.12 2.13 2.14 2.15

ksi theta x1 P 1 x2 ass 3

: sort) :

ksi)

.- _ _ _ _ _ := P N

:= nonempty(1S) := PN

.-.-

_____

:= PN _____

...-

_____

:= PN

Q theta

: sort

Q x1

:

@ P 1 Q x2 Q ass 3 @ then 6

:

ksi [t:ksi]theta : ksi : IS(ksi,xl,x2) : IS(theta, (xl)P 1,

Q P2

: [x:ksi]theta

Q ass 4

:

(X2)Pl) 2.16 P 1 2.17 P 2

2.18 ass 4 Q then 7 2.19 P 2

0 ass 4a

2.20 ass 4a Q then 7a

:= PN

IS([x:ksi]theta, Pl,P2) : [x:ksi]IS(theta,

.-.-

: [x:ksi]IS(theta,

(X)Pl,(X)P2) _____

:= PN

:

( 4 P1,W 2 ) IS([x:ksi]theta, Pl,P2)

Example of a text written in Automath (D.1)

691

Za. Ifelse 2”.1 2”.2 2”.3 2”.4 2”.5 2”.6 2”.7 2”.8

0 ksi x a a ass 4b a ass 4c

@ @ @ @ @ @ @ @

(ksi (x a ifelse ass 4b then 7b ass 4c then 7c

: sort) : ksi) :

bool

: ksi : TRUE(a) : IS(ksi,ifelse,x) : IS(ksi,ifelse,x) : TRUE(a)

2b. Equality for two sorts 2b.l 0 2b.2 ksi 2b.3 eta 2b.4 a 2b.5 b 2b.6 b 2b.7 b 2b.8 ass4d 2b.9 eta 2 b . 1 zeta ~ 2b.lla 2b.12 b 2b.13 c 2b.14 ass 4e 2b.15 ass 4f

@ (ksi @ eta @ a @ b @ ISS @ equal1 @ ass 4d @ symmm @ zeta @ a @ b @ c @ ass 4e @ ass 4f @ transitivv

: sort) : sort : ksi :

eta

: sort

bool : ISS(ksi,eta,a,b) : ISS(eta,ksi,b,a) :

: sort :

ksi

: eta : zeta

ISS(ksi,eta,a,b) ISS(eta,zeta,b,c) : ISS(ksi,zeta,a,c)

:

:

C o m m e n t : T h e PN’s in 2.2, 2.6, 2.9 can n o w be replaced respectively b y ISS(ksi, ksi,x,y), symmm(ksi,ksi,x,y); transitivv(ksi,ksi,ksi,x,y,z). 2‘.

2‘.1 2‘.2 2‘.3 2c.4

Embedding 0 ksi eta p

@ @ @ @

(ksi (eta p EMBED

...-

_____

: sort)

_____ _____

: sort)

:=

[t:eta]ISS(ksi,eta,(t)p,t)

: sort

1-

:

[x:eta]ksi

N.G. de Bruijn

692 2'.5 p 2".6 w

@ w @ image

.- _ _ _ _ -

:

EMBED

:= [x:ksi]nonempty (

EXISTS(eta,[b:eta] equal( (b)P9x11

: [x:ksi]bool

[Note: EXISTS is still to be defined i n line 7.2. So this Section 2' should, as a matter of fact, be placed after Section 7.1

3. PAIRSORT 0 ksi

@ (ksi @ (theta

3.1 3.2 3.3 3.4

theta theta x y

@ @ @ @

3.5 3.6 3.7 3.8

theta u u u

@ u 0 first @ second @ then 8

pairsort x y pair

.-.- _ _ _ _ _ .- _ _ _ _ :=

PN

: sort) : sort)

sort

:= P N

: : : :

ksi theta pairsort

.-.-

:

pairsort

.- _ _ _ _ _ .- _ _ _ _ _ ___--

:= P N := P N := P N

: IS(pairsort,u,

:= P N := P N

: IS(ksi,x,first(pair)) : IS(theta,y,

: ksi :

theta pair(first,second))

3.9 y 3.10 y

@ then 9 @ then 10

second( pair))

4. BOOLEQUAL, IMPLICATION 4.1 4.2 4.3

0 a b

@ a @ b @ c

.- _ _ _ _ _ .- _ _ _ _ _ .- _ _ _ _ _

4.4 4.5

c b

@ then 11 @ d

:= P N

.-

_____

bool bool : pairsort( [t:TRUE (a)]TRUE(b),[s: TRUE(b)]TRUE(a)) : IS(bool,a,b) : IS(bool,a,b)

:

:

Example of a text written in Automath ( D. l) @ then 12

4.6

d

4.7 4.8 4.9 4.10

b @ b @ ass 1 @ ass 2 @

IMPL ass 1 ass2 modpon

:= PN

693

:

pairsort([t:TRUE (a)]TRUE(b),[s: TRUE(b)]TRUE(a))

:= [u:TRUE(a)]TRUE(b) : s o r t .- _ _ _ _ _ : TRUE(a) .- - _ _ _ _ : IMPL := (ass 1)ass 2 : TRUE(b)

..-

5. SOME LOGICAL CONSTANTS

5.3

0

5.4

0

@ contradiction := nonempty(C0NTR) @ OBVIOUSLY := IMPL(contradiction, contradiction) @ trivial := nonempty( OBVIOUSLY) @ now 1 := [u:TRUE(

5.5

0

@ now 2

:= then 2

5.6

0

@ now 3

:= [u:CONTR]u

: TRUE(trivia1) : EMPTY(C0NTR)

..-

:

5.1 5.2

0 0

contradiction)]^ (OBVIOUSLY,now 1)

: bool : sort : bool :

OBVIOUSLY

6. NON, A N D 6.1 6.2 6.3

0 b b

@ b @ NON @ non

6.4 6.5

b c

6.6

_____

bool

:= EMPTY(TRUE(b)) := nonempty(N0N)

: sort : bool

@ C

..-

: bool

@ AND

:= pairsort (TRUE(b),

c

@ and

:= nonempty(AND)

:

6.7 6.8

c if

@ if @ then 12a

.-.-

: AND

6.9

if

@ then 12b

_____

TRUE(c))

_____

: sort

bool

:= first (TRUE(b),

TRUE(c) ,if)

: TRUE(b)

:= second(TRUE( b) ,

TRUE(c),if)

: TRUE(c)

N.G. de Bruijn

694

6.10 b 6.11 if

@I

6.12 b 6.13 if

Q if @ then 12d

6.14 b 6.15 if

Q if

if @I then 12c

@I

then 12c

.-

_____

:= then 3(NON(b),if)

.-.-

_____ := then 2 (NON(b),if)

.- - _ _ _ _

: TRUE(non(b)) : NON(b) : NON(b) : TRUE(non(b))

:= then 5 (TRUE(b) ,if)

: TRUE(b) : EMPTY(NON(b))

.- _ _ _ _ _ .-

: sort)

7. EXISTS, ALL

(ksi

0

@I

7.1 7.2 7.3 7.4 7.5

ksi P P v ass 1

P @I EXISTS @I v @I ass 1 @I then 13

7.6 7.7 7.8

P ass 2 ass 2

@I

7.9 7.10 7.11 7.12

P P

@I

@I

@I @I

ass 2 then 13a then 13b

ALL @ (v v @ ass 3 ass 3 @I specialize

.-.-

_____ := PN

.- _ _ _ _ _ .- _ _ _ _ _ := P N

..-

_____

:= PN := PN

:= [u:ksi]TRUE((u)P) .- _ _ _ _ _

.- _ _ _ _ _ :=

(v)ass 3

: : : :

[u:ksi]bool sort

ksi TRUE((v)P) : EXISTS : EXISTS : ksi :

TRUE( (then 13a)P)

: sort : ksi) : ALL :

TRUE((v)P)

7.13 P 7.14 P

Q NONEXIST := [u:ksi]NON((u)P) : sort @I WEAKEXIST := EMPTY(N0NEXIST) : s o r t

7.15 ksi

@ PARADISE I := [Q:[u:ksi]bool] [t:WEAKEXIST( Q)] EXISTS(Q) @I a .-.- _ _ _ _ _ .- _ _ _ _ _ @ b @ then 14

7.16 P 7.17 a 7.18 b

: : : :

sort

PARADISE1 WEAKEXIST (P) EXISTSfP’I

Example of a text written in Automath (D.l)

695

8. CONSTANT FUNCTIONS 0 ksi

8.1 8.2

8.3 8.4 8.5 8.6

@ ksi @ theta

theta @ pi pi @ CONSTANT

g a b c

@ 0 @ @

a b c then 15

.- _ _ _ _ ...- _ _ _ - -

: sort : sort

.- _ _ _ - .-

: [t:ksi]theta

:=

[s:ksi][t:ksi]IS(theta, (t)Pi,(S)Pi)

..- _ _ _ _ -

..-.-

_____

____-

:= (a)(b)c

: sort : : : :

ksi ksi CONSTANT IS(theta,(a)pi, (b)Pi)

9. CONDITIONAL BRACING

9.1 9.2

0

@ (ksi

ksi

@ P @ h

P

9.3 9.4 9.5 9.6

h x x h

@ X

9.7 9.8 9.9

x a a

@ a @ then 17 0 then 18

9.10 a

@ then 19

@ sigma @ then 16 @ Q

.- _ _ _ - -

: [t:ksi]bool : [t:ksi][s:TRUE(

..-

(t)P)]boo1 _____

:= TRUE( (x)P) := EXISTS(sigma, (x)h) := [t:ksi]nonemtpy(

then 16(t))

.- _ _ _ - -

1-

:= then 3(then 16,a) := then 13a(sigma, (x)h ,

then 17)

@ a @ b @ then 20

: ksi : sort : sort : [t:ksi]bool : TRUE((x)Q) : then 16 : TRUE((x)P)

:= then 13b(sigma,(x)h,

then 17) 9.11 x 9.12 a 9.13 b

: sort)

..- - _ _ _ _ .-.- _ _ _ - _ := then 13(TRUE((x)P),

: TRUE(

(then 18)(x)h) : TRUE((x)P) : TRUE((a)(x)h)

N.G. de Bruijn

696

9.14 b 9.15 b

Q then 21 @ then 22

:= then 2(then 16, then 20) : TRUE((x)Q)

:= then 2(then 16, then 13

(sigma,(x)h,a,b))

:

TRUE((x)Q)

_____

: sort) : [t:ksi]bool)

10. DIRECT BRACING

0 ksi 10.1 P 10.2 Q

Q (ksi Q (P Q Q @ R

.-

.- _ _ _ _ _

.- - _ _ _ _ := [t:ksi] and ((t)P,(t)Q)

: [t:ksi]bool : [t:ksi]bool

11. NAMECHANGING @ NAME @ dash

:= PN := PN

Q ksi Q classin

:= pairsort ([t:ksi]bool,

11.5 ksi 11.6 c

@ C

.-

@ predicof

:= first([t:ksi]bool,

11.7 ksi 11.8 d

@ d @ classof

11.1 11.2 11.3 11.4

0 0 0 ksi

.- - - _ _ _

..-

NAME) _____

NAME,c) _____

11.9 0 11.10 0

@ NAME 2 @ dot

11.11 0 11.12 ksi

.- _ _ _ _ _ @ (ksi @ PREDICATE := pairsort([t:ksi]bool, NAME 2) .-.- _ _ _ _ _ 8 C

:= PN := PN

@ predicup

:= first([t:ksi]bool,

@ d @ predicdown

:= _ _ _ _ _

NAME 2,c) 11.15 ksi 11.16 d

: sort : classin : t:ksi]bool : [t:ksi]bool

:= pair([t:ksi]bool,

NAME,d,dash)

11.13 ksi 11.14 c

: sort

: NAME : sort

: classin(ksi) : sort :

NAME 2

: sort)

: sort : PREDICATE : [t:ksi]bool : [t:ksi]bool

:= pair([t:ksi]bool,

NAME S,d,dot)

: P,REDICATE

Example of a text written in Automath (D.l) 11.17 0 11.18 ksi 11.19 theta 11.20 P 11.21 a

@ (ksi @ (theta

@ P @ a @ then 25

.- _ _ _ _ _ .- _ - - _ _ .- _ _ _ _ _ .- _ _ _ _ _ 1-

697

: : : :

sort) sort)

:= [s:ksi]((s)P)a

[t:ksi]theta EMPTY(theta) : EMPTY(ksi) : EMPTY(ksi)

@ b

.- _ - - _ _

@ then 26

:= then 25(TRUE(

11.24 ksi

@C

11.25 c

@ then 27

nonempty(ksi)) ,ksi,[s: TRUE(nonempty(ksi))] then 3(s),b) : EMPTY(TRUE( nonempty (ksi))) .- _ _ _ _ _ : EMPTY(TRUE( nonempty (ksi))) := then 25(ksi,TRUE( nonempty(ksi)) ,[s:ksi] then 2(s),c) : EMPTY(ksi)

11.26 0 11.27 b 11.28 x

@ X

..-.-

@ then 28

:= then 3(EMPTY(

11.22 ksi 11.23 b

@ b

_____

_____

: boo1 : TRUE(non( non( b)))

TRUE(non(b)),x)) 11.29 x

@ then 29

:= then 27(NON(b),

11.30 x

@ then 30

:= then 29

: EMPTY(NON(b)) : EMPTY(EMPTY(

11.31 b

Q Y

.-

: EMPTY (EMPTY(

11.32 y 11.33 y

@ then 31 @ then 32

:= y := then 26(NON(b),

11.34 y 11.35 b 11.36 z

@ then 33 @ Z

:= then 2(NON(non(b)) ..- _ _ _ _ _

@ then 34

:= then 5(TRUE(b) ,z)

11.37 z

@ then 35

:= then 33(TRUE(b),

then 28)

_____

TRUE@))) TRUE(b))1

then 31)

then 34)

: EMPTY(NON(b)) : : : :

NON(non(b)) TRUE(non(non( b))) TRUE(b) EMPTY(EMPTY( TRUE@)1)

: TRUE(non(non( b)))

N.G. de Bruijn

698

12. EXCLUDED THIRD 12.1 0

@ EXCLTHIRD := [t:bool]PARADISE 11(

12.2 12.3 12.4 12.5 12.6 12.7

@ excl @I a @ if @ then 36 63 (b @ VALID

.- _ - _ _ _ .-.- _ - - _ _ .- _--__

: EXCLTHIRD : bool : EMPTY(NON(a))

:= (a)(if)excl

:

if @ then 37

.- ---__

:

:= [s:EXCLTHIRD]if

: VALID(b)

TRUE(t)) 0 excl b if 0 b

12.8 b 12.9 if

@I

.- _ - - _ _ :=

[s:EXCLTHIRD] TRUE(b)

: sort

TRUE(a)

: bool) : sort

TRUE(b)

Comment: VALID is the notion of truth in non-intuitionistic logic.

12.10 b 12.11 p 12.13 q 12.14 q 12.15 p

.-.-

@ P

@ q @ then 38 @I then 39 @I then 40

_-___

.- ---__ := (q)P := then 35(then 38)

@ P @ q @ then @I then @ then @ then

40a 41 42 42a

: :

TRUE(b) TRUE(non(non(b)))

:= [xEXCLTHIRD]

then 39(s) 12.16 b 12.17 p 12.18 q 12.19 q 12.20 q 12.21 p

: VALID(b) : EXCLTHIRD

: VALID(non(non(b)))

.- ---__ .-.- _ _ _ _ _

:

(q)P then 30(then 40a) := then 36(q,b,then 41) := [s:EXCLTHIRD] then 42(s)

:

:= :=

VALID(non(non(b)))

: EXCLTHIRD

TRUE(non(non(b)) )

: EMPTY(NON(b)) : TRUE(b) : VALID(b)

13. SETS

13.1 13.2 13.3 13.4 13.5 13.6 13.7

0 ksi ksi x s s ksi

(ksi @ set @I x

@I

s EST1 @ esti @ s @I @I

.- --_-_ .:= [x:ksi]bool

..-

_-___

--___

:= TRUE((x)s) := (x)s

.-.-

_-___

: sort)

: sort : ksi : set : sort : bool :

set

Example of a text written in Automath (D.1) 13.8 s 13.9 t

@ t @ INCL

..-

13.10 t 13.11 ksi 13.12 s 13.13 ksi 13.14 ksi 13.15 x 13.16 ksi 13.17 ksi 13.18 x 13.19 ass

@ incl

:= nonempty(1NCL) := _ _ _ _ _

@ then 43 @ emptyset QX @ assume @ then 44

:= now 2 := [x:ksi]contradiction := _ _ _ _ _

13.20 ass

Q then 45

:= then 3(CONTR,

13.21 x

@ then 46

:= [t:ESTI(x,emptyset)]

_____

(s @ powerset Q universe @ X

:= [x:set]incl(x,s) := [x:ksi]trivial

..-

_____

14. TRANSITIVITY

14.1 ksi 14.2 14.3 14.4 14.5 14.6 14.7 14.8

s t r

Q (s

@ t @ r

@ ass 1 ass 1 @ ass 2

ass 2 @ x x @ ass 3 ass 3 Q then48

14.10 ass 3 @ then 50

: : :

:

:= then 46

ESTI(x,universe) set ksi ESTI(x,emptyset) TRUE( contradiction)

: CONTR

EMPTY(ESTI( x,emptyset)) : NON(esti( x,emptyset)) :

OF SET-INCLUSION :=

...-.....-

__------_____ _____ ____-

____-

:= _ _ - - -

: set) : set : set : INCL(s,t) : INCL(t,r) : ksi :

TRUE((x)s)

:= then 3(IMPL( (x)s,(x)t) ,

(x)ass 1) 14.9 ass 3 @ then 49 @ refl 14.9” s

:

:= assume

then 45 @ then 47

: sort : boo1 : set) : set(set(ksi)) : set : ksi

..- -_-_-

then 44)

13.22 x

: set

:= ALL( ksi,[u:ksilnonempty

(IMPL((uh(u)t))) @

699

:= (ass 3)then 48 := [u:ksi]

: IMPL((x)s,(x)t) : TRUE((x)t)

then 2(IMPL((u)s, ( 4 s ) J Y :TRUE((u)s)ly) : INCL(s,s) := then 49(t,r,r,ass 2, refl(r),x,then 49) : TRUE((x)r)

N.G. de Bruijn

700 14.11 x

@ then 51

:= [p:TRUE((x)s)]

14.12 x

@ then 52

:= then 2(IMPL( (x)s, (x) r) ,

then 50(p)

: IMPL((x)s,(x)r)

then 51) 14.13 ass 2 @ then 53

:= [t:ksi]then 52(t)

15. INCLUSION INDUCED IN POWERSET 15.1 15.2 15.3 15.4 15.5

ksi s t ass 4 u

@ @ @ @

set) set) INCL(s,t) set : TRUE((u) powerset (s)) : TRUE(incl(u,s)) : : : :

(s

(t ass 4 u @ a

15.6 a 15.7 a

@ then 54 @ then 55

15.8 a

@ then 56

15.9 a

Q then 57

15.10 a

62 then 58

..- a := then then := then then := then then := then

15.11 u

@ then 59

:= [a:TRUE((u)powerset(

3(INCL(u,s), 54) 53(u,s,t, 55,ass 4) 2(INCL(u,t), 56) 57

: INCL(u,s) : INCL(u,t) : TRUE(incl(u,t)) : TRUE( (u)

powerset( t)) s))]then 58

: IMPL(

(u)powerset (s), (u) powerset (t)) 15.12 u

@ then 60

:= IMPL((u)powerset(s),

15.13 u

@ then 61

:= then 2(then 60,

(u)powerset (t)) then 59)

: sort : TRUE(nonempty(

then 60)) 15.14 ass 4 @ then 62

:= [t:set]then 61(t)

: INCL(set,

powerset (s), power set (t ) )

70 1

Checking Landau’s “Grundlagen” in the Automath System Parts of Chapters 0, 1 and 2 (Introduction, Preparation, Translation) L.S. van Benthem Jutting

CHAPTER 0. INTRODUCTION 0.2. The book translated At an early stage of the Automath project the need was felt to translate an existing mathematical text into an Automath language, first, in order to acquire experience in the use of such a language, and secondly, to investigate to what extent mathematics could be represented in Automath in a natural way. As a text to be translated, the book “Grundlagen der Analysis” by E. Landau [Landau 301 was chosen. This book seemed a good choice for a number of reasons: it does not presuppose any mathematical theory, and it is written clearly, with much detail and with a rather constant degree of precision. For a short description of the contents of Landau’s book see 2.0.

0.3. The language of the translation The language into which Landau’s book has been translated is AUT-QE. A detailed description and a formal definition of this language is given in [van Daalen 73 (A.3)]. I will use the notations introduced there whenever necessary. Where in the following text a concept introduced in [van Daalen 73 (A.3)] is used for the first time, it will be displayed in italics, with a reference to the section in [van Daalen 73 (A.3)] where it occurs. The language of the translation differs from the definition in [van Daalen 73 (A.3)] in one respect, viz. the division of the text into paragraphs [van Daalen 73 (A.3), 2.161. By this device the strict rule that all constants [van Daalen 73 (A.3), 2.6, 5.4.11 in an AUT-QE book [van Daalen 73 (A.3), 2.13.1, 5.4.41 should be different is weakened to the more liberal rule that all constants in one paragraph have to differ. Now, in a line [van Daalen 73 (A.3), 2.13, 5.4.41, reference to constants defined in the paragraph containing that line is as usual, while reference to constants defined in other paragraphs is possible by a suitable

L.S. van Benthem Jutting

702

reference system. For a more detailed description of the system of paragraphing, see Appendix 2 [not an this Volume]. In contravention of the rules for the shape and use of names in AUT-QE, we will in examples in the following text not restrict ourselves to alpha-numeric symbols, and occasionally we use infix symbols. (Of course, in the actual translation of Landau’s book, these deviations from proper AUT-QE do not occur.)

CHAPTER 1. PREPARATION In this chapter the logic which Landau presupposes is analysed and its representation in AUT-QE is described. 1.0. The presupposed logic

In his Vorwort fur den Lernenden” Landau states: “Ich setze logisches Denken und die deutsche Sprache als bekannt voraus”. Clearly, in the translation AUT-QE should be substituted for “die deutsche Sprache”. The proper interpretation of “logisches denken” must be inferred from Landau’s use of logic in his text. This appears to be a kind of informal second (or higher) order predicate logic with equality. In the following some characteristics of Landau’s logic will be discussed, and illustrated by quotations from his text. (i)

Variables have well defined ranges which are not too different from types [van Daalen 73 (A.3), 2.21 in AUT-QE. Cf.: -

On the first page of “Kapitel 1”: “Kleine lateinische Buchstaben bedeuten in diesem Buch, wenn nichts anderes gesagt wird, durchweg naturliche Zahlen” .

- In “Kapitel2, $5’’ : Grosze lateinische Buchstaben bedeuten durchweg,

wenn nichts anderes gesagt wird, rationale Zahlen”. (ii)

Predicates have restricted domains, which again can be interpreted as types in AUT-QE. Cf.: -

-

“Satz 9: Sind 2 und y gegeben, so lie@ genau eine der Falle vor:

(1) 2 = y. (2) Es gibt ein u mit 2 = y + u ...” etc. It is clear that u (being a lower case letter) is a natural number, or u E nat. “Definition 28: Eine Menge von rationalen Zahlen heiszt ein Schnitt, wenn ...”.

Checking Landau’s “Grundlagen”, Prepaxation (0.2)

703

Here it is apparent that being a “Schnitt” is a predicate on the type of sets of rational numbers. (iii)

When, for a predicate P , it has been shown that a unique x exists for which P holds, then “the x such that P” is an object. Cf.: - “Satz 4, zugleich Definition 1: Auf genau eine Art laszt sich jedem

Zahlenpaar x,y eine natiirliche Zahl, x+y genannt, so zuordnen dasz ... . x y heiszt die Summe von x und y”.

+

+

X > Y so hat X U = Y genau eine Losung U . Definition 23: Dies U heiszt X - Y”.

- “Satz 101: 1st

(iv)

The theory of equivalence classes modulo a given equivalence relation, whereby such classes are considered as new objects, is presupposed by Landau. Cf.: -

The text preceding “Satz 40”: “Auf Grund der Satze 37 bis 39 zerfallen alle Briiche in Klassen, so dasz x1 -

N

22

und

y1

Y2

dann und nur dann, wenn

21

-

22

derselben Klasse angehoren”. Y2

-

“Definition 16: Unter eine rationale Zahl versteht mann die Menge aller einem festen Bruch aquivalenten Bruche (also eine Klasse im Sinne des fj1)”.

(v)

The concepts “function” and “bijective function” are vaguely described. Cf.: - “Satz 4” (see (iii) above).

< y so konnen die m eineindeutig bezogen werden” .

- ‘‘Satz 274: 1st x

- “Satz 275: Es sei x fest, f(n)fur n

5 x nicht auf die n 5 y

5 x definiert. Dann gibt es genau

ein fur n 5 x definiertes gz(n)mit folgenden Eigenschaften ...” followed by the “explanation”: “Unter definiert verstehe ich: als komplexe Zahl definiert” , This explanation might be interpreted to indicate the typing of the functions f and g. (vi)

Landau defines and uses partial functions. Cf.: -

“Definition 14: Das beim Beweise des Satzes 67 konstruierte spezielle 211

-

212

heiszt

21

-

22

Y1 --

...‘I.

Y2

nition, only applies if

Here the construction. and therefore the defi21

-

22

> y1 -.

Y2

L.S.van Benthem Jutting

704

- “Definition 56: Das Y des Satzes 204 heiszt

-

r”. This definition deH

pends upon H # 0. - “Definition 71”,where Landau states explicitly: “Nicht definiert ist x” also lediglich fur x = 0, n 5 0”.

X > Y folgt X = ( X - Y )+ Y”. “Satz 240: 1st y # 0 so ist - . y = x”. Y

- “Satz 155: Beweis: 11) Aus -

X

- “Satz 291: Es sei

xy . x y .

n > 0 oder

XI

# 0, xz #

0. Dann ist

( ~ 1 . ~= 2 ) ~

In these last three examples we see “generalized implications”: the terms occurring in the consequent are meaningful only if the antecedent is taken t o be true. A similar situation will be encountered in (vii). (vii) Definitions by cases, sometimes of a complicated nature, are used. Cf.: -

“Definition 52:

f -(Pl+IHl) 181- IHI

-=+H =

0

-(IHI - 1Bl) H+E H

I

-

wennBc0, HO, H < O , wennB O . wenn Z = 0 . wenn H = 0”.

I

- “Definition 71:

IElX

I”={

1

1

wenn n

> 0.

wennx#O,

n=O.

w e n n x # O , n
Notice that in these two definitions, in some of the cases the definiens is not defined when the corresponding condition does not hold, (“generalized definition by cases”), and also that, in some cases, there is in the definiens a reference to the definiendum. (viii) In his text Landau only occasionally mentions predicates and relations; usually he refers to sets. Cf.: -

“Axiom 5: Es sei A4 eine Menge naturlicher Zahlen mit den Eigenschaften:

705

Checking Landau’s “Grundlagen”, Preparation (0.2) (1) 1 gehort zu M. (2) Wenn x zu M gehort, so gehort x’ zu

M.

Dann umfaszt M alle natiirlichen Zahlen” . -

“Satz 2: x’ # x. Beweis: M sei die Menge der x, fur die dies gilt..”.

However, in the text preceding “Definition 26”: - “Da =, >, <, Summe und Produkt den alten Begriffen entsprechen...”.

(ix)

Landau considers (ordered) pairs of objects. In Chapter 2 the components of such pairs remain clearly visible in their names: he does not refer to “the pair 5 with components x1 and 2 2 ” , but only t o “the pair ~ 1 ~ x 2 ” . Nevertheless it is clear from his words that he considers such a pair as one object. Cf.: - “Definition 7: Unter einem Bruch 21 - versteht man das Paar der natiir-

lichen Zahlen -

“Definition 8:

22

XI, 2 2

21

-

22

N

(in dieser Reihenfolge)”. y1 - wenn xly2 = y1x2”. y2

In Chapter 5, however, variables for pairs are used. Cf.: -

“Definition 57: Eine komplexe Zahl ist ein Paar reeller Zahlen E1,E2 (in bestimmter Reihenfolge). Wir bezeichnen die komplexe Zahl mit [El,E:2]”.

This definition is immediately followed by - “Kleine deutsche Buchstaben bedeuten durchweg komplexe Zahlen” .

The two notations are linked in the following way: -

“Definition 60: 1st x = [E,,E2], y = H2]”.

E2, Hi

(x)

+

[H1,H2], so

ist x

+ y = [El +

Finally it should be pointed out that some of Landau’s proofs and remarks tend to a kind of intuitive reasoning which is not easily represented in a formal system. A first example of this is the treatment of equality in “Kapitel 1, 51”. - Yst x gegeben und y gegeben, so sind entweder x und y dieselbe Zahl;

das kann man auch x = y schreiben; oder x und y nicht dieselbe Zahl; das kann man auch z # y schreiben. Hiernach gilt aus rein logischen Griinden:

L.S. van Benthem Jutting

706 ( 1 ) x = x fur jedes x. (2) Aus x = y folgt y = X . (3) Aus x = y, y = z folgt x = z ” .

Here it seems that Landau derives the properties of equality from reflection on the properties of a mathematical structure. They are not theorems or axioms but intuitively true statements. Substitutivity of equal objects, though used frequently in the proofs of subsequent theorems, is never mentioned. Other examples of proofs with intuitive components may be found where Landau, in a glance, takes in a complex logical situation. Cf.: -

“Satz 16: Aus x 5 y, y < z oder x < y, y 5 z folgt x < z . Beweis: Mit dem Gleichheitszeichen in der Voraussetzung klar; sonst durch Satz 15 erledigt”.

- “Satz 20: Aus x

+ z > y + z bzw. x + z = y + z bzw. x + z < y + z

folgt x > y bzw. x = y bzw. x < y. Beweis: Folgt aus Satz 19 d a die drei FQle beide Male sich ausschlieszen und alle Moglichkeiten erschopfen” . A somewhat different example, which involves what might be called “metalogic”, is the text preceding “Definition 26”, where it is indicated how a number of theorems might be proved, without actually proving them. I will return to this in 2.1 (viii).

1.2. The representation of logic in AUT-QE The logic considered by Landau to be “logisches Denken”, as described in the previous section, has been formalized in the first part of the AUT-QE book, called “preliminaries”, which, unlike the other parts, does not correspond to an actual chapter of Landau’s book. A possible way of coding logic in AUT-QE has been described in [van Daalen 73 (A.3), 3, 41. In addition to this description we stress a few points on the interpretation of AUT-QE lines [van Daalen 73 (A.3), 2.13, 5.4.41. Adopting the terminology introduced in [Zucker 77 (A.4)] we shall call expressions of the form [ X I : a11 ... [xk : a k ] type (with k 2 0) (i.e. t-expressions of degree I) l t expressions and expressions of the form [ X I : a11 ... [xk : a k ] prop (again with k 2 0) lp-expressions. Expressions having I t - and lpexpressions as their types, will be called 2t-expressions and 2pexpressions, respectively. Finally, 3t- and 3pexpressions have 2t- and Ppexpressions as their types. Now a 2t-expression will be used to denote a type (or “class”). If its type is an abstraction expression [van Daalen 73 (A.3), 2.8, 5.4.21 then it denotes

Checking Landau’s “Grundlagen”, Preparation (0.2)

707

a type of functions. A 2pexpression denotes a proposition or a predicate. A 3t-expression denotes an object (of a certain type) and a 3pexpression a proof (of a certain proposition). The interpretation of an AUT-QE line having a certain shape (EB-line, PNline or abbreviation line [van Daalen 73 ( A . 3 ) , 2.13, 5.4.41) will depend on its category part [van Daalen 73 ( A . 3 ) , 2.13.11 being a It-, lp, 2t- or 2pexpression. So we arrive at the following refinement of the scheme in [van Daalen 73 (A.3), 4.51.

Shape of the line:

Category-part lt-expression introduces a type variable

lpexpression introduces a proposition or predicate variable

2t-expression introduces= object variable (of the stated type)

PN-line

introduces a primitive type constant

introduces a primitive object (of the stated type)

Abbreviation line

defines a type in terms of known concepts

introduces a primitive proposition or predicate constant defines a proposition or predicate in terms of known concepts

EB-line

defines an object (of the stated type) in terms of known concepts

2pexpression introduces the stated proposition as an assumption introduces the stated proposition as an axiom proves the stated proposition as a theorem

In the above scheme it is apparent that, if the category part of a line is a Ipexpression, then the interpretation of that line is an assertion. But also if the category part is a 2t-expression a , the interpretation has an assertional aspect; the line does not only introduce a new name for an object (as a variable, or a primitive or defined constant) but also asserts that this object has the type a.

1.3. Account of the PN-lines Here I will give a survey of the primitive concepts and axioms (PN-lines) occurring in the preliminary AUT-QE text. A mechanically produced list of

L.S.

708

vitn

Benthem Jutting

these axioms appears as Appendix 3 [see lD.5,11, in this Volume]. In this list the PN-lines appear numbered. References in parentheses below will refer to these numbers. (i)

Axioms for contradiction. Contradiction is postulated as a primitive proposition (l),the double negation law as an axiom (2).

(ii) Axioms for equality. Given a type S, equality is introduced as a primitive relation on S (3), with axioms for reflexivity (4)and for substitutivity (5) (i.e. if x = y , and if P is a predicate on S which holds at x, then P holds at y ) . Moreover, there is a n axiom stating extensionality for functions (8). The notion of equality so introduced is called book-equality (cf. [ v a n Daalen 73 (A.3), 3.61) in contrast to definitional equality of expressions ( [ v a n Daalen 73 (A.3), 2.12, 5.5.61). (iii) Axioms for individuals. Given a type S, a predicate P on S, and a proof that P holds at a unique x E S , the object 2nd (for individual) is a primitive object (6), to be interpreted as “the x for which P holds”. An axiom states that and satisfies P

(7). (iv) Axioms for subtypes. Given a type S and a predicate P on S, the type O T (for own-type, i.e. the subtype of S associated with P ) is a primitive type (9). For U E O T we have a primitive object in(u)E S (lo), and an axiom stating that the function [u : OT] in(u)is injective (12). Moreover, there are axioms to the effect that the images under this function are just those x E S for which P holds ((11) and (12)). (v) Axioms for products (of types). Given types S and T the type pairtype (the type of pairs (x,y ) with c E S and y E T ) is introduced as a primitive type (14). For p E pairtype we have the projections first(p)E S and second ( p ) E T as primitive objects (( 16) and (17)), and conversely, for x E S and y E T we have pair (x,y ) as a primitive object in pairtype (15). Next there are three axioms stating that pair (first(p),second ( p ) ) = p , first(pair (x,y ) ) = z and second (pair (x,y ) ) = y (where = refers to book-equality as introduced in (ii)) ((19), (20) and (21)). (Note: If a type U containing just two objects is available, and if S is a type, the type of pairs (x,y ) with x E S and y E S may be defined alternatively as the function type [x : V l S . In the translation this was done at

Checking Landau’s LLGrundlagen”, Preparation (0.2)

709

the end of Chapter 1, where we took for U the subtype of the naturals 5 2. Therefore the pairing axioms as described above were not used in the actual translation.) (vi) Axioms for sets. Given a type S, the type set (the type of sets of objects in S) is introduced as a primitive type (21), and the element relation as a primitive relation (22). Given a predicate P on S, there is a primitive object setof ( P ) E set (denoting the set of x E S satisfying P ) (23), and there are two axioms to the effect that P holds at z iff z is an element of setof ( P ) ((24) and (25)). These can be viewed as comprehension axioms for S. (As sets contain only objects of one type, such axioms will not give rise to Russell-type paradoxes.) Finally extensionality for sets is stated as an axiom (26). The axioms for sets permit “higher-order” reasoning in AUT-QE, since quantification over the type set is possible. 1.4. Development of concepts and theorems in Landau’s logic

Here we give a sketch of the development of the logic in [Landau 301 from the axioms described in the previous section. Starting from the axioms for contradiction, the development of classical first order predicate calculus is straightforward. In this development more than usual attention has been paid to mutual exclusion: ’ ( A A B ) , and trichotomy: ( A V B V C ) A ( - { A A B ) A i ( B A C) A -.(C A A ) ) , because these concepts are used frequently by Landau in discussing linear order. The properties of equality, e.g. symmetry, transitivity, and substitutivity for functions (i.e. if z = g and f is a function on S then f(z)= f ( y ) ) , follow from the axioms for equality. The development of the theory of equivalence classes (cf. 1.0 (iv)) requires the axioms for subtypes and for sets. It turns out here, when translating mathematics into AUT-QE, that Landau goes quite far in considering concepts and statements about those concepts to belong to “logisches Denken”. We had to choose how to describe partial functions in AUT-QE. As an example let us consider the function f on the type rl of the reals, defined for all 5 E rl for which z # 0, and mapping z to 1/x. There are (at least) four reasonable ways to represent f : (i)

The range of f may be taken to be rl*, the “extended type” of reals, containing, apart from the reals, an object und representing “undefined”. In this case (0) f will be (book-equal to) und, and rl may be defined as OT(rl*,[z : rl*] (z # und)).

710

L.S. van Benthem Jutting

(ii)

An arbitrary fixed object in rl, 0 say, may replace und. Then (0) f will be taken t o be 0.

(iii) f may be considered as a function on OT(r1,[z : rl]z the nonzero reals.

# 0), the subtype of

(iv) f may be represented as a function of two variables: an object x E rl and a proof p E z # 0. so

f E [x: T I ] [p, z # 01 rl (Then, given an 2 such that 2 # 0, i.e. given an z and a proof p that x # 0, we can use (p) (z)f t o represent I/z.) It is clear that the representations (i) and (ii) have much in common. The representations (iii) and (iv) are also related: in fact, we may construct, by the axioms for subtypes, for given z E r l and p E z # 0 an object out(z,p)EOT(rl, [z : rl]z # 0). Then, if fl

E [z : O T ( d ,[x: T I ] x

# O)] rl ,

then

[z::rl][p,z#O](out(z,p))fl~[x:rl][p,z#O]rl. On the other hand, if f2

E [z : rl] [p, z

# 01 rl

then [z : OT(r1,[z : rl]z

# O)] (OTAz(z))(in(z)) f2

E [z : OT(r1,[z : d ]z

# o)] rl

(for brevity some obvious subexpressions in the formula above have been omitted). After a careful examination of Landau’s language, I have decided that the fourth representation is closest to his intention, and have therefore adopted it. However, this leads to the following difficulty: Let, in our example, z E r l and y E r l be given, such that x = y, and suppose we have proofs pE (z # 0) and q E (y # 0). Now it is not a priori clear in AUTQE (though it is clear to Landau), that the corresponding values (p) (z) f and (q) (y) f will be equal. That is: it is not guaranteed in the language that the function values for equal arguments will be independent of the proofs p and q. This property of partial functions, which is called irrelevance of proofs, can be proved for all functions which Landau introduces. When discussing arbitrary

Checking Landau’s “Grundlagen”, Translation (D.2)

711

partial functions, however, irrelevance of proofs had to be assumed in some places (cf. gite below). For a further discussion we refer to 4.0.1. As a consequence of the chosen representation of partial functions, terms may depend on proofs, and therefore certain propositions are meaningful only if others are true. This gives rise to generalized implications (cf. 1.0 (vi)) and generalized conjunctions, such as: “2

> 0 * 1/x > 0”

and “x > 0 A

fi # 2” .

Logical connectives of this kind have been formalized in the paragraph iir’lin the preliminary AUT-QE text. The definition-by-cases operator ite (short for if-then-else, cf. 1.0 (vii)) can be defined on the basis of the axioms for individuals. As we have seen (1.0 (vii)), Landau admits partial functions in such definitions. For these cases a “generalized” version of the definition-by-cases operator gite (for generalized ifthen-else) is required, which has been defined only for partial functions satisfying the irrelevance of proofs condition. All set theoretical concepts used by Landau (cf. 1.0 (viii)) may be defined starting from our axioms for sets. The passages in Landau’s text which use more or less intuitive reasoning (cf. 1.0 (x)) could not very well be translated. In the relevant places straightforward logical proofs were given, which follow Landau’s line of thought as closely as possible.

CHAPTER 2. TRANSLATION In this chapter, we discuss the actual translation of Landau’s book, the difficulties encountered and the way they were overcome (or evaded). First, in Section 2.0, we given an abstract of Landau’s book; then, in Section 2.1, a general survey is given of the various reasons to deviate occasionally from Landau’s text. In the following sections we describe the translation of the Chapters 1 to 5 of Landau’s book. 2.0. An abstract of Landau’s book

(i)

“Kapitel 1. Natiirliche Zahlen”. Peano’s axioms for the natural numbers 1,2,3, ... are stated. ‘i+l’ is defined as the unique operation satisfying x 1 = x‘ and x y‘ = (x y)’. Properties of (associativity, commutativity) are derived.

+

+

+

+

L.S. van Benthern Jutting

712

+

Order is defined by z > y := 3 u(z = y u ] . It is proved to be a linear are derived. “Satz 27” states order relation and its connections with that it is a well-ordering. “.” (multiplication) is defined as the unique operation satisfying z . 1 = z and z.y’ = z.y z. Properties of “.” (commutativity, associativity) are derived, and also its connections with (distributivity) and with order.

+

+

(ii)

+

“Kapitel 2. Briiche”. Fractions (i.e. positive fractions) are defined as pairs of natural numbers. Equivalence of fractions is defined, and proved to be an equivalence relation. Order is defined, it is shown to be preserved by equivalence, and to be an order relation. Properties are derived (e.g. it is shown that neither maximal nor minimal fractions exist, and that the set of fractions is dense in it self). Addition and multiplication are defined, and proved to be consistent with equivalence. Their basic properties and interconnections are derived, and their connections with order are shown. Also subtraction and division are defined. Rationals (i.e. positive rational numbers) are defined as equivalence classes of fractions. Order, addition and multiplication are carried over to the rationals, and their various properties are proved. Finally the natural numbers are embedded, and the order in the rationals is shown to be archimedean.

(iii) “Kapitel 3. Schnitte”. Cuts in the positive rationals are defined. For these cuts, order, addition (with subtraction), and multiplication (with division), are defined, and again the various properties and interconnections of these concepts are proved. The rationals are embedded, and the set of rationals is proved to be dense in the set of cuts. Finally the existence of irrational numbers is proved, by introducing \/z as an example. (iv) “Kapitel 4. Reelle Zahlen”. The cuts are now identified with the positive real numbers, and to these the real number 0 and the negative reals are adjoined, in such a way that to every positive real there corresponds a unique negative real. The absolute value of a real number is defined. Order is defined, its properties are derived, and the predicates “rational” and “integral” (“ganz”) are defined on the reals. Now addition and multiplication are defined, and their properties and their

713

Checking Landau’s “Grundlagen”, Translation ( 0 . 2 )

connections with each other, with absolute value and with order are derived. In particular the minus operator (associating to each real its additive inverse) is discussed, as well as subtraction and division. Finally, in the “Dedekindsche Hauptsatz” , Dedekind-completeness of the order in the reals is proved. (v)

“Kapitel 5. Komplexe Zahlen”. Complex numbers are defined as pairs of reals. Addition, multiplication, subtraction and division, their properties and interconnections are discussed. To each complex number i s associated its conjugate, and also (following the definition of the square root of a nonnegative real) its modulus (as a real number). The connections of these two concepts with each other and with the previously introduced operations are derived. For an associative and commutative operator * (which may be interpreted or .), and for an n-tuple of complex numbers f ( l ) ,...,f(n), as either Landau denotes

+

f(1) * f ( 2 )

* ... *

f ( n ) by

@?=I

f(i) .

This concept is defined as the value at n of the unique function g (with domain {1,2,,..,n}) for which g(1) = f(1) and g ( i ’ ) = g ( i ) * f(i’) for i < n. The properties of @ are proved, in particular, for a permutation s of {1,2, ...,n} it is proved that

+

+

The definition of @ is extended to n-tuples f ( y ) , f ( y l),...,f ( y n - 1) (where y is an integer), and its properties are carried over to this situation. C is defined as the specialization of @ to the operation +, and as its specialization to . (multiplication). Some properties of C and are proved. For a complex number 2 and an integer n, with 2 # 0 or n > 0, zn is defined, and its properties and connections with previously defined concepts are discussed. Finally the reals are embedded in the set of complex numbers, the number i is defined, it is proved that i2 = -1, and that each complex number may be uniquely represented as a bi with a, b real.

n

n

+

2.1. Deviations from Landau’s text In our translation, deviations from Landau’s text appear occasionally. They may be classified as follows:

L.S. van Benthem Jutting

714

(i)

In some cases a direct translation of Landau’s proofs seems a bit too complicated. We list three reasons for this.

(a) Sometimes it is due to the structure of AUT-QE which does not quite agree with the proof Landau gives. E.g. in the proof of “Satz 6” Landau applies, for fixed y, induction with respect to x. As xE nut, y E nut is a common context in the translation, it is easier there to apply, for fixed x, induction with respect to y. (b) Sometimes the reason is that Landau uses a unifying argument. E.g. in the proof of the “Dedekindsche Hauptsatz” there are, at a certain stage, two real numbers E and H , such that E > 0 and Z > H . Here Landau needs a rational number z , such that E > z > H. Now it has been proved in “Satz 159” that between any two positive reals there is a rational. If H > 0 this may be applied immediately. If H 5 0

-

Landau defines HI= and again applies “Satz ‘159”, this time 1+1 with H I . This argument, however, is complicated, because, to apply “Satz 159”, first 0 < H I < Z has to be proved (which Landau fails to do). And it is superfluous because every z in the cut Z will meet the conditions in this case. I

~

( c ) In one instance (the proof of “Satz 27”), Landau has given a complex proof, which may be simplified. In all these cases I have, in the translation, given a proof which follows Landau’s line of reasoning. However, in some cases, I have also given shorter alternative proofs. This means that the deviations are optional in these cases. (ii)

Some of Landau’s “Satze” really consist of two or three theorems. E.g. Tatz 16: Aus x 5 y, y < z oder x < y, y 5 z folgt x < 2”. In such cases the theorem has been split up: “Satz 16a: Aus x 5 y, y < z folgt x < z ” , “Satz 16b: Aus x < y, y 5 z folgt x < 2’.

(iii)

Very frequently Landau uses without notice a number of more or less trivial corollaries of a theorem he has proved. E.g. besides “Satz 93: ( X + Y ) + Z = X + ( Y + Z ) ” he uses “ X + ( Y + Z ) = ( X + Y ) + Z ” without quoting “Satz 79”. Sometimes such a practice is explicitly announced, e.g. in the Vorbemerkung” to “Satz 15”, where it is stated that, with any property derived for <, the corresponding property for > shall be used. In all such cases the corollaries have been formulated and proved after the theorems.

Checking Landau’s “Grundlagen”, Translation (0.2)

715

(iv)

Following the translation of the definition of a concept, we often added the specialization to this concept of certain general properties. E.g. after the introduction of +, substitutivity of equality was applied: “If x = y then x+z = y+z and z+x = z+y. If x = y and z = u then z+z = y+u”. This was done in order to make later applications easier.

(v)

In a few proofs of the last three chapters minor changes were made. E.g. in the proof of “Satz 145”, where Landau states: “AUS > 7 folgt nach Satz 140 bei passendem u E = q+u” but where, by “Definition 35” u can be defined explicitly by u := - q. This has been done in the translation, thus avoiding the superfluous existence elimination. Another deviation occurs in the proof of “Satz 284”. Here Landau writes the following chain of equalities:

<

As in the proof the equality ((u

+ 1) - y) + ((x+ 1) - .( + 1))= (z + 1) - u

was needed, the following chain of equations was preferred in the translation:

+ 1) - y) + ((x+ 1) - (u + 1))= = ((x + 1)- + 1))+ ( ( u + 1) - y) = = (((x + 1)- ( u + 1))+ (u + 1))- y = (x+ 1) - y . ((u

(?A

(vi)

As we have seen in 1.0 (vii) Landau formulates Peano’s fifth axiom in terms of sets, and, when applying it, always represents a predicate as a set. In the translation this extra step has been avoided. The induction axiom is indeed introduced for sets, but then immediately a lemma, called induction, which applies to predicates is proved. This lemma has been used systematically in all proofs by induction. Also “Satz 27: In jeder nicht leeren Menge natiirlicher Zahlen gibt es eine kleinste” has been reworded and proved in terms of predicates and not of “Mengen” .

(vii)

“Intuitive arguments” of Landau were translated in various ways. E.g.

+

“Satz 20: Aus x z folgt X > Y

>y +z

+

bzw. x z = y bzw. x = y

+ z bzw. x + z < y + z bzw. x

< y.

L.S. van Benthem Jutting

716

Beweis: Folgt aus Satz 19 d a die drei FQle beide Male sich ausschlieszen und alle Moglichkeiten erschopfen” (where “Satz 19” asserts the inverse implications). Considering the fact that Landau regards this proof as belonging to “logisches Denken”, I have proved in the preliminaries three “logical” theorems to the effect that: D , B + E , C =+ F , If A V B V C , - ( D A E ) , 7 ( E ~ F )- (, F A D ) and A then D =+ A , E =+ B and F =+ C. These theorems were used in the translation. A second example: “Satz 17: Aus x 5 y, y 5 z folgt z 5 z . Beweis: Mit zwei Gleichheitszeichen in der Voraussetzung klar; sonst durch Satz 16 erledigt” (“Sazt 16” is quoted above under (ii)). Here the AUT-QE text, when translated back into German, might read: L‘Beweis: Es sei z = y. Dann ist, wenn y = z , auch x = z also z 5 z. Wenn aber y < z so ist x < z nach Satz 16a, also ebenfalls x 5 z . Nehme jetzt an x < y. Dan folgt aus Satz 16b x < z , also auch in diesem Fall x 5 z . Deshalb ist jedenfalls x 5 z ” . Another argument which is difficult to translate faithfully occurs in “Kapitel 5, $8’’ where sums and products are introduced. Landau uses here a symbol which he intends to represent either 6L+” or “.”, and in this simultaneously. In our translation we defined way defines “C” and iteration for arbitrary commutative and associative operators, and consequently our concept and the relevant theorems are essentially stronger than Landau’s. This generality is much easier to describe in AUT-QE and “.”. than a theory which applies only t o LL+”

*

“n”

(viii) Landau uses metatheorems whenever he embeds one structure into another, to show that the properties proved for the old structure “carry over” to the new. As an example I cite his treatment in Chapter 2 of the embedding of the natural numbers into the (positive) rationals.

folgt

x > y bzw. x = y bzw. x < y”.

“Definition 25: Eine rationale Zahl heiszt ganz, wenn under den Briichen, deren Gesamtheit sie ist, ein Bruch

X -

1

vorkommt”.

“Dies x ist nach Satz 111 eindeutig bestimmt, und umgekehrt entspricht jedem x genau eine ganze Zahl”.

Checking Landau’s “Grundlagen ”, Translation (0.2)

717

“Satz 113: Die ganzen Zahlen geniigen den fiinf Axiomen der natiirlichen 1 Zahlen, wenn die Klasse von - an Stelle von 1 genommen wird, und als 1 2

2’

Nachfolger der Klasse von - die Klasse von - angesehen wird”. 1 1 Landau adds the following comment: ‘LDa=, >, <, Summe und Produkt (nach Satz 111 und 112) den alten Begriffen entsprechen, haben die ganzen Zahlen alle Eigenschaften die wir in Kapitel 1 fur die natiirlichen Zahlen bewiesen haben”. It was difficult to translate this text. The translation requires first a careful analysis of the interpretation of Peano’s axioms in Chapter 1. There are two possibilities: In the first interpretation, the axioms describe fundamental properties of the given system of naturals (nat, 1, suc), which cannot be proved from more primitive properties, and from which all other properties of the system can be derived. In this conception there is an intention to characterize the structure by the axioms. In the second interpretation, the axioms are simply assumptions underlying a certain theory. The theorems of the theory are valid in any structure in which these assumptions hold. In this view, no claim is made that the axioms characterize the system. The difference between these two conceptions can be illustrated by comparing the r6le of the axioms in Euclid’s geometry to the r61e of the axioms for groups in group theory. The interpretation of “Satz 113” and Landau’s comment varies according to the interpretation of the Peano axioms. In the first interpretation the “ganzen rationalen Zahlen” form a structure (nat*, 1*,suc’) which “happens to” have the same fundamental properties as the original structure (nat, 1, sac). Hence, by a suitable metatheorem, we see that the reasoning of Chapter l may be repeated for this new structure, extending it to (nat*,1*,suc’, +*, .*, <*) and proving the various properties of this extended system. In the second interpretation “Satz 113” just proves that the structure (nat’, 1*,sue*) satisfies the assumptions. After this the theory of Chapter 1 can be applied immediately. However, there is a further problem (under either interpretation): addition on nat’ defined according to the method of Chapter 1 is not (definitionally) the same thing as the restriction (to nut*) of the addition on the rationals and these two functions must still be proved to be (extensionally) equal. Similar remarks can be makde about multiplication and order.

L.S. van Benthem Jutting

718

It follows that the relevant text cannot be rendered directly in AUT-QE under either interpretation of Peano’s axioms. There is, therefore, no technical reason to prefer one of these interpretations to the other. Landau’s ideas on the rBle of the axioms are not quite clear from his text. We cite some of his statements: - In his Vorwort fur den Kenner” he mentions certain laws on the reals

which can be “als Axiome postuliert”. -

He thinks it right, that the student should learn “auf welchen als Axiomen angenommenen Grundtatsachen sich luckenlos die Analysis aufbaut” .

- Moreover: “In dieser (Vorlesung) gelange ich, von den Peanoschen Ax-

iomen der natiirlichen Zahlen ausgehend, bis zur Theorie der reellen Zahlen” . -

In Chapter 1: “Wir nehmen als gegeben an: Eine Menge, d.h. Gesamtheit, von Dingen, naturliche Zahlen genannt, mit den nachher aufzuzahlenden Eigenschaften, Axiome genannt” .

- V o n der Menge der naturlichen Zahlen nehmen wit nun an, dasz sie

die Eigenschaften hat...”. -

A relevant passage is also “Satz 113” quoted above.

-

Landau never mentions “a system of naturals”, like in group theory one would discuss “a group”, but always “die naturlichen Zahlen” .

Most of the sentences quoted above point to the second interpretation, some of them, however, could be interpreted better or equally well in the first way. Now, as neither technical reasons nor Landau’s text indicated definitely how Peano’s axioms should be interpreted, I decided to interpret them as postulates (PN-lines) rather than assumptions (EB-lines) because it suited my own conception of the naturals. Moreover, this interpretation reduces the context and thereby simplifies verification. The meta-reasoning sketched above has been treated as follows. After the proof of “Satz 113” the proofs of Tatz 1” and “Satz 4” (where addition is introduced) were copied for the “ganzen Zahlen”. However, addition on the ‘lganzen Zahlen” has been defined as the restriction of addition on the rationals. Then a number of theorems from “Kapitel 1” was proved using “Satz 112”. Order and multiplication were treated in a similar way. These texts have been inserted as a matter of prestige because we claimed that we were able to say everything Landau says. The insertions were never used, however (cf. (ix) below).

Checking Landau’s “Grundlagen ”, Translation (0.2)

719

In “Kapitel 3, $5’’ and “Kapitel 5, $10” similar arguments occur, when the rationals are embedded in the reals, and the reals in the complex numbers. These arguments were “translated” just by constructing the relevant isomorphisms. This suffices for all applications. (ix)

A consequence of the difficulties described in (viii) is a divergence between the translation and Landau’s book with respect to the use of natural numbers in the Chapters 3, 4 and 5. After his comment (following “Satz 113”) that the “ganzen Zahlen” have the same properties as the “natiirliche Zahlen” Landau continues: “Daher werfen wir die natiirlichen Zahlen weg, ersetzen sie durch die entsprechenden ganzen Zahlen, und haben fortan (da auch die Briiche iiberfliissig werden) in bezug auf das Bisherige nur von rationalen Zahlen zu reden”. In the translation I have not followed this course, because, as pointed out, it would have been a cumbersome task to prove the properties of the “natiirliche Zahlen” for the “ganze Zahlen”, and also because it would have been inevitable to repeat this procedure with every further extension of the number system. Therefore I have stuck to the “natiirliche Zahlen” throughout the translation. Another important deviation of Landau’s text was caused by “Definition 43: Wir erschafFen eine neue, von den positiven Zahlen verschiedene Zahl 0. Wir erschaffen ferner Zahlen die von den positiven und 0 verschieden sind, negative genannt, derart, dasz wir jedem E (d.h. jeder nennen”. positiven Zahl) eine negative Zahl zuordnen, die wir I doubt whether this creative act may be called a “definition”. Landau considers it a part of “logisches Denken” to form, given sets (or types) a and p, the Cartesian product a x p, as is clear from Chapter 2. It might be also considered “logical” t o form the disjoint union [email protected] Landau does not mention this, he just “creates” 0 and the negative numbers from nothing. Moreover, I do not see a formal difference between the assertion “1 ist eine natiirliche Zahl” (which Landau calls an axiom) and the assertion “0 ist eine reelle Zahl” (which he calls a definition). Neither do I see a formal difference between “z’ # 1” and # 0”. In my opinion the limits of “logisches Denken” are exceeded here. In agreement with this criticism I have translated this “definition” by introducing a number of primitive concepts and axioms (PN-lines). The type of real numbers rl is a primitive type. To any cut [ real numbers p(E) and n(E)are associated. 0 is a primitive real number. Next there are axioms to the effect that the functions [z : c u t ] p ( z ) and [z : cut] n ( z )are

-<

“-<

720

L.S. van Benthem Jutting injective. Now zErl has the property pos (or neg) if it is in the range of the first (or the second) of these functions. Then there are axioms stating that, for zErl, p o s ( x ) , neg(z) and 2 = 0 are mutually exclusive, and that each 2 E rl has one of these properties. (In fact Landau does not state the latter axiom explicitly.) Starting from these axioms “Kapitel 4” was translated. However, as I thought it unsatisfactory t o develop the theory of real and complex numbers using more than Peano’s axioms alone, I have added an alternative AUT-QE version of Chapter 4, called Chapter 4a, where the real numbers are defined as equivalence classes of pairs of cuts, and where all theorems of Landau’s “Kapitel 4” are proved for these alternative reals. The AUT-QE translation of Chapter 5 has been checked relative to the AUT-QE book consisting of the Chapters 1, 2, 3 and 4a.

72 1

Checking Landau’s “Grundlagen” in the Automath System Chapter 4 (Conclusions) L.S. van Benthem Jutting

CHAPTER 4. CONCLUSIONS In this chapter we discuss some possibilities t o represent logic in Autoniath. We indicate some desirable extensions of AITT-68 and AIJT-QE and we discuss some aspects (positive as well as negative) of our translation. 4.0. Formalization of logic in Automath

In this section we shall describe various possibilities to represent systems of natural deduction in AUT-68 ( [ v a n Daalen 7.5’ (A.3), 2]), in AUT-QE and in some closely related languages. First we discuss two main decisions which have to be made when choosing between these possibilities. Then we indicate explicitly two possibilities to represent logic. 4.0.0. First order v. higher order In most Automath languages there are certain restrictions on abstraction. E.g. iii AUT-68 as well as in AIJT-QE correct abstraction expressions have the form [ J : a ]A where (Y is a Zexpression (and hence x, having type a, is a :I-variable, i.e. a variable which is a 3-expression). Such restrictions allow a faithful representation of‘ first order logic (in the sense of excluding higher order formulas and inferences). In AIJT-68 as well as in .iITT-QE this can be done by representing propositions and predicates as %expressions (as described in [aarz Daalen 73 (A.:?), 31). Then proposition variables and ( i n AITT-QE) predicate variables will be 2-variables and abstraction (or quantification) with respect t o such variables is impossible in the language.

If‘. in such a setting, we want t o discern between proposition variables and predicate variables then it is necessary t o have abstraction expressions of‘ degree 1 in the language, i.e. t o use AITT-QE (and not AUT-68). In order to represent higher order logic we should require the possibility of abstraction with respect t o proposition and predicate variables. Therefore, if we

L.S. van Benthem Jutting

722

stick to the abstraction restrictions of AUT-68 or AUT-QE, we should represent propositions and predicates by 3-expressions. We may proceed in two ways: (i) we can associate to each proposition a (primitive) type (which we will call the assertion type of the proposition). Objects of this type will be considered as proofs of the proposition. In other words: we consider the proposition as asserted iff its assertion type contains some object. This possibility will be elaborated in 4.0.2. (ii) we can extend the language t o a new language, called AUT-4, by admitting 4expressions (having 3-expressions as their types (cf. [van Daalen 73 (A.3), 2.31). Then a proposition (represented by a 3-expression) might be considered as asserted if it contains something (some Cexpression). Thus propositions act as their own assertion types, and the representation of logic is just as described in [van Daalen 73 (A.3), 3.21, but for a shift with respect to degrees. 4.0.1. Relevance of proofs vs. irrelevance of proofs In all representations of logic in Automath languages which have been developed so far, proofs (i.e. names of proofs) appear in the language ( [ v a n Daalen 73 (A.3), 31, [de Bruijn 73b], [C.4]).In this respect these representations reflect a constructive conception of logic, in which proofs and objects are treated similarly. In a classical conception of logic, proofs are discussed in the metalanguage only. As a consequence it is impossible in such a conception to discern (in the language) between different proofs of one proposition. This point of view can be roughly represented in Automath by proclaiming, for any given proposition a, all proofs of a to be equal. This deprives these proofs of their identity, their names should be considered only as references to the place in the book where the proposition is asserted. This possibility has been first suggested by de Bruijn. If, in a representation of logic in Automath, such a n attitude is adopted, we shall say that this representation satisfies irrelevance of proofs. (Cf. [Zucker 77 (A.4)], and also 1.4). How this irrelevance of proofs is implemented (i.e. in which sense proofs are considered “equal”) will depend both on the language and on the way logic is represented in it (cf. 4.0.3 (i) and (ii)). 4.0.2. A representation of logic in AUT-68 A higher order system of natural deduction can be formalized in AUT-68 as follows. A type of proposition is introduced as a primitive type:

PROP

:=

PN

Etype

Checking Landau’s “Grundlagen”, Conclusions (0.3)

723

and to each proposition A its assertion type I- ( A ) is associated:

E PROP E type

* A : = -

A

*

I-

:=

PN

(In earlier publications on AUT-68, boo1 and TRUE were used instead of PROP and I-). If S is a type, an object P E [z : S] PROP has to be interpreted as a predicate. Objects of type [ x : 5’) I- ( ( x )P ) must then be interpreted as proving that P holds for every E S. So we want to introduce the proposition V(S, P ) which has the property that its assertion type contains elements iff the type [z : S] I- ((z)P ) contains elements. This is expressed in the following lines:

S P P a u P u

* s : = -

E type E [z : S] PROP EPROP

* P : =

*

V

* a * u * Ve * u * Vi

:= PN :=-

ES E I- ( V ( S , P ) )

:=:= PN :=:= PN

E

I-((a)P)

s]

E [z : b ((z) P ) E I- ( V ( s , P ) )

Starting from these primitieve concepts and axioms, higher order logic can be developed. An indication of how this can be done, is given in Appendix 6 [not in this Volume, but see also [B.f]], where the first three theorems from Landau’s book are derived on the basis of the logic so developed. This logic represents a constructive system of natural deduction. Axioms could be added for extensional euqality of functions and extensional equality of propositions (i.e. if a b then a = b ) . Classical logic could be represented this way by adding axioms for irrelevance of proofs:

-

* A

._ .-

._ A * u .._ u * v .- v * i v . p r . := P N

E PROP E I- ( A )

E I- ( A ) E

IS (l-

(A),u,v)

and for the double negation law:

._ A * u .u * d.n.1. :=

PN

E I- ( - ( - ( A ) ) ) E I- ( A )

4.0.3. A representation of logic in AUT-QE How logic can be represented in AUT-QE is described in [vanDaalen 73 (A.3), 31. This system, a first order system of natural deduction, has been used in our translation. An indication of the development of logic in it can be found in the

L.S. van Benthem Jutting

724

excerpted text in Appendix 7, which covers the proofs of the first three theorems of Landau’s book and the logic used in these proofs. [Appendix 7 is not in this Volume. However, this excerpt is contained in the excerpt for “Satt 27”, [D.5] in this Volume.] The system is a bit ambivalent, because it is classical (containing the double negation law as an axiom) but does not satisfy irrelevance of proofs. There are two obvious ways t o implement irrelevance of proofs: (i) by adding an axiom: * A A * S

S * t t * u

.-.-

...-.-

~

~

u * v := v * zrr,pr. := PN

E Prop E type E [z : A] S EA EA E IS(S,(u)t,(v)t)

That is: if to every proof of A an object of type S is associated, then this object is independent of the nature of the proof. It has been indicated by J. Zucker that this axiom implies irrelevance of proofs in partial functions as mentioned in 1.4:

* s

:=

S * T : = T * P : = P * f := f * a := a * b := b *u:= u * v := v * w : = w * Q := [x : SI [Y : ( x )PI Is (T (4(a)f , (Y) (4 f1 w * 11 := [y : ( a )PI irr.pr.((a)P, T , (a)f , u,y) w * 12 := ISP(S,Q,a,b,u,11) w * 13 := ( w ) l z

E type E type E [z : S] prop E [z : S] [y : ( x )P ]T ES

ES E IS (S, a, b ) E

(4p

E E E E E

(b) p [x : 5’1 prop (a)Q (b) Q

W T ,(4 (4 f , (4( b ) f 1

(ii) by extending, in the language, the relation of definitional equality, in such a way that two 3p-expressions (cf. 1.2) are definitionally equal iff their types are definitionally equal. This has been done in the language AUT-II (cf. [Zucker 77 (A.4)]),but could be done in a variant of AUT-QE as well. If we want to formalize intuitionistic logic in AUT-QE we should have the absurdity rule (i.e. contradiction implies any proposition) instead of the double

Checking Landau’s “Grundlagen”, Conclusions (D.3)

725

negation law. The logical connectives (apart from implication) and the existential quantifier could be added as primitive constants, and their elimination- and introduction rules as axioms.

4.1. The language In this section we discuss some features of Automath languages, and the value of these features for the formalization of mathematics.

4.1.0. AUT-SYNT Consider the following AUT-QE text, representing the introduction rule for conjunction: a b u v

* a * b * u * v * andi

:=

__ __

E Prop E Prop

..-

__ __

Eb

:=

.....

.-.-

:=

Ea E and(a,b )

(where the dots indicate some proof which is irrelevant for the present discussion). We will call the variables a, b, u, v the parameters of andi. If we want to apply this rule for propositions A and B, we need two proofs p and q of the propositions, thus getting the proof andz(A,B,p,q) E and ( A ,B ) . Suppose we are given the proof p , then we can compute mechanically its type (cf. [van Daalen 73 (A.3), 6.4.2.31) which is (definitionally equal to) the proposition A it proves. A similar observation holds for q and B. Hence we could say that the expression andi ( A ,B , p , q ) contains redundant information. If the “mechanical type” function CAT ([vanDaalen 73 (A.3), 6.4.2.31) were incorporated in the language, we could write, instead of the expression above, andz (CAT(p), CAT(q),p,q), which only contains p and q. We will call the parameters u and v (for which p and q are substituted) the essential parameters of andi, while a and b (for which the redundant expression A and B are substituted) are called redundant parameters. There are many other examples of expressions with redundant parameters. It is worth while to extend the language in such a way that redundant parameters can be avoided, because the expressions which have to be substituted for them might be long. A system of extensions of this kind has been proposed by I. Zandleven. It is called AUT-SYNT since it admits sylztactzc variables for expressions. Thus we have the languages AUT-68-SYNTI AUT-QE-SYNT etc. For a description of AUT-SYNT we refer to Appendix 9 [[B.5] in thzs Volume], a text in AUT-68-SYNT may be found in Appendix 8 [not in this Volume]. Our experiences with translating Landau’s book have been a stimulus for developing AUT-SYNT, and have indicated the way this could be done. As no

L.S. van Benthem Jutting

726

verifying program for SYNT languages was available until after the translation was finished, the SYNT-facility could not be used in the translation. This may be considered unfortunate, because the presence of this facility would have simplified both the writing and the reading of our text.

4.1.1. 7-reduction in Automath In AUT-68 and AUT-QE one of the possible ways to establish definitional equality is by 0-reduction ([vanDaalen 73 (A.3), 6.2.21): If x is not free in A then [z : a](z) A -+rl A . As can be seen in the list in Chapter 3 [[E.t] in this Volume],0-reduction was applied only twice during the verification of our translation. We give the lines which required these 0-reductions, together with their relevant contexts. The following lines from the text on propositional logic are presupposed:

* *

con a not u

a * a * u * et a * u u * cone

:=

PN

:=

[z : a] con

..-

.._

:=

._ .-

:=

E Prop E Prop E Prop E not(not(a))

__

PN

E a

E con Ea

et(a, [ z : not(a)]u)

The first line where q-reduction is required occurs in the text on predicate logic. In this text the following lines appear:

* s S P P P u v s v

*

:= :=

* *

P all non u v

*

s

...-.-

*

tl t2

:= :=

* *

*

:=

P

:= :=

[x : S ] not((x)P ) __

et ((4Plb) v) ([z : S] tl(z))u

E type E [z : S ]p r o p E Prop E [x: S ]p r o p E not( all (s,P ) ) E non (non ( P ) )

ES E (4 p E con

In order to verify that the middle part of this last line is a correct expression, it should be established that

CAT([x : S]tl(x)) &' DOM(u)

(cf. [van Daalen 73 (A.3), 6.2.4.61) .

We have CAT([z : S ]tl(x)) = [x : S] CAT(tl(x)) = = [x : S] ((s) P ) [s := x] = [z : S] (z) P

,

Checking Landau js “Grundlagen”, Conclusions (0.3)

727

DOM(u) = DOM(not(al1 ( S , P ) ) )= D

= DOM([z : all ( S ,P ) ]con) = all (S,P ) = P

.

The question is to establish D

[z : S] (z) P = P

This obviously requires 17-reduction. The second case in which 17-reduction is used occurs in the text on generalized implication:

* a b b

*

u

*

* *

a b imp u th2

:= := := := :=

E E E E E

b [z : a] cone ((z)b, (z) u)

Prop [z : a] p r o p Prop not(a) imp (a,b)

Here, in order to verify the last line, it is asked whether the category of the middle part definitionally equals the category part, i.e. whether D .

CAT([z : a] cone((z) b, (z) u) = zmp (a,b )

.

Now CAT([z : a] c o n e ( ( z ) b , ( z ) u ) )= [z : a]CAT(cone ( ( z ) b , ( z ) u ) ) = = [z : a] (z) b

,

D

imp (a,b) = b and therefore 17-reduction must be used for establishing D

[z : a] (z) b = b

.

It has been observed by v. Daalen that 17-reduction might have been avoided in both cases by a slight modification of the definitions: for all (in the first case) and for imp (in the second case). In fact all might have been defined by

P

*

all

:=

[x:s]( z ) P

E prop

imp

:=

[ z : S](z)b

Eprop

and imp by b

*

This would have made no difference to the rest of the book, apart from the fact that in some places an extra /?-reduction would have been necessary. In fact, if a predicate P is defined explicitly (as opposed to being a predicate variable or a D primitive predicate constant) then P = [y : S ]m(y), say, and we have, without 7-reduction

L.S. van Benthem Jutting

728 D D [z : S] (z) P = [z : S] (z) [y : S] m(y) = D

D

= [z : S] (m(y)) [y := z] = [z : S] m(z) = P

.

We conclude therefore that q-reduction does not add considerably to the expressive power of Automath. prop v. type In the stage of exploration of the possibilities to represent logic in AUTQE, initially a variant of this language was used which did not contain the 1-expression prop. It was therefore impossible to prescribe whether types had to be interpreted as assertion types (containing proofs) or “ordinary” types (containing “ordinary” objects). Contradiction was represented as a primitive type, negation and the double negation law were formalized in terms of this type as follows:

4.1.2.

* *

*

a a

*

u

*

con a not u d.n.l.

PN

:=

..-

:= := :=

~

[ z : alcon ___

PN

E type E type E type E not(not(a))

Ea

If in this text a is interpreted as an “ordinary” type, nat say, then expressions of type not(a) (or [z : a] con) could be interpreted as proofs that a is empty (in fact, if we have p E not(a), then for an object z E a we have ( z ) p to prove contradiction). Hence expressions of type not(not(a))have to be interpreted as proofs that a is (in a weak sense) nonempty. Given such a proof q we have an object d.n.l.(a,q) E a. Or, in other words: d.n.1. acts as a Hilbert operator, selecting an object from any non-empty type. In particular this induces a form of the axiom of choice. As we did not want the double negation law to have such far-reaching consequences, we extended the language by admitting prop as a basic 1-expression. Thus we obtained the language AUT-QE (as defined in [van Daalen 73 (A.3), 5 ] ) , in which it is possible to distinguish between assertion types and ordinary types. The distinction of prop and type not only unlinked the double negation law from the axiom of choice, but also made it possible to implement irrelevance of proofs (cf. 4.0.1, 1.4). This opportunity was not seized in the logic underlying our translation (though this would have been natural). For an explanation we refer t o 4.2.1. We may conclude that the distinction between proofs and “ordinary” objects is an essential feature when representing classical logic in Automath. For representing constructive logic the version with only type keeps its value.

Checking Landau’s “Grundlagen”, Conclusions (0.3)

729

4.1.3. Strings and telescopes In Chapter 2 of his book Landau uses pairs (21, Q) of natural numbers. He considers such a pair as a single object and yet he describes it by two variables. A faithful translation of this practice could have been given if the concept of a string of expressions would have been present in our language. Another use strings of expressions might have is as arguments of partial functions (as described in 1.4). In fact such functions are applied to pairs ( a , p ) where a is an object of a certain type S, and p a proof that a satisfies some predicate P on S (which describes the range of the function). As a further example we consider the concept of a group, which might be op, iv,e , p ) where E type, op E [z : s][g : s] zv E considered as a string [z : e E and p E groupazzorns op, iv, e). We usually want the types of the expressions of such a string to satisfy certain conditions. In the case of the argument ( a , p ) of a partial function we want a E S, p E (a)P. In other words we want the argument ( a l p )to be consistent with the “abstractor part” of the function: z E S; y E (z) P. In the case of groups we want a group (S,op, iv, e,p) to be consistent with

s]s,

s

(s,

(s,

s

s,

zEtype; yE[s:z][t:z]z; zE[s:z]z; u E z ;

v E groupaxaorns (2,g, z , u ) . There is a strong analogy with the case where expressions All ...,A , are required to be suitable candidates for substitution for the variables zl, ...,z, of a certain context z1 E a1,22 E a2, ...,2, E a, (cf. [van Daalen 73 (A.3), 2.51). To describe such conditions on strings we introduce the following terminology. A finite sequence of E-formulas z1 E all...,xnE a, is called a telescope. The string of expressions (al, ...,a,) is said to fit into the telescope 31 E all...,Z n E an if al E ( Y ~ ,EuC~Y ~ [ :=al], Z~ ...,a, E an[zl,...,zn-l:=a1 ,...,a,-1]. Extension of the language with constants and variables for strings and defined constants for telescopes has been proposed by de Bruijn. This is especially helpful, when formalizing abstract structures such as groups, vector spaces or categories, and has been applied on a large scale by J. Zucker (cf. [ Z U C k e T 77 ( A 4)1)* 4.2. Comments on the Translation

In this section we first give a chronological survey of the different representations of logic which have been tried, and we state the motives for finally choosing AUT-QE as a language for our translation. Furthermore we mention some aspects which are (in our opinion) shortcomings of the translation and we add some positive conclusions which can be drawn from our work.

L.S. van Benthem Jutting

730

4.2.0. Choice of the language In our first attempts to translate Landau’s “Grundlagen” in Automath, we used the language AUT-68. The representation of logic was similar to the one described in 4.0.2 and presented in Appendix 6 [not in this Volume].Elimination and introduction of V were effected by the axioms Ve (with parameters S E type, p E [z : PROP, a E s, u E k (t/(s, P ) ) and vi (with parameters s E t y p e , P E [z : S ]PROP, u E [z : S] ((z)P ) ) . These axioms were used frequently in developing logic, because the logical connectives and the existential quantifier were defined in terms of V. On the basis of this logic Chapter 1 of Landau’s book was translated in AUT-68. At that stage of our work we started trying to represent logic in AUT-QE, initially using a variant of that language which did not contain prop. In AUTQE the axioms V i and Ve were superfluous: if P E [z : S]type (i.e. P represents a predicate on S ) then objects of type P can be interpreted as proofs of V(S, P ) . Conversely, given such an object u E P and an object a E S we have (a)u E (a)P (i.e. ( a )u proves that P holds at a). As a consequence the text on logic in AUT-QE was considerably shorter than the earlier text in BUT-68. (It was not observed at that time, that this was caused essentially by the redundant parameters S and P of both constants Ve and Vi.) So AUT-QE seemed to be a much better language, and therefore a fresh start was made with the translation of Landau’s book into that language. In 4.1.2 we have reported that in this system (AUT-QE without prop) the double negation law induces a Hilbert operator. This led us to add prop as a basic 1-expression to our language, thus extending it t o proper AUT-QE. At the time we finally fixed the language we did not appreciate the fundamental importance of incorporating a form of irrelevance of proofs. This was due mainly t o two reasons:

s]

+

(i) Partial functions are not frequently used in the first three chapters of Landau’s book, and for those partial functions which are defined there, irrelevance of proofs could be derived. Therefore no need was felt for an axiom. (ii) As Landau, being a classical mathematician, does not discuss proofs at all, we thought we should try to follow this practice. Consequently we did not want to have an axiom declaring proofs equal.

4.2.1. Shortcomings of the translation Here I list those features of the translation which I would change if I were to redo the work. (i)

In my opinion the SYNT-facility should be present in any Automath language. It will bring texts in Automath closer t o mathematical practice.

Checking Landau’s “Grundlagen”, Conclusions (0.3)

73 1

The middle parts of many lines in the present Landau translation are unnessarily complex and tedious (both to the reader and to the writer), because this facility is absent in the language I used. (ii) I regret that I have not implemented irrelevance of proofs as an axiom. As I see it now, for representing classical reasoning a language should be chosen which even contains irrelevance of proofs by definitional equality (cf. 4.0.3). (iii) Some of the names I have used lack expressive power. This is partly due to the fact that AUT-QE admits only alphanumeric identifiers, but mainly to my excessive preference for short names. (i.1

I am not content with the translation of Chapter 5.8. This text is overloaded with irrelevant embedding and lifting functions which hamper a clear understanding of the argument. I think it is better to define CZ1 f ( i ) and f(i) for functions f defined for all natural numbers (and not just on an initial part of the naturals), although this procedure deviates slightly from Landau’s intentions.

nz,

Final remarks The main positive comment we can make on the translation is that is has been succesfully finished (in spite of some inconveniences in the language). An aspect which has not been mentioned so far is the ratio between the length of pieces of AUT-QE text and the length of the corresponding German texts. Our claim at the outset was that this ratio can be kept constant. We give a few data. As pieces of text we have chosen the chapters of Landau’s book, and as a measure of the lengbhs .the number of stored AUT-QE expressions (storing expressions requires storing all subexpressions too) and (rough estimates of) the number of German words (where “z” and “+” were counted as words). We give the following list: 4.2.2.

nr. of expressions nr. of words nr.n;f ZE ;wn;:s

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 12200 25800 30300 35000 60500 3200 4900 5300 5500 11000 3,8 593 597 64 5,5

The high ratio in Chapter 4 might be attributed to the complicated definitions by cases in this chapter, while the low ratio in Chapter 1 is possibly caused by the absence of calculations. Another notable aspect of the work is the comparatively small place taken by the preliminaries. It appears that a formal treatment of the logic underlying

732

L.S. van Benthem Jutting

mathematics (if we disregard metalogic) is much easier than a formal treatment of mathematics itself. It has not been the purpose of this enterprise to construct a formal system which suits my own fancy and to develop in this system the theory of naturals, reals and complex numbers. I have rather tried to represent in a language which was essentially given beforehand, a wide variety of concepts and ideas as expressed in a book like Landau’s. The success of this undertaking is due to the flexibility of Automath languages, and to the close connection which can be made between these languages and intuitive human reasoning.

733

A Text Fragment from Zucker’s “Real Analysis” L.S. van Benthem Jutting and R.C. de Vrijer

The text “Real Analysis” was written by J. Zucker, partly in cooperation with A. Kornaat, in 1975-1976. It contains a formalization of the theory of real numbers, functions, continuity, differentiation, ending with the definition of the exponential function as a power series and the proof that this function is its own derivative. In [Wcker 77 (A.4)] its author reported on the development of this text. It is written in AUT-II, a variant of Automath developed by Zucker, which contains explicit product types, sum types and disjoint sums of types. A verifying program for AUT-II was never finished; hence Zucker’s text has never been checked on a computer. We present here a short fragment of Zucker’s text. The aim is to give an impression of how mathematics, also of a more advanced level, can be written, and has actually been written, in a flexible Automath language like AUT-II. With this goal in mind, we have not selected a piece from the very beginning of the Automath book. Instead, we exhibit a fragment occurring at a point where already a good deal of the subject “Real Analysis” has been developed, viz., from Chapter 10 (entitled “Partial functions of a real variable”): Section 7 (“Differentiation”) and part of Section 8 (“Rules for differentiation”). The consequence of this policy is that the text is not readable without some explanation, since we just drop in in the middle of the story, so to say. Therefore, we start with an informal introduction, written with the aim of providing the background that is needed for an understanding of the formal text. It is not intended as a general introduction t o AUT-II. For more information on the background and specific features of this Automath language one should consult [Zucker 77 (A.411 or [ B . 6] .It may be noted that the format of the text roughly follows common Automath practice. In particular, the notations, the way of dealing with contexts, etc., are much as in the description of AUT-68 in [van Benthem Jutting 81 (B.I)]. This article is organized as follows. First we will give an informal exposition of the language and of some syntactical conventions (Section l),and comment on the particular way in which Zucker deals with the syntax (Section 2). Then we give a short account of some relevant parts of the text Real Analysis that

734

L.S. va.n Benthem Jutting and R.C. de Vrjer

precede the fragment (Section 3), on the basis of which it is possible to give a synopsis of the fragment itself (Section 4). We conclude our introductory remarks with an overview of some more identifiers that are used but not defined in the fragment (Section 5). Then Section 6 contains the actual text fragment.

1. DESCRIPTION OF THE SYNTAX The text consists of lines. A book is a sequence of lines. There are three main kinds of lines: (i) AUT-II lines, (ii) paragraph lines, (iii) skip lines. Paragraphs give a global structure to the text, facilitating internal referencing. They are indicated by the paragraph lines. Skip lines are for comments and for formatting purposes. The AUT-11 lines contain the real AUT-11 text. They can again be subdivided into three kinds:

(a) context lines, (b) defining lines, (c) primitive notion lines. We now briefly discuss each of the kinds of lines mentioned. (i)(a) A context line consists of two parts: a context indicator (which is optional) and a context extension. In order t o explain their roles, we discuss contexts. As usual in Automath, a context consists of a sequence [x1 : All ... [ x , : A,], where 2 1 , ..., xn are distinct “variables” and A l , ...,A , are expressions (see below). A context indicator is either a variable followed by the symbol @ or just the symbol @. It denotes the context on which the line should be interpreted. If a context [x1 : All [ x :~Az] [x3 : A31 [ 2 4 : Ad] has been introduced earlier, the context indicator “ 2 3 62”indicates that the present line must be interpreted on context [x1 : All [ x :~Az] [x3 : A31 and the context indicator “x2 @” that the context will be [q : All [x2 : A z ] . If the context indicator is just @, the line should be interpreted on an empty context. If it is absent, the context of the line is the context of the previous AUT-II line. The context extension in a context line is a nonempty sequence [y1 : B1] ... [y, : B,], where again y1, ...,,y are distinct variables and B1, ...,B, are expressions. We now describe the effect of a context line. Suppose that somewhere in our book appears the context [x1 : All [x2 : Az] [xg : A3][x4 : Ad], and that the

A text fragment from Zucker’s “Real Analysis” (0.4)

735

context of the previous AUT-rI line is [z1 : Cl] [Q : Cz]. Then the context that is created by the line [yi : B11 [yz : B2][y3 : &]

will be :c 1 1 [.2 : CZ][yt : B1] [yz : &] [y3 : 831;

the context that is created by the line @

[Yi : B11 [Y2 :

B21 [3/3 : &]

will be [!/I

: B11 b 2 : B21 [3/3 :

&I;

the context that is created by the line xi

@

[pi : B11 [YZ : &]

[2/3

:83)

will be [XI

: A11 [?A : B 11 [3/2 : B2] [u3 : &];

and the context that is created by the line x4

@

[Yl : B11 [y2 : B2j

will be

(i)(b) A defining line also consists of two parts: a context indicator again, and a definition. The context indicator works exactly the same way as it does in a context line. In particular the context indicator is optional. The definition consists of a definiendum, a definiens and a type. Definiendum and definiens are separated by the symbol :=, and definiens and type by the symbol “:”. So a defining line looks like this:

x

@

definiendum

:= definiens

: type

The definiendum is an identifier or a fix symbol. Identifiers are sequences of letters, digits and the symbol “.”. Fix symbols are some special symbols like &, =, >, +, 7 ,etc. and any sequence of symbols in quotes (e.g. ‘and’). They are a special kind of identifiers, that can be used prefix, postfix or infix. Their specific use is indicated at the place where they are defined. In the fragment which is shown here, no fix symbols are defined, and hence we omit the relevant syntax.

736

L.S. van Benthem Jutting and R.C. de Vrijer

Fix symbols which have been defined in the previous book appear frequently in the text, though. The definiens and the type are expressions, separated by a semicolon ":". The type in a definition line is optional, as it is possible t o deduce it from the definiens (modulo convertibility). However, in most cases it is explicitly given in the text. We now describe briefly six important shapes of expressions. Lambda expressions, written as [x: A] b. The square brackets are used for lambda abstraction. In more traditional notation one would write Ax : A h . Application expressions, written as ( a )f , meaning the function f applied to the argument a. In Automath tradition, the argument is written before the function. II-expressions (for Cartesian products), written as II(f), where f denotes a type- (or prop-)valued function. Typically, [x : A] b : II([x : A] B ) holds if on the context extended with [x: A] we have b : B. C-expressions, written as C ( f ) ,where f denotes a type- (or prop-)valued function. A C-expression denotes a type of pairs (a,b ) , where the type of b may depend on a. Typically, we have (a,b ) : C([x : A] B ( z ) )if a : A and b : B(a). Zucker writes pair(a, b ) for (a,b ) , and p r o j l and proj2 for the left and right projections. So projl(pair(a,b ) ) definitionally equals a and proj2(pair(a, b ) ) equals b. Head expressions. If an identifier id has been defined earlier in context [xl : All ... [z, : A , ] , then id(a1, ...,a,) is an expression. Below we make a remark about omitting arguments from such expressions. Fix expressions. If fz is a prefix symbol then fz a will be a n expression, if fz is an infix symbol then afx b will be an expression, and similarly for postfix symbols. Ordinary brackets are used for parsing in this case. This is not yet a complete list. E.g., we did not mention the construction of disjoint sums of types. It may be observed that explicit occurrences of the II- and the C-construction are not encountered in the fragment. However, you will see for example R1 + R1, an infix expression that is definitionally equal to the type II([z: Rl] Rl), and V ( P ) , defined as II(P), where P is a predicate, i.e. P : II([x : Alprop) for some type A . As will become apparent later in this introduction, II- and Cexpressions, and also disjoint-sum types, do prominently figure in the background of the fragment. (On the use of products and disjoint sums in logic see [Zucker 77 (A.4)].)

A text fragment from Zucker’s “Real Analysis” (0.4)

737

(i)(c) A primitive notion line is like a defining line. The difference is that the definiens is not an expression, but PN, for primitive notion. In a primitive notion line a new identifier is declared and its type is given.

x

@ identifier I

:= PN

: type

Primitive notion lines do not occur in the present fragment.

Abbreviating expressions. 1. If the identifier id has been defined earlier on. the context [ X I : All ... [x, : A,] [x,+1 : A,+1] ... [xn : An] and if [ X I : All ... [z, : A,] is the initial part of the context of the line under consideration, then id(a,+l, ...,a,) denotes id(x1,...,x m ra,+l, ...,an). So, in particular, just id on the context [x1 : All ... [x, : An] denotes the full expression id(z1,...,zn).

2. For writing lambda expressions there is in some cases another abbreviation device. If in the book a defining line id

._ .-

a

: A

occurs, with context [q : A11 ... [x, : A,] [zm+l: A,+1] ... [&+k : A,+k], then [Icxlid denotes the expression [2m+1 : A,+i] ... [x,+k : A,+k] id(z,+l, ...,x,+k), that is the expression obtained by “lc times abstracting id”. Cf. [de Bruijn 72al.

3. Zucker exploits the facilities provided by AUT-SYNT. For a description of the AUT-SYNT mechanism see [ B . 5 ] .In Section 5 below we will point out some examples. (ii) Paragraph lines mark the paragraph structure of the text. The text is divided into a nested structure of paragraphs, which is largely independent of the context structure. The purpose of the paragraph structure is to provide the possibility of reusing identifiers. Every line of the text, be it an AUT-II line, a paragraph line or a comment line, belongs to a paragraph, the L‘active paragraph” of that line. Three kinds of paragraph lines are used to indicate the paragraph structure of the text: paragraph opening lines, paragraph reopening lines and paragraph closing lines. They determine the active paragraph of the lines which follow.

L.S. van Benthem Jutting and R.C. de Vrijer

738

Paragraph opening lines have the form +P where P is an identifier, a so called paragraph name. After such a line the active paragraph has name P , until another paragraph line appears. The active paragraph, Q say, of the paragraph opening line itself is called the “mother” of P , and P is called a “daughter” of Q.

Paragraph reopening lanes will not be discussed here, since they do not occur in the present text fragment. Paragraph closing lines have the form

-P where P must be the name of the active paragraph. The lines following this paragraph closing line have as active paragraph the mother of P. The reflexive and transitive closure of the relation mother is called “ancestor”, the reflexive and transitive closure of daughter is “descendant”. Inside paragraphs the definienda must be distinct. Reference to definienda in ancestor paragraphs is (essentially) direct, reference to definienda in nonancestor paragraphs is by “paragraph indicators”. These are characterized by two enclosing double quotes. We do not describe the referencing technique formally, but give a typical example. Consider the following text fragment:

+P @

two

:= 1 + 1

: N

2

:= tWO“.P”

: N

-P Suppose the first line in the text fragment has active paragraph Q. The paragraph opening line changes the active paragraph to P. The paragraph Q is the mother of P , and hence is an ancestor of P. The identifier two is a definiens in P. The paragraph closing line changes the active paragraph back to Q. The next line contains in its definiens the identifier two. This identifier is defined in paragraph P , which is not an ancester of Q. By adjoining to this identifier a paragraph indication we obtain two“.P”, t o be read as “the two from paragraph P”. More complicated paragraph indications are possible, but do not occur in the text fragment from “Real Analysis” which is shown. (iii) Skip lines serve for structuring the text by lay-out devices, and for communicating informal reminders, intuitions and considerations to human readers.

A text fragment from Zucker’s “Real Analysis” (D.4)

739

They are skipped in machine verification. There are two kinds of skip lines: comment lines and empty lines.

Comment lanes have one of the three forms ‘comment’ ‘remark’ ‘heading’

(arbitrary text) (arbitrary text) (arbitrary text)

Comment lines with ‘heading’ are used for dividing the text into chapters and sections. The present fragment occurs in Chapter 10, titled “Partial functions of a real variable”. It contains Section 7 and part of Section 8 from this chapter. Comment lines with ‘comment’ and ‘remark’refer to the AUT-II lines next t o them, ‘comment’ referring forward and ‘remark’ backward. In Zucker’s original manuscript empty lanes are indicated as follows: % . They are to be read as instructions to the typist for organizing the lay-out of the AUT-II text. Here these instructions have simply been followed.

2. SYNTACTIC USAGE Now we discuss, guided by a few specific examples, the manner in which Zucker makes use of the syntax. We start out with the use of paragraphs. The paragraph structure follows to some extent the chapter structure of the text, in such a way that chapters which depend on earlier chapters correspond t o paragraphs which are descendants of other paragraphs. Chapter 10, which contains the text fragment presented here, and which has as its subject matter partial functions of a real variable, starts, after some comment, by a paragraph opening line

+PF This paragraph P F (obviously for “partial functions”) has among its ancestors the following paragraphs:

SYNT containing syntactic constructors derived from the basic AUT-SYNT constructors CAT, DOM,etc. B containing basic concepts: logic, equality, sets, etc. L on linear orders. N on natural numbers. CL on complete linear orders. G on (abelian) groups.

L.S. van Benthem Jutting and R.C. de Vrijer

740

F E R

A4

on fields. on extended reals, i.e. Rl U (03, -m}. on reals. on metric spaces.

The list above shows the structure of the text “Real Analysis” up to Chapter 10. In fact Zucker’s “main” paragraphs often coincide with his chapters. Another use that is made by Zucker of the paragraph mechanism, is storing proofs of theorems in paragraphs that are created especially for this purpose, as these proofs will (most probably) not be referred to, and only the theorems are of interest. A typical example is the paragraph 73 (the third paragraph inside Section 7) in our text fragment. Before the opening line of this paragraph, the context is extended to

[h: pfn] [d : Rl][u: l ( d ) do(h)] , to be interpreted as: let h be a partial function from R t o R, let d be a real number and suppose d does not belong t o the domain of h. The goal is to prove that d does not belong to the domain of the derivative of h. Now first the paragraph 73 is opened and inside this paragraph the theorem is stated: th

.._

-(d) do(der.fn)

The type of th is omitted as this type is clearly prop. Note that, according to the convention on omitting variables, der.fn should be interpreted as der.fn(h). Then we see a number of lines which form together the proof of the theorem. The last of these lines has the definiendum pf, obviously meaning proof, and as its definiens an object of type th, that is, in the “proofs as objects interpretation”, a proof of the theorem. Then paragraph 73 is closed and outside the paragraph the theorem is restated, now with a suggestive name. In words it would run like the following. If we have a partial function, a real number and a proof that the real number does not belong to the domain of the function then we may conclude that the number does not belong to the domain of the derivative. The theorem (or rather its proof) now gets the illustrative name: not.do.so.not. do.der.fn, suggesting “a not in the domain of f so a not in the domain of the derived function of f”. The definiens comes from inside the paragraph, and is indicated by a paragraph indication:

A text fragment from Zucker’s “Red Analysis” ( 0 . 4 )

not.do.so.not. do. der.fn

:= pfl.73”

74 1

: l ( d ) do(der.fn)

After this line the contents of paragraph 73 can be forgotten. A shorter way of proceeding is demonstrated in paragraph 71 (the first paragraph of Section 7). Again, before opening the paragraph a relevant context is built:

[h : pfn] [d : Rl] [e : Rl] [u: (e) der(d, h)]

to be interpreted: let h be a partial function, d and e be reals, and suppose that e is the derivative of h at d. Now paragraph 71 is opened, but instead of stating a theorem, only a proof is given in three lines with definienda 11, 12 and 13 respectively, 12 proving (in the proofs as objects interpretation) that def (derw(d, h ) ) and 13 that e = j w ( der w(d, h)). Then the paragraph is closed and two theorems are extracted, by using the lines of 12 and 13 inside the paragraph. The first is do.der.fn

:= 12“.71”

: (d) do(der.fn)

The name d0.der.b is mnemonic for “(concerning the) domain of the derived function. Apparently (d) do(der.fn) definitionally equals def(der w(d, h ) ) . Note again that der.jh should be interpreted as der.fn(h). The second theorem is va.der.fn

:= sy(1314.71”)

: (der.fn; d) = e

Here sy stands for symmetry of equality. Now apparently der.fn ; d is definitionally equal to jw(derw(d, h ) ) .

3. CONTENTS OF THE PRECEDING TEXT The text fragment is not self-contained. There are, by way of the identifiers that are used, many references t o the preceding text. In order to make it yet accessible, we will briefly comment on some of the relevant parts of the AUT-II book preceding Section 7 of Chapter 10. For the description of the underlying logic and of the treatment of the reals we refer to [Zucker 77 (A.4)] in this Volume. Here we will focus on the way partial functions are formalized, on neighbourhoods and nearness predicates and on limits. Occasionally we cite from Zucker’s text and especially from his comments and remarks.

L.S. van Benthem Jutting and R.C. de Vrijer

742

To keep this section readable, we have not tried t o include all identifiers that might trouble the reader. Some have been covered already in the previous section. In Section 5 , after the synopsis, we list and discuss some other key identifiers from the preceding text, that occur in the fragment but did not yet find a place in this introduction.

Partial functions. The type R1 of real numbers is extended by considering the disjoint sum Rw of R1 with the type which has as its only element the object w (to be interpreted as “the undefined object”). The natural injection of R1 into Rw is i w , so if a : R1 then i w ( a ) : Rw. The image of w in Rw is w_. If b : Rw then def ( b ) is a proposition which is provably equivalent t o b # w_. As a matter of fact, def (g) is definitionally equal to I (contradiction), and def ( b ) is definitionally equal to T (truth) for b other than w_.

Then j w is the mapping from Rw t o R1 such that j w ( g ) is definitionally equal t o 0, and j w ( i w ( a ) ) is definitionally equal t o a (for any a : Rl). On the other hand, if b : Rw and u : d e f ( b ) then it can be proved that i w ( j w ( b ) ) = b (but here we do not have definitional equality!). The type p f n is defined as R1 + Rw. If f : p f n then the domain of f is the predicate do on R1, defined as follows: @I

[f do

: Pfnl

:=

[x: Rl] d e f ( ( a ) f )

:

pred(R1)

The type pred(Rl), denoting the predicates on the reals, is defined as the product type U([x : R l ] prop); see [Zucker 77 (A.4)]. If moreover a : R l then the value of f at a, considered as a real, is denoted by f ; a (where ; is used as an infix symbol).

Note that our remarks above, concerning definitional equality of ( d ) do( der.fn) and d e f ( d e r w ( d , h ) ) , and of d e r . f n ; d and j w ( d e r w ( d , h ) ) respectively, can now be derived, by inspection of the definition of der.fn.

Nearness. The subject of limits is prepared in a section on “nearness”. Zucker begins by taking a point a : R1 and defining the concepts of “neighbourhood” and “punctured neighbourhood” with centre a and (positive) radius p as predicates on the reals.

A text fragment from Zucker’s “Real Analysis” (0.4) @

743

[ a : R l ] [p : R p ] nb := [z : R l ]

( m 4 z + -a) 5 projl(p)) nd

:= [z : R l ]

( n b ( z )A 2 # a )

:

pred(R1)

Here R p is the type of positive reals, that is, the C-type C( [p : Rl] ( p > 0 ) ) . So p r o j l ( p a i r ( p ,u ) ) definitionally equals p . And mo obviously denotes the modulus. Finally, “-” is a prefix symbol denoting the additive inverse, so x -a means x - a in usual notation. Let P : pred(R1). We want to define the propositions “P holds near a” (i.e. in some punctured neighbourhood of a ) , “ P holds at and near a” (i.e. in some neighbourhood of a ) and also, for uniformity of treatment, “P holds at a” (i.e.

+

(4PI. a

@

[ P :p r e d ( R l ) ] at near.pred

:= ( a ) P

near at.near

:= 3(near.pred) := 3( [x : R p ] V( [y : R l ]

:=

: prop

[x: R p ] V([y : R l ] ((Y) n d ( z ) --* (Y) P ) )

((Y) n b ( z )

(Y) P I ) ) We introduce a parameter for a partial function f. a

@

If

+

: pred(R1) :

prop

: Prop

:pfnI

Now we define some more propositions: def.at def.near def.at.near

:= a t ( d o ( f ) ) := n e a r ( d o ( f ) ) := a t . n e a r ( d o ( f ) )

These say that f is defined at a, or near a , or at and near a (respectively). Apart from the definitions above, this section contains other definitions and theorems concerning the defined concepts. For our text, and for the concept of limit, the following definition is important.

f

@

[b : R l ] [p : R p ] near.nb := n e a r ( [ z : RZ]((x)d o ( f ) A ( f ;z) n b ( b , p ) ) ) The proposition near.nb expresses that near a the values of f are in the pneighbourhood of b.

Limits. Then follows a section entitled Limits. The context [a : R l ] [f : p f n ] in Section 3 is kept here. We first define the predicate of “being a limit, at point a, of the partial function f”.

L.S. van Benthem Jutting and R.C. de Vrijer

744

f

@

lim

:= [b : Rl]V ( [ E: Rp]

near.nb(b,E ) )

: pred(R1)

Note that the predicate lam can be satisfied only if a punctured neighbourhood of a lies within the domain of f. There are proofs that the limit is unique:

unq.lim

.- ... .-

: unq(1im)

(where unq is the unicity quantifier). Then limw is defined, being the (unique) limit if the predicate lirn holds for some r e d number, and w_ otherwise.

lirn w

.-.- ...

:Rw

[u : 3(lim)] lim w l

._ .- ...

: (jw(1imw))lim

satisfying

and [u : V([z : Rl] ( ~ ( zlim))] )

lam w 2

._ .- ...

:

limw

=(J

Moreover, going back to the context [a : RI] [f : pfn] [b : Rl],we have b

@

[ u :( b ) lim] 1im.so.lim.w

:=

...

: iw(b) = limw

1im.w.def .so.lim :=

...

:

Finally

f

0

[u : def (limw)]

(jw(1imw))lam

It should by now be not too difficult t o understand other properties of limits occurring in the fragment by their names and use.

4. SYNOPSIS OF THE TEXT FRAGMENT The text fragment starts with Section 7 of Chapter 10, “Differentiation”, containing the definition and some elementary properties of differentiation. It starts out by defining, in context [a : Rl] [f : pfn] (d : Rl] [u : d # a ] ,the difference quotient dq. Then dq.fn : p f n is defined in context [a : Rl][f : p f n ] as the partial function that assigns to any x for which i t is defined the difference quotient dq(a, f , x), or rather iw(dq(a,f,z)), its injection in Rw. (The definition uses def.pfn.g. For a description of this see Section 5 below.) Then the predicate of being the derivative o f f at a is defined as the limit of dq.fn at a.

A text fragment from Zucker’s “Real Analysis” (0.4)

f

@I

der

:= lim(dq.fn)

745

:

pred(R1)

On the context [h : pfn] the derivative of h can now be given as a partial function. h

@

der.fn

:= [z : Rl] d e r w ( z ,h)

: pfn

where der w ( z , h) stands for l z m w ( z , dq.fn(z,h ) ) . These definitions are followed by a few immediate consequences, mainly on the domain of the derivative in relation to the domain of the original function (paragraphs 71-74). Subsequently the context [a : Rl] [f : pfn] is extended with the assumption that f is differentiable at a, with b as the value of the derivative.

f

@I

[b : Rl] [b.der.a.f : ( b ) der]

Under this assumption, it is proved in paragraph 74 that f is defined at and near a, and in paragraph 75 that f must be continuous at a. This ends Section 7. Section 8 is devoted to finding the derivatives of certain functions and giving rules for differentiation. We outline the contents of the part of Section 8 that is included in the selection. The derivative of a constant function is computed. In particular, first, in paragraph 81, it is established that the derivative of a constant function at a is 0. a

62 [c : Rl] th

:= (0) der(con.fn(c))

: prop

Subsequently, this result is used in paragraph 82 to compute the derivative of the constant function as a function. th

:= der.fn(con.fn(c) = con .fun(0))

: prop

The same pattern is followed for the case of the identity function (paragraphs 83, 84). Finally, paragraph 85 is devoted to computing the derivative of the sum of two partial functions as the sum of the derivatives. That is, on the extended context

b.der.a.f @ [g : pfn] [c : Rl] [c.der.a.f : (c) der(g)] the following theorem is proved. th

:= ( b + c )

der(sum.fn(f,g))

: prop

L.S. van Benthern Jutting and R.C. de Vrijer

746

5. OTHER PRECEDING NOTIONS We conclude this introduction by listing some more notions from the preceding text, mostly in the order in which the corresponding identifiers appear in the fragment. The identifiers given here in addition to the ones discussed in the previous sections do still not form a complete list. We expect that the definitions of the missing identifiers can be guessed by their use and by Zucker’s practice of choosing suggestive names. stands for “quotient”. It requires three arguments: two real numbers and a proof that the second is nonzero. daf-nz

proves that the difference of two distinct reals is nonzero. In fact this is an example of the use of AUT-SYNT. Zucker derives first in context [a: Rl][b : Rl][u : a # b]

.- ...

Pf121

: (a+-b)#O

Then he defines in context [z1 : synt]

.dif.nz .pfltl (LASTELT(PREPART(TAIL(#, CAT(zl)))), LASTELT( TAIL(#, C A T ( z l ) ) )21) ,

# q, then dzf .nz(v) : (p+-q) # 0. In context [ P r : pred(Rl)] [F : n([z: Rl]II([u. : (z) P r ] Rl))]

Now if v : p

we have def .pfn.g : pfn. The partial function def.pfn.g has the same value as F at points where P r holds and is undefined otherwise. Note that literally speaking F is a function of two arguments: a real z and a proof of (z)P r . However, since in AUT-II we have irrelevance of proofs (see [Zucker 77 (A.d)]),the value of F depends only on the first argument. d0.pfn.g

This identifier and the following substantiate the just mentioned properties of def.pfn.g. First, if the context above is extended with [d : Rl][u : (d)P r ] , then d is in the domain: d0.pfn.g : (d)do( def .pfn.g(Pr,F)).

va.pfn.g

And, secondly, the value is as said: va.pfn.g : (def .pfn.g ; d) = ( u ) (d) F .

not.d0.pfn.g

If, on the other hand, the context is extended by [d : Rl] [u : ~ ( dP )r ] then n0t.do.pfn.g : ~ ( ddo(def.pfn.g(Pr, ) F)).

A text fragment from Zucker’s “Real AnaJysis” (0.4)

921

747

is for “substitutivity”: if P is a predicate, a = b and ( a )P holds, then so does (b) P. In the definition of su, again, AUT-SYNT is used. First in context [a: type] [ P : II([z: a]prop)] [a : a][b : a][u : a = b]

[v : (4 PI

we have the axiom

ax34

:= PN

: (b)P

.

Then, in context [z1 : synt] [ z :~synt] [z3 : synt]

we have

..-

su

ax94 ( D OM ( CA T(t.1)) , a ,

LASTELT(PREPART(TAIL(=, C A T ( z 2 ) ) ) ) , LASTELT( TAIL(=, C A T ( t z ) ) )~, 2 ~ 2 .3 ) Now if P : n([z : @]prop), u : c = d and TJ : (c) P (where @ : type, c : @ and d : @),then su(P,u, TJ): ( d ) P.

defd

is the “abstracted version of def” i.e. the predicate [ x : Rw]def ( x ) .

Pf.T

is a proof for T (denoting truth and defined by T := I -+ I).In fact pf.T := [z : I]z.

aP

is for “application of a function to equal arguments”: if f is a function and a = 6, then ( a )f = ( b ) f .

jwab

is the “abstracted version of jw” i.e. the function

[. : Rw]j w ( z ) . SY

is for symmetry of equality.

tr3

is for transitivity of 3 equalities: if a = b, b = c and c = d, then a = d. (Note that both sy and tr3 use AUT-SYNT.)

never

is for “always not”. Temporal adverbs are used as identifiers for quantifiers. If P is a predicate on Rl,say, then always(P) is ll([z : Rl] (z) P ) , and n e v e r ( P ) is n([z: Rl]( ~ ( 2P)) ) .

‘amp’

is implication between predicates. In fact if P and Q are predicates on R1, say, then

L.S. van Benthem Jutting and R.C. de Vrijer

748

amp

is the universal quantification of P ‘imp’ Q , that is

(Zucker uses similar notations for other connectives, e.g.

P ‘and’ Q

:= [z : RI]((z)P A (z) Q ) ,

and(P,Q)

:= l l ( [ z : R l ] ( ( z ) P A ( z ) Q ))) .

2.neg

is the double negation law.

ex. i

is for “introduction of existential quantifier” (by producing a witness).

ex. e

is “elimination of existential quantifier”. If 3([z : Rl] (z) P ) and V([z : Rl] ((z) P -, c)) then c.

hf cont

stands for

R1).

is continuity of f at a, defined in context [a : Rl][ f : p f n ] by

cont fn.pl.con

(with type

:=

( a )do( f ) A (f ; a) lzm(a,f) .

stands for “function plus constant”. If f : pfn and a : Rl then fn.pl.con(a,f ) is the function with the same domain as f,and with value (f ; z)+a, for z in that domain. Formally, if u : (b) do(f) then

do.fn.pl.con(f,a, b, u ) : ( b ) do(fn.pl.con(f,a ) ) and

va.fn.pl,con(f,a,6,u) : (fn.pl.con(f,a);6) = ((f;b ) + a ) ii.eq2

.

i f f : pfn and g : pfn, then i i . e q a ( f , g ) is the predicate

That is: (z) i i . e q t ( f , g ) means that z is in the domain of both f and g , and f and g are equal at 5.

A text fragment from Zucker’s “Real Analysis” (0.4)

749

aP2

is for “application of a binary function to equal arguments”: if f is a binary function, a = c and b = d then (b) ( a )f =

ti

is the product function [z : Rl] [y : Rl](z x y).

div. ti

(for “division” followed by “times”) proves that, for q # 0, e9 x q = p . Here, again, a notable use is made of AUT-SYNT. First, on context [a: Rl] [b : Rl] [u : b # 01 Zucker derives

(4 (4f .

Pf27 Then on context

..- ... [tl

: qu(a,b, u ) x b = a

.

: synt] [z2 : synt]

._ div.ti .pf27(.%1,LASTELT(PREPART( TAIL(#, C A T ( t 2 ) ) ) )2, 2 )

.

It follows that if p : Rl and v : q # 0

div.ti(p, v)

:= qu(p,q , v ) x q = p

triple

is defined by: triple(a, b, c) := pair(pair(a,b ) , c).

elsewhere

if a : R1 and P a predicate on R1 then

elsewhere(a, P ) := V([z : Rl]((x # a) --t (z) P ) ) . near.90. near.mod

proves that, for predicates P and Q , if elsewhere(P ‘imp’ Q ) and near(P), then near(&). Note that both “elsewhere” and “near” refer to a : R1,which is not mentioned explicitly here, but should be taken from the context.

near. eq2

near.eqd(f,g ) definitionally equals near(ii.eq2( f , 9 ) ) . So near.eq2(f1g) means that f and g are both defined near a, and that f and g are pointwise equal near a.

ti.@

stands for “times zero”, and proves a x 0 = 0.

O.pE

similarly, stands for “zero plus”, and proves 0

do2

stands for “the intersection of 2 domains”, i.e. d o 2 ( f , g ) := d o ( f ) ‘and’ do(g).

co

stands for ‘‘convergence of equality”, i.e. if a = c and b = c then a = b.

+ a = a.

750

L.S. van Benthem Jutting and R.C. de Vrijer

6. THE TEXT FRAGMENT

‘heading’ 7. DIFFEREN TIA TION

‘comment’ We return to the context “[a:Rl][f:pfn]” To begin with, we define the “difference quotient condition” (a predicate on the reals),

‘comment’ and (in an extended context) the “difference quotient”: [d:Rl] [u:d#a] dq := qu((f;d)+-(f;a),d+-a, dif.nz( u))

: R1

‘comment’ Then we define the “difference quotient function, at a, o f f ” (by the second of the two general methods given in Sec. 2).

f

@I

dq.term := [x:Rl][y:(x)dq.cond] dq(x,projl(y)) dq.fn := def,pfn.g(dq.cond,dq.term)

d

@I

: pfn

[u:d#a][v:(a)do(f)][w:(d)do(f)] do.dq.fn := do.pfn.g(dq.cond,dq.term, d,pair(u,pair(v,w)))

: (d)do(dq.fn)

va.dq.fn := va.pfn.g(dq.cond,dq.term, d,pair (u,pair(v,w)))

: (dq.fn;d)=dq(u)

d @ [u:~(d)dq.cond] not.do.dq.fn := not.do.pfn.g(dq.cond, dq.term,d,u)

: -(d)do(dq.fn)

A text fragment from Zucker’s “Real Analysis” (D.4)

75 1

‘comment’ Now we can define a “derivative, at a, o f f ” (as a predicate o n R1) to be a limit, at a, of the difference quotient function o f f at a.

f@

der := lim(dq.fn)

: pred(R1)

‘comment’ The derivative is unique, unq.der := unq.lim(dq.fn)

: unq(der)

‘comment’ and so we can define it as an object of type Rw, derw := limw(dq.fn)

:Rw

‘comment’ satisfying: [u:3(der)] derwl := limwl(dq.fn,u)

:

(jw(derw))der

f @ [u:never(der)] derw2 := limw2(dq.fnlu)

: derw=w

f 63 [b:Rl][u:(b)der] der.so.derw := lim.so.limw(dq.fn,b,u)

:

iw(b)=derw

:

(jw(derw))der

f @ [u:def(derw)] derw.def. so.der := limw.def.so.lim(dq.fn,u)

‘comment’ Now we can define the “derived function” of a partial function h as the partial function which has as its value the derivative of h, at any point where this exists, and which is undefined otherwise.

L.S. van Benthern Jutting and R.C. de Vrijer

752

@ [h:pfn]

h @

der.fn := [x:R1]derw (x,h)

: pfn

[d:Rl] [e:Rl][u:(e)der(d,h)] +71 11 :=

der.so.derw(d,h,e,u)

:= su(defd,ll ,pf.T) 13 := Ap(jwab,ll) 12

: iw (e)=derw (d,h) : def( derw (d,h)) : e=jw (derw (d,h))

-71

do.der.fn :=

12 “.71”

va.der.fn := sy(l~“.71”)

:

(d)do(der.fn)

:

(der.fn;d)=e

:

derw(d,h)=w

d 62 [u:never(der(d,h))]

+72 11 := derw.2(dlh,u) 12 := w .so.not .def( derw (d,h),11)

: ldef(derw (d,h))

-72

not.do.der.fn :=

12 “.72”

:

(d)do(der.fn)

1

‘comment’ A corollary of the last result as: d @ (u:l(d)d~(h)] +73 th := T(d)do(der.fn)

[v:3(der (d,h))] 11 :=

derwl(d,h,v)

: (jw(derw(d,h)))

der(d,h)

A text fragment from Zucker’s “Real Analysis” (D.4)

753

: (d)do(h)

:I u @

pf := not.do.der.fn([lx]l3)

: th

-73

not .do.so. not.do.der.fn := pf“.73”

(d)do(der.fn)

1

‘comment’ W e return again to the context ‘Ia:Rl][f:pfn]”, and assume now that f is differentiable at a; i n fact, we assume that we have explicitly given a derivative, at a, off. f @ [b:Rl][b.der.a.f:(b)der]

‘remark’ This context (or an extension of it) will be used in most of the rest of this chapter. ‘comment’ The first interesting thing we can say (in this context) is that f is defined at and near a. +74 t h l := def.at th2 := def.near th3 := def.at.near 11 := lim.so.def.near(dq.fn,b, b.der.a.f)

:

def.near(dq.fn)

:

(d)do(dq.fn)

[6:Rp][u:imp(nd( 6) ,do(dq.fn))] [d:R1][v:(d)nd(6)] 12 :=

(v)(d)u

L.S. van Benthem Jutting and R.C. de Vrijer

754

[w:l(d)dq.cond] 13 :=

v @

:I

:= 2.neg([lx]l3)

: (d)dq.cond

15 :=

:

16

projl(proj2(14)) := proj2(proj2(14))

: (d)do(f)

14

u @

(l*)not.do.dq.fn(d,w)

17 :=

18 :=

[2X]16 ex.i(near.pred(do(f)),6,17)

(a)do(f)

: imp(nd(6) ,do(f)) : th2

dl := a+(projl(6) x hf) 19 := in.nd(6) 110 := 15(d1,19) b.der.a.f @

pfl := ex.e(ll,[2x]llo) pf2 := ex.e(11,[2x]l~) pf3 := at .and.near.so.at .near( dO(f),Pfl,Pf2)

: :

thl th2

: th3

-74

der.so.def. at := pfl“.74”

: def.at

der.so.def. near := pf2“.74”

: def.near

def.so.def at.near := pf3“.74”

‘comment’ Further, f is continuous at a.

+75 b.der.a.f @

th := cont dif.fn := fn.pl.con(id.fn,-a) prod := prod.fn(dq.fn,diff.fn) new.fn := fn.pl.con(prod,f;a)

:

def.at.near

A text fragment from Zucker’s “Real Analysis” (0.4)

755

‘remark’ Then (for (x)do(new.fn)): (new.fn;x)=( ((dq.fn;x)x (x+-a))+(f;a)). [editor’s comment: So (new.fn;x)=(f;x). This is proved in 117 below.] p l := do(f) p2 := ii.eq2(f,new.fn) := der.so.def.at 12 := der.so.def.near

11

: (a)do(f) : near(p1)

[d:Rl][u:d#a] [v:(d)p 11

l3 := do,id.fn(d) l4 := do.fn.pl.con(id.fn,-a,d,l3) 15 := va.fn.pl.con(id.fn,-a,d,l3) 16 := do,dq.fn(d,u,ll ,v) 17 := va.dq.fn(d,u,ll,v) 18 := do.prod.fn(dq.fn,dif.fn,d, 16514) 19 := va.prod.fn(dq.fn,dif.fn,d, 16 914 )

(d)do(id.fn) (d)do(dif.fn) (dif.fn;d)=(d+-a) (d)do(dq.fn) : (dq.fn;d)=dq(d,u)

: : : :

:

: (prod;d)=((dq.fn;d)x

:

(dif.fn;d)) (d)do(new.fn) (new.fn;d)= ((prod;d)+(f;a) ((dq.fn;d)x (dif.fn;d))= (dq(d4)x (d+-a)) (dq(d,u)x(d+-a))= ( (f;d)+- (f;a)) (prod;d)=((f;d)+- (Ca)) ((prod;d)+(f;a))= ( ( ( W + - (f;4 )+(f;a) 1 (((f;d)+-(f;a))+(f;a))= (f;d) (new.fn;d)=(f;d) (d)P2

:

elsewhere(pl‘imp’p2)

do.fn.pl.con(prod,f;a,d,ls) va.fn.pl.con(prod,f;a,d,lg)

:

111 :=

112 :=

ap2(ti,17,15)

:

113 :=

div.ti((f;d)+-(f;a),dif.nz(u))

:

114 :=

tr3(19,112,113) ap(pl.Ri(f;a),ll4)

:

ll0 :=

115 :=

:

:

116 := plmi.pl(f;d,f;a)

:

tr3(111,115,116) 11s := triple(v,llo,sy(ll7))

:

117 :=

119 := [3x]11g b.der.a.f @I 120 := near.so.near.mod(pl,p2, 119912)

(d)do(prod)

: near.eq2(f,new.fn)

756

L.S. van Benthem Jutting and R.C. de Vrijer

127

1im.id.fn lim.fn.pl.con(id.fn,a,l21 ,-a) ~u(lim(dif.fn),pl.mi(a),l22) lim.prod.fn(dq.fn,b, b.der.a.fldif.fn,0,123) := su(lim (prod),ti.a(b) ,124) := lirn.fn.pl.c0n(prod,0,12~,f;a) := su(lim(new.fn),0.pl(f;a),126)

128

:= near.eq2.so.same.lim(fl

:= := 123 := 124 := 121

122

125 126

new.fn,l2o,f;a,l27)

: (a)lim(id.fn) : (a+-a)lim(dif.fn) :

(O)lim(dif.fn)

(b x O)lim(prod) (O)lim(prod) : (O+(f;a))lim(new.fn) : (f;a)lim(new.fn) : :

:

(f;a)lim(f)

:

th

-75

der.so.cont := pf“.75”

: cont

‘heading’ 8. RULES FOR DIFFERENTIATION ‘comment’ We compute the derivative, at the point a, of certain partial functions. (In some cases, we will also give an expression for the derived function of the given partial function.) First, the constant function. a @ [c:Rl] +81

th := (O)der(con.fn(c)) con := con.fn(c) dqf := dq.fn(con) 11 :=

do.con.fn(c,a)

: (a)do(con)

A text fragment from Zucker’s “Real Analysis” (0.4)

757

[d:R1][u:d#a] do.con.fn(c,d) := do.dq.fn(con,d,u,ll,12)

12 := 13

(d)do(con) : (d)do(dqf) :

iv.dif := iv(d+-a,dif.nz(u))

17 := tr3(14,15,16)

(dqf;d)= ((c+-c) xiv.dif) : ((c+-c)xiv.dif)= (0x iv.dif) : (Oxiv.dif)=O : (dqf;d)=O

la := pair(13,17)

: (d) ii .eq (dqf,0)

14 := 15

:= ap(ti.Ri(iv.dif),pl.mi(c))

I6 :=

a

@

va.dq.fn(con,d,u,ll,12)

19 :=

@.ti(iv.dif)

[2X]la

pf := elsewhere.con.so.lim(dqf, 0,191

:

: elsewhere.eq(dqf,O)

:

th

-81

der.con.fn := pf“.81”

: (O)der(con.fn(c))

‘comment’ A s a corollary we have the derived function of a constant function. @ [c:R1] +82 th := der.fn(con.fn(c) )=con. fn(0) con := con.fn(c) derc := der.fn(con)

L.S. van Benthem Jutting and R.C. de Vrijer

758

[d:Rl] (O)der(con) (d)do(derc) : (derc;d)=O : (d)do(con.fn(0)) : (d)ii.eq2(derc, con.fn( 0)) :

12 :=

:

13 14 15

c @

der.con.fn(d,c) da.der.fn(con,d,O,ll) := va.der.fn(con,d,O,ll) := do.con.fn(0,d) := triple(l2,14,13)

11 :=

pf := eq.tot.fn(derc,con.fn(O),[l~]l~): th

-82 der.fn.con.fn := pf “.82”

: der.fn(con.fn(c))=

con.fn(0)

‘comment’ Next, the derivative of the identity function at a, +83

a @

th := (l)der(id.fn) dqf := dq.fn(id.fn) 11 :=

do.id.fn(a)

:

(a)do(id.fn)

[d:Rl][u:d#a]

a @

12 := do.id.fn(d) l3 := do.dq.fn(id.fn,d,u,l1,12) l4 := va.dq.fn(id.fn,d,u,ll,lZ) 15 := ti.iv(dif.nz(u)) 16 := pair(l3,tr(l4,15))

: (d)do(id.fn)

(d)do(dqf) (dqfid)=dq(id.fn,d,u) : dq(id.fn,d,u)=l : (d)ii .eq(dqf, 1)

pf := elsewhere.con.so,lim(dqf, 1 ,x 1~1 ~ )

: th

: :

-83

der.id.fn := pf“.83”

‘comment’ and, again, the derived function.

:

(l)der(id.fn)

A text fragment from Zucker’s “Real Analysis” (0.4)

759

+84

@

th := der.fn(id.fn)=con.fn(l) der.fn := der.fn(id.fn)

a @

@

der.id.fn do.der.fn(id.fn,a,l,ll) l3 := va.der.fn(id.fn,a,l,ll) 14 := do.con.fn( 1,a) l5 := triple(12,14,13) 11 :=

:

12 :=

: : : :

pf := eq.tot.fn(der.fn,con.fn(l), [1X 115)

(l)der(id.fn) (a)do(der.fn) (der.fn;a)=l (a)do(con.fn(l)) (a)ii.eq2(der.fn, con.fn( 1))

: th

-84

der.fn.id.fn := pf“.84”

: der.fn(id.fn)=con.fn(l)

’comment’ The derivative of the sum function of two partial functions. (Note: we return to the context ‘la:Rl][f:pfn][b:Rl][b.der.a.f(b)der(f)] ” and extend it.) b.der.a.f @ [g:pfn] [c:Rl][c.der.a.g: (c)der(g)] +85

th := (b+c)der(sum.fn(f,g))

‘remark’ The idea an this proof (and other proofs in this section) is to define a partial function which is defined and equal to the diflerence quotient function under consideration near a, and for which the limit at a can be computed b y the rules for limits in See. 6. sum dqf dqg dqs sdq

:= := := := :=

sum.fn(f,g) dq.fn(f) dq.fn(g) dq.fn(sum) sumbn(dqf,dqg)

L.S. vm Benthem Jutting and R.C. de Vrijer

760

‘remark’ We want to prove “near.eq2(dqs,sdq)”, since the limit of the partial function sdq at a can be computed. p l := d02(f,g) p2 := ii.eq2(dqs,sdq) 11 := 12 :=

13 := 14 :=

der.so.def.at(f,b,b.der.a.f) der.so.def.at(g,c,c.der.a.g) do.sum.fn(f,g,a,ll,l2) va.sum.fn(flg,a,ll,12)

: (a)do(f) : (a)do(g) : (a)do(sum) : (sum;a)=(f;a)+(g;a)

l5 := der.so.def.near(f,b,b.der.a.f) : near(do(f)) 16 := der.so.def.near(g,c,c.der .a.g) : near( do(g) ) 17 := near.both(do(f),do(g),l5,16) : near(p1) [d:Rl][u:d#a][v:(d)pl] projl(v) : (d)do(f) proj2(v) : (d)do(g) : (d)do(sum) 110 := do.sum.fn(f,g,d,l8,19) 111 := va.sum.fn(f,g,d,l&) : (sum;d)=( (f;d) +(g;d) ) : (d)do(dqf) 112 := do.dq.fn(f,d,u,ll,le) 113 := va.dq.fn(fld,u,ll,18) : (dqf;d)=dq(f,d,u) 114 := do.dq.fn(g,d,u712,19) : (d)do(dqg) lI5 := va.dq.fn(g,d,u,l2,19) : (dqg;d)=dq(g,d,u) 116 := do.dq.fn(sum,d,u,l3,110) : (d)do(dqs) 117 := va.dq.fn(sum,d,u,l3,110) : (dqs;d)=dq( sum,d,u) 118 := do.sum.fn(dqfldqg,d,112,1~~) : (d)do(sdq) 119 := va.sum.fn(dqf,dqg,d,l12,114) : (sdq;d)= ((dqf;d)+ (dqg;d)) 18 :=

19 :=

dif.f := ((f;d)+-(f;a)) dif.g := ((g;d)+-(g;a)) dif.sum := (sum;d)+- (sum;a) 120

:= ap2(dif,lll,l4)

: dif.sum=(((f;d)+(g;d))+-

((f;a)+(g;a)))

dif.sum.is.sum.difs(f;d,g;d, f;a,g;a) := tr(120~121)

121 := 122

: dif.sum=(dif.f+dif.g)

A text fragment from Zucker’s “Real Analysis” ( 0 . 4 )

q l := dif.nz(u) iv.dif := iv(d+-a,ql)

76 1

: (d+-a)#O

123 :=

ap(ti.Ri(iv.dif),lzz)

: dq(sum,d,u)=

124 :=

Ri.dist(dif.f,dif.g,iv.dif)

: ((dif.f+dif.g) xiv.dif)=

((dif.f+dif.g) xiv.dif) (dq(f,d,u)+dq(g,d,u))

c.der.a.g @

129 :=

[3x]lzs

: elsewhere(pl‘imp’p2)

130 :=

near.so.near.mod( p l ,p2, 129’17)

:

near.eql(dqs,sdq)

lim.sum.fn(dqf,b,b.der.a.f, dqg,c,c.der.a.g)

:

(b+c)lim(sdq)

:

th

131 :=

pf := near.eq2.so.same.lim(dqs, ~dq,l3o,b+~,l3i)

-85 der.sum.fn := pf“.85”

: (b+c)der(sum.fn(f,g))

This Page Intentionally Left Blank

763

Checking Landau’s “Grundlagen” in the Automath System Appendices 3 and 4 (The PN-lines; Excerpt for “Satz 27”) L.S. van Benthem Jutting APPENDIX 3.

THE PN-LINES FROM THE PRELIMINARIES

+L *A A*B B * IMP 1 * CON A * NOT A * WEL A* W 2 W*ET B*EC B * AND *SIGMA SIGMA * P P * ALL P * NON P * SOME

._ .._

-----

:= [X,A]B

..-

PN

:= IMP(C0N) :=NOT(NOT(A))

.-._ ._ .-

---

PN

:=IMP(A,NOT(B))

:= NOT(EC(A,B))

.._ ._

---

---

:= P := [X,SIGMA)NOT((X)P) := NOT(NON(P))

;PROP ;PROP ;PROP ; PROP ;PROP ;PROP ; WEL(A) ;A ;PROP ;PROP ;TYPE ; [X,SIGMA]PROP ;PROP ;[X,SIGMA]PROP ;PROP

+E SIGMA * S S*T 3 T * IS 4 S * REFIS P*S S*T TtSP SP*I 5 I * ISP P * AMONE

P * ONE

.._

-----

.._ ._ ..-.-

PN PN

---

:= :=

.-__ :=

__--..-

..-

PN := [X,SIGMA][Y,SIGMA][U,(X)P][V, (Y)PIIS(X,Y) := AND(AMONE(SIGMA,P), SOME(SIGMA,P))

.._ - - P*O1 .6 Ol*IND Pn 7 Ol*ONEAX .- PN .- - - SIGMA t TAU := TAU t F F * INJECTIVE:= ALL((X,SIGMA]ALL([Y,SIGMA]

..._

___

SIGMA SIGMA PROP IS(S,S) SIGMA SIGMA (S)P

; IS(S,T) ; (T)P ; PROP ;PROP ; ONE(SIGMA,P) ; SIGMA ; (1ND)P ; TYPE

; [X,SIGMA]TAU

IMP(IS(TAU,(X)F,(Y)F),IS(X,Y)) ))

; PROP

L.S. van Benthem Jutting

764

8 9 10

11 12

13

14

15 16 17 18 19 20

F*TO TO * IMAGE TAU * F F*G G*I I * FISI P*OT P*O1 OlrIN 01 * INP P * OTAXl P*S S*SP S P t OTAX2 TAU * PAIRTYPE TAU t S S+T T t PAIR TAU * P1 P1 *FIRST P1 *SECOND P 1 * PAIRISl

___

:= ; TAU := SOME([X,SIGMA]IS(TAU,TO,(X)F)) ; PR O P .._ - - ; [X,SIGMA]TAU ._ .- - - ; [X,SIGMA]TAU

._ ._ ._ .._ ..._ .-..-.._ .-

:=

._ ._ .-.-

:=

--PN PN --PN PN PN

___ --PN PN

[X,SIGMA]IS(TAU,(X)F,(X)G) IS( [X,SIGMA]TAU,F,G) TY PE OT SIGMA (WP

INJECTIVE(OT,SIGMA,[X,OT]

IW)) ;SIGMA ; (S)P ; IMAGE(OT,SIGMA,[X, OTlIN(X),S) ; TY PE ;SIGMA ; TAU ; PAIRTYPE ; PAIRTYPE ;SIGMA ; TAU

....._ .._ .-

___

:= := :=

PN PN PN

T * FIRSTISl := T * SECONDISl:=

PN PN

; IS(SIGMA,FIRST(PAIR),S) ; IS(TAU,SECOND(PAIR),T)

._ .._

PN

;TYPE ;SIGMA ; SET ; PR O P ; SET ; SIGMA ; (S)P ;ESTI(S,SETOF(P)) ;ESTI(S,SETOF(P)) ; (S)P ;SET ;SET

. . ~

PN

___

; IS(PAIRTYPE,PAIR(FIFlST,

SECOND),Pl)

-E +*E +ST 21 SI G M A r SET SIGMA * S

s*so

22 23

SO*ESTI P * SETOF P*S S*SP 24 SP*ESTI I S*E 25 E * ESTIE SIGMA t SO SO li TO TO * INCL

:=

._ ...:=

.._ ..-

._ ._ .._ ._

---

___

PN PN

___

--PN

---

PN

-----

:= ALL( [X,SICMA]IMP(ESTI(X,SO),

ESTI(X,TO)))

26 -ST -E -L

TOtI I*J J * ISSETI

:=

.._ ..-

.-.

--PN

;PROP ; INCL(S0,TO) ; INCL(T0,SO) ; IS(SET,SO,TO)

Checking Landau’s “Grundlagen”,Excerpt for “Satz 27” (0.5)

EXCERPT FOR “SATZ 27”

APPENDIX 4. +L *A A+B B * IMP B*A1 Al*I ItMP B*C C*I I +J J + TRIMP + CON A + NOT A * WEL A*A1 A 1 * WELI A +W W+ET AtCl C 1 + CONE

:= :=

---

---

:= [X,A]B :=

._

___

---

:= (A1)I :=

.._

:=

--_

---

___

:= [X,A]((X)I)J

.-

PN

:= I M P ( C 0 N ) := NOT(NOT(A))

._ ._

---

:= [X,NOT(A)](Al)X

._

.._

---

PN

---

:= ET([X,NOT(A)]Cl)

;PROP ;PROP ;P R O P

;A ; IMP(A,B) ;B ;PROP ;IMP(A,B) ; IMP(B,C) ; IMP(A,C) ;P R O P ;P R O P ;P R O P ;A ;WEL(A) ; WEL(A) ;A ; CON ;A

+IMP B+I I +J J: T H 1 BtN N + TH2 BIN N*I 11: T H 3 B*A1 AltN N t TH4 BIN N + TH5 N + TH6 -IMP

._ :=

---

___

; IMP(A,B) ; IMP(NOT(A),B)

:=ET(B,[X,NOT(B)]((TRIMP(CON,I,

._

X))J)X)

;B

._

XI)

; IMP(A,B) ; NOT(B) ; IMP(A,B) ; NOT(A)

--; NOT(A) := TRIMP(CON,B,N,(X,CON]CONE(B, :=

---

---

:= TRIMP(CON,I,N) := _ - -

;A

___ := [X,IMP(A,B)](Al)TH3(N,X) := _ - -

; NOT(B) ;NOT(IMP(A,B)) ; NOT(IMP(A,B))

:= ET([X,NOT(A)](THZ(X))N) := [X,B]([Y,A]X)N

; NOT(B)

:=

;A

765

L.S. van Benthem Jutting

766 BtEC

:=IMP(A,NOT(B))

;PROP

._ ._

; EC(A,B)

+EC BtI I * TH1 BtI I t TH2 -EC BtE EtA1 A1 t E C E l E*B1 B1 t ECEZ B t AND BtA1 A1 t B1 B1 t AND1 B+A1 A1 t A N D E l A1 t ANDEP

._ ._

-----

;A ; NOT(B) ;B :=TH3"-IMP"(NOT(B),WELI(B,Bl),E) ; NOT(A) := NOT(EC(A,B)) ;P R O P := .-;A

:= (A1)E :=

___

._

---

;B

:= TH4"-IMP"(NOT(B),Al,WELI(B,Bl)) ; AND(A,B)

._ ._

---

:= TH5"-IMP"(NOT(B),Al) := ET(B,THG"-IMP"(NOT(B),Al))

; AND(A,B) ;A ;B

+AND BtN N*A1 A1 * T H 3

._

--.

:=

.._

:= E C E l ( E T ( E C , N ) , A l )

; NOT(AND) ;A ; NOT(B)

:= IMP(NOT(A),B)

;PROP

-AND B+OR B*A1 A1 * OR11 B*B1 B 1 * OR12

.-._

--;A :=TH2"-IMP"(NOT(A),B,WELI(Al)) ; OR(A,B) ._ - - ;B := [X,NOT(A)]Bl

; OR(A,B)

+OR B*I I t TH2

._

--; IMP(NOT(B),A) := (X,NOT]ET(B,TH3"L-IMP"(NOT(B), AW)) ; OR(A,B)

-OR Bt Ot Nt Ot Nt

O N ORE2 N ORE1

._ ._

-----

:= (N)O

._

; OR(A,B) ; NOT(A)

;B

--; NOT(B) := ET(TH3"-IMP"(NOT(A),B,N,O)) ; A

Checking Landau’s “Grundlagen”,Excerpt for “Satz 27” (0.5) +*OR BtN N*M M * TH3

._

---

___

:=

; NOT(A) ; NOT(B)

:=TH4“L-IMP”(NOT(A),B,N,M)

; NOT(OR(A,B))

._ ._ ._

; OR(A,B) ; IMP(A,C)

-OR c * o O*I I*J J * ORAPP

-----

.._ ; IMP(B,C) := TH1“-IMP”(C,I,TRIMP(NOT,B,C, QJ)) ;C

:=

+*OR

O*I I I; TH7 O*I I IT H 8 C*D D*O O*I I*J J * TH9

._ __ - - -

; IMP(A,C)

:= TRIMP(NOT(C),NOT,B,[X,NOT(C)]

._ ._

TH3“L-IMP”(A,C,X,I) ,0)

---

:=TRIMP(NOT(A),B,C,O,I)

._ ._ ._ ._

---------

:=THT(A,D,C,TH8( A,B,D,O,J),I)

; OR(C,B)

; IMP(B,C) ; OR(A,C)

;P R O P ; OR(A,B)

; IMP(A,C) ; IMP(B,D) i OR(C,D)

-OR

* SIGMA SIGMA * P P * ALL

._

---

.-._

---

;T Y P E ; [X,SIGMA]PROP ;PROP

---

; SIGMA

:= P

+ALL P*S S*N N * TH1

._

:= - - _ := [X,ALL(SIGMA,P)]((S)X)N

; NOT((S)P)

; NOT(ALL(SIGMA,P))

-ALL

P * NON P * SOME P*S S*SP S P * SOME1

:= [X,SIGMA]NOT((X)P) := NOT(NON(P))

.-._ :=

---

---

:= TH1“-ALL”(NON(P),S,WELI((S)P,

SP))

; [X,SIGMA]PROP ;PROP

; SIGMA ; (S)P ; SOME(SIGMA,P)

+SOME P*N N * TH5 -SOME

:= _ _ . := WELI(NON(P),N)

; NON(P) ; NOT(SOME(SIGMA,P))

767

L.S. van Benthem Jutting

768 P*S s * x X*I

:=

._ ._ ._ ._

___

; SOME(SIGMA,P)

---

; PR O P

---

; [Y,SIGMA]IMP((Y)P,X)

---

; NOT(X)

---

;SIGMA ; NOT((T)P)

+*SOME I*N N*T T*T5 NtT6

.-._ ._

:=TH3'Z-IMP''((T)P,X,N,(T)I) := MP(SOME(SIGMA,P),CON,S,

TH5([Y,SIGMA]T5(Y)))

; CON

-SOME I * SOMEAPP := ET(X,[Y,NOT(X)]T6"-SOME'(Y)) ;X +*SOME P*Q Q*S

S*I

I * TH6

-SOME C * AND3 C*A1 A1 * AND3El A1 * AND3E2 A1 * AND3E3 C*A1 A1 * B1 B 1* C1 C 1* AND31

:= AND(A,AND(B,C))

._

---

; PR O P

; ANDB(A,B,C)

:= ANDEl(AND(B,C),Al) ;A := ANDEl(B,C,ANDEz(AND(B,C),AI)) ;B := ANDEz(B,C,ANDEz(AND(B,C),Al)); C

._ ._ ._ ._

-------

;A ;B ;C

:= ANDI(A,AND(B,C),Al,

ANDI(B,C,Bl,Cl))

; AND3(A,B,C)

+AND3 C*A1 A1 * T H l

._

--AND3E3( Al) ,AND3E1(Al))

-AND3

AND3(A,B,C)

:= AND3I(B,C,A,AND3EZ(Al),

ANDB(B,C,A)

Checking Landau’s “Grundlagen”, Excerpt for “Satz 27’’ (0.5) C t EC3 CtE

:=AND3(EC,EC(B,C),EC(C,A))

E t TH1 E * TH 3 E t TH4

:= ANDBEl(EC,EC(B,C),EC(C,A),E) :=AND3E3(EC,EC(B,C),EC(C,A),E) := THl“L-ANDB”(EC,EC(B,C),EC(C,

._

---

+EC3

‘413) -EC3 EtA1 A1 t EC3E12 A1 t EC3E13 EtB1 B1 t EC3E23 B1 t EC3E21

___

:= := ECEl(THl“-EC3”,Al) := ECE2(C,A,TH3“-EC3” ,Al) := _ - := EC3E12(B,C,A,TH4“-EC3”,Bl) := EC3E13(B,C,A,TH4“-EC3”,Bl)

+*EC3 CtE EtF FtG G * TH6

:= := :=

___ ___ _-.

:= ANDBI(EC,EC(B,C),EC(C,A),E,F, G)

-EC3 +E SIGMA t S StT T * IS S * REFIS PtS StT TtSP SPtI I * ISP SIGMA * S StT TtI I t SYMIS T*U UtI I*J J t TRIS

._ ._ ..._ .-

:=

._ ._ :=

.-.._ ._ ._ :=

---

--PN PN __.

-----

___ PN ----___

:= ISP( [X,SIGMA]IS(X,S),S,T,

._ :=

._ ._

REFIS(S),I)

---

___

---

:=ISP( [X,SIGMA]IS(X,U),T,S,J,

SYMIS( I)) ._ _ - UtI ._ --ItJ := TRIS(S,U,T,I,SYMIS(T,U,J)) J * TRIS2 := TtN N t SYMNOTIS := TH3“L-IMP”(IS(T,S),IS(S,T),N, [X,IS(T,S)]SYMIS(T,S,X))

___

769

L.S. van Benthern Jutting +NOTIS U*N N*I I t TH 3 N*I I * TH4

._ .-._

-----

; NOT(IS(S,T))

IS(T,U)

:= ISP( [X,SIGMA]NOT(IS(S,X)),T,U,

._ ,_

NJ)

---

:= THB(SYMIS(U,T,I))

NOT(IS(S,U)) IS(U,T) NOT(IS(S,U))

-NOTIS UtV VrI I*J JtK K * TR3IS v * w W*I I*J J*K K*L L * TRIIS P * AMONE P

* ONE

PtA1 A1 * S S t ONE1 P*O1 01 t IND 01 t ONEAX SIGMA t TAU TAU t F FtS S*T T*I I t ISF TAU * F F*G G*I I*S S t FISE

GtI I t FISI -E +*E +ST

:=

._ :=

.._

___ --__.

---

:=TRIS(S,U,V,TRIS(I,J),K)

._ ._ ._ .._ ._

----------:= TRIS(S,V,W,TR3IS(I,J,K),L)

SIGMA IS(S,T) IS(T,U) IS(U,V) IS(S,V) SIGMA IS(S,T) IS(T,U) IS(U,V) IS(V,W) IS(S,W)

:= [X,SIGMA][Y,SIGMA][U,(X)P][V,

(Y)PIIS(X,Y)

PROP

:= AND(AMONE(SIGMA,P),

._ .._ .-

SOME(SIGMA,P)) ---

---

PROP AMONE(SIGMA,P) SOME(SIGMA,P)

:= ANDI(AMONE(SIGMA,P),

._ ._ ._ .._ .._ := := :=

._ ._

SOME(SIGMA,P),Al ,S)

---

PN PN

---

___ ___

..-

---

:= ISP(SIGMA,[X,SIGMA]IS(TAU,(S)

ONE(SIGMA,P) ONE(SIGMA,P) SIGMA (1ND)P TY PE [X,SIGMA]TAU SIGMA SIGMA IS(S,T)

F,(X)F),S,T,REFIS(TAU,(S)F),I) IS(TAU,(S)F,(T)F) --; [X,SIGMA]TAU --; (X,SIGMA]TAU --; IS([X,SIGMA]TAU,F,G) --: SIGMA := ISP( [X,SIGMA]TAU,[Y,[X,SIGMA] TAU]IS(TAU,(S)F,(S)Y),F,G,

._ ._ ._ ._ ._ ._ ._

.._ ._ .-

REFIS(TAU,(S)F),I)

--PN

; IS(TAU,(S)F,(S)G) ; [X,SIGMA]IS(TAU,(X)F,(X)G) ; IS([X,SIGMA]TAU,F,G)

Checking Landau’s “Grundlagen”, Excerpt for “Satz 27” (D.5) SIGMA * SET SIGMA * S StSO SO t EST1 P * SETOF P*S S*SP S P * EST11 S*E E * ESTIE

77 1

;TYPE ;SIGMA ; SET ; PROP ; SET ;SIGMA ; (S)P ; ESTI(S,SETOF(P)) ; ESTI(S,SETOF(P)) ; (S)P

+EQ +LANDAU +N

* NAT *X X*Y Y * IS Y * NIS x*s S * IN *P P * SOME P * ALL *1 1; SUC *X X*Y Y*I I * AX2 * AX3 * AX4

*S S I CONDl S * COND2

* AX5

._ .._ ._

PN

; TYPE

-----

; NAT

:= IS“E”(NAT,X,Y) := NOT(IS(X,Y))

.._

---

:= ESTI(NAT,X,S)

._

---

:= SOME“L”(NAT,P)

:= ALL“L”(NAT,P) ._ .- PN ..- P N

._

---

:=

..-

._

---

:= ISF(NAT,NAT,SUC,X,Y,I)

._ .._ .-

PN PN

._ ._

---

:= IN(1,S) := ALL( [X,NAT]IMP(IN(X,S),IN((X)

._ .-

SUC,S))) PN

; NAT ; PROP ; PROP ; SET(NAT) ; PROP ; [X,NAT]PROP ; PROP ; PROP ; NAT ; [X,NAT]NAT ; NAT ; NAT ; IS(X,Y) ;I s ( ( x ) s u c , ( Y ) s u c ) ; [X,NAT]NIS((X)SUC,l) ; [X,NAT][Y,NAT][U,IS((X)SUC,

(Y)SUC)I~S(X,Y) ; SET(NAT) ; PROP ; PROP ; [S,SET(NAT)][U,CONDl(S)]

[V,COND2(S)][X,NAT]IN(X,S)

; [X,NAT]PROP

*P PtlP 1P * XSP XSP * x

+I1 x*s XtT1 X*Y Y+YES YES t T2 YES * T3 XtT4

__-

; SET(NAT) ; CONDl(S) ; NAT

---

; IN(Y,S)

:= SETOF(NAT,P) := ESTII(NAT,P,l,lP) :=

._ ._

:= ESTIE(NAT,P,Y,YES)

(Tl)(S)AX5 -I1

:. (Y)P I I

:= ESTII(NAT,P,(Y)SUC,(T2)(Y)XSP) ; IN((Y)SUC,S) := (X) ([Y,NAT][U,IN(Y,S)]T3(Y,U)) ; IN(X,S)

L.S. van Benthem Jutting

772 X * INDUCTION := ESTIE(NAT,P,X,T4“-11”) *X := := X*Y := Y*N

___ ___ ___

; (X)P ; NAT ; NAT ; NIS(X,Y)

+21

___

N*I IrTl

:=

N * SATZl

:= TH3“L-IMP” (IS( (X)SUC,(Y)SUC),

:= (I)(Y)(X)AX4

;I s ( ( x ) s u c , ( Y ) s u c ) ; IS(X,Y)

-21

IS(X,Y),N,[U,IS((X)SUC,(Y)SUC)l T1“-21”(U))

; NIS((X)SUC,(Y)SUC)

+23 X * PROPl

* T1

:= OR(IS(X,l),SOME([U,NAT]IS(X,

W)SUC)))

; PR O P

:=ORIl(IS(l,l),SOME([U,NAT]IS(l,

XtT2

; PR O Pl ( 1 ) (U)SUC)),REFIS(NAT,l)) := SOMEI(NAT,[U,NAT]IS((X)SUC,(U) SUC),X,REFIS(NAT,(X)SUC)) ; SOME([U,NAT]IS((X)SUC,(U)SUC)

X*T3

:= ORIZ(IS( (X)SUC,l),SOME([U,NAT]

XcT4

IS( (X)SUC,(U)SUC)),T2) ; PROPl((X)SUC) := INDUCTION([Y,NAT]PROPl(Y),Tl, [Y,NATI[U,PROPl(Y)]T3(Y),X); PROPl(X)

)

-23 X*N N * SAT23

:= ... ; NIS(X,l) := OREZ(IS(X,l),SOME([U,NAT]IS(X, (U)SUC)),T4“-23” ,N) ; SOME([U,NAT]IS(X,(U)SUC))

Y*Z

._ ._

---

: NAT

X*F F * PROPl

:=

___

; [Y,NAT]NAT

+24

F * PROP2 X*A A*B BtPA PA * PB PBtY Y * PROP3 P B * T1 PB * T 2 P B * T3

:= ALL((Y,NAT]IS(((Y)SUC)F,((Y)F)

S W )

:= AND(IS((l)F,(X)SUC),PROPl)

:=

___

:=

..-

:=

.--

:= :=

.--

___

:= IS((Y)A,(Y)B) := ANDEl (IS((1)A ,( X) SUC) ,PROP1(A)

; PR O P ; PR O P ; [Y,NAT]NAT ; [Y,NAT]NAT ; PROPZ(A) ; PROPZ(B) ; NAT ; PR O P

,PA) ; IS((1)A,(X)SUC) := ANDEl(IS((l)B,(X)SUC),PROPl(B) ; IS((1)B,(X)SUC) ,PB) := TRISZ(NAT,(l)A,(l)B,(X)SUC,Tl, ; PROP3( 1) T2)

Checking Landau’s “Grundlagen”,Excerpt for “Satz 27” (0.5) :=

P*T6

:=ANDE2(IS((l)B,(X)SUC),PROPl(B)

PtT7 PtT8 P*T9

:= (Y)T5

PB * T11 XtAA

X * PROP4 t t

T12 T13

* T14 X*P P*F FtPF PFtG PFtY Y * T15 PF * T16 PF t T17 Y * T18 Y * T19 Y t T20 Y * T21 PF * T22 PF * T23 PF t T24 P t T25

XtBB -24

___

Y*P PtT4 P*T5

Y * T10

773

:= AX2((Y)A,(Y)B,P) :=ANDE2(IS( (l)A,(X)SUC),PROPI(A)

,PA) J’B)

:= (Y)T6

:= TRBIS(NAT,((Y)SUC)A,( (Y)A)SUC,

((Y)B)SUC,( (Y)SUC)B,TI,TQ, SYM1S“E”(NAT,( (Y)SUC) B,((Y)B) SUC,T8)) ; PROP3((Y)SUC) := INDUCTION([Z,NAT]PROPB(Z),T3, [Z,NAT][U,PROPS(Z)]T9(Z,U),Y) ; PROPB(Y) := FISI(NAT,NAT,A,B,[Y,NAT]TlO(Y)) ; IS“E’’( [Y,NAT]NAT,A,B) := [Z,[Y,NAT]NAT][U,[Y,NAT]NAT] [V,PROP2(Z)][W,PROP2(U)]Tll(Z, U,V,W) ; AMONE([Y,NAT]NAT,[Z,[Y,NAT] NAT]PROP2(Z)) := S0ME“L” ([Y,NAT]NAT,[Z,[Y ,NAT] ; PROP NAT]PROP2(Z)) := [X,NAT]REFIS(NAT,((X)SUC)SUC); PROP1(1,SUC) :=ANDI(IS((l)SUC,(l)SUC), PROPl(l,SUC),REFIS(NAT,( 1)SUC) ,TW ; PROP2(1,SUC) := SOMEI([Y,NAT]NAT,[Z,[Y,NAT] NAT]PROP2(1,Z),SUC,T13) PROP4(1) := _ _ _ PROP4(X) := _ _ _ [Y,NAT]NAT ._ - - PROP2(F) [Y,NAT]NAT := [Y,NAT]((Y)F)SUC := --NAT :=REFIS(NAT,(Y)G) IS((Y)G,((Y)F)SUC) := ANDEl(IS((l)F,(X)SUC),PROPl(F) ,PF) ; IS((1)F,(X)SUC) := TRIS(NAT,(l)G,((l)F)SUC,((X) SUC)SUC,TlB(l),AXS((l)F,X) ( SUC,Tl6)) ; IS( (l)G,( (X)SUC)SUC) := ANDE2(IS((l)F,(X)SUC),PROPl(F) ,PF) ;PROPl(F) := (Y)T18 ; IS(((Y)SUC)F,((Y)F)SUC) := TRIS2(NAT,((Y)SUC)F,(Y)G,((Y) F)SUC,T19,T15) := TRIS(NAT,((Y)SUC)G,(((Y)SUC)F) SUC,((Y)G)SUC,T15((Y)SUC), AX2( ((Y)SUC)F,(Y)G,TBO)) := [Y,NAT]T21(Y) := ANDI(IS((l)G,((X)SUC)SUC), PROPl((X)SUC,G),T17,T22) :=SOMEI((Y,NAT1NAT,[Z,(Y,NAT] NAT]PROP2((X)SUC,.Z);G,T23)’ ; PROP4((X)SUC) :=SOMEAPP([Y,NAT]NAT,[Z,[Y,NAT] NAT]PROP2(Z),P,PROP4( (X)SUC), [Z,[Y,NAT]NAT][U,PROP2(Z)] T24(Z,U)) ; PROP4((X)SUC) := INDUCTION([Y,NAT]PROPI(Y),T14, [Y,NAT][U,PROP4(Y)]T25(Y,U),X) ; PROP4(X)

L.S. van Benthem Jutting

774 X * SATZ4

:=ONEI([Y,NAT]NAT,[Z,[Y,NAT]NAT]

PROP2“-24”(Z),AA“-24”,BB“-24”); ONE“E”([Y,NAT]NAT,[Z,[Y,NAT] NAT]AND(IS((l)Z,(X)SUC), ALL([Y,NAT]IS(((Y)SUC)Z,((Y)

Z)=JC))))

X * PLUS

:= IND([Y,NAT]NAT,[Z,[Y,NAT]NAT]

Y*PL

:= (Y)PLUS

X t T26

:= ONEAX([Y,NAT]NAT,[Z,v,NAT]

PROP2“-24”(Z),SATZ4)

NAT]PROP2(Z),SATZ4)

; [Y,NAT]NAT ; NAT

; PROP2(PLUS)

-24 X * SATZ4A

:= ANDEl(IS((l)PLUS,(X)SUC),

X * T27

:= ANDE2(IS((l)PLUS,(X)SUC),

PROP1“-24”(PLUS),T26“-24”)

; IS(PL(X,l),(X)SUC)

+*24 PROPl(PLUS),T26)

; PROPl(PLUS)

-24 Y t SATZ4B

:= (Y)T27“-24”

; rs(PL(X,(Y)suC),(PL(x,Y))suc)

:=Tll(l,PLUS(l),SUC,T26(1),T13)

; IS”E”([Y,NAT]NAT,PLUS(l),SUC)

+*24

* T28 -24 X * SATZ4C

:= FISE(NAT,NAT,PLUS(l),SUC,

T28“-24”,X)

; IS(PL(1,X),(X)SUC)

+*24 X t T29

:= T1l((X)SUC,PLUS((X)SUC),

[Y,NAT] ((Y)PLUS)SUC,T26((X) SUC),T23(BB,PLUS,T26))

; IS“E”([Y,NAT]NAT,PLUS((X)SUC),

[Y,NAT] ((Y)PLUS)SUC)

-24 Y * SATZ4D X * SATZ4E

Y t SATZ4F X * SATZ4G

:= FISE(NAT,NAT,PLUS((X)SUC),

[Z,NAT] ((Z)PLUS)SUC,T29“-24”, Y) := SYMIS(NAT,PL(X,l),(X)SUC, SATZ4A) :=SYMIS(NAT,PL(X,(Y)SUC),(PL (X,Y))SUC,SATZ4B) :=SYMIS(NAT,PL(l,X),(X)SUC, SATZ4C)

; IS(PL( (x)suc,Y),(PL(x,Y))suc) ; IS((X)SUC,PL(X,1)) ; Is((PL(x,Y))suc,PL(x,(Y)suc)) ; IS((X)SUC,PL(l,X))

Checking Landau's "Grundlagen", Excerpt for "Satz 27" (D.5)

775

Z*I I * ISPLl I * ISPL2

f25 Zt

PROPl

Y*Tl

:= rs(PL(PL(X,Y),z),PL(x,PL(Y,z)) ) ; PROP

:= TR31S(NAT,PL(PL(X,Y),l),(PL

~

ZtP P*T2 P+T3

~

,

~

~

~

~

~

~

-25 Zt

SAT25

:= INDUCTION([U,NAT]PROPl

"-25"

(U),T1"-25" ,[U,NAT][V,PROP1"-25" (U)]T3"-25"(U,V) ,Z) ; IS(PL(PL(X,Y) ,Z) ,PL(X,PL(Y,Z)

Z 1: ASSPLl

:= SAT25

))

; IS(PL(PL(X,Y),Z),PL(X,PL(Y,Z)

1) +26 Y c PROPl Y*Tl YtT2 Y*T3 YtP PrT4 PtT5 PtT6

:= IS(PL(X,Y),PL(Y,X))

:= SATZIA(Y)

:= SATZ4C(Y) := TRIS2(NAT,PL(l,Y),PL(Y,l),(Y)

; PROP ; IS(PL(Y,l),(Y)SUC) ; IS(PL(l,Y),(Y)SUC)

; PROPl(1,Y) SUC,T2,T1) --; PROPl(X,Y) := TRIS(NAT,(PL(X,Y))SUC,(PL(Y,X) )SUC,PL(Y,(X)SUC),AX2(PL(X,Y), PL(Y,X),P),SATZ4F(Y,X)) ; Is((PL(x,Y))suc,PL(Y,(x)suc)) := SATZ4D ; IS(PL((x)suc,Y),(PL(x,Y))suc) :=TRIS(NAT,PL((X)SUC,Y),(PL(X,Y) )SUC,PL(Y, (X) SUC),T5,T4) ; PROPl((X)SUC,Y)

._

- 26

Y I SAT26

Y t COMPL

,

~

PL(X,PL(Y,l)),SATZ4A(PL(X,Y)), SATZQF,ISPLP( (Y)SUC,PL(Y,l),X. SATZ4E(Y))) : PROPl(1) := - _ _ ; PROPl(2) := AXZ(PL(PL(X,Y),Z),PL(X,PL(Y,Z)) P) ; IS((PL(PL(X,Y),Z))SUC,(PL (X,PL(Y,Z)))SUC) := TR4IS(NAT,PL(PL(X,Y),(Z)SUC), (PL(PL(X,Y),Z))SUC,(PL(X,PL ~ Y , Z ~ ~ ~ S U C , P L ~ X , ~ ~ ~ ~ ~ , ~ ~ ~ ~ ~ ~ ~ , PL(X,PL(Y,(Z)SUC)),SATZ4B(PL( X,Y),Z) ,TZ,SATZIF(X,PL(Y ,Z)), ISPL2((PL(Y,Z))SUC,PL(Y,(Z) SUC),X,SATZIF(Y,Z))) ; PROPI((2)SUC)

:= INDUCTION([Z,NAT]PROP1"-26"

(Z,Y),T3"-26 ,[Z,NAT] [U,PROP1"-26b (Z,Y)]T6"-26"(Z, Y,U),X) :=SAT26

L.S. van Benthern Jutting

776 +27 Y * PROP1 XtTl X*TZ Y*P P*T3 P*T4

:=NIS(Y,PL(X,Y)) ; PROP := SYMNOTIS(NAT,(X)SUC,l,(X)AX3); NIS(l,(X)SUC) := TH4"E-NOTIS"(NAT,l,(X)SUC, PL(X,l),Tl,SATZIA) ; PROPl(1) := ; PROPl(Y) NIS( (Y)SUC,(PL(X,Y))SUC) := SATZl(Y,PL(X,Y),P)

-__

:= TH4"E-N0TIS1'(NAT,(Y)SUC,(PL

(X,Y))SUC,PL(X,(Y)SUC),T3, SATZ4B)

PROPl((Y)SUC)

-27 Y * SATZ7

:= INDUCTION([Z,NAT]PROPl"-27"

(Z),T2"-27",[Z,NAT][U,PROPl"-27" ; NIS(Y,PL(X,Y)) (Z)]T4"-27"(Z,U) ,Y) ; PROP Z* DIFFPROP := IS(X,PL(Y,Z))

+29

Y*I Y * I1 Y * 111 Y * ONEl ONEl * U U*Tl UcT2 ONEl * T3 Y*T4 ONEl * T5 Y *T6 Y t TWOl TWOl * THREEl THREEl I U U*DU DU*V V*DV DV * T6A

:= IS(X,Y)

; PROP

.._ ._

;I

; PROP := SOME([U,NAT]DIFFPROP(X,Y,U)) ; PROP := SOME([V,NAT]DIFFPROP(Y,X,V))

----:= TRIS(NAT,PL(U,X),PL(X,U), PL(Y,U),COMPL(U,X), ISPLl(U,ONEl)) := TH3"E-NOTIS"(NAT,X,PL(U,X), PL(Y,U) ,SATZ7(U,X),Tl) := TH5"L-SOME"(NAT,[U,NAT] DIFFPROP(U),[U,NAT]T2(U)) := THl"L-EC"(I,II,[Z,I]T3(Z)) := T3(Y,X,SYMIS(NAT,X,Y,ONEl)) := TH2"L-EC"(III,I,[Z,I]T5(Z)) ._ - - -

DU * T8 THREEl * T9 TWOl * T10

; IS(PL(U,X),PL(Y,U)) ; NIS(X,PL(Y,U))

;NOT(II) ; EC(I.11) ; NOT(II1) ; EC(II1,I) ; I1 ._ - - ; 111 ._ ._ - - ; NAT ._ - - ; DIFFPROP(X,Y,U) ._ ._ - - ; NAT := ___ ; DIFFPROP(Y,X,U) :=TR4IS(NAT,X,PL(Y,U),PL(PL(X,V)

,

DV * T7

: NAT

~

~

,

~

~

~

~

,

~

X),DU,ISPLl(Y,PL(X,V),U,DV), ASSPLl(X,V,U),COMPL(X,PL(V,U)) ) ; IS(X,PL(PL(V>U) 3)) := MP(IS(X,PL(PL(V,U),X)),CON, TSA,SATZ7(PL(V,U),X)) ; CON := SOMEAPP(NAT,[V,NAT]DIFFPROP (Y,X,V),THREEl,CON,[V,NAT] [DV,DIFFPROP(Y,X,V)]T7(V,DV)) ; CON := SOMEAPP(NAT,[U,NAT]DIFFPROP (U),TWOl,CON,[U,NAT] [DU,DIFFPROP(U)]T8(U,DU)) ; CON := [Z,III]TS(Z) ;NOT(III)

~

~

~

Checking Landau’s “Grundlagen”, Excerpt for “Satz 27” ( 0 . 5 )

777

Y*T11 Y*A

:= TH1 “L-EC”(II,III,[Z,II]TlO(Z)) := TH6”L-EC3”(I,II,III,T4,T11 ,T6)

; EC(I1,III) ; EC3(1,11,111)

Y * SATZ9B

:= A“-29”

; EC3(IS(X,Y),SOME([U,NAT]

-29

DIFFPROP(X,Y.U)),SOME(

[V,NAT]DIFFPROP(Y,X,V)))

Y * MORE Y * LESS Y * SATZlOB

:= SOME([U,NAT]DIFFPROP(X,Y,U)) ; PR O P := SOME([V,NAT]DIFFPROP(Y,X,V)) ; PR O P := SATZ9B ; EC3(IS(X,Y),MORE(X,Y),

Y*M M * SATZll Y * MOREIS Y * LESSIS Y*M M t SATZl3

:=

_-_

:= M := OR(MORE,IS(X,Y)) :=OR(LESS,IS(X,Y))

._ ._

---

LESS(X,Y)) ; MORE(X,Y) ; LESS(Y,X) ; PR O P ; PR O P ; MOREIS(X,Y)

:= TH9“L-OR”(MORE,IS(X,Y),

LESS(Y,X),IS(Y,X),M,[Z,MORE] Z*I I*M M * ISMOREl

._

SATZll(Z),[Z,IS(X,Y)] SYMIS(NAT,X,Y ,Z))

; LESSIS(Y,X)

--; IS(X,Y) := - _ ; MORE(X,Z) := ISP(NAT,[U,NAT]MORE(U,Z),X,Y,

; MORE(Y,Z) MA ._ I*M ._ - - ; MOREIS(X,Z) M * ISMOREIS1 := ISP(NAT,[U,NAT]MOREIS(U,Z),X, ; MOREIS(Y,Z) Y,M,I) I*M := -__ ; MOREIS(Z,X) M 8 ISMOREIS2 :=ISP(NAT,[U,NAT]MOREIS(Z,U),X, ; MOREIS(Z,Y) Y,M,I) ._ - - Y*I ; IS(X,Y) I 1: MOREIS12 :=ORI2(MORE(X,Y),IS(X,Y),I) ; MOREIS(X,Y) Y*M := ; MORE(X,Y) M * MOREISIl := ORIl(MORE(X,Y),IS(X,Y),M) ; MOREIS(X,Y) z*u ._ - - ; NAT ._ - - U*I ; IS(X,Y) ._ I*J ._ - - ; IS(Z,U) ._ J*M --; MOREIS(X,Z) M * ISMOREIS12:= ISMOREIS2(Z,U,Y,J, ISMOREISl(X,Y,Z,I,M)) ; MOREIS(Y,U) .._ - - YtM ; MORE(X,Y) M * SATZlOG := TH3“L-OR”(LESS(X,Y),IS(X.Y),

___

.-

__

EC3E23(IS(X,Y),MORE(X,Y), LESS(X,Y),SATZlOB,M),

EC3E21(IS(X,Y),MORE(X,Y), LESS(X,Y),SATZlOB,M)) Y * SATZ18

:= SOMEI(NAT,[U,NAT]

Z*M

.-

; NOT(LESSIS(X,Y))

DIFFPROP(PL(X,Y),X,U),Y,

._

REFIS(NAT,PL( X,Y)))

---

; MORE(PL(X,Y),X) ; MORE(X,Y)

L.S. van Benthem Jutting

778

+319

___

MtU UtDU DU t T 1

:=

DU t T 2

:= TRJIS( NAT,PL(X,Z),PL(PL(U,Y),

._ ._

; NAT

--;DIFFPROP(U) := TRIS(NAT,X,PL(Y,U),PL(U,Y),DU, COMPL(Y ,U))

Z

DU t T3

~

,

; IS(X,PL(U,Y))

~

~

~

~

,ISPLl (X,PL(U,Y),Z,Tl), ASSPLl(U,Y ,Z) ,COMPL(U,PL(Y ,Z)) ) := SOMEI(NAT,[V,NAT]

, ;~

~

~

~

~

~

~

DIFFPROP(PL(X,Z),PL(Y,Z),V),U, T2)

; MORE(PL(X,Z),PL(Y,Z))

-319 M t SATZ19A

:= SOMEAPP(NAT,[U,NAT]DIFFPROP

(U),M,MORE(PL(X,Z),PL(Y,z)), [U,NAT][V,DIFFPROP(U)] T3“-319”(U,V)) ZtM

:=

___

; MORE(PL(X,Z),PL(Y,Z)) ; MOREIS(X,Y)

MtN N * T4

:=

___

; MORE(X,Y)

+*319

M*I I*T5

:= MOREISIl(PL(X,Z),PL(Y,Z),

.-

SATZ19A(N)) ...

; MOREIS(PL(X,Z) ,PL(Y ,Z)) ; IS(X,Y)

:= MOREISI2(PL(X,Z),PL(Y,Z),

ISPLl(X,Y,Z,I))

; MOREIS(PL(X,Z) ,PL(Y ,Z))

-319 M * SATZ19L

M * SATZ19M

:= ORAPP(MORE(X,Y),IS(X,Y),

MOREIS(PL(X,Z),PL(Y,Z)),M,

[U,MORE(X,Y)]T4”-319” (U),[U,IS (X,Y)]T5“-319”(U))

; MOREIS(PL( X,Z) ,PL(Y ,Z))

:= ISMOREISl2(PL(X,Z),PL(Z,X),

PL(Y,Z),PL(Z,Y),COMPL(X,Z), COMPL(Y,Z),SATZlSL)

; MOREIS(PL(Z,X),PL(Z,Y))

+324 X*N NtU UtI ItTl

:= :=

..-

___

._ ._

---

:=TRIS(NAT,X,(U)SUC,PL( l,U),I,

SATZ4G(U)) I*T2

:= ISMOREl(PL(l,U),X,l,

SYMIS(NAT,X,PL( 1,U),Tl), SATZ18( 1,U)) N*T3

;M O R E ( X , l )

:= SOMEAPP(NAT,(U,NAT]IS(X,(U)

SUC),SATZ3( X,N),MORE(X,l),

[U,NAT][V,IS(X,(U)SUC)]TZ(U,V) )

-324

; MORE(X,l)

,

~

~

~

~

~

Checking Landau’s “Grundlagen”, Excerpt for “Satz 27” (D.5) X t SATZ24

:= TH2“L-OR”(MORE(X,l),IS(X,l),

X * SATZ24A YtM

:= SATZ13(X,l,SATZ24) :=

[U,NIS(X, l)]T3“-324”(U))

___

779

; MOREIS(X,l) ; LESSIS(1,X) ; MORE(Y,X)

+325

MtU U*DU DU t T1 DU * T2

___

:=

.._

---

; NAT ; DIFFPROP(Y,X,U)

:= SATZlSM(U,l,X,SATZ24(U)) ; MOREIS(PL(X,U),PL(X,l)) := ISMOREISl(PL(X,U),Y,PL(X,l), SYMIS(NAT,Y,PL(X,U),DU),Tl) ; MOREIS(Y,PL(X,l))

-325 M * SATZ25

:= SOMEAPP(NAT,[U,NAT]DIFFPROP

(Y,X,U),M,MOREIS(Y,PL(X,l)),

[U,NAT][V,DIFFPROP(Y ,X,U)] T2“-325”(U,V))

; MOREIS(Y,PL(X,l))

Y*L L * SATZ25B

:= - - ; LESS(Y,X) := SATZl3(X,PL(Y,l),SATZ25(Y,X,L)

*P P*N

.-._

NrM M t LBPROP

._

NtLB N * MIN PtS

:= ALL([X,NAT]LBPROP“-327”(X)) ; PROP := AND(LB,(N)P) ; PROP

1

._

---

---

; LESSIS(PL(Y,l),X) ; [X,NAT]PROP

: NAT

+327

---

:= IMP((M)P,LESSIS(N,M))

; NAT ; PROP

-327

:=

__-

:=

___

; SOME(P)

+*327

S*N NIT1 ScT2 StL L*Y Y +YP YP * T3 YP t T4 YP * T5 YP * T6 YP t T7 LtT8

:= [X,(N)P]SATZSIA(N)

:= [X,NAT]Tl(X)

; NAT ; LBPROP(1,N)

;LB(1) ; [X,NAT]LB(X) ; NAT : (Y)P := SATZ18(Y,1) ; MORE(PL(Y,l),Y) : NOT(LESSIS(PL(Y,l),Y)) := SATZlOG(PL(Y,l),Y,T3) := TH4“L-IMP”( (Y)P,LESSIS(PL(Y,l) ; NOT(LBPROP(PL(Y,l),Y)) ,Y),YP,W := TH1“L-ALL”(NAT,[X,NAT] ; NOT(LB(PL(Y,l))) LBPROP(PL(Y,l),X),Y,T5) := MP(LB(PL(Y,l)),CON,(PL(Y,l))L, : CON T6) := SOMEAPP(NAT,P,S,CON,[X,NAT] : CON [Y,(X)PlT7(X,Y))

._

:= :=

---

___ ___

L.S. van Benthem Jutting

780 StN

._

--.

; NON(NAT,[X,NAT]AND(LB(X),

NtM MtL LtT9

:=

___

; NAT

---

;W M ) ; NOT(AND(LB(M) ,NOT(LB(PL

L t T10 L t T11 N t T12 S t T13 StM MIA At T14 At T15 A * NMP NMP * N NtNP NP t T16 NP * T17

._

:= (M)N

:= ET(LB(PL(M,l)),

NP * T19 NMP * T20 NMP * T21 A * T22 A t T23

(M.1)))))

TH3"L-AND"(LB(M), ; LB(PL(MJ)) NOT(LB(PL(M,l))),Tg,L)) := ISP(NAT,[X,NAT]LB(X),PL(M,l), ; LB((M)SUC) (M)SUC,TlO,SATZ4A(M)) := [X,NAT]INDUCTION([Y,NAT]LB(Y), T2,[Y,NATl[Z,LB(Y)]Tll(Y,Z),X); (X,NAT]LB(X) := [X,NON(NAT,[X,NAT]AND(LB(X), NOT(LB(PL(X,1)))))]T8(T12(X)) ; SOME([X,NAT]AND(LB(X), NOT(LB(PL(X,1))))) ._ ; NAT ._ - - := .-. ; AND(LB(M),NOT(LB(PL(M,l)))) := ANDEi(LB(M),NOT(LB(PL(M,l))), ;L B W A) := ANDEz(LB(M),NOT(LB(PL(M,l))), ; NOT(LB(PL(M,l))) A) := --; NOT((M)P) ._ ._ - - ; NAT ._ - - ; (N)P ; LESSIS(M,N) := MP((N)P,LESSIS(M,N),NP,(N)T14) :=TH3"L-IMP"(IS(M,N),(M)P,NMP, [X,IS(M,N)]ISP(NAT,P,N,M,NP,

NP t T18

NOT(LB(PL(X,1)))))

SYMIS(NAT,M,N,X))) := OREl(LESS(M,N),IS(M,N),T16, T17) := SATZ25B(N,M,T18) := [X,NAT][Y,(X)P]T19(X,Y) := MP(LB(PL(M,l)),CON,T20,TlS) :=ET((M)P,[X,NOT((M)P)]T21(X)) := ANDI(LB( M),(M)P,T14,T22)

; NOT(IS(M,N))

LESS(M,N) LESSIS(PL(M,l),N) LB(PL(M,1)) CON (M)P MIN(M)

-327 S * SATZ27

:= TH6"L-SOME"(NAT,[X,NAT]

AND(LB(X),NOT(LB(PL(X,l)))), [X,NAT]MIN(X),T13"-327",

[X,NAT][Y,AND(LB(X), NOT(LB(PL( X,l))))]T23"-327"(X, Y)) -N -LANDAU -EQ -ST -E -L

; SOME([X,NAT]MIN(P,X))

PART E Verification

This Page Intentionally Left Blank

783

A Verifying Program for Automath I. Zandleven'

0. SUMMARY

This paper describes the Automath verifier which is being operated [in the beginning of the seventies] at the Technological University at Eindhoven. The description is given in terms of a number of procedures, written in an ALGOL-like language. The contents are: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

General remarks. The description language. The translator. Some basic notions and procedures. Substitution. Reductions. CAT and DOM. Definitional equality. Correctness of expressions. Correctness of lines. A paragraph system. Final remarks.

For the theoretical background we refer t o the papers of Prof. de Bruijn, D. van Daalen and R. Nederpelt: [ d e Bruijn 70a (A.211, [de Bruijn 73b], [van Daalen 73 (A.3)] and [Nederpelt 73 (C'.3)].

1. GENERAL REMARKS

1.1. The aim of this paper is to give a rough description of how the AUT-68 and AUT-QE verifier is constructed and how it works. Most of the procedures 'The author is employed in the Automath project and is supported by the Netherlands Organization for the Advancement of Pure Science (Z.W.O.).

I . Zandleven

784

are much simplified for the sake of clarity and so as not t o bother the reader with topics like memory organization, error messages etc.

1.2. The whole verifier is embedded in a conversational system (operating via a terminal) in order t o control the amount of work the program might do in certain cases (mostly when an error in the Automath text has been made). The parts of the procedure texts, whose execution is (partly) controlled by human intervention, are placed between the brackets ?( and )? . Furthermore there is the opportunity to the user to debug the text on-line. 1.3. Notations 1.3.1. Expressions are denoted by A, B, ...,A l , A2, ... etc. 1.3.2. Syntactical identity is denoted by

=.

1.3.3. Bound variables in abstraction expressions are denoted by x,y, ...; thus e.g. [z : A] B. 1.3.4. Expression strings are denoted by C, I?, ... . 1.3.5. An expression, occurring in an expression string C is denoted by C with a subscript; thus C = (El, ...,C,), where Ci are the expressions occurring in C ( i = 1,...,n). 1.3.6. Each non-empty string C can be divided into two parts: C+ := the last expression of C C- := the rest of C (which may be empty)

.

Example. If C = A, B, C ( D ,E ) ,F ( G ,H ) then C+ = F ( G ,H ) , C- = A , B , C ( D ,E ) .

1.3.7. The composition of a string is denoted by the parenthesis (( and )) = ((C+, C-)).

e.g.: C

1.3.8. An indicator string [van Daalen 73 ( A . 3 ) , 2.131 is denoted by I , and a context [van Daalen 7 3 (A.3), 2.21 by 0. 1.3.9. Sometimes, in theoretical discussions, the notation of D. van Daalen is used [van Daalen 73 ( A . 3 ) , 5.31.

A verifying program for Automath ( E l )

785

2. THE DESCRIPTION LANGUAGE

2.1. The language used for the description of the verifying procedures, is based upon ALGOL '60. 2.2. Several types (in the sense of ALGOL '60) are added, e.g. expression, defined name, etc.

2.3. A construction case ... of begin ... end is added, to avoid repeated if ... then ...else-constructions. The values of the case selector are placed before the entries, as labels. Examples. The statement if color = red then paint (river valley) if color = white then paint (Christmas) else if color = blue then paint (moon) else paint (nothing) ,

may now be written as: case color of begin red: paint (river valley); white: paint (Christmas); blue: paint (moon); otherwise: paint (nothing); end;

Another possibility is: paint (case color of begin red: river valley; white: Christmas; blue: moon; otherwise: nothing; end);

So the case-construction may be used for both statement selection and assignment selection.

I. Zandleven

786

Some non-ALGOL symbols are used, e.g. t-, 0,..., and sometimes procedure identifiers are defined as infix, e.g. d OLDER THAN b would be written OLDERTHAN(d, b ) in correct ALGOL. 2.4.

2.5. Each procedure, whose identifer is written in capitals or non-ALGOL symbols, is explained.

2.6. No use is made of the parameter device: value. If an actual parameter has to be evaluated, this is done once only at the beginning of the procedure. All further calls are calls by reference to a program variable.

3. THE TRANSLATOR

Before Automath texts are presented with the verifier, they are passed through a translator. One may consider this translator as a preprocessor, checking the context-free part of the Automath syntax (parentheses, commas etc.), coding the identifier-paragraph identification (see 11),completing the expressions written in shorthand, etc.

4. SOME BASIC NOTIONS AND PROCEDURES

4.1. Shapes Most of the procedures must be able to distinguish the different characteristic forms in which expressions appear. For this purpose we introduce the notion shape, which represents the outermost characteristic form of an expression. E.g. the expression:

(4B)) C ( b Dl E ) has the “application shape”, symbolically denoted by an application expression such as ( P )Q or ( E l )E z .

4.1.1. The shapes, and their symbolism, which are used, are:

A verifying program for Automath ( E l )

787

shape

symbolism

type Prop variable bound variable

type Prop variable boundvar

constant shape application shape abstraction shape

d(C)

( A )B [x : A] B

4.1.2. When using this symbolism for the shapes, we will permit ourselves to use the sub-elements of it, as expressions on which to operate (without explicit declaration of and assignment to the program variables). So we may write, for example: if shape ( E ) = [x : A] B then domain := A else ... ,

4.2. Primitive procedures Often, during the verification process of a book B we need the indicator string, the middle expression or the category expression of a certain line of B. Each line in the book U is uniquely indicated by the name introduced in the identifier part of that line (possibly with a paragraph reference, see 11). These names will belong to the ALGOL-type def inedname. Because an indicator string may be considered as a string of expressions, we may introduce the 4.2.1. expressionstring procedure INDSTR ( d ) ; definedname d comment INDSTR becomes the indicator string of the line in which d is defined;

For the middle and category expression procedures:

4.2.2. expression procedure MIDDLE ( d ) ; def inedname d; comment MIDDLE becomes the middle expression of the line in which d is defined. Of course this procedure is only allowed for those d which represent an abbreviation.

I. Zandleven

788

4.2.3. expression procedure CATEGORY ( d ) ; def inedname d; comment CATEGORY becomes the category expression of the line in which d is defined (both for EB lines, PN lines and abbreviations);

The bodies of these procedures cannot be explained without going into details of memory organization, a subject which is beyond the scope of this note. 4.2.4. Another primitive procedure, OLDER THAN, will be explained in 8.2.

5. SUBSTITUTION

5.1. We have introduced two different shapes (and codings) for variables to be able to distinguish properly between all the variables occurring in an expression. By “shape = variable” we code the variables which occur in indicator strings (these variables are sometimes called parameters). By “shape = boundvar” we code the variables which occur in abstractors. Furthermore, in one Automath book, all binding variables (i.e. variables occurring as z in [z : A] ...) get different code-numbers. So the substitution becomes a simple replacement operation. Now there is only one possible way t o get a so-called clash of variables, namely in the following example. Suppose we have an expression like [z : A] (..., ( B ( z ) )[y : C ][z : A] D(y), ...)

.

If we want to reduce the expression between the dots (by &reduction), we will obtain the expression [z : A1 D ( B ( z ) )

and we see that the z in D ( B ( z ) )is bound by the wrong abstractor now. It is claimed by the author that by this coding system no clash (conflict, confusion) of variables arises during the verification process of Automath. 5.2. Substitution for free variables At first we define a procedure SUBST, which will replace free variables (shape = variable) by expressions, as follows:

Let v be a string of free variables (mutually distinct), let r be an equally long string of expressions,

789

A verifying program for Automath [ E l )

let E be an expression. The procedure SUBST constructs a new expression by replacing in E all w, by the corresponding ri. The procedure (function) identifier SUBST will become the resulting expression: ([wl,...,wn/rl,...,r,]lE (see [van Daalen 73 (A.311 for this notation). We call this procedure by e.g.

SUB ST(^, r, E ) . The string analogue of SUBST(v, I?, E ) , STRINGSUBST(v, I', C) means: replace, in all C j , all vi by the corresponding ri.

5.2.1. Procedure text 5.2.1.1. expression procedure SUBST(v, r, E ) ; expression E ; expressionstring v,r; comment shape (vi) must be variable; SUBST := case shape ( E ) of begin variable : if 3io:=i(vi = E ) then rio else E ;

r,

: STRINGSU SUB ST(^, E ) ) ; : SUB ST(^, A ) ) SUB ST(^, B ) ; ( A )B [z : A] B : [z : SUBST(v, A ) ]SUBST(v, I', B ) ; otherwise : E ;

d ( ~ )

r,

r,

r,

end:

5.2.1.2. expressionstring procedure STRINGSUBST(v, r, C); expressionstring v,r,C; comment shape (vi) must be: variable; STRINGSUBST is the string-analogue of SUBST; STRINGSUBST := if C = 0 then 0 else ((STRINGSUBST(v, r,C-), SUBST(w, r, C+))) 5.3. Substitution for bound variables (shape = boundvar) This is like the substitution for free variables (apart from the fact that only one boundvar at a time is substituted for). Therefore we will only give the procedure heading.

1. Zandleven

790

5.3.1. expression procedure BOUNDSUBST(z, A , E ) , boundvar z; expression A , E ; comment A is either an expression or another boundvar to substitute for x in E,.

6. REDUCTIONS The reductions involved in the verification of correctness of =-formulas (cf. 8) are a-,P-, r]- and &reduction. See also [van Daalen 73 (A.3), 2.12 and 6.21.

6.1. a-reduction To perform an a-reduction one can easily use the procedure BOUNDSUBST. For an expression [x: A] B , where z is t o be replaced by y (say), we have simply to construct [y : A] BOUNDSUBST(z, y, B )

(y must be new of course).

6.2. P-reduction The P-reductor is written in the form: A procedure with two parameters, A and B. A typical use of this procedure is e.g. if El

> E2 P

> B,where > represents a boolean P P

then A := E2 else ... .

If a P-reduction is applicable to A (so A = ( A l ) [z : A21 A3) then B becomes 1[z/A1]lA3, and the procedure identifier gets the value true. If A has the form ( A l ) A z , where A2 does not have an abstraction shape, so that no direct P-reduction is possible, then the procedure tries to reduce A2 with 0-and/or &reduction so as to obtain the form [x : A31 Ad. At that point the actual O-reduction can be carried out. 6.2.1. Procedure text 6.2.1.1. boolean procedure A > B;

P

expression A , B; comment if A is reducible by P-reduction, then P-reduct of A ;

B becomes the

A verifying program for Automath ( E l )

791

begin i f shape(A) = ( P )Q then begin boolean possible; possible := shape(Q) = [z : R]T ; i f not possible then begin boolean continue; continue := true; while continue do ?(begincase shape(Q) of begin ( R ) S : continue := Q > U

P

d ( C ) : continue := Q > U 6 otherwise : continue := f a l s e ; end; i f continue then begin Q := U ; possible := shape(Q) = [z : R]T; continue := not possible; end; end)?; end; i f possible then begin B := BOUNDSUBST(z, P,T); > := true;

4

end else end else

>

0

> P

:= f a l s e ;

:= false;

end: 6.3. greduction

The whole procedure runs under cnntrol of the boolean “eta reduction allowed”, which may be set or reset by the user. When reset (eta reduction allowed = f a l s e ) , the verificator can only use a-,P- and &reduction. Interestingly enough, in the Automath texts, checked so far, q-reduction has almost never been used. The 7-reductor is written in the same form as the P-reductor: A > B. 77 We have for A the following cases. (i) A = [z : P] (Q) R. (a) If Q

2 z then the procedure

first tries t o reduce (Q) R.

I. Zandleven

792

(b) If Q 2 z, but z occurs in R then the procedure first tries to remove z in R by reducing R. (c) If Q 2 z and z does not occur in R , then the 77-reduct ( B ) becomes R and > gets the value true.

77

(ii) A = [x : P ] Q , Q = d ( C ) or Q = [y : R] S. Now the procedure first tries to reduce Q, and afterwards tests if an 77reduction is possible. In either case if no 7-reduction is possible, the procedure identifier > gets the 77 value false. There appear two procedures in > , which must still be explained. 77 D Firstly there is the procedure = to declare as boolean procedure El Ez; where El and E2 are expressions. This procedure investigates whether El and E2 are definitionally equal, and is described in 8. Secondly there is the procedure OCCURS IN, which searches an expression for occurrences of a specific bound variable. This procedure is defined as follows. 6.3.1. Procedure text for OCCURS IN 6.3.1.1. boolean procedure z OCCURS IN E ; boundvar x; expression E ; OCCURS IN := case shape(E) of begin boundvar : x = E ; : 3, x OCCURS IN Ci d(C) ( A )B : x OCCURS IN A or x OCCURS IN B; [y : A] B : z OCCURS IN A or z OCCURS IN B; otherwise : false; end; 6.3.2.1. Procedure text for the 77-reductor 6.3.2.1.1. boolean procedure A > B;

77

expression A, B; comment if A is reducible by 77-reduction then B becomes the areduct of A; if eta reduction allowed then

A verifying program for Automath ( E l )

793

begin i f shape(A) = [x : P] Q then case shape(Q) of begin

D

( R ) T : i f x = R then i f not x OCCURS IN T then begin > *- t r u e ; B := T 77 *-

end else ? ( i f T 2 TI and not x OCCURS IN TI then begin > .- t r u e ; B := T

77

end else if Q

.-

> Q1 then := [z : P]Q1 $ B P e l s e > := f a l s e ) ? 77 e l s e i f Q > Q1 then > := [x : PI Q1 $ B P 77 e l s e > := f a l s e ; 77 d ( C ) : i f Q > Q1 then > := [x : P]Q1 $ B 6 77 e l s e > := f a l s e ; 77 [ x : R ] T : i f Q > Q ~ t h e n >:= [ z : P ] Q 1$ B 77 77 e l s e > := f a l s e ; 77 otherwise: > .- f a l s e ; end else end else

>

77

77

>

77

.-

:= f a l s e ;

:= f a l s e ;

6.3.2.2. The part between ?( and )? has not yet been implemented. Although such cases are easily constructed (e.g. [x : XI (z) f(z,y), where f(z,y) > y), 6 in practice this has never occurred up to now. 6.4. &reduction

The b-reductor is written in the same way as the P- and 77-reductor, and tries to perform a single &reduction on the presented expression. If the presented expression has shape d(C), the procedure takes the middle expression of the line where d is defined (= MIDDLE(d)) and replaces the free variables in it (i.e. the elements of INDSTR(d)) by the expressions of C.

I. Zandleven

794

6.4.1. Procedure text 6.4.1.1. boolean procedure A > B;

6

expression A, B; comment if A reducible by &reduction then B becomes the b-reduct of A . begin i f shape(A) = d(C) then i f d represeids a n abbreviation then begin > .- true; B := SUBST(INDSTR(d), C, MIDDLE(d)); end else else

>

end;

6

6 *-

> 6

:= f a l s e

:= f a l s e ;

7. C A T A N D D O M As pointed out in [ v a n Daalen 73 (A.3), 6.41, we need two functions, CAT and DOM, to compute mechanically the category (type) and the domain of an expression respectively.

7.1. The ‘Lmechanicaltype” function CAT is defined by induction on the length of the expressions as follows. Let B be a correct book and (i)

If

0

= z1 E 01,

0

a correct context

...,zn E an then CAT(zi)

:= a1.

(ii) If d is an abbreviation constant, defined in a line of B by d := A E B, with indicator string I , then CAT(d(C)) := I[I/C]l B. (iii) CAT((A) B) := if CAT(B) G [z : PI Q then I[zclAII Q else (A) CAT( B) (iv) CAT([z : A] B) := [z : A] CAT(B) CAT is not defined for variables with shape = boundvar (see 5.1), because in the verification process there is no need for it (9.5). Further CAT is not defined for 1-expressions, of course. It is easy to see that, if the argument for CAT is a correct expression, the outcome will again be correct.

A verifying program for Autornath ( E l )

795

7.2. The procedure text of CAT reflects the given definition completely.

7.2.1. express ion procedure CAT( E ) ; Expression E ; CAT := case shape(E) of begin variable : CATEGORY(E); d(C) : SUBST(INDSTR(d),C, CATEGORY(d)); ( A )B : if shape(CAT(B)) = [z : P]Q then BOUNDSUBST(z, A, Q) else ( A )CAT(B) [z : A] B : [z : A] CAT(B); otherwise : undefined; end; 7.3. The “mechanical domain” f u n c t i o n DOM

This procedure has to yield (where possible), for a given expression A, an expression a , such that F A E [z : a]p or F A = [z : a]/3. For expressions A of the form [z : B ] C , the computing of the domain is trivial: DOM(A) = B. If A is a variable, we may compute the domain of the category of A. More difficult is the case where A has the shape d ( C ) or the shape ( B )C

.

If we try to reduce A, we may end up with a PN (e.g.: d(C) 2 f(I’), f := PN). On the other hand, if we take the category of A by computing CAT(A), we may obtain type or [q, c q ] ... (z, a,] type. To deal with this probem we use the following strategy. At first CAT(A) is computed, and presented t o DOM (N.B. This is a recursive call, so possibly CAT(CAT(A)) is computed.) If DOM(CAT(A)) does not yield a domain at all, then a 6- or &reduction on A is carried out (if possible), and the reduct is again presented to DOM. Since only 1, 2 and 3-expressions are investigated, the whole process can be given by the following tree figure:

796

I . Zandleven

Figure 1 7.3.1. Procedure text 7.3.1.1. expression procedure DOM(A); expression A; case shape(A) of begin [ z : B]C : DOM := B; variable : DOM := DOM(CATEGORY(A)); d ( C ) ,( B )C : begin D := DOM(CAT(A)); if undefined (D)then if A > A1 then DOM := DOM(A1) 6 else if A > A1 then DOM := DOM(A1)

P

otherwise end;

else DOM := undefined else DOM := D end; : DOM := undefined;

8. DEFINITIONAL EQUALITY To verify the correctness of a given =-formula we will use the Church-Rosser theorem: if A = B then A 2 C 5 B for some C (see also [van Daalen 73 (A.,?), 6.3.11).

A verifying program for Automath ( E l )

797

D

This definition is the guide for the procedure = which we will introduce here.

D

8.1. Description of = The type of the procedure is boolean, and the identifier will be written in

D

infix notation, viz. A = B (in the same way as for

D

> > D’ 6

etc.).

Roughly speaking, in order to check A = B, the procedure tries to reduce A

D

and/or B until either the two expressions are identical or the decision A # B can be made. It is not always necessary for both complete expressions to be present during the whole reduction process. If, for example, A = d ( C l , C 2 , C 3 ) and B = d(C1, C4, C,) then the procedure needs only parts of both expressions, namely

D

and CQ,and will check C2 = E d . So, in general, the procedure uses recursive calls, applied to sub-expressions, following the monotonicity rules described in [ v a n Daalen 73 (A.3), 5.5.61. Recursive calls are also used for the reduction sequences. Firstly the p r e cedure tries, if necessary, to reduce one of the expressions A and B. Which is reduced is a matter of strategy. If one of the two expressions is reduced, one could continue the equality-check by using an iterative or a recursive method. A recursive method is chosen in order to make the algorithm more readable. C2

Example. If A = d(C) and B = ( P )Q then the procedure first tries to reduce B by P-reduction. If this succeeds, and the outcome is B1, then the definitional equality of A and B follows from that of A and B1.

D

Otherwise the procedure tries to reduce A to A1 (say) and checks A1 = B.

D

If this also fails, then the procedure identifier = gets the value f a l s e .

8.2. Type inclusion If we want to verify A E B , we check t- A and t- B, compute CAT(A) and

D

check CAT(A) = B ; so CAT(A) is the first parameter and B is the second parameter of the procedure call. D In order to accept type inclusion as well, we add a slight extension to =, namely:

D

[z : A] type = type

will be accepted as correct, but not

D

type = [z : A] type

.

I. Zandleven

798

The same holds for prop. So the procedure is no longer symmetrical for 1expressions. (Notice that calls are sometimes made with reversed order of the arguments D of =, but as one can see in the procedure text these cases can never refer to 1-expressions.) D Now the definition of = is exactly the same as that of E [van Daalen 73 (A.3), 61. 8.3. OLDER THAN

D D The procedure = needs, in one special case, namely d(C) = b ( r ) and d f b, the boolean procedure OLDER THAN, to decide which of d and b must be reduced. It seems a good strategy to start off by reducing the younger of the two, i.e. the constant which was the more recently, for in this way we have a chance of reducing it to the other. 8.3.1. boolean procedure d OLDER THAN b; def inedname d, b; comment OLDER THAN := the line in which b is defined, appears later in the book than the line in which d is defined;

D

8.4. Procedure text of = 8.4.1. D boolean procedure El = expression El, E2;

E2;

Q - ..case (shape(E1), shape(E2)) of begin (type, type) : true; (type, otherwise) : false; (prop, Prop) : true; (prop, otherwise) : false; (variable, variable) : El = E2

D

>

E22

then El =

Ez >

E22

then El =

E22

then El = E22 else f a l s e ;

(variable, d( C))

: i f E2

(variable, ( A )B )

: if

(variable, [z : A ] B )

: i f E2

(variable, otherwise)

: false;

6

P >

9

D

D

E22

else f a l s e ;

E22

else false;

A verifying program for Automath ( E l )

799

(boundvar, boundvar) : El = E2; (boundvar, otherwise) : consider (variable, shape(&)) : i f d = b then

SD

if c = else ? ( i f El

r thentrue

> Ell then 6

Ell

D

=

E2 e l s e f a l s e ) ?

else i f d OLDER THAN b then i f E2

D

>

E22

then El = E 2 2 e l s e f a l s e

>

El1

then Ell = E2 e l s e f a l s e ;

6

else i f El

6

D

> E22 t h e n El D = E22 e l s e P D El > Ell then El1 = E2 e l s e f a l s e ;

: i f E2

if

6

: i f E2

>rl

E22

D

t h e n El = E 2 2 e l s e

D Ell t h e n Ell = E2 e l s e f a l s e ; 6 : consider reverse (i.e. (shape(E2), shape(E1))) D D : i f A = C and B = D then t r u e i f El

>

else ? ( i f El i f E2

> P > P

D

El1

then El1 = E2 e l s e

E22

t h e n El = E 2 2 e l s e f a l s e ) ? ;

D

> Ell then Ell D = E2 e l s e P D E2 > E22 t h e n El = E22 e l s e f a l s e ;

: i f El

if

77

: consider reverse;

:

B D = D B =

:

D D i f A = C then B = BOUNDSUBST(y, z, D )

:

E2; E2;

else false; : consider reverse;

end;

I . Zandleven

800

8.4.2.

sz

boolean procedure expressionstring C1, C2;

C2;

D

SD

comment = is the string analogue of =;

sg := if C1

3

else C;

0 then Cz

s$

0

C, and E:

D

= Xi;

9. CORRECTNESS OF EXPRESSIONS (k) 9.1. Correctness of an expression is checked by the boolean procedure “I-”, operating on an expression (say E ) and the indicator string (say I ) belonging to E. A procedure call is written like I I- E. Mentioning I is necessary, on account of the free variables in E which must all appear in I . Two non-trivial cases arise: (1) If shape(E) = ( A ) B, then the “applicability” (let us say) of B to A has to be checked. D This is done by looking at: CAT(A) = DOM(B) (see also [van Daalen 73 (A.3), 6.4.2.31).

(2) If shape(E) = d(C) then : all Ci must be correct, firstly secondly : all Ci must have the correct categories. In the case 2 there is a difficulty: Let us consider the following book:

0*

Q

:=

EB;type.

*a *f

:=

EB; Q

:=

0*P

:=

P N ;type EB;type

Q

a

P * b := E B ; P b

*g

:=

f(P,b);type.

Now: ( P , b ) I- f ( P , b ) , nevertheless the string of types expected by f is not definitionally equal t o the string of given types:

A verifying program for Automath ( E l )

801

We may conclude that after checking the definitional equality of the first two a ) , the variable Q by categories, we have to replace, in the category string of (a,

0. This replacement (substitution) is, in a more general way, done by the procedure CORRECTCATS (see also [van Daalen 73 (A.3), 2.5 and 5.4.61).

9.2. boolean procedure CORRECTCATS(C, I ) ; expressionstring C, I ;

CORRECTCATS := if C = 8 then I = 8 else CORRECTCATS(C-, I-) and CAT@+)

SUBST(1-, C-,CAT(I+));

9.3. boolean procedure I I- E; expressionstring I ; expression E ; I- :=

case shape(E) of begin type : true; prop : true; variable : 3i(Ii E ) ; d(C) :I C and CORRECTCATS(C, INDSTR(d));

( A )B : I I- A and I I- B and CAT(A) DOM(B); [z : A] B : I I- A and ((I,.)) I- B; (see 9.5) otherwise : false; end;

9.4. boolean procedure I ks C; expressionstring I,C; comment FSis the string analogue of I-; i-* :=

if C E 8 then true else I Fs C- and I i- C+;

9.6. A comment on I I- [z : P] Q In this case, the checker, after checking I I- P , adds a “waste-line” to the book, of the form:

I. Zandleven

802

I

*

waste := E B ; P .

If we denote this new book by

B',then the checker checks the statement

B',( ( I ,waste)) I- I[z/waste]l Q . For this reason the correctness of a bound variable will never be asked for, and its CAT or DOM will never be computed. Only in D = can the shape boundvar occur.

10. THE CORRECTNESS OF LINES The checking for correctness of a n Automath line is now easy to describe in terms of already defined procedures:

10.1. boolean procedure CORRECT(L1NE); Automath line value LINE;

CORRECT := case form of the line is of begin

I I

I

* * *

N := E B ; E . : I t E ; N := P N ; El : I t - E ;

N := El ; E2 otherwise : false;

:

D I t - El and I t E2 and CAT(E1) =

E2;

end;

11. A PARAGRAPH SYSTEM As already mentioned in [wan Daalen 73 ( A . 3 ) , Section 2.161, the syntactical definition of AUT-68 (and AUT-QE) forces us to write mutually exclusive names (identifiers) for both variables and constants. This, of course, is very annoying to the writer of Automath. Therefore we have introduced a paragraph system. Each Automath text may be divided into sections, called paragraphs. A paragraph starts with:

+ paragraph name. and ends with: - paragraph name.

A verifying program for Automath ( E l )

803

In a paragraph one may write Automath lines and other paragraphs (sub- paragraphs). Finally the whole book is contained in one big paragraph, so all paragraphs occur nested. Behind the identifier of a given constant one may write a so-called paragraph reference, t o indicate in which paragraph this identifier has been defined. An identifier b with paragraph reference to (say) paragraph P, is written in the form: b"P1 - P2 - ... - P,", where P2 is a sub-paragraph of P I , P3 is a sub-paragraph of P2, ... , and P, is the paragraph in which b is actually defined. An identifier, not followed by a paragraph reference, refers to a constant or variable defined in the same paragraph, or, if not found there, in the paragraph, which contains that one, and so on.

Example. ( a := ... denotes a definition of a ... ( a )... denotes a reference to a ) line nr

book

1

p :=

2 3

... (p"A")... ... (p"B")...

reference to line nr

+ A. ... + B. + c.

no good reference ( p has not been defined in B ) . 5 6 7

p := ... ... ( p ) ... ... ( p " A - B - C")... ... (p"A")...

8

... ( p ) ...

4

- c.

- 4

4 1 1

- B. - A.

12. FINAL REMARKS 12.1. We repeat that the procedures given here form only an outline of the actual verifier. Many more parameters are passed through the procedures to avoid duplication, to control critical passages, to permit communication with the user and so on.

804

I . Zandleven

12.2. With regard to efficiency, improvements may be possible. For example, D parts of the strategy, implemented in =, are more or less arbitrary, although suggested by reflexion and practical work. Experience and research may lead to better strategies. Also the use of the features of [de Bruijn 726 (C.2)] may lead to a more efficient verifier. 12.3. We are pleased to say, in any event, that the verifier has been working satisfactorily up t o now. 12.4. An example of a text checked with the described verifier is found in [van Benthem Jutting 731.

805

Checking Landau’s “Grundlagen” in the Automath System Parts of Chapter 3 (Verification) L.S. van Benthem Jutting

CHAPTER 3. VERIFICATION

In this chapter the verification of the AUT-QE text is described. features of the program and the possibility of excerpting are discussed.

Some

3.0. Verification of the text

The verification of the AUT-QE translation of Landau’s book was executed on the Burroughs BG700 computer a t t h e Eindhoven University of Technology. T h e last page of the book was checked in September 1975. T h e whole book was checked in a final run on October 18, 1973. The verifying program was conceived by N.G. d e Bruijn and implemented by I. Zandleven. For a description of this program we refer t o [Zandleven 7,?( E l ) ] .Zandleven also provided the program with input and output facilities, and extended it with a conversational mode for on-line checking and correcting of texts. The verification took place in three stages: (i)

First t h e AUT-QE text was fed into the system on a teleprinter. At this stage the main syntactical structure of the text was analyzed. It was checked, for example, t h a t the format of the lines was as it should be, that the bracketing of t h e expressions was correct, and t h a t no unknown identifiers occurred.

(ii)

Secondly t h e AIJT-QE text was coded. At this stage t h e correct use of the context structure, t h e validity of variables, t h e correct use of the shorthand facility [van Daalen 7’+?(A.:?), 2.151 and of the paragraph reference system (cf. Appendix 2 [not in this Volume]),were checked.

(iii) I’inally t h e text was checked with respect to all clauses of the language definition. At this stage t h e degree [van Daalen 7.“3(-4.3), 2.31 and types of

L.S. van Benthem Jutting expressions were calculated. and the correctness of application expressions and constant expressions was checked. Vital for this is t h e verificatioii of the definitional equality of certain types (cf. [aan Daulen 7.”)(.4.:1). 2.101. [Zandleveii 7.3 ( E . l ) ] ) . Runs of the stages (ii) and (iii) generally claimed much of the computers (virtual) memory capacity (over GOO K bytes were needed for the program together with t h e coded text). In order t o avoid congestion in the multi-programming system it was therefore necessary t o have the program executed a t night ( a n d off-line). As Automath texts are checked relative t o correct books. a mechanical provisional debugging device for off-line checking was implemented. by which lines which were found incorrect could be tentatively repaired. E.g., when the middle part [van Daalen 73 (-4..1)?2.13.11 of a line was found incorrect. the debugging device changed it temporarily into I”, thus turning an abbreviation line into a I”-line. T h e line so “corrected” was then again checked, and. if it was found correct. t h e lines following could then be checked relative t o the “corrected” book. By this device i t was not necessary t o stop the checking immediately after the first error had been found. Another feature of t h e verifying program was added because o f t h e fact that proving expressions t o be incorrect (especially proving expressions to be not definitionally equal) is often more difficult and more time-consuniing than proving correctness. Therefore during off-line runs a parameter in the program (viz. the number of decision points, t o be explained in 3.1) has been limited, and lines were considered provisionally incorrect when this limit was exceeded. When the later chapters were checked, we reduced the demands on the coniputers memory capacity by abridging the book relative t o which the text was checked, in t h e following way: In t h e chapters which had already been found correct, t h e proofs of theorems and lemmas were omitted, and t h e final lines of these proofs (where t h e theorems and lemmas are asserted) were changed into I”-lines. Each time a chapter was completely checked (relative t o the book so abridged) it was abridged in its turn. Texts which are correct relative to t h e abridged book will be correct with respect to t h e unabridged book too. On the other hand, as in classical mathematics there is no reference t o proofs but only t o assertions, it is unlikely t h a t texts which are correct relative t o t h e unabridged book will be rejected relative t o t h e abridged book. In actual fact this did not occur. When a chapter, after several off-line runs of t h e program, was found t o be “nearly correct”, the final verification of that chapter took place on-line. In such a n on-line run the remaining errors could be immediately corrected. Moreover, correct lines could be verified, which had been provisionally rejected because t h e number of decision points during verification in off-line runs had exceeded t h e chosen limit. The verification of such complicated lines could be shortened

(:hecking Landau's %rundlageii", L'erification (E.2)

807

by directing (in conversational mode) t h e strategy for establishing definitional equality. After all chapters were verified in this way, the integral AUT-QE text (coniplete and unabridged) was checked during a final on-line run, which took 2 hours (real time). Of this time '12 minutes were spent on verification (not including t h e time needed for coding). In a table we list some d a t a on this final run, concerning verification time, number of performed reductions and memory occupied. hapter hapter

preliiniiiary text

,hapter

~

107.3

secoiids cr-red iictioiis

:Iiapter

~

:hapter

total

text

~

~

verilicatioii time

haplei

1 2 3 4a 5 4 __ __ __ - __ ~

~

143.1

301.2

342.4

405.7

813.1

406.9

2519.7

752 832 1111

1077 460 1318

1455 466 1873

1644 414 27'4

3393 2749 9290

1533 529 3151

10485 6014 20063 2 13433 215138

iii

631 564 596

d-red tictioiis 6-retluctioiis r/-red tictiona iir.

I I ~ .of

-

1

886 12155

1068 9388

of lines

expressions

~

1603 25792

-

~

2181 30327

2779 42067

-

2690 60450

~

2226 34959

Since one coded expression occuL es 30 ytes (mainly used for references t o subespressions), the total nieniory required for t h e coded book is about ti500 K bytes (= 52000 K bits). 3.1. Controlling the strategy of the program

In ordertoestablish definitional equality of two expressions, t h e verification system tries t o find another expression t o which both reduce. T h e choice of efficient reduction steps for this purpose is a matter of strategy ([ziari Daaleii 73 ( A . 3 ) , 6.1.11). The programmed strategy is described in [Zandlcven 73 (/?.I,)], Under this strategy it is possible that intermediate results are obtained which strongly suggest a negative answer t o the question of definitional equality, without definitely settling it. Suppose, for example, that a ( p ) = a ( q ) has t o be established. T h e programs strategy is t o ascertain that t h e constants a and a are idrntical and t o verify whether p = y. If this is not the case, there is a strong sugge\tion that u ( p ) and u ( q ) are not definitionally equal either, but this is yet uncertain. Tor example. they are definitionally equal relative t o the book

* * * *

p

:= :=

y x

:= :=

71

x * * : : =

I" I" I" P

Etype E 71 E ?I E 71 E 11

808

L.S. van Benthem Jutting

It is a matter of strategy how t o proceed in such cases. We may either apply &reduction (in which case the issue will be eventually settled) or we may try t o continue the verification process without using n ( p ) = a ( q ) . Such a situation is called a decision poii2t. In on-line runs the verifiration may be controlled here by t h e human operator. (Actually. in the situation sketched above, information will be supplied, and the question will appear whether 0reduction should be tried.) In off-line runs &reduction will be applied in order t o get a definite answer t o the question, and it will be checked t h a t the total number of decision points passed during the checking of a line does not exceed t h e chosen limit (cf. 3.0).

809

An Implementation of Substitution in a X-Calculus with Dependent Types L.S. van Benthem Jutting 1. INTRODUCTION In this paper we describe an implementation of substitution, which makes use of de Bruijn indices [de Brzlijn 72% (C.211 and of structure sharing (cf. [Wadsworth 711, [Boyer & Moore 721, [Curien 861). This implementation was designed by I. Zandleven about 1972. He used it in the verifying program for Automath, which has been running at Eindhoven for over 10 years. As it was never properly described we have thought it worth while to give a formal description and to prove that the implementation really satisfies its specifications. The implementation of substitution which we describe does not really carry out substitutions, but implements them by considering pairs consisting of an environment and an expression. The environment in such a pair gives a meaning to the free variables which occur in the expression. Substitution for a free variable (or for more free variables) is implemented by changing the environment. Such an implementation is of interest in situations where (like in Automath) the issue is not to normalize an expression, but t o decide whether two expressions are equal, i.e. whether they have a common reduct. In such a situation time as well as space is saved because there is no copying involved in substitution. We start our description by briefly explaining the structure of the system. We also explain the mechanism of relative addressing, known by now as the system of de Bruijn indices. Our system (and also Zandleven’s implementation) makes essential use of this mechanism. We think, however, that similar implementations using absolute addressing might be possible. Then, in Section 2, we give a formal definition of our system of name-free Xcalculus. We formally define single and multiple substitution, stating theorems about commuting these operations. In Section 3 we discuss the implementation of substitution by structure sharing. We define environments, operations upon environments and an interpretation function, mapping pairs ( A ,E ) , where A is an expression and E an environment, to expressions. We prove that our implementation is sound, i.e. that

810

L.S. van Benthem Jutting

the interpretation of ( A , E ) is really the result of the substitution we meant to implement in constructing E . We carry out this program first for multiple substitution, then for single substitution and finally for a combination of both. Then, in Section 4, we treat the implementation of typing. We define a typing operator in the original system, which associates a type t o certain expressions. We give theorems about the connections between typing and substitution. After that we describe in our implementation a corresponding operator which produces a type for the expression denoted by the pair ( A ,E ) (if this expression has a type) and we prove this function to behave as described. We omit proofs, all proofs being by simple induction on the structure of expressions, possibly making use of earlier theorems. Finally we make some concluding remarks in Section 5. In that section we discuss briefly the possibilities of the system and the environment in which it has been used.

1.1. Automath First we give a short description of the main aspects of Automath which are relevant to our discussion. The Automath system is a proof checking system, which can be used for checking large mathematical proofs. As such it can be used also in checking correctness proofs for designs of computer programs. For an introduction into Automath we refer to [ d e Bruajn 80 (A.5)] and [van Daalen 73 (A.3)]. Automath is a typed A-calculus with dependent types. We assume the basic notions of A-calculus, such as substitution and &reduction, to be well known. Incorporated in the system is the notion of definition: if A is an expression and if the free variables of A are among X I ,...,x, (where n 2 0) then we can define an n-ary constant a such that a(x1,...,2), stands for A . Now if B1, ...,B, are expressions then a(&, ...,B,) is an expression, which should be interpreted as the expression obtained from A by simultaneous substitution of Bi for xi (where 1 5 i 5 n). The operation of eliminating a definition, replacing a defined constant by its definiens is called &reduction. In our system it gives rise to the implementation of simultaneous substitution. It is convenient for our description to forget about the arity of constants by postulating them to have infinite arity. We will therefore consider in the following sections expressions of the form a(@ where B’ is an infinite sequence of expressions. As we have said, Automath is a typed A-calculus: terms are typed A-terms, variables are typed. When the variable x has type A this is denoted by x : A . If A , B and C are expressions, x is a variable, and if for x typed by A the type

An implementation of substitution (E.3)

811

of B is C , then the function Ax : A.B has type IIx : A.C (where B and C may depend upon z). We adopt the practice (which is common use in Automath) to denote both expressions Ax : A.B and IIx : A.C in the same way: by [z : A]B and [x : A]C. In Section 4 we come back to this point. The value of the function A for the argument B will be denoted by A { B } . (This conflicts with the habit in Automath to put the argument before the function.) Thus we get the following description of Automath expressions: We assume two disjoint infinite sets C and V to be given: C = {a, b, c, ...} is the set of constants,

V = {z, y, z , ...} is the set of variables. Now if x is a variable then x is an expression, if a is a constant and

B’ is a sequence of expressions then a ( @ is an expression,

if A and B are expressions, and x is a variable then [z : A]B is an expression, if A and B are expressions then A { B } is an expression. On these expressions we can define single and multiple substitution. Also a-conversion and P-reduction can be defined, and for constants which have definitions we can define &reduction (i.e. application of a definition).

1.2. de Bruijn indices When defining substitution formally the main problem is to avoid clash of variables. This is usually done by introducing “fresh” variables where needed. In our presentation we avoid clash of variables by using relative addressing (nameless variables or “de Bruijn indices” [ d e Bruzjn 7 2 b (C.2)]; see also [Berkling & Fehr 821). This concept is important both in Zandleven’s actual implementation of the Automath verifier and in our formal treatment of substitution and its implementation by environments. We will now give an informal description of this representation of variables. As an example we choose the expression

[u: x { z } ][u : [w: u]w] z { u } .

(1)

This expression should be considered relative to a “context”, that is a sequence of distinct variables, containing all its free variables. In this case the context could be

x,y,z, ... .

(2)

L.S. vm Benthem Jutting

812

We represent the expression (1) on its context (2) by a planar tree.

Figure 1 In this tree the nodes labelled ab and ap represent abstraction and, respectively, application. The bold unary dots represent the place where variables are bound, the letters labelling the leaves are the (free and bound) variables. Note that every letter which labels a leaf should be bound by a bold dot, its binder, which is situated either on its path to the root of the tree or in the context. Now we replace every letter at a leaf by a number, which indicates the number of bold (binding) dots which lie on the path t o its binder. Or, in other words, we replace every occurrence of a variable by the number of scopes in which it is contained, and which are strictly contained in its own scope. Doing so we get the tree of Figure 2. It will be clear that, if we identify &-equivalent expressions, the letters labelling the binders are now irrelevant. Forgetting about these we get the “nameless” expression

Note that in this representation of (1) the number 0 denotes the different variables z, u and w, while the variable u is represented by 0 as well as by 1. Let us analyze which numbers denote the same free variable in a tree. Suppose n is a number labelling a node in a (nameless) tree and let m be the number of bold dots on the path from this node to the root. It is clear that n denotes

An implementation of substitution (E.3)

813

Y' X.

Figure 2

a free variable iff n 2 m. We will call the number n - m the depth of n in the tree. And we see that two labels denote the same free variable if they have the same non-negative depth. In the example this is the case with the labels 2 and 4 (which both denote z ; in fact there are two dots on the path from 2 to z , viz. the dots x and 8 , and four dots on the path from 4 t o z , viz. v , u, x and y). Both labels 2 and 4 have depth 2. It is easy to make algorithms which translate ordinary name-carrying expressions into nameless ones and back (given a certain context and forgetting about a-equivalence).

2. NAME-FREE A-CALCULUS 2.1. The language The structure of the expressions and the use of de Bruijn indices have been explained in the previous section. We will now give a formal definition of the language, which is essentially the Automath language defined in Section 1, but uses de Bruijn indices instead of variables. In the following N denotes the set of natural numbers {0,1,2, ...}.

Definition. We assume an infinite set C = {a,b, c, ...} of constants to be given. Now the set L = { A , B, C, ...} of expressions is defined by

L.S. van Benthem Jutting

814

if x E JVthen x is an expression,

if a E C and B’ is a sequence of expressions then a(@ is a n expression, if A and B are expressions then ( [ A ]B ) is a n expression, 0 if A and B are expressions then A { B } is a n expression.

We will omit parentheses around [A]B when this does not present parsing problems. The natural numbers occurring in an expression will be called references. It has been explained in the introduction how references can be interpreted as nameless (free or bound) variables, and that references having the same nonnegative depth can be interpreted as denoting the same free variable. There is, however, no need for defining formally the concept of depth; it will be used only in informal comments. 2.2. Operations 2.1.1. Updating When we substitute an expression A for the free variables in B with depth n we should add n to the references representing free variables of A in order t o preserve their original depth in the result of substitution. The corresponding updating operator is denoted by ug. When we are updating the expression A in this way we do not want the references denoting bound variables to be updated. If, for example, A is the expression [C]D then references in D with depth 0 should remain unchanged, and other references should be updated. Therefore we need operators u; (for every k E N ) which increase the references in A with depth k or more by n. This gives ug as a special case.

Definition.

uEA{B} = uEA{uEB]

u;E.a(A) = a(u;EA).

0

Remarks. In the last clause of this definition we have used @A, which is meant to denote the sequence of expressions obtained from A’ by applying UE

An implementation of substitution (E.3)

815

t o each item. It is possible, of course, to define this concept formally, but this would give rise to an extra clause in our definition. For brevity the present notation has been chosen in all definitions. The second clause of the definition is justified by the fact that references with depth k + 1 or more in B correspond to references in [A]B with depth k or more.

2.2.2. Single substitution We now define the operation s$ of single substitution. s$ A denotes the result of substituting C for the references in A with depth k . The variable denoted by the depth k in A is supposed to disappear from the context. Therefore the references with greater depth are decreased by 1.

Definition. s$ x

x $ [ A ]B

ifxk = [s$ A]";s B -

("

s $ A { B } = s$ A { s$ B }

s$a(A)

= a(s$A) .

0

Remark. The system defined here, with the operations u; and s$, is similar to the A-calculus treated in [Curien 861. 2.2.3. Multiple substitution We define the operation d$ of multiple substitution. That we use the symbol "d" to denote this operation suggests that its main use is in applying definitions (or, in Automath jargon, in &reduction). d$ A denotes the result of substituting C1, Cz, ... for the references in A with depth simultaneously the expressions CO, 0,1,2, ... respectively. We do not want the references denoting bound variables the operation d$ denoting in A to be changed. Therefore we need for k E simultaneous substitution for those references in A which have depth k or more.

Definition.

d$[A]B = [d2A]dF1B

L.S. van Benthem Jutting

816

d$A{B} = d$A{d$B}

dia(A)

= a(d$A) .

0

2.3. Theorems The following theorems treat the properties and relations of the operations defined above. Proofs proceed by easy induction on the structure of the expressions. For some theorems we give an intuitive justification.

Theorem 2.1. if 15 k 5 1 + m

0

ifk>l+m

Theorem 2.2. if 1 5 k

< 1+ m

if k 2 I + m

Theorem 2.3. 5kS1

c

D

-

1

k+l

- 5s;-1D5~

ifkzl

Remark. Theorem 2.3 corresponds to a well-known result on ordinary substitution: Sg S$ A = S:gD Sg A provided x $ y and x is not free in C

0

Theorem 2.4.

Remark. Theorem 2.4 can be understood as follows. Suppose k 2 1. Then, if we apply u r to an expression A, the references in A with depth 2 k are increased by m (together with all other references in A with depth 2 1 ) and become references with depth 2 k m. When we subsequently substitute the expressions 6 for these references, then they are updated with Uk+m , as may be seen in the definition in 2.2.3.

+

An implementation of substitution (E.3)

If, on the other hand, we first substitute

817

6 for

the references with depth

2 k, then the substitution will occur at the corresponding places in A and the

expressions 6 are updated with ut. When we afterwards apply u r , all references with depth 2 1 (which now includes all outside references in 6) are increased by m, which gives the same result. Now suppose k 5 1. If we substitute c' for the references in A with depth 2 k then there are no references left originating from A itself with depth 2 1. Moreover, all expressions in 6 are now updated by u/j. If we subsequently apply this will affect precisely those references in the expressions from c' which had depth 5 1 - k before substitution. Therefore we can reverse the order, beginning 0 with updating c' and doing the substitution afterwards.

ur

Theorem 2.5a.

Theorem 2.5b. ifk>1

Theorem 2.6.

Remark. Theorem 2.6 corresponds to a well-known result on multiple substitution: S< S< A = s C D

; ~A

c provided all free variables in A which are among y' are also among 2

. 0

3. THE IMPLEMENTATION OF SUBSTITUTION

In this section we describe the implementation of substitution by structure sharing. This implementation is used in the Automath verifying program, and has been designed by I. Zandleven. In order to code substitution without really carrying it out (i.e. without constructing new expressions by copying existing expressions) we use environments.

L.S. van Benthem Jutting

818

Expressions can be interpreted with respect to such environments, and the interpretation will be an expression, which we will prove to be the intended result of substitution. By coding substitution in this way the amount of memory space required and the time for carrying out substitutions will decrease, while the time needed for comparison of expressions might increase. We will give in this section mathematical definitions of environments and of the interpretation function and state a theorem that these definitions meet the requirements formulated above. In order to make our description clear we will treat first multiple substitution, then single substitution, and finally the combination of both.

3.1. Multiple substitution In this section we define environments for multiple substitution, called d-

environments. The name suggests that these environments are used when 6reduction (i.e. application of definitions) is performed. We will also define two operations on these environments.

Definition. A d-environment A is a finite sequence of functions A l , A2, ...,Ah, where Ai : N -+ C for 1 5 i 5 k and k E N . The number k will be called the 0 length of A. Let us first explain informally what we intend to picture by such a d-environment. Suppose A is a d-environment of length 1. If we interpret an expression A with respect to A we intend that the free references in A with depth i shall be interpreted to denote the value of the function A1 at i. We could picture this in a diagram as follows:

Figure 3 If we want to interpret an expression A on a longer d-environment A of length k , say, then we think of the free references of A as pointing into &, but an expression &(i) should now be interpreted with respect to the d-environment

An implementation of substitution (E.3)

A l l Az, ...,& - I .

819

This situation is illustrated in the following diagram.

Figure 4 Note that a d-environment might be an empty sequence. The interpretation of an expression A on the empty d-environment is A itself.

3.1.1. Multiple substitution The operation 66 (where c' is a sequence of expressions) is called multiple substitution (because it codes multiple substitution). Its effect is to extend a d-environment with the sequence c'. Deflnition. Let A = A l l ...)Ak be a d-environment. Then 66A = A l l &+I where Ak+l(i) = c, for i E N .

Ahr 0

3.1.2. Cutting The operation y cuts the last segment from a (non-empty) d-environment. Deflnition. Let A = A I ~ . . . ~beA a~ d-environment, and k 2 1. Then 0 YA = All ...l Ak-1. 3.1.3. Interpretation We now define the interpretation ("on depth n") of an expression A with respect to a d-environment A. The definition formalizes our intentions as explained above. The parameter n for the depth is needed for interpreting the bound variables in A. It is intended that the interpretation of A is the result

L.S. van Benthem Jutting

820

of a certain multiple substitution for all free variables of A . Because nothing should be substituted for the bound references in A , these references should be unchanged by interpretation. In the definition recursion is used on the length of A and on A.

Deflnition. Let A = Al, ...,Ak be a d-environment.

We derive two theorems concerning this interpretation. Theorem 3.2 shows us that the interpretation which we have defined has the property we wanted, i.e. that d-environments indeed code multiple substitution.

Theorem 3.1. If1 5 n then

I(uf"A, A, n

+ m) = uf"I ( A ,A, n) .

0

Theorem 3.2.

I ( A ,6eA, n) = dz* A where

c'.= I ( C ,A, 0) .

0

3.2. Single substitution

Now we define another kind of environments, called s-environments, which code single substitution (as is suggested by the "s" in their name). Deflnition. An s-environment is a partial function C : N a finite domain.

+

(L x N ) with 0

Let us explain the intended interpretation of an expression A with respect to an s-environment C. A reference in A with non-negative depth k (which represents a free variable in A ) is intended to denote a variable (i.e. a reference) if k $Z dom(C). If k E dom(C) then this tells us that an expression has been

An implementation of substitution (E.3)

82 1

substituted for k . If C ( k ) is the pair [C,nl then C has been substituted, or rather the interpretation of C with respect t o the prepart of the original environment, which starts n places before the place k of C. We illustrate this in another diagram.

depth = 1

environment for C

C(2) Figure 5

The need for such a construction can be seen when we consider the expression

“A1

PI C ){ D ) {El

where A , B , C, D and E are expressions. We want to consider the interpretation of this expression with respect to a certain s-environment C, and to apply preduction. Let us draw a diagram of this initial situation.

environment C

Figure 6

So we consider the p-redex ([A] [B]C) { D } on the same environment. I t will be clear that its reduct should be represented by the expression [B]C interpreted on the environment C extended by D (where D should also be interpreted on C). The situation is pictured in Figure 7. Its interpretation is the result of substitution of D (or, rather, the interpretation of D ) for the free references of depth 0 in [BIG‘. Now we can reduce again by applying this expression to El i.e. by substituting E for the references with depth 0 in C. But E should be interpreted with respect to the original environment C, and we can indicate this by putting E into the environment with index 1, which will bring about that references in E will be considered as pointing beyond D. The situation is sketched again in the diagram of Figure 8.

L.S. vitn Benthem Jutting

822

.

0

0

.

0

,

cnvironmcnt C

Figure 7

.

.

.

.

*J

environincnt C

Flm

Figure 8 Both D and E are now interpreted with respect t o the original environment C, while references with depth 0 from C point to E and references with depth 1 point to D. Now let us look at the intended interpretation of a reference which does not point into the domain of C. We have seen that references are shifted when a substitution is made for a smaller reference (cf. Definition 2.2.2, the clause concerning s$ z where z > k ) . This indicates that the interpretation of a reference with depth z should be obtained from z by subtracting the number of elements of dom(C) which are below z. Therefore we give the following definition.

Definition. z m o d C := z - #{y

E dom(C) Iy < z} .

0

Then the intended interpretation of a reference with depth z not pointing into dom(C) will be z mod C. On s-environments we define three operations. 3.2.1. Substitution The operation 0 6 , ~ (called substitution because it codes (single) substitution), extends the domain of an s-environment to k, and associates to k as value

An implementation of substitution (E.3)

823

the pair [C,n1. (As has been explained above “n” indicates that C should be interpreted with respect to a shorter s-environment.)

Definition. If k @‘ dom(C) then (g&

C ) ( k )= YCl n1

(g&C)(i) = C(i)

if i E dom(C) .

0

3.2.2. Extension The operation E is called extension because it codes extension of a context with another free variable. Its effect is that the domain of the s-environment is shifted. Definition.

( & C ) (+ i 1) = C ( i )

if i E dom(C)

.

0

3.2.3. Cutting The operation yn, called cutting, codes the removing of variables from the context. Its effect is that the domain of the s-environment is shifted, and possibly becomes smaller. Definition. (yJ)(h)

= C(i

+ n)

if i

+ n E dom(C) .

0

3.2.4. Interpretation With the help of the operations on s-environments defined above we can describe the interpretation of an expression with respect to an s-environment C. As we have seen, the interpretation of a reference with depth x which does not point into dom(C) will be x mod C, which is x minus the number of elements of dom(C) which are smaller than x. First we state some properties of 2 mod C. Clearly the function Xz.(x mod C) is increasing. Moreover we have: Theorem 3.3. If.

$ dom(C) then

xrnodC=(a:+l)modC-l.

0

L.S. van Benthern Jutting

824

The following three theorems show us what the relation is between the three operations defined on C and the value of x mod C.

Theorem 3.4.

x mod C xmodC-1

x mod 0 5 , C~ =

ifx/c

0 '

Theorem 3.5. xmodEC = (x - 1)modC

+1

0

I

Theorem 3.6. xmodykC=(x+k)modC-IcmodC.

0

Now we will define the interpretation of an expression A with respect to an s-environment C. We use recursion on dom(C) and A . Definition. I(x,C)

=

{

(s+k+l)modC

uo

I ( B ,yz+k+l C)

x mod C

if C(5) = [ B ,k] if x

dom(C)

0

We derive two theorems concerning this interpretation. Theorem 3.8 shows us that the interpretation which we have defined has the property we wanted, i.e. that s-environments indeed code single substitution.

Theorem 3.7.

I(u; A , E~ C) = u;""~' I ( A ,E' yn C) .

0

Corollary 3.7. I(u$ A , C) = u$"'OdC

I ( A ,^In C) .

0

An implementation of substitution (E.3)

Theorem 3.8.

Zf IF

825

@ dom(C) then

Z(A, &, C) = s;YodE Z(A, C) where

c*= U t m o d Y k + l C I(c,Yk+n+l c)

0

3.3. Combination of d-environments and s-environments Now we will combine the techniques described in the previous sections. For doing this we define another kind of environments, which we will call c-enuironments. Here “c” indicates that these environments combine the possibilities of d-environments and s-environments.

Definition. A c-environment is a partial function I? on if with a finite do0 main, and with values in (N --t C) U (C x N ) . The values N + C denote segments of a d-environment, the values in C x N denote values of an s-environment. Let us first sketch the intended interpretation of an expression A with respect to such a combined environment I?. As an example we present the following diagram.

.---

E I

Figure 9

As the diagram suggests, references from A are considered to point either into the first d-environment segment of I? or before that segment. In order to

L.S. van Benthem Jutting

826

picture the intended environments for C, D and E in the situation sketched above we look at the following diagram.

depth = 3

dcpth

=

0

~ ~ ) * j depth

<

=

7

environmcnt for C

.

- * , 1 environment for D

4

meJ

cnvironrncnt for E

1

Figure 10 As may be seen in the diagram, the environments for C and D are obtained by counting back 1 place and 6 places, respectively, just as in the case of an s-environment. The expressions in a d-environment segment, such as E , have as their environment that part of r which lies to the left of the segment. For a c-environment r we define the place of its first d-environment segment. Definition.

a(r)= min { I

E dom(r) I

r(I)E N

+

C} .

Remarks. The minimum is considered to be infinite if the set is empty. For the c-environment in the example above a(r)= 4.

0

0

On c-environments we (re)define the four operations: multiple and single substitution, extension and cutting. 3.3.1. Multiple substitution The c-environment gets a &segment

Definition. If 1

>k

c' at k .

for all 1 E d o m ( r ) then

(a$r)(k) = C (6$r)(i) = r(i)

if i E dom(r)

.

An implementation of substitution (E.3)

827

3.3.2. Single subst it ut ion The domain of the c-environment is extended to Ic, and its value at k is a pair [C,n1. As earlier the “n” indicates that C should be interpreted with respect to a shorter c-environment. Definition. If Ic $ dom(r) then

(4,n r ) ( k ) = [C, n1 if i E dom(r)

(o&F)(i) = r

0

3.3.3. Extension The c-environment is extended (as was previously done with s-environments). The domain of the environment is shifted.

Definition.

(Er)(i+ 1) = r ( i )

if i E dorn(r) .

0

3.3.4. Cutting From the c-environment the last n entries are cut.

Definition.

+

( T J ) ( i ) = r(i n)

if i

+ n E dom(r)

0

3.3.5. Interpret at ion The interpretation of a reference with depth z representing a free variable with respect to a c-environment l? is comparable to its interpretation on an senvironment C. Therefore we define again for z E N the possible interpretation z mod r as follows. Definition. z mod

I < Z} .

:= z - # {y E dom(r) y

0

As before, it is clear that the function X z . ( z m o d r ) is increasing. Also the analogous theorems hold.

L.S. van Benthem Jutting

828

Theorem 3.9. If x fZ dom(r) then x m o d r = (x

+ 1)m o d r - 1 .

Theorem 3.10. xmod6;r

=

x mod r xmodr- 1

ifxsk ifx>k

'

Theorem 3.11. x mod C

r=

T ~ , ~

x mod r xmodr - 1

ifxsk ifx>k

0 '

Theorem 3.12. x mod T k r = (x

+ k ) mod l- - k mod

.

0

Theorem 3.13. zmod&r=(x-l)modr+l

0

Now we will define the interpretation of an expression A with respect to a c-environment I?. We use induction on dorn(r) and A.

The following theorems hold. Theorems 3.15 and 3.16 show that the implementation is sound, coding single as well as multiple substitution.

An implementation of substitution (E.3)

Theorem 3.14. I f n

829

< a(r)then

I ( u ; A , e k r ) = u ; " " ~ ~ I ( A , E ~. ~ ~ I ' ) Corollary 3.14. I f n

< a(r)then

I ( u E A , r )= u ~ m o d r I ( A , ~ n. r ) Theorem 3.15. I f 1 > k for all 1 E dom(r) then

I(A,6; I?) = d'& A where

c*= I ( 6 ,yk+l r) .

Theorem 3.16. Zfk g' dom(r) then

I(A, where

~ 5I?) =, s~

$

I ( A~, I?)

~

c*= U0n m o d Y k + l r I ( c , Yk+n+l r) .

~

~

0

4. TYPING In this section we discuss the way in which certain expressions are typed. First we describe the typing of Automath expressions with named variables. Then we will define the typing of expressions in the name-free A-calculus L defined in Section 2. We will also state some theorems giving the relation between typing, updating and substitution. Finally we will indicate how to find a type for an expression which should be interpreted on a c-environment. And we will state a theorem to the effect that the type found in this way is really the type of the interpreted expression. We start by discussing types in Automath. Consider the expressions A , B and C and the variable x. We have remarked before that, if for x typed by A the type of B is C , then the function Ax : A.B has type IIx : A.C (where B and C may depend upon x). When we apply this function to a n argument D (of type A ) we get a value in SE C, that is the type C, where D is substituted for x. We could also describe this type by saying that it is the co-ordinate indexed by D of the product IIx : A . C . We use, as before, the notation F { D } for the value of the function F at the argument D , and we introduce here locally the notation P ( D ) for that coordinate of the product P which is indexed by D. Then we conclude that if F has type P then F { D } has type P ( D ) .

830

L.S. van Benthem Jutting

Moreover (Xz : A.B) { D } @reduces to Ss B , and also (Hz : A.C) ( D ) preduces to S g C . We see that the product constructor II together with coordinate selection ( ) have the same behaviour with respect to substitution and P-reduction as the functional abstractor X together with application { }. This is the reason for identifying them in our exposition (as is common use in Automath): Hz : A.B and Xz : A.B are both denoted by [z : A]B , and A { B } as well as A ( B )will be denoted by A { B } . Then, if the type of A is C , the type of A { B } will be C { B } . Now, as we are interested here in syntactical issues only, we will not treat the concept of correctness of expressions. We will, however, introduce a typing operator 7, such that for correct expressions A which have a type, it holds that A is of type 7 ( A ) . Let us first consider the proper Automath case, where all constants have a finite arity. We assume that for certain constants a their types are given; i.e. we presuppose a function 0 which associates to certain expressions a(.') a type (i.e. an expression). Here .'is a finite sequence of distinct variables and its length is the arity of a. This sequence should contain all free variables in the type @(a(Z)).We assume also that the notion of simultaneous substitution has been properly defined. If Z is a sequence of distinct variables and l? a sequence of expressions, and if these sequences have the same length, then S; A will denote the result of simultaneous substitution of the expressions for the variables .' in A. The typing operator 7 (which is a partial operator because not all constants have a type) is now defined as follows:

B'

Let A be an expression, let (a be a function associating a type (an expression) to every free variable of A. Now if A is a variable z then 7 ( A ,(a) = (a(z), if A is a(@ and a(.') in dom(@) then 7 ( A ,(a) = S$ @(a(.')), if A is [z : B]C then 7 ( A ,(a) = [z : B]7 ( C ,a*) (where @* is obtained from (a by changing its value at z t o B ) , if A is B {C} then 7 ( A ,@) = 7 ( B ,@){ C } .

Remark. This typing operator corresponds to the one described in [Nederpelt 73 (C.3)],as far as abstraction and application are concerned. The description of various typing operators for defined expressions a(@ can be found in [van 0 Daalen S O ] .

831

An implementation of substitution (E.3)

4.1. Typing in name-free A-calculus; contexts Now we discuss typing in the name-free A-calculus C in Section 2. Expressions will be expressions in L. In particular constants are considered to have infinite arity and sequences of expressions are infinite. For typing such expressions we need types for constants and variables, just as in the case described above. Therefore we assume two functions to be given: a possibly partial function 0 : C + C and a total function @ : N

+C

.

The latter, which gives us the types of variables, will be called a context. The expressions of such a context, i.e. the values @(i), are expressions which may contain free variables (i.e. references). These references should be considered as being typed relative to the prepart of @. More precisely: the type of a reference k 1). with depth k in @(i) must be taken to be @(i We define two operations upon contexts.

+ +

4.1.1. Extension For a context @ we denote by E A @ the extension of @ by an expression A . The consequence is that references with depth 0 in the new context will have type A , while references with depth i > 0 will have the type which was originally the type of a reference with depth i - 1.

4.1.2. Cutting If @ is a context then -yn @ denotes the result of cutting from @ a segment of length n. Hence 71 is the left inverse to E A . Definition.

(m@)(i)

= @(i

+ n)

0

4.1.3. The typing operator We now a.ssume the (partial) function 0 : C + C, which gives the types of constants, to be given and fixed. We define for an expression A and a context @ the type of A relative to @, to be denoted by t ( A , a).

L.S. van Benthem Jutting

832

0

Remark. The update u;" in the first clause of this definition ensures that the type which is defined can be considered relative to the same context as the points to a prepart of (a preexpression which is typed. The expression (a(.) ceding z, as has been indicated above. 4.2. Theorems The following theorems hold for the relation between typing, updating and substitution. The theorems have been formulated so as to make the proofs (by induction) straightforward. The interesting properties are contained in the corollaries.

Theorem 4.1.

Corollary 4.1.

t(uE A, (a) = UE t ( A ,"ln (a) .

Theorem 4.2. If

t(C,(a) = D then

An implementation of substitution (E.3)

833

Corollary 4.2. If

t ( C , @= ) D then

t ( s $ A , @ )= & t ( A , ~ g @ ) .

0

Corollary 4.3. If

t(ci,a) = GI&+’ @(i)

for i E N

then t(d$ A, @) = d$ t ( A ,a) .

0

Remark. For typing to commute with multiple substitution, it is necessary that this substitution is, in some sense, correct. More precisely: it is needed that the type of the expression which is substituted for a reference of depth k is equal to the type of that reference. We have seen, however, that the type of a reference may itself contain free references. Now the substitution operator indicates substitution for all free variables, and therefore the correctness requirement regards the types of the variables with the substitution carried out. The situation resembles the requirement of “fitting of strings into contexts” in Automath languages (cf. [van Daalen 801 and [van Daalen 73 (A.311). 4.3. Typing of expressions in a c-environment In order to describe typing in a c-environment we need, just as above, a (possibly partial) function 0 : C + C describing the typing of constants. This function is implicit in the following. We also need a description of the types of free variables (i.e. references). In our representation we will not give these types explicitly by a function @ : N + C, as we did above, but we will code the types in such a way that they can be retrieved using the c-environment. Let us consider an expression A on a c-environment r. We picture the situation in the following diagram.

L.S. van Benthem Jutting

834

Figure 11 We denote the complement of d o m ( r ) by Fr. As we have explained in Section 3, references from A to Fr denote free references. In the diagram we have indicated that a reference to 1 from A denotes 0, a reference to 3 denotes 1. References from A to 5 are possible only via a reference into dom(P), e.g. by a reference into r ( 4 ) . In this case it will denote a reference 2, etc. Generally a reference t o a number x in F ( r ) will denote z m o d r . We have written down all these interpretations next to the dots which represent the elements of Fr. Looking at these interpretations, we see that they could be considered as an order preserving map from Fr onto n/. We will denote this map by vr and its inverse by $r. Clearly $r is an order preserving enumeration of Fp. Now, as references to Fr should be interpreted as free variables, we choose to code the types of those free variables as a function with domain Fr. We will put the type of the free reference with depth i on the place $ r ( i ) . But we will not put there the type itself of the free reference, but an expression which codes this type, and which should be interpreted itself relative to the environment r. This leads us to the following description. We suppose a function '$ : Fr + C to be given. We will interpret this function as coding the context @ which gives the types of the variables. That is: we will define the "interpretation" U ( Q ,I') of Q with respect to r and this interpretation will be @.

Definition. Let i E N then

U(XP,r)(i)= I ( @ ( j ) , y j + l r ' )

where j = $r(i)

.

0

An implementation of substitution (33.3)

835

It turns out to be necessary to define what is the influence on @ when the environment r is cut or extended. This is done in the following definitions. In the case of extension the type of the extra variable should be given of course.

Definition. (&A

w)(o)

= A if i E dom(9)

( & ~ 9 ) (1)i += Q ( i )

.

Definition. (%) @(i) = @(i

+ n)

if i

+ n E dom(9) .

For these operations the following theorems hold.

Now we define the type T ( A ,I?, 9) of the expression A with respect to the c-environment I' and the function 9 (which codes the types of the free references in A ) .

836

L.S. van Benthem Jutting

4.4. The theorem on typing

We finish this description by stating the following theorem

Theorem 4.6. Let r be a c-environment and Q a function on Fr as described above, then we have:

5. CONCLUSIONS

The main conclusion which can be drawn from the description we have given is that the implementation of substitution we described is sound. It codes substitution without ever copying a given expression and as a consequence it is cheap both in execution time and in memory space, as far as pure substitution is concerned. In comparing expressions it might be slower than systems which use copying. We may add that it has been implemented and tried out extensively in the Automath checker, which has been used in Eindhoven since 1974. The system can be used to implement P-reduction, 77-reduction and so-called &reduction (i.e. application of definitions). Various strategies for deciding convertibility of typed expressions can be implemented using the system. Our description and also the implementation have made essential use of the concept of “de Bruijn indices”, that is of relative addressing of variables. We think, however, that similar implementations using absolute addressing might be possible. A remarkable feature of the system is the difference which is made between single and multiple substitution, as single substitution could be considered as a special case of multiple substitution. The main reason for the distinction is that, when multiple substitution is applied, it is never required to consider the terms which are substituted on a “shorter” environment, as in the case described in Section 3.2. Therefore we would burden multiple substitution with superfluous administration if we tried to incorporate single substitution as a special case. Looking back on the description it can be remarked that it uses the natural numbers for coding various tree-like structures. The advantage of this choice is that formal proofs of our theorems are possible, the disadvantage is that the intuitive background of our definitions is not as clear as might be possible in another presentation. This disadvantage is most obvious in Section 4. We have tried to compensate for it by giving informal explanations for our definitions.

An implementation of substitution (E.3)

837

Finally we situate this implementation into Landin’s SECD machine (for a reference see e.g. [Glaser et al. 841) in order to make clear what its status is. In fact our system is mainly a possible implementation of the E-part of that machine. The rest of the Automath checker has not been discussed here and is rather different from Landin’s machine, and also from the reduction machine for BRL (see [Berkling & Fehr 821)’ CATAR (see [Curien 861) and other functional programming languages. The reason is that the aims of these languages differ somewhat from those of the Automath checker: the latter aims at deciding convertibility of pairs of typed A-expressions, preferably without normalizing them, while Landin’s machine and similar A-calculus machines aim at normalization. Nevertheless our implementation of substitution could be used also for normalizing expressions.

ACKNOWLEDGEMENTS The incentive for writing this paper came from G. Huet and T. Coquand, who asked me (in 1986) how substitution was implemented in the Automath verifier. Of course the paper could not have been written without I. Zandleven who invented (and implemented) the system. I want to thank Rob Wieringa for many valuable comments which helped me in presenting my ideas, for patiently listening to many expositions and for invaluable technical support in preparing the text. Paul Gorissen and Frans Kruseman Aretz spent much effort in reading the manuscript and suggested major improvements in the presentation.

This Page Intentionally Left Blank

PART F Related Topics

This Page Intentionally Left Blank

841

Set Theory with Type Restrictions N.G. de Bruijn

1. It has been stated and it has been believed throughout this century that set theory is the basis of all mathematics. Usually (but not always) people think of the Cantor set theory, with some formalization like the one of Zermelo-F’raenkel. It describes a universe of things called sets, and everything discussed in mathematics is somehow interpreted in this universe. 2. It seems, however, that there is a revolt. Some people have begun to dislike the doctrine “everything is a set”, just at the moment that educational modernizers have pushed set theory even into kindergarten. It is not unthinkable that this educational innovation is one of the things that make others try to get rid of set theory’s grip for absolute power. Mostowski is reputed to have claimed a counterexample by declaring “I am not a set”. At the present state of science it seems to be impossible to find out whether this statement is true or false. Anyway, there is no safe ground for saying that everything is a set. Let us try to be more modest and say: “very many things can be coded as sets”. For example, Beethoven’s 9-th symphony can be coded as a set. But the coding is quite arbitrary, and we are not sure that nothing gets lost in the coding. To quote a more mathematical example: Gauss’ construction of the regular 17-gon may be interpretable as a set, but again such an interpretation is quite arbitrary and does not seem to be illuminating. An expression like “the intersection of the set of even integers with the set of all constructions of the 17-gon” makes sense only after the codings have been stated. Sets have become a very important part of our language. Until 1950 many rigorous texts on mathematical analysis were written with little or no use of the language and notation of sets. This has changed considerably, but quite often the change is very superficial. It is superficial as long as it is nothing but a translation from predicates to sets. One of the reasons for this translation may be that there is a vague opinion that a set is a mathematical object and a predicate is not. Accordingly, it is felt that someone who makes assumptions and proves theorems about predicates is a logician and not a mathematician.

N.G. de Bruijn Nevertheless, there still remains a tremendous use for sets in mathematics. Sets are here to stay, and we have to ask what kind of set theory we should adhere to. The question which set theory is the true set theory, is not a true question, of course. It is all a matter of taste: relevant things are whether a theory is beautiful, economic, powerful, easy t o manipulate, natural, easy to explain, etc. The fact that the Cantor-Zermelo-Fraenkel theory is interesting, correct, rich and deep, does not imply that it is necessarily the tool that should be available for every mathematician’s use. It has some disadvantages too. One is that it makes the foundation of mathematics rather hard for the non-specialist. We have the sad situation that late in the 20th century the average ordinary mathematician has rather vague ideas about the foundation of his science. Another unpopular feature in Cantor set theory is the admission of z E z, which seems t o be rather far away from possible interpretations. 3. The natural, intuitive way to think of a set, is to collect things that belong to a class or type given beforehand. In this way one can try to get theories that stay quite close t o their interpretations, that exclude z E z and are yet rich enough for everyday mathematics. Some of these theories may exclude large parts of the interesting, funny paradise of Cantor’s set theory which has been explored by so many expert mathematicians. For a survey of various type theories we refer to [Fraenkel et al. 581. 4. In this paper we shall try to make a plea for a kind of type theory where the

use of types is very similar to the rBle of types in cases where the objects t o be discussed are not sets. Let us first note that natural languages are confusing when dealing with types. The word “is,’ is used for too many things. We say “5 is a number”, “5 is the sum of 1 and 5”, “5 is the sum of two squares”. It is only in the first sentence that “is” can refer to a type. We shall use the symbol E for this: 5 E number. We shall call such a formula a typing. 5. We think of a type theory where the type of an object is unique. If A E B then B is completely determined by A . This seems to drift us away from the idea that B is something like a set and that A is a member of B, and we have to be careful not to confuse the typing symbol E with the membership symbol E , although there is a conceptual similarity.

We of course run into circumstances where we want to say that our number 5 is also a complex number: 5 E complex number. We have t o make the distinction between the real number 5 and the complex number 5 in order to maintain that in A E E the B is completely determined by A . It is a bit awkward, we have to talk a great deal about identification and embedding (but in de Cantor-

Set theory with type restrictions (F.1)

a43

Zermelo-Fraenkel theory this is not any better). Yet it should be done; let us not forget that most mathematicians would hesitate to identify the real number 5 with the 2 x 2 matrix

(

), and the latter situation is really not

very

much different from the one with the complex numbers.

6. Let us first explain some other cases where E plays a r8le. If B is a theorem and if A is a proof for B, then we can write A E B. The theorem can have several proofs, but a proof proves just one theorem. Another example: Let B be a statement of constructibility of some geometrical figure that can be constructed by means of ruler and compass, and let A be a description of one of the constructions. We again write A E B. There can be several different A’s t o a given B , but if the construction A is given, there is no doubt about what it constructs. A third example: Let A be a computer program and let B be a description of what the execution of the program achieves (i.e., A describes the syntax and B the semantics). In all these cases the A’s and B’s may depend on several variables (of certain types), and the results A E B may be transformed into other results A’ E B’ by means of substitution. Moreover, in all these cases there is a possibility to introduce a name for a thing in B if we do not actually have that thing. There are two ways for this. (i) The thing can be introduced as something primitive and fixed. For example Peano’s first axiom says that there is a natural number t o be denoted by 1, and nothing is assumed about it at that stage. An example with a different interpretation is that B is a proposition and we say that its truth is assumed, i.e. that B is an axiom. From then on, B plays the same r61e as a theorem: we act as if we have a proof, i.e. we have something E B that we do not wish to describe. Let us now look at the case of geometrical constructions. We want t o express that the possibility t o connect the distinct points P and Q by a straight line is a primitive construction, i.e. a construction that cannot be described in terms of simpler constructions. That is, we act as if we have a fixed thing E B, where B is a statement of constructibility. (ii) The thing can be introduced as a variable. Its validity is restricted to a piece of text (a “block”) that is opened by the introduction of the variable; that is why we call it a block opener. The variable is introduced by stating its type: if its name is z, we write something to the effect of “let z E B”. If B is a type like “number”, “point”, then this phrase “let z E B” sounds quite familiar. If B is a statement, however, we may interpret the phrase “let x E B” as “assume that B is true”. That is, we act as if we have a proof for B. (This is not the same thing as the introduction of B as an axiom:

844

N.G. de Bruijn “let x E B” does not reach beyond the block opened by x, and secondly, we can substitute A for x if we later get any proof A for B.) There is a slightly unfamiliar feature: most mathematicians have not got used to giving names to proofs, and here we give names even to would-be-proofs.

7. The parallels between the various interpretations of typings are very strong indeed. The mechanism of substitution is the same for the various interpretations, and actually the various interpretations are happily intermingled. Everything that is said in mathematics is said in a certain context. That context consists of a string of variables (block openers), each one having been introduced as a thing of a certain type. The type of the second variable may depend on the first variable, etc. In such a string some of the variables have to be interpreted as conventional mathematical objects (like numbers, points), others as wouldbe-proofs for assumptions. The linguistic treatment makes no difference as to the interpretations. 8 . The above characteristics are the common root of the mathematical languages of the Automath family [ d e Bruzjn 70a (A.2)]. The definitions of these

languages hardly contain anything on logic or on the foundation of mathematics. Notions like “truth” , “theorem”, “proof”, “set”, “definition”, “and”, “implies”, “inference rule” are either things that can be explained by means of the language (like any other piece of mathematical material) or else they are only meant to emphasize pieces of text to a reader who likes to have a feeling for motivation. To mention an example: a definition, an abbreviation and a theorem have the same linguistic form. It would not be necessary to distinguish between these three, if it were not for the fact that “readability” has something to do with the relation to conventional modes of expressing mathematics. The languages of the Automath family have the property that books written in these languages can be checked for syntactic correctness by means of a computer. We emphasize that syntactic correctness guarantees that the interpretations of the text are correct mathematics. Note that various sets of the typing symbol E can occur on one and the same piece of text, and therefore we can pursue a kind of unification of mathematical theories. It is not the right place t o go into a complete exposition of these languages, but one thing should be made clear; just as they admit to introduce objects of a given type, and to build new objects by means of old ones, it is equally possible to introduce new types (by way of variables or of primitive notions) and to build new types in terms of old. For this purpose we create the extra symbol type and we write things like “number E type”, “let B E type”, etc. 9. Having such type languages available as relatively simple tools, we are in-

Set theory with type restrictions (F.1)

845

duced to base mathematics on a type theory where types can be constructed as abundantly as other mathematical objects, i.e., where types may depend on parameters, are defined under certain assumptions only, where types can be introduced as variables or as primitive notions.

10. There are various ways to do set theory in such a system. One possibility is that we take a primitive type called SET, and from then on, we write A E SET for every A which we want to consider as a set. We can write the complete Cantor-Zermelo-Fraenkel theory this way. The relations A E B, A C B are relations that have a meaning whenever A E SET and B E SET. There is not the slightest danger to confuse E and E. The E is a relational symbol just like any other; it does not occur in the language definition. There is a second, entirely different way, that implements set theory with types, in the sense of the “5 E number” mentioned before. Now the symbol E means something like E. If B is a type, and if P is a predicate on B , we form the set S of all A with A E B for which P ( A ) is true. So sets in B correspond to predicates on B. We write S C B , and we define E by saying that A E S means P ( A ) . Quite often we like to consider S as a new temporary universe, i.e. we wish to have A E S in the form of a formula with an E. To that end we create a type called OWNTYPE(B,P ) and a one-to-one mapping of that type onto S. Some of this work can be simplified by special notation we shall not develop here; such notation can be used both for ordinary and for automated reading. 11. In order to work with the predicates mentioned in the previous section, we want some kind of typed lambda calculus. It is roughly this. If B is a type, and if for every x E B we have a formula of the form A(z) E C(z), then we want t o write [z : B]A(z) E [z : B]C ( x )

The left-hand side is the function that sends x into A ( z ) , defined for all x E B ; the conventional notation in non-typed lambda calculus is X,A(x). The righthand side is slightly unconventional; in the case that C(z) does not depend on z one may think of the class of all mappings that send B into C. This kind of lambda calculus is part of the language definition, independently of the mathematical axioms we are going to write in our books. So there is a primitive idea of mapping available before sets are discussed. In particular, predicates are such mappings, so if sets are introduced by means of predicates, they already require the lambda calculus. Later, one can show that the concept of a mapping as a subset in a Cartesian product is equivalent to the notion of mapping provided by the lambda calculus.

846

N.G. de Bruijn

12. Cantor produced his paradise by means of linguistic constructions. (This created considerable controversies in his time, since he did not specify his language.) Now let us see what we get by linguistic constructions in our typed set theory. Assume we introduce (by means of an axiom) the type N of all natural numbers (and we take a set of axioms like Peano’s). Then we have, whether we want it or not, subscribed to N N , to N”, etc., since the lambda calculus prescribes that we accept the type of all mappings of N into N , etc. However, it seems (we use the phrase “it seems” since no formal proof has been given thus far) that we cannot form something of the strength of the union

The reason is not that we would not allow ourselves to form the union of a countable number of types. That will be provided, anyway, by an axiom we would not like to live without. The reason is that we are unable to index the sets of the sequence (1) in our language. The indexing we want is N1 = N , N2 = ”1, N3 = ” 2 , ... , and this is in terms of our metalanguage, since it requires a discussion of something like the length of a formula. This is a little detail Cantor never made any trouble about. The fact that the union (1) is “inaccessible” does not mean that bigger types are forbidden. After all, we can just start saying “let B be any type (i.e. B E type)” and we can make assumptions about B that cannot be satisfied by the types N , N N ,... . The world where we have N , N N ,... , but where (1) is L‘inaccessible’’,is a world most mathematicians will doubtlessly find big enough to live in. For those who want to have a bigger world, where they cannot be troubled by people asking for interpretations, there is a simple way out: they just take a type SET and provide it with Zermelo-Fraenkel axioms. If they want to have the picture complete, they will not find it hard to embed the types N , N N ,... into a small portion of their paradise.

13. Having discarded the idea that every mathematical object is a set, we should be careful not to fall into the next trap. We might like to say that a mathematical object is either some B with B E type or an A with A E B (where B E type). However, the situation can be more complex than this. Let us consider the notion “group” that occurs in the sentence “let G be a group”. What we want t o say is something like this: assume we have a type A , that we have in A a set B , that in B we have a multiplication rule, that the multiplication is associative, etc. The object we want to handle can be denoted by a string of identifiers 2 1 , ...,xk, where 11 E A l l 2 2 E A2, ...,x k E Ak, but where A2 may depend on 2 1 , A3 on X I and 2 2 , etc. It is not as if the string A1, ...,Ak were something type-like, and 51,...,xk were something chosen in it. Accordingly, we

Set theory with type restrictions ( F . l )

847

cannot write let “G be a group” as a single typing “G E group”. We can of course create, by means of a set of axioms, a new type “group”, but that is a poor remedy: we cannot afford to adopt axioms for every new notion we like to introduce.

14. In Section 10 we compared two different ways to talk about sets by means of typings. The choice between the two has a more general aspect; viz. the question whether we shall or shall not aim at minimal use of typings. The word “minimal” refers to the number of different uses of the typing symbol. In order to say what we mean, we describe a kind of minimal system that seems t o be in the spirit of basing mathematics on Zermelo-Fraenkel set theory. In the first place we use typings ...E SET (as in Section 10). Secondly we create a type called BOOL, and we use “ AE BOOL” in order to express that A is a proposition. Finally, for every X with X E BOOL we create a type called TRUE(X), and we use the typing P E T R U E ( X )for expressing that P is a proof for the truth of X . In this minimal system, the use of typings of the form ...E type is restricted to the above-mentioned three instances right at the beginning of the book of mathematics. The author thinks that talking mathematics in such a minimal system is not always the natural thing to do. There is much to be said for a more liberal use of typings, where typings of the form ...E t y p e are used throughout the book. Let us consider the geometrical constructions mentioned in Section 6. It seems natural to use A E B for saying that A describes a construction and that B says what has been constructed. Let us say that we have created, for every point P , a type CONSTR(P). Hence statement A E B has the form A E CONSTR(P). If we want to phrase this in our minimal system, we get something as follows. The point is a set ( P E S E T) , and so is some coded form A* of A (A* E SET). We form a proposition q(P,A*) (so q(P,A*) E BOOL) that says that A* is a construction for P. Finally we need a proof S for this proposition, whence we write S E TRUE(q(P,A*))

for what was A E CONSTR(P) in the liberal system. In the latter case it is not necessary t o provide a proof corresponding to S, since the type of A can be determined by a simple algorithm. This example shows two advantages of liberal use of typings: one is that many unnatural codings can be suppressed, the other one is that a higher degree of automation can be achieved. Yet there are many other advantages of which we mention two: (i) We are neither forced nor forbidden to introduce the types SET, BOOL and TRUE(X); (ii) There is a possibility that one and the same piece of text gives rise to various pieces of standard mathematics, just by the use of different interpretations.

This Page Intentionally Left Blank

849

Formalization of Constructivity in Automath N.G. de Bruijn

1. INTRODUCTION There are various systems in which a large part of mathematical activity is formalized. The general effect of the activity of putting mathematics into such a system is what one might call the unification of mathematics: different parts of mathematics which used to be cultivated separately get united, and methods available in one part get an influence in other parts. Very typical for twentieth century mathematics is the unifying force of the concepts of set theory. And today one might say that the language of mathematics is the one of the theory of sets combined with predicate logic, even though one might disagree about the exact foundation one should give to these two. Not everyone thinks of set theory and logic as being parts of a single formal system. Set theory deals with objects, and logic deals with proofs, and these two are usually considered as of a different nature. Nevertheless, there are possibilities to treat these two different things in a common system in a way that handles analogous situations analogously indeed. A system that goes very far in treating objects and proofs alike, is the Automath system (see [ d e Bruajn 80 ( A . 5 ) ] ) . In Automath there are expressions on three different levels, called degrees. Each expression of degree 3 has a “type” that is of degree 2, and each expression of degree 2 has a type of degree 1. Expressions of degree 1 do not have a type. There are two basic expressions of degree 1, viz. type and prop. The word type should not be confused with the word type used more or less colloquially when saying that each expression of degree 2 or 3 has a type. We denote typing by a colon. If A has B as its type, we write A : B. So we can have

A : B : type and also C : D : prop

.

The interpretation of (1) is that A is the name of an object (like the number 3), and that B is the name of the class from which that object is taken (it might be

850

N.G. de Bruijn

a symbol for the set of integers). The interpretation of (2) is that C is a name for a proof, and that D somehow represents the statement that is proved by C. The main profit we have from this way of describing proofs and objects is the matter of substitutivity. If we have described an object depending on a number of parameters, that description can be used under different circumstances by means of substitution: we replace the formal parameters by explicit expressions. The same technique is applicable to theorems: a theorem is intended for many applications, and such applications can be effectuated by substitution. The conditions of the theorem are modified by these substitutions too. If we study the matter more closely, we see that some of the parameters are object-like, and othere are proof-like. The substitution machinery is the same for both. All this is effectively implemented in the Automath system.

2. ORIENTATION ON GEOMETRICAL CONSTRUCTIONS On the fringe of mathematics there are mathematical activities which seem to be of a kind that does not fit into the pattern of objects and proofs. One such thing is the matter of geometrical constructions, a subject that goes back to Greek mathematics. A construction is neither an object nor a proof, but constructions are discussed along with geometrical objects, and along with proofs that show that the constructions construct indeed what is claimed to be constructed. Since these geometrical constructions can also admit substitution for formal parameters, there is a case for creating facilities which handle a new kind of things along with objects and proofs. So we can think of a system that handles objects, proofs and geometrical constructions in more or less the same way. If we think of geometrical constructions, there is a peculiarity that may not arise easily with other kinds of constructions: it is the matter of observability. Let us study a particular example in order to stress this point. Let there be given four points A , B, C and D in the plane. We assume that A , B and C are not on a line. Let M be the centre of the circle through A , B and C. We wish to construct the point P that is defined as follows. P is obtained from D by multiplication, with A4 as the multiplication center, and multiplication factor 1, 2 or 3. The factor is 1 if D lies inside the circle, 2 if D lies on the circle, and 3 if D lies outside the circle. If we want to carry out the construction of P , we have to know whether we are allowed to observe what the position of D with respect to the circle is. In particular this problem comes up for the practical question what should happen if there is insufficient precision for concluding whether D is inside or outside. If we think of a construction with actual physical means like paper, pencil,

Formalization of constructivity in Automath (F.2)

851

ruler and compass, then the case of D lying exactly on the circle is, of course, undecidable. The above construction problem may seem to be very artificial, but yet its main characteristic turns up in very many geometrical constructions: it is the fact that, at some point of the construction the result of some observation will decide the further course of the construction. An example where this will happen is the case of geometrical constructions that have to be carried out inside a given finite part of the plane. The naive approach t o observability may be formulated as the slogan “truth is observable’’ (see Section 4). Other possibilities will be sketched in Sections 8-10. A further thing one might like to formalize is selectability: one wants to be able to select a n object from a set of objects one has constructed. For example, a construction of the intersection of two circles may produce two points, and we may wish to be able to “take one of them”. In this case such a selection principle is not indispensable: one might describe the effect of the construction of the intersection as giving a labelled “first point” and a labelled “second point”. But there is a stronger reason for implementing a selection principle: so often we have to “take an arbitrary point” at some stage of a construction. It should be noted that in such cases the final result of the entire construction does not depend on the particular point that was taken. In Section 5 we come back to this, in particular to the matter of the difference between “giving” and “taking” arbitrary points. A description of all these features is possible in Automath. We have various options for doing it. The way we present this matter is necessarily arbitrary. It is certainly not the intention of this note to give a particular basis for geometrical construction theory. The only thing that will be attempted is to provide a framework into which such a basis might be placed. If we formalize a thing like constructability we of course dislike t o do it in the style of classical logic. We do not want to consider constructability of a point as a proposition in the ordinary sense. We do not want to admit arguments where we get a contradiction from the assumption that the point P is not constructable, and then conclude the constructability of P. Therefore we want to put constructability (and the same thing might apply to observability and selectability) in a framework of positive logic, where we have no negation at all. In fact we can be even more restrictive, and refrain from introducing the ordinary logical connectives (like A , V, +) for this logic. The only thing we want to do is to register statements about constructability, observability and selectability (possibly provided with a number of parameters), and to keep them available for later use.

852

N.G. de Bruijn

We can provide facilities for such a positive logic in Automath by adding a new expression of degree 1, to be called pprop (the first p stands for “positive”). For this pprop we shall not proclaim any logical axioms, and we shall not introduce the notion of negation. Moreover, we do not feel the need to have abstraction in the world of pprop. That is, if u: pprop we shall not take abstractors [x:u] like we would have in cases with prop or type. Accordingly, in this pprop world we shall not consider application (..) ... either. That means: we take pprop entirely in the style of PAL (see [de Bruzj’n 80 ( A . 5 ) ] ) . There is a case for doing something similar in the world of type. Let us create a new expression of degree 1, to be called ctype (the ‘c’ stands for ‘construction’, since we intend to use it in the world of constructions). The difference between ctype and type is similar to the difference between pprop and prop. In ctype we intend to be free from all the assumptions that might have been made about type. In particular we shall not necessarily implement set-theoretical notions. And we shall not even introduce the notion of equality. That is, if a : C : ctype and b : C : ctype, then we will not introduce the equality of a and b as a proposition. Moreover, we shall treat ctype entirely in the style of PAL: no application and no abstraction. For a description of Automath versions where various sets of rules apply to various expressions of degree 1, we refer to [de Bruijn 74al. It has to be admitted that geometry is not the easiest example for the study of constructions. It is not so much the fact that the geometrical universes like planes, spaces, are uncountable. The most troublesome thing is neither that in the geometrical plane there is no fixed origin and no fixed direction. The real course of trouble is that there are so many situations where we have to except some of the cases. If we want to say that points p and q have just one connecting line we have to exclude the case p = q. Such things cause a steady flow of exceptions, which even has distorted the meaning of the word “arbitrary”. In past centuries the word “arbitrary” often had the meaning: “arbitrary, but avoiding some obvious exceptions”, and these exceptions were usually unspecified. If one took an arbitrary point and an arbitrary line then the point should not be so arbitrary t o lie accidentally on the line! A full description of all these exceptions has the tendency to make geometrical construction theory unattractive. Yet there is still another source of irritation: so often we have to split into cases (two circles may have 0, 1 or 2 points of intersection), and these situations might pile up to an entangled mess. Nevertheless we may be grateful t o geometry for having confronted us with the notion of constructability. What we have learned from geometry might be applied t o other areas. Computer science might be one of them. Observability, as a formal element in geometrical construction theory, was considered by D. Kijne [Kijne 621. That paper also attempts a formal treatment

Formalization of constructivity in Automath (F.2)

853

of selectability (with selection from finite sets only), and considers “giving arbitrary points” by means of a kind of algebraical adjunction operation.

3. THE BASIS OF FORMAL GEOMETRY

Before we discuss a formal basis for geometrical constructions we have to say what “formal geometry” or more generally, formal mathematics is. Here we are not concerned about the contents of formal geometry, but just about the spirit in which it is written. We may assume that it is written in an Automath book, using the full power of typed lambda calculus. And that it is written in a setting of logic and set theory, the details of which are still open to discussion. One might or might not take the rules of classical logic (e.g. in the form of the double negation law), and we might differ in taking or not taking a thing like the axiom of choice. Such distinctions hardly influence the spirit in which geometry is presented. They might influence the content, i.e. the set of all provable geometrical statements (but it should be remarked that there are areas of mathematics which are much more susceptible to foundational differences than classical geometry seems to be). Just to give an idea of the spirit, we give a small piece of Hilbert’s axiomatization of geometry. Hilbert starts with: there are things we call points and there are things we call lines (in Hilbert’s system the notion of a line is not presented as a special kind of point set). In Automath we say this by creating primitive types “line” and “point”. These types are undefined, just introduced as primitive notions (PN’s). As a primitive we also have the notion “incidence” of a point and a line. Next we can express axioms like: if two points are different, then there is exactly one line incident to both points. Something should be said about “different”. We take it that our geometry text is written in a mathematics book in which for any two objects a, b of type A there is a proposition that expresses equality of a and b, and that for any proposition we can form the negation. In this way the fact that a and b are different can be expressed in Automath by means of NOT(IS(A, a, b ) ) . But in order to keep this paper readable we shall just write a # b instead of this. We now give a piece of Automath text that can be considered as the start of a Hilbert-style geometry book (we display our Automath texts in a flag-andflagpole format: the block openers are written on flags, and the poles indicate their range of validity).

N.G. de Bruijn

854

point := PN : type line := PN : type

I

p : point

m : line

q : point

ax1 := P N : incident(p, conn) ax2 := PN : incident(q, conn)

m : line

I

ax3 := P N : m =conn So if p , q are points, and m is a line, then incident(p,m) is a proposition; if pr is a proof of p # q then conn(p,q,pr) is the connecting line of p and q . In Axioms 1 and 2 we have expressed that this line is incident to p and q, in axiom 3 it is stated that if a line m is incident to both p and q then m is equal to the connecting line. Although the above fragment is still a meagre piece of geometry it is hoped that it shows the spirit of a formalization. We shall refer to such a presentation of geometry as 0.

4. A NAIVE APPROACH TO OBSERVABILITY

What we shall call the naive approach is expressed by the slogan “Truth is Observable”. Let us explain what this means by mentioning two cases. In the first case we use knowledge obtained from geometrical theory 0 in order t o prove that some object we have to construct is already in our possession. We do not bother whether that proof is L‘constructive’’or not: truth is just truth. One might find this a poor example, since within the sope of usual geometrical theories and usual constructions it seems that “non-constructive” proofs can

Formalization of constructivity in Automath (F.2)

855

always be replaced by very constructive ones, but it is easy to imagine fields where the situation is different. In the second case we have a construction that started from a point that was chosen arbitrarily. At some stage of the construction we have a point P and a circle c, and subsequently our course of actions is depending on whether P lies inside c, outside c or on c. The naive point of view says that on the basis of the theory in G we have exactly one of the three alternatives. We can observe which one of the three occurs, and we act accordingly. In Sections 6 and 7 we offer two different implementations for the naive point of view.

5. TAKING ARBITRARY OBJECTS Before going on, we have to make it clear that there are two entirely different situations where in traditional geometry it was said that an arbitrary object (like a point) was taken. Let us call these situations V and S, (these letters abbreviate “data” and “selection”). If we think of a problem where a teacher requires a pupil to construct something, then 2) is the case where the data have been chosen arbitrarily by the teacher. On the other hand, S is the case where the pupil, in the course of the construction, selects some point arbitrarily. Quite often the final result does not depend on the particular point that was chosen, but there may be other cases. It may happen that the final result itself has a kind of arbitrariness. An example: given points A , B and C, not on a line, construct a point inside the triangle formed by A, B and C. In the opinion of the pupil, the points taken in situation 2) are not called “arbitrary”: they are called “given” or possibly “arbitrarily given”. The pupil has no freedom in case V . In the S-case, however, the pupil is completely free, and the teacher has no say in the matter. In a formal presentation like in Automath the difference between V and S is very pronounced. 2) is effectuated by means of the introduction of a new variable, S is implemented by means of a primitive notion (PN). We shall show this in detail in Section 6 and 7. There is something about the PN-implementation of the S-situation that might be felt as strange. If we describe a construction by such a PN, then we select exactly the same point if we are requested to do the construction a second time. If the second time we would insist on selecting a point that is actually different from the one chosen the first time, then we have to do this on the basis of some new selection principle, of course. But if we just want to take a point again, without any restriction as t o its being different from or equal to the first one, our P N provides us with the same point we had before. This means that we

N.G. de Bruijn

856

get more information than we intended to have. Nevertheless, such information cannot possibly do any harm. What shall we do about this weirdness of the PN-implementation? Shall we invent unpopular remedies in order to cure a completely harmless disease? Let us not prescribe a definite attitude in this, and admit that there are several ways to live with the situation. Either we leave the harmless disease for what it is, or we take one of the remedies. Let us mention two remedies. The first one is to take a notion of time t , and adhere a value o f t to every construction step. The arbitrarily selected points will depend on t. If we have to repeat the construction some other day, t has a different value, so nothing is known about the selected point in comparison to the one selected the previous day. As a second remedy we suggest to implement arbitrary selection not by an axiom but by some axiom scheme. The scheme proclaims the right to create as many copies of the axiom as one might wish, each time with a different identifier. We leave it at these scanty remarks. The author’s opinion is that unless we invent a much simpler cure, we’d better learn to live with the harmless disease.

6. FIRST IMPLEMENTATION OF THE NAIVE POINT OF VIEW We have to express in some way or other that some of our mathematical objects have been constructed. This can be thought of as a property of those objects, but for reasons sketched in Section 2 we prefer to take this property as a pprop rather than as a prop. We shall create, for every type X and for every z of type X, the expression have(X,z) with have(X,z) : pprop. In particular we can abbreviate have(point,z) t o havep(z) and have(line,z) to havel(z). (Since we use “have” for points and lines only, one might think of taking just “havep” and “havel” as primitives, without taking “have” for general types.) We now give some Automath text. It is supposed to be added to a book that contains geometrical theory G (see Section 3) already. First we introduce “have”, and abbreviations “havep” and “havel”.

I I

have := PN : pprop

Formalization of constructivity in Automath (F.2)

I

857

havep := have(point,u) : pprop

v : line have(line,v) : pprop Next we display how we take an arbitrary object in the sense of the D-situation of Section 5 (“given objects”). In order to talk about a given point we need two block openers, expressing (i) that u is a point, and (ii) that havep(u) holds; inside that context the point u can be considered as given. We shall now express: if u and v are given points and if u # v then we can construct the line connecting u and v. According to our naive point of view the condition that u and v are different is simply expressed in the terminology of 9.

I

u :point

J

-1

ass11 : havep(u)

v : point

1

Next we describe a case of “taking arbitrary points” in the S-situation of Section 5. We express that if m is a given line then we are able to take a point not on m (we use the identifier “up” to suggest “arbitrary point”).

I

m : line

I

ass14 : havel(m)

I

up := PN : point az12 := PN : NOT(incident(ap, m ) ) ax13 := PN : havep(ap)

N.G. de Bruijn

858

These pieces of text display the form in which the basic constructions are introduced. If we want to describe a more complicated construction, we mention the relevant objects one by one, in the order of the construction, and each time we express that we “have” them. We give a (still very simple) example.

assl6:p#q

1

L1 := conn(p, q, assl6) : line H1 := a z l l ( p , assl4, q, assl5, assl6) : havel( L1) P1 := ap(L1, H1) : point : NOT(incident(P1, L l ) ) N i l := az12(L1, H1) H2 := az13(L1, H1) : havep(P1) Here L1 is an abbrevation for the line connecting p and q; H1 can be used as a reference for the fact that we actually have that line. P1 is the result of the construction, N i l assures us that P1 does not lie on L1, and H 2 assures us that we actually have P1. Altogether the text line with identifier P1, Nil, H1 represent the “derived construction’’ expressing that if p and q are given different points then we can take a point P1 such that p , q, P1 are not on one line. This derived construction can be applied later without referring to how it came about. It can be constructed as a kind of “subroutine”. The example of a derived construction we gave here is ridiculously simple, of course. Yet the pattern is the same as in more complicated cases. It shows the old idea of subroutines, which existed in constructive geometry many centuries before it came up in computer programming.

7. SECOND IMPLEMENTATION OF THE NAIVE POINT OF VIEW In the second implementation we take a construction plane which we conceive as being different from the geometrical plane. We might think of the original geometrical plane as abstract, and the construction plane as concrete, consisting

Formalization of constructivity in Automath (F.2)

859

of a piece of paper we can draw on. But, of course, our construction plane is still abstract: it is a mathematical model of a concrete plane. The objects in the constructed plane will be called cpoints and clines. In the back of our minds we think of a one-to-one mapping between the two planes: every cpoint has a point as its companion, and every cline has a line as its companion. Yet we shall not express all of this in our mathematical formalism. We shall just talk about a mapping (to be called semp) of cpoints to points and a mapping (to be called seml) of clines to lines. The reason for this reticence lies in the interpretation. If p l is a point, and if we are able to name a cpoint cpl that is mapped t o p l in our mapping, then for us this means that we “have” p l . We do not want to say that every point in the geometrical plane is a point we “have” just by being able to express that point mathematically. Therefore we do not want to be able to express the inverse mapping. Related to this reticence is the fact that we do not want to be able to discuss equality of two cpoints. Such equality has to be discussed for the companion points in the geometrical plane. And we do not want to admit as mathematical objects things like “the set of all cpoints” with some prescribed property. We achieve these restrictions by putting “cpoint” and “cline” into ctype, which is a world without equality, without set theory, without quantification. As a consequence we do not have constructability questions in our theory. A statement: “the point P is not constructable with ruler and compass” will not be a proposition in our Automath book. If we would be able to quantify over the construction plane we would be able to express that ‘‘there is no cpoint that is mapped onto P” and that would express the non-existence of the construction. Constructability questions belong to the meta-theory. They express that something “cannot be obtained on the basis of the PN’s displayed thus far”, and we cannot say such things in Automath itself. What we call our second implementation starts with the introduction of cpoint, cline and the mappings semp and semf. The latter abbreviations suggest the word “semantics”: we might say that the geometrical plane forms the semantics of the construction plane. If P is a cpoint then semp(P) is its semantics. Off we go: cpoint cline

:= PN : ctype

:= PN : ctype

1

c p : point

I

1

semp := PN : point

N . G . de Bruijn

860

I

I

cl : cline

1

semi:= PN : line

In order to take an arbitrary point in the construction plane, a single block opener “x : cpoint” plays the role of the pair “U : point”, “ass11 : havep(u)” of the first implementation. We show this with the fundamental construction that connects two points: x : cpoint

cconn := PN : cline ax21 := PN : seml(cconn) = conn(semp(x), semp(y), ass21) The fact that cconn is the line we are looking for, is expressed (in ax21) by means of equality in 4. If we have to take an arbitrary point in the S-situation we again get one PN less than in the corresponding case of Section 6. In order to express that, we can take a point outside a line, we write cm : cline := PN : cpoint ax22 := PN : NOT(incident(semp(acp),seml(cm)))

We also show the text corresponding t o the one with P1, N i l , H2 in Section 6: p : cpoint

:

Ni2 := ax22(CL1) N i 3 := ...

cpoint

: NOT(incident(semp(CPl),seml(CLl)))

: NOT(incident(semp(CPl),

conn(semp(p),semp(q) 1 ass221))

Formalization of constructivity in Automath (F.2)

861

We have not displayed the proof N i 3 . It will depend on applying general axioms about equality, and will make use of N i 2 and ax21. Passages like the one from N i 2 to N i 3 might be superfluous in many cases, since it is practical to keep the discussion as long as possible in the construction plane. To that end we might copy notions from E to the construction plane. The simplest example is

7 x : cpoint

cincident := incident(semp(z),seml(y)) : prop

8. RESTRICTED OBSERVABILITY In Sections 4, 6, 7 we described the naive point of view, where every truth in the geometrical theory is considered to be “observable”. Observability has its meaning in the process of taking decisions about the course of our constructions. Let us describe two different motives for restricting observability. One is practical, the other one is fundamentalistic. We shall discuss these in Section 9 and 10, respectively.

9. PRACTICAL RESTRICTIONS ON OBSERVABILITY The practical point of view is connected to questions of precision. This can be compared t o the matter of rounding off errors in numerical analysis. If in a construction two points turn out to be so close together that our construction precision does not guarantee that they are different, then we can not claim to be able to connect them by a line. And even if the points are different, the line will be ill-defined. Although these practical matters give rise to quite complicated considerations, we cannot say that they are necessarily essentially different from what we did in Sections 6 and 7. One can still go on the basis that truth is observable: the question is just a matter of which propositions we consider the truth of. Instead of claiming the possibility t o connect two points p , q if p # q in the geometrical world G, we take a thing like d(p,q) > 1 (distance exceeds unity) as our criterion. Nevertheless we can make things a little livelier than this. Let us start from what we developed in the beginning of Section 7: just the four PN’s that were called cpoint, cline, semp and semi. We now introduce a primitive notion

N.G. de Bruijn

862 “obsdif” ( “observationally different”) in the construction plane: p : cpoint

obsdif := PN : prop And now instead of introducing the cconn, ax21, etc. of Section 7, we go on like this: z : cpoint

I

ass31 : obsdif(x, y)

1

cconnl := PN : cline ax31 := PN : 2 # y ax32 := P N : seml(cconn1) = conn(semp(z),semp(y),az31) Knowledge about obsdif can come from different sources. In the first place we can axiomatize things like: if d(semp(x),semp(y)) > 1 then z and y are observationally different. A second source arises if we axiomatize in the construction plane, in some situations, that if cpoints u and w are observationally different, then the cpoints z and y, derived from u and w in one way or other, are observationally different. A very simple case of this is an axiom stating that obsdif(s, y) implies obsdif(y, z). It will be clear that this subject will become very complicated without being very rewarding. Therefore it seems definitely unattractive.

10. FUNDAMENTALISTIC RESTRICTIONS ON OBSERVABILITY In Section 9 we still had the uncritical acceptance of all truth that can be obtained in the geometrical world. There is a clear reason for restriction. If we have to use geometrical propositions for taking decisions in the world of constructions, it is reasonable to require that we also have a “constructive” way for actually deciding whether such propositions hold or do not hold. We can implement such restrictions in Automath by selecting some “constructive” basis for logic and mathematics, like intuitionistic mathematics, and building our geometry G according to these principles. We might even mix

Formalization of constructivity in Automath (F.2)

863

a constructive kind of mathematics with the ordinary kind, using pprop and ctype for the constructive kind. In particular it seems to be reasonable to take the “obsdif” we had in Section 9 as a pprop rather than as a prop. The latter remark suggests that it might be simpler to shift life entirely to the construction plane, and to forget 0 altogether. But this is not what we usually want. Let us imagine that we want to describe the theory of Mascheroni constructions (constructions with compass but without ruler). The subject matter concerns both circles and straight lines, the constructions deal with circles only. This difference can be implemented by discussing both circles and straight lines in 0, but just “cpoints” and “ccircles” in the construction plane.

11. COMPARISON WITH COMPUTER PROGRAM SEMANTICS It is very natural to compare the field of geometrical constructions with the one of computer programming. In both cases there is a number of actions that produce one or more objects, and in both cases it is very essential that it is proved that these objects satisfy the problem specification that was given beforehand. In a computer program we usually think of a “state space”; the input is an element of that state space and the output is again such an element. In the case of geometrical constructions one would say that the input is (vaguely speaking) the given figure, and the output is the required figure. Let us admit different spaces for input space and output space, and try to describe at least the specification of a geometrical construction in terms of input and output. As an example we take the following (trivial) construction problem. Given two different points P , Q and a line rn. Construct a line q that intersects m, passes through P but not through Q. Let us talk in the style of Section 7, and let us moreover decide to introduce a name R for a cpoint of intersection of q and m (otherwise we would need existential quantification). An element of the input space is a triple ( P , Q , m ) where P : cpoint, Q : cpoint, rn : cline, and where we have semp(P) # semp(Q). An element of the output space is a pair (9, R) where q : cline and R : cpoint. The problem specification is given by the conditions that seml(q) is incident with semp(P) and semp(R) but not with semp(Q). This kind of problem specification is entirely in the style of what is called “relational semantics’’ in computing science. If we deal with geometrical constructions, the role of “subroutines” is more or less the same as in computer programming. In particular we can say that descriptive geometry consists of a large body of subroutines. In computer programs we can have loops. Sometimes pieces of a program

N.G. de Bruijn

864

have to be repeated until some condition is satisfied. The geometrical constructions we discussed in the previous sections have no such loops. This reveals essential restrictions on the class of constructions we can describe in the various systems that were suggested in these sections. An example of a construction with loop is the following one. Let A , B , C be given points on a given line, B between A and C. It is required to construct a point D on that line, such that C is between B and D , and such that the length of the line segment BD is an integral multiple of the length of the segment AB. This construction requires a loop. Our treatment of geometrical constructions in Sections 3-10 might be called “operational” or anyway “functional”. All the time uniquely determined outputs are obtained step by step, and in the slightly more sophisticated case of the use of subroutines the only thing we actually do is taking sequences of steps together and considering them as a single step. The reason is that the treatment is based on what we shall call the interior approach. In the interior approach we talk in terms of the constructed objects. The constructed objects are treated in the same style as ordinary mathematical objects and (but this is a typical Automath feature) proofs. In our Automath book we discuss the objects, but the action of construction is felt as subject matter of some metalanguage. An entirely different way to deal with constructions is that we consider constructions as objects, seemingly more abstract than the ordinary objects, but nevertheless on the same linguistic level. Let us call this the exterior approach. (The name is suggested by the fact that if we work in the interior approach then the metalinguistic discussion of construction is felt as being something at the outside.) With the exterior approach we can get rid of the limitations of our “functional style” of construction description. Anyway we can remove the last differences there might be between geometrical construction and computer programming. We might try to start the exterior treatment with the introduction of a primitive notion “construction” like: construction := P N : ctype but it has to be more complicated than this. The notion of construction has to depend on the input space and the output space as parameters, and this is not so easy t o describe.

865

The Mat hernat ical Vernacular, A Language for Mathematics with Typed Sets N.G. de Bruijn

1. INTRODUCTION 1.1. The body of this paper is from an unpublished manuscript (“Formalizing the Mathematical Vernacular”) that was started in 1980, had a more or less finished form in the summer of 1981, and a revision in July 1982. [The Sections 1 to 17 were published for the first time i n [de Bruijn 87a (F.3)]. A t that occasion the (very essential) Sections 12-1 7 were revised in order to adapt them to typed set theory, and the Introduction was extended. For this 1994 version the old Sections 18-22 have been revised i n order to let them match the revised Sections 12-1 7.1 1.2. The word “vernacular” means the native language of the people, in contrast to the official, or the literary language (in older days in contrast to the latin of the church). In combination with the word “mathematical”, the vernacular is taken to mean the very precise mixture of words and formulas used by mathematicians in their better moments, whereas the “official” mathematical language is taken to be some formal system that uses formulas only. We shall use MV as abbreviation for “mathematical vernacular”. This MV obeys rules of grammar which are sometimes different from those of the “natural” languages, and, on the other hand, by no means contained in current formal systems. 1.3. It is quite conceivable that MV, or variations of it, can have an impact on computing science. A thing that comes at once into mind, is the use of MV as an intermediate language in “expert systems”. Another possible use might be formal or informal specification language for computer programs. 1.4. Many people like to think that what really matters in mathematics is a formal system (usually embodying predicate calculus and Zermelo-F’raenkel

866

N.G. de Bruijn

set theory), and that everything else is loose informal talk about that system. Yet the current formal systems do no adequately describe how people actually think, and, moreover, do not quite match the goals we have in mathematical education. Therefore it is attractive to try to put a substantial part of the mathematical vernacular into the formal system. One can even try to discard the formal system altogether, making the vernacular so precise that its linguistic rules are sufficiently sound as a basis for mathematics. An attempt to this effect will be made in this paper. We shall try to do more than just define what the formalized vernacular is: much of our effort (certainly in Sections 2, 3, 4) will go into showing its relation to standard mathematical practice. 1.5. Putting some kind of order in such a complex set of habits as the mathematical vernacular really is, will necessarily involve a number of quite arbitrary decisions. The first question is whether one should feel free to start afresh, rather than adopting all pieces of organization that have become more or less customary in the description of mathematics. We have not chosen for a system that is based on what many people seem to have learned to be the only reasonable basis of mathematics, viz. classical logic and Zermelo-Fraenkel set theory, with the doctrine that “everything is a set”. Instead, we shall develop a system of typed set theory, and we postpone the decision to take or not to take the line of classical logic to a rather late stage.

1.6. The idea to develop MV arose from the wish to have an intermediate stage between ordinary mathematical presentation on the one hand, and fully coded presentation in Automath-like systems on the other hand. One can think of the MV texts being written by a mathematician who fully understands the subject, and the translation into Automath by someone who just knows the languages that are involved. [For general information on Automath the following paper may be adequate: [de Bruijn 80 (A.5)l.l Experience with teaching MV was acquired in a course ‘
1.7. Even a superficial inspection of mathematical literature shows that it is very hard to get anywhere as long as we take the term “mathematical vernacular” so wide as to contain all language mathematicians use for convincing one another. We shall try t o isolate a fragment of the language and polish it up so as to turn it into a basis for mathematics. It is this fragment that is called MV.

The mathematical vernacular (F.3)

867

The rules of MV do not just explain how mathematical sentences have t o be formed, but also how they have to be manipulated in order t o build new correct material. In particular they will help us to disclose the rules of the game of axioms, definitions, theorems and proofs.

1.8. Roughly speaking, the MV part of a piece of mathematics will be the rigorous part. In order to make a bit clear at this stage what this MV part is, we mention a few things that we do not want to belong t o it. Without being very systematic, we mention: Argumentation in the form of references to previous material, and indications of the kind of reasoning. Typical of what we mean here is: “Replacing x by p in Theorem 25 we find ...”. Indications for reconstructing pieces of texts that have been omitted. Example: “The second part of the proof can be given by interchanging the roles of x and y”. References t o the syntactical form of presented material, like “the lefthand side of this equation”. Interpretation in terms of notions that belong to an entirely different area, like the use of geometrical terminology for discussing the graph of a function, in a case where the rigorous part of the text has no geometry at all. Remarks about the relation between the human writer or reader and the text. Example: “It is easy to see that ...”. It may help the reader to draw a figure of this situation. Commands, like “Show that

...”.

Surveys of what is to be expected in later parts of the text. Historical remarks. It would not be hard to extend this list of non-MV items. Quite often non-MV components and MV-components occur in one and the same sentence. Example: “Obviously we have f(z) > 1 for all x, but that does not help us to prove the lemma”. The only MV-part here is “f(z)> 1 for all z”.

1.9. In a system where we expect to have our mathematics checked by a machine it will certainly be worth while to take both the MV-part and the argumentations as essential parts of the formal language, as has been done in Automath. But even if that is considered as a sound basis for mathematical communication,

868

N.G. de Bruijn

it is questionable whether it can ever replace that communication. It has the disadvantage that it makes sense only for texts that have been elaborated in every silly little detail. For communication this is rather inconvenient. We wish to write in a style in which we omit what we think is trivial. What things can be considered to be trivial depends on the experience the reader is expected to have. Therefore we shall define correctness of MV in such a way that proofs where pieces of the derivation are omitted, can be considered as still correct. A text would become incorrect if we omit definitions of notions that are used in later parts of the book. A proof written in MV may be restricted to showing a sequence of resting points only. The derivation from point to point may be suppressed, or at least be treated quite informally. This seems to come close to the current ideal of mathematical presentation: impeccable statements, connected by suggestive remarks.

1.10. In contrast to what one might expect at first sight, the grammar of the mathematical vernacular is not harder, but very much easier than the one of natural language. We can get away with only three grammatical categories (the sentence, the substantive and the name), because mathematicians can take a point of view that is very different from the one of linguists. The main thing is that mathematical language allows mathematical notions to be defined; it can even define words and sentences. In choosing these new words and sentences we have almost absolute freedom, just like in mathematical notation. We hardly need linguistic rules for the formation of new words and new sentences. It usually pleases us to form them in accordance with natural language traditions, but it is neither necessary nor adequate to set linguistic rules for them. 1.11. The language definition of MV will be presented in two rounds. In the first round we express the general framework of organization of mathematical texts. It is about books and lines, introduction of variables, assumptions, definitions, axioms and theorems. All this is condensed in the rules BR1-BR9 in Sections 9 and 10. In the second round we get the rules about validity. These cover Sections 11-17. These two rounds describe a language for mathematics. It would go too far to call them the foundation of mathematics. The language of mathematics allows us to write mathematical books, and in these books we can axiomatize the rest of what we call the foundation of mathematics. Part of that axiomatic basis might be considered as foundation of mathematics as a whole, but other sets of axioms just serve for particular mathematical theories. The dividing line between the two is traditional, not essential. Part of the axiomatic basis in the book may be of logical nature, and that part will certainly be considered to

The mathematical vernacular (F.3)

869

belong to the foundation. Most of the validity rules of the second round have been put in that second round since they cannot be expressed in the books. In other words, they cannot be expressed in MV itself. But a very large part of what is called the foundation of mathematics can just be written in the books, more or less to our own taste. As examples we mention here: falsehood, negation, conjunction, disjunction, the law of the excluded middle, existential quantification, the empty set, the axioms for the system of natural numbers, the axiom of choice. One might try to reduce the second round to an absolute minimum and to put as much as possible in the MV books. We have not gone that far, in most cases because it seems to be nicer to keep things together that belong together. In the case of Section 15 (rules for Cartesian products) the reason to keep it in the second round may seem peculiar. It is just because of the fact that if we want to refer to elements ( a ,b) of the Cartesian product of A and B , we would hate to have t o mention A and B as parameters all the time. We would have to, if that section would be shifted to the book.

1.12. Let us try to compare MV and Automath. In the first place it must be said that MV has been inspired by the structure of Automath as well as by the tradition of writing in Automath. In that tradition elementhood, i.e. the fact that an object belongs to a set, is expressed by the typing mechanism available in Automath. So in order to say that p is an element of the set S, this is coded as p : S, so S is the type of p . This is in accordance with the tradition in standard mathematical language. If we say that p is a demisemitriangle, one does not think of the set or the class of all demisemitriangles in the first place, but rather thinks as “demisemitriangle” as the type of p . It says what kind of thing p is. In order to keep this situation alive, MV does not take sets as the primitive vehicles for describing elementhood, but substantives (in the above example semidemitriangle is a substantive). It is important to see the difference between substantives and names. Grammatically they play different roles. If we say that a b is an integer, then “integer” is a substantive and a b plays the role of the name of an object. Coming to a situation like a E b E c E d, the Automath style does not allow to write this as a chain of typings like a : b : c : d. If 6 is a set, then let us write 61 for the substantive “element of b”. The chain becomes a : 61, b : c l , c : dl. An important difference between Automath and MV is that in Automath typirigs are unique (up t o definitional equivalence), and in MV they are not. MV is adapted to the tradition of ordinary mathematical language in which 5 is a real number and at the same time the same 5 is an integer. One does

+

+

870

N.G. de Bruijn

not feel a conflict since “integer” is just a special kind of “real number”. In Automath it is always a bit troublesome to express that an object belongs to a subtype: The fact that 5 is a positive real number is described in Automath by two consecutive typings. The first one says that 5 is a real number, the second one says that some particular expression u is a proof for the statement that 5 is positive. This is often felt as a burden. A consequence of the way we treat typing by means of substantives in MV is that a typing like “5 : real number” has the nature of a proposition. This is one of the rules of MV (see T1 in Section 12), but is not done in Automath. Another difference between Automath and MV, already mentioned in 1.9, is that Automath has exact proof references inside the formal text, whereas MV either does not have them at all or has them informally in the margin. This provides a serious (but quite clear) task for those who implement MV into Automat h. There is another trouble with the implementation. In MV we have quite strong equality rules, more or less corresponding to the standard feeling that “between two equal things there is no difference at all, they are just the same”. In Automath it may cause quite some work to show the equality of two expressions whose only difference is that, somewhere inside, the first one has p and the second one has q, and where p is equal to q. One has to bring the equality from the inside to the outside, and that may cause a lot of Automath text. Fortunately the writing of that text can be automated. In our version of MV we have a strong set of equality axioms (in particular EQ10a-EQ10c of Section 13.2) which make all this much easier. 1.13. One might think of direct machine verification of books written in MV, but this will be by no means so “trivial” as in Automath. Checking books in MV may require quite some amount of artificial intelligence. In the first place MV allows us to omit parts of proofs, at least as long as no definitions are suppressed (see Section 1.9). But even if the steps in an MV book are ridiculously small, a checker may have a hard time, since in MV proof indications are not given in the formal text itself. To make a book in MV better readable, one can provide the text with proof references in the form of hints, so to speak in the margin. In order to make automatic checking of MV books feasible, one has to invent some system t o pass those informal hints to the artificially intelligent machine.

1.14. The formation rules of MV allow us to form sub-substantives to a given substantive. The relation is denoted by <, like ‘%quare < rectangle” in geometry. Once we have the substantive “rectangle”, we can form smaller ones, but our rules do not allow us to form bigger ones. The effect is that for every object

The mathematical vernacular (F.3)

871

in an MV book one can find a “largest” substantive it is typed by. Let us call that one the archetype of the object. Likewise, this largest substantive can also be called the archetype of all the substantives it contains in the sense of <. These archetypes can be on the back of our minds, but they are never mentioned explicitly in the MV book. Moreover, archetypes are nowhere mentioned in the language rules. One advantage of this system of “anonymous archetypes” is that we are never obliged to state the archetypes as a kind of parameters (actually this is what we have to do in Automath). Another advantage is that the MV text we produce can also be appreciated by readers who have bigger archetypes in mind. For example, a book on real numbers where complex numbers are never mentioned, can be used by anyone who started from the complex numbers, and wants to see the reals as special cases. In other words, our MV books can always be embedded into book with bigger archetypes. Since all objects, all substantives and all sets have an archetype, we can refer to our MV as a kind of typed set theory.

1.15. Our effort in describing a large part of the language in terms of both substantives and sets, instead of sets only, gives some duplication in the language rules that might be considered as superfluous. We of course would like t o try to eliminate one of the two, and deal with sets only, or with substantives only. Both can be done, of course, but none of the two seems to give anything that looks more satisfactory than what we have in our MV. It has some advantage to describe both: it liberates us from the nasty decision to discard either of them. Substantives (like point, number, function) seem to be the things we handle in our natural language, and sets are things we have learned, more or less artificially, to use instead of substantives. Before the advent of New Math (or should we say, before the rise and fall of New Math) talking by means of substantives was general, and set language was only introduced if strictly necessary. This is roughly what we have done in our presentation of the MV rules. Talking and thinking in terms of substantives is so strongly traditional that one might even call it “natural”.

1.16. In MV it is not true that “every object is a set”. If we introduce a substantive as a variable or as a primitive, then the objects which are typed by that substantive can not be considered as sets. Elements of a Cartesian product (see Section 15) are no sets.

1.17. Some of the decisions we have to take about MV involve questions about what to put in the language and what in the metalanguage. In particular we have a kind of meta-typing (the “high typing”, see Section 3.6) in the language,

872

N.G. de Bruijn

whereas most other systems would have such things in the metalanguage. The high typing is used for saying that something is a substantive or that something is a statement. We note here that ‘‘statement” is a synonym for “sentence”. We use “statement” in MV, but there would be no harm in replacing it by “proposition”. Linguists would probably dislike the use of the word “sentence” for phrases which are no full sentences. The distinction between high typing and low typing corresponds in Automath to the distinction between typing by means of expressions of degree 1 and typing by means of expressions of degree 2.

1.18. It is customary to make the distinction between sets and classes. Roughly speaking, sets are classes over which we allow quantification. Usually we think of the classes which are no sets as those which are just too big to be sets, like the class of all sets. In our MV we allow quantification over every substantive, and substantives directly correspond to sets. Classes over which we cannot quantify are not discussed in MV itself, neither by means of low typing nor by means of high typing. We can discuss them in the metalanguage: the class of all statements, the class of all substantives. In Automath the class of all proofs for a given proposition is treated as a type. There is nothing of that kind in MV. 1.19. Let us devote some attention to the role played by adjectives. An adjective belongs to a substantive, and serves a double purpose: (i) t o form a new substantive, and (ii) to form a new sentence. Example: Having the substantive “triangle”, we can form the adjective “isosceles”. With this one we can form the new substantive “isosceles triangle” as well as the new sentence “... is isosceles”. It may be just because of this double usage that mathematicians like to express things by means of adjectives. Many definitions in mathematics are in the form of the introduction of a new adjective. A warning must be given: an adjective belongs to a substantive, but not automatically to the archetype of that substantive.

2. SOME TERMINOLOGY

2.1. Before we start explaining what MV really is, we say something about the terminology to be used in this paper. We of course distinguish between the language MV and texts written in the language. Instead of “texts” we shall speak of “books”. The language MV has to be defined by stating its rules of grammar, or, as we might say, its language definition.

The mathematical vernacular (F.3)

873

A certain amount of usage of MV in an MV-book will depend on constructions and notions that are introduced in the language definition. This kind of usage will be called primary MV. All the rest of MV that is used somewhere in the book depends on terms and phrases that were chosen previously in that book. Such book material will be called secondary MV. We quote some examples. A phrase like “for all z” will conveniently belong to the language definition, but “logz”, “normal subgroup”, will be things we prefer to define in some book. There are things on the borderline for which the language design has to choose between introduction in grammar or in book, and that choice might be a matter of taste. Examples of things about which one might hesitate are: use of the equality sign, elementary notions about logic, and the treatment of sets and functions. A mathematics book contains what we called (in Section 1.8) an MVfragment, and all the rest is non-MV. A part of the non-MV fragment consists of material that speaks about the MV-fragment. One might think of words and phrases like “left-hand side”, “equation”, “notation” “symbol”, “formula”, “substitution”, “unknowns”, “integration”, “algebraic”, and of the material mentioned in Section 1.8, (i) and (ii). We shall not study this kind of language systematically, but just vaguely refer to it as metalanguage. It should be noted that the borderline between language and meta-language has shifted over the centuries. There was a time when “set”, “function” were definitely metalanguage, and right now there are things on the threshold between language and metalanguage which might shift into the language in the next few decades. As such, one might suspect terms like “proposition”, “condition”, “proof”, “algorithm”. Quite often a word occurs in two different languages, with related but different meanings. This has its reasons: it is always hard to find new words for new notions, and we like t o have words with some suggestive power rather than new words that do not mean anything to us. For example, the word ‘Lalgebra” is used in metalanguage t o indicate a branch of mathematics, and in the language itself to denote a special kind of ring. And if we say “this system of three equations has two solutions”, then “three” and “two” may be on different sides of the border. In this paper we shall develop some new terminology. Some of it will be incorporated in the primary part of the language MV, some of it will be metaMV. We have to be explicit about this, especially since we shall shift things into MV that are commonly considered metalanguage. In particular words like “substantive”, “statement” will become (primary) MV. 2.2.

2.3. Let us give a survey of the abbreviations for the various languages to

874

N.G. de Bruijn

be referred to in this paper. Quite often a sentence does not belong to any of these, simply because the sentence is intended t o relate these languages to each other. But for separate words in this kind of discussion it may be possible t o state more precisely to what language they belong. In many cases we shall indicate this, and we use the abbreviations OMV, gL, MV, pMV, sMV, mMV, smMV, imMV. OMV. This stands for “ordinary mathematical vernacular”. This is the language today’s mathematicians actually use when they want to be precise but not absolutely formal.

gL stands for “general language”. This refers to words and phrases outside mathematics, in our case usually as a kind of meta-OMV. Since we do not suggest an exact definition of OMV, the boarderline between OMV and gL will be vague. MV stands for the “stylized” form of OMV, and is to be described in this paper. This MV is a language that can be defined completely (see Sections 3-17). pMV stands for primary MV, as explained in Section 2.1. Words and other constructs of pMV appear for the first time in the language definition. Authors of MV books do not have the right to deviate from pMV. sMV stands for secondary MV, as explained in Section 2.1. Words, symbols and phrases of sMV are chosen by the author of an MV-book. mMV. This stands for meta-MV. We shall consider two kinds, smMV and imMV. smMV stands for syntax-oriented meta-MV. It is the language we use for expressing the rules of MV. imMV stands for interpretation-oriented meta-MV. It is the language we use for discussion of the relation between MV and its popular variant OMV. We use imMV for explaining how pieces of existing mathematics can be expressed in MV. Many of the words we introduce in pMV, sMV, smMV will be borrowed from gL, and the usage will be related to their usage in gL. In order to give an impression what distinctions can be made, we give a list of terms which all mean something like “group of words or symbols that express something”, and in each case we indicate the possible languages. In several cases the set of possible languages might be extended, and in some cases it is not very

The mathematical vernacular (F.3)

875

clear what that list should be. What is important for us here is the question which cases are pMV, sMV, imMV, smMV. The matter of what belongs to gL is more vague, of course. word ............................ gL ......................................................... sentence ...................... gL ......................................................... proposition .... OMV ........................ sMV ................................ statement ............................. pMV ............................................ substantive ........................... pMV ............................................ formula .......... OMV .................................... smMV ................. expression ...................gL ........................... smMV ................. phrase ......................... gL ......................................................... theorem ..........OMV .................................... smMV .. imMV .. assertion ........ OMV .. gL ........................... smMV .. imMV .. assumption .... OMV .. gL ........................... smMV .. imMV .. condition ....... OMV .. gL ........................... smMV .. imMV .. definition ....... OMV .. gL ........................... smMV .. imMV .. predicate ........OMV .................................... smMV .. imMV .. clause ............................................................ smMV ................. name ........................... gL ........................... smMV .. imMV ..

3. INGREDIENTS OF MV

3.1. We shall point out some of the characteristics of MV, with special emphasis on those aspects which might be considered novelties. We shall comment on the following points: context indication (Section 3.2) , mixing natural language and formulas (Section 3.3), grammatical categories (Sections 3.4 and 3.5), typing (Section 3.6). 3.2. The context indication system, as described in Section 4, is little more than a systematic description of what all mathematicians are aware of when they are talking or writing mathematics. It is certainly worth while to give it a predominant place in the description of what mathematics is. In particular it gives insight in what “variables” are. Moreover it opens the way to natural deduction as a basis for mathematical reasoning. 3.3. Mixing natural language and formulas is a very typical aspect of a mathematician’s lingo (both OMV and MV). In most cases formulas become part of a sentence as if they were just words or sequences of words, in complete accordance with the grammar of the natural language. When we say “if a / / b then

N.G. de Bruijn

876

p E V ” ,then “affb” and “ p E V” play grammatically the role of sub-sentences, just like in “if it rains then we get wet”. And when we say “b satisfies ...” then b is the subject of the sentence just as if b were a person. But there are also cases where the mathematician’s lingo does not follow the rules of the natural language. We mention “for all integers z we have ...’I, where the z does not play a role that any ordinary word can play in a natural language sentence. And we mention “for all z we have ...”, where 2 does seem to play such a role, but the wrong one. For these little irritations we offer as explanation that our natural languages do not admit anything corresponding to bound variables. 3.4. In natural languages one analyses the structure of a sentence by attaching

grammatical categories t o words or word groups. Such categories can be “sentence”, “noun”, “verb”, etc. In our discussion of MV we shall restrict ourselves to a rather small number of categories. We shall only use the following: statement,

substantive,

name.

There might be a case for the adjective (cf. Section 1.19) as a fourth category, but we ignore them in our presentation of MV. The reason why we do not need to go into the finer shades of grammatical analysis lies in the fact that in the MV-book we can introduce words, symbols and other kinds of phrases by means of definitions (see Section 7.2). As far as the defined things are words or phrases, we usually choose them according to what sounds right in ordinary language, and that is why they seem to ask for linguistic analysis. But such an analysis is unnecessary. The only thing we do with the new words and phrases is to repeat them in other circumstances. As an example we quote a definition: “We say that the vectors p and q are locally independent in the sense of Prlwtzkowsky if ...”. Later we just repeat this phrase, with p and q replaced by other names. The fact that the words “in the sense of” have been taken just in this order, does not play a role we consider to be essential for MV. It plays a role in readability, memorizability and possibly in parsability (cf. Section 23). It is like choosing notation: as a function symbol for the hyperbolic cosine we might select “cos hyp” or “cosh” or “csh” but not “ggrrr”, since that would not be very suggestive, and certainly not “gg?(rg[?” since that would be asking for trouble with parsing. 3.5. Let us say a few informal things (expressed in gL) on the categories “state-

ment”, ‘Lsubstantive”, “name” and “adjective” , A statement is a group of words or formulas that might play the role of a complete sentence, although it can occur as just a part of some other phrase (the word “phrase” is used here to indicate any sequence of words or formulas that somehow is considered in its entirety at some moment). Example: In the

The mathematical vernacular (F.3)

877

phrase “if a > b then p is divisible by 5” the parts “a > b” and “ p is divisible by 5” are statements, and the whole phrase is a statement. A substantive is a generic term for a class. Examples: “circle”, “positive integer with exactly three divisors”, “point”. A generic term for a class is not the same as a name for that class. The difference is small: it is only the way we use them. If C is the class of all circles, then the phrases “P is a circle” and “ P is an element of C” are intended to mean the same thing. A warning: sometimes a phrase has the grammatical form of a substantive without playing that role in a mathematical text. In the phrase “ P is the orthocenter of triangle ABC”, the word “orthocenter” is not t o be considered as a generic name for a class. One should not think that it had first been explained what an orthocenter is, and that later it was proved that a triangle has just one orthocenter, so that finally we can speak of “the orthocenter”. No: the phrase “the orthocenter of triangle ABC” can be used by virtue of a previous definition in the book, where it was introduced as a name with the same status as a name like Oc(A, B, C) would have had. Therefore there is no question of parsing it into separate components like “the” “orthocenter”, etc. A n a m e is a phrase we consider as a sufficient indication of an object. Without going into the question whether we have or do not have objects in mathemat,ics, we note that our linguistic handling of mathematics seems to treat mathematical names as if they were names of objects. Examples of names are “the center of the unit circle”, “the point M ” , “ M ” , “a b”. As to adjectives we mention that adjectives are always attached directly or indirectly to a substantive. Once we know what a triangle is we can say what it means that a triangle is “isosceles”. It can be used in two ways: (i) in statements like “triangle A B C is isosceles”, and (ii) in order to form the new substantive “isosceles triangle”. This humble role of adjective does not seem to suffice as a reason for taking them as building blocks in our rudimentary grammar of MV. Nevertheless there is a reason to take them seriously: mathematicians seem to like them so much. They seem to like definitions where a new notion is presented in the form of a new adjective. We shall say more about adjectives in Section 22.14.

+

3.6. In our version of MV we use typing on two levels: low typing and high typing. Low typing is used to express that some “object” is of a certain “kind” like “ p is an integer”. In MV we have a preference for writing a colon instead of “is a”, so we write “ p : integer”. This colon is the notation for a kind of relation between “p” (which is grammatically a name) and “integer” (which is grammatically a substantive). In the metalanguage smMV we say that “ p : integer” is a (low)

878

N.G. de Bruijn

typing. High typing is a thing that in most other systems would be put into the metalanguage rather than into the language itself. We denote it by a double colon. On the right we have either “statement” or “substantive”. Examples of ‘‘z > y :: statement”. high typings are “integer :: We can as well say right here that low typings “ p : q” will occur only in cases where the high typing “q :: substantive” has been established already. In this connection we mention that one might say in the metalanguage smMV that p is a name, or that p is the name of some ql but we do not express this in MV itself. We mention a question that often turns up among mathematicians: is 3 a number or is 3 the name of a number? We can agree to both alternatives, depending on the language we use. In MV we say “3 : number”, but in smMV we say that 3 is a name, and more precisely that 3 is the name of a number. For a moment we consider the word “object”. There is the old philosophical question whether mathematical objects exist. Those who believe in the existence are called platonists. One might suspect that all mathematicians are platonists, even those who fiercely deny it. The matter is clear for those who consider it as their job to provide useful communication language for mathematicians: platonism is not right or wrong, platonism is irrelevant. At least it is irrelevant for matters of truth and falsehood of mathematical statements. It may be relevant for mathematical taste, but that is a personal matter anyway. The most important thing to say about platonism is possibly that platonism is dangerous. It may seduce mathematicians in thinking that they can get away with incomplete definitions of objects since these objects exist anyway. And it might give the false suggestion that slightly different definitions of a mathematical object are not harmful since after all they refer to one and the same platonic object. Another danger of the idea of platonic existence is that many people find it hard t o understand the meaning of existence in mathematics. The statement in OMV that “there exists a positive number whose square equals 9” has nothing to do with the platonic existence of the number 3. We shall give a kind of linguistic interpretation to the word “object”. We take it as a word used in smMV. If S is a substantive, and if we have in MV that p : S, then we might say in smMV that “ p is the name of an object”. Continuing in smMV, one might ask “what kind of object?”. The answer t o this will be in smMV that p is the name of an S, and in MV itself that p : S (which expresses that p is an S). We of course have not expressed here what the word “object” means, but only how the word is used.

3.7. In the next section we will use the word clause (smMV). It will get its exact description in the language definition (from Section 6 onward), but we may as well say right here that a clause is either a typing or a statement. This

The mathematical vernacular (F.3)

879

cannot serve as a definition of the word “clause”, however. We even note that Yyping” and “statement” belong to different languages. In order to give a preliminary idea, we say here that a clause will be either a high typing,

A :: substantive or

P :: statement or something of the form

P

(3.7.1)

in situations where

P

::

statement

(3.7.2)

had already been recognized as a valid clause. The interpretation of (3.7.2) is “P is a well-formed statement”, and the one of (3.7.1) is “P is true”. Note that in (3.7.1) and (3.7.2) P itself can be a low typing like “ a : A ” , where “ A :: substantive” has already been recognized as a valid clause. There will be cases where we establish that “ a : A” is a well-formed statement, and there will be cases where we establish that “u : A” is true. High typings will be different from low typings in the sense that they cannot be considered as statements. There will be no valid clauses of the form

( A :: substantive) :: statement (P :: statement) :: statement

4. STRUCTURE

OF MV BOOKS

4.1. In this section we give a first outline of what a book is. The following terms will be all smMV: “book”, “line”, “older”, “younger”, “context”, “context item”, “declarational”, “assumptional”, “body of a line”, “context length”, “empty context”, “flag”, “flagstaff”, “flagstaff form”, “flagless form”, “block”, “block opener”, “nested blocks”, “sub-block”. A book is a finite partially ordered set of lines. The order relation is called “older than” (“line p is older than line q” and “line q is younger than line p” are synonymous). A line consists of two parts: a context and a body. A context is a finite sequence of context items. There are two kinds of context items: declarational items and assumptional items. The sequence of context

880

N.G. de Bruijn

items may be empty, and in that case we speak of the empty context. In general, the number of items in the sequence is called the length of the context; the empty context has length zero. This is all we say here about context items; for further information we refer to Section 6. We refer to Section 7 for a description of the body of a line; for the time being we do not need such a description. 4.2. We sketch the interpretation of the words introduced in Section 4.1 in terms of gL. A book is t o be interpreted as any connected piece of mathematics that starts from scratch. Lines are primitive building blocks of books. One aspect of lines is that if we omit the last line of a book then it is still a book, but if we omit just a part of that line then it is no longer a book. Usually we think of a book as a linearly ordered set (i.e.l a sequence of lines, and we were thinking that way when using the words “last line” in the previous sentence), where the first line is “the oldest” and the last line is “the youngest”, but we need not go so far as to prescribe this linearity. The meaning of old and young is that younger material may make use of older material, but not the other way round. Since every finite partially ordered set can be put into a linear order that is consistent with the partial order, we see that the generalization from linear t o partial is a very superficial one. Nevertheless, the presentation in a non-linear form may make a book easier to understand. In particular, if two pieces A and B are logically independent of each other, then this independence would be muffled if, only for the sake of typography, we would proclaim A to be older than B. Saying thct a book remains a book if we omit the last line (or in the case of non-linear order, if we omit a line that is not older than any other line in the book), means that it remains a book in the sense of syntactic structure; it need not be an interesting book. An assumptional context item is t o be interpreted as an assumption, like “assume p > q ” . A declarational context item is t o be interpreted as the introduction of a variable of a specified type, like “let y be a real number”. A context is to be interpreted as a sequence of such items, arranged in the order in which they were introduced. As an example (in OMV) we give a context of length 4: “let n be a positive integer, let S be a subset of the set of real numbers, assume that S has n elements, let s be an element of S”. The body of a line is interpreted as a piece of true information we provide in the considered context. As an example (in OMV) we give, with the above context of length 4, “if n > 1 then S contains an element different from s”. 4.3. In this section we present examples of the structure of a book. Throughout

Sub-sections 4.3, 4.4, 4.5 we think of a linearly ordered book. The examples are

The mathematical vernacular (F.3)

88 1

abbreviated, in the sense that context items are replaced by symbols I1,12,13,... , line bodies by 61, bz, b3, ... . Contexts are represented as sequences of items separated by commas, and we write an asterisk between context and body of the line. Now a book can look like this:

The contexts in this example look a bit untidy. In a mathematics text the contexts usually do not change from line t o line, but are constant over a larger piece of text. And if the context changes, it is either by adding a few context items on the right or by deleting one or more from the right. So contexts grow and shrink on the right in the course of a mathematical discussion. Assumptions that were once introduced are no longer valid from a certain point onwards, and the same thing holds for variables: a variable is born, is alive during some time, and then dies. In OMV it is customary to announce birth of assumptions and of variables, but it is left to the reader to guess (possibly on the basis of the typographical layout, possibly on the basis of “understanding the author’s intentions”) at what point in the text they are dismissed. For the sake of further discussion we give a typical example:

4.4. The information contained in a book is completely preserved if we write

it in what we call its flagstaff form. In contrast to this, the form presented in Section 4.3 is called the flagless form. In the flagstafl form, the context items are written on flags. The staff of a flag is vertical, and marks the set of lines where the flag’s item is a part of the context. The following example, where the second example of Section 4.3 has been put into flagstaff form, speaks for itself.

N.G. de Bruijn

882

Needless to say, the way back from flagstaff form to flagless form is immediate. For every one of the bodies b l , ...,blo we get the context if we assemble the items on the flags carried by the flagstaffs we see on the left of that body. Later we shall use rectangular flags for assumptions and pointed flags for declarations, in order to make a clear distinction between those two kinds of context items. We did not do it here, since the relation between flagless form and flagstaff form is independent of such a distinction. In our formal presentation of MV we handle the flagless form; in examples we may switch to the flags (see Section 18). Sometimes we use the word block (smMV) to denote the material to the right of a flagstaff, including the flag itself. So every flag determines a block, and the item on the flag is called a block opener (smMV). As an example we quote that 4.5.

P

P

The mathematical vernacular (F.3)

883

are blocks of the book of Section 4.4.The block openers are 1 4 and 16, respectively. The blocks are always nested, that is to say that if two blocks are not disjoint then one of the two is a sub-block of the other one.

5. IDENTIFIERS

5.1. In this section the following terms of smMV will be introduced: “identifier”, “fresh identifier”, “constant”, “parametrized constant”, “modified parametrized constant”, “variable”, “dummy”, “variables of a context”. Note that a word like “variable” is smMV, but that the variables themselves are sMV. Similar things hold for the other notions. 5.2. An identifier is a symbol or a string of symbols to be considered as an atomic piece of text. We might say that an identifier is a symbol, but since we need a very large number of symbols we use strings of symbols instead, taken from a relatively small collection. It is a matter of parsing how to isolate these identifiers in a given piece of text. We shall not go into these parsing questions since they are not very essential here: if we had an unlimited amount of useful identifiers the matter would not have arisen at all. We refer to Section 23 for further remarks. Examples of identifiers in OMV are “z”,“2”, “the complex number field”, “parallelogram”. We note that if we describe “the complex number field” as a string of symbols, then we have t o consider the empty spaces between the separate words as symbols too. These are produced by key strokes on a typewriter just like the letters, but they do not leave a visible imprint on the paper. Therefore it is better to replace the empty spaces by a visible character that is not used otherwise. One can take the underlining symbol for this, and write “the_complexnumberfield”. In our examples we shall not do this, however. It is one of the aspects in which the paper remains informal. 5.3. An identifier is called fresh at some specified place of the book if it has

not appeared yet at older places of that book. In MV and in OMV we often need fresh identifiers, but in practice this is taken with a grain of salt. Since the number of short identifiers is rather small, we are inclined to use some of them repeatedly, in different circumstances, with different meanings. We shall not pay attention to this matter and act as if there were an unrestricted amount of easily recognizable symbols.

884

N.G. de Bruijn

5.4. In an MV book there are various kinds of identifiers. First there are the pMV symbols that occur in the definition of MV, like

USUbStantiVe7,, LIStatement)!,

U:,l,

.. ,

U..!,

.- .

U.-ll

Another class of identifiers is the class of variables (the word “variable” is smMV, the variables in the book will be sMV). A variable is an identifier that occurs for the first time in an MV book in a declarational context item (see Section 6.3). Other identifiers are bound variables (also called dummies), for which we refer to Section 20. Finally we have identifiers that are called constants. They are the identifiers whose first occurrence in an MV-book is on the left of a symbol ‘?=”. The interpretation is that a constant is the name given to a defined object like “2”, “e” ( e is the basis of natural logarithms). 5.5. Related to the constants are the parametrized constants, which are not identifiers in the proper sense. A parametrized constant is a kind of finite sequence of symbols in which there occur variables at various places. The notion is relative with respect to a context. A context has a number of variables, i.e., the variables introduced in the declarational items of that context. These variables will be referred to as the “variables of the context”. It is essential that each one of the context variables occurs at least once in the parametrized constant. The constants of Section 5.4 can be considered as parametrized constants for the case that there are no declarational items in the context. If z and y are the variables of the context, then the following things may be parametrized constants:

“f(2, y)”,

“z

+ y”,

“the distance from z to y”.

A parametrized constant is called fresh (smMV) somewhere in the book if it has not appeared at older places in that book, not even with different variables. Parametrized constants can be used later in the book by repeating them, with the variables replaced by other expressions. We do not say here what kind of expression we have in mind, but just mention as examples

+

+ +

“f(a b, 3)”, ‘‘(a 6 ) 4”, “the distance from P to the center of c”.

In smMV such modified repetitions will be called modified parametrized constants. Clearly, these modified constants will generate new parsing problems, but again we lightheartedly neglect these.

The mathematical vernacular (F.3)

885

The condition that in a parametrized constant all variables of the context occur, is usually taken with a grain of salt. We return to this in Section 21.6. 5.6. Many of the parametrized constants in our examples will have the form b(z1,...,x n ) , where X I , ...,zn are all the variables of the context, in the order in which they are introduced in the declarational context items. In these cases we often take the liberty to write just b instead of b(z1,...,z), on the left-hand side of the definitional line (and sometimes at other places where it is obvious what the abbreviation b stands for).

6. STRUCTURE OF CONTEXT ITEMS

6.1. A context item is a pair, consisting of a clause and a label. For a first orientation on what a clause is, we refer t o Section 3.7. The label is either “(asrn)” or “(dcl)”. See Section 6.4 for the reason why these labels are used. The phrases ‘‘context item”, “clause” and “label” are smMV; both “(asm)” and “(dcl)” are pMV. 6.2. An assumptional context item has the form “ P (asm)”. As a first orientation we say that this P is a clause, but that not every clause will be admitted: “P (asrn)” will only be allowed in cases where the high typing “P :: statement” can be established in the book, at least in the context formed by the sequence of context items preceding this item “ P (asrn)”. For details we refer to Section 9.

6.3. Declarational context items have one of the following forms: x : P (dcl) x :: substantive (dcl) z :: statement (dcl) where x is a fresh identifier and P is some expression. As said in Section 5.4, z is called the variable of the context item. Not every P will be admitted here, but only those P for which the high typing “ P :: substantive” can be established in the book, in the context formed by the sequence of context items preceding this item “x : P (decl)”. For details we refer to Section 9. 6.4. It is essential that context items are explicitly labeled as being either declarational or assumptional. In the flagstaff form this can be done by using pointed flags for declarations and rectangular flags for assumptions (see Section 18). The reason for the use of labels that distinguish between the two kinds of

N.G. de Bruijn

886

items, is the fact that the form of the context item does not always reveal to which one of the two categories it belongs. The following example of a context of length 2 in OMV shows what we mean: “Let p be a quadrilateral, assume that p is a rectangle” (of course no one would say this in one breath, but quite often the various items of a single context are pages apart). The labels “let be” and “assume that” are no luxury, for if we would say “ p is a quadrilateral, p is a rectangle” then it would not have been made clear that in the first item p is introduced and that in the second item p is a thing we already know about.

7. STRUCTURE OF LINE BODIES

7.1. There are four kinds of line bodies: (i)

definitional line bodies

(Sections 7.2 and 7.6)

(ii) primitive line bodies

(Sections 7.3 and 7.10)

(iii) assertional line bodies

(Sections 7.4 and 7.11)

(iv) axiomatic line bodies

(Sections 7.5 and 7.12)

(all these terms are smMV). 7.2. The interpretation in OMV of lines of the type (i) is that they represent definitions. That word has to be taken in a wide sense, and contains much more than what a text in OMV would label as “Definition”. Whenever we select a new symbol to represent a longer expression, usually for the sake of brevity, we essentially have a definition. We consider three kinds of definitions, according to the syntactic category of the things to be defined. There are “name definitions”, where a new name is introduced for an “object”. Next there are k b s t a n t i v e definitions”, where a new substantive is introduced, and finally “statement definitions”, where a new phrase is coined to represent a statement. As examples of the three kinds of definitions we quote

“the orthocenter of triangle t is ...”, “A rhombus is ...”, “We say that the sequence s converges t o the real number c if

...”.

In MV these three categories will be represented by (7.6.1), (7.6.2), (7.6.3), respectively. For further examples and comments see Sections 7.7-7.9. Many definitions in OMV have the form of the introduction of a new adjective. We shall not put these into MV since they can be circumvented (cf. Sections 3.5 and 22.14).

The mathematical vernacular (F.3)

887

7.3. The interpretation (in OMV) of lines with bodies of the type (ii) is that they introduce primitive notions. Such lines are rare in mathematics, and have the same status as axioms. Together with the axioms they may form the basis of a theory. As an example we quote from Hilbert’s axioms for plane geometry, which state “there are things we call points and things we call lines”, where the words “point” and “line” are introduced as new substantives but, in contrast to the substantive definitions of Section 7.6, without explanation in terms of known things. In MV the introductions of these primitives get the form (7.10.2). An example of a primitive of the form (7.10.3) is that, after points and lines have been mentioned, the notion “point A lies on line q” is introduced without explanation. Finally we give an example of what is expressed in (7.10.1). One of Peano’s axioms is: “there is a special natural number which we shall denote by the symbol 1”. Here the new object is introduced without definition. Instead of defining it we just say of what kind it is. 7.4. The interpretation of lines with bodies of the type (iii) is that assertions are made that follow from previous material. Some of these are called “theorems”, others L‘lemmas’l,but most of them (in particular the assertions inside proofs) do not get such a stately name. And it is certainly not common practice to apply words like “theorem”, “lemma” to cases with high typings like “ A :: substantive”, “P :: statement”, which are likewise admitted here (see Section 7.11). When saying that theorems and lemmas follow from previous material, we have to interpret the habit in OMV to print a proof after the announcement of the theorem instead of before. If we wish to have a similar announcement in MV, we might give a name to the thing stated in the theorem, claiming that it is a statement P , like in (3.7.2). The proof will end with the assertional line body P (see (3.7.1)). The interpretation of the first line in OMV is “ P is a well-formed proposition”, and of the last one ‘‘P is true”.

7.5. Lines with a body of the type (iv) are to be interpreted as axioms. They can be applied in the same way as theorems, but in the case of axioms we do not require that their content follows from previous material.

7.6. In MV, a definitional line body has one of the forms P : = Q : R P := Q :: substantive P := Q :: statement

(7.6.1) (7.6.2) (7.6.3)

where P stands for a parametrized constant. Note that “substantive” and

N.G. de Bruijn

888

‘‘statement” are pMV, as well as the symbols “:=”, Y’, “::”, but that Q is definitely not the pMV symbol “PN”.

7.7. We remark that it will be a consequence of our later rules that (7.6.1) appears only in situations where “R :: substantive” is valid. The interpretation of (7.6.1) is that the definition provides a new (possibly short) name P for an object of which the full description is Q. Example (in OMV): “Let S(n) denote the real number exp(1) ... +exp(n)”. In this example S ( n ) plays the role of P , “exp(1) ... exp(n)” the role of &, and ‘‘real number” the role of R. In MV we write it as “ S ( n ):= ... : real number” (we do not attempt right now to write exp( 1) ... exp(n) in official MV).

+ +

+

+ +

7.8. The interpretation of (7.6.2) is that it provides a (usually short) expression P for a (usually longer) description Q that represents a substantive. Example: the role of Q can be played by the substantive “positive integer with exactly two divisors” and the role of P by the new substantive “prime number”. 7.9. The interpretation of (7.6.3) is that the definition provides a new (usually short) expression for a (usually longer) statement. Example: “We say that p is orthogonal to q if the inner product of p and q is zero”. Here “ p is orthogonal to q” plays the role of P , and ‘the inner product of p and q is zero” plays the role of Q. 7.10. In MV a primitive line body has one of the forms

P := PN : R P := PN :: substantive P := PN :: statement

(7.10.1) (7.10.2) (7.10.3)

We note that “:=”, “PN”, “:”, “::”, “substantive” and “statement” are all pMV (“PN” has been chosen at mnemonic for the OMV-term “primitive notion”). P stands for a parametrized constant. In the case of (7.10-1) it will be required that “R :: substantive” is valid in the context in which (7.10.1) is written.

7.11. An assertional line body in MV is nothing but a single clause (cf. Section 3.7).

7.12. An axiomatic line body has the form c [Axiom]

(7.12.1)

where c is a clause, and the symbol “[Axiom]” is a pMV term. By virtue of language rules still to be formulated, there are two differences between axiomatic and assertional line bodies. In the first place, the assertional line body

The mathematical vernacular (F.3)

889

has to “follow” from the previous part of the book, and secondly, in the axiomatic case the clause c has to be restricted to cases for which the high typing “c :: statement” can be established in the book. The latter is similar t o the restriction made on assumptional context items (Section 6.2).

7.13. We introduce the notion clause of a lane body (smMV). In the cases of Sections 7.10-7.12 the body has just one clause. In a n assertional line body that clause is the line body itself. In lines with axiomatic line body “c [Axiom]” the clause of the line is just that c. In lines with bodies (7.10.1), (7.10.2), (7.10.3) the clause of the body is ‘‘P : R”, “P :: substantive”, “P :: statement”, respectively. A definitional line body has two clauses. The old clauses of the lines of Section 7.6 are “Q : R”, “Q :: substantive” and “Q :: statement”, respectively. The new clauses of these lines are “ P : Q”, “P :: substantive” and “ P :: statement”, respectively.

8. GENERAL REMARKS ON RULES OF MV 8.1. In Section 4, 6 and 7 we have explained the structure of books, contexts and lines. The question is now: what contexts and what lines are allowed? It will not be trivial to state a complete set of rules for this. A part of these rules will be felt as rules for language manipulation; these rules will be explained in Sections 9 and 10. Another part (Sections 12-17) will be more like a piece of the foundation of mathematics. However, the rules of MV will not contain all of what is usually called the foundation of mathematics. Once we have reached a certain level, the language is strong enough to allow us to write the rest of the foundation of mathematics in an MV book. It is attractive to put as little as possible in the language definition and as much as possible in the books, but we shall not aim at extremes in this respect. The state of affairs can be compared to the way a ship is built. The ship is constructed ashore only until the stage that it is just able to float. Then it is launched, and after that, the construction goes on. The reason for this is, of course, that a ship cannot be launched if it is too heavy. In the case of MV the reason is different. The MV ship can be used by many different customers in different ways. After MV is launched, every customer can finish the construction according to his own wishes. After the launching of a ship, two things happen: (i) the construction is completed, and (ii) the ship will be sailing the seas. Here our analogy is less satisfactory. The action of the ship’s construction in the water near the shipyard is very different from the action of sailing the seas. In the case of MV these

890

N.G. de Bruijn

two actions are alike. To (i) there corresponds the writing of the fundamental portions of the book, and what corresponds to (ii) is writing a (possibly long) book or set of books based on that fundamental chapter. But all the time the action consists of writing books and nothing else. 8.2. As said before, our MV will be modelled after OMV, i.e. the way mathematicians write and speak today, but we cannot just copy OMV. There is no consensus in OMV about how things should be said. We are not in a position to derive all rules of MV by observation of OMV. We have to invent new rules, and that may mean making arbitrary choices. We have to give definitive shape to things which are not properly revealed in OMV. In particular this refers to the fact that this paper tries to interpret OMV as a typed language. One might argue that such an interpretation is not really called for, and that it is about as arbitrary as interpreting OMV as a non-typed language. The most-favored method of coping with life without types is to maintain that “everything is a set”. One might try to arrange a typed language in such a way that this set-loving point of view can be obtained by just creating a single type, viz. the type “set”. Yet we have not taken the trouble to keep this possibility open in our presentation of MV. Conversely, one might try to code typed material in terms of a non-typed language, but this seems t o be very unattractive.

8.3. We first say something about the notion of validity (smMV). The word valid (smMV) means: having been built according to the grammar of MV (and that grammar has still to be disclosed). The rules of that grammar will be production rules, in the sense that they all describe ways to extend a valid book by adding a new line. In the course of the description of how the new line is t o be built, we have certain resting-points where certain phrases are discussed as being acceptable ingredients of the line t o be added. Important resting points are clauses (see Section 3.7 and Section 7.13). In a given context there is a set of such clauses which are called valid clauses. The production rules explain how our knowledge about that set can be extended, describing how by means of a number of elements of that set a new one can be constructed. 8.4. Validity is expressed with respect to a book.

As already said in Section 4, a book is a partially ordered set of lines. For any given line we can consider the set of all lines which are older than the given line; this is again a partially ordered set of lines and therefore a book. We shall refer to the given line as “the new line” and to that book as “the set of old lines”. Whether a new line will be called valid, depends on the set of old lines, and not on what happens in other lines. The same remark applies to parts of

The mathematical vernacular (F.3)

89 1

new lines, like clauses and contexts. Only for the identifiers, and more generally for the parametrized constants we have a condition that goes beyond the set of old lines: we have to stipulate that they are all different throughout the whole book. We usually think of a book as having been written line by line, where older lines precede newer lines in time. If this is the case, then the condition for the parametrized constants is that at each moment the parametrized constant introduced in a line (on the left of a sign :=) is different from all the parametrized constants used before. We still have to say how the context for the new line has to be built, and what clauses are valid in that context. This will be said in Sections 9 and 10. 8.5. From Section 9 onwards we shall give a formal definition of the notion of an MV book in flagless form. Except for the syntactic matters referred to in Section 8.6, we shall not make use of what was said in the preceding sections. Those sections were intended to give interpretations, and t o help the reader to get an insight into the complex set of definitions we shall display in the next sections. First a few things about the terminology. The smMV-terms “MV book” and “valid book” are synonymous. We shall not define notions “context” and “clause” as such, but we shall define “valid context with respect to a set of lines”, “valid clause with respect to a valid context and a set of lines” (in both cases that set is referred to as the set of “old lines”, and in the second caSe it will be required that the valid context is a valid context with respect to that same set of old lines), “valid book”, “line”, “body of a line”, “clause of the body of a line”, “context of a line”. The pMV symbols to be used are

.,

u.13

..

it..”,

“:=”, “substantive”, “statement”, “PN”, “[Axiom]”

and “(dcl)”, “(asm)”, %”. (The symbols of the second row do not appear in the flagstaff form: their role is taken over by the pointed and rectangular flags and flagstaffs.) A few general things can be said here about the format of things. A book is a finite (possibly empty) partially ordered set of lines. A line is a pair consisting of a (valid) context and a line body. A (valid) context is a finite (possibly empty) sequence of context items. 8.6. We mention some things that should have been formally discussed in the next sections, but are nevertheless treated very superficially. They are of a syntactical nature. We mention:

N.G. de Bruijn

892 (i)

substitution,

(ii) variables of a context, (iii) fresh identities and fresh parametrized constants, (iv) parsing. We take it that Section 5 (and 23) are sufficiently clear as an indication of how these notions are to be formalized. A complete formalized treatment of them would not quite fit into the general style of this paper.

9. VALID CONTEXTS AND VALID CLAUSES

9.1. Everything that has been said thus far is to be considered as introduction, providing an orientation about what we are going to describe. It also served to build up a feeling for the interpretation. From now on, however, we shall attempt a more complete and more formal description. Many things that have been referred to earlier in vague terms, will now get a more serious treatment. The rules are about books, lines and validity. They will get their content by means of rules BR1-BR9 (BR stands for “basic rule”). We need not say beforehand what these notions mean. These rules BR1 to BR9 are hardly of a logical or a mathematical nature. Or, rather, they describe how to handle logic and mathematics. In order to get to logic and mathematics themselves we have to add a number of rules in Sections 12-17 that describe more ways to produce valid clauses. As to the production of valid contexts and books no rules will be issued beyond these BR1-BR9. 9.2. The symbols c, C, 11, I,, P, A, x, 21, xk, X I , xk that are used in this section for explaining language rules are meta-variables. They are used in smMV in order to denote expressions occurring in an MV book. In the rules BR1-BR7 there is a set S of lines (“the set of old lines”), and “valid” stands for “valid with respect to S”.

9.3. BR1. If an old line has context C, and if c is a clause of the body of that line, then C is a valid context, and c is a valid clause in that context. 9.4. BR2. The empty context is valid. 9.5.

BR3. If 11,...,I, is a valid context (if n = 0 we mean the empty con-

text), and if

The mathematical vernacular (F.3)

893

P :: statement is a valid clause in that context, then 11,...,I,, I,+l is a valid context, where In+l stands for “P(asm)”. (As already explained in Section 6.3, the additional “(asm)” serves to label it as an assumptional context item; it is not superfluous since P may have the form of a typing.)

9.6. BR4. If 1 1 , ..., I, is a valid context (if n = 0 it is the empty context), and if 2 is a fresh identifier, then the following contexts of length n 1

+

11, 11,

...,I,,z ...,I,, z

::

substantive (dcl) (dcl)

:: statement

are valid contexts. If, moreover,

A :: substantive is a valid clause in the context 11,

...,I,,z

:

A

11,

...,I,, then

(dcl)

is a valid context.

9.7. BR5. If 11,...,I,, is a valid context, and if one of these n items is 2 : A (dcl), then 2 : A is a valid clause in that context. Similarly, if one of the n items is 5 :: statement (dcl), then z :: statement is a valid clause in the context. If one of the items is P(asm), then P is a valid clause in the context. 9.8. BR6. Let C and CO be valid contexts, let 21,...,~k be the variables of the context CO (this notion was explained informally in Section 5.5), and let c be a valid clause in the context Co. Let X I , ...,xk be expressions with the property that if we replace 21,...,xkby X I , ...,xk,then all context items of CO, with the labels “(dcl)” and “(asm)” deleted, become clauses which are valid in the context C. Then the clause we get if we replace 2’s by X ’ s in c becomes a clause that is valid in the context C.

9.9. BR7. If 11, ...,I, is a valid context, and if k < n, then I1 ,...,I k is a valid context. If c is a valid clause in the latter context, then c is a valid clause in the context 11, ...,I,.

N.G. de Bruijn

894

10. VALID BOOKS 10.1. The notion of a valid book is obtained by saying that the empty book is valid and by explaining how a valid book can be extended.

10.2. BR8. The empty book is valid.

10.3. BR9. Consider a valid book, and take any set of lines as set of old lines. Let C be a valid context with respect to this set. The following list indicates what line bodies can be used to form, together with the context C, a line that produces a valid book again if it is added to the book, making the new line younger than all the old lines. The line bodies are on the right. In the cases (iii), (iv), (v), (vi), (vii), (viii) we require, as an extra condition, that the clause on the left is valid in the context C with respect to the set of old lines. P := P N :: statement P := P N :: substantive (iii)

Q

:: statement

(iv)

Q

::

(v)

c :: statement

substantive

(4

I ~

P := Q :: statement P := Q :: substantive c [Axiom] C

C

(vii)

R

(viii)

Q :R

:: substantive

P

:=

PN : R

P : = Q : R

In all cases P stands for some fresh parametrized constant, containing the variables of the context and no others. As the “clause of the line body” we take, in the cases (i) to (vii), respectively, (i)

P :: statement

(ii)

P :: substantive

(iii)

both P :: statement and Q :: statement

(iv)

both P :: substantive and Q :: substantive

(v)

c

(vi)

c

(vii)

P

(viii)

both P : R a n d Q : R.

:

R

The mathematical vernacular (F.3)

895

10.4. We have a comment on case (ii) of BR9. Some people may say it is not customary to use, or to admit the use of, lines like this in any arbitrary context. They might like to admit them in the empty context only. Essentially this comes down to starting a mathematics book with the creation of a number of types, and then off we go. This restricted use of case (ii) has the advantage that it becomes much easier to describe the collection of all types that can occur in a book. Nevertheless we keep this rule (ii) as it stands, i.e. we allow it in any context. We leave it to the user of the language to make or not to make the more restricted use of the rule. It is as with roads: one can build a road that technically admits speeds of 200 mph, the legal authorities may prescribe a speed limit of 100, and the individual user may restrict himself to a maximum of 60. We note that if a substantive is introduced as primitive by means of a line of the type (ii), then this substantive is an archetype (see Section 12.1). And if the line has a context like the line z : A(dc1)

*

P ( z ) := P N :: substantive,

then the only way to make special instances P ( u ) and P ( v ) comparable, is to require 2~ = 21.

11. COMMON STRUCTURE OF FURTHER RULES 11.1. All further rules are about the validity of clauses, where “validity” is taken with respect to a set of old lines and with respect to a context (which in its turn is assumed to be valid with respect to that set of old lines). In the simplest case such a rule will be of the form

........................................................................ (11.1) Q .............................................................................

P

and will express the following: if P is a valid clause, then Q is a valid clause too. A variation on the scheme (11.1) is

.......................................................................... Pl p2

Qi

Q2 Q3

.............................................................................

N.G. de Bruijn

896

which means to express the rule that if PI and P2 are both valid, then Q1, Q2, and Q3 are valid.

11.2. Some of the rules will be slightly more intricate in the sense that they deal with context extension. This can happen in entries on either side. We take a case where it happens on the left only:

......................................................................... Pl

*

J

Pz

Q

p3

......................................................................... (the * is an smMV symbol here). The meaning of this is as follows. We are dealing with a set of old lines (which is not going to be changed in this rule), and a context C. Assume that PI is a valid clause in the context C, that P2 is a valid clause in the extended context C, J (if C = I1, ....I,, then C, J represents the context I1, ....I,, J ) , and finally that P3 is a valid clause in the context C. Then Q is a valid clause in the context C. The validity of J as a context item will not be open to doubt in the cases we present. This validity will always follow from the assumptions. In rule T6 there is a case where the role of J will be taken over by two context items (separated by a comma) instead of a single one. As remarked in Section 11.1,a rule like the one above is intended to hold in any context. If I is such a context, this means that the rule also includes the following one:

........................................................................ I * PI I,J * 9 I

*

I * &

P3

............................................................................. 11.3. In all our rules, the phrases that were represented above by P , P I , P2, P3, Q, Q1, Q2, 9 3 , J will be expressions in terms of one or more meta-variables. Actually all symbols that have not been introduced explicitly as pMV, are to be considered as meta-variables in this kind of rules. For example, in rule T8' the letters A and B are meta-variables. In applications of that rule, they may be replaced by any pair of expressions.

11.4. Except for rule EQ11, all rules to be presented in the next sections have the form sketched in Sub-sections 11.1 and 11.2.

The mathematical vernacular (F.3)

897

11.5. We sometimes use the term derived rule (smMV). Derived rules are rules whose validity follows from earlier rules. In other words, what such a rule proclaims to be valid, can be shown to be valid already because of the other rules. We shall use a rule number with asterisk if we claim that the rule is a derived rule. The remaining rules are called fundamental rules, although we do not claim our set of fundamental rules to be minimal. In some cases one might be able to write such derived rules as theorems in an MV book, and then it is a matter of taste whether we present them as language rules or as theorems. There are derived rules whose derivation requires induction over the length of the book. As an example we take the observation that a : A can appear in the book only when A :: substantive. Another example is the observation that if A and B are substantives, and ( A = ) B :: statement, then there is a substantive C such that both A << C and B << C. We shall not actually use this kind of derived rules, but rather consider them as part of the metatheory of MV books. Therefore we shall refer to such rules as metatheory rules. 11.6. In some rules we deal with substitution (smMV). We use as smMV notaIf z is an identifier, if P and Q are expressions, then [[z/P]] Q denotes tion [[ the expression we get if every occurrence of 2 in Q is replaced by the expression P. Since P may also contain z, there may arise new occurrences of z, but these new ones are not to be replaced by P , of course. Example: [[s/g(z, y)]] f(z,g) stands for f Mz,Y), Y).

/I].

12. RULES ABOUT TYPING 12.1. We start with an informal introduction on the relation “<<”,a relation that can hold between substantives. If A and B are substantives then “ A << B” is to be interpreted as “every A is a B”. Example: “square << rectangle”, “rectangle << rectangle”. In smMV we say that “ A << B” is a clause, and that it intends to express that “ A is a sub-substantive of B”. The clause “ A << B” can appear in books also in cases where A << B is not true. For example, although “rectangle << square” is not true, it can still be considered as a well-formed statement. Here we speak smMV, but in MV it would be expressed as rectangle << square :: statement.

Our rules will have the effect that A << B can only appear in our books if A and B have a common ancestor E , i.e., a substantive E such that A << E and B << E are both true. If there is not such a common ancestor, then A << B will not be a statement. Our production rules for valid clauses will never produce

N.G. de Bruijn

898

such a thing as “rectangle < complex number :: statement”. One might derive in the metalanguage that “A << B :: statement” is an equivalence relation on the set of all substantives, and that every equivalence class is completely characterized by an “archetype” E , i.e., a substantive E with the property that all A of the class are sub-substantives of E . Similarly, if an ‘‘object” x is typed by x : A, then x has an archetype E. These archetypes do not appear in our rules, and neither in our MV books. For a comment on why the rules of MV were designed without explicit archetypes we refer t o Section 1.14. Since formula A < B will not be a statement for arbitrary substantives A and B, we will be unable to state it as an assumption in an MV book. We cannot say in such a book: “let A and B be substantives and assume that A < B”. The fact that we do say such things in some of our rules (like in T4) is quite a different matter. These rules are not written in an MV book. In T4 it means the assumption “A << B is a valid clause”, and this is said in smMV, not in MV. In none of our rules there is a conclusion drawn from the mere fact that A << B :: statement. We refrain from such rules in the philosophy that in such cases we will always have some substantive C with A << C and B << C . Formulas a : A will only appear in our books in situations where A is a substantive. Likewise, A < B will only appear when both A and B are substantives. Therefore we need not add assumptions like A :: substantive on the left in the rules Tl-T6. As to use of the substantive binder S in rule T6 we refer to Section 20.4.

12.2. Rules T1-T11. (See Section 11 for notational conventions.)

............................................................................. T1

a : A

a : A :: statement

............................................................................. T2

A c C

A << B :: statement

B
............................................................................. T3

A
B
a : B :: statement

a : A :: statement

.............................................................................

899

The mathematical vernacular (F.3) T4

A
2

: A(dc1)

*z

:

B

............................................................................. T5

z : A(dc1)

*

z : B

A
.........................................................................

............................................................................. We mention some derived rules. Indications for derivations are: T7*: from BR5 and T5, T8*: from T7* and T2, T9* and T10*: from T3 and T7*, Tll*: from T4 and BR5, T12*: from T4, BR7, T11* and T5, T13*: from T6 and T11*.

. ....................................................................... . A :: substantive

A
............................................................................. T8*

A< B

A < B :: statement

............................................................................. T9*

A
a : B :: statement

............................................................................. T10* A
............................................................................. T11’

A< B a : A

a : B

.............................................................................

900

N.G. de Bruijn A
T12’

A
B e C

............................................................................. T13’

2 :

Y

:

A(dc1) Sz:AP

*P

:: statement

y :A “X/Yll

p

......................................................................... 13. RULES ABOUT EQUALITY

13.1. We shall consider equality between names of objects, between statements and between substantives. The effect of our rules will be that objects will be comparable by equality (being “comparable by equality” means that their equality is a statement) only if they have the same archetype (cf. Section 12.1), and substantives will be comparable by equality only if they are sub-substantives of a common archetype. Any two statements will always be comparable by equality. As a metatheory rule (which we shall never explicitly use) we mention that if p = q appears in a book, then p and q are either both objects typed by a substantive, or both substantives, or both statements. In the (quite strong) rules EQ10a-10c the symbols p and q stand for phrases that possibly show one or more occurrences of the identifier t . 13.2. We first present the fundamental rules of the form displayed in Section 11.

........................................................................... EQ1

a : A :: statement

a = b :: statement

b : A :: statement

............................................................................. EQ2

a : A

a=a

........................................................................ EQ3

a : A

b:A

a = b

.............................................................................

The mathematical vernacular (F.3) EQ4

A
90 1

A = B :: statement

............................................................................. EQ5

A :: substantive B :: substantive A=B

A
............................................................................. EQ6

A
A=B

............................................................................. EQ7

P :: statement Q :: statement

P =Q

:: statement

............................................................................. EQ8

P :: statement Q :: statement P(asm) * Q &(am) * P

............................ EQ9

P = Q

...

......................................

P :: statement Q :: statement P=Q

................................

... ..........................................

EQlOa u : A

w : A “tl4lP = “WlQ u=w t : A(dc1) . . * p = q (for notation of substitution see Section 11.6) ~

............................................................................. EQlOb As EQlOa, but with “:: substantive” instead of “: A”.

............................................................................. EQlOc As EQlOa, but with “:: statement” instead of “: A”.

.............................................................................

N.G. de Bruijn

902

13.3. The following rule E Q l l is not of the general form described in Section 11.

............................................................................ E Q l l If the set of old lines contains a line of one of the forms

C * P : = Q : R C * P := Q :: substantive C * P := Q :: statement (cf. the first three cases of Section 10.3), then P = Q is a valid clause in the context C.

............................................................................. 13.4. We mention the following derived rules, mainly on reflexivity, symmetry and transitivity of equality. Hints for derivation are:

EQ12': EQ13': EQ14': EQ15': EQ16': EQ17': EQ18': EQ19': EQ20': EQ21': EQ22': EQ23': EQ24': EQ25':

from EQ5 and Tll', from T9 and EQ1, from T7* and EQ6, from EQ5 and EQ6, from EQ5, T12' and EQ6, from BR5 and EQ8, from EQ9 and EQ8, from EQ9 and BR6, from EQ9, BR6 and EQ8, from EQ1, EQ17', EQlOa, EQ2 and EQ18*, from EQ1, EQ17*, EQlOa, from EQ6, T6, T13*, BR6, EQ19', T5 and EQ6, from T6, EQl2' and EQ8, from T5 and EQ6.

............................................................................. EQ12* A :: substantive B :: substantive a : A A=B

a : B

.............................................................................

The mathematical vernacular (F.3) EQ13'

a : A :: statement b : B :: statement

903

a = b :: statement

A
........................................................................... EQ14'

A :: substantive

A=A

........................................................................... EQ15'

A :: substantive B :: substantive A=B

B=A

............................................................................. EQlG* A :: substantive B :: substantive C :: substantive

A=C

A=B B=C

.......................................................................... EQ17'

P

::

statement

P=P

........................................................................... EQ18'

P :: statement Q :: statement

Q=P

P=Q

......................................................................... EQ19'

P :: statement Q :: statement P=Q P

.......................................................................... EQ20'

P :: statement Q :: statement R :: statement P=Q Q=R

P=R

..........................................................................

904

N.G. de Bruijn

EQ2l* a : A a=b

b=a

............................................................................. EQ22* a : A a=b b=c

a=c

............................................................................. EQ23* A :: substantive z : A(dc1) * P :: statement z : A(dc1) * Q :: statement z : A(dc1) * P = Q

S x :P ~= S x : Q~

....................................................................... EQ24* A :: substantive z : A(dc1) * P :: statement z : A(dc1) * Q :: statement Sx:A p = Sx:A Q

z : A(dc1)

*

P =Q

............................................................................. EQ25* A :: substantive B :: substantive z : A(dc1) * z : B z : B(dc1) * z : A

A=B

............................................................................. 14. RULES ABOUT SETS 14.1. In this section we shall provide rules that take care of the translation of the language of substantives into the language of sets. This translation is not very essential, and whether we prefer sets over substantives is partly a matter of fashion. But one thing is really important for us: we want to be able to speak of the collection of all subsets of a set, and to quantify over that collection. The symbols “-set”, “1”are pMV. We use “A-set” as a substantive formed from the substantive A (like “pointset” is derived from “point”). The notation “At” can be pronounced as “the set of all A’s”, and if T is an A-set, then 7’1 can be pronounced as the substantive “element of T’ .

The mathematical vernacular (F.3)

905

14.2. Fundamental rules.

............................................................................. s1

A :: substantive

A-set :: substantive A t : A-set

.......................................................................... s2

Ti :: substantive T1< A (Tl)t= T

A :: substantive T : A-set

........................................................................... s3

A = (At11

A :: substantive

.......................................................................... s4

A :: substantive B :: substantive A
A-set

< B-set

.......................................................................... 14.3. In order to get t o the ordinary notations about sets we have to introduce

some typographical abbreviations (for this notion we refer to Section 21):

aET

stands for

a : TJ

TIc T2 stands for T11< T21 , A = T1. {z E T I P ( z ) } stands for (&:A P ( z ) ) f where 14.4. We mention a number of derived rules. Hints for derivation are:

S5': from S1, S4 and T l l * , S6*: from S4, T2, EQ4, S1, T1, T3, S7': from S2, T11*, BR5, EQ20*, EQ24*, S8*: from BR5, S1, EQ14*, EQlOb, S1, EQ2, EQlOb, S9': from S4, S3, S2, EQ14*, EQlOa, S2, EQ15*, EQ16*, S10*: from S1, EQ14*, EQlOa, S11*: from S2, S8*, EQ23*, EQ24'.

............................................................................. S5*

A :: substantive B :: substantive A
A t : B-set

.............................................................................

N.G. de Bruijn S6*

A :: substantive B :: substantive C :: substantive A<
A-set (< B-set :: statement A-set = B-set :: statement A t : B-set :: statement

............................................................................. S7*

A :: substantive Ti : A-set Ti = Tz TZ : A-set z : A(dc1) * (ZE Ti) = (ZE Tz)

........................................................................ S8*

A :: substantive B :: substantive A=B

A-set = B-set AT= BT

............................................................................. S9*

A :: substantive B :: substantive C :: substantive A c C B c C AT= Bt

A=B

............................................................................. S10'

A :: substantive Ti : A-set Tz : A-set Ti = Tz

............................................................................. S11*

A :: substantive TI : A-set TZ : A-set

Ti = Tz

Til= Tzl

.........................................................................

907

The mathematical vernacular (F.3)

15. RULES ABOUT PAIRS 15.1. If we have two substantives, “point” and “line”, say, we want to speak of pairs, the first component of which is a point, the second one a line. Such pairs are called “point-line-pairs” . The symbols “-pair”, “proj1”, “proj2”, “the pair” axe pMV. 15.2. Fundamental rules about pairs.

............................................................................. P1

A :: substantive B :: substantive

A-B-pair :: substantive

............................................................................. P2

a : A

the pair (a, b ) : A-B-pair

b : B

............................................................................. A B

P3

:: substantive :: substantive u : A-B-pair

projl(u) : A proj2(u) : B the pair (projl(u),proj2(u)) = u

............................................................................. P4

a : A b : B

projl(the pair (a,b ) ) = a proj2(the pair(a, b ) ) = b

............................................................................. 15.3. We mention two derived rules. Hints for derivation are:

P5*: from P3, T11*,P2, EQ22*, T5, P6*: from P5*, T2, EQ4.

............................................................................. A-B-pair < C-D-pair P5* A
............................................................................. P6*

A<E

C<E B
A-B-pair < C-D-pair :: statement A-B-pair = C-D-pair :: statement

D < F

...........................................................................

N.G. de Bruijii

908

16. RULES ABOUT FUNCTIONS 16.1. If A and B are substantives we shall introduce a new substantive “mapping of A’s to B’s”. We write this in MV as “A ---* B”; the symbol + is pMV. The fact that the same arrow is used for implication (Section 17) will give no confusion. Actually the rules for the two are alike (compare F2 and F3 with L2 and L3). For the value of the function f at the point p we shall write in MV “Val( f , p ) ” , instead of the usual f ( p ) . The notations V a l and X are pMV. 16.2. Fundamental rules about functions.

........................................................................... F1

A :: substantive B :: substantive

A + B :: substantive

............................................................................. F2

A :: substantive B :: substantive f : A+B p : A

val(f,p) : B

............................................................................. F3

A :: substantive B :: substantive z : A(dc1) * F : B

X z : ~F : A

+B

............................................................................. F4

A :: substantive B :: substantive z : A(dc1) * F : B y : A

val(Xz:A F, y) = “z/yl] F

............................................................................. F5

A :: substantive B :: substantive f : A+B f = g g : A+B z : A(dc1) * Val( f , z) = val(g, z)

.............................................................................

909

The mathematical vernacular (F.3)

16.3. Here are two derived rules. Hints for derivation are: F6’: from F2, F4, F5, F7’: from T l l * , EQ12*, F3, F6*, EQ22*, T5.

........................................................................... F6*

A :: substantive B :: substantive f : A-B

Xz:A val(f,

x) = f

............................................................................. F7*

A :: substantive B :: substantive C :: substantive D :: substantive A=C

A+B<
B e D

............................................................................. 16.4. Many mathematicians would prefer to express the notion of function by means of a graph in a Cartesian product, which has the advantage to reduce the number of basic rules. On the other hand the function concept seems to be such a natural one, and the way we think of functions is usually so far from Cartesian products, that it is attractive to describe the function concept independently.

17. RULES ABOUT LOGIC 17.1. The only things to be presented in this section are the rules for implication and for universal and existential quantification. Treatment of negation, conjunction and disjunction can be postponed to the MV book. For this possibility we refer to Section 18.1. The symbol ‘I-” is pMV; the same arrow was used in Section 15 for the notation of mappings.

17.2. We first present the fundamental rules L1, L2, L3.

........................................................................... L1

P :: statement Q :: statement

P

-

Q :: statement

.............................................................................

N.G. de Bruijn

910

P :: statement Q :: statement P‘Q

L2

P(asm)

*

Q

............................................................................. P :: statement Q :: statement P(asm) * Q

L3

P“Q

............................................................................. 17.3. We mention some simple derived rules about implication. Hints for proofs are:

L4*: from BR5, L3, L5*: from L2, EQ8, L6*: from L4*, EQlOb.

....

.....................................................................

L4*

P :: statement

I . .

P+P

. . . . . . . . ................................................................. P :: statement Q :: statement P‘Q Q’P

L5*

... L6*

..

P=Q

.................................................................. P :: statement Q :: statement

P‘Q Q“P

P=Q

............................................................................. 17.4. We finally present derived rules on universal quantification. Let P be an expression (possibly containing the identifier z). Then we take “ V z : ~P” as P”. typographical abbreviation (see Section 21) for “A = For this new quantifier the following rules can be derived.

........................................................................ L10*

A :: substantive z : A(dc1) * P :: statement

V z : ~P :: statement

.............................................................................

The mathematical vernacular (F.3)

L11’

A :: substantive z : A(dc1) * P :: statement 2 : A(dc1) * P

911

V x :P ~

............................................................................. L12’

A :: substantive z : A(dc1)

*

P :: statement

vx:Ap

“x/aIl p

a : A

............................................................................. 17.5. The reader might have expected a treatment of existential quantification too. This can easily be postponed t o the MV book. I t can be built upon axioms for a statement exist(A), where A is a substantive. It seems t o be nicer to postpone that to the book, since it is of the same nature as the axioms for disjunction in propositional calculus. 18. EXAMPLE OF AN MV BOOK 18.1. Having completed our presentation of the rules of MV, we can now start writing books. In the beginning of an MV book we still have t o write a number of fundamentals in the form of primitive notions and axioms. These might have been taken as language rules too, but we would rather leave it t o the user of the language to have it in his own way. Moreover, the language definition is simplified if we shift t o the book whatever we can. Nevertheless there are several things in our language rules that have a form that would enable us to write them in the book. As examples we mention EQ2, EQ3, S3, F1, F2, F5, L1, L2, L3 as far as “fundamental” rules are concerned. In the case of S1, F1, L1 we had serious reasons for not shifting them t o the book: they were needed for the formulation of further rules that had to stay in the language definition because of their form. For the others there is no other reason than the wish to keep related material together. T h e fact that some of them play a role in the derivation of derived rules (like EQ2 is used in the derivation of EQ21*), is not a serious reason. The derived rules do not belong to the definition and theory of MV, and might as well be postponed until after a piece of the book has been written. In Section 19.4 we show that some book material might also have been put in the form of rules of the type of Section 17.

18.2. In the following MV book with pointed and rectangular flags (cf. Section

N.G. de Bruijn

912

6.4) we have numbers (l), (2), ... on the left. These do not belong to the book, but serve as labels for our comments in Section 19.

a :: statement> :: statement a then b := a

>

a-b b then a

+

-

b :: statement

b

El and b := PN :: statement or b := PN :: statement

4

a or b [ Axiom ]

P

a and b [ Axiom ] b and a

El

a or b [ Axiom ]

1a [ Axiom ] b [ Axiom ] b and a aorb I I c :: statement

>

a -+ ( b or a ) b -+ ( b or a ) b or a contradiction := PN :: statement no := contradiction :: statement a :: statement>

I

The mathematical vernacular (F.3) not(a) := a + no :: statement no a [ Axiom ] ( ( a -+ no) + a) + a [ Axiom ] -+

(a

-

no)

-+

not(not(a))

a

+a

'al

11-

r

not(6) a ot(a) + a d 6 = not(a b = not(a) I a or not(a)

not(b)) b

-+

-+

1 :: substantive> xist(A) := PN :: statement exist(A)

[ Axiom ]

exist(A)] :: statement

not (Vz:Ano)

not(tl,,Ano)I not exist A

913

N.G. de Bruijn

I

exist(A) no Vx:Ano no 2t (not(exist(A))) tist(A) st(A) = not(V,,A no) re is exactly one A := exist(S,,AVb,A (b = a ) ) :: statement iere is exactly one A I ie A := P N : A b =the A

A) Z}A :=fx:A

(x = a ) : A-set

e=

X:> {a}A

{ a , b}A :=fz:A (x = a or x = b) : A-set a E {a,

~IA

Ea,

b E {a,b)A

b ) and ~ not (c = a )

I

A :: substantive B :: substantive

IA

union(A, B , f ) :=fb:B 3,:Ab E

Vd(f, U )

: B-Set

:: substantive>

D

A) P :: statement> selected(A, a, b, P ) := S Z : ~ ( (= ( za ) and P ) or ((x = b) and not P ) ) :: substantive there is exactly one selected(A, a,b, P ) selection(A, a, b, P ) := the selected(A, a , b, P ) : A if P then (selection(A, a , b, P ) = a ) if not(P) then (selection(A, a, b, P ) = b)

The mathematical vernacular (F.3)

915

natural number := PN :: substantive nat := natural number :: substantive N := nat T: nat -set 1 : = PN : n a t

suc(n) := successor of n : nat not(suc(n) = 1) [ Axiom ]

propagate(S) := V n : ~ ( ( nE S) -+ (suc(n) E S)) :: statement if start(S) and propagate(S) then S = N [ Axiom ]

m divides n := 3k:nat ( k m = n) divisor of n := Sk:nat ( k divides n) :: substantive (divisor of n) << nat prime number := Sp:nat(not(p= 1) and vk:&visor of p ( k = 1 or k = p ) ) :: substantive prime number << nat 9

II

not ( p . q : prime number) :: statement

point := PN ::substantive line := PN :: substantive

r

A := P N :: statement

B : oint

A and B := ( b goes thr. A ) and ( b goes thr. B )

pzz>

one

(Sc:line

(c goes thr. A and B)) [ Axiom

1

N.G. de Bruijn

916

(147) (148) (149)

P

b : line A , B, C on 6 := ( b goes thr. A and B ) and (6 goes thr. C) :: statement

3A:point 3B:point 3C:point

not(%:line ( A , B ,

c on d ) ) [ Axiom ]

19. COMMENTS O N THE EXAMPLE OF AN MV BOOK 19.1. In Section 18.2 the text from (1) to (59) represents a piece of propositional logic. It is not complete in the sense that it contains everything one might ever need, but it is sufficiently representative for showing the following things. (i)

Some logic can indeed be developed in the book, so we do not need to put all of it in the language rules.

(ii) The logical statements we derive in the book can be applied later as inference rules, in the same way as mathematical theorems are applied. So there is hardly a borderline between logic and mathematics. A much more prominent borderline is the one between language definition and book material. (iii) The book starts with minimal propositional logic (lines (1) to (32)): just the rules for introduction and elimination of implication, conjuction and disjunction, without any negation. Next there is the introduction of contradiction and negation (lines (33) to (36)) and the “falsum rule” (37) as a logical axiom. This part of the book ((1) t o (37)) might be used as a basis for intuitionistic mathematics. It is not very likely, however, that our system would satisfy all intuitionists. They might dislike some of the things we have preferred to put in the language rules, like the possibility to introduce arbitrary substantives, and the rules Sl-S4 (which make it possible to talk about the set of all subsets of a set). The extra axiom (38) takes us into classical logic, with the double negation rule (44) and the rule of the excluded third (59) as results. From there on we are in classical logic. It has to be admitted that if classical logic had been our only goal, the number of primitives and axioms might have been reduced considerably. But reducing the number of axioms, important as it might be for metatheory, is completely irrelevant from the practical point of view: later in the book old axioms and primitives are applied in exactly the same way as old theorems and old defined notions. (iv) The spirit of the treatment is natural deduction. There is no trace of treating logic by means of truth values. Contrary to popular opinion, it

The mathematical vernacular (F.3)

917

is hard, if possible at all, to explain logical reasoning by means of truth values, unless one cheats by using a priori knowledge of logical reasoning in order to explain what such reasoning is. But nothing is lost by discarding truth tables. Everything that can be done with truth tables, can be done in natural deduction, usually faster, and usually closer to our actual way of thinking.

19.2. We now comment on some of the details of the MV book of Section 18.2. The book starts with implication. Line (3) introduces an alternative notation for the +. Lines (4) to (8) are meant as a little exercise with the introduction of implication, according to rule L3. Note that we have (6) by the fact that we had “b” in an old line in a smaller context (BR5 and BR7). Now (7) follows by L3. Similarly the little theorem (8) follows from (7). Lines (9) to (11) apply rule L2 for the elimination of implication, the so-called “modus ponens” rule. Actually it makes the rule available as a book theorem: any further case of modus ponens can be seen as an application of this theorem (11). A similar remark holds for many mathematical theorems. We need not always transform results obtained inside a block into results with a shorter context by means of L3 and L11*. We can just leave them quietly in their context, ready for the application of the powerful rule BR6. Lines (12) and (13) introduce the conjunction and disjunction as primitives. The introduction rule for the conjunction is given in (17) and (18); note that (18) need not be labeled as an axiom since it can be obtained from (17) by substitution: b for a and a for b. The elimination rule for the conjunction is given by (22) and (23)) and here both have to be axioms. The introduction rule for the conjunction is expressed by the two axioms (15) and (20). Elimination of disjunction is achieved by the more complex rule (29). It is the basis for “proof by cases”: if we want to prove c and we know that a or b, then it suffices to derive c from a and from b separately. In (33) the primitive notion “contradiction” is introduced. Thus far there was no question of “falsehood”, since all validity rules in the language definition are formulated in a positive way. In an MV book there is no reason to say that a thing is true: a man is not more honest because he says that he is honest. But we do say sometimes that a thing is false, by saying that it implies a contradiction. This is expressed in (36). In (34) we just abbreviated “contradiction” to “no”. This has the same function as the typographical abbreviations (Section 21), but we prefer to restrict the use of typographical abbreviations t o cases that can not be expressed in the book. The same remark applies to the notation -.a instead of not(a) : it can be introduced in a definitional line in the MV book if we wish.

N.G. de Bruijn

918

Classical logic is obtained by adding the double negation law (45): it says that if the negation of a is not true, then a is true. In the present text it is a theorem instead of an axiom, derivable from the two axioms (37) and (38). The first one is the intuitionistic “falsum rule”, the second one is a special case of Peirce’s law ( ( a + b) a ) a. We have taken (38) as an axiom since it gives rise t o interesting exercises in natural deduction. Let us assume that we did not have the rules (12) to (32) in our book. Then we can still say that the rule (38) (holding for all statements a ) has the eflect of a disjunction, viz. the disjunction of b and not(b) (for all b). Line (56) shows that a can be proved by cases, just as if “b or not (b)” had been available. In (45) we have the double negation law: the negation of the negation of a statement implies that statement itself. If we would have taken the double negation rule (45) as a n axiom, we might have derived both (37) and (38) as theorems. In a certain sense (37) and (38) form an orthogonal decomposition of (45). This is to be interpreted with the following notion of orthogonality: statements p and q are called orthogonal if p ) + p . In a way that means that q and p both ( p -, q ) -, q and ( q do not give any information about each other: if we can prove q under the assumption p then we can prove q all by itself, and if we can prove p under the assumption q then we can prove p all by itself. And indeed, without using the a and book from (1) to (32) it can be shown directly after line (35) that no ( ( a + no) a ) a are orthogonal. A few lines about the derivation of (45). By modus ponens we have -+

-+

-+

-+

-+

-+

not(not(a)) (ass), not(a) (ass)

*

no,

*

a.

so the falsum rule leads to not(not(a)) (ass), not(a) (ass) Therefore not(not(a)) (ass)

*

( a -+ no) + a ,

so applying (38) with modus ponens we get not(not(a)) (ass)

*

a.

Line (59) is the so-called rule of the excluded third. Abbreviating “a or not(a)” to c , we get it from (56), with a replaced by c and b by a. Notice that both a and not(a) lead to c, because of (15) and (20). The basic rules BR1-BR7 and the equality rules EQ1-EQ25’ soon become second nature to us, and therefore we shall hardly notice their application any more.

The math em at ical vernacular (F.3)

919

19.3. At this point we note that it is hard to give a satisfactory description of the word “proof” as an smMV term. Looking ahead from a line (i) to a line (j) (where i < j ) one might say that the text between (i) and (j) is a proof of (j). Others might only count the material between (i) and (j) as far as it is actually used for the derivation of (j). More important is the question whether explanations of the type given in Section 19.2 belong to the proof. If the steps in the MV book are so small that each line requires just a single application of a single rule, then most people would call it a very detailed proof, even if it is not mentioned in the text what rules were applied. We can omit lines here and there such that most people will be able to find intermediate steps themselves by mental exercise. In the case of (24) most people will be able to find the missing link in a split second. In the case of (57) there is a sequence of missing links, and we will feel the need for scrap paper, even if we are experienced in this kind of natural deduction. We can go quite far in this respect. It was pointed out already in Section 1.9 that the rules of MV allow to omit intermediate steps, even to such an extent that readers may find that the essence of the proof is lacking. This is caused by the fact that validity in MV is defined recursively. Something can be valid because of the existence of a sequence of intermediate steps; it is not required that these steps have actually been written down in the book. 19.4. Some of the logical parts of the text of Section 18.2 give us the same rights as if we had added a number of further rules in Section 17. We shall display them in that form here, labeled with two asterisks instead of one, since they are not derived from the fundamental rules of Sections 9-17 but from material of the particular book of Section 18.

............................................................................. L13**

P :: statement Q :: statement

PandQ

P

Q ......................................................................... L14”

P :: statement Q :: statement PandQ

P

Q

............................................................................. L15**

P :: statement Q :: statement

P or Q

P

.............................................................................

N.G. de Bruijn

920 L16**

P :: statement Q :: statement

P or Q

Q ............................................................................. L19**

P :: statement Q :: statement R :: statement P-R Q’R P or Q

R

............................................................................. 19.5. In (60)-(97) we show a few things about sets, sufficiently representative for showing how one should go on. Lines (60)-(67) introduce the notion of existence on a negation-free basis. Once we have classical logic, we can do more. In (64)-(80) we show the equivalence of “exist(A)” and “not(Vz:A no)”, which is t h e basis for expressing existential quantifiers into universal ones, and vice versa. We mention that (69) is obtained by application of (67) with c replaced by no; with this value of c assumption (66) is valid because of (68). Next, (70) rests on L3, (69) and, of course, (36). One gets (74) from (63), replacing a by 2, and (75) from L2, using (72) and (74). Then (76) follows from L11*, using (75), next (77) from L2 with (71), (76), and (78) from L3 and (77). Finally we get (79) from (78) and (45), and (80) from (70), (79) by virtue of L5*. In (83) we introduce (as a primitive notion) the definite article “the” in front of a substantive, if that substantive has the uniqueness property assumed in (82). In (85) we assert that if the uniqueness property holds then every A equals ”the A” . A detailed proof of (85) can be given as follows. Abbreviate K(A) := S a : ~ V b :(b~ = a ) . Then (82) says exist(K(A)). We want to apply (67), replacing A by K(A) and c by “b = the A”. To that end we have to satisfy what the condition (66) amounts to in this case, i.e., va:K(A)(b = the A ) . In order to prove the latter statement, we extend the context by means of a : K(A) (dcl). In the extended context we have to show b = the A. In this context we have a : K(A), and therefore V ~ : A(d = a). Using L12* we get both b = a and the A = a, so by EQ21* and EQ22* we infer b = the A. Note that this derivation uses no classical logic. It is entirely negation-free. In (87) we define the singleton { a } as ~ an A-set. In usual untyped set theory the subscript A is superfluous, but here it is not. Nevertheless, having to write

The mathematical vernacular (F.3)

92 1

the subscript is a formal duty only, for if the same a also satisfies a : B then { a } = ~ { a } ~ .We know (by metatheory) that the typings a : A and a : B can hold simultaneously only if A and B are sub-substantives of a common K (which might be the archetype of a ) , and therefore (90) helps us out. Note that in (87) and (92) the typographical abbreviation f z : A us used (see (20.7.2)). In the context of (87) we can define the empty set too: A :: substantive(dec1)

*

emptysetA :=Tz:A no : A-set

.

Note that we do not have a universal empty set: every archetype has one of its own. In (90) we wanted t o express that if A and C are substantives with C
as the thing denoted by { a l , . . , ,a,}. number in the metalanguage smMV.

t

In this case n is a variable natural

The two ways (i) and (ii) can be connected in the MV book for every single value of n, but not for all n simultaneously. We note that the length of the derivation (i.e. the number of applications of language rules) is more or less proportional to n. In (101) we define the union of an indexed collection of sets.

922

N.G. de Bruijn

Lines (102)-(110) present the “if-then-else” selector, which can be used as a basis for definition of functions by cases. We omit indications of the proofs. Note that (108) uses the “the” of (83).

19.6. The text from (111) to (135) deals with the natural number system. The Peano axioms are (112), (113), (114), (116), (118), (124). In (127) we have presented the notion of the product of two natural numbers by means of a PN. This can of course be avoided (the product can be defined), but the text would become lengthy, and it is our present purpose to get rapidly to divisibility. In (129) we define a substantive “divisor of n”, in (130) it is noted that it is a sub-substantive of ‘hat”, and (131) gives the definition of “prime number”. In (135) we form the statement that the product of two primes is not a prime. Note that this contains an example of a typing ( p . q : prime number) playing the role of a statement, which is allowed by virtue of (132) since p . q : nat. It would be wrong to claim “not ( p . q : prime number)” as a theorem here: a proof would require more information about products than what is expressed in (127). 19.7. The text from (136) to (149) is based on the beginning of Hilbert’s axiomatization of geometry. Hilbert starts by saying “We conceive three different systems of things: the things of the first system are called “points”, those of the second system are called “lines”, those of the third system “planes”.” Hilbert does not make any use of his “systems” as systems: what he actually does is just handling the words “point”, “line”, “plane” as new substantives. Therefore we interpreted his words in MV by taking them as PN’s. As to (143) one might hesitate. Does this really require a mathematical definition or is it just one of the linguistic transformations we want to admit anyway? We refer to Section 22 for such matters.

20. BINDERS 20.1. We have not spent much attention on quantification by means of bound variables and quantifiers. Formal treatment of quantification in lambda calculus is well known, of course. One of the hard things in quantification is the treatment of the names of bound variables, which have to be refreshed occasionally. In this section we do not go into these standard matters of non-typed lambda calculus. Instead, we shall indicate a number of points in which our present proposal of MV differs from more usual ways to treat quantification. 20.2. First we mention that our MV is a typed language, and that, accordingly, the bound variables in the quantifications run over a certain range. The range

The mathematical vernacular (F.3)

923

can be indicated by a substantive A , using a typing “x : A ” , or by a set S, and then we use z E S. By 14.3 .we can easily pass from one to the other, and so we restrict our discussions to the case ‘‘2: A ” .

20.3. In OMV the “value” of a result of quantification is either a statement or an object. Examples of the first kind are Vnmatural number

P(n)

%natural

1

number

p(n>,

where P ( n ) is a statement. Examples of the second kind are

2f

(n)

1

u V(n)

1

{. E SIP(z))

t

nES

n= 1

where f (n)is a number, V ( n )is a set, P ( z ) a statement. 20.4. In MV, where substantives are taken seriously, we can also admit quan-

tification where the value of the result is a substantive. One of the possibilities is given by the binder S, introduced in rule T6, Section 12.2. Its meaning is shown in the following example. We consider ‘‘quadrilateral with the property that its diagonals are perpendicular to each other” as a new substantive. It can be written by means of a quantifier as Sz:qudrilateral(thediagonals of z are perpendicular)

.

20.5. The following example demonstrates a second case where quantification leads to a new substantive. We have the name “square of p” if p is a prime number. We want to despecify the variable p , and get to the substantive ‘‘square of a prime”, or “prime square”. For this we can use the binder “despo” and write

despop:prime(the square of p ) (despo is short for “despecified object”). This binder can be considered as a typographical abbreviation (see Section 21.2).

20.6. Many quantifiers can be expressed once we have the functional bander (Church’s A). If A and B are substantives, if F ( ...z...) is an expression containing z at one or more places, with the property that z : A implies F ( ...z...) : B, then

F ( ...z...) is the function that attaches, for each z, the value F ( ...z...) to x. Example:

N.G. de Bruijn

924

(20.6.1)

Azpositive

integer(2’

+

’

Some other quantifiers can be expressed in terms of this one. For example, the sum

c

(20.6.2)

(n2+n)-2

n:positive integer

might be written as (20.6.3)

SUm(Xn:positive integer(n’

+ n)-’) .

This means that the quantification (20.6.2) can be obtained by application of the unary operator “sum” to the function (20.6.1). In contrast to (20.6.2), all the binding in (20.6.3) is in the function, and nothing of it in the operation. 20.7. The binder S of rule T6, Section 12.2, plays a similar central role as

Church’s A. In Section 20.6 we had the unary operator “sum”, acting on an expression quantified by a A, in the present case we take as examples the unary operators “exist” (from Section 18.2, formula (61)) and “t” (from Section 14.2). The operator “exist” maps substantives into statements, the “1”maps substantives into names. If P is an expression containing z, such that for all z : A we ~ agreeing that have P :: statement, then we can form a new statement 3 z : by (20.7.1)

&:A

P ( z ) = exist(S,:AP(z))

and the new name (20.7.2)

tz:A

P ( z ) = (Sz:AP(z))f .

This (20.7.2) is usually written on OMV as {z : A ( P ( z ) } . We get the standard rules for introduction and elimination of the quantifiers 3 , : ~and t=:A from T6 (Section 12.2) in combination with the rules for the unary operator “exist” (Section 18.2, (60)-(67)) and the rules for the unary operator t (Section 14.2). At this point it should be noted that we have not provided facilities in MV to deal explicitly with predicates. If A is a substantive then we cannot express in mV that something is a predicate over A . Instead, we have to use the rules of Section 14. A predicate is usually seen as a mapping from objects to statements. As an example, we consider the property of-a natural number to be > 5. One might suggest ArInat(z > 5) that sends any natural number z into the statement z > 5. It seems attractive, but we have not incorporated this into MV. It would not not quite fit into our system to attach a type to such a thing, and therefore it would not help us to create arbitrary predicates. 20.8.

The mathematical vernacular (F.3)

925

The set-forming rules of Section 14 help us out. Instead of discussing a predicate in MV, we discuss the set of all objects satisfying that predicate. So instead of that A z : n a t ( ~ > 5) we talk about the set fz:A P ( z ) (cf. (20.7.2). And instead of taking arbitrary predicates we take an arbitrary A-set, as in formula (88) of Section 18.2.

20.9. MV does not allow the introduction of new quantifiers in the book. The reason is that the language is not equipped with means for saying ”an expression containing I”. Have only two basic quantifiers in the language definition, viz. the substantive binder S and the functional binder A. All other binders have to be expressed in terms of these two in the way indicated in Section 20.6. If we insist on using notations like (20.6.2), (20.7.1), (20.7.2), we have to treat them as typographical abbreviations, to be discussed in Section 21.

21. TYPOGRAPHICAL ABBREVIATIONS 21.1. Of course we like to use OMV notations in MV as much as possible. We can do this in an informal way by using what we shall call typographical abbreviations. We can agree that if we write (20.6.2) in an MV book, this is just an informal abbreviation for (20.6.3). The agreement to use that abbreviation cannot be made in the book itself, but has to be written in the margin somehow. A similar remark applies to (20.7.1) and (20.7.2). 21.2. As a further example we take the “despo” of Section 20.5. If A and B are substantives, and if P( ...I...)is an expression containing I such that for all z : A we have P( ...I...): B , then despoz:A P( ...z...) can be considered as typographical abbreviation of

21.3. The words “typographical abbreviations” indicate unofficial abbreviations, usually (but not always) invented for the sake of typography. When reading a text that uses such abbreviations we first have to translate them into what they stand for, and only after that translation we are assumed to be able t o understand the text as an M V book. Typographical abbreviations are superficial, when compared to the abbreviations we introduce in the MV book itself by means of definitional lines (see

N.G. de Bruijn

926

Section 7.6). Definition and theory of the language have t o take these definitional lines as essential parts of the language, but never deal with typographical abbreviations. As an example of a typographical abbreviation outside the world of quantifiers we quote the notation { 1,.. . , n } for the set of integers from 1 to n. We refer to the discussion in Section 19.5.

21.4. When studying mathematical notation, we discover many other cases of abbreviations that we would prefer t o consider as informal. Some examples are: (i)

We write a = b

< c = d instead of the conjunction of a = b, b < c, c = d.

(ii) We write a(l),a(2), . . . to denote an infinite sequence. (iii) We have baroque notations for 17-th and 18-th century mathematics, on integrals, derivatives, differential equations, etc. (iv) We have many unwritten conventions by which we omit things that, strictly speaking, would be necessary for parsing. These may be local conventions in a certain area of mathematics. In trigonometry one interprets sin z cos y as the product (sin z)(cos y); the alternative sin (z cos y) occurs in the theory of Bessel functions but not in trigonometry.

21.5. Many formulas in OMV are written in a form that do not fit into a single line. A simple example is the old notation for quotients by means of a horizontal bar. If we think of MV having to be processed by a computer it seems that such notations have to be avoided, but it is not a matter of principle. Formats which do not have the form of a string of characters might be admitted in MV just as well. As a simple example we note that sometimes the value that a function takes at the point n is denoted by using the letter n as a subscript, like in c,. This should not be confused with the habit of using cl, c2, . . . as new identifiers. The difference between the two is of the same type as the difference between the two ways to look at al, . . . ,a,, discussed in Section 19.5. 21.6. There will be many cases where one is easily tempted to interpret the rigid rules of MV with a little grain of salt. At least, as long as the texts are intended to be read by human beings only. If we present them to machines, we have to be much more careful. It was already mentioned in Section 5.3 that we often cheat with the rules that require identifiers to be fresh. But it is not necessary to be so rigid in

The mathematical vernacular (F.3)

927

cases of variables introduced by means of declarations. It suffices to have them different from all previous variables of that same context and different from all previously introduced constants (either by definition or by PN). But the rule that all introduced constants should have different identifiers, will often be felt as a burden: in mathematics constants occurring in distinct subjects will often have the same name. When writing for human readers, there does not seem to be any harm. When writing for machines, we should do something like the paragraph system that was used in Automath, where identifiers are always interpreted in the sense given to them in the local paragraph. An old constant with the same name can only be referred to if we add some kind of paragraph indication. Another case for grains of salt was already mentioned in Section 5.5. The condition that all context variables occur in the name of a defined notion is quite different from habits in OMV. In Automath there is a systematic way to weaken this rule: in parentheses expressions like F ( A 1 , .. . , A n ) we may just omit the first k entries A 1 , . . . , Ak if t.hey are identical to the first k variables of the context. In particular that has the effect that in a definitional line in a context with variables 21,.. . ,zn the parametrized constant on the left of sign ":=" can be written as a single identifier (cf. Section 5 . 6 ) . But it should be noticed that in MV and in OMV not all parametrized constants have the form of such parentheses expressions. That makes it harder to formulate what liberties can be taken in the matter of omission of a number of context variables.

22. GETTING CLOSER TO NATURAL LANGUAGE 22.1. In the examples of Section 18.2 we inserted a small piece of what one might call natural language. On a small scale it showed how mathematics can be described in words and sentences, not just in symbols and formulas. If we want to insert more natural language, or even if we want our MV book to look like an ordinary mathematics book, we can do this on three levels:

(i)

the primary MV level (pMV),

(ii) the secondary MV level (sMV), (iii) the level of typographical abbreviations. 22.2. As to primary MV we can discuss how some of the basic notations on typing, on context and on quantification are to be expressed in terms of words. In several cases it is not quite clear how to do this: there may arise ambiguities,

N.G. de Bruijn

928

in particular just those which the MV notations intended to avoid. Right now we do not try to suggest how to solve all these difficulties. We hardly go beyond a first orientation. 22.3. The typing “p:prime number” can of course be pronounced “ p is a prime number”. A definitional line body “ p := Q : prime number” can be pronounced as “denote the prime number Q by p” (in this case Q is an expression and p is an identifier). The declaration ‘$:prime number” (the case of a pointed flag) is “let p be a prime”. The assumption “p:prime number” can be “assume that p is a prime number” (the case of a rectangular flag). The case of “assume...” is essentially different from “let...”: in the case of “let ...” the p is a new variable, in the case of “assume...” it is a variable or a constant that was introduced earlier in the book (cf. the assumption (89) in Section 18.2). Unfortunately there is not a very strong feeling in english OMV that “let ...” is to be restricted t o the case of declaration. One might consider to replace the “let ...” by something that cannot be misinterpreted, like “take any prime number p”. In OMV there is no clear way to tell where the flagstaffs end. Quite often it is suggested by the typographical layout, mainly by the subdivision of the text into sentences, paragraphs, sub-sections, sections and chapters. There is certainly a need for explicit rules for this. Right now there are just some unwritten conventions. One might express a rule like this: If an assumption is a part of a sentence, then it does not reach beyond that sentence. If it is the first sentence of a paragraph, and not a paragraph of its own, then it does not reach beyond its paragraph. Similarly, if a sequence of assumptions form the first sentence of a paragraph, and not a whole paragraph, then these assumptions are intended just for this paragraph. The rules for declarations are the same as those for assumptions. As an example we quote

(22.3.1)

If x is a real number then sin x

<2 .

It is considered to be bad manners to refer to x in the next sentence. Actually one may doubt what (22.3.1) means. It can be (i) a block, opened by the declaration “x:real number”, (ii) the universal statement VZzrealnumber sinx

< 2.

Fortunately the two are equivalent by virtue of the language rules of MV. But sometimes (22.3.1) means an implication: just imagine that the sentence (22.3.1) is preceded by “let x be any complex solution of the equation 4 cos x - 1 = 0”. In that case we can consider (22.3.1) as an implication, but also as a block (starting with the assumption “let x be a real number”). Fortunately again, the two possibilities are equivalent because of the language rules of MV.

The mathematical vernacular (F.3)

929

22.4. Expressing quantification in natural language is reasonably established: “for all points P . . .”, “for every point P * . .”, “there exists a point P such that ...”, are sufficiently clear, also if the quantifier “for all points P” is shifted to the end of the sentence: a machine should be able to translate them into V’s and 3’s. Nevertheless there is something wrong from the linguistic point of view: the name P does not play a role any ordinary word could ever play in a sentence. Writing “for every point, P say” does not make it any better. Natural language simply does not have anything corresponding to dummy variables! We (and the linguists) just have to learn t o live with the strange P in “for every point P”. 22.5. In our natural languages it is often possible to express quantification without the use of a dummy. The sentence “all dogs sleep” is equivalent to “for every dog P , P sleeps”. This can be done because of the fact that P is the subject of the sentence “P sleeps”. In other cases it can be done by means of pronouns. “There is a dog whose master trims its hair every day” will (although it is not very elegant English) mean: “there is a dog d such that the master of d trims d’s hair every day”. Correct interpretation of such sentences may depend on subtleties (see what happens if “d’s hair” is replaced by “his hair”), in particular in cases of more than one quantification in a single sentence. Special care should be taken with the words we use when applying the rule for elimination of the existence quantifier, mentioned at the end of Section 20.7. One usually says: “We know the existence of an z : A such that P ( z ) . Take such an 2”. Then one starts deriving a statement Q (which does not contain z). That is, one derives z : A(decl), P(z)(ass)

*

Q

and then the existence of an z : A such that P ( z ) guarantees that Q holds outside the block too. It would be nice to have a standard way of saying these things in a world where it is not customary to indicate the end of a block. A suggestion: open the argument with “We may assume that we have an z with the property P(z)”. The word “assume” stresses the fact that the life of z is short! 22.6. The situation with the substantive binder (see T6 in Section 12) is similar to the situation with logical quantifiers. The example in Section 20.4, viz. “quadrilateral with the property that its diagonals are orthogonal to each other” again shows that names of dummies can sometimes be avoided by means of a pronoun (in this case “its”). Another example is “prime number dividing nr’l where the gerund “dividing n” is derived from the statement “ p divides n”.

N . G . de Bruijn 22.7. It is not hard to get good translations for our formalism describing sets. If A is a substantive, we can pronounce “AT’, as ‘‘the set of all A’S’’. If S is

an A-set then the substantive “Sl” can be pronounced as “element of S”.So s : Sl is to be pronounced as “s is an element of S”,and this amounts to the same thing as “s E S ”. 22.8. Now coming to secondary MV, we of course depend on the special book we prefer to write. If it contains the material of Section 18.2, we first get to the discussion of natural language for negation. If P is a statement, the statement not(P) can be written as “it is not true that P”, and actually such a thing could be written as a book definition like this:

P :: statement(dec1)

*

it is not true that P := not(P) :: statement.

Nothing much is gained by this, of course (and some people might object that “not true” has too much of a metalinguistic flavor), but it seems to be the only construction that works the same way in all possible cases. In many sentences the negation can be worked into the statement P, usually by putting “does not” in front of the main verb, but that is impossible if P has the form of a quantified statement or of an implication. 22.9. Getting deeper into an MV book, we are no longer dealing with fundamentals, and this may imply that we are mainly sticking to the kind of sentences we have invented ourselves in definitions, especially since it is so easy for us to introduce new terminology in the book. We discuss the example of line (140) in Section 18.2. In natural language we like to have some synonyms available for a phrase like “b goes through A”, and there is no objection against codifying some of these in the M V book. We might insert directly after (140) a definition like

A lies on b := b goes through A :: statement

.

And we may introduce a new substantive “point of b” as “point lying on b” (the substantive binder with suppressed dummy, cf. Section 22.5). And from now on we have another way to say that “b goes through A” by means of the typing statement “A is a point of b”.

22.10. Natural language can have productive rules for getting synonyms. The rules we discuss here are connected with possession, with conjunction and with disjunction. Possession (to have or not to have) seems to be overwhelmingly important for human beings, and therefore they have made facilities for expressing it in

The mathematical vernacular (F.3)

93 1

many different ways, so as to fit in every linguistic situation. In the name “the derivative of g” the “of” suggests that g possesses something, and we rapidly accept that “f is the derivative of g” and “g has f as its derivative” are synonymous. Therefore we feel that in the MV book it is sufficient to define “the derivative of g” and that we can take the other constructions as implicitly defined by it. The phrases for describing possession of course depend on the question whether somebody can possess more that one thing of the kind mentioned, and also on the question whether he is or is not the sole proprietor. For example: “ A is a point of b” turns into “b has A as one of its points”, and the existence statement “exist(point of b)” is synonymous to “b has at least one point”. In conjunctive statements synonyms can lead to shorter sentences. The statement “ A lies on b and B lies on b” is synonymous to “ A and B lie on b”, and similarly “ A lies on b and A lies on c” is synonymous to “ A lies on b and c”. Still the matter is tricky: ‘‘a implies c and b implies c” can not be contracted t o ‘‘a and b imply c”. There are similar contractions for disjunction. “ A lies on b or on c” is synonymous to “ A lies on b or A lies on c”. In general one will go as far as one can with such contractions as long as there is no danger of ambiguities. Try “ P is the only point of S and P is the only point of T”, “ P is a point of the limit set of 5’and P is a point of the limit set of T”. Quite often synonym production rules are not safe enough for mathematics. “I hear John and Mary” can be considered to be synonymous to “I hear Mary and John”. But in “the Cartesian product of X and Y” we cannot just interchange the X z.nd the Y . And a rule that “not for all points P we have &” is synonymous to “there is a point P such that not(&)” is a thing we want to be able to discuss as a logical rule; we do not want t o be forced t o accept it for the sake of linguistics.

22.11. From Section 22.10 we may conclude that it is not easy at all to decide about synonym production rules. Forbidding them will make our language inelegant, admitting them makes it unsafe. Obviously the best thing we can do is that as long as we do not fully understand the synonym production rules, we refuse to consider them as a part of “official” MV. We can put them on the list of the “grains of salt”, if we wish. In many cases we can take the effects of synonym production rules seriously without proclaiming them as rules. An example of this was presented in Section 18, line (143), where a synonym was provided by a book definition. Many things can be developed that way, although it may become monotonous on the long run. 22.12. It requires quite some mathematical experience to understand sentences

N.G. de Bruijn

932

involving indefinite articles (“a” and “an”). Quite often such sentences are ambiguous, and their interpretation may depend on whether they are labeled as “definition” or as “theorem”. Example: (22.12.1) a rhombus is a quadrilateral with property P has three interpretations, viz. (22.12.2) rhombus Q quadrilateral with property P (22.12.3) rhombus := quadrilateral with property P (22.12.4) rhombus = quadrilateral with property P

.

If (22.12.1) is labeled with “theorem” or “lemma”, we choose for (22.12.2), but if the label is “definition”, we choose (22.12.3). If instead of (22.12.1) we would have had “rhombuses are quadrilaterals with property P”, with a theorem-like label, we might have hesitated between (22.12.2) and (22.12.4). If we really want to express (22.12.4) we might prefer “A quadrilateral is a rhombus if and only if it has property P”. We may test our abilities for understanding mathematical sentences by trying cases where some of the words have been replaced by symbols that conceal the meaning of the words. As an example we consider

Definition. An A of a B is a C whose D intersects that B. This we rapidly interpret as z : B (dcl)

*

:= S,,c((the

A of z :=

D of y) intersects Z)

::

substantive

.

A second example is

Definition. We say that the A of a B hibernates if it is skew to that B” This we interpret as z : B (dcl) := the

*

the A of z hibernates :=

A of z is skew to z

at least if the terms “the A of book.

Z”

:: statement

,

and “is skew to” have been defined in the

The mathematical vernacular (F.3)

933

There is an objection against the choice of the phrase ”the A of 2 hibernates” in the above definition. It asks for trouble with parsing. Let us be more specific by taking as a n example: “We say that the orthocenter of triangle d hibernates if it lies inside d”. Now take two triangles d and e with common orthocenter P. We might argue: if P lies inside d then P hibernates, and therefore P lies inside e. This is obviously false! The thing is that there never has been a definition explaining what it means that a point hibernates. Nevertheless this is suggested by the phrase “We say that the orthocenter of d hibernates”, since “the orthocenter of d” is the name of a point. Therefore it is in vain to appeal to EQlOa (Section 13) in order to say that in this phrase “the orthocenter of d” may be replaced by “the orthocenter of e”. We cannot say that

t : point (dcl) * t hibernates

:: statement

.

A way to avoid this inconvenience is to define “the orthocenter of triangle d hibernates with respect to d”. Or, still simpler, we define hibernation of a point with respect to a triangle, and then apply it t o the orthocenter. We note that the definite article “the” is used in two different ways. In a case like “the orthocenter of d” it originates from a line in the book where in a context “d:triangle (dcl)” we have defined the name “the orthocenter of d”. But a case like “the positive root of f ” has to be parsed as a substantive (“positive root of f ” ) preceded by the “the” of Section 18.2 (83), which requires a proof of the uniqueness statement (82). In the case of ‘‘the orthocenter of d” there has been no previous introduction of a substantive ‘Lorthocenterof d”, and therefore it can not be parsed as a substantive preceded by a definite article. 22.13. A line in an MV book can be labeled “theorem” or “lemma” if the line body is a statement (case (vi) of BR9 (Section 10)). That, is, if it is considered “important” enough for such a stately name. Otherwise it can just be considered as a stepping stone in a proof, or as a minor remark. Lines with a body of the form (iii), (iv) or (viii) of BR9, can be labeled “definition”, although very often (in particular in case (iii) ) we prefer to call them abbreviations. These cases (iii), (iv)l (vii) can be called “statement definition”, “substantive definition” and “name definition”, respectively, and each case has its own phraseology in OMV. Very many definitions in OMV are definitions of adjectives; we shall discuss the use of adjectives in Section 22.14. 22.14. In our grammar of MV we have not discussed adjectives thus far, but they are easily incorporated. We should always bear in mind that an adjective is to be defined with respect to a substantive. Let us take the substantive

N.G. de Bruijn

934 “triangle” as an example. If z : triangle (dcl)

*

P

:: statement

is valid, then P (which may be an expression containing z) expresses a property of z. In natural language such a property may be expressed by an adjective. Let us choose the word “blue” for it. Then we can consider the new substantive “blue triangle” and the new statement “x is blue”. This statement can be considered as an abbreviation for “z is a blue triangle”. The substantive can be defined as (22.14.1) blue triangle := Sz:triangle P :: substantive . The statement “z is blue” can be introduced by z : triangle (dcl)

*

z is blue :=

(x : blue triangle)

:: statement .

We can agree that we express both (22.14.1) and (22.14.2) by the single line

Definition. A triangle z is called blue when P. A nice way to write this with a new binder “Adj” is as follows: (22.14.3) blue := Adjz:triangleP

.

Working with adjectives has some other nice features. One is, that if an adjective like “yellow” is defined with respect to the substantive A , and if B << A , then we can speak of yellow B’s too. And if we have defined both “yellow” and “round” on the substantive A , then we can use the new substantive “yellow round A ” , and that is synonymous with “round yellow A”. Misunderstandings can arise if the adjective “round” was not defined with respect to the substantive A but with respect to the substantive “yellow A ” .

23. REMARKS O N PARSING The situation about parsing is like this. What we really want to say in a sentence or a formula, has the structure of a tree (to be more precise, of a planted planar tree). Such a tree is a finite directed graph where (i) every point has just one incoming edge (except for a single point, the root, which has none), (ii) at each point a linear order of the outgoing edges is given (the order from left t o right), (iii) every point can be reached by means of a finite path that starts at the root. Finally we mention that to the points of the tree there may be attached letters or words, i.e., identifiers.

The mathematical vernacular (F.3)

935

The difficulty is that we want to put such tree-shaped information in a linear form, in order to be expressed in speech or writing, and that we want this linearized form to reveal the original tree structure. Mathematicians have solved this problem centuries ago, coding their trees in linearized form with the aid of sets of nested pairs of parentheses. In natural languages, however, this has never been done. It is quite probable that parsing trouble had its influence on our natural languages before writing was invented, in particular in the shaping of inflexions and conjugations. Quite some effort in learning languages, and in studying their structure, is connected with parsing and with the constructs people invented for the benefit of parsability. In spite of the fact that parsing is an immense problem in the study of natural languages, we can be very casual about it when discussing MV. As long as we are able to create a language it is no serious problem at all. We can just be generous with the use of parentheses or any other means for describing the tree structure in a linear format. Admittedly, we do not want to go all the way: we want our MV to look like natural language as much as possible. This can be achieved to a large extent by sensible choice of the terms and phrases we introduce in our books (in the form of sMV material). Combined with the use of just a few parentheses, this can reduce parsing trouble to a bearable extent. Making a serious study of this would, for the moment, be a waste of time, since the trouble can so easily be eliminated by adding enough parentheses.

This Page Intentionally Left Blank

937

Relational Semantics in an Integrated System R.M.A. Wieringa 0. ABSTRACT

This paper contains the description of a system for handling semantics of computer programs. The methodology used for the description of semantics is the relational semantics: - possibly incomplete - information about programs is represented by binary relations. For the description we use the language Automath in which logic, mathematics, syntax and semantics are integrated. Moreover, the correctness of texts written in Automath can be checked mechanically by a computer. We consider an ALGOLGO-like programming language. The axiomatic basis of it is kept small, but it is large enough to make the definition of many ALGOL constructs possible. In the basis are included assignment, binary selection, concatenation, block structures and recursive parameterless procedures. For these basic constructs semantics is presented, and some examples are given how new program constructs can be described in terms of these basic ones.

1. INTRODUCTION We shall present a formalism for the description of syntax and semantics of programs in a n ALGOLGO-like programming language (i.e. a block-structured programming language with variables of various kinds, assignment to these variables] binary selection] recursion, etc.). An essential point is that program correctness proofs have to be subjected to an automatic verification system. So we have to deal with (a) the organisation of the variables, the so-called state space,

(b) the description of the syntax of the language: what kind of programs do we consider, (c) the description of the semantics of the programs: what information do we state about the programs.

R.M.A. Wieringa

938

The method we shall use to describe semantics will be relational semantics with strong emphasis on dealing with incomplete information about the relation between initial and final state. A system for verification of the correctness of programs has to be able to cope with mathematical theories (e.g. number theory) and to keep track of the mathematical interpretation of values of state space variables. In practice, the verification of the correctness of a program appears to be long and tedious, since it consists of very many elementary steps. We feel the need for a mechanical verification. So alltogether, we need a language in which various formal systems (e.g. semantics, logic, mathematics) are integrated, and the correctness of what is written in the language should be decidable by a computer. Automath ( [ d eBruijn 70a ( A . 2 ) ] )is such a wide-scope language. In an Automath book we can express all primitives we need about logic, mathematics, programming language, semantics, and on the basis of these primitives we can define particular programs and derive truths about their semantics. We use the following notatioan for some of the essentials of Automath. Typing is denoted by colons ( P : Q means P has type Q). Abstraction is written as [z : A] B , denoting the function with domain A and values B (this B may contain z). Application is written as ( A )B (i.e. the value of the function B at the point A ) . We use Q : type for saying that Q is a type, and R : prop for saying that R represents a proposition (if S : R then S is a proof of that proposition). The semantical framework described here is essentially based on various proposals by N.G. de Bruijn [de Bruijn 73d], [de Bruijn 75b]. In the present form it is used by the author of this paper for the development of an operational system intended to be useful for proving correctness of big programs.

2. T H E STATE SPACE Since programs act on variables, we have to pay some attention to these variables and their possible values; in other words, to the state space. Roughly speaking a state is a set of variables each of a certain type (think e.g. on the types integer, boolean etc. in ALGOLGO) and having a value corresponding to that type. So we introduce the notion datatype, and several datatypes, like datatype : type boo1 : datatype int : datatype

.

Relational semantics in an integrated system (F.4)

939

For each datatype dt the type of the corresponding values will be denoted by elts(dt) : type

.

Since our programming language has an ALGOL-like block structure, we put our variables on stacks: one for each datatype. For simplicity we do not assume the stacks to have a bottom. The places in each stack are indexed by 0,1,2,...; the 0 refers to the top of the stack. In the stack corresponding to dt the values have type elts(dt). Each pair (dt, i) of a datatype and an index now identifies a program variable: we do not talk about names of variables. So we define (written in Automath) State := [dt : datatype] [i : nat] elts(dt) : type (where nat is the type of the naturals). For a visual interpretation see Fig. 1.

Figure 1, A state space There are several operations on states. Let us fix a state a. By value(o, dt, i) we denote the value in a of the variable (dt, i); it has type elts(dt). Furthermore there are some operations transforming states into states: (a) adapt(a, dt, i, u ) is the state that is obtained from a by replacing the value of (dt, i) by a new value u ; (b) extend(u,dt,u) is the state we get when in (T we push an element with value u on the stack corresponding to dt. So, when 0’= extend(a, dt, v) we have value(a’, dt, 0) = v, value(a’, dt, i 1) = value(o, dt, i), value(a’, t, i) = value(a, t, i) when t # dt;

+

(c) restrict(a,dt) is the state we get when in a we remove the top element of the stack corresponding to dt. So when a’ = restrict(a,dt) we have value(a’, dt, i) = value(a, dt, i+ l),value(a‘, t , i) = value(a, t, i) when t # dt.

R.M.A. Wieringa

940

Having defined these operations on states, we can prove properties about them, e.g. value(extend(a, d t , v), dt, i

+ 1) = value(0, dt, i)

restrict(extend(0, dt, v), dt) = 0 and write these in our Automath book. In order to deal with nontermination, abortion because of thing like “divide by zero”, indexing outside array bounds etc., we add an extra datatype ref (standing for refuser). The variables belonging to ref are quasi-variables,i.e. they do not appear in a program itself but only in its semantics. There are two values connected to refusers: ON and OFF, ON meaning “there is something wrong”. The datatype ref plays an exceptional role in our discussions. In most cases we shall stipulate that datatypes are # ref.

3. SYNTAX What kind of programs do we consider? It is our intention to have a rich class of programs with a set of primitives that is as small as possible. Therefore we do not consider expressions of complex shape in our primitive programs. The following programs are primitive (the word “Program” will be used as the type of all programs).

(1) If dt : datatype, u : dt # ref (i.e. u proves dt we have

# ref), i

: nat, v : elts(dt),

Const_ass(dt,u, i, v) : Program corresponding to “z := d’in ALGOL (where z corresponds to (dt, i) and v is a constant of type dt).

(2) If dt : datatype, u

: dt

# ref, il

: nat, i2 : nat, we have

Var_ass(dt,u,illi2) : Program corresponding to “x := y” in ALGOL, y being a variable.

(3)

If b : nat, nl : Program, a2 : Program, we have Binselect@, irl,n2) : Program corresponding to “ i f b then irl else ~ 2 ” .

Relational semantics in

illl

integrated system (F.4)

94 1

If ~l : Program, 7r2 : Program, we have Concat(Rl,R2) : Program corresponding to “ ~ 1~ ;2 ” . If dt : datatype, u : dt

# ref, R

: Program, we have

Block(dt, u, R ) : Program corresponding to “begin dt x ; R end” (where dt is one of the types in ALGOL). If dt : datatype, u : dt

# ref, R

: Program, we have

Injection(dt, u, R ) : Program In ALGOL there is no construction corresponding to this. It intends the following: Program R acts on a state space. When we want to use R in a situation where that state space has been extended with a variable of datatype dt, x has to act on that extended state space. For formal reasons this program has to get a new name. If cp : Program we have

+ Program

(i.e. cp is a function from programs to programs)

Recurs(cp) : Program more or less corresponding to “procedure p ; ( p ) cp”. The idea behind this approach is the following: ALGOL uses in recursive procedures a kind of circular definition: in the specification of procedure p , p itself may appear: p := ( p ) cp. The essential part of the procedure is cp, the program-program function. By the formula p := Recurs(cp) we turn cp into a program. The above list of primitive programs is a reasonable basis for a programming language. We do not state it to be complete; if desirable we can add further primitives later, e.g. primitives about array assignment, and operations on records (as in PASCAL). And users of the system, handling special algorithms requiring special datatypes can add primitive notions for private use. By means of the seven primitive program constructs given above we can build other program constructs. Once they have been written in our Automath book they are available for later use, just like the primitive ones. We give some examples.

R.M.A. Wieringa

(8)

To the boolean assignment “ b l := b2 V b3” in ALGOL (where b l , b2 and b3 are variables) corresponds the statement “if b2 then b l := true else b l := b3”. It is written in Automath as follows: if b l : nat, b2 : nat, b3 : nat. we define Bool_or..ass(bl, b2, b3) := Binselect (b2, Const .ass( bool,boolnotref, b l T ) , Varass(bool,boolnotref, b l , b3)) : Program (where boolnotref states boo1 # ref, and T : elts(boo1) denotes the value true).

(9)

The empty statement in ALGOL can be mimicked as follows: Dummy := Var-ass(bool,boolnotref,0,O) : Program so “b := b” in ALGOL where b corresponds t o (boo1,O).

(10) To the statement “if b l V b2 then A” corresponds the block “begin boolean b; b := b l V b2; if b then A else end”. If b l : nat, b2 : nat, T : Program, we describe it by Or-cond(b1, b2, T ) := Block(bool,boolnotref,Concat(Bool_or_ass(O, bl

+ 1, b2 + l),

Binselect( 1,Injection(bool,boolnotref, A),Dummy))) : Program Notice the effect of the introduction of a new boolean. It transforms b l , b2 and T into b l 1, b2 1 resp. Injection(bool,boolnotref, A).

+

+

(11) To the while statement “while b do A” corresponds the recursive procedure “procedure p ; if b then begin T ; p end else”. If b : nat, A : Program, we describe it by While(b, T ) := Recurs([nl : Program] Binselect(b, Concat(T, TI), Dummy)) : Program.

+

We did not yet discuss integers and assignments like “ a := b c”. We can define the integers as sequences of bits 0 and 1 and write programs for addition, multiplication etc. It is a long way t o go, but whatever we produce is available forever.

Relational semantics in an integrated system (F.4)

943

4. SEMANTICS

Semantics as we describe it is closely related to the methodology of denotational semantics with one of its central ideas the presentation of meaning of a program as a function from states to states (cf. [Scott & Strachey 711). We take a different point of view: we do not consider functions from states t o states but binary relations over the state space. This is called relational semantics (cf. [Hichcock & Park 731). When discussing semantics of a program in a particular situation it is, fortunately, often sufficient to deal with incomplete information. Some parts of the program may have semantic properties which are partly irrelevant for the properties of the program as a whole. Such incomplete information has the form of a binary relation, and can be treated in our system. As an extra advantage we mention that we do not have the slightest trouble with non-deterministic programs. We connect relations to programs by stating that a relation p presents information about a program x. In our Automath book we take this notion to be primitive, but we can give the following interpretation from an executional point of view: For every pair a1 : State, a2 : State where a1 and a2 are initial and final state of some execution of x, the relation p holds. Because of the possible incompleteness of the information, the converse (i.e. when p holds for a1 and a2, x can transform a1 into a2) need not be true. In the jargon of Automath, a relation is a function that adds t o every u1 : State and a2 : State a proposition. So the type of all relations is Reln := [a1 : State] [a2 : State] prop

.

So given p : Reln, a1 : State, a2 : State, “ p holds for a1 and u2” is expressed by (a2) (01)p. Further we write, given x : Program, p : Reln, the primitive notion info(a,p) : prop

.

The interpretation of info(x, p ) is the proposition ‘‘p presents information about T” *

The basic properties embodied in this interpretation are given by the following axioms (where a : Program, p l : Reln, p2 : Reln) 0

info(T, p l ‘and’ p2) ‘eqv’ (info(a, p l ) ‘and’ info(x, p 2 ) )

0

( p l ‘imp’ p2) ‘imp’ (info(a, p l ) ‘imp’ info(x, p 2 ) )

(We use ‘and’, ‘imp’, ‘eqv’ for the connectives A , calculus.)

+, = of ordinary propositional

R.M .A. Wieringa

944

The relations p we claim by axiom to present information about the seven primitive programs in Section 3 all have a standard form, viz. [a1 : State] [a2 : State] i f Someref-on(a1) then a1 = a 2 e l s e P ( a 1 , a2)

where P ( a 1 , a2) is a proposition, and Some_ref_on(o) := 377 : nat(value(a,ref,r) = ON)

.

The motivation for this is the following: Once a refuser is in ON position (because of things like nontermination, abortion), we do not want to “execute” the rest of the program anymore; in other words: this rest is equivalent to a skip, for which we present the information m 1 = 02.

So to each primitive program T we have a relation p in standard form and an axiom stating info(T, p ) . In this paper we do not give the relations in standard form, but only the essential part, i.e. the proposition P(a1,a 2 ) in the e l s e part. (1) To Const-ass(dt, u,i, w) is connected the proposition (playing the role of P(aL02)) a 2 = adapt(a1, dt, i, w) . (2) To Var-ass(dt, u,i l , i2) is connected a2 = adapt(a1, d t , il, value(a1, d t , i2)) . (3) Given p l : Reln, p2 : Reln, info(T1, p l ) , i n f o ( ~ 2p2), , to Binselect(b, ~ 1 , 7 r 2 ) is connected i f value(a1, bool, b ) = T then (a2) (al)p l else (a2) (01)p2

.

(4) Given p l : Reln, p2 : Reln, info(T1, pl), i n f o ( ~ 2p2), , to Concat(xl,7r2) is

connected 3a : State ( ( a )(al)p l ‘and’ (a2) ( a )p2)

.

(5) Given p : Reln, info(T, p ) , to Block(&, u,T ) is connected 3wl : elts(dt) 3w2 : elts(dt) ((extend(a2,d t , w2)) (extend(a1, d t , wl)) p )

.

Relational semantics in an integrated system (F.4)

945

Since a1 and a 2 are states belonging t o the state space outside the block and p is a relation between states inside the block, we have to extend a1 and a2 with appropriate values when connecting them with p . They are extended with v l and v2, representing the initial and final value of the variable local in the block.

( 6 ) Given p : Reln, info(n, p ) , to Injection(& u, n) is connected (restrict(a2, dt)) (restrict(a1,dt)) p ‘and’ value(a2, dt, 0) = value(a1, dt, 0)

.

Now p acts on a “smaller” state space than the one 01 and a 2 belong to, so we have to restrict a1 and u2. The second part of ‘and’ states that the value of the added variable does not change.

(7) In order t o describe information on the recursive program Recurs(cp), we have to consider a sequence of relations with special properties. Given Seq : nat

+

Reln,

with

V a l : State Va2 : State ((02) (al)( 0 )Seq ‘eqv’ i f SomeIef.on(a1) then a2 = a1

else value(a2, ref,nonterm) = ON))

V k : nat V.rr : Program (info(n, ( k ) Seq) ‘imp’ info((n) cp, ( k

+ 1)Seq))

to Recurs(cp) is connected

Vn : nat 3k : nat ( k ‘gtr’ n ‘and’ (02) (al) ( k )Seq)

.

The interpretation is as follows: We start from a program TO to which we connect the proposition value(a2,ref,nonterm) = ON. (TO can be considered as a non-terminating program.) We now build the programs (nO)cpo := no, (x0)pl := (xO)cp, (nO)cp2 := ((.rrO)p)p,.... For every k , ( k )Seq is a relation that presents information on (TO) cpk, by induction: (0) Seq presents information about no, and for any k and n holds info(n, ( k ) Seq) ‘imp’ info((x) cp, ( k + 1)Seq). The information presented on Recurs(cp) is now the least upperbound of the sequence Seq: [a1 : State] [a2 : State] Vn : nat 3k : nat

( k ‘gtr’ n ‘and’ (a2) (01) ( k )Seq)

.

Starting from our semantics of the seven primitive programs, we can define relations for higher-level constructs and prove that these relations present information. Especially the while statement deserves some attention, and the

946

R.M.A. Wieringa

+

programs that effect the arithmetic operations, such as ‘‘a := 6 c”. Once all such standard programs have been written in our book, we gradually can start to write more complex programs and to present information about them. This set-up is completely parallel to the situation in mathematics where we start from very simple primitives, and gradually learn to say everything we want. Much of the work we have to do when writing programs and proving semantics about them, is more or less standard. All the time we deal with complex expressions in terms of the operations on states (as given in Section 2). Those can be simplified by application of the rules we have mentioned at the end of 2, applying elementary logic and elimination of if-then-else constructs. At this moment we feel the need for a (limited) automatic simplifier. Given a complex expression in terms of extend, adapt, restrict etc. such a simplifier is supposed to deliver a simpler equivalent form of this expression (and written in Automath a proof of this equivalence). Occasionally, some human interaction might be helpful.

947

Computer Program Semantics in Space and Time N.G. de Bruijn

1. INTRODUCTION This note can be considered as an addition to [de Bruijn 73d] (see also [ Wzeringa 80 (F'.4)]). We aim at a new treatment of recursive procedures, i.e., a new version of Section 11 of [de Bruijn 73dl. This new treatment also effects the other sections, but there the alterations that have to be made are quite obvious. In [de Bruijn 73d] we used a state space R that was extended to a space R+ by adding a single element m. This element m played a role in the semantics only: it is not something that can be referred to in the programs. Its semantic role is to indicate non-termination. In the present note we follow a system that handles some further information, i.e., something corresponding to runtime. Where the system of [de Bruijn 73d] only distinguished between runtime being finite or infinite, the present system might be able to say exactly what the runtime is in cases where it is finite. For practical applications it might be interesting to develop runtime administration for terminating programs, but this was not the main motivation for this study. The reason was rather of a theoretical nature. The semantic treatment of [de Bruijn 73d] (Sect. 11) turned out to be hard to combine with fixed points semantics. Mr. R. Wieringa, who implemented a large part of [de Bruijn 73d] in Automath, had to introduce a new notion of order between predicates and had to impose slightly awkward monotonicity restrictions in order t o establish a correspondence between the recursion semantics of [de Bruijn 73d] and fixed point theory. The system proposed in this note is much easier in this respect. Yet, if we weaken the runtime information by distinguishing between finite and infinite runtime only, the semantics can be expected to be the one of [de Bruijn 73d]. Our present system takes care of runtime by describing a relation between the moment t where the execution of a program starts, and the moment t' where the execution ends. Semantical information about a program will have the form of a predicate on (0 x 2') x (0 x T ) (whereas in [de Bruijn 73d] it had the form of a predicate on R+ x Of. It is quite reasonable to think of predicates in which

N.G. de Bruijn

948

the relation between t and t‘ is expressed in a form like

t + f ( w , w‘) I t’ I t

+ g(w, w’) ,

and it is also quite reasonable t o choose the semantics of primitive program constructs in accordance with this form. Nevertheless we shall not set it as a rule that our predicates should necessarily have this form. What we do require, however, is that t 5 t’ is somehow enforced, and we shall realize this by restricting our predicates to the set of all those quadruples (w, t , w‘, t’) for which t

5 t’.

The interpretation of t and t’ is obvious. In a case where t < 00, t‘ = 00 we of course say that the program does not terminate. In cases where t = 00, t’ = 00 we can say that the program execution never started, since some other program that had to be executed first, took infinitely long. In order to be able to say that the sum of the lengths of infinitely many time intervals is infinite, we restrict ourselves t o time moments which are either integers or the symbol 00. Just like in [de Bruijn 73d], where programs never explicitly referred to the element 00, in our present system the programs will usually not refer to t in any way, except for the program “delay”. In this note we shall discuss the relation between t and t’ only as far as it is relevant for the treatment of recursion. In fact we take the point of view that no program takes time, except for the “overhead” of a procedure call. This overhead is connected with the fake program we call “delay”, which takes time without doing anything else: its semantics may be described by (w = w’) A (t‘ = t 1). One might say that every case of non-termination is already caused by an infinity of executions of “delay”, in spite of the fact that other program components might try to make it worse. Another point of view in which this note differs from [de Bruijn 73d] is a simplification: we have given up the “relativistic” attitude (see Section 3). Part of the philosophy underlying the note [de Bruijn 73d] was that semantical discussion can be kept on a mathematical level without entering into the syntax of a programming language. The correctness of the code in which the program is presented t o a computer is a matter we can reduce to the question of the correctness of a compiler. This is not essentially different from the problem of the correctness of the translation of programs in higher order languages. The philosophy that semantics can be built up without entering into syntax, is elaborated in Section 8 of this paper.

+

Computer program semantics in space and time (F.5)

949

2. PREDICATES ON THE SET A We start from a set fl (which is called ‘%ate space”). And we introduce the “time space” T , defined as

(where X is the set of integers and 00 is a new element). In T we have addition and order. The ordinary addition of Z is extended by 00 t = t 00 = 00 for all t E T , and the ordinary order of Z is extended by agreeing that k < 00 for all k E Z. The set A is defined as

+

+

Given a point (w, t , d , t’),we may refer to the pair ( w , t ) as “initial”, and t o the pair (w‘, t ’ ) as “final”. We write Pred(A) for the collection of all predicates on A. Particular predicates are (i)

‘ITRUE”,which is identically true on A,

(ii)

“FALSE”,which is identically false on A,

(iii) “NONTERM” (for non-termination), which has as its value the proposition t’ = 00, (iv) a predicate which we shall denote by J , given by

J ( w , t, w’, t‘) = (t = t’) A ((t’ < m) + (w = w ’ ) )

.

(1)

It is inconvenient t o admit all arbitrary predicates on A, for in cases where t’ = 00 the value of w’ has no sensible interpretation, and for our semantics it would be awkward to make the distinction between different final pairs (w’, 00). (Of course it neither makes sense to distinguish between different initial pairs (w,m), but there it causes no trouble for our semantic system.) Somehow we want to consider all the (w‘,00)’s as equal, but we do not want to loose the Cartesian product structure of our space. Therefore we shall restrict predicates to being ‘konstant at infinity”. We say that a predicate P E Pred(A) is constant at infinity if P(w,t , ~ ’00), = P ( w ,t , w ; , 00) for all w,w‘,w; E R and all t E I. The set of all P E Pred(A) which are constant at infinity will be called Pred*(A). For every P E Pred(A) we construct a P* E Pred*(A) as follows:

950

N.G. de Bruijn P*(w,t ,w’, t’) = P(w,t ,w’, t’)

if t‘

<m ,

and

P * ( W , t ,w’, 00) = 3,En P(w,t , p, m) for all w,w‘ E R, t E T . If P is itself in Pred*(A) already, we obviously have P’ = P. This applies in particular to the examples TRUE, FALSE, NONTERM and J , mentioned above. We shall write P c Q for predicate implication (rather than using the dubious Q), so P c Q means notation P -+

P(wl t , u‘,t’)

V(u,t,ut,tt)

+

Q(w, t , w’, t‘)

.

We use the sign = for equivalence of predicates (and we shall often just call it equality), so P = Q means ( P c Q) A (Q c P ) . The notation P c Q helps us to remind that the set of points satisfying P is a subset of the set of points satifying Q. Quite often we interpret a predicate as an amount of information, and then we have to keep in mind that P c Q does not mean that Q gives more information than P. I t is just the other way around: P gives all the information presented by Q, but possibly more. With the above notation P*, Q’ we obviously have

(Pc

Q)

+

(P’

c Q’)

for all P,Q E Pred(A). Quite often we want to describe a predicate by means of some expression E containing w ,t , w’,t’. We shall use the notation XE in order to denote the predicate P for which P(w,t,w’,t’) = E for all w , t , w’,t’. And if P is obtained from E this way, we write X*E for the P* corresponding to P. As examples we mention X*((w = w’) A (t’ = 00)) = X*(t’ = m) = X ( t ’ = m) = NONTERM

,

X’((W = w’)A ( t = t‘)) = x(((w = w’)A ( t = t’)) V ( t = t’ = m)) = = X ( ( t = t’) A ((t’ < 03)

+

(w = w ’ ) ) ) = J

.

If P and Q are in Pred(A) we define the “boolean matrix product”, or “boolean convolution”, denoted by P * Q, as follows:

P

* Q=

3 0 ~ n . 1st i~s g~ f ( P ( w t, , 0 1

8)A

We note that this convolution is associative:

( P * Q)

*

R= P

*

(Q

* R).

It is not hard to show that for all P E Pred(A)

Q(.,

3,w’, t ‘ ) )

.

Computer program semantics in space and time (F.5)

P*J=P*,

95 1

(2)

and therefore ( P * Q)* = P * Q * J = P * Q*. In particular, if Q is constant at infinity, then P * Q is constant at infinity. As a special case of (2) we mention

J * J=J*=J.

3.

SEMANTIC INFORMATION

We assume that we have a set called Prog(0). The elements of this set are called programs (or “programs on 0”).And we assume to have a mapping “Totinf” (which stands for “total information”), mapping Prog(0) into Pred*(A). The interpretation is that if 7r is a program, then Totinf(.rr) is a predicate on A that presents all the semantic information about x. That is to say in terms of operational interpretation: there is an execution of x leading from initial state (w , t ) to final state (w’, t ’ ) if and only if the quadruple ( w , t, w’, t’) satisfies the predicate Totinf(T ) . Quite often it happens that for a given program x it is hard to find Totinf(7r) and, moreover, this total information is not always important in all its details. We can usually be quite happy with something that is simpler and weaker. That means that we work with some R E Pred*(A) such that Totinf(7r)

cR.

(1)

In most cases we start from the other end. We have some predicate R as our goal, and we want to find a program T such that (1)holds. Then we say that R is a program specification, and that x is a program that satisfies the specification. In [de Bruijn 73d] we tried to avoid the introduction of a thing like Totinf. The idea behind this is that it might be useful to have a semantic system in which people with different ideas about the total information are still able to communicate about things they do agree on, and also that it leaves some freedom t o those who implement a programming language. This idea of [de Bruijn 73d], if extended to our present system, means that we work with a predicate w on Prog(R) x Pred*(A), with the interpretation that W ( T , P ) expresses that P(w,t, w’, t ‘ ) is true if (but not necessarily “only if”) there is an execution for which ( w , t, w’, t‘) presents the initial and final state. The connection with Totinf is if Totinf(T) c P then

W(T,

P)

N.G. de Bruijn

952

for all K E Prog(R), P E Pred*(h). In the present note we shall not follow this line of [de Bruijn 73d]. The advantage of avoiding Totinf(7r) might not compensate the disadvantage that the basic properties of W ( K , P ) are harder to formulate than those of Totinf(.rr). There is an analogy in topology, where there is a possibility (“point-free topology”) to restrict the discussions t o “open sets” as basic objects, without bothering whether these objects are sets (of “points”) indeed. The price that has to be paid for this “relativistic” point of view is a complication of the axiomatic structure, a structure that can only be understood with the non-relativistic structure in mind. The interpretation of Totinf(a) makes it hard to attach a meaning to cases where w , t are such that there is not a single pair w‘, t’ such that Totinf(s)(w, t , w‘, t ’ ) is true. Nevertheless we do not exclude these cases by means of a general condition on Totinf(K), partly since it still might turn out to come in handy for special kinds of abortion (cf. Section 10). On the basis of Totinf we can define a transitive relation between programs. If both 7r and u are in Prog(R) we write x

If both

K

5 u if and only if Totinf(7r) c Totinf(u)

5 u and

u

5 K we say that

K

.

and u are semantically equivalent.

4. SEMANTICS OF PRIMITIVE PROGRAMS In this section we consider the programs “skip” , “delay” , “adlibitum”, “nonterm” and a class of programs called “assignments”. We ignore other primitive programs like 2 := 2 y , etc. The programs “skip”, “delay”, “adlibitum” and “nonterm” will hardly ever occur in actual programs, but are just added to the collection of all programs in order to smooth the semantic dicussion. The program “skip” does nothing, that is to say that it leaves both w and t untouched, at least as long as t < 00. Its semantics is

+

Totinf(skip) = X*((w = w’) A (t = t ’ ) ) . The program “delay” is just like “skip” in as far as w = w‘, but it “consumes a unit of time”: Totinf(de1ay) = X*((w = w’) A (t’ = t

+ 1)) .

The program “adlibitum” is a non-deterministic program that instructs the computer to do as it pleases, possibly even to use a n infinite amount of time.

953

Computer program semantics in space and time (F.5)

(The term “adlibiturn” is used in music with the same meaning, although one would not usually admit the performer to play infinitely long.) The semantics is “no information at all”, “anything may happen”, and is expressed formally by Totinf(ad1ibitum) = TRUE. The program “nonterm” instructs the computer to go on for ever, and requires nothing about w’. Its semantics is Totinf(nonterm) = X ( t ’ = co) = X * ( t ’ = 00) = NONTERM

.

The trouble with a formal treatment of assignments is this: they contain expressions in some syntactic form, intended to represent elements of R, but the language in which they are formulated (the computer programming language) should not be confused with the (so much richer) mathematical language in which we discuss the semantics. Let us say that somehow we have defined a class of expressions (in the programming language) and that t o each E of that class we have assigned a mapping g of R into R. Then something like “x := E” is a program; we note that E may contain a symbol x referring t o the initial value w , but no symbols referring to w’, t or t’. We shall not say precisely how g is obtained from E; one usually has the “naive” interpretation that the value g ( w ) is obtained if we just replace z by w in E. But whatever the relation between E and g might be, the semantics is Totinf(x := E ) = X*((w‘ = g(u))A ( t = t‘)) . We have chosen the (unrealistic) point of view that the execution of z := E does not take time. If one wants t o attach some time consumption to this program, it is easily administered by adding (in concatenation) a number of executions of “delay”.

5. LOWER PRIMITIVE PROGRAM CONSTRUCTS

One of the simplest lower primitive program constructs is K o r a, if both K and a are in Prog(R). The interpretation is that for every input the computer is free to choose which one of K and a is to be executed. The semantics is described, if P = Totinf(?r), Q = Totinf(a), R = Totinf(n o r a), by

R(w,t ,w‘, t’) = P(w,t ,w’, t‘) V Q(w, t ,w’, t’) for all (w, t ,w’, t’) E A. Since both P and Q are constant at infinity, the same thing holds for R.

N.G. de Bruijn

954

Next we consider the well-known construct “A ; u ” , called the concatenation of the programs x and u. The semantics is given by means of the convolution Totinf(x; u ) = Totinf(x)

* Totinf(u) .

We note that Totinf(x ; 0)is constant at infinity, since the second factor is constant at infinity. We also note that because of the associativity of the convolution the programs x ; ( u ; T ) and ( x ; u ) ; T are semantically equivalent. With the construct “if E then x else u” we have a situation similar to the one with the assignment statement we considered in Section 4. Somehow we have defined a class of expressions in the programming language, and to each E of that class we have associated a predicate B on R. The expression E may contain a symbol z referring to the input value w , but no symbols referring to w’, t or t’. (The usual “naive” point of view if that the program text contains B itself, and that it reads “if B(z) then x else u ” . ) The intuitive (operational) meaning of “if E then x else u ” ,is that if B(w) is true then x is to be executed, if B(w) is false then u is to be executed. The formal semantics is described as follows. Let P , Q , R be the Totinf’s of x,u and “if E then x else u”. Then we have

R(w,t , w‘, t’) = = (B(w) A P(w,t,w’,t’)) V (-B(w) A Q ( w , t , w ’ , t ’ ) )

and we note that the right-hand side is equivalent to

( B ( w )+ P(w, t , w ’ , t’))A (-B(w) + Q(w, t , w‘, t’)). We have to check that R is constant at infinity. This follows directly from the fact that both P and Q are constant at infinity. In the lower primitive program constructs of this section we have not administered any run time for the “overhead” of the constructs: the only time consumption is in the execution of the sub-programs x , u , ... . It would not be very hard to alter this by adding a number of executions of the program “delay”.

6. HIGHER PRIMITIVE PROGRAM CONSTRUCTS

Let cp be a program-to-program function, i.e., a mapping of Prog(R) into itself. As an example we start from some expression E to which there corresponds a predicate B (like in Section 5), and for any x E Prog(R) we define cp(x) by cp(x) = if

E then x else s k i p .

Computer program semantics in space and time (F.5)

955

For any program-to-program function cp we have a program to be called R E CURS(cp). We may think of a program p that can be described in ALGOL60 by the procedure declaration and procedure body procedure p ; cp(p)

.

In order to describe the semantics of RECURS(cp)we first define the programto-program function $ by $(A)

= cp((de1ay; A ) )

.

(1)

Next we introduce the functions q k (k = 0,1, ...) by iteration: $ ‘ ( A ) = A , qhk++’(a) = $ ( g k ( r ) ) .We apply the q k to the primitive program “adlibitum”. Abbreviating Rk

= Totinf($k(adlibitum))

(2)

we now define the semantics of RECURS(cp) by Totinf(RECURS(cp)) = R

(3)

R(U,t ,U ’ , t’) = V k G N & ( U , t ,W ’ , t’)

(4)

where

for all U,W’ E R and t , t‘ E T . IV is the set (1, 2,3, ...}, but it would do no harm t o include k = 0 as well, since & = Totinf(ad1ibitum) = TRUE. We have to check that R E Pred*(A), i.e., that R is constant at infinity. This is trivial from (4), since each Rk is constant at infinity. In Section 9, we shall comment on the definition (3), and discuss its relation to fixed point semantics. We postpone the discussion since we want to show first that (3) is good enough for a practical case: the “while” statement.

7. THE WHILE STATEMENT We consider the program that is usually written as while E do r

,

(1)

where r is a program, and E is an expression that plays the same role as in Section 4 in the construct “ if E then A else d’.We form a program-toprogram function cp by c p ( ~ ) = if E then ( r ; A ) else skip

(2)

N.G. de Bruijn

956

and we claim that the semantics of RECURS((p) is able to explain the semantics usually attached to (1). We shall do this both for partial correctness and for total correctness. In both cases B is the predicate on that corresponds t o E .

Theorem 1 (“Partial correctness”). Let C be a predicate on R and abbreviate

D = X((t = t’) A ( ( B ( w )A C ( w ) ) + c ( w ’ ) ) ) , F = X((C(w)

A

(3)

(t‘ < m)) + (C(W’) A -B(w’))) .

(4)

Assume that the semantics of I- satisfies Totinf(I-) c D*

.

Then the one of RECURS((p) satisfies Totinf(RECURS(q7)) C F

.

Proof. As in Section 6 we abbreviate

,

Rk = Totinf($’(adlibiturn)) and we note that qk++’(adlibitum)= = i f E then

(7; delay

; $k(adlibitum)) else skip

.

By the rules of Section 5 we get Rk+l C Wk+l where wk+l = X((B(w)

+

Hk(w,t,w’,t’))A ( l B ( w )

Hk = X(ausu,s, (D*(W,t,a,s) A (S‘ = S

+

s(w, t,W’, t’)))

1

(7)

+ 1) A (a‘ = a ) A

.

A Rk(o’, S’,W ‘ , T’)))

(8)

S = Totinf(skip) = X*((w = w’) A ( t = t’)) .

By (3) we have

D*(w,t , w‘, T’)

-+

t = t’

and therefore (8) can be simplified to

Hk = ~ ( ~ , ~ ~ D * ( ~ , t , ~ , t ) A R ~ ( ~ , ~ + 1 , W ’ , t ’ ) ) . (9) We define a sequence of predicates Pk

= X ( ( C ( W ) A (t’ +

Pk

E Pred(A) by

+ 1 5 k + t ) A (t‘ <

(C(w‘)A-B(w’)))

.

00))

-+

Computer program semantics in space and time (F.5)

957

We remark that for all k

i.e., that

Pk

= Pl

Pk

is constant at infinity: we even have

,

(11)

pk(w, t,w‘, m)

(12)

for all k , w , t , w’ because of the subexpression t‘ We shall show that Rk

cPk

( k = 0, 1,2, ...)

< 00 on the left in (10).

.

(13)

Once this has been shown we rapidly get to (6): by Section 6 Totinf(RECURS(9)) C Rk for all k , and for all (w, t , ~ ’t’) , E A we have (vk

Pk(w, t ,W ‘ , t’)) + F ( w , t ,U ‘ , t’) .

(14)

If t‘ = 03 this is trivial since the right-hand side is true, if t’ < rn we can (given w, t , w’, t’) take k such that t ’ + l 5 k + t , whence Pk(W, t , w ’ , t’) + F ( w , trw’, t’). We shall prove (13) by induction. First

Ro c Po

(15)

is trivial: P = TRUE since t’+ 1 5 t is false on A. next we take a fixed k we assume Rk C P k and we shall show C &+I. It suffices to show wk+1

c pk+1

> 0, (16)

To this end we fix ( w , t , w’, t’) E A, we assume

Rk C

Pk

and

wk+l( W , t , W ‘ ,

t’)

(17)

and our goal is Pk+l(w, t , w’, t‘). w e split in two cases: B ( w ) and +(w), according to (15) and (7) we can reach our goal by proving

so

In order to show (18) we assume its left-hand side. By (9) and Rk C conclude that n exists such that

we

B(U)A

D * ( W , t , 0,t’) A Pk(C7,

t

I, W ’ , t’) .

Pk

(20)

If t‘ < 00 we can just replace D’ by D , and from (20), (3), (4) we derive ( C ( w ) A (t’

+ 1 5 k + t + 1)) + (C(W‘)A i B ( w ’ ) )

(21)

N.G. de Bruijn

958

and that means Pk+l(W,t,W’,t’). If t’ = 00 we get Pk+l(W,t,W‘,t’) from (12). Next we show (19). We assume S(w, t,w’, t‘) A l B ( w ) , and have to prove Pk+l(w,t,w’,t’). If t‘ = 00 this is trivial by (12), so we take t’ < m. w e assume C ( w ) A (t’ 1 k 1 t ) A (t’ < 00) and want to show C(w‘) A +?(w’). From S(w, t,w’, t’) and t’ < 00 we get w = w’, so by C ( w ) and +(w) we have 0 C(w’) A -B(w’). This finishes the proof of Theorem 1.

+ < + +

Theorem 2 (“Total correctness”). Let C be a predicate on R, and let Q be a mapping of R into the set of integers 2 0 . We abbreviate

D = X ( ( ( B ( w )A C ( w ) )

+

F = X ( ( C ( w )A ( t < 0 0 ) )

< Q(w)))) A ( t = 0 ),

(22)

< t + Q(w)) A -B(w’))) .

(23)

(C(w’)A ( Q ( 4

+

( C ( w ‘ ) A (t’

Assume that the semantics of r satisfies Totinf(r) c D*

.

(24)

Then the one of RECURS((p) satisfies Totinf(RECURS(p)) c F

.

(25)

Proof. With the new D and F we follow the proof of Theorem 1. We also take new Pk’s:

Pk = X((C(W) +

A

(Q(W) < k) A (t < 0 O ) )

+

((t‘+ 1 < t + k) A c ( w ‘ ) A iB(w‘))) .

(26)

Since we have altered D, F and Pk, we have to supply new proofs for the details ( l l ) ,(14), (15), (18), (19) of the proof of Theorem 1. We note that the simplification of (8) to (9) is again valid. We have (11) since

Pk(w,t , W ’ ,

0O)

*y(C(W)

A (&(id)

< k ) ) V ( t = 00)

and the right-hand side does not depend on w‘. In order to show (14) we take any (w, t , w’, t’) E A, we assume all pk(W,t , w’, t’) and c ( w ) A ( t < 00). Taking k > Q(w) + 1 we infer from Pk(w,t , w ’ , t ‘ ) that t‘+l t + k and C ( w ’ ) A - B ( w ’ ) and therefore (t’ 5 t+Q(w))AC(w‘)A-B(w’). Thus we have proved F ( w , t ,w‘, t‘). We now show (15): since Q(w) < 0 is false for all w we have from (26)

<

Po = TRUE. Next we take any (w, t , d , t’) E A and any k E { O , l , ...} and we shall prove (18). We do this by showing that (20) leads to Pk+l(W,t,W‘,t‘). Assume (20). Moreover, assume C ( w ) A (Q(w) < k + 1) A ( t < 0 0 ) , and try to get (t’ 5 t k) A C ( w ‘ ) A (-B(u’)).The D*(w,t , 0,t ) of (20) equals D(w, t , n, t )

+

Computer program semantics in space and time (F.5)

<

959

and therefore implies in the present context C ( a )A (Q(u) < Q ( w )). Hence c(0)A ( Q ( u ) < k ) . The Pk(0,t l,w’,t’) of (20) now gives (t’ 5 t k ) A C ( w ‘ ) A +l(w’) as we wanted. This finishes the proof of (18). We finally turn to (19). We assume (with some fixed (w, t ,w’, t’) E A and fixed k ) that S(w, t , w ‘ , t’) A i B ( w ) ,and moreover C ( w ) A (&(w) < k 1) A (t < 03). If t’ < 00 then S(w, t , w ‘ , t’) says that w = w’ and t = t’, and we get (t’ 1 5 t k 1) A c ( w ‘ ) A -.B(w’). This proves Pk+l(w,t,w’,t’). The case t’ = 03 does not occur, since S(w, t,w’, t‘) would still say t = t’, which conflicts with t < 03. This finishes the proof of (19)) and completes the proof of Theorem 2.

since t

00,

+

+

+

+

+ +

0

Sometimes we can definitely conclude to non-termination of a while-statement. Rather than stating general theorems we present a single example. The proof will again follow the pattern of the proof of Theorem 1. In this example we take for R the set of all integers. The predicate B corresponding to the E in “while E do T” is given by B(w) = (w > 0). And T is a program for which we assume Totinf(.r) C D*, where

D = X ( ( t = t‘) A (w’ = w

+ 1)).

(27)

(In ALGOL one can think of the while-statement “while z > 0 do 2 := z+ l”.) We shall derive the semantical statement Totinf(RECURS(cp)) c F , where

F = X((w > 0 ) + (t’ = w)) .

(28)

This corresponds to the intuitively obvious statement that if the initial value of > 0 do z := z 1” is non-terminating. Again we follow the proof of Theorem 1. We take

+

z is positive, then “while z

= x((w

> 0 ) + (t + k 5 t’ 5 w)) .

(29)

Since D*(w,t,w‘, t‘) + ( t = t’) we can again simplify (8) t o (9). Now again it suffices to check (11)) (14)) (15), (18) and (19). We have (11) since w’ does not occur in (29). And (14) is trivial by (29) and (28). Next (15) is obvious since Po = TRUE (note that t 5 t’ 5 00 in all points of A). Now we turn to (18). We assume the left-hand side, i.e., Hk(W,t , w ’ , t ’ ) and w > 0. Turning to (9) we note that D*(w,t , u,t ) implies (t = 00) A (0= w 1). so by the induction assumption Rk C Pk we derive from H k ( W , t , w ’ , t ’ ) that u exists such that

+

( ( t = w) V (0= w + 1)) A

> 0 ) + ( t + 1+ k 5 t’ 5 w)) .

((0

I f w > 0 we deduce ( t = w ) v ( a > 0), so ( t = w ) V ( t + l + k 5 t’ 5 t + 1+ k 5 t’ 5 00.

03),

whence

960

N.G. de Bruijn

Therefore we have proved (18). Finally (19) is trivial since already -B(u) implies Pk+l(u,t,u’,t’) by (29).

8. A CONTEMPLATION ON SYNTAX AND SEMANTICS There is a world of semantics and a world of syntax. We use the word “world” in order to avoid to have to be very precise. It means something like “area of attention”. Let us call these worlds SEM and SYN. In SEM we talk about sets, relations and mappings in the usual mathematical sense. These mathematical “objects” are discussed in ordinary mathematical language. In SYN we talk about strings of characters, and in particular about special strings which are called “programs”. Again we use ordinary mathematical language to discuss these linguistic objects. So both for SEM and for SYN we use mathematical language, but the “objects” are different. Gradually we discover possibilities to link objects in SYN to objects in SEM, but there is always trouble with the metalanguages of SEM and SYN, in particular in those cases where we use one and the same word (like “variable”) in different meanings in the two metalanguages. In spite of the formidable amount of knowledge about formal languages it must be said that SYN is a poor man’s world, an underdeveloped country. SYN cannot really live without SEM, but SEM can certainly live very comfortably without SYN, just by developing a bit of extra metalanguage. Let us compare the situation of computer programs with a subject that came up about two thousand years earlier: geometrical constructions with ruler and compass. In this geometrical case SEM is the world of geometrical objects and logical discussions about those objects. Since the whole of mathematics is available to SEM, it includes sets and mappings. As an example we mention that there is a mapping that attaches to each pair ( P , r ) , where P is a point and r a line segment, the set circle(P,r), which is the set of all points in our plane which have the distance r to P. Let us call the act of getting the set circle(P, r ) from P and r a construction. In the metalanguage we now describe sequences of such constructs, which lead from a set of objects we are assumed t o “have” at the start, t o the objects which somehow interest us. We say that this sequence constructs these interesting objects. Parallel to this sequence we have a sequence of actions in our physical world on physical paper with physical ruler, compass and pencil, and for this sequence of physical actions the sequence in SEM is a mathematical model. But we have to emphasize that the world of SEM is bigger than this. We might

Computer program semantics in space and time (F.5)

961

study constructions for which no physical realization is available. Now where is SYN in this case? Let us hire people to carry out the geometrical constructions we invented. Assume these people are unable to understand our metalanguage. We have to instruct them very precisely what to do at each step. To that end we invent a system for coding instructions, and these coded instructions are the programs of SYN. (One might say that LOGO is a kind of programming language for at least some geometrical constructions.) The question whether a sequence of commands in SYN corresponds exactly t o the sequence we had in SEM’s metalanguage, is independent of the question whether we actually execute or can execute these commands physically. Let us not get to computer programs. The historical order seems to be somewhat different from the old geometrical case. Most of it started with programs (which had to be very precisely defined) plus a somewhat informal notion of state space and a possibly even more informal notion of time. At the moment the need for “program correctness proofs” was felt, SYN was much better developed than SEM. It is quite natural that this resulted in various ways to treat program correctness which were mostly SYN-centered. It became SYN with a tiny bit of SEM (like state space and predicates), or SYN with a lot of SEM (like fixed point theory). Even the term “program correctness” itself bears the traces of this. The term suggests syntactical correctness, but means something different: it means that a program is correct with respect t o some semantical specification. In the geometrical case one of course feels that the matter of correctness of a geometrical construction (like the question of whether our construction for a regular pentagon really leads to a pentagon that is regular) is a matter of SEM only, and that the question has nothing to do with the way we have coded the construction in SEM. Yet we can of course raise the question whether the execution of a given coded description of a construction leads to a proper pentagon. But it would be a clear case for “separation of concerns” to split this question into (i) whether the program is correct, and (ii) whether the coding is correct. There does not seem to be a good reason for tying things to SYN. In ordinary mathematical language we can define anything we need, like sets and mappings, without ever bothering about the kind of notation we use. There is a strong notion of equality in mathematics, with the effect that one and the same object can be described in various syntactically different ways. This is true for “objects” as well as for “actions”. So let us try to keep the matter of program semantics away from SYN. In SEM we can express in ordinary mathematical language everything we want for

N.G. de Bruijn

962

program correctness, and in a formal checking system (like Automath) we can speak of integrated semantics. In integrated semantics we can describe logic, mathematics, programs and program semantics all in one and the same system. And a compiler would be able to read these mathematically defined programs and to translate them into machine language without ever using the computer languages we usually think of. The total effect of integrated semantics will be simplification. It might also be a satisfactory framework in which other semantical systems can be placed and compared. In SYN-free semantics one can consider programs which are not representable syntactically at all. It might be possible to characterize the representable programs in the set of all programs by means of properties like monotonicity (see Section 9), and show that things like fixed point theory can be developed on the basis of such properties. But why go into all that trouble? In Section 6 we showed an example where an important ingredient of practical programming was treated semantically without any reference to such properties, and it seems likely that we can go quite a distance in this style. In SYN-free semantics it seems to be attractive t o identify the notions of a “program” with the notion of the semantic information of that program. Yet there is something to say for the idea of creating a separate set (the set of programs) and to map it into the set of relations by a mapping “Totinf”, as we did in Section 2. This policy anyway leaves various possibilities open. In particular we keep the possibility to add equality and equivalence assumptions in the set of programs, and such assumptions might be adjustable to later mappings of the programs in SYN into this set of programs. So we do not require that every element of Prog(R) is representable in SYN, and we do not require that two elements of Prog(R) are equal if they have the same semantics. Note that sometimes the notion of equality in SYN might be stronger than the one in Prog(R). For example, the repeated concatenations 2 :=

(2 :=

zS1;

2+1;

(2

:= 2 + 2 ;

2 := 2 + 2 ) ;

2 := 5

2-1)

:= 2-1

might be considered as equal in SYN, and their counterparts in Prog(R) are different but semantically equivalent. On the other hand the programs 2

:= s + l ;

2 :=

2+2;

5 := 2

2+3

:= 2 + 2

will be considered to be different in SYN, as well as different in Prog(R), but yet semantically equivalent.

Computer program semantics in space and time (F.5)

963

For the time being we just leave it open what Prog(R) is. Followers of SYNful semantics might like to identify it with the set of all their programs, and their antagonists might like to identify it with the set of all relations, like Pred(A).

9. COMMENTS ON THE SEMANTICS OF RECURSION In Section 6 we defined the semantics of RECURS (for any program-toprogram function cp) by means of formulas (l), (2), (3), (4). In this section we give some arguments for this choice, and we compare it with other possibilities. In the process of evaluating and comparing we shall appeal to more or less intuitive ideas on the structure and execution of computer programs. It should be stipulated that those ideas are certainly not substantiated in all respects by the treatment of program semantics as explained thus far in this paper. Let us first ignore the 1c, of Section 6, and just work with the iterates of cp itself. If p is a program then

are programs. In order t o facilitate the discussion we assume for a moment that all these programs are deterministic. We take any initial value w , and we ask what happens in the execution. Our intuition says: either the recursive program the p is is non-terminating, or there is a k such that in the execution of @(I) not executed at all. This also means that the executions of cp'(p), cp"'(p), ... are all equal, as far as the initial value w is concerned, and that they are all ), ... for any other program v. In the equal to the executions of ( ~ ' ( v cp"+'(v), case of non-deterministic programs these things are harder t o explain, but the idea remains the same. In the preparation of note [de Bruijn 73d] this idea led to a particular choice of p: p is a program with Totinf(p) = X ( t ' = 00)

,

which means that every execution of p is non-terminating. Consequently: if any execution of @ ( p ) actually executes p , then that execution of @ ( p ) is nonterminating too. So we get the semantics we expect, if we say that ( w , t ,w', t ' ) satisfies the predicate Totinf(cp'(p)) for all large k (or at least for infinitely many k). In the case of termination the p in cpk(p) is not executed at all if k is large. In the case of non-termination the p is executed in all these cpk(p), and that takes care of the truth of Totinf(cpk(p))at ( w , t , w ' , 00) for some w'. The objection one might make is the lack of monotonicity in the sequence. Let us discuss monotonicity first in general terms. If cp is a program-to-program

N.G. de Bruijn

964

function we can “expect” cp t o be monotonic in the sense that 7r

5

+

I cp(a)

cp().

(with 5 defined as in Section 3) for all programs x , 0: if we know more about 7r than about c7, then we know more about c p ( ~ )than about cp(c7). Unfortunately we cannot apply this in the case of the sequence p, cp(p),cp2(p),... since there is cp(p) or cp(p) 5 p for the “non-termination” prono guarantee that either p I gram p. We are so much better off with adlibitum: adlibitum cp(adlibitum), and if cp is monotonic this leads to

>

adlibitum

> cp(ad1ibitum) > @(adlibiturn) > ... .

Unfortunately we do not get the proper semantics this way. The simplest example is the one where cp is the identity: c p ( ~ ) = 7r for all programs 7r. Then the limit of the sequence with entries Totinf(cp’(ad1ibitum)) is just Totinf(ad1ibitum). This means that the sequence provides no semantic information at all: it just says that anything may happen, and not that RECURS(cp) is definitely non-terminating. This objection has been overcome in Section 6 of this paper by taking $ instead of cp. The extra executions of “delay” have the effect that for this particular cp Totinf(Q’(ad1ibitum)) = X ( t

+ k 5 t’ 5

00)

.

Taking limits for k + 00 we get X(t’ = oo),and that is what we wanted. Here we use the following definition of the limit of a monotonically decreasing sequence 7r1

> 7r2 2 7r3 1 ...

*

(1)

We say that T k + 7r

(k + m)

(2)

if for all (w, t , w’, t’) t/k p k ( W , t ,W ’ , t’) = P ( W , t,W ’ , t‘) , where P k = TOtinf(Tk), P = Totinf(T). w e note that Totinf(7r) is uniquely determined by the sequence T I , K Z , ... , and moreover that “k

>

(3)

for all k . In [de Bruijn 73d] we did not have monotonicity of the sequence of programs, and we had to have “lim sup” instead of “lirn”. The fact that we have monotonicity in the present semantics has the obvious advantage that the lim of a

Computer program semantics in space and time (F.5)

965

monotonic sequence is nicer to deal with than the lim sup of a non-monotonic sequence. If we have (1) and (2), then for any arbitrary k we can use Totinf(ak) as information about A , since (3) expresses Totinf(nk) 2 Totinf(n). This is much simper than with lim sup, where we can obtain information about the lim sup only if we have information about Totinf(7rk) for infinitely many k . Let us now discuss the idea of RECURS(cp) being a fixed point. Intuitive ideas about execution suggest the "fixed point statement" Totinf($( RECURS(cp)))= Totinf (RECURS(cp)),

(4)

but it is not easy to actually prove this without making restrictive assumptions about the class of constructs we take cp from. Just monotonicity will not do. Monotonicity does suffice for the weaker result

$(RECURS(cp)) 5 RECURS(cp) .

(5)

This follows if we apply 11, to both sides of the inequality (cf. (3))

RECURS(cp) 5 +'(adlibiturn) and take limits for k -B 00. The equality (4) is easy if we assume that $ is not just monotonic but also continuous. We take the latter notion in the sense that lirn $(Ah) = $(lim(wk))

(6)

for any sequence satisfying (1). However, it may be quite hard to establish continuity, even in this weak sense, for all $'s arising from a given set of program constructs. In particular we have t o bear in mind that we may wish to apply RECURS to functions cp which in their definition contain applications of RECURS already. If we take our set of program constructs a bit too wide, it is easy to kill continuity even for very simple program functions. We shall present an example that might give an idea of the kind of restrictions we might have to build in. Let R be the set of integers, and imagine that for every k we have a program r k with Totinf(7rk) = x*(((w'2 k) A ( t = t ' ) )V (t' = 0 0 ) ) . we let 0 be described similarly by Totinf(a) = X*((w'= 0)A ( t = t ' ) ) . We define the program-to-program function 19 by means of concatenation with a:

6(7)= (7; 0 )

N.G. de Bruijn

966

for all T . We have T I 2 7r2 2 ... . If we call the limit T it follows from the monotonicity of cp that cp(n1) 2 cp(x2) 2 ... . Let us put p = lim V ( n k ) . We hope that p and I ~ ( T are ) semantically equivalent, but unfortunately this is not the case: Totinf(p) = X*(((w’ = 0) A ( t = t’))V (t’ = 00)) Totinf(lS(w)) = X*(t’ = 00)

.

So a general proof of (4), which means (6) with T k = $,“(adlibiturn) has t o then just monotonicity. depend on more knowledge about the sequence Another point that has to be raised is the maximality. Let us assume that Q is a “predicate transformer” which is such that Totinf($(n)) = @(Totinf(n)) for all T . Then if P = Totinf(REGURS(cp)) we can read (4) as

@ ( P )= P Now let Q be any other predicate with @ ( Q )= Q. Then just assuming monotonicity we can show that Q c P , in other words: P is the maximal fixed point of Q. In order to show Q c P we remark that TRUE ZI @(TRUE) ZI Q2( TRUE), .. and that the limit of the sequence sequence

ak(TRUE) is P.

Comparing this with the

(of which all entries equal Q) we infer from Q c TRUE, by monotonicity of Q, that Q c Q k ( TRUE), and therefore Q c P . The question arises whether it is really worth the trouble of finding satisfactory restrictions on $ that guarantee (4).After all, we have shown in Section 7 that we can get to quite practical statements on actual programs without ever going into notions like continuity and fixed points. We did not even have to mention monotonicity! It might be easier t o prove (4) under restrictive assumptions like finiteness of state space, or exclusion of non-determinism. But such restrictions do not seem to be attractive for the practical discussion of actual programs. Anyway there is quite a distance between the definition of recursion semantics by means of lim(Totinf($k((adlibitum)))and any definition based on the idea of a maximal fixed point. It is a matter of opinion which one of the two ideas is preferable as the definition of the semantics of recursion.

Computer program semantics in space and time (F.5)

967

The two kinds of semantics get closer together if we take a more liberal interpretation of non-termination. In this more liberal version a semantical statement that says (with w , t given), that t‘ = 00 is a possible effect, has to be interpreted as that there is no upper bound to the values of the t’ of the possible executions (with initial values w , t ) . After all, if we are interested in having our programs terminated, a statement that a program might run for a million years does not give us much more comfort than a statement that it might go on for ever. Therefore, it is quite reasonable to identify “unpredictably long” with “infinitely long”. This “liberal version” is related to the following definition. If P E Pred(A) then instead of the P” of Section 2 we define P+ by: P+(w,t,w’, t’) = P(w,t , w ’ , t ’ ) if t’ < co,and

P + bt,w’, 00) = VU€T\{oo}

%JET I v>u

3,€n P(w,t , P,

.

If the set of all P+ with P E Pred(A) is denoted by Pred+(A), we have Predf(A)

c Pred*(A) c Pred(A) .

In determinietic cases there is no difference between P* and P+. To be more precise, if w , t are such that there is at most one pair w’,t’ with t‘ < co and P(w,t , ~ ‘t’), , then P+(w,t , w ’ , t’) = P*(w,t , ~ ’t’), for all w’, t‘. In general, if P describes the semantics of a program in the original version (where P ( w ,t , w’, co) means that the program can actually run for ever), then P+ is the liberal semantics (where P+(w,t ,w‘, co) means that there is no upper bound to the runtime). If we take the liberal semantics, then our prospects for proving the fixed point property for recursion become much more promising. The difference between P* and P+, and its being related to having nondeterminism and infinite state space, can be connected to Konig’s well known infinity lemma. In order to make the notation sufficiently clear for further discussion, we explain it in a few words. Let ( S , r , f ) be a rooted tree. That is, S is a set (the set of “points”), r is a special element of S (called the “root”) and f is a mapping of S\{r} into S (f(z)is called the “father” of x). It is assumed that for every z (z # r ) there is an integer n such that the n-th iterate f ” maps z into r . This n is uniquely determined, and is called the “level” of z. The level of r is zero. If z E S , the set of all y E S\{r} with f ( y ) = z is called the “offspring” of z, and denoted

Ob). If 3: E S we denote by IP(x) the proposition that there is an infinite path starting from z, that is a sequence 20,z1, 2 2 , ... with zo = z and f(zCn+l) = zn for n = 0 , 1 , 2 , ... . And by UL(x) (UL abbreviates “unbounded level”) we denote

N.G. de Bruijn

968

the proposition that for every natural number m there is a path 20,. . . l zm, again with zo = z, f(z,+l) = zn for n = 0, ...,m - 1. We note that for all z IP(2) + UL(2)

.

Finally we formulate the “finite offspring condition”. It says that for every z E S the offspring O(z) is a finite set.

Konig’s lemma expresses: If the finite offspring condition holds, and UL(T) is true, then we have IP(T). Coming back to semantics, we shall try to explain that IP(T)can be compared with infinite runtime, and UP(T) with unpredictably long runtime, in both cases with initial value T. And the finite offspring condition corresponds to a condition that says that the number of possible outputs is finite, for every given input. This condition is certainly guaranteed if the state space is finite, but also if the program is deterministic. We describe a typical case of a tree where UL(T) holds but IP(T) does not. . points of S are the pairs ( i , j ) with integers i, j Let us call it ( S , T , ~ ) The satisfying either 0 < i 5 j or i = j = 0. The point (0,O) is taken as the root. For all other points we define the “father” by

fo(i,j) =

{

< i 5 j ) A (i #

(i - 1,j)

if (0

(090)

ifl=i<j.

1)

The tree is depicted in Figure 1. In that figure the arrows run from points to their fathers.

Figure 1 Coming back to the general tree ( S ,T, f ) , we describe programs for which S is the state space. As a primitive program we take the program “step”. Its semantics is described by Totinf(step) = (t’ = t ) A (w’ E O ( w ) ) .

Computer program semantics in space and time (F.5)

969

Note that “step” is a non-deterministic program, at least if there exist points w where O ( w ) has more than one element. If O(w) is empty (such an w is called an end-point), then there is no possible output w’ to the input w . If this is considered unacceptable, one might take any arbitrary value of w’ as output, like w’ = w . That means that in the definition of Totinf(step) we replace w’ E O ( w )

by w’ E O ( w ) V ( ( O ( W= ) 8) A (w’ = w ) )

.

But actually the case of empty O(w) is unimportant since the program “step” will not be used there. Let us now discuss the program (see the beginning of Section 7) while E do s t e p , where E corresponds to the predicate B given by = (O(w)#

0) .

In the notation of Formula (2) of Section 7 this program denotes RECURS(cp), where cp is given by c p ( ~ )= i f

E then (step; T ) else skip

.

The “intuitive”, or, if one prefers, “operational” semantics of this program is the following one. Let w (the input) be any point of the tree. If w’ is an end-point such that fk(w’)= w for some lc 2 0 then w’is a possible output (and t’ = t k). If IP(w) holds then there is a non-terminating execution. If IP(w) is false but UL(w) still holds then there exist unpredictably long executions. If UL(w) does not hold then all executions starting with w terminate, and there is an upper bound to their runtime. Let us now investigate Totinf(RECURS(w)) as defined by (4) in Section 6. As far as terminating executions are concerned, it produces the same results as the intuitive semantics. For values of w where IP(w) holds, it proclaims the possibility that t’ = 00, as it should. But in points where UL(w) holds but IP(w) does not, the semantics of Section 6 still says that t’ = 00 is possible, and the intuitive semantics says it is not. The difference between these two kinds of semantics vanishes (at least for this program) if we identify “Unpredictably long runtime’’ and “non-termination” . In the tree program described here, it is also easy to illustrate that $(RECURS(cp))and RECURS(cp) can have different semantics in the system of Sec, UL(z) is false tion 6. We take the special tree (SO,TO, fo). We have U L ( T ~ )but for all z with f(z)= TO. From this it can be derived that at (TO, t , d ,00) (where w‘ irrelevant) RECURS(cp) is true but $(RECURS(cp))is not.

+

N.G. de Bruijn

970

Things look much better in the non liberal Pred' semantics. We briefly discuss the changes that have to be made. First, in Section 2 we have to introduce Predf(A) instead of Pred*(A). We have to give up the idea of a J such that always P * J = P+, like in Formula (2) of Section 2, and it is not generally true that ( P * Q)+ = P * Q+. (A counterexample: 0 = N ,

P ( ~ , t , w ' , t ' ) =( t ' = t ) , Q ( w , t , ~ ' , t ' )= ( ( w ' = ~ ) A ( t ' = t + w ) ) . In Section 3 we have to introduce Totinf+ as a mapping of Prog(R) into Pred+(A). The changes in Section 4 are trivial. In Section 5 the semantics of the concatenation has to be described by Totinf(r; 0)= (Totinf(a)

* Totinf(o))+ .

In Section 6 the semantics of RECURS(9) can be given by Totinf(RECURS(9)) = R+

,

although the definition of R in (4) will guarantee that there is no difference between R and R+ as soon as we have monotonicity. We note that if PO2 PI 2 P2 3 ... with P, = P$ for all n, and if P = V, P,, then P = P+. In Theorems 1 and 2 of Section 7 the new semantics makes no difference, since they deal with cases with bounded runtime. In the non-termination example at the end of Section 7 there is neither any difference, since the program is deterministic. From the tree program discussed earlier in this section it can be seen that $(RECURS((p)) and RECURS(9) need not have the same Pred*-semantics in nondeterministic cases with infinite state space. If we turn to Pred+-semantics, however, it seems that we only need monotonicity properties in order to show that Totinft ($( RECURS(9))) = Totinf+( RECURS((p)) , which means that RECURS(9) can be considered as the maximal fixed point.

10. A COMMENT ON ABORTION

We have not yet described the notion of abortion in our semantical system. Here we first discuss an attempt to treat abortion in a way that seems to be promising at first, and we shall explain why it is not satisfactory. The attempt is this one. If R is a program, and w , t are initial state and time such that there do not exist any w', t' (not even with t' = m) such that Totinf(w, t , w', t') holds, then we might try to interpret this as abortion. This means: with the initial w , t the execution of T will have been interrupted at

Computer program semantics in space and time (F.5)

971

some point. The semantical system does not disclose at exactly which point further execution is refused by the machine, simply because it never discusses executional details. This point of view seems to be very promising for Dijkstra’s guarded command statement i f El

+ a1

0

... 0 E,++.ak

fi

(1)

where each Ei corresponds t o a predicate B (like in our discussion of the “ i f then e l s e ” in Section 4). The semantics is: “select at random an i such that Bi(w)holds, and then execute ai;if there is no such i then abort”. If abortion is interpreted in the style “no possible w’, t’” we indicated above, then the total information of the progam (1) is given by (&(w) A

Pi(W,t,w’, t‘))V ... V ( B ~ ( wA )P k ( ~ , t , w ’t’)) , ,

(2)

where Pi = Totinf(ai) (i = 1,...,Ic). The simplicity of (2) seems to be a positive point both for Dijkstra’s semantics and for the “no possible w’, t’ ” interpretation of abortion. Unfortunately we have to admit that the “no possible w’, t‘” interpretation of abortion conflicts with the idea of nondeterministic programs. We show this with a concatenation “a ; u” where a is a nondeterministic program and u is a program that sometimes aborts. Let b and c be two different elements of 0, and let Totinf(a)(w, t,w’, t’) = ((w’ = b) V (w’ = c)) A ( t = t’) Totinf(u)(w, t , ~ t’) ‘ , = (w

# b) A (w = w’) A ( t = t’) .

So u leads to abortion with the initial state 6, but is harmless with all other initial states. By our semantic rule on concatenation we have Totinf(a ; u ) ( w ,t , d ,t’) = (w’ = c ) A (t’ = t )

(3)

so here there is no abortion for any w ,t. Therefore (3) does not describe the situation adequately, since we expect that the semantics of “a ; u” is: whatever w , t is, we have either abortion or (w’ = c) A (t’ = t ) . A more satisfactory way to incorporate abortion into a semantical system is by means of a special boolean variable. Let us call it ab (for “abortion”). At the start of a program we add the assignment ‘lab := f a l s e ” (interpretation: no abortion thus far), and every sub-program u of the program is replaced by “ i f Tab then u e l s e skip”. If in the final state we have ab = t r u e then we interpret this by saying that the program execution has been aborted. In addition to this the sub-programs may contain assignments ‘lab := t r u e ” in those cases where we actually want abortion to take place. We might want to

972

N.G. de Bruijn

do this in the guarded command statement. Another example is overflow: if we do not wish to handle numbers exceeding m, we might transform “y := l/p” into “ i f Il/pl 2 m then y := l / p e l s e ab := true”. Let us use the word “refuser” for such an extra boolean variable like ab. The essential thing for a refuser T is that subprograms 0 are to be remodelled into “ i f -rr then 0 e l s e skip”. Refusers can be used for semantic discussion of forward goto’s as well.

Bibliography

This Page Intentionally Left Blank

975

References

[A.1] de Bruijn, N.G., Verification of mathematical proofs b y a computer, A preparatory study for a project AUTOMATH (formerly unpublished, 1967). [A.2] de Bruijn, N.G., The mathematical language AUTOMATH, its usage and some of its extensions, in: Laudet, M., Lacombe, D. and Schuetzenberger, M., eds., Symposium on Automatic Demonstration, IRIA, Versailles, 1968 (Berlin, Springer Verlag, 1970), 2 9 4 1 . (Lecture Notes in Math., 125). [A.3] van Daalen, D.T., A description of Automath and some aspects of its language theory, in: Braffort, P., ed., Proceedings of the Symposium APLASM (Orsay, 1973), Vol. I. Also in [van Benthem Jutting 771, 48-77. [A.4] Zucker, J., Formalization of classical mathematics in Automath, in: Colloque International de Logique, Clermont-Ferrand, France, 1975 (Paris, CNRS, 1977), 135-145. (Colloques Internationaux du Centre National de la Recherche Scientifique, 249). [A.5] de Bruijn, N.G., A survey of the project AUTOMATH, in: Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/London, Academic Press, 1980), 579-606. [A.6] van Daalen, D.T., The language theory of Automath, Ph.D. thesis (Eindhoven University of Technology, 1980), Chapter 1, Sections 1 - 5 (Introduction). [A.7] de Bruijn, N.G., Reflections on Automath (Eindhoven University of Technology, 1990). [A.8] Nederpelt, R.P., Type systems - basic ideas and applications, in: van de Goor, A.J., ed., Proceedings of CSN '90, Computing Science in the Netherlands (Amsterdam, Stichting Mathematisch Centrum, 1990), 367383. [B.1] van Benthem Jutting, L.S., Description of A UT-68(Eindhoven University of Technology, 1981). (Memorandum 1981-12, Dept. of Math.).

976

References

[B.2] de Bruijn, N.G., AUT-SL, a single line version of AUTOMATH (Eindhoven University of Technology, 1971). (Notitie 71-22, Dept. of Math.). [B.3] de Bruijn, N.G., Some extensions of AUTOMATH: the AUT-4 family (Eindhoven University of Technology, 1974). (Internal Report, Dept. of Math.). [B.4] de Bruijn, N.G., A UT-QE without type inclusion (Eindhoven University of Technology, 1978). (Memorandum 1978-04, Dept. of Math.). [B.5] van Benthem Jutting, L.S., Checking Landau’s LiGrundlagen”in the A UTOMATH system, Ph.D. thesis (Eindhoven University of Technology, 1977), Appendix 9 (AUT-SYNT). [B.6] van Daalen, D.T., The language theory of Automath, Ph.D. thesis (Eindhoven University of Technology, 1980), Chapter VIII, 1 and 2 (AUT-II).

[B.7]de Bruijn, N.G., Generalizing Automath by means of a lambda-typed lambda calculus, in: Kueker, D.W., Lopez-Escobar, E.G.K. and Smith, C.H., eds., Mathematical Logic and Theoretical Computer Science, Proceedings of the Maryland 1984/85 Special Year in Mathematical Logic and Theoretical Computer Science (New York, Marcel Dekker, 1987), 71-92. (Lecture Notes in Pure and Appl. Math., 106).

[B.8] Balsters, H., Lambda calculus extended with segments, Ph.D. thesis (Eindhoven University of Technology, 1986), Chapter 1, Sections 1.1 and 1.2 (Introduction). [C.1] van Benthem Jutting, L.S., A normal form theorem in a A-calculus with types, in: Mitt. d. Gesellsch. f. Math. u. Datenverarb. 17, Tagung ub. form. Sprachen u. Programmiersprachen, Oberwolfach, Germany (1971), 27-32. [C.2] de Bruijn, N.G., Lambda calculus notation with nameless dummies, a tool

for automatic formula manipulation, with application to the Church-Rosser theorem, Indagationes Math. 34, 5 (1972), 381-392.

[C.3] Nederpelt, R.P., Strong normalisation in a typed lambda calculus with lambda structured types, Ph.D. thesis (Eindhoven University of Technology, 1973). [C.4] de Vrijer, R.C., Big trees in a A-calculus with A-expressions as types, in: Bohm, C., ed., A-Calculus and Computer Science Theory (Berlin, Springer Verlag, 1975), 252-271. (Lecture Notes in Comp. Sc., 37). Also: de Vrijer, R.C., Surjective pairing and strong normalization, Ph.D. thesis (University of Amsterdam, 1987), Chapter 5.

References

977

[C.5] van Daalen, D.T., The language theory of Automath, Ph.D. thesis (Eindhoven University of Technology, 1980), Parts of Chapters 11, IV, V - VIII.

[C.6] van Benthem Jutting, L.S., The language theory of A,, a typed lambda calculus where terms are types (Eindhoven University of Technology, 1985). (Memorandum 1985-02, Dept. of Math. and Comp. Sc.). [ D . l ] de Bruijn, N.G., Example of a text written in Automath (formerly unpublished, 1968).

[0.2] van Benthem Jutting, L.S., Checking Landau’s “Grundlagen” in the A UTOMATH system, Ph.D. thesis (Eindhoven University of Technology, 1977), Parts of Chapters 0, 1 and 2 (Introduction, Preparation, Translation). [ 0 . 3 ] van Benthem Jutting, L.S., Checking Landau’s iiGrundlagenJJin the

A UTOMATH system, Ph.D. thesis (Eindhoven University of Technology, 1977), Chapter 4 (Conclusions).

ID.41 van Benthem Jutting, L.S. and de Vrijer, R.C., A text fragment from Zucker’s ”Real Analysis” (1994). [ 0 . 5 ] van Benthem Jutting, L.S., Checking LandauJs “Grundlagen” in the A UTOMATH system, Ph.D. thesis (Eindhoven University of Technology, 1977), Appendices 3 and 4 (The PN-lines from the preliminaries; Excerpt for “Satz 27”).

[E.1] Zandleven, I., A verifying program for Automath, in: Braffort, P., ed., Proceedings of the Symposium APLASM (Orsay, 1973), Vol. I.

[E.2]van Benthem Jutting, L.S., Checking Landau’s “Grundlagen” in the AUTOMATH system, Ph.D. thesis (Eindhoven University of Technology, 1977), Parts of Chapter 3 (Verification).

[E.3] van Benthem Jutting, L.S., An implementation of substitution in a Xcalculus with dependent types (Philips Research Laboratories Eindhoven, Eindhoven University of Technology, 1988). [F.1] de Bruijn, N.G., Set theory with type restrictions, in: Hajnal, A., Rado R. and Sos, V.T., eds., Infinite and Finite Sets, I, International Colloquium, Keszthely, Hungary, 1973 (1975), 205-214. (Colloquia Math. SOC.J&nos Bolyai, 10).

[F.2] de Bruijn, N.G., Formalization of constructivity in Automath, in: de Doelder, P.J., de Graaf, J. and van Lint, J.H., eds., Papers dedicated to J.J.

978

References

Seidel (Eindhoven University of Technology, 1984), 76-101. (EUT-Report 84-WSK-03).

[F.3] de Bruijn, N.G., The Mathematical Vernacular, a language for mathematics with typed sets, in: Dybjer, P. et al., eds., Proceedings of the Workshop on Programming Languages, Marstrand, Sweden 1987. Plus: Formalizing the Mathematical Vernacular (formerly unpublished, 1982), Examples of an MV Book.

[F.4] Wieringa. R.M.A., Relational semantics in an integrated system (Eindhoven University of Technology, 1980). (Internal Report, Dept. of Math.). [F.5] de Bruijn, N.G., Computer program semantics in space and time (Eindhoven University of Technology, 1983). (Internal Report, Dept. of Math. and Comp. Sc.). [Abadi et al. 911 Abadi, M., Cardelli, L., Curien, P.-L. and Lkvy, J.-J., Explicit substitutions, Functional Programming 1, 4 (1991), 375-416. [Andrews 711 Andrews, P., Resolution in type theory, Journ. of Symb. Logic 36 (1971), 414-432. [Backhouse et al. 891 Backhouse, R., Chisholm, P. and Malcolm, G . , Do-ityourself type theory, Formal Aspects of Computing 1 (1989), 19-84.

[Balsters 861 Balsters, H., Lambda calculus extended with segments, Ph.D. thesis (Eindhoven University of Technology, 1986). See also [B.8]. [Barendregt 711 'Barendregt, H.P., Some extensional models for combinatory logics and A-calculi, Ph.D. thesis (Utrecht University, 1971). [Barendregt 741 Barendregt, H.P., Pairing without conventional restraints, Zeitschr. f. math. Logik u. Grundl. d. Math. 20 (1974), 289-306. [Barendregt 771 Barendregt, H.P., The type free A-calculus, in: Barwise, J., ed., Handbook of Mathematical Logic (North Holland Publishing Co., Amsterdam, 1977), 1091-1132. (Studies in Logic and the Foundations of Math., VOl. 90). [Barendregt 811 Barendregt, H.P., The Lambda Calculus: Its Syntax and Semantics (North Holland Publishing Co., Amsterdam, 1981). [Barendregt 84a] Barendregt, H.P., The Lambda Calculus: Its Syntax and Semantics, Revised edition (North Holland Publishing Co., Amsterdam, 1984).

References

979

[Barendregt 84b] Barendregt, H.P., Introduction to lambda calculus, Nieuw Archief voor Wiskunde 4 , 2 (1984), 337-372. [Barendregt 911 Barendregt, H.P., Introduction to generalized type systems, Journal of Functional Programming 1, 2 (1991), 125-154. [Barendregt 921 Barendregt, H.P., Lambda calculi with types, in: Abramsky, S., Gabbay, D. M. and Maibaum, T., eds., Handbook of Logic in Computer Science (Oxford, Clarendon Press, 1992), Vol. 2, 117-309. [Barendregt et al. 761 Barendregt, H.P., Bergstra, J., Klop, J.W. and Volken, H., Representability in lambda algebras, Indagationes Math. 38 (1976), 377-387. [Barendregt and Hemerik 901 Barendregt, H.P. and Hemerik, C., Types in lambda calculi and programming languages, in: Jones, N., ed., European Symposium on Programming (Berlin, Springer Verlag, 1990), 1-36. (Lecture Notes in Comp. Sci., 432). [Barendsen 891 Barendsen, E., Representation of logic, data types and recursive functions in typed lambda calculus, Master’s thesis (University of Nijmegen, 1989). [van Benthem Jutting 71a] van Benthem Jutting, L.S., On normal forms in A UTOMATH (Eindhoven University of Technology, 1971). (Notitie 71-24, Dept. of Math.). [van Benthem Jutting 71b (C.1)] van Benthem Jutting, L.S., A normal form theorem in a A-calculus with types, in: Mitt. d. Gesellsch. f. Math. u. Datenverarb. 17, Tagung ub. form. Sprachen u. Programmiersprachen, Oberwolfach, Germany (1971), 27-32. [van Benthem Jutting 731 van Benthem Jutting, L.S., The development of a text in AUT-QE, in: Braffort, P., ed., Proceedings of the Symposium APLASM (Orsay, 1973), Vol. I. [van Benthem Jutting 761 van Benthem Jutting, L.S., A translation of Landau’s “Grundlagen” in AUTOMATH, Vol. 1-5 (Eindhoven University of Technology, 1976). [van Benthem Jutting 771 van Benthem Jutting, L.S., Checking Landau’s “Grundlagen” in the Automath system, Ph.D. thesis (Eindhoven University of Technology, 1977). Published as Mathematical Centre Tracts nr. 83 (Amsterdam, Mathematisch Centrum, 1979). See also [ B . 5 ] ,[ 0 . 2 ] , [ 0 . 3 ] , [D.5] and [E.2].

980

References

[van Benthem Jutting 81 (B.1)] van Benthem Jutting, L.S., Description of A UT-68 (Eindhoven University of Technology, 1981). (Memorandum 198112, Dept. of Math.). [van Benthem Jutting 85 (C.6)] van Benthem Jutting, L.S., The language theory of A,, a typed lambda calculus where terms are types (Eindhoven University of Technology, 1985). (Memorandum 1985-02, Dept. of Math. and Comp. Sc.). [van Benthem Jutting 88 (E,3)] van Benthem Jutting, L.S., An implementation of substitution in a A-calculus with dependent types (Philips Research Laboratories Eindhoven, Eindhoven University of Technology, 1988). [van Benthem Jutting and Wieringa 791 van Benthem Jutting, L.S. and Wieringa, R.M.A., Representatie van expressies in het verificatieprogramma VERA 1979 (Eindhoven University of Technology, 1980). (Memorandum 1979-15, Dept. of Math.).

[van Benthem Jutting and de Vrijer 94 (D.4)] van Benthem Jutting, L.S. and de Vrijer, R.C., A text fragment from Zucker’s ”Real Analysis” (1994). [Ben-Yelles 811 G-stratification is equivalent to F-stratification, Zeitschr. f. Math. Logilc u. Grundl. d. Math. 27 (1981), 141-150. [Berkling and Fehr 821 Berkling, K.J. and Fehr, E., A modification of the Acalculus as a base for functional programming languages, in: Nielsen, M. and Schmidt, E.M., eds., Automata, Languages and Programming, 9th International Colloquium, Aarhus (Berlin, Springer Verlag, 1982), 35-47. (Lecture Notes in Computer Science, 140). [Bishop 671 Bishop, E., Foundations of Constructive Analysis (New York, McGraw-Hill, 1967). [de Boer 751 de Boer, S., De ondefinieerbaarheid van Church’ 6-functie in d e A-calculus en Barendregt’s lemma, Master’s thesis (Eindhoven University of Technology, 1975). [Boyer and Moore 721 Boyer, R.S. and Moore, J.S., The sharing of structure in theorem-proving programs, Machine Intelligence 7 (Edinburgh, Edinburgh University Press, 1972), 101-113. [Boyer and Moore 881 Boyer, R.S. and Moore, J.S., A Computational Logic Handbook (Boston, Academic Press, 1988).

References

98 1

[ d e Bruijn 67 (A.1)] de Bruijn, N.G., Verification of mathematical proofs b y a computer, A preparatory study for a project AUTOMATH (formerly unpublished, 1967).

[ d e Bruijn 68a (D.l)] de Bruijn, N.G., Example of a text written in Automath (formerly unpublished, 1968). [de Bruijn 68b] de Bruijn, N.G., AUTOMATH, a language for mathematics (Eindhoven University of Technology, 1968). (T.H.-Report 68-WSK-05).

[ d e Bruijn 7Oa (A.Z)] de Bruijn, N.G., The mathematical language AUTOMATH, its usage and some of its extensions, in: Laudet, M., Lacombe, D. and Schuetzenberger, M., eds., Symposium on Automatic Demonstration, IRIA, Versailles, 1968 (Berlin, Springer Verlag, 1970), 29-61. (Lecture Notes in Math., 125). [de Bruijn 70b] de Bruijn, N.G., On the use of bound variables in AUTOMATH (Technological University, Eindhoven, 1970). (Notitie 70-9, Dept. of Math.). [ d e Bruijn 71 (B.Z)] de Bruijn, N.G., AUT-SL, a single line version of AUTOMATH (Eindhoven University of Technology, 1971). (Notitie 71-22, Dept. of Math.).

[de Bruijn 72a] de Bruijn, N.G., Some abbreviations in the input language for A UTOMATH (Eindhoven University of Technology, 1972). (Notitie 72-15, Dept. of Math.). [ d e Bruijn 72b (C.2)] de Bruijn, N.G., Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the Church-Rosser theorem, Zndagationes Math. 34 (1972), 381-392. [de Bruijn 73a] de Bruijn, N.G., A theory of generalized functions, with applications to Wigner distribution and Weyl correspondence, Nieuw Archief woor Wiskunde 3, 21 (1973), 205-280. [de Bruijn 73b] de Bruijn, N.G., AUTOMATH, a language for mathematics, A series of lectures at the SCminaire de mathkmatiques supbrieures, UniversitC de MontrCal, 1971, Lecture Notes by B. Fawcett (Les Presses de 1’UniversitC de MontrCal, 1973). [de Bruijn 73c] de Bruijn, N.G., The AUTOMATH Mathematics Checking Project, in: Braffort, p., ed., Proceedings of the Symposium APLASM (Orsay, 1973), Vol. I. Reprinted in: Kopania, J., ed., Studies in Logic, Grammar and Rhetoric, Humanities, Vol. I1 (Bialystok, 1983). (Papers of Warsaw University, 40).

982

References

[de Bruijn 73d] de Bruijn, N.G., A system for handling syntax and semantics of computer programs in terms of the mathematical language A UTOMATH (Eindhoven, unpublished, 1973). [de Bruijn 74a] de Bruijn, N.G., A framework for the description of a number of members of the A UTOMATH family (Eindhoven University of Technology, 1974). (Memorandum 1974-08, Dept. of Math.).

[ d e Bruijn 74b (B.3)) de Bruijn, N.G., Some extensions of AUTOMATH: the A UT-4 family (Eindhoven University of Technology, 1974). (Internal Report, Dept. of Math.). [ d e Bruijn 75a (F.1)]de Bruijn, N.G., Set theory with type restrictions, in: Hajnal, A., Rado R. and Sos, V.T., eds., Infinite and Finite Sets, I, International Colloquium, Keszthely, Hungary, 1973 (1975), 205-214. (Colloquia Math. SOC.Ja'nos Bolyai 10). [de Bruijn 75b] de Bruijn, N.G., The use of the language AUTOMATH for syntax and semantics of programming languages (Eindhoven, unpublished, 1975). [de Bruijn 761 de Bruijn, N.G., Modifications of the 1968 version of AUTOMATH (Eindhoven University of Technology, 1976). (Memorandum 197614, Dept. of Math.). [de Bruijn 771 de Bruijn, N.G., Some auxiliary operators in A UT-Pi (Eindhoven University of Technology, 1977). (Memorandum 1977-15, Dept. of Math.). [de Bruijn 78a] de Bruijn, N.G., A namefree lambda calculus with facilities for internal definition of expressions and segments (Eindhoven University of Technology, 1978). (T.H.-Report 78-WSK-03). [de Bruijn 78b] de Bruijn, N.G., Lambda calculus with namefree formulas involving symbols that represent reference transforming mappings, Indagationes Math. 40 (1978), 348-356. [ d e Bruijn 78c (B.4)] de Bruijn, N.G., AUT-QE without type inclusion (Eindhoven University of Technology, 1978). (Memorandum 1978-04, Dept. of Math.). [de Bruijn 78d] de Bruijn, N.G., A note on weak diamond properties (Eindhoven University of Technology, 1978). (Memorandum 1978-08, Dept. of Math.). [de Bruijn 79a] de Bruijn, N.G., Wees contextbewust in WOT, Euclides 55 (1979/1980), 7-12.

References

[de Bruijn 79b] de Bruijn, (1979/1980), 66-72.

983

N.G.,

Grammatica van WOT, Euclides 55

[de Bruijn 79c] de Bruijn, N.G., Van alles en nog wat over gebonden variabelen in wiskundige taal, Euclides 55 (1979/1980), 262-268.

[ d e Bruijn 80 (A.5)] de Bruijn, N.G., A survey of the project AUTOMATH, in: Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/London, Academic Press, 1980), 579-606.

[ d e Bruijn 83 (F.5)] de Bruijn, N.G., Computer program semantics in space and time (Eindhoven University of Technology, 1983). (Internal Report, Dept. of Math. and Comp. Sc.). [ d e Bruijn 84 (F.Z)] de Bruijn, N.G., Formalization of constructivity in Automath, in: de Doelder, P.J., de Graaf, J. and van Lint, J.H., eds., Papers dedicated to J . J . Seidel (Eindhoven University of Technology, 1984), 76101. (EUT-Report 84-WSK-03). [de Bruijn 86) de Bruijn, N.G., Checking mathematics with the aid of a computer, in: Howson, A.G. and Kahane, J.-P., eds., The Influence of Computers and Informatics on Mathematics and its Teaching (Cambridge, Cambridge University Press, 1986), 61-68. [ d e Bruijn 87a (F.3)] de Bruijn, N.G., The Mathematical Vernacular, a language for mathematics with typed sets, in: Dybjer, P. et al., eds., Proceedings of the Workshop on Programming Languages, Marstrand, Sweden 1987. Plus: Formalizing the Mathematical Vernacular (formerly unpublished, 1982), Examples of an MV Book. [ d e Bruijn 87b (B.7)] de Bruijn, N.G., Generalizing Automath by means of a lambda-typed lambda calculus, in: Kueker, D.W., Lopez-Escobar, E.G.K. and Smith, C.H., eds., Mathematical Logic and Theoretical Computer Science, Proceedings of the Maryland 1984/85 Special Year in Mathematical Logic and Theoretical Computer Science (New York, Marcel Dekker, 1987), 71-92. (Lecture Notes in Pure and Appl. Math., 106). [de Bruijn 891 de Bruijn, N.G., Machinale verificatie van redeneringen, in: Lemniens, P.W.H., ed., Bewijzen in de Wiskunde (Amsterdam, Centrum voor Wiskunde en Inforrnatica, 1989), 61-79. [de Bruijn 9Oa] de Bruijn, N.G., The use of justification systems for integrated semantics, in: Martin-Lof, P. and Mints, G., eds., Colog-88 (Berlin, Springer Verlag, 1990), 9-24. (Lecture Notes in Comp. Sc., 417).

984

References

[ d e Bruijn 906 (A.7’1 de Bruijn, N.G., Reflections on Automath (Eindhoven University of Technology, 1990). [de Bruijn 91a] de Bruijn, N.G., Telescopic mappings in typed lambda calculus, Information and Computation 91 (1991), 189-204. [de Bruijn 91b] de Bruijn, N.G., A plea for weaker frameworks, in: Huet, G. and Plotkin, G., eds., Logical Frameworks, Proceedings of the BRA workshop, Sophia Antipolis, 1990 (Cambridge, Cambridge University Press, 1991), 40-67. [de Bruijn 91c] de Bruijn, N.G., Checking mathematics with computer assistance, Notices American Mathematical Society, 8 , 1 (1991), 8-15. [de Bruijn 921 de Bruijn, N.G., On the role of types i n mathematics (to be published, 1992). [de Bruijn 931 de Bruijn, N.G., Algorithmic definition of lambda-typed lambda calculus, in: Huet, G. and Plotkin, G., eds., Logical Environments (Cambridge, Cambridge University Press, 1993), 131-146. [Bulnes-Rozas 791 Bulnes-Rozas, J.P., GOAL: A goal oriented commend language for interactive proof construction, Ph.D. thesis, Stanford A.I. Lab., Stanford (1979). (Memo AIM-328). [Cardelli and Wegner 851 Cardelli, L. and Wegner, P., On understanding types, data abstraction, and polymorphism, Computing Surveys 17,4 (1985), 471522. [Church 321 Church, A., A set of postulates for the foundation of logic, Ann. of Math. 33 (1932), 346-366 and 34 (1933), 839-864. [Church 361 Church, A., An unsolvable problem of elementary number theory, Amer. Journal of Math. 58 (1936), 345-363. [Church 401 Church, A,, A formulation of the simple theory of types, Journ. of Symb. Logic 5 (1940), 56-68. [Church 411 Church, A., The Calculi of Lambda Conversion (Princeton University Press, 1941). (Annals of Math. Studies, 6). [Constable et al. 861 Constable, R.L.et al., Implementing Mathematics with the NuPRL Proof Development System (Englewood Cliffs, Prentice-Hall, 1986). [Coppo and Dezani 781 Coppo, M. and Dezani-Ciancaglini, M., New type assignment for A-terms, Archiv. Math. Logik 19 (1978), 139-156.

References

985

[Coppo et al. 811 Coppo, M., Dezani-Ciancaglini, M. and Venneri, B., Functional characters of solvable terms, Zeitschr. f. Math. Logik u. Grundl. d. Math. 27 (1981), 45-58. [Coquand 851 Coquand, Th., Une the‘orie des constructions, Thkse de troisibme cycle (Paris, Universite Paris VII, 1985). [Coquand 861 Coquand, Th., An analysis of Girard’s paradox, Proceedings of the first Symposium on Logic i n Computer Science, Cambridge Mass. (Washington DC, IEEE Computer Society), 227-236. [Coquand 901 Coquand, Th., Metamathematical investigations of a calculus of constructions, in: Odifreddi, P.G., ed., Logic and Computer Science (London, Academic Press, 1990), 91-122. (APIC series, 31). [Coquand and Huet 851 Coquand, Th. and Huet, G., Constructions: a higher order proof system for mechanizing mathematics, in: Buchberger, B., ed., Computer Algebra, Proceedings of the European conference EUROCAL ’85, Linz (1985), 151-184. (Lecture Notes in Comp. Sc., 203). [Coquand and Huet 881 Coquand, T. and Huet, G., The calculus of constructions, Information and Computation 76 (1988), 95-120. [Curien 861 Curien, P.-L., Categorical Combinators, Sequential Algorithms and Functional Programming (London, Pitman, 1986). [Curry and Feys 581 Curry, H.B. and Feys, R., Combinatory Logic (Amsterdam, North Holland Publishing Co., 1958), Vol. 1. [Curry et al. 721 Curry, H.B., Hindley, J.R. and Seldin, J.P., Combinatory Logic (Amsterdam, North Holland Publishing Co., 1972), Vol. 2. [van Daalen 701 van Daalen, D.T., Verzamelingstheorie, de axioma’s van Zermelo-Frankel (Eindhoven University of Technology, 1970). (Internal Report, Dept. of Math.).

[van Daalen 73 (A.3’11 van Daalen, D.T., A description of Automath and some aspects of its language theory, in: Braffort, P., ed., Proceedings of the Symposium A P L A S M (Orsay, 1973), Vol. I. Also in [van Benthem Jutting 771, 48-77. [van Daalen 801 van Daalen, D.T., The language theory of Automath, Ph.D. thesis (Eindhoven University of Technology, 1980). See also [A.6], (B.6) and [ C. 51.

986

References

[van Dalen et al. 781 van Dalen, D., Doets, H.C. and de Swart, H., Sets: Naive, Axiomatic and Applied (Oxford, Pergamon Press, 1978). [Dowek et al. 911 Dowek, G., Felty, A., Herbelin, H., Huet, G., Paulin-Mohring, Ch. and Werner, B., The Cop proof assistant version 5.6, user’s guide (Rocquencourt, INRIA - Lyon, CNRS ENS, 1991). [Fitch 521 Fitch, F.B., Symbolic Logic, an Introduction (New York, The Ronald Press Co., 1952). [Fraenkel et al. 581 Fraenkel, A.A., Bar-Hillel, Y. and LBvy, A., Foundations of Set Theory (Amsterdam, North Holland Publishing Co., 1958). [Frege 18791 Frege, G., Begriflsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (Halle, Verlag von Louis Nebert, 1879). [Frege 18931 Frege, G., Grundgesetze der Arithmetik, begriflsschriftlich abgeleitet (Jena, H. Pohle, 1893, 1903). [Gandy 801 Gandy, R.O., Proofs of Strong Normalization, in: Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/London, Academic Press, 1980), 475477. [Gentzen 351 Gentzen, G., Untersuchungen uber das logische Schliessen, Math. Zeitschr. 39 (1935), 176-210, 405-431. [Gentzen 361 Gentzen, G., Die Widersprachsfreiheit der reine Zahlentheorie, Math. Annalen 112 (1936), 493-565. [Geuvers 881 Geuvers, J.H., The interpretation of logic in type systems, Master’s thesis (University of Nijmegen, 1988). [Geuvers 931 Geuvers, J.H., Logics and type systems, Ph.D. thesis (Catholic University of Nijmegen, 1993). [Geuvers and Nederhof 911 Geuvers, J.H. and Nederhof, M.J., A modular proof of strong normalization for the Calculus of Constructions, Journal of Functional Programming 1, 2 (1991), 155-189. [Girard 711 Girard, J.-Y., Une extension de l’interprktation de Godel a l’analyse, et son application A 1’Climination des coupures dans l’analyse et la thCorie des types, in: Fenstad, J.E., ed., Proceedings of the 2nd Scandinavian Logic Symp. (Amsterdam, North-Holland Publishing Co., 1971), 63-92.

References

987

[Girard 721 Girard, J.-Y., Interpre'tation fonctionelle et dimination des coupures dans l'arithm6tique d 'ordre supe'rieur, Thbse d ' Etat (Paris, UniversitC Paris VII, 1972). [Girard et al. 891 Girard, J.-Y., Lafont, Y. and Taylor, P., Proofs and Types (Cambridge, Cambridge University Press, 1989). [Glaser et al. 841 Glaser, H., Hankin, C. and Till, D., Principles of Functional Programming (Englewood Cliffs, Prentice-Hall, 1984). [Gordon and Melham 931 Gordon, M.J.C. and Melham, T.F., Introduction to HOL, A theorem proving environment for higher order logic (Cambridge, Cambridge University Press, 1993). [Gordon et al. 791 Gordon, M., Milner, R. and Wadsworth, C., Edinburgh LCF, A mechanised Logic of Computation (Berlin, Springer Verlag, 1979). (Lecture Notes in Comp. Sc., 78). [Hall 671 Hall Jr., M., Combinatorial Theory (Chichester, Wiley, 1967). (Blaisdell book in pure and applied mathematics). [Harper et al. 861 Harper, R., MacQueen, D. and Milner, R., Standard ML (Edinburgh University, 1986). (Report ECS-LFCS-86-2). [Harper et al. 871 Harper, R., Honsell, F. and Plotkin, G., A framework for defining logics, in: Proceedings of the second Symposium on Logic in Computer Science, Ithaca, N.Y. (Washington DC, Computer Society of the IEEE, 1987), 194-204. [Hichcock and Park 731 Hichcock, P. and Park, D., Induction rules and termination proofs, in: Nivat, M., ed., Automata, Languages and Programming (Amsterdam, North-Holland Publishing Co., 1973), 225-252. [Hilbert and Ackermann 19281 Hilbert, D. and Ackermann, W., GrundzGge der theoretischen Logzk (Berlin, Springer Verlag, 1928). [Hindley 691 Hindley, J.R., The principal type scheme of an object in combinatory logic, Transactions of the American Math. SOC.146 (1969), 29-60. [Hindley 791 Hindley, J.R., Combinatory reductions and lambda reductions compared, Zeitschr. f. Math. Logik u. Grundl. d. Math. 23 (1979), 169180. [Hindley et al. 721 Hindley, J.R., Lercher, B. and Seldin, J.P., Introduction to Combinatory Logic (Cambridge, Cambridge University Press, 1972).

References

988

[Hindley and Seldin 861 Hindley, J.R. and Seldin, J.P.,

Introduction to Combinators and A-Calculus (Cambridge, Cambridge University Press, 1986). [Howard 801 Howard, W.A., The formulae-as-types notion of construction, in: Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/London, Academic Press, 1980), 479-490. (Manuscript 1969). [Jervell 711 A normal form in first order arithmetic, in: Fenstad, J.E., ed., Proceedings of the 2nd Scandinavian Logic Symp. (Amsterdam, North-Holland Publishing Co., 1971), 93-108. [Kamareddine and Nederpelt 93) Kamareddine, F. and Nederpelt, R., On stepwise explicit substitution, Int. Journal of Found. of Comp. Sc. 4, 3 (1993), 197-240. [Kijne 621 Kijne, D., Construction geometries and construction fields, in: Algebraical and Topological Foundations of Geometry, Proceedings of a Colloquium held in Utrecht, 1959 (Oxford, Pergamon Press, 1962). [Kleene 521 Kleene, S.C., Introduction to Metamathematics (New York, Van Nostrand, 1952). [Klop SO] Klop, J.-W., Combinatory reduction systems, Ph.D. thesis (Utrecht University, 1980). [Klop 921 Klop, J.W., Term rewriting systems, in: Abramsky, S. Gabbay, D. M. and Maibaum, T. eds., Handbook of Logic in Computer Science (Oxford, Clarendon Press, 1992), Vol. 2, 1-116. [Kneale and Kneale 621 Kneale, W. and Kneale, M., The Development of Logic (Oxford, Clarendon Press, 1962). [Kreisel 721 Kreisel, G., Five notes on the application of proof theory to computer science (Stanford University, 1972). (Techn. report no. 182). [Landau 301 Landau, E., Grundlagen der Analysis (First ed.: Leipzig, 1930; Third ed.: New York, Chelsea Publ. Comp., 1960). [Landin 641 Landin, P.J., The mechanical evaluation of expressions, Computer Journal 6, 4 (1964), 308-320. [Lauchli 70) Lauchli, H., An abstract notion of realizability for which intuitionistic predicate calculus is complete, in: Kino, A., Myhill, J. and Vesley,

References

989

R.E., eds., Intuitionism and Proof Theory, Proc. Summer Conference at Buffalo, 1968 (Amsterdam, North-Holland Publishing Co., 1970). [Leivant 751 Leivant, D., Strong normalization for arithmetic (variations on a theme of Prawitz), in: Proof Theory Symposium Kiel1974 (Berlin, Springer Verlag, 1975), 182-197. (Lecture Notes in Math., 500). [LCvy 741 LCvy, J.J., Re'ductions sures dans le lambda-calcul, These 3' Cycle (Paris, 1974). [LBvy 751 LCvy, J.J., An algebraic interpretation of the APK-calculus and a labelled A-calculus, in: Bohm, C., ed., A-Calculus and Computer Science Theory (Berlin, Springer Verlag, 1975), 147-165. (Lecture Notes in Comp. sc., 37). [Luo 891 Luo, Z., ECC: An extended Calculus of Constructions, in: Proceedings of the fourth Annual Symposium on Logic in Computer Science, Asilomar, Cal. (Washington DC, IEEE Computer Society Press, 1989), 386-395. [Magnusson and Nordstrom 941 Magnusson, L. and Nordstrom, B., The ALF proof editor and its proof engine, in: Barendregt, H. and Nipkow, T., eds., Types for Proofs and Programs (Berlin, Springer Verlag, 1994), 238-262. (Lecture Notes in Comp. Sc., 806.) [Mann 731 Mann, C.R., The connections between proof theory and category theory, Ph.D. thesis (Oxford, 1973). [Martin-Lof 71a] Martin-Lof, P., A theory of types (1971). (Manuscript). [Martin-Lof 71b] Martin-Lof, P., Hauptsatz for the theory of species, in: Fenstad, J.E., ed., Proceedings of the 2nd Scandinavian Logic Symp. (Amsterdam, North-Holland Publishing Co., 1971), 217-233. [Martin-Lof 75a] Martin-Lof, P., An intuitionistic theory of types: predicative part, in: Rose, H.E. and Shepherdson, J.C., eds., Logic Colloquium '73 (Amsterdam, North-Holland Publishing Co., 1975), 73-118. [Martin-Lof 75b] Martin-Lof, P., About models for intuitionistic type theory and the notion of definitional equality, in: Karger, S., ed., Proceedings of the third Scandinavian Logic Symp. (Amsterdam, North Holland Publishing CO., 1975), 81-109. [Martin-Lof 821 Martin-Lof, P., Constructive mathematics and computer programming,' in: Logic, Methodology and Philosophy of Science, VI, 1979 (Amsterdam, North-Holland Publishing Co., 1982), 153-175.

990

References

[Martin-Lof 841 Martin-Lof, P., Intuitionistic Type Theory, Studies in Proof Theory (Napoli, Bibliopolis, 1984). [Mitchell and Plotkin 851 Mitchell, J.C. and Plotkin, G.D., Abstract types have existential type, in: Proceedings of the 12th Annual Symposium on Principles of Programming Languages (New York, ACM, 1985), 37-51. [Mitschke 761 Mitschke, G., A-Kalkul, 6-Konversion und axiomatische Rekursions- Theorie, Habilit. Schr. (Darmstadt, 1976). [Mohring 861 Mohring, Ch., Algorithm development in the calculus of constructions, in: Proceedings of the first Symposium on Logic in Computer Science, Cambridge, Mass. (Washington DC, IEEE Computer Society, 1986), 84-91. [Nederpelt 71a] Nederpelt, R.P., Lambda-Automath (Eindhoven University of Technology, 1971). (Notitie 71-17, Dept. of Math.). [Nederpelt 71b] Nederpelt, R.P., Lambda-Automath IZ (Eindhoven University of Technology, 1971). (Notitie 71-25, Dept. of Math.). [Nederpelt 72a] Nederpelt, R.P., Strong normalisation in a A-calculus uiith Aexpressions as types (Eindhoven University of Technology, 1972). (Notitie 72-18, Dept. of Math.). [Nederpelt 72b] Nederpelt, R.P., The closure theorem in A-typed A-calculus (Eindhoven University of Technology, 1972). (Notitie 72-22, Dept. of Math.).

[Nederpelt 73 (C.3)] Nederpelt, R.P., Strong normalisation in a typed lambda calculus with lambda structured types, Ph.D. thesis (Eindhoven University of Technology, 1973). (Nederpelt 771 Nederpelt, R.P., Presentation of natural deduction, Receuil des Travaux d e l’lnstitut Math., Nouvelle SCrie, tome 2, 10, Symp. Set Theory, Foundations of Math. (Beograd, 1977), 115-126. [Nederpelt 801 Nederpelt, R.P., An approach to theorem proving on the basis of a typed lambda-calculus, in: 5th Conference on Automated Deduction (Berlin, Springer Verlag, 1980), 182-194. (Lecture Notes in Comp. Sc., 87). [Nederpelt 871 Nederpelt, R.P., De Taal van de Wiskunde (Almere, Versluys, 1987).

[Nederpelt 90 (A.S)] Nederpelt, R.P., Type systems - basic ideas and applications, in: van de Goor, A.J., ed., Proceedings of CSN ’90, Computing Science in the Netherlands (Amsterdam, Stichting Mathematisch Centrum, 1990), 367-383.

References

991

[Nederpelt 921 Nederpelt, R.P., The fine-structure of lambda calculus (Eindhoven University of Technology, 1992). (Computing Science Notes, 92/07). [Newman 421 Newman, M.H.A., On theories with a combinatorial definition of “equivalence”, Ann. of Math. 2, 43 (1942), 223-243. [Nordstrom et al. 901 Nordstrom, B., Petersson, K. and Smith, J.M., Programming in Martin-Loj’s Type Theory (Oxford, Oxford University Press, 1990). [Osswald 731 Osswald, H., Ein syntaktischer Beweis fur die ZulGsigkeit der Schnittregel im Kalkul von Schutte fur die intuitionistischen Typenlogik, Manusc. Math. 8 (1973), 243-249. [Paulin-Mohring 891 Paulin-Mohring, Ch., Extraction des programmes dans le calcul des constructions, Thhse (Paris, UniversitC Paris VII, 1989). [Paulson 871 Paulson, L.C., Logic and Computation (Cambridge, Cambridge University Press, 1987). (Cambridge Tracts in Theor. Comp. Sc., 2). [Penning 771 Penning, P., Automath-bewijzen voor tautologieen (Eindhoven, unpublished, 1977). [Peremans 941 Peremans, W., Ups and Downs of Type theory (Eindhoven University of Technology, 1994). (Computing Science Notes, 94/14). [Plotkin 801 Plotkin, G., Lambda-definability in the full type hierarchy, in: Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/London, Academic Press, 1980), 363-373. [Pohlers 731 Pohlers, W., Ein starkes Normalisationssatz fur die intuitionistischen Typen, Manusc. Math. 8 (1973), 371-387. [Pottinger 771 Pottinger, G., Letter to Prawitz (unpublished, 1977). [Pottinger 791 Pottinger, G., On analysing relevance constructively, Studia Logica 38 (1979), 171-185. [Pottinger 801 Pottinger, G., A type assignment to the strongly normalizable terms, in: Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/ London, Academic Press, 1980), 561-577. [Prawitz 651 Prawitz, D., Natural Deduction, a Proof- Theoretical Study (Stockholm, Almquist and Wiksell, 1965).

992

References

[Prawitz 711 Prawitz, D., Ideas and results in proof theory, in: Fenstad, J.E., ed., Proceedings of the 2nd Scandinavian Logic Symp. (Amsterdam, NorthHolland Publishing Co., 1971), 235-307. [Reynolds 741 Reynolds, J.C., Towards a theory of type structure, in: Robinet, B., ed., Proceedings of the Colloque sur la Programmation (Berlin, Springer Verlag, 1974), 408-425. (Lecture Notes in Comp. Sc., 19). [Reynolds 851 Reynolds, J.C., Three approaches to type structure, in: Ehrig, H. et al., eds., Mathematical Foundations of Software Development (Berlin, Springer Verlag, 1985), 97-138. (Lecture Notes in Comp. Sc., 185). [Sanchis 671 Sanchis, L.E., Functionals defined by recursion, Notre Dame Journal of Formal Logic 8 (1967), 161-174. [Schulte Monting 731 Schulte Monting, H., Yet another proof of the ChurchRosser theorem (unpublished, 1973). [Scott 701 Scott, D., Constructive validity, in: Laudet, M., Lacombe, D. and Schuetzenberger, M., eds., Symposium on Automatic Demonstration, IRIA, Versailles, 1968 (Berlin, Springer Verlag, 1970), 237-275. (Lecture Notes in Math., 125). [Scott 731 Scott, D.S., Models for various type-free calculi, in: Suppes, P. et al., eds., Logic, Methodology and Philosophy of Science, IV, 1971 (Amsterdam, North Holland Publishing Co., 1973), 157-187. [Scott and Strachey 711 Scott, D. and Strachey, C., Towards a mathematical semantics for computer languages (Oxford, Oxford University, 1971). [Seldin 751 Seldin, J.P., Review of ‘Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the Church-Rosser theorem’, Journ. of Symb. Logic 40 (1975), 470. [Seldin 761 Seldin, J.P., A theory of generalized functionality I (unpublished, 1976). See also: Seldin, J.P., Progress report on generalized functionality, Ann. Math. Logic 17 (1979), 29-59. [Seldin and Hindley 801 Seldin, J.P. and Hindley, J.R., eds., To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism (New York/London, Academic Press, 1980). [Shoenfield 671 Shoenfield, J.R., Mathematical Logic (Reading MA, Addison Wesley, 1967).

References

993

[Smorynski 771 Smorynski, C., The incompleteness theorems, in: Barwise, J., ed., Handbook of Mathematical Logic (Amsterdam, North-Holland Publishing Co., 1977), 821-865. (Studies in Logic and the Foundations of Math., VOl. 90). [Staples 751 Staples, J., Church-Rosser theorems for replacement systems, in: Crossley, J.N., ed., Algebra and Logic (Berlin, Springer Verlag, 1975), 291306. (Lecture Notes in Math., 450). [Staples 771 Staples, J., A lambda calculus with naive substitution (Brisbane, unpublished, 1977). [Stenlund 721 Stenlund, S., Combinators, A-terms and Proof Theory (Dordrecht, Reidel, 1972). [Tait 671 Tait, W.W., Intentional interpretations of functionals of finite type I, Journ. of Symb. Logic 32 (1967), 198-212. [Troelstra 731 Troelstra, A S . , ed., Metarnathematical Investigation of Intuitionistic Arithmetic and Analysis (Berlin, Springer Verlag, 1973). (Lecture Notes in Math., 344). [Troelstra 771 Troelstra, AS., Aspects of constructive mathematics, in: Barwise, J., ed., Handbook of Mathematical Logic (Amsterdam, North Holland Publishing Co., 1977), 973-1052. (Studies in Logic and the Foundations of Math., Vol. 90). [Troelstra and van Dalen 881 Troelstra, AS. and van Dalen, D., Constructivism in Mathematics (Amsterdam, North-Holland Publishing Co., 1988), Vol. 1. [Trybulec 901 Trybulec, A., Introduction, Formalized Mathematics 1, 1 (1990), 7-8. [Turner 791 Turner, D.A., Another algorithm for bracket abstraction, Journ. of Symb. Logic 44 (1979), 267-270. [Udding 801 Udding, J.T., A Theory of Real Numbers and its Presentation in Automath, Vol. 1-3, Master’s thesis (Eindhoven University of Technology, 1980).

[de Vrijer 75 (C.4)] de Vrijer, R.C., Big trees in a A-calculus with A-expressions its types, in: Bohm, C., ed., A-Calculus and Computer Science Theory, (Berlin, Springer Verlag, 1975), 202-221. (Lecture Notes in Comp. Sc., 37). Also: de Vrijer, R.C., Surjective pairing and strong normalization, Ph.D. thesis (University of Amsterdam, 1987), Chapter 5.

994

References

[de Vrijer 87a] de Vrijer, R.C., Surjective pairing and strong normalization: two themes in lambda calculus, Ph.D. thesis (University of Amsterdam, 1987). See also [de Vrijer 75 (C.4)]. [de Vrijer 87b] de Vrijer, [de Vrijer 87al.

R.C.,

“Stelling” to

his Ph.D.

thesis.

See

[de Vrijer 87c] de Vrijer, R.C., Exactly estimating functionals and strong normalization, Proc. of the Koninklijke Nederlandse Akademie van Wetenschappen, Series A 90, 4 (1987), 479493. [Wadsworth 711 Wadsworth, C.P., Semantics and pragmatics of the lambdacalculus, Ph.D. thesis (Oxford, 1971). [Weyhrauch 771 Weyhrauch, R.W., A users manual for FOL (Stanford, Computer Science Dept., 1977). (Artificial Intelligence Project M235). [Whitehead and Russell 19101 Whitehead, A.N. and Russell, B., Principia Mathematica (Cambridge, Cambridge University Press, 1910-1913), Vol. 1-3. [Wieringa 761 Wieringa, R.M.A., Binaire optelling en vermenigvuldiging in AUT-QE (Eindhoven, unpublished, 1976). [Wieringa 781 Wieringa, R.M.A., Een notatie-systeem voor lambda-calculus met definities, Master’s thesis (Eindhoven University of Technology, 1978).

[ Wieringa 80 (F.4)] Wieringa. R.M. A., Relational semantics in a n integrated system (Eindhoven University of Technology, 1980). (Internal Report, Dept. of Math.). [Zandleven 73 (E.l)] Zandleven, I., A verifying program for Automath, in: BrafFort, P., ed., Proceedings of the Symposium APLASM (Orsay, 1973), VOl. I. [Zucker 741 Zucker, J . , Cut-elimination and normalization, Annals of Math. Logic 7 (1974), 1-112. [Zucker 77 (A.4)] Zucker, J., Formalization of classical mathematics in Automath, in: Colloque International de Logique, Clermont-Ferrand, France, 1975 (Paris, CNRS, 1977), 135-145. (Colloques Internationaux du Centre National de la Recherche Scientifique, 249).

Indexes

This Page Intentionally Left Blank

Index of Names Berkling & Fehr 119821, 811, 837 Bernays, P., 192 Beth, E.W., 26, 203 Bishop [1967], 130 de Boer, S., 503 - (19751, 503, 505 Bourbaki, N., 206 Boyer & Moore (19723, 809 - [1988], 12 Braun, W.C.P., 252 van Bree, L.G.F.C., 73 Brouwer, L.E. J., 26, 204, 236 de Bruijn, N.G., 6, 111, 127, 128, 133,139,163,164,169, 175177, 187, 194, 216, 230, 231, 235, 238, 240, 242, 252, 306, 339,345, 390,393,394,473, 590, 687, 688, 722, 729, 783, 805, 809, 811, 813, 836, 838 - [1968b],8, 15, 74, 141, 150, 394 - [197Oa], 8, 128, 141, 150, 169, 283, 285, 286, 334, 371, 387, 393, 394, 469, 470, 783,844, 938 - [1970b],277 - [1971], 147, 313, 334, 393, 590 - [1972a], 158, 737 - [1972b], 116, 156, 175, 177, 192, 319, 327, 390, 656, 673, 804, 809, 811 - [1973b],141, 283, 285, 286, 334, 396, 722, 783 - [1973~],127, 141 - [1973d], 53, 938, 947, 948, 951,

Abadi et al. (19911, 35, 50 Ackermann, W., 4 Aristotle, 3 Augustsson, L., 12 Balsters, H.,8 - [1986],35, 45, 170, 306, 368 Barendregt, H.P., 11, 229, 230, 235, 239, 243, 248, 376, 391,399, 443, 503, 505, 611 - [1971], 375, 376, 389-391, 427, 448 - [1981],327, 503 - [1984a],339, 340, 572 - 119921, 5, 11, 26, 27, 34, 36, 37, 46 - & Hemerik [1990], 5, 26 - et al. [1976], 503 von Beethoven, L., 841 Ben-Yelles [1981], 187, 633 van Benthem Jutting, L.S., 6, 73, 156, 169, 170, 216, 222, 247, 252, 303, 306, 331, 396, 508, 509, 578 - [1971a], 155, 396, 508 - [1971b],473, 633 - 119731, 101, 127, 160, 804 - [1976], 157, 159 - [1977], 7, 44, 141, 147, 157, 159, 163, 169,170,175, 188-190, 197, 198, 303, 306, 334, 471, 570, 578 - [1981], 339, 687, 688, 733 - (19881, 177 - & Wieringa [1979], 170, 177 997

Index of Names

998 952, 963, 964

- [1974a],289, 334, 852 - [1974b], 148, 159, 187 - [1975a], 144, 192

- & Feys [1958], 8, 150, 168, 175, 195,371,375,376,389,391393, 415, 443, 572, 585 - et al. [1972],195

- [1975b],53, 160, 938

- (19761, 159 - [1977], 159, 289, 306, 652

- [1978a], 33, 156, 160, 177, 339, 355

- [1978b], 177, 342 - [1978~],159, 306, 335, 652 - [1980], 163, 169, 171, 189, 197, 334, 339, 655, 673, 810, 849, 852, 866 - [1987a],865 - [1991a],11, 21, 44 - [1991b], 30, 32 Bulnes, J.P., 166 Bulnes-Rozas [1979],166, 170 Cantor, G., 203-205, 221, 229, 841, 842, 845, 846, 848 Church, A., 5, 34,36,37,41, 75, 168, 205, 229, 230, 232, 235, 246, 389, 391, 392, 503 - (19321, 5, 389 - [1936],5 - (19401, 5, 13, 135, 136, 340, 355 - [1941],375 Constable et al. [1986],9, 11 Coppo & Dezani [1978], 188 Coppo et al. [1981], 188 Coquand, Th., 12, 235, 837 - [1985], 10 - [1986], 8 - [1990], 10 - & Huet [1985], 10 - & Huet [1988], 10 Curien [1986], 809, 815, 837 Curry, H.B., 5, 128, 175, 195, 230, 232, 235, 248, 250

van Daalen,

D.T., 7, 252, 264, 396,

485, 783

- [1970], 144 - [1973], 127, 141, 155, 163, 172, 177,289,301,303,469-471, 474, 493, 517, 522, 523, 558, 591, 592, 701, 783, 787, 810, 833 - [1980], 7, 23, 31, 38, 155, 303, 314, 327, 331, 334, 339, 473, 508, 515, 526, 539, 545, 550, 574, 579, 590, 592, 612, 629, 637, 655, 830, 833 van Dantzig, D., 203 Dedekind, R., 713, 714 Dijkstra, E.W., 207 Dowek et al. [1991], 11 Erdos, P., 26, 204, 212 Euclid, 219, 717 Euler, L., 219 Fitch, F.B., 687 - [1952], 52, 687 Fleischhacker, L., 371 Fraenkel, A.A., 50, 220, 226, 841843, 845-847, 865, 866, 912 - et al. [1958],842 Frege, G., 3, 228 - [1879],3 - [1893], 3 Freudenthal, H., 205, 206 Gandy [1980],41, 655, 664 Gauss, C.F., 841 Geuvers [1993], 11 - & Nederhof [1991], 11

Index of Names Girard, J.-Y., 8, 111, 188, 200, 230, 235, 393 - [1971], 9, 188, 393 - [1972], 8, 9, 111, 123, 150, 168, 188, 520 Glaser et al. [1984],837 Godel, K., 9, 192, 229, 393 Gordon, M.J.C., 12 - & Melham [1993], 12 - et al. [1979], 9, 166 Gorissen, P., 837 Harper et al. [1987], 10 Helmink, L., 11 Heyting, A., 26, 204, 205, 211, 216, 218, 236 Hichcock & Park [1973], 943 Hilbert, D., 4, 52, 93, 94, 101, 142, 152, 163, 203, 205, 216, 217, 219, 728, 730, 733 853, 865, 887, 911, 922, 936 - & Ackermann [1928], 4 Hindley [1979], 177 - et al. [1972], 469 Howard, W.A., 8,111,128,235,248, 393, 469, 473 - [1980], 8, 111, 150, 168, 393, 469, 473 Huet, G., 235, 837 Kamareddine & Nederpelt [1993],50 Kijne (19621, 852 Klop [1980],36 - (19921, 36 Kneale & Kneale [1962], 3 Kolmogorov, A.N., 236 Kornaat, A., 7, 138, 156, 158, 170, 252, 303, 306, 733 Kramers, H.A., 203 Kreisel [1972], 395 Kronecker, L., 217, 218 Kruseman Aretz, F.E.J., 837

999 Lauchli, H., 161, 393 - [1970], 161, 393 LCvy, J.J., 503, 508, 509 - [1974],508 - [1975],503 Landau, E., 73, 144, 157, 159, 160, 169, 222, 223, 252,303,306, 578 701-706, 709-711, 713, 715, 723-725, 729-732, 805 - [1930],44 45 157 159701, 709 Landin, P.J., 243, 837 von Leibniz, G.W., 3 142 Leivant, D., 635 - [1975],633 635 Luo, z., 11 - [1989], 11 Magnusson & Nordstrom [1994], 12 Marcelis, J.G., 252 Martin-Lof, P., 8, 111, 128, 188, 194, 200,230,391,393,398,427429, 472, 474, 655 - [1971a], 8, 168, 194 - [1975a], 9, 25, 111, 123, 128, 130, 134, 150, 188, 192, 393, 474, 517, 520, 655 - [1984],9 Milner, R., 9, 166 Mitschke (19761, 503 Mostowski, A., 841 Nederpelt, R.P., 6, 28, 123, 165, 174, 175, 183, 186, 187, 193, 252, 264, 278, 281, 333, 371, 473, 517, 518, 551, 559, 577, 581, 591,592,596,605,616,626-

628, 655, 683, 783 - [1971a], 275, 393, 590, 628 - [1971b], 28, 393, 590 - [1972b], 400 - [1973], 123, 124, 147, 155, 163, 175, 177, 281, 314, 327, 331,

Index of Yames

1000 333, 345, 473, 482, 517, 518, 522, 577, 592, 626, 633, 655, 783, 830 - (19771, 144 - [1987],23, 52 von Neumann, J., 192 Nordstrom, B., 12 - et al. [1990], 9 Paulin-Mohring, Ch., 11 - [1989], 11 Paulson [1987], 10 Peano, G., 52, 137, 141, 218, 221, 228, 241, 246, 249, 711, 715, 717, 718, 720, 843, 846, 848, 887, 912, 922 Penning, P., 252 - [1977], 170 Plotkin [1980], 191 Pollack, R., 11 Pottinger, G., 635 - (19771, 305, 635 - (19793, 305 - [1980], 189 Prawitz, D., 111, 392, 393, 508, 632, 635 - [1965],305, 392, 509, 632 - [1971], 111, 123, 150, 168, 392, 393, 487, 488 Pythagoras, 3 Reynolds, J.C.,9 - [1974], 9 Rosser, J.B., 38, 42 Russell, B., 4, 229 Sanchis, L.E., 393 - (19671, 393 Schuh, F., 202, 203 Schulte Monting [1973], 391 Scott, D., 9, 26, 100, 161, 167, 169, 188, 193, 194, 216, 473

- [1970], 25, 168, 169, 188, 194,

200, 473

- [1973], 9, 128, 134 - & Strachey (19711, 943 Seidel, J.J., 6 Seldin, J.P., 177, 188, 195, 200, 230 - [1975], 177 - [1976], 25, 188, 195 - & Hindley (19801, 22 Shakespeare, W., 217 Smorynski [1977],5 Staples, J., 175 - [1977], 175

Tait, W.W., 118, 376, 391, 392, 396, 398, 427-429, 655 - [1967], 392, 396, 487 Tarski, A., 5 Thales, 3 Trybulec [1990], 12 Turing, A.M., 229 Turner [1979], 49, 177 Udding, J.T., 160, 252 - [1980], 160 de Vrijer, R.C., 7, 175, 185, 252, 396, 503, 517, 601, 647, 689 - [1975], 155, 163, 173, 175, 185, 517, 591, 601, 608, 647 .. . - [1987a], 37, 174 - [1987~],41, 655, 664 Wadsworth (19711, 177, 809 Weyhrauch, R.W., 166 - [1977], 166 Whitehead, A.N., 4, 229 - & Russell [1910],4 Wieringa, R.M.A., 8, 157, 160, 170, 177, 252, 837, 947 - [1976], 171 - [1978], 160

Index of Names - [1980], 947 Wittgenstein, L., 151

Zandleven, F., 7, 121, 126, 156, 158, 169, 170, 176, 252, 725, 805, 809,811, 817, 837 - 119731, 101, 115, 121, 127, 156, 169, 570, 572, 805-807 Zermelo, E., 51, 144, 220, 226, 841843, 865, 866, 912 Zucker, J., 7, 147, 158, 160, 169, 170, 224, 252, 303, 306, 724, 729, 733,736, 737,739-742,746, 748, 749 - [1974],635 - [1977], 158, 163, 169-171, 177, 188, 189, 197,289,303,305, 306,470, 706, 722, 724,729, 733, 736, 741, 742, 746

1001

This Page Intentionally Left Blank

Index of Notations +, 309, 493

a,116

+’, 311

P, 108

C9, 304

Pq-CR, 577 PTVU-SN, 638 p r , 603 bool, 688 bool, 16, 41, 723 b o o l , 53 BOUNDSUBST, 790 BT, 38 BT, 600, 601

507 >, 494, 518 >1,493 > p , 790 >6, 794 >,, 792 51,494 2 , 494, 518 A sub B, 493 A 1 B, 494, 519 A c B, 493 [ti : V ] ,149 E [z := p ] , 148 E, 52 00, 53 P I r 11 486 +, 599 -‘btr 473 ,: 599 c,583 El 493 N, 518 =, 494

cantyp, 539, 558 CAT, 49, 125, 795 CAT, 299, 739 CAT, 31 CATEGORY, 788 CL, 519, 534 CL1, 534 CL’, 612 CL& 612 CLPT, 534 CLPT1, 535 CORRECT, 802 CORRECTCATS, 801 CR, 494 CR’, 612 CR;, 612 CR1, 494 cus, 648

El 798 $ ’,

800 T, 747 I-, 723, 801 I-N, 626 I-*,801 :, 52 ::, 52

Deg, 399, 450 6 , 108, 493 dn, 594 dnc, 594 1003

1004

DOM, 48,795, 796 DOM,739 DOM, 31 dom, 561 E, 116

EB, 109 E , 304, 493 Eait, 311 77, 108 EUD, 519, 536 IF, 638 ij-pp, 494 INDSTR, 787 i n t , 53 IS,41

Lq, 519, 536 LR, 605 MIDDLE, 787 F , 39 N, 494 nonempty, 688

OCCURS IN, 792 OFF, 940 OLDER THAN,798 w , 346 %(x)4 89 ON,940 P'T, 596 PD, 596 n, 32,306,736 K , 304, 493 PN, 109 PROP, 723 prop, 19, 22, 43, 152,196,728 PT, 519, 534, 536,596 PT1, 534

Index of Notations

9, 116

&-, 518 r, 601 RECURS, 955 p, 38, 399,457 rst, 601 rt, 601 s, 601 519, 596 sc, 597 C,11, 736 6 ,304,493 SN, 494 STRINGSUBST, 789 SUBST, 789 SYNT, 739 synt, 31, 299 SA,

T,53 t, 601 r , 34,260,603 r', 482 381 0,468 el,467 0 2 , 466 Totinf, 53 TRUE, 688 TRUE, 16,41, 53,87, 723 Typ, 399, 450 Typn, 449 Typ', 399, 452, 482 typ, 184, 186 typ', 593 type, 16, 19, 22, 43, 75, 148, 181, 258, 395, 728 UD, 519,536 UT, 519,536 weak ij-pp, 494

Index of Subjects function -, 231 functional -, 90 - index, 294 lambda -, 18 abstractor, 398, 403 abstractor chain, 398, 403 abstractor string, 616, 626 absurdity, 236 acceptable, 84 adequacy, 10 adjective, 52, 872, 876, 877 ALF, 12 ALGOLGO, 206, 279, 937 algorithm, 387 algorithmic correctness check, 329 algorithmic definition, 172, 517, 554, 558, 563, 591 all-quantifier, 93 a--conversion, 116 modulo -, 175 -equality, 107 -equivalence, 390 -reduction, 47, 413, 790 alphabet, 401 ancestor, 738 anonymous archetype, 871 Another Logical Framework, 12 applicability condition, 37, 39, 395, 452, 472 application, 28, 148, 178, 231, 377, 396 - condition, 234, 592 -expression, 129, 178, 255, 736

+-language, 185, 309, 526 +-reduction, 40, 493 +‘-reduction, 311 rst-reduction, 601 rt-reduction, 601 r-reduction, 601 s-reduction, 601 t-reduction, 601 1-expression, 258, 371 2-expression, 258, 371 3-expression, 258, 371 Cexpression, 722 lpexpression, 706 It-expression, 706 Ppexpression, 706 2t-expression, 706 3pexpression, 706 3t-expression, 706 V-elimination, 46 $-function, 304 $-type, 304 (-context, 360 ij-postponement, 494 abbreviating expression, 737 abbreviation, 105, 179, 241 abbreviation system, 76, 147 abbreviation-line, 707 abortion, 940, 970 abstract algebra, 139 abstract data type, 28, 245 abstract linear order, 138 abstraction, 28, 148, 178, 231, 396, 470 - expression, 129, 178, 254 1005

Index of Subjects

1006 function -, 231 legitimate Q-,452 - restriction, 19, 107 - rule, 532 self -, 392, 530 applicator, 403 applicator chain, 403 arbitrary, 52, 204, 852, 855 archetype, 205, 871, 898 argument degree, 186, 525 arithmetical typed A-calculus, 168 artificial intelligence, 870 ascendant, 32, 321 assertion, 875 assertion category, 87 assertion type, 688, 722 assertional line body, 886 assignment, 53, 54 assumption, 153, 240, 844, 875 assumptional context item, 880, 885 assumptional item, 879 AT-couple, 324 AT-pair, 32, 324 AT-removal, 33, 37, 325 attachment, 157 AUT-Ah, 30, 32 AUT-A, 36, 37 AUT-II, 31, 40, 44, 127, 147, 158, 289, 294, 303, 724, 733 AUT-II line, 734 AUT-no, 633 AUT-IIl, 634, 636 AUT-4, 187, 284 AUT-68, 41, 147, 251, 687, 721, 722 AUT-LAMBDA, 335 AUT-QE, 127, 147, 183, 230, 276, 289, 303, 396, 469, 471, 701, 721 AUT-QE-NTI, 289, 294, 335 AUT-SL, 28, 32, 147, 154, 186, 275, 314, 393

AUT-SYNT, 30, 43! 158, 299, 725, 737 ACT-synt, 169 Automath, 6, 230, 393, 394, 869 Automath book, 127 Automath project, 169 Automath verifier, 47, 49 automatic expression simplifier, 946 automatic formula manipulation, 375 automatic theorem proving, 23, 98, 157, 221, 240 axiom, 88, 153, 234, 241, 843, 888 axiom of choice, 94, 152, 205, 216 axiom of reducibility, 4 axiomatic line body, 886 Barendregt's cube, 235 Barendregt's lemma, 505 base, 401, 478, 479 Begriffsschrift, 3

P-chain complex, 432 -conversion, 421 -equality, 107, 108 -equivalent, 421 -normal, 460 -normal form, 460 -normalizable, 460 -reduction, 47, 49, 122, 242, 257, 288, 326, 382, 386, 390, 417, 789 n-step -, 417 P1-

-equivalence, 424 -nf, 466 -normal, 463 -normal form, 463, 466 -normalizable, 463 -normalization theorem, 466 -reduction, 36, 398, 422, 424 n-step -, 425

Index of Subjects

P2-

-equivalence, 424 -normal, 463 -normal form, 463 -normalizable, 463 -reduction, 36, 422, 424 n-step -, 424 Pq-Church-Rosser , 577 Or-reduction, 603 big tree, 37, 473, 484, 519, 577, 600 big tree theorem, 38, 175, 485, 486, 591, 600, 601 binary predicate, 132 binary sum, 40 binary tree, 316 binary union, 311 binder, 922 binding abstractor, 407 binding influence, 408 binding instance, 318 binding occurrence, 406 binding variable, 176, 233 block, 15, 75, 843, 882 - opener, 76, 145, 882 - mechanism, 14 - structure, 53, 939 body of a line, 880 Bolzanc-Weierstrass theorem, 138 book, 28, 29, 74, 108, 144, 253, 474, 872, 879 Automath -, 127 correct -, 172 correct PAL -, 84 empty -, 894 - equality, 16, 19, 22, 41, 112, 154, 159, 219 MV -, 879, 891, 911 nested -, 79 valid -, 891, 894 zero abstraction index -, 294 bookkeeping pairs, 601, 607

1007 bool, 87 bool-style, 153, 213 boolean convolution, 950 borderline between language and metalanguage, 873 bound expression, 406 bound instance, 318 bound occurrence, 406 bound variable, 76, 89, 175, 375, 884 Boyer-Moore Theorem Prover, 12 Brouwer-Heyting-Kolmogorov interpretation, 236 de Bruijn index, 49, 809, 811 but for a-reduction, 418 calculated type, 31 Calculus of Constructions, 9, 10, 235 calculus of lambda conversion, 389 calculus ratiocinator, 3 Cambridge LCF, 9 CAML, 11 canonical type, 28, 39, 48, 539, 555 Cantor’s paradise, 205, 846 capacity, 606 Cartesian product, 131, 244, 736 case-construction, 245 category, 15, 48, 75, 76, 128 - of all propositions, 131 - of all types, 130 cc, 10 chain, 403 Characteristica Universalis, 3 chastity belt system, 223 checker, 156 checking, 142, 251, 337 Church’s thesis, 5 Church-numeral, 246 Church-Rosser, 326 h-,577 weak -, 494 - for hq, 596

1008

Index of Subjects

- property, 36, 123, 391, 472, 494 - for PI-reduction, 427

- theorem, 35, 155, 171, 371, 386,390,485, 655,663 Church-typing, 232 clash of variables, 605, 788 class, 372, 872 classical 17 rule, 133 classical logic, 133, 152, 236,866 classical mathematics, 133 classical real analysis, 136 clause, 875,878, 885 clause of a line body, 889 closed expression, 406 closure, 11, 40,418,519,655,676 - for AUT-II, 629 - for Pv-AUT-QE, 537 - for P@-AUT-QE+, 543 - for A, 596 - proofs for simpler languages, 551 - property, 37, 123, 173, 473, 517 - theorem, 155,483 combinatorics, 138 combinatory logic, 177 comment, 739 comment line, 739 common reduct, 173 compatibility condition, 180 compatibility of def and typ, 523, 595 compatible, 231 complete linear order, 139 complete ordered field, 139 completion, 82 complex, 432 composite &reduction, 417 compound notion, 76 comprehension, 136

computability, 5, 489,647 computability under substitution, 648 computable functions, 229 computer program semantics, 843, 863 computer programming, 858 concatenation, 54,954 condition, 875 confluent, 494 confusion of variables, 415 conjunction, 132 conservative extension, 544 constant, 60,230, 377,471, 883 - at infinity, 949 - function, 756 - symbol, 474 constructibility, 51 construction, 470 construction irrelevance, 288 constructive existence, 202 constructive reasoning, 167 constructive validity, 25 constructivist, 87 constructivity, 50 Constructor, 11 content, 253 context, 15, 28, 77, 103, 104, 144, 222, 233, 253, 879 E-, 360 assumptional - item, 880, 885 declarational - item, 880,885 empty -, 128,880 - extension, 734 - indicator, 75, 734 - item, 879,885 - line, 128,265, 734 - mechanism, 14 renaming of -s, 581 valid -, 891,892 variable of a -, 883,884 continuity, 44,748

Index of Subjects continuous, 965 continuum problem, 221 contractum, 179 contradiction, 28, 45, 236, 708 conversion, 28, 390 a-, 116 modulo -, 175 p -, 421 calculus of lambda -, 389 - rule, 234 type -, 308, 530 convertibility, 179 conveyor belt, 209 coq, 11 correct, 372 - book, 172 - PAL book, 84 - expression, 172 - formula, 172 - term, 34 correctness, 172, 263, 327, 328 algorithmic - check, 329 degree -, 307, 312 degree norm -, 593 mathematical -, 166 - in h, 592 - in AUT-QE, 119 - of a 9-formula, 524 - of an E-formula, 524 - of a category, 524, 528 - of an expression, 523, 800 - of a line, 802 partial -, 54, 956 program -, 53 semi -, 333 substitutivity of -, 597 total -, 54, 958 - of strings, 801 - of types, 597 CR for &-reduction, 427, 436 CR for ?-reduction, 436

1009 CR for full 07-reduction, 581 criteria for a good notation, 376 cube of typed lambda calculi, 11 Curry-Howard isomorphism, 235 Curry-typing, 232 d-system, 620 data bank, 164 data type, 10, 53, 938 daughter, 738 dead end set, 639 decidability, 37, 102, 173, 265, 472, 518, 554, 569 feasible -, 18, 25, 167, 173 formal -, 25, 173 decidable, 485 deciding @formulas, 124 decision procedure, 172, 555 declarational context item, 880, 885 declarational item, 879 decomposition, 417, 439, 571 defined constant, 493 defined constant-expression, 493 definiendum, 735 definiens, 735 defining line, 734 definite article, 920 definition, 14, 76, 88, 202, 225, 241, 735, 875, 876 algorithmic - , 172, 517, 554, 558, 563, 591 E-, 39, 172, 517, 554 inductive -, 402 language -, 872 - line, 128 - scheme, 493 unfolding a -, 242 definitional, - constants in h, 620 - equality, 19, 22, 28, 83, 90, 108, 133, 154, 219, 257, 281,

1010

Index of Subjects

394,494, 796

- extension, 185,522, 544 - line, 146

- line body, 886

- reduction, 47 degree, 16, 29, 32, 103, 130, 148, 181,283,307,320-322, 324, 393,450 argument -, 186, 525 - correctness, 307,312 domain -, 186, 525 function -, 185, 525 higher -, 148 - norm, 593 - norm correct, 591 - norm correctness, 593 value -, 185,525 6-equality, 107 -reduction, 39, 50, 154, 219, 256,493, 793, 794 -string, 345 -lambda, 327 A, 36,40,397 -constructible, 409 Ah, 33,313,327 denotational semantics, 943 dependent product, 234 derivation rule, 154 derivation step, 402 derivative, 137, 744,751,753,756 derived rule, 897 descendant, 738 despecify, 923 despo, 923,925 Dialectica interpretation, 9 diamond property, 123 didactics, 251 difference quotient, 744, 750 differentiable, 745 differentiation, 44,733,744,750,756

direct consequence, 417 disjoint one-step i-reduction, 496 disjoint one-step reduction, 527 disjoint reduction, 494 disjoint sum, 132,244, 736 disjoint union, 32,304 disjunction, 28,305 distinction between replacement and substitution, 605 distinctly bound, 397,407 domain, 126,232 - degree, 186,525 - function, 561 mechanical -, 795 - of a term, 48 preservation of -s, 596 Q-, 453 uniqueness of -s, 126,174,519, 535 extended -, 519,535 double negation axiom, 236 double negation law, 216, 728, 748 dummy, 231,318, 883,884 dummy-binding, 376 EB-line, 707 ECC, 11 Edinburgh LCF, 9 E-definition, 39, 172, 517, 554 - for A, 533 E-formula, 116,119, 177,517,524 elementary &reduction, 416 elementary PI-reduction, 424 elementary Pz-reduction, 424 elementary q-reduction, 439 elementary tc-reduction, 443 elementary mathematics, 143 elimination of primitive constants, 616 elimination rule, 234 embedding, 41 empty block opener, 109

Index of Subjects empty book, 894 empty context, 128, 880 empty line, 739 end-point, 316 environment, 49, 817 €-reduction, 40, 304, 493 &,It -reduct ion, 311 equality, 46, 135, 293, 708, 900 Q-, 107 0-, 107, 108 book-, 16,19,22,41,112,154, 159, 219 definitional -, 19, 22, 28, 83, 90, 108, 133, 154, 219, 257, 281, 394, 494, 796 extended -, 583 6-, 107 rj-, 107, 108 intensional -, 192 left hand - rule, 178, 536 - of booleans, 41 - of elements, 41 - of proofs, 306 - on types, 46 right hand - rule, 178 equivalence, 327 equivalence proof, 554, 563 equivalent, 172 77--equality, 107, 108 -reduction, 39, 43, 47, 48, 288, 382, 390, 726, 791, 792 n-step -, 440 T!-reduction, 441 k-fold -, 441 qd-system, 620 excerpt, 45, 48 excluded middle, 205 excluded third, 45, 216, 285 existence, 93 existential quantification, 132, 748

1011 existential quantifier, 237 existential type, 246 expert system, 865 explicit substitution, 50 explicit typing, 232 exponential function, 138 expression, 28, 59, 254, 402, 736, 875 1-, 258, 371 2-, 258, 371 3-, 258, 371 4-, 722 lp-, 706 It-, 706 2p-, 706 2t-, 706 3 p , 706 3t-, 706 abbreviating -, 737 abstraction -, 129, 178, 254 application -, 129, 178, 255, 736 bound -, 406 closed -, 406 correct -, 172 defined constant -, 493 fix -, 736 head -, 254, 736 inhabitable -, 307 lambda -, 736 legal -, 233 legitimate -, 452, 482 name carrying -, 380 NF -, 172, 379 normable -, 507 p, 20, 130 n-, 736 plus -, 493 primitive constant -, 493 pseudo -, 233 quasi -, 114, 149, 286 saturated -, 59 C-, 736

1012 t-, 20, 130 ext-postponement, 499 ext-reduction, 493, 496 extended definitional equality, 583 extended reduction, 583 extended system, 24 extended typed /\-calculus, 168, 303 extended uniqueness of domains, 519, 535 extension, 172, 544 definitional -, 185, 522, 544 unessential -, 185 extensional reduction, 175, 493 extensionality, 47, 136, 192 extensions of Automath, 98 exterior approach, 864 external reference, 322 factor, 404 failure of &-SN, 652 feasibility, 21, 24 feasible decidability, 18, 25, 167, 173 field, 139 finite product, 32 finite sum, 32 first-order language, 158 first-order system, 114, 637 fix expression, 736 fix symbol, 735 fixed point, 965 fixed point semantics, 54 flag, 891 flagless form, 881 flagstaff, 891 flagstaff form, 52, 881, 891 FOL, 166 formal decidability, 25, 173 formula, 875 correct -, 172 E-, 177 Q-, 177

Index of Subjects formulae-as-types, 8, 199, 393, 469 fourth degree identification, 283 fragment p-, 196 t-, 189 free occurrence, 406 free variable, 76, 118 fresh, 884 fresh identifier, 883 fresh variable, 412 function, 17, 231, 389, 908 - abstraction, 231 - application, 231 - degree, 185, 525 - like, 113 - type, 17 type-valued -, 131 - value, 17, 389 functional abstraction, 90 functional binder, 923 functional interpretation, 111 functional interpretation of logic, 19 fundamental rule, 897 y-equivalence, 430 y-type, 357 Godel’s general recursive functions, 5

general language rules, 177 general reduction, 446 generalized - Cartesian product, 239 - conjunction, 132, 711 - functionality, 25 - if-then-else, 711 - logic, 171 -implication, 131, 197, 704, 711 - product, 17, 40, 234 - sum, 32, 40 - typed A-calculus, 40, 168 generate, 402, 417, 424, 439

Index of Subjects generic form, 494 geometrical construction, 50,54?226,

843,850,960 Girard’s paradox, 8 gL, 874 GOAL, 167 grains of salt, 927,931 grammatical category, 876 graph reduction, 177 Grundlagen, 157 Hall-Konig theorem, 138 hardware verification, 12 head expression, 254, 736 head reduction, 571 heading, 739 Heine-Bore1 theorem, 138 high typing, 871, 877 higher degree, 148 higher mathematics, 143 higher order language, 158,187 Higher Order Logic, 12 higher order system, 200 higher order typing, 50 Hilbert operator, 93,728 Hilbert’s &-operator, 152 Hilbert’s axiomatization of geometry, 853,922 Hilbert’s axioms, 887 Hilbert’s program, 5 Hilbert’s selection operator, 205 HOL, 12 identifier, 15, 58, 75, 76,735, 883 identity - function, 758 syntactic -, 175 IEreduction, 493,496, 634 if-then-else, 41, 54,711 illative combinatory logic, 195 immune form, 571,638 imMV, 874

1013 implantation, 32! 322 implementation, 809 - of substitution, 809,817 implication, 41,45,91,132,152,153,

205, 211, 236 generalized -, 197 implicit typing, 232 impredicativity, 8 improper reduction, 585,632 improper symbol, 401 improved dead end set, 641 incomplete information, 53,943 incompleteness theorem, 12 increasing reduction, 36 indicator, 15, 75-77 - string, 79 individual, 45, 136,708 - variable, 401 induction, 46 - on p , 508 - on reducts, 598 - on subexpressions, 598 - on the length of proof, 402 inductive definition, 402 Inductive Definitions, 11 ineffective P-chain, 464 inference rule, 87 infinite binary tree, 315 inhabitability condition, 307 inhabitable, 186,309 - degree condition, 523 - expression, 307 inhabitant, 17 inhabited, 235 injection, 32,40,304,493 instantiate, 129 instantiation, 149,180,193,222,308 - condition, 593 integers, 137 integrated, 938 integration, 227

1014 intensional equality, 192 interactive program. 47 interior approach, 864 internal reference, 322 internalization, 218 interpretation, 143 interpretation-oriented meta-MV, 874 introduction of a variable, 240 introduction rule, 234 introduction-elimination reduction, 493 introduction-elimination rule, 634 intuitionism, 205, 236 intuitionistic Irule, 133 intuitionistic logic, 236 intuitionistic reasoning, 41 intuitionistic type theory, 8,25, 230 inverse mapping theorem, 138 irrelevance of objects, 134 irrelevance of proofs, 20,29,43,132, 133, 169,710,722,724 justification, 240 ereduction, 443 Konig’s lemma, 968 kind, 236 knowledge frame, 328 label, 316,885 labeling, 317 lambda abstraction, 18 Lambda Automath, 393 lambda equivalence, 399,447 lambda expression, 736 lambda notation, 389 lambda phrase, 403 lambda phrase chain, 403 lambda term, 231 lambda tree, 32, 317 A, 40, 186,314, 393, 397, 452, 473, 590,591 A,, 41, 655,673

Index of Subjects

Aq, 591 Aqc, 616 Avdr 620 Ac, 616 Ad, 620 A-Automath, 275 A-Calculus, 5, 229 arithmetical typed -, 168 cube of typed -, 11 extended typed -, 168,303 generalized typed -, 39, 168 A-typed -, 313 name-free -, 156,813 polymorphic -, 9, 193 polymorphic predicate -, 10,167 pure typed -, 168 second order -, 230, 235 typed -, 232,845 untyped -, 230 A-definability, 5 A-definable functions, 5 A-typed lambda calculus, 313 A-SEMIPAL, 146 AV, 340 A+, 34 A+-Church, 34 AX, 38, 175, 185,469, 601 AX-1, 185,472 AX-p, 486 AX-theory, 471,478 Xu, 33,339, 340 ATU, 340, 354 A(-term, 343 language, 143 -, 185 non -, 525 closure proofs for simpler -s, 551 - definition, 872 first order -, 158 general - rules, 177

+

Index of Subjects higher order -, 158, 187 meta -, 4, 143, 208, 873 object -, 4 programming - semantics, 23, 160 regular -, 181, 525 specification -, 865 superimposed -, 97 - theory, 155, 171, 172 type in a programming -, 244 large category, 130, 134 lazy evaluation, 49 LCF, 9, 167 left hand equality, 519 - rule, 178, 535 left projection, 736 legal expression, 233 legitimate expression, 452, 482 legitimate Q-application, 462 legitimate term, 38, 39, 472 LEGO, 11 length, 402 length of proof, 424 let-construction, 28, 244 level of a variable, 379 lexicographical order, 321 library, 251 life without types, 890 limit, 743 line, 28, 108, 128, 253, 734, 879 linear order, 139 list, 243 literary replacement, 513, 605 local P-reduction, 242, 244, 325, 382 local reduction, 32 logic, 268, 688, 706, 721, 909 classical -, 236 combinatory -, 177 functional interpretation of -, 19 generalized -, 171

1015 illative combinatory -, 195 intuitionistic -, 236 minimal predicate -, 166 predicate -, 236 propositional -, 236 Logic for Computable Functions, 9 logical constant, 241 Logical Framework, 10 logical paradox, 4 loops, 863 loss factor, 23, 44, 160 low typing, 877 machine verification, 18 macro-operation, 60 main line, 328 main reduct, 572 main reduction, 506 many-step reduction, 122 Mascheroni constructions, 863 mathematical correctness, 166 Mathematical Vernacular, 23, 27, 52, 161, 211, 224, 865 mathematics produced in Automath, 159 mechanical domain, 795 mechanical type, 31, 125 memory, 48 meta language, 4, 143, 208, 873 meta-MV, 874 meta-typing, 871 metalingual discussion, 376 metric space, 138 micro-operation, 61 mimicking, 159 mini-reduction, 33, 242, 326 minimal logic, 132 minimal predicate logic, 166 mixed string, 401 MIZAR, 12 ML, 9, 167, 230

1016 mMV, 874 mock typing, 150, 289 modified parametrized constant, 883 modulo a-conversion, 175 modulo a-reduction, 493 Modus Ponens, 45,91,214, 236,917 monotonic functional, 666 monotonicity, 390 monotonous, 231 monotony rule, 416, 425, 430, 439, 443,447 more-step reduction, 494 mother, 738 multi-step proof, 152 multiple P-reduction, 382, 384 multiple substitution, 39, 49, 501, 512, 815 MV, 52, 865, 874 - book, 879, 891, 911 name, 52, 253, 266, 376, 875-877 -carrying expression, 380 - clash, 33, 35, 47, 156 - of the proof, 47 named variable, 35 name-free, 35, 379 - A-Calculus, 156, 813 - notation, 34, 231, 342 - variable, 25, 343 nameless dummy, 33, 343, 375 nameless variable, 40, 49, 50, 656, 811 natural deduction, 14, 238, 721, 875, 916 natural language, 927 natural number, 46, 137, 241 nearness, 742 Nederpelt’s norm, 278 negation, 28 nested book, 79 nested one step reduction, 663

Index of Subjects nested reduction, 428 new clause of a line, 889 NF expression, 379 non-+-language, 525 non-termination, 53, 54, 940, 947 nonempty, 41 norm, 34, 37,371, 373,456, 507, 664 -correct, 331, 334 normability, 40 normable, 41, 331, 601 - expression, 507 - term, 664, 666 normal, 460 - reduction, 572 - term, 664 normal form, 82, 155, 173, 277, 281, 371, 391, 456, 460 uniqueness of -s, 173 - theorem, 155, 373 normalizable, 460 normalization, 123, 173, 391 - for 0-reduction, 508 - property, 494 strong -, 29, 36,37, 39-41, 123, 173, 391, 463, 472, 649, 655, 664 - theorem, 463 weak -, 34, 39 norming functional, 41, 664 notation rule, 401 Kuprl, 9, 230 object, 128, 130, 878 - language, 4 - name, 76 observability, 51, 850, 854, 861 occurrence, 401 old, 880 old clause of a line, 889 OMV, 874 one-step 0-reduction, 231

Index of Subjects one-step preservation of types, 534 one-step reduction, 121,493 order, 441 order-completeness, 137 ordered field, 139 osmosis, 202, 223 outer shape lemma, 506 p-expression, 20, 130 p-fragment, 196 p-part, 25 p-reduction, 634 pair, 31, 131,244,304,493, 736,907 pair rule, 310 pairing, 40 PAL, 15, 74,79, 146, 148,852 PAL-FT, 146, 147 paradox, 4,208 paragraph, 701 - closing line, 738 - indicator, 738 - line, 734, 737 - marker, 45 - opening line, 738 - reopening line, 738 - system, 44,48,110,147,156, 802 parameter, 128 parametrized constant, 883 parsing, 876,883,926,934 Part P,25 t-, 25, 191 partial correctness, 54,956 partial function, 137, 709, 724, 733, 742,744 Peano’s axioms, 137,241, 846,887 permissible, 449 permutative reduction, 632,634 phrase, 875 ?r-

1017

n-

-reduction, 40, 304, 493, 607 -type, 358

-expression, 736 -operator, 289 -type, 40 platonism, 203, 217,220,878 plus-expression , 493 pMV, 874 PN, 253 PN-line, 16,43, 146, 707 pointed flag, 882,885 polymorphic A-calculus, 9,193 polymorphic predicate A-calculus, 10, 167 polymorphism, 28, 243 positive logic, 851 postponed substitution, 176 postponement, 39,498 - of 1.)-reduction,399,442 power series, 44, 138 powertype, 21, 136 PPA, 10, 167 predicate, 41,93, 132,135, 196,875 predicate logic, 236 predicativity, 9 preservation - of domain, 596 - of types, 174,310, 519,834 - of C u t Y p , 540 - of typ, 596 - of typ*, 596 primary MV, 873 primitive constant-expression, 493 primitive line body, 886 primitive notion, 17, 76, 104, 109, 241 primitive notion line, 128, 734 primitive program, 952 primitive program construct, 953 principal type scheme, 230,232

1018 procedure, 54 processing, 143 processor, 73, 96, 570 product, 28, 45 - formation, 305 - type, 17, 233, 306, 708 program, 279, 951 - abortion, 54 - correctness, 53 - specification, 951 -to-program function, 954 programming language semantics, 23, 160 programming variable, 937 projection, 32, 304, 493 - function, 244 - of pairs, 40 - rule, 310 proof, 88, 128, 131, 242, 251, 919 - by cases, 966 - checking, 166, 167 - class, 211 - irrelevance, 22, 159 - object, 235 - type, 152, 153 proofs-as-objects, 5, 211 prop-inclusion, 114 prop-style, 153, 213 prop-type, 235 propagation, 525 proper identifier, 76 proper reduction sequence, 122, 585 proper subexpression, 403, 493 proposition, 87, 131, 153, 240, 872, 875 propositional logic, 236 propositional variable, 153 propositions-as-types, 5, 16, 18, 22, 27, 41, 111, 150, 168, 196, 198, 211, 235 pseud+expression, 233

Index of Subjects PTS, 11, 27, 233 PTS-rules, 37 pure system, 24, 517 Pure Type System, 11, 27, 233 pure typed A-calculus, 168 Q-applicable, 452 Q-domain, 452 Q-function, 452 @formula, 116, 120, 177, 517, 524 &propagation, 308, 529 quadruple, 53 quantification, 59. 106, 185, 396, 923 quantifier, 135, 225 quasi-expression, 114, 149, 286 quasi-type, 358 quasi-variable, 940 ramified theory of types, 4 rationals, 137 real analysis, 21, 733 realizer, 196 reals, 137 reasoning, 85, 150 rectangular flag, 882, 885 recursion, 21, 28, 53, 246, 963 recursive algorithm, 572 recursive procedure, 942 redex, 179 reduction, 28, 155, 390, 446 +-, 40, 493 +’-, 311 Q-, 47, 413, 790 single-step -, 414 modulo -, 493 p-, 47, 49, 122, 242, 257, 288, 326, 382, 386, 390, 417, 789 composite -, 417 elementary -, 416 local -, 325 multiple -, 382, 384 n-step -, 417

Index of Subjects one-step -, 231 single-step -, 416 PI--, 36, 398, 422, 424 elementary -, 424 n-step -, 424 CR for -, 427, 436 single-step -, 424 P2-, 37, 423, 425 elementary -, 424 n-step -, 424 single-step -, 424 PT-, 603 CR for ,&-, 581 definitional -, 47 delta -, 219 6--, 39, 50, 154, 219, 256, 493, 793, 794 disjoint -, 496 disjoint one-step i-, 496 disjoint one-step -, 527 E-, 40, 304, 493 Ea1t-r 311 7-, 39, 43, 47, 48, 288, 382, 390, 726, 791, 792 elementary - 439 n-step - 439 restricted - 578 single-step -, 439 ?!-, 441 k-fold - 441 extensional -, 175, 493 ext -, 493, 496 extended -, 583

Y-

-, 436 single-step -, 429 graph - , 177 head -, 571 IE-, 493, 496, 634 improper -, 585, 632 increasing -, 36 CR for

1019 introduction-elimination -, 493 6-, 443 elementary -, 443 single-step -, 443 local -, 32 main-, 506 many-step -, 122 mini-, 33, 242, 326 more-step -, 494 - of order p , 441 nested one-step -, 663 nested -, 428 single-step -, 428 normal -, 572 P-, 634 permutative -, 632, 634 A-, 40, 304, 493, 607 r-, 601 rst-, 601 rt-, 601 9-, 601 - sequence, 122, 173, 390 proper -, 122, 585 U-, 40, 304, 493 single-step -, 446 - step, 173 - strategy, 125, 173, 570 t-, 601 twin -, 495 - under substitution, 39, 503, 505 redundancy, 167 reference, 814 - depth, 176, 378 - mapping, 34, 344, 354 - number, 33, 35, 343 - transforming mapping, 177 reflexivity, 136 refmap, 354 refuser, 940, 972 regular language, 181, 525

1020 regular system, 517 relation, 132 relational semantics, 53, 938 relative addressing, 49 remark, 739 renaming, 48, 375 - of contexts, 581 renovation, 411 reopening of a paragraph, 46 replacement, 411 - property, 606 - theorem for P ~ T - S N , 621 - theorem for PT-SN, 606 restricted q-reduction, 578 pnormable, 399, 458, 460 pA-normable, 458 ptype, 360 right hand equality rule, 178 right projection, 736 root, 316 rule of the excluded third, 41 rules of grammar, 872 runtime, 947 saturated expression, 59 scar, 422 scheme, 193 second order lambda calculus, 230, 235 secondary MV, 873 segmap, 350 segment, 23, 33, 241, 339, 345 - calculus, 44 - mapping, 350 semantically equivalent, 952 semantics, 171, 859, 960 -of computer programs, 51, 54 - of programs, 937 semicorrectness, 333 SEMIPAL! 15, 74, 81, 146 sentence, 52, 872, 875

Index of Subjects sequential composition, 53, 54 set, 41, 45, 136, 240, 709, 871, 904 - extensionality, 136 - theory, 50, 138, 841 - type, 236 shape, 571, 786 shorthand facility, 110 C-expression, 736 o-reduction, 40, 304, 493 C-type, 11, 40, 246 similarity, 431 simple replacement, 411 simple theory of types, 5 simple type, 34 simple type theory, 4 simultaneous substitution, 118 simultaneous substitution theorem, 5 14 single line, 29, 32, 40, 590 single line Automath, 590 single substitution, 49 single-step a-reduction, 413 single-step P-reduction, 416 single-step PI-reduction, 424 single-step ,&-reduction, 424 single-step q-reduction, 439 single-step ?-reduction, 429 single-step &-reduction, 443 single-step nested reduction, 428 single-step proof, 152 single-step reduction, 446 skip line, 734, 738 smMV, 874 sMV, 874 sort, 234 sound applicability, 519, 596 soundness of applicability, 535 specification, 53 - language, 865 square brackets lemma, 503,505,510 square brackets lemma for >or, 610

Index of Subjects

stack, 245, 939 standardization, 391 - theorem, 503 Stanford LCF, 9. state space, 937 state transformation, 939 statement, 233, 872, 875, 876 strategy, 156 strengthening, 308, 613 - rule, 521, 525 string, 44, 158 strong ,@normalizability, 466 strong &normalization, 509 strong &normalization theorem, 468 strong PI-normalizability, 466 strong PI-normalization theorem, 467 strong Pz-normalizability, 466 strong ,&-normalization theorem, 466 strong 7-normalizability, 466 strong q-normalization theorem, 468 strong existence, 134 strong normal form theorem, 155 strong normalizability, 468 strong normalization, 29, 36, 37,3941, 123, 173, 391, 463, 472, 649, 655, 664 - property, 494 - theorem, 468 strongly normalize, 123 structure of line bodies, 886 structure sharing, 49, 809, 817 subdivided lambda tree, 328 subexpression, 403, 493 subroutine, 858, 863 substantive, 52,210,870, 871, 875877 sub-, 870, 897 substitution, 14, 28, 47, 49, 51, 59, 89, 118, 129, 380, 414, 809, 815 - lemma, 41

1021 - lemma for

P-N,

509

postponed -, 176 - theorem for P-SN, 509 - theorem for Pr-SN, 611 substitutivity, 136, 747 - of correctness, 597 subtype, 41, 45, 708 successor, 137, 241, 246 sugaring, 226 sum, 28 - rule, 310 - type, 304 superimposed language, 97 supertype, 114, 236, 303, 470 suppression mechanism, 167 surjectivity of pairing, 40 syllogism, 3 symmetry, 747 syntactic identity, 175, 493 syntactic similarity, 583 syntactic variable, 401 syntax, 54, 960 syntax checker, 571 syntax-oriented meta-MV, 874 system F, 9, 230, 235 t-expression, 20, 130 t-fragment, 189 t-part, 25, 191 tail, 405 r-expansion, 607 r-redex, 603 taxonomy of type systems, 229 teaching, 223, 224 telescope, 11, 21, 44, 139, 158, 224, 729 telescopic mapping, 11 term, 390 termination of verification algorithm, 573 text, 872

1022 theorem, 88, 241, 875 theorem proving, 167 theory of real numbers, 733 Theory of Types, 8 time space, 53 to fit in, 650 total correctness, 54, 958 transitivity, 747 translation, 711 translator, 570, 786 tree, 77, 314 tree of knowledge, 78 truth, 747 truth value, 916 twin case, 496 twin occurrence, 495 twin reduction, 495 type, 15, 32, 76, 128, 130, 210, 232, 258, 395, 449, 708, 735 $-, 304 abstract data -, 28, 245 anonymous arche-, 871 arche-, 205, 871, 898 assertion -, 688, 722 - assignment, 179, 478 calculated -, 31 canonical -, 28, 39, 48, 539, 555 category of all -s, 130 - check list, 330 - conversion, 308, 530 correctness of -, 597 data -, 10, 53, 938 equality on -s, 46 existential -, 246 extended -d A-calculus, 168,303 function -, 17 y-, 357 - in a programming language, 244 - inclusion, 19, 25, 30, 32, 39,

Index of Subjects 114, 150, 184, 285, 289, 305, 314, 335, 470, 479, 483, 531, 538, 651 intuitionistic - theory, 8, 25, 230 - label, 493, 497 -d lambda calculus, 313, 845 lambda -d lambda calculus, 313 mechanical -, 31, 125 - of a lambda tree, 320, 323 - of pairs, 131

II-, 40 358 power-, 21, 136 preservation of -s, 174,310, 519, 534 one-step -, 534 principal - scheme, 230, 232 product -, 17, 233, 306, 708 proof -, 152, 153 prop-, 235 quasi-, 358 ramified theory of -s, 4 - reduction, 285 - restriction, 50 P-, 360 -d set theory, 871 set-, 236 C-, 11, 40, 246 simple theory of -s, 5 simple -, 34 simple - theory, 4 - structure, 469 sub-, 41, 45, 708 sum-, 304 super-, 114, 236, 303, 470 - system, 229 taxonomy of - systems, 229 - theory, 5, 229 ultimate -, 19 uniqueness of -s, 11, 19, 102, T-,

Index of Subjects 174, 178, 306, 312,472, 519, 535, 869 - valued function, 131, 303 typing, 232, 449, 897 Church-, 232 Curry-, 232 explicit -, 232 - function, 554 high -, 871, 877 higher order -, 50 implicit -, 232 low -, 877 meta -, 871 mock -, 150, 289 - operation, 50 - operator, 260 ultimate -, 19 typographical abbreviation, 224, 905, 910, 921, 925 ultimate type, 19 understanding, 143 unessential extension, 185, 521, 522, 544 unfolding a definition, 242 unification of mathematics, 849 uniqueness - of domains, 126, 174, 519, 535 extended -, 519, 535 - of normal forms, 123, 173, 392 - of types, 11, 19, 102, 174, 178, 306, 312, 472, 519, 535, 869 - quantifier, 136 universal generator, 391 universal quantification, 132,290, 748 universal quantifier, 237 unstability, 174 update function, 34

1023 updating, 814

val, 908 valid, 60, 81, 890 - book, 891, 894 - clause, 891: 892 - context, 891, 892 - inference, 3 validity, 109 valuation, 423 value degree, 185, 525 variable, 230, 377, 402, 843, 875, 883 binding -, 176, 233 bound -, 76, 89, 175, 375, 884 clash of -s, 605, 788 confusion of -s, 415 free -, 26, 118 fresh --, 412 individual -, 401 introduction of a -, 240 level of a -, 379 name-free -, 25, 343 named -, 35 nameless -, 40,49, 50, 656, 811 - of a context, 884 programming -, 937 propositional -, 153 quasi -, 940 segment -, 33, 346 syntactic -, 401 verification, 23, 155, 556, 569, 805 - algorithm, 572 - of E-formulas, 574 - program, 47-49, 570 verifying program, 170,783,805,809 vernacular, 865 vicious circle, 4 waiting list, 328 weak ij-postponement, 494 weak 6-advancement, 500 weak Church-Rosser, 494

1024 weak disjunction, 134 weak existence, 20, 134 weak functional behaviour, 34 weak normalization, 34, 39 weakening, 527, 581 weight, 348,352 where-construction, 28, 244 while-statement, 54,945,955 Wiener Kreis, 203 word, 875 WOT, 161, 211 young, 880 Zermelo-Fraenkel axioms, 51, 846 Zermelo-Fraenkel set theory, 50,866 zero abstraction index book, 294 Zorn's lemma, 138

Index of Subjects

Selected papers on automath

Read more

Selected Papers on Precalculus

Read more

Selected Papers on Quantum Electrodynamics

Read more

Selected Papers on Soil Mechanics

Read more

Selected Papers on Quantum Electrodynamics

Read more

Selected Papers

Read more

Selected Papers

Read more

Selected Papers

Read more

Selected Papers on Automath (Studies in Logic and the Foundations of Mathematics)

Read more

Selected Papers on Automath (Studies in Logic and the Foundations of Mathematics)

Read more

Selected Papers on Automath (Studies in Logic and the Foundations of Mathematics)

Read more

Selected papers on the analysis of algorithms

Read more

Selected Papers on The Periodic System

Read more

Karl Schuhmann, Selected papers on phenomenology

Read more

Selected Papers on Noise and Stochastic Processes

Read more

Selected Papers on the Analysis of Algorithms

Read more

Roots and Branches: Selected Papers on Tolkien

Read more

Selected Papers on the Periodic Table

Read more

Selected Papers I

Read more

Representation theory: Selected papers

Read more

Kolmogorov, Selected probability papers

Read more

Integrable Systems Selected Papers

Read more

Constraint Processing.. selected papers

Read more

Quantum Entanglements: Selected Papers

Read more

Integrable systems. Selected papers

Read more

Selected logic papers

Read more

Quantum Entanglements: Selected Papers

Read more

Quantum entanglements: selected papers

Read more

Quantum entanglements: selected papers

Read more

Formal Philosophy: Selected Papers

Read more

Recommend Documents

Selected papers on automath

Selected Papers on Precalculus

...

Selected Papers on Quantum Electrodynamics

Selected Papers on Soil Mechanics

SELECTED PAPERS ON SOIL MECHANICS BY A. W. SKEMPTON, F.R.S. SELECTED PAPERS ON SOIL MECHANICS BY A. W. SKEMPTON, F.R.S...

Selected Papers on Quantum Electrodynamics

Papers Selected on ELECTRODYNAMICS Editedby JulianSchwinger Thedevelopment of quantum mechanics duringthe first quarte...

Selected Papers

Selected Papers Volume I Arizona, 1968 Peter D. Lax Selected Papers Volume I Edited by Peter Sarnak and Andrew M...

Selected Papers

Selected Papers

Errett Albert Bishop speaking at the 50th summer meeting of the Mathematical Association of America in Eugene, Oregon in...

Selected Papers on Automath (Studies in Logic and the Foundations of Mathematics)

Selected Papers on Automath (Studies in Logic and the Foundations of Mathematics)

SELECTED PAPERS ON AUTOMATH STUDIES INLOGIC AND THE FOUNDATIONS OF MATHEMATICS VOLUME 133 Honorary Editor: P. SUPPE...