Wolfram Pohlers
Proof Theory The First Step into Impredicativity
ABC
Wolfram Pohlers Universit¨at M¨unster Inst. Ma...
224 downloads
1453 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Wolfram Pohlers
Proof Theory The First Step into Impredicativity
ABC
Wolfram Pohlers Universit¨at M¨unster Inst. Mathematische Logik und Grundlagenforschung Einsteinstr. 62 48149 M¨unster Germany
ISBN 978-3-540-69318-5
e-ISBN 978-3-540-69319-2
Library of Congress Control Number: 2008930149 Mathematics Subject Classification (2000): 03F03, 03F05, 03F15, 03F25, 03F30, 03F35 c 2009 Springer-Verlag Berlin Heidelberg ° This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: WMXDesign GmbH, Heidelberg Printed on acid-free paper 987654321 springer.com
To Renate
Preface
The kernel of this book consists of a series of lectures on infinitary proof theory which I gave during my time at the Westf¨alische Wilhelms–Universit¨at in M¨unster. It was planned as a successor of Springer Lecture Notes in Mathematics 1407. However, when preparing it, I decided to also include material which has not been treated in SLN 1407. Since the appearance of SLN 1407 many innovations in the area of ordinal analysis have taken place. Just to mention those of them which are addressed in this book: Buchholz simplified local predicativity by the invention of operator controlled derivations (cf. Chapter 9, Chapter 11); Weiermann detected applications of methods of impredicative proof theory to the characterization of the provable recursive functions of predicative theories (cf. Chapter 10); Beckmann improved Gentzen’s boundedness theorem (which appears as Stage Theorem (Theorem 6.6.1) in this book) to Theorem 6.6.9, a theorem which is very satisfying in itself although its real importance lies in the ordinal analysis of systems, weaker than those treated here. Besides these innovations I also decided to include the analysis of the theory (Π2 –REF) as an example of a subtheory of set theory whose ordinal analysis only requires a first step into impredicativity. The ordinal analysis of (Π01 –FXP)0 of nonmonotone Π10 –definable inductive definitions in Chapter 13 is an application of the analysis of (Π2 –REF). I have also put more emphasis on the development of the recursion theoretic background. This takes place in Chapters 5 and 6. Chapters 2 and 4 serve as recapitulation of basic facts in general logic. These chapters are indeed very basic. Being aware that this is boring for the more experienced reader the book contains some redundancies. They are intended to avoid the necessity of permanently scrolling back during reading (which is of course unavoidable in most cases). Chapter 3 on ordinals is misplaced since it is not seriously needed before Chapter 6. However, I found no better place to put it in. In contrast to SLN 4017, where I tried to develop the theory of ordinals on the basis of an (in fact incomplete) axiom system, I decided now to develop the theory on the basis of a not fully axiomatized naive set theory (as it is common in everyday mathematics). According to the well–established custom, ordinals are regarded in the set theoretical sense. Those vii
viii
Preface
who regret that as a loss of constructivity are advised to stick to the notation systems developed from the set theoretical study of the ordinals. Notations of ordinals may be viewed as syntactically defined objects as discussed in Section 3.4.3. As a warm up and a basis for the things to come, Chapter 7 reviews and discusses Gentzen’s original results. Chapter 8 is more or less copied from SLN 4017 and treats the boundaries of predicativity in the narrow sense discussed in Section 8.3. In Chapter 9 we apply Buchholz’ technique of operator controlled derivation to obtain an upper bound for the ordinal κ ID1 which coincides with the prooftheoretical ordinal of ID1 . Weiermann observed that the technique of operator controlled derivations is also applicable to characterize the provably recursive functions of arithmetic. In Chapter 10 we therefore present and discuss an adaption of Weiermann’s theory on subrecursive functions to a study of the provably recursive functions of arithmetic. Chapter 11 starts with a — this time more detailed — introduction to set theory and presents the axiom systems KPω for Kripke–Platek set theory with infinity and the axiom system (Π2 –REF) and computes their proof-theoretical ordinals. As an application of the ordinal analysis of (Π2 –REF) Chapter 13 computes the proof-theoretical ordinal of non–monotone arithmetical inductive definitions which are Π10 –definable. This analysis is basically built on an observation due to Robin Gandy a couple of decades ago which was never published by him. Chapter 12 was only inserted after the rest of the book was finished. Gerhard J¨ager in his Habilitationsschrift pointed out that the naive view of the impredicativity of the systems ID1 and KPω is not completely correct. Although the impredicativity of theses system seems to be manifested in the closure axiom ID11 or the ∆0 –collection axiom in KPω , respectively, their impredicativity comes only in concurrence with foundation. It seemed important to me to include this result. Since J¨ager’s work is published in [52] the original plan was just to include a few exercises in which J¨ager’s results could be presented. However, due to the different framework in the present book it turned into a whole chapter in which, however, most of the results are still stated as exercises with extended hints. Here I am especially indebted to Jan Carl Stegert, who checked the exercises there and suggested many improvements. Chapter 1 displays my personal view of the development of proof theory. Since I am not a historian I cannot guarantee to have checked all sources in a professional way. According to my personal preference and the topic of this book I have put emphasis on ordinal analysis. Of course I am aware of the existence and importance of other parts of proof theory but it was not my aim to get close to a complete history of proof theory. I want to thank my assistants Christoph Heinatsch and Christoph Duchhardt for proofreading most parts of the book. Besides the correction of a variety of errors they also suggested many improvements. Jan Carl Stegert checked the exercises not only in Chapter 12 but also in the other parts. I also thank Joachim Columbus and Andreas Schl¨uter who proofread SLN 4017. The parts taken from there still rest on their corrections.
Preface
ix
Of course I discussed the topics of the book with many of my colleagues. I am especially indebted to Andreas Weiermann who not only developed the theory presented in Chapter 10 (although he is in no way responsible for any errors which may have sneaked into my adaption of his theory) but also gave many suggestions in discussions. The counterexample in Exercise 6.7.6 is due to Arnold Beckmann. However, he not only contributed this counterexample but also assisted me on many other occasions during the preparation of this book. Another colleague who suffered frequently from my inquiring is Wilfried Buchholz to whom we owe the theory of operator controlled derivations. Many thanks also to the students in our proof theory seminar. Their interest and thorough study helped to improve the book. Last but not least I want to thank my wife for her patience during the long time it took to finish this work. This book is dedicated to her. M¨unster February 2008
Wolfram Pohlers
Contents
1
Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
Primitive Recursive Functions and Relations . . . . . . . . . . . . . . . . . . . . . . 9 2.1 Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Primitive Recursive Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3
Ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Some Basic Facts about Ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Fundamentals of Ordinal Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 A Notation System for the Ordinals below ε0 . . . . . . . . . . . 3.4 The Veblen Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 The Veblen Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 A Notation System for the Ordinals below Γ0 . . . . . . . . . . .
17 17 20 28 33 35 35 36 41
4
Pure Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 First-Order and Second-Order Logics . . . . . . . . . . . . . . . . . . . . . . . . 4.3 The TAIT-Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Trees and the Completeness Theorem . . . . . . . . . . . . . . . . . . . . . . . . 4.5 GENTZEN’s Hauptsatz for Pure First-Order Logic . . . . . . . . . . . . . . 4.6 Second-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 43 46 52 55 64 66
5
Truth Complexity for Π 11 -Sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Language L (NT) of Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The TAIT-Language for Second-Order Arithmetic . . . . . . . . . . . . . . 5.3 Truth Complexities for Arithmetical Sentences . . . . . . . . . . . . . . . . 5.4 Truth Complexity for Π11 -Sentences . . . . . . . . . . . . . . . . . . . . . . . . .
69 69 73 74 77
xi
xii
Contents
6
Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Inductive Definitions as Monotone Operators . . . . . . . . . . . . . . . . . . 6.3 The Stages of an Inductive Definition . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Arithmetically Definable Inductive Definitions . . . . . . . . . . . . . . . . 6.5 Inductive Definitions, Well-Orderings and Well-Founded Trees . . . 6.6 Inductive Definitions and Truth Complexities . . . . . . . . . . . . . . . . . . 6.7 The Π11 -Ordinal of a Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83 83 84 85 87 90 92 99
7
The Ordinal Analysis for PA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.1 The Theory PA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.2 The Theory NT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7.3 The Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 7.4 The Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.5 The Use of Gentzen’s Consistency Proof for Hilbert’s Programme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.5.1 On the Consistency of Formal and Semi-Formal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.5.2 The Consistency of NT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.5.3 Kreisel’s Counterexample . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.5.4 Gentzen’s Consistency Proof in the Light of Hilbert’s Programme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
8
Autonomous Ordinals and the Limits of Predicativity . . . . . . . . . . . . . . 131 8.1 The Language Lκ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 8.2 Semantics for Lκ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 8.3 Autonomous Ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 8.4 The Upper Bound for Autonomous Ordinals . . . . . . . . . . . . . . . . . . 136 8.5 The Lower Bound for Autonomous Ordinals . . . . . . . . . . . . . . . . . . 138
9
Ordinal Analysis of the Theory for Inductive Definitions . . . . . . . . . . . . 157 9.1 The Theory ID1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 9.2 The Language L ∞ (NT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 9.3 The Semi-Formal System for L ∞ (NT) . . . . . . . . . . . . . . . . . . . . . . . 163 9.3.1 Semantical Cut-Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . 163 9.3.2 Operator Controlled Derivations . . . . . . . . . . . . . . . . . . . . . . 166 9.4 The Collapsing Theorem for ID1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 9.5 The Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 9.6 The Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 9.6.1 Coding Ordinals in L (NT) . . . . . . . . . . . . . . . . . . . . . . . . . . 181 9.6.2 The Well-Ordering Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 9.7 Alternative Interpretations for Ω . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Contents
xiii
10
Provably Recursive Functions of NT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 10.1 Provably Recursive Functions of a Theory . . . . . . . . . . . . . . . . . . . . 207 10.2 Operator Controlled Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 10.3 Iterating Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 10.4 Cut Elimination for Operator Controlled Derivations . . . . . . . . . . . 220 10.5 The Embedding of NT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 10.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
11
Ordinal Analysis for Kripke–Platek Set Theory with Infinity . . . . . . . . 237 11.1 Naive Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 11.2 The Language of Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 11.3 Constructible Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 11.4 Kripke–Platek Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 11.5 ID1 as a Subtheory of KPω . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 11.6 Variations of KPω and Axiom β . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 11.7 The Σ –Ordinal of KPω . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 11.8 The Theory (Π2 − REF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 11.9 An Infinitary Verification Calculus for the Constructible Hierarchy 269 11.10 A Semiformal System for Ramified Set Theory . . . . . . . . . . . . . . . . 277 11.11 The Collapsing Theorem for Ramified Set Theory . . . . . . . . . . . . . . 283 11.12 Ordinal Analysis for Kripke–Platek Set Theory . . . . . . . . . . . . . . . . 287
12
Predicativity Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 12.1 Admissible Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 12.2 M-Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 12.3 Extending Semi-formal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 12.4 Asymmetric Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 12.5 Reduction of T+ to T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 0 12.6 The Theories KPn and KPn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 12.7 The Theories KPl 0 and KPi 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
13
Nonmonotone Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 13.1 Nonmonotone Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 333 13.2 Prewellorderings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 13.3 The Theory (Π01 –FXP)0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 13.4 ID1 as Sub-Theory of (Π01 –FXP)0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 13.5 The Upper Bound for ||(Π01 –FXP)0 || . . . . . . . . . . . . . . . . . . . . . . . . . 345
14
Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
Chapter 1
Historical Background
The history of “Proof Theory” begins with the foundational crisis of Mathematics in the first decades of the twentieth century. At the turn of the century, in reaction to the explosion of mathematical knowledge, endeavors began, to provide the growing body of mathematics with a firm foundation. Some of the notions used then seemed to be problematic. This was especially true for those notions that embodied “infinities”. On the one hand, there was the notion of “infinitesimals” which dealt with “infinity in the small”. The elimination of infinitesimals by limit processes meant a big progress in giving existing mathematics a firm fundament. On the other hand there was the notion of “infinity in the large”. Investigations on the uniqueness of representation of functions by trigonometric series forced Georg Cantor to develop a completely new theory of the infinity in the large. One of the central points in Cantor’s theory was the possibility to collect even infinitely many objects into a new object. These entities – Cantor called them “Mengen”, “sets” in English – were the mathematical subjects of Cantor’s research. Therefore, he called his new theory, “Mengenlehre” which is translated as “set theory”. The possibility to form sets unrestrictedly, however, led immediately to contradictions. One example is Russell’s paradox about the set of all sets which are not members of themselves. For / x} R := {x x ∈ we obtain R ∈ R ⇔ R ∈ / R. These paradoxes and probably also the seemingly paradoxical fact that the axiom of choice offered the possibility to well-order all sets, created a feeling of un¨ certainty among the mathematical community. Herman Weyl in his article “Uber die neue Grundlagenkrise der Mathematik” [112] pointed out that circular definitions which caused the paradoxes of set theory are also used in analysis (cf. [111]). He introduced the term “new foundational crisis” into the discussion. In his book “Das Kontinuum” [110] he had already proposed a development in mathematics that avoided “circular” definitions.
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
1
2
1 Historical Background
Even more significant was the criticism by Brouwer. He already doubted the logical basis of mathematics. His point of attack was the law of excluded middle (or tertium non datur) which permits to prove the existence of objects without explicitly constructing them. Brouwer suggested to develop mathematics on intuitive principles which exclude the law of excluded middle. Their formalizations – mostly due to Heyting – is now known as intuitionistic logic. Both approaches – Brouwer’s as well as Weyl’s – meant, however, rigid restrictions on mathematics. Hilbert, then one of the most influential mathematicians, was not willing to accept any restriction which could mutilate existing mathematics. Therefore he suggested a programme to save mathematics in its existing form. He started to think about foundations of mathematics very early. Problem number 2 in his famous list of unsolved mathematical problems presented in 1900 at Paris was to find axioms for the theory of reals and to show their consistency. Between 1917 and 1922, he was frequently lecturing on foundational problems (cf. [93]). His ongoing interest in foundational problems was certainly one of the reasons for Hilbert to engage Paul Bernays as assistant. Bernays became one of his most important collaborators in developing his programme, which is nowadays called Hilbert’s programme. The initial paper in which the phrase Beweistheorie, i.e., proof theory, is explicitly mentioned is [41] titled “Neubegr¨undung der Mathematik”. In this paper, he responds to Weyl’s and Brouwer’s criticism and sketches his program for the foundation of analysis. He propagates the axiomatic method (Axiomatische Methode) which means that mathematical objects and their interrelations should be solely defined by axioms. The remaining task is to show that these axioms are not contradictory. To perform this task a new mathematical discipline is needed – proof theory. The objects of research in this new discipline are mathematical proofs. He sketches how this new discipline could work. He points out that a proof is a finite figure which we may think of as a tree whose root is the proved formula, leaves are axioms and nodes are locally correct with respect to rules (Schlußschemata).1 Therefore, the objects of proof theory are finite objects and it should be possible to show by completely finite reasoning that the root of a proof tree can never be a contradiction.2 In this way, Hilbert aimed to obtain a justification of the infinite by finite reasoning.3 This point of view is known as Hilbert’s finitist standpoint (Finiter Standpunkt). This standpoint was elaborated further in a talk “Die logischen Grundlagen der Mathematik” given in front of the Deutsche Naturforscher–Gesellschaft and printed in [42]. In [41], Hilbert summarized his programme into two steps. The first step is 1
In this paper he has one only single rule, modus ponens. Here Hilbert apparently anticipates the possibility of a calculus for reasoning which produces all the logical consequences of an axiom system. A fact which was proved only in 1930 by K. G¨odel in [36]. 3 In a talk “Uber ¨ das Unendliche” given at M¨unster in June 1925 (printed in [42]) he states literally “daß das Operieren mit dem Unendlichen nur durch das Endliche gesichert werden kann”, i.e., “that operating with infinities can only be secured by finite means.” 2
1 Historical Background
3
to formalize all existing mathematics.4 The second step is to develop and use proof theory to show that this formal system cannot produce contradictions.5 There have been early successes in performing Hilbert’s programme. Consistency proofs for weak subsystems of the axioms for number theory have been given by Wilhelm Ackermann [1], John von Neumann [65], Herbrand [39] and [40] and Gerhard Gentzen [28, 29]. But all attempts to extend these proofs to stronger axiom systems which also include an axiom for mathematical induction failed. That this was not an accident followed from G¨odels’s paper [37] in which he proved his famous incompleteness theorems. His first incompleteness theorem shows the impossibility of a (first-order) axiom system which proves all true arithmetical sentences while, even worse, his second incompleteness theorem states that a formal theory cannot prove its own consistency. G¨odel’s theorems rest on the observation that metamathematical reasoning can be “goedelized”, i.e., can be coded by natural numbers. This is especially true for all finite reasoning. As a consequence any finitist consistency proof should already be formalizable within the axioms of number theory (even a weaker axiom system which provides a coding machinery and a bit of mathematical induction suffices). So – according to G¨odel’s second incompleteness theorem – no finitist consistency proof for the axioms of number theory, let alone for stronger systems, is possible. In 1935, Bernays in his article on “Hilberts Untersuchungen u¨ ber die Grundlagen der Mathematik” [9] gives an account of the state of proof theory and also discusses the influence of G¨odels’s incompleteness theorems to Hilbert’s proof theory.
4
“Alles, was bisher die eigentliche Mathematik ausmacht, wird nunmehr streng formalisiert, so daß die eigentliche Mathematik oder die Mathematik im engeren Sinne zu einem Bestande an beweisbaren Formeln wird. Die Formeln dieses Bestandes unterscheiden sich von den gew¨ohnlichen Formeln der Mathematik nur dadurch, daß außer den mathematischen Zeichen noch das Zeichen →, das Allzeichen und die Zeichen f¨ur Aussagen darin vorkommen. Dieser Umstand entspricht ¨ einer seit langem von mir vertretenen Uberzeugung, daß wegen der engen Verkn¨upfung und Untrennbarkeit arithmetischer und logischer Wahrheiten ein simultaner Aufbau der Arithmetik und formalen Logik notwendig ist.” I.e., “Everything, which hitherto has been part of actual mathematics, will be rigidly formalized such that actual mathematics becomes a stock of provable formulas. The formulas in this stock are distinguished from the usual formulas of mathematics in so far that they will, beyond the mathematical symbols, also contain the symbol →, the symbol for generalization and symbols for propositions. This fact reflects my long stated conviction that the strong interrelation and inseparability of arithmetical and logical truth enforce a simultaneous development of arithmetic and formal logic.” 5 “Zu dieser eigentlichen Mathematik kommt gewissermaßen eine neue Mathematik, eine Metamathematik, hinzu, die zur Sicherung jener dient, indem sie vor dem Terror der unn¨otigen Verbote sowie der Not der Paradoxien sch¨utzt. In dieser Metamathematik kommt – im Gegensatz zu den rein formalen Schlußweisen der eigentlichen Mathematik – das inhaltliche Schließen zur Anwendung, und zwar zum Nachweis der Widerspruchsfreiheit der Axiome.” I.e., “To the real mathematics comes, so to speak, a new mathematics, a metamathematics, which is needed to secure the former by protecting it from the terror of unnecessary prohibitions as well from the disaster of paradoxes. In order to show the consistency of the axioms this metamathematics needs – in contrast to the purely formal reasoning of real mathematics – contentual reasoning”.
4
1 Historical Background
While this paper was in print, Gerhard Gentzen [30] succeeded in giving a consistency proof for an axiom system for number theory which included the full scheme of mathematical induction. His proof, however, did not contradict G¨odel’s second incompleteness theorem. It contained parts which could not be formalized within number theory. Nevertheless, the extensions needed seemed to be so tiny that Bernays (loc. cit.) argued that Gentzen’s proof meets the basic requirements of Hilbert’s finitist standpoint.6,7 In the beginning, Gentzen’s first proof was not completely understood. Therefore he withdrew Sects. 14–16 of [30] before printing and replaced them by a version, in which he used Cantor’s new transfinite numbers, to prove the termination of his reduction procedure on proof figures.8 So he succeeded in giving a consistency proof in which the “nonfinitist” means are concentrated in a single principle, an induction along a primitive-recursively definable well-ordering of the natural numbers of transfinite order type ε0 . The remaining parts of his proof were completely finitist. According to G¨odel’s second incompleteness theorem, induction along the wellordering used in Gentzen’s proof cannot be proved from the axioms of number theory. Vice versa, Gentzen in [31] showed that induction along any proper segment of this well-ordering is provable from these axioms. The order type ε0 therefore provides a measure for the “amount of transfiniteness” of the axioms of number theory. Therefore, it is tempting to regard the order type of the shortest primitive recursive well-ordering which is needed in a consistency proof for a theory T as characteristic for T . That this idea is malicious was later detected by Georg Kreisel who indicated in [62] that for any reasonable axiom system Ax there is a primitive recursive wellordering ≺ of order type ω which proves the consistency of Ax by induction along ≺ with the other things that only finitist means.9 In [31], however, Gentzen showed more. He proved without using G¨odel’s second incompleteness theorem that any primitive recursive well-ordering whose wellfoundedness is provable from the axioms of number theory must have order type less than ε0 . From the previously mentioned result of [31], it follows that conversely for any order type α < ε0 there is a primitive recursive well-ordering of order type α whose well-foundedness can be proved from the axioms for number theory. This implies that ε0 is the supremum of the order types of primitive recursive well–orderings whose well–foundedness can be proved from the axioms of number 6
. . . ist von G. Gentzen der Nachweis f¨ur die Widerspruchsfreiheit des vollen zahlentheoretischen Formalismus erbracht worden, durch eine Methode, die den grunds¨atzlichen Anforderungen des finiten Standpunktes durchaus entspricht. I.e., . . . has Gentzen given a consistency proof for the formalism of full number theory by a method which basically meets the fundamental requirements of the finitist standpoint. 7 In my personal opinion this is – even for number theory – not correct . See [73] §16 and Sect. 7.5 in this book. 8 The galley proofs of the original version of his proof which used a form of the fan theorem have been preserved. It is reprinted in [32]; an English translation is in the collected papers [96]. 9 The well–ordering in Kreisel’s counterexample is, however, artificially defined (cf. Sect. 7.5.3). For all known “natural well-orderings” the least order type which is needed to prove the consistency of the axioms for arithmetic is ε0 . The problem is that we do not have a definition of “natural wellorderings” and – at least at the moment – have little hope to find one.
1 Historical Background
5
theory. Following Gentzen’s paper [31], we today define the proof-theoretical ordinal ||T || of a theory T to be the supremum of the order types of primitive recursive well-orderings on the natural numbers whose well-foundedness can be derived from the axioms in T . This is a mathematically well-defined ordinal. For reasons that are explained later this ordinal is now also called the Π11 -ordinal of the theory T . But there is also the notion of a Π20 -ordinal of a theory10 and – most recently – also the notion of the Π10 -ordinal of a theory, which probably is closest to the original intention. By an ordinal analysis of a theory T , we commonly understand the computation of its proof-theoretical ordinal, i.e., Π11 -ordinal. But ordinal analysis includes much more. A good account of the status of ordinal analysis is given by Michael Rathjen in [80]. Ordinal analyzes for systems stronger than pure number theory have been given later by Solomon Feferman in [23] and Kurt Sch¨utte in [87] and [88] who independently characterized the limits of predicativity. Previous work in that direction was done by Paul Lorenzen [63]. An important paper for the further development of impredicative proof theory was Gaisi Takeuti’s 1953 paper [101], which showed that the generalization of Gentzen’s Hauptsatz for first-order logic to simple type theory entails the consistency of full second-order number theory. Takeuti conjectured that Gentzen’s Hauptsatz also holds for simple type theory. In [86] Sch¨utte presented 1960 a semantical equivalent to Takeuti’s conjecture. William Tait in [97] proved 1966 Takeuti’s conjecture for second-order logic using Sch¨utte’s semantical equivalent. Around 1967, Moto-o Takahashi in [100] and Dag Prawitz in [75] independently proved Sch¨utte’s semantical equivalent which entails Takeuti’s conjecture for full simple type theory. Girard showed 1971 in [33] that simple type theory not only allows cut elimination but also a terminating normalization procedure.11 Disappointingly all these results gave neither ordinal analyzes nor essential proof-theoretic reductions, although Girard’s results had high impact on structural proof theory and theoretical computer science. The first results for impredicative subsystems of second-order number theory which also gave ordinal information, were obtained by Takeuti in [102]12 , in which he proved the consistency of the system of Π11 -comprehension using transfinite induction on ordinal diagrams. This gave an upper bound for the Π11 ordinal of 10
Cf. Sect. 10. He generalized G¨odel’s system T of functionals of finite types to a system F of functionals of types possibly containing variables for types. Via the Curry–Howard isomorphism this system corresponds to a natural deduction system for finite types. Then he proved strong normalization for the system F by a generalization of Shoenfield’s computability predicates for the system T . This generalization, however, was far from obvious since normalization for the system F implies cut elimination for simple type theory which in turn by Takeuti’s result entails consistency of full second (and even higher) order number theory. Therefore Girard’s computability predicates must not be formalizable in full second-order number theory and it is difficult to invent such predicates. Girard introduced the notion of “candidats de reducibilit´e” which was perfectly adapted to the situation. 12 Although Takeuti’s emphasis was not on ordinal information but on consistency proofs which were as close as possible to Hilbert’s programme. 11
6
1 Historical Background
this system which was later extended by Takeuti and Yasugi in [105] to the system for ∆21 –comprehension. Another way to attack the consistency problem was initiated around 1958 by G¨odel [38]. He tried to establish the consistency of the axiom system for pure number theory by translating the provable sentences F of pure number theory into sen˜ α , β ], where α and β are strings of variables tences F ∗ of the form (∃α )(∀β )F[ ˜ α , β ] is a quantifier free formula. Then he for functionals of higher types and F[ showed that for every provable sentence F of pure number theory there is a string ˜ f ,g] holds for of terms f in his system T of functionals of finite types such that F[ all g of appropriate types which means that the quantifier free formula F[f ,g] becomes provable in T . All that is needed in this consistency proof is the fact that all the functionals in T are computable. G¨odel apparently took that for granted. A strict proof, however, again needs a transfinite induction along a well-ordering of order type ε0 as shown by Howard [46]. There are simpler proofs for the computability of the functionals in T – which probably were completely plain to G¨odel and which later have been spelled out by Tait [98]13 – using computability predicates. These computability predicates are locally formalizable in number theory, which means that for every class of terms of restricted type complexity we can define their computability predicate within pure number theory. According to G¨odel’s second incompleteness theorem, however, the global computability predicate for all terms, which is needed for the consistency proof, is not formalizable.14 No ordinal information is gained from the proof via computability predicates. G¨odel’s paper can be viewed as a paradigmatic example for reductive proof theory. The theory T is quantifier free, based solely on the defining equations for the functionals. No law of excluded middle is needed to show the interpretation theorem. So we have a reduction to an intuitionistic quantifier free theory (but in higher types) which intuitively seems to be much less obscure than the theory PA of pure number theory with the full scheme of mathematical induction. But remember that from a strict mathematical viewpoint the consistency of the theory T is of the same complexity as the consistency of the theory PA. So we have only a reduction, not a real consistency proof. Building on this way of attack, Clifford Spector in 1962 [95] developed a consistency proof for full second-order number theory by a functional interpretation using functionals defined by bar recursion on all finite types. Of course this proof again did not give any ordinal information. But Kreisel wanted to know whether systems for iterated inductive definitions could serve to model bar recursion of finite types. In 1963 [61], he introduced systems for generalized inductive definitions. Although it turned out that even iterated inductive definitions are much weaker than 13
Cf. also the proof given in Shoenfield’s book [92]. The situation is comparable to the situation when we prove the consistency of number theory directly by truth predicates. The local versions, i.e., truth predicates for formulas of bounded complexity, are formalizable within number theory. However, the global version, which is needed for the consistency proof, cannot be formalized in number theory. Again a consistency proof via truth predicates gives no ordinal information. 14
1 Historical Background
7
full second-order number theory the focus was pointed on systems for iterated generalized inductive definitions. This topic was blossoming at the 1968 conference on Intuitionism and Proof Theory in Buffalo. The proceedings [60] contain three important papers in this area. Harvey Friedman in [25] showed that the second-order theory with the Σ21 axiom of choice can be interpreted in the system (Π11 –CA)<ε0 of less than ε0 -fold iterated Π11 -comprehensions and Feferman in [24] showed that less than ν -fold iterated Π11 comprehensions could be interpreted in systems ID<ν for less than ν -fold iterated inductive definitions. Tait in [99] used cut elimination for an infinitary propositional logic whose formula complexity is measured by constructive number classes to obtain a consistency proof for second-order number theory with the scheme of Σ21 dependent choice. No ordinal information could be gained from this proof although it carried already the germs of ideas which later made an ordinal analysis for systems of iterated generalized inductive definitions possible. By a complicated passage through formal theories for choice sequences, it was known that the theory ID1 for one generalized inductive definition based on classical logic is reducible to the theory ID1i (O) which axiomatizes the first constructive number class based on intuitionistic logic. In ID1i (O), only the existence of the accessible part of a computably enumerable ordering is postulated. Accessible parts have a clear constructive meaning. Reduction to theories for accessibility predicates which are based on intuitionistic logic is therefore one of the aims of reductive proof theory. However, Zucker showed [115] that there are definitive obstacles to a straight forward reduction of the theories IDν for ν > 1 to intuitionistic accessibility theories. Such reductions were later obtained via ordinal analyses. The first ordinal analysis for the theory ID1i was obtained by Willam Howard [47] in 1972. Via the known proof-theoretical reductions this entailed also an ordinal analysis for ID1 . Ordinal analyses for theories for finitely iterated inductive definitions were later obtained by Pohlers in [68] and for transfinitely iterated inductive definitions in [69], using Takeuti’s reduction procedure for Π11 –comprehension. Later, more perspicuous methods have been established by Buchholz in [15], using his Ων -rules, and Pohlers in [71, 70, 72], using the “method of local predicativity”. An account to the history of that development by Feferman can be found in [15]. To get more perspicuous ordinal analyzes also for the subtheories of secondorder number theories it was an obvious attempt to try to transfer the methods which were successful in the ordinal analyzes of theories for iterated inductive definitions. Generalizing Buchholz’ Ων -rules to second-order number theory worked quite well and led to [56, 16] and the monograph [17] in which the results of Takeuti for Π11 Comprehension [102] and Takeuti and Yasugi for the theory of ∆21 -comprehension [104] could be reobtained by much more perspicuous techniques. But at that time it was not at all clear how this technique could be pushed essentially further. The obvious generalization of local predicativity to subtheories of second-order number theory meant to extend the ramified analytic hierarchy – the familiar tool in predicative proof theory – to stages beyond ω1CK thus leading to proof-theoretic
8
1 Historical Background
ordinals bigger than Γ0 , the limiting ordinal of predicativity.15 This approach, however, needed sets which were not longer sets of natural numbers and therefore outsides the ramified analytical hierarchy. This required a coding machinery which turned out to be unmanageable. The natural remedy was to step outside of the ramified analytic hierarchy – which can be viewed as G¨odel’s constructible universe intersected with the powerset of the natural numbers – and to work directly in the constructible hierarchy. The pioneering work in this direction has been done by J¨ager starting with his Diploma thesis via his Dissertation [48] and finally in [55] and his monograph [52].16 One of the highlights of this approach was the analysis of the theory KPi by J¨ager and Pohlers in [55] which corresponds to ∆21 –comprehension plus Bar–induction on the side of subsystems of second-order number theory. Showing the well–foundedness of the notation system needed in the analysis of KPi within Feferman’s theory T0 for explicit mathematics J¨ager established the lacking direction (∆21 –CA)+(BI) ≤ T0 in the proof-theoretic equivalence of T0 with (∆21 – CA)+(BI). The other direction was already established by work of Feferman. The strongest theories analyzed so far are parameter free Π21 -comprehension by Rathjen [81] which on the side of set theory corresponds to the theory of Σ1 separation. He arrived there by sucessively analyzing the theory KPM axiomatizing a recursively Mahlo universe [77], the theory of Π3 -reflection [78] and [79] and the theory of stability [82]. This analysis still works within the constructible hierarchy but uses methods which by far exceed the method of local predicativity presented in this volume.17 This book concentrates on the first step into predicativity. After a short and therefore rather incomplete review of predicative proof theory, we are going to analyze theories on the level of a noniterated inductive definition whose ordinal strength is measured by the Bachmann–Howard ordinal. Stronger theories will need a second and further steps into impredicativity which are supposed to become the topic of another monography.
15
Cf. Chap. 8. Part of the results presented there are in Chap. 12. 17 In this connection one should also mention the work of Toshiyasu Arai who, mainly building on Takeuti’s approach, also got results for strong axiom systems (e.g. [3]). 16
Chapter 2
Primitive Recursive Functions and Relations
To fix notations and to give a rough overview of some of the recursion theoretic background which we are going to use in this book we start with a chapter on primitive recursive functions and relations. Of course, we cannot be exhaustive here. For further studies we recommend a textbook on computability theory. An old, but still very good source is Rogers’ book [84].
2.1 Primitive Recursive Functions The natural numbers can be viewed as generated by a counting process. We start counting with zero. So every natural number is either zero or the immediate successor of another natural number. This counting process is the basis for the definition of the class of primitive recursive functions, a subclass of the functions on the natural numbers which are effectively computable. We obtain the class of primitive recursive functions by first introducing a class of formal terms – the primitive recursive functions terms – and then defining their evaluation on the natural numbers. 2.1.1 Definition The primitive recursive function terms are inductively defined by the following clauses. • The symbol S (for the successor function) is a unary primitive recursive function term. • The symbol Ckn (for the function with constant value k) is an n-ary primitive recursive function term. • For 1 ≤ k ≤ n the symbol Pkn (for the projection on the kth component) is an n-ary primitive recursive function term.
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
9
10
2 Primitive Recursive Functions and Relations
• If h1 , . . . , hm are n-ary primitive recursive function terms and g is an m-ary primitive recursive function term then Sub(g, h1 , . . . , hm ) is an n-ary primitive recursive function term (substitution of functions). • If g is an n-ary and h an n + 2-ary primitive recursive function term then Rec(g, h) is an n + 1-ary primitive recursive function term (primitive recursion).
2.1.2 Definition For an n-ary primitive recursive function term f and an n-tuple z1 , . . . , zn of natural numbers we define the evaluation ev( f , z1 , . . . , zn ) inductively by the following clauses: • ev(S, z1 ) = z if z is the successor number of z1 . • ev(Ckn , z1 , . . . , zn ) = z if z = k. • ev(Pkn , z1 , . . . , zn ) = z if z = zk . • ev(Sub(g, h1 , . . . , hm ), z1 , . . . , zn ) = z holds if there are natural numbers u1 , . . . , um such that ev(hi , z1 , . . . , zn ) = ui for i = 1, . . . , m and ev(g, u1 , . . . , um ) = z. • ev(Rec(g, h), k, z1 , . . . , zn ) = z holds if there are natural numbers u0 , . . . , uk such that uk = z, ev(g, z1 , . . . , zn ) = u0 and ev(h, i, ui , z1 , . . . , zn ) = ui+1 for i = 0, . . . , k − 1. 2.1.3 Definition A function F: Nn −→ N is primitive recursive if and only if there is an n-ary primitive recursive function term f such that for every n-tuple z1 , . . . , zn of natural numbers we have ev( f , z1 , . . . , zn ) = F(z1 , . . . , zn ). Let F be a class of number theoretic functions. By F n , we denote the n-ary functions in F . We say that F is closed by substitution if H1 , . . . , Hm ∈ F n and G ∈ F m implies that the function F defined by F(z1 , . . . , zn ) := G(H(z1 , . . . , zn ), . . . , H(z1 , . . . , zn )) also belongs to F . We say that F is closed under primitive recursion if for G ∈ F n and H ∈ F n+2 the function F, which is uniquely defined by the recursion equations F(0, z1 , . . . , zn ) = G(z1 , . . . , zn ) and F(n + 1, z1 , . . . , zn ) = H(n, F(n, z1 , . . . , zn ), z1 , . . . , zn ), also belongs to F .
2.1 Primitive Recursive Functions
11
2.1.4 Theorem The primitive recursive functions form the smallest class of functions which contain the basic functions “successor”, “constant-functions” and “projections” and is closed under substitution and primitive recursion. Proof Let PRF denote the class of primitive recursive functions. By definition PRF contains the basic functions. If H1 , . . . , Hm are n-ary primitive recursive functions and G is an m-ary primitive recursive function then there are primitive recursive function terms h1 , . . . , hm and g such that Hi (z1 , . . . , zn ) = ev(hi , z1 , . . . , zn ) holds true for i = 1, . . . , m and G(u1 , . . . , um ) = ev(g, u1 , . . . , um ). Then Sub(g, h1 , . . . , hm ) is an n-ary primitive recursive function term satisfying ev(Sub(g, h1 , . . . , hm ), z1 , . . . , zn ) = G(H1 (z1 , . . . , zn ), . . . , Hn (z1 , . . . , zn )). This shows that PRF is closed under substitutions. Now assume that G and H are n- and n + 2-ary primitive recursive functions and F is defined by the recursion equations F(0, z1 , . . . , zn ) = G(z1 , . . . , zn ) and F(i + 1, z1 , . . . , zn ) = H(i, F(i, z1 , . . . , zn ), z1 , . . . , zn ). There are primitive recursive function terms g and h such that ev(g, z1 , . . . , zn ) = G(z1 , . . . , zn ) and ev(h, z1 , . . . , zn+2 ) = H(z1 , . . . , zn+2 ). Then Rec(g, h) is an n + 1-ary primitive recursive function term. We show ev(Rec(g, h), i, z1 , . . . , zn ) = F(i, z1 , . . . , zn ) by induction on i. We get the induction begin from ev(Rec(g, h), 0, z1 , . . . , zn ) = u0 = ev(g, z1 , . . . , zn ) = G(z1 , . . . , zn ) = F(0, z1 , . . . , zn ) and the induction step follows from ev(Rec(g, h), i + 1, z1 , . . . , zn ) = ev(h, i, ui , z1 , . . . , zn ) = H(i, F(i, z1 , . . . , zn ), z1 , . . . , zn ) = F(i + 1, z1 , . . . , zn ). It remains to show that PRF is the least such class. Assume that F is a class of number theoretic functions having the required closure properties. By induction on the definition of the primitive recursive function term f , we show that its evaluation belongs to F . This is obvious if f is one of the terms S, Ckn or Pkn . For a composed term f = Sub(g, h1 , . . . , hm ) the evaluations of g and h j belong to F by induction hypothesis. Since F is closed under substitutions, the evaluation of f belongs to F , too. If F is the evaluation of the composed term Rec(g, h), then F satisfies the recursion equations F(0, z1 , . . . , zn ) = G(z1 , . . . , zn ) and F(i + 1, z1 , . . . , zn ) = H(i, F(i, z1 , . . . , zn ), z1 , . . . , zn ) where G and H are the evaluations of the terms g and h. By induction hypothesis, G and H belong to F . Since F is closed under primitive recursion F belongs to F . We prove some basic closure properties of the primitive recursive functions which we will need later. For more detailed studies we recommend any textbook on recursion theory. The classical reference is [84]. First we observe that many of the familiar number theoretic functions are primitive recursive (cf. the exercises). Among them is bounded summation satisfying the recursion equations
12
2 Primitive Recursive Functions and Relations 0
x+1
x
i=0
i=0
i=0
∑ F(i) = F(0) and
∑ F(i) = F(x + 1) + ∑ F(i)
where F is a primitive recursive function. Likewise, we check that the bounded product ∏xi=0 F(i) (as a function of its upper bound x) is primitive recursive, too. Important primitive recursive functions are the case distinction functions sg and sg defined by the recursion equations sg(0) := 0 and sg(n + 1) := 1 and sg(0) := 1 and sg(n + 1) := 0. The predecessor function pred(n) satisfies the recursion equations pred(0) = 0 and pred(n + 1) = n and is thus primitive recursive. Then we obtain the arithmetical difference of two natural numbers as the binary primitive recursive function satisfying the recursion equations · 0 = 0 and n − · (x + 1) = pred(n − · x). n− The absolute difference between two natural numbers is the primitive recursive function defined by · m) + (m − · n). |m − n| := (n − 2.1.5 Exercise Show that addition, multiplication and exponentiation of natural numbers are primitive recursive.
2.2 Primitive Recursive Relations In the following we identify n-ary relations on a set N with subsets of N n . 2.2.1 Definition An n-ary relation R ⊆ Nn is primitive recursive if and only if its characteristic function χR (z1 , . . . , zn ) := 1 if (z1 , . . . , zn ) ∈ R 0 otherwise is primitive recursive. 2.2.2 Lemma The primitive recursive relations are closed under the boolean operations ¬, ∧ and ∨, bounded quantification and substitution with primitive recursive functions.
2.2 Primitive Recursive Relations
13
Proof Closure under boolean operations is obvious since χ¬P = sg ◦ χP := Sub(sg, χP ), χP∧Q = χP ·χQ := Sub(·, χP , χQ ) and χP∨Q = sg(χP + χQ ) := Sub(sg, Sub(+, χP , χQ )). To show closure under bounded quantification let P by an n + 1-ary relation and define Q(x, z1 , . . . , zn ) :⇔ (∃y ≤ x)P(y, z1 , . . . , zn ). Then x
χQ (x, z1 , . . . , zn ) = sg( ∑ χP (i, z1 , . . . , zn )) i=0
which shows that the primitive recursive relations are closed under bounded existential quantification. Because of the closure under negation this also entails the closure under bounded universal quantification. The closure under substitution with primitive recursive functions follow directly from the fact that the primitive recursive functions are closed under substitutions. Using the closure properties of primitive recursive functions we recognize many of the familiar relations on the natural numbers as primitive recursive. Since x = y ⇔ |x − y| = 0 we obtain χ= (m, n) = sg(|m − n|). Equality is therefore a primitive recursive relation. More primitive recursive relations are listed in Table 2.1. The next aim is to show that there isa primitive recursive coding machinery for tuples of natural numbers. Let N∗ := n∈N Nn denote all finite tuples of natural numbers. A coding machinery is an one-to-one mapping
: N∗ −→ N 1−1
for which we write x1 , . . . , xn instead of (x1 , . . . , xn ). A coding machinery induces a relation Seq := rng( ), the length function satisfying
Table 2.1 Some primitive recursive relations Relation
Notion
Definition
Equality
x=y
χ= (x, y) = sg(|x − y|)
Less or equal than
x≤y
(∃z ≤ y)[y = x + z]
Less than
x
x ≤ y ∧ x = y
x divides y
x/y
(∃z ≤ y)[y = x·z]
p is a prime
Prime(p)
p = 0 ∧ p = 1 ∧ (∀z ≤ p)[¬(z/p) ∨ z = 1 ∨ z = p]
14
2 Primitive Recursive Functions and Relations
Seq(x) ⇒ lh(x) = min {n −1 (x) ∈ Nn } and the decoding functions (x)i satisfying ( x0 , . . . , xn )i = xi for all i ≤ n. We call a coding machinery primitive recursive if all restrictions Nn of the encoding function , the induced relation Seq, the length function lh and the decoding functions are all primitive recursive. The elements of Seq are the sequence numbers. Let Pnb(k) denote the kth prime. Defining n
= 0 and x0 , . . . , xn = ∏ Pnb(i)xi +1
(2.1)
i=0
we obtain an one-to-one map from N∗ into N by the unique prime decomposition of natural numbers. This induces a primitive recursive coding machinery (cf. Exercise 2.2.8). We define the concatenation of sequence numbers by
x0 , . . . , xm y0 , . . . , yn := x0 , . . . , xm , y0 , . . . , yn . For every function F: Nn+1 −→ N we obtain its course-of-values function [F] satisfying the recursion equations [F](0, z1 , . . . , zn ) := (the empty sequence) and [F](i + 1, z1 , . . . , zn ) := [F](i, z1 , . . . , zn ) F(i, z1 , . . . , zn ). Observe that [F](n + 1) = F(0), . . . , F(n).1 It follows from the recursion equations that [F] is primitive recursive if F is primitive recursive. The converse is also true since F(x) = ([F](x + 1))x . So [F] is primitive recursive if and only if F is primitive recursive. 2.2.3 Theorem (Course-of-values recursion) Let G be an n+2-ary function. There is a uniquely determined function F satisfying the equation F(x, z1 , . . . , zn ) = G(x, [F](x, z1 , . . . , zn ), z1 , . . . , zn ). If G is primitive recursive then F is also primitive recursive. Proof We obtain the uniqueness by a simple induction on x. For the second claim assume that G is primitive recursive. It suffices to show that [F] is primitive recursive. According to the recursion equations for [F] we obtain [F](i + 1, z1 , . . . , zn ) = [F](i) G(i, [F](i, z1 , . . . , zn ), z1 , . . . , zn ) which entails that [F] is primitive recursive. 1 If we regard a natural number n + 1 as the set {0, . . . , n} then the image of n + 1 under F is the set F[n + 1] = {F(i) i ≤ n}. Therefore we sometimes use the notions [F](n) and F[n] interchangeable if the ordering of the values of F below n is unimportant.
2.2 Primitive Recursive Relations
15
The main application of course-of-values recursion is the definition of relations. 2.2.4 Definition We say that a relation Q is primitive recursive in the relations P1 , . . . , Pn if there is a primitive recursive relation R such that x ∈ Q ⇔ R(χP1 (x), . . . , χPn (x),x). For short we sometimes (in abuse of notation) express that by Q = R(P1 , . . . , Pn ). 2.2.5 Observation If a relation Q is obtained from relations P1 , . . . , Pn by Boolean operations, bounded quantification and substitution with primitive recursive functions, then Q is primitive recursive in P1 , . . . , Pn . 2.2.6 Theorem (Course-of-values recursion for relations) Assume that R is a primitive recursive n + 2-ary relation. Then there is a uniquely determined primitive recursive n + 1-ary relation Q satisfying Q(k, x1 , . . . , xn ) ⇔ R([χQ ](k, x1 , . . . , xn ), k, x1 , . . . , xn ). Proof We use course-of-values recursion to define the characteristic function of Q. We obtain χQ (k, z1 , . . . , zn ) = χR ([χQ ](k, z1 , . . . , zn ), k, z1 , . . . , zn ). Then χQ is primitive recursive according to Theorem 2.2.3. The uniqueness of Q follows by an easy induction on k. Theorem 2.2.6 opens the possibility to define primitive recursive relations implicitly with the proviso that only smaller arguments must be used in the defining part. We will use that on many occasions. We will freely use the λ -notation for functions. I.e., if f is a function with n arguments then λ xi . f (x1 , . . . , xn ) denotes the function xi −→ f (x1 , . . . , xn ). Similarly if t is some term containing a free variable x we denote by λ x . t(x) the function x −→ t(x). 2.2.7 Exercise Let P be an n + 1-ary primitive recursive relation. Show that the function defined by the bounded search operator min {y ≤ x P(y, z1 , . . . , zn )} if this exists µ y ≤ x . P(y, z1 , . . . , zn ) := x+1 otherwise is primitive recursive. 2.2.8 Exercise Show that the encoding function defined in (2.1) induces a primitive recursive coding machinery. Hint: Use bounded search to show that Pnb is a primitive recursive function.
2.2.9 Exercise Show that the concatenation function is primitive recursive. 2.2.10 Exercise Prove Observation 2.2.5.
Chapter 3
Ordinals
Ordinals play a predominant role in proof-theoretical research. We need ordinals to measure infinitary objects and the run time of infinitary processes. An example of an infinitary process is the stepwise construction of fixed-points in Sect. 6.3, an example for infinitary objects are infinite well-founded trees as introduced in Sect. 5. This chapter is purposely misplaced. Ordinals are not needed before Sect. 4.4 where we have to measure the complexity of infinite trees. Even there only a superficial knowledge of ordinals is needed. The chapter on inductive definitions (Chap. 6) requires a little more ordinal theoretic background. Only starting from Chap. 7 a more profound knowledge of ordinals and ordinal notations is needed. However, we found no better place for this chapter. In a first reading this chapter can, therefore, be omitted and revisited later, when ordinals start to play their predominant role.
3.1 Heuristics Ordinals generalize the finite ordinals, first, second, third, . . . into the transfinite. A natural number has two aspects. One aspect is that of a cardinal which measures quantity or size. In that sense the natural number n can be viewed as a representative for the finite sets having exactly n elements. The second aspect is that of a finite ordinal, which represents the process of counting. Counting a set means to order its elements according to the way we count them. We take the first, second, etc. For finite sets, however, the difference between the cardinal and ordinal aspects is not immediately visible because there is, up to isomorphism, only one way to order a finite set. Therefore all counting processes for a finite set will lead to the same “ordinal” which is in fact its “cardinality”. When passing to infinite sets the situation changes dramatically. As an example, take the set of natural numbers and count according to the canonical ordering of its elements starting with 0, 1, . . . etc. Now change the way of counting and start with 1, 2, . . . , and count 0 as the last element. This leads to an ordering that obviously cannot be isomorphic to the canonical one. W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
17
18
3 Ordinals
The canonical ordering possesses no last element while the new ordering has 0 as its last element. The segment below 0 in the new ordering, which is 1, 2, . . . , is an ordering which is apparently isomorphic to the canonical ordering of the natural numbers. The new ordering has therefore “one element more” than the canonical ordering. If we take ω as a symbol for the canonical ordering of N, the new ordering can be characterized as ω + 1, i.e., we have counted one step behind the canonical ordering of N. This procedure can be iterated by ordering the natural number as 2, 3, 4, . . . , 0, 1, an ordering that can be characterized as ω + 2, or more generally n, n + 1, . . . , 0, 1, . . . , n − 1 which can be characterized as ω +n. We may even order the natural numbers by first taking all even numbers in their canonical ordering and then all the odd numbers, i.e., 0, 2, 4, . . . , 2n, . . . , 1, 3, 5, . . . , 2n + 1, . . . , and thus obtain an ordering that can be characterized as ω + ω etc. Observe that we never changed the basic set, which means that the cardinal aspect of all the ordered sets ω , ω + 1, ω + ω remains unchanged. On the other hand not every ordering is suited for counting. Take for example the non-negative rational numbers Q+ in their canonical ordering. Here we can start with 0, but there is no next element in the canonical ordering because it is a dense ordering, i.e., between 0 and any positive rational number q > 0 there is a rational number q0 such that 0 < q0 < q. Therefore the canonical ordering of Q+ is not suited for counting. Only those orderings are suited for counting that have the property that, after having counted many elements arbitrarily, the remaining set of not yet counted elements possesses a least element, provided that it is not empty. Such orderings are called well-orderings. To be more precise, a linear ordering is a pair (A, ≺) where A is a non empty set, the field of the ordering, and ≺ ⊆ A × A is a binary relation satisfying the following conditions: • (∀x ∈ A)[¬x ≺ x] • (∀x ∈ A)(∀y ∈ A)(∀z ∈ A)[x ≺ y ∧ y ≺ z ⇒ x ≺ z] • (∀x ∈ A)(∀y ∈ A)[x ≺ y ∨ y ≺ x ∨ x = y].
(irreflexivity) (transitivity) (linearity)
A linear ordering (A, ≺) is a well-ordering if it is also well-founded, i.e., if it also satisfies / X] . • (∀X) X ⊆ A ∧ X = 0/ ⇒ (∃y ∈ X)(∀x ∈ A)[x ≺ y ⇒ x ∈ The jutting property of well-founded relations is that they admit induction. By induction on a relation (A, ≺) we mean the scheme (∀x ∈ A)[(∀y ≺ x)F(y) ⇒ F(x)] ⇒ (∀x ∈ A)F(x).
(3.1)
3.1 Heuristics
19
The property expressed in (3.1) is obvious. If we assume (∃x ∈ A)¬F(x) we obtain a least element a ∈ {x ∈ A ¬F(x)}. But then (∀y ≺ a)F(y) which, according to the premise of (3.1), would imply F(a) contradicting a ∈ {x ∈ A ¬F(x)}. A closer look at (3.1) shows that it is essentially the contraposition of wellfoundedness.1 Therefore a relation is well-founded if and only if it allows induction. Another equivalent formulation is the finiteness of descending sequences, which means that a relation (A, ≺) is well-founded if and only if there is no infinite descending sequence x0 x1 · · · xn xn+1 · · ·. This equivalence is again close to trivial. A relation (A, ≺) with an infinite descending sequence is clearly not well-founded. On the other hand, if (A, ≺) is not well-founded we have a non empty subset B ⊆ A which has no ≺-least element. Choosing x0 ∈ B there is an x1 ∈ B such that x1 ≺ x0 . Having chosen a descending sequence xn ≺ · · · ≺ x0 of elements in B we still find some xn+1 ∈ B such that xn+1 ≺ xn . Iterating this procedure we obtain an infinite descending sequence. We call two orderings (A1 , ≺1 ) and (A2 , ≺2 ) equivalent if there is an order preserving map from A1 onto A2 . This is obviously an equivalence relation on orderings. An equivalence class of orderings is called an order-type. As we have seen, every counting process well-orders a set and, vice versa, every well-ordering is suited to count the elements of its field. Therefore we call the equivalence class of a well-ordering an ordinal. Our previous examples ω , ω + n, ω + ω can therefore be understood as ordinals. If α and β are ordinals we define an order relation α < β iff there is a well-ordering (A, ≺A ) ∈ α and a well-ordering (B, ≺B ) ∈ β such that (A, ≺A ) is isomorphic to a proper initial segment of (B, ≺B ), i.e., if there is an element b ∈ B such that (A, ≺A ) is isomorphic to (B, ≺)b := ({x ∈ B x ≺B b}, ≺B ). It is not difficult to check that this definition is independent of the representatives (A, ≺A ) ∈ α and (B, ≺B ) ∈ β . Moreover it can be shown that the relation < wellorders the class On of all ordinals. For an ordinal α ∈ On we see that the wellordering (On, <)α = ({ξ ξ < α }, <) is a representative of α . Therefore we can choose ({ξ ξ < α }, <) as a canonical representative for α . By ω1 we denote the least ordinal which cannot be represented by a well-ordering of the natural numbers. The least ordinal which cannot be represented by a decidable well-ordering of the natural number is commonly denoted by ω1CK and pronounced as ω1 –C HURCH–Kleene in honor of these pioneers of computability theory. From a set-theoretical point of view the just described approach is, however, problematic. The order-type of a well-ordering is, in general, not a set but a proper class in the sense defined below. To avoid this difficulty in a set theoretical framework one represents ordinals by canonical representatives, i.e., defines ordinals in such a way that α := {ξ ξ < α }. Then ordinals become sets. This can be obtained by requiring that an ordinal is a set that is well-ordered by the membership relation ∈. Regarding ordinals in the set theoretic sense has become so common in mathematical logic that we will adopt this standpoint. It has many technical advantages. We cannot present the complete set theoretical background. The background needed for this text is, however, very small. It does not exceed the 1
With X := {x ∈ A ¬F(x)}.
20
3 Ordinals
usual background set theory which is ubiquitous in mathematics. We will give all proofs on the basis of a naive set theory. There we will talk naively about classes and sets. Every set is also a class. A class is a set if it is a member of another class. Classes which are not sets are called proper classes. The reader who wants to know more about the set theoretical background should consult the first chapters of an introductory text book on set theory, e.g., [58].
3.2 Some Basic Facts about Ordinals As mentioned at the end of the previous section, we will regard ordinals in the set theoretic sense. We assume that an ordinal is a transitive set which is well-ordered by the membership relation ∈, i.e., we define
α ∈ On :⇔ Tran(α ) ∧ (α , ∈) is well-ordered
(3.2)
where Tran(M) :⇔ (∀x ∈ M)(∀y ∈ x)[y ∈ M] denotes the fact that M is a transitive class, i.e., a class without gaps. By On we denote the class of ordinals. We define
α < β :⇔ α ∈ On ∧ β ∈ On ∧ α ∈ β and use mostly lower case Greek letters to denote ordinals. As an immediate consequence of the definition we obtain
α ∈ On ⇒ Tran(α ) ∧ (∀x ∈ α )[Tran(x)].
(3.3)
To see (3.3), we observe that α ∈ On entails Tran(α ) by definition and z ∈ y ∈ x ∈ α implies z ∈ x because α is well-ordered by ∈. Hence Tran(x), which proves (3.3). For x ∈ α ∈ On we get by (3.3) Tran(x) and x ⊆ α . But then x is also well-ordered by ∈ and therefore an ordinal. So we have shown
α ∈ On ∧ x ∈ α ⇒ x ∈ On, i.e., Tran(On)
(3.4)
and obtain
α ∈ On ⇒ α = {β β < α }
(3.5)
as an immediate consequence. In set theory we commonly assume that the universe is well-founded with respect to the membership relation. This is expressed by the foundation scheme (∃x)F(x) → (∃x) F(x) ∧ (∀y ∈ x)[¬F(y)] . In the presence of the foundation scheme, also the opposite direction of (3.3) holds true.
3.2 Some Basic Facts about Ordinals
21
3.2.1 Lemma Assume that the membership relation ∈ well-founded, i.e., that the foundation scheme holds true. Then α is an ordinal if and only if α is a hereditarily transitive set. Proof One direction of the claim is (3.3). For the opposite direction assume that α is hereditarily transitive, i.e., Tran(α ) ∧ (∀x ∈ α )[Tran(x)]. By the foundation scheme ∈ is irreflexive and well-founded on α . Since α is hereditarily transitive the membership relation is also transitive on α . It remains to check linearity. Assume that β is also hereditarily transitive. We show “If β is well-ordered by ∈ and α ⊆ β then α = β ∨ α ∈ β ”.
(3.6)
To prove (3.6) assume α = β and let ξ be the ∈-minimal element in β \ α . Then ξ ⊆ α . For η ∈ α we get ξ ∈ / η as well as η = ξ and thus η ∈ ξ because β is linearly ordered by ∈. Hence α = ξ ∈ β . This finishes the proof of (3.6). Observe that the contraposition of the foundation scheme is the scheme of ∈induction (∀x)[(∀y ∈ x)F(y) → F(x)] → (∀x)F(x). We prove “If α is hereditarily transitive then α is linearly ordered by ∈ ” by ∈-induction. Let ξ , η ∈ α such that ξ = η . Then γ := ξ ∩ η is hereditarily transitive. If γ = ξ = η we get ξ = γ ∈ η by the induction hypothesis and (3.6). If γ = ξ we get γ ∈ ξ by (3.6) and the induction hypothesis. In case of γ = η we are done and the remaining case γ ∈ η is excluded since this would entail γ ∈ η ∩ ξ = γ which contradicts foundation. So we have seen that in the presence of the foundation scheme the ordinals are exactly the hereditarily transitive sets. Observe further that (3.6) gives the direction from left to right in the following useful equivalence
α ∈ On ∧ β ∈ On ⇒ (α ⊆ β ⇔ α ≤ β ). The other direction holds trivially since α < β implies α ⊆ β by the transitivity of β . But even if we do not assume the foundation scheme we obtain the following fact from the proof of Lemma 3.2.1. 3.2.2 Lemma Assume that ∈ is well-founded on a set a. Then a is an ordinal if and only if a is hereditarily transitive. 3.2.3 Theorem The class On is well-ordered by ∈. Proof It follows from the definition that the relations ∈ and < coincide on ordinals. Any infinite descending sequence α0 > α1 · · · of ordinals would also be an infinite descending sequence in α0 contradicting the well-foundedness of α0 . Therefore ∈ is well-founded on On. Then ∈ is also irreflexive on On and transitive because all elements of On are transitive. It remains to check linearity. For α , β ∈ On we obtain
22
3 Ordinals
γ := α ∩ β as an ∈-well-founded hereditarily transitive set, i.e., γ ∈ On. By (3.6) we have γ ≤ α and γ ≤ β . If γ = α and γ = β we get γ ∈ α ∩ β = γ which is impossible. Therefore α = γ ≤ β or β = γ ≤ α and we are done. Observe that On is a proper class. Since On is hereditarily transitive and wellfounded by ∈ the assumption that On is a set would lead to the contradiction On ∈ On by Lemma 3.2.2. By Theorem 3.2.3 we have the principle of induction for On. This principle is commonly called transfinite induction. 3.2.4 Lemma Let M ⊆ On be transitive. Then M ∈ On or M = On. Proof Since M ⊆ On the class M is well-ordered by < and all its members are transitive, therefore M is hereditarily transitive. If M is a set we get M ∈ On by Lemma 3.2.2. Otherwise we show On ⊆ M by transfinite induction. Let η ∈ On and assume by induction hypothesis η ⊆ M. Since M is well-ordered by ∈ we obtain η = M or η ∈ M by (3.6). But η = M is excluded because M is not a set. Hence η ∈ M. 3.2.5 Lemma Let M ⊆ On be a set. Then sup M := min {ξ ∈ On (∀α ∈ M)[α ≤ ξ ]} exists and sup M =
M := {ξ (∃α ∈ M)[ξ ∈ α ]}.
We call sup M the supremum of the set M.
Proof Let β := M. Then β is a hereditarily transitive subset of On. Hence β ∈ On by Lemma 3.2.4. For α ∈ M we obtain α ⊆ β , which implies α ≤ β by equation / So sup M exists and sup M ≤ β . For ξ < β (3.6). Hence {ξ (∀α ∈ M)[α ≤ ξ ]} = 0. there is an η ∈ M such that ξ < η that also proves β ≤ sup M. To gain a feeling for ordinals in the set theoretical setting we enumerate the first ordinals. The least transitive set whose members are also transitive is apparently the empty set 0. / If x is hereditarily transitive we see immediately that x ∪ {x} is again a hereditarily transitive set. Therefore we get the first ordinals as 0 = 0, / 1 := 0 ∪ {0} = {0}, 2 = 1 ∪ {1} = {0, 1}, 3 = {0, 1, 2}, . . . . 3.2.6 Definition Let
α := min {β α < β }. We call α the successor of α . It is easily checked that α = α ∪ {α }. Let Lim := {α ∈ On α = 0 ∧ (∀β < α )[β < α ]}
3.2 Some Basic Facts about Ordinals
23
denote the class of limit ordinals. The existence of limit ordinals is secured by the axiom of infinity. We define
ω := min Lim and call an ordinal finite if it is less than ω . There are three types of ordinals: • The ordinal 0 • Successor ordinals of the form α • Limit ordinals According to the three types of ordinals we can reformulate transfinite induction on ordinals as F(0) ∧ (∀α )[F(α ) ⇒ F(α )] ∧ (∀λ ∈ Lim)[(∀ξ < λ )F(ξ ) ⇒ F(λ )] ⇒ (∀α )F(α ). Another important principle is the principle of transfinite recursion, which generalizes primitive recursion into the transfinite. To formulate the principle of transfinite recursion we recall some notions. A (class) relation (in the set theoretical sense) is a class of ordered pairs. The field of a relation R is defined as field(R) := {x (∃y)[(x, y) ∈ R ∨ (y, x) ∈ R]}. A (class) function F is a relation in which the second component is uniquely determined by the first. We use the familiar notations: • F(a) is the uniquely determined element such that (a, F(a)) ∈ F. • The domain dom(F) of a function F is the class {a (∃y)[(a, y) ∈ F]} and dually • The range rng(F) of F is the class {y (∃x)[(x, y) ∈ F]}. • The restriction of F to a set a is the set Fa := {(x, y) x ∈ a ∧ (x, y) ∈ F}. We use the letter V to denote the universe, i.e., the class of all sets. By F: N −→ p M we denote that F is a partial function from N into M, i.e., dom(F) might be a proper subclass of N. 3.2.7 Theorem (Transfinite recursion on ordinals) Let G be a function mapping sets to sets. Then there is a uniquely defined function F: On −→ V such that F(α ) = G(Fα ).
24
3 Ordinals
Proof
We prove the theorem on the basis of naive set theory. Let
M := { f f : On −→ p V ∧ dom( f ) ∈ On ∧ (∀α ∈ dom( f ))[ f (α ) = G( f α )]}. (i) Then we obtain f ∈ M ∧ g ∈ M ∧ α ∈ dom( f ) ∩ dom(g) ⇒ f (α ) = g(α )
(ii)
by induction on α . By the induction hypothesis we have f α = gα and obtain thus f (α ) = G( f α ) = G(gα ) = g(α ). Now we define F := {(α , b) (∃ f ∈ M)[ f (α ) = b].}
(iii)
The class F is a function by (ii). We have to show (∀α ∈ On)(∃b)[F(α ) = b] ∧ Fα ∈ M.
(iv)
We prove (iv) by induction on α . Let α ∈ On. By the induction hypothesis we have f0 := Fα ∈ M. Now we define f0 (β ) if β < α f (β ) := G( f0 α ) if β = α . Then dom( f ) = α ∈ On and for all β ≤ α we have f (β ) = G( f β ), i.e., f ∈ M. Therefore F(α ) = f (α ) and Fα = f and we are done. For the uniqueness of F we assume that H is a second function satisfying the properties of the theorem and show F(α ) = H(α ) straight forwardly by induction on α . According to the three different types of ordinals we may again reformulate the principle of transfinite recursion in the following form. Let a be a set and S and L functions which map sets to sets. Then there is a uniquely defined function F: On −→ V satisfying F(0) = a F(α ) = S(F(α )) λ ∈ Lim ⇒ F(λ ) = L(Fλ ). The principle of transfinite recursion can be extended to well-founded relations (A, ≺) where we have to require that for a ∈ A the class ≺a := {b ∈ A b ≺ a} is always a set. Theorem 3.2.7 holds also in the more general setting stated later. 3.2.8 Theorem Let (A, ≺) be a well-founded relation and G:V −→ V a function. Then there is a uniquely determined function F satisfying (∀x ∈ A)[F(x) = G(F{y ∈ A y ≺ x})]. The proof is essentially the same as that of Theorem 3.2.7. However, since ≺ is not supposed to be transitive a rigid proof requires some extra prerequisites such as the definition of the transitive closure of (A, ≺). Spelling out all these details would lead
3.2 Some Basic Facts about Ordinals
25
us too far into set theory. Therefore we omit the proof. A rigid proof can be found practically in any elementary textbook on set theory.2 3.2.9 Definition For a well-founded binary relation ≺ and s ∈ field(≺) we define otyp≺ (s) = sup {otyp≺ (t) t ≺ s} by transfinite recursion. It follows by induction on ≺ that otyp≺ (s) ∈ On for all s ∈ field(≺). We call otyp≺ (s) the order-type of s in ≺. The order-type of the relation ≺ is otyp(≺) := sup {otyp≺ (s) s ∈ field(≺)}. We define
ω1CK := sup {otyp(≺) ≺ ⊆ ω × ω is primitive recursive}. Also by transfinite recursion we define for a well-founded relation ≺ the Mostowski collapsing function
π≺ : field(≺) −→ V π≺ (s) := {π≺ (t) t ≺ s}. The Mostowski collapse of ≺
π≺ [field(≺)] := {π≺ (s) s ∈ field(≺)} is obviously a transitive set. 3.2.10 Lemma Let ≺ be a well-founded and transitive binary relation. Then π≺ (s) = otyp≺ (s) holds true for all s ∈ field(≺) and otyp(≺) = π≺ [field(≺)]. Proof First we show that π≺ (s) is an ordinal. Since ≺ is well-founded and π≺ is a ≺-∈ homomorphism the set π≺ (s) is well-founded by ∈. For x ∈ y ∈ π≺ (s) there is a t ≺ s and a t0 ≺ t such that y = π≺ (t) and x = π≺ (t0 ). Since ≺ is transitive we have t0 ≺ s and therefore x ∈ π≺ (s). This shows that π≺ (s) is transitive for all s ∈ field(≺) which implies that π≺ (s) is hereditarily transitive and thus an ordinal by Lemma 3.2.2. Next we show
π≺ (s) = otyp≺ (s)
(i)
by induction along ≺. For α ∈ otyp≺ (s) we find a t ≺ s such that α ≤ otyp≺ (t) = π≺ (t) < π≺ (s). Hence α ∈ π≺ (s) and thus otyp≺ (s) ≤ π≺ (s). Conversely α ∈ π≺ (s) implies α = π≺ (t) = otyp≺ (t) < otyp≺ (s) for some t ≺ s which shows π≺ (s) ≤ otyp≺ (s). For the second claim we obviously have π≺ [field(≺)] ⊆ otyp(≺). To obtain also the opposite inclusion assume α ∈ otyp(≺). Then there is an s ∈ field(≺) such that α ≤ otyp≺ (s) = π≺ (s). Hence α ∈ π≺ [field(≺)]. For a well-ordering ≺ and S ⊆ field(≺) we denote by µ≺ S the ≺-least element in S. 2
Cf. also Theorem 11.4.10 in Chap. 11.
26
3 Ordinals
3.2.11 Lemma Let ≺ be a well-ordering. Then the functions otyp≺ and π≺ coincide, are order preserving, hence one-one, and also onto on otyp(≺). The inverse function en≺ := otyp−1 ≺ satisfies en≺ : otyp(≺) −→ field(≺) en≺ (s) := µ≺ {t ∈ field(≺) (∀x ≺ s)[en≺ (x) ≺ t]} and thus enumerates the elements in the field of ≺ increasingly. We call en≺ the enumerating function of ≺. The proof is obvious from Lemma 3.2.10 and the definitions of otyp(≺) and π≺ , respectively. Let M ⊆ On. Then (M, ∈) is a well-ordering. Therefore all previous definitions apply to (M, ∈). In this case we do not mention the relation ∈, but just talk about otyp(M) and denote its enumerating function by enM . 3.2.12 Lemma For a class M ⊆ On we have either otyp(M) = On or otyp(M) ∈ On. It is otyp(M) ∈ On if and only if M is a set. Proof
This follows from Lemma 3.2.4 since otyp(M) is hereditarily transitive.
3.2.13 Lemma Let f : On −→ p On be an order preserving function such that dom( f ) is transitive. Then a ≤ f (a) for all a ∈ dom( f ). Proof Assume that there is an α ∈ dom( f ) such that f (α ) < α . Then there is a least such α . But then f (α ) ∈ dom( f ) and we obtain f ( f (α )) < f (α ) < α , contradicting the minimality of α . Observe that, as a special case of Lemma 3.2.13, we always have α ≤ enM (α ) for a class M ⊆ On. Let us now turn to the cardinal aspect of sets. The cardinality of a set is the number of its elements. This is difficult to grasp in an absolute way. It is easier to compare the size of sets by bringing their elements in a one-one correspondence. Therefore we call two sets a and b are equivalent if there is a one-one map from a onto b, i.e., onto b . a ∼ b :⇔ (∃ f ) f : a −→ 1−1 If we assume that every set can be well-ordered – a fact which is equivalent to the axiom of choice – there is a possibility to measure the size of a set by an ordinal. 3.2.14 Definition For a set a we define onto
a := min {β ∈ On (∃ f )[ f : a −→ β ]} 1−1 and call a the cardinality of a. An ordinal α is a cardinal iff α = α .
3.2 Some Basic Facts about Ordinals
27
Let Card := {κ κ is a cardinal.} Since a∼b ⇔ a=b we can measure the size of a set by determining its cardinal which – in some sense – corresponds to “counting its elements”. Cardinals are “initial ordinals” in the sense that there is no smaller ordinal of the same size. All finite ordinals are cardinals. Also ω is a cardinal. The cardinal ω + , i.e., the first cardinal bigger than ω is commonly denoted by ω1 . Of special interest are initial ordinals κ , which satisfy the additional closure condition that they can only be approximated by a set of smaller ordinals if this set has at least the size of κ . Such ordinals will be called regular. 3.2.15 Definition The class of regular ordinals is defined by Reg := {α ∈ On (∀x)[x ⊆ α ∧ x < α ⇒ sup x < α ]}. Observe that all regular ordinals are cardinals. It is a folklore result of set theory that the class of all cardinals as well as the class of the regular cardinals are unbounded in On and thus proper classes. For any ordinal α there is a least cardinal α + that is bigger than α . All cardinals of the form α + are regular. 3.2.16 Definition Let κ be a regular ordinal. A class M ⊆ On is unbounded in κ if for all α < κ there is a β ∈ M ∩ κ such that α < β . 3.2.17 Definition Let κ be a regular ordinal. A class M ⊆ On is closed in κ if supU ∈ M holds for all U ⊆ M ∩ κ such that U = 0/ and U < κ . We denote by “M is κ -club” that M is closed and unbounded in κ . 3.2.18 Definition Let κ be a regular ordinal and f : On −→ p On be order preserving. We call f κ -continuous if dom( f ) is closed in κ and sup f [U] = f (supU) holds for all U ⊆ dom( f ) ∩ κ such that U < κ . An order preserving function f : On −→ p On is a κ -normal function if f is κ continuous and κ ⊆ dom( f ). 3.2.19 Theorem Let κ be a regular ordinal. A class M ⊆ On is κ -club iff enM is a κ -normal function. Proof “⇒:” Let M be κ -club. Then M is unbounded in κ , which implies otyp(M ∩ κ ) = κ . Hence κ ⊆ dom(enM ). Let U ⊆ dom( f ) ∩ κ such that U < κ . Then we also have enM [U] = U < κ . Since κ is regular this implies supU < κ and sup(enM [U]) < κ . Therefore we find an α < κ such that sup(enM [U]) = enM (α ).
28
3 Ordinals
For ξ ∈ U we have enM (ξ ) < enM (α ) which implies ξ < α . Hence supU ≤ α . Assume supU < α . Then enM (supU) < enM (α ) = sup enM [U] and we obtain a ξ ∈ U such that enM (supU) < enM (ξ ). But then supU < ξ ∈ U which is absurd. “⇐:” Because of κ ⊆ dom(enM ) we obtain otyp(M ∩ κ ) = κ . Therefore M is unbounded in κ . Let U ⊆ M ∩ κ such that U < κ and A = otyp(U). Then A = U < κ and, since κ is regular, sup A < κ . We obtain enM (sup A) = sup enM [A] = supU which shows supU ∈ M. 3.2.20 Exercise The open intervals (α , β ) := {γ α < γ ∧ γ < β } together with the class On form a basis of a topology on On, the order-topology. The order-topology induces a topology on every regular ordinal κ . Prove the following claims: (a) A set M ⊆ κ is closed in the sense of Definition 3.2.17 iff it is closed in the order-topology on κ . (b) Let M ⊆ κ be closed in κ . An order preserving function f : M −→ κ is continuous in the sense of Definition 3.2.18 iff it is continuous in the order-topology on κ . (c) Let M ⊆ κ be closed in κ . Characterize the class of functions satisfying f (sup U) = sup { f (ξ ) ξ ∈ U} for all non void sets U ⊆ M that are bounded in M. 3.2.21 Exercise Prove the following facts: (a) Every regular ordinal is a cardinal. (b) The class of cardinals is closed and unbounded. (c) If λ and κ are cardinals and there is no cardinal in the interval (λ , κ ) then κ is regular. (d) The class of regular ordinals is unbounded.
3.3 Fundamentals of Ordinal Arithmetic 3.3.1 Definition For an ordinal α let Onα := {β ∈ On α ≤ β } denote the class of ordinals ≥ α . Let α + ξ := enOnα (ξ ) and call α + β the ordinal sum of α and β . Since Onα is obviously club in any regular κ > α , the function λ ξ . α + ξ is a κ -normal function by Theorem 3.2.19. 3.3.2 Observation The function λ ξ . α + ξ satisfies the following recursion equations:
3.3 Fundamentals of Ordinal Arithmetic
29
α +0 = α α + ξ = (α + ξ ) α + λ = sup (α + ξ ) for λ ∈ Lim ξ <λ
We see from Observation 3.3.2 that α + ξ extends the addition of natural numbers into the transfinite. We easily check the following properties of ordinal addition. 3.3.3 Theorem The ordinal sum satisfies the following properties: • ξ < η ⇒ α +ξ < α +η • α ≤ α +ξ
and
ξ ≤ α +ξ
• α + (β + γ ) = (α + β ) + γ Proof The first two properties follow directly from the fact that λ ξ . α + ξ is an enumerating function. The last property follows straight forwardly by induction on γ . We obtain α + 1 = (α + 0 ) = α and will therefore mostly write α + 1 instead of α . 3.3.4 Definition Put H := {α ∈ On α = 0 ∧ (∀ξ < α )(∀η < α )[ξ + η < α ]}. We call the ordinals in H principal or additively indecomposable. 3.3.5 Lemma Principal ordinals have the following properties: / H ⇔ (∃ξ < α )(∃η < α )[α = ξ + η ]. (1) α ∈ (2) H ⊆ Lim ∪ {0 }. (3) {0 , ω } ⊆ H and (0 , ω ) ∩ H = 0. / (4) Any infinite cardinal κ is principal. (5) H is κ -club for all regular ordinals κ > ω . Proof
By the definition of H we get
α∈ / H ⇔ (∃ξ ∈ α )(∃η ∈ α )[α ≤ ξ + η ].
(i)
To obtain (1) we have to show that for α ∈ / H and ξ < α there is an η < α such that α = ξ + η . Since α ∈ Onξ there is an η0 such that α = ξ + η0 . Together with (i) we obtain ξ + η0 = α ≤ ξ + η which implies η0 ≤ η < α .
30
3 Ordinals
Properties (2) and (3) are obvious. We prove (4). Let ξ < κ . Then Onξ ∩ κ = κ . Therefore we have ξ + η ∈ Onξ ∩ κ for all η < κ . To prove (5) we first show that H is unbounded in κ . Let α < κ . Put α0 := α + 1 and αn+1 = αn + αn and M := {αn n ∈ ω }. By (4) we have M ⊆ κ . Since ω < κ this implies sup M < κ . For ξ , η < sup M there is an n ∈ ω such that ξ , η < αn . Hence ξ + η < αn + αn = αn+1 ≤ sup M. Therefore we have α < sup M ∈ H and sup M < κ . To show that H is closed pick U ⊆ H ∩ κ such that U < κ . Then supU < κ . If supU ∈ U we are done. Otherwise supU ∈ Lim and for ξ , η < supU we find an α ∈ U ⊆ H such that ξ , η < α . Hence ξ + η < α ≤ supU that shows supU ∈ H. Let
ω ξ := enH (ξ ).
(3.7)
Then we obtain
ω 0 = 1,
ω1 = ω
and
λ ∈ Lim ⇒ ω λ = sup {ω ξ ξ < λ }
and
α < β ⇒ ω α < ω β .3 3.3.6 Observation An ordinal α is additively indecomposable if and only if ξ + α = α holds true for all ξ < α . Proof Let α ∈ H. If α = 1 the only ordinal less than 1 is 0 and 0+1 = 1. Otherwise we have α ∈ Lim and obtain ξ + α = sup {ξ + η η < α } ≤ α ≤ ξ + α . The opposite direction follows because for ξ , η < α we get ξ + η < ξ + α = α . The next lemma is a direct consequence of Observation 3.3.6. 3.3.7 Lemma Let {α1 , . . . , αn } ⊆ H. Then there are {k1 , . . . , km } ⊆ {1, . . . , n} such that ki < ki+1 , αki ≥ αki+1 for i = 1, . . . , m and α1 + · · · + αn = αk1 + · · · + αkm . We define α =NF α1 + · · · + αn :⇔ α = α1 + · · · + αn ∧ {α1 , . . . , αn } ⊆ H ∧ αi ≥ αi+1 for i = 1, . . . , n − 1.
(3.8)
3.3.8 Theorem (Cantor normal-form) For all ordinals α = 0 there are uniquely determined ordinals α1 , . . . , αn such that α =NF α1 + · · · + αn .
3
Writing ω ξ as an exponential is not by accident. (cf. Exercise 3.3.11)
3.3 Fundamentals of Ordinal Arithmetic
31
Proof We prove the existence by induction on α . If α ∈ H then α =NF α . Otherwise we have α = ξ + η with ξ , η < α . By induction hypothesis we get ξ =NF ξ1 + · · · + ξm and η =NF η1 + · · · + ηn . Then α =NF ξ1 + · · · + ξ j + η1 + · · · + ηn where 1 ≤ j ≤ m is the biggest index such that ξ j ≥ η1 . For uniqueness we assume α =NF β1 + · · · + βm and α =NF α1 + · · · + αn and show m = n and αi = βi for i = 1, . . . , n by induction on m. Let α1 = ω ξ and β1 = ω η . Then ω ξ = α1 ≤ α < ω η +1 which entails ξ ≤ η . Dually we also obtain η ≤ ξ . Hence ξ = η , i.e., α1 = β1 . But then α2 + · · · + αm = β2 + · · · + βn which by induction hypothesis implies m = n and αi = βi for i = 2, . . . , n. Using the Cantor normal-form the following observation is obvious. 3.3.9 Observation If α =NF α1 + · · · + αm and β =NF β1 + · · · + βn then α < β holds iff one of the following conditions is satisfied: • m < n and αi = βi for all i ≤ m, • There is a j < m such that αi = βi for all i < j and α j < β j . Besides the ω -power ω α (which is in fact a power) we will later also need the power to the basis 2. This, however, is easily defined by the recursion equations 20 := 1 2α +1 := 2α + 2α λ ∈ Lim ⇒ 2λ := sup {2ξ ξ < λ }. 3.3.10 Exercise We define multiplication and exponentiation of ordinals by transfinite recursion:
α · 0 := 0 α · β := α · β + α α · λ := sup {α · ξ ξ < λ } for λ ∈ Lim and exp(α , 0) := 1 exp(α , β ) := exp(α , β ) · α exp(α , λ ) := sup {exp(α , ξ ) ξ < λ } for λ ∈ Lim. Proof or refute the following statements: (a) α < β ∧ γ > 0 ⇔ γ · α < γ · β (b) α < β ⇒ α · γ < β · γ (c) α · (β + γ ) = α · β + α · γ
32
3 Ordinals
(d) (α + β ) · γ = α · γ + β · γ (e) α · (β · γ ) = (α · β ) · γ (f)
β = 0 ⇒ (∀α )(∃!γ )(∃!δ )[α = β · γ + δ ∧ δ < β ]
(g) 1 < γ ∧ α < β ⇒ exp(γ , α ) < exp(γ , β ) (h) exp(α , β ) · exp(α , γ ) = exp(α , β + γ ) (i)
exp(exp(α , β ), γ ) = exp(α , β · γ )
(j)
α < β ⇒ exp(α , γ ) ≤ exp(β , γ )
(k) α > 0 ∧ β > 1 ⇒ (∃!δ )[exp(β , δ ) ≤ α < exp(β , δ + 1)] 3.3.11 Exercise Show (a) exp(ω , α ) = ω α (b) α ∈ Lim ⇒ exp(2, α ) ∈ H (c) 2α = exp(2, α ) for all ordinals α . Use these results to prove ω α = 2ω ·α . Hint: To prove (a) start by showing exp(ω , α ) ∈ H.
3.3.12 Exercise Define the enumerating function of all ordinals which are not successor ordinals. 3.3.13 Exercise (Cantor normal-form with basis β ) Let β > 1. Show that for all α > 0 there are uniquely determined ordinals α1 > · · · > αn and γ1 , . . . , γn with 0 < γi < β for i = 1, . . . , n such that
α = β α 1 · γ1 + · · · + β α n · γn , where β αi stands for exp(β , αi ) (cf. Exercise 3.3.10). 3.3.14 Exercise Let α = 0 = 0 = α = α . For α =NF α1 + · · · + αm and β = NF αm+1 + · · · + αn let α = β := απ (1) + · · · + απ (n) where π is a permutation of the numbers {1, . . . , n} such that απ (i) ≥ απ (i+1) for i ∈ {1, . . . , n − 1}. We call α = β the symmetric sum of α and β . Prove the following properties of the symmetric sum: • α = β = β
α. =
• α = ξ < α = η for all ordinals ξ < η .
3.3 Fundamentals of Ordinal Arithmetic
33
3.3.15 Exercise (Multiplicative indecomposable ordinals) An ordinal α = 0 is multiplicatively indecomposable if (∀ξ < α )(∀η < α )[ξ ·η < α ]. Prove the following statements: (a) An ordinal α > 1 is multiplicatively indecomposable iff ξ ·α = α for all ξ < α . (b) An ordinal α is multiplicatively indecomposable iff α ∈ {1, 2} or there is an δ ordinal δ such that α = ω ω . 3.3.16 Exercise Prove the equation in (a) and characterize the classes in (b)–(d) similarly.
(a) M1= := {α ∈ On ω · α = α · ω } = H ∩ ω ω ∪ {0} (b)
M2= := {α ∈ On ω α = α ω }
(c)
M1< := {α ∈ On ω · α < α · ω }
(d)
M2< := {α ∈ On ω α < α ω }
3.3.1 A Notation System for the Ordinals below ε0 We will later prove that all (κ -) normal functions possess fixed-points. The fixedpoints of λ ξ . ω ξ are commonly called ε -numbers. Let
ε0 := min {α ω α = α }. For ordinals less than ε0 we get a normal-form α =NF ω α1 + · · · + ω αn with αi < α for i = 1, . . . , n. This opens the possibility to denote every ordinal < ε0 by a word formed from the alphabet {0, +, ω . }. By goedelizing these words, i.e., coding them by natural numbers, we obtain codes for the ordinals below ε0 in the natural numbers. We do that by simultaneously defining a set OT of ordinal codes and a mapping | |: OT −→ On that assigns the ordinal value |a| to every code a ∈ OT. First we define (O1)
0 ∈ OT and
|0| = 0
(O2)
If a1 , . . . , an ∈ OT and |a1 | ≥ |a2 | ≥ · · · ≥ |an | then a1 , . . . , an ∈ OT and | a1 , . . . , an | := ω |a1 | + · · · + ω |an | .
Second we define a relation a ≺ b :⇔ a ∈ OT ∧ b ∈ OT ∧ |a| < |b|
34
3 Ordinals
and claim that the set OT as well as the binary relation ≺ are both primitive recursive. To check that we show that both notions can be simultaneously defined by courseof-values recursion. We get x ∈ OT ⇔ Seq(x) ∧ [x = 0 ∨ (∀i < lh(x))[(x)i ∈ OT ∧ (i + 1 < lh(x) → (a)i+1 (a)i )]], where a b stands for a ≺ b ∨ a = b, and a ≺ b ⇔ a ∈ OT ∧ b ∈ OT ∧ [(a = 0 ∧ b = 0) ∨ (lh(b) = 1 ∧ a (b)0 ) ∨ (∃ j < min{lh(a), lh(b)})(∀i < j)[(a)i = (b)i ∧ (a) j ≺ (b) j ] ∨
lh(a) < lh(b) ∧ (∀i < lh(a))[(a)i = (b)i ] ]. 3.3.17 Theorem The set OT and the binary relation ≺ are primitive recursive. For a ∈ OT it is |a| < ε0 and for every ordinal α < ε0 there is an a ∈ OT such that |a| = α . Proof We have just seen that OT and ≺ are simultaneously definable by courseof-values recursion and are thus primitive recursive. Since ε0 is in H and closed under λ ξ . ω ξ we obtain |a| < ε0 easily by induction on a. Conversely we show by induction on α < ε0 that there is an a ∈ OT such that |a| = α . This is obvious for α = 0. If α = 0 we have α =NF ω α1 + · · · + ω αn with αi < α for i = 1, . . . , n. By induction hypothesis there are ordinal notations a1 , . . . , an such that |ai | = αi . Then |α1 | ≥ |α2 | ≥ · · · ≥ |αn | which implies a := a1 , . . . , an ∈ OT and |a| = ω |a1 | + · · · + ω |an | = ω α1 + · · · + ω αn = α . 3.3.18 Lemma For a ∈ OT it is otyp≺ (a) = |a|. Hence ε0 = otyp(≺) < ω1CK . Proof Since ≺ is a well-ordering we have otyp≺ (a) = π≺ (a) for all a ∈ OT. We show |a| = π≺ (a) by induction on ≺. By Theorem 3.3.17 we have |a| = {|b| |b| < |a|} = {|b| b ≺ a} = {π≺ (b) b ≺ a} = π≺ (a). 3.3.19 Remark After having defined the notation system (OT, ≺) we may forget the set theoretical background we used to develop the system. We may consider OT as a syntactically defined set of natural numbers together with a syntactically defined relation ≺.4 This is of importance when ordinal notations are used to obtain consistency proofs as finitist as possible. In a finitist consistency proof we must not rely on the set theoretic background but have to argue purely combinatorially. The notation system can be viewed as purely combinatorially given. But without the set theoretical background we need to prove that (OT, ≺) is a well-ordering. The proof that it is a linear ordering is tedious but completely elementary. The proof of its wellfoundedness, however, needs means that exceed the strength of Peano Arithmetic (cf. Sect. 7.1). The well-foundedness proof is given in Sect. 7.4. A discussion about its foundational status is in Sect. 7.5.4. 4
This will also be true for all the notation systems we are going to develop.
3.4 The Veblen Hierarchy
35
3.3.20 Exercise Let F be the least set of functions on the natural numbers that contains the function λ x . 0 and satisfies: • If f1 , . . . , fn are in F then λ x . x f1 (x) + · · · + x fn (x) is in F . For f , g ∈ F let f ≺ g : ⇔ (∃k)(∀x ≥ k)[ f (x) < g(x)]. Show that (F , ≺) is a well-ordering of order-type ε0 . What is the order-type of the polynomials in one variable with respect to ≺?
3.4 The Veblen Hierarchy To obtain notions for ordinals beyond ε0 we need decompositions also for additively indecomposable ordinals. Thereto we have to study ordinals more profoundly. The main tool will be a hierarchy of closed and unbounded classes based on the class of additively indecomposable ordinals.
3.4.1 Preliminaries 3.4.1 Definition Let f : On −→ p On be a partial function. Put Fix( f ) := {η ∈ dom( f ) f (η ) = η }. The derivative of the function f is defined by f := enFix ( f ) . For a class M ⊆ On we define its derivative by M := Fix(enM ) = {α ∈ M enM (α ) = α }. 3.4.2 Lemma Let κ > ω be a regular ordinal and f : On −→ p On a κ -normal function. Then Fix ( f ) is club in κ and therefore f again a κ -normal function. Proof First we show that Fix( f ) is unbounded in κ . Let α < κ and put β0 := α + 1 and βn+1 := f (βn ). Then β := supn∈ω βn < κ and f (β ) = supn∈ω f (βn ) = supn∈ω βn+1 = β . Hence α < β ∈ Fix ( f ) ∩ κ and Fix( f ) is unbounded in κ . Next we show that Fix ( f ) is closed in κ . Let U ⊆ Fix ( f ) ∩ κ such that U < κ . We obtain f (supU) = sup f [U] = supU, i.e., supU ∈ Fix ( f ). 3.4.3 Corollary The derivative M of a κ -club class M is again κ -club.
36
3 Ordinals
Proof Since M = Fix (enM ) we obtain the claim immediately from Lemma 3.4.2 and Theorem 3.2.19. 3.4.4 Lemma Let κ > ω be a regular ordinal and I < κ . If {Mι ι ∈ I} is a collec tion of classes which are club in κ then ι ∈I Mι is also club in κ .
Proof The intersection ι ∈I Mι is obviously closed in κ . To visualize the fact that it is also unbounded in κ we choose α < κ and α < m0,0 ∈ M0 , arrange the classes Mι in a row and choose elements mk,ι ∈ Mι increasingly as shown in the following figure: M0 m0,0 .. . mk,0 .. .
≤ ≤
M1 m0,1 .. .
··· ≤ ··· ≤
Mι m0,ι .. .
··· ι ∈I ≤ · · · ≤ supι ∈I m0,ι ≤ .. .
mk,1 .. .
≤ ··· ≤
mk,ι .. .
≤ · · · ≤ supι ∈I mk,ι ≤ .. .
supk mk,0 = supk mk,1 = · · · = supk mk,ι = · · · = Since all the classes Mι are unbounded in κ and I < κ we can choose the mk,ι < κ in such a way that the ordinals in the figure are increasing from left to right and from top to down. But then supk mk,ι ∈ Mι for all ι ∈ I and since all these suprema coincide we see that supk mk,ι ∈ ι ∈I Mι . 3.4.5 Exercise Let κ > ω be a regular ordinal and {Xα α < κ } be a family of sets which are club in κ . Show that their diagonal intersection
Xα ∆α <κ Xα := ξ ∈ κ ξ ∈ α <ξ
is club in κ .
3.4.2 The Veblen Hierarchy Based on the class of additively indecomposable ordinal we will now establish a hierarchy of closed and unbounded classes using derivatives. 3.4.6 Definition We define the Veblen hierarchy of α -critical ordinals by Cr (0) = H Cr (α + 1) = Cr (α )
λ ∈ Lim ⇒ Cr (λ ) =
ξ <λ
Cr (ξ )
3.4 The Veblen Hierarchy
37
and put
ϕα := enCr (α ) . 3.4.7 Theorem The classes Cr (α ) are club in all regular ordinals κ > max{α , ω }. The functions ϕα are,therefore, κ -normal functions for all regular κ > max{α , ω }. Proof This follows immediately from Corollary 3.4.3, Lemma 3.4.4 and Theorem 3.2.19. 3.4.8 Lemma The Veblen functions ϕα have the following basic properties: (1)
ϕ0 (α ) = ω α
(2)
ϕ1 (0) = ε0
(3)
β < α ⇒ ϕξ (α ) < ϕξ (β )
(4)
β ≤ ϕα (β )
(5)
α < β ⇒ Cr (β ) Cr (α ) ∧ ϕα (γ ) ≤ ϕβ (γ ) ∧ ϕα (ϕβ (γ )) = ϕβ (γ )
Proof Properties (1) through (4) follow directly from the definition and Lemma 3.2.13. Only (5) needs a proof. By definition we have
α < β ⇒ Cr (β ) ⊆ Cr (α )
(i)
that already implies
ϕα (γ ) ≤ ϕβ (γ ).
(ii)
For α < β we get ϕβ (γ ) ∈ Cr (β ) ⊆ Cr (α + 1) = Cr (α ) which shows
ϕα (ϕβ (γ )) = ϕβ (γ ).
(iii)
Since 0 < β we obtain 0 < ϕβ (0) and thus ϕα (0) < ϕα (ϕβ (0)) = ϕβ (0). Hence ϕα (0) ∈ Cr (α ) \ Cr (β ), and the inclusion is proper. 3.4.9 Theorem (A) We have ϕα1 (β1 ) = ϕα2 (β2 ) if and only if one of the following conditions is satisfied: (1)
α1 < α2 ∧ β1 = ϕα2 (β2 )
(2)
α1 = α2 ∧ β1 = β2
(3)
α2 < α1 ∧ ϕα1 (β1 ) = β2 .
38
3 Ordinals
(B) We have ϕα1 (β1 ) < ϕα2 (β2 ) if and only if one of the following conditions is satisfied: (1)
α1 < α2 ∧ β1 < ϕα2 (β2 )
(2)
α1 = α2 ∧ β1 < β2
(3)
α2 < α1 ∧ ϕα1 (β1 ) < β2 .
Proof We prove (A) and (B) simultaneously and distinguish the following cases: 1. α1 < α2 . Then ϕα1 (ϕα2 (β2 )) = ϕα2 (β2 ) and we obtain
ϕα1 (β1 ) = ϕα2 (β2 ) = ϕα1 (ϕα2 (β2 )) ⇔ β1 = ϕα2 (β2 ) as well as
ϕα1 (β1 ) < ϕα2 (β2 ) = ϕα1 (ϕα2 (β2 )) ⇔ β1 < ϕα2 (β2 ). 2. α1 = α2 . Then
ϕα1 (β1 ) = ϕα2 (β2 ) ⇔ β1 = β2 and
ϕα1 (β1 ) < ϕα2 (β2 ) ⇔ β1 < β2 . 3. α1 > α2 . Then we obtain by 1.
ϕα1 (β1 ) = ϕα2 (β2 ) ⇔ β2 = ϕα1 (β1 ) and
ϕα1 (β1 ) > ϕα2 (β2 ) ⇔ β2 < ϕα1 (β1 ). Hence
ϕα1 (β1 ) < ϕα2 (β2 ) ⇔ ϕα1 (β1 ) < β2 .
From Theorem 3.4.9 we obtain especially
ϕα (0) < ϕβ (0) ⇔ α < β
(3.9)
which entails the following corollary. 3.4.10 Corollary The function λ ξ . ϕξ (0) is order preserving. Hence α ≤ ϕα (0) ≤ ϕα (β ) for all β . 3.4.11 Theorem For all principal ordinals α ∈ H there are uniquely determined ordinals ξ and η such that α = ϕξ (η ) and η < α . Proof We show first uniqueness. Let α = ϕξ (η ) and α = ϕν (µ ) such that η < α and µ < α . If ξ < ν we get the contradiction α = ϕξ (η ) < ϕξ (α ) = ϕξ (ϕν (µ )) = ϕν (µ ) = α . Hence ξ = ν which immediately also implies η = µ .
3.4 The Veblen Hierarchy
39
To show also the existence let ξ := min {µ α < ϕµ (α )}. This minimum exists because α ≤ ϕα (0) < ϕα (α ). For ξ = 0 we have α < ϕ0 (α ) and because of α ∈ H α < ϕ0 (α ). Hence η < α . If ξ = 0 we have α = there is an η such that ϕ0 (η ) = ϕρ (α ) for all ρ < ξ . Hence α ∈ ρ <ξ Cr (ρ + 1) = Cr (ξ ). Therefore there is an η such that ϕξ (η ) = α < ϕξ (α ) that implies η < α . An ordinal β ∈ Cr (α ) is closed under ordinal addition (since Cr (α ) ⊆ H) and closed under all functions ϕξ with ξ < α since β = ϕα ρ for some ρ and η < β = ϕα (ρ ) implies ϕξ (η ) < ϕα (ρ ) = β for all ξ < α by Theorem 3.4.9 (B)(1). The ordinals in Cr (α ) are thus inaccessible by ordinal addition and the functions ϕξ for ξ < α . Therefore we call the ordinals in Cr (α ) α -critical. Ordinals α , which are themselves α -critical are therefore closed under the function ϕ := λ ξ η . ϕξ (η ) viewed as a binary function. We call these ordinals strongly critical and define SC := {α α ∈ Cr (α )}
(3.10)
and
Γξ := enSC (ξ ). 3.4.12 Lemma We have (1)
α ∈ SC ∧ ξ , η < α ⇒ ϕξ (η ) < α
and (2)
α ∈ SC ⇔ ϕα (0) = α .
Proof Property (1) is already clear. Property (2) holds because α ∈ SC implies α = ϕα (ξ ) for some ξ and α ≤ ϕα (0) ≤ ϕα (ξ ) = α implies α = ϕα (0). The opposite direction is obvious. 3.4.13 Lemma The ordinals in SC are exactly the ordinals which are closed under ϕ viewed as a binary function. Proof In Lemma 3.4.12 (1) we have already seen that strongly critical ordinals are closed under ϕ . For the opposite inclusion let α be closed under ϕ . We show
ξ < ϕα (0) ⇒ ξ < α
(i)
by induction on ξ . If ξ ∈ / H we obtain (i) directly by the induction hypothesis and the fact that α ∈ H. So assume ξ = ϕξ0 (ξ1 ) with ξ1 < ξ . From the premise of (i) and Theorem 3.4.9 we then obtain ξ0 < α and ξ1 < ϕα (0). But then ξ1 < α by induction hypothesis and we obtain ξ = ϕξ0 (ξ1 ) < α because α is closed under ϕ . From (i) we get ϕα (0) = α which entails α ∈ SC. 3.4.14 Theorem The class SC is club in all regular ordinals κ > ω .
40
3 Ordinals
Proof We show unboundedness first. Let α < κ and put α0 := α + 1 and αn+1 := ϕαn (0). Then β := supn∈ω αn < κ and for ξ < β there is a k such that ξ < αk ≤ αm for all m ≥ k. This implies ϕξ (αm+1 ) = ϕξ (ϕαm (0)) = ϕαm (0) = αm+1 for all m ≥ (β ) = ϕξ (supk<m<ω αm ) = supk<m<ω ϕξ (αm ) = supk<m<ω αm ≤ β . k. So β ≤ ϕ ξ Hence β ∈ ξ <β Cr (ξ ) = Cr (β ) which shows α < β ∈ SC ∩ κ . To prove that SC is also closed let M ⊆ SC ∩ κ such that M < κ . Then α := sup M < κ . For ξ , η < α there is an ordinal β ∈ M ∩ κ such that ξ , η < β . Hence ϕξ (η ) < β ≤ α and α ∈ SC by Lemma 3.4.13. 3.4.15 Exercise Let κ > ω be a regular ordinal. Show that SC ∩ κ = ∆α <κ Cr (α ). 3.4.16 Exercise Define Cr ξ (α ) by the following clauses: • Cr 0 (0) = H. • Cr α +1 (0) := ∆λ <κ Cr α (λ ). • Cr λ (0) =
ξ <λ
Cr ξ (0) for λ ∈ Lim.
• Cr ξ (α + 1) := Cr ξ (α ) and • Cr ξ (λ ) :=
η <λ
Cr ξ (η ) for λ ∈ Lim
and define
ϕξ ,η := enCr ξ (η ) . Show that ϕξ ,η are all κ -normal-functions and prove (a) ϕ0,ξ (α ) = ϕξ (α ). (b) ϕ1,0 (α ) = Γα . (c) α := ϕα1 ,α2 (α3 ) < ϕβ1 ,β2 (β3 ) =: β holds true iff one of the following conditions is satisfied: • α1 = β1 ∧ α2 < β2 ∧ α3 < β . • α1 = β1 ∧ α2 = β2 ∧ α3 < β3 . • α1 = β1 ∧ β2 < α2 ∧ α < β3 . • α1 < β1 ∧ α2 < β ∧ α3 < β . • β1 < α1 ∧ (α < β2 ∨ (α = β2 ∧ 0 < β3 ) ∨ α < β3 ).
3.4 The Veblen Hierarchy
41
3.4.3 A Notation System for the Ordinals below Γ0 The first strongly critical ordinal is Γ0 . We are going to use the Veblen hierarchy to develop a system of notations for the ordinals below Γ0 . Besides the Cantor normalform for ordinals, which we already used in the development of the notations for the ordinals below ε0 , the key in the development is the following lemma. 3.4.17 Lemma For every ordinal α ∈ H \ SC there are uniquely determined ordinals ξ and η such that α = ϕξ (η ) and ξ , η < α . Proof In Theorem 3.4.11 we have already seen that there are uniquely determined ordinals ξ and η such that α = ϕξ (η ) and η < α . Of course we have ξ ≤ α . So / SC. assume ξ = α . But then α ≤ ϕα (0) ≤ ϕα (η ) = α in contradiction to α ∈ By Lemma 3.4.17 and the Cantor normal-form we obtain for the ordinals below Γ0 a uniquely determined normal-form
α =NF ϕξ1 (η1 ) + · · · + ϕξn (ηn ) such that ξi , ηi < α for i = 1, . . . , n. This shows that we can represent every ordinal < Γ0 as a word over the alphabet {0, +, ϕ. (.)}. This opens the possibility for a notation system. Again we define a set OT of ordinal notations and a subset PT of notations for principal ordinals together with an evaluation function | |. (O1)
0 ∈ OT
(O2)
If a1 , . . . , an ∈ PT and a1 · · · an then 1, a1 , . . . , an ∈ OT and | 1, a1 , . . . , an | = |a1 | + · · · + |an |
(O3)
If a1 , a2 ∈ OT then 2, a1 , a2 ∈ PT and | 2, a1 , a2 | = ϕ|a1 | (|a2 |).
(O4)
PT ⊆ OT.
and
|0| = 0
Moreover we define a ≺ b :⇔ a ∈ OT ∧ b ∈ OT ∧ |a| < |b| and a ≡ b :⇔ a ∈ OT ∧ b ∈ OT ∧ |a| = |b|. By Theorem 3.4.9 and Lemma 3.4.17 it is easy to check that OT and the relations ≺ and ≡ are simultaneously definable by course-of-values recursion and thence primitive recursive. We leave the details and the following two claims as (easy) exercises. 3.4.18 Theorem The sets OT, PT and the relations ≺ and ≡ are primitive recursive. For every notation a ∈ OT we have |a| < Γ0 and for every ordinal α < Γ0 there is conversely an a ∈ OT such that |a| = α .
42
3 Ordinals
This is all we need about ordinals for predicative proof theory. Later, however, before stepping into impredicative proof theory, we will have to return to the theory of ordinals. 3.4.19 Exercise Show that every cardinal is strongly critical. 3.4.20 Exercise We define a fixed-point-free version ϕ¯ of the Veblen function ϕ . ⎧ ⎨ ϕα (β + 1) if β = γ + n for some ordinal γ such that ϕα (γ ) = γ ϕ¯ α (β ) := or ϕα (γ ) = α ⎩ otherwise. ϕα (β ) Show:
⎧ ⎨ α < γ ∧ β < ϕ¯ γ (δ ) (a) ϕ¯ α (β ) < ϕ¯ γ (δ ) ⇔ α = γ ∧β < δ ⎩ γ < α ∧ ϕ¯ α (β ) ≤ δ (b) α < ϕ¯ α (β ) ∧ β < ϕ¯ α (β ). (c) ϕ¯ α1 (β1 ) = ϕ¯ α2 (β2 ) ⇔ α1 = α2 ∧ β1 = β2 . 3.4.21 Exercise Prove Theorem 3.4.18.
or or
Chapter 4
Pure Logic
To fix the formal framework we recall the notions of first- and second-order languages and introduce calculi for first- and weak second-order predicate logic.
4.1 Heuristics To begin with, we will follow H ILBERT’s programme and, in a first step, try to formalize, if not the whole of mathematics, at least parts of it. To get a feeling about how this could be done, we start with some heuristic remarks. The object(ive)s of mathematical research are, broadly speaking, structures. The “working mathematician” wants to figure out which theorem will hold in the structure of his or her interest. Therefore (s)he needs a language in which (s)he can formulate the theorems. This is usually done in English (or in another language) augmented with technical terms that are characteristic for the structure. But as we know from mathematical logic only the technical terms do matter; the use of English or any other colloquial language can (in principle) be dispensed with. We can do with the formal language of logic. From a heuristical point of view the basic ingredients for a formal language are: • symbols for the elements of the structure; fixed elements symbolized by constants and arbitrary elements symbolized by variables • symbols for functions • predicate symbols which describe relations between terms (e.g., equality, less than, . . . ) and thus form primitive propositions • logical connectives by which we can compose propositions
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
43
44
4 Pure Logic
• quantifiers ranging over the elements of a structure that allow us to express properties which are satisfied by some or all elements of the structure. Here, we can also imagine quantifiers which range over the functions, or the subsets, of the structure or even over sets of subsets etc. In these cases we talk about higher order languages. If the range is restricted to the elements of the structure alone we talk about a first-order language. First-order logic is the logic that deals with first-order languages. From the basic symbols we build two types of well-formed expressions. • Terms which are built up recursively from constants and variables by function symbols and • formulas which are built up recursively from primitive propositions by logical connectives and quantifiers. Once we have established a formal language which is adequate for a structure, we can formulate sentences in this language. The problem is now to figure out which sentences hold in the structure. This could be done by pure intuition. But it may happen that our intuition about the truth of a sentence is erroneous or may not be shared by our colleagues. Therefore we have to prove the truth of a sentence. The next problem we have to deal with is therefore to settle the notion of “proof”. Experience tells us that a proof consists of a series of inferences. But how can we characterize an inference? Inferences have to preserve the truth of sentences in a structure. Let S |= F denote that the sentence F holds in the structure S. Call an inference A1 , . . . , An |=S F adequate for the structure S if S |= Ai for i = 1, . . . , n implies S |= F. Adequateness in a special structure doesnot, however, provide any progress. To secure that an inference A1 , . . . , An |=S F is adequate for the structure S we have to ensure S |= F. But that is what we aimed for and therefore it is completely superfluous to check also S |= Ai for the premises Ai . Therefore we have to widen the notion of inference such that it becomes independent of a particular structure, i.e., we regard only those inferences which are adequate for all possible structures. We define A1 , . . . , An |=L F :⇔ (∀S) [A1 , . . . , An |=S F] and call that a logical inference. Logical inferences preserve truth in all possible structures.1 It is one of the big achievements of mathematical logic to have shown that in the case of first-order languages, the notion |=L of a logical inference can be replaced by a set of formal rules. A formal rule is syntactically given by a figure of the form F1 , . . . , Fn G
1
By narrowing the range of possible test-structures S we obtain stronger logics but lose the possibility to replace |=L by a strictly formal rule. (Cf. Exercise 5.4.13).
4.1 Heuristics
45
where F1 , . . . , Fn , G is a finite set of formulas in the formal language. Let G denote the fact that G is deducible from F1 , . . . , Fn , i.e., derivF1 , . . . , Fn able from F1 , . . . , Fn by a finite number of applications of formal rules. The theorems which connect logical inferences and deductibility are the Soundness- and Completeness-Theorem for first-order logic which state that there are formal rules such that A1 , . . . , An |=L F ⇔ A1 , . . . , An
F.
The direction from left to right is known as the Completeness Theorem while the opposite direction is called Soundness- or Correctness-Theorem. Since formal rules are syntactically defined, it is decidable whether a formal rule is correctly applied. So it becomes decidable whether a finite series of formal inferences is correct. If we start from a set A1 , . . . , An of sentences, which are obviously F we can replace by |=L and true in the structure S and infer A1 , . . . , An obtain S |= F, i.e., that F is a theorem of S. It remains decidable whether the presented proof is correct, i.e., whether the rules have been applied correctly. This is one of the most important features of mathematics. The truth of its proved theorems is (in principle) machine-checkable on the basis of logically legitimated rules and does not depend on the knowledge of conventions and regulations which are only admissible to a selected group of insiders. This is probably one of the reasons for the broad applicability of mathematics. Of course no “working mathematician” works formally. Finding proofs is a matter of high intuition. But in writing up a proof she or he is (at least unconsciously) aware that mathematical proofs are in principle syntactically formalizable and thus decidable. Otherwise it would be impossible for the mathematical community to check the correctness of a proof. Apparently this has been common knowledge to ¨ was able to prove the completethe mathematical community long before G ODEL ness theorem which states that there are calculi, i.e., sets of formal rules, which are complete for logical reasoning.2 The notion of logical inference is closely connected to that of logical validity. A sentence F is logically valid – denoted by |= F – if it holds in all structures. According to the definition of a logical inference, a sentence is logically valid if and only if it is derivable from an empty set of premises, i.e., |= F ⇔ |=L F. An immediate consequence of the definition of a logical inference is the deduction theorem stating (A1 , . . . , An , G |=L F) ⇔ (A1 , . . . , An |=L G → F). Therefore we obtain A1 , . . . , An |=L F ⇔ |= A1 ∧ · · · ∧ An → F. Since logical validity and thence also logical deductibility rely on all suited structures we cannot expect to obtain information about a particular structure purely logically. If we want to prove something about a special structure we have to anticipate 2
Hilbert in [41] apparently already anticipated the existence of such calculi.
46
4 Pure Logic
basic facts which we consider to be characteristic of that structure. These facts form the axioms. We know, however, that for sufficiently complex structures it is impossible to specify axioms which characterize the structure completely. This incompleteness is twofold. First, it is impossible to characterize an infinite structure by a set of first-order axioms up to isomorphisms. Even if we take as axioms all the sentences which are true in a structure S there are still structures which are not isomorphic to S but satisfy exactly the same first-order sentences. This is a consequence of the compactness theorem for first-order logic which says that an infinite set M of firstorder sentences is satisfiable in a structure if and only if every finite subset of M is satisfiable. In full second-order logic there is an axiom system which characterizes the structure of natural numbers up to isomorphism but there is no complete deduction formalism for second-order logic. The correctness of a proof in second-order logic is therefore not machine-checkable. There are, however, formal calculi which are at least sound for second-order languages. From a logical point of view, these calculi are not really second-order calculi but rather two sorted first-order calculi. This distinction is a topic of an introductory text of Mathematical Logic and will not be discussed further here. We will, however, introduce a second-order language for arithmetic and set theory and (especially in the exercises) sometimes also work with formal calculi for second-order logic. If we insist in machine-verifiability of mathematical proofs it makes no sense to take all valid first-order sentences of a structure as axioms because it is in general not decidable whether a sentence is valid in a structure. To check whether F is a correct proof, requires us also to check that A1 , . . . , An are axA1 , . . . , An ioms. The property of being an axiom must therefore be decidable. Here comes ¨ ’s first incompleteness theorem. another incompleteness which is known as G ODEL It is generally impossible to derive all first-order theorems of a structure from a decidable set of axioms. Exploring the limits of the deductibility power of decidable axioms systems is one of the aims of this book. The following sections fix the formal logical framework for this book and recall some general background and results of Mathematical Logic.
4.2 First-Order and Second-Order Logics Logical languages describe structures. In a formal language for logic we distinguish between logical and non logical symbols. The logical symbols are common to all languages while the non logical symbols are characteristic for the intended structures. 4.2.1 Definition (Logical symbols) The logical symbols are: • Countably many free object variables, denoted by u, v, w, u0 , . . .. • Countably many bounded object variables, denoted by x, y, z, x0 , . . ..
4.2 First-Order and Second-Order Logics
47
• For every non zero natural number n countably many n-ary free relation variables, denoted by U, V , U0 , . . . . • For every non zero natural number n countably many n-ary bounded relation variables, denoted by X, Y , Z, X0 , . . . . • The propositional connectives ¬, ∨, ∧, →. • The quantifiers ∀ and ∃. • Auxiliary symbols such as parentheses, square brackets etc. 4.2.2 Definition The non logical symbols of a language comprise: • A set C of constants. • A set F of function symbols. Every function symbol f ∈ F is equipped with an arity # f which is a non zero natural number. • A set R of relation symbols. Every relation symbol R ∈ R comes with an arity #R which is a non zero natural number. 4.2.3 Definition (Inductive definition of the L -terms) • Every constant c is an L -term with FV(c) = 0. / • Every free object variable u is an L -term with FV(u) = {u}. • If t1 , . . . ,tn are L -terms and f is an n-ary function symbol then ( f t1 . . .tn ) is an L -term with FV( f t1 . . .tn ) = FV(t1 ) ∪ · · · ∪ FV(tn ). We call FV(t) the set of variables which occur freely in the term t. 4.2.4 Definition (Inductive definition of the L -formulas) • If t1 , . . . ,tn are L -terms and R is an n-ary relation symbol then (Rt1 . . .tn ) is an atomic formula with FV(Rt1 . . .tn ) = FV(t1 ) ∪ · · · ∪ FV(tn ), BV(Rt1 . . .tn ) = 0/ and FV2 (Rt1 . . .tn ) = BV2 (Rt1 . . .tn ) = 0. / • If U is an n-ary free relation variable and t1 , . . . ,tn are L -terms then (Ut1 . . .tn ) is an atomic formula with FV(Ut1 . . .tn ) = FV(t1 ) ∪ · · · ∪ FV(tn ), BV(Ut1 . . .tn ) = BV2 (Ut1 . . .tn ) = 0/ and FV2 (Ut1 . . .tn ) = {U}. Instead of (Ut1 . . .tn ) we often also write (t1 , . . . ,tn ) ε U. • Every atomic formula is a formula.
48
4 Pure Logic
• If F is a formula then (¬F) is formula with FV(¬F) = FV(F), BV(¬F) = BV(F), FV2 (¬F) = FV2 (F) and BV2 (¬F) = BV2 (F). • If F and G are formulas then (F ∧ G), (F ∨ G) and (F → G) are formulas with FV(F ◦ G) = FV(F) ∪ FV(G), BV(F ◦ G) = BV(F) ∪ BV(G), FV2 (F ◦ G) = FV2 (F) ∪ FV2 (G) and BV2 (F ◦ G) = BV2 (F) ∪ BV2 (G) for ◦ ∈ {∧, ∨, →}. • If F is a formula and x ∈ / BV(F) then (∀x)Fu (x) and (∃x)Fu (x) are formulas with FV((Qx)Fu (x)) = FV(F) \ {u}, BV((Qx)Fu (x)) = BV(F) ∪ {x}, FV2 ((Qx)Fu (x)) = FV2 (F) and BV2 ((Qx)Fu (x)) = BV2 (F) for Q ∈ {∀, ∃}. • If F is a formula and X ∈ / BV2 (F) then (∀X)FU (X) and (∃X)FU (X) are formulas with FV((QX)FU (X)) = FV(F), BV((QX)FU (X)) = BV(F), FV2 ((QX)FU (X))= FV2 (F) \ {U} and BV2 ((QX)FU (X)) = BV2 (F) ∪ {X} for Q ∈ {∀, ∃}.3 A language is characterized by its non logical symbols. We call (C , F , R) together with the arities of the function- and predicate symbols the signature of the language. To interpret a formal language we need a structure which matches the signature of the language. This is the following definition. 4.2.5 Definition Let L (C , F , R) be a formal language. An L -structure S = (S, S ) is a non void set S (the domain of the structure) together with a mapping S which assigns an element cS ∈ S to every constant c ∈ C , a function f S : S# f −→ S to every function symbol f ∈ F and a relation RS ⊆ S#R to every relation symbol R ∈ R. Let S = (S, S ) be an L -structure. An S-assignment is a family Φ = (Φ 1 , Φn2 )n∈ω \{0} of mappings Φ 1 : FV −→ S and Φn2 : FV2n −→ Pow(Sn ) for all positive natural numbers n where FV denotes the set of all free object variables and FV2n the set of all n-ary relation variables. Usually we write shortly Φ (u) instead of Φ 1 (u) and Φ (U) instead of Φn2 (U) since it is mostly obvious from the context which mapping applies. 4.2.6 Definition (Inductive definition of the value t S [Φ ] for an L -term t in a structure S with the S-assignment Φ ) • cS [Φ ] := cS . • uS [Φ ] := Φ (u). • ( f t1 . . .tn )S [Φ ] := f S (t1S [Φ ], . . . ,tnS [Φ ]).
3 Here F (x) and F (X) stand for the symbol string that is obtained from F replacing all occuru U rences of u or U by x or X, respectively.
4.2 First-Order and Second-Order Logics
49
Observe that the interpretation t S [Φ ] of an L -term is an element of S. To define that a structure satisfies a formula with an assignment we introduce the notation
Φ ∼u Ψ :⇔ Φ (v) = Ψ (v) for all v = u for S-assignments Φ and Ψ . We use the analogous notation also for the secondorder variables. 4.2.7 Definition (Inductive definition of the satisfaction relation S |= F[Φ ] for an L -formula F in an L -structure S with an S-assignment Φ ) • S |= (Rt1 . . .tn ) iff (t1S [Φ ], . . . ,tnS [Φ ]) ∈ RS . • S |= (Ut1 , . . . ,tn )[Φ ] iff (t1S [Φ ], . . . ,tnS [Φ ]) ∈ Φ (U). • S |= ¬F[Φ ] iff S |= F[Φ ]. • S |= (F ∧ G)[Φ ] iff S |= F[Φ ] and S |= G[Φ ]. • S |= (F ∨ G)[Φ ] iff S |= F[Φ ] or S |= G[Φ ]. • S |= (F → G)[Φ ] iff S |= F[Φ ] or S |= G[Φ ]. • S |= (∀x)Fu (x)[Φ ] iff S |= F[Ψ ] for all assignments Ψ ∼u Φ . • S |= (∃x)Fu (x)[Φ ] iff S |= F[Ψ ] for some assignment Ψ ∼u Φ . • S |= (∀X)FU (X)[Φ ] iff S |= F[Ψ ] for all assignments Ψ ∼U Φ . • S |= (∃X)FU (X)[Φ ] iff S |= F[Ψ ] for some assignment Ψ ∼U Φ . We read S |= F[Φ ] as “S satisfies F with the assignment Φ ” or as “F is true in S with the assignment Φ ”. In the above definition we used S |= F[Φ ] to denote that S does not satisfy F with the assignment Φ . We will often denote by F(v1 , . . . , vn ) that FV(F) ⊆ {v1 , . . . , vn }. In general, however, it does not mean FV(F) = {v1 , . . . , vn }. The same notation is also used for second-order variables. Class terms of the form {x1 , . . . , xn Au1 ,...,un (x1 , . . . , xn )} are not constituents of the formal language. We will, however, use them in the form (t1 , . . . ,tn ) ε {x1 , . . . , xn Au1 ,...,un (x1 , . . . , xn )} as an “abbreviation” for the formula Au1 ,...,un (t1 , . . . ,tn ) (which in fact sometimes is easier to write and read). If F and A are formulas we define, FU ({x1 , . . . , xn Au1 ,...,un (x1 , . . . , xn )})
50
4 Pure Logic
as the formula which is obtained from F by replacing all occurrences (Ut1 . . .tn ) in F by Au1 ,...,un (t1 , . . . ,tn ). Sometimes we write just FU (A) or even shorter F(A) if it is clear which variable is to be replaced. We have introduced L as a second-order language. We call FV(F) the free firstorder variables of F, BV(F) the bounded first-order variables of F and analogously FV2 (F) and BV2 (F) the free or bounded second-order variables of F. A term t with FV(t) = 0/ is closed. A formula F is a sentence if FV(F) = / A formula F is called first-order if BV2 (F) = 0. / FV2 (F) = 0. Observe that the value t S [Φ ] and and the satisfaction relation S |= F[Φ ] only depend on the values of Φ on the free variables which occur in t or F, respectively. i.e., we have
Φ FV(t) = Ψ FV(t) ⇒ t S [Φ ] = t S [Ψ ]
(4.1)
and
Φ FV(F)∪FV2 (F) = Ψ FV(F)∪FV2 (F) ⇒ (S |= F[Φ ] ⇔ S |= F[Ψ ]). (4.2) We say that an L -formula F is satisfiable if there is a structure S and an Sassignment Φ such that S |= F[Φ ]. We call F valid (or true) in a structure S if S |= F[Φ ] for all S-assignments Φ . It follows from (4.2) that sentences have a truth value in a structure which is independent of the choice of assignments. 4.2.8 Definition (Logical consequence) Let M be a set of L -formulas. We say that F is a logical consequence of M, denoted by M |=L F, if for every L -structure S and every S-assignment Φ which satisfies S |= G[Φ ] for all G ∈ M we also have S |= F[Φ ]. A formula F is logically valid , denoted by |= F, if it is a logical consequence of the empty set, i.e., if S |= F[Φ ] for all L -structures S and all S-assignments Φ . Two formulas F and G are logically equivalent, denoted by F ≡L G, if F |=L G and G |=L F. We define F ↔ G :⇔ (F → G) ∧ (G → F) and obtain F ≡L G if and only if |= F ↔ G. The notion of logical consequence can be reformulated in terms of satisfiability. 4.2.9 Theorem A formula F is a logical consequence of a set M of formulas if and only if the set M ∪ {¬F} is unsatisfiable. Proof If M |=L F then M ∪ {¬F} is unsatisfiable because every structure S and every S-assignment Φ which satisfies all formulas in M also satisfies F. Thus S |= ¬F[Φ ]. On the other hand if M ∪ {¬F} is unsatisfiable then every structure S and every S-assignment Φ which satisfies all formulas in M has to falsify ¬F and thus satisfies F. An immediate consequence of Theorem 4.2.9 is the Deduction Theorem.
4.2 First-Order and Second-Order Logics
51
4.2.10 Theorem (Deduction Theorem) A formula F is a logical consequence of a set M ∪ {A1 , . . . , An } if and only if A1 ∧ · · · ∧ An → F is a logical consequence of M. Proof This is obvious by Theorem 4.2.9 since M ∪ {A1 , . . . , An } ∪ {¬F} is unsat isfiable iff M ∪ {¬(A1 ∧ · · · ∧ An → F)} is unsatisfiable. 4.2.11 Exercise Let L be a first-order language. (a) Define tu (s) and Fu (s) inductively for L -terms s and t and L -formulas F and prove that (b) su (t) is again an L -term and Fu (s) an L -formula. 4.2.12 Exercise Give an inductive definition of FU (A) for L -formulas F and A and prove that FU (A) is again a formula. 4.2.13 Exercise Prove the following claims. (a) |= (∀x)Fu (x) → Fx (t) for any L -term t. (b) |= (∀X)FU (X) → FU (G) for any L -formula G such that FU (G) is again a formula. Assume that M is a set of L -formulas. Show (c) M |=L A → F implies M |=L A → (∀x)Fu (x) if u ∈ / FV(M ∪ {A}), (d) M |=L A → F implies M |=L A → (∀X)FU (X) if U ∈ / FV2 (M ∪ {A}). 4.2.14 Exercise Let LN = ({0, 1}, {+, ·}, {<, =}) be the language of number theory and N := (N, {0, 1}, {+, ·}, {<, =}) be the LN -structure of the natural numbers. Define formulas F(u) and G such that (a) N |= F[Φ ] iff Φ (u) is a prime number. (b) N |= G iff there are infinitely many natural numbers n such that n and n + 2 are prime. 4.2.15 Exercise Let F(u, v) be an L -formula. Decide whether the following formulas are logically valid: (a) (∀x)(∃y)Fu,v (x, y) → (∃x)(∀y)Fu,v (x, y) (b) (∃x)(∀y)Fu,v (x, y) → (∀x)(∃y)Fu,v (x, y)
52
4 Pure Logic
4.2.16 Exercise (a)
Let F and G be L -formulas. Show |= (∀x)(F ∧ G)u (x) → (∀x)Fu (x) ∧ (∀x)Gu (x)
(b) Let L be a first-order language which contains the constant 0 and the relation symbol =. Find L -formulas F and G such that |= (∀x)(F ∨ G)u (x) → (∀x)Fu (x) ∨ (∀x)Gu (x).
4.3 The TAIT-Calculus As mentioned in the previous section, there are formal calculi which are sound and complete for M |=L F if M ∪ {F} is a set of first-order formulas. This section will introduce such a calculus which is especially adapted for proof-theoretic studies. In his pioneering papers [28] and [29] G. G ENTZEN formalized logical reasoning by his sequent calculus. A sequent has the form A1 , . . . , Am B1 , . . . , Bn , where Ai and B j are first-order formulas. The intended meaning of this sequent is {A1 , . . . , Am } |=L B1 ∨ · · · ∨ Bn . The sequence (A1 , . . . , Am ) is the antecedent, the sequence (B1 , . . . , Bn ) the succedent of the sequent. G ENTZEN’s striking result is his Hauptsatz which states that there is a sound and complete set of rules for the sequent calculus which is deviation free, i.e., every formula occurring in the conclusion of an inference already occurs in its premises. The paradigmatic example for an inference which incorporates a deviation is the cut rule F, A1 , . . . , Am B1 , . . . , Bn A1 , . . . , Am B1 , . . . , Bn , F , A1 , . . . , Am B1 , . . . , Bn where the “deviation” via the cut-formula F is unrecoverable from the conclusion. Examples for the rules in the sequent calculus are (∧-rules)
∆,A Γ ∆,A ∧ B Γ
∆1 Γ1 , A ∆2 Γ2 , B ∆1 , ∆2 Γ1 , Γ2 , A ∧ B
and (∨-rules)
∆1 , A Γ1 ∆2 , B Γ2 ∆1 , ∆2 , (A ∨ B) Γ1 , Γ2
∆ Γ ,A . ∆ Γ ,A ∨ B
These examples show that we need two inferences for every logical symbol, one for the antecedent and one for the succedent. Antecedent and succedent are connected via the rules for the negation symbol which are,
∆,A Γ ∆ Γ , ¬A
∆ Γ ,A . ∆ , ¬A Γ
4.3 The TAIT-Calculus
53
It was W. TAIT’s observation that by removing the negation symbol from the basic propositional connectives, the distinction between antecedent and succedent in sequents becomes dispensable. This reduces the number of required inferences by half.4 To fix the syntactical framework for this book, we introduce a formal system for first-order logic (without identity) which is based on a one-sided sequent calculus a` la TAIT. Let L be a first-order language. We define the Tait language LT of L by introducing for every relation variable U a new relation variable ¬U and for every relation symbol R a new relation symbol ¬R of the same arity as U or R, respectively. As logical symbols we only allow the positive boolean connectives ∧ and ∨ and the quantifiers ∀ and ∃. Terms and formulas are defined according to Definitions 4.2.3 and 4.2.4 with respect to the extended number of relation variables and -symbols and the restricted number of logical connectives. We expand an L -structure M to an LT -structure MT by interpreting (¬R)M as the complement of RM . An M-assignment Φ is extended to an MT -assignment ΦT by defining ΦT (¬U) := {(s1 , . . . , sn ) (s1 , . . . , sn ) ∈ / Φ (U)}. Negation is not among the logical symbols but becomes definable via de Morgan’s laws. 4.3.1 Definition For a formula F in the TAIT-language LT we define ∼F by the following clauses. ∼(¬Rt1 , . . . ,tn ) :≡ (Rt1 , . . . ,tn ).
• ∼(Rt1 , . . . ,tn ) :≡ (¬Rt1 , . . . ,tn ); • ∼(Us1 , . . . , sn ) :≡ (¬Us1 , . . . , sn ); • ∼(A ∧ B) :≡ ∼A ∨ ∼B;
∼(¬Us1 , . . . , sn ) :≡ (Us1 , . . . , sn ).
∼(A ∨ B) :≡ ∼A ∧ ∼B.
• ∼(∀x)F(x) :≡ (∃x)∼F(x); • ∼(∀X)F(X) :≡ (∃X)∼F(X);
∼(∃x)F(x) :≡ (∀x)∼F(x). ∼(∃X)F(X) :≡ (∀X)∼F(X).
For any L -structure M, any M-assignment Φ and any LT -formula F we then obtain, MT |= ∼F[ΦT ] ⇔ M |= ¬F[Φ ].
(4.3)
Therefore we mostly identify ¬F and ∼F although ¬ is not among the basic connectives of LT . Replacing ¬ by ∼ we obtain a translation F → F T from a language L into its TAIT-language LT . We commonly will, however, identify an L -formula F and its TAIT translation F T . It will always be clear from the context whether we talk about F or F T . 4 In a private talk William Tait told me that one of his objectives was to reduce the number of cases in proof theory.
54
4 Pure Logic
The TAIT-calculus for first-order logic derives finite sets (not sequences) of firstorder formulas in the TAIT-language which are to be interpreted disjunctively. Unless otherwise mentioned we will only talk about first-order formulas in this and the following section. We use upper case Greek letters Γ , ∆ , Λ , Γ1 , . . . as syntactical variables for finite sets of formulas. Instead of ∆ ∪ Γ we commonly write ∆ , Γ . Instead of ∆ ∪ {F} we write ∆ , F. 4.3.2 Definition (The TAIT-calculus) We define the derivation relation
m T
∆ induc-
tively by the following clauses. (AxL) If A is an atomic formula then (∨)
If
(∧)
If
(∃)
If
(∀)
If
m0 T mi
∆ , A, ∼A for all natural numbers m. m
∆ , Ai for some i ∈ {1, 2}, then
T
∆ , A1 ∨ A2 for all m > m0 .
∆ , Ai and mi < m for all i ∈ {1, 2}, then
T m0 T m0 T
m T
∆ , Av (t), then
m T
of the finite set ∆ ”.
m T
∆ , A1 ∧ A2 .
∆ , (∃x)Av (x) for all m > m0 .
∆ , A(u) and u not free in ∆ , then
The derivation relation
m T
m T
∆ , (∀x)Au (x) for all m > m0 .
∆ should be read as: “There is a derivation of length ≤ m
As an immediate consequence of Definition 4.3.2 we obtain the structural rule (STR) If
m T
∆ , m ≤ n and ∆ ⊆ Γ then
n T
Γ,
which is proved by induction on m. We leave the straightforward proof by induction on m as an exercise. First we observe that this calculus is sound. Let ∆ := {F F ∈ ∆ }. 4.3.3 Theorem (Soundness Theorem) If
m T
∆ then |=
∆.
Let M be an L -structure and Φ an M-assignment. We show
Proof m T
∆ implies M |= ∆ [Φ ]
(i)
by induction on m. If the premise of (i) holds by (AxL) then ∆ contains a formula A and its dual ∼A. According to (4.3) on p. 53 we obtain M |= (A ∨ ∼A)[Φ ] which implies M |= ∆ [Φ ]. If the premise of (i) holds by a rule we obtain the claim from the induction hypothesis because the rules preserve validity. Let us just check the case
4.4 Trees and the Completeness Theorem
55
of an inference (∀) to pinpoint the role of the variable condition in this rule. If M |= (∀x)Au (x)[Φ ] then there is an assignment Ψ ∼u Φ such that M |= A[Ψ ]. From m0 the premise ∆ , A we obtain M |= ( ∆ ∨ A)[Ψ ] by the induction hypothesis. T
Hence M |= ∆ [Ψ ]. But, since u does notoccur in ∆ , we have M |= ∆ [Ψ ] if and only if M |= ∆ [Φ ] and thus also M |= ( ∆ ∨ (∀x)Au (x))[Φ ]. 4.3.4 Exercise Give an inductive definition of the translation F → F T .
4.4 Trees and the Completeness Theorem We are now going to show that the TAIT-calculus is also complete.
4.4.1 Theorem (Completeness Theorem) Assume M |=L ∆ for a countable set M and a finite set ∆ of LT -formulas. Then there is a finite subset Γ ⊆ M and an m ∈ ω m such that ¬Γ , ∆ , where ¬Γ := {¬G G ∈ Γ }. T
We will prove Theorem 4.4.1 by the method of search trees introduced by Sch¨utte. Therefore we have to describe trees mathematically. A tree can be visualized as shown in Fig. 4.1. A tree grows out of its root. We enumerate the immediate
0
1
< 0, 0, 0 > 0
< 0, 0 >
<0>
0
0
3
2
< 2, 1, 0 >
< 1, 0 >
<1>
111 000 000 111 000 111 000 111 000 111 000 111 0 1 0 1
0
1
<>
Fig. 4.1 Visualization of a (mathematical) tree
4
1
0
0
1
<2>
< 0, 0, 0, 4 >
2
2
2
< 2, 1, 2 >
< 2, 2 >
56
4 Pure Logic
successors of a node from left to right. Then every node in a tree can be addressed by a finite sequence of numbers as indicated in Fig. 4.1. This leads to the following definition. 4.4.2 Definition A tree is a set of sequence numbers which is closed under initial segments. i.e., Tree(T ) :⇔ T ⊆ Seq ∧ (∀s ∈ T )[t ⊆ s → t ∈ T ], where for sequence numbers s and t we define, t ⊆ s :⇔ lh(t) ≤ lh(s) ∧ (∀i < lh(t))[(t)i = (s)i ]. A node s ∈ T is a leaf if it possesses no successors in T , i.e., if (∀x)[s x ∈ / T ]. A path in a tree T is a subset P ⊆ T which is linearly ordered by ⊆ and closed under initial sequences, i.e., P is a path in T :⇔ P ⊆ T ∧ (∀s ∈ P)(∀t ∈ P)[s ⊆ t ∨ t ⊆ s] ∧ (∀s)(∀t ∈ P)[s ⊆ t → s ∈ P]. Observe that an infinite path P can be identified with a function f : N −→ N such that P = {[ f ](m) m ∈ ω }, where [ f ](m) := f (0), . . . , f (m − 1) codes the courseof-values of f below m. A tree T is well-founded if and only if it possesses no infinite paths. For s ∈ T we define, Ts := {t st ∈ T } and call Ts the subtree of T above s. 4.4.3 Lemma Let T be a well-founded tree. Then we have the principles of bar induction and bar recursion which are (BI) (∀s) (∀x)[s x ∈ T ⇒ F(s x)] ⇒ F(s) ⇒ (∀s ∈ T )[F(s)] and (BR)
Let G be a function and T a well-founded tree. Then there is a function F with dom(F) = T such that F(s) = G(FTs ).
We will not prove the lemma here completely. The proof of (BI) is straight forward. From the assumption (∃s ∈ T )[¬F(s)] we construct an infinite path s0 , s1 , . . . such that ¬F(si ) holds for all i ∈ N. By assumption we have ¬F(s0 ) for some s0 ∈ N. Having constructed s0 , . . . , sn there is an i ∈ N such that s n i ∈ T and ¬F(sn i) because otherwise we had F(sn ) by the hypothesis of (BI). It is easy to see that a tree T is well-founded iff the tree relation, defined by s ≺T t :⇔ t s, is well-founded. The proof of (BR) then becomes a special case of transfinite recursion along the tree relation and can be given within the framework of set theory (cf. Theorem 3.2.8 and Exercise 4.4.13).
4.4 Trees and the Completeness Theorem
57
Since it is easy to visualize how a function defined by bar recursion is constructed, bar recursion is sometimes regarded as a constructive principle.5 Using bar recursion we define the order type of a node s in a well-founded tree T by otypT (s) := sup {otypT (s x) + 1 s x ∈ T } and otyp(T ) := otypT ( ). Bar induction on a well-founded tree T then corresponds to induction on otypT (s). These are the preliminaries which we need in the definition of the search tree. To define the search tree for a finite set ∆ of LT -formulas we order ∆ arbitrarily to obtain a finite sequence ∆ . Instead of ∆ , F we write shortly ∆ , F. The first formula in ∆ , i.e., the formula with lowest index in the sequence ∆ , which is not atomic, is its redex, denoted by R( ∆ ). We put R( ∆ ) := 0/ if ∆ possesses no redex. The sequence ∆ r is obtained by discarding the redex R( ∆ ) in ∆ . Assume that M is a countable set of LT -formulas and fix an enumeration for the formulas in M. Observe that there are only countably many terms which can be built from the variables, constants and function symbols occurring in M ∪ ∆ . Let {ti i ∈ N} be an enumeration of these terms. We define the search tree S M∆ together with the label function δ by the following clauses. (S )
∈ S M∆ ∧ δ ( ) = ∆
For the coming clauses assume s ∈ S M∆ and that δ (s), viewed as a finite set, is not an axiom according to (AxL). (SId )
If R(δ (s)) = 0/ then s 0 ∈ S M∆ and δ (s 0) = δ (s), ¬Gi , where Gi is the / s0 ⊆s δ (s0 ). If first formula in the enumeration of the set M such that ¬Gi ∈ there is no such Gi then δ (s 0) = δ (s).
(S∧ )
If R(δ (s)) is a formula (A0 ∧ A1 ) then s i ∈ S ∆ and δ (s i) = δ (s)r , Ai for i = 0, 1
(S∀ )
If R(δ (s)) is a formula ((∀x)Av (x)) then s 0 ∈ S ∆ and δ (s 0) = δ (s)r , Av (u), where u is the first free variable in a given enumeration of the free variables such that u does not occur in δ (s) and M.
(S∨ )
If the redex R(δ (s)) is a formula (A0 ∨ A1 ) then s 0 ∈ S M∆ and δ (s 0) = δ (s)r , Ai , ¬Gj , R(δ (s)), where Ai is the first formula in A0 , A1 which does not occur in s0 ⊆s δ (s0 ) and G j the first formula in the enumer ation of M such that ¬G j does not occur s0 ⊆s δ (s0 ). If such a formula Ai or / respectively. G j does not exist we put Ai = 0/ or G j = 0,
5
The constructiveness of a function defined by bar recursion depends of course on the complexity of the well-founded tree.
58
4 Pure Logic
If R(δ (s)) is a formula (∃x)Av (x) then s 0 ∈ S M∆ and we define δ (s 0) = δ (s)r , Av (t), ¬Gi , R(δ (s)), where Gi is the first formula in / s0 ⊆s δ (s0 ) and Gi = 0/ if such a the enumeration of M such that ¬Gi ∈ formula does not exist, and t is the first term in the enumeration of the terms such that Av (t) does not occur in s0 ⊆s δ (s0 ). If no such term exists then δ (s 0) = δ (s)r , R(δ (s)), ¬Gi .
(S∃ )
We observe that the search tree is a binary tree. If S M∆ is well-founded then S M∆ is finite by K¨onig’s lemma and therefore all nodes in S M∆ have finite order type. We first show, 4.4.4 Lemma (Syntactical Main Lemma)
If the search tree S M∆ is well-founded
then for every s ∈ S M∆ there is a finite set Γs ⊆ M such that otyp(s) denotes the order type of s in the well-founded trees
otyp(s) T S M∆ .
¬Γs , δ (s), where
Proof We show the lemma by bar induction on S M∆ . If s is a leaf in S M∆ then none of the clauses (SId ) through (S∃ ) apply. Therefore δ (s), viewed as a set, 0 is an axiom according to (AxL) and we obtain δ (s) and choose Γs := 0. / T
So assume s ∈ S M∆ such that δ (s) is not an axiom according to (AxL). If R(δ (s)) = 0/ then s 0 ∈ S M∆ and by induction hypothesis there is a finite set
Γs 0 such that
otyp(s 0)
T
¬Γs 0 , δ (s 0). But either δ (s 0) = δ (s) or
δ (s 0) = δ (s), ¬Gi for some formula Gi ∈ M and we obtain
otyp(s) T
¬Γs , δ (s) by
putting Γs := Γs 0 in the first case and Γs := Γs 0 ∪ {Gi } in the second. For R(δ (s)) = 0/ we have to distinguish cases according to the shape of R(δ (s)). If R(δ (s)) is a formula A0 ∧ A1 we are in the case of (S∧ ). Then s i ∈ S M∆ for i = 0, 1 and we obtain
otyp(s i)
T
¬Γs i , δ (s i) by the induction hypothesis,
where δ (s i) = δ (s)r , Ai . By an inference (∧) we then obtain
otyp(s) T
¬Γs , δ (s)
for Γs := Γs 0 ∪ Γs 1 . In the case that R(δ (s)) is a formula (∀x)Av (x) we have s 0 ∈ S M∆ and obtain
otyp(s 0)
T
¬Γs 0 , δ (s)r , Av (u) by the induction hypothesis where u does not
occur in δ (s) and Γs 0 . Putting Γs := Γs 0 we obtain
otyp(s) T
¬Γs , δ (s) by an in-
ference (∀). Now assume that R(δ (s)) is a formula A0 ∨ A1 . Then s 0 ∈ S M∆ and we obtain
otyp(s 0)
T
¬Γs 0 , δ (s), ¬G j or
otyp(s 0)
T
¬Γs 0 , δ (s), Ai , ¬G j for
some i ∈ {0, 1} and some formula G j ∈ M by the induction hypothesis. Put
4.4 Trees and the Completeness Theorem
59
Γs := Γs 0 ∪ {G j }. In the first case we get by the structural rule in the second case we need an inference (∨) to get
otyp(s) T
otyp(s) T
¬Γs , δ (s),
¬Γs , δ (s).
The last case is that R(δ (s)) is a formula (∃x)Au (x). Then s 0 ∈ S M∆ and we get
otyp(s 0)
T
¬Γs 0 , δ (s), ¬Gi or
otyp(s 0)
T
¬Γs 0 , δ (s), Au (t), ¬Gi for some
formula Gi ∈ M and some L -term t by the induction hypothesis. Letting Γs :=
Γs 0 ∪ {Gi } we get
otyp(s) T
¬Γs , δ (s) either by the structural rule or by an applica-
tion of an inference (∃).
4.4.5 Lemma (Semantical Main Lemma) If the search tree S M∆ is not wellfounded then there is a model M and an assignment Φ over M such that M |= G[Φ ] for all formulas G ∈ M but M |= F[Φ ] for all formulas in ∆ . Proof Assume that S M∆ is not well-founded. Then there is an infinite path in S M∆ . Such a path can be viewed as a function f : N −→ N such that [ f ](n) :=
f (0), . . . , f (n − 1) ∈ S M∆ for all n ∈ N. Let δ ( f ) := n∈N δ ([ f ](n)). We pick an infinite path f in S M∆ and check the following properties. (Atm) If an atomic formula A occurs in δ ([ f ](n)) then A occurs in all δ ([ f ](m)) for m ≥ n. (M)
It is ¬G ∈ δ ( f ) for all G ∈ M.
(Red) If a non atomic formula F occurs in δ ([ f ](n)) then there is a m ≥ n such that F = R(δ ([ f ](m))). (∧)
If a formula (A0 ∧ A1 ) occurs in δ ( f ) then there is an i ∈ {0, 1} such that Ai ∈ δ ( f ).
(∀)
If a formula ((∀x)Av (x)) occurs in δ ( f ) then there is a variable u such that Av (u) occurs in δ ( f ).
(∨)
If a formula (A0 ∨ A1 ) occurs in δ ( f ) then Ai ∈ δ ( f ) for every i ∈ {0, 1}.
(∃)
If a formula ((∃x)Av (x)) occurs in δ ( f ) then Av (t) ∈ δ ( f ) for every term t.
Property (Atm) is obvious because atomic formulas are never discarded. Property (M) follows because either rule S0/ , S∨ or rule S∃ is infinitely often applied in f . We prove property (Red) by induction on the number of non atomic formulas in δ ([ f ](n)) with lower index than F. If there is no such formula then F is the redex in δ ([ f ](n)). If δ ([ f ](n)) possesses a redex with lower index than F then F is
60
4 Pure Logic
discharged in δ ([ f ](m + 1)) and by induction hypothesis there is an m ≥ n + 1 such that F = R(δ ([ f ](m))). Property (∧) follows because by property (Red) there is an m such that R(δ ([ f ](m))) = (A0 ∧ A1 ). Then [ f ](m + 1) = [ f ](m) i and Ai ∈ δ ([ f ](m + 1)) for some i ∈ {0, 1}. Similarly we show property (∀). If (∀x)Av (x) occurs in δ ( f (n)) there is an an m such that R(δ ([ f ](m))) = (∀x)Av (x). Then Av (u) ∈ δ ([ f ](m + 1)) which proves (∀). / δ ( f ). There is an m such that To prove property (∨) assume first A0 ∈ f ](m) 0 and obtain R(δ ([ f ](m))) = (A0 ∨ A1 ). Then we have [ f ](m + 1) = [ δ ([ f ](m + 1)) = δ ([ f ](m))r , A0 , ¬G j , A0 ∨ A1 because A0 ∈ / k≤m δ ([ f ](k)). There/ δ ( f ) is false. Now assume A0 ∈ δ ( f ) but A1 ∈ / δ ( f ). Then fore the assumption A0 ∈ there is an n0 such that A0 ∈ δ ([ f ](n0 )). Since also (A0 ∨ A1 ) ∈ δ ([ f ](n0 )) there is by (Red) an n ≥ n0 such that (A0 ∨ A1 ) = R(δ ([ f ](n))). But then A1 ∈ δ ([ f ](n + 1)) / δ ( f ) is false. showing that also the assumption A1 ∈ We finally regard property (∃) and show Av (ti ) ∈ δ ( f ) by induction on the index i in our fixed enumeration of the terms. Assume {Av (t j ), (∃x)Av (x)} ⊆ / l<m0 δ ([ f ](l)). Observe that (∃x)Av (x) l<m0 δ ([ f ](l)) for all j < i but Av (ti ) ∈ is never discarded in δ ( f ). By (Red) there is therefore an m ≥ m0 such that R(δ ([ f ](m))) = (∃x)Av (x). Then [ f ](m+1) = [ f ](m) 0 and {Av (ti ), (∃x)Av (x)}⊆ δ ([ f ](m + 1)). We define a model M together with an M-assignment Φ . The domain M of the model M is given by, M := {ti i ∈ N} and the assignment
Φ (u) := u for all free variables u. Any constant c is interpreted by cM := c and the interpretation of a function symbol f is defined by, f M (t1 , . . . ,tn ) := ( f (t1 , . . . ,tn )). Then we obtain t M [Φ ] = t for all terms t by a simple induction on the the complexity of the term t and define the interpretation of an n-ary relation symbol by, RM := {t1 , . . . ,tn (¬Rt1 , . . . ,tn ) ∈ δ ( f )}. From the above properties we then obtain, F ∈ δ ( f ) ⇒ M |= F[Φ ]
(i)
by induction on the complexity of the formula F. If F is an atomic formula Rt1 , . . . ,tn then (¬Rt1 , . . . ,tn ) ∈ / δ ( f ) by (Atm) since otherwise there existed an m such that {(Rt1 , . . . ,tn ), (¬Rt1 , . . . ,tn )} ⊆ δ ([ f ](m)) contradicting the infinity of the path f . Hence (t1M [Φ ], . . . ,tnM [Φ ]) ∈ / RM , i.e., M |= F[Φ ]. If F is a formula (¬Rt1 , . . . ,tn ) we obtain (t1M [Φ ], . . . ,tnM [Φ ]) ∈ RM by
4.4 Trees and the Completeness Theorem
61
definition which in turn entails M |= (¬Rt1 , . . . ,tn )[Φ ]. In case that F is not atomic we obtain the claim immediately from the induction hypothesis using properties (∧) through (∃). Since ¬G ∈ δ ( f ) for all G ∈ M we obtain from (i) M |= G[Φ ] for all G ∈ M. Since ∆ = δ ([ f ](0)) we finally get M |= G[Φ ] for all G ∈ ∆ . We can now prove Theorem 4.4.1. Let M be countable and ∆ finite and assume m ¬Γ , ∆ for all m ∈ ω and all finite subsets Γ ⊆ M. Then S M∆ is not well-founded T by the Syntactical Main Lemma. By the Semantical Main Lemma wetherefore find a modelM and an M-assignment Φ such that M |= M[Φ ] but M |= ∆ [Φ ]. Hence M |=L ∆ . The next theorem is the combination of the Soundness and the Completeness Theorem. 4.4.6 Theorem Let M be a countable set of LT -formulas. Then we have M |=L F m ¬Γ , F. if and only if there is a finite subset Γ ⊆ M and an m ∈ ω such that T
4.4.7 Corollary Let F be a first-order formula. Then |= F if and only if there is an m m ∈ ω such that F. T
Proof
Let M = 0/ and ∆ = {F} in Theorem 4.4.6.
By a theory we understand a set of sentences. We call a theory T also an axiom system if A ∈ T is decidable. A formula F is provable from a theory T if T |=L F. We obtain from Theorem 4.4.6 that a formula is provable from a first-order theory T if and only if there are finitely many sentences A1 , . . . , An in T such that m ¬A1 , . . . , ¬An , F for some m < ω . T 4.4.8 Definition For a first-order theory T , we define, T
T
F :⇔ there is an m and there are sentences A1 , . . . , An in T such that
m T
(4.4)
¬A1 , . . . , ¬An , F.
(Remark: Don’t confuse the T standing for the theory T with the subscript
T
standing for
“Tait”).
4.4.9 Remark We have introduced the notion of logical consequence for pure predicate logic without identity. However, we will always assume that a language contains the binary identity symbol =, whose intended meaning is equality. Therefore we assume that any theory, formulated in a language L = L (C , F , R), contains the defining axioms for =, i.e., the identity axioms IDEN which are • (∀x)[x = x] • (∀x)(∀y)[x = y → y = x]
62
4 Pure Logic
• (∀x)(∀y)(∀z)[x = y ∧ y = z → x = y] n
• (∀x1 ) . . . (∀xn )(∀y1 ) . . . (∀yn )[ f ∈F
i=1 xi
= yi → f (x1 , . . . , xn ) = f (y1 , . . . , yn )] for all
and n
• (∀x1 ) . . . (∀xn )(∀y1 ) . . . (∀yn )[ R ∈ R. Sometimes we also write T
m T
i=1 xi
= yi → (Rx1 , . . . , xn ) → (Ry1 , . . . , yn )] for all
∆ to denote that there are finitely many axioms
A1 , . . . , An of T (including the identity axioms) such that
m T
¬A1 , . . . , ¬An , ∆ .
As another corollary of the Completeness Theorem we obtain the Compactness Theorem for countable sets M. 4.4.10 Corollary (Compactness Theorem) Let M be a countable set of L formulas. If M |=L F then there is already a finite subset N ⊆ M such that N |=L F. Proof
If M |=L F we get
T
¬N, F for a finite subset N ⊆ M by Theorem 4.4.6.
Then |= ¬N, F by the Soundness Theorem (Theorem 4.3.3). Using the Deduction Theorem (Theorem 4.2.10) we finally obtain N |=L F. For a more familiar formulation of the Compactness Theorem we go back to Theorem 4.2.9 to obtain: The theory T ∪ {¬F} has no model ⇔ T |=L F ⇔ N |=L F for some finite set N ⊆ T ⇔
(4.5)
There is a finite subset N ⊆ T such that N ∪ {¬F} has no model. Choosing F as a true sentence and taking contrapositions in (4.5), we obtain the Compactness Theorem in its more familiar formulation. 4.4.11 Theorem (Compactness Theorem) A countable theory T possesses a model if and only if every finite subset of T possesses a model. We just want to remark that the Compactness Theorem is also true for uncountable theories. This needs, however, a modified proof which we omitted since only countable (even recursive) theories are proof-theoretically interesting. 4.4.12 Exercise Let M be a countable set of formulas. Extend the definition m m ∆ everywhere in of the TAIT-calculus to a calculus M ∆ by replacing Definition 4.3.2 by M
m T
T
∆ and adding a theory rule
T
4.4 Trees and the Completeness Theorem
(MR) If M
m0 T
¬A, ∆ and A ∈ M then M
63 m T
∆ for all m ≥ m0 .
Show that this definition coincides with the definition given in (4.4). 4.4.13 Exercise Let T ⊆ Seq be a tree. The tree-ordering ≺T is defined on T by s ≺T t : ⇔ (∃u ∈ Seq)[u = ∧ t u = s]. Prove: (a) The tree T is well-founded iff the tree ordering ≺T is well-founded. For a well-founded tree T define depth(T ) as the least ordinal α such that there is a function f : T −→ α + 1 such that (∀s,t ∈ T ) s ≺T t ⇒ f (s) < f (t) . (b) Show depth(T ) = otyp(T ). Hint: Show first depth(T ) = sup {depth(T a ) + 1 a ∈ T }.
The Kleene–Brouwer-ordering on a tree T is defined by, s
Show that otyp(T ) < ω1CK for every primitive recursive tree T .
Hint: Use the Kleene–Brouwer-ordering.
4.4.14 Exercise A hydra is a finite tree H (c.f. Definition 3.3.5). The heads of the hydra are the leaves of H different from . A hydra without heads is dead. A battle between Hercules and the hydra H runs as follows. Hercules consecutively chops off one of the heads s of the Hydra. If the neck of s was sufficiently long the hydra H regrows m many new necks each of them carrying the remaining number of heads, i.e., it turns into the hydra H(s, m) according to the following rules: • If lh(s) = 1 then H(s, m) := H \ {s}. • If lh(s) = n + 1 > 1 then H(s, m) := H \ {s} ∪ { (s)0 , . . . , (s)n−2 l b b ∈ H (s)0 ,...,(s)n−1 \ { (s)n } ∧ k < l ≤ k + m}, where k := max {i (s)0 , . . . , (s)n−2 , i ∈ H}.
64
4 Pure Logic
s
Fig. 4.2 A hydra
A run in the battle between Hercules and a hydra H is a sequence {Hi i ∈ ω } such that • H0 = H, • Hi+1 =
Hi (si , mi ) for some head si ∈ Hi and some mi ∈ N if Hi is alive { } if Hi is dead.
Hercules wins the battle if the hydra eventually dies. (a) Let H be the hydra in Fig. 4.2. Draw a picture of H(s, 2). (b) Show that Hercules wins any battle against any hydra. Hint: Assign an ordinal notation o(H) < ε0 from Sect. 3.3.1 to any hydra H and show that o(H(s, m)) < o(H) holds true for any hydra H. Use the symmetric sum (cf. Exercise 3.3.14) instead of the ordinary ordinal sum.
4.5 GENTZEN’s Hauptsatz for Pure First-Order Logic One of the consequences of the completeness theorem for the first-order TAITcalculus is the admissibility of the cut rule. 4.5.1 Theorem (Weak G ENTZEN’s Hauptsatz) If is a k such that
k T
∆.
m T
∆ , F and
n T
∆ , ¬F then there
4.5 GENTZEN’s Hauptsatz for Pure First-Order Logic
The proof is simple. From
m T
∆ , F and
n T
65
∆ , ¬F we get by the soundness of the
calculus that ∆ ∨ F as well as ∆ ∨ ¬F are valid in any model with any assign k ment. But then ∆ has to be valid in all models with all assignments. Hence ∆ T
for some k by the completeness of the calculus. It is an easy exercise to show by k induction on k that this entails ∆. T
Theorem 4.5.1 gives no information about the size of k. This information, however, might be crucial. To compute k we define the rank rnk(F) of a formula F as the number of logical symbols occurring in F and augment the clauses in Definition 4.3.2 by an additional rule, the cut rule (Cut) If rnk(F) < r,
m r
∆ , F and
m r
, ∆ ¬F then
n r
∆ for all n > m.
To make this definition correct we have to replace m r
m T
∆ , . . . in all clauses in
Definition 4.3.2 by ∆ , . . . . The subscript r is thus a measure for the complexity m m of all cut formulas occurring in the derivation. Obviously we have ∆ ⇔ 0 ∆. T
We can now formulate G ENTZEN’s Hauptsatz in a version which gives information about the size of the cut free derivation. 4.5.2 Theorem (G ENTZEN’s Hauptsatz) fined by 20 (x) = x and 2n+1
(x) = 22n (x) .
If
m r
∆ then
2r (m) 0
∆ where 2r (x) is de-
We need G ENTZEN’s Hauptsatz for pure logic, in fact only in its weak version (Theorem 4.5.1). We will, however, prove modifications of the strong version (Theorem 4.5.2) for different infinitary systems. Therefore we leave the proof of G ENTZEN’s Hauptsatz as an exercise which should be solved after having seen the cut-elimination for the semi-formal calculus which we are going to introduce in Definition 7.3.5. 4.5.3 Exercise Prove Theorem 4.5.2. For the following exercises assume that the first-order language L contains at least one unary predicate symbol C and a distinguished constant c. Put ⊥ :⇔ Cc ∧ ¬Cc. Then S |= ⊥ holds in all L -structures S. Therefore ⊥ represents the false formula. Dually :⇔ ∼⊥ stands for the true sentence. 2
4.5.4 Exercise (a) Show that 0 ∆ holds for all finite sets ∆ of L -formulas containing . m k (b) Show that n ∆ , ⊥ implies that there is a natural number k such that 0 ∆ . Compute an upper bound for k. (c) Conclude that ⊥ |=L F holds for all L -formulas F. This rule is known as “ex falso quodlibet”.
66
4 Pure Logic n
4.5.5 Exercise Let ∆ and Γ be finite sets of formulas such that r ∆ , Γ . Show that there is a formula F with FV(F) ⊆ FV(∆ ) ∩ FV(Γ ) containing only those relation variables and -constants, with the possible exception of C and ¬C, which occur in all formulas of Γ and in all formulas of ¬∆ := {∼A A ∈ ∆ } and that there are natural m0 m1 numbers m0 and m1 such that 0 ∆ , F and 0 ¬F, Γ . Hint: Use G ENTZEN’s Hauptsatz.
4.5.6 Exercise (Craig’s interpolation theorem). An interpolation formula for two L -formulas A and B is a formula F satisfying the following conditions. • FV(F) ⊆ FV(A) ∩ FV(B) and FV2 (F) ⊆ FV2 (A) ∩ FV2 (B). • All relation constants occurring in F occur in A as well as in B. • |= A → F and |= F → B hold true. Assume |= A → B. Show that there is an interpolation formula for A and B. Hint: Use the Completeness Theorem and Exercise 4.5.5.
4.5.7 Exercise (Interpolation theorem of Craig and Lyndon) We say that a relation symbol R or a relation variable X occurs positively in a formula F if R or X, respectively, occurs in F T . We say that R or X occurs negatively in F if ¬R or ¬X, respectively, occurs in F T . Strengthen Craig’s interpolation theorem (Exercise 4.5.6) in so far that the interpolation formula only contains positive (negative) occurrences of relation variables and -predicates which occur positively (negatively) as well in A as in B.
4.6 Second-Order Logic We have already mentioned that there is no calculus for second-order logic which is complete. This is due to the fact that there is a categorical axiom system for arithmetic in second-order logic (cf. Exercise 7.1.1). Let T denote this axiom system and c be a new constant. Then every finite subset of T := T ∪ {c = n n ∈ N} has a model (just take N, which is a model of T , and interprete the new constant c by a natural number which is different form all nN occurring in the finite subset). If there were a compactness theorem for second-order logic we would also obtain a secondorder model N |= T which cannot be isomorphic to N because cN = nN = n for all n ∈ N. As we have just seen, a completeness theorem entails a compactness theorem. Therefore there cannot be a completeness theorem for second-order logic. Nevertheless there are sound calculi for second-order logic. We obtain such a calculus if we extend Definition 4.3.2 by the rules for second-order quantifiers which are,
4.6 Second-Order Logic
(∃2 ) If
m0 T
67
∆ , AU (V ), then
m T
∆ , (∃X)AU (X) for all m > m0
and (∀2 ) If
m0 T
∆ , A(U) and U not free in ∆ , then
We often write
m T
2
T
∆ , (∀X)AU (X) for all m > m0 .
∆ to emphasize that we talk about a derivation in the sense of
second-order logic. We transfer the definition of T second-order language. 4.6.1 Theorem Let
m
m 2
T
∆ . Then |=
∆ given in (4.4) on p. 61 also to theories T in
∆ in the sense of second-order logic.
Proof The proof is by induction on m and continues the proof of the soundness theorem for first-order logic (Theorem 4.3.3). The new additional cases are inferences according to (∀2 ) and (∃2 ). We start with the case of an inference (∀2 ). Let M be a structure and Φ an assignment. By induction hypothesis we have M |= ( ∆ ∨ A(U))[Ψ ] for all assignments Ψ . This is especially true for all as/ FV2 (∆ ) we obtain signments Ψ ∼U Φ . Because of the variable condition U ∈ M |= ∆ [Φ ] ⇔ M |= ∆ [Ψ ]. If M |= ∆ [Φ ] we therefore obtain M |= ∆ [Ψ ] ∼U Φ . Hence M |= (∀X)AU (X)[Φ ]. and thus M |= A(U)[Ψ ] for all assignments Ψ In case of an inference (∃2 ) assume M |= ∆ [Φ ]. By the induction hypothesis we then obtain M |= AU (V )[Φ ]. Now define Φ (W ) for W = U Ψ (W ) := Φ (V ) for W = U. Then Ψ ∼U Φ and M |= A(U)[Ψ ]. Hence M |= (∃X)AU (X)[Φ ].
4.6.2 Note We introduced only a very weak form of a second-order calculus. Commonly much stronger calculi are regarded. They differ in so far that class terms become well-formed expressions of the second-order language. Whenever F(u1 , . . . , un ) is a second-order formula the expression {(x1 , . . ., xn )|Fu1 ,...,un (x1 , . . ., xn )} is a second-order term. The defining axiom for second-order terms then is (t1 , . . . ,tn ) ∈ {(x1 , . . . , xn ) Fu1 ,...,un (x1 , . . . , xn )} ↔ Fu1 ,...,un (t1 , . . . ,tn ). So far there is no real difference to our convention to use class terms as “abbreviations”. The difference comes by extending the (∃2 ) rule to m0
(∃2 ) If SO ∆ , AU (S) for a second-order term S, then m > m0 .
m SO
∆ , (∃X)AU (X) for all
Adapting the remaining clauses in Definition 4.3.2 and clause (∀)2 above we get the calculus SO ∆ for simple second-order logic. By rule (∃)2 we obtain obtain full comprehension (CA) (∃X)(∀x1 ) . . . (∀xn )[(x1 , . . . , xn ) ∈ X ↔ Fu1 ,...,un (x1 , . . . , xn )]
68
4 Pure Logic
for arbitrary formulas F(u1 , . . . , un ) in simple second-order logic, which shows that already a bit of mathematics sneaks into pure logic. Commonly the calculus is also equipped with a cut rule (cut) If
SO
∆ , F and
SO
Γ , ¬F then
SO
∆ ,Γ .
It has been shown by Takeuti [101] that the eliminability of the cut rule in simple second-order logic already implies the consistency of full second-order arithmetic (cf. Sect. 1). Therefore the proof of the Hauptsatz for the calculus SO cannot be as simple as for the weak second-order calculus
2
introduced above
T
(cf. Exercise 4.6.4). The first proof of the Hauptsatz for a calculus similar to has been given by Tait in [97].
SO
4.6.3 Exercise Prove the soundness of the simple second-order logic, i.e., show SO
∆ ⇒ S |=
∆ [Φ ]
for any structure S and any S assignment Φ . n
4.6.4 Exercise Extend the definition of r ∆ to weak second-order logic, i.e., second-order logic without second-order terms. Prove m r
∆ ⇒
2r (m) 0
∆.
Hint: Postpone this exercise until you have seen the cut elimination procedure for the infinitary calculus in Sect. 7.3
Chapter 5
Truth Complexity for Π 11 -Sentences
We will now turn to the language of arithmetic. The first-order sentences of the language of arithmetic can be arranged in a hierarchy, the arithmetical hierarchy. Arranging the second-order sentences in the language of arithmetic into a hierarchy leads to the analytical hierarchy, which continues the arithmetical hierarchy. Of special interest in proof theory are the sentences at the Π11 -level of the analytical hierarchy.
5.1 The Language L (NT) of Arithmetic To fix the language L (NT) := L (C , F , R) of arithmetic it suffices to define its signature. We put: •
C := {0}.
•
F := { f f is a symbol for a primitive recursive function term}. The arity # f is the arity of the denoted primitive recursive function term.
•
R := {=}. The arity of = is 2.
5.1.1 Remark Since we have a primitive recursive coding machinery we lose no expressive power by restricting the relation variables to unary ones. Any atomic formula (t1 , . . . ,tn ) ε U n for an n-ary relation variable U n can be expressed as
t1 , . . . ,tn ε U 1 with a unary “set variable” U 1 and vice versa. Sometimes it is more convenient to allow only set variables; sometimes it is more convenient to work with n-ary relation variables. We will switch between both concepts without always emphasizing it. This, however, means no loss of generality. We call L (NT)-terms also arithmetical terms. An L (NT)-formula F is called / The first-order formulas of L (NT), i.e., the arithmetical if FV2 (F) = BV2 (F) = 0. W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
69
5 Truth Complexity for Π11 -Sentences
70
formulas that also allow occurrences of free relation variables, are sometimes called bold face arithmetical formulas. They will play an important role in the ordinal analysis of axiom systems for arithmetic. We are interested in the structure of natural numbers which form the standard model of L (NT). In abuse of notation we put N := (N, N ), i.e., we denote the structure and its domain by the same symbol. We define •
0N := 0.
•
f N is the function represented by f , i.e., f N (z1 , . . . , zn ) := ev( f , z1 , . . . , zn ) according to Definition 2.1.2.
•
z1 =N z2 :⇔ z1 = z2 , i.e., =N is interpreted as the identity on the natural numbers.
Observe that for every natural number z there is a closed term z such that zN = z. Just put z := (S · · · (S 0) · · ·). z-times
5.1.2 Lemma Let t be an arithmetical term and Φ an N assignment such that Φ (v) = z. Then t N [Φ ] = tv (z)N [Φ ]. Proof We induct on the length of the term t. If t = 0 there is nothing to show. If t is a free variable different from v we get tv (z) = t and are done. If t is the free variable v we get t N [Φ ] = Φ (v) = z = tv (z)N . Now assume that t is a term ( f t1 . . .tn ). Then we obtain t N [Φ ] = f N (t1N [Φ ], . . . ,tnN [Φ ]) = f N (t1,v (z)N [Φ ], . . . ,tn,v (z)N [Φ ]) = ( f t1 . . .tn )v (z)N [Φ ] = tv (z)N [Φ ] using the induction hypothesis. 5.1.3 Lemma Let F be an L (NT)-formula and Φ an N-assignment such that Φ (v) = z. Then N |= F[Φ ] if and only if N |= Fv (z)[Φ ]. Proof Induction is on the length of the formula Φ . The claim is obvious if v∈ / FV(F). If F is an atomic formula s = t or (t1 , . . . ,tn ) ε U we obtain N |= F[Φ ] iff sN [Φ ] = t N [Φ ] or (t1N [Φ ], . . . ,tnN [Φ ]) ∈ Φ (U), respectively. By the previous lemma (Lemma 5.1.2) we obtain either N |= Fv (z)[Φ ] iff tv (z)N [Φ ] = sv (z)N [Φ ] N (z), . . . ,t N (z)) ∈ Φ (U) iff iff t N [Φ ] = sN [Φ ] or N |= Fv (z)[Φ ] if and only if (t1,v n,v (t1N [Φ ], . . . ,tnN [Φ ]) ∈ Φ (U), respectively. In case that F is a composed formula we get the claim immediately from the induction hypothesis. The next theorem shows that the somewhat weird looking definition of the satisfaction relation in case of first-order quantification matches in fact the intuitive meaning of the universal and existential quantifiers. 5.1.4 Theorem Let F(v) be an L (NT)-formula. Then N |= (∀x)Fv (x)[Φ ] ⇔ N |= Fv (z)[Φ ] for all z ∈ N
5.1 The Language L (NT) of Arithmetic
71
and N |= (∃x)Fv (x)[Φ ] ⇔ N |= Fv (z)[Φ ] for some z ∈ N Proof By Definition 4.2.7 we have N |= (∀x)Fv (x)[Φ ] if and only if N |= F[Ψ ] for all assignments Ψ ∼v Φ . Let z ∈ N and define Ψ ∼v Φ such that Ψ (v) = z. Then we obtain by Lemma 5.1.3 N |= F[Ψ ] ⇔ N |= Fv (z)[Ψ ] which by (4.2) is equivalent / FV(Fv (z)). This proves the direction from left to right. to N |= Fv (z)[Φ ] because v ∈ For the opposite direction let Φ ∼v Ψ and put z := Ψ (v). By hypothesis we have N |= Fv (z)[Φ ] that (4.2) is equivalent to N |= Fv (z)[Ψ ]. Again by Lemma 5.1.3 this entails N |= F[Ψ ]. The proof for the existential quantifier follows the same pattern and is left as an exercise. We just want to remark that an analogous theorem holds also for second-order quantifiers if we enrich the language by constants for subsets of Nn . A subset A ⊆ N is L (NT)-definable if there are an L (NT) formula F(u, v1 , . . . , vn ) and natural numbers z1 , . . . , zn such that A = {x ∈ N N |= F(x, z1 , . . . , zn )}. The definable subsets of N can be hierarchically ordered by the complexity of their defining formulas. To do that we first introduce a syntactically defined hierarchy for L -formulas: •
The class of ∀00 -formulas comprises the quantifier free L -formulas. We put ∃00 := ∀00 .
•
If G is a ∃0n -formula then (∃x)Gu (x) is a ∃0n and (∀x)Gu (x) is a ∀0n+1 -formula.
•
If G is a ∀0n -formula then (∀x)Gu (x) is a ∀0n -formula and (∃x)Gu (x) is a ∃0n+1 formula.
A ∃0n -formula is therefore a first-order formula in prenex form which has n alternating blocks of quantifiers starting with a block of existential quantifiers. Dually a ∀0n -formula is a first-order formula in the prenex form with n alternating quantifiers starting with a block of universal quantifiers. It is a theorem of Mathematical Logic that every first-order formula is logically equivalent to a ∀0n - or ∃0n -formula. We call a formula a ∆n0 -formula if it is logically equivalent to a ∃0n -formula and simultaneously also logically equivalent to a ∀0n -formula. We extend the hierarchy to second-order formulas by adding the following clauses. •
The class of ∀10 -formulas comprises the union of all ∀0n formulas. We put ∆01 := ∃10 := ∀10 .
•
If G is a ∃1n -formula then (∃x)Gu (x), (∀x)Gu (x) and (∃X)GU (X) are ∃1n -formulas and (∀X)GU (X) is a ∀1n+1 -formula.
5 Truth Complexity for Π11 -Sentences
72
•
If G is a ∀1n -formula then (∃x)Gu (x), (∀x)Gu (x) and (∀X)GU (X) are ∀1n -formulas and (∃X)GU (X) is a ∃1n+1 -formula.
The hierarchy of second-order formulas follows the same pattern as the hierarchy of first-order formulas but we count only blocks of second-order quantifiers and neglect first-order quantifiers. The basis of the hierarchy of definable sets of natural numbers form the primitive recursively definable sets. Due to the richness of the language L (NT) every primitive recursively definable set is already definable by a ∀00 formula. We put: •
Π00 := Σ00 := ∆00 = {A ⊆ N A is definable by an arithmetical ∀00 -formula}
as well as •
Πn0 := {A ⊆ N A is definable by an arithmetical ∀0n -formula}
and dually •
Σn0 := {A ⊆ N A is definable by an arithmetical ∃0n -formula}.
We put: •
∆n0 := Σn0 ∩ Πn0 .
A set A ⊆ N is called arithmetical if there is a n such that A ∈ ∆n0 . The hierarchy of arithmetical sets is cumulative and strict in the sense that for all n we have 0 0 0 Σn0 ∪ Πn0 ∆n+1 Σn+1 ∪ Πn+1 .
These facts are known as the Arithmetical Hierarchy Theorem, which is a theorem of abstract recursion theory. Without proof we cite that for n > 0 the levels Σn0 are closed under the positive boolean operations ∧ and ∨, bounded first-order ∀quantification and unbounded ∃-quantification. Dually the classes Πn0 are closed under the positive boolean operations, bounded ∃-quantification and unbounded ∀quantification. The classes ∆n0 are closed under the propositional operations ¬, ∧ and ∨ and bounded ∀- and ∃-quantification. It is an easy observation that all subsets of N that are first-order definable in the language of arithmetic are members of the arithmetical hierarchy, i.e., they are definable by an arithmetical ∃0n or ∀0n formula for some finite n. The arithmetical hierarchy can be expanded into the analytical hierarchy that comprises sets of natural numbers which can be defined by second-order formulas. We put: •
Σ01 := Π01 := ∆01 is the collection of all arithmetical sets
For n ≥ 1 we define:
Πn1 is the collection of sets of natural numbers which are definable by a ∀1n formula. and dually
•
5.2 The TAIT-Language for Second-Order Arithmetic
73
Σn1 is the collection of sets of natural numbers which are definable by a ∃1n formula. Again we put:
•
•
∆n1 := Πn1 ∩ Σn1 .
The sets ∆n1 form the levels of the analytical hierarchy. It is easy to see that the collection of Πn1 -sets is closed under the positive boolean operations, first-order quantification and second-order ∀-quantification. Dually the Σn1 sets are closed under the positive boolean operations, first-order quantifications and second-order ∃-quantification. The class of ∆n1 -sets is closed under all boolean operations and first-order quantifications. The analytical hierarchy, too, is cumulative and strict. The study of this hierarchy is the subject of descriptive set theory. In this book we are only concerned with Π11 sets. In the arithmetic language, it is common to talk about Σni -formulas or Πni formulas instead of ∃in -formulas or ∀in -formulas, respectively. Due to the existence of a primitive recursive coding machinery we obtain for every Πn0 or Σn0 -formula F a Πn0 or Σn0 -formula F in that there are exactly n alternating first-order quantifiers followed by a quantifier-free kernel such that: N |= F ↔ F . Similarly, we obtain for every Πn1 or Σn1 -formula F a Πn1 or Σn1 -formula F with exactly n alternating second-order quantifiers followed by an arithmetical kernel such that: N |= F ↔ F . 5.1.5 Exercise Let A ⊆ N be a Π11 -set. Show that there is a first-order formula F(X, x) such that A = {x ∈ N N |= (∀X)F(X, x)}. Hint: Let z ∈ (X)y :⇔ z, y ∈ X. Show that (∃y)(∀X)G(X, y) ↔ (∀Z)(∃y)G((Z)y , y) holds true in N and use these facts to obtain a block of universal second quantifers followed by a first-order formula. Then show that (∀X)(∀Y )G(X,Y ) ↔ (∀Z)G((Z)0 , (Z)1 ) is true in N and use that to contract the universal second-order quantifers.
5.2 The TAIT-Language for Second-Order Arithmetic We adapt the definition of the TAIT-language of Sect. 4.3 to the language L (NT). 5.2.1 Definition The TAIT-language for arithmetic contains the following symbols: •
Bounded number variables x, y, z, x0 , . . . .
•
Set variables X, Y , X0 , . . . .
5 Truth Complexity for Π11 -Sentences
74
•
The logical symbols ∧, ∨, ∀, ∃.
•
The binary relation symbols ε , ε , =,=.
•
The constant 0.
•
Symbols for all primitive recursive functions.
Terms are constructed from the constant 0 using function symbols in the usual way. Atomic formulas are equations s = t, inequalities s = t between terms and the formulas t ε X and t ε X where t is a term and X a set variable. Formulas are obtained from atomic formulas closing them under the logical operations. The definition of ∼F specializes as follows: •
∼(s = t) :≡ s = t;
∼(s = t) :≡ s = t;
•
∼(s ε X) :≡ s ε X;
•
∼(A ∧ B) :≡ ∼A ∨ ∼B;
•
∼(∀x)F(x) :≡ (∃x)∼F(x);
∼(s ε X) :≡ s ε X; ∼(A ∨ B) :≡ ∼A ∧ ∼B; ∼(∃x)F(x) :≡ (∀x)∼F(x).
For any assignment Φ of subsets of N to the set variables occurring in F we again obtain N |= ∼F[Φ ] ⇔ N |= ¬F[Φ ].
(5.1)
Therefore, we commonly write ¬F instead of ∼F. Observe that we do not allow free number variables in the formulas of the TAITlanguage of L (NT). Nevertheless, any formula in the language of arithmetic can be canonically translated into its Tait-form. This is done by replacing ¬ by ∼. However, we have to replace number variables in an L (NT)-formula by closed number terms to obtain a well-formed formula of the TAIT-language in the sense of Definition 5.2.1. This could be avoided by also allowing free number variables in the TAIT-language. To dispense with free number variables in the TAIT-language of L (NT) has, however, good technical advantages.
5.3 Truth Complexities for Arithmetical Sentences As a heuristic preparation we study first the truth complexities of arithmetical sentences, i.e., of formulas in the first-order TAIT-language of arithmetic that must not contain free set variables. Let Diag(N) be the diagram of N, i.e., the set of true atomic sentences in the Tait-language.
5.3 Truth Complexities for Arithmetical Sentences
75
5.3.1 Observation The true arithmetical sentences can be characterized by the following types: •
The sentences in Diag(N),
•
The sentences of the form (F0 ∨ F1 ) or (∃x)F(x) where Fi and F(k) is true for some i ∈ {0, 1} or k ∈ ω , respectively,
•
The sentences of the form (F0 ∧ F1 ) or (∀x)F(x) where Fi and F(k) is true for all i ∈ {0, 1} or k ∈ ω , respectively.
According to Observation 5.3.1 we divide the arithmetical sentences into two types. 5.3.2 Definition Let
–type := Diag(N) ∪ {sentences of the form (F0 ∧ F1 )}∪ {sentences of the form (∀x)F(x)}.
We say that the sentences in –type are of conjunctive type. Dually put
–type := {¬F F ∈ –type } = ¬Diag(N) ∪ {sentences of the form (F0 ∨ F1 )}∪ {sentences of the form (∃x)F(x)}.
The sentences in –type are of disjunctive type. Then we define the characteristic sequence CS(F) of sub-sentences of F. 5.3.3 Definition Let ⎧ ⎨ 0/ CS(F) := F 0 , F1 ⎩ G(k) k ∈ ω
if F is atomic if F ≡ (F0 ◦ F1 ) if F ≡ (Qx)G(x)
for ◦ ∈ {∧, ∨} and Q ∈ {∀, ∃}. The length of the type of a sentence F is the length of its characteristic sequence CS(F). From Observation 5.3.1 and Definition 5.3.2 we get immediately 5.3.4 Observation
F ∈ –type
⇒ [N |= F ⇔ (∀G ∈ CS(F))(N |= G)]
and
F ∈ –type
⇒ [N |= F ⇔ (∃G ∈ CS(F))(N |= G)]
We use Observation 5.3.4 to define the truth complexity of a sentence F.
5 Truth Complexity for Π11 -Sentences
76
5.3.5 Definition The infinitary verification calculus the following clauses:
If F ∈ –type and (∀G ∈ CS(F))(∃αG < α )
( )
If F ∈ –type and (∃G ∈ CS(F))(∃αG < α )
( )
α
F is inductively defined by αG αG
G then G then
α α
F. F.
α
F as “F is α -verifiable” and put We read
α F ∪ {ω1 }). tc F := min( α
We call tc F the truth complexity of the sentence F. The next theorem is obvious from Observation 5.3.4 and Definition 5.3.5. 5.3.6 Theorem
α
F implies N |= F.
5.3.7 Observation Let rnk(F) be the number of logical symbols occurring in F. Then
N |= F ⇒ tc F ≤ rnk(F) and
N |= F ⇔ tc F < ω . Proof
For the first claim we prove
N |= F ⇒
rnk(F)
F
(i)
by induction on rnk(F). If rnk(F) = 0 then F is atomic. Hence, F ∈ Diag(N) and 0 we obtain F by a clause ( ) with empty premise. If rnk(F) > 0 and F ∈ –type then N |= G for all G ∈ CS(F). Since rnk(G) < rnk(F) we obtain G ∈ CS(F) by induction hypothesis and infer
rnk(F)
–type then N |= G for some G ∈ CS(F) and
rnk(G)
rnk(F)
rnk(G)
G for all
F by a clause ( ). If F ∈ G by induction hypothesis.
But then F by clause ( ). The second claim follows from the fact that rnk(F) < ω together with the first claim and Theorem 5.3.6. According to Observation 5.3.7 the notion of truth complexity is not very exciting for arithmetical sentences. This, however, will change if we extend it to the class of formulas containing free set variables.
5.4 Truth Complexity for Π11 -Sentences
77
5.4 Truth Complexity for Π11 -Sentences The aim of this section is to extend the notion of truth complexity to Π11 -sentences, the first level of the analytic hierarchy. This is made possible by the ω -completeness theorem (Theorem 5.4.9 below). The ω -completeness theorem allows for an infinitary syntactical verification calculus that is N-complete for Π11 -sentences. In the form as we are going to present this verification calculus it derives finite sets of first-order formulas of arithmetic that may contain free set parameters but must not contain free number variables. 5.4.1 Definition We call an arithmetical formula that does not contain free number variables, but may contain free set parameters, a pseudo Π11 -sentence. For pseudo Π11 -sentences F(X) we define N |= F(X) :⇔ N |= (∀X)F(X) ⇔ N |= F(X)[Φ ] for all assignments Φ .
We adopt the definition of –type and –type for pseudo Π11 -sentences. Observe, however, that open atomic pseudo sentences, i.e., sentences of the form (t ε X) and (s ε X), do not belong to any type. Observation 5.3.4 fails for pseudo Π11 -sentences. But it can be weakened to the following observation. 5.4.2 Observation For any assignment Φ to the free set variables in a pseudo Π11 sentence F we obtain F ∈ –type ⇒ (N |= F[Φ ] ⇔ (∀G ∈ CS(F)) N |= G[Φ ] ) F ∈ –type ⇒ (N |= F[Φ ] ⇔ (∃G ∈ CS(F)) N |= G[Φ ] ) The proof is a simple induction on the rank of formula F. Our aim is to use Observation 5.4.2 to obtain a more syntactic verification calculus for pseudo Π11 -sentences. However, Observation 5.4.2 says nothing about the verification of pseudo Π11 -sentence that have no type. Here we observe that a pseudo Π11 -sentence of the form t ε X ∨ s ε X is verifiable if the terms t and s yield the same value. In interpreting finite sets of Π11 -sentence as finite disjunctions we can use these observations to define a “syntactic” verification calculus that is similar to the Tait calculus for pure logic but uses inferences with infinitely many premises. This is made precise in the following definition. 5.4.3 Definition For a finite set ∆ of pseudo Π11 -sentences we define the verification α calculus ∆ inductively by the following clauses: (Ax)
( )
If sN = t N then
α
∆ , s ε X,t ε X holds true for all ordinals α .
If F ∈ (∆ ∩ –type ) and (∀G ∈ CS(F))(∃αG < α )
αG
∆ , G then
α
∆.
5 Truth Complexity for Π11 -Sentences
78
( )
If F ∈ (∆ ∩ –type ) and (∃G ∈ CS(F))(∃αG < α )
αG
∆ , G then
α
∆.
We call the formulas s ε X and t ε X in a clause (Ax ) and the formula F in a clauses ( ) and ( ) the critical formula(s) of the clause. As previously remarked, the finite set ∆ should be read as a finite disjunction. Again we write F1 , . . . , Fn instead of {F1 , . . . , Fn }, ∆ , Γ instead of ∆ ∪ Γ and ∆ , F for ∆ ∪ {F}. It is a simple observation that there is a “structural rule” also for the verification calculus. α
∆ then
β
Γ for all β ≥ α and all Γ ⊇ ∆ .
(STR)
If
Proof
We induct on α . If
α
∆ holds by (Ax) then there are terms s and t such that β
sN = t N and {t ε X, s ε X} ⊆ ∆ ⊆ Γ . Hence, Γ by (Ax). In case of an inference according to ( ) (or ( )) with critical formula F we have F ∈ ∆ ⊆ Γ and the αG ∆ , G for all (or some) G ∈ CS(F). If CS(F) = 0/ then F ∈ –type premise(s)
β
αG
and we obtain Γ by an inference ( ). If CS(F) = 0/ we obtain Γ , G for all (or some) G ∈ CS(F) by the induction hypothesis. Since αG < α ≤ β we obtain β
Γ by an inference ( ) or ( ), respectively.
α
5.4.4 Lemma The verification calculus ∆ is sound for pseudo Π11 -sentences, i.e., if we assume that all free set variables occurring in ∆ are among X := (X1 , . . . , Xn ) then α ∆ ⇒ N |= (∀X) ∆ (X) .
Proof Let Φ be any assignment. We show N |= ( ∆ )[Φ ] by induction on α . If α = 0 we are either in the situation of a rule according to (Ax) or there is an F ∈ ∆ ∩ –type such that CS(F) = 0, / hence F ∈ Diag(N), which makes the claim trivial. In the first case we get N |= ( ∆ )[Φ ] because {t ε Xi , s ε Xi } ⊆ ∆ and sN ∈ Φ (Xi ) ∨ t N ∈ / Φ (Xi ) since sN = t N . β
Now assume α > 0 and ( ∆ )[Φ ] for all β < α . Let F be the critical formula α ∆ . Then either F ∈ ∆ ∩ –type or F ∈ ∆ ∩ –type and of the last inference in αG
CS(F) = 0/ and we have the premise(s) ∆ , G for some or forall G ∈ CS(F), respectively. By the inductive hypothesis we, therefore, get N |= ( ∆ ∨ G)[Φ ] for some or all G ∈ CS(F). But this implies N |= G[Φ ] for some or all G ∈ CS(F). Hence, N |= F[Φ ] and thus also, N |= ( ∆ )[Φ ] by Observation 5.4.2. The next task is to show that the verification calculus is also complete. Similar to the completeness proof for pure logic this will be done by defining search trees. Let ∆ be a finite set of pseudo Π11 -sentences. We order the formulas in ∆ arbitrarily to transform them into a finite sequence ∆ of pseudo Π11 -sentences. The leftmost formula in the sequence ∆ that is in –type or in –type is the redex R( ∆ ) of r
∆ . The sequence ∆ is obtained from ∆ by canceling its redex R( ∆ ). We put:
5.4 Truth Complexity for Π11 -Sentences
79
Ax (∆ ) :⇔ ∃s,t, X[sN = t N ∧ {t ε X, s ε X} ⊆ ∆ ]. Two pseudo Π11 -sentences are numerically equivalent if they only differ in terms whose evaluations yield the same value. To simplify notations we mostly identify numerically equivalent pseudo Π11 -sentences. 5.4.5 Definition For a finite sequence ∆ of pseudo Π11 -sentences, we define its search tree S ω∆ together with a label function δ that assigns finite sequences of
Π11 -sentences to the nodes of S ω∆ inductively by the following clauses: (S )
Let ∈ S ω∆ and δ ( ) = ∆ .
For the following clauses assume s ∈ S ω∆ and ¬Ax(δ (s)). (SId )
If δ (s) has no redex then s 0 ∈ S ω∆ and δ (s 0) = δ (s).
(S )
If the redex R(δ (s)) belongs to –type then s i ∈ S ∆ for every Fi ∈ CS(R(δ (s))) and δ (s i) = δ (s)r , Fi .
(S )
If the redex R(δ (s)) is in –type then s 0 ∈ S ω∆ and δ (s 0) = δ (s)r , Fi , R(δ (s)), where Fi is the first formula in CS(F) that is not numerically equivalent to a formula in δ (s0 ). If no such formula exists then
s0 ⊆s
δ (s 0) = δ (s)r , R(δ (s)).
5.4.6 Remark The search tree S ω∆ and the label function δ are defined by courseof-values recursion. Therefore, S ω∆ as well as δ are primitive recursive. The order type of S ω∆ , in case that it is well-founded, is therefore an ordinal < ω1CK . 5.4.7 Lemma (Syntactical Main Lemma) If S ω∆ is well-founded then for all s ∈ S ω∆ , where δ (s) is viewed as a finite set.
otyp(s)
δ (s)
Proof We induct on otyp(s). If otyp(s) = 0 then s is a leaf. By definition of S ω∆ the only possibilities for s ∈ S ω∆ to become a leaf are that either Ax(δ (s)) holds true
or R(δ (s)) ∈ –type and CS(δ (s)) = 0. / In the first case we get 0
0
δ (s) according
to (Ax). In the second case we obtain δ (s) according to . If otyp(s) > 0 we are in the case S or S . For the redex F := R(δ (s) we get either F ∈ –type or F ∈ –type , and obtain by the induction hypothesis and the structural rule otyp(s k)
δ (s), Gk
(i)
respectively. From (i), however, we get the claim by an for some or all Gk ∈ CS(F), inference according to ( ) or ( ), respectively.
5 Truth Complexity for Π11 -Sentences
80
5.4.8 Lemma (Semantical Main Lemma) If S ω∆ is not well-founded then there is an assignment S1 , . . . , Sn to the set variables occurring in ∆ such that N |= F[S1 , . . . , Sn ] for all F ∈ ∆ . Proof The proof is very similar tothe proof of Lemma 4.4.5 in Sect. 4.4. Pick an infinite path f in S ω∆ . Let δ ( f ) := m∈ω δ ([ f ](m)). Observe
F∈ / ( –type ∪ –type ) ∧ F ∈ δ ([ f ](n)) ⇒ (∀m ≥ n)[F ∈ δ ([ f ](m))] (i)
F ∈ δ ([ f ](n)) ∩ ( –type ∪ –type ) ⇒ (∃m ≥ n)[F = R(δ ([ f ](m)))] (ii)
F ∈ δ ( f ) ∩ –type
F ∈ δ ( f ) ∩ –type
⇒ (∃G ∈ CS(F))[G ∈ δ ( f )]
(iii)
⇒ (∀G ∈ CS(F))[G ∈ δ ( f )].
(iv)
Recall that we identify numerically equivalent formulas. The proof of (i) is obvious because there are no rules that change or discard formulas of the form t ε X or t ε X. The proofof (ii) is by induction on the index of F in δ ([ f ](n)). If there are no formulas in –type ∪ –type that have a smaller index in δ ([ f ](n)) then F = R(δ ([ f ](n))). Otherwise, the redex G := R(δ ([ f ](n))) has a smaller index than F. But f is infinite, so CS(G) = 0. / Thus δ ([ f ](n + 1)) = δ ([ f ](n))r Gi for some Gi ∈ CS(G) and F gets a smaller index in δ ([ f ](n + 1)). By induction hypothesis there is an m such that F = R(δ ([ f ](m))). For the proof of (iii) we may by (ii) assume that F = R(δ ([ f ](n))). Since f is infinite, CS(F) cannot be empty. Thus, by clause , we find a G ∈ CS(F) such that G ∈ δ ([ f ](n + 1)). To show (iv) let G ∈ CS( f ) and denote by i(G) its index in the sequence CS(F). By (ii) we may again assume F = R(δ ([ f ](n))). Then [ f ](n + 1) = [ f ](n) 0 and δ ([ f ](n + 1)) = δ ([ f ](n))r G0 F for some G0 ∈ CS(F). If i(G) ≤ i(G0 ) we are done. Otherwise, we proceed by induction on i(G) − i(G0 ) and obtain by (ii) an m such that F = R(δ ([ f ](m))). Then δ ([ f ](m + 1)) = δ ([ f ](m))r G1 F for some G1 ∈ CS(F). Since i(G)−i(G1 ) < i(G)−i(G0 ) the claim follows by induction hypothesis. We define an assignment
Φ (X) := {t N (t ε X) ∈ δ ( f )} and show by induction on rnk(F) (∀F ∈ δ ( f )) N |= F[Φ ] .
(v)
If F ≡ (t ε X) ∈ δ ([ f ](n)) then (t ε X) ∈ δ ([ f ](m)) for all m ≥ n by (i). This, / δ ( f ) for all s such that sN = t N because, otherwise, we however, implies (s ε X) ∈
5.4 Truth Complexity for Π11 -Sentences
81
had by (i) Ax (δ ([ f ](m))) for some m contradicting the infinity of the path f . Hence / Φ (X), which means N |= (t ε X)[Φ ]. tN ∈ If (t ε X) ∈ δ ([ f ](n)) we get by definition of Φ directly N |= (t ε X)[Φ ], i.e., Φ ]. N |= (t ε X)[ If F ∈ –type we get by (iii) an m such that G ∈ δ ([ f ](m)) for some G ∈ CS(F). By the induction hypothesis this implies N |= G[Φ ] which, in turn, shows N |= F[Φ ]. If F ∈ –type we get for all G ∈ CS(F) a m such that G ∈ δ ([ f ](m)) by (iv). By the induction hypothesis this implies N |= G[Φ ] for all G ∈ CS(F). Hence N |= F[Φ ]. Since ∆ = δ ([ f ](0)) the claim follows from (v). 5.4.9 Theorem (ω -completeness Theorem) For all Π11 -sentences of the form (∀X1 ) . . . (∀Xn )F(X1 , . . . , Xn ) we have N |= (∀X1 ) . . . (∀Xn )F(X1 , . . . , Xn ) ⇔ (∃a < ω1CK )
α
F(X1 , . . . , Xn ).
Proof The direction from right to left is Lemma 5.4.4. For the opposite direction we assume
α
F (X1 , . . . , Xn )
(i)
ω for all α < ω1CK . Then S F(X cannot be well-founded by the Syntactical Main 1 ,...,Xn ) Lemma (Lemma 5.4.7). Thus, by the Semantical Main Lemma, we obtain an as signment Φ to the set variables X1 , . . . , Xn such that N |= F(X1 , . . . , Xn )[Φ ].
It is an easy exercise to prove that the verification calculus has the property of ∨exportation, i.e., to prove α
∆,A ∨ B ⇒
α
∆ , A, B
by induction on α . For a formula F let F ∇ denote the finite set that is obtained from F by exporting all disjunction , i.e., {F} if F ∈ / –type or F ≡ (∃y)Gv (y) F ∇ := A∇ ∪ B∇ if F ≡ (A ∨ B). Then we obtain1 α
F ⇒
α
F ∇.
5.4.10 Definition Let (∀X)F(X) be a Π11 sentence. We put
α tc (∀X)F(X) := min( α F(X)∇ ∪ {ω1CK })
There is no profoundness behind the definition of F ∇ . We just wanted to avoid that certain formulas, e.g., the scheme for Mathematical Induction, get weird looking truth complexities, caused by additional applications of -rules.
1
82
5 Truth Complexity for Π11 -Sentences
and call tc F the truth complexity of F. For a pseudo Π11 -sentence G(X) containing the free set parameters X we define
tc G(X) := tc (∀X)G(X) . We can now reformulate Theorem 5.4.10 in the following form. 5.4.11 Theorem For any (pseudo) Π11 -sentence F we have
N |= F ⇔ tc F < ω1CK . 5.4.12 Exercise Let F be a pseudo Π11 - sentence. Show that the search tree S{F} and the label function δ are primitive recursive. 5.4.13 Exercise Let S = (S, . . .) be a structure for L (NT). We call S a weak second-order structure if we only allow S-assignments to second-order variables that lie in some subset A of the full power-set of S. This implies that the range of second-order quantifiers is restricted to A .2 We denote weak second-order structures by S = (S, A , . . .). Call a weak second-order structure S a ω -structure if (S, . . .), i.e., its first-order part, is isomorphic to the standard model N. Define M ω F iff S |= G[Φ ] for all G ∈ M implies S |= F[Φ ] for all ω -structures S and all S-assignments Φ . Show that for a Π11 -sentence F and a countable set M of Σ11 -sentences we have M ω F iff there is a α < ω1CK and a finite set Γ ⊆ M such α ¬Γ , F. that
2
That means that we treat second-order variables as a new sort of first-order variables, the range of which is restricted to A . Therefore, we are in fact in the realm of a two sorted first-order logic.
Chapter 6
Inductive Definitions
Inductive definitions are ubiquitous in mathematics, especially in mathematical logic. In the following section, we give a summary of the basic theory of inductive definitions and link it with the the theory of truth complexity for Π11 -sentences of the previous Chapter.
6.1 Motivation We already introduced a sample of inductively defined notions in this book. Recall, for instance, the inductive definition of a term by the following clauses: 1. Every free variable and constant is a term. 2. If t1 , . . . ,tn are terms and f is an n-ary function symbol then ( f t1 , . . . ,tn ) is a term. Generally speaking, inductive definitions consist of a set of clauses. A clause has the form: If a ∈ X for all a ∈ A then b ∈ X. Put in abstract terms, a clause is a pair (A, b) where A is the set of premises and b the conclusion of the clause. Sometimes we write briefly A⇒b to denote a clause. The set of premises may well be infinite. 6.1.1 Definition An inductive definition is a collection A of clauses. Let A be an inductive definition. We say that, a set X satisfies A if for (A ⇒ b) ∈ A and A ⊆ X we also have b ∈ X. Synonymously, we say that, the set X is closed under the clauses in A (or just A -closed).
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
83
84
6 Inductive Definitions
6.1.2 Definition A set X is inductively generated by an inductive definition A if X is the least set, with respect to set inclusion, which is closed under A . We denote this set sometimes by I(A ). Observe that the set CA := {b (A ⇒ b) ∈ A } of conclusions of an inductive definition A is always A -closed. Moreover the intersection of A -closed sets is again A -closed. We therefore obtain I(A ) =
{X X is A -closed}.
An inductive definition A is deterministic if (A ⇒ b) ∈ A and (B ⇒ b) ∈ A imply A = B, i.e., if there is only one way in which b can get into I(A ).
6.2 Inductive Definitions as Monotone Operators In this section we will show that inductive definitions can be viewed as monotone operators. 6.2.1 Definition Let an inductive definition A operate on a set X, i.e., A ∪ {a} ⊆ X holds for all (A ⇒ a) ∈ A . We define an operator, ΦA : Pow(X) −→ Pow(X) ΦA (S) := {b ∈ X (∃(A ⇒ b) ∈ A )[A ⊆ S]}. For S0 ⊆ S1 ⊆ X we then obviously have ΦA (S0 ) ⊆ ΦA (S1 ) which means that the operator ΦA is monotone. Vice versa every monotone operator Φ: Pow(X) −→ Pow(X) can be viewed as the collection {(A ⇒ a) A ⊆ X ∧ a ∈ Φ(A)} of clauses. Inductive definitions and monotone operators are thus two sides of the same medal. So we simplify Definition 6.1.1 in the following way. 6.2.2 Definition A generalized inductive definition on a set A is a monotone operator Φ : Pow(An ) −→ Pow(An ). 6.2.3 Definition Let Φ: Pow(An ) −→ Pow(An ) be an inductive definition. A set S ⊆ An is Φ-closed iff Φ(S) ⊆ S.
6.3 The Stages of an Inductive Definition
85
We define I(Φ) =
{S S ⊆ An ∧ Φ(S) ⊆ S}.
6.2.4 Observation Let Φ: Pow(An ) −→ Pow(An ) be an inductive definition. Then I(Φ) is the least fixed-point of the operator Φ. Proof
Let
MΦ := {S Φ(S) ⊆ S}. Then I(Φ) =
MΦ .
For S ∈ MΦ we have I(Φ) ⊆ S and thus Φ(I(Φ)) ⊆ Φ(S) ⊆ S by monotonicity. Thus Φ(I(Φ)) ⊆
MΦ = I(Φ).
(i)
From (i) we obtain Φ(Φ(I(Φ))) ⊆ Φ(I(Φ))
(ii)
again by monotonicity. Hence Φ(I(Φ)) ∈ MΦ , which entails I(Φ) ⊆ Φ(I(Φ)).
(iii)
But (i) and (iii) show that I(Φ) is a fixed-point and by definition of I(Φ) this has to be the least one. We call, I(Φ) the fixed-point of the inductive definition Φ.
6.3 The Stages of an Inductive Definition The definition of a fixed-point as the intersection of all Φ-closed sets – which is sometimes called the definition of I(Φ) “from above” – does, however, not really reflect the inductive nature of the fixed-point. An inductively defined set should be obtained step by step from below but not as the intersection of Φ-closed sets. This step by step construction of I(Φ) becomes visible regarding the stages of an inductive definition Φ. The idea is to apply Φ successively starting with the empty set. In general it is not guaranteed that the fixed-point will be constructed in finitely many steps but might well need transfinite iterations. Therefore we need ordinals to describe the stages of an inductive definition Φ. For a set A we denote by A its cardinality. It is a folklore result of set theory that (A)+ , the first cardinal bigger than A, is always a regular cardinal. We call a set a countable if a ≤ ω . Observe that the first uncountable cardinal ω1 := ω + is regular. Since sup {α α < (A)+ } = (A)+ we obtain that {α α ≤ A} = {α α < (A)+ } = (A)+ , i.e., there are (A)+ -many
86
6 Inductive Definitions
ordinals of cardinality less or equal to A. Especially there are ω1 -many countable ordinals. 6.3.1 Definition By transfinite recursion we define Φα := Φ(Φ<α ) where we used the abbreviation Φ<α :=
Φξ .
ξ <α
Then Φ0 = Φ (0), / Φ1 = Φ(Φ(0)), / Φ2 = Φ(Φ(Φ(0)) / ∪ (Φ(0))), . . . . 6.3.2 Lemma Let Φ be an inductive definition on a set A. Then there is an ordinal σ < (A)+ such that Φ<σ = Φσ . Proof
By definition and the monotonicity of Φ, we get
ξ < η ⇒ Φξ ⊆ Φη . All the sets Φξ are subsets of A, hence of cardinality ≤ A. Since there are (A)+ many ordinals below (A)+ , the hierarchy of stages cannot be strict for all ordinals ≤ A. Therefore there exist an ordinal σ < (A)+ such that Φ<σ = Φσ . 6.3.3 Definition We define |Φ| := min {ξ Φξ = Φ<ξ } and call |Φ| the closure ordinal of the inductive definition Φ. 6.3.4 Theorem For an inductive definition Φ we have I(Φ) = Φ<|Φ| . Proof
We obtain
Φξ ⊆ I(Φ)
(i)
easily by induction on ξ . The induction hypothesis yields Φ<ξ ⊆ I(Φ) and by the monotonicity of Φ we obtain Φξ = Φ(Φ<ξ ) ⊆ Φ(I(Φ)) = I(Φ). Since Φ(Φ<|Φ| ) = Φ|Φ| = Φ<|Φ| and I(Φ) is the least Φ-closed set we also have I(Φ) ⊆ Φ<|Φ| .
(ii)
Hence I(Φ) = Φ<|Φ| . By (i) and (ii) and we have a construction of the fixed-point I(Φ) from below.
6.4 Arithmetically Definable Inductive Definitions
87
6.3.5 Definition For an element n ∈ I(Φ) we define its inductive norm by |n|Φ := min {ξ n ∈ Φξ }. 6.3.6 Lemma For the closure ordinal of an inductive definition Φ we get |Φ| = sup {|n|Φ + 1 n ∈ I(Φ)}. Proof Since I(Φ) = Φ<|Φ| we have η := sup {|n|Φ + 1 n ∈ I(Φ)} ≤ |Φ|. Assuming η < |Φ| we get Φη Φη +1 . But then, there is an n ∈ Φη +1 \ Φη which contra dicts the definition of η . 6.3.7 Exercise Let A be an arbitrary set and M ⊆ Pow(A). Show that there is an inductive definition ΦM whose fixed-point is the sigma-algebra induced by M. Show that |ΦM | ≤ ω1 .
6.4 Arithmetically Definable Inductive Definitions The computation of the closure ordinals of inductive definitions is one of the objectives of generalized recursion theory. In the general case, however, all we know is that these ordinals are below (A)+ . To say more, we need more information about the operator. In proof theory, we are primarily interested in inductively generated sets of natural numbers and will therefore concentrate on inductive definitions on natural numbers. Therefore we concentrate on operators which are arithmetically definable, i.e., inductive definitions whose clauses can be described in the language of arithmetic. 6.4.1 Definition An operator Φ: Pow(Nn ) −→ Pow(Nn ) is arithmetically definable if there is a formula F(X,x, y1 , . . . , yn ) in the first-order language of number theory, in which only the shown variables occur freely, together with number parameters a1 , . . . , am such that Φ(S) = {k ∈ N N |= F(X,x, y1 , . . . , ym )[S, k, a1 , . . . , am ]}. Given a formula F(X,x, y1 , . . . , yn ) and a tuple a1 , . . . , am of number constants, we denote the induced operator by ΦF [a1 , . . . , am ]. Usually we suppress mentioning the additional parameters a1 , . . . , am . 6.4.2 Remark Since there is a primitive recursive coding machinery in the language of arithmetic, it means no loss of generality to restrict inductive definitions to operators Φ : Pow(N) −→ Pow(N). For an n-ary relation variable X, we can define an unary relation variable X and replace (u1 , . . . , un ) ε X by u1 , . . . , un ε X.
88
6 Inductive Definitions
To simplify notation we will therefore mostly work with set variables, i.e., unary relation variables. 6.4.3 Definition Let F(X) be a formula in the language of arithmetic. We say that F(X) is X-positive, if, after translating F(X) into the TAIT-language, there are no occurrences of t ε X in F(X). 6.4.4 Lemma The operator ΦF which is induced by an X-positive formula F(X, x) is monotone. In this case we talk about positive operators or positive inductive definitions. Proof
We show
S ⊆ T ⇒ (∀x)[F(S, x) → F(T, x)]
(i)
by induction on the rank of the X-positive formula F(X, x). If X does not occur in F(X, x) then (i) holds trivially. If F(X, x) is a formula x ε X we obtain (∀x)[F(S, x) → F(T, x)] from the hypothesis S ⊆ T . In case that F(X, x) possesses sub-formulas we obtain the claim immediately from the induction hypothesis. 6.4.5 Definition If ΦF is an inductive definition defined by the X-positive formula F(X, x, a1 , . . . , an ) we write IF [a1 , . . . , an ] instead of I(ΦF [a1 , . . . , an ]) and |F| in<ξ ξ stead of |ΦF | as well as IF [a1 , . . . , an ] and IF [a1 , . . . , an ] instead of ΦF [a1 , . . . , an ]<ξ and ΦF [a1 , . . . , an ]ξ , respectively. Again we suppress the extra parameters a1 , . . . , an whenever they are inessential or obvious from the context. 6.4.6 Remark It follows from the Craig–Lyndon interpolation theorem that every definable operator whose monotonicity is provable in first-order logic is positive. This may not be the case if the proof of the monotonicity needs non logical axioms. All relevant monotone inductive definitions, however, are positively definable. (Cf. Exercise 6.4.8.) For a set S and a natural number a, the a-slice is the set Sa := {x a, x ∈ S}. 6.4.7 Definition A set S ⊆ N is positive-inductively definable if there is an Xpositive formula F(X, x), possibly with further numerical parameters, and a natural number a such that S is the a slice of the fixed-point IF . By Observation 6.2.4 we see that the fixed-point of an arithmetically definable inductive definition is definable by a Π11 -formula. For an X-positive arithmetical formula F(X, x, y1 , . . . , yn ) we obtain x ∈ IF [a1 , . . . , an ] ⇔ (∀X) (∀y)[F(X, y, a1 , . . . , an ) → y ε X] → x ε X . (6.1) Therefore we have the following lemma;
6.4 Arithmetically Definable Inductive Definitions
89
6.4.8 Lemma All positive-inductively definable sets are in Π11 . Proof
This is obvious from (6.1) and Definition 6.4.7.
We will see later that vice versa all Π11 -set are positive-inductively definable. An ordinal which is characteristic for the structure N of natural numbers is the ordinal
κ N := sup{|ΦF |
F(X, x, a1 , . . . , an ) is an X-positive arithmetical formula and a1 , . . . , an ∈ N.}
It is a folklore result of abstract recursion theory that κ N = ω1CK , where ω1CK is the least ordinal which cannot be represented as the order-type of a primitive recursive (even Σ11 -definable) well-ordering, a result which we will reprove in Theorem 6.6.4. We will later even get a finer calibration regarding not only definability but also provability within certain axiom systems. 6.4.9 Exercise Let Φ: Pow(N) −→ Pow(N) be an arithmetically definable operator such that |= (∀X)(∀Y ) X ⊆ Y → (∀x)[F(X, x) → F(Y, x)] holds true for its defining formula F(X, x). Show that there is an X-positive formula which defines Φ. Hint: Use the C RAIG–LYNDON interpolation theorem (Exercise 4.5.7).
6.4.10 Exercise (Simultaneous inductive definitions). Assume that F(X,Y, x) and G(X,Y, y) are X- and Y -positive formulas. We define α <α ΦαF := {n ∈ N N |= F[Φ< F , ΦG , n]}
and α <α ΦαG := {n ∈ N N |= G[Φ< F , ΦG , n]} α where we again abbreviate Φ< F :=
ξ ξ ∈On ΦF
as well as
ξ ξ ∈On ΦG
ξ ξ <α ΦF
α and Φ< G :=
ξ ξ <α ΦG .
Show that
are positive-inductively definable.
6.4.11 Exercise (Transitivity theorem). Let F(X,Y, x) be an X,Y -positive and G(Z, z) a Z-positive formula. Show that the fixed-point of the operator ΦF(IG ) defined by ΦF(IG ) (S) := {n ∈ N N |= F[IG , S, n]} is positive-inductively definable. 6.4.12 Exercise (Stage comparison theorem). Let F(X, x) and G(X, x) be Xpositive formulas. Show that the stage comparison relations m ≤∗F,G n :⇔ m ∈ IF ∧ |m|F ≤ |n|G
90
6 Inductive Definitions
and m <∗F,G n :⇔ |m|F < |n|G are positive-inductively definable. Here we put |n|F := ω1 for n ∈ / IF .
6.5 Inductive Definitions, Well-Orderings and Well-Founded Trees An important special case of inductive definitions are accessible parts which will be studied in this section. 6.5.1 Definition Let ≺ be an arithmetically definable binary relation on the natural numbers and regard the operator defined by the formula Acc(X, x, ≺) :⇔ x ∈ field(≺) ∧ (∀y)[y ≺ x → y ε X]. This is obviously a X-positive formula inducing a monotone operator Acc≺ . Its fixed-point is the accessible part of ≺, usually denoted by Acc≺ . By Accα≺ we denote the α -th stage of the operator Acc≺ . For s ∈ Acc≺ we denote by |s|Acc≺ the inductive norm of s. 6.5.2 Lemma Let ≺ be an arithmetically definable binary well-founded relation on the natural numbers. Then Acc≺ = field(≺) and for all s ∈ field(≺) we have otyp≺ (s) = |s|Acc≺ . Proof
First we show
s ∈ field(≺) ∧ otyp≺ (s) = α ⇒ s ∈ Accα≺
(i)
by induction on α . By the induction hypothesis we have α (∀t)[t ≺ s → t ∈ Acc< ≺ ]
(ii)
which immediately implies s ∈ Accα≺ . From (i) we obtain field(≺) ⊆ Acc≺ and |s|Acc≺ ≤ otyp≺ (s). The opposite inclusion Acc≺ ⊆ field(≺) holds anyway. Therefore it suffices to show s ∈ Accα≺ ⇒ otyp≺ (s) ≤ α .
(iii)
We prove (iii) by induction on α . By the induction hypothesis we get α Acc< ≺ ⊆ {x ∈ field(≺) otyp≺ (x) < α }.
Accα≺
and t ≺ s we obtain t ∈ For s ∈ otyp≺ (s) ≤ α .
α Acc< ≺
(iv) and thence otyp≺ (t) < α . Hence
We see from Lemma 6.5.2 that induction along a well-ordering or induction on the definition of an inductively defined set are two sides of the same medal. We shall
6.5 Inductive Definitions, Well-Orderings and Well-Founded Trees
91
see that Bar Induction on well-founded trees belongs to the same category. Wellfounded trees have been introduced in Definition 4.4.2. Let T be a tree which is arithmetically definable, i.e., we have an arithmetical formula T (x) without further free variables such that s ∈ T ⇔ T (s). Define FT (X, x) :⇔ T (x) ∧ (∀y)[T (x y) → x y ε X]. Then FT (X, x) is an X-positive formula. Let IT denote its fixed-point and ITα the stages of the fixed-point. 6.5.3 Lemma Let T be a well-founded arithmetically definable tree. Then s ∈ otyp (s) IT T for all s ∈ T . Proof We induct on otypT (s). By the induction hypothesis we have s x ∈
∈ IT . If T is well-founded then (∀s ∈ T )[otypT (s) = |s|IT ] and |IT | = otyp(T ) + 1. Proof Let T be arithmetically definable. If T is well-founded we obtain ∈ IT by Lemma 6.5.3. If conversely ∈ IT then we obtain the well-foundedness of T = T by Lemma 6.5.4. By Lemma 6.5.3 we obtain |s|IT ≤ otypT (s) and by Lemma 6.5.4 we get otypT (s) = otyp(Ts ) ≤ |s|IT . Finally we observe that |t|IT ≤ |s|IT for all s ⊆ t ∈ T . Hence |IT | = | |IT + 1 = otypT ( ) + 1 = otyp(T ) + 1 by Lemma 6.3.6. 6.5.6 Corollary The class of Π11 -sets and the class of positive-inductively definable sets coincide.
92
6 Inductive Definitions
Proof We have already seen that all positive-inductively definable sets are in Π11 . If P ∈ Π11 , say P = {x Fu (x)}, we obtain n ∈ P iff the search tree for Fu (n) is wellfounded from the ω -Completeness Theorem (Theorem 5.4.9). By Theorem 6.5.5 this is the case iff ∈ ISω . This shows that P is positive-inductively definable. Fu (n)
6.6 Inductive Definitions and Truth Complexities The aim of this section is to connect the closure ordinal of a positive inductive definition to the truth complexity of its defining Π11 -sentence. To improve readability we introduce some abbreviations. Let ClF (X) :⇔ (∀x)[F(X, x) → x ε X] denote the fact that the set X is closed under the operator induced by the formula F(X, x). The Π11 -definition of n ∈ IF is then obtained as n ∈ IF ⇔ (∀X)[ClF (X) → n ε X]. The first step is to show a stage theorem. 6.6.1 Theorem (Stage Theorem) mula. Then (∀n ∈ IF ) |n|F < 2tc(n∈IF ) .
Let F(X, x) be an X-positive arithmetical for-
To prove the Stage Theorem we need the more general Stage Lemma. To formulate the lemma let F(X, x) be an X-positive arithmetical formula and put F[t1 , . . . ,tk ] :⇔ F(X, x) ∨ x = t1 ∨ · · · ∨ x = tk . Then β
α +β
s ∈ IFα ⇒ IF[s] ⊆ IF
.
(6.2) β
<β
Claim (6.2) is shown by induction on β . If x ∈ IF[s] then F(IF[s] , x) ∨ x = s. By <α +β
induction hypothesis and X-positivity, this implies F(IF α +β α +β or x = s ∈ IFα ⊆ IF and (6.2) is shown. IF
, x) ∨ x = s. Hence x ∈
The proof of the Stage Lemma uses the following Inversion Lemma. 6.6.2 Lemma (Inversion Lemma) Let F ∈ α α ∆ , F implies ∆ , G for all G ∈ CS(F).
–type and CS(F) = 0. / Then α
∆,F Proof We induct on α . Assume first that the last inference in verifying has a critical formula A different from F. Then A ∈ ∆ and we have the premise(s)
6.6 Inductive Definitions and Truth Complexities αB αB
93
∆ , F, B for some (or all) B ∈ CS(A). By the induction hypothesis we get α
∆ , G, B for some (or all) B ∈ CS(A) and infer ∆ , G by the same inference. If the last clause was ( ) with critical formula F then we have the premises αG
∆ , F, G
(i) αG
for all G ∈ CS(F). From the induction hypothesis we then get ∆ , G for all G ∈ CS(F) and obtain the claim by the structural rule (STR) (cf. p. 78). 6.6.3 Lemma (Stage Lemma) Let F(X, x) be an X-positive arithmetical formula and ∆ (X,Y ) a finite set of pseudo Π11 -sentences in which X occurs at most positively. If α
¬ClF (X), t1 ε X, . . . ,tk ε X, ∆ (X,Y ),
then N |=
(i)
α
<2 ∆ [IF[t , S1 , . . . , Sn ] 1 ,...,tk ]
(ii)
holds true for all assignments S1 , . . . , Sn to the set variables Y1 , . . . ,Yn in ∆ . Proof We prove the lemma by induction on α . These are the following cases to distinguish. 1) (i) holds by (Ax). Then there is a formula s ε X in ∆ such that sN = ti N for <1 0 some i ∈ {1, . . . , k}. Then s ∈ IF[t ,...,t ] and we obtain N |= ∆ [IF[t ,...,t ] , S1 , . . . , Sn ]. 1
k
1
k
2) (i) holds by an application of ( ). Then there is an A ∈ –type ∩ ∆ . If CS(A) = 0/ then A ∈ Diag(N) and we are done. Otherwise we have the premises αG
¬ClF (X), t1 ε X, . . . ,tk ε X, ∆ (X,Y ), G(X,Y )
for all G ∈ CS(A). All G(X,Y ) are again X-positive and we obtain by the induction hypothesis N |=
α
α
<2 G <2 G ∆ [IF[t , S1 , . . . , Sn ] ∨ G[IF[t , S1 , . . . , Sn ] 1 ,...,tk ] 1 ,...,tk ] α
(iii)
α
<2 G <2 ⊆ IF[t we obtain (ii) from (iii). for all G(X,Y ) ∈ CS(A). Since IF[t ,...,t ] ,...,t ] 1
k
1
k
3) (i) holds by an application of ( ). Then there is a formula A ∈ –type belonging also to {¬ClF (X),t1 ε X, . . . ,tn ε X} ∪ ∆ (X,Y ) such that αG
¬ClF (X),t1 ε X, . . . ,tn ε X, ∆ (X,Y ), G
(iv)
for some G ∈ CS(A) and αG < α . If A ∈ ∆ (X,Y ) we obtain the claim from the induction hypothesis as in case 2. Therefore suppose that A ≡ ¬ClF (X) ≡ (∃x)[F(X, x) ∧ x ε X]. Then G ≡ F(X,t) ∧ t ε X for some term t and we obtain from (iv) αG
¬ClF (X),t1 ε X, . . . ,tn ε X, ∆ (X,Y ), F(X,t)
(v)
94
6 Inductive Definitions
and αG
¬ClF (X),t1 ε X, . . . ,tn ε X,t ε X, ∆ (X,Y )
(vi)
by inversion. From (v) we get N |=
α
α
<2 G <2 G ∆ [IF[t , S1 , . . . , Sn ] ∨ F[IF[t ,t] 1 ,...,tk ] 1 ,...,tk ]
(vii)
by the induction hypothesis. If N |=
α
<2 G ∆ [IF[t , S1 , . . . , Sn ] 1 ,...,tk ] α
α
<2 G <2 ⊆ IF[t . Otherwise, we get we are done because IF[t ,...,t ] ,...,t ] 1
<2αG ,t], N |= F[IF[t 1 ,...,tk ]
1
k
i.e., t
k
2αG ∈ IF[t . 1 ,...,tk ]
(viii)
From (viii) and assertion (6.2), we obtain αG
αG
αG
<2 <2 +2 IF[t ⊆ IF[t ,...,t ] 1 ,...,tk ,t] 1
k
α
<2 ⊆ IF[t . 1 ,...,tk ]
(ix)
The induction hypothesis applied to (vi) yields N |=
α
<2 G ∆ [IF[t , S1 , . . . , Sn ] 1 ,...,tk ,t]
(x)
To infer the Stage Theorem from the Stage Lemma let α := tc n ∈ IF . Then α α ¬ClF (X), n ε X and we obtain N |= n ∈ IF<2 by the Stage Lemma. Hence |n|F < 2α . and we obtain the claim from (x) and (ix).
6.6.4 Theorem κ N = ω1CK . Proof By Lemma 6.5.2 we have otyp(≺) < κ N for every recursive well-ordering ≺. Hence ω1CK ≤ κ N . If F(X, x) is an X-positive arithmetical formula and n ∈ IF we obtain
α := tc (∀X)[(∀x)[F(X, x) → x ε X] → n ε X] < ω1CK because the search tree S(∀x)[F(X,x)→xε X]→nε X is primitive recursive. Since ω1CK is closed under ordinal addition and exponentiation we obtain |n|F ≤ 2α < ω1CK for all formulas F and all n ∈ IF . Hence κ N ≤ ω1CK .
(i)
6.6.5 Note The Stage Theorem is a theorem of Gentzen (cf. [31]) in disguise.
Gentzen showed (of course using a different terminology) otyp(≺) ≤ ω tc TI(≺) for a well-ordering ≺. In case of a well-ordering, however, there is a strengthening of Gentzen’s theorem due to Beckmann [8] which we are going to prove next.
6.6 Inductive Definitions and Truth Complexities
95
To sharpen the Stage Theorem we need some additional notations. Assume that ≺ is a binary relation. We extend the accessibility operator Acc≺ defined by the formula Acc(X, x, ≺) :⇔ x ∈ field(≺) ∧ (∀y)[y ≺ x → y ε X] to its inflation Acc≺ (X) := X ∪ Acc≺ (X) = X ∪ {n ∈ field(≺) (∀y ≺ n)[y ∈ X]}. The operator Acc≺ is called inflationary because of X ⊆ Acc≺ (X), which is in general not true for Acc≺ . The α -th iterate of the operator Acc≺ is defined by α
Acc≺ (X) := Acc≺ (X ∪
ξ <α
ξ
Acc≺ (X)).
It is then plain that, starting with the empty set, all iterations of Acc≺ and Acc≺ coincide. i.e., we have α
Acc≺ (0) / = Accα≺ .
(6.3)
Let X ⊆ Acc≺ . We want to enumerate the stages of the elements in Acc≺ which are different from the stages of elements in X. Therefore we define X := {n (∃y ∈ X)[|y|Acc≺ = |n|Acc≺ ]}, O(X) := {|n|Acc≺ n ∈ X} and enX := enOn\O(X) . We use the enumerating function enX in the explicit definition of a new operator. en (α )
Rα≺ (X) := X ∪ {x ∈ Acc≺ |x|Acc≺ ≤ enX (α )} = X ∪ Acc≺ X
.
Since enX∪{x} (α ) ≤ enX (α + 1) holds trivially, we obtain the innocently looking, but crucial property α +1 (X) ∪ {x}. Rα≺ (X ∪ {x}) ⊆ R≺
Put α R< ≺ (X) := X ∪
(6.4)
ξ
R≺ (X).
ξ <α
There is a close connection between the operators Rα≺ (X) and the iterations α
Acc≺ (X), which is expressed by the next lemma. 6.6.6 Lemma Assume that ≺ is a transitive binary relation and X ⊆ Acc≺ . Then α Rα≺ (X) = Acc≺ (R< ≺ (X)).
Proof
For the inclusion from left to right, let s ∈ Rα≺ (X). Then either s ∈ X ⊆
α Acc≺ (R< ≺ (X)) and we are done, or |s|Acc≺ ≤ enX (α ). Pick t ≺ s. If enX (β ) <
96
6 Inductive Definitions
|t|Acc≺ < |s|Acc≺ ≤ enX (α ) for all β < α then |t|Acc≺ ∈ / rng(enX ), which implies α t ∈ X ⊆ R< ≺ (X). Otherwise, there is a β < α such that |t|Acc≺ ≤ enX (β ). Therefore, we have α (∀t ≺ s)[t ∈ R< ≺ (X)] α which in turn implies s ∈ Acc≺ (R< ≺ (X)).
α <α For the opposite inclusion, let s ∈ Acc≺ (R< ≺ (X)). Then s ∈ R≺ (X) and we are done, or
(∀t ≺ s)[t ∈ X ∨ |t|Acc≺ < enX (α )].
(i)
We show (∀t ≺ s)[t ∈ X ∨ |t|Acc≺ < enX (α )] ⇒ |s|Acc≺ ≤ enX (α )
(ii)
by induction on |s|Acc≺ . Let p ≺ s. If p ∈ / X we obtain |p|Acc≺ < enX (α ) by the / X we obtain t ≺ s by hypothesis of (ii). Therefore assume p ∈ X. If t ≺ p and t ∈ the transitivity of ≺. Hence |t |Acc≺ < enX (α ) by (i). Using the induction hypothesis, we obtain |p|Acc≺ ≤ enX (α ). Since p ∈ X we cannot have |p|Acc≺ = enX (α ). Hence |p|Acc≺ < enX (α ). Therefore |p|Acc≺ < enX (α ) holds true for all p ≺ s which entails |s|Acc≺ ≤ enX (α ). From (i) and (ii) we finally get |s|Acc≺ ≤ enX (α ) and thus s ∈ Rα≺ (X). 6.6.7 Lemma Let ≺ be a transitive binary relation. For any set X ⊆ Acc≺ and any α
ordinal α we have Rα≺ (X) = Acc≺ (X). Proof
Proving the lemma by induction on α we have the induction hypothesis
α R< ≺ (X) = X ∪
ξ <α
ξ
Acc≺ (X) =: X ∪ Acc≺
<α
(X).
(i)
Together with Lemma 6.6.6 we obtain α Rα≺ (X) = Acc≺ (R< ≺ (X)) =
= Acc≺ (X ∪ Acc≺
<α
α
(X)) = Acc≺ (X).
6.6.8 Lemma (Boundedness Lemma) Let ≺ be an arithmetical definable binary transitive relation and ∆ (X,Y ) a finite set of X-positive pseudo Π11 -sentences such that α ¬(∀x) (∀y ≺ x)[y ε X] → x ε X , t1 ε X, . . . ,tk ε X, ∆ (X,Y ). (i) Then N |=
∆ [Rα≺ ({t1N , . . . ,tkN }), S]
holds true for all assignments S of sets to the variables in Y .
(ii)
6.6 Inductive Definitions and Truth Complexities
97
Proof Let Prog(≺, X) :⇔ (∀x) (∀y ≺ x)[y ε X] → x ε X express the progressiveness of the set X relative to ≺. We show the lemma by induction on α . There are the following cases. 1) Hypothesis (i) holds by (Ax). Then there is a formula (s ε X) ∈ ∆ (X,Y ) with N s = tiN for some i ∈ {1, . . . , k}. But tiN ∈ R0≺ ({t1N , . . . ,tkN }) which implies N |=
∆ [R0≺ ({t1N , . . . ,tkN }), S].
2) Hypothesis (i) holds by an inference whose critical formula F belongs to ∆ (X,Y ). If CS(F) = 0/ then F ∈ Diag(N) which implies N |=
∆ [R0≺ ({t1N , . . . ,tkN }), S].
Otherwise all G(X,Y ) ∈ CS(F) are again X-positive and we have the premise(s) αG
¬Prog(≺, X),t1 ε X, . . . ,tk ε X, ∆ (X,Y ), G(X,Y )
for some (or all) G(X,Y ) ∈ CS(F). If N |= N |=
(iii)
∆ [Rα≺ ({t1N , . . . ,tkN }), S] then also
∆ [Rα≺G ({t1N , . . . ,tkN }), S]
α for all αG < α . Therefore we get N |= G[R≺G ({t1N , . . . ,tkN }), S] for some (or all) G ∈ CS(F) from (iii) by the induction hypothesis. Since F is X-positive this implies
N |= F[Rα≺ ({t1N , . . . ,tkN }), S]
and this in turn entails N |= ∆ [Rα≺ ({t1N , . . . ,tkN }), S]. 3) Hypothesis (i) holds by an inference ( ) with premise α0
¬Prog(≺, X),t1 ε X, . . . ,tk ε X, ∆ (X,Y ), (∀y)[y ≺ t → y ε X] ∧ t ε X. (iv)
By inversion we obtain from (iv) α0
¬Prog(≺, X),t1 ε X, . . . ,tk ε X, ∆ (X,Y ), (∀y)[y ≺ t → y ε X]
(v)
¬Prog(≺, X),t1 ε X, . . . ,tk ε X,t ε X, ∆ (X,Y ).
(vi)
and α0
Assume N |= N |=
∆ [Rα≺ ({t1N , . . . ,tkN }), S]. Since α0 < α this implies also
∆ [Rα≺0 ({t1N , . . . ,tkN }), S]
which by the induction hypothesis for (v) yields α
N |= (∀y)[y ≺ t N ⇒ y ∈ R≺0 ({t1N , . . . ,tkN })] and therefore by Lemma 6.6.6 t N ∈ Rα≺ ({t1N , . . . ,tkN }) which in turn implies
(vii)
98
6 Inductive Definitions
{t N } ⊆ Rα≺ ({t1N , . . . ,tkN }).
(viii)
From the induction hypothesis applied to (vi) we obtain N |=
∆ [Rα≺0 ({t1N , . . . ,tkN ,t N }), S].
(ix)
By the assertion (6.4) (on p. 95) and (viii) we obtain Rα≺0 ({t1N , . . . ,tkN ,t N }) ⊆ Rα≺ ({t1N , . . . ,tkN }) which together (ix) implies N |=
∆ [Rα≺ ({t1N , . . . ,tkN }), S].
From the Boundedness Lemma together with (6.3) and Lemma 6.6.7 we obtain the Boundedness Theorem which strengthens the Stage Theorem in case of accessible parts for transitive binary relations. 6.6.9 Theorem (Boundedness Theorem for accessible parts of transitive relations) Let ≺ be an arithmetically definable and transitive binary relation and ∆ (X,Y ) a finite set of X-positive pseudo-Π11 -sentences which at most contain the shown free set variables. Then α
¬Prog(≺, X), ∆ (X,Y )
implies
∆ [Accα≺ ,Y ] . N |= (∀Y )
Proof
From α
¬Prog(≺, X), ∆ (X,Y )
we obtain by the Boundedness Lemma N |= (∀Y ) ∆ [Rα≺ (0), / Y ] which by Lemma 6.6.7 implies α N |= (∀Y ) ∆ [Acc≺ (0), / Y ] . By (6.3) we therefore obtain ∆ [Accα≺ ,Y ] . N |= (∀Y )
6.6.10 Remark We may even allow free set variables in the defining formula of the relation ≺. Therefore we can extend Lemma 6.6.8 and Theorem 6.6.9 to binary relations which are Σ11 -definable. (Cf. [8]). Observe that the transitivity of the order relation is crucial (cf. Exercise 6.7.6 below).
6.7 The Π11 -Ordinal of a Theory
99
6.6.11 Corollary Let ≺ be a transitive order relation. Then α
¬Prog(≺, X), n ∈ X
implies otyp≺ (n) ≤ α . Proof
From α
¬Prog(≺, X), n ε X
(i)
we obtain from the Boundedness Theorem for accessible parts N |= n ε Accα≺ .
(ii)
This implies by Lemma 6.5.2 otyp(≺) ≤ α .
6.7 The Π11 -Ordinal of a Theory We defined a first-order theory as a set of first-order sentences. Liberalizing that to a set of pseudo-Π11 -sentences still leaves us in the realm of first-order logic and therefore allows for complete formal deduction formalisms. As pointed out in Sect. 4.6 second-order theories cannot possess a complete formal deduction formalism. However, in Sect. 4.4 we have shown that there are sound calculi for second-order theories. To cover also the situation of second-order theories we introduce the notion of formal systems. A formal rule is a syntactically given figure of the form (R)
F1 , . . . , Fn F.
The formulas F1 , . . . , Fn are the premises, the formula F is the conclusion of the rule (R). A formal system is a decidable set T of axioms together with a set R of formal rules. The set of formulas which are derivable in (T, R) is inductively defined by the following clauses. • If F ∈ T then (T, R)
F
• If (T, R) F1 , . . . , (T, R) (R) in R then (T, R) F.
Fn and F1 , . . . , Fn F is an instantiation of a rule
¨ ’s Examples for formal system are the Tait calculi described in Sect. 4.3. G ODEL Completeness Theorem (Theorem 4.4.6) formulated in terms of general formal systems, then takes the form: For a decidable set of pseudo-Π11 -sentences T there is a formal system (T, R) such T |=L F if and only if (T, R) F. As remarked above we have only the direction
100
6 Inductive Definitions
(T, R)
F ⇒ T |=L F
for second-order logic. In first-order logic the set of consequences of a theory is independent of the special choice of the formalism. The set of consequences of formal system in second-order logic, however, depends also on its set of rules. For the rest of this section let us assume that the language of a theory or a formal system comprises the language of arithmetic, either directly or by interpretation. Let TI(≺, X) :⇔ Prog(≺, X) → (∀x ε field(≺))[x ε X] and TI(≺) :⇔ (∀X)TI(≺, X) express the fact that the relation ≺ is well-founded. Inspired by Gentzen’s work we define the proof-theoretic ordinal of a formal system (T, R) to be the ordinal ||(T, R)|| := sup {otyp(≺) ≺ is primitive recursive ∧ (T, R)
TI(≺)}.
Since the set of consequences of a first-order theory is independent of the choice of the formalism, i.e., of the set of rules, provided that the formalism is complete, we define the proof-theoretic ordinal of a first-order theory by ||T || := sup {otyp(≺) ≺ is primitive recursive ∧ T |=L TI(≺, X)}.
(6.5)
Equation (6.5) therefore requires that T is able to prove (pseudo-)Π11 -sentences.1 For theories T and formal systems (T, R) which prove (pseudo-)Π11 -sentences we define the Π11 -ordinal by
||T ||Π 1 := sup {tc F + 1 F is a (pseudo) Π11 -sentence ∧ T |=L F} 1
and
||(T, R)||Π 1 := sup {tc F + 1 F is a (pseudo) Π11 -sentence ∧ (T, R) 1
F}.
A theory or a formal system is Π11 -sound if T
F ⇒ N |= F or (T, R)
F ⇒ N |= F
for all (pseudo-) Π11 -sentences F. The next theorem is then an immediate consequence of the definition of the Π11 -ordinal. 6.7.1 Theorem Let T be a theory or a formal system. Then T is Π11 -sound if and only if ||T ||Π 1 < ω1CK . 1
We modify the Boundedness Theorem (Theorem 6.6.9) for accessible parts of primitive recursive transitive relations. 6.7.2 Theorem (Boundedness Theorem) Let ≺ be a primitive recursive, binary,
transitive and well-founded relation. If otyp(≺) ∈ Lim then otyp(≺) = tc TI(≺) . 1
In Sect. 10 we discuss the situation for a theory, which only proves arithmetical sentences.
6.7 The Π11 -Ordinal of a Theory
101
Let α := tc TI(≺) . For n ∈ field(≺) let
Proof
TI(≺, n) :⇔ ¬Prog(≺, X) ∨ n ε X
and αn := tc TI(≺, n) . Then we have αn
¬Prog(≺, X), n ε X
(i)
for all n ∈ field(≺) which by Corollary 6.6.11 entails otyp≺ (n) ≤ αn ≤ α for all n ∈ field(≺). Hence otyp(≺) ≤ α + 1 and thus, otyp(≺) ≤ α since otyp(≺) ∈ Lim. For s ∈ field(≺) we obtain by induction on otyp≺ (s) 5·(otyp≺ (s)+1)
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X], s ε X.
(6.6)
For t ≺ s we have 5·(otyp≺ (t)+1)
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X],t ε X
(ii)
by the induction hypothesis. If s t then (¬t ≺ s) ∈ Diag(N) and we obtain 0
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X], ¬t ≺ s
(iii)
by an inference ( ) with empty premise. Hence 5·(otyp≺ (t)+1)
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X], ¬t ≺ s, t ε X
(iv)
for all t. From (iv) we get 5·(otyp≺ (s))+3
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X], (∀y)[¬y ≺ s ∨ y ε X]
(v)
by two inferences ( ) followed by an inference according to ( ). From the axiom 0
s ε X, s ε X
(vi)
and (v) we obtain by the structural rule 5·(otyp≺ (s))+4
(STR)2
together with an inference ( )
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X], (∀y)[¬y ≺ s ∨ y ε X] ∧ s ε X, s ε X.
(vii)
From (vii), however, we obtain 5·(otyp≺ (s))+5
¬(∀x)[(∀y ≺ x)(y ε X) → x ε X], s ε X
by an inference ( ). This finishes the proof of (6.6). But (6.6) implies 5·(otyp≺ (s)+1)
¬Prog(≺, X), s ε field(≺), s ε X
for all term s and since otyp(≺) ∈ Lim we get 5·(otyp≺ (n) + 1) + 2 < otyp(≺) for all s ∈ field(≺) and thus 2 In fact we needed the structural rule also in the inferences before. Applications of the structural rule (STR) remain mostly unmentioned.
102
6 Inductive Definitions otyp(≺)
¬Prog(≺, X), (∀x ε field(≺))[x ε X].
Hence tc TI(≺) ≤ otyp(≺).
The following theorem is an immediate consequence of the Boundedness Theorem. 6.7.3 Theorem Let T be a theory or a formal system. Then ||T || ≤ ||T ||Π 1 . 1
The objective of an ordinal analysis for a theory is the computation of its prooftheoretic ordinal. As we will see in the remainder of this book this is commonly done in two steps. First we calculate an upper bound β ≥ ||T ||Π 1 and in the second 1 step show that for all α < β there is a primitive recursive ordering ≺ such that TI(≺) and otyp(≺) = α . This implies β ≤ ||T || ≤ ||T ||Π 1 ≤ β and therefore T 1 ||T || = ||T ||Π 1 . 1 There is, however, also an abstract argument showing that ||T ||Π 1 and ||T || co1 incide for certain theories. We formulate the theorem but give only a sketch of its proof because we will never apply the theorem. 6.7.4 Theorem Let T be a theory which is strong enough to prove the ω Completeness Theorem. Then ||T || = ||T ||Π 1 . 1
Proof (Sketch) We assume that T (or a conservative extension of T in which we add comprehension for elementary formulas) is strong enough3 to prove that a pseudo-Π11 -sentence F is equivalent to the well-foundedness of the search tree for F, i.e., T
F(X) ↔ TI(≺F )
(i)
where ≺F is the relation s ≺F t :⇔ s ε S F ∧ t s. The relation ≺F is obviously primitive recursive and for s ∈ S F we have otypS F (s) F = otyp≺F (s). If α < ||T ||Π 1 then there is a pseudo-Π11 -sentence F such that T 1
T I(≺F ) which shows that α ≤ tc F ≤ otyp(≺F ) < and α ≤ tc F . But then T ||T ||. So we have ||T ||Π 1 ≤ ||T || which together with Theorem 6.7.3 yields the claim. 1
The next theorem is accredited to Kreisel. It limits the importance of the Π11 ordinal of a theory. But as we will see in the course of the book the real importance of the Π11 -ordinal rest in its computation and not so much in the bare knowledge of its size. Nevertheless the proof-theoretic ordinal of a theory T has become a very common measure for its proof-theoretic strength. All we need to prove the ω -Completeness Theorem (Theorem 5.4.9) is K¨onigs Lemma. According to [94] the theory (ACA)0 of arithmetical comprehension suffices. This theory is a conservative second-order extension of PA.
3
6.7 The Π11 -Ordinal of a Theory
103
6.7.5 Theorem Let T be a theory or a formal system for which we have ||T || = ||T ||Π 1 and F a Σ11 -sentence which is true in the standard structure N. Then 1 ||T || = ||T + F||. Proof
Assume
T + F |=L TI(≺, X)
(i)
for a primitive recursive order relation ≺. Then we obtain T |=L ¬F ∨ TI(≺, X)
(ii)
which implies α
¬F, ¬Prog(≺, X), (∀x ε field(≺))[x ε X]
(iii)
for some α < ||T ||. By the Boundedness Theorem for accessible parts (Theorem 6.6.9) we then obtain N |= ¬F ∨ (∀x ∈ field(≺))[x ε Accα≺ ].
(iv) Accα≺ ],
Since N |= F this entails (∀x ∈ field(≺))[x ε which means otyp(≺) ≤ α . Hence ||T + F|| ≤ ||T ||. The opposite inequality holds trivially. 6.7.6 Exercise (Beckmann) Let ≺ ⊆ N × N be defined by i ≺ k :⇔ k = i + 1. (a) Compute otyp≺ (n) for n ∈ N (b)
Show
4·n
¬Prog(≺, X), 2n ε X.
Hint: Prove 4·k
¬Prog(≺, X), n ε X, n + 2k − 1 ε X
for all n by induction on k. The induction begin is trivial. For the induction step you have the induction hypotheses 4·k
¬Prog(≺, X), n ε X, n + 2k − 1 ε X
4·k
¬Prog(≺ X), n + 2k ε X, n + 2k+1 − 1 ε X.
and
Combine these hypotheses to get the claim.
Observe that Exercise 6.7.6 shows that the hypothesis of transitiveness in the Boundedness Lemma (Lemma 6.6.8), in the Boundedness Theorem (Theorem 6.6.9) and Theorem 6.7.2 is indispensable.
104
6 Inductive Definitions
6.7.7 Exercise Show that the boundedness property of generalized recursion theory follows from the Stage Theorem (Theorem 6.6.1), i.e., let Rel (X) :⇔ (∀x ε X)[Seq(x) ∧ lh(x) = 2] denote that X codes a binary relation and Wf (X) :⇔ Rel (X) ∧ (∀Y ) Y ⊆ field(X) ∧ (∃x)(x ε Y ) → (∃x) x ε Y ∧ (∀y)[ y, x ε X → y ε Y ] denote that the relation coded by X is well-founded. (a) Assume that F(X) is a Σ11 -formula such that (∀X)[F(X) → Wf(X)]. Show sup {otyp(X) F(X)} < ω1CK using the Stage Theorem. (b) Let ≺ be a Σ11 -definable binary relation. Show that the order-type of ≺ is less than ω1CK provided that ≺ is well-founded. Hint: Cf. [8].
Chapter 7
The Ordinal Analysis for PA
This sections repeats Gentzen’s ordinal analysis for arithmetic as a paradigmatic example for an ordinal analysis.
7.1 The Theory PA The most familiar axiom system for the number theory is the system PA, which is commonly accredited to Guiseppe Peano [67]. It is likely that Peano was inspired by Dedekind’s work on natural numbers as presented in his article “Was sind und was sollen die nat¨urlichen Zahlen” [20]. The language L (PA) of Peano Arithmetic comprises the constants 0 and 1 for the natural numbers 0 and 1 and the binary function symbols + for addition and · for multiplication. It is formulated in a firstorder logic with identity whose nonlogical axioms are: (NF)
(∀x)[x + 1 = 0]
(INJ) (∀x)(∀y)[x + 1 = y + 1 → x = y] (PL0) (∀x)[x + 0 = x] (PL1) (∀x)(∀y)[x + (y + 1) = (x + y) + 1] (MU0) (∀x)[x·0 = 0] (MU1) (∀x)(∀y)[x·(y + 1) = (x·y) + x] (IND) F(0) ∧ (∀x)[F(x) → F(x + 1)] → (∀x)F(x)
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
105
7 The Ordinal Analysis for PA
106
Here we have two groups of axioms. The first group comprises the defining axioms for the non logical symbols 0, 1, + and ·, the second group consists of the scheme (IND), which tries to fix the ontology of our structure N. It follows from (IND) that every natural number is either 0 or the successor of some other natural number. To see that regard the formula F(u) :⇔ u = 0 ∨ (∃y)[u = y + 1]. Then we have F(0) and obtain F(u + 1) with u as an witness for y. Therefore we obtain by (IND) (∀x)F(x) that says that every member of any model M of PA is either 0M or (k + 1)M for some k ∈ M. 7.1.1 Exercise Replace the scheme (IND) by the second-order axiom (Ind)2 (∀X) 0 ε X ∧ (∀x)[x ε X → x + 1 ε X] → (∀x)[x ε X] 2
and show that the so obtained axiom system PA is categorical for N, i.e., all models 2 of PA are isomorphic to the standard structure N of natural numbers. 2
Hint: Show that for any second-order model N |= PA the mapping n → nN is an isomorphism.
7.2 The Theory NT Instead of analyzing the axioms in PA we do that for a richer language that has constants for all primitive recursive functions. The language L (NT) is a first-order language which contains set parameters denoted by capital Latin letters X, Y , Z, X1 , . . . and constants for 0 and all primitive recursive functions. We assume that the symbols for primitive recursive functions are built up from the symbols S for the successor function, Ckn for the constant function, Pkn for the projection on the k-th component by the substitution operator Sub and the recursion operator Rec (cf. Chap. 2). If F(x1 , . . . , xn ) is a formula that contains only the shown free variables we call the sentence (∀x1 ) . . . (∀xn )F(x1 , . . . , xn ) the universal closure of the formula F(x1 , . . . , xn ). The theory NT comprises the universal closures of the following formulas: The successor axioms (∀x)[¬0 = Sx] (∀x)(∀y)[S(x) = S(y) ⇒ x = y]. The defining axioms for the function symbols are the universal closures of the following formulas:
7.2 The Theory NT
107
Ckn (x1 , . . . , xn ) = k Pkn (x1 , . . . , xn ) = xk Sub(g, h1 , . . . , hm )(x1 , . . . , xn ) = g(h1 (x1 , . . . , xn ) . . . hm (x1 , . . . , xn )) Rec(g, h)(0, x1 , . . . , xn ) = g(x1 , . . . , xn ) Rec(g, h)(Sy, x1 , . . . , xn ) = h(y, Rec(g, h)(y, x1 , . . . , xn ), x1 , . . . , xn ). The scheme of Mathematical Induction F(0) ∧ (∀x)[F(x) → F(S(x))] → (∀x)F(x) for all L (NT)-formulas F(u). The identity axioms IDEN (∀x)[x = x] (∀x)(∀y)[x = y → y = x] (∀x)(∀y)(∀z)[x = y ∧ y = z → x = z] (∀x)(∀y)[x1 = y1 ∧ . . . ∧ xn = yn → fx = fy] (∀x)(∀y)[x1 = y1 ∧ · · · ∧ xn = yn → (Fv1 ,...,vn (x) → Fv1 ,...,vn (y))] where F(v1 , . . . , vn ) is an arbitrary formula in the language of NTand x and y stand for x1 , . . . , xn and y1 , . . . , yn , respectively. The theory NT is an extension of the theory PA by definitions. That means that for every formula F in the language of NT there is a formula F 0 in the language of PA such that NT F ↔ F 0 . On the other side, every formula in the language of PA which is provable in NT is already provable in PA. We will also freely use symbols for primitive recursive predicates in the language of NT although they are not among the basic symbols. This is possible since for every primitive recursive predicate R we have its characteristic function χR among the basic symbols and use (Rt1 , . . . ,tn ) as an abbreviation for χR (t1 , . . . ,tn ) = 1. Recall the definition of NT F in (4.4) on Chap. 61. The following theorem ruminates Theorem 4.4.6. 7.2.1 Theorem A formula F in the language of L (NT) is provable in NT iff there are finitely many axioms A1 , . . . , An in NT and a m < ω such that m ¬A1 , . . . , ¬An , F. T
7.2.2 Exercise Let (∆10 –CA) be the second-order theory in the language L (NT) whose axioms comprise all the axioms of N together with the axiom-scheme (∆01 –CA)
(∃X)(∀x)[x ε X ↔ F(x)] for ∆01 -formulas F(x)
of arithmetical comprehension. Let (∆10 –CA)0 denote the theory (∆10 –CA) in which the scheme (IND) of Mathematical Induction is replaced by the axiom (Ind)2 (cf. Exercise 7.1.1).
7 The Ordinal Analysis for PA
108
(a) Show that (∆10 –CA)0 is a conservative extension of NT. (b) Is this also true for (∆10 –CA)? Hint: For (a) show that every model of NT is extendable to a model of (∆10 –CA)0 . To answer (b) solve Exercise 7.4.8
7.2.3 Remark The second-order theory in the language L (PA) is based on the axioms of PA rather than those of NT but together with the scheme (∆01 –CA) of arithmetical comprehension, and the scheme (IND) replaced by the single secondorder axiom (Ind)2 is known as (ACA)0 . Obviously (∆10 –CA)0 and (ACA)0 have the same proof-theoretical strength. (ACA)0 plays a prominent role in the program of reverse mathematics (cf. [94]).
7.3 The Upper Bound We start this section by a simple observation. We call two pseudo Π11 -sentences numerically equivalent if they only differ in number terms that yield the same values. It is plain that numerically equivalent formulas have the same rank. 7.3.1 Lemma Let F and G be numerical equivalent formulas. Then α ∆,G Proof
α
∆ , F implies
We prove
sN = t N and
α
∆ (s) ⇒
α
∆ (t)
(7.1)
by induction on α . The lemma is an obvious consequence of (7.1). If α
∆ (s)
(i)
holds by (Ax) then {t1 ε X, t2 ε X} ⊆ ∆ (s) such that t1N = t2N . If s is different from ti for i = 1, 2 we also have {t1 ε X, t2 ε X} ⊆ ∆ (t). If s is ti for some i ∈ {1, 2}, α ∆ (t) by (Ax). say s is t1 , then {t ε X,t2 ε X} ⊆ ∆ (t) and t N = sN = t2N . Hence If (i) holds by an instance of ( ) there is a formula F(s) ∈ ∆ (s) ∩ –type and α0 ∆ (s), A(s) for some A(s) ∈ CS(F(s)). But then a(t) ∈ CS(F(t)) and we have
α0
∆ (t), F(t) by the induction hypothesis. By an inference ( ) we obtain the claim. If (i) holds by an instance of ( ) there is a formula F(s) ∈ ∆ (s) ∩ –type which is critical for this inference. If CS(F(s)) = 0/ then F(s) ∈ Diag(N). But t N = sN implies α ∆ (t) by an inference ( ). If CS(F(s)) = 0/ also F(t) ∈ Diag(N) and we obtain αG
αG
then we have ∆ (s), G(s) for all G(s) ∈ CS(F(s)) that entails ∆ (t), G(t) by the induction hypothesis. But CS(F(t)) = {G(t) G(s) ∈ CS(F(s))} and we obtain α ∆ (t), F(t) by an inference ( ). This ends the proof of (7.1).
7.3 The Upper Bound
109
Next we show that any derivation in the first-order logic can be translated into a derivation in the infinite verification calculus. 7.3.2 Theorem Let ∆ (u) be a finite set of formulas in the language of arithmetic m m with all free number variables shown. Then ∆ (u) implies ∆ (n) for all tuples T
n of numerals. Proof
We show the theorem by induction on m. Assume first that m T
∆ (u)
(i)
holds by (AxL). Then there is an atomic formula A(u) such that {A(u), ¬A(u)} ⊆ ∆ (u). If A(u) does not contain a free set variable we have A(n) ∈ Diag(N) or m ∆ (n) by an inference ( ) with empty premise. ¬A(n) ∈ Diag(N) and obtain If A(u) is a formula (t(u) ε X) then {t(n) ε X, t(n) ε X} ⊆ ∆ (n) and we obtain m ∆ (n) by (Ax). Now assume that (i) holds by an inference (∨) or (∃). Let F(u) be the critical formula. Then F is a formula (A1 (u) ∨ A2 (u)) or (∃y)A(y,u). We have the premise m0 T
∆ (u), Ai (u)
(ii)
∆ (u), A(t(u,v),u)
(iii)
or m0 T
for some term t(u,v). Defining G to be the formula Ai (n) or the formula A(t(n,0),n) we get m0
∆ (n), G
(iv)
by the induction hypothesis. Since G ∈ CS((A1 (n) ∨ A2 (n))) in the first case and m ∆ (n) by an inference ( ). G ∈ CS((∃y)A(y,n)) in the second case, we obtain Finally, if (i) is obtained by an inference (∧) whose critical formula is (A0 (u) ∧ A1 (u)) or by an inference (∀) with critical formula (∀y)A(y,u), we have the premise(s) m0 T
∆ (u), Ai (u) for i ∈ {0, 1}
(v)
∆ (u), A(v,u)
(vi)
or m0 T
for a variable v that is different from all variables in the list u. Let Gi be the formula Ai (n) for i = 0, 1 or the formula A(i,n) for i ∈ ω . Abbreviating the critical formula by F(u) we get CS(F(n)) = G0 , G1 or CS(F(n)) = Gi : i ∈ ω and we have m0
∆ (n), Gi
(vii)
7 The Ordinal Analysis for PA
110
for all Gi ∈ CS(F( n)) by the induction hypothesis. We obtain the claim from (vii) by an inference ( ). It follows from Theorems 7.2.1 and 7.3.2 that for any provable pseudo Π11 sentence F of NT there are finitely many axioms A1 , . . . , An ∈ NT such that m
¬A1 , . . . , ¬An , F.
(7.2)
To determine the Π11 -ordinal of NT we have to compute tc F . Our strategy will be the following. First we compute an upper bound, say α , for the truth complexities of all axioms in NT. This gives α
Ai
(7.3)
for all axioms Ai . Then we extend the verification calculus to an infinitary calculus with cut and use the cut rule to get rid of all the axioms. In the next step, we have to develop a procedure to eliminate cuts, which allows us to keep control over the length of the infinite derivations. The depth of the resulting cut free derivation then provides us with an upper bound for the truth complexity of F. We start with the computation of the truth complexities of the axioms of NT. All numerical instances of the defining axioms for primitive recursive functions belong to the diagram Diag(N). Therefore we obtain their universal closure by a finite number of applications of the -rule. The same is true for all identity axioms except the last one. So far we have seen that all mathematical and identity axioms (but the last) of NT have truth complexities below ω . But we have not yet analyzed the last identity axiom and the induction scheme. Here we need a preparatory lemma. 7.3.3 Lemma (Tautology Lemma) Let F and F be numerically equivalent L (NT)formulas. Then
2·rnk(F)
∆ , ¬F , F.
Proof Induction on rnk(F). If F is an atomic formula s = t then F is a formula s = t such that sN = sN and t N = t N . We either have (s = t) ∈ –type , hence (s = t ) ∈ –type , or (s = t) ∈ 0 –type , hence (s = t ) ∈ –type , and obtain ∆ , s = t, s = t by an inference ( ) with empty premise. If F is a formula t ε X then F is a formula t ε X with t N = t N . Then we obtain 0 ∆ ,t ε X,t ε X by (Ax). If F ∈ –type and CS(F) = 0/ we obtain by the induction hypothesis 2·rnk(G)
∆ , G, ¬G , F, ¬F
(i)
for all G ∈ CS(F). Because of CS(¬F ) = {¬G G ∈ CS(F)} we obtain from (i) 2·rnk(G)+1
∆ , G, F, ¬F
(ii)
for all G ∈ CS(F) by an inference ( ). Since 2·rnk(G) + 1 < 2·rnk(F) we finally obtain
7.3 The Upper Bound 2·rnk(F)
111
∆ , F, ¬F
from (ii) by an inference ( ). The remaining cases are completely symmetrical.
If F(v1 , . . . , vn ) is a formula of L (NT) and k1 , . . . , kn and l 1 , . . . , l n are n-tuples of numerals we get 2·rnk(F)
k1 = l 1 , . . . , kn = l n , ¬Fv1 ,...,vn (k1 , . . . , kn ), Fv1 ,...,vn (l 1 , . . . , l n )
either by an inference ( ) with empty premise or by the Tautology Lemma. Abbreviating x1 , . . . , xn by x and y1 , . . . , yn by y we, therefore, obtain k
(∀x)(∀y)[x1 = y1 ∧ · · · ∧ xn = yn → Fv1 ,...,vn (x) → Fv1 ,...,vn (y)]
for some k < ω . 7.3.4 Lemma (Induction Lemma) For any natural number n and any L (NT)sentence F(n) we have 2·[rnk(F(n))+n]
¬F(0), ¬(∀x)[F(x) → F(S(x))], F(n).
Proof We induct on n. For n = 0 this is an instance of the Tautology Lemma. For the induction step we have 2·[rnk(F(n))+n]
¬F(0), ¬(∀x)[F(x) → F(S(x))], F(n)
(i)
by the induction hypothesis and obtain 2·rnk(F(n))
¬F(0), ¬(∀x)[F(x) → F(S(x))], ¬F(S(n)), F(S(n))
(ii)
by the Tautology Lemma. From (i) and (ii) we get by ( ) 2·[rnk(F(n))+n]+1
¬F(0), ¬(∀x)[F(x) → F(S(x))], F(n) ∧ ¬F(S(n)), F(S(n)). (iii)
By a clause ( ) we get from (iii) 2·[rnk(F(n))+n]+2
¬F(0), ¬(∀x)[F(x) → F(S(x))], F(S(n)).
Since S(n)N = SnN this implies by Observation 7.3.1 2·[rnk(F(n))+n]+2
¬F(0), ¬(∀x)[F(x) → F(S(x))], F(Sn).
If G is an instance F(0) ∧ (∀x)[F(x) → F(S(x))] → (∀x)F(x) of Mathematical Induction then G∇ is the finite
set ¬F(0), ¬(∀x)[F(x) → F(S(x))], (∀x)F(x) and we get by Lemma 7.3.4 tc G ≤ ω . Thus, together with our previous remarks, we have
ω Ai , i.e., tc Ai ≤ ω (7.4) for all identity and nonlogical axioms Ai of NT.
7 The Ordinal Analysis for PA
112
So far we have computed ω as an upper bound for the ordinal α in (7.3) on p. 110. α To bridge (7.2) and (7.3) we extend the verification calculus ∆ to an infinitary α calculus, ρ ∆ which we are going to call a semi-formal system.1 7.3.5 Definition For a finite set ∆ of pseudo Π11 -sentences we define the semiα formal derivability relation ρ ∆ inductively by the following clauses: α ρ
( )
∆ , s ε X,t ε X for all ordinals α and ρ . α G If F ∈ ∆ ∩ –type and (∀G ∈ CS(F))(∃αG < α ) ρ ∆ , G then α G If F ∈ ∆ ∩ –type and (∃G ∈ CS(F))(∃αG < α ) ρ ∆ , G then
(cut)
If
(Ax)
If sN = tN then
( )
α0 ρ
∆,F ,
α0 ρ
∆ , ¬F and rnk(F) < ρ then
α ρ
α ρ
∆.
α ρ
∆.
∆ for all α > α0 .
We call F the critical formula of the clauses ( ) and ( ). The critical formulas of an axiom (Ax) are s ∈ X and t ∈ / X. A cut possesses no critical formula. Observe that we have α 0
∆ ⇔
α
∆.
(7.5)
Thus, by Theorem 7.3.2 we obtain for a finite set ∆ m T
∆ ⇒
m 0
∆.
(7.6)
There are two obvious properties of
α ρ
∆.
α
7.3.6 Lemma (Soundness) If ρ F1 , . . . , Fn then N |= (F1 ∨ · · · ∨ Fn )[Φ ] for every assignment Φ of subsets of N to the set parameters in F1 , . . . , Fn . Proof The proof is by induction on α . It is exactly the proof of Lemma 5.4.4 with the additional case that the last inference is a cut α0 ρ
F1 , . . . , Fn , A and
α0 ρ
F1 , . . . , Fn , ¬A ⇒
α ρ
F1 , . . . , Fn .
For an assignment Φ we then obtain N |= (F1 ∨ · · · ∨ Fn ∨ A)[Φ ] as well as N |= (F1 ∨ · · · ∨ Fn ∨ ¬A)[Φ ], which implies N |= (F1 ∨ · · · ∨ Fn )[Φ ].
1
There is a certain inconsequence of terminology. See Note 7.3.17 for explanation.
7.3 The Upper Bound
113
7.3.7 Lemma (Structural Lemma) σ ≥ ρ and Γ ⊇ ∆ .
If
α ρ
∆ then
β σ
Γ holds true for all β ≥ α ,
Induction on α . If
Proof α ρ
∆
(i)
holds by (Ax) then {t ε X, s ε X} ⊆ ∆ ⊆ Γ which entails
β ρ
Γ for all β and σ .
If (i) is derived by an inference ( ) then there is a formula F ∈ ∆ ∩ –type ⊆ Γ ∩ –type such that αG ρ
∆,G
(ii)
with αG < α ≤ β for all G ∈ CS(F). Since ∆ , G ⊆ Γ , G we obtain αG σ
Γ ,G
(iii) β
for all G ∈CS(F) from the induction hypothesis and obtain σ Γ from (iii) by an inference ( ). The remaining case that (i) holds by an inference ( ) is completely dual. As a consequence of Theorem 7.3.2, (7.4) and (7.5) we obtain an embedding theorem for NT. 7.3.8 Theorem (Embedding theorem for NT) For every pseudo-Π11 -sentence F ω +m F. which is provable in NT there are finite ordinals m and r such that r Proof If NT F there are finitely many axioms A1 , . . . , An of NT and a finite m0 ordinal m0 such that ¬A1 , . . . , ¬An , F. By Theorem 7.3.2 and (7.5) this implies m0 0
T
¬A1 , . . . , ¬An , F. By (7.4) and (7.5) we have
Structural Lemma and n cuts we obtain and m ≥ 4 + n.
ω +m r
ω +4 0
Ai for i = 1, . . . , n. Using the
F for r > max{rnk(A1 ), . . . , rnk(An )}
We prepare a cut-elimination theorem for the semi-formal system by a few observations. 7.3.9 Lemma (Inversion Lemma) α ρ
If F ∈
∆ , G for all G ∈ CS(F).
–type , CS(F) = 0/ and
α ρ
∆ , F then
Proof We induct on α . If the critical formula of the last inference is different from F, we have an inference of the form αι ρ
∆ι , F for ι ∈ I ⇒
and obtain
αι ρ
α ρ
∆,F
(i)
∆ι , G for all ι ∈ I from the induction hypothesis. With the same
inference we then get
α ρ
∆ , G.
7 The Ordinal Analysis for PA
114
Now assume that F is the critical formula of the last inference. Since CS(F) = 0/ we have the premises αG ρ
∆ , F, G
(ii)
for all G ∈ CS(F). From (ii) we obtain αG ρ
∆,G
(iii)
by the induction hypothesis and finally
7.3.10 Lemma ( ∨-Exportation) If
α ρ
α ρ
∆ , G from (iii) by the Structural Lemma.
∆ , F1 ∨ · · · ∨ Fn then
α ρ
∆ , F1 , . . . , Fn .
Proof We show the lemma by induction on α . If the critical formula of the last inference is different from F1 ∨ · · · ∨ Fn we obtain the claim directly from the induction hypothesis. Therefore assume that F ≡ (F1 ∨ · · · ∨ Fn ) :≡ F1 ∨ (F2 ∨ · · · ∨ Fn ) is the critical formula of the last inference. Then we have the premise α0 ρ
α0 ρ
∆ , F, F1 or
∆ , F, (F2 ∨ · · · ∨ Fn )
and obtain by the induction hypothesis α0 ρ
∆ , F1 , . . . , Fn .
(i)
The claim then follows from (i) by the Structural Lemma. 7.3.11 Lemma If F ∈ Diag(N) and
α ρ
∆ , ¬F then
α ρ
∆.
Proof Induction on α . Since ¬F contains no logical symbols and is not in –type it cannot be the critical formula of the last inference αι ρ
∆ι , ¬F for ι ∈ I ⇒
α ρ
∆ , ¬F. αι ρ
By the induction hypothesis we, therefore, get same inference
α ρ
∆.
7.3.12 Lemma (Reduction Lemma) Let F ∈ / and
β ρ
Γ , ¬F then
α +β ρ
∆ι for all ι ∈ I and then by the
–type and ρ = rnk(F). If
α ρ
∆,F
∆ ,Γ .
Proof The proof is by induction on β . First assume that ¬F is not the critical formula of the last inference βι ρ
Γι , ¬F for ι ∈ I ⇒
β ρ
Γ , ¬F.
(i)
7.3 The Upper Bound
115
If I = 0/ then either Γ ∩ Diag(N) = 0, / which entails ∆ , Γ ∩ Diag(N) = 0, / and we obtain
α +β ρ
∆ , Γ by an inference ( ) with empty premise, or there are terms s and
t such that {s ε X,t ε X} ⊆ Γ and sN = t N which immediately implies by (Ax). Otherwise we get α +βι ρ
α +β ρ
∆ , Γι
∆ ,Γ
(ii) α +β
for all ι ∈ I by the induction hypothesis and obtain ρ ∆ , Γ from (ii) by the same inference. Now assume that ¬F is the critical formula. If ρ = 0 then ¬F is atomic. If F ∈
–type we have F ∈ Diag(N) and obtain If F ≡ (s ε X) we show α ρ
α +β ρ
∆ , Γ by Lemmas 7.3.11 and 7.3.7.
∆ ,Γ
(iii)
by a side induction on α . First we observe that there is a formula t ε X with t N = sN in Γ since
β ρ
Γ , ¬F holds by (Ax). If F is not the critical formula of αι ρ
α ρ
α ρ
∆,F
then we have the premises ∆ι , F for ι ∈ I. If I = 0/ we get ∆ , Γ directly and for I = 0/ from the induction hypothesis by the same inference. If F is the critical formula we are in the case of (Ax) which entails that there is a formula r ε X in ∆ α with rN = sN = t N . But then we obtain ρ ∆ , Γ by (Ax). The case F ≡ (s ε X) is symmetrical. From (iii) we get
α +β ρ
∆ , Γ by the Structural Lemma.
Now assume ρ > 0. Then ¬F ∈ –type and we have the premise β0 ρ
Γ , ¬F, ¬G
(iv)
for some G ∈ CS(F). Then we obtain α +β0 ρ
∆ , Γ , ¬G
by induction hypothesis. From α +β0 ρ
(v) α ρ
∆ , F we get
∆ ,Γ ,G
(vi)
by the Inversion Lemma and the Structural Lemma. Since rnk(G) < rnk(F) = ρ we obtain the claim from (v) and (vi) by (cut). 7.3.13 Lemma (Basic Elimination Lemma) If
α ρ +1
∆ then
2α ρ
∆.
Proof Induction on α . If the last inference is not a cut of complexity ρ we obtain the claim immediately from the induction hypothesis and the fact that λ ξ . 2ξ is order preserving. The critical case is a cut
7 The Ordinal Analysis for PA
116 α0 ρ +1
α0
∆ , F and
ρ +1
α ρ +1
∆ , ¬F ⇒
∆
with rnk(F) = ρ . By the induction hypothesis and the Reduction Lemma we obtain 2α0 +2α0 ρ
∆ and we have 2α0 + 2α0 = 2α0 +1 ≤ 2α .
Observe that our language so far only comprises formulas of finite rank. But we have designed the semi-formal calculus in such a way that it will also work for languages with formulas of complexities ≥ ω . The following results masters this situation, too. 7.3.14 Lemma (Predicative Elimination Lemma)
If
α β +ω ρ
∆ then
ϕρ ( α ) β
Induction on ρ with side induction on α . For ρ = 0 we obtain
Proof
2α
≤ ωα
∆.
2α
β
∆ by
= ϕ0 (α ) this entails the claim. Now the Basic Elimination Lemma. Since assume ρ > 0. If the last clause was not a cut of rank ≥ β we obtain the claim from the induction hypotheses and the fact that the function ϕρ is order preserving. Therefore assume that the last inference is α0 β +ω ρ
∆ , F and
α0 β +ω ρ
∆ , ¬F ⇒
α β +ω ρ
∆
such that β ≤ rnk(F) < β + ω ρ . But then there is an ordinal φ such that rnk(F) = β + φ , which, writing φ in Cantor normal form, means rnk(F) = β + ω σ1 +. . .+ ω σn < β + ω ρ . Hence σ1 < ρ and we get rnk(F) < β + ω σ1 ·(n+1). By the side induction hypothesis we have it follows we esis
ϕρ (α0 )+1
ϕρ ( α 0 )
β
∆ , F and
ϕρ ( α 0 )
β
∆ , ¬F. By a cut
∆ . If we define ϕσ01 (ξ ) := ξ and ϕσn+1 (ξ ) := ϕσ1 (ϕσn1 (ξ )) 1
β +ω σ1 ·(n+1) obtain from σ1 < ρ ϕσn+1 (ϕρ (α0 )+1) 1
by n + 1-fold application of the main induction hypoth-
∆ . Finally we show ϕσn1 (ϕρ (α0 ) + 1) < ϕρ (α ) by induc-
β
tion on n. For n = 0 we have ϕσ01 (ϕρ (α0 ) + 1) = ϕρ (α0 ) + 1 < ϕρ (α ) since α0 < α and ϕρ (α ) ∈ Cr (0). For the induction step we have ϕσn+1 (ϕρ (α0 ) + 1) = 1 ϕσ1 (ϕσn1 (ϕρ (α0 ) + 1)) < ϕρ (α ), since σ1 < ρ and ϕσn1 (ϕρ (α0 ) + 1) < ϕρ (α ) by the induction hypothesis. Hence
ϕρ α
β
∆.
By iterated application of the Predicative Elimination Lemma we obtain 7.3.15 Theorem (Elimination Theorem) Let + ω ρn . Then
ϕρ1 (ϕρ2 (···ϕρn (α )···)) 0
α ρ
∆ and assume that ρ =NF ω ρ1 + . . .
∆.
7.3.16 Theorem (The upper bound for NT) Let F be a pseudo-Π11 -sentence. If
NT F then tc F < ε0 . Hence ||NT|| ≤ ||NT||Π 1 ≤ ε0 . 1
7.3 The Upper Bound
If NT
Proof
ω +ω r
117
F we get by the Embedding Theorem (Theorem 7.3.8)
F
(i)
for some finite r. By the Elimination Theorem (or just by the iterated application of the Basis Elimination Lemma) this entails ϕ0r (ω +ω ) 0
F.
(ii)
ϕ0r (ω +ω )
Hence F and we get tc F < ε0 , since ϕ0r (ω + ω ) < ε0 holds true for all finite ordinals r.
7.3.17 Note Calculi whose rules may have infinitely many premises have already been proposed by Hilbert (cf. [44], [9]). Their systematic use in proof theory is due to Sch¨utte who also used the term “semi-formal system” to describe such calculi. In his sense, a semi-formal system consists of a decidable2 set T of axioms together with a collection R of rules. In contrast to a formal system, the rules R ∈ R of a semi-formal system may have infinitely many premises. In a more strict sense, however, one requires that the premises of a rule of a semi-formal systems are primitive recursively enumerated, i.e., that there is a primitive recursive coding of the formulas ∆ and a primitive recursive function f such that for an infinitary rule ∆i i ∈ N we obtain ∆i = f ( ∆ , i). Semi-formal systems in this strict sense have subtle applications in proof theory (cf. e.g. [90]). It is not too difficult to see that the semiformal system of Definition 7.3.5 can be replaced by a semi-formal system in the strict sense. Also cut-elimination holds true for strict semi-formal systems. In this book, however, we will not need the properties of semi-formal systems in the strict sense. Therefore we only introduced the liberated notion, omitting the requirement of primitive recursive enumerability of the premises of an infinite inference rule. α can be viewed as a semi-formal sysIn that sense also the verification calculus tem. This becomes especially clear by property (7.5) on pp. 112. But observe that some of the properties of the verification calculus may be lost by turning it into a semi-formal system in the strict sense. An example of such a property is in fact (7.5) where the direction from right to left becomes false for semi-formal systems in the strict sense. Since there is, in principle, a qualitative difference between the verification calα culus , whose root is the truth definition for sentences, and the inference system α ρ , which is a generalization of the finitary calculus for predicate logic, we made the difference between the verification calculus and the semi-formal calculus. This difference will become even more evident when we later introduce operator controlled derivations. In general, the verification calculus cannot be controlled by an operator.
2
In some settings he also allowed axioms of the form (∀x)Fu (x) (or F(u)) provided that Fu (z) is a true atomic formula for every instantiation z for u.
7 The Ordinal Analysis for PA
118
α 7.3.18 Exercise (a) Show that 1 ¬ClF (X),t1 ε X, . . . ,tn ε X, ∆ [X,Y ] for a set <2α ∆ [X,Y ] of X-positive pseudo-Π11 -sentences implies N |= ∆ [IF[t , S] for all 1 ,...,tn ] assignments S to the set variables Y in ∆ [X,Y ].
(b) Prove the Boundedness Lemma (Lemma 6.6.8) with α 1
(c) Conclude that
α
replaced by
α 1
.
TI(≺, X) implies otyp(≺) ≤ α for transitive relations ≺.
Hint: Just add the additional case of an atomic cut to the proof of Lemmas 6.6.3 and 6.6.8. −
7.3.19 Exercise Let PA be the theory PA without the scheme of Mathematical − Induction (IND). For each n ∈ N let PAn be the theory PA + (IND)n , where (IND)n 0 is the restriction of (IND) to Σn -formulas. These theories are better known under the name IΣ n . We define the quantifier rank rnk∗ (F) for L (PA)-formulas F by the following clauses: ⎧ 0 if F contains no quantifiers ⎪ ⎨ ∗ ∗ max{rnk (F ), rnk (F )} + 1 if F ≡ F1 ◦ F2 with ◦ ∈ {∧, ∨} 1 2 rnk∗ (F) := and F contains quantifiers ⎪ ⎩ ∗ if F ≡ (∀x)F0 (x) or F ≡ (∃x)F0 (x) rnk (F0 ) + 1 Let IRn denote the rules of the Tait-calculus enlarged by the cut rule (cut)
If
m0 r
m0 r
∆ , F,
∆ , ¬F and rnk∗ (F) < r then
m r
∆ for all m > m0
and the rule of Σn0 -induction m0
m0
0 (IR)n If r ∆ , F(0), r ∆ , ¬F(u), F(Su) for a Σ n -formula F(u) and u ∈ m FV(∆ ) then r ∆ , F(t) for all m > m0 and all L (NT)-terms t. −
Let PA–IRn denote the formal system (PA , IRn ). We use PA–IRn ∆ as abbrevim ation for the fact that there is an m and an r such that PA–IRn r ∆ . α α We finally define ∗ ρ ∆ like ρ ∆ where rnk(F) is replaced by rnk∗ (F). F ⇔ PAn
(a) Show PA–IRn
F for every L (PA)-formula F.
For the following assume that ∆ is a finite set of L (PA)-formulas. (b) Show PA–IRn
m n+1+r
∆ ⇒ PA–IRn ω ·(m+1)
2r (m) n+1
∆ and
m PA–IRn n+1 ∆ (x) ⇒ ∗ n+1 ∆ (k) for all k ∈ N, where x are all the first-order variables that are free in ∆ .
(c) Show ∗
α 1+m+1
(d) Show ∗
α 1
∆ ⇒ ∗
∆ ⇒
ω ·α 1
2α 1+m
∆ for all m ∈ N.
∆.
(e) Use (d) to compute an upper bound for ||PAn || for each n ∈ N.
7.4 The Lower Bound
(f)
119
Can you improve (d) in case that we work in the language L (NT)? Will this influence ||NTn || where NTn is defined analogously to PAn ?
7.4 The Lower Bound We want to show that the bound given in Theorem 7.3.16 is the best possible one. By Theorem 6.7.3 it suffices to prove Theorem 7.4.1 below because then we obtain ε0 ≤ ||NT|| ≤ ||NT||Π 1 ≤ ε0 . 1
7.4.1 Theorem For every ordinal α < ε0 there is a primitive recursive wellordering ≺ on the natural numbers of order type α such that NT TI(≺, X). In Sect. 3.3.1, we developed a notation system for the ordinals below ε0 . There we defined a primitive recursive set OT ⊆ N and a relation ≺ that corresponds to the ordinals below ε0 , as summarized in Theorem 3.3.17. Thus we can talk about ordinals < ε0 in L (NT). To increase readability, we will not distinguish between ordinals and their representations in L (NT) and regard formulas (∀α )[. . .] as abbreviations for (∀x)[x ε OT → . . .] and formulas (∃α )[. . .] as abbreviation for (∃x)[x ε OT ∧ . . .]. We also write α < β instead of α ≺ β . We introduce the following formulas: • α ⊆ X :⇔ (∀ξ )[ξ < α → ξ ε X] • Prog(X) :⇔ (∀α )[α ⊆ X → α ε X] • TI(α , X) :⇔ Prog(X) → α ⊆ X Our aim is to show TI(α , X) for all α < ε0 . Since ε0 = sup {ϕ0n (0) n ∈ ω } = sup {expn (ω , 0) n ε ω } and TI(0, X) holds trivially, we are done as soon as we succeed in proving TI(α , X) ⇒ NT
NT
TI(ω α , X)
(7.7)
because NT TI(α , X) and β < α obviously entail NT TI(β , X). As a preparation, we need a simple lemma about substitutions. Recall that if F(X) is a formula containing a set variable X and G(v) is a formula with a number variable v we denote by F({x Gv (x)}) the formula that is obtained from F(X) by replacing all occurrences of t ε X by Gv (t) and those of t ε X by ¬Gv (t). 7.4.2 Lemma Let ∆ (X) be a finite set of formulas (in an arbitrary first-order language) containing a free set variable and G(v) a formula containing an object varim k able v. If ∆ (X) then there is a k such that ∆ ({x Gv (x)}). T
T
7 The Ordinal Analysis for PA
120
Proof The proof is a simple induction on m. There are only two cases, which need some attention. First the case of an axiom. If ∆ (X) = ∆0 (X),t ε X,t ε X we obtain ∆ ({x Gv (x)}) = ∆0 ({x Gv (x)}), Gv (t), ¬Gv (t). Here we have to satisfy ourselves that for an arbitrary formula A (and not only for an atomic formula A) there is a k k such that T ∆ , A, ¬A. But this is shown as in the proof of the Tautology Lemma (Lemma 7.3.3, which also shows that k = 2·rnk(A)). The second critical case is that of an application of the rule (∀). Here it might happen that the variable condition m m ∆ (v) entails ∆v (u) which is is violated. To avoid this, we have to show that T
T
obvious by induction on m. In the case of an inference according to (∀) we then m0 ∆0 (X), A(X, u) and choose a new free object variable w that have the premise T
does not occur in Gv (t). Then we obtain duction hypothesis,
k0 T
m0 T
∆0 (X), A(X, u)u (w) and, using the in-
∆0 ({x Gv (x)}), A({x Gv (x)}, u)u (w) for some k0 and may
now apply an inference (∀) to obtain
k T
∆0 ({x Gv (x)}), (∀x)A({x Gv (x)}, u)u (x).
The remaining cases are all simple.
The next observation is a consequence of the previous lemma, which, nevertheless, still needs some care. 7.4.3 Lemma (Substitution for NT) Let F(X) and G(v) be a formula in the language of number theory. Then NT Proof
F(X) ⇒ NT
F({x G(x)}).
(7.8)
To prove (7.8) assume
NT
F(X).
(i)
Then there are finitely many axioms of A1 , . . . , An ⊆ NT and an m such that m T
¬A1 (X), . . . , ¬An (X), F(X).
(ii)
By Lemma 7.4.2, we then obtain k T
¬A1 ({x G(x)}), . . . , ¬An ({x G(x)}), F({x G(x)}).
(iii)
To conclude NT
F({x G(x)}),
we have to ensure that Ai ({x G(x)}) ∈ NT holds for i = 1, . . . , n. But this is obvious since we formulated the only critical axioms, the last identity axiom and Mathemat ical Induction, as schemes.3 3
It is possible to replace the last identity scheme by the single axiom (∀x)(∀y)[x = y → (x ε X → y ε X)]. Formulating Mathematical Induction as a scheme is, however, inevitable.
7.4 The Lower Bound
121
Let J (X) := {α (∀ξ )[ξ ⊆ X → ξ + ω α ⊆ X]} denote the jump of X. Then, if we assume NT
Prog(X) → Prog(J (X)),
(i)
TI(α , J (X)) → TI(ω α , X).
(ii)
we obtain NT
To prove (ii) assume (working informally in NT) TI(α , J (X)), i.e., Prog(J (X)) → α ⊆ J (X)
(iii)
which entails Prog(J (X)) → α ε J (X).
(iv)
Choosing ξ = 0 in the definition of the jump turns (iv) into Prog(J (X)) → ω α ⊆ X,
(v)
that together with (i), gives Prog(X) → ω α ⊆ X,
(vi)
i.e., TI(ω α , X). Once we have (ii) we also get (7.7) because NT
TI(α , X) implies NT TI(α , J (X)) by (7.8). It remains to prove (i). Again we work informally in NT. Assume Prog(X).
(vii)
We want to prove Prog(J (X)), i.e., (∀α )[α ⊆ J (X) → α ε J (X)]. Thus, assuming also
α ⊆ J (X),
(viii)
we have to show α ε J (X), i.e., (∀ξ )[ξ ⊆ X → ξ + ω α ⊆ X]. That means that for
η < ξ + ωα
(ix)
we have to prove η ε X under the additional hypothesis
ξ ⊆ X.
(x) + ωα .
If η < ξ , we obtain η ε X by (x). Let ξ ≤ η < ξ If α = 0 then η = ξ and we obtain η ε X by (x) and (vii). If α > 0 then there is a σ < α and a natural number, (i.e., a numeral in NT), such that η < ξ + ω σ + . . . + ω σ =: ω σ · n. 4 We show n−fold
σ
σ < α → ξ +ω ·n ⊆ X 4
C.f. the proof of the Predicative Elimination Lemma (Lemma 7.3.14).
(xi)
7 The Ordinal Analysis for PA
122
by induction on n. For n = 0 this is (x). For n := m + 1 we have
ξ + ωσ · m ⊆ X
(xii)
by the induction hypothesis. From σ < α we obtain σ ε J (X) by (viii). This together with (xii) entails ξ + ω σ · n = ξ + ω σ · m + ω σ ⊆ X. This finishes the proof of (i), hence also that of (7.7) that in turn implies Theorem 7.4.1. Summing up we have shown 7.4.4 Theorem (Ordinal Analysis of NT) ||NT|| = ||NT||Π 1 = ε0 . 1
The next theorem is a consequence of Theorem 7.3.16 and (the proof of) Theorem 7.4.1. 7.4.5 Theorem There is a Π11 -sentence (∀X)(∀x)F(X, x) that is true in the standard structure N such that NT F(X, n) for all n ∈ N but NT (∀x)F(X, x). To prove the theorem choose F(X, x) :⇔ Prog(X) → x ε OT → x ε X.
7.4.6 Remark Theorem 7.4.5 can be regarded as a weakened form of G¨odel’s first incompleteness theorem. G¨odel’s theorem states that there is already a true Π10 sentence (∀x)G(x) that is unprovable in NT while all its instantiations G(n) are provable in NT. Theorem 7.4.5 can be read as there is a Π30 -formula (∀x)F(x, X) with this property, which is a true pseudo-Π11 -sentence. Observe that the presence of the free set variable in the Π30 -formula is crucial for the proof given here, which ¨ ’s proof. Theorem 7.4.5 is stronger in the aspect is completely different from G ODEL that provability of all instances F(n, X) is not obvious, while the provability of G(n) follows from the fact that all true Σ10 -sentences are provable in NT. Getting rid of free set variables could, however, be of importance for the separation of theories in bounded arithmetic (cf. [7]). But even the application of highly sophisticated methods of impredicative proof theory to NT will only yield independence of Π20 sentences (cf. Corollary 10.5.11 below). 7.4.7 Exercise Show that ϕ0n (ω ) ≤ ||NTn || holds true for all n ∈ ω . Confer the result with Exercise 7.3.19 to obtain exact bounds for n ≥ 1. Hint: Use mathematical induction to show NT0 niously.
TI(ω , X) and define J (X) more parsimo-
7.4.8 Exercise Prove ϕ1 (ε0 ) ≤ ||(∆10 –CA)||. Hint: Show (∆10 –CA)
(∀X)TI(α , X) → (∀X)TI(ϕ0n (α ), X)
using the scheme of Mathematical Induction. Observe that ξ < ϕ1 (α ) and α = 0 imply that there is a natural number n and an ordinal γ < α (whose code is primitive recursively computable from the codes of ξ and α ) such that ξ < ϕ0n (ϕ1 (γ + 1)). Conclude that (∆10 –CA) .
(∀ξ < α )(∀X)TI(ϕ1 (ξ ), X) → (∀X)TI(ϕ1 (α ), X)
7.5 The Use of Gentzen’s Consistency Proof for Hilbert’s Programme
123
7.5 The Use of Gentzen’s Consistency Proof for Hilbert’s Programme In this section, we want to discuss the results just obtained in the light of the original aims of proof theory as posed in the Hilbert’s programme. We start with some general remarks on the consistency of formal systems.
7.5.1 On the Consistency of Formal and Semi-Formal Systems 7.5.1 Definition A (semi) formal system T is semantically consistent if there is no A and T ¬A. formula A such that T A. Since If T is a theory that as a model M then M |= A for all A such that T M |= A and M |= ¬A is impossible it follows that T is semantically consistent. Why is it not possible to satisfy Hilbert’s programme by just showing that T has a model? For a theory T that contains some kind of infinity axiom it is impossible to prove by finitist means that T possesses a model. An infinite model cannot be constructed by finitist means. It is difficult to give an exact description of “finitist means”. Let (Σ01 –Ind) be the theory NT in which the scheme (IND) of mathematical induction is restricted to Σ10 -formulas, i.e., to formulas of the shape (∃x)A where A is quantifier free. For our purposes, it is sufficient to regard everything that can be formalized within the system (Σ01 –Ind) as finitist.5 It is easy to see that the common definition of “M |= NT” cannot be carried out in NT let alone within (Σ01 –Ind).6 Trying to construct full models is apparently the wrong way to come to finitist consistency proofs. Still following the ideas of Hilbert’s programme we define the notion of syntactical consistency. 7.5.2 Definition A (semi-) formal system T is syntactically consistent iff there is a formula A such that T A. For reasonable formal systems we can show that semantical and syntactical consistency coincide. To explain which formal systems are reasonable we need some preparations. Let us first recall some basic notions. A propositional atom is either an atomic formula or a formula whose outermost logical symbol is a quantifier. A propositional assignment is a map that assigns a truth value to every propositional atom. Here we require that atoms that are dual in the Tait-language (i.e., atoms A and B such that (∼A) = B and (∼B) = A) obtain opposite truth values.
This system proves the same Π20 -sentences as the system in which Mathematical Induction is restricted to quantifier free formulas. There are good reasons why this system is regarded as finitist that have been widely discussed in the literature. We do not want to add further arguments to that discussion. 6 It follows by G ODEL ¨ ’s second incompleteness theorem that this is impossible in principle. 5
7 The Ordinal Analysis for PA
124
The propositional truth value of a formula under a propositional assignment is calculated according to the truth tables for the logical connectives ∧ and ∨. A formula is propositionally valid if it becomes true under all propositional assignments. 7.5.3 Definition (a)
Let (T, R) be a (semi-) formal system. We say that a rule
A1 , . . . , An F is permissible in (T, R) if (T, R) (b)
A1 , · · · , (T, R)
An ⇒ (T, R)
F.
A rule A1 , . . . , An F
is a propositional rule if the formula ¬A1 ∨ · · · ∨ ¬An ∨ F is propositionally valid. (c) A (semi-) formal system (T, R) is propositionally closed if every propositional rule is a permissible rule of (T, R). 7.5.4 Theorem A propositionally closed (semi-) formal system is semantically consistent if and only if it is syntactically consistent. Proof If (T, R) is semantically consistent we have (T, R) (A ∧ ¬A). So (T, R) is syntactically consistent. If (T, R) is semantically inconsistent there is a formula A such that (T, R) A ∧ ¬A. Since (A ∧ ¬A) F is a propositional rule for all formulas F we obtain (T, R) F for all formulas F. So (T, R) is syntactically inconsistent.
7.5.2 The Consistency of NT A theory is of course propositionally closed. To obtain the semantical consistency of NT it therefore suffices to show that there is a formula A that is not derivable from the axioms in NT. We will prove that via the syntactical consistency of the semi-formal system introduced in Definition 7.3.5. 7.5.5 Lemma If
α 0
A for an atomic sentence A then A ∈ Diag(N).
Proof We show the claim by induction on α . Since CS(A) = 0/ and the last in ference cannot be a cut, the only possibility is an application of ( ) with empty premises and critical formula A. But then A ∈ –type , i.e., A ∈ Diag(N). 7.5.6 Corollary The semi-formal system given in Definition 7.3.5 is syntactically consistent.
7.5 The Use of Gentzen’s Consistency Proof for Hilbert’s Programme
If ¬A ∈ Diag(N) and we assume
Proof
Theorem
β 0
α ρ
125
A then we obtain by the Elimination
A for some ordinal β . This, however, contradicts Lemma 7.5.5.
7.5.7 Lemma The semi-formal system given in Definition 7.3.5 is propositionally closed. Assume that ¬A1 ∨ · · · ∨ ¬An ∨ F is a propositionally valid formula and
Proof αi ρi
Ai
(i)
for i = 1, . . . , n. Let
ρ = max ({ρi i = 1, . . . , n} ∪ {rnk(Ai ) i = 1, . . . , n}) + 1 and α := max{α1 , . . . , αn }. Since ¬A1 ∨ · · · ∨ ¬An ∨ F is propositionally valid it is also logically valid and by the Completeness Theorem for first-order logic (Theorem 4.4.1) and Theorem 7.3.2 there is a natural number m such that 7 m 0
¬A1 ∨ · · · ∨ ¬An ∨ F.
From (i) and (ii) we obtain
β ρ
F for some β < α + ω .
(ii)
7.5.8 Theorem The semi-formal system given in Definition 7.3.5 is semantically consistent. Proof
This follows from Corollary 7.5.6, Lemma 7.5.7 together with Theorem 7.5.4.
7.5.9 Theorem The theory NT is semantically consistent. Proof We already remarked that theories are always propositionally closed. So it remains to show that NT is syntactically consistent. Assuming that NT is syntactically inconsistent we obtain NT A for all sentences A. By Theorem 7.3.8 this ω +ω A contradicting Corollary 7.5.6. implies r Having seen this consistency proof we have to answer the question: “What makes this proof more ‘constructive’ in comparison to the proof via the construction of a model for NT?” Since this leads us away from the main topic of this book, we will be quite sketchy. Nevertheless, we want to at least touch this issue. First we observe that the infinitary proof tree constructed in the proof of Theorem 7.3.2 is primitive recursive, hence definable in NT0 , a theory in the language L (NT) in which the scheme of Mathematical Induction is restricted to quantifier free formulas.8 7 The proof via the completeness theorem works only for first-order formulas. We will later extend the semi-formal system to a language with infinitely long formulas. To show that these systems are also propositionally closed we will then have to reprove that all propositionally valid formulas are derivable. 8 This system is also known as Primitive Recursive Arithmetic PRA.
7 The Ordinal Analysis for PA
126
Secondly, we have to secure that the cut-elimination operations, as given by the Reduction Lemma (Lemma 7.3.12) and Lemma 7.3.13, are primitive recursive, hence executable in NT0 . By a slight modification of the cut-elimination procedure it is even possible to define the reduction steps for non well-founded trees. This method is known as mints’ continuous cut-elimination.9 But since we have not introduced continuous cut-elimination, we need the well-foundedness of the infinitary proof tree to show that the cut-elimination procedure terminates. If we assume an arithmetization of the formulas we can define a predicate e
α ρ
∆
that formalizes the fact that e is an index of a recursive proof tree whose nodes are tagged with notations for ordinals ≤ α and finite sets of codes for formulas in L (NT) such that its root is tagged with (codes for) α and ∆ . Then there is a recursive function f such that expn (ω ,α ) α n ∆ . NT + (TI(exp (ω , α ), X)) e n ∆ → f (e) 0 (i) To formalize the embedding procedure let T be a theory in the language of arithmetic. We assume that there is an elementary coding for the language of arithmetic and that there is a predicate Prf T (i, v) :⇔ “i codes a proof from T of the formula coded by v”. By the above embedding procedure we then obtain ω +ω (∃x)[Prf NT (x, F )] → (∃x)[(x)0 NT F ] .
(ii)
(x)1
From (ii) and (i) it follows NT + (TI(ε0 , X))
(∃x)[Prf NT (x, F )] → (∃x)[(x)0
(x)1 0
F ].
(iii)
Formalizing the fact that no false atomic sentence is derivable in the semi-formal system yields NT + (TI(ε0 , X))
¬(∃x)[(x)0
(x)1 0
A ]
(iv)
for all formulas A such that ¬A ∈ Diag(N). But from (iv) and (iii) we obtain NT + (TI(ε0 , X))
¬(∃x)[Prf NT (x, A )]
for all false atomic sentences. So we have shown by transfinite induction along the notation system developed in Sect. 3.3.1 that NT is syntactically consistent and therefore also semantically consistent. On the other hand we have seen in Sect. 7.4 that NT proves the wellfoundedness of all initial segments of the notation system. In this sense the result is optimal. 9
The details of a similar construction are in [12].
7.5 The Use of Gentzen’s Consistency Proof for Hilbert’s Programme
127
Using M INTS’ continuous cut-elimination the result can be sharpened in so far that (iii) becomes formalizable in PRA. Since all paths in the cut-free derivation are primitive recursive it even suffices to know that every primitive recursively definable ≺ descending sequence is finite to get (iv). Thus, PRA together with the principle PRWO(≺), which says that every primitive recursively definable ≺-descending sequence is finite, suffices to prove the consistency of NT.
7.5.3 Kreisel’s Counterexample Having in mind the fact that PRA + PRWO(≺) for a primitive recursive wellordering ≺ proves the consistency of the theory NT while NT proves the wellfoundedness of all initial segments of ≺, it is tempting to define the proof-theoretic ordinal of a theory T as the order type of the shortest primitive recursive wellordering that is needed to show the consistency of T over the basis theory PRA. That this is malicious has been pointed out by Kreisel who presented the following counterexample. To sketch it let us work in PRA. A formula is ∆0 iff it only contains bounded quantifiers (∀x < a) or (∃x < a). For a theory T we define the provability predicate T x :⇔ (∃y)Prf T (y, x). Let ⊥ be a false atomic sentence, i.e., let (∼⊥) ∈ Diag(N), and define Con(T ) :⇔ ¬T ⊥ . 7.5.10 Theorem (Kreisel) For any consistent theory T there is a primitive recursive well-ordering ≺T of order type ω such that PRA + PRWO(≺T ) Proof
Define
x ≺T y :⇔
Con(T ).
x < y if (∀i < x)[¬Prf T (i, ⊥ )] y < x otherwise
(i)
and let F(x) :⇔ (∀i ≤ x)[¬Prf T (i, ⊥ )].
(ii)
Now we obtain PRA
(∀x≺T y)F(x) → F(y)
(iii)
since, if we assume ¬F(y), we have (∃i ≤ y)[Prf T (i, ⊥ )] and get y + 1 ≺T y and thus together with the premise of (iii) also F(y + 1). But this implies F(y), a contradiction. Since ≺T is primitive recursive we obtain from (iii) PRA + PRWO(≺T )
(∀x)F(x)
(iv)
7 The Ordinal Analysis for PA
128
and thus PRA + PRWO(≺T )
Con(T ).
Since Con(T ) is true we have ≺T = < and thus otyp(≺T ) = ω .
(v)
It follows from Kreisel’s counterexample that the naive definition of the prooftheoretic ordinal of T will always yield proof-theoretic ordinals ≤ ω . It is therefore not possible to define the proof-theoretic ordinal of a theory or a formal system as the order type of the “shortest primitive recursive well-ordering” by which the consistency of the theory can be established in a “finitist” way. The well-ordering in the counterexample given here is admittedly extremely artifical. But whatever definition we choose, it is nearly impossible to avoid artifical counter examples. The true reason for that is the fact that we have to refer to presentations of ordinals by primitive recursive well-orderings (or equivalently to ordinal notations) instead of ordinals themselves. Therefore these definitions become extremely sensitive to the way in which ordinals are presented. In all cases in that we have a “canonical” representation or a canonical notation system these abnormalities will not occur. But until today all attempts to give a mathematical definition of a “canonical well-ordering” failed.
7.5.4 Gentzen’s Consistency Proof in the Light of Hilbert’s Programme In the former sections, we did not do much more than repeating Gentzen’s consistency proof for pure number theory in a different language. Our motivation, however, was mainly ordinal analysis and not primarily consistency. When Gentzen first presented his proof it was viewed as a salvation of H ILBERT’s programme. As cited in the introduction, Bernays regarded Gentzen’s proof as a proof in the guise of Hilbert’s programme. Bernays believed that only a tiny extension of Hilbert’s finitist standpoint was needed to justify Gentzen’s proof. We want to discuss how tiny the extension really is. For this purpose we imagine an opponent who seriously doubts the consistency of the theory NT (or an equivalent theory) and whom we try to convince by Gentzen’s proof. Of course we assume that our opponent is able to follow mathematical reasoning in an unbiased way. We present the proof as constructive as possible (which we did not do here) and he or she will probably accept all steps in the proof but finally utter his or her uneasiness with the induction up to ε0 that we needed in establishing the syntactical consistency. This (s)he says is a bit beyond his or her finitist horizon. We, therefore, try to substantiate this induction as “finitistly” as possible. We avoid talking about ordinals and just want to convince him or her that the order relation we used in showing the syntactical consistency is well-founded. Therefore we introduce the order relation as in Sect. 3.3.1 (without mentioning its “ordinal origin”) and, in showing its well-foundedness, will more or less inevitably end up with an argumentation that is essentially that of Sect. 7.4.
7.5 The Use of Gentzen’s Consistency Proof for Hilbert’s Programme
129
Given an α < ε0 we find an n such that α < expn (ω , 0) and it suffices to secure the well-foundedness up to expn (ω , 0), i.e., to prove TI(expn (ω , 0), X). This is obtained by iterated application of (7.7). Analyzing the proof of (7.7) more closely we see that the crucial argument there is an iterated application of NT
TI(α , J (X)) → TI(ω α , X).
(i)
Defining J0 (X) = X and Jk+1 (X) = J (Jk (X)), we have to start with the trivial statement TI(0, Jn (X)) and then decrease the number of jumps until we reach TI(expn (ω , 0), X). The main point in proving (i), however, was to show NT
Prog(X) → Prog(J (X))
(ii)
which in turn needed the proof of
σ < α → ξ + ω σ ·k ⊆ X
(iii)
by Mathematical Induction on k. So to show NT
Prog(Jl (X)) → Prog(Jl+1 (X))
(iv)
we need Mathematical Induction for the formula
σ < α → ξ + ω σ ·k ⊆ Jl (X).
(v)
This means that we need the induction scheme for formulas of the complexity of Jl (X). We did not pay attention to define J (X) parsimoniously. But even under the most careful definition of J (X) we will have at least l quantifiers in the defining formulas for Jl (X) which means that Jl (X) belongs at least to the lth level in the arithmetical hierarchy. To reach ε0 we, therefore, must not restrict the complexities of the formulas in the scheme of Mathematical Induction. At this point our opponent will argue that in doing so we exhaust full first-order number theory and even a bit more.10 But (s)he doubts full number theory. Therefore (s)he cannot accept the proof. We hardly can advance a mathematical argument against that. This situation is in complete accordance with G¨odel’s second incompleteness theorem. If G¨odel’s second incompleteness theorem is more than a mere formal triviality but has a genuine content one cannot expect to bypass it by a tiny extension of the finitist standpoint. Therefore one also cannot expect to obtain proof-theoretical results in the spirit of Hilbert’s programme in the narrow sense as he had put it in [41]. But Hilbert’s programme has also a more general aspect. At some occasions, ¨ e.g. in his talk “Uber das Unendliche” [42], he talks about the elimination of ideal objects. In his opinion infinite sets (and similar mathematical objects) only serve as ideal elements that are needed to obtain information about concrete objects. This part of Hilbert’s programme is at least partially realized by Friedman’s and Simpson’s programme of reverse mathematics. Surprisingly much of mathematics
The “bit more” is that we obtain NT TI(expn (ω , 0)) for all n only outside of NT. The quantifier “for all n” cannot be formalized in NT.
10
130
7 The Ordinal Analysis for PA
can already be obtained in a theory (RCA)0 whose proof-theoretical strength is that of (Σ01 –Ind) (hence incorporates what we wanted to call finitist).11 Although Gentzen’s result is of little help in the spirit of Hilbert’s finitist programme it sheds some light on the consistency of number theory, which is more in the spirit of B ROUWER’s approach. Looking more carefully at the consistency proof (as we have sketched it in Sect. 7.5.2) we see that the consistency proof for NT can be formalized within a formal system that is based on a constructive logic, (i.e., a logic that avoids proofs by contradiction and, therefore, excludes the possibility to prove the existence of mathematical objects without having constructed them) and whose only nonfinitist feature is an induction over quantifier free formulas along an elementary definable ordering that possesses no infinite primitive recursively definable paths. Since such an ordering (which is essentially the ordering we introduced in Sect. 3.3.1) is easy to visualize it is intuitively plain that this system should be consistent (although its proof-theoretical ordinal is above ε0 ). This form of reductive proof theory is in full coherence with G¨odel’s second incompleteness theorem.
11
One should, however, be aware that just combinatorial statements, e.g. Ramsey’s theorem in the strengthened form by Paris and Harrington, are the statements that turn out to be unprovable in NT and are thus not finitistly provable although they talk only about finite objects. This thwarts Hilbert’s idea of justifying the infinite by the finite.
Chapter 8
Autonomous Ordinals and the Limits of Predicativity
The ordinal analysis of NT is a paradigmatic example for predicative proof theory. We discuss the notion of predicativity in Sect. 8.3 below. The rest of the chapter is dedicated to the presentation of the famous result of Solomon Feferman and Kurt Sch¨utte about the limits of predicativity.1
8.1 The Language Lκ 8.1.1 Definition Let κ be an ordinal. We introduce the language Lκ . The logical symbols of Lκ are • Countably many free set variables, denoted by X,Y, Z, X1 , . . . • The binary relation symbols =, =, ε , ε . • The logical connectives
and .
The nonlogical symbols of Lκ are • A constant 0 for the natural number 0. • Constants for all primitive recursive functions. Terms are defined inductively by the clauses • The constant 0 is a term. • If t1 , . . . ,tn are terms and f is a symbol for an n-ary primitive recursive function then ( f t1 , . . . ,tn ) is a term. Observe that the value t N of a term t is effectively, i.e., recursively, computable. With the exception of the definition of the language Lκ this chapter will not be used in the later chapters.
1
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
131
132
8 Autonomous Ordinals and the Limits of Predicativity
We define formulas and their types inductively by the following clauses
• Let s and t be terms. If sN = t N then (s = t) ∈ –type and (s = t) ∈ –type . If sN = t N then (s = t) ∈ –type and (s = t) ∈ –type . It is FV(s = t) = FV(s = t) = 0. / • If s is a term and X a set variable then (s ε X) and (s ε X) are atomic formulas. These atomic formulas have no type. It is FV(s ε X) = FV(s ε X) = {X}. • If λ < κ and Fξ ξ < λ is a sequence of formulas such that ξ <λ FV(Fξ ) is finite then ( Fξ ) ∈ –type and ( Fξ ) ∈ –type and FV( Fξ ) = FV(
ξ <λ
Fξ ) =
ξ <λ
ξ <λ
ξ <λ FV(Fξ ).
ξ <λ
• Atomic formulas and all members of –type ∪ –type are formulas.
• We define the characteristic sequence of a formula in –type ∪ –type by 0/ t if F is of the shape s = t ors = CS(F) = if F = "ξ <λ Fξ for " ∈ { , }. Fξ ξ < λ The formal negation ∼F of a formula F is defined by • ∼(s = t) is (s = t) and ∼(s = t) is (s = t). • ∼(s ε X) is (s ε X) and ∼(s ε X) is (s ε X).
• ∼(
ξ <η Fξ )
is (
ξ <η ∼Fξ )
and ∼(
ξ <η Fξ )
is
ξ <η ∼Fξ .
8.2 Semantics for Lκ Since we are only interested in the meaning of formulas of Lκ in the standard model N we will only define N |= F[Φ ] for an assignment of subsets of N to the set variables in FV(F). 8.2.1 Definition Let Φ : FV(F) −→ Pow(N) be an assignment. We define ⎧ ⎪ F is a formula (s ε X) and sN ∈ Φ (X) ⎪ ⎨ / Φ (X) F is a formula (s ε X) and sN ∈ N |= F[Φ ] :⇔ ⎪ and N |= G[Φ ] for all G ∈ CS(F) F ∈ –type ⎪ ⎩ and N |= G[Φ ] for some G ∈ CS(F). F ∈ –type
8.2 Semantics for Lκ
133
Again we get directly N |= ∼F[Φ ] ⇔ N |= F[Φ ] for any assignment Φ . Therefore we regularly use ¬F as synonym for ∼F. Observe that the TAIT-language of arithmetic may be regarded as asublanuntouched and put (A ∧ B)∗ := A, B, guage of Lω1. We leave atomic formulas ∗ ∗ ∗ (A ∨ B) := A, B, (∀x)Fu (x) := n∈ω Fu (n) and (∃x)Fu (x) := n∈ω Fu (n). Then we get {F ∗ F ∈ L (NT)} ⊆ Lω1 and N |= F[Φ ] ⇔ N |= F ∗ [Φ ] for all assignments Φ . It follows from Definition 8.2.1 that we can transfer Definition 5.4.3 of the verification calculus for L (NT) to the language Lκ . Therefore we define α
∆
(8.1)
for finite sets of Lκ -formulas literally as in Definition 5.4.3. The difference is that in the new definition we may have characteristic sequences which are longer than ω . Then α
∆ ⇒ N |=
∆ [Φ ] for every assignment Φ ,
(8.2)
follows directly from Definition 8.2.1. The opposite direction of (8.2), however, does not hold in general. It fails for ω1 < κ . For κ ≤ ω1 , however, we can adapt the proof of Theorem 5.4.9. First we have to modify Definition 5.4.5. Since all ordinals are countable we assume that for every ordinal α < ω1 there is an enumeration αi | i ∈ ω of the ordinals less than α . We modify clauses (S ) and (S ) in Definition 5.4.5 as follows:
( ) If the redex of δ (s) is ξ <α Aξ then s i ∈ S∆ for all i ∈ ω and δ (s i) :=
δ (s)r Aαi where αi | i ∈ ω enumerates the ordinals less than α and
( ) If the redex of δ (s) is ξ <α Aξ then s 0 ∈ S∆ and we define δ (s 0) := / r⊆s δ (r), where
δ (s)r Aαi , ξ <α Aξ for the first number i such that Aαi ∈ r
αi | i ∈ ω enumerates the ordinals less than α , or δ (s) if such an i does not exist. It is obvious that this definition cannot be extended to languages Lκ with ω1 < κ . The Syntactical Main Lemma carries over literally and so does the Semantical Main Lemma. What is changed is the order-type of the search trees. All search trees are still countably branching but in general not recursive. Since ω1 is a regular cardinal every countably branching well-founded tree has an order-type below ω1 . The ω -completeness theorem (Theorem 5.4.9) is thus changed in the following way. 8.2.2 Theorem Let F be an Lω1 -formula. Then α (∀Φ ) N |= F[Φ ] ⇔ (∃α < ω1 )[ F].
134
8 Autonomous Ordinals and the Limits of Predicativity
8.2.3 Exercise Show that the Completeness Theorem fails for Lκ with κ > ω1 . Hint: For M ⊆ N define n∈X FM (n) := n∈ /X
if n ∈ M if n ∈ / M.
and FM∧ :=
{FM (n) n ∈ N}.
Let
F := {FM∧ M ⊆ N}. First show N |= F[Φ ] for all assignments Φ . Then prove α
• If ∆ then ∆ cannot have the form F, FM∧i , . . . , FM∧im , FM j1 (k1 ), . . . , FM jn (kn ) for natural num1 bers i1 , . . . , im and j1 . . . jn and pairwise different natural numbers k1 , . . . , kn by induction on the definition of
α
∆.
8.3 Autonomous Ordinals The basis for the following definition is the notion of predicativity. The notion of predicativity is still controversial. Therefore we define and discuss here predicativity in a pure mathematical – and thus perhaps oversimplified – setting. We consider a notion to be impredicative if it is defined under recourse to an entity to which it belongs itself. The standard (an evil) example of an impredicative definition is / x}. The set R is defined by recourse to all sets Russell’s paradoxical set R := {x x ∈ and is supposed to be a set itself. Therefore the paradoxical question R ∈ R is permitted. But observe that our definition of the fixed-point of a monotone operator Γ as the least Γ -closed set is also impredicative in this sense. How can such impredicative notions be avoided? The safe way is to introduce ramifications. Every set gets a stage, say an ordinal α , and all members of a set of stage α are supposed to have lower stages. This is essentially the way proposed by Russell. Such an approach, however, does not reflect mathematical praxis. In everyday mathematics we do not talk about ramified objects, e.g. ramified real numbers. Therefore Russell introduced the axiom of reducibility to get rid again of ramifications. This axiom, however, has been regarded as unsatisfactory in different ways. A possible way to introduce ramified sets is given by the constructible hierarchy of sets introduced by G¨odel. The constructible hierarchy has therefore attained a predominant role in (impredicative) proof theory. We postpone, however, the introduction of constructible sets until Sect. 11 and discuss here predicativity in the framework of the infinitary language Lκ . Assume informally that we want to construct a mathematical universe of subsets of N from below. We start from scratch and accept only those sets which can be obtained by elementary methods. These, in our sense, are exactly the finite sets which can be defined by formulas in Lω . It is then at least plausible that this cannot
8.3 Autonomous Ordinals
135
lead us beyond ω . To get further we therefore have to accept ω , i.e., the axiom of infinity. But then we can also accept all objects which we can reach from ω in finitely many steps. Therefore we accept infinitary formulas of Lω1 of length ω + n for arbitrary finite n and also infinite proof trees of lengths ω +n. That means that we may work in the semi-formal system of Definition 7.3.5 with formulas and prooftrees of lengths below ω ·2. So we accept everything which we can prove within this segment of the semi-formal system. The obvious way to come to stronger infinities is then to define an ordering ≺ on ω and to prove its well-foundedness within the ω +n restricted semi-formal system. Whenever we succeed in showing TI(≺) for ω +k such an ordering ≺ we accept α := otyp(≺) as a new infinity and also everything which can be reached from α in finitely many steps. That means that we can extend the semi-formal systems to formulas and proof trees of lengths below α + ω . Again we use the so extended semi-formal system to obtain stronger infinities and iterate this process as far as possible. This approach is apparently perfectly predicative. To reach new infinities we only use the means which we have so far secured in a very strict sense. In a very strict sense because we do not only require that all notions in the definition of a new object are previously secured but also that a proof of the infinity, i.e., the well-foundedness, of the new objects only needs previously secured means.2 We therefore talk about predicativity in the narrow sense. The obvious question is whether this procedure will eventually come to a standstill and, if so, how large a segment of the ordinals can be secured by this method. These questions will be answered in this section. To make these questions mathematically precise we introduce some some notions. 8.3.1 Definition For a formula F ∈ Lω1 we define 0, if F is atomic or a formula s = t or s = t rnk(F) := sup {rnk(Fξ ) + 1 ξ < λ } if F = "ξ <λ Fξ for " ∈ { , }. We extend Definition 7.3.5 to formulas of Lω1 and call that semi-formal system NTω1 . In Sect. 7.3 we have formulated and proved all theorems in such a way that they also hold for the system NTω1 . To grasp the informal description of the construction of an universe from below we define the autonomous closure Aut(α ) of an ordinal α . First let α ∗ := min {λ λ ∈ Lim ∧ α < λ } denote the first limit ordinal above α . Since all ordinals below α ∗ can be accessed from α by finitely many steps we anticipate that all ordinals below α ∗ are predicatively accessible from α .
2 This is the difference to the familiar definition of the constructible hierarchy which is based on iterating definability and whose definition is therefore only locally predicative (cf. Sect. 11). A similar remark applies to the definition of the stages of an arithmetically definable inductive definition. Here too, every single step is predicatively defined. But to iterate the stages until the fixed-point is reached we simultaneously have to secure the well-foundedness of the next iteration steps only using the means of the so far obtained iterations. In that sense also inductive definition are only locally predicative. This fact is used in the ordinal analysis of arithmetically definable inductive definitions (cf. Sect. 9).
136
8 Autonomous Ordinals and the Limits of Predicativity
8.3.2 Definition The autonomous closure Aut(α ) of an ordinal α is inductively defined by the following clauses. • α ∈ Aut(α ) • If β ∈ Aut(α ) then β ∗ ⊆ Aut(α ) • If λ ∈ Aut(α ) and ≺ is an ordering of ω which is definable by a formula of rank less than λ and there are ordinals ξ and ρ less than λ such that NTω1 then otyp(≺) belongs to Aut(α ).
ξ ρ
TI(≺, X)
The subsets of N which are definable in LAut(ω ) are apparently the members of the predicative universe constructed on ω .
8.4 The Upper Bound for Autonomous Ordinals To obtain an upper bound for Aut(α ) we modify the definition of Aut(α ) slightly and define
ξ ∆ω1 (α ) := {tc F F ∈ Lω1 ∧ rnk(F) < α ∧ (∃ξ < α )(∃ρ < α )[NTω1 ρ F]} (8.3) and put • ∆∗ω1 (α ) is the least set which contains α as element and is closed under ordinal successor and the function ∆ω1 . • We say that an ordinal is ∆ω1 -closed if ∆ω1 (ξ ) ⊆ α holds for all ξ < α . 8.4.1 Lemma We have Aut(α ) ⊆ ∆∗ω1 (α ) for all countable ordinals α . Proof We show that β ∈ Aut(α ) implies β ∈ ∆∗ω1 (α ) by induction on the definition of β ∈ Aut(α ). This is obvious for β = α and follows from the closure of ∆∗ω1 (α ) under ordinal successor if β < λ ∗ for some λ ∈ Aut(α ). Now let β = otyp(≺) ξ
and NTω1 ρ TI(≺, X) for some ξ and ρ less than λ ∈ Aut(α ). Without loss of
generality we can assume that β is a limit ordinal and obtain tc TI(≺, X) = β by the Boundedness Theorem (Theorem 6.7.2). Hence β ∈ ∆∗ω1 (α ). 8.4.2 Theorem The ordinals in {ω } ∪ SC are ∆ω1 -closed. m
expn (2,m)
F Proof Assuming α ∈{ω }∪SC, m, n< ω andNTω1 n F, weobtain NTω1 0
by Lemma 7.3.13 which entails tc F < ω by the Boundedness Theorem. So ω is
ξ ∆ω1 -closed. If α ∈ SC, ξ , η < α and NTω1 η F we obtain tc F ≤ ϕξ (η ) < α by Lemma 7.3.14 and the Boundedness Theorem. So α is ∆ω1 -closed.
8.4 The Upper Bound for Autonomous Ordinals
137
Since all ordinals in {ω } ∪ SC are limit ordinals both next corollaries are immediate. 8.4.3 Corollary All ordinals in {ω } ∪ SC are ∆∗ω1 closed. 8.4.4 Corollary We have Aut(0) = ω and Aut(ω ) ⊆ Γ0 . 8.4.5 Exercise We use class terms as abbreviations as introduced in Sect. 8.5 and extend the the translation of the first-order TAIT-language of arithmetic to the second-order language by defining
(∀X)FU (X)∗ := {FU ({x A(x)}) rnk(A) < ω } and
(∃X)FU (X)∗ := {FU ({x A(x)}) rnk(A) < ω }. (a) Prove that for every L (NT)-formula F(X1 , . . . , Xk , x1 , . . . , xl ) there is a finite ordinal j such that rnk(F(A1 , . . . , Ak , n1 , . . . , nl )∗ ) ≤ ω + j holds true for all tuples of arithmetical formulas A1 , . . . , Ak and all tuples n1 , . . . , nl of natural numbers. (b) Let ∆ be a finite set of Lω1 -formulas. Use ∆U (A) as an abbreviation for α {FU ({x A(x)}) F ∈ ∆ }. Assume NTω1 0 ∆ and show that this implies 2·rnk(A)+α
NTω1
0 2·rnk(A)+α
NTω1 (c) Let
rnk(A)+ρ n T
2
∆U (A). Prove moreover that NTω1
α ρ
∆ for ρ = 0 implies
∆U (A).
∆ (X1 , . . . , Xk , x1 , . . . , xl ) be a derivation in the sense of second-order ω +n
∆ (A1 , . . . , Ak , n1 , . . . , nl )∗ holds true logic (cf. Sect. 4.6). Show that NTω1 0 for all tuples A1 , . . . , Ak of arithmetical formulas and all tuples n1 , . . . , nl of natural numbers. (d) Show that NTω1
ω +ω +k 0
A∗ holds true for all axioms of (∆10 –CA).
(e) Use the previous results to prove ||(∆10 –CA)|| ≤ ϕ1 (ε0 ) and confer the result with Exercise 7.4.8. 8.4.6 Exercise Modify part (d) of Exercise 8.4.5 to obtain ||(∆10 –CA)0 || ≤ ε0 . 8.4.7 Exercise Let (∆10 –CA) + (BR) be the formal system which comprises all axm ioms of (∆10 –CA), all rules of weak second-order logic 2 together with a cut-rule T
and the bar-rule
138
8 Autonomous Ordinals and the Limits of Predicativity
(BR)
m r
n
TI(≺, X) ⇒ r TI(≺, F) for all n > m and all formulas F in the secondorder language L (NT) if ≺ is a primitive recursive ordering.
(a) Show that there is an embedding theorem of the form ! 1 (∆0 –CA) + (BR) F ⇒ (∃k < ω )(∃l < ω ) NTω1
ϕ1l (0) ω +k
F
∗
" .
Hint: Use cut-elimination, boundedness and a modification of equation (6.6) to obtain a translation of the bar-rule.
(b) Use (a) to prove ||(∆10 –CA) + (BR)|| ≤ ϕ2 (0). Is this bound exact?
8.5 The Lower Bound for Autonomous Ordinals It follows from Corollary 8.4.4 that Γ0 is an upper bound for predicativity in the narrow sense delineated above. This has first been observed by Sch¨utte [85], [87] and [88] and independently by Feferman [23]. Both authors could, however, also show that
Γ0 ⊆ Aut(ω ).
(8.4)
So, once we have accepted ω , the ordinal Γ0 is the exact bound of predicativity in the narrow sense. The proof of (8.4) is the aim of the present section. In Sect. 3.4.3 we introduced a notation system for the ordinals below Γ0 . Therefore we can talk about ordinals less or equal than Γ0 in the language of arithmetic. Again we will identify ordinals and their notations. Lower case Greek letters will vary over ordinals. We use all the abbreviations introduced in Sect. 7.4. Recall, especially the formulas α ⊆ β , Prog(X) and TI(α , X) defined there. Since we have constants for all characteristic functions of primitive recursive relations, all primitive recursive relations are available in Lω1 . To improve readability we will, however, always write R(t1 , . . . ,tn ) instead of (χ Rt1 , . . . ,tn ) = 1. We regard L (NT) as a sublanguage of Lω1 and use the familiar notations. So A → B stands for {¬A, B}, (∀x)F(x) for {F(n) n ∈ ω } etc. We will also freely use class terms of the form {x F(x)} although they do not belong to the language. The formula t ε {x F(x)} is to be read as an abbreviation for the formula F(t). The rank of a class term {x F(x)} is the rank of the formula F(t). By Kσ := {T T is a class term and rnk(T ) < σ } we denote the collection of all class terms of ranks less than σ . Let TIσ (α ) :⇔
{TI(α , S) S ∈ Kσ }.
8.5 The Lower Bound for Autonomous Ordinals
139
Since ordinal bounds for derivations are crucial in the following proofs we have to prove everything step by step. To facilitate proving (and reading) we collect α some properties which are often used. By ρ ∆ we always understand in this section α ρ
NTω1
∆ . In the TAIT-calculus a claim F derived from a finite set ∆ of hypotheses α
appears in the form ρ ¬∆ , F, where ¬∆ := {¬G G ∈ ∆ }. First we prove a more or less obvious property of Lκ -sentences. 8.5.1 Lemma Let F be a true Lκ -sentence with rnk(F) = α . Then finite set of Lκ -formulas ∆ .
α 0
∆ , F for any
0
Proof We induct on rnk(F). If rnk(F) = 0 then F ∈ –type and we obtain 0 ∆ , F by an inference ( ) with empty premise. If rnk(F) = α > 0 then we obtain rnk(G) < α for all G ∈ CS(F). If F ∈ –type αG there is some G ∈ CS(F) such that N |= G. Hence 0 ∆ , G for aG := rnk(G) < α .
By an inference ( ) we then obtain If F ∈
α 0
∆ , F.
–type we have N |= G for all G ∈ CS(F) and thus
αG 0
∆ , G for all
G ∈ CS(F) and αG := rnk(G) < α . By an inference ( ) we then obtain
α ρ
8.5.2 Lemma From α +1 ρ
α 0
∆ , F(t) for 2·rnk(F(t)) ≤ α and rnk(F) < ρ we obtain
∆ , s = t, F(s). If sN = t N we obtain
Proof
2·rnk(F(t)) 0
∆ , s = t, ¬F(t), F(s)
(i)
by the Tautology Lemma (Lemma 7.3.3). From (i) and the hypothesis we obtain the claim by cut. If 2·rnk(F(t)) 0
sN
=
tN
then (t = s) ∈
α +1 ρ
α0 ρ0
∆ , A and
α ρ
∆ , F(t)
–type and we obtain
∆ , s = t, F(s) by an inference ( ) with empty premises.
8.5.3 Lemma (Conjunction Lemma) Assume tain
∆ , F.
α1 ρ1
Γ , B. Then we ob-
∆ , Γ , A ∧ B for all α ≥ max{α0 , α1 } and all ρ ≥ max{ρ0 , ρ1 }. α
α
Proof We obtain ρ ∆ , Γ , A, A ∧ B and ρ ∆ , Γ , B, A ∧ B by the structural rule (Lemma 7.3.7). The claim follows now by an inference ( ). α ρ0
β
γ
∆ , F and ρ Γ , ¬F then ρ +1 ∆ , Γ holds true for all ordinals 1 γ > max{α , β } and ρ ≥ max{ρ0 , ρ1 , rnk(F) + 1}. 8.5.4 Lemma If
140
8 Autonomous Ordinals and the Limits of Predicativity
Proof
The proof is straight forward by the Structural Lemma and cut.
Similar applications of the Structural Lemma will not longer be explicitely mentioned. Combining Lemmas 8.5.4 and 8.5.1, we obtain the Detachment Rule α ρ
8.5.5 Lemma If 1, ρ }. Then
α +1 ρ
∆ , F and F is a false sentence such that rnk(F) < min{α +
∆.
For the rest of this section λ will always denote a limit ordinal. A rule which will often be used (mostly without mentioning it) is stated in the following lemma.
8.5.6 Lemma ( -Importation) Assume α +n ρ
α ρ
∆ , F1 , . . . , Fn for n > 1. Then we obtain
∆ , F1 ∨ · · · ∨ Fn
Proof This is obvious by the structural rule and repeated applications of ( )inferences. Lower case Greek letters in formulas are supposed to vary over ordinal notations. To improve the readability of the following proofs we agree upon the following convention. Whenever we write α ρ
∆ (ξ1 , . . . , ξn )
we have to read it as α ρ
ξ1 ε OT, . . . , ξn ε OT, ∆ (ξ1 , . . . , ξn ).
It would only be confusing to make all the “hypotheses ξi ε OT ” explicit. According to this agreement it suffices to have ordinals ξ to derive αξ
for
ρ
α ρ
∆ , F(ξ ) with αξ + 2 < α for all
∆ , ξ ε OT, F(ξ ). If ξ is not an ordinal notation we have
Importation implies get
ρ
∆ , (∀ξ )F(ξ ). This holds true because
by Lemma 8.5.1. So we have α ρ
αξ
∆ , (∀ξ )F(ξ ).
Because of hypothesis Put
α ρ
0 0
αξ +2 ρ
αξ
ρ
αξ
ρ 0 0
∆ , F(ξ ) stands
∆ , ξ ε OT, F(ξ )
∆ , ξ ε OT, F(ξ ) for all ξ which by
-
∆ , ξ ε OT ∨ F(ξ ). By an inference ( ) we finally
ξ ε OT, ξ ε OT we can derive
α +1 ρ
∆ (ξ ), ξ ε On ∧ F from the
∆ (ξ ), F. We are going to use these facts tacitly.
Jp(S) := {η η ε OT ∧ (∀ξ )[ξ ⊆ S → ξ + η ⊆ S]}.
(8.5)
We call Jp(S) the jump of the class S. Observe that S ∈ Kλ entails also Jp(S) ∈ Kλ . 8.5.7 Lemma For S ∈ Kλ there are ordinals α and ρ less than λ such that α ρ η ε Jp(S), η ⊆ S.
8.5 The Lower Bound for Autonomous Ordinals
Proof
141
By Tautology (Lemma 7.3.3) we obtain 2·β 0
η ε Jp(S), η ε Jp(S)
(i)
for β = rnk(Jp(S)) < λ . Hence 2·β
0
η ε Jp(S), ¬0 ⊆ S, η ⊆ S
by -Inversion and 3 0
(ii)
-Exportation (Lemma 7.3.10). But
¬ξ ε OT ∨ ¬ξ < 0 ∨ ξ ε S
holds by Lemma 8.5.1 for all ordinals ξ . This implies 4 0
0⊆S
(iii)
by an ( )-inference. From (ii) and (iii) the claim follows by a cut. 8.5.8 Lemma For S ∈ Kλ there are ordinals α and ρ less than λ such that α ρ
¬Prog(Jp(S)), ¬(η ⊆ Jp(S)), η ⊆ S
for all ordinals η . Proof
We use Tautology to prove α0 0
¬Prog(Jp(S)), Prog(Jp(S))
for α0 = 2·rnk(Prog(Jp(S))) < λ . By from (i) α0 0
(i)
-Inversion and
-Exportation we obtain
¬Prog(Jp(S)), ¬η ⊆ Jp(S), η ε Jp(S).
(ii)
From (ii) and Lemma 8.5.7 we obtain by cut α ρ
¬Prog(Jp(S)), ¬η ⊆ Jp(S), η ⊆ S
for α and ρ less than λ .
8.5.9 Lemma For S ∈ Kλ there is an ordinal α less than λ such that α 0
η ε Jp(S), ¬(ξ ⊆ S), ξ + η ε S
for all ordinals ξ and η . α0
Proof Use Tautology to derive 0 η ε Jp(S), η ε Jp(S) and then -Inversion and -Exportation to obtain the claim. 8.5.10 Lemma For S ∈ Kλ there are ordinals α and ρ less than λ such that α ρ
¬Prog(S), Prog(Jp(S)).
142
8 Autonomous Ordinals and the Limits of Predicativity
Proof By Tautology, -Inversion and and α2 less than λ such that α0 0
α1 0
-Exportation we obtain ordinals α0 , α1
¬(ξ ⊆ S), ¬(ζ < ξ ), ζ ε S,
(i)
¬(η ⊆ Jp(S)), ¬(η0 < η ), η0 ε Jp(S)
(ii)
η0 ε Jp(S), ¬(ξ ⊆ S), ξ + η0 ⊆ S
(iii)
and α2 0
for all ordinals η0 . Cutting (iii) and (ii) we therefore obtain α3 ρ0
¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(η0 < η ), ξ + η0 ⊆ S
with α3 < λ and ρ0 < λ . Again by Tautology, get an ordinal α4 < λ such that α4 0
-Inversion and
(iv)
-Exportation we
¬Prog(S), ¬(ξ + η0 ⊆ S), ξ + η0 ε S.
(v)
From (iv) and (v) we obtain cut α5 ρ1
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(η0 < η ), ξ + η0 ε S
(vi)
for α5 and ρ1 less than λ . From (vi) we obtain by Lemma 8.5.2 α5 ρ1
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(η0 < η ), ¬(ζ = ξ + η0 ), ζ ε S
(vii)
and thus by inferences ( ) and ( ) α6 ρ1
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(∃η0 )[η0 < η ∧ ζ = ξ + η0 ], ζ ε S
(viii)
for α6 still less than λ . By Lemma 8.5.1 we obtain n 0
¬(ζ < ξ + η ), ζ < ξ , (∃η0 )[η0 < η ∧ ζ = ξ + η0 ]
(ix)
for some n < ω ≤ λ . Cutting (viii) and (ix) yields α7 ρ
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(ζ < ξ + η ), ζ < ξ , ζ ε S
(x)
with α7 < λ and ρ < λ for all ordinals ζ . By (i) and(x) we obtain with an inference ( ) α8 ρ
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(ζ < ξ + η ), ζ < ξ ∧ ¬(ζ < ξ ), ζ ε S
(xi)
which yields an ordinal α9 less than λ such that α9 ρ
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ¬(ζ < ξ + η ), ζ ε S
(xii)
8.5 The Lower Bound for Autonomous Ordinals
143
for all ordinalsζ by the Detachment Rule (Lemma 8.5.5). By ( )-Importation and an inference ( ) we obtain α10 ρ
¬Prog(S), ¬(η ⊆ Jp(S)), ¬(ξ ⊆ S), ξ + η ⊆ S
(xiii)
for all ξ from (xii). Hence α11 ρ
¬Prog(S), ¬(η ⊆ Jp(S)), η ε Jp(S)
(xiv)
by ( )-Importation and ( ). Finally we obtain again by ( )-Importation and an inference ( ) ordinals α < λ and ρ less than λ such that α ρ
¬Prog(S), Prog(Jp(S)).
8.5.11 Lemma If λ ≤ σ there is for every S ∈ Kλ an ordinal α < λ such that α 0
¬TIσ (ξ ), ¬Prog(S), ξ ⊆ S. Let S ∈ Kλ . By Tautology and
Proof α0 0
-Exportation we obtain
¬(Prog(S) → ξ ⊆ S), ¬Prog(S), ξ ⊆ S
(i)
for α0 = 2·rnk(Prog(S) → ξ ⊆ S) < λ . Hence α1 0
¬(∀ξ )[Prog(S) → ξ ⊆ S], ¬Prog(S), ξ ⊆ S
(ii)
by an inference ( ). Since (∀ξ )[Prog(S) → ξ ⊆ S] is the formula TI(ξ , S) and S ∈ Kλ ⊆ Kσ we finally conclude α 0
¬ {TI(ξ , S) S ∈ Kσ }, ¬Prog(S), ξ ⊆ S
by an inference ( ). 8.5.12 Corollary If γ ≤ σ and γ is a limit ordinal then true for all η . Proof
γ 0
¬TIσ (η ), TIγ (η ) holds
This is an immediate consequence of Lemma 8.5.11.
8.5.13 Lemma For all ordinals ξ and η we obtain λ λ
¬TIλ (ξ ), ¬TIλ (η ), TIλ (ξ + η ).
Proof Let S ∈ Kλ . Then also Jp(S) ∈ Kλ and by Lemma 8.5.11 we get ordinals α0 , α1 < λ such that α0 0
and
¬TIλ (ξ ), ¬Prog(S), ξ ⊆ S
(i)
144
8 Autonomous Ordinals and the Limits of Predicativity α1 0
¬TIλ (η ), ¬Prog(Jp(S)), η ⊆ Jp(S).
By Tautology, α2 0
-Inversion and
(ii)
-Exportation we get
¬Prog(Jp(S)), ¬η ⊆ Jp(S), η ε Jp(S)
(iii)
for some α2 < λ . But (ii) and (iii) imply α3 ρ
¬TIλ (η ), ¬Prog(Jp(S)), η ε Jp(S).
(iv)
From (iv) and Lemma 8.5.10 we obtain by cut α5 ρ
¬TIλ (η ), ¬Prog(S), η ε Jp(S)
(v)
for α5 < λ and ρ < λ sufficiently large. Statements (i) and (v) imply α6 ρ
¬TIλ (ξ ), ¬TIλ (η ), ¬Prog(S), ξ ⊆ S ∧ η ε Jp(S)
(vi)
for some α6 < λ by an inference ( ). Lemma 8.5.9 together with (vi) yield α7 ρ
¬TIλ (ξ ), ¬TIλ (η ), ¬Prog(S), ξ + η ⊆ S.
(vii)
Since (vii) holds for all S ∈ Kλ we obtain α ρ
¬TIλ (ξ ), ¬TIλ (η ), TIλ (ξ + η )
for some α < λ by -Importation and an inference ( ). 8.5.14 Lemma
λ 0
¬TIλ (η ), ¬(ξ < η ), TIλ (ξ ).
By Lemma 8.5.11, we obtain for S ∈ Kλ an α1 < λ such that
Proof α1 0
¬TIλ (η ), ¬Prog(S), η ⊆ S
which by -Inversion and α1 0
-Exportation yields
¬TIλ (η ), ¬Prog(S), ¬(ζ < η ), ζ ε S.
By Lemma 8.5.1 and n 0
(i)
(ii)
-Exportation we have
¬(ξ < η ), ¬(ζ < ξ ), ζ < η
(iii)
for some finite ordinal n. From (ii) and (iii) we get α2 0
¬TIλ (η ), ¬Prog(S), ¬(ξ < η ), ¬(ζ < ξ ), ζ ε S
(iv)
for α2 = α + n by the Reduction Lemma (Lemma 7.3.12). From (iv) we get by -Importation and the ( )-rule an α3 < λ with
α3 0
¬TIλ (η ), ¬(ξ < η ), Prog(S) → ξ ⊆ S.
Since this holds for all S ∈ Kλ we obtain by an application of the ( )-rule
8.5 The Lower Bound for Autonomous Ordinals λ 0
145
¬TIλ (η ), ¬(ξ < η ), TIλ (ξ ).
8.5.15 Definition For α =NF α1 + · · · + αn put h(α ) := α1 . 8.5.16 Theorem For a limit ordinal λ we obtain ρ less than λ ∗ .
α ρ
¬TIλ (h(η )), TIλ (η ) for α and
Proof Let η =NF η1 + · · · + ηn . Then ηi ≤ h(η ) holds for i = 1, . . . , n. By Lemma 8.5.14 and the Detachment Rule we obtain λ +1 1
¬TIλ (h(η )), TIλ (ηi )
(i)
for i = 1, . . . , n. By Lemma 8.5.13 we therefore get λ +3 λ +1
¬TIλ (h(η )), TIλ (η1 + η2 ).
(ii)
Iterating this n-fold yields the claim.
Having shown that TI λ is closed under ordinal addition the next aim is to study the closure behavior of TI λ under the V EBLEN function ϕσ . We introduce the V EBLENσ -jump as a class term VJpλ (σ ) := {η η ε OT ∧ TIλ (η ) → TIλ (ϕσ (η ))}. Observe that VJpλ (σ ) ∈ Kλ ∗ . 8.5.17 Lemma There is finite ordinal n such that For S ∈ Kλ we get
Proof
4
0 0
n 0
TIλ (0).
¬Prog(S), ¬(ξ < 0), ξ ε S for all ξ by Lemma 8.5.1.
Hence 0 ¬Prog(S), 0 ⊆ S by -Importation and the ( )-rule. The claim now fol lows by another -Importation and an application of an ( )-inference. The next (quite technical) lemma provides the induction step in the proof of Lemma 8.5.19 which will be proved by induction on the complexity of ordinal notations. We therefore define the norm of an ordinal α < Γ0 by 0 if α = 0 N(α ) := (8.6) ∑ni=1 (N(αi ) + N(βi ) + 1) if α =NF ϕα1 (β1 ) + · · · + ϕαn (βn ). 8.5.18 Lemma For every limit ordinal λ there are ordinals ρ and α less than λ ∗ such that α ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(∀η < τ )[TIλ (ϕσ (η ))], ¬(∀ν )[N(ν ) < N(µ ) ∧ ν < ϕσ (τ ) → TIλ (ν )], ¬TIλ (τ ), ¬(µ < ϕσ (τ )), TIλ (µ )
holds true for all ordinals µ , σ and τ .
146
8 Autonomous Ordinals and the Limits of Predicativity
Proof
Let us abbreviate the premises of the lemma by
Aλ (σ ) :⇔ (∀ξ < σ )(∀η )[η ε VJpλ (ξ )] ⇔ (∀ξ < )σ(∀η )[TIλ (η ) → TIλ (ϕξ (η ))] Bλ (σ , τ ) :⇔ (∀η < τ )[TIλ (ϕσ (η ))] and Cλ (µ , σ , τ ) :⇔ (∀ν )[N(ν ) < N(µ ) ∧ ν < ϕσ (τ ) → TIλ (ν )]. If ϕσ (τ ) ≤ µ we obtain 0 0
¬Aλ (σ ), ¬Bλ (σ , τ ), ¬Cλ (µ , σ , τ ), ¬TIλ (τ ), ¬(µ < ϕσ (τ )), TIλ (µ )
(i)
by Lemma 8.5.1. Therefore assume µ < ϕσ (τ ). If µ = 0 we obtain from Lemma 8.5.17 ω 0
¬Aλ (σ ), ¬Bλ (σ , τ ), ¬Cλ (µ , σ , τ ), ¬TIλ (τ ), ¬(µ < ϕσ (τ )), TIλ (µ ). (ii)
So assume µ > 0. According to Theorem 3.4.9 we distinguish the following cases. ϕ (µ ) such that µ1 < σ and µ2 < ϕσ (τ ). By Tautology, 1. It is µ = µ1 2 Inversion and -Exportation we have α1 0
¬Aλ (σ ), ¬(µ1 < σ ), ¬TIλ (µ2 ), TIλ (ϕµ1 (µ2 ))
(iii)
for α1 = 2·rnk(Aλ (σ )) < λ ∗ . By the Detachment Rule and Lemma 8.5.2 we derive form (iii) α2 ρ
¬Aλ (σ ), ¬TIλ (µ2 ), TIλ (µ )
for ρ = rnk(TIλ (µ )) + 1 < λ ∗ . By Tautology, also have α3 0
(iv)
-Inversion and
-Exportation we
¬Cλ (µ , σ , τ ), ¬(N(µ2 ) < N(µ )), ¬(µ2 < ϕσ (τ )), TIλ (µ2 )
(v)
for α3 = 2·rnk(Cλ (µ , σ , τ )) < λ ∗ . By the Detachment Rule we derive from (v) α4 ρ
¬Cλ (µ , σ , τ ), TIλ (µ2 )
(vi)
for α4 < λ ∗ . Cutting (iv) and (vi) we conclude α5 ρ
¬Aλ (σ ), ¬Cλ (µ , σ , τ ), TIλ (µ )
(vii)
for α5 less than λ ∗ . 2.It is µ = ϕµ1 (µ2 ) such that µ1 = σ and µ2 < τ . By Tautology, -Inversion and -Exportation we get α6 0
¬Bλ (σ , τ ), ¬(µ2 < τ ), TIλ (ϕσ (µ2 ))
(viii)
for α6 = 2·rnk(Bλ (σ , τ )) < λ ∗ . Using the Detachment Rule and Lemma 8.5.2 we infer from (viii) α7 ρ
¬Bλ (σ , τ ), TIλ (µ )
(ix)
8.5 The Lower Bound for Autonomous Ordinals
147
for ρ = rnk(TIλ (µ )) + 1 < λ ∗ and α7 < λ ∗ . 3. If µ < τ we obtain by Lemma 8.5.14 and the Detachment Rule λ +1 1
¬TIλ (τ ), TIλ (µ ).
(x)
4. It is µ = µn for n > 1. Then h(µ ) = µ1 < µ and N(µ1 ) < N(µ ). By NF µ1 + · · · + Tautology, -Inversion, -Exportation and Lemma 8.5.2 we obtain α8 ρ
¬Cλ (µ , σ , τ ), ¬(N(h(µ )) < N(µ )), ¬(h(µ ) < µ ), TIλ (h(µ ))
(xi)
for α8 = 2·rnk(Cλ (µ , σ , τ )) < λ ∗ and ρ = rnk(TIλ (h(µ ))) + 1 < λ ∗ . By the Detachment Rule it follows α9 ρ
¬Cλ (µ , σ , τ ), TIλ (h(µ ))
(xii)
and by Theorem 8.5.16 we obtain α10 ρ
¬Cλ (µ , σ , τ ), TIλ (µ )
(xiii)
for α10 < λ ∗ . Collecting (i), (ii), (vii), (x) and (xiii) yields the claim.
We are now going to discard step by step the superfluous premises in Lemma 8.5.18. 8.5.19 Lemma For every limit ordinal λ there is an ordinal ρ < λ ∗ such that for all ordinals µ there is an ordinal αµ < λ ∗ with αµ ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(∀η < τ )[TIλ (ϕσ (η ))], ¬TIλ (τ ), ¬(µ < ϕσ (τ )), TIλ (µ )
for all ordinals σ and τ . Proof We prove the lemma by meta-induction on N(µ ) and use the abbreviations of the proof of Lemma 8.5.18. If N(µ ) = 0 then µ = 0 and we obtain the claim from Lemma 8.5.17. So assume N(µ ) = 0. Then α µ0 ρ1
¬Aλ (σ ), ¬Bλ (σ , τ ), ¬TIλ (τ ), ¬(N(µ0 ) < N(µ )), ¬(µ0 < ϕσ (τ )), TIλ (µ0 )
(i)
because either N(µ ) ≤ N(µ0 ) and (i) holds by Lemma 8.5.1 or N(µ0 ) < N(µ ) and (i) holds by induction hypothesis. From (i) we get
α1 ρ1
¬Aλ (σ ), ¬Bλ (σ , τ ), ¬TIλ (τ ), Cλ (µ , σ , τ )
(ii)
by -Importation and an ( )-inference for α1 = sup {αµ0 + 6 N(µ0 ) < N(µ )}. Since there are only finitely many such µ0 we have α1 < λ ∗ . From (ii) and Lemma 8.5.18 we obtain by cut αµ ρ
¬Aλ (σ ), ¬Bλ (σ , τ ), ¬TIλ (τ ), ¬(µ < ϕσ (τ )), TIλ (µ )
for αµ = max{α , α1 } + 1 < λ ∗ and ρ = max{ρ1 , ρ2 , rnk(Cλ (µ , σ , τ )) + 1} < λ ∗ where α and ρ2 are the ordinals stemming from Lemma 8.5.18.
148
8 Autonomous Ordinals and the Limits of Predicativity
8.5.20 Lemma There are ordinals α and ρ less than λ ∗ such that α ρ
¬(τ ⊆ VJpλ (σ )), ¬TIλ (τ ), (∀η < τ )TIλ (ϕσ (η ))
for all ordinals σ and τ . Proof
By Tautology and α0 0
-Exportation we obtain
η ε VJpλ (σ ), ¬TIλ (η ), TIλ (ϕσ (η ))
(i)
for α0 = 2·rnk(VJpλ (σ )) < λ ∗ . By Lemma 8.5.14 we have λ 0
¬TIλ (τ ), ¬(η < τ ), TIλ (η ).
(ii)
Cutting (i) and (ii) yields α1 ρ
η ε VJpλ (σ ), ¬(η < τ ), ¬TIλ (τ ), TIλ (ϕσ (η ))
(iii)
with α1 and ρ less than λ ∗ . By Lemma 8.5.1 we have 0 0
η < τ , ¬TIλ (τ ), ¬(η < τ ), TIλ (ϕσ (η ))
(iv)
and obtain from (iii) and (iv) by an inference ( ) α1 +1 ρ
η < τ ∧ η ε VJpλ (σ ), ¬TIλ (τ ), ¬(η < τ ), TIλ (ϕσ (η )).
(v)
From (v) we obtain by an inference ( ) α1 +4 ρ
¬(τ ⊆ VJpλ (σ )), ¬TIλ (τ ), ¬(η < τ ), TIλ (ϕσ (η ))
and from (vi) by α ρ
(vi)
-Importation and an inference ( )
¬(τ ⊆ VJpλ (σ )), ¬TIλ (τ ), (∀η < τ )TIλ (ϕσ (η ))
for ordinals α and ρ less than λ ∗ . 8.5.21 Lemma
λ λ
¬(∀µ < σ )TIλ (µ ), TIλ (σ ).
Proof For every class term S ∈ Kλ there is by Lemma 8.5.11 an ordinal α0 < λ such that α0 0
¬TIλ (µ ), ¬Prog(S), µ ⊆ S
(i)
and an ordinal α1 < λ such that α1 0
¬Prog(S), ¬(µ ⊆ S), µ ε S.
(ii)
Since we also have 0 0
¬(µ < σ ), µ < σ
(iii)
by Lemma 8.5.1, we obtain from (i), (ii) and (iii) by cut and an inference ( )
8.5 The Lower Bound for Autonomous Ordinals α2 λ
149
µ < σ ∧ ¬TIλ (µ ), ¬Prog(S), ¬(µ < σ ), µ ε S
(iv)
with α2 < λ . From (iv) we get α2 +3 λ
¬(∀µ < σ )TIλ (µ ), ¬Prog(S), ¬(µ < σ ), µ ε S
by an inference ( ) and from (v) by α3 λ
(v)
-Importation and an inference ( )
¬(∀µ < σ )TIλ (µ ), ¬Prog(S), σ ⊆ S.
(vi)
So we obtain α4 < λ such that α4 λ
¬(∀µ < σ )TIλ (µ ), Prog(S) → σ ⊆ S
(vii)
for all S ∈ Kλ from (vi) by -Importation. The claim follows from (vii) by an infer ence ( ). 8.5.22 Lemma There are ordinals α < λ ∗∗ and ρ < λ ∗ such that α ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬TIλ ∗ (τ ), TIλ (ϕσ (τ ))
for all ordinals σ and τ . Proof By Lemma 8.5.19, there is an ordinal ρ1 < λ ∗ and for all µ an ordinal αµ less than λ ∗ such that αµ ρ1
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(∀η < τ )TIλ (ϕσ (η )), ¬TIλ (τ ), ¬(µ < ϕσ (τ )), TIλ (µ ).
(i)
From (i) we obtain by -Importation and an ( )-inference λ∗ ρ1
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(∀η < τ )TIλ (ϕσ (η )), ¬TIλ (τ ), (∀µ < ϕσ (τ ))TIλ (µ ).
(ii)
By (ii) and Lemma 8.5.21 we obtain λ ∗ +1 ρ1
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(∀η < τ )TIλ (ϕσ η ), ¬TIλ (τ ), (iii) TIλ (ϕσ (τ ).)
By Lemma 8.5.20, we have α1 ρ1
¬(τ ⊆ VJpλ (σ )), ¬TIλ (τ ), (∀η < τ )TIλ (ϕσ (η ))
(iv)
for α1 < λ ∗ . Cutting (iv) and (iii) yields α2 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(τ ⊆ VJpλ (σ )), ¬TIλ (τ ), TIλ (ϕσ (τ )) (v)
with α2 < λ ∗∗and ρ < λ ∗ large enough to majorize also all cuts to come. From (v) we obtain by -Importation α3 ρ1
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(τ ⊆ VJpλ (σ )), τ ε VJpλ (σ )
(vi)
for all ordinals τ . Again by -Importation and an ( )-inference we obtain from (vi)
150
8 Autonomous Ordinals and the Limits of Predicativity α4 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], Prog(VJpλ (σ ))
(vii)
with α4 still less than λ ∗∗ . By Tautology, we have β 0
¬(τ ⊆ VJpλ (σ )), τ ⊆ VJpλ (σ )
(viii)
for β = 2·rnk(τ ⊆ VJpλ (σ )) < λ ∗∗ and by the Conjunction Lemma (Lemma 8.5.3) we obtain from (viii) and (vii) α5 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], Prog(VJpλ (σ )) ∧ ¬(τ ⊆ VJpλ (σ )), τ ⊆ VJpλ (σ ),
i.e., α5 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬(Prog(VJpλ (σ )) → (τ ⊆ VJpλ (σ ))), (ix) τ ⊆ VJpλ (σ ).
Since VJpλ (σ ) ∈ Kλ ∗ we obtain from (ix) by an inference ( ) α5 +1 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬TIλ ∗ (τ ), τ ⊆ VJpλ (σ ).
By Tautology, β1 0
-Inversion and
(x)
-Exportation we get
¬Prog(VJpλ (σ )), ¬(τ ⊆ VJpλ (σ )), τ ε VJpλ (σ )
(xi)
for β1 = 2·rnk(Prog(VJpλ (σ ))) < λ ∗∗ . From (vii) and (xi) we get by cut α6 ρ
¬(∀ξ < σ )(∀η )[η ∈ VJpλ (ξ )], ¬(τ ⊆ VJpλ (σ )), τ ε VJpλ (σ )
(xii)
and from (x) and (xii) by cut
α7 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬TIλ ∗ (τ ), τ ε VJpλ (σ ).
(xiii)
By -Exportation we get from (xiii) α7 ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬TIλ ∗ (τ ), ¬TIλ (τ ), TIλ (ϕσ (τ )).
(xiv)
From Corollary 8.5.12 and (xiv) we obtain by cut α ρ
¬(∀ξ < σ )(∀η )[η ε VJpλ (ξ )], ¬TIλ ∗ (τ ), TIλ (ϕσ (τ ))
for α < λ ∗∗ . Checking all the cuts we see that we can keep ρ below λ ∗ .
Observe that Lemma 8.5.22 is the first which uses essentially the infinitary language. Passing from TIλ (τ ) to TIλ (ϕσ (τ )) is the essential step in the well-ordering proof. To render this step possible we have, however, still to get rid of the hypothesis (∀ξ < σ )(∀η )[η ε VJpλ (ξ )] and to equalize the asymmetry between λ ∗ and λ . This is prepared by the following lemmas. 8.5.23 Lemma Let λ be a limit ordinal and {µσ σ < η } a set which is unασ bounded in λ . Then ρσ ∆ , TIµσ (τ ) with ασ < α and ρσ ≤ ρ for all σ < η implies α ρ
∆ , TIλ (τ ).
8.5 The Lower Bound for Autonomous Ordinals
151
Proof Since {µσ σ < η } is unbounded in λ we have for every class term S ∈ Kλ ασ a µσ with σ < η such that S ∈ Kµσ . From the hypothesis ρσ ∆ , TIµσ (τ ) we obtain α ρ
ασ ρσ
∆ , Prog(S) → τ ⊆ S by
-Inversion. Since ασ < α and ρσ ≤ ρ we obtain
∆ , TIλ (τ ) by an inference ( ).
8.5.24 Lemma For a limit ordinal η we have ω ·η ω ·η
¬TIω ·η (τ ), TIω ·η (ϕ0 (τ ))
for all ordinals τ . Proof
By Lemma 8.5.1 we have α0 0
(∀ξ < 0)(∀η )[η ε VJpλ (ξ )]
(i)
for α0 < λ ∗∗ . For limit ordinals λ we therefore obtain from (i) and Lemma 8.5.22 by cut α1 ρ
¬TIλ ∗ (τ ), TIλ (ϕ0 (τ ))
(ii)
for α1 < λ ∗∗ and ρ < λ ∗ . By Corollary 8.5.12 we have λ∗ λ∗
¬TIω ·η (τ ), TIλ ∗ (τ )
(iii)
for all limit ordinals λ ∗ ≤ ω ·η . For λ = ω ·ξ < ω ·η we obtain from (ii) and (iii) αξ ω ·(ξ +1)
¬TIω ·η (τ ), TIω ·ξ (ϕ0 (τ ))
(iv)
with αξ < ω ·(ξ + 2) < ω ·η , since η ∈ Lim. The set {ω ·ξ ξ < η } is unbounded in ω ·η and we obtain by Lemma 8.5.23 and (iv) ω ·η ω ·η
¬TIω ·η (τ ), TIω ·η (ϕ0 (τ )).
8.5.25 Lemma (Euklidian division for ordinals) For ordinals σ and τ = 0 there are ordinals ρ < τ and η such that σ = τ ·η + ρ . Proof Let η0 := min {ξ σ < τ ·ξ }. Then η0 = 0. By minimality η0 cannot be a limit ordinal. Hence η0 = η + 1 and we obtain τ ·η ≤ σ . But then there is a ρ such that τ ·η + ρ = σ < τ ·η + τ which entails ρ < τ . 8.5.26 Lemma Let ν and η = 0 be ordinals. Then ω 1+ν +1 ·η ω 1+ν +1 ·η
¬TIω 1+ν +1 ·η (τ ), TIω 1+ν +1 ·η (ϕν (τ ))
for all ordinals τ .
152
8 Autonomous Ordinals and the Limits of Predicativity
Proof We induct on ν . Since ω ·η ∈ Lim we get for ν = 0 by Lemma 8.5.24 ω 2 ·η ω 2 ·η
¬TIω 2 ·η (τ ), TIω 2 ·η (ϕν (τ )).
So assume ν = 0. For µ < ν we obtain a δ such that ν = µ + 1 + δ . For ξ = 0 we thus have ω 1+ν ·ξ = ω 1+µ +1 ·ω δ ·ξ =: ω 1+µ +1 ·ζ for some ζ = 0. Therefore we obtain by the induction hypothesis ω 1+ν ·ξ ω 1+ν ·ξ
¬TIω 1+ν ·ξ (τ ), TIω 1+ν ·ξ (ϕµ (τ ))
(i)
which by two inferences ( ) entails ω 1+ν ·ξ +2 ω 1+ν ·ξ
τ ε VJpω 1+ν ·ξ (µ )
(ii)
for all ordinals τ . By an inference ( ) this implies ω 1+ν ·ξ +5 ω 1+ν ·ξ
(∀τ )[τ ε VJpω 1+ν ·ξ (µ )]
(iii)
for all µ < ν . By an inference ( ) together with Lemma 8.5.1 (in case that ν ≤ µ ) we then obtain ω 1+ν ·ξ +6 ω 1+ν ·ξ
¬(µ < ν ) ∨ (∀τ )[τ ε VJpω 1+ν ·ξ (µ )]
(iv)
for all ordinals µ . From (iv) we get ω 1+ν ·ξ +9 ω 1+ν ·ξ
(∀µ < ν )(∀τ )[τ ε VJpω 1+ν ·ξ (µ )]
(v)
with an inference ( ). By Lemma 8.5.22 there is an ordinal α < ω 1+ν ·ξ + ω ·2 and an ordinal ρ < ω 1+ν ·ξ + ω such that α ρ
¬(∀µ < ν )(∀τ )[τ ε VJpω 1+ν ·ξ (µ )], ¬TIω 1+ν ·ξ +ω (τ ), TIω 1+ν ·ξ (ϕν (τ )) (vi)
for all ordinals τ . From (vi) and (v) we obtain by cut α1 ω 1+ν ·ξ +ω
¬TIω 1+ν ·ξ +ω (τ ), TIω 1+ν ·ξ (ϕν (τ ))
(vii)
for an ordinal α1 < ω 1+ν ·ξ + ω ·2. To apply Lemma 8.5.23 we show that the set M := {ζ (∃ξ ) ξ = 0 ∧ ζ = ω 1+ν ·ξ ∧ (∀n < ω )[ω 1+ν ·ξ + ω ·n < ω 1+ν +1 ·η ] } is unbounded in ω 1+ν +1 ·η . Let σ < ω 1+ν +1 ·η . We have to find an ordinal ξ = 0 such that σ < ω 1+ν ·ξ and ω 1+ν ·ξ + ω ·n < ω 1+ν +1 ·η for all n ∈ ω . By Euklidian division there are ordinals ρ < ω 1+ν and ξ0 such that σ = ω 1+ν ·ξ0 + ρ < ω 1+ν ·(ξ0 + 1). Since ω 1+ν ·ξ0 ≤ σ < ω 1+ν +1 ·η we obtain ξ0 < ω ·η and therefore also ξ0 +n < ω ·η for all n < ω . Hence ω 1+ν ·(ξ0 + n) < ω 1+ν ·ω ·η = ω 1+ν +1 ·η . Choosing ξ := ξ0 + 1 we get σ < ω 1+ν ·ξ and ω 1+ν ·ξ + ω ·n ≤ ω 1+ν ·ξ0 + ω 1+ν + ω ·n ≤ ω 1+ν ·(ξ0 + n + 1) < ω 1+ν +1 ·η .
8.5 The Lower Bound for Autonomous Ordinals
153
Now let ζ ∈ M. By (vii) and Corollary 8.5.12 we obtain ζ +ω ·2 ω 1+ν +1 ·η
¬TIω 1+ν +1 ·η (τ ), TIζ (ϕν (τ )).
(viii)
Since ζ + ω ·2 < ω 1+ν +1 ·η for all ζ ∈ M and M is unbounded in ω 1+ν +1 ·η we obtain ω 1+ν +1 ·η w1+ν +1 ·η
¬TIω 1+ν +1 ·η (τ ), TIω 1+ν +1 ·η (ϕν (τ ))
for all ordinals τ by Lemma 8.5.23.
(ix)
The aim of this section is to prove (8.4) in p. 138. Therefore put
ζ0 := ϕ1 (0) = ε0 and ζn+1 := ϕζn (0). We show
Γ0 = sup {ζn n ∈ ω }.
(8.7)
First we prove ζn < Γ0 for all n by induction on n. Since Γ0 ∈ Lim we get by Corollary 3.4.10 and Lemma 3.4.12 (2) ζ0 = ϕ1 (0) < ϕΓ0 (0) = Γ0 . In the induction step we get from the induction hypothesis ζn < Γ0 by Corollary 3.4.10 and Lemma 3.4.12 (2) ζn+1 = ϕζn (0) < ϕΓ0 (0) = Γ0 . Next we show that {ζn n ∈ ω } is unbounded in Γ0 . For σ < Γ0 we prove by induction on N(σ ) that there is an n < ω such that σ < ζn . This is obvious for σ = 0. If σ =NF σ1 + · · · + σm then there is by induction hypothesis an n such that σ1 < ζn . Since ζn ∈ H we obtain σ < ζn . If σ =NF ϕσ1 (σ2 ) then there is by induction hypothesis an n such that σi < ζn for i = 1, 2. But then σ = ϕσ1 (σ2 ) < ϕζn (ζn ) < ϕζn+1 (0) = ζn+2 . This finishes the proof of (8.7). 8.5.27 Lemma
ζn ·ω +1 ζn ·ω +1
TIζn ·ω +1 (ζn+1 ).
Since ω 1+ζn +1 = ζn ·ω for all n ∈ ω we obtain from Lemma 8.5.26
Proof
ζn ·ω ζn ·ω
¬TIζn ·ω (0), TIζn ·ω (ϕζn (0)).
Cutting that with Lemma 8.5.17 yields the claim. 8.5.28 Lemma
ω 2 +ω ω 2 +ω
¬TIω 2 (ζn ), TIω 2 (ζn ·ω + 1)
By Theorem 8.5.16, we have ordinals α0 and ρ less than ω 2 + ω such that
Proof α0 ρ
¬TIω 2 (ζn ), TIω 2 (ζn + 1)
and from Lemma 8.5.24 we obtain by ω2 ω2
(i)
¬TIω 2 (ζn + 1), TIω 2 (ω ζn +1 ).
-Exportation (ii)
154
8 Autonomous Ordinals and the Limits of Predicativity
Again by Theorem 8.5.16, we obtain ordinals α1 and ρ less than ω 2 + ω such that α1 ρ
¬TIω 2 (ω ζn +1 ), TIω 2 (ζn ·ω + 1)
(iii)
because h(ζn ·ω + 1) = ω ζn +1 . Cutting (i), (ii) and (iii) yields the claim. 7·(λ +α +1)
8.5.29 Lemma
0
TIλ (α ).
Let ρ := rnk(S). We show
Proof
7·(ρ +α ) 0
¬Prog(S), α ⊆ S
(i)
by induction on α . By induction hypothesis and Lemma 8.5.1 we have 7·(ρ +β ) 0
¬Prog(S), ¬(β < α ), β ⊆ S
(ii)
for all ordinals β and by Tautology also 2·ρ 0
β ε S, β ε S.
(iii)
From (ii) and (iii) we then obtain 7·(ρ +β )+1 0
¬Prog(S), β ⊆ S ∧ β ε S, ¬(β < α ), β ε S
for all β by the Conjunction Lemma (Lemma 8.5.3). By an inference from (iv) 7·(ρ +β )+2 0
¬Prog(S), ¬(β < α ), β ε S
(iv)
we obtain (v)
and finally 7·(ρ +α ) 0
¬Prog(S), α ⊆ S
This proves (i). From (i) we obtain from (v) by -Importation and an ( )-inference.
the claim by -Importation and an inference ( ).
8.5.30 Lemma Γ0 ⊆ Aut(ω ). Proof
We show
ζn ∈ Aut(ω )
(i)
by induction on n. Since Aut(ω ) is transitive by Lemma 8.5.14, we then obtain Γ0 ⊆ Aut(ω ) by (8.7). In the proof of Theorem 7.4.1, we have shown NT TI(α , X) for every ordinal α less than ε0 , i.e., less than ζ0 . By Lemma 7.4.3 this implies NT TI(α , S) for every α < ε0 and every S ∈ Kω . By the Embedding Theorem for NT (Theorem 7.3.8) and Lemma 8.5.1 there are finite ordinals nS such that
8.5 The Lower Bound for Autonomous Ordinals ω +nS ω
155
¬(α < ε0 ), TI(α , S)
(ii)
for all ordinals α and all S ∈ Kω . Hence ε0 ⊆ Aut(ω ) and ω ·2+3 ω
(∀ξ < ε0 )TIω (ξ ).
(iii)
Together with Lemma 8.5.21 this entails ω ·2+4 ω ·2
TIω (ε0 )
(iv)
which implies ζ0 = ε0 ∈ Aut(ω ). Let ζn ∈ Aut(ω ) by induction hypothesis. From Lemma 8.5.29 we obtain ζn +7
0
TIω 2 (ζn ) which by Lemma 8.5.28 implies
ζn ·ω + 1 ∈ Aut(ω ) by Aut(ω ).
ζn +8
ω 2 +ω
TIω 2 (ζn ·ω + 1). Hence
-Inversion and we obtain with Lemma 8.5.27 ζn+1 ∈
8.5.31 Theorem (S. Feferman, K. Sch¨utte) Aut(ω ) = Γ0 . Proof
This is immediate by Corollary 8.4.4 and Lemma 8.5.30.
It follows from Theorem 8.5.31 that the Sch¨utte-Feferman-ordinal Γ0 is the limit ordinal for predicativity in the strict sense. 8.5.32 Exercise* Let A(X, x) be an arithmetical formula and ΦA the operator induced by A(X, x) (which needs not to be monotone). Let ≺ be a well-ordering. We define the iterations of the arithmetical operator ΦA in the following way. ≺x For x ∈ field(≺) we put ΦxA := {n ∈ N N |= A[Φ≺x A , n]} where ΦA stands for z {x ∈ N (∃z ≺ x)[x ∈ ΦA ]}. These iterations can be described in the second-order language of NT. Let LO(X) express that X is a set of pairs such that the relation x ≺X y :⇔ x, y ∈ X is a linear ordering and WO(X) :⇔ LO(X) ∧ Wf (X) express that X represents a well-ordering. For a set X we denote by Xa := {x x, a ε X} the a-slice of X and define X ≺a := {x (∃z ≺ a)[x ∈ Xz ]}. The formula HierA (X,Y ) :⇔ (∀x)[x ∈ Y ↔ A(Y ≺X (x)1 , (x)0 )] expresses that Y represents the iterations of the operator ΦA along the ordering ≺X . The scheme of arithmetical transfinite recursion (cf. [94]) is then expressed by (ATR)
(∀X)[WO(X) → (∃Y )HierA (X,Y )].
Let (ATR) denote the second-order theory which comprises all axioms of NT together with the scheme (∆01 –CA) for arithmetical comprehension and the scheme (ATR) of arithmetical transfinite recursion. By (ATR)0 we denote the theory where the scheme (IND) of Mathematical Induction is replaced by the second-order axiom (Ind)2 . (a) Show that for every α < Γ0 there is a primitive recursive order relation ≺ such TI(≺). that otyp(≺) = α and (ATR)0
156
8 Autonomous Ordinals and the Limits of Predicativity
(b) Show that for every α < Γε0 = ϕ1,0 (ε0 ) there is a primitive recursive order relation ≺ such that otyp(≺) = α and (ATR) TI(≺). Hint: This is by far not simple. One possibility is to show that NTω1 is formalizable within (ATR)0 and then refer to Sect. 8.5. Although tedious this is quite straight forward. You have to represent the infinitary proof trees by arithmetically definable trees tagged with ordinal notations. Of course there are direct proofs in (ATR)0 which are much more satisfying. The basic ideas of these proofs are essentially the same as in Sect. 8.5 but become much clearer in (ATR)0 where they are not obscured by the tedious computations of derivation lengths. A well-ordering proof for ordinals below Γ0 in a different system IR is in [23] and another for the system (ATR)0 in [27].
Chapter 9
Ordinal Analysis of the Theory for Inductive Definitions
Having exhausted the limits of predicativity, we are ready for the first step into impredicativity. This book will solely be restricted to this first step. The further steps are more complicated and will be treated in a forthcoming volume. As mentioned in the discussion on predicativity at the beginning of Sect. 8.3 the definition of the least fixed-point of a monotone inductive definition Γ as the intersection of all Γ -closed sets carries the features of an impredicative definition. An axiom system for a theory, which allows the formation of such fixed-points will therefore be the first (and simplest) example for an impredicative theory.
9.1 The Theory ID1 We want to axiomatize the theory for positively definable inductive definitions over the natural numbers. According to Corollary 6.5.6, we can express Π11 -relations by inductively defined relations. Therefore we can dispense with set parameters in the theory, to save some case distinctions and also to give examples for some of the phenomena which are characteristic for impredicative proof theory. To define the language of the theory for inductive definitions, let us assume that we have n-ary relation variables (instead of only unary ones) in the language L (NT). Since there are primitive recursive coding functions in L (NT) this is not really necessary but it facilitates the introduction of new constants for fixed-points. We will, however, continue to talk about “set”-variables and “set”-parameters. 9.1.1 Definition The language L (ID) comprises the first-order language of NT. For every X-positive formula F(X, x1 , . . . , xn ) of L (NT), where X is supposed to be an n-ary predicate variable, we introduce a new n-ary set constant IF . The theory ID1 comprises NT including the scheme for Mathematical Induction for the extended language (without set variables) together with the defining axioms for the set constants (ID11 ) (∀x)[F(IF ,x) → x ε IF ] W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
157
158
9 Ordinal Analysis of the Theory for Inductive Definitions
expressing that IF is closed under the operator defined by the X-positive formula F(X,x) and (ID12 ) (∀x)[F(G,x) → G(x)] → (∀x)[x ε IF → G(x)], expressing that it is the least such set. Here the notion F(G,x) stands for the formula obtained from F(X,x) by replacing all occurrences of t ε X by G(t ) and t ε X by ¬G(t). We frequently use the abbreviation ClF (G) :⇔ (∀x)[F(G,x) → G(x)]
(9.1)
to express that the “class” {x G(x)} is closed under the operator ΓF induced by F(X,x). The standard interpretation for IF is the least fixed point IF of the operator ΦF as introduced in Definition 6.4.1. The following two properties are left as exercises. N |=n ε IF ⇔ N |= (∀X)[ClF (X) →n ε X]
ID1
(9.2)
(∀x)[F(IF ,x) ↔x ε IF ].
(9.3)
In the following, we distinguish the theory ID1 (X), which allows set parameters in its language, from the theory ID1 without set parameters. Since there are no defining axioms for the set parameters, every model for ID1 can easily be expanded to a model of ID1 (X). Therefore ID1 (X) is a conservative extension of ID1 . The definition of the proof-theoretic ordinal as well as the definition of the Π11 -ordinal of a theory require, (pseudo)-Π11 -sentences in the language of the theory. Therefore only ||ID1 (X)|| and ||ID1 (X)||Π 1 are meaningful. In the first step we 1
want to show that for impredicative theories comprising ID1 set parameters become dispensable. 9.1.2 Definition We define the ordinal
κ ID1 := sup {|n|F + 1 F(X, x) is X-positive ∧ ID1
n ε IF }.
n ε IF } is a Σ -definable set of The set {|n|F + 1 F(X, x) is X-positive ∧ ID1 ordinals less than ω1CK . This implies κ ID1 < ω1CK .1 We are going to show that computation of κ ID1 yields an ordinal analysis for ID1 (X). First we obtain ID1 (X)
F(X) ⇒ ID1 (X)
F(G)
(9.4)
in the same way as in the case of NT in Lemma 7.4.2. Then we claim ID1 (X)
1
TI(≺, X) ⇔ ID1
For details consult [4]. Cf. also Sect. 11.
(∀x ε field(≺)) [x ε Acc≺ ]
(9.5)
9.1 The Theory ID1
159
where Acc≺ has been defined in Definition 6.5.1 as the fixed-point of the formula F≺ :⇔ Acc(X, x, ≺) ⇔ x ε field(≺) ∧ (∀y ≺ x)[y ε X]. To check (9.5) observe that TI(≺, X) means ClF≺ (X) → (∀x ε field(≺))[x ε X]. The assumption ID1 (X) TI(≺, X) therefore entails ID1 (X)
ClF≺ (Acc≺ ) → (∀x ε field(≺))[x ε Acc≺ ]
by (9.4). But ClF≺ (Acc≺ ) is an axiom (ID11 ) of ID1 (X). Hence ID1 (X)
(∀x ε field(≺))[x ε Acc≺ ].
For the opposite direction we notice that ClF≺ (X) → (∀x)[x ε Acc≺ → x ε X] is an instance of axiom (ID21 ) of ID1 (X). From ID1 (X)
(∀x ε field(≺))[x ε Acc≺ ]
we therefore immediately get ID1
ClF≺ (X) → (∀x)[x ε field(≺) → x ε X],
i.e., ID1 (X) TI(≺, X). If ≺ is an arithmetically definable transitive relation such that ID1 (X) we get by (9.5) and Lemma 6.5.2
TI(≺, X)
otyp(≺) = sup {otyp≺ (x) + 1 x ε field(≺)} ≤ sup {|x|F≺ + 1 ID1
(9.6)
x ε Acc≺ } ≤ κ ID1 ,
hence ||ID1 (X)|| ≤ κ ID1 . But we also have ID1
t ε IF ⇔ ID1 (X)
ClF (X) →t ε X.
(9.7)
The direction from left to right is true since ClF (X) →t ε IF →t ε X is an instance of the axiom (ID12 ) and the opposite direction follows because ID1 t ε IF is obvious from the instantiation ClF (IF ) →t ε IF and axiom (ID11 ) which says ClF (IF ). ||ID1 (X)||Π 1
By the Stage Theorem (Theorem 6.6.1) we have κ ID1 ≤ 2 with (9.6) and (9.7) this implies ||ID1 (X)||Π 1
||ID1 (X)|| ≤ κ ID1 ≤ 2
1
= 2||ID1 (X)|| = ||ID1 (X)||
1
. Together
(9.8)
where we anticipated that ||ID1 (X)|| is an ε -number. This fact will follow only from the well-ordering proof in Sect. 9.6.2. However, ruminating the well-ordering proof in Sect. 7.4, we see even now that the proof-theoretic ordinal of any theory which comprises the scheme of Mathematical Induction is closed under ω -powers, hence an ε -number.
160
9 Ordinal Analysis of the Theory for Inductive Definitions
Equation (9.8) confirms our decision to not include set parameters in the language of ID1 . Our aim is therefore to compute κ ID1 .2
9.2 The Language L ∞ (NT) The aim of the next section is to provide a fragment of the language Lω1 +1 , which on the one side is large enough to allow an embedding of the theory ID1 and on the other side is simple enough to be handled proof-theoretically. The basic idea is to represent the stages of arithmetically definable monotone operators by infinitely long formulas. 9.2.1 Definition (The language L∞ (NT)) We define the language L∞ (NT) as a fragment of the language Lω1 +1 defined in Definition 8.1.1. We dispense with set parameters and restrict the formation of disjunctions and conjunctions to finite se quences, sequences of the form Fu (n) n ∈ ω and to sequences of the form <ξ F(IF ,t ) ξ < λ ≤ ω1 and
<ξ ∼F(IF ,t ) ξ < λ ≤ ω1 <ξ
for an X-positive L (NT)-formula F(X,x) where t ε IF <ξ
t ε IF
:⇔
is recursively defined by
<η
F(IF ,t ).
(9.9)
η <ξ α
Since L∞ (NT) is a sublanguage of Lω1 +1 , the infinite verification calculus ∆ of Sect. 8 is also a correct verification relation for L∞ (NT). We mentioned in Sect. 8.1 that there is no completeness theorem for Lω1 +1 . For the fragment L∞ (NT), however, we get completeness nearly for free. The reason is the absence of set variables in this fragment. In L∞ (NT) there are only sentences which belong either to –type or to –type . Therefore we can dispense with the rule (Ax) in the definα ∆ . All we need are the following two clauses (cf. Definitions 5.4.3 and ition of 8.2.1). αG α ( ) If F ∈ (∆ ∩ –type ) and (∀G ∈ CS(F))(∃αG < α ) ∆ , G then ∆. As a word of warning we want to emphasize that the identicalness of κ ID1 and ||ID1 (X)|| depends on the presence of axiom (ID12 ). There are meta-predicative theories in which (among others) (ID11 ) is replaced by (∀x)[F(IF ,x) ↔ x ε IF ] and (ID12 ) is omitted. In such theories T the ordinals κ T and ||T || may differ. On the side of subsystems of set theories the ordinal κ T corresponds to the ordinal ||T || ω CK (cf. Remark 11.7.3). A similar effect of incoherence of ||T || and ||T || ω CK for Σ 1 Σ 1 subsystems of set theory with restricted foundation occurs in Theorem 12.6.14. 2
9.2 The Language L ∞ (NT)
( )
161
If F ∈ (∆ ∩ –type ) and (∃G ∈ CS(F))(∃αG < α )
αG
∆ , G then
α
∆.
The completeness of the verification relation is stated in the next lemma. 9.2.2 Lemma For F ∈ L∞ (NT) we have rnk(F)
N |= F ⇒
F.
Proof We induct on rnk(F). If CS(F) = 0/ then F ∈ –type , which immediately 0 implies F. If F ∈ –type and G ∈ CS(F) then we have N |= G and thus by induction hypothesis rnk(G)
G
(i)
for all G ∈ CS(F). From (i) we obtain rnk(F)
F
by an inference . If F ∈ –type then CS(F) = 0/ and there is a sentence G ∈ CS(F) such that N |= G. Thus rnk(G)
G
(ii)
by induction hypothesis, which implies rnk(F)
F
by an inference
.
There is an obvious embedding of the language L (NT) into the fragment of L∞ (NT). 9.2.3 Definition We define an embedding ∗ : L (NT) −→ L∞ (NT) by • (s = t)∗ := (s = t) and (s = t)∗ := (s = t)
• (A ∧ B)∗ := A∗ , B∗
• (A ∨ B)∗ := A∗ , B∗ Fu (n)∗
• ((∀x)Fu (x))∗ := n<ω
• ((∃x)Fu (x))∗ :=
n<ω
Fu (n)∗
162
9 Ordinal Analysis of the Theory for Inductive Definitions
The language L (NT) can therefore be regarded as a sublanguage of L∞ (NT). However, by (9.9) also the stages of an inductive definition over N can be easily expressed in L∞ (NT). <ξ
9.2.4 Lemma Let F(X,x) be a formula in L (NT). Recall the definition of (t ε IF ) in (9.9). As a shorthand we define α
t ε IF
<α
:⇔ F(IF ,t )
Then we obtain α
N |=t ε IF
⇔ t
N
∈ IFα
(9.10)
IFα
denote the stages of the inductive definition induced by F in the sense of where Definition 6.4.5 and Definition 6.3.1. Proof
We prove the lemma by induction on α . We have <α
N |=t ε IF
⇔ t
N
∈ IF<α
(i)
by the induction hypothesis. But from (i) we immediately obtain <α
N |= F(IF ,t ) ⇔ N |= F(IF<α ,t ) ⇔ t
N
∈ ΦF (IF<α ) ⇔ t
N
∈ IFα .
In short, we also use α
t ε IF
<α
:⇔ ¬F(IF ,t ).
If F(X,x) is an X-positive L (NT)-formula, we get |F| ≤ ω1CK by Theorem 6.6.4. <ω CK
Hence IF = IF 1 = IF<ω1 . Let us use Ω as a symbol which can either be interpreted by ω1CK or ω1 .3 We may now extend the embedding ∗ : L (NT) −→ L∞ (NT) to an embedding ∗
: L (ID) −→ L∞ (NT)
(9.11)
by adding the clause <Ω
• (t ε IF )∗ := (t ε IF
<Ω
) and (t ε IF )∗ := (t ε IF
)
to Definition 9.2.3. 9.2.5 Lemma For any L (ID)-sentence G we get N |= G ⇔ N |= G∗ . Proof
The proof is a straightforward induction on rnk(G). The only remark<Ω <Ω or t ε IF . By Lemma 9.2.4 we obtain
able case is that G is a formula t ε IF <Ω
{n N |=n ε IF case. 3
} = IF irrespective of the interpretation of Ω . This settles also this
Possible additional alternative interpretations for Ω are discussed in Sect. 9.7.
9.3 The Semi-Formal System for L ∞ (NT)
163
For an L (ID)-sentence F we define its truth complexity by α ∗
∗
F } if this exists tc F := tc F = min {α ∞ otherwise. The following observation (9.12) shows that the “truth complexity” of the provable sentences of ID1 also provides an upper bound for κ ID1 . α
<Ω
n ε IF
implies
α
<α
n ε IF
and thus also |n|F < α .
(9.12)
It is not difficult to prove (9.12) directly (cf. Exercise 9.3.18). However, we omit the proof because we need it in the modified form of Corollary 9.3.17 which will be proved there.
9.3 The Semi-Formal System for L ∞ (NT) In the previous chapters, we have seen that cut elimination is the main tool in the ordinal analysis of predicative theories. The aim of this introductory section is to study if this is also true for impredicative theories.
9.3.1 Semantical Cut-Elimination 9.3.1 Definition We adopt Definition 7.3.5 of
α ρ
∆ to finite sets ∆ of sentences of
the language L∞ (NT). Since there are no free set variables, we can dispense with α clause (Ax) in the definition of L∞ (NT) ρ ∆ . So only the following clauses are left. αG ( ) If F ∈ ∆ ∩ –type and (∀G ∈ CS(F))(∃αG < α ) L∞ (NT) ρ ∆ , G then L∞ (NT)
α ρ
∆.
( ) If F ∈ ∆ ∩ –type and (∃G ∈ CS(F))(∃αG < α ) L∞ (NT) α ρ
∆.
(cut) If L∞ (NT)
α0 ρ
L∞ (NT)
∆ , F, L∞ (NT)
that rnk(F) < ρ then L∞ (NT)
α0 ρ ∆ , ¬F for α ρ ∆ for all α
∆ , G then
some L∞ (NT) formula F such > α0 .
The equivalence (7.5) on p. 112 remains true for L∞ (NT) α
αG ρ
α
α 0
∆ . Instead of
L∞ (NT) ρ ∆ , we will from now on briefly write ρ ∆ . Our first observation is that, due to the fact that there are only sentences in L∞ (NT), cut elimination for this infinitary system comes nearly for free.
164
9 Ordinal Analysis of the Theory for Inductive Definitions
9.3.2 Theorem (Semantical cut elimination) Let Γ be a finite set of L∞ (NT)α α sentences. Then ρ Γ already implies Γ. Proof
We prove α ρ
Γ , ∆ and N |= F for all F ∈ ∆ ⇒
α
Γ
(i)
by induction on α . The claim then follows from (i) taking ∆ to be empty. In the case of a cut we have the premises α0 ρ
Γ , ∆ , F and
α0 ρ
Γ , ∆ , ¬F.
(ii)
Then either N |= F or N |= ¬F. Using the induction hypothesis on the corresponding premise we get the claim. In case of an inference according to ( ) there is a formula F ∈ –type ∩(Γ ∪ ∆ ) and we have the premises αG ρ
Γ ,∆,G
(iii)
for all G ∈ CS(F). If F ∈ ∆ there is a G ∈ CS(F) such that N |= G and we obtain α
Γ
from (iii) by the induction hypothesis and the structural rule. If F ∈ Γ we obtain αG
Γ ,G
(iv) α
for all G ∈ CS(F) by the induction hypothesis and Γ from (iv) by an inference ( ). In case of an inference according to ( ) there is a formula F ∈ –type ∩(∆ ∪ Γ ) and we have the premise α0 ρ
Γ , ∆ , G0
(v)
for some G0 ∈ CS(F). If F ∈ ∆ we have N |= G for all G ∈ CS(F) and obtain the claim from (v) by the induction hypothesis and the structural rule. If F ∈ Γ α0 α we get Γ , G0 from (v) by the induction hypothesis and from that Γ by an inference ( ). 9.3.3 Remark There is also a semantical proof of the Cut-Elimination Theorem for α NTω1 . From NTω1 ρ F we obtain by Lemma 7.3.6 N |= F[Φ ] for all assignments Φ . By the Completeness Theorem (Theorem 8.2.2) we then get an ordinal δ < ω1 δ
such that F. On the one hand, no further information of the size of δ is provided by this proof (which makes it useless for ordinal analysis). On the other hand, this proof is much deeper than that of Theorem 9.3.2 since it uses the Completeness Theorem (which we regard as a deeper theorem)4 although Theorem 9.3.2 provides a The Completeness Theorem for L∞ (NT) is Lemma 9.2.2 which is also close to trivial. A completeness theorem for L∞ (NT, X) with free set variables fails (cf. Exercise 8.2.3). But this can
4
9.3 The Semi-Formal System for L ∞ (NT)
165
bound for δ , namely α itself. This illuminates that cut elimination alone cannot play the same exclusive role in the ordinal analysis for ID1 as it does in the ordinal analysis of predicative theories. We have seen that the language L∞ (NT) is sufficiently expressive to allow an embedding of the language of ID1 . Theorem 9.3.2 explains us that the ordinal obtained by the embedding procedure is already an upper bound for the truth complexities of the provable sentences of ID1 . No real cut elimination procedure is needed. This raises the suspicion that the ordinal bound obtained by embedding is too coarse. We are going to examine this phenomenon closer. In view of Remark 9.3.3 we have to check the embedding strategy carefully. First we notice that already the translation of pure logic needs some extra care. We cannot transfer Theorem 7.3.2 literally. The sentencet ε IF is an atomic sentence of L (ID), while its translation (t ε IF )∗ is not longer an atomic sentence of L∞ (NT). Because <Ω of rnk(t ε IF ) = Ω we obtain by the Tautology Lemma (Lemma 7.3.3) Ω
<Ω
∆ ,t ε IF
<Ω
,t ε IF
.
(9.13)
Thus Theorem 7.3.2 can be modified to 9.3.4 Theorem If Ω +m
m T
∆ (x) holds for a finite set of L (ID)-formulas then
∆ (n)∗ for all tuplesn of numerals.
The truth complexities of the defining axioms for primitive recursive functions are not altered. More caution is again needed for the last identity axiom (∀x)(∀y)[x =y → F(x) → F(y)]. But here we get Ω +n
(∀x)(∀y)[x =y → F(x) → F(y)]
for some n < ω again by the Tautology Lemma and the fact that all translations of L (ID)-formulas have ranks Ω + m for some finite ordinal m. By the same fact and the Induction Lemma (Lemma 7.3.4) we also obtain Ω +ω +4
G∗
for all instances G of the scheme of Mathematical Induction in ID1 . It remains to check the truth complexities for the axioms (ID11 ) and (ID12 ). The Induction Lemma can be generalized to 2·rnk(G)+ω ·(α +1)
α
¬(∀x)[F(G,x) → G(x)], t ε IF , G(t)
(9.14)
which yields Ω ·2+ ω as an upper bound for the truth complexities for translations of axiom ID12 . We are not going to prove (9.14) here. It will follow from Lemma 9.5.3 given below. be rectify by replacing ω1 by ω1CK . This is sufficient to embed ID1 and makes L∞ (NT, X) to a fragment of Lω1 for which Theorem 8.2.2 holds.
166
9 Ordinal Analysis of the Theory for Inductive Definitions
Up to now we never used the peculiarities of the language L∞ (NT). All the mentioned results hold for any language Lκ even with free set variables. Of another quality is the translation (ID11 )∗ of axiom (ID11 ). The sentence (ID11 )∗ can be regarded as the defining axiom for the ordinal symbol Ω since it postulates that Ω represents an ordinal which is an upper bound for the closure ordinal for all arithmetically definable inductive definitions. We cannot expect to import the properties of Ω from outside into the semi-formal proof relation in a similar way as we did it in the case of the Induction Lemma and its generalization (9.14). To obtain a verification for (ID11 ) we have to fall back on Lemma 9.2.2 by which we obtain Ω +n
<Ω
ClF (IF
)
<Ω
since rnk(ClF (IF )) = Ω + n for some n < ω . Combining that with Theorem 9.3.4 we obtain infinite derivations
Ω ·2+ω Ω +m
F ∗ for
sentences F which are provable in ID1 . It follows from Theorem 9.3.2 that we can Ω ·2+ω
also get a cut-free derivation F ∗ but this bound is much too big for an ordinal analysis. Since obviously κ ID1 ≤ κ N = ω1CK < ω1 a better bound is already obtained by Theorem 6.6.4. These observations show that the ordinal analysis for ID1 needs something new. We need a collapsing procedure that allows us to collapse the length of a deriva<Ω tion for the formula n ∈ IF obtained by the canonical embedding into a derivation with length below Ω . The hallmark for impredicative proof theory is thus not longer cut-elimination but the necessity for a collapsing procedure.5 But ordinals, as hereditarily transitive sets, are in general not collapsible. A possible remedy is to use an ordinal notation system with gaps and to assign only ordinals from the notation system to the nodes of the derivation trees of the semi-formal system. This was the way originally used in [71] and [73]. Since this method uses heavily the fact that the stages of an inductive definition are locally predicatively defined we talk about the method of local predicativity. Here we are going to use a simplification of the method of local predicativity due to Buchholz [13]. We do not have to start with a notation system but assign ordinals which are controlled by operators. This again will leave enough gaps for a collapsing procedure. We will, however, not copy Buchholz’ proof directly but introduce a slight variant which even more sharply pinpoints the role of collapsing.
9.3.2 Operator Controlled Derivations Recall the notion of a strongly critical ordinal as defined in (3.10) on p. 39. Lemma 3.4.17 together with the Cantor normal-form for ordinals shows that every ordinal has a normal form 5
We will, however, see that cut-elimination again plays the important role in the collapsing procedure.
9.3 The Semi-Formal System for L ∞ (NT)
α =NF ϕα1 (β1 ) + · · · + ϕαn (βn )
167
(9.15)
such that βi < α for i = 1, . . . , n and ϕα1 (β1 ) ≥ · · · ≥ ϕαn (βn ). 9.3.5 Definition We define the set SC(α ) of strongly critical components of an ordinal α as follows. ⎧ 0/ if α = 0, ⎪ ⎪ ⎪ α } if α ∈ SC, { ⎪ ⎨ α ) ∪ SC( α ) if α = ϕα1 (α2 ) and α1 , α2 < α , SC( 1 2 SC(α ) := n ⎪ ⎪ ⎪ ⎪ if α =NF α1 + · · · + αn and n > 1. ⎩ SC(αi ) i=1
9.3.6 Definition A Skolem-hull operator is a function H , which maps sets of ordinals to sets of ordinals satisfying the conditions • For all X ⊆ On it is X ⊆ H (X) • If Y ⊆ H (X) then H (Y ) ⊆ H (X) • H (X) = max{X, ℵ0 }. A Skolem–hull operator H is Cantorian closed if it satisfies • α ∈ H (X) ⇔ SC(α ) ⊆ H (X) for any set X of ordinals. We call H transitive if • H (0) / ∩ Ω is transitive. The “least” Cantorian closed operator B0 is obtained by inductively defining SC(X) ⊆ B0 (X) and SC(α ) ⊆ B0 (X) ⇒ α ∈ B0 (X) where SC(X) stands for
ξ ∈X
SC(ξ ). Then B0 (X) satisfies
X ⊆ B0 (X) and α ∈ B0 (X) ⇔ SC(α ) ⊆ B0 (X).
(9.16)
/ = Γ0 which shows that this operator is also It follows from Lemma 3.4.17 that B0 (0) transitive. It shows moreover that for any Cantorian closed operator H and any set X the image H (X) contains all ordinals below Γ0 . To obtain “stronger” operators we have to equip them with stronger closure properties. For the theories treated in this book it suffices to assume that all operators satisfy / • Ω ∈ H (0). For a set X ⊆ On and an operator H let • H [X] := λ Ξ . H (X ∪ Ξ ).
168
9 Ordinal Analysis of the Theory for Inductive Definitions
9.3.7 Definition For a sentence G in the fragment of L∞ (NT) we define <α
par(G) := {α IF
occurs in G}.6
For a finite set ∆ of sentences of the fragment of L∞ (NT) we define par(∆ ) :=
par(F).
F∈∆
α
9.3.8 Definition Let H be a Skolem-hull operator. We define the relation H ρ ∆ by the clauses ( ), ( ) and (cut) of Definition 7.3.5 with the additional conditions • α ∈ H (par(∆ )) and for an inference αi ρ
H
∆ι for ι ∈ I ⇒ H
α ρ
∆
with finite I also • par(∆ι ) ⊆ H (par(∆ )) for all ι ∈ I. We introduce the following abbreviation H1 ⊆ H2 :⇔ (∀X⊆On) H1 (X) ⊆ H2 (X) . α
When writing H ρ ∆ we tacitly assume that H is a Cantorian closed Skolem–hull operator. The Structural Lemma of Sect. 7.3 extents to 9.3.9 Lemma If H1 ⊆ H2 , α ≤ β , ρ ≤ σ , ∆ ⊆ Γ , β ∈ H2 (par(Γ )) and H1 then
β H2 σ
Proof
α ρ
∆
Γ.
We show the lemma by induction on α . Let
H1
αι ρ
∆ι for ι ∈ I ⇒ H1
α ρ
∆
be the last inference. By the induction hypothesis we then obtain H2
αι σ
Γ , ∆ι for ι ∈ I.
(i)
For finite I we have par(∆ι ) ⊆ H1 (par(∆ )) and thus par(Γ , ∆ι ) ⊆ H1 (par(Γ )) ⊆ H2 (par(Γ )). Because of ρ ≤ σ and α ≤ β ∈ H2 (par(Γ )) we obtain H2 (i) by the same inference.
6
<α
The symbol IF
<ξ
F(IF , . ) or
is in fact an abbreviation for ξ <α
ξ <α
only α is counted as parameter. The ξ ’s are treated as “bounded”.
<ξ
β σ
Γ from
F(IF , . ), respectively. But
9.3 The Semi-Formal System for L ∞ (NT)
169
9.3.10 Lemma Let X ⊆ H (par(∆ )) and H [X] Proof
α ρ
∆ . Then H
α ρ
∆.
Induction on α . Assume that
H [X]
αι ρ
∆ι for all ι ∈ I ⇒ H [X]
α ρ
∆
is the last inference. All inference rules have the property par(∆ ) ⊆ par(∆ι ). By the αι induction hypothesis we therefore obtain H ρ ∆ι . But X ⊆ H (par(∆ )) implies α ρ
H [X](par(∆ )) = H (par(∆ )) and we obtain H
∆ by the same inference.
9.3.11 Lemma (Inversion Lemma) Let F ∈ –type, CS(F) = 0/ and assume α α H ρ ∆ , F. Then H [par(F)] ρ ∆ , G holds true for all G ∈ CS(F). Proof The proof parallels that of Lemma 7.3.9 but needs some extra care on the parameters. We induct on α . If F is not the critical formula of the last inference H
αι ρ
∆ι , F for ι ∈ I ⇒ H
α ρ
∆,F
(i)
we obtain H [par(F)]
αι ρ
∆ι , G
(ii)
by the induction hypothesis. In case that I is finite, we have to check the parameter condition. But then we have par(∆ι , F) ⊆ H (par(∆ , F)) which entails par(∆ι , G, F) ⊆ H (par(∆ , G, F)). Since α ∈ H (par(∆ , F)) ⊆ H (par(∆ , F, G)) we obtain the claim by the same inference. Now assume that F isthe critical formula of the last inference. Then this was an inference according to ( ) and we have the premises H
αG ρ
∆ , F, G
for all G ∈ CS(F) and obtain by the induction hypothesis H [par(F)]
αG ρ
∆ , G.
(iii)
Since αG < α ∈ H (par(∆ , F)) ⊆ H (par(∆ , G, F)) we get the claim from (iii) by Lemma 9.3.9. 9.3.12 Lemma ( ∨-Exportation) Assume H have H Proof
α ρ
∆ , F1 , . . . , Fn .
α ρ
∆ , F1 ∨ · · · ∨ Fn . Then we also
The claim follows directly by induction on α .
9.3.13 Lemma If F ∈ Diag(N) and H
α ρ
∆ , ¬F then H
α ρ
∆.
Proof Induction on α . Since F ∈ Diag(N) we have par(F) = 0/ and ¬F ∈ –type . Therefore ¬F cannot be the critical formula of the last inference
170
9 Ordinal Analysis of the Theory for Inductive Definitions αι ρ
H
α ρ
∆ι , ¬F for all ι ∈ I ⇒ H
∆ , ¬F.
(i)
From the induction hypothesis we get αι ρ
H
∆ι
(ii) α
which entails H ρ ∆ by the same inference because αι < α ∈ H (par(∆ , ¬F)) = H (par(∆ )) and, in case that I is finite, par(∆ι ) ⊆ par(∆ι , ¬F) ⊆ H (par(∆ , F)) = H (par(∆ )). 9.3.14 Lemma (Reduction Lemma) H (par(∆ )). If H Proof
α ρ
∆ , F and H
β ρ
Let F ∈ –type , ρ = rnk(F) and par(F) ⊆ α +β ρ
Γ , ¬F then H
∆ ,Γ .
The proof is very similar to that of Lemma 7.3.12 but we need extra care
on the controlling operator. If ρ = 0, we obtain H α +β
β
0
Γ by Lemma 9.3.13, which
Γ , ∆ by Lemma 9.3.9. Therefore assume ρ > 0 which means entails H 0 CS(F) = 0. / We induct on β . Let us first assume that ¬F is not the critical formula of the last inference H
(J)
βι ρ
Γi , ¬F for ι ∈ I ⇒ H
β ρ
Γ , ¬F.
If (J) is an inference according to ( ) with empty premises then we have also H
α +β 0
∆ ,Γ .
Therefore assume that I = 0. / We still have par(F) ⊆ H (par(∆ )) and obtain H
α +βι ρ
∆ , Γι by the induction hypothesis. Since α + βι < α + β , and in the
case of finite I also par(∆ , Γι ) ⊆ H (par(∆ , Γ )), we obtain H inference (J).
α +β ρ
∆ , Γ by an
Now assume that ¬F is the critical formula of the last inference in H Then we have the premise H
β0 ρ
β ρ
Γ , ¬F.
Γ , ¬F, ¬G for some G ∈ CS(F) with
par(Γ , F, G) ⊆ H (par(Γ , F))
(i)
α +β0
and obtain H ρ ∆ , Γ , ¬G by the induction hypothesis. By inversion, the Structural Lemma, the hypothesis par(F) ⊆ H (par(∆ )) ⊆ H (par(∆ , Γ )) and α +β0
Lemma 9.3.10 we obtain from the first hypothesis also H ρ ∆ , Γ , G. It is α + β0 < α + β and rnk(G) < ρ . To apply a cut we still have to check par(∆ , Γ , G) ⊆ H (par(∆ , Γ )).
(ii)
But this is secured by (i) and the hypothesis par(F) ⊆ H (par(∆ )) ⊆ H (par(∆ , Γ )).
9.3 The Semi-Formal System for L ∞ (NT)
171
9.3.15 Theorem (Cut elimination for controlled derivations) Let H Cantorian closed Skolem–hull operator. Then (i)
H
α ρ +1
H
α β +ω ρ
∆ ⇒ H
2α ρ
be a
∆
and (ii) Proof H
β
∆.
We show (i) by induction on α . If the last inference αι ρ +1
α ρ +1
∆ι for ι ∈ I ⇒ H
is not a cut of rank ρ , we have H
2αι ρ
∆ ∆ι by induction hypothesis and par(∆ι ) ⊆ 2α ρ
H (par(∆ )) in the case of finite I. So we get H In case that the last inference is a cut H
ϕρ ( α )
∆ and ρ ∈ H (par(∆ )) ⇒ H
α0 ρ +1
∆,F
H
of rank ρ we obtain H
α0 ρ +1 2α0 ρ
∆ , ¬F ⇒ H
∆ , F and H
2α0 ρ
α ρ +1
∆ by the same inference. ∆
∆ , ¬F by the induction hypothesis.
But either F ∈ –type or ¬F ∈ –type and par(F) = par(¬F) ⊆ H (par(∆ )). Therefore we may apply the Reduction Lemma (Lemma 9.3.14) and the fact that 2α
2α0 + 2α0 ≤ 2α to obtain H ρ ∆ . Now we prove (ii) by induction on ρ with side induction on α . For ρ = 0 as proved by (i) and Lemma 9.3.9. Thus, let ρ > 0 and H
αι β +ω ρ
∆ι for all ι ∈ I ⇒ H
α β +ω ρ
∆
(i)
be the last inference. Then we have by the side induction hypothesis H
ϕρ ( α ι ) β
∆ι for all ι ∈ I.
(ii)
From α ∈ H (par(∆ )) and ρ ∈ H (par(∆ )) we obtain ϕρ (α ) ∈ H (par(∆ )). If the last inference was not a cut of rank ≥ β we get from (ii), and in the case of a finite index set I together with par(∆ι ) ⊆ H (par(∆ )), the claim H
ϕρ ( α ) β
∆
by the same inference. Now assume that the last inference is a cut H
α0 β +ω ρ
∆,F
H
α0 β +ω ρ
∆ , ¬F ⇒ H
α β +ω ρ
∆
(iii)
such that β ≤ rnk(F) =: σ < β + ω ρ . Then par(F) ⊆ H (par(∆ )) which entails σ ∈ H (par(∆ )) and for σ =NF β + ω σ0 + · · · + ω σn < β + ω ρ we also get β , σi ∈ H (par(∆ )). Then σ < β + ω σ0 ·(n + 1) and σ0 < ρ and we obtain by cut
172
9 Ordinal Analysis of the Theory for Inductive Definitions ϕρ (α0 )+1 β +ω σ0 ·(n+1)
∆
(iv)
with σ0 ∈ H (par(∆ )). From (iv) we get the claim by iterated application of the main induction hypothesis together with the fact that ϕσ0 (· · · (ϕρ (α0 ) + 1) · · ·) < ϕρ (α ). We close this section by proving (9.12) on p. 163 for operator controlled derivations. Lemma 9.3.16 given below is one of the key properties of local predicativity. It plays an important role in the elimination of the impredicative axiom (ID11 )∗ . For a finite set ∆ of formulas we denote by ∆ (F) that the formula F occurs as sub-formula in some of the formulas in ∆ . <β
α
9.3.16 Lemma (Boundedness) If H ρ ∆ (t ε IF ) then H [{β }] holds true for all γ such that α ≤ γ ≤ β . <β
Proof We induct on α . In the cases that t ε IF last inference H
αι ρ
<β
∆ι (t ε IF ) for ι ∈ I ⇒ H
α ρ
α ρ
<γ
∆ (t ε IF )
is not the critical formula of the <β
∆ (t ε IF )
(i)
we get H [{β }]
αι ρ
<γ
∆ι (t ε IF )
(ii)
by induction hypothesis. For a finite index set I we have the additional hypoth<β <β <γ esis par(∆ι (t ε IF )) ⊆ H (par(∆ (t ε IF ))) which entails par(∆ι (t ε IF )) ⊆ <γ H [{β }](par(∆ (t ε IF ))) and we obtain H [{β }]
α ρ
<γ
∆ (t ε IF )
(iii)
from (ii) by the same inference. <β If t ε IF is the critical formula, we are in the case of an ( ) inference with premise H
α0 ρ
<β
ξ
∆0 ,t ε IF ,t ε IF
(iv)
for some ξ < β . Applying the induction hypothesis twice, we obtain H [{β , ξ }]
α0 ρ
<γ
α
∆0 ,t ε IF ,t ε IF 0 . <β
(v)
ξ
<β
From α0 ∈ H (par(∆0 ,t ε IF ,t ε IF )) and ξ ∈ H (par(∆0 ,t ε IF )) we obtain <β
<γ
<γ
α0 ∈ H (par(∆0 , IF )) ⊆ H [{β }](par(∆0 , IF )) and H [{β , ξ }](par(∆0 , IF )) = <γ ∆0 , IF )). Since α0 < α ≤ γ we can apply Lemma 9.3.10 and an inH [{β }](par( ference ( ) to obtain H [{β }]
α ρ
<γ
∆0 ,t ε IF .
9.4 The Collapsing Theorem for ID1
9.3.17 Corollary If H
α ρ
173 <β
n ε IF
for α ≤ β ≤ Ω then |n|F < α .
Proof Forgetting the controlling operator we get by the Boundedness Lemma <α <α α α n ε IF . Hence N |= n ε IF<α . ρ n ε IF , which by Theorem 9.3.2 implies 9.3.18 Exercise Prove claim (9.12) in p. 163. Hint. Adapt the proof of the Boundedness Lemma.
9.3.19 Exercise Prove claim (9.14) in p. 165.
9.4 The Collapsing Theorem for ID1 Let H be a Cantorian closed operator. We define its iterations Hα . 9.4.1 Definition For X ⊆ On let Hα (X) be the least set of ordinals containing X ∪ {0, Ω } which is closed under H and the collapsing function ψH α where
ψH (α ) := min {ξ ξ ∈ / Hα (0)}. / We need a few facts about the operators Hα . Here it is comfortable to interprete Ω as the first uncountable cardinal. Interpreting Ω as ω1CK makes the following considerations much harder. We return to alternative interpretations for Ω in Sect. 9.7. First we observe |Hα (X)| = max{|X|, ω },
(9.17)
which implies
ψH (α ) < Ω
(9.18)
showing that ψH is in fact collapsing. Clearly the operators Hα are Cantorian closed Skolem-hull operators which are cumulative, i.e.,
α ≤ β ⇒ Hα ⊆ Hβ and ψH (α ) ≤ ψH (β ).
(9.19)
Since for α ∈ Hβ (0) / ∩ β implies ψH (α ) ∈ Hβ (0) / we have
α ∈ Hβ (0) / ∩ β ⇒ ψH (α ) < ψH (β ).
(9.20)
From (9.20) we get Hα (0) / ∩ Ω = ψH (α ).
(9.21)
The “⊇”-direction follows from the definition of ψH (α ) and (9.18). For the opposite inclusion, observe that ψH (α ) is closed under all operations in H – which means especially ψH (α ) is strongly critical – and show
174
9 Ordinal Analysis of the Theory for Inductive Definitions
ξ ∈ Hα (0) / ∩ Ω ⇒ ξ < ψH (α ) by induction on the definition of ξ ∈ Hα (0). / If ξ is obtained by operations in H / ∩α we get ξ < ψH (α ) immediately. In case that ξ = ψH (η ) we have η ∈ Hα (0) which by (9.20) implies ξ = ψH (η ) < ψH (α ). From (9.21) we see that all the iterations Hα are again transitive operators. Another immediate property of the iterated operators is
λ ∈ Lim ⇒ Hλ (X) =
Hξ (X).
ξ <λ
This can even be focused to
λ ∈ Lim ⇒ Hλ (X) =
{Hξ (X) ξ ∈ Hξ (X) ∩ λ }.
(9.22)
The inclusion from right to left in (9.22) follows from the equation above. For the opposite inclusion, let ξ ∈ Hλ (X) and η0 := min {η ξ ∈ Hη (X)}. Then η0 < λ and η0 ∈ / Lim. Let η0 = η + 1. Then η ∈ Hη (X), hence η0 ∈ Hη0 (X), because otherwise we had ξ ∈ Hη +1 (X) = Hη (X) contradicting the minimality of η0 . This ends the proof of (9.22). By definition we have
ψHγ (0) = min {ξ ξ ∈ / Hγ (0)} / = ψH (γ ). This implies that
γ ∈ Hγ (X) ⇒ (Hγ )1 (X) = Hγ +1 (X)7
(9.23)
holds true for all sets X of ordinals.We show
ξ ∈ Hγ +1 (X) ⇒ ξ ∈ (Hγ )1 (X)
(i)
by induction on the definition of ξ ∈ Hγ +1 (X). If ξ ∈ X or ξ is obtained by operations in H then (i) holds immediately. So assume ξ = ψH (η ) for some η ∈ Hγ +1 (X)∩ γ +1. Then η ∈ (Hγ )1 (X) by induction hypothesis and for η < γ we get ξ ∈ Hγ (X) ⊆ (Hγ )1 (X). If η = γ we directly obtain ξ = ψH (γ ) = ψHγ (0) ∈ (Hγ )1 (X). For the opposite direction (Hγ )1 (X) ⊆ Hγ +1 (X)
(ii)
we observe that (Hγ )1 (X) is the closure of Hγ (X) ∪ {ψH (γ )} under H . But Hγ +1 (X) is closed under H , too. Moreover, we have Hγ (X) ⊆ Hγ +1 (X) and get ψH (γ ) ∈ Hγ +1 (X) from the hypothesis γ ∈ Hγ (X). Hence (ii) and thus also (9.23). The next lemma shows that property (9.23) can be extended. 9.4.2 Lemma Let H be a Cantorian closed operator. Then α + β ∈ Hα +β (X) implies (Hα )β (X) = Hα +β (X) for all X. Proof We prove the lemma by induction on β . The claim holds trivially for β = 0. / Hα +γ (X) If β = γ + 1 we get α + γ ∈ Hα +γ (X) since the assumption α + γ ∈ 7
The “normal form condition” γ ∈ Hγ (0) / will play an important role in Sect. 9.6.
9.4 The Collapsing Theorem for ID1
175
implies Hα +γ +1 (X) = Hα +γ (X), hence α + γ ∈ / Hα +γ +1 (X) in contradiction to α + γ + 1 ∈ Hα +γ +1 (X). So we have (Hα )γ (X) = Hα +γ (X)
(i)
by induction hypothesis which implies γ ∈ (Hα )γ (X). By (9.23) and (i) we therefore obtain (Hα )γ +1 (X) = ((Hα )γ )1 (X) = (Hα +γ )1 (X) = Hα +γ +1 (X). If β ∈ Lim we get by (9.22) and the induction hypothesis Hα +β (X) = =
{Hα +ξ (X) α + ξ ∈ Hα +ξ (X) ∩ α + β } {(Hα )ξ (X) ξ ∈ (Hα )ξ (X) ∩ β } = (Hα )β (X).
The next claim is a direct corollary of Lemma 9.4.2. 9.4.3 Corollary If α + β ∈ Hα +β (0) / then ψH (α + β ) = ψHα (β ). The following collapsing property is crucial for the ordinal analysis of ID1 . Already in Sect. 9.3, we mentioned that axiom ID11 is the critical axiom of ID1 . Its transla<Ω
tion ClF (IF
<Ω
) :⇔ (∀x)[F(IF
tive. The complexity of its premise x ε
<Ω IF .
As
<Ω
,x) →x ε IF
<Ω F(IF ,x)
] looks already formally impredica-
is bigger than that of its conclusion
<Ω pointed out in Sect. 9.3.1, ClF (IF ) can be viewed as the defining <Ω as an ordinal satisfying ClF (IF ). Since Ω looses this property when
axiom for Ω there are gaps below Ω , we can hardly expect to obtain an operator controlled <Ω derivation of ClF (IF ). However, the truth complexity of true L∞ (NT)-sentences which do not contain conjunctions of length ≥ Ω is certainly below Ω which means that they possess infinitary derivations of length less than Ω . Let us characterize this class of formulas by the following definition.
9.4.4 Definition We say that a sentence in the fragment L∞ (NT) is in Ω –type if <Ω it does not contain sub-formulas of the shape t ε IF , i.e., if it does not contain conjunctions of length ≥ Ω . Since infinitary proofs stemming from translations of formal proofs in ID1 show a big uniformity, there is some hope that sentences in Ω –type that are translations of ID1 -theorems possess an operator controlled derivation which does not use axiom (ID11 )∗ and whose length can be computed to be less than Ω . That this hope can be substantiated is prepared by the following collapsing lemma. It shows that an operator controlled derivation of an Ω –type-sentence depending on instances of (ID11 )∗ can be transformed into an operator controlled derivation of the same sentence which no longer uses (ID11 )∗ and has a computable length less than Ω . The cost is that we have to iterate the controlling operator. A study of the proof of the collapsing theorem shows that the instances of (ID11 )∗ are replaced by a series of cuts. So there is a faint resemblance to Gentzen’s method of resolving applications of instances of the scheme of mathematical induction by a series of cuts.
176
9 Ordinal Analysis of the Theory for Inductive Definitions
9.4.5 Lemma (Collapsing Lemma) and H Proof
Let ∆ ⊆
β
<Ω <Ω ¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), ∆ . Ω
Ω
–type such that par(∆ ) ⊆ H (0) /
Then Hω β +1
ψH ( ω β ) Ω
∆.
The proof is by induction on β . The key property is
β ∈ H (0) / and ω β < γ ⇒ ψH (ω β ) < ψH (γ )
(i)
which is obvious by (9.20) since we have ω β ∈ H (0) / ∩ γ ⊆ Hγ (0) / ∩ γ . Other observations are / H (par(∆ )) = H (0)
(ii)
because par(∆ ) ⊆ H (0) / and
β ∈ H (0) / ⇒ ω β ∈ Hω β +1 (0) / and ψH (ω β ) ∈ Hω β +1 (0) /
(iii)
which is clear by (9.19) and the closure properties of Hω β +1 (0). / Let <Ω
<Ω
Θk :⇔ ¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ) and assume that the critical part of the last inference H
βι Ω
Θk , ∆ι for ι ∈ I ⇒ H
β Ω
Θk , ∆
(iv)
belongs to a sentence in ∆ . Observe that par(Θk ) = {Ω }. So we have to bother only about the parameters of ∆ . We claim par(∆ι ) ⊆ H (0). /
(v)
If I is finite then we have par(∆ι ) ⊆ H (par(∆ )) = H (0) / because par(∆ ) ⊆ H (0). / If I is infinite and the critical formula has the shape ((∀x)Fu (x))∗ , then I = ω and ∆n = ∆ , Fu (n). But then par(∆n ) = par(∆ ) and (v) holds. If the critical formula of <ξ the inference ist ε IF then ξ < Ω because ∆ ⊆ Ω –type. Then ∆ι = ∆ , G for some <ξ
G ∈ CS(t ε IF ), which means that par(∆ι ) ⊆ par(∆ ) ∪ {η } for some η < ξ . But ξ ∈ H (0) / ∩ Ω entails ξ ⊆ H (0) / by the transitivity of H . Hence par(∆ι ) ⊆ H (0) / for all ι ∈ I and the proof of (v) is completed. Next we claim
∆ι ⊆
Ω
–type.
(vi) Ω
–type for inferences that are no cuts. In case that the This follows from ∆ ⊆ inference in (iv) is a cut, its cut-sentence is of rank < Ω which ensures that it belongs to Ω –type, too. Because of (v) and (vi) the induction hypothesis applies to the premises of (iv) and we obtain Hω βι +1
ψH ( ω β ι ) Ω
∆ι .
(vii)
From βι ∈ H (0) / ∩ β we obtain ψH (ω βι ) < ψH (ω β ) by (i) and from β ∈ H (0) / β / Since also par(∆ι ) ⊆ H (0) / ⊆ Hω β +1 (0) / we get also ψH (ω ) ∈ Hω β +1 (0).
9.4 The Collapsing Theorem for ID1 ψH ( ω β )
Hω β +1
Ω
177
∆
(viii)
from (vii) by the same inference. Now assume that the critical formula of the last inference is <Ω
<Ω
<Ω
¬ClFi (IFi ), i.e., the formula (∃x)[Fi (IFi , x) ∧ x ε IFi ].
(ix)
Then we have the premise H
β0 Ω
<Ω
<Ω
<Ω
<Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), Fi (IFi ,t) ∧ t ε IFi , ∆
(x)
with β0 ∈ H (par(∆ ) ∪ {Ω }) = H (0). / By inversion we obtain from (x) H
β0 Ω
<Ω
<Ω
<Ω
<Ω
<Ω
<Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), Fi (IFi ,t), ∆
(xi)
and H
β0 Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ),t ε IFi , ∆ .
(xii)
Applying the induction hypothesis to (xi) and then using boundedness, gives ψH ( ω β 0 )
Hω β0 +1
Ω
<ψH (ω β0 )
Fi (IFi
,t), ∆ ,
(xiii)
i.e., ψH ( ω β 0 )
Hω β0 +1
Ω
ψ
t ε IFi H
(ω β0 )
,∆.
(xiv)
From (xii) we obtain by inversion H
β0 Ω
<Ω
<Ω
ψ
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ),t ε IFi H
(ω β0 )
,∆
(xv)
which entails β0
Hω β0 +1
Ω
<Ω
<Ω
ψ
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ),t ε IFi H
(ω β0 )
,∆.
(xvi)
Since ψH (ω β0 ) ∈ Hω β0 +1 (0) / the induction hypothesis applies to (xvi) and we obtain ψH
(Hω β0 +1 )ω β0 +1
Ω
ω β0 +1
(ω β0 )
ψ
t ε IFi H
(ω β0 )
,∆.
(xvii)
By Lemma 9.4.2 and Corollary 9.4.3, this entails Hω β0 +1+ω β0 +1
ψH (ω β0 +1+ω β0 ) Ω
ψ
t ε IFi H
(ω β0 )
,∆
(xviii)
and we obtain Hω β +1
ψH ( ω β ) Ω
∆
from (xiv) and (xvii) by the Structural Lemma and (cut).
178
9 Ordinal Analysis of the Theory for Inductive Definitions
9.4.6 Remark Although we will not need it for the ordinal analysis of ID1 we want to remark that the Collapsing Lemma may be strengthened to H
β
<Ω
Ω +1
<Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), ∆ ⇒ Hω β +1
ψH ( ω β ) ψH ( ω β )
∆.
For k = 0 it can be modified to H
β Ω
∆ ⇒ Hβ +1
ψH ( β ) ψH ( β )
∆.
Proof We have to do three things. First we observe that in the case of a cut of / ∩ Ω ⊆ Hω β ∩ Ω = ψH (ω β ). Since rnk(F) < rank < Ω , we have par(F) ⊆ H (0) max par(F) + ω , we obtain rnk(F) < ψH (ω β ). If the cut rank is Ω + 1 we have the <Ω
additional case of a cut of rank Ω . Then the cut sentence is t ε IF premises H
β0 Ω +1
<Ω
<Ω
<Ω
<Ω
<Ω
<Ω
and we have the
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), ∆ ,t ε IF
(i)
and H
β0 Ω +1
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), ∆ ,t ε IF
.
(ii)
But we may apply the induction hypothesis to (i) and then proceed as in the last case ψ
(ω β0 )
in the proof of the Collapsing Lemma. The resulting cut sentence is t ε IF H which shows that the cut sentence has rank < ψH (ω β ). Finally we observe that only in the case that the critical formula of the last in<Ω ference is one of the formulas ClFi (IFi ) we needed the fact that ω β is additively indecomposable. If k = 0 we may therefore replace ω β by β .
9.5 The Upper Bound To get an upper bound for κ ID1 we have to strengthen Theorem 9.3.4 in the following way. 9.5.1 Theorem If Ω +m 0
m T
∆ (x) holds true for a finite set of L (ID)-formulas then we
∆ (n) for all tuples n of numerals and all Cantorian closed Skolem– get H hull operators H . Recall that two formulas are numerically equivalent, if they only differ in terms s1 , . . . , sn and t1 , . . . ,tn such that tiN = sN i for i = 1, . . . , n. The key in the proof of Theorem 9.5.1 is a refinement of the Tautology Lemma.
9.5 The Upper Bound
179
9.5.2 Lemma (Controlled Tautology)
2·rnk(F)
We have H
0
∆ , ¬F , F for numeri-
cally equivalent L∞ (NT)-sentences F and F and all Cantorian closed Skolem–hull operators H . The proof by induction on rnk(F) is easy. First observe that 2 · rnk(F) ∈ H (par(F)) for every Cantorian closed Skolem–hull operator because rnk(F) = max par(F) + n for some n < ω . Assume without loss of generality that F ∈ –type . By induction hypothesis we have H
2·rnk(G) 0
∆ , ¬F , F, G, ¬G
(i)
for all G ∈ CS(F) where G corresponds to G in the same way as F to F. Since par(∆ , ¬F , F, G, ¬G ) ⊆ H (par(∆ , ¬F , F, G)) we obtain from (i) H
2·rnk(G)+1 0
∆ , ¬F , F, G,
(ii)
for all G ∈ CS(F) by an inference ( ). From (ii) and 2 · rnk(G) + 1 < 2 · (rnk(G) + 1) ≤ 2 · rnk(F), however, we immediately get H
2·rnk(F) 0
∆ , ¬F , F
by an inference ( ). Now we prove Theorem 9.5.1 by induction on m. If (AxL) we get H
Ω +m 0
m T
∆ , ¬F(x), F(x) holds by
∆ , ¬F(n), F(n) by Lemma 9.5.2 and possibly Lemma 9.3.9.
In case of an inference (∧) we have the induction hypotheses H for i ∈ {1, 2}. By Lemma 9.3.9 we obtain H
by an inference according to ( ) finally H
Ω +mi 0 Ω +m 0
Ω +mi 0
∆ (n), Ai (n)
∆ (n), Ai (n), (A1 ∧ A2 )(n) and
∆ (n), (A1 ∧ A2 )(n).
In the case of an inference according to (∀) we have the premise Ω +m0
m0 T
∆ (x), A(u)
∆ (n), Au (k) for all numerals k. where u does not occur in ∆ (x) and obtain H 0 By Lemma 9.3.9 and an inference ( ) we obtain the claim. The cases of inferences according to (∨) and (∃) are treated similarly. It is obvious that all defining axioms and also all identity axioms, but the last, are controlled derivable with a derivation depth below ω . With controlled tautology we also immediately get cut free controlled derivations of depths below Ω + ω for all translations of the last identity axiom. Ruminating the proof of the Induction Lemma (Lemma 7.3.4) shows that this proof is controlled by any Cantorian closed Skolem–hull operator. Summing up we get H
Ω +ω +4 0
G∗
(9.24)
for every axiom G of NT in the language L (ID) where H may be an arbitrary Cantorian closed Skolem–hull operator.
180
9 Ordinal Analysis of the Theory for Inductive Definitions
So it remains to check the schemes (ID11 ) and (ID12 ). By the Collapsing Lemma (Lemma 9.4.5) we have only to deal with (ID12 ). 9.5.3 Lemma (Generalized Induction) Let F(X,x) be an X-positive NT formula. Then H
2·rnk(G)+ω ·(α +1) 0
α
¬ClF (G), n ε IF , G(n)
holds true for any sentence G(n) in the fragment L∞ (NT) and for any Cantorian closed Skolem–hull operator H . From the Generalized Induction Lemma we obtain H
Ω ·2+3 0
<Ω
¬ClF (G), (∀x)[x ε IF
→ G(x)]
(9.25)
which is the translation of the scheme (ID12 ). The proof of Lemma 9.5.3 still needs a preparing lemma. 9.5.4 Lemma (Monotonicity Lemma) Assume H
α ρ
∆ , ¬G(n), H(n) for all tuples
n of numerals and let F(X,x) be an X-positive L (NT)-formula. Then we obtain H
α +2·rnk(F) ρ
∆ , ¬F(G,n), F(H,n) for all tuples n.
Proof Induction on rnk(F). In the case that F is the formula (x ∈ X), we obtain α the claim from the hypothesis H ρ ∆ , ¬G(n), H(n). The remaining cases are as in the proof of the Controlled Tautology Lemma. Proof of the Generalized Induction Lemma. If α = 0 then H
2·rnk(G)+ω ·α +1 0
<α
¬ClF (G), n ε IF , G(n)
(i)
for all n by an inference ( ) with empty premise. If α > 0 we get (i) by induction hypothesis. From (i) we obtain H
2·rnk(G)+ω ·α +2·rnk(F)+1 0
α
¬ClF (G), n ε IF , F(G,n)
(ii)
for all n by the Monotonicity Lemma. By controlled tautology we have H
2·rnk(G) 0
α
¬ClF (G), n ε IF , ¬G(n), G(n).
(iii)
From (ii) and (iii) we get H
2·rnk(G)+ω ·α +(2·rnk(F))+2 0
α
¬ClF (G), n ε IF , F(G,n) ∧ ¬G(n), G(n)
(iv)
by an inference ( ). From (iv) we finally obtain H
2·rnk(G)+ω ·α +2·rnk(F)+3 0
α
¬ClF (G), n ε IF , G(n)
by an inference ( ). Since 2·rnk(G) + ω ·α + 2·rnk(F) + 3 < 2·rnk(G) + ω ·(α + 1) we are done.
9.6 The Lower Bound
181
9.5.5 Theorem Assume ID1
F(x) for a formula F(x) whose free variables occur <Ω
<Ω
all in the list x. Then there are finitely many instances ClF1 (IF1 ), . . . , ClFk (IFk ) of translations of axiom (ID11 ) and an n < ω such that H
Ω ·2+ω Ω +n
<Ω
<Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), F ∗ (m)
holds true for any tuple m of the length of x and for any Cantorian closed Skolem– hull operator. Proof If ID1 F(x), then there are finitely many axioms A1 , . . . , Ar and a natural p number p such that ¬A1 , . . . , ¬Ar , F(x). By Theorem 9.5.1 this implies T
H
Ω +p
¬A∗1 , . . . , ¬A∗r , F ∗ (m)
0
(i)
for any Cantorian closed Skolem–hull operator H . From (i), (9.24) and (9.25) we obtain the claim by some cuts. Let B(X) be the least set Y ⊇ X such that {0, Ω } ⊆ Y and α ∈ Y ⇔ SC(α ) ⊆ Y . Then B(0) is a Cantorian closed operator such that B(0) / ∩ Ω = Γ0 , which shows that B(0) is also transitive. It is the smallest possible extension of the minimal operator B0 which contains Ω (cf.(9.16) in p. 167). We construct the hierarchy Bα of Cantorian closed operators based on B and put ψ (α ) := ψB (α ). The ordinal ψ (εΩ +1 ) is known as the Bachmann–Howard ordinal. 9.5.6 Theorem (The Upper Bound for ID1 ) Proof B
If ID1 Ω ·2+ω Ω +n
It is κ ID1 ≤ ψ (εΩ +1 ).
m ε IF we obtain by Theorem 9.5.5 <Ω
<Ω
<Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), m ε IF
.
(i)
By Theorem 9.3.15, we obtain an α ∈ BεΩ +1 (0) / ∩ εΩ +1 such that B
α Ω
<Ω
<Ω
<Ω
¬ClF1 (IF1 ), . . . , ¬ClFk (IFk ), m ε IF
.
(ii)
From (ii) and the Collapsing Lemma (Lemma 9.4.5) it follows Bω α +1
ψ (ω α ) Ω
<Ω
m ε IF
which by Corollary 9.3.17 implies |m|F < ψ (ω α ) < ψ (εΩ +1 ).
9.6 The Lower Bound 9.6.1 Coding Ordinals in L (NT) It follows from the previous sections that BεΩ +1 (0) / ∩ εΩ +1 is the set of ordinals that is relevant in the computation of an upper bound for κ ID1 . To prove that ψ (εΩ +1 ) is
182
9 Ordinal Analysis of the Theory for Inductive Definitions
the exact bound it suffices to show that ID1 proves n ε Accα≺ for some arithmetical definable relation ≺ and all α < ψ (εΩ +1 ). We are even going to show that for each α < ψ (εΩ +1 ) there is a primitive recursive relation ≺ such that otyp(≺) = α and ID1 proves (∀x ε field(≺))[x ∈ Acc≺ ]. Therefore, we get by (9.5) (on p. 158) also ID1 (X) TI(≺, X) which entails ψ (εΩ +1 ) ≤ ||ID1 (X)||. Hence ||ID1 (X)|| = ψ (εΩ +1 ). Since we cannot talk about ordinals in L (ID) we need codes for the ordinals / The only parameters occurring on BεΩ +1 (0) / are 0 and Ω . Therefore in BεΩ +1 (0). / possesses a term notation which is built up from 0, Ω every ordinal in BεΩ +1 (0) by the functions +, ϕ and ψ . This term notation, however, is not unique. To show that the set of term notations together with the induced <-relation on the terms are primitive recursive, we need an unique term notation. This forces us to inspect the / more closely. set Bα (0) We define
α =NF ψ (β ) :⇔ α = ψ (β ) ∧ β ∈ Bβ (0). / Then we obtain
α =NF ψ (β1 ) ∧ α =NF ψ (β2 ) ⇒ β1 = β2
(9.26)
since the assumption β1 < β2 would imply β1 ∈ Bβ1 (0) / ∩ β2 ⊆ Bβ2 (0) / ∩ β2 which by (9.20) entails ψ (β1 ) < ψ (β2 ), contradicting ψ (β1 ) = ψ (β2 ). The assumption β2 < β1 leads to a similar contradiction. We define a set of ordinals T by the clauses (T0 ) {0, Ω } ⊆ T (T1 ) α ∈ / SC ∧ SC(α ) ⊆ T ⇒ α ∈ T (T2 ) β ∈ T ∧ α =NF ψ (β ) ⇒ α ∈ T. Let Γ := enSC be the enumerating function of the strongly critical ordinals. Commonly we write Γξ instead of Γ (ξ ). Then we obtain ΓΩ +1 = min {α ∈ SC Ω < α }. We want to prove / T = BΓΩ +1 (0).
(9.27)
The inclusion ⊆ in (9.27) is obvious. Troublesome is the converse inclusion. The idea is of course to prove
ξ ∈ BΓΩ +1 (0) / ⇒ ξ ∈T
(9.28)
/ We will therefore redefine the sets by induction on the definition of ξ ∈ BΓΩ +1 (0). Bα (0) / more carefully by the following clauses. (B0 ) {0, Ω } ⊆ Bnα for all α and all n < ω (B1 ) ξ ∈ / SC ∧ SC(ξ ) ⊆ Bnα ⇒ ξ ∈ Bn+1 α
9.6 The Lower Bound
183
(B2 ) η ∈ Bnα ∩ α ⇒ ψ (η ) ∈ Bn+1 α (B3 ) Bα :=
n n∈ω Bα
∧ ψ (α ) := min {ξ ξ ∈ / Bα }.
We first check that / for all α ≤ ΓΩ +1 Bα = Bα (0)
(9.29)
which justifies the use of the same symbol ψ to denote the function ψB (α ) := / Bα (0)} / and ψB (α ) := min {ξ ξ ∈ / Bα }. The proof of (9.29) is by inmin {ξ ξ ∈ / is the least set, which contains {0, Ω } and is duction on α . By definition Bα (0) closed under +, ϕ and ψB α . The functions ψB α and ψB α coincide by induction hypothesis and Bα contains {0, Ω } and is closed under +, ϕ and ψB α by / ⊆ Bα . For the opposite direction we show definition. So Bα (0)
ξ ∈ Bnα ⇒ ξ ∈ Bα (0) / by side induction on n. This is obvious for the cases (B0 ) and (B1 ) and needs the induction hypothesis ψB α = ψB α in case (B2 ). So (9.28) can be shown by proving
ξ ∈ Bnα ⇒ ξ ∈ T
(9.30)
for all α < ΓΩ +1 by induction on n. But still troublesome in pursuing this strategy is, case (B2 ). In this case, we do not know if ψ (η ) is in normal-form, i.e., if η ∈ Bη . First we show that a normal-form always exists. 9.6.1 Lemma For every ordinal α < ΓΩ +1 there exists an ordinal αnf such that ψ (α ) =NF ψ (αnf ). Proof Put αnf := min {ξ α ≤ ξ ∈ Bα }. For any ordinal α we have ϕΩn (0) ∈ Bα and ΓΩ +1 = supn∈ω ϕΩn (0). Therefore αnf exists. But [α , αnf ) ∩ Bα = 0/ holds true by definition and this implies Bα = Bαnf and thus also ψ (α ) = ψ (αnf ). Since αnf ∈ Bα = Bαnf we have ψ (α ) =NF ψ (αnf ). Our troubles are solved as soon as we can show
η ∈ Bnα ⇒ ηnf ∈ Bnα .
(9.31)
n−1 Then we may argue in case (B2 ) that for η ∈ Bn−1 α we also have ηnf ∈ Bα and thus ηnf ∈ T which entails ψ (η ) =NF ψ (ηnf ) ∈ T . We obtain (9.31) as a special case of the following lemma whose proof is admittedly tedious. Also we cannot learn much from it. Therefore one commonly includes the normal-form condition into clause (B2 ), which then becomes n (B2 ) η ∈ Bn−1 α ∩ α ∧ η ∈ Bη ⇒ ψ (η ) ∈ Bα .
The proof of (9.30) then becomes trivial.8 8 The importance of the normal form condition appeared already in the proof of Lemma 9.4.2. A possibility to include the normal form condition in the definition of the iterations of Skolem-hull operators is to define ψH (α ) := min {ξ ξ ∈ Hξ (0) / ∧ξ∈ / Hα (0)}. /
184
9 Ordinal Analysis of the Theory for Inductive Definitions
9.6.2 Lemma Let δ (α ) := min {ξ α ≤ ξ ∈ Bδ }. Then α ∈ Bnβ implies δ (α ) ∈ Bnβ for all α < ΓΩ +1 . Proof We show the lemma by induction on n. First observe that by the minimality of δ (α ) we get
α ∈ H ⇒ δ (α ) ∈ H and α ∈ SC ⇒ δ (α ) ∈ SC.
(i)
The lemma is trivial if α ∈ Bδ . Then δ (α ) = α . Therefore we assume
α∈ / Bδ . Then α < δ (α ) and for α < Ω we get by (9.21), δ (α ) = Ω fore we may also assume
(ii) ∈ Bnδ
for any n. There-
Ω ≤ α.
(iii)
We have
ξ∈ / SC ∧ ξ ∈ Bnβ ⇒ SC(ξ ) ⊆ Bn−1 β .
(iv)
Since (Ω , ΓΩ +1 ) ∩ SC = 0/ we obtain by induction hypothesis
δ (SC(α )) := {δ (ξ ) ξ ∈ SC(α )} ⊆ Bn−1 β ∩ Bδ .
(v)
We are done if we can prove SC(δ (α )) ⊆ Bn−1 β ∩ Bδ .
(vi)
To prove (vi) we extend Definition (8.6) (on p. 145) of the norm of ordinals below Γ0 to all ordinals by putting N(α ) := 0 for α ∈ SC. We prove the more general claim n−1 δ (SC(γ )) ⊆ Bn−1 β ∩ Bδ ⇒ SC(δ (γ )) ⊆ Bβ ∩ Bδ
(vii)
by induction on N(γ ). 1. If γ ∈ SC then γ ≤ Ω and δ (γ ) ∈ {γ , Ω }. For δ (γ ) = Ω the claim holds trivially and for δ (γ ) = γ we get SC(δ (γ )) = {γ } = δ (SC(γ )) ⊆ Bn−1 β ∩ Bδ . 2. Let γ =NF γ1 + · · · + γk and put i := min { j γ j < δ (γ j )}. If i is undefined we have δ (γ ) =NF γ1 + · · · + γk = γ and δ (SC(γ )) = δ (SC(γ1 )) ∪ · · · ∪ δ (SC(γk )) ⊆ n−1 Bn−1 β ∩ Bδ and obtain SC(δ (γ )) = SC(δ (γ1 )) ∪ · · · SC(δ (γk )) ⊆ Bβ ∩ Bδ by induction hypothesis. If i is defined we claim
δ (γ ) = γ1 + · · · + γi−1 + δ (γi ) = δ (γ1 ) + · · · + δ (γi−1 ) + δ (γi ).
(viii)
From (viii) we obtain (vii) by induction hypothesis. Let η := γ1 + · · · + γi−1 . Then γ < η + δ (γi ) and thus δ (γ ) ≤ η + δ (γi ). Assuming δ (γ ) < η + δ (γi ) we get η + γi ≤ γ ≤ δ (γ ) = η + ε < η + δ (γi ) for some ε ∈ Bδ . Hence γi ≤ ε < δ (γi ) for ε ∈ Bδ which contradicts the definition of δ (γi ). 3. Next assume γ =NF ϕγ1 (γ2 ). Then γ = ϕγ1 (γ2 ) ≤ δ (γ ) = ϕξ (η ) ≤ ϕδ (γ1 ) (δ (γ2 )) for some ξ , η ∈ Bδ . If γ1 = δ (γ1 ) we get δ (γ2 ) = η by minimality of δ (γ2 ) and thus δ (γ ) = ϕδ (γ1 ) (δ (γ2 )) and (vii) follows from the induction hypothesis. If γ1 <
9.6 The Lower Bound
185
δ (γ1 ) and γ ≤ δ (γ2 ) we obtain δ (γ ) ≤ δ (γ2 ) ≤ δ (γ ) and (vii) follows by induction hypothesis. So assume γ1 < δ (γ1 ) and δ (γ2 ) < γ . Let γ3 := min {ζ γ ≤ ϕδ (γ1 ) (ζ )}.
(ix)
We claim
γ3 ∈ Bn−1 β ∩ Bδ .
(x)
From (x) we get δ (γ ) ≤ ϕδ (γ1 ) (γ3 ). If we assume δ (γ ) = ϕξ η < ϕδ (γ1 ) (γ3 ) we have γ = ϕγ1 (γ2 ) < ϕδ (γ1 ) (γ3 ). The assumption ξ = δ (γ1 ) yields γ ≤ δ (γ ) = ϕδ (γ1 ) (η ) < ϕδ (γ1 ) (γ3 ) and thus η < γ3 , contradicting the minimality of γ3 . Assuming δ (γ1 ) < ξ yields δ (γ ) < γ3 and γ ≤ δ (γ ) = ϕδ (γ1 ) (δ (γ )), again contradicting the minimality of γ3 . So it remains ξ < δ (γ1 ). But, since ξ ∈ Bδ , this implies ξ < γ1 which in turn entails γ ≤ η ∈ Bδ ∩ δ (γ ) contradicting the definition of δ (γ ). Therefore we have
δ (γ ) = ϕδ (γ1 ) (γ3 )
(xi)
and obtain (vi) from (xi) by induction hypothesis and (x). It remains to prove (x). We are done if γ3 = 0. If we assume γ3 ∈ Lim we get γ =NF ϕγ1 (γ2 ) = ϕδ (γ1 ) (γ3 ) by the continuity of ϕδ (γ1 ) . Since, γ1 < δ (γ1 ) we obtain γ2 = γ contradicting γ =NF ϕγ1 (γ2 ). It remains in the case that γ3 = µ +1. Then ϕδ (γ1 ) (µ ) < γ =NF ϕγ1 (γ2 ) ≤ ϕδ (γ1 ) (µ + 1). Because of γ1 < δ (γ1 ) this implies ϕδ (γ1 ) (µ ) < γ2 ≤ δ (γ2 ) < γ = ϕδ (γ1 ) (µ + 1). Since δ (γ2 ) ∈ Bn−1 β ∩ Bδ we have shown / Bδ ∩ Bn−1 β ∩ (ϕδ (γ1 ) ( µ ), ϕδ (γ1 ) ( µ + 1)) = 0.
(xii)
To finish the proof we show that in general, we have Bnβ ∩ (ϕζ (ν ), ϕζ (ν + 1)) = 0/ ⇒ ν + 1 ∈ Bnβ .
(9.32)
From (xii) and (9.32) we then obtain γ3 ∈ Bδ ∩ Bn−1 β , i.e., (x). To prove (9.32) we first show
ρ ∈ [ϕζ (ν ), ϕζ (ν + 1)) ⇒ SC(ν ) ⊆ SC(ρ )
(xiii)
by induction on N(ρ ). If ρ =NF ρ1 + · · · + ρk we have ρ1 ∈ [ϕζ (ν ), ϕζ (ν + 1)) and obtain SC(ν ) ⊆ SC(ρ1 ) ⊆ SC(ρ ). Let ρ =NF ϕρ1 (ρ2 ). If ζ < ρ1 then ν ≤ ρ < ν + 1 and thus ρ = ν . If ζ = ρ1 then ν = ρ2 and SC(ν ) = SC(ρ2 ) ⊆ SC(ρ ). If ρ1 < ζ then ϕζ (ν ) ≤ ρ2 < ρ < ϕζ (ν + 1) and we obtain SC(ν ) ⊆ SC(ρ2 ) ⊆ SC(ρ ) by induction hypothesis. If finally ρ ∈ SC then ϕρ (0) = ρ = ϕζ (ν ) and thus ρ = ν or ρ = ζ and ν = 0 and the claim in both cases is obvious. / We prove (9.32) by induction on n. Let σ ∈ Bnβ ∩ (ϕξ (η ), ϕξ (η + 1)). Then σ ∈ n−1 SC and we have SC(σ ) ⊆ Bn−1 β . By (xiii) we get SC(ν ) ⊆ SC(σ ) ⊆ Bβ . Since
0 ∈ Bn−1 we also have SC(ν + 1) ⊆ Bn−1 and thus obtain ν + 1 ∈ Bnβ . β β
Now we have all the material to prove (9.27).
186
9 Ordinal Analysis of the Theory for Inductive Definitions
9.6.3 Theorem It is BΓΩ +1 (0) / = BΓΩ +1 = T . / by induction on the definition of Proof We easily obtain α ∈ T ⇒ α ∈ BΓΩ +1 (0) α ∈ T . For the opposite inclusion we use (9.29) (in p. 183) and prove
ξ ∈ Bnα ⇒ ξ ∈ T, i.e., (9.30), by induction on n. If ξ ∈ Bnα by (B0 ) we obtain ξ ∈ T by (T0 ). If ξ ∈ Bn+1 α by (B1 ) we obtain ξ ∈ T by the induction hypothesis and (T1 ). So assume that n ξ ∈ Bn+1 α by (B2 ). Then ξ = ψ (η ) such that η ∈ Bα ∩ α . By Lemma 9.6.1 we obtain ψ (η ) =NF ψ (η (η )) and by Lemma 9.6.2 η (η ) ∈ Bnα . So η (η ) ∈ T by induction hypothesis and it follows ξ = ψ (η ) =NF ψ (η (η )) ∈ T by (T2 ). Having shown Theorem 9.6.3 we want to develop a primitive recursive notation system for the ordinals in T. But still annoying is the normal-form condition in clause (T2 ). To define a set On of notions for ordinals in T together with a
ξ ∈ Bβ ⇔ ξ = 0 ∨ ξ = Ω ∨ (ξ ∈ / SC ∧ SC(ξ ) ⊆ Bβ ) ∨ (ξ = ψ (η ) ∧ η ∈ Bβ ∩ β ).
(9.33)
From (9.33) we read off the following definition. 9.6.4 Definition Let ⎧ / ⎨ 0 {K(η ) η ∈ SC(ξ )} K(ξ ) := ⎩ {η } ∪ K(η )
if ξ = 0 or ξ = Ω if ξ ∈ / SC if ξ = ψ (η ).
From (9.33) and Definition 9.6.4 we immediately get 9.6.5 Lemma It is ξ ∈ Bβ iff K(ξ ) ⊆ β . 9.6.6 Corollary We have α =NF ψ (β ) iff α = ψ (β ) and K(β ) ⊆ β . Another obstacle for a notation system that assigns an uniquely determined term notation to the ordinals in T are the fixed-points of the Veblen functions ϕ . This was tolerable in the development of the notation system for the ordinals below Γ0 . The simultaneous definition of the relations ≺ and ≡ by course-of-values recursion was not too difficult. In developing a notation system for the ordinals in T it would, however, cause unnecessary complications. In Exercise 3.4.20 we introduced the fixed-point free versions ϕ¯ of the Veblen-functions. For these functions we always have ϕ¯ ξ1 (η1 ) = ϕ¯ ξ2 (η2 ) ⇔ ξ1 = ξ2 ∧ η1 = η2 . Therefore we are going to use ϕ¯ instead of the functions ϕ in the following sections.
9.6 The Lower Bound
187
9.6.7 Definition We use the known facts about ordinals in T to define sets SC ⊆ H ⊆ On ⊆ N of ordinal notations together with a finite set K(a) ⊆ On of sub-terms of a ∈ On, a relation ≺ ⊆ On × On and an evaluation function | |O : On −→ T by the following clauses where we use a b to denote a ≺ b ∨ a = b. Definition of SC, H and On. • Let 0 ∈ On, 1 ∈ SC, | 0|O := 0 and | 1|O := ω1 • If n > 1, a1 , . . . , an ∈ H and a1 · · · an then we put 1, a1 , . . . , an ∈ On and define | 1, a1 , . . . , an |O := |a1 |O + · · · + |an |O • If a, b ∈ On then 2, a, b ∈ H and | 2, a, b|O = ϕ¯ |a|O (|b|O ) • If a ∈ On and b ≺ a for all b ∈ K(a) then 3, a ∈ SC and | 3, a|O := ψ (|a|O ) Definition of K(a). • K( 0) = K( 1) = 0/ • K( 1, a1 , . . . , an ) = K(a1 ) ∪ · · · ∪ K(an ) • K( 2, a, b) = K(a) ∪ K(b) • K( 3, a) = {a} ∪ K(a) Let a, b ∈ On. Then a ≺ b iff one of the following conditions is satisfied. • a = 0 and b = 0 • a = 1, a1 , . . . , am , b = 1, b1 , . . . , bn and (∃i < m)(∀ j ≤ i)[a j = b j ∧ ai+1 ≺ bi+1 ] or m < n ∧ (∀ j ≤ m)[a j = b j ] • a = 1, a1 , . . . , an , b ∈ H and a1 ≺ b • a ∈ H, b = 1, b1 , . . . , bn and a b1 • a = 2, a1 , a2 , b = 2, b1 , b2 and one of the following conditions is satisfied a1 ≺ b1 and a2 ≺ b a1 = b1 and a2 ≺ b2 b1 ≺ a1 and a b2 • a = 2, a1 , a2 , b ∈ SC and a1 , a2 ≺ b • a ∈ SC, b = 2, b1 , b2 and a b1 or a b2
188
9 Ordinal Analysis of the Theory for Inductive Definitions
• a = 3, a1 , b = 3, b1 and a1 ≺ b1 • a = 3, a1 and b = 1 Collecting all the known facts about T and observing that On, SC, H, K(a) and ≺ are defined by simultaneous course-of-values recursion, we have the following theorem. 9.6.8 Theorem The sets On, H and SC as well as the relations K and ≺ are primitive recursive. The map | |O : On −→ T is one-one and onto such that a ≺ b iff |a|O < |b|O . Proof Since we defined On, H and SC as well as the relations K and ≺ by simultaneous course-of-values recursion, all these sets and relations are primitive recursive. The fact that |a|O ∈ T follows immediately from the definition of the map | |O . It remains to show that | |O is onto. We prove: “For every ordinal α ∈ T there is a notation α ∈ On such that (a) α = | α |O , (b) α ∈ H ⇔
α ∈ H,
(c) α ∈ SC ⇔
α ∈ SC,
(d) K( α ) = { β (e) β < α ⇔ (f)
α =β ⇔
β ∈ K(α )},
β ≺ α and α = β ”
by induction on the definition of α ∈ T . We put 0 := 0 and Ω := 1. If α =NF ϕ¯ α1 (β1 ) + · · · + ϕ¯ αn (βn ) for n > 1, we put α := 1, 2, α1 , β1 , . . . , 2, αn , βn . Then (a) follows directly from the induction hypothesis and the definition of | |O . Claims (b) and (c) hold trivially and (d) follows from the induction hypothesis by comparing definitions 9.6.4 / SC and we define α := 2, α1 , α2 and oband 9.6.7. If α = ϕ¯ α1 (β1 ) then α ∈ tain claim (a) again from the induction hypothesis and the definition of | |O . Claim (b) and (c) follow directly and claim (d) from the induction hypothesis and a comparison of definitions 9.6.4 and 9.6.7. If α =NF ψ (α0 ) we have K(α0 ) ⊆ α0 by Corollary 9.6.6. Then we obtain K( α0 ) = { β β ∈ K(α0 )} and thus β ≺ α0 for all β ∈ K( α0 ). Hence α := 3, α0 ∈ On and α = | α |O since α0 = | α0 |O by the induction hypothesis. Claims (b) and (c) again follow directly and claim (d) from the induction hypothesis by comparing definitions 9.6.4 and 9.6.7. Claims (e) and (f) follow from the induction hypothesis, Observation 3.3.9, Exercise 3.4.20 and the fact that, for α =NF ψ (α0 ) and β =NF ψ (β0 ) we have α < β iff α0 < β0 .
9.6 The Lower Bound
189
9.6.9 Corollary ψ (ΓΩ +1 ) < ω1CK . Proof We have ψ (ΓΩ +1 ) = otyp(≺Ω ). Since ≺ is primitive recursive we get ψ (ΓΩ +1 ) < ω1CK .
9.6.2 The Well-Ordering Proof In view of Theorem 9.6.8 we may talk about the ordinals in BΓΩ +1 (0) / in L (NT) and L (ID). For the sake of better readability we will, however, not use the codes but identify ordinals in BΓΩ +1 (0) / and their codes. We denote (codes of ) ordinals by lower case Greek letters and write α < β instead of α ≺ β . We use the abbreviations and conventions introduced in Sects. 7.4 and 8.5. The aim of this section is to show that there is a primitive recursive relation <0 such that for every α < ψ (εΩ +1 ) we get ID1 n ε Acc<0 and |n|Acc<0 = α . The strategy of the proof will be the following. • We first define a relation <1 for which TI1 (Ω , X) holds trivially. The relation <1 is no longer arithmetically definable but needs a fixed point in its definition. Then we use the well-ordering proof of Sect. 7.4 to obtain TI1 (α , X) provable in ID1 (X) for all α <1 εΩ +1 . • By a condensing argument we then show that TI1 (α , X) implies ψ (α ) ε Acc<0 . 9.6.10 Definition For ordinals α , β we define • α <0 β :⇔ α < β < Ω . Let F(u) be an L (ID) formula. By ξ ⊆0 F we denote the formula (∀η <0 ξ )[Fu (η )]. Let Acc be the fixed point of the operator induced by the formula (∀η < ξ )[ξ < Ω ∧ η ε X] i.e., Acc = Acc<0 ∩ Ω . Let M := {α SC(α ) ∩ Ω ⊆ Acc}. For α , β ∈ On we define • α <1 β :⇔ α < β ∧ α ε M ∧ β ε M.
ξ ⊆1 F stands for (∀η <1 ξ )[Fu (η )]. Let • Progi (F) :⇔ (∀ξ ε field(
190
9 Ordinal Analysis of the Theory for Inductive Definitions
Observe that by the axioms of ID1 and (9.3) (on page158) we have ID1
α ε Acc ↔ (α < Ω ∧ α ⊆ Acc)
(9.34)
ID1
Prog0 (Acc)
(9.35)
ID1
Prog0 (F) → (∀ξ )[ξ ε Acc → F(ξ )]
(9.36)
9.6.11 Lemma Let Prog(F) abbreviate the formula (∀α )[(∀ξ < α )F(ξ ) → Prog(F) → Prog0 (F) and thus also ID1 Prog(F) → F(α )]. Then ID1 (∀ξ ε Acc)F(ξ ). Proof (∀ξ <0 α )F(ξ ) implies (∀ξ < α )F(ξ ) for α < Ω . Together with Prog(F) we therefore get F(α ), i.e., we have Prog0 (F). Combining this with (9.36) we obtain the second claim. 9.6.12 Lemma (ID1 ) The class Acc is closed under ordinal addition. Proof
Let Acc+ := {ξ (∀η ε Acc)[η + ξ ε Acc]}. We claim
Prog0 (Acc+ ).
(i)
To prove (i) we have the hypothesis
α < Ω and (∀ξ < α )[ξ ε Acc+ ]
(ii)
and have to show α ε Acc+ , i.e., (∀η ε Acc)[η + α ε Acc].
(iii)
By (9.34) it suffices to have
η + α ⊆ Acc
(iv)
to get (iii). Let ξ < η + α . If ξ < η then we get ξ ε Acc from η ε Acc by (9.34). If η ≤ ξ < η + α , there is a ρ < α such that ξ = η + ρ . Then we obtain η + ρ ε Acc by (ii). From (i) we obtain (∀ξ ε Acc)[ξ ε Acc+ ]
(v)
by (9.36) which means (∀ξ ε Acc)(∀η ε Acc)[ξ + η ε Acc]. 9.6.13 Lemma ID1
Prog1 (F) → Prog0 (F).
9.6 The Lower Bound
191
Proof We have the premises Prog1 (F), α < Ω and (∀ξ <0 α )F(ξ ) and have to show F(α ). If ξ <1 α we get ξ <0 α by α < Ω and thus F(ξ ) by (∀ξ <0 α )F(ξ ). Hence (∀ξ <1 α )F(ξ ), which entails F(α ) by Prog1 (F). 9.6.14 Lemma (ID1 ) The class Acc is closed under λ ξ , η . ϕ¯ ξ (η ). Define / M ∨ Ω ≤ α }. Accϕ := {α (∀ξ ε Acc)[ϕ¯ α (ξ ) ε Acc] ∨ α ∈
(i)
We claim Prog1 (Accϕ ).
(ii)
To prove (ii) we have the hypothesis (∀ξ <1 α )[ξ ε Accϕ ]
(iii)
and have to show
α ε Accϕ .
(iv)
For α ε M or Ω ≤ α (iv) is obvious. Therefore assume
α ε M∩Ω.
(v)
We have to show (∀ξ ε Acc)[ϕ¯ α (ξ ) ε Acc].
(vi)
According to Lemma 9.6.11 we may assume that we have (∀η < ξ )[η ε Acc → ϕ¯ α (η ) ε Acc]
(vii)
and have to show
ξ ε Acc → ϕ¯ α (ξ ) ε Acc
(viii)
for which by (9.34) it suffices to prove
ρ < ϕ¯ α (ξ ) → ρ ε Acc.
(ix)
We show (ix) by Mathematical Induction on the length of the term notation of ρ . By (9.34) 0 ε Acc holds trivially. If ρ =NF ρ1 + · · · + ρn we have ρi ε Acc by induction hypothesis and obtain ρ ε Acc by Lemma 9.6.12. If ρ ∈ SC then we have ρ ≤ α or ρ ≤ ξ . If ρ ≤ ξ we get ρ ε Acc from ξ ε Acc. If ρ ≤ α we have ρ ≤ µ for some µ ∈ SC(α ). Since α ε M we have µ ε Acc and thence also ρ ε Acc. Now assume ρ ∈ H \ SC. Then ρ = ϕ¯ ρ1 (ρ2 ). There are the following cases. 1. ρ1 = α and ρ2 < ξ . Then we obtain ϕ¯ ρ1 (ρ2 ) ε Acc by (vii). 2. α < ρ1 and ρ ≤ ξ . Then ρ ε Acc follows from ξ ε Acc. 3. ρ1 < α and ρ2 < ϕ¯ α (ξ ). Then SC(ρ1 ) ∩ Ω is majorized by some µ ∈ SC(α ) ∩ Ω ⊆ Acc which means SC(ρ1 ) ∩ Ω ⊆ Acc and therefore ρ1 <1 α . By
192
9 Ordinal Analysis of the Theory for Inductive Definitions
(iii) we obtain ρ1 ε Accϕ . By induction hypothesis we have ρ2 ε Acc and which entails ϕ¯ ρ1 (ρ2 ) ε Acc. This finishes the proof of (ii). To prove the lemma we have to show
α , β ε Acc ⇒ ϕ¯ α (β ) ε Acc.
(x)
From α , β ε Acc we get α , β < Ω . Since SC(α ) ⊆ α + 1 this implies SC(α ) ∩ Ω ⊆ Acc. Hence α ε M ∩ Ω . From (ii) and Lemma 9.6.13 we obtain Prog0 (Accϕ ) and thence Acc ⊆ Accϕ by (9.36). Together with β ε Acc this implies ϕ¯ α (β ) ε Acc. Put AccΩ := {α α ε M ∨ (∃ξ ε K(α ))[α ≤ ξ ] ∨ ψ (α ) ε Acc}. 9.6.15 Lemma Proof
ID1
Prog1 (AccΩ ).
Assume
α ε field(<1 ) and (∀η <1 α )[η ε AccΩ ].
(i)
We have to show
α ε AccΩ .
(ii)
For α ε M or (∃ξ ε K(α ))[α ≤ ξ ] (ii) is obvious. Therefore assume α ε M and K(α ) ⊆ α . To prove (ii) it remains to show
ψ (α ) ε Acc.
(iii)
For (iii) in turn it suffices to have
ρ < ψ (α ) → ρ ε Acc.
(iv)
We prove (iv) by Mathematical Induction on the length of the term notation of ρ . If ρ∈ / SC we get SC(ρ ) ⊆ Acc by induction hypothesis and thence ρ ε Acc by Lemma 9.6.12 and Lemma 9.6.14. If ρ ∈ SC then there is a ρ0 such that ρ =NF ψ (ρ0 ) and ρ0 < α which implies K(ρ0 ) ⊆ ρ0 < α . For ξ ∈ SC(ρ0 ) ∩ Ω we have ξ =NF ψ (η ) for some η . Then η ε K(ξ ) ⊆ K(ρ0 ) ⊆ α which implies ξ = ψ (η ) < ψ (α ). Hence SC(ρ0 ) ∩ Ω ⊆ ψ (α ). By induction hypothesis we therefore obtain SC(ρ0 ) ∩ Ω ⊆ Acc. Hence ρ0 <1 α and therefore ρ0 ε AccΩ by (i). Since K(ρ0 ) ⊆ ρ0 and we just showed ρ0 ε M, this implies ρ = ψ (ρ0 ) ε Acc. 9.6.16 Lemma (Condensation Lemma) Let α be an ordinal in BΓΩ +1 such that K(α ) ⊆ α , α ε M and ID1 TI1 (α , F). Then ID1 ψ (α ) ε Acc. Proof
We especially have
ID1
TI1 (α , AccΩ ).
(i)
From (i) and Lemma 9.6.15 we obtain (∀ξ <1 α )[ξ ε AccΩ ]
(ii)
9.6 The Lower Bound
193
and from (ii) and Lemma 9.6.15
α ε AccΩ .
(iii)
But (iii) together with the other hypotheses yield ψ (α ) ε Acc. 9.6.17 Lemma ID1
TI1 (Ω + 1, F) ∧ K(Ω + 1) ⊆ Ω + 1 ∧ Ω + 1 ε M.
Proof Since SC(Ω + 1) ∩ Ω = 0/ and K(Ω + 1) = 0/ we trivially have K(Ω + 1) ⊆ Ω + 1 ∧ Ω + 1 ε M. Assuming Prog1 (F) we have to show (∀ξ <1 Ω + 1)[F(ξ )]. If ξ <1 Ω we obtain SC(ξ ) ⊆ Acc and thus ξ ε Acc by Lemma 9.6.12 and Lemma 9.6.14. By Lemma 9.6.13 we get Prog0 (F) which then by (9.36) entails F(ξ ). So we have (∀ξ <1 Ω )[F(ξ )] which by Prog1 (F) also implies F(Ω ). 9.6.18 Lemma Assume ID1
TI1 (α , F) ∧ K(α ) ⊆ α ∧ α ε M.
Then also ID1
TI1 (ω α , F) ∧ K(ω α ) ⊆ ω α ∧ ω α ε M
holds true. Proof
We show
ID1
TI1 (α , F) ⇒ ID1
TI1 (ω α , F)
(i)
analogously to the proof of (7.7). Put J F(α ) :⇔ α ε M ∧ (∀η ε M)[η ⊆1 F → η + ω α ⊆1 F].
(ii)
First we have to show ID1
Prog1 (F) → Prog1 (J F).
(iii)
To prove (iii) we have the premises Prog1 (F)
(iv)
α ε M ∧ α ⊆1 J F
(v)
η ε M ∧ η ⊆1 F
(vi)
ξ <1 η + ω α
(vii)
and have to show F(ξ ).
(viii)
194
9 Ordinal Analysis of the Theory for Inductive Definitions
If ξ < η this follows from (vi). If ξ = η this follows from (vi) together with (iv). So assume η < ξ =NF η + ω ξ1 + · · · + ω ξk < η + ω α . From ξ ε M we also obtain ξi ε M and ξ < η + ω ξ1 ·(k + 1). Therefore it suffices to prove
η + ω ξ1 ·u ⊆1 F
(ix)
by mathematical induction on u. For u = 0 this is (vi). From ξ1 <1 α we obtain J F(ξ1 ) by (v). Together with the induction hypothesis η + ω ξ1 ·u ⊆1 F this yields η + ω ξ1 ·(u + 1) ⊆1 F. So we have (ix) and thus also (iii). Now assume ID1
TI1 (α , F)
(x)
for all L (ID)-formulas F. This embodies especially ID1
TI1 (α , J F).
(xi)
which means ID1
α ε M ∧ (Prog1 (J F) → α ⊆1 J F).
(xii)
By (xii) we get α ε J F. But this implies ω α ⊆1 F and we obtain by (iii) ID1
ω α ε M ∧ (Prog1 (F) → ω α ⊆1 F),
i.e., TI1 (ω α , F). Because of SC(ω α ) ∩ Ω = SC(α ) ∩ Ω and K(ω α ) = K(α ) the remaining claims follow trivially. 9.6.19 Theorem (The lower bound for ID1 ) For every ordinal α < ψ (εΩ +1 ) there n ε Acc≺ and α ≤ |n|Acc≺ . is a primitive recursive ordering ≺ such that ID1 ID Hence ψ (εΩ +1 ) ≤ κ 1 . Proof We have outlined in Theorem 9.6.8 that < is primitive recursive. Defining a sequence ζ0 = Ω + 1 and ζn+1 = ω ζn we obtain by Lemma 9.6.17 and Lemma 9.6.18 ID1
TI1 (ζn , F) ∧ K(ζn ) ⊆ ζn ∧ ζn ε M
for all n. By the Condensation Lemma (Lemma 9.6.16), this implies ψ (ζn ) ε Acc = Acc<0 ∩ Ω for all n, which, according to our convention, actually means ψ (ζn ) ε Acc for all n. From Lemma 6.5.2 we obtain | ψ (ζn ) |Acc<0 = otyp<0 ( ψ (ζn ) ) = | ψ (ζn ) |O = ψ (ζn ) for all n. Hence |ψ (ζn )|Acc<0 = ψ (ζn ) for all n and the claim follows because supn ψ (ζn ) = ψ (εΩ +1 ). 9.6.20 Corollary (Ordinal analysis of ID1 )
κ ID1 = ψ (εΩ +1 ) and ||ID1 (X)|| = ||ID1 (X)||Π 1 = κ ID1 (X) = ψ (εΩ +1 ). 1
9.7 Alternative Interpretations for Ω
195
9.6.21 Exercise For α ∈ On and a set X of ordinals let Cα (X) be the closure of X ∪ {0, Ω } under + and λ ξ < α . λ η . ϑξ (η ) for ϑγ := enIn(γ ) and In(γ ) := / Cγ (ξ )}. Show the following properties {ξ ξ ∈ a) b) c) d) e) f) g)
In(α ) is unbounded and In(α ) ∩ Ω is unbounded in Ω . Ω is closed under ϑα . Cα (0) / ∩ Ω = ϑα (0). If α < ϑΩ (0) then In(α ) ∩ Ω = Cr (α ) ∩ Ω . If α < ϑΩ (0) and β < Ω then = ϑα (β ) = ϕα (β ). ϑΩ (0) = Γ0 . ψ (εΩ +1 ) = ϑϑ1 (Ω +1) (0)
Hint: For d) show first α ∈ CΩ (0) ∩ Ω ⇒ α ∈ Cα (0).
9.7 Alternative Interpretations for Ω In defining the iterated operators Bα , we interpreted the ordinal Ω as ω1 , the first uncountable ordinal. We already mentioned that another possible interpretation for Ω could be ω1CK , the first recursively regular ordinal. In view of the role of Ω in the ordinal analysis of the theory ID1 , the interpretation of Ω as ω1CK would even be the more natural one. The aim of this section (which is not needed for the further reading) is to show, that the “variable” Ω may be interpreted in many ways without altering the transitive part of the resulting operators Bα . We obtained the ordinal analysis of ID1 by iterating the operator B which essentially closes a set X ∪ {0, Ω } ⊆ On under ordinal addition + and the Veblen function ϕ viewed as a binary function. Since + and ϕ have a fixed meaning they do not depend on the interpretation of the ordinal Ω . The definition of the function ψB , however, depends on the operator B which in turn depends on the interpretation of the “variable” Ω . By assigning a value V (Ω ) to the “symbol” Ω we obtain an operator BV together with a function ψV := ψBV as defined in Definition 9.4.1. In the following, we will denote by Bα , the operators obtained by the standard interpretation V (Ω ) := ω1 . Recall that Bα ⊆ BΓω1 +1 for all ordinals α . This follows because all strongly critical ordinals which enter the sets Bα are of the form ψ (β ) < ω1 . The first strongly critical ordinal above ω1 , which is Γω1 +1 , is therefore inaccessible for the operations that form the sets Bα . In the next definition, we refer to the definition of ordinal terms On as presented in Definition 9.6.7. Here, however, we do not care that these terms are coded as natural numbers. To improve readability we will therefore write 0, Ω instead of 0,
1, a1 + · · · + an instead of 1, a1 , . . . , an , ϕ¯ a (b) instead of 2, a, b and ψ (a) instead of 3, a. We will also write < instead of ≺ etc. To distinguish ordinal terms, which may contain the variable Ω , from their interpretations in the ordinals we are, however, going to denote them (for the moment) by lower case Roman letters.
196
9 Ordinal Analysis of the Theory for Inductive Definitions
9.7.1 Definition Let V (Ω ) ∈ On be an assignment of an ordinal to Ω . We define the interpretation aV ∈ On for ordinal terms a ∈ On by the following clauses. • 0V := 0 and Ω V := V (Ω ) • (a1 + · · · + an )V := aV1 + · · · + aVn • (ϕ¯ a (b))V := ϕ¯ (aV ) (bV ) • (ψ (a))V := ψV (aV ) Let OnV := {aV a ∈ On}. The standard interpretation of Ω is the interpretation St(Ω ) := ω1 . For the standard interpretation we obviously have aSt = |a|O for all ordinal terms a ∈ On and / = BΓω1 +1 OnSt = T = BΓω1 +1 (0)
(9.37)
as shown in Theorems 9.6.3 and 9.6.8. If we try to develop the theory of iterations of the operator B as in the Sects. 9.4– 9.6, on the basis of a nonstandard interpretation for Ω we see that we already fail in proving equation (9.18) (in p. 173). Here we used that St(Ω ) = ω1 is a regular cardinal. The obvious remedy is to require equation (9.18) axiomatically and to introduce (AxΩ )
(∀ξ )[ψV (ξ ) < V (Ω )]
as the defining axiom for Ω . Ruminating Sect. 9.4 through Sect. 9.6, we will then notice that this (and at some places also V (Ω ) ∈ SC, but by Exercise 9.7.26 even this can be dispensed with) is all we need. In the rest of this section we are going to characterize the ordinals which satisfy AxΩ . 9.7.2 Definition An interpretation V is good relative to an ordinal Θ if (∀a)[aSt ∈ BΘ ∩ (Θ + 1) ⇒ ψV (aV ) < V (Ω )].
(9.38)
We call V a good interpretation if it is good relative to the ordinal Γω1 +1 . It follows from (9.18) on p. 173 that St is a good interpretation. We will continue to write ψ (a) instead of ψSt (a). To avoid silly technicalities, we will from now onwards assume that any interpretation satisfies V (Ω ) ∈ SC. Then the following lemma is immediate. 9.7.3 Lemma Let V be an interpretation. Then a ∈ H implies aV ∈ H and a ∈ SC implies aV ∈ SC. Recall that Γ denotes the enumerating function of the class SC of strongly critical ordinals and Γ its derivative. For an ordinal α we define its strongly critical succes/ sor by α SC := min {η ∈ SC α < η }. For an interpretation V , we abbreviate BαV (0) by BαV .
9.7 Alternative Interpretations for Ω
197
9.7.4 Lemma Let V be an interpretation such that Γ (0) ≤ V (Ω ). Then
ψV Γ (0) = Γ Γ (0). Proof
(9.39)
We show
ξ < Γ (0) ⇒ BξV ∩ Ω = ψV (ξ ) = Γ (ξ ) by induction on ξ . By induction hypothesis we have
η < ξ ⇒ BηV ∩ Ω = ψV (η ) = Γ (η ) > η .
(i)
Hence η ∈ BηV ⊆ BξV and Γ (η ) ∈ BξV for all η < ξ . For ρ < Γ (ξ ) there is an η < ξ such that Γ (η ) majorizes SC(ρ ) which implies ρ ∈ BξV , hence
Γ (ξ ) ⊆ BξV ∩ Ω ⊆ ψV (ξ ). / BξV which shows also But (i) and the hypothesis Γ (0) ≤ V (Ω ) imply Γ (ξ ) ∈ ψV (ξ ) ≤ Γ (ξ ). Hence ψV (ξ ) = Γ (ξ ) = BξV ∩ Ω . 9.7.5 Corollary Let V be an interpretation such that Γ (0) ≤ V (Ω ). Then ψV Γ (0) = Γ Γ (0) = ψSt Γ (0) and aV = aSt for all ordinal terms a ∈ On such that aSt < Γ (0). Proof The first part of the claim is in Lemma 9.7.4. We prove the second part by induction on the length of the term a. If a is not of the form ψV (b) we obtain the claim directly from the induction hypothesis. If a is a term ψ (b) such that aSt = ψ (bSt ) < Γ (0) then bSt < Γ (0) and we obtain bSt = bV and thus aSt = ψ (bSt ) = ψV (bV ) = aV . 9.7.6 Lemma Let V be an interpretation which is good relative to an ordinal Θ . For ordinal terms a, b ∈ On we then obtain aSt ∈ BΘ +1 ⇒ [aSt < bSt ⇔ aV < bV ]
(i)
bSt ∈ BΘ ∩ (Θ + 1) ∧ aSt ∈ BbSt ⇒ aV ∈ BbVV
(ii)
Proof Let lh(a) denote the length of the term a ∈ On. We prove (i) and (ii) simultaneously by induction on 2lh(a) + 2lh(b) . In the proof of (i) we follow the distinction by cases of Definition 9.6.7. Of course it suffices to show the direction from left to right. The opposite direction is then an immediate consequence. If aSt = 0 and bSt = 0 we have aV = 0 and bV = 0. Hence aV < bV . St Now suppose a = a1 + · · · + am . From aSt ∈ BΘ +1 we get {aSt 1 , . . . , am } ⊆ BΘ +1 . lh(a ) lh(a ) lh(a) lh(b) i i+1 Since 2 +2 <2 +2 we obtain by the induction hypothesis aVi ≥ V V V V ai+1 thence a =NF a1 +· · ·+am . If b = b1 +· · ·+bn we obtain bV =NF bV1 +· · ·+bVn
198
9 Ordinal Analysis of the Theory for Inductive Definitions
St ⇔ aV < bV by the induction hypothesis with the same argument. Since aSt i < bi i i we get aV < bV . St V V If b ∈ H we have aSt 1 < b . By the induction hypothesis it follows that a1 < b , which entails aV < bV since bV ∈ H by Lemma 9.7.3. V Now suppose a ∈ H and b = b1 + · · ·+ bn . Then aSt ≤ bSt 1 and we obtain H # a ≤ bV1 and bV =NF bV1 + . . . + bVn by the induction hypothesis. But this implies aV < bV . St St St V Assume a = ϕ¯ a1 (a2 ) and b = ϕ¯ b1 (b2 ). If aSt 1 < b1 and a2 < b , we obtain a1 < = bSt bV1 and aV2 < bV by the induction hypothesis, which implies aV < bV . If aSt 1 1 St then aV = bV and aV < bV by the induction hypothesis which implies and aSt < b 1 1 2 2 2 2 St St St aV < bV . Similarly we obtain from bSt 1 < a1 and a ≤ b2 by induction hypothesis bV1 < aV1 and aV ≤ bV2 , which in turn imply aV < bV . St St St V If a = ϕ¯ a1 (a2 ) and b ∈ SC such that aSt 1 < b and a2 < b then b ∈ SC and V V V V a1 < b and a2 < b by the induction hypothesis. By Lemma 9.7.3 we have bV ∈ SC which entails aV = ϕ¯ aV (aV2 ) < bV . 1 St St If a ∈ SC and b = ϕ¯ b1 (b2 ) we have aSt < bSt 1 or a < b2 . By Lemma 9.7.3 we V V V V V have a ∈ SC and obtain a < b1 or a < b2 by the induction hypothesis. In both cases this implies aV < ϕ¯ bV (bV2 ). 1
Now let a = ψ (a0 ) and b = ψ (b0 ). Then 2lh(a0 ) + 2lh(a0 ) ≤ 2lh(a) < 2lh(a) + 2lh(b) which by the induction hypothesis for (i) entails aV0 < bV0 . Moreover we have St St St c ∈ K(a0 ) ⇒ cSt < aSt 0 , which entails K(a0 ) ⊆ a0 and that in turn implies a0 ∈ ) < ω we obBaSt by Lemma 9.6.5. Since BΘ +1 ∩ ω1 = ψ (Θ + 1) and ψ (aSt 1 0 0
St St tain ψ (aSt 0 ) < ψ (Θ + 1), which implies a0 ≤ Θ . Hence a0 ∈ BΘ ∩ Θ + 1. By V the induction hypothesis for (ii) we therefore obtain a0 ∈ BaVV ⊆ BbVV . But then 0
0
aV = ψV (aV0 ) < ψV (bV0 ) = ψV (bV ). If finally a = ψ (a0 ) and b = Ω we get aSt 0 ∈ BΘ ∩ Θ + 1 as before and thus ψV (aV0 ) < Ω V = V (Ω ) since V is good relative to Θ . We are now going to prove (ii). The claim is obvious for a = 0 and a = Ω . If a = a1 + · · · + an or a = ϕ¯ a0 (a1 ) we obtain by the induction hypothesis for (i) and Lemma 9.7.3 aV =NF aV1 + · · · + aVn or aV = ϕ¯ aV (aV1 ), respectively. By the induction 0
hypothesis for (ii) we then have SC(aV ) ⊆ BbVV , which entails aV ∈ BbVV . St St If a = ψ (a0 ) then aSt =NF ψ (aSt 0 ) and a0 ∈ BbSt ∩ b ⊆ BΘ ∩ (Θ + 1). By the V induction hypothesis for (ii) and (i) we then obtain a =NF ψV (aV0 ) and aV0 ∈ BbVV ∩ bV and this implies aV =NF ψV (aV0 ) ∈ BbVV . We have seen in Theorem 9.6.8 that, for every ordinal α ∈ T there is an ordinal term α ∈ On such that α = | α |O = α St . The notation α is uniquely defined because we used the fixed-point free versions of the Veblen functions. 9.7.7 Definition For an ordinal α ∈ T and an interpretation V we define α V := α V. So if V is an interpretation, which is good relative to an ordinal Θ we obtain from Lemma 9.7.6 for ordinals α , β ∈ T .
α ∈ BΘ +1 ⇒ [α < β ⇔ α V < β V ] ∧ [α = β ⇔ α V = β V ]
(9.40)
9.7 Alternative Interpretations for Ω
199
β ∈ BΘ ∩ (Θ + 1) ∧ α ∈ BbSt ⇒ α V ∈ BβVV
(9.41)
β ∈ BΘ +1 ∧ α ∈ Cr (β ) ⇒ α V ∈ Cr (β V ).
(9.42)
Property (9.42) holds true since α ∈ Cr (β ) implies that there is a β0 ≥ β such that α ∈ Cr (β0 ) \ Cr (β0 + 1). Then α = ϕ¯ β0 (η ) for some η and thus α V = α V = ϕ¯ β0 V ( η V ) ∈ Cr (β0V ) ⊆ Cr (β V ) since β V ≤ β0V by (9.40). For β ∈ BΘ ∩ (Θ + 1) the interpretation V is thus an embedding from Bβ into BβVV which preserves principality, strong criticality and β -criticality. We will now prove that this embedding is also onto. The proof will need a relativized version of Lemma 9.6.2 saying that if ψV (α ) ∈ BβV,n there is an α0 ∈ BβV,n such that α0 ∈ BαV0
and ψV (α ) = ψV (α0 ), where BβV,n is defined analogously to Bnβ . Since the proof of Lemma 9.6.2 only needs ψ (α ) < Ω it relativizes easily to interpretations which are good relative to some Θ . 9.7.8 Lemma Let V be a good interpretation relative to Θ and β ∈ BΘ ∩ (Θ + 1). Then for every α ∈ BβVV there is a γ ∈ Bβ such that α = γ V . Moreover we have α ∈ SC iff γ ∈ SC and α ∈ H iff γ ∈ H. Proof Let α ∈ BβV,n . We prove the lemma by induction on β V with side induction on n. If α = 0 we put γ := 0 and if α = V (Ω ) we put γ := ω1 . Now assume α =NF α1 + · · · + αn . Then H # αi < α for i = 1, . . . , n. By the main induction hypothesis there are ordinals γi ∈ Bβ such that αi = γiV and γi ∈ H. By equation (9.40) we obtain γ1 ≥ · · · ≥ γn and put γ := γ1 · · · + γn . Then γ =NF γ1 · · · + γn and γ V = γ1V · · · + γnV , γ ∈ Bβ and γ ∈ / H. Next assume α = ϕ¯ α1 (α2 ). Then α ∈ H \ SC and αi < α . By the main induction hypothesis there are ordinals γ1 , γ2 ∈ Bβ such that γiV = αi for i = 1, 2. Let γ = ϕ¯ γ1 (γ2 ). Then γ ∈ Bβ and γi < γ for i = 1, 2. Let α = ψV (η ) such that η ∈ BβV,n−1 ∩ β V . Then α ∈ SC. By Lemma 9.6.2 V there is an η0 ∈ BβV,n−1 ∩ β V such that η0 ∈ BηV0 and α = ψV (η0 ). By induction V
hypothesis there is an α0 such that η0 = α0V , hence α0V ∈ BαV V , which implies α0 ∈ 0
Bα0 . So γ := ψ (α0 ) implies γ ∈ SC and γ =NF ψ (α0 ) and we obtain α = ψV (η0 ) = ψV (α0V ) = ψ (α0 )V = γ V .
9.7.9 Theorem Let V be an interpretation which is good relative to Θ and β ∈ BΘ ∩ (Θ + 1). Then (Bβ )V = BβVV . Proof
From α ∈ Bβ we obtain α V ∈ BβVV by (9.41). Hence (Bβ )V :=
{α V α ∈ Bβ } ⊆ BβVV . Conversely we obtain for α ∈ BβVV a γ ∈ Bβ such that
α = γ V ∈ (Bβ )V by Lemma 9.7.8. Hence BβVV ⊆ (Bβ )V .
200
9 Ordinal Analysis of the Theory for Inductive Definitions
9.7.10 Corollary For every good interpretation we have (BΓω1 +1 )V = BVV (Ω )SC . Proof Define ∆0 := ω1 +1 and ∆n+1 = ϕ¯ ∆n (0). Then supn∈ω ∆n = Γω1 +1 , ∆n ∈ B∆n for all n ∈ ω and BΓω1 +1 = {B∆n n ∈ ω }. The interpretation V is good relative to all ∆n for n ∈ ω . Hence (BΓω1 +1 )V = {(B∆n )V n ∈ ω } = {B∆V V n ∈ ω } = n
BVV (Ω )SC , since sup {∆nV n ∈ ω } = V (Ω )SC .
Summing up, we obtain the following theorem. 9.7.11 Theorem If V is an interpretation which is good relative to Θ and β ∈ BΘ ∩ (Θ + 1) then V is an order isomorphism from Bβ onto BβVV . If V is a good interpretation then BVV (Ω )SC .
V
is an order isomorphism from BΓω1 +1 onto
The main goal of the present section is to show that the transitive part of BΓΩ +1 does not depend on the interpretation of Ω , i.e., we want to show that BΓω1 +1 ∩ ω1 = BVV (Ω )SC ∩V (Ω ) holds true for any good interpretation V . 9.7.12 Lemma Let V be an interpretation which is good relative to Θ . Then BαV V ∩ V (Ω ) = ψV (α V ) holds for all α ∈ BΘ ∩ (Θ + 1). Proof Since V is good relative to Θ , we have ψV (α V ) < V (Ω ) for all α ∈ BΘ ∩ Θ + 1 by definition. Then we prove the lemma literally as we proved (9.21) on p. 173. 9.7.13 Theorem Let V be an interpretation which is good relative to Θ . Then BαV V ∩V (Ω ) = Bα ∩ ω1 for all α ∈ BΘ ∩ (Θ + 1)
(i) and
(ii) α V = α for all α < ψ (Θ ). Proof Since V is a good interpretation relative to Θ we know from Theorem 9.7.11 that V is an order isomorphism from Bα onto BαV V for all α ∈ BΘ ∩ (Θ + 1) which maps ω1 to V (Ω ). By (9.21) and Lemma 9.7.12 Bα ∩ ω1 and BαV V ∩ V (Ω ) are transitive sets. Therefore V is the identity on these sets. This proves (i). Since α < ψ (Θ ) = BΘ ∩ ω1 ⊆ BΓω1 +1 = T we have a term notation for α . We prove (ii) by induction on lh(α ) := lh( α ). We trivially have 0V = 0. If α =NF α1 + · · · + αn or α =NF ϕ¯ α1 (α2 ) we get αi < ψ (Θ ) and thus αiV = αi for i = 1, . . . , n or i = 1, 2, respectively. Hence α V = α1V + · · · + αnV = α1 + · · · + αn = α or α V =NF ϕ¯ α V (α2V ) = ϕ¯ α1 (α2 ) = α . 1
If α =NF ψ (α0 ) then α0 ∈ BΘ ∩ Θ . By (i) we thus obtain BαV V ∩V (Ω ) = Bα0 ∩
ω1 . Hence ψV (α0V ) = BαV V ∩V (Ω ) = Bα0 ∩ ω1 = ψ (α0 ). 0
0
9.7 Alternative Interpretations for Ω
201
9.7.14 Corollary For a good interpretation V we have α V = α for all α ∈ BΓω1 +1 ∩ ω1 = ψ (Γω1 +1 ). Proof
This follows immediately from Theorem 9.7.13.
It follows from Corollary 9.7.14 that a reinterpretation of ω1 does not move the ordinals below ω1 in BΓω1 +1 provided that this reinterpretation is good. This becomes, of course, false for ordinals above ω1 . Our next aim is to characterize good interpretations. 9.7.15 Theorem Assume Θ ∈ BΘ . Then ψ (Θ ) < V (Ω ) for every interpretation which is good relative to Θ . Proof Assume V (Ω ) ≤ ψ (Θ ). By Theorem 9.7.13 it follows V (Ω ) ≤ ψ (Θ ) = V ∩ V (Ω ) = ψ (Θ V ) contradicting the definition of “good relative BΘ ∩ ω1 = BΘ V V to Θ ”. 9.7.16 Corollary It is ψ (Γω1 +1 ) ≤ V (Ω ) for any good interpretation V . Proof Let ∆n be the fundamental sequence for Γω1 +1 as introduced in the proof of Corollary 9.7.10. Then V is good relative to all ∆n and we have ∆n ∈ B∆n for all n ∈ ω . By Theorem 9.7.15 it follows ψ (∆n ) < V (Ω ) for all n which in turn implies ψ (Γω1 +1 ) ≤ V (Ω ). 9.7.17 Lemma Let V be an interpretation and Θ an ordinal such that Γ (0) ≤ ψ (Θ ) ≤ V (Ω ) ≤ ω1 . Then (i)
β V ≤ β holds for all β ∈ BΘ
and (ii) V is a good interpretation relative to all β < Θ . Proof We show both claims simultaneously by induction on lh(β ) with side induction on β . We start with proving (i). For β = 0 we have β V = 0 ≤ β and for β = ω1 by definition β V = V (Ω ) ≤ ω1 . For β =NF β1 + · · · + βn or β =NF ϕ¯ β1 (β2 ) we obtain the claim immediately form the main induction hypothesis. Assume β =NF ψ (β0 ). Then β0 ∈ BΘ ∩ Θ and V is good relative to β0 by the main induction hypothesis for (ii). But we also have β0 ∈ Bβ0 ∩ (β0 + 1) by the normal form condition and obtain β V = ψV (β0V ) = ψβ0 = β by Theorem 9.7.13. To prove (ii) we have to show
ξ ∈ Bβ ∩ (β + 1) ⇒ ψV (ξ V ) < V (Ω ).
(i)
So assume ξ ∈ Bβ ∩ (β + 1). Then ψ (ξ ) ≤ ψ (β ). If β = 0, then ξ = 0 and we obtain ψV (ξ V ) = Γ0 < Γ (0) ≤ ψ (Θ ) ≤ V (Ω ) by Lemma 9.7.4.
202
9 Ordinal Analysis of the Theory for Inductive Definitions
For β ∈ Lim we distinguish the following cases. 1. ξ = β . Then ξ ∈ Bξ , which shows that ψ (ξ ) is in normal form. Hence ξ = β = ψ (β ) < ω1 . If β ≤ ψ (β ) we therefore obtain β < ψ (β ) ≤ ψ (ω1 ) = Γ (0). By Corollary 9.7.5, this implies ξ V = ξ = β and ψV (ξ V ) = Γξ = Γβ < Γ (0) ≤ ψ (Θ ) ≤ V (Ω ). If ψβ < β we obtain ψβ ∈ BΘ ∩ ψ (Θ ) from β ∈ BΘ ∩ Θ . By the side induction hypothesis for (ii) we then obtain ψV (ξ V ) = ψV (β V ) = ψ (β )V ≤ ψ (β ) < ψ (Θ ) ≤ V (Ω ). 2. ξ < β . Since Bβ = η <β Bη we find an η < β such that ξ ∈ Bη ∩ (η + 1). By the side induction hypothesis for (ii) we know that V is good relative to η and obtain ψV (ξ V ) < V (Ω ). Finally we assume β = β0 + 1. If β ≤ ψ (β ) we obtain ξ ≤ β < ψ (β ) ≤ ψ (ω1 ) = Γ (0) as in case 1. Hence ψV (ξ V ) = Γξ < Γ (0) ≤ V (Ω ). Now let ψ (β ) < β . There is an α ∈ Bω1 such that ψ (ξ ) =NF ψ (α ). In the terminology of Lemma 9.6.2 it is α = ξ (ξ ) ≥ ξ . If ξ = β then α = ξ as ξ ∈ Bβ = Bξ and therefore ξ (ξ ) = β (β ) = min {η β ≤ η ∈ Bβ } = β . From β ∈ BΘ ∩ Θ and ψ (β ) < β we obtain ψ (β ) ∈ BΘ ∩ β . By the side induction hypothesis for (i) it follows ϕ¯V (ξ V ) = ϕ¯V (β V ) ≤ ψ (β ) < ψ (Θ ) ≤ V (Ω ). If ξ < β then ψ (ξ ) =NF ψ (α ) ∈ Bβ which implies α ∈ Bβ ∩ β ⊆ BΘ . By the side induction hypothesis for (ii) we know that V is good relative to β0 . By (9.40) we therefore obtain ξ V ≤ α V from ξ ≤ α and as ψ (α ) < ψ (β ) < β finally also ψV (ξ V ) ≤ ψV (α V ) ≤ ψ (α ) ≤ ψ (β ) < ψ (Θ ) ≤ V (Ω ) by the side induction hypothesis for (i). 9.7.18 Lemma Suppose that Θ is a limit ordinal and V an interpretation such that Γ (0) ≤ ψ (Θ ) ≤ V (Ω ). Then ψV (ξ V ) = ψ (ξ )V = ψ (ξ ) < ψ (Θ ) holds true for all ξ ∈ BΘ ∩ Θ . Proof If ω1 ≤ V (Ω ) we obtain ψV (ξ ) < ω1 ≤ V (Ω ) for all ordinals ξ and V is a good interpretation. If V (Ω ) < ω1 and ξ ∈ BΘ ∩ Θ then there is an η < Θ such that ξ ∈ Bη ∩ η . By the hypothesis ψ (Θ ) ≤ V (Ω ) it follows from Lemma 9.7.17 that V is good relative to η which by Lemma 9.7.13 entails ψV (ξ V ) = ψ (ξ )V = ψ (ξ ). But ψ (ξ ) < ψ (Θ ) is clear from ξ ∈ BΘ ∩ Θ . Now we are prepared for a characterization of good interpretations. 9.7.19 Theorem An interpretation V is good if and only if ψ (Γω1 +1 ) ≤ V (Ω ). Proof If V is good then ψ (Γω1 +1 ) ≤ V (Ω ) by Corollary 9.7.16. For the opposite direction, assume ψ (Γω1 +1 ) ≤ V (Ω ) and ξ ∈ BΓω1 +1 ∩ (Γω1 +1 ). By Lemma 9.7.18 we obtain ψV (ξ V ) = ψ (ξ ) < ψ (Γω1 +1 ) ≤ V (Ω ). So V is a good interpretation. As a consequence of the characterization theorem for good interpretation we obtain the following observation. 9.7.20 Theorem The following interpretations are good interpretations. (A) The standard interpretation V (Ω ) := ω1 .
9.7 Alternative Interpretations for Ω
203
(B) The recursive standard interpretation V (Ω ) := ω1CK (C) The term interpretation V (Ω ) := ψ (Γω1 +1 ). Proof Theorem 9.7.19 together with Equation (9.18) implies that the standard interpretation is a good interpretation. From Corollary 9.6.9 and Theorem 9.7.19 it follows that the recursive standard interpretation is good and (C) follows immediately from Theorem 9.7.19. Let us now return to AxΩ . 9.7.21 Definition Let V be an interpretation. We call V a global model of AxΩ if (∀ξ )[ψV (ξ ) < V (Ω )]. If we only have (∀ξ ε BVV (Ω )SC )[ψV (ξ ) < V (Ω )], we talk about a local model of AxΩ . We already mentioned in the beginning of the section that for global models we can develop the theory of the iterations Bα as in the case of the standard interpretation. The ordinal Ω may therefore be viewed as a “virtual ordinal”, which has only to be bigger than the transitive part of all Bα . Since we cannot know its size in advance we have opted to interpret Ω as ω1 , which for cardinality reasons is safely outside of all the eventually obtained transitive parts. 9.7.22 Theorem The following statements are equivalent. (A) The interpretation V is a global model for AxΩ . (B) The interpretation V is good and ψ (Γω1 +1 ) < V (Ω ). (C) The interpretation V is a local model of AxΩ and ψ ((V (Ω ))SC ) < V (Ω ). Proof From (A) we obtain immediately that V is a good interpretation. Let ∆n be again the fundamental sequence for Γω1 +1 as defined in the proof of Corollary 9.7.10. Then ∆nV < V (Ω )SC and we obtain ψ (Γω1 +1 ) = sup {ψ (∆n ) n ∈ ω } = sup {ψV (∆nV ) n ∈ ω } ≤ ψV (V (Ω )SC ) < V (Ω ). Assume (B) and choose ξ ∈ BVV (Ω )SC . Then ξ ∈ B∆V V for some n ∈ ω . We have n ∆n ∈ B∆n ∩ (∆n + 1) and V is good relative to ∆n . By Lemma 9.7.8, η ∈ BΓ∆n ⊆ BΓω1 +1 such that ξ = η V . Hence ψV (ξ ) = ψV (η V ) < V (Ω ). So V is a local model for AxΩ . Now regard the sequence ∆nV . It is supn∈ω ∆nV = V (Ω )SC and we obtain ∆nV ∈ BξV ∩V (Ω )SC for any ordinal ξ . Now we claim sup ψV (∆nV ) = ψV (V (Ω )SC ).
n∈ω
(i)
204
9 Ordinal Analysis of the Theory for Inductive Definitions
From (i) and Corollary 9.7.14 we then obtain ψV (V (Ω )SC ) = supn∈ω ψV (∆nV ) = supn∈ω ψ (∆n ) = ψ (Γω1 +1 ) < V (Ω ). It remains to show (i). We already used the obvious fact that supn∈ω ψV (∆nV ) ≤ ψV (V (Ω )SC ). By Lemma 9.7.18 we obtain σ := ∆n ) = ψ (Γω1 +1 ) < V (Ω ). Towards a contradiction, we supn∈ω ψV (∆nV ) = supn∈ω ψ ( assume σ < ψV (V (Ω )SC ) = n∈ω B∆Vn ∩V (Ω ). Then there is an n ∈ ω such that σ ∈ B∆V V ∩V (Ω ) = ψV (∆nV ). Hence ψ (Γω1 +1 ) = σ < ψV (∆nV ) = ψ (∆n ) < ψ (Γω1 +1 ), a n contradiction. Let us finally assume (C). We first observe
α ≤ β ∧ [α , β ) ∩ BβV = 0/ ⇒ BαV = BβV ∧ ψV (α ) = ψV (β )
(ii)
which is obvious from the definition of BβV and ψV (β ). We also prove
ξ ∈ BβV ⇒ ξ ∈ BVV (Ω )SC ∩V (Ω )SC
(iii)
for any ordinal β easily by induction on the definition of ξ ∈ BβV using the fact that V is a local model of AxΩ . For V (Ω )SC ≤ ξ we obtain by (ii) and (iii) ψV (ξ ) = ψV (V (Ω )SC ) < V (Ω ) and for ξ < V (Ω )SC by the weak monotonicity of ψV also ψV (ξ ) ≤ ψV (V (Ω )SC ) < V (Ω ). As an immediate consequence of Theorem 9.7.19 and Theorem 9.7.22 we obtain 9.7.23 Corollary If ψ (Γω1 +1 ) < V (Ω ) then V is a global model of AxΩ . Corollary 9.7.23 together with Corollary 9.6.9 yield the following theorem. 9.7.24 Theorem The recursive standard interpretation V (Ω ) := ω1CK is a global model of AxΩ .
In Fig. 9.1 we have visualized the set of ordinals in α ∈On Bα = BΓω1 +1 and alternative interpretations for Ω . For all interpretations V (Ω ) ∈ (ψ (Γω1 +1 ), ω1 ) the transitive segment stays untouched but α ∈On BαV ⊆ ω1 . All these interpretations generate a global model of AxΩ . The term interpretation V (Ω ) = ψ (Γω1 +1 ), however, does not generate a global model of AxΩ . As indicated in Fig. 9.1 moving
0
ψ (Γω1 +1 )
ω1
ω1 + ω1
Γω1 +1
ω1 + ψ (Γω1 +1 ) ω1
0
V (Ω )
V (Ω ) +V (Ω )
V (Ω )Γ
Fig. 9.1 Visualization of the ordinals generated by iterations of B and alternative interpretations for Ω .
9.7 Alternative Interpretations for Ω
205
V (Ω ) to ψ (Γω1 +1 ) enlarges the transitive part of BVV (Ω )SC . In fact this interpretation
compresses all gaps such that BVV (Ω )SC becomes a transitive set whose order type is otyp(BΓω1 +1 ).
9.7.25 Exercise Prove Lemma 9.7.3 9.7.26 Exercise (a) Show that Γ (α ) = ψV (α ) holds true for V (Ω ) ∈ Γ (0) \ SC. (b) Describe the behavior of ψV for V (Ω ) ∈ Γ (0) ∩ SC. For the following parts drop the hypothesis V (Ω ) ∈ SC in the definition of an interpretation. Call V good relative to an ordinal Θ if ψV (aV )V < sup {λ ∈ SC λ ≤ V (Ω )} holds true for all a such that aSt ∈ BΘ ∩ (Θ + 1). (c) Show that Theorem 9.7.19 and Theorem 9.7.22 remain true (d) Show that Lemma 9.7.6 (i) fails if we only require ψV (aV ) < V (Ω ) for aSt ∈ BΘ ∩ (Θ + 1) in the definition of a good interpretation, which may not be strongly critical. Hint: For (c) extend H to HV := H ∪ {V (Ω )}, SC to SCv := SC ∪ {V (Ω )} and modify the normalform conditions for ordinal terms replacing H and SC by HV and SCV accordingly. Then check that all claims of this section remain correct.
9.7.27 Exercise Show that V (Ω ) =NF ψ (α ) implies ψ (α ) < ψV (α V ). 9.7.28 Exercise Assume again V (Ω ) ∈ SC. Show that the following claims are equivalent. (a) V is a global model for AxΩ (b) ψV (V (Ω )SC ) = ψ (ω1SC ) and ψV (ξ V ) = ψ (ξ ) holds true for all ξ ∈ Bω SC . 1
(c) ψV (V (Ω )SC ) = ψ (ω1SC ) and ψ (ξ )V = ψ (ξ ) holds true for all ξ ∈ Bω SC . 1
(d) ψV (V (Ω )SC ) = BVV (Ω )SC ∩V (Ω ). (e) ψV (V (Ω )SC ) < V (Ω ). (f)
The function ψ V is continuous and ψV (ξ + 1) = ψV (ξ )SC .
9.7.29 Exercise Assume again V (Ω ) ∈ SC. Show that the following claims are equivalent. (a) V is a good interpretation. (b) ψV (ξ V ) = ψ (ξ ) holds true for all ξ ∈ Bω SC . 1
206
9 Ordinal Analysis of the Theory for Inductive Definitions
(c) ψ (ξ )V = ψ (ξ ) holds true for all ξ ∈ Bω SC . 1
(d) BξV ⊆ V (Ω )SC holds true for all ordinals ξ . (e) ψV (ξ ) ≤ V (Ω )SC holds true for all ordinals ξ . (f)
BξVV ⊆ V (Ω )SC holds true for all ξ ∈ Bω SC . 1
(g) ψV (ξ V ) ≤ V (Ω )SC holds true for all ξ ∈ Bω SC . 1
(h) ψ (ξ )V ≤ V (Ω )SC holds true for all ξ ∈ Bω SC . 1
9.7.30 Exercise Disprove the following statements. (a) If ψ V is continuous then V is a global model for AxΩ . (b) If ψV (ξ + 1) ≤ ψV (ξ )SC holds true for all ordinals ξ then V is a global model for AxΩ . 9.7.31 Exercise Let V be the term interpretation. Show that ψV (V (Ω )SC ) = otyp(Bω SC ) = ψ (ω1SC )SC . 1
Chapter 10
Provably Recursive Functions of NT
Kreisel once asked the question if it is possible to obtain upper bounds for the prooftheoretic ordinal of a theory whose language does not include free predicate variables without using G¨odel’s second incompleteness theorem. In Sect. 9.3 we proved in Theorem 9.3.2 that cut elimination is close to trivial for semi-formal systems which derive only sentences. Subsequently we discussed that no ordinal information can be expected from cut-elimination in such semi-formal systems. The proof of Theorem 9.3.2 fails, however, in the presence of set variables. For the standard procedure in getting an ordinal analysis of NT it is therefore crucial to include free set variables in its language. This is probably the background of Kreisel’s question. On the other side, we have seen in Chap. 9 that controlling operators may allow us to obtain information also from semi-formal systems without free set variables in their languages. Weiermann observed that this is also true for “predicative” semi-formal systems. He could prove that the methods of impredicative proof theory are also applicable in predicative proof theory and lead there to better results. In particular he succeeded in (re)characterizing the provably recursive functions of NT (cf. [106] and [10]). In the following sections we present a variant of one of Weiermann’s approaches.
10.1 Provably Recursive Functions of a Theory Let T be a theory whose language extends L (NT). In T we can express all primitive recursive predicates by their characteristic functions. Again we write T
P(x) instead of T
χP (x) = 1.
To explain what the provably recursive functions of a theory T are, we recall some notions and results of elementary recursion theory. Partial recursive function terms are defined by extending the definition of the primitive recursive function terms in Sect. 2. We replace in Definition 2.1.1, the
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
207
10 Provably Recursive Functions of NT
208
phrase “primitive recursive function term” by “partial recursive function term” and add the clause • If t is an n + 1-ary partial recursive function term, then µ xi . t is for 1 ≤ i ≤ n + 1 an n-ary partial recursive function term. The inductive definition of ev( f , z1 , . . . , zn ) = z has to be altered to an inductive definition of ev( f , z1 , . . . , zn ) $ z which means that f is now to be interpreted by a partial function, i.e., a function whose domain dom( f ) is a subset of Nn and not necessarily the whole space Nn . A partial function f : Nn −→ p N may be extended to a total function f˜: Nn −→ N ∪ {↑} defined by f (z1 , . . . , zn ) if (z1 , . . . , zn ) ∈ dom( f ) f˜(z1 , . . . , zn ) := ↑ otherwise where f (z1 , . . . , zn ) = ↑ should be read as “ f (z1 , . . . , zn ) is undefined”. Then we ob˜ 1 , . . . , zn ). We mostly identify tain f (z1 , . . . , zn ) $ g(z1 , . . . , zn ) iff f˜(z1 , . . . , zn ) = g(z f and f˜. We always simply write f (z1 , . . . , zn ) $ z instead of ev( f , z1 , . . . , zn ) $ z. The clauses of the inductive definition of f (z1 , . . . , zn ) $ z are those of the inductive evaluation of primitive recursive function terms as given in Definition 2.1.2 with = replaced by $ and the additional clause ⎧ ⎨ min {x t(z1 , . . . , zi−1 , x, zi . . . zn ) $ 0} if this exists µ xi . t (z1 , . . . , zn ) :$ and t(z1 , . . . , zi−1 , y, zi . . . zn ) is defined for all y < x ⎩ ↑ otherwise. A function is partial recursive iff it is the interpretation of a partial recursive function term. The partial recursive function terms can be coded by natural numbers. The predicate “e codes a partial recursive function term” is definable by course of values recursion and thence primitive recursive. By T we denote, Kleene’s T-predicate. The meaning of T(e, x1 , . . . , xn , y) is that y codes the evaluation of the term number e at the arguments x1 , . . . , xn . We cite two important theorems of elementary recursion theory. The first is Kleene’s normal-form theorem. The second the Recursion Theorem. 10.1.1 Theorem (Kleene’s Normal-form Theorem) There are n + 2-ary primitive n recursive predicates T and a regressive primitive recursive function U, i.e., a function satisfying U(y) ≤ y, such that for every n-ary partial recursive function f there n is an e ∈ ω such that f (x) $ U(µ y. T (e,x, y)). We call e an index for the n-ary partial recursive function f . It is a common abbreviation to put n
{e}n (x) :$ U(µ y. T (e,x, y)). This notation is known as Kleene-bracket. The connection between indices for functions of different arities is stated in the Snm -Theorem.
10.2 Operator Controlled Derivations
209
10.1.2 Theorem (Snm -Theorem) For each m, n ∈ N there is an m + 1-ary primitive recursive function Snm such that {e}m+n (y,x) $ {Snm (e,y)}n (x). An immediate consequence of the Snm -Theorem is the following Recursion Theorem. 10.1.3 Theorem (Recursion Theorem) Let g be an n + 1-ary partial recursive function. Then there is an index e such that {e}n (x) $ g(e,x). Proof In order to emphasize that the Recursion Theorem is an immediate consequence of the purely combinatorial Snm -Theorem we give its simple proof. Define h(y,x) :$ g(Sn1 (y, y),x) and let e0 be an index for h. For e := Sn1 (e0 , e0 ) we then compute {e}n (x) $ {Sn1 (e0 , e0 )}n (x) $ {e0 }n+1 (e0 ,x) $ h(e0 ,x) $ g(Sn1 (e0 , e0 ),x) $ g(e,x)
and are done.
Since {e}n (x) is partial recursive in the arguments e andx it follows from the Recursion Theorem that “nearly everything we can write down” is partial recursive. As an example we get an index e such that {e}1 (x) $ {e}1 (x) + 1. But this is not a contradiction. It only shows the function λ x . {e}1 (x) is nowhere defined. We showed this example to stress the fact that the important part in introducing a recursive function n is to show that this function is total, i.e., to show (∀x)(∃y)T (e,x, y). This is the background of the following definition. 10.1.4 Definition Let T be a theory which comprises the language of arithmetic (either directly or by interpretation). If e is an index for a partial recursive function f n and T proves (∀x)(∃y)T (e,x, y) then we say that f is a provably recursive function of T . The aim of the following sections is to characterize the provably recursive functions of the theory NT. The theory NT here serves as a paradigmatic example how an adaption of the methods of impredicative proof theories can be used to obtain better results also in predicative proof theory. This method can be extended also to stronger theories (cf. [10]).
10.2 Operator Controlled Derivations In this section, we transfer the concept of operator controlled derivations to the semi-formal system defined in Definition 7.3.5. This time we only regard sentences in the language L (NT). Every sentence belongs either to –type or to –type . α Therefore we do not need clause (Ax) in the definition of ρ ∆ for a finite set ∆ of L (NT)-sentences. We want to refine
α ρ
∆ by controlling operators. In contrast
to the situation in the ordinal analysis of ID1 , we do not have ordinal parameters
10 Provably Recursive Functions of NT
210
in the sentences of L (NT). The parameters which are of interest here are the natural numbers occurring in the sentences. Number parameters do not influence the computation of the Π11 -ordinal of a theory. Therefore, we could neglect them in the computation of ||ID1 ||Π 1 . However, in the computation of what we will call the 1
Π20 -ordinal of a theory, they play a crucial role. For a finite set ∆ of L (NT)-sentences we define
par(∆ ) := {t N ∈ ω The closed term t occurs in some of the sentences in ∆ } ∪{rnk(F) F ∈ ∆ }. We count also parameters inside of open terms. As an example, the number 7 is among the parameters of the sentence (∃x)(∃y)[7 + x = y]. To obtain controlling operators let f : N −→ N be a strictly increasing function. The function f canonically induces an operator F f : Pow(N) −→ Pow(N) F f (X) := f [X] := { f (x) x ∈ X}. Since we will only deal with finite sets X ⊆ N and are only interested in upper bounds it would be overkill to work with the induced operators. Therefore, we define for an increasing function f : N −→ N and a finite set X ⊆ N directly f (X) := max F f (X) = f (max X).
(10.1)
To emphasize the fact that we think of operators rather than strictly increasing functions we keep on using sanserif capitals to denote strictly increasing functions and call them strictly increasing operators. We abbreviate (∀x ∈ X)(∃y ∈ Y )[x ≤ y] by X ≤ Y . Strictly increasing “operators” F are monotone in the sense that we have X ≤ Y ⇒ F(X) ≤ F(Y ). As in Sect. 9.3 we use the notation F [X] := λ Ξ . F(X ∪ Ξ ). The next aim is to modify Definition 9.3.8 to operators induced by increasing functions. This is not completely obvious because F f (X) only contains natural numbers, i.e., finite ordinals, while we have to assign infinite ordinals. Here we need an observation which turns out to be crucial. In describing this observation, we restrict ourselves to ordinals below Γ0 . There we have a normal form
α =NF ϕα1 (β1 ) + · · · + ϕαn (βn ) as shown in Sect. 3.4.3. Recall the norm of α which has been defined by 0 if α = 0 N(α ) := (N( α ) + N( β ) + 1) if α =NF ϕα1 (β1 ) + · · · + ϕαn (βn ). ∑i≤n i i
10.2 Operator Controlled Derivations
211
Observe that N(α ) < ω for all ordinals α and N(n) = n for n < ω . For a finite set X of ordinals let N(X) := {N(α ) α ∈ X}. For an operator F and a finite set X ⊆ On we define OF (X) := {α < Γ0 N(α ) ≤ F(N(X))}. Since there are only finitely many ordinals of the same norm, the set OF (X) is always finite. Instead of OF ({α }) we write mostly OF (α ). There is no confusion as OF takes only finite sets as arguments. α
10.2.1 Definition Let F be an operator. We define the relation F ρ ∆ by the clauses ( ), ( ) and (cut) of Definition 7.3.5 with the additional conditions • α ∈ OF (par(∆ )) and for an inference F
αi ρ
∆ι for ι ∈ I ⇒ F
α ρ
∆
with finite I also • par(∆ι ) ≤ F(par(∆ )). Of course the modified system remains sound, i.e., we still have F
α ρ
F1 , . . . , Fn ⇒ N |= F1 ∨ · · · ∨ Fn .
We define F ≤ G :⇔ (∀X)[F(X) ≤ G(X)] and say that the operator G extends the operator F. The following lemma corresponds to Lemma 9.3.9. 10.2.2 Lemma (Structural Lemma) Let F and G be strictly increasing operators which are not trivial and ∆ and Γ be finite sets of sentences. Assume F [par(∆ )] ≤ α G [par(∆ )], ∆ ⊆ Γ , ρ ≤ σ , α ≤ β ∈ OG (par(Γ )) and F ρ ∆ . Then we also obtain G
β σ
Γ.
The proof is a direct induction on α .
The operators F are not Skolem-hull operators. The property (F ◦ F )(X) = F(X) which is characteristic for Skolem-hull operators does not hold. Therefore Lemma 9.3.10 has to be modified. This has a series of consequences. 10.2.3 Lemma If X ≤ F(par(∆ )) and F [X] Proof
α ρ
∆ then F ◦ F 2
If X ≤ F(par(∆ )) then F [X](par(∆ )) ≤ F (par(∆ )).
Next, we prove an inversion lemma for sentences in –type .
α ρ
∆.
10 Provably Recursive Functions of NT
212
10.2.4 Lemma (Inversion Lemma) Let F ∈ –type , CS(F) = 0/ and assume α α H ρ ∆ , F. Then H [par(F)] ρ ∆ , G holds true for all G ∈ CS(F). Proof
Induction on α . The proof parallels completely the proof of Lemma 9.3.11.
There is also a corresponding lemma to Lemma 9.3.13. Assume F
10.2.5 Lemma (Detachment Lemma) Then we already get F [par(F)]
α ρ
∆.
α ρ
∆ , F and ¬F ∈ Diag(N).
Proof Induction on α . From ¬F ∈ Diag(N) we conclude that F is atomic and / the sentence F cannot be the critical formula of the F ∈ –type . Since CS(F) = 0, last inference (J) F
αι ρ
∆ι , F for ι ∈ J ⇒ F
α ρ
∆ , F.
By the induction hypothesis, we get F [par(F)]
αι ρ
∆ι for ι ∈ J.
(i)
For finite J we have par(∆ι , F) ≤ F(par(∆ , F)), hence par(∆ι ) ≤ F [par(F)](par(∆ )) α and, because of α ∈ OF (par(∆ , F)) ⊆ OF [par(F)] (par(∆ )), we obtain F [par(F)] ρ ∆ by an inference (J).
Let F be a controlling operator such that
10.2.6 Lemma (Reduction Lemma) 2·x ≤ F(x). Assume F
α ρ
∆ , F and F
β ρ
Γ , ¬F for a sentence F ∈
that rnk(F) = ρ and par(F) ≤ F(par(∆ )). Then Proof
3 α +β F ρ
–type such
∆ ,Γ .
We induct on β . Let us first check that α + β ∈ O 3 (par(∆ , Γ )). We have F
N(α + β ) ≤ N(α ) + N(β ) ≤ 2· max{N(α ), N(β )} 2
≤ 2· max{F(par(∆ , F)), F(par(Γ , F))} ≤ F (par(∆ , Γ , F)). 2
(i)
3
From par(F) ≤ F(par(∆ )) we obtain F (par(∆ , Γ , F)) ≤ F (par(∆ , Γ )). Hence α + β ∈ O 3 (par(∆ , Γ )). F
If ρ = 0 then F ∈ Diag(N) and we obtain F β
3 α +β 0
∆ , Γ from the hypoth-
esis F 0 Γ , ¬F by the Detachment Lemma together with par(F) ≤ F(par(∆ )), / Lemma 10.2.2 and Lemma 10.2.3. Thus suppose ρ > 0, which entails CS(F) = 0. First assume that ¬F is not the critical formula of the last inference βι ρ
F
(J)
Γi , ¬F for ι ∈ I ⇒ F
β ρ
Γ , ¬F.
If (J) is an inference according to ( ) with empty premises, then we have F
3 α +β 0
∆ ,Γ .
10.3 Iterating Operators
213 3 α +βι
Therefore assume that I = 0. / We have par(F) ≤ F(par(∆ )) and obtain F ρ ∆ , Γι by the induction hypothesis. Since α + βι < α + β , and in the case of finite I also 3
par(∆ , Γι ) ⊆ par(∆ , Γι , F) ≤ F (par(∆ , Γ )), we obtain F ence (J).
3 α +β ρ
∆ , Γ by an infer-
Now assume that ¬F is the critical formula of the last inference in F Then we have the premise F
β0 ρ
β ρ
Γ , ¬F.
Γ , ¬F, ¬G for some G ∈ CS(F) with
par(Γ , F, G) ≤ F(par(Γ , F))
(ii)
and obtain F
3 α +β0 ρ
∆ , Γ , ¬G
(iii)
by the induction hypothesis. By inversion and Lemma 10.2.3 we obtain from the α hypotheses F ρ ∆ , F and par(F) ≤ F(par(∆ )) F◦F
α ρ
∆ , G.
(iv) 3 α +β0
∆ , Γ , G. It is α + β0 < α + β From (iv) we get by the Structural Lemma F ρ and rnk(G) < ρ . To apply a cut we still have to check 3
par(∆ , Γ , G) ≤ F (par(∆ , Γ )).
(v)
But this is secured by (ii) and the hypothesis par(F) ≤ F(par(∆ )) which implies 3 par(∆ , Γ , G) ≤ F(par(∆ , Γ , F)) ≤ F (par(∆ , Γ )).
10.3 Iterating Operators The remarkable and, in comparison to Sect. 9.3, new fact is that we have to iterate the controlling operator in the Reduction Lemma. The crucial point is how to define transfinite iterations of operators. Here we will follow quite closely the pattern we used in defining the iterated Skolem-hull operators Hα . 10.3.1 Definition Let F be an operator. We define
ψF (α ) := min {η (∀ξ ∈ OF (α ) ∩ α )[ψF (ξ ) < η ]} = sup({ψF (β ) + 1 β ∈ OF (α ) ∩ α } ∪ {0}). Since OF (α ) ∩ α is finite, the supremum in Definition 10.3.1 is in fact a maximum and the following theorem is immediate. 10.3.2 Theorem For all ordinals α < Γ0 and all strictly increasing operators F we get ψF (α ) < ω .
10 Provably Recursive Functions of NT
214
The finite ordinal ψF (α ) is the first ordinal majorizing all the “ordinals” which are reachable by ψF (OF (α ) ∩ α ). Observe the strong analogy to the definition of ψH (α ). To simplify notation we introduce the abbreviation
α &F β :⇔ α ∈ OF (β ) ∩ β ⇔ α < β ∧ N(α ) ≤ F(N(β )). Then we obtain
α &F β ⇒ Fα < Fβ .
(10.2)
By induction on n < ω we obtain ψF (n) = n. This is obvious for n = 0 and in the successor case we have
ψF (n + 1) = max {ψF (m) + 1 m ≤ n ∧ N(m) ≤ F(n + 1)} = n + 1. Hence
ψF ω = Idω .
(10.3)
The essential property of the functions ψF is stated in the following lemma. We call an operator F strongly increasing if it satisfies F(x) + 1 < F(x + 1). For strongly increasing operators we obtain n = 0 ⇒ n + F(x) < F(n + x)
(10.4)
by an easy induction on n. 10.3.3 Lemma Let F be a strongly increasing operator. Then
ψF (α + ψF (β )) ≤ ψF (α = β ).
(10.5)
Proof We prove the lemma by induction on β . For α = 0 or β = 0 this follows from ψF ω = Idω . So assume α = 0 and β = 0. Then ψF (β ) = ψF (γ ) + 1 for some γ ∈ OF (β ) ∩ β . Hence
ψF (α + ψF (β )) = ψF (α + ψF (γ ) + 1) ≤ ψF (α = γ + 1). If α = β = α = γ + 1 then we are done. So assume α = γ + 1 < α = β . But N(γ ) ≤ F(N(β )) entails by (10.4) N(α = γ + 1) = N(α ) + N(γ ) + 1 ≤ N(α ) + 1 + F(N(β )) ≤ F(N(α ) + N(β )). Hence α = γ + 1 ∈ OF (α = β ) ∩ (α = β ) and therefore ψF (α = γ + 1) ≤ ψF (α = β ). As a first consequence of Lemma 10.3.3 we obtain n
F (x) ≤ ψF (ω ·n + x),
(10.6)
n
where F denotes the n-th iteration of the strongly increasing operator F. We prove (10.6) by induction on n. First, we observe that F(x) &F ω + x, which implies F(x) = ψF (F(x)) < ψF (ω +x). Hence ω ·n+F(x) ∈ OF (ω ·n + F(x))∩ ω ·n+
10.3 Iterating Operators
215 n+1
ψF (ω + x) by which we obtain F ψF (ω + x)) ≤ ψF (ω ·(n + 1) + x). Conversely we obtain
n
(x) = F (F(x)) ≤ ψF (ω ·n + F(x)) ≤ ψF (ω ·n +
n
ψF (ω ·n + x) ≤ F (2·n + x) + n
(10.7)
which is shown by main induction on n and side induction on x. The case n = 0 follows trivially from ψF ω = Idω . So assume that the claim is correct for n. Let
η1 := ω ·n + F(2n + x)
(i)
η0 := ω ·(n + 1) + x − 1
(ii)
and
if x = 0. We claim
ψF (ω ·(n + 1) + x) = max{ψF (η0 ), ψF (η1 )} + 1.
(iii)
Let α := ω ·(n + 1) + x. First we show that ηi ∈ OF (α ) ∩ α holds true for i = 0, 1. We clearly have ηi < α . We get N(η0 ) = 2n + x + 1 ≤ F(2(n + 1) + x) = F(N(α )) and N(η1 ) = 2n + F(2n + x) ≤ F(2(n + 1) + x) since F is strongly increasing. For ω ·(n + 1) ≤ ξ ∈ OF (α ) ∩ η0 we get N(ξ ) ≤ 2·(n + 1) + x − 1 ≤ F(2·(n + 1) + x − 1) = F(N(η0 )). Hence ξ ∈ OF (η0 )∩ η0 and thus ψF (ξ ) < ψF (η0 ). For ξ ∈ OF (α ) ∩ ω ·(n + 1) we get N(ξ ) ≤ F(N(α )) = F(2(n + 1) + x) ≤ F(2n + F(2n + x)) = F(N(η1 )) and ξ = ω ·k + m. For k < n we obviously have ξ < η1 and for k = n we get m ≤ F(2n + x), otherwise N(ξ ) ≤ F(N(α )) would fail. Hence ξ = η1 or ξ &F η1 and thus ψF (ξ ) ≤ ψF (η1 ). If max{ψF (η0 ), ψF (η1 )} = ψF (η0 ) we get ψF (ω ·(n + 1) + x) = ψF (η0 ) + 1 = n+1
n+1
ψF (ω ·(n+1)+x−1)+1 ≤ F (2·(n + 1) + x − 1)+n+2 ≤ F (2(n + 1) + x)+ n + 1 by the induction hypothesis for x. If max{ψF (η0 ), ψF (η1 )} = ψF (η1 ) we get ψF (ω ·(n + 1) + x) = ψF (η1 ) + 1 = n
n
ψF (ω ·n + F(2n + x)) + 1 ≤ F (2n + F(2n + x)) + n + 1 ≤ F (F(2(n + 1) + x)) + n+1
n+1 = F (2(n + 1) + x) + n + 1 by the induction hypothesis for n. This finishes the proof of (10.7). Equations (10.6) and (10.7) show that the functions ψF are closely connected to iterations of the “operator” F. 10.3.4 Definition In view of (10.6) and (10.7) it makes sense to define Fα := λ x . ψF (α + x)
(10.8)
and to call Fα the α -th “iteration” of F However, Definition 10.3.4 is not completely stringent. So we get, for instance, Fn (x) = n + x for finite n, which has nothing to do with iterations of the operator F. Defining Fα (x) := ψF (ω ·α + x) would have been more in the spirit of an iteration. n
This would have guaranteed F0 = Id, F1 ≈ F and Fn ≈ F . However, in computing
10 Provably Recursive Functions of NT
216
upper bounds for the provably recursive functions of NT we will very quickly reach ordinals beyond ω ω . But ω ·α = α holds for all additively indecomposable ordinals α ≥ ω ω . Therefore we have opted for the technically much simpler definition given in (10.8). The next lemma is a corollary to (10.5). 10.3.5 Lemma Let F be a strongly increasing operator. Then Fα ◦ Fβ ≤ Fα = β . Proof
We have Fα (Fβ (x)) = ψF (α + ψF (β + x)) ≤ ψF (α = β + x) = Fα = β (x).
We will be forced to iterate iterations of operators, i.e., to regard iterations (Fα )β . To find upper bounds for (Fα )β we have to compute ψF (β ). This computation needs α the symmetric product of ordinals. 10.3.6 Definition We define the symmetric product α × β in two steps. We start with additively indecomposable ordinals for which we put
ω α × ω β := ω α = β where = denotes the symmetric sum (cf. Exercise 3.3.14). For ordinals α =NF α1 + · · · + αm and β =NF β1 + · · · + βn in Cantor normal-form, we put # $ m
= α × β := ∑
i=1
n
∑= αi × β j
j=1
and define
α × 0 = 0 × α := 0. First we have to show that the symmetric product behaves like a commutative product. 10.3.7 Lemma For all ordinals α , β and γ we have • α × β = β × α, • α × 1 = α, • α × (β
γ ) = (α × β ) = (α × γ ), =
• α × ξ < α × η for all ordinals ξ < η and • α × λ = supη <λ (α × η ) for limit ordinals λ .
10.3 Iterating Operators
217
Proof The first claim follows directly from the definition of the symmetric sum and Exercise 3.3.14. For the second claim, we first observe ω ξ × 1 = ω ξ × ω 0 = ω ξ and n obtain α × 1 ==∑ i=1 αi × 1 = α for α =NF ∑ni=1 αi . For the third claim, we assume α =NF ∑li=1 αi , β =NF ∑mj=1 β j and γ =NF ∑nk=1 γk . Then we obtain $ #
α × (β
γ) = =
l
m
n
∑= ∑= (αi × β j ) = ∑= (αi × γk )
i=1 l
= =∑
j=1
m
k=1
l
n
∑= (αi × β j ) = ∑= ∑= (αi × γk ) = (α × β ) = (α × γ ).
i=1 j=1
i=1 k=1
m n ξj ηj αi l = For the fourth claim let α = ∑ i=1 ω , ξ =NF ∑ j=1 ω and η =NF ∑k=1 ω . Since ξ < η we either have ξ j = η j for j = 1, . . . , m and m < n or there is an r < n such that ξ j = η j for 1 ≤ j ≤ r and ξ j < ηr+1 for r + 1 ≤ j < m. Then we obtain α × ξ = ξj ηk αi = αi = l m l n = = = = <∑ = α × η because either αi = ξ j = αi = η j for ∑ i=1 ∑ j=1 ω i=1 ∑ k=1 ω j = 1, . . . , m and m < n or αi = ξ j = αi = η j for 1 ≤ j ≤ r and αi = ξ j < αi = ηr+1 for r + 1 ≤ j ≤ m. For the last claim we have
sup (α × η ) ≤ α × λ
(i)
η <λ
by the previous claim. We first prove
ω α × ω ξ +1 = sup (ω α × η ).
(ii)
η <ω ξ +1
If µ < ω α × ω ξ +1 = ω α = ξ +1 there is an n < ω such that µ < ∑= ni=1 (ω α = ξ ) = ∑= ni=1 (ω α × ω ξ ) = ω α × ∑= ni=1 ω ξ < supη <ω ξ +1 (ω α × η ) because ∑= ni=1 (ω ξ ) < ξ ξ +1 . Hence ω α × ω ξ +1 ≤ sup α ∑= n+1 i=1 ω < ω η <ω ξ +1 (ω × η ). The converse inequality holds by (i). Next we prove,
ω α × ω λ = sup (ω α × η )
(iii)
η <ω λ
for limit ordinals λ . If µ < ω α × ω λ = ω α = λ there is a ζ < λ such that µ < ω α = ζ = ω α × ω ζ . Since ω ζ < ω λ this implies ω α × ω λ ≤ supη <ω λ (ω α × η ). The opposite inequality is again (i). But (ii) and (iii) easily generalize to
α × ω ξ = sup (α × η )
(iv)
η <ω ξ
for arbitrary ordinals α and ξ = 0. If λ ∈ Lim we have λ =NF ∑n−1 j=1 λ j + λn with λn = ω µ and µ = 0. Then we obtain by (iv)
10 Provably Recursive Functions of NT
218 n
λi = = α ×λ = α × ∑
i=1 n−1
= sup ξ <λn
∑
=
i=1
n−1
∑
=
(α × λi ) = (α × λn ) =
i=1
n−1
∑= (α × λi ) =
i=1
sup (α × ξ )
ξ <λn
n−1 (α × λi ) = (α × ξ ) = sup α × ∑= λi = ξ ≤ sup (α × η ). ξ <λn
i=1
η <λ
We need an upper bound for the norm of the symmetric product and want to show N(α ) + N(β ) < N(α × β ) ≤ N(α )·N(β ).
(10.9)
For additively indecomposable ordinals α and β , we immediately get N(α × β ) = N(α ) + N(β ) + 1. If N(α ) ≤ 2 or N(β ) ≤ 2, we check N(α × β ) ≤ N(α )·N(β ) by looking at all possible cases. For N(α ) and N(β ) > 2, we obtain N(α ) + N(β ) + 1 ≤ N(α )·N(β ) by induction on N(β ). Hence N(α × β ) ≤ N(α )·N(β ) for additively indecomposable ordinals. Therefore, we obtain for α =NF α1 + · · · αm and β =NF β1 + · · · βn m
n
m
N(α × β ) = ∑ ( ∑ N(αi × β j )) ≤ ∑ i=1 j=1
n
∑ (N(αi )·N(β j ))
i=1 j=1
= N(α )·N(β ). 10.3.8 Lemma Let F be a strongly increasing operator satisfying also x2 ≤ F(x). Then
ψF (β ) ≤ ψF (α × (β + 1) + N(β )). α
Proof
(10.10)
We prove the lemma by induction on β . It is
ψF (β ) = ψF (η ) + 1 α
α
(i)
for some η < β such that N(η ) ≤ Fα (N(β )) = ψF (α + N(β )). From (i) we obtain by the induction hypothesis
ψF (β ) ≤ ψF (α × (η + 1) + N(η )) + 1. α
(ii)
If β = η + 1 then (ii) directly implies ψF (β ) ≤ ψF (α × (β + 1) + N(β )) since α clearly α × β + N(β ) &F α × (β + 1) + N(β ). Otherwise we have
α × (η + 1) + N(η ) &F α × β + ψF (α + N(β ))
(iii)
since α × (η + 1) + N(η ) < α × β + ψF (α + N(β )) and N(α × (η + 1)) + N(η ) ≤ N(α )·N(η ) + N(α ) + N(η ) ≤ (N(α ) + N(η ))2 ≤ F(N(α ) + N(η )) ≤ F(N(α × β ) + ψF (α + N(β ))).
10.3 Iterating Operators
219
But (iii) and (ii) imply
ψF (β ) ≤ ψF (α × (η + 1) + N(η )) + 1 ≤ ψF (α × β + ψF (α + N(β ))) α
≤ ψF (α × β
α + N(β )) = ψ (α × (β = F
+ 1) + N(β )).
(iv)
As a consequence of (10.10) we obtain an upper bound for (Fα )β . 10.3.9 Lemma Let F be a strongly increasing operator satisfying x2 ≤ F(x) and α be an infinite ordinal. Then (Fα )β (x) ≤ Fα ×(β = ω ) (x). Proof
(10.11)
By (10.10) we have
(Fα )β (x) = ψF (β + x) ≤ ψF (α × (β + x + 1) + N(β ) + x). α
We are done if we can show α × (β + x + 1) + N(β ) + x &F α × (β since then (Fα )β (x) ≤ ψF (α × (β
ω) + x =
(ii)
ω ) + x) = Fα ×(β = = ω ) (x).
α × (β + x + 1) + N(β ) + x < α × (β
(i)
We clearly have
ω ) + x. =
Moreover, we get N(α × (β + x + 1) + N(β ) + x) ≤ N(α )·(N(β ) + x + 1) + N(β ) + x ≤ (N(α ) + N(β ) + x)2 ≤ F(N(α ) + N(β ) + x) ≤ F(N(α × (β = ω ) + x)).
The following theorem is a first application of Lemma 10.3.9. 10.3.10 Theorem (Witnessing Theorem) Let G be a ∃01 -sentence of L (NT), i.e., a sentence of the form (∃x)Fu (x) such that F is quantifier free, and F a strongly α increasing operator satisfying x2 ≤ F(x). If F 0 (∃x)Fu (x) then there is an n < Fω α ·2+1 (par(G)) such that N |= Fu (n).
Proof Induction on α . The only possibility for F cording to ( ) whose premise is F
α0 0
(∃x)Fu (x), Fu (t)
α 0
(∃x)Fu (x) is an inference ac(i)
for some term t such that par(G, Fu (t)) ⊆ F(par(G)), hence t N ≤ F(par(G)). In case of N |= Fu (t) we choose n := t N . Then t N ≤ F(par(G)) ≤ Fω α ·2+1 (par(G)) is true by the parameter condition for inferences with finitely many premises. Otherwise we α0 get F [par(t)] 0 (∃x)Fu (x) by the Detachment Lemma. Since par(t) ≤ F(par(G)) this entails F ◦ F
α0
0
(∃x)Fu (x). Hence Fω ·2
α0 0
(∃x)Fu (x). By the induction hy-
pothesis there is an n ≤ (Fω ·2 )ω α0 ·2+1 (par(G)) such that N |= Fu (n). By (10.11) we therefore obtain n ≤ Fω ·2×(ω α0 ·2+1 = ω ) (par(G)). Putting β := ω ·2 × (ω α0 ·2+1 = ω ) = ω α0 ·2+2 ·2 = ω 2 ·2 it remains to check
10 Provably Recursive Functions of NT
220
Fβ (par(G)) ≤ Fω α ·2+1 (par(G)).
(ii)
We certainly have β < ω α ·2+1 . For α = α0 + 1 we get N(β ) = 4·N(α0 ) + 12 ≤ (2N(α0 ) + 4)2 ≤ F(2N(α0 ) + 4) = F(N(ω α ·2+1 )), hence β &F ω α ·2+1 , which immediately entails (ii). Therefore, assume α0 + 1 < α and let m := max par(G). Since α0 ∈ OF (par(G)) we have N(α0 ) ≤ ψF (ω + m) and thus obtain β &F ω α ·2 + 4·ψF (ω + m) + 12. Hence Fβ (m) = ψF (β + m) < ψF (ω α ·2 + 4·ψF (ω + m) + 12 + m) ≤ ψF (ω α ·2 = ω ·4 + 12 + 5m).
(iii)
But α ≥ 1 and m ≥ 1 entail ω α ·2 = ω ·4 + 12 + 5m &F ω α ·2+1 + m which together with (iii) implies (ii). It follows from the Witnessing Theorem and the Inversion Lemma that operatorcontrolled cut free derivations provide upper bounds for the Skolem-functions of ∀02 -sentences that are provable in the operator-controlled semi-formal system. α Whenever we have F 0 (∀x)(∃y)F(x, y, n1 , . . . , nk ) for a quantifier free formula F(x, y, n1 , . . . , nk ) we obtain by inversion F
α 0
(∃y)F(m1 , . . . , ml , y, n1 , . . . , nk ),
which by the Witnessing Theorem shows that there is a natural number n < Fω α ·2+1 ({m1 , . . . , ml , n1 , . . . , nk }) such that N |= F(m1 , . . . , ml , n, n1 , . . . , nk ). A Skolem function for a true ∀02 -sentence (∀x)(∃y)F(x, y) is the recursive function µ y. F(x, y). If we succeed in characterizing the class of controlling operators for the ∀02 -sentences that are provable in NT we thus obtain a characterization of the recursive functions whose totality is provable in NT. Therefore, we have to prove cut-elimination for the semi-formal system with controlling operators. 10.3.11 Exercise The relation &F is not transitive. To make it transitive define
α &0F β :⇔ α = β k α &k+1 F β :⇔ (∃γ )[α &F γ ∧ γ &F β ] and
α &∗F β :⇔ (∃k)[α &kF β ]. Prove ψF (α ) = max {k 0 &kF α }.
10.4 Cut Elimination for Operator Controlled Derivations We have now all the material to prove cut-elimination for operator controlled derivations in the semi-formal system. This will be the main tool in finding upper bounds for the provable recursive functions of NT. In this section we tacitly assume that all operators F satisfy x3 ≤ F(x). This means no loss of generality because we will
10.4 Cut Elimination for Operator Controlled Derivations
221
later see that the operators which result from the embedding procedure increase even stronger. 10.4.1 Lemma (Elimination Lemma)
Assume F
α ρ +1
∆ . Then Fω α +1
ωα ρ
∆.
Proof The proof is similar to the proofs of the analogous statements (Lemma 7.3.13, Theorem 9.3.15) proved before. We will, however, see that we have to be much more careful in calculating the controlling operators. In case that the last inference is not a cut of rank ρ , we have the premises F
αι ρ +1
∆ι for ι ∈ I
and obtain Fω αι +1
ω αι ρ
∆ι
(i)
by the induction hypothesis. Now we show Fω αι +1 [par(∆ι )] ≤ Fω α +1 [par(∆ι )].
(ii)
Let x ∈ ω , dι := max par(∆ι ) and eι := max{dι , x}. Then Fω αι +1 [par(∆ι )](x) = Fω αι +1 (eι ). From αι ∈ OF (par(∆ι )) we obtain N(ω αι +1 ) = N(αι )+2 ≤ F(dι )+2 ≤ ψF (ω + dι ) + 2. Therefore ω αι +1 &F ω α + ψF (ω + dι ) + 2 and we obtain Fω αι +1 (eι ) < Fω α +ψ
F
≤ ψF (ω α
(ω +dι )+2 (eι ) = ψF (ω ω + 2eι =
α
+ ψF (ω + dι ) + 2 + eι )
+ 2) < ψF (ω α +1 + eι ) = Fω α +1 (eι ),
(iii)
where we used ω α = ω + 2eι + 2 &F ω α +1 + eι , which is true because of ω α = ω + 2eι + 2 < ω α +1 + eι and N(ω α = ω + 2eι + 2) = N(α ) + 5 + 2eι ≤ (N(α ) + 2 + eι )2 < F(N(ω α +1 + eι )). From (i) and (ii) we obtain Fω α +1
ω αι ρ
∆ι for ι ∈ I
(iv)
by the Structural Lemma and from (iv) Fω α +1
ωα ρ
∆
with the same inference. In case of a finite index set I it is obvious that par(∆ι ) ⊆ F(par(∆ )) also entails par(∆ι ) ⊆ Fω α +1 (par(∆ )). So assume that the last inference is a cut of rank ρ . Then, we have the premises F
α0 ρ +1
∆ , F and F
α0 ρ +1
∆ , ¬F
(v)
with rnk(F) = ρ and obtain by the induction hypothesis Fω α0 +1
ω α0 ρ
∆ , F and Fω α0 +1
ω α0 ρ
∆ , ¬F.
(vi)
Applying the Reduction Lemma (Lemma 10.2.6) in combination with Lemma 10.3.5 we obtain from (vi)
10 Provably Recursive Functions of NT
222
F(ω α0 +1 )·3
ω α0 ·2 ρ
∆.
(vii)
Let d := max par(∆ ) and x ≥ d. Since N(α0 ) ≤ F(par(∆ , F)) and par(F) ≤ 2 F(par(∆ )) = F(d) we obtain N(α0 ) ≤ F (d). Hence N(ω α0 +1 ) = N(α0 ) + 2 ≤ 2 F (d) + 2 ≤ ψF (ω ·2 + d) + 2, which in turn shows ω α0 +1 ·3 &F ω α ·3 + ψF (ω ·2 + d)·3 + 6. Therefore we obtain Fω α0 +1 ·3 (x) < Fω α ·3+ψ
F
(ω ·2+d)·3+6 (x) = ψF (ω
α
·3 + ψF (ω ·2 + d)·3 + 6 + x)
≤ ψF (ω α ·3 + ω ·6 + 4x + 6) ≤ ψF (ω α +1 + x) = Fω α +1 (x),
(viii)
where in the last line we have used ω α ·3+ ω ·6+4x+6 &F ω α +1 +x, which follows because N(α ) ≥ 1 entails N(ω α ·3 + ω ·6 + 4x + 6) = N(α )·3 + 4x + 21 < (N(α ) + 2 + x)3 ≤ F(N(ω α +1 + x)). By (vii) and (viii) we obtain by the Structural Lemma Fω α +1
ωα ρ
∆.
Iterating the Elimination Lemma we obtain an Elimination Theorem which suffices to determine upper bounds for the provably recursive functions of NT. 10.4.2 Theorem (Elimination Theorem) If F
α n
∆ and n > 0 then Fϕ n (α +1) 0
ϕ0n (α ) 0
∆.
Proof We induct on n. For n = 1 the claim follows from the Elimination Lemma ωα (Lemma 10.4.1). For n + 1 > 1 we obtain Fω α +1 n ∆ from the Elimination Lemma. By the induction hypothesis it follows (Fω α +1 )ϕ0n (ω α +1) implies Fω α +1 ×(ϕ n (ω α +1)= ω ) 0
ϕ0n+1 (α ) 0
ϕ0n (ω α ) 0
∆ which
∆
(i)
by Lemma 10.3.9. We prove
β := ω α +1 × (ϕ0n (ω α + 1) = ω ) &F ϕ0n+1 (α + 1)
(ii)
and get then the claim from (i) and (ii) by the Structural Lemma (Lemma 10.2.2). We obtain β < ϕ0n+1 (α + 1) since ϕ0n+1 (α + 1) is multiplicatively indecomposable and ω α +1 < ϕ0n+1 (α + 1) as well as ϕ0n (ω α + 1) = ω < ϕ0n+1 (α + 1). Since n ≥ 1 we finally obtain N(β ) ≤ (N(α ) + 2)·(N(α ) + n + 4) ≤ (N(α ) + n + 2)3 ≤ F(N(ϕ0n+1 (α + 1))) which finishes the proof of (ii). 10.4.3 Exercise Formulate and prove (ii) of Theorem 9.3.15 for F
α β +ω ρ
∆.
10.5 The Embedding of NT To obtain an embedding of formal derivations in NT into operator controlled derivations we have to majorize all primitive recursive functions by iterations of a suited
10.5 The Embedding of NT
223
strictly monotone operator. An essential role in the majorizing procedure is played x+1 by the diagonalizing operator λ x . F (x). 10.5.1 Lemma Let F be a strictly increasing operator satisfying x2 ≤ F(x). Then x+1 λx. F (x) is majorized by Fω 2 . Proof
We have F
obtain F
x+1
x+1
(x) ≤ ψF (ω ·(x + 1) + x). Since ω ·(x + 1) + x &F ω 2 + x we
(x) ≤ ψF (ω 2 + x) = Fω 2 (x).
10.5.2 Lemma Let F be a strongly increasing operator satisfying x2 + 2 ≤ F(x). Then we obtain for every primitive recursive function f a natural number e f such that f (x1 , . . . , xn ) ≤ Fω e f ({x1 , . . . , xn }) Proof Let f be a primitive recursive function and f a primitive recursive function term representing f . We prove the lemma by induction on the definition of the primitive recursive function term f . If f is the successor symbol S then f (x) = x + 1 ≤ Fω (x). If f is the symbol Ckn for the constant function with value k we have Ckn (x1 , . . . , xn ) = k ≤ Fω k ({x1 , . . . , xn }). If f is the symbol Pkn then Pkn (x1 , . . . , xn ) = xk ≤ Fω ({x1 , . . . , xn }). Now assume that f = Sub(g, h1 , . . . , hm ). Then by the induction hypothesis, we have numbers eg and ehi such that g(x1 , . . . , xm ) ≤ Fω eg ({x1 , . . . , xm })
(i)
hi (x1 , . . . , xn ) ≤ Fω ehi ({x1 , . . . , xn }) for i = 1, . . . , m.
(ii)
and
Then g(h1 (x1 , . . . , xn ), . . . , hm (x1 , . . . , xn )) ≤ Fω eg ({h1 (x1 , . . . , xn ), . . . , hm (x1 , . . . , xn )}) ≤ Fω eg (Fω eh ({x1 , . . . , xn })) ≤ Fω eg = ω eh ({x1 , . . . , xn }) ≤ Fω e f ({x1 , . . . , xn }) for eh := max{eh1 , . . . , ehm } and e f := eg + eh + 1. If f is the term Rec(g, h) we have numbers eg and eh such that g(x1 , . . . , xn ) ≤ Fω eg ({x1 , . . . , xn }) and h(k, l, x1 , . . . , xn ) ≤ Fω eh ({k, l, x1 , . . . , xn }). Let e0 := max{eg , eh } and put u0 := g(x1 , . . . , xn ) and uk+1 := h(k, uk , x1 , . . . , xn ). Then f (k, x1 , . . . , xn ) = uk and we prove k+1
uk ≤ Fω e0 (k, x1 , . . . , xn )
(iii)
10 Provably Recursive Functions of NT
224
by induction on k. We have u0 := g(x1 , . . . , xn ) ≤ Fω e0 ({x1 , . . . , xn }) and uk+1 := h(k, uk , x1 , . . . , xn ) ≤ Fω e0 ({k, uk , x1 , . . . , xn }) k+1
≤ Fω e0 ({k, Fω e0 ({k, x1 , . . . , xn }), x1 , . . . , xn }) k+2
≤ Fω e0 ({k, x1 , . . . , xn }). Let m := max{k, x1 , . . . , xn }. Since 2·x + 1 ≤ x2 + 2 ≤ F(x), we obtain by Lemma 10.5.1 and Lemma 10.3.9 k+1
m+1
f (k, x1 , . . . , xn ) = uk ≤ Fω e0 (m) ≤ Fω e0 (m) ≤ (Fω e0 )ω 2 (m) ≤ Fω e0 +3 (m) = Fω e0 +3 ({k, x1 , . . . , xn })
and put e f := e0 + 3.
10.5.3 Theorem Let F be a strongly increasing operator such that x2 + 2 ≤ F(x). Then every primitive recursive function is eventually majorized by Fω ω . Proof Let f be an n-ary primitive recursive function. By Lemma 10.5.2, there is a natural number e f such that f (x1 , . . . , xn ) ≤ Fω e f ({x1 , . . . , xn }). For all tuples x1 , . . . , xn such that x := max{x1 , . . . , xn } ≥ e f we then obtain ω e f + x &F ω ω + x, which implies f (x1 , . . . , xn ) ≤ Fω e f (x) ≤ Fω ω ({x1 , . . . , xn }).
Lemma 10.5.2 will be crucial in the proof of the following lemma which is based on the iterations of a strongly increasing operator F fulfilling x2 + 2 ≤ F(x). 10.5.4 Lemma Let ∆ (x1 , . . . , xn ) ⊆ Γ (x1 , . . . , xn ) be a finite set of L (NT)-formulas m ∆ (x1 , . . . , xn ) then there is which contain at most the free variables x1 , . . . , xn . If a k < ω such that Fω k +m
m 0
T
Γ (z1 , . . . , zn ) for all n-tuples z1 , . . . , zn of numerals and
all strongly increasing operators F fulfilling x2 + 2 ≤ F(x). Proof Induction on m. We proceed by distinction on cases according to Definition 4.3.2. m ∆ (x1 , . . . , xn ) holds by (Ax)L then there is an atomic formula A(x1 , . . . , xn ) If T
such that {A(x1 , . . . , xn ), ¬A(x1 , . . . , xn )} ⊆ ∆ (x1 , . . . , xn ). For an n-tuple z1 , . . . , zn of natural numbers either A(z1 , . . . , zn ) ∈ Diag(N) or ¬A(z1 , . . . , zn ) ∈ Diag(N). Hence 0 Γ (z1 , . . . , zn ) ∩ Diag(N) = 0, / and we obtain Fω 0 Γ (z1 , . . . , zn ) by an inference ( ) with empty premises. In case of an inference according to (∨) we have the premise m0 T
∆0 (x1 , . . . , xn ), Ai (x1 , . . . , xn )
(i)
for some i ∈ {0, 1} and obtain Fω k +m0
m0 0
Γ (z1 , . . . , zn ), Ai (z1 , . . . , zn ), (A0 ∨ A1 )(z1 , . . . , zn )
(ii)
10.5 The Embedding of NT
225
for any n-tuple z1 , . . . , zn . From (ii), however, we obtain the claim by an inference ( ). In case of an inference according to (∧) we have the premises mi T
∆0 (x1 , . . . , xn ), Ai (x1 , . . . , xn )
(iii)
for i = 0, 1 and obtain Fω k0 +m
0
m0 0
Γ (z1 , . . . , zn ), A0 (z1 , . . . , zn ), (A0 ∧ A1 )(z1 , . . . , zn )
(iv)
Γ (z1 , . . . , zn ), A1 (z1 , . . . , zn ), (A0 ∧ A1 )(z1 , . . . , zn )
(v)
and Fω k1 +m1
m1 0
by the induction hypothesis. From (iv) and (v) we then obtain the claim by an infer ence ( ) with k := max{k0 , k1 }. In case of an inference according to (∃) we have the premise m0 T
∆0 (x1 , . . . , xn ), Au ( f (x1 , . . . , xm ), x1 , . . . , xn ).
(vi)
Without loss of generality, we may assume that {x1 , . . . , xm } ⊆ {x1 , . . . , xn } because we can replace all variables that do not occur among x1 , . . . , xn by 0 without destroying the derivation. By the induction hypothesis we then obtain Fω k0 +m
0
m0 0
Γ (z1 , . . . , zn ), Au ( f (z1 , . . . , zn ), z1 , . . . , zn ), (∃x)Au (x, z1 , . . . , zn ).
(vii)
By Lemma 10.5.2, there is a natural number k1 such that f (z1 , . . . , zn ) ≤ Fω k1 ({z1 , . . . , zn }) < Fω k1 +m (par(Γ (z1 , . . . , zn ))). Thus par(Γ (z1 , . . . , zn ), Au ( f (z1 , . . . , zn ), z1 , . . . , zn )) ⊆ Fω k +m (par(Γ (z1 , . . . , zn )))
for k = max{k0 , k1 } and we obtain the claim from (vii) by an inference ( ). The last case is an inference according to (∀). Then we have the premise m0 T
∆0 (x1 , . . . , xn ), A(v, x1 , . . . , xn )
(viii)
such that v does not occur in ∆0 (x1 , . . . , xn ). Fix an n-tuple z1 , . . . , zn of natural numbers. Then we obtain from (viii) Fω k +m0
m0 0
Γ (z1 , . . . , zn ), Av (z, z1 , . . . , zn ), (∀x)Au (x, z1 , . . . , zn )
(ix)
for all natural numbers z by the induction hypothesis. From (ix) we get the claim by an inference ( ). It follows from Theorem 7.2.1 and Lemma 10.5.4 that for every L (NT)-sentence that is provable in NT, there are axioms A1 , . . . , An ∈ NT and natural numbers k and m such that Fω ·k+m
m 0
¬A1 , . . . , ¬An , F.
(10.12)
10 Provably Recursive Functions of NT
226
To obtain a characterization of the provably recursive functions of NT, we have to check if all the mathematical axioms of NT are also operator controlled derivable. If (∀x1 ) . . . (∀xn )F(x1 , . . . , xn ) is an identity axiom or a mathematical axiom of NT different from an instance of Mathematical Induction and the last identity axiom then we have Fx1 ,...,xn (z1 , . . . , zn ) ∈ Diag(N) for every tuple of numerals z1 , . . . , zn . For any non trivial strictly increasing operator F and n ≥ 1 we obtain N(ω ) = 2 ≤ F(par((∀x1 ) . . . (∀xn )F(x1 , . . . , xn ))). Therefore F
ω 0
(∀x1 ) . . . (∀xn )F(x1 , . . . , xn )
(10.13)
is true for all strictly increasing operators F and all axioms different from instances of Mathematical Induction and the last identity axiom. To check the operator controlled derivability of instances of Mathematical Induction, we need some preparations. The first is a operator controlled version of the Tautology Lemma. 10.5.5 Lemma Let F be a strictly increasing operator which also satisfies 2·x ≤ F(x) for all natural numbers x. Then F equivalent sentences F and F .
2rnk(F) 0
∆ , F, ¬F holds for all numerical
Proof We show the lemma by induction on rnk(F). Without loss of generality we 0 may assume that F ∈ –type . If F is atomic we obtain F 0 ∆ , F, ¬F by an infer ence ( ) with empty premise. If F is not atomic then F is either a sentence (F0 ∧ F1 ) or a sentence (∀x)Gu (x). In the first case, we obtain F i = 0, 1 and obtain F
2rnk(Fi )+1 0
∆ , F, ¬F , Fi
2rnk(Fi ) 0
∆ , F, ¬F , Fi , ¬Fi for (i)
for i = 0, 1 by an inference ( ) and thence F
2rnk(F) 0
∆ , F, ¬F
(ii)
by an inference ( ). Both inferences can be applied because 2rnk(Fi ) < 2rnk(Fi ) + 1 < 2rnk(F) ≤ F(par(∆ , F, ¬F)) and par(∆ , F, Fi ) ⊆ par(∆ , F). If F is a sentence (∀x)Gu (x) we obtain F
2rnk(G) 0
∆ , F, ¬F , Gu (n), ¬Gu (n)
(iii)
for all n ∈ ω . By an inference ( ) we obtain from (iii) F
2rnk(G)+1 0
∆ , F, ¬F , Gu (n)
(iv)
for all n ∈ ω . This inference can be applied because 2rnk(G) < 2rnk(G) + 1 < 2rnk(F) ≤ F (par(∆ , F, Gu (n))) and n ∈ par(∆ , F, Gu (n)). From (iv) we then obtain
10.5 The Embedding of NT
F
2rnk(F) 0
227
∆ , F, ¬F
by an inference ( ). ω 0
A for all instances A As an immediate consequence of Lemma 10.5.5 we get F of the last identity axiom. Next we handle the scheme of Mathematical Induction. 10.5.6 Lemma (Mathematical Induction Lemma) operator such that 4·n ≤ F(n). Then F
2·(rnk(F)+n) 0
Let F be a strictly increasing
¬Fu (0) , ¬((∀x)[Fu (x) → Fu (x + 1)]), Fu (n).
Proof We prove the lemma by induction on n. The proof is literally that of Lemma 7.3.4. We have only to keep an eye on the occurring number parameters. First we have 2·(rnk(F) + n) ∈ OF (par(Fu (n))) because 2·(rnk(F) + n) ≤ 4· max par(Fu (n)) ≤ F(par(Fu (n))). Next we obtain F
2·rnk(F) 0
¬Fu (0) , ¬((∀x)[Fu (x) → Fu (x + 1)]), Fu (0)
(i)
by Lemma 10.5.5. Assume now F
2·(rnk(F)+n) 0
¬Fu (0) , ¬((∀x)[Fu (x) → Fu (x + 1)]), Fu (n)
for induction hypothesis, i. e. F
2·(rnk(F)+n) 0
¬Fu (0) , (∃x)[Fu (x) ∧ ¬Fu (x + 1)], Fu (n).
(ii)
By Lemma 10.5.5 we have F
2·rnk(F) 0
¬Fu (0) , (∃x)[Fu (x) ∧ ¬Fu (x + 1)], ¬Fu (n + 1), Fu (n + 1).
(iii)
From (ii) and (iii) we obtain by an inference ( ) F
2·(rnk(F)+n)+1 0
¬Fu (0) , (∃x)[Fu (x) ∧ ¬Fu (x + 1)], Fu (n + 1), Fu (n) ∧ ¬Fu (n + 1)
(iv)
and from (iii) finally F
2·(rnk(F)+n)+2 0
Fu (0) , ¬(∀x)[Fu (x) → Fu (x + 1)], Fu (n + 1)
by an inference according to ( ). In both inferences the occurring number parameters are obviously controlled. From Lemma 10.5.6, we finally obtain all instances of Mathematical Induction. 10.5.7 Lemma (Mathematical Induction) such that 4·n ≤ F(n). Then F
ω +4 0
Let F be a strictly increasing operator
¬Fu (0) ∨ ¬((∀x)[Fu (x) → Fu (x + 1)]) ∨ (∀x)Fu (x))
holds for any formula F.
10 Provably Recursive Functions of NT
228
This follows from Lemma 10.5.6 by an inference ( ) and inferences ( ).
Proof
Now we are prepared to prove the main theorem of this section. 10.5.8 Theorem Let F be a strongly increasing operator such that x2 + 2 ≤ F(x) for all natural numbers x. If NT (∀x)(∃y)F(x, y) for a quantifier free formula F(u, v) then there is an α < ε0 such that N |= (∀x1 ) . . . (∀xn )(∃y ∈ Fα ({x1 , . . . , xn }))F(x1 , . . . , xn , y). Proof Assume NT A1 , . . . , Al such that m 0
(∀x)(∃y)F(x, y). Then there are finitely many NT- axioms
¬A1 , . . . , ¬Al , (∀x)(∃y)F(x, y).
(i)
By Lemma 10.5.4 there is a k ∈ ω such that Fω k +m
m 0
¬A1 , . . . , ¬Al , (∀x)(∃y)F(x, y).
(ii)
Together with (10.13) (on p. 226) and Lemma 10.5.7 we obtain from (ii) finite ordinals p, r < ω such that Fω p
ω ·2 r
(∀x)(∃y)F(x, y).
(iii)
We can choose p so big that x3 ≤ Fω p (x). Then the Elimination Theorem (Theorem 10.4.2) and (iii) imply (Fω p )ϕ0r (ω ·2+1)
ϕ0r (ω ·2) 0
(∀x)(∃y)F(x, y).
(iv)
By Lemma 10.3.9 we obtain (Fω p )ϕ0r (ω ·2+1) ≤ Fω p ×(ϕ r (ω ·2)= ω ) ≤ Fϕ n (0) for n large 0 0 enough. Then (iv) implies Fϕ n (0) 0
ϕ0r (ω ·2) 0
(∀x)(∃y)F(x, y).
(v)
by the Structural Rule (Lemma 10.2.2). By the Inversion Lemma (Lemma ??) we get Fϕ n (0) [{x1 , . . . , xn }] 0
ϕ0r (ω ·2) 0
(∃y)F(x1 , . . . , xn , y).
(vi)
Now let n so big that ϕ0n (0) × (ω ϕ0 (ω ·2)·2+1 = ω ) &F ϕ0n (0) =: α . Then α < r
ε0 and N |= (∃y ∈ Fα ({x1 , . . . , xn }))F(x1 , . . . , xn , y) by the Witnessing Theorem (Theorem 10.3.10) and Lemma 10.3.9. 10.5.9 Corollary The provably recursive functions of NT are eventually majorized by Fε0 for any strongly increasing operator satisfying x2 + 2 ≤ F(x). Proof Let f be a provably recursive function of NT. Then there is an index e such that f (x1 , . . . , xn ) $ U(µ y. T(e, x1 , . . . , xn , y) and NT (∀x)(∃y)T(e,x, y). By Theorem 10.5.8 we obtain an α < ε0 such that y < Fα ({x1 , . . . , xn }) holds true for
10.5 The Embedding of NT
229
the least y satisfying T(e, x1 , . . . , xn , y). Since f (x) = U(y) ≤ y we obtain f (x) ≤ Fε0 ({x1 , . . . , xn }) for all x = (x1 , . . . , xn ) such that max{x1 , . . . , xn } ≥ N(α ). On the other hand, we have shown in Sect 3.3.1 that we can talk about ordinals below ε0 in L (NT). For the moment, it is useful to emphasize the difference between an ordinal α < ε0 and its code α . It is obvious that the function N( α ) := N(α ) is primitive recursive. For a finite ordinal n we get according to Sect. 3.3.1 n := µ s . {Seq(s) ∧ lh(s) = n ∧ (∀i < n)[(s)i = 0]} and the search operator can obviously be bounded. Hence n → n is primitive recursive. The inverse mapping | n | is then simply given by | n | = n = lh( n ), which shows that it is primitive recursive, too. For an ordinal α the function λ n. α + n is primitive recursive because α + n = α n . For a strictly increasing function F the function ψF induces a mapping
ψF : On −→ {ξ ∈ On ξ < ω } by putting ψF ( α ) := ψF (α ) . Hence
ψF (α ) = |ψF ( α )| = lh(ψF ( α )). We will first prove in NT that the function ψF ( α ) is defined for all α < ε0 provided that F is a provably recursive function of NT (which is the case for all primitive recursive functions). We use the Recursion Theorem (which is a theorem of NT) to obtain an index e satisfying {e}( ξ ) $ max≺ {{e}( η ) 0 η ≺ ξ ∧ N( η ) ≤ F(N( ξ ))}. (10.14) Then e is an index for the function ψF and it remains to show that ψF is total. To this end we show by induction on β ≺ α , which is available in NT by Theorem 7.4.1, (∀ β ≺ α )(∃y)[{e}( β ) $ y ∧ y = ψF ( β )].
(10.15)
We have the induction hypothesis (∀ ξ ≺ β )(∃y)[{e}( ξ ) $ y ∧ y = ψF ( ξ )] and obtain by (10.14) {e}( β ) $ max≺ {{e}( ξ ) 0 ξ ≺ β ∧ N( ξ ) ≤ F(N( β ))} $ max≺ { ψF (ξ ) + 1
ξ ≺ β ∧ N( ξ ) ≤ F(N( β ))}
$ ψF ( β ). Since all {e}( ξ ) are defined by induction hypothesis and NT proves that there are only finitely many ξ ≺ β such that N( ξ ) ≤ F(N( β )) their ≺-maximum exists and we obtain that {e}( β ) is defined, too.
10 Provably Recursive Functions of NT
230
From (10.15) we obtain (∃y)[ψF ( α ) = y]
(10.16)
for every ordinal α < ε0 and defining F x (y) = lh(ψF (x y )) we obtain Fα (x) = F
α
(x)
since Fα (x) = ψF (α +x) = |ψF ( α + x )| = lh(ψF ( α + x ))=lh(ψF ( α
(10.17)
x )) =
F α (x). For α < ε0 we obtain (∀x)[ α + x ≺ ε0 ] by Mathematical Induction on x. So we have (∀x)(∃y)[Fα (x) $ y] for any α < ε0 by (10.16) and (10.17). This shows that for α < ε0 , every function Fα is a provably recursive function of NT. The provably recursive functions of NT are, however, clearly closed under “primitive recursive in.” Therefore every function which is primitive recursive in some Fα with α < ε0 is a provably recursive function of NT. On the other hand, if f is a provably recursive function of NT then there is an index e such that NT (∀x)(∃y)T(e,x, y). Hence N |= (∀x)(∃y < Fα ({x}))T(e,x, y) for some α < ε0 by Theorem 10.5.8. This means that f (x) = µ y < Fα ({x}). T(e,x, y), i.e., that f is primitive recursive in some Fα . So we have the following theorem. 10.5.10 Theorem The provably recursive functions of NT are exactly the functions which are primitive recursive in Fα for some α < ε0 and some strictly increasing operator F which is provably recursive, e.g., primitive recursive, and satisfies x2 + 2 ≤ F(x). 10.5.11 Corollary There is a true Π20 -sentence (∀x)(∃y)F(x, y) such that NT proves (∃y)F(n, y) for all natural numbers n but NT (∀x)(∃y)F(x, y). Proof
Define F(u, v) :⇔ (u)0 ∈ OT → F(u)0 ((u)1 ) = v.
10.6 Discussion It follows from Theorems 10.5.8 and 10.5.10 that the Skolem functions for Π20 sentences that are provable in NT form exactly the class of functions, which are primitive recursive in {Fα α < ε0 } for suitable “simple” operators F. We call the collection {Fα α < ε0 } the subrecursive hierarchy with basis F. For a collection F of functions let PRH (F ) := { f f : N −→ N and f is primitive recursive in some g ∈ F }
10.6 Discussion
231
denote the primitive recursive hull of the collection F . By SFΠ 0 (T ) we denote the 2
class of Skolem functions of the provable Π20 -sentences of a theory T . We say that the Skolem functions of the provable Π20 -sentences of a theory T are generated by the subrecursive hierarchy {Fα α < κ } or, equivalently, that {Fα α < κ } form a generating class for the Skolem functions of the provable Π20 sentences of the theory T if SFΠ 0 (T ) = PRH ({Fα α < κ }). The ordinal κ is the 2 length of the generating class. It has become common to call the length of the shortest subrecursive hierarchy which is needed to generate the Skolem functions for the provable Π20 -sentences of a theory its Π20 -ordinal. The length of the generating subrecursive hierarchy for SFΠ 0 (NT) is pretty independent from its basic function as long as this function is 2
a provably recursive function of NT with sufficient growth rate. Nevertheless the definition of the Π20 -ordinal of a theory is “less canonical” than the definition of its Π11 -ordinal. In defining the subrecursive hierarchy, we needed heavily the norm of an ordinal, i.e., some kind of unique term notation for ordinals. Therefore, the Π20 ordinal of a theory is rather an ordinal equipped with some additional term structure than a pure ordinal in the set theoretic sense. Although we vaguely talk about the Π20 –ordinal of a theory, we actually mean an ordinal notation system equipped with a norm function. This causes no real problems for ordinals below Γ0 because we have a relatively clear picture of a notation system for these ordinals. But even there we can observe surprising effects (cf. [107]). For bigger ordinals the situation becomes much worse. We have shown in Sect. 9.6 that there is a primitive recursive notation system for the Howard–Bachmann-ordinal ψ (εΩ +1 ). Here, however, it is not at all clear why this notation system should be regarded as “canonical”. A feasible characterization of “what makes a notation system canonical” is still one of the big challenges. The question “which function is a suitable basis for a subrecursive hierarchy generating the Skolem functions for the provable Π20 -sentences for a theory?” is not so important for NT. In Theorem 10.5.10, we have just shown the fact that the subrecursive hierarchies which generate the provably recursive functions of NT are pretty independent of the choice of their generating functions. In determining the Π20 -ordinals of subtheories of NT or rather PA, however, the choice of the basis functions becomes important because these ordinals are not ε -number. A difference of one or even more ω -powers in the length of the generating class makes a big difference there. But the analysis of such small theories is not among the aims of this book. It is, however, interesting in itself to find a basis for a generating class for the provably recursive functions of NT which is as simple as possible. To make “our”1 theory of subrecursive hierarchies work we need a strongly increasing operator satisfying x2 + 2 ≤ F(x). The function λ x . x2 + 2 is strongly increasing except for x = 0 (which would be sufficient to develop the theory) but already λ x . x2 + 3 is everywhere strongly increasing and can therefore serve as a basis. The requirements “strongly increasing” and x2 + 2 ≤ F(x) are, however, somehow weird and at least 1
This theory is based on the work of Weiermann, mainly published in [14] and [108].
10 Provably Recursive Functions of NT
232
the second condition is apparently only technically motivated. They exclude the simplest non trivial strictly increasing function, the successor function, as a basis. However, Arai in [2] and also Weiermann in [108] could show that the subrecursive hierarchy with basis S (the operator induced by the successor function S) generates the Skolem functions of the provable Π20 -sentences of PA at level ε0 . This result, however, needs a detailed study of the properties of the subrecursive hierarchies which would lead us outside the scope of this book. We will therefore briefly sketch an alternative approach to subrecursive hierarchies, which is historically the original one. In this discussion, we restrict ourselves to ordinals below ε0 and assign fundamental sequences to the ordinals below ε0 . We do that in two steps and define first fundamental sequences for additively indecomposable ordinals. 10.6.1 Definition Let α < ε0 be an additively indecomposable ordinal and x < ω . Put % 0 if α = ω 0 β α [x] := ω ·(x + 1) if α = ω β +1 ω λ [x] if α = ω λ and λ ∈ Lim. For additively decomposable ordinals α =NF α1 + · · · + αn < ε0 let
α [x] := α1 + · · · + αn [x] and 0[x] := 0. We call α [x] x ∈ ω the fundamental sequence for α and α [x] the x-th member of the fundamental sequence for α . Observe that for α ∈ H \ {1} we obtain α = supx∈ω α [x]. Hence α = supx∈ω α [x] / Lim we obtain α [x] as the predecessor of α , independent for all α ∈ Lim. If α ∈ from x. n We extend the definition of F to ordinals < ε0 by putting x if α = 0 α α [x] F (x) := F (F(x)) if 0 < α . α
10.6.2 Lemma Let α =NF α1 + · · · + αn . Then F = F
α1
◦ (· · · ◦ F α
αn
· · ·). α +···+α
n−1 Proof The proof is by induction on α . For αn = 1 we obtain F = F 1 ◦ α α1 αn−1 ◦ F) · · ·) by induction hyF by definition which entails F = F ◦ (· · · (F
α
pothesis. If αn ∈ Lim we obtain F (x) = F F
α1
(· · · F
αn−1
(F
αn [x]
(F(x))) · · ·) = F
α1
(· · · F
α [x]
αn−1
(x) = F
(F
αn
α1 +···+αn−1 +αn [x]
(x)) · · ·).
n
(F(x)) =
ω
For the operator S we compute easily S (x) = n + x, S (x) = 2·x + 2, S
ω ·n
(x) = 2n ·(x + 2) − 2 and S
ω2
(x) = 2x+1 ·(x + 3) − 2.
10.6 Discussion
233 ω2
ω2
Therefore we have x2 + 2 ≤ S (x) and may use F := S as basis for a subrecursive hierarchy. We claim that this hierarchy is eventually majorized by the hiα erarchy {S α < ε0 }. To prove the claim, we need a few observations. First, we remark that
α ∈ Lim ⇒ x ≤ N(α [x]),
(10.18)
which follows from the definition of α [x] by induction on α . Next we observe
α < β ⇒ α ≤ β [N(α )].
(10.19)
To secure (10.19) let β =NF β1 + · · · + βn . If α ≤ β1 + · · · + βn−1 + βn [0] ≤ β [N(α )] we are done. Therefore assume β1 + · · · + βn−1 + βn [0] < α < β1 + · · · + βn . Then there is a k such that β1 + · · · + βn−1 + βn [k] < α ≤ β1 + · · · + βn−1 + βn [k + 1] = β [k + 1]. But then α = β1 + · · · + βn−1 + βn [k] + γ with γ = 0, which implies N(α ) > k by (10.18). Hence α ≤ β [N(α )]. We show that the fundamental sequences have the nesting property
α [x] < β < α ⇒ α [x] ≤ β [0].
(10.20)
Let α =NF α1 + · · · + αn . Then α [x] = α1 + · · · + αn−1 + αn [x] < β < α1 + · · · + αn . Hence β =NF α1 + · · · + αn−1 + γ1 + · · · + γm such that αn [x] < γ := γ1 + · · · + γm . Therefore there is a γ0 = 0 such that γ = αn [x] + γ0 and γ [k] = αn [x] + γ0 [k] for all k. Hence αn [x] ≤ γ [0], which implies α [x] ≤ α1 + · · · + αn−1 + γ [0] = β [0]. We use the nesting property to show that the hierarchy {S increasing. We prove α
α
α < ε0 } is strongly
α
S (x) + 1 < S (x + 1) for α = 0 and
α [y] < β < α ⇒ S
α [y]
(10.21)
β
(x) < S (x)
simultaneously by induction on α and side induction on β . If α = α0 + 1, we obα α α α tain S (x) + 1 = S 0 (x + 1) + 1 < S 0 (x + 2) = S (x + 1) from the induction hypothesis for the first claim. If α ∈ Lim, we use the nesting property to obtain α [x] < α [x + 1] ≤ β [0] < β < α and the induction hypothesis for both claims to α
α [x]
α [x]
α [x+1]
α
(x + 2) = S (x + 1). obtain S (x) + 1 = S (x + 1) + 1 < S (x + 2) < S To prove the second claim we again use the nesting property to obtain α [y] ≤ β [x], which, together with the induction hypothesis for both claims, implies S S
β [x]
(x) < S
β [x]
β
α [y]
(x) ≤
(x + 1) = S (x).
The key property for the hierarchy is α
β
α < β ⇒ (∀x)[N(α ) ≤ x ⇒ S (x + 1) ≤ S (x)].
(10.22)
We prove (10.22) by induction on β . From (10.19), we obtain α ≤ β [N(α )] ≤ β [x]. α
For α = β [x] we obtain S (x + 1) = S α
β [x]
β
(x + 1) = S (x). If α < β [x] we obtain
from the induction hypothesis S (x + 1) ≤ S
β [x]
(x) < S
β [x]
β
(x + 1) = S (x).
10 Provably Recursive Functions of NT
234
10.6.3 Lemma Let F = S
ω2
. Then ψF (α ) ≤ S
ω 2 ×(α +1)
(N(α ) + 2).
Proof We induct on α . For α = 0 the claim is obvious. So assume α > 0. Then ψF (α ) = ψF (η ) + 1 for some η < α such that N(η ) ≤ F(N(α )). By the induction hypothesis we obtain ψF (η ) ≤ S
ω 2 ×(η +1)
(N(η ) + 2). But N(ω 2 × (η + 1)) ≤
3·N(η ) + 3. Since ω 2 × (η + 1) ≤ ω 2 × α we obtain ψF (η ) < S by (10.22) and (10.21). Since S S
ω2
ω2
S
ω2 ω 2 ×α =
(N(α ) + 2) = S
(3·N(η ) + 3)
(x) = 2x+1 (x + 3) − 2 we get 3·S
(x + 2) and therefore 3·N(η ) + 3 ≤ 3·S
implies ψF (α ) = ψF (η ) + 1 ≤ S
ω 2 ×α
ω 2 ×α
ω 2 ×(α +1)
ω2
(N(α )) + 3 < S
(3·N(η ) + 3) ≤ S
ω 2 ×α
ω2
ω2
(x) + 3 <
(N(α ) + 2) which
(S
ω2
(N(α ) + 2)) =
(N(α ) + 2).
ω2
10.6.4 Theorem Let F = S . Then the subrecursive hierarchy {Fα α < ε0 } is α eventually majorized by the hierarchy {S α < ε0 }. Proof
Because of Fα (x) = ψF (α +x) ≤ S ω )+3·N(α )+3 ω 2 ×(α =
ω 2 ×(α +x+1)
(N(α ) + x + 2) ≤ S
ω )+3·N(α )+4 ω 2 ×(α =
(3·N(α ) + 3·x + 3) = S (3x)≤S α that {Fα α < ε0 } is eventually majorized by {S α < ε0 }.
(x) we obtain α
ByTheorem 10.6.4, wehave SFΠ 0 (NT)=PRH ({Fα α < ε0 })⊆PRH ({S 2
α
ω) ω 2 ×(α =
α < ε0 }).
Therefore {S α < ε0 } generates at least all Skolem functions for the provable Π20 -sentences of NT. By induction on α < ε0 we obtain, however, that all the funcα α tions S are provably recursive functions of NT. Therefore PRH ({S α < ε0 }) ⊆ SFΠ 0 (NT) = PRH ({Fα α < ε0 }) and we have seen that the primitive recursive 2 hulls of both hierarchies coincide. Therefore we obtain the following theorem. α
10.6.5 Theorem The hierarchy {S α < ε0 } generates the class of Skolem functions of the provable Π20 -sentences of NT, i.e., SFΠ 0 (NT) = PRH ({S 2
α
α < ε0 }).
α
The hierarchy {S α < ε0 } is a slight variant of the Hardy-hierarchy. The original Hardy-hierarchy is defined by ⎧ if α = 0 ⎨x Hα (x) := Hβ (x + 1) if α = β + 1 ⎩ H (x) if α ∈ Lim. α [x] The difference is marginal and it is easy to show that the primitive recursive hulls of α S and Hα coincide. It has become common to call hierarchies whose growth rate corresponds to the the growth rate of the Hardy-hierarchy “fast-growing” in contrast
10.6 Discussion
235
to the “slow-growing” hierarchy Gα which is defined by pointwise iteration of the successor function. Pointwise iteration of a function P is defined by ⎧ if α = 0 ⎨x p Pαp (x) := P(Pβ (x)) if α = β + 1 ⎩ P p (x) if α ∈ Lim. α [x] So we obtain Gα +1 (x) = Gα (x) + 1 for Gα (x) := Sαp (x). Girard [34] was the first who showed that the slow-growing hierarchy has to be iterated along the Howard– Bachmann-ordinal ψ (εΩ +1 ) to generate SFΠ 0 (NT). This result was later reobtained 2 by others. Careful studies by Weiermann, however, have shown that the slowgrowing hierarchy is extremely sensible to variations in the definition of the fundamental sequences. There are assignments of fundamental sequences to limit ordinals which are sufficiently natural and leave the fast growing hierarchies unchanged but alter the slow growing hierarchies dramatically (cf. [107]). But these studies are widely outside the scope of this book.
Chapter 11
Ordinal Analysis for Kripke–Platek Set Theory with Infinity
The notion of “set” is the central notion of mathematics. All mathematical objects can be represented as sets. The set theoretical universe can therefore be regarded as the mathematical universe. The study of axiom systems for set theory should therefore be the central subject of proof theory. In the following section, we will present the first step into the study of axiom systems for set theory. The pioneering work in this direction is mainly due to Gerhard J¨ager [48].
11.1 Naive Set Theory We want to start with a rough heuristic. Set theory has been introduced by Georg Cantor on the turn of the nineteenth to the twentieth century. He gave a definition of a set as a “collection of well-distinguished objects into a new object.” This is of course not a mathematical definition as we understand it today. The difficulties in defining “what is a set” are similar to the difficulties in defining “what is a vector.” Instead of defining vectors, we define vector spaces by their closure properties and say that a vector is an element of a vector space. In the same way, we will describe the set theoretical universe and declare a set as a “member of the universe.” Following this line we try to describe the universe U. The basic notion of set theory is the membership relation a ∈ b stating that the object a is a member of the object b. The formula a ∈ U then expresses that a is a member of the universe, which is synonymous to saying “a is a set.” The universe is then a “collection” of sets. A collection of sets, i.e., an object A having the property (∀x ∈ A)[x ∈ U], is commonly called a class. So we have to deal with two types of objects, classes and sets. One of the basic properties of the universe is that it is closed under the membership relation. All the members of a set are supposed to be sets themselves. Therfore, we postulate the axiom of transitivity (Tran) a ∈ U ⇒ (∀x ∈ a)[x ∈ U]
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
237
238
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
which says that every set is also a class. The next thing to do is to define equality of classes. We say that two classes are equal if and only if they have the same elements, i.e., we postulate the axiom of extensionality (Ext)
A = B ⇔ (∀x ∈ A)[x ∈ B] ∧ (∀y ∈ B)[x ∈ A].
If F(x) is some property assigned to the set x, we can comprehend all these sets having this property into a class. As an abbreviation we write {x F(x)} for this class and postulate the axiom of class comprehension (Comp) a ∈ {x F(x)} :⇔ a ∈ U ∧ F(a). From (Comp) we see immediately that not all classes are sets. The famous counterexample is Russell’s class R := {x x ∈ x}. The assumption R ∈ U leads to the contradiction R ∈ R ⇔ R ∈ / R. So we have R ∈ / U. This shows that the class R is somehow too big as to be a set. Sets are classes, which – in what sense so ever – are “smaller than the universe.” So a subclass of a set – which is already a “small class” – should again be a set. Separating sets out of a set should again lead to a “small class.” Formulating this as an axiom gives the separation axiom (Sep)
a ∈ U ⇒ {x ∈ a F(x)} ∈ U.
Another axiom in this guise is the collection axiom (Coll) (∀a) (∀x ∈ a)(∃y)F(x, y) ⇒ (∃z)(∀x ∈ a)(∃y ∈ z)F(x, y) saying that given as set a and to every element x ∈ a, a witness y having the property F(x, y) then there should already exist a “small class,” i.e., a set, z which contains such a witness for every x ∈ a. There are simple set operations such as pairing, i.e., forming the class {a, b} := {x x = a ∨ x = b} from sets a, b, and union a := {x (∃y ∈ a)[x ∈ y]}, which should not lead outside the universe. So we introduce also the axioms (Pair)
a ∈ U ∧ b ∈ U ⇒ {a, b} ∈ U
and (Union) a ∈ U ⇒
a ∈ U.
Until now we do not have an axiom that postulates that the universe is inhabited. One possibility is to postulate that the empty class 0/ := {x x = x} is a set. But what we really are interested in are universes that contain infinite sets. Therefore, we introduce the axiom of infinity (Inf) (∃x ∈ U) x = 0/ ∧ (∀y ∈ x)(∃z ∈ x)[y ∈ z] requiring the existence of a set x which is not empty and internally unbounded, i.e., for every element y of x there is still an element z ∈ x, which “majorizes y” in the sense of the ∈-relation. There is another requirement which is perhaps a bit harder to motivate. We want to exclude infinitely descendent ∈ sequences, i.e., sequences of the form
11.1 Naive Set Theory
239
a0 # a1 # · · · # ai # an+1 # · · ·. This requirement is equivalent to the requirement that the ∈ relation is well-founded on U. So we have another axiom, the axiom of foundation (FOUND*) A ⊆ U ∧ (∃x) x ∈ A] ⇒ (∃y)[y ∈ A ∧ (∀z ∈ y)[z ∈ / A] . The foundation axiom especially excludes the existence of set that are “selfreferential”, i.e., the existence of a set a such that a ∈ a holds true. The axiomatization of set theory is connected with the names Ernst Zermelo and Abraham Fraenkel. We have essentially (still informally) introduced a part of the axioms, which today are known as Zermelo–Fraenkel set theory. However, two axioms are still lacking. The first is the power–set axiom. The class Pow(a) := {x (∀y ∈ x)[y ∈ a]} of all subsets of a set a is the power-class of a. The power-set axiom postulates that Pow(a) is again a set, i.e., (Pow)
a ∈ U ⇒ Pow(a) ∈ U.
From a foundational point of view the power set axiom is more problematic. It is by far not clear why Pow(a) should again be a “small” class.1 But already in Cantor’s work Pow(a) was considered as a set. The second axiom – which for other reasons is somewhat problematic from a foundational point of view – is the axiom of choice postulating the existence of a choice function for every family of nonempty sets. Its formulation requires the definition of a function from sets into sets. (AC) A ⊆ U ∧ (∀x ∈ A)(∃y)[y ∈ x] ⇒ (∃ f ) Fun( f ) ∧ (∀x ∈ A)[ f (x) ∈ x] . The formalization of the axioms (Ext), (Pair), (Union), (Sep), (Inf), (Pow), and (FOUND) in first-order logic (as presented in the forthcoming section) are commonly called Zermelo set theory. If (AC) is included one talks about Zermelo set theory with choice ZC. In Zermelo–Fraenkel set theory the axiom of separation (Sep) is replaced by (the formalization of) the axiom of replacement (Repl) Fun( f ) ∧ a ∈ U ⇒ { f (x) x ∈ a} ∈ U which again has the clear meaning that the image of a set a under a function f cannot be too big. By ZF we commonly denote Zermelo–Fraenkel set theory without the axiom of choice while ZFC stands for Zermelo–Fraenkel set theory with choice. If we want to state something for both ZF and ZFC we denote that by ZF(C). We want to mention that the axioms of ZFC are equivalent to the system of axioms (Ext), (Pair), (Union), (Sep), (Coll), (Pow), (Inf), (AC), and (Found), i.e., that the axiom of replacement is – on the basis of the other axioms – equivalent to separation plus collection.
1
Even ZFC is not sufficient to decide the size of the class Pow(a). It is still open which axioms are the right ones to decide the size of Pow(N) [113, 114]. The fact that Pow(a) it is again a set is, however, not a debatable item among set theorists.
240
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.2 The Language of Set Theory Formalization of naive set theory as presented in the previous section would require a two-sorted language with a sort for classes and another sort for sets. However, classes play a role different from that of sets. Talking about all classes would mean to introduce a “superuniverse” containing all classes and thus “superclasses” as objects which collect classes but are not classes, i.e., elements of the superuniverse, themselves, etc. This shows that quantification over classes may cause problems. There are formalizations of set theory, which involve quantifications over classes but the most common axiomatizations avoid classes as basic objects. In fact there is no need to talk about classes. We already introduced the notion {x F(x)} as an “abbreviation” for the formula F(x). So it becomes possible to talk at least about definable classes and to formalize nearly all of the axioms of the previous section in terms of a one-sorted first-order logic. By Zermelo–Fraenkel set theory, one usually understands the axiomatization of set theory in an one-sorted first-order logic with identity. The only nonlogical symbol that is needed to formalize ZF(C) in a first-order language is a binary relation symbol ∈ for membership. 11.2.1 Definition The language L (∈) of set theory is the first-order language with identity not containing relation variables whose only nonlogical symbol is the binary relation constant ∈. In the language L (∈) the axioms of the previous section take the following form: (Ext) (∀x)(∀y) x = y ↔ (∀z ∈ x)[z ∈ y] ∧ (∀z ∈ y)[z ∈ x] (Pair)
(∀x)(∀y)(∃z)(∀u)[u ∈ z ↔ u = x ∨ u = y]
(Union) (∀x)(∃y)(∀u) u ∈ y ↔ (∃z ∈ x)[u ∈ z] (Sep)
(∀x)(∃y)(∀u)[u ∈ y ↔ u ∈ x ∧ F(u)]
(Coll)
(∀a)[(∀x ∈ a)(∃y)F(x, y) → (∃z)(∀x ∈ a)(∃y ∈ z)F(x, y)]
where F is a formula in the language of set theory. If F contains further free variables we assume that (Sep) and (Coll) are the universal closure of these schemes. (Inf) (∃x) x = 0/ ∧ (∀y ∈ x)(∃z ∈ x)[y ∈ z] where x = 0/ can be expressed by (∃y)[y ∈ x] and (FOUND) (∃x)F(x) → (∃y) F(y) ∧ (∀z ∈ y)[¬F(z)] . We leave the formulation of the power-set axiom and the axiom of choice in terms of L (∈) to the reader since we will not deal with these axioms in this book. Observe that the sacrifice of classes forced us to replace the axioms (Sep) and (Coll) by schemes. This is unavoidable. It can be shown that ZF(C) is not finitely
11.3 Constructible Sets
241
axiomatizable. In full ZF(C) it is possible to replace the foundation scheme (FOUND) by a single axiom (Found) (∀a) (∃y)(y ∈ a) → (∃z ∈ a)[(∀y ∈ z)(y ∈ / a)] . This is in general not true for subsystems of ZF(C). The remarks on the limits of first-order logic in Sect. 4.1 of course also apply to models of ZF(C). If M is a model of ZF(C) then M will have all the properties required in the informal Sect. 11.1 possibly except (Tran) and (FOUND*). Even if M |= (FOUND) we will not know that M is in fact well-founded (this cannot be secured in first-order logic). To remedy that one usually only regards well-founded and transitive models of ZF(C). It is no surprise that, similar to the situation in arithmetic, there is again no firstorder axiomatization which fixes the set theoretical universe up to isomorphisms. In the case of set theory, the situation is even worse because we do not have a picture of the “standard” set theoretical universe which is comparably clear to that of the standard structure of natural numbers.2 But there is a similarity. Every model of the axioms of NT contains N as an initial segment. Likewise every model of ZFC contains a least inner class model, i.e., a model L which contains all ordinals. This inner model – the constructible hierarchy invented by G¨odel – will overtake the role of the standard natural numbers in the proof-theoretical analysis of axiom systems for set theory.
11.3 Constructible Sets Whenever we have a set theoretical universe U, we can define a hierarchy V by the following clauses V0 := 0/ Vα +1 := Pow(Vα ) Vλ :=
Vξ for λ ∈ Lim
ξ <λ
and V :=
Vξ .
ξ ∈On
This construction is due to von Neumann. The von Neumann-hierarchy is cumulative in the sense that α < β implies Vα ⊆ Vβ and Vλ is the collection of all previous stages at limit ordinals λ . It is not too difficult to check that V , when constructed within a set theoretical universe U, is again a transitive model of 2
This is of course only true as long as we restrict ourselves to the “first-order part” of the structure of natural number. That our intuition of its second-order part, i.e., the reals, is likewise unclear is often overlooked.
242
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
ZF(C). It is moreover also easy to see that U = V if and only if U |= (FOUND). So every well-founded universe can be visualized as a von Neumann-hierarchy. But due to the unclear meaning of the power-set, we do not know too much about the von Neumann-hierarchy. Therefore, G¨odel invented a hierarchy whose stages are more carefully built. Instead of taking the full power-set at the successor stage he only allows those members of the power-set which are constructible from the already obtained part of the universe. The key here is the notion of definability. If M is a set and F an L (∈)-sentence we write shortly M |= F for (M, ∈) |= F. 11.3.1 Definition Let M be a transitive set. A set a is definable from M if there is an L (∈) formula F(x, y1 , . . . , yn ) and elements a1 , . . . , an ∈ M such that a = {x ∈ M M |= F(x, a1 , . . . , an )}. Let Def (M) be the collection of all sets which are definable from M. 11.3.2 Definition Using definability we introduce the constructible hierarchy L by the following clauses. L0 := 0/ Lα +1 := Def (Lα ) Lλ :=
Lξ for λ ∈ Lim
ξ <λ
and L :=
Lξ .
ξ ∈On
Although we will not deal too much with the set theoretical properties of the constructible hierarchy, we are going to state some basic facts. For a more profound study, we recommend any text book on set theory (e.g. [59, 21]). 11.3.3 Lemma For the stages of the constructible hierarchy we have (A)
(∀α ∈ On)[Tran(Lα )]
and (B)
α ≤ β ⇒ Lα ⊆ Lβ .
Proof
We prove (A) and
(C)
Tran(Lα ) ⇒ Lα ⊆ Lα +1 .
Property (B) then follows by induction on β . It is obvious for β = 0 and β ∈ Lim. For successor ordinals it follows from (A) and (C). If Tran(Lα ) and x ∈ Lα we immediately get x = {z ∈ Lα z ∈ x} ∈ Lα +1 . Hence Lα ⊆ Lα +1 . We show (A) by induction on α . It is trivial for α = 0 and follows
11.3 Constructible Sets
243
directly from the induction hypothesis for α ∈ Lim. If x ∈ y ∈ Lα +1 we have the induction hypothesis Tran(Lα ) and obtain y = {z ∈ Lα Lα |= F(z,a)} for some L (∈)-formula F and a tuple a of sets in Lα . Hence x ∈ Lα ⊆ Lα +1 by (C). Once we have defined a set at stage α in the constructible hierarchy, we want to be sure that it will not change its meaning during the process of expanding L. This requires the notion of absoluteness. We call a formula absolute if it keeps its meaning in all higher stages. More generally we define absoluteness as follows. 11.3.4 Definition We call an L (∈)-formula F(x1 , . . . , xn ) upwards persistent if M |= F(a1 , . . . , an ) ⇒ N |= F(a1 , . . . , an ), for all transitive sets M, N such that M ⊆ N. We call F(x1 , . . . , xn ) downwards persistent if N |= F(a1 , . . . , an ) ⇒ M |= F(a1 , . . . , an ) and absolute if M |= F(a1 , . . . , an ) ⇔ N |= F(a1 , . . . , an ) is true for all transitive sets M, N with M ⊆ N. In all cases a1 , . . . , an is an arbitrary tuple of elements in M. Persistency and absoluteness are model-theoretically defined properties of formulas. Therefore, it would be convenient to have syntactical criteria for a formula being persistent or absolute. In the next definition, we introduce sufficient criteria for persistency and absoluteness in terms of the Levy-hierarchy of L (∈)-formulas. For these purposes, we consider bounded and unbounded quantifiers as different basic symbols. A quantifier Q is bounded if it only occurs in the form (Qx ∈ u). The meaning of (∀x ∈ u)[ · · · ] is of course (∀x)[x ∈ u → · · · ] while (∃x ∈ u)[ · · · ] means (∃x)[x ∈ u ∧ · · · ]. 11.3.5 Definition The class of ∆0 -formulas is the smallest class of L (∈)-formulas, which contains all atomic formulas (u = v) and (u ∈ v) and is closed under the propositional connectives ¬, ∧, ∨, and bounded quantification. The class of Σ -formulas is the smallest class which comprises all ∆0 -formulas and is closed under the positive propositional connectives ∧ and ∨, bounded quantification and unbounded ∃-quantification. Dually the Π -formulas form the smallest class, which comprises all ∆0 -formulas and is closed under the positive propositional connectives ∧ and ∨, bounded quantification and unbounded ∀-quantification. An L (∈)-formula is Σ1 if it has the form (∃x)Fu (x) for a ∆0 -formula F. Dually a formula is Π1 if it has the shape (∀x)Fu (x) with F a ∆0 -formula. It is obvious from the definition that every Σ1 -formula is also a Σ -formula and every Π1 -formula is also a Π -formula. The converse direction is in general false. A list of ∆0 -notions is displayed in Fig. 11.1. Recall that the ordered pair is definable by (x, y) := {{x}, {x, y}}.
244
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
Notion
Abbreviation
∆0 –definition
subset
a⊆b
(∀x ∈ a)[x ∈ b]
a is the pair {u, v}
a = {u, v}
(∀x ∈ a)[x = u ∨ x = v] ∧ u ∈ a ∧ v ∈ a
a is the ordered pair (u, v)
a = (u, v)
(∃x ∈ a)(∃y ∈ a)[x = {u} ∧ y = {u, v} ∧ a = {x, y}]
a = (x, y) for some y
x = P0 (a)
(∃u ∈ a)(∃y ∈ u)[a = (x, y)]
a = (x, y) for some x
y = P1 (a)
(∃u ∈ a)(∃x ∈ u)[a = (x, y)]
a is an ordered pair
Pair(a)
(∃u ∈ a)(∃x ∈ u)(∃y ∈ u)[a = (x, y)]
a is a relation
Rel(a)
(∀x ∈ a)[Pair(x)]
a is a function
Fun(a)
Rel(a) ∧ (∀x ∈ a)(∀y ∈ a)[P0 (x) = P0 (y) → P1 (x) = P1 (y)]
domain of a relation
dom( f ) = a
Rel( f ) ∧ (∀x ∈ f )[P0 (x) ∈ a] ∧ (∀x ∈ a)(∃y ∈ f )[P0 (y) = x]
range of a relation
rng( f ) = a
Rel( f ) ∧ (∀x ∈ f )[P1 (x) ∈ a] ∧ (∀x ∈ a)(∃y ∈ f )[P1 (y) = x]
field of a relation
field( f ) = a
Rel( f ) ∧ a = dom( f ) ∪ rng( f )
image of an element
f (x) = a
Fun( f ) ∧ (x, a) ∈ f
union of a set
a=
b
(∀x ∈ a)(∃y ∈ b)[x ∈ y] ∧ (∀y ∈ b)(∀x ∈ y)[x ∈ a]
a is not empty
a = 0/
(∃x ∈ a)[x ∈ a]
a is transitive
Tran(a)
(∀x ∈ a)(∀y ∈ x)[y ∈ a]
a is an ordinal
a ∈ On
Tran(a) ∧ (∀x ∈ a)Tran(x)
a is a limit ordinal
a ∈ Lim
a = 0/ ∧ a ∈ On ∧ (∀x ∈ a)(∃y ∈ a)[x ∈ y]
a ∈ {a1 , . . . , an }
a = a1 ∨ · · · ∨ a = an
a belongs to a finite set
Fig. 11.1 Some ∆0 –notions
11.3.6 Lemma Every Σ -formula is upwards persistent. Dually every Π -formula is downwards persistent. The ∆0 -formulas are therefore absolute. Proof It suffices to prove the lemma for Σ -formulas. The rest follows by dualization. So let F(x1 , . . . , xn ) be a Σ -formula, M, N transitive sets and a1 , . . . , an ∈ M. We show M |= F(a1 , . . . , an ) ⇒ N |= F(a1 , . . . , an )
(i)
by induction on the complexity of the formula F(x1 , . . . , xn ). This is obvious for atomic formulas and follows directly from the induction hypothesis if F(x1 , . . . , xn )
11.3 Constructible Sets
245
is a boolean combination. So let F(x1 , . . . , xn ) be a formula (∀y ∈ x1 )F0 (y, x1 , . . . , xn ) and assume M |= F0 (b, a1 , . . . , an ) for all b ∈ a1 ∈ M.
(ii)
Since M is transitive we obtain from c ∈ a1 ∩ N already c ∈ M. Hence N |= F0 (c, a1 , . . . , an ) for all c ∈ a1 ∩ N
(iii)
by the induction hypothesis which means N |= (∀x ∈ a1 )F0 (x, a1 , . . . , an ). Finally assume that F(x1 , . . . , xn ) is a formula (∃y)F0 (y, x1 , . . . , xn ) and M |= (∃y)F0 (y, a1 , . . . , an ).
(iv)
Then there is a b ∈ M ⊆ N such that M |= F0 (b, a1 , . . . , an ) and we obtain N |= F0 (b, a1 , . . . , an ), i.e., N |= (∃y)F0 (y, a1 , . . . , an )
(v)
directly by the induction hypothesis. This includes of course also the case of a bounded existential quantifier. 11.3.7 Definition Let M be a set. We call F(x1 , . . . , xn ) a ∆ -formula in M if there is a Σ -formula FΣ (x1 , . . . , xn ) and a Π -formula FΠ (x1 , . . . , xn ) such that M |= (∀x1 ) . . . (∀xn )[F(x1 , . . . , xn ) ↔ FΣ (x1 , . . . , xn )] and M |= (∀x1 ) . . . (∀xn )[F(x1 , . . . , xn ) ↔ FΠ (x1 , . . . , xn )]. Observe that the concept of a ∆ -formula is not longer purely syntactical. It needs – in contrast to the purely syntactical concept of a ∆0 -formula – a proof to see that F is a ∆ -formula in some set M. 11.3.8 Corollary Let M and N be transitive sets, M ⊆ N and F a ∆ -formula in M as well as in N. Then F is absolute for M and N, i.e., we have M |= F(a1 , . . . , an ) if N |= F(a1 , . . . , an ) for all a1 , . . . , an ∈ M. Proof
This follows immediately from Lemma 11.3.6.
For the stage Lα in the constructible hierarchy we obtain Lα = {x ∈ Lα Lα |= x = x}. Hence Lα ∈ Lα +1 and, since x = x is a ∆0 formula, this definition of Lα is absolute for all Lβ with β > α . For a set a ∈ L we define rnkL (a) := min {η a ∈ Lη } and call rnkL (a) the L-rank of a.
(11.1)
246
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
In Sect. 3, we have shown that in the presence of the foundation scheme (FOUND) we can define
α ∈ On :⇔ Tran(α ) ∧ (∀x ∈ α )Tran(x). Being an ordinal is thus expressible by a ∆0 -formula. The notion “α is an ordinal” is therefore absolute. We prove Lα ∩ On = α and rnkL (α ) = α + 1
(11.2)
by induction on α . Both claims are clear for α = 0. / Next assume α ∈ Lim. Then Lα ∩ On =
(Lξ ∩ On) =
ξ <α
ξ =α
ξ <α
by the induction hypothesis, and, since x ∈ On is absolute,
α = {x ∈ Lα x ∈ On} = {x ∈ Lα Lα |= x ∈ On} ∈ Lα +1 . Hence rnkL (α ) ≤ α + 1 and rnkL (α ) ≤ α would imply α ∈ Lη for some η ≤ α contradicting the just proved first claim. In the successor case α = α0 + 1 we have the induction hypothesis Lα0 ∩ On = α0 and rnkL (α0 ) = α0 + 1, i.e., α0 ∈ Lα . Hence α = α0 ∪ {α0 } ⊆ Lα ∩ On. Conversely, ξ ∈ Lα ∩ On implies ξ ⊆ Lα0 ∩ On = α0 which in turn implies ξ ≤ α0 < α . Hence Lα ∩ On ⊆ α and we have Lα ∩ On = α which is the first claim and entails α < rnkL (α ). But by the first claim and the absoluteness of On we again have
α = {x ∈ Lα x ∈ On} = {x ∈ Lα Lα |= x ∈ On} ∈ Lα +1 . Hence rnkL (α ) ≤ α + 1, which shows also the second claim.
The constructible hierarchy is obiously well-founded. Therefore, it follows immediately that all stages Lα satisfy the scheme (FOUND). But we have even more. 11.3.9 Theorem Let α > ω be a limit ordinal. Then Lα |= (Ext) ∧ (Pair) ∧ (Union) ∧ (Inf) ∧ (FOUND). / b or there Proof Assume a, b ∈ Lα and a = b. Then there is an x ∈ a such that x ∈ / b] or is an x ∈ b such that x ∈ / a. Since Lα is transitive we get (∃x ∈ Lα ∩ a)[x ∈ / a]. Hence Lα |= a = b → (∃x ∈ a)[x ∈ / b] ∨ (∃x ∈ b)[x ∈ / a], i.e., (∃x ∈ Lα ∩ b)[x ∈ Lα |= (Ext). If a, b ∈ Lα then there is an η < α such that a, b ∈ Lη . But then {a, b} = {x ∈ Lη Lη |= (x = a ∨ x = b)} ∈ Lη +1 ⊆ Lα . Hence Lα |= (Pair).
11.4 Kripke–Platek Set Theory
247
If a ∈ Lα then there is an η < α such that a ∈ Lη and a = {x (∃y ∈ a)[x ∈ y]}. Since Lη is transitive and (∃y ∈ a)[x ∈ y] is a ∆0 -formula we obtain
a = {x ∈ Lη Lη |= (∃y ∈ a)[x ∈ y]} ∈ Lη +1 ⊆ Lα .
Hence Lα |= (Union). By (11.2) we have ω ∈ Lω +1 ⊆ Lα for every limit ordinal α > ω . Hence Lα |= (Inf). We already mentioned that Lα |= (FOUND) holds for all α . Defining a set c := {x ∈ a F(x, b1 , . . . , bn )} by separation makes it dependent on the property expressed by the formula F(x, b1 , . . . , bn ). To make sure that this set remains unaltered during the expansion process of the universe, we must only allow absolute properties in defining sets by separation. Therefore, we restrict the separation scheme to properties which can be expressed by ∆0 -formulas, a notion which is still syntactically checkable. Let us denote the scheme (Sep) in which the formulas F(u) are restricted to ∆0 -formulas by (∆0 –Sep). Analogously we denote by (∆0 –Coll) the collection scheme restricted to ∆0 -formulas F(x, y). 11.3.10 Theorem For any limit ordinal α we have Lα |= (∆0 –Sep). Proof Let a ∈ Lα , F(x, y1 , . . . , yn ) be a ∆0 -formula and b1 , . . . , bn ∈ Lα . Again there is an η < α such that a, b1 , . . . , bn ∈ Lη . By absoluteness and transitivity of Lη we obtain {x ∈ a F(x, b1 , . . . , bn )} = {x ∈ Lη Lη |= x ∈ a ∧ F(x, b1 , . . . , bn )} ∈ Lη +1 . But α ∈ Lim implies Lη +1 ⊆ Lα .
11.4 Kripke–Platek Set Theory We have seen that the constructible hierarchy at limit levels already satisfies a certain amount of closure conditions. Axiomatizing these closure conditions we obtain a set BST of basic axioms, which are satisfied by any rudimentarily closed universe. The acronym BST stands for Basic Set Theory. Our aim is to analyze these axiom proof-theoretically. Therefore, we reformulate these axioms more parsimoniously. Since ∆0 -separation is among the axioms of basic set theory, it suffices to require the existence of supersets of pair, union, etc., to obtain the axioms of pairing and union. The precise sets can then be separated by ∆0 -separation. The axioms of BST comprise, therefore, the following sentences and schemes. (Ext’) (∀u)(∀v) (∀x ∈ u)[x ∈ v] ∧ (∀y ∈ v)[y ∈ u] → u = v (Pair’)
(∀u)(∀v)(∃z)[u ∈ z ∧ v ∈ z]
(Union’) (∀u)(∃z)[(∀x ∈ u)[x ⊆ z]]
248
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
(∆0 –Sep)
(∀v1 ) . . . (∀vn )(∀u)(∃z)[(∀x ∈ z)[x ∈ u ∧ F(x, v1 , . . . , vn )] ∧ (∀x ∈ u)[F(x, v1 , . . . , vn ) → x ∈ z]],
where F(x, v1 , . . . , vn ) is a ∆0 -formula, and the scheme of foundation (FOUND) (∀v) (∃x)F(x,v) → (∃x)[F(x,v) ∧ (∀y ∈ x)[¬F(y,v)]] . We have not included the axiom of infinity into BST. Therefore, we need an axiom that secures the existence of at least one set. We add the axiom (Nullset)
(∃x)(∀y)[y ∈ / x]
securing the existence of the empty set. Augmenting BST by absolute collection, i.e., (∆0 –Coll)
(∀v)(∀u)[(∀x ∈ u)(∃y)F(x, y,v) → (∃z)(∀x ∈ u)(∃y ∈ z)F(x, y,v)]
for ∆0 -formulas F(x,v), we obtain the axiom system KP of Kripke–Platek set theory. Adding also the axiom (Inf’)
(∃z)[z ∈ Lim]
which replaces the axiom of infinity (and makes axiom (Nullset) superfluous) we obtain the system KPω of Kripke–Platek set theory with infinity. These systems are profoundly studied in Barwise’s book [4]. A transitive set A is called admissible if A |= KP, if A |= KPω we call A admissible above ω . An ordinal α is called admissible if Lα |= KP and admissible above ω if Lα |= KPω . Admissible sets and ordinals are important because they possess enough closure properties to develop an abstract recursion theory. The theory of admissible sets and ordinals is not the topic of this book. Those who are interested in learning more about admissible sets and recursion theory on admissible ordinals are recommended to consult Barwise’s and Hinman’s books [4] and [45]. We will restrict ourselves to sketch the main features which are needed for our coming studies. First we check the closure properties of admissible sets. 11.4.1 Lemma Any admissible set is closed under pairs, ordered pairs, unions, intersections, and Cartesian products. Proof Let A be an admissible set and u, v ∈ A. By (Pair’) there is an a ∈ A such that u, v ∈ a. By (∆0 –Sep) we obtain {x ∈ a x = u ∨ x = v} ∈ A. Hence {u, v} ∈ A. If u ∈ A we obtain by (Union’) a v ∈ A such that (∀x ∈ u)[x ⊆ v]. By (∆0 -Sep) we get u = {x ∈ v (∃y ∈ u)[x ∈ y]} ∈ A. Since (u, v) = {{u}, {u, v}} we obtain for u, v ∈ A also (u, v) ∈ A. So we have closure under ordered pairs. For sets a ∈ A such that a = 0/ we get
a = {x (∀y ∈ a)[x ∈ y]} = {x ∈
a (∀y ∈ a)[x ∈ y]}
which exists by closure under unions and (∆0 –Sep). So we have closure under intersections.
11.4 Kripke–Platek Set Theory
249
Observe that for a, b ∈ A also a ∪ b = {a, b} ∈ A and a ∩ b = {a, b} ∈ A, which shows that we also have closure under finite unions and intersections. Finally we want to show that admissible sets are closed under Cartesian products. This is not immediately clear from the defining axioms, since there is no power set axiom in KP. For a, b ∈ A we have a × b = {(u, v) u ∈ a ∧ v ∈ b}.
(i)
If we succeed in finding a set c ∈ A such that a × b = {x ∈ c x = (u, v) ∧ u ∈ a ∧ v ∈ b}
(ii)
we obtain a × b ∈ A by (∆0 -Sep). Since we have closure under ordered pairs we obtain for every x ∈ a (∀y ∈ b)(∃z)[z = (x, y)]
(iii)
which by (∆0 –Coll) entails (∀x ∈ a)(∃wx )(∀y ∈ b)(∃z ∈ wx )[z = (x, y)].
(iv)
Applying (∆0 –Coll) again we obtain (∃c0 )(∀x ∈ a)(∃w ∈ c0 )(∀y ∈ b)(∃z ∈ w)[z = (x, y)]. Since A is closed under unions we have c :=
(v)
c0 ∈ A and obtain from (v)
(∀x ∈ a)(∀y ∈ b)[(x, y) ∈ c]. Hence a × b = {x ∈ c x = (u, v) ∧ u ∈ a ∧ v ∈ b} ∈ A.
For a L (∈)-formula F and a set a we denote by F a the formula, which is obtained from F by restricting all quantifiers to a. It is obvious that F a is always a ∆0 -formula. The fact that Σ -formulas are upwards - and Π -formulas downwards persistent can be formally expressed by Fa ∧ a ⊆ b → Fb
(11.3)
Fa → F
(11.4)
and
for Σ -formulas F and Fb ∧ a ⊆ b → Fa
(11.5)
F → Fa
(11.6)
and
for Π -formulas. Observe that (11.3) through (11.6) are theorems of pure logic. The proof is a straightforward induction on the complexity of the formula F.
250
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.4.2 Theorem (Σ -reflection) Let F(v) be a Σ -formula with only the shown variables free. Then KP
(∀v)(∃a)[F(v) ↔ F(v)a ].
Proof As just stated the direction from right to left holds for logical reasons. We prove the opposite direction by induction on the complexity of the Σ -formula F and argue in an arbitrary model A of KP. If F(v) is ∆0 then F(v)a is equal to F(v) and we have nothing to show. If F(v) is a Boolean combination, say F(v) ≡ F0 (v) ∧ F1 (v), we obtain by the induction hypothesis sets a0 and a1 in A such that F0 (v) ↔ F0 (v)a0 and F1 (v) ↔ F1 (v)a1 . But then ai ⊆ a := a0 ∪ a1 ∈ A and the claim follows from (11.3). So assume that F(v) is a formula (∀x ∈ v1 )F0 (x,v). Let v ∈ A such that F(v) is true in A. Then we have by induction hypothesis (∀x ∈ v1 )(∃a0 )F0 (x,v)a0 .
(i)
By (∆0 –Coll) we obtain from (i) (∃z)(∀x ∈ v1 )(∃a0 ∈ z)F0 (x,v)a0 . But then a0 ⊆ a :=
(ii)
z ∈ A and together with (11.3) we get
(∃a)(∀x ∈ v1 )F(x,v)a .
(iii)
Let finally F(v) ≡ (∃y)F0 (y,v) and v ∈ A such that A |= F(v). By induction hypothesis we have (∃y)F0 (y,v) → (∃y)(∃a0 )F0 (y,v)a0 .
(iv)
Let y ∈ A be a witness for (∃y)F0 (y,v) and a := a0 ∪ {y}. Then we have a0 ⊆ a ∈ A and y ∈ a which imply (∃a)(∃y ∈ a)F0 (y,v)a .
11.4.3 Theorem (Σ -Collection) Let F(x, y,v) be a Σ -formula. Then KP proves (∀v) (∀x ∈ a)(∃y)[F(x, y,v)] → (∃b)[(∀x ∈ a)(∃y ∈ b)[F(x, y,v)] ∧ (∀y ∈ b)(∃x ∈ a)F(x, y,v)] . Proof We argue in an arbitrary model A of KP. Since (∀x ∈ a)(∃y)[F(x, y,v)] is a Σ -formula we find a set c such that (∀x ∈ a)(∃y ∈ c)[F(x, y,v)c ] by Σ -reflection. Put b := {y ∈ c (∃x ∈ a)[F(x, y,v)c ]}. Then b ∈ A by (∆0 –Sep). Since F(x, y,v)c → F(x, y,v) by (11.14) we finally get (∀x ∈ a)(∃y ∈ b)F(x, y,v) and (∀y ∈ b)(∃x ∈ a)F(x, y,v).
11.4 Kripke–Platek Set Theory
251
11.4.4 Theorem (∆ -Separation) Let F(x,v) be a Σ - and G(x,v) a Π -formula such that KP (∀v)(∀x)[F(x,v) ↔ G(x,v)]. Then KP proves that {x ∈ u F(x,v)} is a set. I.e., more precisely, KP
(∀v)(∀u)(∃z)[z = {x ∈ u F(x,v)}].
Proof We argue in an arbitrary model A of KP. Let d := {x ∈ u F(x,v)}. By Σ -reflection we find a set c ∈ A such that (∀x ∈ u)[F(x,v)c ∨ ¬G(x,v)c ].
(i)
Then z := {x ∈ u F(x,v)c } ∈ A
(ii)
by (∆0 -Sep). If x ∈ z we get F(x,v)c which by upwards persistency entails F(x,v). Hence x ∈ d. If x ∈ d we have x ∈ u and F(x,v) which by hypothesis entails G(x,v). By downward persistency it follows G(x,v)c which by (ii) entails F(x,v)c . Hence x ∈ z. So d = z ∈ A. 11.4.5 Theorem (Σ -replacement) Let F(x, y,v) be a Σ -formula, then KP proves (∀v) (∀x ∈ a)(∃!y)[F(x, y,v)] → (∃ f )[Fun( f ) ∧ dom( f ) = a ∧ (∀x ∈ a)[F(x, f (x),v)]] . Proof We argue in an arbitrary model A of KP. Assume (∀x ∈ a)(∃!y)[F(x, y,v)]. By Σ -collection there is a b ∈ A such that (∀x ∈ y)(∃y ∈ b)[F(x, y,v)]. Because of the uniqueness condition we obtain f : = {(x, y) ∈ a × b F(x, y,v)} = {(x, y) ∈ a × b (∀z)[F(x, z,v) → z = y].} Hence f ∈ A by ∆ -Separation and f is a function with domain a which satisfies the claim. 11.4.6 Theorem (Strong Σ -replacement) Let F(x, y,z) be a Σ -formula. Then KP proves (∀v) (∀x ∈ a)(∃y)[F(x, y,v)] → (∃ f )[Fun( f ) ∧ dom( f ) = a ∧ (∀x ∈ a)[ f (x) = 0] / ∧ (∀x ∈ a)(∀y ∈ f (x))[F(x, y,v)]] Proof We argue in an arbitrary KP model A. Assume (∀x ∈ a)(∃y)[F(x, y,v)]. By Σ -collection there is a set b ∈ A such that (∀x ∈ a)(∃y ∈ b)[F(x, y,v)]
(i)
(∀y ∈ b)(∃x ∈ a)[F(x, y,v)].
(ii)
and
252
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
By Σ -reflection we find a c ∈ A such that (∀x ∈ a)(∃y ∈ b)[F(x, y,v)]c and (∀y ∈ b)(∃x ∈ a)[F(x, y,v)]c .
(iii)
For x ∈ a we obtain dx := {y ∈ b F(x, y,b)c } ∈ A. Since dx is uniquely determined by x ∈ a we obtain by Σ -replacement a function f ∈ A such that dom( f ) = a and f (x) = dx . The function f then obviously satisfies the claim. It is a useful observation that we may extend the language of L (∈) by new relation and function symbols without altering the class of ∆ - and Σ -formulas, provided that the new relation symbols have a ∆ -definition and the new function symbols a Σ -definition. We state the theorem omitting its (more or less standard) proof, which can be found in [4]. 11.4.7 Theorem Let F(v) be a Σ -formula and G(v) be a Π -formula such that KP proves (∀v)[F(v) ↔ G(v)]. Moreover let F (v, y) be a Σ -formula such that KP proves (∀v)(∃!y)F (v, y). We extend the language L (∈) to the language L (∈)∗ ∗ by adding new relation symbols RF and new function symbols FF . Let KP be the theory in the language L (∈)∗ which comprises all axioms of KP together with the defining axioms (∀v)[RF (v) ↔ F(v)] and (∀v)F (v, FF (v)) for the new symbols. We call RF a ∆ -relation symbol and FF a Σ -function sym∗ ∗ H for an bol. The theory KP is an extention by definitions of KP, i.e., if KP ∗ L (∈)-formula H then also KP H and for every L (∈) -formula H there is ∗ H ↔ H . Moreover, if H is a Σ -formula an L (∈)-formula H such that KP then H is again a Σ -formula and if H is a ∆ -formula then H is also a ∆ formula, i.e., the classes of Σ -, Π - and ∆ -formulas are preserved under such extensions. Although class terms are not part of the language of KPω we will sometimes use class terms of the form {x F(x)}. The formula s ε {x F(x)} will then be an “abbreviation” for the formula F(s). We will sometimes even write s ∈ {x F(x)}. The next theorem we are aiming at is the Σ -recursion theorem. It is the Σ -recursion theorem that accounts for the fact that admissible sets are good candidates for generalizations of recursion theory. The first step is to show that KPω proves the existence of the transitive closure of a set.3 By the transitive closure TC(a) of a set a we mean the least transitive set b such that a ⊆ b. 11.4.8 Theorem The theory KPω proves that every set possesses a transitive closure. 3
This is even true for KP. The proof there needs, however, a bit more care [4].
11.4 Kripke–Platek Set Theory
Proof
253
We work informally in KPω . Let
α = ω :⇔ α ∈ Lim ∧ (∀ξ ∈ α )[ξ ∈ / Lim]. Using (Inf’) we obtain (∃!α )[α = ω ], i.e., KPω proves the existence of ω . Let P( f , n, a) :⇔ Fun( f ) ∧ n ∈ ω ∧ dom( f ) = n + 1 ∧ f (0) = a ∧ (∀m ∈ n)[ f (m + 1) =
f (m)].
Then we obtain P( f , n, a) ∧ P(g, n, a) ⇒ f = g
(i)
by ∈-induction and again by ∈-induction (∀n ∈ ω )(∃ f )P( f , n, a)
(ii)
expanding f with domain n + 1 to f ∪ {(n + 1, f (n))}. By (i) and (ii) we have (∀n ∈ ω )(∃! f )P( f , n, a) and by Σ -replacement (Theorem 11.4.5) we obω )P(F(n), n, a) and F(m) = tain a function F such that dom(F) = ω , (∀n ∈ F(n)(m + 1) for m < n. Now define TC(a) := n∈ω F(n)(n). Then we obtain a = F(0)(0) ⊆ F(1)(1) = F(0)(0). Hence a ⊆ TC(a). If x ∈ y ∈ TC(a) there is a n ∈ ω such that y∈ F(n)(n) and we obtain x ∈ F(n)(n) = F(n + 1)(n + 1), which shows x ∈ n∈ω F(n)(n). Hence Tran(TC(a)). If a ⊆ b and Tran(b) we show F(n)(n) ⊆ b
(iii)
by induction on n. F(0)(0) = a ⊆ b holds by hypothesis. Byinduction hypoth esis we have F(n)(n) ⊆ b which entails F(n + 1)(n + 1) = F(n)(n) ⊆ b ⊆ b by transitivity of b. This proves that TC(a) is the least transitive set which comprises a. 11.4.9 Theorem (Induction along the transitive closure) The theory KP proves (∀v) (∀x)[(∀y ∈ TC(x))F(y,v) → F(x,v)] → (∀x)F(x,v) . Proof
By ∈-induction we prove
(∀x)(∀y ∈ TC(x))F(y,v).
(i)
The claim follows from (i) because x ∈ TC({x}). To prove (i) we have the induction hypothesis (∀z ∈ x)(∀y ∈ TC(z))F(y,v).
(ii)
From (ii) we get (∀z ∈ x)F(z,v). Therefore, we have F(y,v) for all y in the set x ∪ {TC(z) z ∈ x} = TC(x).
The next theorem is the most important theorem of Kripke–Platek set theory.
254
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.4.10 Theorem (Σ -recursion theorem). Let G be an n + 2-ary Σ -function sym∗ bol. Then KP proves the existence of an n + 1-ary Σ -function symbol F satisfying F(v, x) = G(v, x, FTC(x)), where FTC(x) := {(z, F(v, z)) z ∈ TC(x)}. ∗
Proof We work informally in KP . The proof is essentially the familiar proof of ∗ transfinite recursion with attention paid to the restricted means of KP and to the complexity of the involved definitions. Let P( f , y,v, z) :⇔ Fun( f ) ∧ dom( f ) = TC(y) ∧ (∀u ∈ dom( f ))[ f (u) = G(v, u, f TC(u))] ∧ z = G(v, y, f ). First we show by induction on TC(y) P( f , y,v, z) ∧ P( f , y,v, z ) ⇒ f = f ∧ z = z .
(i)
We have the induction hypothesis (∀u ∈ TC(y))(∀ f )(∀ f )(∀z)(∀z )[P( f , u,v, z) ∧ P( f , u,v, z ) ⇒ f = f ∧ z = z ]. (ii)
From u ∈ TC(y) ∧ P( f , y,v, z) we obtain P( f TC(u), u,v, f (u)) by definition of P. Therefore, we obtain by the induction hypothesis f TC(u) = f TC(u) and f (u) = f (u) for all u ∈ TC(y). Hence f = f which implies z = G(v, y, f ) = G(v, y, f ) = z . Next we show (∀y)(∃ f )(∃z)P( f , y,v, z)
(iii)
by induction on TC(y). We have the induction hypothesis (∀u ∈ TC(y))(∃ fu )(∃zu )P( fu , u,v, zu )
(iv)
which by (i) entails (∀u ∈ TC(y))(∃! fu )(∃!zu )P( fu , u,v, zu ).
(v)
Using Σ -replacement we obtain the function f := {(u, zu ) u ∈ TC(y)}. Let u ∈ TC(y). For v ∈ TC(u) ⊆ TC(y) we have P( fv , v,v, zv ) and obtain fv = fu TC(v) by induction on TC(v) because x ∈ TC(v) implies fv (x) = G(v, x, fv TC(x)) = G(v, x, fu TC(x)) = fu (x). By (v) we now obtain P( fu TC(v), v,v, zv ), i.e., fu (v) = zv = f (v). Hence f TC(u) = fu . Again by (v) we have P( f TC(u), u,v, zu ) which implies f (u) = zu = G(v, u, f TC(u)). Therefore, f (u) = zu = G(v, u, f TC(u)) is true for all u ∈ TC(y) which shows P( f , y,v, G(v, y, f )) and terminates the proof of (iii). By (iii) and (i) we have (∀y)(∃!z)(∃ f )P( f , y,v, z)
(vi)
and we may introduce a Σ -function symbol F together with the defining axiom F(v, y) = z ⇔ (∃ f )P( f , y,v, z).
(vii)
11.4 Kripke–Platek Set Theory
255
Then we have F(v, y) = G(v, y, f ) if P( f , y,v, G(v, y, f )).
(viii)
But P( f , y,v, G(v, y, f )) implies P( f TC(u), u,v, f (u)) for u ∈ TC(y). Hence f (u) = F(v, u) for all u ∈ TC(y), i.e., f = {(u, F(v, u)) u ∈ TC(y)}. So we have F(v, y) = G(v, y, {(u, F(v, u)) u ∈ TC(y)})
which finishes the proof. There are variations and consequences of the Σ -recursion theorem. ∗
11.4.11 Theorem Let G be an n + 2-ary Σ -function symbol. Then KP proves the existence of an n + 1-ary Σ -function symbol F such that F(v, y) = G(v, y, {(z, F(v, z)) z ∈ y}). Proof Define G (v, y, f ) := G(v, y, f y). Then we obtain by Theorem 11.4.10 a Σ -function F such that F(v, y) = G (v, y, {(z, F(v, z)) z ∈ TC(y)}) = G(v, y, {(z, F(v, z)) z ∈ y}).
Another special form is the definition of Σ -functions on ordinals. Here we start with a Σ -function G and define F(v, α ) = G(v, α , Fα ).
(11.7)
Assertion (11.7) is mostly applied in the following form. ∗
11.4.12 Theorem Let G0 , GS , and GL be Σ -function symbols. Then KP proves the existence of a Σ -function symbol F such that dom(F) = On and (0) F(v, 0) = G0 (v), (S) F(v, α + 1) = GS (v, α + 1, F(α )), (L) F(v, λ ) = GL (v, λ , Fλ ) for λ ∈ Lim. ∗
Working informally in KP we define a function symbol G by ⎧ if α = 0 ⎨ G0 (v) G (v, α , f ) := GS (v, α , f (β )) if α = β + 1 ⎩ if α ∈ Lim. GL (v, α , f )
Proof
It is obvious that G is again a Σ -function symbol. Applying (11.7) to G yields the claim. It is also obvious that we can apply the Σ -recursion Theorem in the form that for a Σ -function symbol G, we obtain a new Σ -function symbol F satisfying
256
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
F(y, a1 , . . . , an ) = G(F[y], y, a1 , . . . , an )
(11.8)
where F[y] := {F(x, a1 , . . . , an ) x ∈ y}. Just define G (a1 , . . . , an , y, f ) := G(a1 , . . . , an , y, rng( f y)) and apply the Σ -recursion theorem. An important application of the Σ -recursion theorem is the implicit definition of ∆ -predicate symbols. ∗
11.4.13 Theorem Let P be a ∆ -predicate symbol. Then KP proves the existence of a ∆ -predicate symbol R satisfying R(v, y) ⇔ P(v, y, {z ∈ TC(y) R(v, z)}). Proof
Let
G(v, y, f ) :=
0 if P(v, y, {x ∈ TC(y) f (x) = 0}) 1 if ¬P(v, y, {x ∈ TC(y) f (x) = 0}).
Then
G(v, y, f ) = z ⇔ (∃u) (∀v ∈ u)( f (v) = 0 ∧ v ∈ TC(y)) ∧ (∀v ∈ TC(y))( f (v) = 0 → v ∈ u)
∧ (z = 0 ∧ P(v, y, u)) ∨ (z = 1 ∧ ¬P(v, y, u)) .
Hence G is a Σ -function symbol. By the Σ -recursion Theorem we obtain a Σ -function symbol F such that F(v, y) ∈ 2 and F(v, y) = 0 ⇔ G(v, y, FTC(y)) = 0 ⇔ P(v, y, {z ∈ TC(y) F(v, z) = 0}). (i) Defining R(v, y) :⇔ F(v, y) = 0 we see by (i) that R satisfies the recursion condition of the theorem. Since ¬R(v, y) ⇔ F(v, y) = 1 we see that R is a ∆ -relation symbol.
An important consequence of the Σ -recursion theorem is that the constructible hierarchy can be developed within the framework on KP. Ruminating Definition 11.3.2 we see that the stages Lα of the constructible hierarchy are definable by Σ -recursion as soon as it is secured that a → Def (a) is a Σ -function. There are different methods to secure that. One is using G¨odel-functions, another to goedelize the language of set theory, i.e., to code every L (∈)-formula F by a set F , and then to show that the satisfaction predicate Sat(a, F ,b) :⇔ (a, ∈) |= F[b] is a ∆ -predicate. We will not give the details here. They can be found in the standard literature, e.g. [4] or [21]. The function α → Lα is thus a Σ -function of KP.
11.5 ID1 as a Subtheory of KPω
257
11.5 ID1 as a Subtheory of KPω In this section, we want to sketch that ID1 may be regarded as a subtheory of KPω . The first step is to ensure that all primitive recursive functions are sets in KPω . We take finite ordinals, i.e., ordinals < ω , to represent natural numbers. Remember that ω is a set in KPω . Likewise all finite ordinals k < ω are sets in KPω , we put 0 := 0/ and k + 1 := {0, . . . , k}. In the remainder of this section m, n, k, n0 ,. . . denote finite ordinals. We define n-tuples in the usual way by (x1 , . . . , xn+1 ) := ((x1 , . . . , xn ), xn+1 ) using the pairing function of set theory. We leave it as an exercise to check that the predicates :⇔ a = (x1 , . . . , xn ) for some x1 , . . . , xn
a is an n-tuple and
Pin (a) = xi :⇔ a = (x1 , . . . , xn ) for some x1 , . . . , xi−1 , xi+1 , . . . , xn are again ∆0 -notions. Likewise we define the k-fold Cartesian product of sets by a1 × · · · × ak := (x1 , . . . , xk )
k
xi ∈ ai = (a1 × · · · × ak−1 ) × ak .
i=1
It follows by a meta-induction on k that KP also proves closure under k-fold Cartesian products. We are going to show that all primitive recursive functions are sets in KPω . We obtain S = {z ∈ ω × ω (∃x ∈ ω )[z = (x, x ∪ {x})]}. So the successor function S on natural numbers is a set by ∆0 -separation. Put Ckn := {z ∈ ω n+1 (∃x1 ∈ ω ) . . . (∃xn ∈ ω )[z = (x1 , . . . , xn , k)]} and Pkn := {z ∈ ω n+1 (∃x1 ∈ ω ) . . . (∃xn ∈ ω )[z = (x1 , . . . , xn , xk )]} showing that Ckn and Pkn are sets by ∆0 -separation. Now assume that we have an m-ary function g and n-ary functions h1 , . . . , hm on natural numbers as sets. Then let Sub(g, h1 , . . . , hm ) := {z ∈ ω n+1 (∃x1 ∈ ω ) . . . (∃xn ∈ ω )(∃z1 ∈ ω ) . . . (∃zm ∈ ω )(∃y ∈ ω ) m
[
hi (x1 , . . . , xn ) = zi ∧ g(z1 , . . . , zm ) = y ∧ z = (x1 , . . . , xn , y)]}. i=1
So Sub(g, h1 , . . . , hm ) is a set by ∆0 -separation. Finally assume that we have an n-ary function g and an n + 2-ary function h on the natural numbers. Then put
258
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
Rec(g, h) := {z ∈ ω n+2 (∃u ∈ ω )(∃x1 ∈ ω ) . . . (∃xn ∈ ω )(∀ f )[Fun( f ) ∧ dom( f ) = ω ∧ f (0) = g(x1 , . . . , xn ) ∧ (∀n ∈ ω )[ f (n + 1) = h(n, f (n), x1 , . . . , xn )] ∧ z = (u, x1 , . . . , xn , f (u))} Since then also
Rec(g, h) = {z ∈ ω n+2 (∃u ∈ ω )(∃x1 ∈ ω ) . . . (∃xn ∈ ω )(∀ f ) Fun( f ) ∧ dom( f ) = ω ∧ f (0) = g(x1 , . . . , xn ) ∧ (∀n ∈ ω )[ f (n + 1) = h(n, f (n), x1 , . . . , xn )] → z = (u, x1 , . . . , xn , f (u)) }
we obtain Rec(g, h) as a set by ∆ -separation. Therefore, all primitive recursive functions are available in KPω . This makes it easy to embed the first-order language L (NT) of number theory into the language of set theory. Atomic formulas f (x1 , . . . , xn ) = y and f (x1 , . . . , xn ) = y are translated by replacing the function symbol f by the primitive recursive function f , which is available as set in KPω . One asymmetry is caused by the fact that we have free predicate variables in L (NT), which we did not introduce in the language of set theory. We will therefore, for the moment, augment the language of set theory by free predicate variables, which we denote by capital letters X,Y ,. . . . Again we often write (a1 , . . . , an ) ε X instead of (Xa1 , . . . , an ). In case that X is a set we will also often just write a ∈ X. The new atomic formulas (Xa1 , . . . , an ) are counted among the ∆0 -formulas of KPω (X).4 Let us denote by KPω (X) the theory in the extended language. Since there are no defining axioms for predicate variables, it is obvious that KPω (X) is a conservative extension of KPω . So far we obtain NT as a subtheory of KPω . 11.5.1 Theorem Let F(X) be an L (NT)-formula such that NT F(X)ω . KPω
F(X). Then
Proof It suffices to check that all nonlogical axioms of NT are provable in KPω . Obviously, we always have S(x) = 0 and from S(x) = x ∪ {x} = y ∪ {y} = S(y) we obtain (∀z ∈ x)[z ∈ y] and (∀z ∈ y)[z ∈ x], i.e., x = y. So the successor axioms are provable in KPω . For x ∈ ω we obtain S(x) = x ∪ {x} = x + 1 ∈ ω by definition of S. Likewise we can prove that x1 , . . . , xn ∈ ω implies Ckn (x1 , . . . , xn ) = k ∈ ω as well as Pkn (x1 , . . . , xn ) = xk ∈ ω . If g and h1 , . . . , hm are given we obtain Sub(g, h1 , . . . , hm )(x1 , . . . , xn ) = g(h1 (x1 , . . . , xn ), . . . , hm (x1 , . . . , xn )) by definition of Sub(g, h1 , . . . , hm ). For given functions g and h we obtain by definition of Rec(g, h) also Rec(g, h)(u, x1 , . . . , xn ) = f (u) where f is the uniquely determined function used in the definition of Rec(g, h). Hence Rec(g, h)(0, x1 , . . . , xn ) = f (0) = g(x1 , . . . , xn ) and Rec(g, h)(k + 1, x1 , . . . , xn ) = f (k + 1) = h(k, f (k), x1 , . . . , xn ) = h(k, Rec(g, h)(k, x1 , . . . , xn ), x1 , . . . , xn ). 4
The predicate variables should rather be viewed as predicate constants. The substitution rule F(X) ⇒ F(S) for any “class term” S = {x G(x)} does not hold for KPω (X). It is only true for class terms that are ∆ -definable.
11.5 ID1 as a Subtheory of KPω
259
So all defining axioms for primitive recursive functions are provable in KPω . It remains to check the scheme of mathematical induction. So assume F(0, X)ω and (∀x ∈ ω )[F(x, X)ω → F(x + 1, X)ω ]. If (∃x ∈ ω )[¬F(x, X)ω ] then we obtain by (FOUND) a least k < ω such that ¬F(k, X)ω . The hypothesis F(0, k)ω excludes k = 0. Since ω is the least limit ordinal it follows k = k0 + 1 which implies F(k0 , X)ω . The hypothesis (∀x ∈ ω )[F(x, X)ω → F(x + 1, X)ω ], however, leads to the contradiction F(k, X)ω . Hence (∀x ∈ ω )F(k, X)ω . To show that also ID1 is a subtheory of KPω we have to represent fixed-points of arithmetically definable inductive definitions in KPω . This is prepared by the following lemma. 11.5.2 Lemma Let B(X, x, a1 , . . . , an ) be a ∆ -formula of KPω . Then there is a Σ -function symbol IB of KPω such that IB (α , a1 , . . . , an ) = {x ∈ a1 B(IB [α ], x, a1 , . . . , an )} where IB [α ] := {z ∈ a1 (∃β ∈ α )[z ∈ IB (β , a1 , . . . , an )]} Proof
Let
GB (X, a1 , . . . , an ) := {x ∈ a1 B(X, x, a1 , . . . , an )}.
(i)
Then GB (X, a1 , . . . , an ) = y ⇔ (∀z ∈ y)[z ∈ a1 ∧ B(X, z, a1 , . . . , an )] ∧ (∀z ∈ a1 )[B(X, z, a1 , . . . , an ) → z ∈ y] which shows that GB is a Σ -function. By the Σ -recursion theorem (11.8) we obtain therefore a Σ -function IB such that IB (α , a1 , . . . , an ) = GB (IB [α ], a1 , . . . , an ) = {z ∈ a1 B(IB [α ], z, a1 , . . . , an )}. If X occurs positively in an L (∈)-formula B(X, x,a) then the associated operator GB,a (X) := {z ∈ a1 B(X, z,a)} is monotone. We have seen in the proof of the previous lemma that GB,a is a Σ -operator of KPω if B(X, x,a) is ∆ -formula. The fact that a class S ⊆ a1 is closed under the operator GB,a is abbreviated by ClB,a (S) :⇔ (∀z ∈ a1 )[B(S, z,a) → z ε S]. Next we show that the least fixed point of GB,a is Σ -definable in KPω . Put IB,a := {z ∈ a1 (∃ξ )[z ∈ IB (ξ ,a)]}. Notice that IB,a is in general not a set in KPω . The notion z ε IB,a is supposed to be an “abbreviation” for (∃ξ )[z ∈ IB (ξ ,a)] ∧ z ∈ a1 .
260
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.5.3 Theorem Let B(x, X,a) be an X-positive ∆ -formula of KPω . Then KPω proves ClB,a (IB,a )
(A) and
ClB,a (S) → IB,a ⊆ S.
(B)
Proof Working in KPω assume B(IB,a , x,a) and x ∈ a1 . Since B(X, x,a) is Xpositive, B(IB,a , x,a) is a Σ -formula. By Σ -reflection there is a c such that B({z ∈ a1 (∃ξ ∈ c)[z ∈ IB (ξ ,a)]}, x,a).
(i)
Now define
β :=
{η ∈ c η ∈ On} and α = β ∪ {β }.
Then α is a set by ∆0 –separation, (Union) and (Pair) such that c ∩ On ⊆ α . By upwards persistency and monotonicity we obtain B({z ∈ a1 (∃ξ ∈ α )[z ∈ IB (ξ ,a)]}, x,a), which by Lemma 11.5.2 entails x ∈ IB (α ,a) and thus x ε IB,a . This proves part (A). For the proof of part (B) assume ClB,a (S). We prove IB (α ,a) ⊆ S
(ii)
by induction on α . From the induction hypothesis we have IB [α ] ⊆ S which by X-positivity implies B(IB [α ], x,a) → B(S, x,a), i.e., x ∈ IB (α ,a) → B(S, x,a).
(iii)
From (iii) and ClB,a (S) we finally obtain (∀x)[x ∈ IB (α ,a) → x ε S]. Hence IB,a ⊆ S.
If F(X, x,x) is an X-positive arithmetical formula then F(X, x,x)ω becomes an Xpositive ∆ -formula in L (∈), which defines a monotone operator GF,x : Pow(ω n ) −→ Pow(ω n ). It follows from Theorem 11.5.3 that IF,x is the least fixed point of GF,x . This shows that we can embed the theory ID1 into KPω . So we have the following theorem.
11.6 Variations of KPω and Axiom β
261
11.5.4 Theorem The theory ID1 is a subtheory of KPω . Proof We have already seen that all axioms of NT are provable in KPω . It follows from Theorem 11.5.3 that for any X-positive arithmetical formula F(X, x,x), we have a Σ -definable class IF ω ,x satisfying axioms ID11 and ID12 .
11.6 Variations of KPω and Axiom β Among others this section prepares an ordinal analysis of the theory (ATR)0 which has been introduced in Exercise 8.5.32.5 The ordinal analysis will be accomplished in Chap. 12. Since (ATR)0 is a theory that is not in the focus of this book, we will be rather sketchy here and leave proofs as exercises (with extensive hints). 11.6.1 Exercise Extend the language L (∈) of set theory to the language L ∗ (∈) by adding constants for all primitive recursive functions. Interpreting natural numbers as ordinals less than ω we obtain L (NT) as a sublanguage of L ∗ (∈). Denote by KPN the theory in the language L ∗ (∈) which comprises KPω , whose schemes are all extended to the new language, and all defining axioms for primitive recursive functions restricted to ω . The theory KPN0 is obtained from KPN replacing the scheme of foundation by axiom (Ind). (a) Show that NT
F implies KPN
Fω
(b) Show that KPN is an extension by definitions of KPω . To calibrate theories by the amount of induction they provide we define
(∀x ∈ ω )[(∀y)(y
Let BSTω − be the theory BST + (Inf’) without the foundation scheme (FOUND) and (Union’) replaced by the axiom (TranC) (∀x)(∃y)[Tran(y) ∧ x ⊆ y] which entails that every set possesses a transitive hull. This axiom is necessary since we need foundation to prove the existence of the transitive hull. Because of a ⊆ 5
This section will only be needed in Chap. 12 in which we revisit predicative proof theory. Readers who are only interested in the first step into impredicativity may therefore omit this section in a first reading.
262
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
TC(a), we obtain unions by ∆0 -separation. The axiom (TranC) makes therefore axiom (Union) superfluous. 11.6.2 Exercise Show BSTω −
(∀x)(∃y)[y = TC(x)].
Hint: By (TranC) there is a y such that Tran(y) and u ⊆ y. By (Inf’) you get the existence of ω . Define TC(u) := {x ∈ y (∃n ∈ ω )(∃ f )[Fun( f ) ∧ dom( f ) = n + 1] ∧ f (0) ∈ u ∧ (∀k ∈ n)[ f (k + 1) ∈ f (k) ∧ f (n) = x]} and show that the existential quantifier can be bounded.
By KPω 0 we denote the theory KPω in which the scheme of foundation is replaced by the axiom (Ind)ω of Mathematical Induction. By KPω r we denote the theory KPω in which the scheme (FOUND) is replaced by the axiom (Found). A key role in the ordinal analysis of (ATR)0 plays Axiom β which we are going to formulate. Express by Wf (a, r) that r is a well-founded binary relation on a, i.e., Wf (a, r) :⇔ Rel (r) ∧ field(r) = a ∧ (∀x) x ⊆ a ∧ x = 0/ → (∃y ∈ x)[(∀z ∈ x)[(z, y) ∈ / r]] . Axiom β postulates that every transitive well-founded relation possesses an order type, i.e., a bit more general, (Axβ ) (∀a)(∀r) Wf(a, r) → (∃ f )(∃b)[Fun( f ) ∧ dom( f ) = a ∧ rng( f ) = b ∧ (∀x ∈ a)(∀y ∈ a)[(x, y) ∈ r → f (x) ∈ f (y)]] . The function f in Axiom β is uniquely defined and commonly called the collapsing function of the well-founded relation r. Its range is called the Mostowski-collapse of the well-founded relation r (cf. Definition 3.2.9). The Mostowski-collapse of a transitive well-founded relation is an ordinal which is the order type of the relation. To formalize admissible sets, we introduce an unary relation symbol Ad with the defining axioms (Ad 1) (∀x)[Ad(x) → Tran(x)] (Ad 2) (∀x)(∀y)[Ad(x) ∧ Ad(y) → x ∈ y ∨ x = y ∨ y ∈ x] (Ad 3) (∀x)[Ad(x) → (Pair’)x ∧ (Union’)x ∧ (∆0 –Sep)x ∧ (∆0 –Coll)x ]. Of course we have to replace (Union’)x by (TranC)x for theories with restricted foundation. The formula Ad(a) axiomatizes the fact that a is an admissible set. Let Ad be the set {Ad 1, . . . , Ad 3}. By KPl we denote the theory which comprises BST, the axioms in Ad , and the limit axiom (Lim)
(∀x)(∃y)[Ad(y) ∧ x ∈ y]
which postulates that the universe is a union of admissible sets. Even stronger is the theory KPi , which is KP + Ad + (Lim) and axiomatizes an admissible union of
11.6 Variations of KPω and Axiom β
263
admissible universes, i.e., a recursively inaccessible universe. An ordinal analysis for these theories is outside a first step into predicativity. However, restricting the amount of induction – or equivalently foundation – we obtain much weaker theories. 11.6.3 Exercise Let u = A0 :⇔ Ad(u) ∧ (∀x ∈ u)[¬Ad(x)]. (a) Show that KPl proves (∃!u)[u = A0 ]. = A0 ∧ x ∈ u]. Show that KPl proves (b) The formula x ∈ A0 stands for (∃u)[u (∀x) x ∈ A0 ↔ (∀u)[u = A0 → x ∈ u] . (c) Show that KPl proves (∀x) x ∈ A0 ↔ (∀u)[Ad(u) → x ∈ u] . Check if you need foundation for these results. Hint: Observe that even in absence of foundation you have Ad(a) → a ∈ / a because otherwise you / x} ∈ a by ∆0 –separation leading to the contradiction r ∈ r ⇔ r ∈ / r. would get r := {x ∈ a x ∈
In the standard interpretation A0 is interpreted as the set Lω of hereditarily finite sets. The axiom (∀u) (∀x ∈ X)[(∀y ∈ x)[y ∈ X → y ∈ u] → x ∈ u] → (∀x ∈ X)[x ∈ u] of ∈-induction restricted to a class X is equivalent to the axiom (Found(X)) (∀u) (∃y ∈ u)[y ∈ X] → (∃y ∈ u)[y ∈ X ∧ (∀z ∈ y)[z ∈ X → z ∈ / u]] saying that ∈ restricted to X is well-founded on sets. Let KPl0 and KPi0 be the theories in which the scheme of foundation is replaced by the axiom (Found(A0 )). This is a slight generalization of the axiom of Mathematical Induction (and is in fact equivalent to it). Observe that you did not need foundation to solve Exercise 11.6.3. Therefore A0 is available in KPl 0 and thus also in KPi 0 . In contrast to KPl and KPi, which are much stronger than KPω , the theories KPl 0 and KPi0 are reducible to predicative theories. This will be handled in Sect. 12.7. 11.6.4 Exercise (a) Show that KPl 0 proves Axiom β .6 Show moreover that KPl 0 proves that the “collapsing function” f for a well-founded relation r on a set a belongs to an admissible u whenever a and r are in u. (b) Show that the M OSTOWSKI collapse of a well-ordering is an ordinal. Hint: Assume Wf(a, r). By (Lim) you find an admissible set u such that a, r ∈ u. Let TCa,r (v) := {x ∈ a (∃n ∈ ω )(∃ f )[Fun( f ) ∧ dom( f ) = n + 1 ∧ f (0) = x ∧ (∀k ∈ n)[( f (k + 1), f (k)) ∈ r] ∧ f (n) ∈ v]} denote the transitive closure of the r predecessors of v. Define the Mostowski collapse of r by the formula 6
Observe that in KPl 0 the existence of a collapsing function does not necessarily imply that r is well-founded. The Mostowski collapse of r is just a hereditarily transitive set. Only the presence of foundation makes it an ordinal. The “collapse” of a well-founded relation is, however, always an ordinal.
264
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity A(x, g) :⇔ Fun(g) ∧ dom(g) = TCa,r ({x}) ∧ (∀y ∈ dom(g)) [g(y) = {g(z) (z, y) ∈ r}] .
Show by induction on r that there is an uniquely determined g with A(x, g) and find a set c ∈ u such that f := {(x, y) ∈ a × c (∃g ∈ c)[A(x, g) ∧ y = g(x)]}.
The ordinal ω = A0 ∩ On is a set in KPl0 . Since all primitive recursive functions are definable as sets in KPl0 a canonical translation of the formulas of second-order arithmetic into the language of set theory is obtained by restricting all first-order quantifiers to ω and all second-order quantifiers to Pow(ω ). The second-order language of L (NT) becomes so a sublanguage of L (∈). 11.6.5 Exercise Show that KPl 0 comprises (ATR)0 (cf. Exercise 8.5.32). Hint: The only axiom that needs checking is (ATR). Let A(Z, x) be an arithmetical formula, assume WO (ω , X), i.e., that X is a well-ordering on ω , and put H(a,Y ) :⇔ HierA (Xa,Y ) where Xa := { x, y x, y ε X ∧ y, a ε X}. Use induction on X to show that there is a uniquely determined Y satisfying H(a,Y ).
In KPl 0 we can define all finite levels of the admissible hierarchy. Define u = Ak+1 (u) :⇔ Ad (u) ∧ (∀x ∈ u)[¬Ad(x) ∨
k
x = Ak ].
i=1
11.6.6 Exercise (a) Show that KPl 0 proves (∃!y)[y = Ak ] for all natural numbers k. (b) Define x ∈ Ak :⇔ (∃u)[u = Ak ∧ x ∈ u] and show that KPl 0 proves (∀x) x ∈ An+1 ↔ (∀u)[Ad(u) ∧
n
Ai ∈ u → x ∈ u] .
i=0
11.6.7 Remark The term Axiom β has its origin in the theory of models of secondorder arithmetic. A structure for second-order arithmetic has the form M = (N, S, · · ·), where N is the domain of the first-order quantifiers and S ⊆ Pow(N) the domain of the second-order quantifiers (Sect. 4.2). A structure for second-order arithmetic is a β structure if it is absolute for well-orderings7 , i.e., M |= (∀X)TI(≺, X) iff ≺ is a wellordering in standard structure. Any model M of second-order arithmetic that satisfies the axiom or the scheme of Mathematical Induction thinks that N is well-ordered by the “less than”-relation on N. If M is a β -structure then “less than” is a well-ordering in the real world, which implies that N is isomorphic to N, the standard natural numbers, i.e., that M is an ω -model. However, Mostowski has shown that in general there are ω -models, which are not β -structures (cf. [64]). If M is a transitive model of KPω + Axiom β then M is absolute for wellorderings. This is obvious because the well-foundedness of a relation ≺ can then be expressed not only by the familiar Π -formula but also by saying that there is a 7
bon-ordre is French for well-ordering, hence β .
11.7 The Σ –Ordinal of KPω
265
collapsing function for ≺, which is a Σ -formula. So the notion of well-foundedness of relations (on sets) is expressible by a ∆ -formula, hence absolute. It is a theorem of abstract recursion theory that a model of second-order arithmetic is a β -model if and only if it is closed under hyperjumps, i.e., if it is closed under Π11 -comprehension. On the side of set theory one hyperjump corresponds to the passage to the next admissible set. Therefore, KPl axiomatizes an universe that is “closed under hyperjumps.” This is the recursion theoretic background for the fact that KPl proves Axiom β .
11.7 The Σ –Ordinal of KPω In this section, we study the connection between the proof-theoretic ordinal of a theory T in the language of set theory and the minimal constructible model for the Σ1 -sentences provable in T . Let a be a set and ≺ a linear ordering with field(≺) ⊆ a. Let Proga (≺, S) :⇔ (∀x ∈ a)[(∀y ∈ a)[y ≺ x → y ε S] → x ε S], expressing that the class term S is “progressive” with respect to ≺ inside of a, and TI a (≺, S) :⇔ field(≺) ⊆ a ∧ (Proga (≺, S) → (∀x ∈ field(≺))[x ε S]), expressing the scheme of induction along ≺. For theories in the language of set theory, we therefore modify the definition of the proof-theoretic ordinal as defined in Sect. 6.7 to ||T || := sup {otyp(≺) ≺ ⊆ ω × ω is ∆0 -definable and T
TI ω (≺, S)}. (11.9)
Now put Acca (X, x, ≺) :⇔ x ∈ a ∧ (∀y ∈ a)[y ≺ x → y ∈ X] Then Acca (X, x, ≺) defines an X-positive operator. We obtain ClAcca,≺ (S) ⇔ Proga (≺, S). If we assume that KPω proves the scheme of induction along ≺, i.e., TI a (≺, S) for any class term S, then KPω proves especially TI a (≺, IAcca,≺ ). KPω From Theorem 11.5.3 we then obtain KPω
(∀x ∈ a)[x ∈ field(≺) → (∃β )[x ∈ IAcca,≺ (β )]].
(11.10)
But IAcca,≺ (β ) is exactly the β th stage in the accessible part of the well-ordering ≺. We have shown in Sect. 6.5 that the order type of x ∈ field(≺) is equal to its inductive norm in the accessible part (c.f. Lemma 6.5.2). If we succeed in finding an upper bound for the ordinals β in (11.10), we will also have an upper bound for the order type of ≺ and thus an upper bound for the proof-theoretic ordinal of KPω as defined in Sect. 6.7. This is the background of the following definition.
266
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.7.1 Definition Let ||KPω ||Σ := min {α Lα |= F for all Σ -sentences F such that KPω
F}.
We call ||KPω ||Σ the Σ -ordinal of KPω . We have just explained that the Σ -ordinal of KPω is an upper bound for the order types of well-orderings for which the scheme of transfinite induction is provable in KPω . Therefore, we have the following theorem. 11.7.2 Theorem The Σ -ordinal of KPω is an upper bound for its proof-theoretic ordinal. 11.7.3 Remark Theorem 11.7.2 is the first step in showing ||KPω || = ||KPω ||Σ . As a word of warning we want to emphasize that this situation is singular for KPω . The special status of KPω is due to the fact that ω1CK is the least admissible ordinal above ω , i.e., that Lω CK is the least model of KPω in the con1
F then Lω CK |= F and structible hierarchy. If F is a Σ -sentence such that KPω 1 CK by Σ -reflection there is an ordinal αF < ω1 such that LαF |= F. Since the function α → Lα is ω1CK -recursive, i.e., it is definable by a Σ -formula in Lω CK (see 1 below), and Lα |= F is expressible as ∆ -relation, we obtain F → αF as a ω1CK F} is recursively enumerable. Therefore, recursive function. The set {F KPω F} < ω1CK and we obtain sup {αF F is a Σ -formula and KPω ||KPω ||Σ < ω1CK .
(11.11)
For stronger theories T , however, the Σ -ordinal is bigger than ω1 . There we CK have to introduce the notion of a Σ ω1 -ordinal ||T || ω1CK , which is the least stage CK
Σ
in the constructible hierarchy, which models the provable Σ ω1 -sentences, i.e., sentences of the form (∃x ∈ Lω CK )F(x) where F(x) is ∆0 . This requires that Lω CK is 1 1 definable in these theories. For such theories T , we obtain ||T || = ||T || ω1CK .8 The abΣ stract background is the hyperarithmetical quantifier theorem which states that every Π11 -relation is uniformly equivalent to a relation which is Σ -definable on Lω CK . This CK
1
theorem becomes provable in stronger theories such as KPl (cf. [52, 74]). It can even be extracted from the ω -completeness theorem (Theorem 5.4.9) as sketched in [8]. CK In the case of KPω the Σ - and Σ ω1 -ordinals coincide. 11.7.4 Theorem ||KPω ||Σ ≥ ψ (εΩ +1 ). Proof We have shown in Theorem 9.6.19 that ID1 proves that for every α < ψ (εΩ +1 ) there is an n ∈ ω such that n ∈ Acc≺ and |n|Acc≺ ≥ α for a primitive recursively definable ordering ≺ on ω . By Theorem 11.5.4 we therefore obtain 8
Provided that enough foundation is available in T . Cf. Theorem 12.6.14 for an example in which these ordinals differ.
11.7 The Σ –Ordinal of KPω
KPω
267
(∃β )[n ∈ IAccω ,≺ (β )]. But then Lα |= (∃β )[n ∈ IAccω ,≺ (β )] because other-
wise |n|Acc≺ < α . Hence α < ||KPω ||Σ .
For α := ||KPω ||Σ the set Lα is by definition the least stage in the constructible hierarchy at which we have a partial model for the Σ -sentences which are provable in KPω . We are going to show that Lα is closed under the ω1CK -recursive functions n CK whose totality is provable in KPω . A partial function F: Lω CK −→ p Lω CK is ω1 1 1 recursive if its graph GF (a1 , . . . , an , b) :⇔ F(a1 , . . . , an ) $ b is Σ -definable in Lω CK . A partial function F is provably total in KPω if 1
KPω
(∀x1 ) . . . (∀xn )(∃y)GF (x1 , . . . , xn , y).
A Π2 -formula is a formula of the shape (∀x)(∃y)G(x, y,v) where G(x, y,v) is a ∆0 formula. The next theorem states that the least model for the provable Σ -sentences of KPω is already the least model for the provable Π2 -sentences of KPω . 11.7.5 Theorem Let α = ||KPω ||Σ and F be a Π2 -sentence such that KPω Then Lα |= (∀x)(∃y)F(x, y).
F.
Proof For the proof we borrow from [4] that KPω proves that the constructible hierarchy is an inner class model of KPω . Therefore we obtain Σ -reflection in the form KPω
G → (∃u)(∃ξ )[u = Lξ ∧ Gu ]
(i)
for Σ -formulas G. Let (∀x)(∃y)F(x, y) be a provable Π2 -sentence of KPω , i.e., as(∀x)(∃y)F(x, y) and let a ∈ Lα . We have to show Lα |= (∃y)F(a, y). sume KPω Since α ∈ Lim there is a β < α such that a ∈ Lβ . Since β < ||KPω ||Σ there is a Σ -sentence G such that KPω G but Lβ |= G. By (i) we obtain (∃ξ )(∃u)[u = Lξ ∧ Gu ∧ (∀x ∈ u)(∃y)F(x, y)].
(ii)
Applying Σ -reflection again we obtain (∃z)(∃ξ )(∃u)[u = Lξ ∧ Gu ∧ (∀x ∈ u)(∃y ∈ z)F(x, y)]
(iii)
and thus Lα |= (∃z)(∃ξ )(∃u)[u = Lξ ∧ Gu ∧ (∀x ∈ u)(∃y ∈ z)F(x, y)].
(iv)
So there is a ξ < α such that Lξ |= G and (∀x ∈ Lξ )(∃y ∈ Lα )F(x, y). Since Lβ |= G we have β < ξ by upwards persistency. Hence a ∈ Lξ and we are done. 11.7.6 Corollary Let α := ||KPω ||Σ . Then Lα is closed under the provably ω1CK recursive functions of KPω .
268
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
Proof Let f be a provably ω1CK -recursive function of KPω . Let F(x, y) be the Σ1 (∀x)(∃y)F(x, y). Since (∀x)(∃y)F(x, y) is a formula which defines f . Then KPω Π2 -sentence we obtain Lα |= (∀x)(∃y)F(x, y) by Theorem 11.7.5. Hence f (x) ∈ Lα for all x ∈ Lα .
Π2 –REF) 11.8 The Theory (Π We have seen that KP proves the reflection principle for Σ -formulas. We will now briefly introduce a theory that seems to be stronger but has – as we will see in the following sections – the same proof-theoretical strength. To formulate the theory, we first introduce the scheme of Π2 -reflection. (Π2 –REF) (∀v) F(v) → (∃a)[a = 0/ ∧ Tran(a) ∧ v ∈ a ∧ F(v)a ] , where F(v) is a Π2 formula and v ∈ a stands for vi ∈ a for i = 1, . . . , n if v = (v1 , . . . , vn ). To simplify notations we use the shorthand a |= F :⇔ a = 0/ ∧ Tran(a) ∧ F a . There are admissible sets that do not satisfy (Π2 –REF). The following theorem, however, shows that for constructible sets this is always the case. 11.8.1 Theorem Let α be an admissible ordinal. Then Lα |= (Π2 –REF). Proof Let G :≡ (∀x)(∃y)F(x, y,v) be a Π2 -formula and b a tuple in Lα such that Lα |= G[b]. Since α ∈ Lim there is a β0 < α such that b ∈ Lβ0 . Then Lα |= (∀x ∈ Lβ0 )(∃y)F(x, y,b).
(i)
Since α is admissible we obtain by Σ -reflection a set b0 ∈ Lα such that Lα |= (∀x ∈ Lβ0 )(∃y ∈ b0 )F(x, y,b)b0 . But then there is a β1 < α such that b0 ∈ Lβ1 . Because of Tran(Lβ1 ) we also have b0 ⊆ Lβ1 which implies Lα |= (∀x ∈ Lβ0 )(∃y ∈ Lβ1 )F(x, y,b)Lβ1 .
(ii)
We may now choose β1 > β0 minimal satisfying (ii). Then we get Lα |= (∀x ∈ Lβ1 )(∃y)F(x, y,b)
(iii)
and construct a β2 > β1 such that Lα |= (∀x ∈ Lβ1 )(∃y ∈ Lβ2 )F(x, y,b)Lβ2 exactly as before. Iterating the procedure we obtain a sequence β0 , β1 , . . . of ordinals < ω . We show that this function is Σ -definable. Having defined βn we put
11.9 An Infinitary Verification Calculus for the Constructible Hierarchy
269
βn+1 = z ⇔ z ∈ On ∧ βn < z ∧ (∃u) u = Lz ∧ (∀x ∈ Lβn )(∃y ∈ u)F(x, y,b)u ∧ (∀ρ < z)(∀v ∈ u)[v = Lρ ∨ (∃x ∈ Lβn )(∀y ∈ v)¬F(x, y,b)v ] . So we have (∀x ∈ ω )(∃z)[z = βx ] and obtain by Σ -collection and union that β := supn∈ω βn < α . But for x ∈ Lβ we find an n < ω such that x ∈ Lβn . So there is a L y ∈ Lβn+1 such that F(x, y,b) βn+1 . Since Lβn+1 ⊆ Lβ we obtain finally (∀x ∈ Lβ )(∃y ∈ Lβ )F(x, y,b)Lβ . Since Lβ ∈ Lα we have a reflection point.
It is obvious that ∆0 -collection follows from (Π2 –REF). If (∀x ∈ a)(∃y)F(x, y,b) for a ∆0 -formula F(x, y,b) we reformulate it as (∀x)(∃y)[x ∈ a → F(x, y,b)] and obtain by (Π2 –REF) a transitive set c such that (∀x ∈ c)(∃y ∈ c)[x ∈ a → F(x, y,b)c ] and a ∈ c. From Tran(c) we obtain a ⊆ c which entails (∀x ∈ a)(∃y ∈ c)[F(x, y,b)] by absoluteness of F(x, y,b). 11.8.2 Definition The theory (Π2 –REF) comprises the axioms of BST together with the scheme (Π2 –REF). We have already seen that (Π2 –REF) proves ∆0 -collection. Therefore, (Π2 –REF) is at least as strong as KP. It is, however, easy to see that in (Π2 –REF) we can also prove the axiom of infinity. Let x be an ordinal. Then obviously x ∪ {x} is again a hereditarily transitive set, i.e., an ordinal, such that x ∈ x ∪ {x}. So (Π2 –REF) proves (∀x)(∃y)[x ∈ On → y ∈ On ∧ x ∈ y]. By (Π2 –REF) there is a transitive set a such that (∀x ∈ a)(∃y ∈ a)[x ∈ On → y ∈ On ∧ x ∈ y]. By ∆0 -separation the set α := {x ∈ a x ∈ On} exists and we have (∀x ∈ α )(∃y ∈ α )[x ∈ y], i.e., α ∈ Lim. Therefore, we have the following theorem. 11.8.3 Theorem The theory KPω is a subtheory of (Π2 –REF).
11.9 An Infinitary Verification Calculus for the Constructible Hierarchy In analogy to the infinitary verification calculus for Π11 -sentences in the language of arithmetic, we develop an infinitary verification calculus for the Σ -sentences in the language of the constructible hierarchy. The first step is to introduce a language for the constructible hierarchy. As a peculiarity of this and the following sections, we do not count the equality symbol among the basic symbols of L (∈) but view equations s = t as abbreviations. We put s = t :⇔ (∀x ∈ t)[x ∈ s] ∧ (∀x ∈ s)[x ∈ t].
(11.12)
270
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
This has some technical advantages. In analogy to Sect. 5 we introduce the Tait language of L (∈). Since we regard equality as defined, the only relation symbols of the Tait language of L (∈) are the binary relations ∈ and ∈. / Formulas are built from the atomic formulas (t ∈ s) and (t ∈ / s) by the boolean connectives ∧ and ∨, bounded and unbounded quantification. Negation is not among the logical symbols. Instead we define ∼F by • ∼(t ∈ s) :≡ (t ∈ / s) and ∼(t ∈ / s) :≡ (t ∈ s) • ∼(F0 ∨ F1 ) :≡ (∼F0 ∧ ∼F1 ) and ∼(F0 ∧ F1 ) :≡ (∼F0 ∨ ∼F1 ) • ∼((∃x ∈ t)F(x)) :≡ (∀x ∈ t)[∼F(x)] and ∼((∀x ∈ t)F(x)) :≡ (∃x ∈ t)[∼F(x)] • ∼((∃x)F(x)) :≡ (∀x)[∼F(x)] and ∼((∀x)F(x)) :≡ (∃t)[∼F(x)] Again we have S |= ∼F ⇔ S |= ¬F for any L (∈)–structure S . Therefore, we will mostly use ∼F and ¬F synonymously, although negation is not among the basic symbols. For the following definition of LRS -terms and sentences we assume that L (∈) is given as Tait language. 11.9.1 Definition (Terms of the language LRS of ramified set theory) We define the terms of LRS together with their stages inductively by the following clauses. • For every ordinal α the symbol Lα is an (atomic) LRS -term of stage α . • If a1 , . . . , an are LRS -terms of stages < α and F(v1 , . . . , vn ) is an L (∈)-formula then {x ∈ Lα F(x, a1 , . . . , an )Lα } is a (composed) LRS -term of stage α . By stg(t) we denote the stage of an LRS -term t. By Tα we denote the set of LRS -terms of stages less than α . We assume an ordering <ST on Tα such that all terms in Tα come before the terms in Tβ \ Tα if α < β. 11.9.2 Definition If F(v1 , . . . , vn ) is an L (∈)-formula which contains at most the shown free variables v1 , . . . , vn and a1 , . . . , an are LRS -terms then Fv1 ,...,vn (a1 , . . . , an ) is an LRS -sentence. The stage stg(F) of an LRS -formula is the maximum of the stages of LRS -terms occurring in F. The notions of ∆0 -, Σ -, Π -, Σ1 -, · · · formulas carry over to LRS . The semantics for LRS is defined in the obvious way. We are only interested in the standard meaning of LRS -terms and sentences and will therefore only define their meaning in the constructible hierarchy L. We put
11.9 An Infinitary Verification Calculus for the Constructible Hierarchy
271
• LLα = Lα , L
• {x ∈ Lα F(x, a1 , . . . , an )Lα } := {x ∈ Lα Lα |= F[x, aL1 , . . . , aLn ]} and • L |= F(a1 , . . . , an ) :⇔ L |= F[aL1 , . . . , aLn ]. 11.9.3 Lemma For every constructible set a ∈ L there is an LRS -term t such that a = t L. Proof We prove the lemma by induction on rnkL (a). If rnkL (a) = α + 1 we have a = {x ∈ Lα Lα |= F[x, a1 , . . . , an ]} for constructible sets a1 , . . . , an . Since rnkL (ai ) < rnkL (a) for i = 1, . . . , n we have by induction hypothesis LRS -terms t1 , . . . ,tn such that ai = tiL . Then t := {x ∈ Lα F(x,t1 , . . . ,tn )Lα } is an LRS -term and we obtain t L = {x ∈ Lα Lα |= F[x,t1L , . . . ,tnL ]} = {x ∈ Lα Lα |= F[x, a1 , . . . , an ]} = a. The following corollary follows immediately from the proof of Lemma 11.9.3 and the definition of t L . 11.9.4 Corollary For every a ∈ Lα there is an LRS -term t ∈ Tα such that s = t L . Vice versa t L ∈ Lα is true for all t ∈ Tα . We divide the LRS -sentences into two types.
11.9.5 Definition The –type comprises • Sentences of the form (s ∈ t) for LRS -terms s and t • Sentences of the form (F0 ∨ F1 ) • Sentences of the form (∃x ∈ t)F(x) • Sentences of the form (∃x)F(x).
Dually the –type comprises • Sentences of the form (s ∈ / t) for LRS -terms s and t • Sentences of the form (F0 ∧ F1 ) • Sentences of the form (∀x ∈ t)F(x) • Sentences of the form (∀x)F(x).
It is obvious that an LRS -sentence G belongs to –type if and only if ∼G belongs to –type and vice versa.
272
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.9.6 Definition We define the characteristic sequence for sentences in by the following clauses s = s s ∈ Tα if t = Lα CS(s ∈ t) := s = s ∧ F(s ) s ∈ Tα if t = {x ∈ Lα F(x)},
–type
CS(F0 ∨ F1 ) = F0 , F1 , F(s) s ∈ Tα if t = Lα CS((∃x ∈ t)F(x)) = F(s) ∧ G(s) s ∈ Tα if t = {x ∈ Lα G(x)}, CS((∃x)F(x)) = F(t) t ∈ Tα . α ∈On
Dually we define for sentences in –type ∼(s = s ) s ∈ Tα if t = Lα CS(s ∈ / t) := ∼(s = s ) ∨ ∼F(s ) s ∈ Tα if t = {x ∈ Lα F(x)}, CS(F0 ∧ F1 ) = F0 , F1 , F(s) s ∈ Tα if t = Lα CS((∀x ∈ t)F(x)) = ∼G(s) ∨ F(s) s ∈ Tα if t = {x ∈ Lα G(x)} CS((∀x)F(x)) = F(t) t ∈ Tα . α ∈On
For every member G ∈ CS(F) let i(G) ∈ Tα denote its index. In case of F ≡ (F0 ◦F1 ) for ◦ ∈ {∧, ∨} put i(F0 ) = 0 and i(F1 ) = 1. Observe that
F ∈ –type and
⇔ ∼F ∈ –type
CS(∼F) = ∼G G ∈ CS(F)
holds true for all LRS -sentences F. 11.9.7 Remark According to (11.12) we regard equations and inequalities as de fined by extensionality. An alternative approach is to put s = t into –type , s = t into –type and to define CS(s = t) := (∀x ∈ s)[x ∈ t], (∀x ∈ t)[x ∈ s] and dually CS(s = t) := (∃x ∈ s)[x ∈ / t], (∃x ∈ t)[x ∈ / s]. To reduce the number of cases we opted for the approach via extensionality. 11.9.8 Lemma Let F be an LRS -sentence. Then
F ∈ –type
⇒ (L |= F ⇔ (∃G ∈ CS(F))L |= G)
11.9 An Infinitary Verification Calculus for the Constructible Hierarchy
F ∈ –type
273
⇒ (L |= F ⇔ (∀G ∈ CS(F))L |= G).
Proof We prove the lemma only for sentences in for F ∈ –type we we have ∼F ∈ –type and
–type . This suffices because
L |= F ⇔ L |= ∼F ⇔ (∀G ∈ CS(∼F))[L |= G] ⇔ (∀G ∈ CS(∼F))[L |= ∼G] ⇔ (∀G ∈ CS(F))[L |= G]. If F is a sentence s ∈ Lα we have L |= (s ∈ Lα ) if and only if sL ∈ Lα which holds if and only if sL = a for some set a ∈ Lα . But then there is an LRS -term t ∈ Tα such that sL = t L . By absoluteness this holds if and only if L |= s = t and obviously (s = t) ∈ CS(s ∈ Lα ). If F is a sentence s ∈ {x ∈ Lα F(x, a1 , . . . , an )Lα } we get L |= F if and only if sL ∈ {x ∈ Lα Lα |= F[x, aL1 , . . . , aLn ]}. This is the case if and only if sL ∈ Lα and Lα |= F[sL , aL1 , . . . , aLn ]. By Corollary 11.9.4 there is a term t ∈ Tα such that sL = t L . Then also L |= sL = t L and Lα |= F[t L , aL1 , . . . , aLn ] by absoluteness. This entails L |= s = t ∧ F(t, a1 , . . . , an )Lα and (s = t ∧ F(t, a1 , . . . , an )Lα ) ∈ CS(F). If vice versa L |= s = t ∧ F(t, a1 , . . . , an )Lα for some t ∈ Tα and terms ai ∈ Tα then sL = t L ∈ Lα and, since aLi ∈ Lα for i = 1, . . . , n, we get by absoluteness also L |= sL ∈ {x ∈ Lα Lα |= F[x, aL1 , . . . , aLn ]} . If F is a sentence F0 ∨ F1 we immediately obtain L |= F ⇔ L |= Fi for some i ∈ {0, 1}. Now let F be a sentence (∃x ∈ t)F0 (x, a1 , . . . , an ). First assume that t = Lα . Then L |= F if and only if there is a set a ∈ Lα such that L |= F0 [a, aL1 , . . . , aLn ]. By Corollary 11.9.4 there is a term t ∈ Tα such that a = t L and we obtain L |= F0 [t L , aL1 , . . . , aLn ], i.e., L |= F0 (t, a1 , . . . , an ) and F0 (t, a1 , . . . , an ) ∈ CS(F). If conversely L |= F0 (t, a1 , . . . , an ) for some t ∈ Tα then t L ∈ Lα which implies L |= (∃x ∈ Lα )F0 (x, a1 , . . . , an ). If t = {x ∈ Lα G(x, b1 , . . . , bm )Lα } we have L |= F if and only if there is a set a ∈ Lα such that L |= G[a, bL1 , . . . , bLm ] ∧ F0 [a, aL1 , . . . , aLn ]. But then there is a term t ∈ Tα such that a = t L and we obtain L |= G(t, b1 , . . . , bm ) ∧ F0 (t, a1 , . . . , an ). If conversely L |= G(t, b1 , . . . , bm ) ∧ F0 (t, a1 , . . . , an )
274
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
for some t ∈ Tα then we obtain t L ∈ Lα which in turn implies L |= (∃x ∈ Lα ) G[x, bL1 , . . . , bLm ] ∧ F0 [x, aL1 , . . . , aLn ] . Hence L |= F. The case that F is a sentence (∃x)F(x0 , a1 , . . . , an ) is even simpler and treated analogously.
Having defined –type and –type and the characteristic sequences, we obtain α ∆ for finite sets ∆ of Σ -sentences in the language LRS a verification relation according to Definition 5.4.3. Since there are no free predicate variables in LRS we can, however, dispense with clause (Ax). Therefore there are only the rules
( )
( )
If F ∈ ∆ ∩ –type and α ∆.
αG
∆ , G as well as αG < α for all G ∈ CS(F) then
α0
If F ∈ ∆ ∩ –type and ∆ , G for some G ∈ CS(F) then α > α0 such that stg(i(G)) < α .
α
∆ for all
11.9.9 Theorem Let ∆ be a finite set of Σ -sentences in the language LRS . If then L |= ∆ Lα .
α
∆
Proof We induct on α . If the last inference is an inference according to ( ) then αG ∆ , G for all G ∈ CS(F). If L |= there is a sentence F ∈ ∆ ∩ –type such that Lα ∆ we are done. Otherwise we obtain by persistency also L |= ∆ LαG which by LαG for all G ∈ CS(F). All sentences in CS(F) the induction hypothesis entails L |= G By persistency we get therefore L |= GLα for all G ∈ CS(F). are again Σ -sentences. But CS(F Lα ) = GLα G ∈ CS(F) . By Lemma 11.9.8, however, we then obtain L |= F Lα contradicting the assumption L |= ∆ Lα . Now assume that the last inference is an -inference. Then there is a sentence F ∈ –type ∩ ∆ , an LRS -term t such that stg(t) < α , a sentence G ∈ CS(F) such α0 ∆ , G. If L |= ∆ Lα we obtain that i(G) = t and an ordinal α0 < α such that by persistency and the induction hypothesis L |= GLα0 . Since stg(t) < α we get in any case GLα ∈ CS(F Lα ) and obtain by Lemma 11.9.8 L |= F Lα contradicting Lα L |= ∆ . 11.9.10 Definition Let ∆ be a set of Σ -sentences of LRS . Put par(∆ ) = {α Lα occurs in some formula of ∆ }. 11.9.11 Corollary Assume Lα |= F.
α
F for a Σ -sentence F such that par(F) ⊆ α . Then
11.9 An Infinitary Verification Calculus for the Constructible Hierarchy
275
Proof By Theorem 11.9.9 we obtain L |= F Lα . Since par(F) ⊆ α we obtain t L ∈ Lα for all terms occurring in F. Hence Lα |= F by absoluteness. 11.9.12 Definition (Truth complexity for Σ -sentences) define
α F ∪ {∞}) tc F := min( α
For a Σ -sentence F we
where ∞ is a symbol such that α < ∞ and α + ∞ = ∞ as well as ∞ + α = ∞ hold true for all ordinals α .
11.9.13 Theorem If F is a Σ -formula such that α := tc F < ∞ and par(F) ⊆ tc F then Lα |= F.
Proof This is immediate from Corollary 11.9.11 and the definition of tc F . The next aim is to show that the sentences in the characteristic sequence of a sentence F have lower complexity than F. By an LRS -expression we mean either an LRS -term or an LRS -sentence. 11.9.14 Definition (The rank of an LRS -expression) LRS -expression by the following clauses
We define the rank of an
rnk(Lα ) := ω · α rnk({x ∈ Lα F}) := max{rnk(Lα ), rnk(Fx (L0 )) + 2} rnk(s ∈ t) := rnk(s ∈ / t) := max{rnk(s) + 6, rnk(t) + 1} rnk(A ∨ B) := rnk(A ∧ B) := max{rnk(A), rnk(B)} + 1 rnk((∃x ∈ s)F(x)) := rnk((∀x ∈ s)F(x)) := max{rnk(s), rnk(F(L0 )) + 2} rnk((∃x)F(x)) = rnk((∀x)F(x)) := ∞. 11.9.15 Lemma Let a be an LRS -expression. If rnk(a) < ∞ there is an n < ω such that rnk(a) = ω ·stg(a) + n. Proof We prove the lemma by induction on rnk(a) < ∞. If a is a term Lα then n : = 0. If a is a composed term {x ∈ Lα F} we get rnk(a) = max{rnk(Lα ), rnk(Fx (L0 )) + 2}. Then stg(Fx (L0 )) ≤ α and by induction hypothesis there is an n0 < ω such that rnk(Fx (L0 )) = ω ·stg(Fx (L0 )) + n0 . If stg(Fx (L0 )) < α put n := 0 otherwise put n := n0 + 2. If a is a formula (s ∈ t) or (s ∈ / t) then rnk(a) = max{rnk(s) + 6, rnk(t) + 1}. By induction hypothesis there are n0 , n1 < ω such that rnk(s) = ω ·stg(s) + n0 and rnk(t) = ω ·stg(t) + n1 . If stg(s) < stg(t) put n := n1 + 1, if stg(t) < stg(s) put n := n0 + 6, if stg(s) = stg(t) put n := max{n0 + 6, n1 + 1}.
276
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
If a is a formula (A ∨ B) or (A ∧ B) then rnk(a) = max{rnk(A), rnk(B)} + 1. By induction hypothesis we have n0 , n1 < ω such that rnk(A) = ω ·stg(A) + n0 and rnk(B) = ω ·stg(B) + n1 . If stg(A) < stg(B) put n := n1 + 1, if stg(B) < stg(A) put n := n0 + 1 and if stg(A) = stg(B) put n := max{n0 + 1, n1 + 1}. If a is a sentence of the form (∃x ∈ s)F(x) or (∀x ∈ s)F(x) then we obtain rnk(a) = max{rnk(s), rnk(F(L0 )) + 2}. We have rnk(s) = ω ·stg(s) + n0 and rnk(F(L0 )) = ω ·stg(F(L0 )) + n1 for n0 , n1 < ω by induction hypothesis. If stg(F(L0 )) < stg(s) put n := n0 , if stg(s) < stg(F(L0 )) put n := n1 + 2 if stg(s) = stg(F(L0 )) put n := max{n0 , n1 + 2}. The cases that a is a formula (∃x)F(x) or (∀x)F(x) are excluded by the hypothesis rnk(a) < ∞. 11.9.16 Lemma Let a and bx (L0 ) be LRS -expressions. Then stg(a) < stg(bx (L0 )) implies rnk(bx (a)) = rnk(bx (L0 ))
(A) and
stg(a) < α implies rnk(bx (a)) < max{ω ·α , rnk(bx (L0 )) + 1}.
(B)
Proof We show (A) by induction on rnk(bx (L0 )). The case that bx (L0 ) is L0 is excluded. So assume bx (L0 ) is the term {y ∈ Lα Fx (L0 )}. First assume stg(bx (L0 )) = stg(Fx (L0 )). Then stg(a) < stg(Fx (L0 )) and we obtain rnk(Fx (a)) = rnk(Fx (L0 )) by the induction hypothesis. But this immediately implies rnk(bx (a)) = rnk(bx (L0 )). Next assume stg(Fx (L0 )) < stg(bx (L0 )). Because of stg(a) < stg(bx (L0 )) = α we get stg(Fx (a)) < α and thus rnk(bx (a)) = ω ·α = rnk(bx (L0 )). / tx (L0 )). Now assume that bx (L0 ) is a formula (sx (L0 ) ∈ tx (L0 )) or (sx (L0 ) ∈ If stg(sx (L0 )) < stg(tx (L0 )) then rnk(bx (L0 )) = rnk(tx (L0 )) + 1 and stg(a) < stg(tx (L0 )). By induction hypothesis it follows rnk(tx (a)) = rnk(tx (L0 )) and, since still stg(sx (a)) < stg(tx (a)), also rnk(bx (a)) = rnk(bx (L0 )). If stg(tx (L0 )) < stg(sx (L0 )) we obtain the claim by a similar argument. If stg(sx (L0 )) = stg(tx (L0 )) we get the claim directly from the induction hypotheses. The remaining cases are all similar and left as an exercise. If stg(a) < stg(bx (L0 )) we obtain claim (B) directly from (A). If stg(bx (L0 )) ≤ stg(a) < α we have stg(bx (a)) < α which implies rnk(bx (a)) < ω ·α by Lemma 11.9.15. 11.9.17 Lemma Let s and t be LRS -terms. Then rnk(s = t) = rnk(s = t) = max{9, rnk(s) + 4, rnk(t) + 4}. Proof
The sentence s = t is a shorthand for
(∀x ∈ s)[x ∈ t] ∧ (∀x ∈ t)[x ∈ s]. According to Definition 11.9.14 we get rnk((∀x ∈ s)[x ∈ t]) = max{rnk(s), 8, rnk(t) + 3}.
(i)
11.10 A Semiformal System for Ramified Set Theory
277
Hence rnk(s = t) = max{8, rnk(t) + 3, rnk(s) + 3} + 1. The case of a sentence s = t is completely analogous.
11.9.18 Lemma Let F be a ∆0 -sentence of LRS and G ∈ CS(F). Then rnk(G) < rnk(F). / Lα ). Proof We run through the cases. First let F be a formula (s ∈ Lα ) or (s ∈ Then α = 0 and G ∈ CS(F) is of the form s = s for some s ∈ Tα . Hence rnk(G) = max{9, rnk(s) + 4, rnk(s ) + 4} < max{rnk(s) + 6, ω ·α + 1} = rnk(F). / {x ∈ Lα A} then α = 0 and G is a If F is the sentence s ∈ {x ∈ Lα A} or s ∈ sentence s = s ∧ Ax (s ) or ∼(s = s ) ∨ ∼Ax (s ) for some s ∈ Tα . Hence rnk(G) = max{10, rnk(s) + 5, rnk(s ) + 5, rnk(Ax (s )) + 1} < max{rnk(s) + 6, ω ·α + 1, rnk(Ax (L0 ) + 3)} = rnk(F) by Lemma 11.9.16. The claim follows directly from Definition 11.9.14 if F is a sentence A ∨ B or A ∧ B. So assume that F is of the form (∀x ∈ Lα )A or (∃x ∈ Lα )A. Then α = 0 and G ≡ Ax (s ) for some s ∈ Tα . By Lemma 11.9.16 (B) we obtain rnk(Ax (s )) < max{ω ·α , rnk(Ax (L0 )) + 2} = rnk(F). Finally assume that F is a formula of the shape (∃x ∈ {y ∈ Lα H})A or a formula (∀x ∈ {y ∈ Lα H})A. Then α = 0 and G is a sentence Hy (s ) ∧ Ax (s ) or ∼Hy (s ) ∨ Ax (s ). Then we get by Lemma 11.9.16 (B) rnk(G) = max{rnk(Hy (s ))+ 1, rnk(Ax (s ) + 1)} < max{ω ·α , rnk(Hy (L0 )) + 2, rnk(Ax (L0 )) + 2} = rnk(F). 11.9.19 Corollary For a ∆0 -sentence A which is true in the constructible hierarchy
we have tc A ≤ rnk(A). Proof
Using Lemma 11.9.18 and Lemma 11.9.8 we obtain by induction on rnk(A)
L |= A ⇒
rnk(A)
∆,A
for any finite set ∆ of LRS -sentences. Hence
rnk(A)
A which entails tc A ≤ rnk(A).
11.10 A Semiformal System for Ramified Set Theory Our aim is the computation of the Σ -ordinal of KPω or even (Π2 –REF). We have already computed a lower bound for ||KPω ||Σ in Theorem 11.7.4. According to Theorem 11.9.13 it suffices to compute an upper bound for the truth-complexities of the Σ -sentences which are provable in (Π2 –REF) to obtain also an upper bound for ||(Π2 –REF)||Σ . We follow the pattern of Sect. 9.3 and design a semiformal system
278
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
with operator controlled infinitary derivations. We repeat the definition of the semiformal system. 11.10.1 Definition We define the proof relation clauses:
If F ∈ –type ∩ ∆ and
( )
α ρ
∆.
αG ρ
α ρ
∆ inductively by the following
∆ , G as well as αG < α for all G ∈ CS(F) then
α0
( )
If F ∈ –type ∩ ∆ and ρ ∆ , G for some G ∈ CS(F) then α > α0 such that stg(i(G)) < α .
(cut)
If
α0 ρ
∆ , F and
rnk(F) < ρ then
α0 ρ ∆ , ∼F α ρ ∆.
α ρ
∆ for all
for some α0 < α and sentence F such that
First we obtain a theorem corresponding to Theorem 9.3.2. 11.10.2 Theorem Let Γ be a finite set of LRS -sentences. Then α ρ
Γ ⇒
α
Γ.
Proof The proof is exactly that of Theorem 9.3.2 in which N |= F is replaced by L |= F. We copy the following definition from Sect. 9.3. 11.10.3 Definition Let H be a Skolem–hull operator. We define the relation α H ρ ∆ by the clauses ( ), ( ) and (cut) of Definition 11.10.1 with the additional conditions • α ∈ H (par(∆ )) and for an inference H
αi ρ
∆ι for ι ∈ I ⇒ H
α ρ
∆
with finite I also • par(∆ι ) ⊆ H (par(∆ )).
α
In spite of the slight modification in the ( )-rule the calculus H ρ ∆ coincides with the calculus defined in Sect. 9.3. The properties 9.3.9 through 9.3.15 still hold true for the new calculus. For Lemma 9.3.13, however, observe that the only sen tences of –type with empty characteristic sequences are formulas of the form s ∈ /t with stg(t) = 0, i.e., with terms t representing the empty set. The first difference is in Lemma 9.3.16 which is replaced by the following boundedness lemma.
11.10 A Semiformal System for Ramified Set Theory
11.10.4 Lemma (Boundedness) Assume H H
α ρ
∆ , F Lβ for all β ≥ α .
279 α ρ
∆ , F for a Σ -sentence F. Then
Proof First we observe that par(∆ , F) ⊆ par(∆ , F Lβ ). This, in turn, implies α ∈ H (par(∆ , F Lβ )). We show the lemma by induction on α . Assume that F is not a sentence of the form (∃x)G(x) which is simultaneously also the critical formula of the last inference H
(I)
αι ρ
∆ι , F, Gι for ι ∈ I ⇒ H
α ρ
∆ , F.
Then we obtain H
αι ρ
L
∆ι , F Lβ , Gι β
(i)
for all β ≥ α by the induction hypothesis. In case of a finite index set I we L also have par(∆ι , F Lβ , Gι β ) = par(∆ι , F, Gι ) ∪ {β } ⊆ H (par(∆ , F)) ∪ {β } ⊆ α H (par(∆ , F Lβ )). Therefore, we obtain H ρ ∆ , F Lβ from (i) by an inference (I). is the critical formula of the last inference then If F is a Σ -sentence (∃x)G which this is an inference according to ( ) whose premise is H
α0 ρ
∆ , F, Gx (t)
(ii)
for a term t such that stg(t) < α . But then Gx (t) ∈ CS(F Lβ ) for β ≥ α and α par(∆ , F, Gx (t)) ⊆ H (par(∆ , F)) ⊆ H (par(∆ , F Lβ )) and we obtain H ρ ∆ , F Lβ by an inference ( ). Derivations in the semiformal systems are bothersome because of the ordinal bounds and the controlling operators. But we have seen in Corollary 11.9.19 that an upper bound for the truth complexity of a sentence can be computed from its rank. In most cases also the extended identity operator will suffice to control the parameters. The extended identity operator is the operator defined by I (X) := closure of X ∪ {0, Ω } under the successor function and λ ξ . ω ξ . ◦
We introduce an auxiliary calculus ∆ for multisets ∆ of ∆0 -sentences of LRS . A multiset is a finite unordered sequence. Two multisets F1 , . . . , Fn and G1 , . . . , Gm are equal if n = m and there is a permutation π of the numbers 1, . . . , n such that Fi = Gπ (i) for i = 1, . . . , n. If ∆ := F1 , . . . , Fn and Γ are multisets we write ∆ ⊆ Γ if there is a permutation π such that Fπ (1) , . . . , Fπ (n) is an initial sequence of Γ , i.e., all Fi occur in Γ with at least the same multiplicity in which they occur in ∆ . 11.10.5 Definition The calculus ( )◦
◦
∆ is defined by the two clauses:
◦
If F ∈ –type and ∆ , Γ for some nonvoid set Γ ⊆ CS(F) then holds true for all Λ ⊇ ∆ , F (viewed as multisets)
◦
Λ
and
( )◦
If F ∈ –type and (viewed as multisets)
◦
∆ , G for all G ∈ CS(F) then
◦
Λ for all Λ ⊇ ∆ , F
280
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity ◦
◦
with the proviso that in case of a finite number of premises ∆ , Γ or ∆ , G we have par(∆ , Γ ) ⊆ I (par(Λ )) or par(∆ , G) ⊆ I (par(Λ )), respectively. We call the formula F in clauses ( )◦ and ( )◦ the critical sentence of the inference. For a multiset ∆ = F1 , . . . , Fn of ∆0 -sentences we define its derivation rank
∑= ω rnk(G) + ω < ω rnk(F) .
G∈Γ
◦
11.10.6 Lemma If Λ for a multiset Λ then H rian closed Skolem–hull operator H .
drnk(Λ ) 0
Λ holds for any Canto-
◦
Proof We induct on the definition of Λ . First we observe that for a Cantorian closed operator H we always have drnk(Λ ) ∈ H (par(Λ )). Let F be the critical sentence of the last inference ◦
(I)
∆ ,Γ ⇒
◦
Λ.
◦
If (I) is an inference according to ( )◦ we have the premise ∆ , Γ for some nonvoid set Γ ⊆ CS(F) and ∆ , F ⊆ Λ . By the induction hypothesis we obtain H
drnk(∆ ,Γ ) 0
∆ ,Γ .
(i)
But then rnk(G) < rnk(F) for all G ∈ Γ by Lemma 11.9.18. This, however, implies
∑= ω rnk(G) + ω < ω rnk(F) .
G∈Γ
and thus drnk(∆ , Γ ) + ω < drnk(∆ , F) ≤ drnk(Λ ). Here we see why it is important to have multisets, because F may occur in ∆ and thus has to be counted with multiplicity 2 in ∆ , F to ensure drnk(∆ , Γ ) + ω < drnk(∆ , F). From (i), however, we obtain H
drnk(∆ ,F) 0
∆ , F and therefore also H
drnk(Λ ) 0
Λ
by the structural rule and finitely many applications of inferences ( ) which are applicable because par(∆ , Γ ) ⊆ I (par(Λ )) ⊆ H (par(Λ )) holds for any Cantorian closed Skolem–hull operator. with the simplification The case of an inference according to ( )◦ is analogous that we only need one application of an inference ( ) and do not have to bother about the parameters if there are infinitely many premises. Another awkwardness of ramified set theory is the fact that sentences of the form a ∈ b, a ∈ / b, (∃x ∈ b)F(x) and (∀x ∈ b)F(x) have different characteristic sequences according to the shape of b. To unify the notation we introduce a modified member/ b and assume that a ∈ b and a ∈ / b are ship relation9 a ∈ b as well as its negation a ∈ empty sentences in case that b is an atomic term Lα and denote the formula F(a) 9
This simplification as well as the simplified calculus
◦
∆ are due to Buchholz [13].
11.10 A Semiformal System for Ramified Set Theory
281
or ∼F(a), respectively, in case that b is a composed term {x ∈ Lα F(x)}. More precisely we define G(s) if b = Lα a ∈ b ∧ G(s) :⇔ F(a) ∧ G(s) if b = {x ∈ Lα F(x)} a∈ / b ∨ G(s) :⇔
G(s) if b = Lα ∼F(a) ∨ G(s) if b = {x ∈ Lα F(x)}
and for multisets also ···,a ∈ / b, G(s), · · · :⇔
· · · , G(s), · · · if b = Lα · · · , ∼F(a), G(s), · · · if b = {x ∈ Lα F(x)}.
This has the advantage that we obtain CS(a ∈ b) = t ∈ b ∧ t = a t ∈ Tstg (b) , CS(a ∈ / b) = t ∈ / b ∨ t = a t ∈ Tstg (b) , CS((∃x ∈ b)F(x)) = t ∈ b ∧ F(t) t ∈ Tstg (b) as well as CS((∀x ∈ b)F(x)) = t ∈ / b ∨ F(t) t ∈ Tstg (b) independent of the shape of b. There are some derived rules using this new notation which we will use quite frequently. (Str) (Taut)
If ◦
(Sent) If (∈)
If
(∈) /
If
◦
◦
∆ and ∆ ⊆ Γ then
◦
Γ
∆ , A, ∼A ◦
∆ , A then
◦
∆ , ∼B, A ∧ B
◦
∆ ,t ∈ b ∧ t = a and par(t) ⊆ par(∆ , a ∈ b) for some t ∈ Tstg (b) then ∆,a ∈ b ◦
∆ ,t ∈ / b,t = a for all t ∈ Tstg (b) then
◦
∆,a ∈ /b
◦
(∃b )
If ∆ ,t ∈ b ∧ F(t) and par(t) ⊆ par(∆ , (∃x ∈ b)F(x)) for some t ∈ Tstg (b) ◦ then ∆ , (∃x ∈ b)F(x)
(∀b )
If
◦
∆ ,t ∈ / b, F(t) for all t ∈ Tstg (b) then
◦
∆ , (∀x ∈ b)F(x)
282
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity ◦
The structural rule (Str) has been built into the calculus . The tautology rule (Taut) is proved by induction on the rank of the sentence A. The sentential rule ◦ ◦ (Sent) follows from the tautology rule ∆ , ∼B, B and the hypothesis ∆ , A by an ◦ inference ( ) since par(∆ , A)∪par(∆ , ∼B, B) ⊆ par(∆ , ∼B, A ∧ B). The remaining rules are special cases of ( )◦ or ( )◦ . We will now prove a series of sentences in the simplified calculus. ◦
a∈ / a for every LRS -term a.
(11.13)
We show (11.13) by induction on rnk(a). Let α := stg(a). For α = 0 we obtain / If α > 0 we have the claim by an inference ( )◦ because then CS(a ∈ a) = 0. rnk(b) < rnk(a) for all b ∈ Tα and obtain ◦
b∈ /b
(i)
for all b ∈ Tα by induction hypothesis and from (i) ◦
b∈ / a, b ∈ a ∧ b ∈ /b
by (Sent). By ◦
(∃a )
(ii)
it then follows
/ this entails for all b ∈ Tα . By (∈) ◦
◦
b∈ / a, (∃x ∈ a)[x ∈ / b], i.e.,
a ⊆ a, hence also
◦
◦
b∈ / a, b = a
a∈ / a.
a = a, for all LRS -terms a.
(iii) (11.14)
We prove (11.14) by induction on α := rnk(a). For α = 0 the characteristic sequence ◦ of (∀x ∈ a)[x ∈ a] is empty. Therefore, we obtain a ⊆ a by an inference ( )◦ . For ◦ α > 0 we obtain b ⊆ b and, by symmetry, therefore also ◦
b=b
(i)
for all b ∈ Tα by induction hypothesis. From (i) we obtain ◦
b∈ / a, b ∈ a ∧ b = b
(ii)
by (Sent) and from (ii) by (∈) ◦
b∈ / a, b ∈ a for all b ∈ Tα .
By (∀a ) we get
◦
a ⊆ a from (11.15).
(11.15)
As a corollary to (11.14) we get ◦
a ∈ Lα for all a ∈ Tα
(11.16)
and thus
α = 0 ⇒
◦
Lα = 0. /
(11.17)
11.11 The Collapsing Theorem for Ramified Set Theory
283
By (Taut) we also have ◦
a = b, a = b.
(11.18)
An important observation is ◦
Tran(Lα ).
(11.19)
Let a ∈ Tα . If b ∈ Tstg (a) then stg(b) < α and by (11.15) we get ◦
b∈ / a, b ∈ Lα
for all b ∈ Tstg (a) . By ◦
(i) (∀a )
(i) implies
(∀y ∈ a)[y ∈ Lα ] i.e.
◦
a ⊆ Lα
(11.20)
for all a ∈ Tα . But from (11.20) we obtain again by (∀ ) Lα
◦
(∀x ∈ Lα )(∀y ∈ x)[y ∈ Lα ]
which is the claim.
This is all we need for the collapsing theorem. We will return to the simplified ◦ calculus in Sect. 11.12 where we prove the axioms of KPω and (Π2 –REF) in ramified set theory.
11.11 The Collapsing Theorem for Ramified Set Theory This section is in principle a remake of Sect. 9.4. We will, therefore, use the notions of this section and also often rely on the results obtained there. The main difference is that we will prove the Collapsing Lemma for (Π2 –REF) while the direct conversion of Sect. 9.4 would only give us Σ -reflection. Recall the scheme (Π2 –REF) (∀v) F(v) → (∃a)[a |= F(v)] for Π2 -formulas F(v) ≡ (∀x)(∃y)G(x, y,v). We start with a simple technical remark. 11.11.1 Remark If H
α ρ
∆ , F LΩ for a Σ1 -sentence F then H
α ρ
∆ , F.
The proof is an obvious induction on α using H (par(∆ , F LΩ )) = H (par(∆ , F)). 11.11.2 Theorem (Collapsing Theorem) Let Γ be a finite set of instances of the scheme (Π2 –REF), ∆ a finite set of Σ -sentences with par(∆ ) ⊆ Ω . Let moreover H / be a Cantorian closed transitive Skolem–hull operator such that par(Γ , ∆ ) ⊆ H (0). Then H
β Ω
β
¬Γ LΩ , ∆ LΩ entails H
ω
ωβ
ψH ( ω ω ) +1 Ω
∆ LΩ .
284
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
Proof
We adapt the key properties in the proof of Lemma 9.4.5. First we obtain β
β
β ∈ H (0) / and ω ω < γ ⇒ ψH (ω ω ) < ψH (γ )
(i)
which is again obvious by (9.20) (on p. 173) since we have ω / ∩ γ . From par(Γ , ∆ ) ⊆ H (0) / we get Hγ (0)
ωβ
∈ H (0) / ∩γ ⊆
H (par(∆ )) = H (0) /
(ii)
and finally we have β
β ∈ H (0) / ⇒ ωω ∈ H
β
ω ω +1
β
(0) / and ψH (ω ω ) ∈ H
β
ω ω +1
(0) /
(iii)
by (9.19) and the closure properties of H ω β (0). / ω +1 Let us first assume that the last inference H
(J)
βι Ω
L
¬Γ LΩ , ∆ι Ω for ι ∈ I ⇒ H
β Ω
¬Γ LΩ , ∆ LΩ
does not have a critical sentence which belongs to ¬Γ LΩ . We show first: “All sentences in ∆ι are Σ -sentences”
(iv)
par(∆ι ) ⊆ H (0). /
(v)
and
The only case in which there are sentences in ∆ι which are not subsentences of sentences in ∆ is a cut with cut-formula H, say. If H contains unbounded quantifiers then rnk(H LΩ ) ≥ Ω . So (iv) follows because of cut rank Ω . In the case of a finite L index set I we obtain (v) from par(∆ι Ω ) ⊆ H (par(Γ LΩ , ∆ LΩ )) ⊆ H (0). / The only possibility for an infinite index set I is an inference according to ( ). Then we have the critical sentence (∀x ∈ a)G(x) in ∆ LΩ and for every ι ∈ I there is an LRS term t ∈ Tstg (a) such that par(∆ι ) ⊆ par(∆ ) ∪ par(t). But then par(t) ⊆ stg(a) ∈ / Because of (iv) and (v) H (0) / ∩ Ω . Since H is transitive it follows par(t) ⊆ H (0). the induction hypothesis applies to the premise(s) of (J). So we have H
β ωω ι
ψH ( ω ω
βι
+1 Ω
)
∆ιLΩ
(vi)
and obtain with the aid of (i) and (iii) β
H
β ω ω +1
ψH ( ω ω ) Ω
∆ LΩ
(vii)
from (vi) by an inference (J). Now assume that the critical formula of (J) belongs to ¬Γ LΩ . All formulas in ¬Γ have the shape H(t1 , . . . ,tk ) :⇔ (∃vk+1 ) . . . (∃vn ) (∀x)(∃y)G(x, y,t1 , . . . ,tk , vk+1 , . . . , vn ) ∧ (∀a)[a = 0/ ∨ ¬Tran(a) ∨ t1 , . . . ,tn ∈ / a ∨ vk+1 , . . . , vn ∈ /a ∨ ¬(∀x ∈ a)(∃y ∈ a)G(x, y,t,v )]
11.11 The Collapsing Theorem for Ramified Set Theory
285
for a ∆0 -formula G(x, y,t,v ).Therefore, we have to distinguish the cases that (J) is an inference according to ( ) (the case that k < n in the critical sentence) or an inference according to ( ) (the case that k = n in the critical sentence). We start with the first case. Then we have the premise β0
H
¬Γ LΩ , ¬H(t1 , . . . ,tk+1 )LΩ , ∆ LΩ .
Ω
(viii)
By the parameter condition for ( ) inferences we obtain par(H(t1 , . . . ,tk+1 )) ⊆ / and can therefore apply the induction hypothesis to (viii). H (par(Γ , ∆ )) ⊆ H (0) So we have H
ω
ω β0
ψH ( ω ω
β0
)
+1 Ω
∆ LΩ
(ix)
β
ψH ( ω ω )
and obtain H ω β ∆ LΩ from (ix) by the structural rule. ω +1 Ω The real interesting case is that of an inference according to ( ). There we have the premises β0
H
Ω
¬Γ LΩ , (∀x ∈ LΩ )(∃y ∈ LΩ )G(x, y,t ), ∆ LΩ
(x)
¬Γ LΩ , (∀a ∈ LΩ )[a = 0/ ∨ ¬Tran(a) ∨ t ∈ /a∨ ¬(∀x ∈ a)(∃y ∈ a)G(x, y,t )], ∆ LΩ
(xi)
and H
β0 Ω
for some β0 < β . By ( ) inversion we obtain from (x) H
β0 Ω
¬Γ LΩ , (∃y ∈ LΩ )G(t, y,t ), ∆ LΩ β
(xii) β
for every term t ∈ TΩ . Let ηn := ω ω 0 ·(n + 1), ηω := ω ω 0 ·ω and η := ψH (ηω ). / = n∈ω Hηn (0) / because both Then (xii) is true for all t ∈ Tη . We have Hηω (0) sets are closed under H and ψH ηω . Since η = H (ηω ) ∩ Ω we obtain for every / ∩ Ω = ψH (ηn ). From (xii) we get t ∈ Tη an n ∈ ω such that par(t) ⊆ Hηn (0) H ηn
β0
¬Γ LΩ , (∃y ∈ LΩ )G(t, y,t ), ∆ LΩ
Ω
(xiii)
by the structural rule and can now apply the induction hypothesis to (xiii) to obtain Hηn+1 +1
ψH (ηn+1 ) Ω
(∃y ∈ LΩ )G(t, y,t ), ∆ LΩ .
(xiv)
By Remark 11.11.1 and the Boundedness Lemma (Lemma 11.10.4) we get from (xiv) Hηn+1 +1
ψH (ηn+1 ) Ω
(∃y ∈ Lη )G(t, y,t ), ∆ LΩ .
This shows that for every t ∈ Tη there is an n ∈ ω such that Hηω +1
ψH ( ηn ) Ω
(∃y ∈ Lη )G(t, y,t ), ∆ LΩ
286
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
and this entails ψH ( ηω )
Hηω +1
Ω
(∀x ∈ Lη )(∃y ∈ Lη )G(t, y,t ), ∆ LΩ
(xv)
by an inference ( ). From (xi) we get by ( )-inversion and Hηω +1
β0
¬Γ
LΩ
Ω
-exportation
, Lη = 0, / ¬Tran(Lη ), t1 ∈ / Lη , . . . ,tn ∈ / Lη , ¬(∀x ∈ Lη )(∃y ∈ Lη )G(x, y,t ), ∆ LΩ .
(xvi)
Since par(Lη ) = {η } ⊆ Hηω +1 (0) / we can apply the induction hypothesis to (xvi) to obtain H
ψH (ηω +ω ω ηω +ω
ω β0
β0
)
+1 Ω
Lη = 0, / ¬Tran(Lη ), t1 ∈ / Lη , . . . ,tn ∈ / Lη , ¬(∀x ∈ Lη )(∃y ∈ Lη )G(x, y,t ), ∆ LΩ .
(xvii)
/ = η + 3 we obtain by (11.19) and Since rnk(Tran(Lη )) = η + 6 and rnk(Lη = 0) (11.16) in p. 282 and Lemma 11.10.6 η ·ω 3
H
Lη = 0/
0
(xviii)
and η ·ω 6
H
Tran(Lη ).
0
(xix)
/ ∩ Ω ⊆ Hηω (0) / ∩ Ω = η we obtain ti ∈ Tη . By (11.16) we Since stg(ti ) ∈ H (0) ◦ therefore have ti ∈ Lη and thus η ·ω
H
0
t i ∈ Lη
(xx)
for i = 1, . . . , n by Lemma 11.10.6. Cutting (xvii), (xviii), (xix), (xx), and (xv) gives β
H
β ω ω +1
ψH ( ω ω ) Ω
∆ LΩ .
(xxi)
To ensure that these cuts are correct it suffices to check β0
{η ·ω k , ηω + ω ω + 1} ⊆ H
ωω
β
(0) / ∩ ωω
β
(xxii) β
because from (xxii) we obtain {η ·ω k , ψH (ηω + ω ω 0 )} ⊆ H
ψH
β (ω ω )
as well as H
ηω +ω ω
β0
+1
⊆H
β
ω ω +1
But (xxii) is obvious because β0 ∈ H (0) / ⊆H β ω ω 0 +1
H
ωω
β
β + ωω 0
∈H
ωω
β
β (0) / ∩ ωω .
β
β
(0) / ∩Ω =
ωω
β
(0) / =H
β
ω ω +1
(∆ LΩ ).
(0) / and thus also ηω + ω ω
β0
=
This in turn also implies η = ψH (ηω ) ∈
(0) / and η ·ω k = ω η +k < ω ψH (ω
nally also obtain ψH (ω ω ) ∈ H
β
which also imply that the para-
β
ω ω +1
meters of all cut formulas are controlled by H
ωω
ω ω +1
ωβ )
(0). /
β
= ψH (ω ω ). Since β ∈ H (0) / we fi
11.12 Ordinal Analysis for Kripke–Platek Set Theory
287
11.12 Ordinal Analysis for Kripke–Platek Set Theory Recent proof-theoretical research has shown that, via the semiformal system for ramified set theory, the constructible hierarchy is the best suited instrument for ordinal analyses of stronger impredicative systems. In our first step into impredicativity, this does not yet become visible. The ordinal analysis of ID1 is still simpler than that of KPω or (Π2 –REF). One of the reasons is the extensionality of sets which makes the proof of the identity axioms cumbersome in ramified set theory. Their proof will be therefore the first step in the ordinal analysis of KPω . Let A((v1 , . . . , vn )) denote that every free variable v1 , . . . , vn occurs at most once in the formula A. 11.12.1 Lemma (Identity Lemma) Let A((v1 , . . . , vn )) be a ∆0 -formula of L (∈). Then ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), . . . , ¬(sn ⊆ tn ), ¬(tn ⊆ sn ), ¬A(s1 , . . . , sn ), A(t1 , . . . ,tn ).
Proof We prove the lemma by induction on rnk(A(s1 , . . . , sn )) = rnk(A(t1 , . . . ,tn )). ◦ First let A((v1 , . . . , vn )) ≡ v1 ∈ v2 . If stg(s2 ) = 0 we obtain s1 ∈ / s2 by an inference (∈) / and thus the claim by (Str). So assume stg(s2 ) > 0. If stg(t2 ) = 0 we obtain ◦ ◦ b∈ / t2 and by (11.15) on p. 282 also b ∈ / s2 , b = s1 , b ∈ s2 for all b ∈ Tstg (s2 ) . ◦ Hence b ∈ / s2 , b = s1 , b ∈ s2 ∧ b ∈ / t2 for all b ∈ Tstg (s2 ) . By an inference ( )◦ and ◦ / s2 which again implies the claim by (Str). There(∈) / we then get ¬(s2 ⊆ t2 ), s1 ∈ fore assume stg(s2 ) = 0 as well as stg(t2 ) = 0. For s ∈ Tstg (s2 ) and t ∈ Tstg (t2 ) we obtain rnk(s = t1 ) = max{9, rnk(s) + 4, rnk(s1 ) + 4} < max{rnk(s1 ) + 6, rnk(s2 )} = rnk(s1 ∈ s2 ) and analogously rnk(t = t1 ) < rnk(t1 ∈ t2 ). So we obtain by the induction hypothesis ◦
¬(s ⊆ t), ¬(t ⊆ s), ¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), s = s1 ,t = t1 .
(i)
From (i) we obtain by (Sent) ◦
¬(s ⊆ t), ¬(t ⊆ s), ¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ),t ∈ / t2 , s = s1 ,t ∈ t2 ∧ t = t1 .
(ii)
By (∈) we obtain from (ii) ◦
¬(s ⊆ t), ¬(t ⊆ s), ¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ),t ∈ / t2 , s = s1 ,t1 ∈ t2
(iii)
and from (iii) ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), s = t,t ∈ / t2 , s = s1 ,t1 ∈ t2
(iv)
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), s ∈ / t2 , s = s1 ,t1 ∈ t2
(v)
for all t ∈ Tstg (t2 ) by ( )◦ . From (iv) we obtain ◦
by an inference (∈). / By (v) we obtain by (Sent)
288
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), s ∈ s2 ∧ s ∈ / t2 , s ∈ / s2 , s = s1 ,t1 ∈ t2
which by ◦
(∃s2 )
(vi)
implies
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), (∃x ∈ s2 )[x ∈ / t2 ], s ∈ / s2 , s = s1 ,t1 ∈ t2
(vii)
for all s ∈ Tstg (s2 ) . From (vii), however, we get finally ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), ¬(s2 ⊆ t2 ), ¬(t2 ⊆ s2 ), s1 ∈ / s2 ,t1 ∈ t2
by an inference (∈) / and (Str). Next let A((v1 , . . . , vn )) be a conjunction A0 ((v1 , . . . , vn )) ∧ A1 ((v1 , . . . , vn )). By the induction hypothesis we obtain ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), . . . , ¬(sn ⊆ tn ), ¬(tn ⊆ sn ), ¬Ai (s1 , . . . , sn ), Ai (t1 , . . . ,tn )
for i = 0, 1. By ( )◦ we then obtain ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), . . . , ¬(sn ⊆ tn ), ¬(tn ⊆ sn ), ¬A0 (s ) ∨ ¬A1 (s ), Ai (t )
for i = 0, 1 and finally by an inference ( )◦ ◦
¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), . . . , ¬(sn ⊆ tn ), ¬(tn ⊆ sn ), ¬A(s ), A(t ).
Now assume A((v1 , . . . , vn )) ≡ (∃x ∈ v1 )B((v2 , . . . , vn )). Let s := s2 , . . . , sn and t := t2 , . . . ,tn . For s ∈ Tstg (s1 ) and t ∈ Tstg (t1 ) we obtain ◦
¬(s2 ⊆ t2 ), ¬(t2 ⊆ s2 ), . . . , ¬(sn ⊆ tn ), ¬(tn ⊆ sn ), ¬(s ⊆ t), ¬(t ⊆ s), (viii) ¬B(s,s ), B(t,t )
by the induction hypothesis. Putting
∆ := ¬(s2 ⊆ t2 ), ¬(t2 ⊆ s2 ), . . . , ¬(sn ⊆ tn ), ¬(tn ⊆ sn ) we obtain from (viii) ◦
∆ , ¬(s ⊆ t), ¬(t ⊆ s), t ∈ / t1 , ¬B(s,s ), t ∈ t1 ∧ B(t,t )
(ix)
by (Sent) and from (ix) ◦
∆ , s = t, t ∈ / t1 , ¬B(s,s ), (∃x ∈ t1 )B(x,t )
(x)
∆,s ∈ / t1 , ¬B(s,s ), (∃x ∈ t1 )B(x,t ).
(xi)
for all t ∈ Tstg (t1 ) by an inference (∃t1 ) and an inference ( )◦ . From (x) we obtain
by (∈) /
◦
By an inference (Sent) we infer ◦
∆,s ∈ / t1 ∧ s ∈ s 1 , s ∈ / s1 , ¬B(s,s ), (∃x ∈ t1 )B(x,t )
from (xi) and by (∃s1 ) ◦
∆ , ¬(s1 ⊆ t1 ), s ∈ / s1 , ¬B(s,s ), (∃x ∈ t1 )B(s,t )
(xii)
11.12 Ordinal Analysis for Kripke–Platek Set Theory
289
for all s ∈ Tstg (s1 ) . From (xii) we get finally ◦
∆ , ¬(s1 ⊆ t1 ), ¬(t1 ⊆ s1 ), ¬((∃x ∈ s1 )B(x,s )), (∃x ∈ t1 )B(x,t )
by (∀s1 ) and (Str). This, however, is the claim. The cases that A((v1 , . . . , vn )) is a formula A0 ((v1 , . . . , vn )) ∨ A1 ((v1 , . . . , vn )) or a formula (∀x ∈ v1 )B((v2 , . . . , vn )) are symmetrical to the already treated cases. Therefore the lemma is proved. The following identity theorem is now a corollary to Lemma 11.12.1. 11.12.2 Theorem Let A(v1 , . . . , vn ) be an L (∈)-formula. Then ◦
s1 = t1 , . . . , sn = tn , ¬A(s1 , . . . , sn )LΩ , A(t1 , . . . ,tn )LΩ
holds for all LRS -terms s1 , . . . , sn and t1 , . . . ,tn . Let B((u1 , . . . , uk )) be a formula such that
Proof
A(v1 , . . . , vn ) = Bu1 ,...,uk (v1 , , v1 , v2 , , v2 , . . . , vn , , vn ). Then apply the identity lemma to obtain ◦
¬(s˜1 ⊆ t˜1 ), ¬(t˜1 ⊆ s˜1 ), . . . , ¬(s˜k ⊆ t˜k ), ¬(t˜k ⊆ s˜k ), ¬B(s˜1 , . . . , s˜k )LΩ , B(t˜1 , . . . , t˜k )LΩ
(i)
where s˜1 , . . . , s˜k = s1 , , s1 , . . . , sn , , sn and analogously t˜1 , . . . , t˜k = t1 , ,t1 , . . . , tn , ,tn . From (i) we infer the claim by successively applying ( )◦ -inferences. Next we show how logically valid formulas are translated into ramified set theory. Since we treat bounded quantifiers as basic symbols in the language of KPω , we have to extend the Tait calculus introduced in Definition 4.3.2 by rules for these basic symbols. We define (∃)b
If
(∀)b
If
m0 T m0 T
∆ , v ∈ u ∧ A(v), then
m T
∆ , (∃x ∈ u)Av (x) for all m > m0 .
∆,v ∈ / u, A(v) and the variable v is not free in any of the formulas in
∆ , (∀x ∈ u)Av (x), then
m T
∆ , (∀x ∈ u)Av (x) for all m > m0 .
11.12.3 Theorem Let ∆ (x1 , . . . , xk ) be a finite set of L (∈)-formulas which contain m ∆ (x1 , . . . , xk ) we obtain at most the variables x1 , . . . , xk free. If T
(∀α ≤ Ω )(∃n ∈ ω )(∃r ∈ ω )(∀a1 ∈ Tα ) . . . (∀ak ∈ Tα )H
ω ω ·α +n ω ·α +r
∆ (a1 , . . . , ak )Lα
for every Cantorian closed Skolem–hull operator H . Proof We prove the theorem by induction on m. In case of an axiom according to ◦ (Ax)L we get ∆ (a1 , . . . , ak )Lα by (Taut). Since drnk(∆ (a1 , . . . , ak )Lα ) ≤ ω ω ·α +n for some n < ω we obtain the claim.
290
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
The cases of inferences according to (∧) or (∨) follow immediately from the induction hypotheses. In the case of an inference according to (∃) or (∃)b we have the premise m0 T
∆ (x1 , . . . , xk ), A(u, x1 , . . . , xk )
(i)
∆ (x1 , . . . , xk ), u ∈ xi ∧ A(u, x1 , . . . , xk )
(ii)
or m0 T
and distinguish the following cases. 1. The variable u is different from all variables x1 , . . . , xk . Then we obtain H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak )Lα , Au (L0 , a1 , . . . , ak )Lα
(iii)
by the induction hypothesis and obtain by an inference according to ( ) H
ω ω ·α +n0 +1 ω ·α +r
∆ (a1 , . . . , ak )Lα , (∃x ∈ Lα )Au (x, a1 , . . . , ak )Lα
in case of (∃) or H
ω ω ·α +n0 +1 ω ·α +r
∆ (a1 , . . . , ak )Lα , (∃x ∈ ai )Au (x, a1 , . . . , ak )Lα
in case of (∃)b . Since the parameters of L0 are trivial we don’t have to bother about them. 2. The variable u occurs in the list x1 , . . . , xk . In case of (∃) we get H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak )Lα , Au (ai , a1 , . . . , ak )Lα
(iv)
by the induction hypothesis. Since ai ∈ Tα and ai occurs in the conclusion we obtain H
ω ω ·α +n0 +1 ω ·α +r
∆ (a1 , . . . , ak ), (∃x ∈ Lα )Au (x, a1 , . . . , ak )
by an inference ( ). In case of (∃)b we obtain H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak )Lα , ai ∈ a j ∧ Au (ai , a1 , . . . , ak )Lα
(v)
by the induction hypothesis. We have ◦
∆ (a1 , . . . , ak )Lα ,t ∈ / a j ,t = ai , ¬Au (ai , a1 , . . . , ak )Lα , t ∈ a j ∧ Au (t, a1 , . . . , ak )Lα
(vi)
for all t ∈ Tstg (a j ) by the Identity Theorem (Theorem 11.12.2). This implies H
ω ω ·α +n1 0
∆ (a1 , . . . , ak )Lα ,t ∈ / a j ,t = ai , ¬Au (ai , a1 , . . . , ak )Lα , t ∈ a j ∧ Au (t, a1 , . . . , ak )Lα
(vii)
for some n1 < ω by Lemma 11.9.15 and Lemma 11.10.6. From (v) we get H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak )Lα , ai ∈ a j
H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak )Lα , Au (ai , a1 , . . . , ak )Lα
(viii)
and (ix)
11.12 Ordinal Analysis for Kripke–Platek Set Theory
291
by inversion. Cutting (vii) and (ix) yields natural numbers n1 and r1 such that ω ω ·α +n2 ω ·α +r1
H
∆ (a1 , . . . , ak )Lα ,t ∈ / a j ,t = ai ,t ∈ a j ∧ Au (t, a1 , . . . , ak )Lα
(x)
for all t ∈ Tstg (a j ) . By an inference ( ) we obtain ω ω ·α +n2 +1 ω ·α +r1
H
∆ (a1 , . . . , ak )Lα ,t ∈ / a j ,t = ai , (∃x ∈ a j )Au (x, a1 , . . . , ak )Lα
(xi)
for all t ∈ Tstg (a j ) and by inferences ( ) and ( ) ω ω ·α +n3 ω ·α +r1
H
∆ (a1 , . . . , ak )Lα , ai ∈ / a j , (∃x ∈ a j )Au (x, a1 , . . . , ak )Lα .
(xii)
Cutting (viii) and (xii) yields the claim. In the case according to an inference (∀) or (∀)b we have the premises m0 T
∆ (x1 , . . . , xk ), A(u, x1 , . . . , xk )
(xiii)
∆ (x1 , . . . , xk ), u ∈ / x j , A(u, x1 , . . . , xk ),
(xiv)
or m0 T
respectively, such that u is not among the variables x1 , . . . , xk . From (xiii) and (xiv) we obtain by the induction hypothesis H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak )Lα , Au (t, a1 , . . . , ak )Lα
(xv)
for all t ∈ Tα or H
ω ω ·α +n0 ω ·α +r
∆ (a1 , . . . , ak ),t ∈ / a j , Au (t, a1 , . . . , ak )
(xvi)
for all t ∈ Tstg (a j ) , respectively. From (xv) we obtain by an inference ( ) directly H
ω ω ·α +n ω ·α +r
∆ (a1 , . . . , ak )Lα , (∀x ∈ Lα )Au (x, a1 , . . . , ak )Lα .
By (11.15) (in p. 282) we have ◦
t∈ / a j ,t ∈ a j
(xvii)
for all t ∈ Tstg (a j ) . Cutting (xvii) and (xvi) yields n1 < ω and r0 < ω such that H
ω ω ·α +n1 ω ·α +r0
∆ (a1 , . . . , ak ),t ∈ / a j ∨ Au (t, a1 , . . . , ak )
(xviii)
for all t ∈ Tstg (a j ) . By an inference ( ) we finally get from (xviii) H
ω ω ·α +n2 ω ·α +r0
∆ (a1 , . . . , ak ), (∀x ∈ a j )Au (x, a1 , . . . , ak ).
Now we derive the axioms of (Π2 –REF). We begin with the equality axioms IDEN . From (11.14) (in p. 282) we obtain ◦
(∀x ∈ Lα )[x = x]
(11.21)
292
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
for all α and from (11.18) also ◦
(∀x ∈ Lα )(∀y ∈ Lα )[x = y → y = x]
(11.22)
and from the Identity Theorem (Theorem 11.12.2) also ◦
(∀x ∈ Lα )(∀y ∈ Lα )(∀z ∈ Lα )[x = y ∧ y = z → x = z]
(11.23)
and ◦
(∀x ∈ Lα )(∀y ∈ Lα )(∀a ∈ Lα )(∀b ∈ Lα )[x = y ∧ a = b → x ∈ a → y ∈ b]. (11.24)
These are all the identity axioms. Since a = b is an abbreviation for the formula (∀x ∈ a)[x ∈ b] ∧ (∀x ∈ b)[x ∈ a] we obtain ◦
(Ext’)Lα
(11.25)
from (Taut). 11.12.4 Lemma Let α be a limit ordinal. Then
◦
(Pair’)Lα .
Let a, b ∈ Tα and put β := max{stg(a), stg(b)} + 1. Then
Proof ◦
a ∈ Lβ ∧ b ∈ Lβ
(i)
by (11.16). Since α ∈ Lim we have β < α and obtain ◦
(∃z ∈ Lα )[a ∈ z ∧ b ∈ z]
(ii)
for all a, b ∈ Tα by an inference ( )◦ . From (ii), however, we obtain ◦
(∀x ∈ Lα )(∀y ∈ Lα )(∃z ∈ Lα )[x ∈ z ∧ y ∈ z].
11.12.5 Lemma Let α be a limit ordinal. Then
◦
(Union’)Lα .
Let a ∈ Tα and β = stg(a). For t ∈ Tβ we obtain by (11.16) and (Str)
Proof ◦
s∈ / t, s ∈ Lβ
(i)
for all s ∈ Tstg (t) and thus ◦
(∀y ∈ t)[y ∈ Lβ ]
(ii)
t∈ / a, (∀y ∈ t)[y ∈ Lβ ]
(iii)
by an inference ( )◦ . From (ii) we obtain by (Str) ◦
and from (iii) by (∀a ) ◦
(∀x ∈ a)(∀y ∈ x)[y ∈ Lβ ].
(iv)
(∃z ∈ Lα )(∀x ∈ a)(∀y ∈ x)[y ∈ z]
(v)
Hence ◦
11.12 Ordinal Analysis for Kripke–Platek Set Theory
293
by (∃)Lα with Lβ as witness. Since this holds for all a ∈ Tα we finally obtain ◦
(∀u ∈ Lα )(∃z ∈ Lα )(∀x ∈ u)(∀y ∈ x)[y ∈ z]
by an inference (∀Lα ). 11.12.6 Lemma Let α be a limit ordinal. Then
◦
(∆0 –Sep)Lα .
Proof Let F(x, x1 , . . . , xn ) be a ∆0 -formula. Choose a, a1 , . . . , an ∈ Tα and define β := max{stg(a), stg(a1 ), . . . , stg(an )} + 1. Put b := {x ∈ Lβ x ∈ a ∧ F(x, a1 , . . . , an )}. From (Taut) we get ◦
t∈ / b,t ∈ a ∧ F(t, a1 , . . . , an )
(i)
for all t ∈ Tβ . Hence ◦
(∀x ∈ b)[x ∈ a ∧ F(x, a1 , . . . , an )]
(ii)
by (∀b ). Conversely we have ◦
t∈ / a, ¬F(t, a1 , . . . , an ),t ∈ a ∧ F(t, a1 , . . . , an ) ∧ t = t
for all t ∈ Tstg (a) by (11.15), (11.14) and (Sent). Using the definition of t ∈ b this means ◦
t∈ / a, ¬F(t, a1 , . . . , an ),t ∈ b ∧ t = t.
(iii)
Since stg(t) < stg(b) we obtain ◦
t∈ / a, ¬F(t, a1 , . . . , an ) ∨ t ∈ b
(iv)
(∀x ∈ a)[F(x, a1 , . . . , an ) → x ∈ b].
(v)
by (∈) and ( )◦ . Hence ◦
By (ii) and (v) we therefore have ◦
(∀x ∈ b)[x ∈ a ∧ F(x, a1 , . . . , an )] ∧ (∀x ∈ a)[F(x, a1 , . . . , an ) → x ∈ b]. (vi)
Since b ∈ Tα this yields ◦ (∃z ∈ Lα ) (∀x ∈ z)[x ∈ a ∧ F(x, a1 , . . . , an )] ∧ (∀x ∈ a)[F(x, a1 , . . . , an ) → x ∈ z]
(vii)
by an inference ( )◦ . By iterated applications of (∀Lα ) we finally obtain the provability of (∀x1 ∈ Lα ) . . . , (∀xn ∈ Lα )(∀x ∈ Lα )(∃z ∈ Lα ) z = {u ∈ x F(x, x1 , . . . , xn )} . To deal also with the foundation scheme we prove a foundation lemma.
294
11 Ordinal Analysis for Kripke–Platek Set Theory with Infinity
11.12.7 Lemma (Foundation Lemma). Let H be a Cantorian closed Skolem–hull operator and F(x, x1 , . . . , xn ) an arbitrary L (∈)-sentence. Then H
2·ρ +5·(stg (a)+1) 0
¬F(a, a1 , .. . , an )Lα , (∃x ∈ Lα ) F(x, a1 , . . . , an )Lα ∧ (∀y ∈ x)[¬F(y, a1 , . . . , an )Lα ]
for all α and LRS -terms a, a1 , . . . , an ∈ Tα with ρ := rnk(F(a, a1 , . . . , an )Lα ). Proof Choose a, a1 , . . . , an ∈ Tα . To improve readability we suppress mentioning the parameters a1 , . . . , an . We prove the lemma by induction on stg(a) and put ρa := 2·rnk(F(a)Lα ). For b ∈ Tstg (a) we get H
ρb +5·(stg (b)+1) 0
¬F(b)Lα , (∃x ∈ Lα ) F(x)Lα ∧ (∀y ∈ x)[¬F(y)Lα ]
(i)
by the induction hypothesis. From (i) we obtain structurally ρb +5·(stg (b)+1) b∈ / a, ¬F(b)Lα , (∃x ∈ Lα ) F(x)Lα ∧ (∀y ∈ x)[¬F(y)Lα ] (ii) H 0
which by two inferences ( ) and an inference ( ) implies ρb +5·stg (a)+3 H 0 (∀y ∈ a)¬F(y)Lα , (∃x ∈ Lα ) F(x)Lα ∧ (∀y ∈ x)[¬F(y)Lα ] . (iii) By tautology we obtain H
ρa 0
¬F(a)Lα , F(a)Lα , (∃x ∈ Lα ) F(x)Lα ∧ (∀y ∈ x)[¬F(y)Lα ] .
(iv)
Since ρb ≤ ρa we obtain from (iii) and (iv) by ( ) H
ρa +5·stg (a)+4 0
Lα ¬F(a)Lα , F(a) ∧ (∀y ∈ a)¬F(y)Lα , . Lα (∃x ∈ Lα ) F(x) ∧ (∀y ∈ x)[¬F(y)Lα ]
(v)
Since a ∈ Tα we obtain from (v) H
ρa +5·stg (a)+5 0
¬F(a)Lα , (∃x ∈ Lα ) F(x)Lα ∧ (∀y ∈ x)[¬F(y)Lα ]
(vi)
by an inference ( ). The parameter conditions are satisfied because a occurs in the conclusion. Since ρa + 5·stg(b) + 5 ≤ ρa + 5·stg(a), we obtain the claim from (vi). As an immediate consequence of the Foundation Lemma we obtain the following foundation theorem. 11.12.8 Theorem (Foundation Theorem) Let H be a Cantorian closed Skolem– hull operator, α a limit ordinal and F(x, x1 , . . . , xn ) an L (∈)–formula. Then 2·ω ·α +α (∀ x ∈ Lα ) (∃z ∈ Lα )F(z,x)Lα → H (∃z ∈ Lα )[F(z,x)Lα ∧ (∀y ∈ z)¬F(y,x)Lα ] . 0 Now we have collected all the facts which are needed to compute an upper bound for the Σ -ordinal of (Π2 –REF).
11.12 Ordinal Analysis for Kripke–Platek Set Theory
295
11.12.9 Theorem ||(Π2 –REF)||Σ ≤ ψ (εΩ +1 ). Proof Assume (Π2 –REF) F for a Σ -sentence F. Then there are finitely many axioms A1 , . . . , An of (Π2 –REF) such that m T
¬A1 , . . . , ¬An , F
(i)
in the Tait calculus augmented by the rules for the bounded quantifiers. According to Theorem 11.12.3 we obtain natural numbers n and r such that H
Ω ·ω n Ω +r
L
¬A1 Ω , . . . , ¬ALn Ω , F LΩ .
(ii)
For axioms Ai which are different from instances of (Π2 –REF) we obtain by equations (11.21) through (11.25), Lemma 11.12.4 through 11.12.6 and Theorem 11.12.8 natural numbers ni such that H
Ω ·ω ni 0
L
Ai Ω .
(iii)
By cuts we therefore obtain natural numbers m and s such that H
Ω ·ω m Ω +s
L
¬B1 Ω , . . . , ¬BLmΩ , F LΩ
(iv)
where the Bi ’s are all the instances of the scheme (Π2 –REF), which were needed in the formal proof of F. This holds for any Cantorian closed Skolem–hull operator and is especially true for the minimal operator B defined in Chap. 9. Applying cut / ∩ εΩ +1 such that elimination we obtain an ordinal α ∈ BεΩ +1 (0) B
α Ω
L
¬B1 Ω , . . . , ¬BLmΩ , F LΩ .
(v)
By the Collapsing Theorem (Theorem 11.11.2) we obtain α
Bω ω α +1
ψ (ω ω ) Ω
F LΩ .
(vi)
By semantical cut elimination (Theorem 11.10.2) we then obtain α
ψ (ω ω )
F LΩ
(vii)
which implies by Corollary 11.9.11 Lψ (ω ω α ) |= F. So we have Lψ (εΩ +1 ) |= F for all
Σ -sentences provable in (Π2 –REF). Hence ||(Π2 –REF)||Σ ≤ ψ (εΩ +1 ).
In combination with Theorem 11.7.4 we finally obtain the ordinal analysis of (Π2 –REF). 11.12.10 Theorem (Ordinal analysis of Kripke–Platek set theory) ||KPω || = ||KPω ||Σ = ψ (εΩ +1 ) and ||(Π2 –REF)|| = ||(Π2 –REF)||Σ = ψ (εΩ +1 ). Proof By Theorem 11.5.4 the theory ID1 is a subtheory of KPω which in turn is a subtheory of (Π2 –REF). Therefore we obtain ψ (εΩ +1 ) = ||ID1 || ≤ ||KPω || ≤ ||KPω ||Σ ≤ ||(Π2 –REF)||Σ ≤ ψ (εΩ +1 ) and therefore also ψ (εΩ +1 ) = ||KPω || ≤ ||(Π2 –REF)|| ≤ ||(Π2 –REF)||Σ = ψ (εΩ +1 ).
Chapter 12
Predicativity Revisited
The ordinal analyses of ID1 and (Π2 –REF) show that their crucial “impredicative” axioms are ID11 and the reflection scheme. Their presence forced us to develop the collapsing machinery in the ordinal analysis. This, however, is not the complete truth. The impredicative character of these axioms comes only in connection with foundation.1 This observation is due to J¨ager (cf. [48] and [52]). He has shown that theories which are considerably stronger2 than KPω become reducible to predicative theories as soon as the foundation scheme is removed or restricted. The methods of predicative proof theory are not in the center of this book. However, we have introduced most of the main notions and techniques needed in J¨ager’s work. Therefore, we sketch J¨ager’s approach leaving most of the proofs as exercises. One aim of this chapter is to compute Γ0 as an upper bound for proof-theoretic ordinal of the theories KPi 0 and KPl 0 introduced in Sect. 11.6. Together with Exercise 11.6.5 this also yields Γ0 as upper bound for the proof-theoretic ordinal of the theory (ATR)0 which in turn with Exercise 8.5.31 implies that Γ0 is the exact proof-theoretic ordinal for all these theories.
12.1 Admissible Extension The recursion theoretic background of J¨ager’s research is the theory of the next admissible set. Let T be a theory in a language L (T ) and M a model of T . Then we can build an admissible universe AM above M in which the elements of M act as “urelements”, i.e., objects without elements such that the domain M of M is an element of AM . If T is a theory in the language of set theory and M a transitive model of T we may extend M to an admissible set that contains M as an element. Here, we will mainly deal with theories in the language of set theory and introduce the 1 This chapter is intended to give some background information and is not needed in the following sections. It may therefore be omitted in a first reading. 2 Their ordinal analysis needs a second or even further steps into impredicativity.
W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
297
298
12 Predicativity Revisited
admissible extension T + of a theory T .3 Since we are going to analyze set theories without full foundation we cannot longer rely on the Σ -recursion theorem which we used to make clear that the Σ -ordinal of a set theory is an upper bound for its proof-theoretic ordinal. Therefore, we may be forced to use pseudo Π11 -sentences to define the proof-theoretic ordinal of theories which do not contain enough foundation. To provide pseudo Π11 -sentences let us assume that there are free second-order variables in the language of set theory. Since there are no defining axioms or rules for second-order variables they are harmless. However, if we count formulas t ∈ X or t ∈ / X among ∆0 -formulas, we must be aware that X stands only for classes which are ∆0 -definable. Any of our theories formulated with second-order variables is a conservative extension of its first-order part. For the definition of the admissible − extension, recall the theory BST (cf. p. 261) which contains the axioms of basic set theory BST (cf. p. 247) but the foundation scheme and (Union’) replaced by − − r − (TranC). Let KP be BST + (∆0 –Coll) and KP be BST + (Found) + (∆0 –Coll). 12.1.1 Definition (Admissible extension) Let T be a theory in the language of set theory. To obtain the language L (T + ), we augment the language L (T ) by a constant U whose intended meaning is a universe for the theory T . The axioms of T + comprise: • For all axioms A ∈ T the sentence AU . • The sentence Tran(U) together with all sentences t ∈ U for closed L (T )-terms t. −
• All axioms in KP . Observe that T + (∃y)[y = U] and T + (∀x ∈ U)(∃y)[x ∈ y], i.e., U is a set and a transitive subset of the universe. If a structure A := (A,U, E) is a model of T + then A
U = U is an element of A and A := (U, E) is an L (T )-structure such that A |= T . If A is well-founded and transitive and E the standard membership relation then A is an admissible structure above A in the sense of [4]. Obviously A is an end-extension of A. However, observe that we have not included the scheme of foundation in the axioms of T + . As we will see, this makes the theory proof-theoretically much weaker.
12.2 M-Logic Let T be a theory and M be the intended standard structure for L (T ). Our plan is to study T + by capturing an admissible segment of the constructible hierarchy above the “urelement structure” M by a semi-formal system RS M . In Definition 7.3.5 we designed a semi-formal system which is ω -complete, i.e., complete with respect to the standard model N. To obtain a semi-formal system which is complete with 3
This differs from J¨ager’s notation who called it T e .
12.2 M-Logic
299
respect to a countable L (T )-structure M we have to introduce a variant of this semi-formal system. 12.2.1 Definition Assume that M is a countable L (T )-structure such that for every element m in the domain of M there is an L (T )-term m representing m, i.e., we require mM = m. We put all sentences in the diagram of M into –type and alter Definition 5.3.3 by defining CSM ((Qx)F(x)) := Fx (m) m ∈ dom(M) . To get a verification calculus M modify (Ax) to (Ax)M below. (Ax)M If t M = sM then M
α
α
∆ similar to that in Definition 5.4.3 we have to
∆ , s ε X,t ε X for all ordinals α .
α
α
Finally, we define M ∆ using the rules (Ax)M , ( ) and ( ). Then M satisfies a counterpart of the ω -completeness theorem (Theorem 5.4.9). This is known as Mcompleteness theorem. Extending the verification calculus as in Definition 7.3.5 by a cut-rule, we obα tain a semi-formal system M ρ ∆ for L (T ). This semi-formal system is uniquely determined by the structure M. Therefore, we commonly do not distinguish the structure M and its associated semi-formal system. 12.2.2 Exercise (M-completeness theorem ) Let M be a countable L (T )structure. Show that M |= (∀X)F(X) holds true iff there is a countable ordinal α α such that M F(X). Hint: Modify the proof of the ω -completeness theorem.
12.2.3 Definition Let T be a theory, M a countable L (T )-structure. For a finite set α ∆ of L (T )-formulas, we denote by TM ρ ∆ that there is a finite subset Γ ⊆ T such that M
α ρ
¬Γ , ∆ .
12.2.4 Exercise Let T be a theory and M a countable L (T )-structure. Show that <ω T F implies TM 0 F for all L (T )-sentences F. Hint: This is the standard embedding argument. Show by induction on n that M
n 0
n T
∆ (x) implies
∆ (m) for all tuples m of elements in the domain of M.
12.2.5 Exercise Show that M M implies M
α ρ
∆.
α ρ
∆ , F for an L (M)-sentence F which is false in
Hint: This is an easy induction on α . In case that F is not the critical formula of the last inference you get the claim immediately from the induction hypothesis. In case that F is the critical formula use that M |= F implies M |= G for all (some) G ∈ CS(F) if F ∈ –type (F ∈ –type ).
300
12 Predicativity Revisited
12.3 Extending Semi-formal Systems The main tools in studying the admissible extension of a theory are semi-formal systems for ramified set theory. To this end we will extend a given semi-formal system S into two different directions. One direction is to extend S to a semi-formal system RS S which formalizes a set universe above the domain of S. The other extension is a system SU which engrafts first-order logic into S. We start by fixing an abstract notion of semi-formal systems. 12.3.1 Definition A semi-formal system4 S over a countable structure M is given by a language L (S), comprising the language of M, whose formulas are arranged in –type , –type , and possibly also atomic formulas without types such that every formula F ∈ ( –type ∪ –type ) is equipped with a characteristic sequence CSS (F). Moreover, we assume that there is a well-defined rank function rnkS (F) for the formulas in the language of S satisfying rnkS (G) < rnkS (F) for G ∈ α CSS (F). The derivability relation S ρ ∆ for finite L (S)-formulas is given by the axioms
(Ax)L If A is an atomic formula not in –type ∪ –type and {A, ¬A} ⊆ ∆ then α ρ ∆ holds true for all ordinals α and ρ and (Ax)M If s and t are M-terms such that sM = t M then for all ordinals α and ρ
α ρ
∆ , s ∈ X,t ∈ / X holds true
together with the familiar rules ( ), ( ) and (cut). If T is a theory in the language of S then we denote by TS α ρ
α ρ
∆ that there is
a finite set Λ ⊆ T ∪ IDEN such that S ¬Λ , ∆ where IDEN is the set of identity axioms (cf. p. 61). We say that ∆ is a consequence of T in the framework of S-logic. We view TS as a semi-formal system with additional axioms. When talking about semi-formal systems, we include semi-formal systems with additional axioms. For convenience we require that there are no function symbols in the language of S. This means no restriction for our studies. 12.3.2 Remark We did not, in general, require that the language of L (S) includes the membership symbol. To avoid tedious case distinctions, we assume that L (S) / t) := 0/ for always contains ∈ (and thus also ∈) / and put CSS (s ∈ t) := CSS (s ∈ L (S)-terms s and t if ∈ is not among the regular symbols of L(S). Observe that in languages without membership symbols s ∈ / t is a sentence in –type with empty 0 characteristic sequence. We therefore obtain S 0 s ∈ / t, (∃x ∈ t)(∀y ∈ x)[y ∈ / t], for all L (S)-terms s and t and thus S
4
<ω 0
(Found).
Here we understand semi-formal systems in the liberated sense of Note 7.3.17.
12.3 Extending Semi-formal Systems
301
Our plan is now to enhance S-logic by additional set theoretic axioms. These additional axioms are formulated in first-order logic. Therefore, we have to engraft first-order logic into S-logic. This is done in the following definition, where we allow S to be a semi-formal systems with additional axioms. 12.3.3 Definition (The extension SU ) Let S be a semi-formal system over a countable structure M. We augment the language L (S) by a new constant U (for urelements) and free variables (if not already present in the language L (S)) to the language L (SU ). Terms and formulas of L (SU ) are defined by the following clauses: • Every L (S)-term is an L (SU )-term. • Every new free variable and the constant U are L (SU )-terms. • If F is an L (S)-formula then F U is an L (SU )-formula. We put F U in U ( –type) iff F is in –type ( –type ) and define CSS (F U ) := –type GU G ∈ CSS (F) .
/ U and U = U in • We put U ∈ U as well as U = U in –type and dually U ∈ –type . All these sentences have empty characteristic sequences.
• If t is an L (S)-term then t ∈ U is in –type and we put CS(t ∈ U) = s is an L (S)-term / U is in –type with CS(t ∈ / U) = t = s . Dually t ∈ t = s s is an L (S)-term . / t are • If s and t are L (SU )-terms different from U then s = t, s = t, s ∈ t and s ∈ atomic L (SU )-formulas. Every atomic L (SU )-formula is an L (SU )-formula. Atomic formulas which do not belong to L (S) and do not contain U are neither in –type nor in –type . • The L (SU )-formulas areclosed under the positive boolean operations ∧ and ∨. Conjunctions belong to –type while disjunctions are in –type . The characteristic sequences of conjunctions and disjunctions are defined in the obvious way. • If F(u) is an L (SU )-formula where u is a free variable then (∀x)Fu (x) and (∀x ∈ U)Fu (x) are L (SU )-formulas in –type and (∃x)Fu (x) and (∃x ∈ U)Fu (x) are L (SU )-formula in –type . Let U CSS ((Qx ∈ U)Fu (x)) := Fu (t) t is an L (S)-term and
U CSS ((Qx)Fu (x)) := Fu (t) t is an L (SU )-term .
302
12 Predicativity Revisited
We define • rnkU (F) :=
0 U sup {rnkU (G) + 1 G ∈ CSS (F)}
if F ∈ / –type ∪ –type if F ∈ –type ∪ –type . α
Generalizing Definition 11.10.1, we obtain the proof relation SU ρ ∆ for a finite set ∆ of L (SU )-formulas using (Ax)L , (Ax)M , ( ), ( ) and (cut) where the cut rank ρ is computed according to rnkU . If S is a semi-formal system with additional axioms T , then the sentences <α T U become additional axioms of SU . The obvious abbreviation SU <ρ ∆ denotes that there are ordinals α0 < α and ρ0 < ρ such that SU U <α
∆. Observe that SU is again a semi-formal system over M.
S
α0 ρ0
∆ . Similarly, we use
ρ0
12.3.4 Exercise (a) Check that rnkU (F) is well defined. (b) Show that the Predicative Elimination Lemma (Lemma 7.3.14) and the Elimination Theorem (Theorem 7.3.15) remain true for the semi-formal system SU . Hint: (a) Pick an additively indecomposable ordinal η such that rnkS (F) < η holds true for & & U ) = rnkS (F) for L (S)-formulas F, rnk(F) := η for all L (S)-formulas F. Then define rnk(F & & & all atomic formulas which are not L (S)-formulas, rnk(A ◦ B) := max{rnk(A), rnk(B)} + 1 and U & Qx)F(x)) = rnk(F(u)) & & & rnk(( + 1 and show that G ∈ CSS (F) implies rnk(G) < rnk(F). (b) Just modify the proofs of 7.3.14 and 7.3.15.
12.3.5 Exercise Let ∆ be a finite set of L (S)-formulas. α α (a) Show that S ρ ∆ implies SU ρ ∆ U and conclude that TS U
TSU
α ρ
∆ .
α ρ
∆ implies
α 0
∆.
U
(b) Show that SU
α 0
∆ U implies S
α 0
∆ . Hence TSUU
α 0
∆ U entails TS
Hint: (a) holds essentially by definition. (b) This is an easy induction on α .
The next step is to resolve the additional set theoretic axioms enhancing a semiformal system S into a ramified set theory above U. The ramified set theory above S is formalized by the semi-formal system RS S introduced below. 12.3.6 Definition (The language L (RS S )) Assume again that S is a semi-formal system over a countable structure M, possibly with additional axioms. We define the L (RS S )-terms inductively by the following clauses. • Every L (S)-term is an L (RS S )-term of stage −1. • Every constant LSα is an L (RS S )-term of stage α .
12.3 Extending Semi-formal Systems
303
• Let G(u,u) be an L (SU )-formula such that u,u comprises all its free variables and a a tuple of closed L (RS S )-terms of stages less than α . By Gα we denote the formula which is obtained from G replacing U by LS0 and restricting all unbounded quantifiers in G to LSα . Then {x ∈ LSα Gαu,u (x,a)} is an L (RS S )-term of stage α . We use stgSRS (t) as a token for the stage of an L (RS S )-term and denote by TαS the set of all L (RS S )-terms of stages less than α . Formulas are defined by the following clause. • If F(u1 , . . . , un ) is an L (SU )-formula whose free variables occur all in the list u1 , . . . , un then Fuα1 ,...,un (a1 , . . . , an ) is an L (RS S )-formula for all n-tuples a1 , . . . , an of L (RS S )-terms. Observe that for an L (S)-formula F we obtain F U as an L (SU )-formula and thus S (F U )α , i.e., F L0 , as an L (RS S )-formula. Since we (may) have equations in the “basis language” L (S) we need the equality symbol as basic symbol anyway. Therefore, we regard equations s = t and s = t of L (RS S ) no longer as defined but as separate atomic formulas. This is in contrast to the language of ramified set theory as introduced in Sect. 11.9 and forces us to define also the type and the characteristic sequences for equations and inequalities. However, it becomes immediately clear that the difference is only of technical nature.
12.3.7 Definition (The –type and –typeof L (RSS ) and the characteristic sequences of L (RS S )-sentences.) We define –type , –type and CSRS M (F) for L (RS S )-sentences by the following clauses:
LS
• If F is an formula F0 0 where F0 is an L (S)-formula we put F in –type ( –type ) iff F0 is in –type ( –type ) of the semi-formal system S and define LS0 CSRS G ∈ CSS (F0 ) . S (F) = G L
For the following clauses, assume that F is not of the form F0 0 for an L (S)-formula F0 .
• Formulas s = t belong to –type with CSRS S (s = t) := (∀x ∈ s)(x ∈ t), (∀x ∈ t)(x ∈ s).
• Formulas s = t belong to –type with CSRS / t), (∃x ∈ t)(x ∈ / s). S (s = t) := (∃x ∈ s)(x ∈ RS S S / For simplicity we put CSRS S (L0 = L0 ) = CSS (L0 = L0 ) := 0.
304
12 Predicativity Revisited
Atomic formulas s ∈ t belong to –type and we define their characteristic sequences as ⎧ 0/ if s is LS0 and t ∈ T0S ⎪ ⎪ ⎪ ⎪ if t and s are both the term LS0 ⎨ 0/ S s = a a ∈ Tα if t is a term LSα CSRS S (s ∈ t) := ⎪ ⎪ ⎪ s = a ∧ a ∈ t a ∈ T0S if t ∈ T0S and s is not LS0 5 ⎪ ⎩ S s = a ∧ F(a) a ∈ Tα if t = {x ∈ LSα F(x)}.
Dually, we count formulas s ∈ / t among –type and define CSRS / t) := ¬F F ∈ CSRS S (s ∈ S (s ∈ t) . Quantifiers which are bounded by terms of stage −1 are regarded as defined. That is, for t ∈ T0S we put s ∈ t ∧ F(s) s ∈ T0S if Q = ∃ RS . CSS ((Qx ∈ t)F(x)) = S s∈ / t ∨ F(s) s ∈ T0 if Q = ∀ In the definition of the characteristic sequences CSRS S (F) of the remaining sentences F we follow the rules in Definition 11.9.6 where we replace Tα by TαS . We define 0 if F ∈ / ( –type ∪ –type ) rnkRS (F) := RS S sup {rnkRS S (G) + 1 G ∈ CSS (F)} if F ∈ ( –type ∪ –type ). We will, however, mostly omit the subscript and just write rnk(F) instead of rnkRS S (F) if there is no danger of confusion. It is mostly clear from the context (or inessential) to which rank we refer. α Finally, we obtain the derivability relation for RS S ρ ∆ using axioms (Ax)M , (Ax)L and the rules ( ), ( ) and (cut). 12.3.8 Exercise Show that rnkRS S (F) is well-defined. Hint: Let η be an additively indecomposable ordinal such that rnkS (F) ≤ η for all formulas in & & LS0 ) = rnkS (F) for all formulas F in L (S). For L (S)-terms t put rnk(t) = L (S). Define rnk(F & & LSα ) := η + ω ·α , put, rnkS (t) or rnk(t) = 0 in case that rnkS (t) is undefined. Then define rnk( according to Lemma 11.9.17, & = t) = rnk(s & = t) := max{9, rnk(s) & + 4, rnk(t) & + 4} rnk(s & and then follow the clauses in Definition 11.9.14 to define rnk(E) for all L (RS S )-expressions E. & & Finally, show that rnk(G) < rnk(F) holds true for all G ∈ CSRS S (F).
12.3.9 Exercise (a) Show that for finite sets ∆ of L (S)-formulas RS S plies S
α 0
∆.
α 0
S
∆ L0 im-
5 Strictly speaking, we should distinguish between the membership relation in the language L (S) and L (RS S ). It will, however, always be clear from the context which membership relation is meant.
12.3 Extending Semi-formal Systems
(b) Show that S
α ρ
∆ implies RS S
305 α ρ
S
∆ L0 .
Hint: Use induction on α .
The semi-formal system RS S is a generalization of ramified set theory RS as introduced in Sect. 11.9. The system RS represents the “pure part” of RS S . According to [4] Chapter II 1, the pure part of an admissible set A consists of the elements a ∈ A with empty support, i.e., of those elements for which TC(a) contains no ure0 lements. Since T0 = 0/ we get RS 0 s ∈ / L0 for all LRS -terms s, i.e., L0 represents the empty set in RS. In contrast to that, we get RS S S
β ρ
α ρ
s ∈ LS0 for those s ∈ T0S for which
s = s holds for some β < α . Therefore LS0 in RS S represents the set of all “ure0 s = s for all 0 S {x ∈ L0 x = x} we
lements” (i.e., L (S)-terms) for which S proves s = s. If we assume S L (S)-terms s then LS0 contains all “urelements”. Defining 0/ := 1
/ 0/ for all L (RS S )-terms. The role of L0 in RS is therefore played by 0/ get RS S 0 t ∈ in RS S . The elements in the “pure part” of RS S are those whose transitive closures intersected with LS0 is empty (cf. Fig. 12.1). We want to transfer wide parts of the results of Sects. 11.10 and 11.12 to the system RS S . This means to ruminate all the proofs given there. The situation is, however, not so bad since we do not need controlling operators, i.e., we do not have to care about the parameters occurring in formulas. In checking the properties of the empty set as defined in RS S we have already seen that the underlying semi-formal system S should fulfill some prerequisites, 0 e.g., S 0 s = s for every L (S)-term s. This is the background of the following definition. 12.3.10 Definition Call a semi-formal system S (possibly with additional axioms) chaste iff <ω
• S 0 s = t, s = t, S s and t and • S
<ω 0
<ω 0
s ∈ t, s ∈ / t and S
<ω 0
s∈ / s hold true for all L (S)-terms
F for all instances F of identity axioms in IDEN .
12.3.11 Exercise Let M be a countable L (∈)-structure which is well-founded and transitive. (a) Show that M-logic is chaste. (b) Show that for any chaste semi-formal system S and any theory T in the language of SU the system TSU is chaste. Hint: (a) follows from the Π11 -completeness of M-logic. For (b) recall the convention that all theories are supposed to comprehend the identity axioms.
306
12 Predicativity Revisited
So assume that S is a chaste semi-formal system. First, we observe that the calculus ◦ ∆ defined in Definition 11.10.5 carries over to the system RS S where we may add the rule (S) S
<ω 0
∆ implies
◦
∆
and drop the parameter conditions. Also the derived rules (Str) through (∀b ) stay valid where in the case of (Taut) we need the chasteness of S in case that A is a formula s ∈ t for L (S)-term s and t. We easily check that Lemma 11.10.6 modifies to the semi-formal system RS S , i.e., we have RS S
◦
∆ ⇒ RS S
ω +drnk(∆ ) 0
∆.
(12.1)
Next, we check equation (11.13) on p. 282, i.e., (a ∈ / a) for all RS S -terms a. Here ◦ S we have the additional case that a is in T0 . But then RS S a ∈ / a follows from the chasteness of S. Similarly we get (11.14), i.e., (a = a), in the case of a ∈ T0S by the chasteness of S. The lacking case a ∈ T0S in (11.15) (on p. 282) is also covered by the chasteness of S. Equations (11.16) and (11.17) follow in the same way. The restriction α = 0 in (11.17) can be dropped. In (11.19), we have to supplement Tran(LS0 ). <ω <ω To get it observe that RS S 0 s = s implies RS S 0 s ∈ LS0 which in turn entails RS S
<ω 0
<ω (∀y ∈ t)[y ∈ LS0 ] 0 <ω RS S 0 Tran(LS0 ). Here, we have
s∈ / t ∨ s ∈ LS0 , for all L (S)-terms t and s. Hence RS S <ω
and thus RS S 0 (∀x ∈ LS0 )(∀y ∈ x)[x ∈ LS0 ], i.e., also shown that (11.20) transfers to RS S . A bit more tedious to show is the counterpart of Theorem 11.12.2 RS S
◦
t1 = s1 , . . . ,tn = sn , ¬A(t1 , . . . ,tn ), A(s1 , . . . , sn ).
Here we show first the counterpart of Lemma 11.12.1, where we need the chasteness of S to treat the case of terms of negative stages. Now it is easy to check that (11.21) – (11.25), Lemmas 11.12.4 – 11.12.6 (pp. 291 and 293) become provable in RS S . In the coming text, we refer to these results as properties of RS S . Observe, however, that in general, we cannot transfer the Foundation Lemma 11.12.7. The reason is that T0S is no longer empty. We will study in Sect. 12.5 what is needed to handle the axiom of foundation. Exercise 12.3.14 below displays a special situation in which RS S proves foundation. 12.3.12 Remark Another difference is that the system RS S may include free second-order parameters which are supposed to represent subsets of the domain of the urelement structure. According to the discussion in Sects. 9.1 and 11.7, we could dispense with second-order parameters in ramified set theory RS. They can, however, easily be added. To do so, we allow atomic formulas of the form t ε X and t ε X where t is an LRS -term and introduce the rule (Ax)RS If {t ε X, s ε X} ⊆ ∆ and t L = sL then RS(X) α and ρ .
α ρ
∆ holds true for all ordinals
12.3 Extending Semi-formal Systems α
307 α
Let RS (X) ∆ stand for RS(X) 0 ∆ . We obtain the following completeness theorem for the extended verification calculus. 12.3.13 Exercise Let (∀X)F(X) be a Π11 -sentence in the language of ramified set theory which contains only the parameters Lα for α < ω1 . Show that Lω1 |= α (∀X)F(X) iff there is an ordinal α < ω1 such that RS(X) F(X). Hint: One direction is shown straight forwardly by induction on α and does not need the hypothesis of countability. For the opposite direction you have to define a search tree analogous to the search tree in Definition 5.4.5.
Let us discuss the situation in which the basis semi-formal system is Lτ -logic for a countable ordinal τ . For a ∈ Lτ we put rnkLτ (a) as the first ordinal at which a enters Lτ and define for an L (RS Lτ )-term a rnkLτ (a) for a ∈ T0Lτ RS stgτ + (a) := RS τ + stgLτ (a) otherwise. In RS Lτ , we can prove a foundation axiom, even a foundation scheme. 12.3.14 Exercise Show RS Lτ
α 0
b∈ / a, (∃x ∈ a)(∀y ∈ x)[y ∈ / a]
for α = rnk(b ∈ a) + 5·(stgτRS + (b) + 1). Hint: Use induction on stgτRS + (b).
The system RS Lτ represents a set universe above the urelement structure Lτ . We have seen in Sect. 11.10 that there is a semi-formal system building up Lτ from scratch. We are now going to unravel the urelement structure using ramified set theory (cf. Fig. 12.1). In Lemma 11.9.3, we have shown that for every set a ∈ Lτ there is an L (RS Lτ )-term ta representing a, i.e., satisfying taL = a. Choosing ta
308
12 Predicativity Revisited On
On
Pure part RS Lτ
RS Lτ
LL0 τ (= Lτ )
RS(X)Tτ
Fig. 12.1 Unraveling Lτ into RS(X)Tτ
12.3.17 Lemma Let τ be a countable additively indecomposable ordinal above τ +3·α α ω ω . Then RS Lτ ρ ∆ implies RS (X) τ +ρ ∆ τ+ . Proof The proof is by induction on α . We will not show all cases but restrict ourselves to the more delicate ones. α If RS Lτ ρ ∆ holds by (Ax)Lτ there are Lτ -terms a and b such that aLτ = bLτ and {a ∈ X, b ∈ / X} ⊆ ∆ . But then we get aτ+ L = taL = aLτ = bLτ = tbL = bτ+ L , and we get RS (X)
τ +3·α τ +ρ
∆ τ+ by (Ax)RS .
Now assume that RS Lτ
α ρ
∆ holds by an inference ( ). Then there is a formula
F ∈ –type ∩ ∆ such that RS Lτ
αG ρ
∆ , G for all G ∈ CSRS Lτ (F).
/ then F belongs to Diag(Lτ ) and we obtain RS (X) If CSRS Lτ (F) = 0
Lemma 11.9.18. So assume CSRS / Lτ (F) = 0. If F is a formula s ∈ / s0 , where s0 is a term
rnk(F) 0
∆ by
of stage −1 and s a term of stage ≥ 0,
we have the premises RS Lτ
αG ρ
∆,G
(i)
for all G ∈ CSRS Lτ (F). If s0 is Lβ for some β < τ the premises have the form RS Lτ
αt ρ
∆ , s = t ∨ t ∈ / Lβ , hence RS Lτ
αt ρ
∆ , s = t,t ∈ / Lβ
(ii)
12.3 Extending Semi-formal Systems
309
for all t ∈ T0Lτ . For a ∈ Tβ we get aL ∈ Lβ and thus RS Lτ
αaL ρ
∆ , s = aL from (ii) by
Exercise 12.2.5. By induction hypothesis we thus obtain RS(X) τ +3·α
τ +3·αaL τ +ρ
∆ τ+ , sτ+ = a
for all a ∈ Tβ . Hence RS(X) τ +ρ ∆ τ+ , sτ+ ∈ / Lβ . If s0 is a term {x ∈ Lβ H(x)} for some β < τ the premises have the form RS Lτ
αt ρ
∆ , s = t,t ∈ / s0
(iii)
for all t ∈ T0Lτ . Let a ∈ Tβ . Then aL ∈ Lβ ⊆ Lτ = T0Lτ . If Lτ |= H(aL ) then Lτ |= aL ∈ / sL0 and thus RS (X)
τ +3·αaL
τ+
∆ τ+ , sτ+ = a, ¬H(aL )
τ +ρ
by (iii), Ex-
ercise 12.2.5 and the induction hypothesis. If Lτ |= ¬H(aL ) then we also obtain τ+ τ+ τ L |= ¬H(aL ) which by Corollary 11.9.19 implies RS (X) 0 ∆ τ+ , sτ+ = a, ¬H(aL ) τ+
τ+
since rnk(¬H(aL ) ) < τ . But ¬H(aL ) RS (X)
τ +3·αa +2 τ +ρ
= ¬H τ+ (a) and we have
∆ τ+ , sτ+ = a ∨ ¬H τ+ (a) τ +3·α τ +ρ
∆ τ+ , sτ+ ∈ / {x ∈ Lβ H τ+ (x)}. In all other cases, we get CS(F τ+ ) = Gτ+ G ∈ CSRS Lτ (F) and obtain the claim directly from the induction hypothesis. The next case is that of an ( )-inference. Here, we have a formula F ∈ –type ∩ ∆ and a premise for all a ∈ Tβ which implies RS (X)
RS Lτ
α0 ρ
∆,G
(iv)
for some G ∈ CSRS Lτ (F). The only remarkable case is that F is a formula s ∈ s0 for a term s0 of stage −1. If s0 is Lβ for some β < τ then the premise has the form RS Lτ
α0 ρ
∆ , s = a ∧ a ∈ Lβ
(v)
for some a ∈ T0Lτ . Then aτ+ = ta ∈ Tτ . If ta ∈ Tβ we get RS (X) hence also RS(X)
τ +3·α τ +ρ
τ +3·α0 τ +ρ
∆ τ+ , sτ+ = ta ,
∆ τ+ , sτ+ ∈ Lβ , from (v) by ∧ -inversion and induction α0 ρ τ +3·α τ + τ+ ∆ ,s τ +ρ
/ Tβ then a = taL ∈ / Lβ and we get RS Lτ hypothesis. If ta ∈
∆ from (v) by ∧ -
∈ Lβ by induction inversion and Exercise 12.2.5. Hence RS (X) hypothesis. If s0 = {x ∈ Lβ H(x)} for some β < τ the premise has the form RS Lτ
α0 ρ
∆ , s = a ∧ a ∈ s0
(vi)
for some a ∈ T0Lτ . If Lτ |= a ∈ sL0 we get (aτ+ )L = a ∈ sL0 which entails aτ+ ∈ Tβ τ by the minimality of aτ+ and L |= H(a)τ+ . Hence RS(X) 0 ∆ τ+ , H τ+ (aτ+ ) by Corollary 11.9.19. From (vi) we get RS(X) and induction hypothesis. Hence RS (X)
τ +3·α0 +1 τ +ρ
τ +3·α0 τ +ρ
∆ τ+ , sτ+ = aτ+ ∧ H τ+ (aτ+ )
∆ τ+ , sτ+ = aτ+ by ∧ -inversion
310
12 Predicativity Revisited
τ +3·α τ +ρ
∆ τ+ , sτ+ ∈ {x ∈ Lβ H τ+ (x)}. If α 0 L |= a ∈ sL0 we get RS Lτ ρ ∆ from (vi) and Exercise 12.2.5 which immediately τ +3·α implies RS(X) τ +ρ ∆ τ+ , sτ+ ∈ {x ∈ Lβ H τ+ (x)} by the inductive hypothesis. In the remaining cases we again have CS(F τ+ ) = Gτ+ G ∈ CSRS Lτ (F) and ob-
by a inference ( ) which entails RS(X)
tain the claim easily from the induction hypothesis. The last case to consider is a cut. But here we just observe that rnk(F τ+ ) ≤ τ + rnkRS Lτ (F) and apply the induction hypothesis.
12.4 Asymmetric Interpretations The key tool in using ramified set theory in the ordinal analysis of predicative theories is a technique which is known as asymmetric interpretation. An interpretation fixes the ranges of set quantifiers. An interpretation is asymmetric if existential and universal quantifiers get different ranges. Our first application of asymmetric interpretations is the reduction of the admissible extension T + of a theory to the the basis theory T in the framework of S-logic. 12.4.1 Definition Let F be a formula in the language L (SU ). By F (α ,β ) we denote the L (RS S )-formula which is obtained from F by replacing all quantifiers (Qx ∈ U)[· · · x · · ·] by (Qx ∈ L0 )[· · · x · · ·], all unbounded quantifiers (∀x)[· · · x · · ·] by (∀x ∈ Lα )[· · · x · · ·] and all quantifiers (∃x)[· · · x · · ·] by (∃x ∈ Lβ )[· · · x · · ·]. I.e., unbounded quantifiers are interpreted asymmetrically while quantifiers ranging over U are interpreted by quantifiers ranging over L0 . If ∆ is the finite set {F1 , . . . , Fn }, we denote by D (α ,β ) the collection of all sets (α ,β ) (α ,β ) {F1 1 1 , . . . , Fn n n } with αi ≤ α and βi ≥ β for i = 1, . . . , n. If A is a formula (∀x1 ) . . . (∀xn )Fv1 ,...,vn (x1 , . . . , xn ) and u = u1 , . . . , un is a tuple of variables we call Fv (u) a specialization of A. 12.4.2 Lemma (a) Let S be a chaste semi-formal system. Assume moreover that every formula in the finite set Λ (u) of L (SU )-formulas is a specialization of an − axiom in KP or an identity axiom or an axiom Tran(U). Let moreover ∆ (u) be a set of L (SU )-formulas. Then SU
α 0
¬Λ (u), ∆ (u) implies RS S α
ϕ1 (β +3α )
ω ·(β +3α +1)
∆u (a)
for all ordinals β ≥ 2, all finite sets ∆ ∈ D (β ,β +3 ) and all tuples a of L (RS S )terms of stages less than β . (b) If Λ (u) contains also specializations of axiom (Found) and S is Lτ -logic for some countable ordinal τ the claim remains true for β ≥ τ . Proof The proof is by induction on α . Choose β and let σ := β + 3α and ρ := ω ·(σ + 1) and pick ∆ ∈ D (β ,σ ) . Assume first that the critical formula(s) of the last inference (J) belong(s) to ∆ (u). The claim follows trivially if (J) is an axiom (Ax)M or an axiom (Ax)L belonging to
12.4 Asymmetric Interpretations
311
L (S). If (J) is an axiom (Ax)L not belonging to L (S) then there is an atomic forδ mula A(ui , u j ) such that {¬A(ui , u j ), A(ui , u j )} ⊆ ∆ (u) and we obtain RS S ρ ∆ (a) for some ordinal δ < ϕ1 (β ) by (Taut).6 If the critical formula of (J) is (∀x ∈ U)G(x,u) we have the premises SU
αt 0
¬Λ (u), ∆ (u), G(t,u)
for all L (S)-terms t, i.e., for all t ∈ T0S . There are ordinals β ≤ β and σ ≥ σ such that ((∀x ∈ U)G(x,a))(β ,σ ) ∈ ∆ . By the induction hypothesis we get ϕ1 (β +3αt ) ρ
RS S
∆ (a), G(t,a)(β ,σ ) ϕ1 (β +3α )
for all t ∈ T0S and obtain RS S ρ ∆ (a), (∀x ∈ L0 )G(x,a)(β ,σ ) by an infer ences ( ). The case that the critical formula is (∃x ∈ U)G(x,u) is shown analogously. If the critical formula F is (∀x)G(x,u) then we have the premises SU
αa 0
¬Λ (u), ∆ (u), G(t,u)
for all terms t. Choose t to be a variable which does not occur inu. There are ordinals β ≤ β and σ ≥ σ such that ((∀x)G(x,a))(β ,σ ) ∈ ∆ . By the inductive hypothesis, we then obtain ϕ1 (β +3αa ) ρ
RS S
∆ (a), G(b,a)(β
,σ )
for all L (RS S )-terms b of stage less than β and by an inference ( ) we get ϕ1 (β +3α ) ρ
RS S
∆ (a), (∀x ∈ Lβ )G(x,a)(β
,σ )
, i.e., RS S
β +3α ρ
∆ (a).
Now assume that F is a formula (∃x)G(x,u). Let β ≤ β and σ ≥ σ be ordinals such that ((∃x)G(x,a))(β ,σ ) ∈ ∆ . We have the premise SU
α0 0
¬Λ (u), ∆ (u), G(t,u)
for some term t. If t is a variable occurring in the list u we replace it by the corresponding term b in the list a. Otherwise t is either an L (S)-term. i.e., a term in T0S , the term U, which is to replace by L0 , or a variable not occurring in u. In the latter case we replace t by any L (RS S )-term of stage less than β and obtain RS S
ϕ1 (β +3α0 ) ρ
∆ (a), G(b,a)(β
,σ )
by the inductive hypothesis. The claim follows by an inference ( ). The remaining cases that the critical formula belongs to ∆ (u) are even simpler and can be treated analogously. Assume next that the critical formula F of the last inference belongs to the set ¬Λ (u). There are two main sub-cases. First assume that the formula F has the shape 6
Cf. p. 281.
312
12 Predicativity Revisited
(∃xk+1 ) . . . (∃xn )G(u, xk+1 ) such that (∀xk+2 ) . . . (∀xn )¬G(u, v) is still a specialization of one of the axioms. Then, we have the premise α0
SU
0
(∃xk+2 ) . . . (∃xn )G(u,t), ¬Λ (u), ∆ (u)
for some L (SU )-term t. If t is a an L (S)-term, we leave t unchanged. If t is U we replace t by L0 . If t is a variable in the list u, we replace t by the corresponding term in the list a. If t is a variable not occurring in u, we replace t by any term of stage less than β and obtain by the induction hypothesis ϕ1 (β +3α0 ) ρ
RS S
∆ (a) α
α
ϕ1 (β +3α )
for all ∆ ∈ D (β ,β +3 0 ) ⊇ D (β ,β +3 ) and thus also RS S ρ ∆ (a). Now assume that the formula(s) in the premise corresponding to the critical formula F are not longer specializations of an axiom. Here, we have to distinguish cases according to the axiom which F specializes. (Nullset) If ¬F is a specialization of (Nullset) then F is (∀x)(∃y)[y ∈ x] and we have a premise S
α0
(∃y)[y ∈ w], ¬Λ (u), ∆ (u)
0
for a free variable w not occurring in the list u. Let b := {x ∈ Lβ x = x}. Then b is a term of stage β . Applying the induction hypothesis to β := β + 1 we obtain ϕ1 (β +3α0 ) ρ0
RS S
(∃y ∈ Lβ )[y ∈ b], ∆ (a).
By (11.14) (on p. 282) and the chasteness of S we get RS S a of stage less than β and thus RS S
<ϕ1 (β ) 0
<ϕ1 (β ) 0
a = a for all terms
(∀y ∈ Lβ )[y ∈ / b], ∆ (a)
and we obtain the claim by cut. (Pair) Let ¬F is a specialization of (Pair’), i.e., F is a formula (∀z)[u ∈ / z ∨ v∈ / z]. Then, we have a premise SU
α0 0
u∈ /w ∨ v∈ / w, ¬Λ (u), ∆ (u)
such that w does not occur in the list u while u and v are members of the list. Let a1 and a2 be the terms in the list a which correspond to u and v, respectively. Then b := {a1 , a2 } is an L (RS S )-term of stage ≤ β . Let β := β + 1 and apply the inductive hypothesis to obtain RS S
ϕ1 (β +3α0 ) ρ0
a1 ∈ / b ∨ a2 ∈ / b, ∆ (a).
(i)
By (11.14) (cf. p. 282) we obtain by a few inferences RS S
<ϕ1 (β ) 0
a1 ∈ b ∧ a2 ∈ b.
(ii)
12.4 Asymmetric Interpretations
313
Since rnk(a1 ∈ b ∧ a2 ∈ b) < ω ·β + ω < ρ we get ϕ1 (β +3α ) ρ
RS S
∆ (a)
from (i) and (ii) by cut. (Union) Now assume that ¬F is a specialization of (Union’). Then F is a formula (∀z) (∃x ∈ u)[¬x ⊆ z] and we have a premise SU
α0 0
(∃x ∈ u)[¬x ⊆ w], ¬Λ (u), ∆ (u)
where w is a variable not occurring in the listu. Let a0 be the term which corresponds to u. Then a := a0 = {x ∈ Lβ (∃y ∈ a0 )[x ∈ y]} is an L (RS S )-term of stage β . Applying the inductive hypothesis for β := β + 1 we get ϕ1 (β +3α0 ) ρ
RS S
for all ∆ ∈ D (β
,β +3α0 )
<ϕ1 (β )
RS S
0
(∃x ∈ a0 )[¬x ⊆ a], ∆ (a)
(iii)
. Since
(∀x ∈ a0 )[x ⊆ a],
(iv)
α
D (β ,β +3 0 ) ⊇ D (β ,σ ) and rnk((∀x ∈ a0 )[x ⊆ a]) < ω ·β + ω < ρ , we get the claim from (iii) and (iv) by cut. (Transitive Closure) Now, assume that (Union’) has to be replaced by (TranC). Then F is a formula (∀y)[¬Tran(y) ∨ ¬u ⊆ y] and we have a premise SU
α0 0
¬Tran(w) ∨ ¬u ⊆ w, ¬Λ (u), ∆ (u)
where w is a variable not occurring in the listu. Let a0 be the term which corresponds to u. Since a0 is a term of stage less than β we obtain <ϕ1 (β )
RS S
0
a0 ⊆ Lβ
(v)
by (11.20) in p. 283. Since Lβ isa term of stage β we can apply the inductive hypothesis for β := β + 1 and use -exportation to obtain ϕ1 (β +3α0 ) ρ
RS S
for all ∆ ∈ D
(β ,β +3α0 )
<ϕ1 (β )
RS S
0
¬Tran(Lβ ), ¬a0 ⊆ Lβ , ∆ (a)
(vi)
α
⊇ D (β ,β +3 ) . By (11.19), we have
Tran(Lβ )
(vii)
and get the claim from (v),(vi) and (vii) by cuts. (Separation) Assume ¬F is an instance of ∆0 -separation. Then F is a formula (∀z) (∃x ∈ z)[x ∈ / u ∨ ¬G(x,u)] ∨ (∃x ∈ u)[G(x,u) ∧ x ∈ / z] . for a ∆0 -formula G(x,u) and we have a premise SU
α0 0
(∃x ∈ w)[(x ∈ / u ∨ ¬G(x,u))] ∨ (∃x ∈ u)[G(x,u) ∧ x ∈ / w], ¬Λ (u), ∆ (u)
314
12 Predicativity Revisited
for a variable w not occurring in the list u. Then b := {x ∈ Lβ x ∈ a ∧ G(x,a)} is an L (RS S )-term of stage β . Applying the induction hypothesis to β := β + 1 we obtain RS S
ϕ1 (β +3α0 ) ρ
for all ∆ ∈ D (β
(∃x ∈ b)[x ∈ / a ∨ ¬G(x,a)] ∨ (∃x ∈ a)[G(x,a) ∧ x ∈ / b], ∆ (a) (viii) ,β +3α0 )
<ϕ1 (β )
RS S
0
. But, as shown in the proof of Lemma 11.12.6, we get
(∀x ∈ b)[x ∈ a ∧ G(x,a)] ∧ (∀x ∈ a)[¬G(x,a) ∨ x ∈ b].
(ix)
α
Since D (β ,σ ) ⊆ D (β ,β +3 0 ) we obtain the claim cutting (ix) and (viii). (Collection) The crucial case is that ¬F is an instance of ∆0 -collection. Then F is a formula ¬[(∀x ∈ v)(∃y)G(x, y,u) → (∃z)(∀x ∈ v)(∃y ∈ z)G(x, y,u)], Here, we need the asymmetric interpretation. We have the premises SU
α0 0
¬Λ (u), ∆ (u), (∀x ∈ v)(∃y)G(x, y,u)
(x)
¬Λ (u), ∆ (u), (∀z)(∃x ∈ v)(∀y ∈ z)¬G(x, y,u).
(xi)
and SU
α0 0
Let b be the term corresponding to v and put σ0 := β + 3α0 . Applying the inductive hypothesis to (x) we obtain ϕ1 (β +3α0 ) ρ
RS S
∆ (a), (∀x ∈ b)(∃y ∈ Lσ0 )G(x, y,a)
(xii)
α
for all ∆ ∈ D (β ,β +3 0 ) . Let β := σ0 + 1 and σ := β + 3α0 . Applying the inductive hypothesis to (xi) we get ϕ1 (β +3α0 )
RS S
ω ·(σ +1)
∆ (a), (∀z ∈ Lσ0 +1 )(∃x ∈ b)(∀y ∈ z)¬G(x, y,a)
and thus by ∀-inversion RS
ϕ1 (β +3α0 )
∆ (a), (∃x ∈ b)(∀y ∈ Lσ0 )¬G(x, y,a)
ω ·(σ +1)
(xiii)
for all ∆ ∈ D (β ,σ ) . Since β < β and σ = β + 1 = β + 3α0 + 1 + 3α0 ≤ β + 3α = σ we get D (β ,σ ) ∩ D (β ,σ0 ) ⊇ D (β ,σ ) . Since moreover ω ·(σ + 1) ≤ ω ·(σ + 1) and rnk((∃y ∈ b)(∀z ∈ Lσ0 )¬G(x, y,a)) ≤ ω ·(σ0 + 1) < ω ·σ + ω we get the claim from (xii) and (xiii) by cut. (Identity) Assume that ¬F is a specialization of an identity axiom, say ¬F is the formula u = v ∧ A(u) → A(v) for atomic A(u). Then F is u = v ∧ A(u) ∧ ¬A(v) and we obtain from the premises SU SU
α0 0
α0 0
¬Λ (u), ∆ (u), u = v ¬Λ (u), ∆ (u), A(u)
12.4 Asymmetric Interpretations
315
and SU
α0 0
¬Λ (u), ∆ (u), ¬A(v).
By the inductive hypothesis, we then get RS S
ϕ1 (β +3α0 ) ρ
∆ (a), a = b
RS S
ϕ1 (β +3α0 ) ρ
∆ (a), A(a)
RS S
ϕ1 (β +3α0 ) ρ
∆ (a), ¬A(b)
and
where a and b are the terms corresponding to u and v if these variables occur in the list u or arbitrary terms of stage less than β otherwise. Either by Theorem 11.12.2 or by the fact that S is chaste we get η
RS S
0
∆ (a), a = b, ¬A(a), A(b)
for some ordinal η < ϕ1 (β + 3α0 ). The claim then follows by cuts. The remaining cases of identity axioms are similar and are left as exercises. (Extensionality) The case that ¬F is the extensionality axiom follows immediately from (11.25) on p. 292. (Transivity of U) Finally, let ¬F be the sentence Tran(U). Then we have the the premise SU
α0 0
Λ (u), (∃y ∈ t)[y ∈ / U], ∆ (u)
for some L (S)-term t. Applying the inductive hypothesis yields RS S
ϕ1 (β +3α0 ) ρ
(∃y ∈ t)[y ∈ / L0 ], ∆ (a).
Since S is chaste, we have S
<ω RS S 0
RS S
<ω 0
s = s for every L (S)-term s and therefore
s∈ / t ∨ s ∈ L0 . This implies <ω 0
(∀y ∈ t)[y ∈ L0 ]
and we obtain the claim by (cut). To prove part (b) assume that S is Lτ -logic and ¬F is a specialization of axiom (Found). Then F is a formula (∃x)[x ∈ u] ∧ (∀x ∈ u)(∃y ∈ x)[y ∈ u] and we have the premises LτU
α0 0
(∃x)[x ∈ u], ¬Λ (u), ∆ (u)
316
12 Predicativity Revisited
and LτU
α0 0
(∀x ∈ u)(∃y ∈ x)[y ∈ u], ¬Λ (u), ∆ (u).
For σ0 := β + 3α0 , we obtain by the inductive hypothesis RS Lτ
ϕ1 (β +3α0 ) ρ
(∃x ∈ Lτ0 )[x ∈ a], ∆ (a)
(xiv)
RS Lτ
ϕ1 (β +3α0 ) ρ
(∀x ∈ a)(∃y ∈ x)[y ∈ a], ∆ (a).
(xv)
and
Since τ ≤ β < σ0 we get by Exercise 12.3.14 <ω σ0 0
RS Lτ
b∈ / a, (∃x ∈ a)(∀y ∈ x)[y ∈ / a]]
(xvi)
for all terms b ∈ TσL0τ which implies <ϕ1 (β +3α )
RS Lτ
0
(∀x ∈ Lσ0 )[x ∈ / a], (∃x ∈ a)(∀y ∈ x)[y ∈ / a]
(xvii)
/ a] and (∃x ∈ a)(∀y ∈ x)[y ∈ / a] have ranks by an inference ( ). Since (∀x ∈ Lσ0 )[x ∈ less than ρ we obtain ϕ1 (β +3α ) ρ
RS Lτ
∆ (a)
cutting (xvii), (xiv) and (xv).
12.4.3 Remark Observe that we can extend Lemma 12.4.2 to SU -derivations which may contain cuts whose cut formulas are at most Σ1 or Π1 . Then we have the additional case that there are premises SU
α0 µ
¬Λ (u), ∆ (u), (∃x)G(x,u)
(i)
SU
α0 µ
¬Λ (u), ∆ (u), (∀x)¬G(x,u).
(ii)
and
By the induction hypothesis, we get from (i) for σ0 := β + 3α0 RS S
ϕ1 (β +3α0 ) ρ
∆ (a), (∃x ∈ Lσ0 )G(x,a)
(iii)
and, putting σ := σ0 + 3α0 , from (ii) RS S
ϕ1 ( σ ) ρ
∆ (a), (∀x ∈ Lσ0 )¬G(x,a)
(iv)
Since σ = β + 3α0 + 3α0 < β + 3α , we get the claim cutting (iv) and (iii). 12.4.4 Definition Let T be a theory in the language of L (S). By TS+ −
α ρ
∆ we de-
note that there is a finite subset Γ of T and finite subset Λ ⊆ KP + Tran(U) such α α that SU ρ ¬Γ U , ¬Λ , ∆ . If T is empty, we just write S+ ρ ∆ . For a theory T in the
12.5 Reduction of T+ to T
317
language of L (SU ), we denote by TS+ + T such that
α TS+ ρ
¬Σ , ∆ . The notions
<α TS+ <ρ
α ρ
∆ that there is finite subset Σ of T
∆ are defined in the obvious way.
12.4.5 Theorem (a) Let S be a chaste semi-formal and α be an ε -number. Then TS+
<α 0
∆ U implies TS
<ϕα (0)
0
∆.
(b) If τ is a countable ordinal and α an ε -number then TL+τ + (Found) implies Lτ Proof
<ϕα (0)
0
(a) If TS+ −
<α 0
∆U
∆. <α 0
∆ U there is an ordinal ξ < α and finite subsets Γ ⊆ T
and Λ of KP + Tran(U) such that SU
ξ < α . By Lemma 12.4.2, we obtain RS S <ϕα (0)
ξ
¬Λ , ¬Γ U , ∆ U for some ordinal
0 ϕ1 (3ξ )
ω ·3ξ +ω
¬Γ L0 , ∆ L0 . Using predica-
¬Γ L0 , ∆ L0 and finally TS tive cut-elimination, we obtain RS S 0 Exercise 12.3.9. (b) is proved analogously using Lemma 12.4.2(b).
<ϕα (0) 0
∆ by
12.5 Reduction of T + to T Theorem 12.4.5 shows that T + is reducible to T in the framework of the infinitary S-logic. It was J¨ager’s observation that there is even a reduction in the framework of ordinary first-order logic. The basic idea is the same as in the reduction F implies within the framework of infinitary logic. The reduction chain “T + SU
ξ
0
−
¬Γ U , ¬Λ , F U for some finite Γ ⊆ T and finite Λ ⊆ KP which, in turn, im-
plies RS S that RS S
η
η 0
0
¬Γ U , F U and this implies TS ¬Γ , F allows to deduce
T
η 0
F” has to be modified in such a way
¬Γ , F. In general, however there is no pas-
sage from the infinitary system RS S to ordinary first-order logic. J¨ager’s idea was to k use a finitary fragment RS T of RS S to make this passage feasible. Of course we cannot start with an infinitary semi-formal system. Therefore we fix a theory T and regard first-order logic – as introduced in Definition 6.3.2 – as a semi-formal system where the characteristic sequence of a formula (Qx)Fu (x) is Fu (t) t is an L (T )-term .7 If L (T ) does not include the membership relation, / t) := 0/ for L (T )-terms s we add it to the language and define CSk (s ∈ t) := CSk (s ∈ and t.8 The role of k will become clear in a moment. To avoid unnecessary case distinctions, we again require that the language of T does not contain function symbols. It is obvious that the ∀-rule and ∃-rule are then obtainable from the tively. 8 Since s ∈ / t belongs to –type this makes s ∈ / t an axiom. 7
-rule and
-rule, respec-
318
12 Predicativity Revisited
Since pure first-order logic is in general not chaste, we require that the axioms in ∆ chaste. This means in particular that T suffice to make the system given by T T
T contains all identity axioms. We call T chaste if the “semi-formal” system T
T
∆
is chaste. k We use the language of T + to define the finitary fragment RS T of ramified set theory. For an L (T + )-formula F let F n be the formula which is obtained from F replacing the constant U by L0 and bounding all quantifiers by a constant Ln . 12.5.1 Definition Assume that {Fi i ∈ ω } is an enumeration of all L (T + )k formulas. We define the RS T-terms inductively by the following clauses. k
• Every L (T )-term is an RS T-term of stage −1. k • Every constant Ln with n < ω is an RS T-term of stage n. • Let G(u, u1 , . . . , uk ) be an L (T + )-formula occurring in the list {F0 , . . . , Fk }, which contains at most the free variables u, u1 , . . . , uk . If a := a1 , . . . , ak is a tuple k k of RS T-terms of stages less than n < ω then {x ∈ Ln Gn (x,a)} is an RS T-term of stage n. k
• If F(u1 , . . . , un ) is an L (T + )-formula and a1 , . . . , an is a tuple of RS T-terms then k
F m (a1 , . . . , an ) is an RS T-formula for any m < ω . k
By Tnk we denote the set of RS T-terms of stages less than n. k
12.5.2 Definition We define simultaneously a relation ∼ on the RS T-terms and the k
RS T-formulas by the following clauses. • • • •
a ∼ b if a and b are terms of stage −1. Ln ∼ Ln for all n < ω . {x ∈ Lm F(x)Lm } ∼ {x ∈ Ln G(x)Ln } iff m = n and G(u)Ln ∼ F(u)Ln . k A ∼ B iff there are a L (T + )-formula F(u1 , . . . , un ) and RS T-terms a1 , . . . , an and b1 , . . . , bn such that ai ∼ bi for i = 1, . . . , n and A = F m (a1 , . . . , an ) and B = F m (b1 , . . . , bn ) for some finite m.
Let [t]∼ := {s s ∼ t} and [F]∼ := {G G ∼ F}. k
∼
12.5.3 Exercise (a) Show that for every class [t]∼ of RS T-terms there is a term t (u) ∼
such that [t]∼ = {t u1 ,...,un (c1 , . . . , cn ) c1 , . . . , cn ∈ T0k }. ∼ k (b) Show that for every class [F]∼ of RS T-formulas there is a formula F(u) such ∼
that [F]∼ = {F u1 ,...,un (c1 , . . . , cn ) c1 , . . . , cn ∈ T0k }.
12.5 Reduction of T+ to T
319 k
(c) Show that ∼ is an equivalence relation on the RS T-terms and – formulas. (d) Conclude that the equivalence classes of ∼ provide a finite partition of the terms in Tnk for every n < ω . Hint: Show (a) and (b) simultaneously by induction on rnk(t) and rnk(F), respectively. This is ∼ ∼ more tedious than you would expect. Observe that t and F are obtained from t or F, respectively, by successively marking all occurrences of terms of stages −1 by free variables. Pay attention that different occurrences even of the same term have to be marked by different variables. Then use (a) and (b) to prove (c) and then (d), where you need induction on n. k α ρ
To define the relation RS T
k
∆ for a finite set ∆ of RS T-formulas we modify
Definition 12.3.7 of the characteristic sequences for RS T-formulas replacing TαS by k
k
Tnk accordingly. Let CSk (F) denote the characteristic sequence in the sense of RS T.
k
However, not every RS T-formula is in ( –type ∪ –type ). Let • rnkk (A) := 0 for all atomic L (T )-formulas and define
• rnkk (F) := sup {rnkk (G) G ∈ CSk (F)} + 1 for F ∈ ( –type ∪ –type ). Observe that rnkk (F) is well-defined (cf. Exercise 12.3.8). 12.5.4 Exercise Show that A ∼ B implies rnkk (A) = rnkk (B). Conclude that k rnkk (F) < ω for all RS T-formulas F. k
12.5.5 Definition We are now going to define the semi-formal system for RS T. Since there is no underlying structure M we do not need rule (Ax)M but only (Ax)Lk
k α ρ
{A, ¬A} ⊆ ∆ then RS T k α ρ ∆ k α RS T ρ
and we define RS T We denote by k α RS T ρ
k
If A is an atomic L (RS T)-formula not in ( –type ∪ –type ) and
∆ for all ordinals α and ρ
using rules (Ax)kL , ( ), ( ) and (cut).
∆ that there is a finite set Λ ⊆ T of formulas such that
¬Λ , ∆ . L0
k α ρ
12.5.6 Exercise Show that RS T such that
k m RS T r
∆ implies that there are finite ordinals m and r
∆. k α ρ
Hint: Show first that RS T
k α ρ
∆ (u) implies RS T
∆u (t) for all terms t of stage −1. Then prove the k
claim by induction on α using the fact that the characteristic sequence of any RS T-formula contains only finitely many equivalence classes.
320
12 Predicativity Revisited k
∆ to denote that there are finite ordinals m and
As a shorthand, we write RS T r such
k m that RS T r
k m RS T 0
∆
k and RS T 0
∆ to denote that there is a finite ordinal m such that k
∆ . For a theory T in the language of RS T we denote by T RSk
T
is a finite set Λ ⊆ T such that
k RS T k
k
¬Λ , ∆ . We use the same notations for RS T. k
∆ implies RS T
12.5.7 Exercise Show that RS T
∆ that there
0
∆.
k
Hint: Adapt the Reduction Lemma (Lemma 7.3.12) to RS T and then copy the proof of the Basis Elimination Lemma (Lemma 7.3.13). k
12.5.8 Exercise Let ∆ be a finite set of L (T )-formulas. Show that RS T implies
T
∆ . Conclude that
k RS T 0
∆ L0 implies T
k m 0
Hint: Show by induction on m that RS T
∆ L0 implies
the critical formula of the last inference in
k m RS T 0
k m0 RS T 0
T
T
0
∆ L0
∆.
∆ . The only remarkable case is that
∆ L0 is a formula (∀x ∈ L0 )F(x)L0 . Then you
∆ , F(t) for all L (T )-terms t. Choose a free L (T )-variable not have the premises occurring in ∆ , F and apply the inductive hypothesis followed by a universal inference. L0
L0
Now we have to check in how far the results of Sects. 11.10 and 11.12 can be k transformed to RS T. Recall our general proviso that T is a chaste theory. Then we observe that in transferring the results of Sect. 11.10 and Sect. 11.12 to RS S we only used the chasteness of RS S but never the fact that there are infinitely many RS S k terms. Therefore, we can assume that all these properties stay correct for RS T. In the next step we apply the method of asymmetric interpretations to the sysk tem RS T. In analogy to Definition 12.4.1 let F (m,s) be the formula obtained from the L (T + )-formula by replacing U by L0 and bounding all unbounded universal quantifiers by Lm and all unbounded existential quantifiers by Ls . For a finite set ∆ = {F1 , . . . , Fn } of L (T + )-formulas let D (m,s) := {{F1 (m1 ,s1 ) , . . . , Fn (mn ,sn ) } mi ≤ m ∧ si ≥ s}.
12.5.9 Exercise Let T be a chaste theory. Let moreover Λ (u) and ∆ (u) be finite sets of L (T + )-formulas not containing free variables other than those in the list − u. Assume that every formula in Λ (u) is a specialization of an axiom in KP , an n ¬Λ (u), ∆ (u) implies that there is a identity axiom, or an axiom Tran(U). Then T
k such that RS T ∆ (a) holds true for all finite ordinals m, all ∆ ∈ D (m,m+3 ) and k and all RS T-terms of stages less than m. k
n
Hint: The proof by induction on n is essentially the same as that of Lemma 12.4.2. Exercise 12.5.6 ensures that all derivations remain finite. To find the correct k you have to secure that all comprehension formulas of specializations of ∆0 -separation occurring in Λ (u) as well as the formulas
12.5 Reduction of T+ to T
321
which are needed to handle specializations of (Nullset), (Pair’) and (Union’) or (TranC), respectively, are among the formulas in the enumeration F0 , . . . , Fk .
12.5.10 Theorem The theory T + is a conservative extension of T . F U for an L (T )-formula F we obtain finite sets Γ ⊆ T and Proof If T + − Λ ⊆ KP ∪ Tran(U) such that ¬Λ , ¬Γ U , F U . By Exercise 12.5.9 we obtain a T
k
k such that RS T
k
¬Γ L0 , F L0 . By Exercise 12.5.7, we get RS T
Exercise 12.5.8 finally T
T
0
¬Γ L0 , F L0 and by
F.
12.5.11 Remark Surprisingly Theorem 12.5.10 states that the axioms for the next admissible set do not affect the strength of the basis theory T . Notice, however, that this crucially depends on the fact that there is no foundation in T + . The main point in obtaining this conservation result is to get rid of the additional set theoretical axioms in T + . The crucial axiom in this connection is the axiom of ∆0 -collection. This axiom is removed by an asymmetric interpretation. Such an interpretation, however, cannot work in the presence of a foundation or induction scheme because there the same formula occurs positively and negatively. It remains to study what happens in the presence of the foundation axiom. This is prepared by the following lemma. k
12.5.12 Exercise Show that for every RS T-formula F there is an L (T )-formula FT k
such that RS T
L
(∀x ∈ L0 )[F(x) ↔ FT 0 (x)].
Hint: The key in solving Exercise 12.5.12 is Exercise 12.5.3. Recall that for all formulas in –type ∪ –type we have CS(F) = G(s) s ∈ I , k
where I is a finite index set for RS T-terms (in case that F is a conjunction or distinction F0 ◦ F1 let G(s) be Fs for s ∈ {0, 1}). By Exercise 12.5.3 there is a term s∼containing free variables u1 , . . . , un k such that t ∼ s iff there is a tuple c1 , . . . , cn of RS T-terms such that t is a term s∼u1 ,...,un (c1 , . . . , cn ). Now define ⎧ iff F ∈ / ( –type ∪ –type) or CS(F) = 0/ ⎪ ⎨F ∼ {(∃x)G(su (x))T G(s) ∈ CS(F)} for F ∈ –type FT := ⎪ ⎩ {(∀x)G(s∼ (x)) G(s) ∈ CS(F)} for F ∈ –type . T u
Let (FOUND(X)) be the scheme (∃y)[y ∈ X ∧ F(x)] → (∃z)[F(z) ∧ z ∈ X ∧ (∀y ∈ z)[y ∈ X → ¬F(y)]] saying that ∈ is well-founded on X for definable classes which is equivalent to the scheme (∀x ∈ X)[(∀y ∈ x)[y ∈ X → F(y)] → F(x)] → (∀x ∈ X)F(x)
322
12 Predicativity Revisited
of ∈-induction. Recall also the axiom (Found(X)) (introduced on p. 263) saying that ∈ is well-founded on X for sets. 12.5.13 Exercise Let F be an L (T )-formula. Show T + + (Found(X)) T + (FOUND(X))
T
T
F U iff
F.
Hint: The direction from right to left is simple since x ∈ U F(x)U is a set in T + . The opposite direction is proved by extending Exercise 12.5.9 to the case that Λ (u) may also contain a specialization of axiom (Found(X)) under the additional hypothesis T (FOUND(X)). To this end you T
have to check k
b∈ / a, b ∈ / X, (∃x ∈ a)[x ∈ X ∧ (∀y ∈ x)[y ∈ /X ∨ y∈ / a]
RS T
for all b ∈ Tnk . This is done by induction on n. If a and b are in T0k
you get (i) from T
(i) T
(FOUND).
and stg(a) ≥ 0. According to Exercise 12.5.12 there is an L (T )-formula So assume that b ∈ k (b ∈ a ∧ b ∈ X ↔ F L0 (b)). Using T (FOUND) you get F such that RS T T0k
T
T
T
¬F(b), (∃x)[F(x) ∧ (∀y ∈ x)¬F(y)],
which is (i) for b ∈ T0k . For n > 0 you get k
RS T for all
c∈ / a, c ∈ / X, (∃x ∈ a)[x ∈ X ∧ (∀y ∈ x)[y ∈ /X ∨ y∈ / a]9
k c ∈ Tstg (b) k
RS T
(ii)
by the induction hypothesis. Use (ii) to infer
(∀x ∈ b)[x ∈ /a ∨ x∈ / X], (∃x ∈ a)[x ∈ X ∧ (∀y ∈ x)[y ∈ /X ∨ y∈ / a]
(iii)
which implies k
RS T
(∀x ∈ b)[x ∈ /a ∨ x∈ / X] ∧ b ∈ X ∧ b ∈ a, (∃x ∈ a)[x ∈ X ∧ (∀y ∈ x)[y ∈ /X ∨ y∈ / a], b ∈ / a, b ∈ /X
(iv)
by tautology. From (iv) you easily get (i) for all terms b. Now you have to extend the proof of Exercise 12.5.9 by the case that ¬F is a specialization of axiom (Found). Then F is a formula (∃x)[x ∈ u ∧ x ∈ X] ∧ (∀x ∈ u)[x ∈ X → (∃y ∈ x)[y ∈ X ∧ y ∈ u] and you have the premises n0 T
(∃x)[x ∈ u ∧ x ∈ X], ¬Λ (u), ∆ (u)
and n0 T
(∀x ∈ u)[x ∈ / X ∨ (∃y ∈ x)[y ∈ X ∧ y ∈ u], ¬Λ (u), ¬(∀x)[x ∈ U], ∆ (u).
For m0 := m + 3n0 you get by the inductive hypothesis k
(∃x ∈ Lm0 )[x ∈ a ∧ x ∈ X], ∆ (a)
(v)
k
(∀x ∈ a)[x ∈ / X ∨ (∃y ∈ x)[y ∈ X ∧ y ∈ a], ∆ (a)
(vi)
RS T and RS T 9
For the notion ∈ cf. p. 280.
0
12.6 The Theories KPn and KPn for ∆ (a) ∈ D (m,m+3 ) . Applying two with (v) and (vi) yields n
k
RS T
323
inferences and a
inference to (i) and cutting the result
∆ (a)
and you have extended Exercise 12.5.9. k NowassumeT + + (Found(X)) F U .ThenconcludeRS T+(FOUND(X)) F L0 byExercise 12.5.9 and finally, use exercises 12.5.7 and 12.5.8 to get T + (FOUND(X)) F.
0
12.6 The Theories KPn and KPn In this section we introduce the theories KPn which axiomatize set universes containing n-admissible sets. Their ordinal analyzes are already outside of a first step into impredicativity (they need n-steps). However, restricting the amount of foun0 dation which is available in these theories we obtain theories KPn which are in the realm of predicative proof theory. 12.6.1 Definition Recall the group Ad of axioms introduced on p. 262. By KPn we which comprises the axdenote the theoryin the language L (0, / A0 , . . . , An , ∈, Ad) / 0] / + 0/ ∈ A0 + n−1 ioms KP + Ad + ni=0 Ad (Ai ) + (∀x)[x ∈ i=0 Ai ∈ Ai+1 axiomatizing that there are n consecutive admissible universes which entails the existence of the admissible ordinals ω and ωiCK for i ∈ {1, . . . , n}. The canonical least model for the theory KPn in the constructible hierarchy is Lω CK . n+1
0
The theory KPn is obtained from KPn by replacing the foundation scheme r
(FOUND) by axiom (Found(A0 )), the theory KPn is obtained from KPn by replacing the foundation scheme (FOUND) by axiom (Found). If T and T are theories, we write T ( T iff a model M of T is definable within any model M of T . By T ≡ T we denote that T ( T and T ( T hold true. 12.6.2 Exercise (a) Show that KPω ≡ KP0 . 0
(b) Show that KPω 0 ≡ KP0 . r
(c) Show that KPω r ≡ KP0 . r
(d) Show that KPω r + (IND)ω ≡ KP0 + (FOUND(A0 )). Hints: (a) KPω ( KP0 is obvious since you get ω = A0 ∩ On. For the opposite direction you have to show that the hereditarily finite sets – which are definable in KPω (cf. [4]) – are an interpretation for A0 . 0 (b) Again you have to show that a model of KP0 can be constructed within a model of KPω 0 . This is, however, much more complicated since you do not have enough induction. But you have the existence of ω and the hereditarily finite sets can be coded into ω . (c) and (d) follow easily from (b).
324
12 Predicativity Revisited 0
To establish the connection between the theories KPn and iterations of the admis' sible extension we introduce the theories KPn. In defining the admissible extension + ' KPn we rename the new constant U by An+1 in the obvious way. − ' ' ' + 12.6.3 Definition Let KP−1 := KP and KPn+1 := KPn + (Found(A0 )). + − 'r ' 'r = KP and KPn+1 := KPnr + (Found). KP−1 0 r ' ' 12.6.4 Exercise Show that KPn ≡ KPn and KPn ≡ KPnr hold true for all n ≥ 0. − − ' Hint: Roughly speaking you get KP0 = KP + (KP )A0 + 0/ ∈ A0 + Tran(A0 ) + (Found(A0 )) ≡ −
0
KP + Ad(A0 ) + (Found(A0 )) = KP0 and by induction on n also − ' ' KPn+1 = KP + (KPn)An+1 + {Ai ∈ An+1 i ≤ n} + Tran(An+1 ) + (Found(A0 )) −
0
≡ KP + (KPn )An+1 + {Ai ∈ An+1 i ≤ n} + Tran(An+1 ) + (Found(A0 )) −
≡ KP + Ad(An+1 ) + 0/ ∈ A0 +
n
i=0 Ad(Ai ) +
n+1
i=0 Ai
∈ Ai+1 + (Found(A0 ))
0
≡ KPn+1 . r Render this sketch more precisely and apply it also to KPn .
12.6.5 Remark Recall the semi-formal system Lω defined according to Definition 12.2.1. The semi-formal system based on Lω must not be confused with ramified set-theory. Formulas s ∈ t and s = t are atomic in Lω . But observe that sLω ∈ t Lω and sLω = t Lω are still decidable (cf. [4] II.2). All formulas in L (Lω ) have finite ranks. By a simple induction on the rank of an L (Lω )-sentence we get the following lemma. 12.6.6 Lemma Let F be an L (Lω )-sentence which is true in Lω . Then Lω
rnk(F) 0
F.
In Sect. 11.5, we have shown that all primitive recursive functions are sets. This needed the existence of ω . More generally we know that every recursive function is ∆1 definable on Lω (cf. [4] II.2.3). Therefore there is a direct translation of arithmetical pseudo Π11 -sentences F into pseudo Π11 -sentences F ∗ of L (Lω ) such that N |= (∀X)F(X) implies Lω |= (∀X)[X ⊆ On → F ∗ (X)]. The role of natural numbers in Lω is played by ordinals. We identify F and F ∗ if there is no danger of confusion. α
1 12.6.7 Exercise F∗ Let F be an arithmetical pseudo Π1 -sentence. Show that Lω implies tc F ≤ α . Let ≺ be a transitive binary relation. Infer otyp(≺) ≤ α from α TI(≺, X)∗ . Lω
Hint: For an Lω -formula F let F be the formula which is obtained from F by replacing all closed terms t by their evaluations t Lω . Let ∆ , F be a finite set of arithmetical formulas and show first Lω
α
∆ ∗ , F ∗ ⇒ Lω
α
∆ ∗, F ∗
0
12.6 The Theories KPn and KPn
325
by induction on α . The crucial case is that the critical formula of the last inference is F ∗ for an atomic formula F. Observe that then F ∗ is not necessarily atomic, too. However, if F ∈ Diag(N) α ∆ ∗ , F ∗ with an inference ( ). Now assume that F is a then F ∗ ∈ Diag(Lω ) and you get Lω formula t ∈ X for a term t of the shape f s1 . . . sn where f is a symbol for an n-ary primitive recursive function. Then observe that F ∗ is the formula (∀y ∈ ω )(∀x1 ∈ ω ) . . . (∀xn ∈ ω )[ f x1 . . . xn = y → y ∈ X]. Since Lω Lω
αs
α
∆ ∗ , F ∗ you have the premises
, ∆ ∗, s ∈ / On, (∀x1 ∈ ω ) . . . (∀xn ∈ ω )[ f x1 . . . xn = s → y ∈ X]
for all s ∈ Lω , which by inversion entails Lω
αs
∆ ∗, s ∈ / On,t1 ∈ / On, . . . ,tn ∈ / On, f t1 . . .tn = s, s ∈ X.
(Observe that primitive recursive functions are not elements of Lω . However, f t1 . . .tn = s is a sentence in the language of Lω ). For s := t Lω and ti ∈ On all the sentences s ∈ / On, ti ∈ / On, f t 1 . . .t n = s α
become false and by Exercise 12.2.5 and the induction hypothesis you get Lω ∆ ∗ ,t Lω ∈ X. The case that F is a formula t ∈ / X is symmetrical. In a similar way you treat the case that F is a false equation s = t or inequality s = t for terms s and t of the form f s1 . . . sm and gt1 . . .tm , respectively. In the next step you prove that Lω
12.6.8 Exercise Show that Lω
α
∆ ∗ implies
<ω ·2
α
∆ by a simple induction on α .
(FOUND).
2·rnk(F)+5·(rnkLω (F(a))+1)
Hint: Show first Lω 0 ¬F(a), (∃y)[F(y) ∧ (∀x ∈ y)¬F(x)] for all a ∈ Lω by ∈-induction on a. Here rnkLω (a) denotes the least stage at which a enters Lω (cf. the proofs of Lemma 11.12.7 and Theorem 11.12.8). Actually the proof here is simpler and even closer to the proof of Lemma 7.3.4.
12.6.9 Definition Let Lω (−1) = Lω and Lω (n+1) = (Lω (n) )+ . Instead of U, we denote the new constant in the admissible extension (Lω (n) )+ by An+1 . ' Observe that the languages of Lω (n) and KPn coincide and that all formulas in L (Lω (n) ) have finite ranks. 12.6.10 Definition Let Φ 0 (α ) := α and Φ n+1 (α ) := Φ n (ϕα (0)). 12.6.11 Lemma Let α be an ε -number and 0 ≤ k ≤ n. Then Lω (n) Lω (k−1)
<Φ n+1−k (α ) 0
<α 0
∆ Ak implies
∆ for all finite sets ∆ of formulas in Lωk .
<α ∆ Ak . By Theorem 12.4.5 we obtain 0 <ϕα (0) <ϕα (0) Lω (n−1) 0 ∆ for k = n and Lω (n−1) 0 ∆ Ak for k < n. The latter implies n−k <Φ (ϕα (0)) <Φ n+1−k (α ) by the induction hypothesis Lω (k) 0 ∆ , i.e., Lω (k) 0 ∆.
Proof
We induct on n. Assume Lω (n)
12.6.12 Exercise (a) Show Lω (n) (b) Show
<ω ·2 Lω (n) <ω
<ω ·2 0
(FOUND(A0 )). ' A for all axioms A in KPn.
' (c) Conclude that KPn + (FOUND(A0 ))
∆ implies Lω (n)
<ω ·2 <ω
∆.
326
12 Predicativity Revisited
Hint: (a) Redo the proof of Exercise 12.6.8 bearing in mind that all formulas in L (Lω (n) ) have finite ranks. Prove part (b) by induction on n. The case n = −1 is covered by Lemma 12.6.6. For the succes<ω ·2 <ω ·2 ' sor case you have Lω (m) <ω A for all axioms A in KPm which entails (Lω (m) )+ <ω AAm+1 for − < ω ' all axioms A in KPm. Essentially by definition, you get Lω (m+1) <ω A for all axioms in KP and Tran(Am+1 ) . You easily show Lω (m+1)
<ω ·2 Lω (m+1) <ω
<ω ·2 <ω
AM ∈ Am+1 and, as a special case of part (a), finally
(Found(A0 )). For (c) combine claims (a) and (b) with Exercise 12.2.4.
' 12.6.13 Theorem KPn
∆ Ak implies Lω (k−1)
<Φ n−k (ε0 ) 0
∆ for 0 ≤ k ≤ n.
' Using Exercise 12.5.13 we obtain KPn−1 + (FOUND(A0 ))
Proof ' KPn
∆ Ak . By Exercise 12.6.12 this implies Lω (n−1)
cut elimination yields <Φ n−k (ε0 ) 0
<ε0 Lω (n−1) 0
<ω ·2 <ω
∆ Ak from
∆ Ak . Predicative
∆ Ak and Lemma 12.6.11 finally Lω (k−1)
∆.
Defining ||T ||Σ A1 := min {α F a Σ1 -formula and Lα |= F and T
F A1 }
:= min {α F a Π2 -formula and Lα |= F and T
F A1 }
and ||T ||
A
Π2 1
we obtain the following theorem as a corollary of Theorem 12.6.13. 0
12.6.14 Theorem (a) For all n we get ||KPn || ≤ Φ n (ε0 ). 0
(b) For n ≥ 1 we have ||KPn ||Σ A1 ≤ Φ n−1 (ε0 ). 0
(c) For n ≥ 1 we obtain ||KPn ||
A
Π2 1
≤ Φ n−1 (ε0 ). The stage LΦ n−1 (ε0 ) of the con-
structible hierarchy is therefore closed under the provably ω1CK -recursive functions 0
of KPn . Proof (a) We have to determine an upper bound for the order type of orderings 0 0 on ω whose well-foundedness is provable in KPn . In KPn , we obtain ω as A0 ∩ On. 0 ' TI(≺, X)A0 ∩On . Then KPn TI(≺, X)A0 ∩On according to So assume that KPn Φ n (ε0 )
TI(≺, X)On and by ExerExercise 12.6.4. By Theorem 12.6.13, we get Lω 0 cise 12.6.7 otyp(≺) ≤ Φ n (ε0 ). 0 (∃x ∈ A1 )F(x) for a ∆0 -formula F(x). (b) To prove part (b) assume KPn Thus, we get Lω (0)
<Φ n−1 (ε0 )
0
F(x) for some ordinal η < Lemma 12.4.2, we get RSLω
(∃x)F(x) by Theorem 12.6.13, i.e., Lω U
Φ n−1 (ε
ϕ1 (3η )
ω ·3η +ω
0)
−
η 0
¬Λ , (∃x)
and a finite set Λ ⊆ KP + IDEN . By
(∃x ∈ L3η )F(x). Applying Lemma 12.3.17 we
0
12.6 The Theories KPn and KPn ϕ1 (3η )
therefore get RS(X)
ω η +2
327
(∃x ∈ Lω +3η )F(x). Since ω +3η < Φ n−1 (ε0 ), we obtain
||KPn ||A1 ≤ Φ n−1 (ε0 ). 0
(∀x ∈ A1 )(∃y ∈ A1 )F(x, y) for a ∆0 -formula F(x, y). By (c) Assume KPn η Theorem 12.6.13 we obtain Lω (0) 0 (∀x)(∃y)F(x, y) for some η < Φ n−1 (ε0 ). For a ∈ LΦ n−1 (ε0 ) there is a β < Φ n−1 (ε0 ) such that a ∈ Lβ and by Lemma 12.4.2 we get RSLω
ϕ1 (β +3η ) ρ
(∃y ∈ Lβ +3η )F(a, y).
Hence RSLω
<Φ n−1 (ε0 ) <Φ n−1 (ε0 )
(∀x ∈ LΦ n−1 (ε0 ) )(∃y ∈ LΦ n−1 (ε0 ) )F(x, y) (∀x ∈ LΦ n−1 (ε0 ) )(∃y ∈ LΦ n−1 (ε0 ) )F(x, y).
which implies by Lemma 12.3.17 RS(X) Hence LΦ n−1 (ε0 ) |= (∀x)(∃y)F(x, y).
0
A function f is provably ω1CK -recursive in KPn if there is a Σ -formula 0
F(x, y, z) such that f (x) $ y ⇔ KPn
(∀x ∈ A1 )(∃y ∈ A1 )(∃z ∈ A1 )F(x, y, z) 0
for a ∆0 -formula F(x, y, z) whose parameters are all definable in KPn . Hence LΦ n−1 (ε0 ) |= (∀x)(∃y)(∃z)F(x, y, z) which shows that LΦ n−1 (ε0 ) is closed under f . 12.6.15 Remark In KPn , the set A0 figures as the set Lω of hereditarily finite sets and thus A1 figures as the first admissible set above Lω , i.e., as Lω CK . The 1
ordinal ||KPn ||Σ A1 corresponds therefore to the ordinal ||KPn || ω1CK mentioned in Σ Remark 11.7.3. The theories KPn are therefore examples for theories in which the proof-theoretic ordinal and the ||KPn || ω1CK -ordinals differ. Σ
In the following theorem, we collect some of the results of this section. 0
12.6.16 Theorem ||KPω 0 || = ||KP0 || = ε0 . ||KPω r || = ε0 . ' ||KPω r + (IND) || = ||KPr || ≤ ϕ (0). ω
0
ε0
0 ||KP0 ||
Proof ||KPω 0 || = ≤ ε0 follows from Exercise 12.6.2 (b) and Theorem 12.6.14. ' TI(≺, X)A0 ∩On we get KP0 + (Found) TI(≺, X)A0 ∩On and thus If KPω r − <ω ·2 KP + (FOUND) TI(≺, X)On by Exercise 12.5.13. Hence Lω <ω TI(≺, X)On which implies Lω
<ε0 0
TI(≺, X)On by cut-elimination and thus otyp(≺) < ε0 by
Exercise 12.6.7. Since NT is embeddable in both theories the bound ε0 is exact. ' If KPω r + (IND) TI(≺, X) we get KPr + (FOUND(A )) TI(≺, X)A0 ∩On . ω
Hence, Lω (0) + (Found)
<ω ·2 <ω
0
0
TI(≺, X)A0 ∩On by Exercise 12.6.12(c) which by
328
12 Predicativity Revisited
cut-elimination implies Lω (0) +(Found) <ϕε0 (0)
we get Lω 0 Exercise 12.6.7.
<ε0 0
TI(≺, X)A0 ∩On . By Theorem 12.4.5(b),
' TI(≺, X)On and thus ||KPω r + (IND)ω || = ||KPr0|| ≤ ϕε0 (0) from
12.6.17 Remark ϕε0 (0) is the the proof-theoretic ordinal of the theory of ∆11 comprehension and also of the theory of the Σ11 axiom of choice in second-order arithmetic both equipped with the induction scheme. These results are scattered in the literature of predicative proof theory (e.g. [26]). Unfortunately, there is no comprehensive monograph treating these parts of predicative proof theory systematically. Sch¨utte in his monograph [89] only treats the his system DA which corresponds to a ∆11 -comprehension rule and has proof-theoretic ordinal ϕω (0). Since KPω proves ∆ -separation and strong Σ -replacement it is not too difficult to see that r
(∆11 -CA) and (Σ11 -AC) can be embedded into KPω + (IND)ω (which entails that ϕε0 (0) is the exact bound for KPω r + (IND)ω ) while (∆11 -CA)0 and (Σ11 -AC)0 , the versions in which the induction scheme is replaced by the induction axiom, can be 0 embedded into KPω and thus have proof-theoretic ordinals ε0 . The result that these theories are conservative over NT is originally due to Schlipf (cf. [6]).
12.7 The Theories KPl 0 and KPi 0 Recall the theories KPl 0 and KPi 0 introduced in Sect. 11.6. The theory KPl comprises the axioms of BST + Ad together with the limit axiom (Lim)
(∀x)(∃y)[Ad(y) ∧ x ∈ y]
which states that the universe is a union of admissible universes. Even stronger is the theory KPi which is KP + BST + Ad + (Lim) which axiomatizes an admissible universe which is also a union of admissible universes. In Exercise 11.6.3 we have seen that the set A0 of hereditarily finite sets is definable in KPl, hence also in KPi. We obtain the restrictions KPl 0 and KPi 0 replacing the foundation scheme (FOUND) by the axiom (Found(A0 )). 0 We are going to reduce KPi 0 – and thus also KPl 0 – to the theories KPk via an asymmetric interpretation. To this end, we need an (asymmetric) interpretation for the formulas in L (KPi) into LKP which is given in the following definition. k
12.7.1 Definition Let F be an L (∈, Ad)-formula. By F (m,s) , we understand the L (∈, 0, / A1 , . . . , Ak , Ad)-formula which is obtained from F restricting all universal quantifiers (∀x)[· · · x · · ·] to (∀x ∈ Am )[· · · x · · ·] and all existential quantifiers (∃x)[· · · x · · ·] to (∃x ∈ As )[· · · x · · ·]. If ∆ = {F1 , . . . , Fn } is a finite set of L (∈, Ad)(m ,s ) (m ,s ) formulas then D (m,s) is the collection of all finite sets {F1 1 1 , . . . , Fn n n } with mi ≤ m and si ≥ s for i = 1, . . . n (cf. Definition 12.4.1) .
12.7 The Theories KPl 0 and KPi 0
329
12.7.2 Exercise Let ∆ (u) be a finite set of L (∈, Ad) formulas whose free variables 0 n occur all in the list u. Show that KPi 0 ∆ (u) implies KPk |= a ∈ Ai → ∆ (a) T
for all m, all i ≤ m, all k ≥ s ≥ m + 3n , all ∆ ∈ D (m,s) and all tuples a of L (0, / A1 , . . . , An , ∈, Ad)-terms. Hint: This is again an asymmetric interpretation. The proof is essentially that of Lemma 12.4.2. 0 n Assume KPi 0 ∆ (u). Fix m and choose s ≥ m + 3n and k ≥ s. Let M be an arbitrary KPk -model T
and Φ an M-assignment. Prove n
• “If
T
¬Λ (u), ∆ (u) for a finite set Λ (u) of specializations of KPi 0 -axioms, including the iden
tity axioms, then M |= (a ∈ Ai → ∆ (a)) [Φ ] holds true for all ∆ ∈ D (m,s) , all i ≤ m and all tuples a of
0 L (KPk )-terms”
by induction on n. Fix ∆ ∈ D (m,s) . To simplify the notation, suppress mentioning the assignment Φ . The distinction of cases is as in the proof of Lemma 12.4.2. The main cases are that the critical formula of the last inference is an axiom ¬(Lim) or ¬(∆0 –Coll). In case of ¬(Lim), you have the premise n0 T
¬Λ (u), ∆ (u), (∀y)¬[Ad(y) ∧ v ∈ y]
(i)
for some variable v. If v occurs in the list u let b the corresponding term. Otherwise put b := 0. / Then M |= (a ∈ Ai ) implies b ∈ Ai for all i ≤ m < m + 1. Since also k ≥ s ≥ m + 3n ≥ m + 1 + 3n0 you can apply the induction hypothesis to (i) to get M |= a ∈ Ai →
∆ (a) ∨ (∀y ∈ Am+1 )¬[Ad(y) ∨ b ∈ y].
(ii)
Since M |= b ∈ Am ∈ Am+1 and M |= Ad(Am ) you get the claim from (ii). In case of an axiom (∆0 –Coll) you have the premise n0 T
¬Λ (u), ∆ (u), (∀x ∈ v)(∃y)F(x, y,u) ∧ (∀z)(∃x ∈ v)(∀y ∈ z)¬F(x, y,u)
from which by ∧-inversion you get n0 T
¬Γ , ∆ (u), (∀x ∈ v)(∃y)F(x, y,u)
(iii)
¬Γ , ∆ (u), (∀z)(∃x ∈ v)(∀y ∈ z)¬F(x, y,u).
(iv)
and n0 T
Choose b as in the previous case, put s0 := m + 3n0 and apply the inductive hypothesis to (iii) to obtain M |= a ∈ Ai →
∆ (a) ∨ (∀x ∈ b)(∃y ∈ As0 )F(x, y,a).
(v)
Letting s0 := m + 3n0 , m := s0 + 1 and s := m + 3n0 and applying the inductive hypothesis to (iv) you get M |= a ∈ Ai →
∆ (a) ∨ (∀z ∈ As0 +1 )(∃x ∈ b)(∀y ∈ z)¬F(x, y,a)
for all ∆ ∈ D (m ,s ) ⊇ D (m,s) . Since M |= As0 ∈ As0 +1 this implies M |= a ∈ Ai →
∆ (a) ∨ (∃x ∈ b)(∀y ∈ As0 )¬F(x, y,a)
and the claim follows from (v) and (vi).
(vi)
330
12 Predicativity Revisited
A case which does not appear in 12.4.2 is that the critical formula is ¬Ad. Then you have a premise n0 T
Λ (u), ∆ (u), Ad(v) ∧ ¬B(v)
or n0 T
Λ (u), ∆ (u), Ad(v0 ) ∧ Ad (v1 ) ∧ ¬B (v)
where B(v) is the formula Tran(v) or ((Pair’)v ∧ (Union’)v ∧ (∆0 –Sep)v ∧ (∆0 –Coll)v ) and B (v) the formula v0 ∈ v1 ∨ v0 = v1 ∨ v1 ∈ v0 . Indicating only the first case you get by ∧ -inversion n0 T
Λ (u), ∆ (u), Ad(v)
and n0 T
Λ (u), ∆ (u), ¬B(v).
If v appears in the listu replace v by b accordingly. Otherwise replace v by b := 0. / Then M |=a ∈ Ai implies b ∈ Ai . By induction hypothesis you obtain M |= a ∈ Ai → Ad (b) ∨ and M |= a ∈ Ai → ¬B(b) ∨
∆ (a)
(vii)
∆ (a).
(viii)
If M |= Ad (b) you get the claim from (vii) and M |= Ad (b) implies M |= B(b) and you get the claim from (viii). The case that the critical formula is the negation of (Found(A0 )) is again different from that in 0
Lemma 12.4.2. The problem is that A0 has different definitions in KPi 0 and KPk . To emphasize this difference let us temporarily write U0 for A0 as defined in KPi 0 . Observe, however, that by Exercise 11.6.3(c) (∀x ∈ U0 )[· · · x · · ·] can be expressed by (∀x)[(∀u)(Ad(u) → x ∈ u) → · · · x · · ·]. But if M |= x ∈ A0 we always get M |= (∀u)[Ad(u) → x ∈ u]. So (∀x ∈ U0 )[· · · x · · ·]Am is in M equivalent to (∀x ∈ A0 )[· · · x · · ·]Am . On the other hand (∃x ∈ U0 )[· · · x · · ·] is a shorthand for the formula (∃x)(∃u)[Ad(u) ∧ (∀y ∈ u)(¬Ad(y)) ∧ x ∈ u ∧ · · · x · · ·]. For l > 0 the set A0 is a witness for u in (∀x)[x ∈ A0 → (∃u ∈ Al )[Ad(u) ∧ (∀y ∈ u)(¬Ad(y)) ∧ x ∈ u]. Hence, M |= (∃x ∈ U0 )[· · · x · · ·]Al iff M |= (∃x ∈ A0 )[· · · x · · ·]Al for all l > 0. The negation of (Found(A0 )) is the formula (∃u) (∀x ∈ U0 )[(∀x ∈ y)(y ∈ u) → x ∈ u] ∧ (∃x ∈ U0 )(x ∈ / u)
and from the premise you get by -inversion n0 T
¬Λ (u), ∆ (u), (∀x ∈ U0 )[(∀x ∈ y)(y ∈ u) → x ∈ u]
and n0 T
¬Λ (u), ∆ (u), (∃x ∈ U0 )[x ∈ / u]
for a variable u. If u occurs in the list u replace u correspondingly by a term b. Otherwise, let b be 0. / In any case you get M |= a ∈ Ai → b ∈ Ai . Using the above observations you get
M |= a ∈ Ai → ∆ (a) ∨ (∀x ∈ A0 )[(∀y ∈ x)(y ∈ b) → x ∈ b]
(ix)
and
M |= a ∈ Ai → ∆ (a) ∨ (∃x ∈ A0 )(x ∈ b)
(x)
12.7 The Theories KPl 0 and KPi 0
331
by the inductive hypothesis. Since M |= (Found(A0 )) you get from (ix)
M |= a ∈ Ai → ∆ (a) ∨ (∀x ∈ A0 )(x ∈ b) and the claim follows together with (x). The remaining cases follow the pattern of the proof of Lemma 12.4.2.
Since KPl 0 ⊆ KPi 0 , Exercise 12.7.2 also yields a reduction of KPl 0 to
0 n∈ω KPn .
12.7.3 Theorem (a) ||KPl 0 || ≤ ||KPi 0 || ≤ Γ0 . (b) ||KPl 0 ||Σ A ≤ ||KPi 0 ||Σ A ≤ Γ0 . (c) ||KPl 0 ||
1
A Π2 1
1
A ≤ Γ0 , i.e., the stage LΓ0 of the constructible Π2 1 the Π2A1 -sentences which are provable in KPl 0 or
≤ ||KPi 0 ||
hierarchy
is the least model for KPi 0 and hence closed under all ω1CK -recursive functions which are provably total in KPl 0 or even KPi 0 . TI(≺, X) . Then, we (a) ||KPl 0 || ≤ ||KPi 0 || is obvious. So assume KPi 0 0 TI(≺, X)A0 for some n which by Theorem 12.6.14 implies otyp(≺) ≤ get KPn Φ n (ε0 ) < Γ0 . A (b) First observe that for l > 0 we get U0 l = {x (∀u ∈ Al )[Ad(u) → x ∈ u]} = {x x ∈ A0 } = A0 which implies that for a Σ1-formula (∃x)F(x) the translation ((∃x ∈ U1 )F(x))A1 , i.e., the formula (∃x) (∀u)[Ad(u) ∧ U0 ∈ u → x ∈ A u] ∧ F(x) 1 , becomes equivalent to (∃x ∈ A1 ) (∀u ∈ A1 )[Ad(u) ∧ A0 ∈ u → x ∈ Proof
U0
(∃x)F(x)U1 u] ∧ F(x) , which, in turn, is equivalent to (∃x ∈ A1 )F(x). If KPi 0 for a Σ1 -sentence (∃x)F(x) then we get from Exercise 12.7.2 KPn (∃x ∈ A1 )F(x) which by Theorem 12.6.14 (b) implies (∃x ∈ LΦ n−1 (ε0 ) )F(x) and by upwards persistency also (∃x ∈ LΓ0 )F(x). Hence ||KPl 0 ||A1 ≤ ||KPi 0 ||A1 ≤ Γ0 . Claim (c) follows from (b) in the same way as in Theorem 12.6.14. In Exercise 11.6.6, we have seen that the theory (ATR)0 of autonomously iterated arithmetic comprehensions can be embedded into KPl 0 . Hence ||(ATR)0 || ≤ Γ0 which shows that it is reducible to a theory which is predicative in the Feferman– Sch¨utte sense as described in Chap. 8. On the other hand, (ATR)0 proves that for every ordinal α less than Γ0 there is a primitive recursive well-ordering of order type α whose well-foundedness is provable in (ATR)0 (cf. Exercise 8.5.32). Summing up we get the following theorem. 12.7.4 Corollary ||KPl 0 || = ||KPi 0 || = ||(ATR)0 || = Γ0 . This implies that the theory (ATR)0 is predicatively reducible. 12.7.5 Remark The theory (ATR)0 has been introduced by Friedman in the framework of reverse mathematics (cf. [94]). The first computation of ||(ATR)0 || by Friedman et al. is in [27].
332
12 Predicativity Revisited
The method of asymmetric interpretations is not limited to KPi 0 but also yields results for a series of theories which lay between Γ0 and ψ (εΩ +1 ). The analysis of these metapredicative theories is mostly due to J¨ager and his students and collaborators. Two selected references are [53] and [54] where further references can be found.10
10
Cf. also the references at the end of the book.
Chapter 13
Nonmonotone Inductive Definitions
We have seen in Sect. 6.3 that the fixed-point of an inductive definition comes in stages. We used a cardinality argument to show that the hierarchy of stages of an inductive definition becomes eventually stationary. Monotonicity, however, is not really necessary in the cardinality argument. The cardinality argument only needs that the operator is inflationary, i.e., that the operator Φ satisfies the condition X ⊆ Φ(X). This means, however, not really a restriction since any operator Φ: Pow(N) −→ Pow(N) induces an operator Φ (X) := X ∪ Φ(X) which is apparently inflationary.
13.1 Nonmonotone Inductive Definitions We start with a brief synopsis of the theory of nonmonotone inductive definitions. 13.1.1 Definition Let Φ: Pow(N) −→ Pow(N) be an operator. The hierarchy of stages is defined by Φα := Φ<α ∪ Φ(Φ<α ) where again we abbreviate Φ<α := Then Φ<∞ :=
ξ <α Φ
α.
Φξ
ξ ∈On
is a fixed-point of Φ . We call it the fixed-point generated by Φ. 13.1.2 Lemma Let Φ: Pow(N) −→ Pow(N) be an operator. Then there is a countable ordinal σ such that Φ<σ = Φσ . For this ordinal σ , we have Φ<∞ = Φ<σ = Φσ . Proof Since all sets Φα are countable, we obtain by cardinality reasons an ordinal σ such that Φ<σ = Φσ . By definition, we have Φ<σ ⊆ Φ<∞ . For the opposite inclusion, we prove W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
333
334
13 Nonmonotone Inductive Definitions
σ ≤ τ ⇒ Φτ = Φ<σ
(i)
by induction on τ . For σ = τ , we obtain (i) by definition of σ . For σ < τ we obtain Φ<τ = Φ<σ by induction hypothesis. Hence Φτ = Φ(Φ<τ ) ∪ Φ<τ = Φ(Φ<σ ) ∪ Φ<σ = Φσ = Φ<σ .
13.1.3 Definition Let |Φ| := min {σ Φσ = Φ<σ }. We call |Φ| the closure ordinal of Φ. As a consequence of Lemma 13.1.2, we obtain Φ∞ := Φ(Φ<∞ ) ∪ Φ<∞ = Φ(Φ<|Φ| ) ∪ Φ<|Φ| = Φ|Φ| = Φ<∞ .
(13.1) onto
By a norm on a set P we commonly understand a surjective mapping f : P −→ λ ∈ On. An operator Φ: Pow(N) −→ Pow(N) induces a mapping | |Φ : N −→ On defined by α <∞ |n|Φ := min {α x ∈ Φ } if x ∈ Φ ∞ otherwise which satisfies the following property. onto
13.1.4 Lemma We obtain |Φ| = {|x|Φ x ∈ Φ<∞ }. Hence | |Φ : Φ<∞ −→ |Φ|. Proof Let σ := |Φ|. By Lemma 13.1.2, we have Φ<∞ = Φ<σ . Hence {|x|Φ x ∈ Φ<∞ } ⊆ σ . If α < σ then Φ<α Φα . Then, there is an x ∈ Φα \ Φ<α . But then |x|Φ = α for that x. By Lemma 13.1.4 | |Φ is a norm on Φ<∞ . We call it the inductive norm induced by Φ. 13.1.5 Remark The standard example for nonmonotone operators are operators of the form [Γ0 , Γ1 ] where Γ0 and Γ1 are inductive definitions, i.e., monotone operators, and [Γ0 , Γ1 ](X) := {x ∈ N x ∈ Γ0 (X) ∨ [Γ0 (X) ⊆ X ∧ x ∈ Γ1 (X)]}. We obtain the stages of [Γ0 , Γ1 ] by first iterating Γ0 until a Γ0 -closed set is obtained and then use Γ1 , then falling back to iterating Γ0 until the next Γ0 closed set is obtained, then using Γ1 , etc. The origin of operators of this kind was the need of higher constructive number classes in abstract recursion theory. The first constructive number class comprises all ordinals which can be represented by recursive well-orderings on the natural numbers, i.e., the ordinals below ω1CK . Kleene designed an abstract notation system O for the ordinals in the first constructive number class using an abstract notion of fundamental sequence. Kleene’s notation system consists of a set O of ordinal notations together with an interpretation |a|O ∈ On for a ∈ O. Originally, it is defined by the clauses
13.1 Nonmonotone Inductive Definitions
335
• 1 ∈ O and |1|O := 0, • if a ∈ O then 2a ∈ O and |2a |O := |a|O + 1 and • if e is an index for a recursive fundamental sequence, i.e., if (∀n ∈ ω )[{e}1 (x) ∈ O] and |{e}1 (n)|O < |{e}1 (n+1)|O , then 3·5e ∈ O and |3·5e |O := supn∈N |{e}1 (n)|O , where {e}1 denotes the Kleene bracket, i.e.,{e}1 (x) $ U(µ y. T 1 (e, x, y)) (cf. Theorem 10.1.1 on p. 208). The second constructive number class O2 is obtained by allowing fundamental sequences of length ω1CK , i.e., by the limit clause • if (∀x ∈ O)[{e}1 (x) ∈ O2 ] and |x|O < |y|O ⇒ |{e}1 (x)|O2 < |{e}1 (y)|O2 then 32 ·5e ∈ O2 and |32 ·5e |O2 := supx∈O |{e}1 (x)|O2 . Iterating this procedure, we obtain all finite constructive number classes. By ωnCK , we denote the order type of the nth constructive number class. It is not too difficult to see that Kleene’s O can be defined by a positive arithmetical inductive definition. This implies that the order type of the ordinals denoted in Kleene’s O cannot exceed ω1CK (as we defined it in this book). Again, it is not too difficult to see that the order type is in fact equal to ω1CK . The order type of the ordinals in O2 therefore exceeds ω1CK which implies that already O2 is unattainable by a positive arithmetical inductive definition but needs an inductive definition in which O occurs negatively as given above. This indicates that nonmonotone inductive definitions are more powerful than monotone inductive definitions. We will not pursue the abstract theory further. A careful study of the connections between large cardinals, their constructible counterparts in the constructible hierarchy L and the closure ordinals of nonmonotone inductive definitions is due to Aczel and Richter [83].1 From a proof-theoretical standpoint only definable nonmonotone operators (cf. Definition 6.4.1) can be interesting. Following Richter–Aczel, we define [F1 , . . . , Fn ] := {[Γ1 , . . . , Γn ] Γi is positively Fi -definable}, where F1 , . . . , Fn are complexity classes. The closure ordinal of the class [Π10 , Π10 ] is already a recursive Mahlo ordinal and thus widely outside the reach of a first step into impredicativity. But Richter and Aczel prove in [83] that |Π10 | = |Σ20 | = ω1CK , a result which goes back to Gandy. An ordinal analysis of a theory for Π10 definable nonmonotone inductive definitions should therefore still be in the realm of this book. We will give its ordinal analysis as an application of the ordinal analysis of (Π2 –REF).
1
One of the most influential papers for ordinally informative proof theory.
336
13 Nonmonotone Inductive Definitions
13.2 Prewellorderings The axiomatization of a theory of fixed-points Φ of Π10 -definable operators is not as simple as the axiomatization of positive inductive definitions. In the latter, the fixedpoint could be easily described as the least Φ-closed set. To define fixed-points of nonmonotone operators, we need ordinals which are, however, not available in the language of arithmetic. The solution is given by the stage comparison relations for inductive definitions which are defined by / Φ<α ] x Φ y :⇔ (∃α )[x ∈ Φα ∧ y ∈ α / Φα ]. x ≺Φ y :⇔ (∃α )[x ∈ Φ ∧ y ∈
(13.2)
Our aim is to axiomatize Φ<∞ via the stage comparison relations. Therefore, we have to study them more profoundly. We start with two observations. / Φ<∞ ∨ ¬(y ≺Φ x)] x Φ y ⇔ x ∈ Φ<∞ ∧ [y ∈
(13.3)
x ≺Φ y ⇔ x ∈ Φ<∞ ∧ [y ∈ / Φ<∞ ∨ ¬(y Φ x)].
(13.4)
and
To prove the direction from left to right in (13.3) let x Φ y. Then there is an α such / Φ<α . Hence x ∈ Φ<∞ and we are done if y ∈ / Φ<∞ . If y ∈ Φβ , that x ∈ Φα and y ∈ then α ≤ β which implies x ∈ Φβ . Hence ¬(y ≺Φ x). For the opposite inclusion / Φ<∞ then x Φ y is immediate. Otherwise, y ∈ Φα implies assume x ∈ Φ<∞ . If y ∈ / Φ<α but x ∈ Φα . x ∈ Φα for all ordinals α . For α := |x|Φ , we therefore obtain y ∈ Hence x Φ y and we have (13.3). Equation (13.4) is proved similarly. From the left-hand side hypothesis, we ob/ Φα . This implies x ∈ Φ<∞ . If y ∈ Φβ tain an ordinal α such that x ∈ Φα but y ∈ < β then α < β which implies x ∈ Φ . Hence ¬(y Φ x). For the opposite implica/ Φ<∞ , we obviously have x ≺Φ y. Otherwise, we obtain tion assume x ∈ Φ<∞ . If y ∈ < α α / Φα . x ∈ Φ whenever y ∈ Φ . For α := |x|Φ , we therefore have x ∈ Φα but y ∈ Hence x ≺Φ y and we are done with (13.4), too. The agreement α < ∞ for all ordinals α allows for a simple characterization of the relations Φ and ≺Φ . We obtain {x x Φ y} = Φ|y|Φ
(13.5)
{x x ≺Φ y} = Φ<|y|Φ .
(13.6)
and / Φ<α . If To prove (13.5) let x Φ y. Then there is an α such that x ∈ Φα and y ∈ <∞ <∞ |y| Φ y∈ / Φ then |y|Φ = ∞ which immediately implies x ∈ Φ ⊆ Φ . Otherwise we have y ∈ Φ|y|Φ which implies α ≤ |y|Φ . Hence, x ∈ Φ|y|Φ and we have {x x Φ y} ⊆ Φ|y|Φ in any case. For the opposite inclusion let x ∈ Φ|y|Φ . If |y|Φ < ∞ we obtain y∈ / Φ<|y|Φ and thus x Φ y. If |y|Φ = ∞, we obtain x ∈ Φ<∞ by (13.1) and y ∈ / Φ<α for all α . Hence x Φ y.
13.2 Prewellorderings
337
To check (13.6) let x ≺Φ y. Then there is an α such that x ∈ Φα and y ∈ / Φα . Hence x ∈ Φ<∞ . If |y|Φ < ∞, we have y ∈ Φ|y|Φ which implies α < |y|Φ and thus x ∈ Φ<|y|Φ . Hence {x x ≺Φ y} ⊆ Φ<|y|Φ . For the opposite inclusion let x ∈ Φ<|y|Φ . / Φ<|y|Φ , we get y ∈ / Φα . Hence Then there is an α < |y|Φ such that x ∈ Φα . Since y ∈ x ≺Φ y. As a consequence of (13.6) and (13.5) we obtain x Φ y ⇔ x ≺Φ y ∨ x ∈ Φ({z z ≺Φ y}).
(13.7)
This is immediate because for α := |y|Φ we have {x x Φ y} = Φα = Φ<α ∪ Φ(Φ<α ) = {x x ≺Φ y} ∪ Φ({x x ≺Φ y}). The relations ≺Φ and Φ are in some sense the unique relations which satisfy (13.7). To make precise in which sense we introduce the notion of a prewellordering. 13.2.1 Definition Let P be a set and (, ≺) pair of relations. The triple (P, , ≺) is onto a prewellordering if there is a function f : P −→ λ ∈ On such that x y ⇔ x ∈ P ∧ [y ∈ P ⇒ f (x) ≤ f (y)] and x ≺ y ⇔ x ∈ P ∧ [y ∈ P ⇒ f (x) < f (y)]. We call f the associated norm of (P, , ≺). 13.2.2 Theorem (Prewellordering Theorem) The triple (Φ<∞ , Φ , ≺Φ ) is the uniquely determined prewellordering which satisfies (FPΦ )
x Φ y ⇔ x ≺Φ y ∨ x ∈ Φ({z z ≺Φ y}).
/ Φ<∞ ∨ |x|Φ ≤ Proof From (13.6) and (13.5), we obtain x Φ y ⇔ x ∈ Φ<∞ ∧ [y ∈ <∞ <∞ <∞ / Φ ∨ |x|Φ < |y|Φ ]. So (Φ , Φ , ≺Φ ) is a |y|Φ ] and x ≺Φ y ⇔ x ∈ Φ ∧ [y ∈ prewellordering with associated norm |y|Φ . By (13.7), we know that it also satisfies (FPΦ ). To show the uniqueness of (Φ<∞ , Φ , ≺Φ ) let (P, , ≺) be any prewellordering satisfying (FPΦ ). Let moreover f : P −→ λ ∈ On be its associated norm. By induction on f (y) we show y ∈ P ⇒ Φ f (y) = {z z y}
(i)
From the induction hypothesis we obtain Φ< f (y) =
{z z x ∧ f (x) < f (y)} = {z z ≺ y}.
(ii)
x∈P
Hence Φ f (y) = Φ< f (y) ∪ Φ(Φ< f (y) ) = {z z ≺ y} ∪ Φ({z z ≺ y}) = {z z y}. From (i) and y y for y ∈ P, we obtain y ∈ P ⇒ y ∈ Φ f (y) hence |y|Φ ≤ f (y).
(iii)
338
13 Nonmonotone Inductive Definitions
If we assume |y|Φ < f (y) then we obtain a z ∈ P such that |y|Φ = f (z) because f is onto. Hence y ∈ Φ f (z) which by (i) implies y z, i.e., f (y) ≤ f (z) < f (y), a contradiction. Therefore we have y ∈ P ⇒ f (y) = |y|Φ .
(iv)
Next, we show Φ|x|Φ = {z z x}
(v)
by induction on |x|Φ . By the induction hypothesis and (iv) we obtain Φ<|x|Φ = {z z ≺ x}. Hence Φ|x|Φ = Φ<|x|Φ ∪ Φ(Φ<|x|Φ ) = {z z ≺ x} ∪ Φ({z z ≺ x}) = {z z x}. Since x ∈ Φ|x|Φ , we get from (v) x x and thus x ∈ P which proves Φ<∞ ⊆ P. Together with (i) we therefore get P = Φ<∞ . By (iv) we know that the associated norms coincide. Hence (P, , ≺) = (Φ<∞ , Φ , ≺Φ ). The definition of a prewellordering still depends on its associated norm function which needs ordinals in its definition. To get rid of the explicit need of ordinals we develop a description of prewellorderings which only uses the language of arithmetic. 13.2.3 Theorem Let P ⊆ N be a set and (, ≺) be a pair of transitive order relations. The triple (P, , ≺) is a prewellordering if and only if it satisfies the following conditions (PWO1)
x y ⇔ x ∈ P ∧ [y ∈ / P ∨ ¬(y ≺ x)].
(PWO2)
x ≺ y ⇔ x y ∧ ¬(y x).
(PWO3)
The relation ≺ is well-founded.
Proof Assume that (P, , ≺) is a prewellordering with associated norm f . Then x ≺ y ⇔ f (x) < f (y) holds for x, y ∈ P and we obtain property (PWO1) since x y ⇔ x ∈ P ∧ [y ∈ / P ∨ f (x) ≤ f (y)] ⇔ x ∈ P ∧ [y ∈ / P ∨ ¬( f (y) < f (x))] ⇔ x ∈ P ∧ [y ∈ / P ∨ ¬(y ≺ x)].
(i)
We obtain property (PWO2) because x y ∧ ¬(y x) ⇔ x ∈ P ∧ [y ∈ / P ∨ f (x) ≤ f (y)] ∧ (y ∈ / P ∨ [x ∈ P ∧ f (x) < f (y)]) ⇔ x ∈ P ∧ [y ∈ / P ∨ f (x) < f (y)] ⇔ x ≺ y. Since f is a norm every infinite descending ≺ sequence will induce an infinite descending sequence in the ordinals. The relation ≺ is therefore well-founded.
13.2 Prewellorderings
339
Now assume properties (PWO1) – (PWO3). For x ∈ P let f (x) := otyp≺ (x). We show that (P, , ≺) is a prewellordering with associated norm f . First, we observe x∈P ∧ y∈ / P ⇒ x y ∧ ¬(y x) ⇒ x≺y
(ii)
by (PWO1) and (PWO2). By the definition of the order type, we obtain x ∈ P ∧ y ∈ P ⇒ (x ≺ y ⇔ f (x) < f (y)).
(iii)
Again by (PWO1) and (PWO2), we have x ≺ y ⇒ x ∈ P.
(iv)
Pulling together (ii), (iii), and (iv) we obtain x ≺ y ⇔ x ∈ P ∧ [y ∈ P ⇒ f (x) < f (y)].
(v)
From (v) and (PWO1) we obtain x y ⇔ x ∈ P ∧ [y ∈ / P ∨ ¬(y ≺ x)] ⇔ x ∈ P ∧ [y ∈ / P ∨ ¬( f (y) < f (x))] ⇔ x ∈ P ∧ [y ∈ P ⇒ f (x) ≤ f (y)].
(vi)
By (v) and (vi), we see that (P, , ≺) is a prewellordering with associated norm λ x . otyp≺ (x). The next observation is that the set P is completely determined by a prewellordering (P, , ≺). 13.2.4 Lemma Let (P, , ≺) be a triple satisfying (PWO1)–(PWO3). Then P = {x x x} =: D .
(13.8)
We call D the diagonalization of . Proof From x x, we obtain x ∈ P from (PWO1). Conversely, we obtain from ¬(x x) by (PWO1) x ∈ / P ∨ [x ∈ P ∧ x ≺ x]. Since ≺ is well-founded, we have ¬(x ≺ x). Hence x ∈ / P. 13.2.5 Lemma Let (, ≺) be transitive orderings satisfying (PWO1)–(PWO3). Then we obtain x≺yz ⇒ x≺z
(13.9)
x y ≺ z ⇒ x ≺ z.
(13.10)
and
Proof For (13.9) check x ≺ y z ⇒ x z by transitivity of and (PWO2). The assumption z x leads to z x y, i.e., z y contradicting y ≺ z.
340
13 Nonmonotone Inductive Definitions
For (13.10), we again derive x z from the hypotheses. The assumption z x implies z x y contradicting y ≺ z. 13.2.6 Theorem Let ΦF be a definable operator, say ΦF (X) = {x ∈ N F(X, x)} and (F , ≺F ) a pair of transitive binary relations satisfying the following properties: (FIX)
x F y ⇔ x ≺F y ∨ F({z z ≺F y}, x)
(PWO1)
x F y ⇔ x F x ∧ [y F y → ¬(y ≺F x)]
(PWO2)
x ≺F y ⇔ x F y ∧ ¬(y F x)
(PWO3)
≺F is well-founded.
Let DF := DF = {x x F x}. Then (DF , F , ≺F ) is a prewellordering and therefore DF = Φ<∞ F . Proof By Theorem 13.2.3, we obtain that (DF , F , ≺F ) is a prewellordering. by the Since x F y ⇔ x ≺F y ∨ F({z z ≺F y}, x), we obtain DF = Φ<∞ F Prewellordering Theorem (Theorem 13.2.2). 13.2.7 Remark Observe that Theorem 13.2.6 implies that the conditions (FIX) and (PWO1) – (PWO2) already guarantee that the order type of ≺F is large enough to turn the diagonalization DF into the fixed point of the operator ΦF . This is a consequence of the Prewellordering Theorem. We can, however, see this more directly also. First, we observe that x ∈ / DF implies DF = {y y ≺F x}. This holds because x∈ / DF , i.e., ¬(x F x), implies y F x ⇔ y ≺F x for any y by (PWO2) which in turn implies y ≺F x ⇔ y F y by (PWO1). Trivially, we have DF ⊆ ΦF (DF ). Now / DF . Then DF = {y y ≺F x} and F(DF , x), implies x F x assume F(DF , x) and x ∈ by (FIX) which contradicts x ∈ / DF .
13.3 The Theory (Π01 –FXP)0 The aim of this section is to introduce the theory (Π01 –FXP)0 which axiomatizes the existence of fixed-points for Π10 -definable operators. The language is based on the language of second-order arithmetic. We introduce some abbreviations which are needed to formalize the results of the earlier sections. For a set X let Rel (X) :⇔ (∀x ε X)[Seq(x) ∧ lh(x) = 2] denote that X codes a binary relation. For simplicity, we mostly write x X y instead of x, y ∈ X. By
13.3 The Theory (Π01 –FXP)0
341
Tran(X) :⇔ Rel (X) ∧ (∀x)(∀y)(∀z)[x X y ∧ y X z → x X z] we express that X codes a transitive binary relation. Let field(X) := {x (∃y)[x X y ∨ y X x]} denote the field and
Wf (X) :⇔ (∀Y ) Y ⊆ field(X) ∧ (∃x)(x ε Y ) → (∃x)[x ε Y ∧ (∀y)[y X x → y ε Y ]]
express the well-foundedness of the relation coded by X. By Xx := {y x, y ε X} we denote the x-slice of X. The formula PO(X) :⇔ (∀x)[x ε X ↔ Seq(x) ∧ lh(x) = 2 ∧ ((x)0 = 0 ∨ (x)0 = 1) ∧ Tran(X0 ) ∧ Tran(X1 )] formalizes that X codes two transitive orderings. If PO(X) let X := X0 and ≺X := X1 . We define PRO(X) :⇔ PO(X) ∧ (∀x)(∀y) [x X y ↔ x X x ∧ (y X y → ¬(y ≺X x))] ∧ [(x ≺X y) ↔ x X y ∧ ¬(y X x)] . The formula PRO(X) formalizes properties (PWO1) and (PWO2) to express that (DX , X , ≺X ) is a pre-ordering. By PRWO(X) :⇔ PRO(X) ∧ Wf(≺X ), we denote that (DX , X , ≺X ) is a prewellordering. Finally, we use Theorem 13.2.6 to express that the diagonalization of a prewellordering represents the fixed-point of an operator defined by a formula F(X, x). We define FXPF (X) :⇔ PRWO(X) ∧ (∀x)(∀y)[x X y ↔ x ≺X y ∨ F({z z ≺X y}, x)]. It follows from Theorem 13.2.6, that the set satisfying FXPF (X) is uniquely determined by the formula F. For FXPF (X) we, therefore introduce the notations F for X , ≺F for ≺X and ΦF for DX . Now we have all the material to define (Π01 –FXP)0 . 13.3.1 Definition The theory (ACA)0 is a second-order theory2 in the language of arithmetic which comprises: • All axioms of NT. • The axiom (Ind)2 of Mathematical Induction.
2
In the weak sense of Sect. 4.6.
342
13 Nonmonotone Inductive Definitions
• The scheme of arithmetical comprehension (∆01 –CA) (∃X)(∀x)[x ε X ↔ F(x)] where F(x) is an arbitrary first-order formula. The theory (Π01 –FXP)0 is a second-order theory whose nonlogical axioms comprise all axioms of (ACA)0 and • for every Π10 -formula F(X, x), in which X is the only set variable, the axiom (FXP(F)) (∃X)[FXPF (X)].
13.3.2 Exercise Show that (Π01 –FXP)0
(∃!X)[FXPF (X)].
13.4 ID1 as Sub-Theory of (Π01 –FXP)0 To obtain a lower bound for the proof-theoretical ordinal of (Π01 –FXP)0 we want to show that ID1 is a sub-theory of (Π01 –FXP)0 . The easiest way to obtain that is to use the fact that all positive-inductively definable relations are Π11 -definable. Then, we use the ω -Completeness Theorem (Theorem 5.4.9) in the form that a Π11 -relation is valid if and only if the associated search tree is well-founded. By Theorem 6.5.5, we know that the well-foundedness of a primitive recursively definable tree is expressible by a relation which is positive-inductively definable by a Π10 -formula. Finally we only have to check that (Π01 –FXP)0 comprises also all positive Π10 -definable fixed-points. 13.4.1 Definition The theory (Π11 –CA)− 0 is a second-order theory whose nonlogical axioms are: • all axioms of (ACA)0 ; • the scheme of parameter free Π11 -comprehension (Π11 –CA)
−
(∃X)(∀x)[x ε X ↔ F(x)]
where F(x) is a Π11 -formula which must not contain free set parameters. 13.4.2 Theorem The theory ID1 is embeddable into (Π11 –CA)− 0. Proof We first have to embed the language. Let F(X,x) be an X-positive arithmetical formula which contains only the shown variables free. We define I (F) := {x (∀X) (∀y)[F(X,y) →y ε X] → x ε X }.
13.4 ID1 as Sub-Theory of (Π01 –FXP)0
343
−
Then I (F) is a set by (Π11 –CA) . Replacing all occurrences of IF in the language L (ID) by I (F) we obtain L (ID) as a sub-language of second-order arithmetic. If F(x) is a formula of L (ID) then {x F(x)} is arithmetical in finitely many sets I (Fi ) and, since we have arithmetical comprehension with set parameters, also a set in (Π11 –CA)− 0 . Any instance of the scheme of Mathematical Induction in the language L (ID) is therefore a specialization of the axiom of Mathematical Induction in (Π11 –CA)− 0 . All defining axioms for primitive recursive functions are of course 1
2
also axioms of (Π11 –CA)− 0 . So it remains to show that ID1 and ID1 are derivable in 2
(Π11 –CA)− x) be an X-positive arithmetical formula. Put 0 . To prove first ID1 let F(X, MF (X) :⇔ (∀x)[F(X,x) →x ε X]. By definition of I (F), we obtain x ε I (F) ↔ (∀X)[MF (X) →x ε X]. If G(x) is an L (ID)-formula then S := {x G(x)} is a set in (i) implies
(i) (Π11 –CA)− 0.
MF (S) →x ∈ I (F) →x ε S
Therefore (ii)
and therefore (∀y)[F(S,y) →y ε S] → (∀x)[x ε I (F) →x ε S]. 2
1
This is the translation of ID1 . To prove also the translation of ID1 we obtain from (i) MF (X) → (∀x)[x ε I (F) →x ε X].
(iii)
By the X-positivity of F(X,x) we obtain from (iii) MF (X) → F(I (F),x) → F(X,x)
(iv)
for any tuple x of free variables. But MF (X) → F(X,x) →x ε X
(v)
holds true by definition of MF (X) and from (v) and (iv), we obtain F(I (F),x) → MF (X) →x ε X
(vi)
which entails F(I (F),x) → (∀X)[MF (X) →x ε X], i.e., F(I (F),x) →x ∈ I (F). The universal closure of (vii) is axiom
(vii) 1 ID1 .
We check next that positive inductive Π10 -definitions are obtainable in (Π01 –FXP)0 . Let F(X, x) be a Π10 -formula. Then, we have by FXP(F) a prewellordering X satisfying FXPF (X). To simplify notations, we denote the the prewellordering by (F , ≺F ) and put ΦxF := {y y F x} and Φ<x F := {y y ≺F x}
344
13 Nonmonotone Inductive Definitions
as well as ΦF := {x x F x}. If F(X, x) is an X-positive Π10 -formula, we obtain F(ΦF , x) → x ε ΦF
(13.11)
(∀y)[F({z G(z)}, y) → G(y)] → ΦF ⊆ {z G(z)}.
(13.12)
and
<x To prove (13.11), assume F(ΦF , x) and x ε ΦF . Then ΦF = Φ<x F . But F(ΦF , x) F implies x F x. Hence x ε Φ contradicting our assumption. To prove (13.12), we show
(∀y)[F({z G(z)}, y) → G(y)] → ΦxF ⊆ {z G(z)} by induction on ≺F . By induction hypothesis, we have (∀y)[F({z G(z)}, y) → G(y)] → Φ<x F ⊆ {z G(z)} which by X-positivity of F(X, x) implies (∀y)[F({z G(z)}, y) → G(y)] → (∀x)[F(Φ<x F , x) → F({z G(z)}, x)]. Hence (∀y)[F({z G(z)}, y) → G(y)] → ΦxF ⊆ {z G(z)} by FXP(F). Since this holds for all x, we obtain finally (∀y)[F({z G(z)}, y) → G(y)] → ΦF ⊆ {z G(z)}. 0 13.4.3 Theorem The theory (Π11 –CA)− 0 is a sub-theory of (Π1 –FXP)0 .
Proof We have only to show that parameter free Π11 -comprehension is provable in (Π01 –FXP)0 . Let (∀X)F(X, x) be a Π11 -formula without further free set parameters. Inspecting the proofs of the Syntactical and Semantical Main Lemma (Lemmas 5.4.7 and 5.4.8) we see that these proofs are easily formalizable in (Π01 –FXP)0 if we replace induction on otyp(s) by bar induction. According to ω is well-founded. The both lemmas we get (∀X)[F(X, n)] if and only if S{F(X,n)} ω search tree S{F(X,n)} is defined by a primitive recursive formula. The formula ω G(Y, y, n) :⇔ (∀z)[y z ∈ S{F(X,n)} → y z ε Y ] is therefore a Y -positive Π10 -
formula. According to Theorem 6.5.5 – which is provable in (Π01 –FXP)0 – the ω search tree S{F(X,n)} is well-founded if and only if ε ΦG(n) . Hence (∀X)[F(X, n)] ↔ ε ΦG(n) . (i) By arithmetical comprehension, we obtain y ε ΦG(y) as a set which shows (∃Z)(∀y) y ε Z ↔ (∀X)[F(X, y)] . (Π01 –FXP)0 (Π01 –FXP)0
13.5 The Upper Bound for ||(Π01 –FXP)0 ||
345
Summing up, we obtain the following theorem. 13.4.4 Theorem The theory ID1 is a sub-theory of (Π01 –FXP)0 . This implies that not only Π10 -definable but any arithmetically definable positive inductive definition is obtainable in (Π01 –FXP)0 . As another consequence we obtain 0 13.4.5 Theorem ψ (εΩ +1 ) = ||ID1 || ≤ ||(Π11 –CA)− 0 || ≤ ||(Π1 –FXP)0 ||.
13.5 The Upper Bound for ||(Π01 –FXP)0 || In Sect. 11.5, we have shown that the first-order language of arithmetic can be regarded as a sublanguage of set theory. We cannot directly extend that to the secondorder language of (Π01 –FXP)0 because we know that the second-order variables of the theory (Π01 –FXP)0 , which contains (Π11 –CA)− 0 as sub-theory, range over a domain which contains Π11 -sets. The attempt to translate second-order variables in the language of arithmetic by quantifiers ranging over subsets of ω must fail because in theories like KPω or (Π2 –REF), which have Lω CK as least constructible model, 1 these quantifiers range over sets in Pow(ω ) ∩ Lω CK , a narrower class than the class 1 of Π11 -definable sets which corresponds to the class of sets which are Σ1 -definable over Lω CK . 1
Therefore, we have to extend (Π2 –REF) to a second-order theory (Π2 –REF)2 . Second-order variables, again denoted by capital Latin letters, are supposed to range over classes. 2
13.5.1 Definition The theory (Π2 –REF) is a second-order theory which comprises the axioms BST of Basic Set Theory in which the scheme of foundation is replaced by the axiom (∀X) X = 0/ → (∃x ∈ X)[(∀y ∈ x)(y ∈ / X)] (Found)2 together with the comprehension scheme (CS) (∃X)(∀y)[y ∈ X ↔ F(y)], where F(y) is a first-order formula not containing X, and the scheme (Π2 –REF). The second-order variables in (Π2 –REF)2 range over classes which are firstorder definable. Therefore, every model of (Π2 –REF) can easily be extended to a model of (Π2 –REF)2 which shows that (Π2 –REF)2 is a conservative extension of (Π2 –REF). The theories (Π2 –REF) and (Π2 –REF)2 have therefore the same Σ -ordinal. We obtain a translation of the language of second-order arithmetic into the language of second-order set theory by translating first-order quantifiers (Qx)[· · ·] into
346
13 Nonmonotone Inductive Definitions
(Qx ∈ ω )[· · ·] and second-order quantifiers (∀X)[· · ·] into (∀X)[X ⊆ ω → · · ·] and (∃X)[· · ·] into (∃X)[X ⊆ ω ∧ · · ·]. If we want to emphasize the difference between formulas in the language of arithmetic and formulas in the language of set theory, we denote the translation of an arithmetical formula F by F ω , but to simplify notations we mostly identify F and F ω . It will be clear from the context (or inessential) whether we mean the arithmetical formula or its translation into the language of set theory. For any arithmetical formula F(X, x, i1 , . . . , in ) we get by Lemma 11.5.2 an Σ -function symbol IF such that (Π2 –REF) proves <α
<α
IF (α , i1 , . . . , in ) = {x ∈ ω F(IF , x, i1 , . . . , in )} ∪ IF
(13.13)
for <α
IF
:= {z ∈ ω (∃β < α )[z ∈ IF (β , i1 , . . . , in )]}.
Put α
IF := {x ∈ ω (∃ξ ≤ α )[x ∈ IF (ξ , i1 , . . . , in )]} and define α
<α
α
α
x F y :⇔ (∃α )[x ∈ IF ∧ y ∈ / IF ] as well as x ≺F y :⇔ (∃α )[x ∈ IF ∧ y ∈ / IF ]. The pair (F , ≺F ) are the stage comparison relations defined in (13.2) on p. 336. Let XF := {x ∈ ω
Seq(x) ∧ lh(x) = 2 ∧ Seq((x)1 ) ∧ lh((x)1 ) = 2 ∧ (x)0 ≤ 1 ∧ ((x)0 = 0 → (x)1,0 F (x)1,1 ) ∧ ((x)0 = 1 → (x)1,0 ≺F (x)1,1 )}.
We have shown in Sect. 11.5 that all primitive recursive functions on ω are definable in KPω and thus also in (Π2 –REF)2 . Therefore, we obtain XF as a class in (Π2 –REF)2 . We observe α
x F x ↔ (∃α )[x ∈ IF ] ↔ (∃α )[x ∈ IF (α , i1 , . . . , in )] and define <∞
IF
:= {x ∈ ω (∃α )[x ∈ IF (α , i1 , . . . , in )]} = {x ∈ ω x F x}.
(13.14)
We have to show that (Π2 –REF)2 is strong enough to prove that F and ≺F fulfill properties (PWO1) – (PWO3) and (FIX) of Theorem 13.2.3. It is obvious that ≺F and F are transitive binary relations. It is also obvious that ≺F is well-founded. So, we have property (PWO3). Moreover, we can prove α
β
x F y ↔ (∃α )[x ∈ IF ∧ (∀β )[y ∈ IF → α ≤ β ]] <∞
↔ x ∈ IF
<∞
∧ (y ∈ IF
→ ¬(y ≺F x))
which is (PWO1). In addition, we prove
13.5 The Upper Bound for ||(Π01 –FXP)0 || α
347
<α
β
<β
x ≺F y ↔ (∃α )[x ∈ IF ∧ y ∈ / IF ] ∧ (∀β )[y ∈ IF → x ∈ IF ] ↔ x F y ∧ ¬(y F x) which is (PWO2). It remains to check (FIX). Clearly, we can define α
|x|F := min {α x ∈ IF } |x|F
<∞
<|x|F
for x ∈ IF and obtain IF = {y y F x} as well as IF <∞ x ∈ IF we therefore obtain by (13.13) |y|F
<|y|F
<|y|F
x F y ↔ x ∈ IF ↔ x ∈ IF ∨ F(IF ↔ x ≺F y ∨ F({z z ≺ y}, x).
= {y y ≺F x}. For
, x)
(13.15) <∞
To check (FIX) completely we have to get rid of the hypothesis x ∈ IF . However, the cardinality argument which we used in the proof of Lemma 13.1.2 is not available in (Π2 –REF)2 . We need some other argument and this is the point at which we have to exploit the fact that F is Π10 . This new argument needs a few preparations. 13.5.2 Lemma Let A(X, x, i1 , . . . , in ) be a formula which contains at most quantifiers which are bounded by ordinals < ω . Then <β
<∞
A(IF , x, i1 , . . . , in ) ↔ (∀α )(∃β )[α ≤ β ∧ A(IF , x, i1 , . . . , in )]. ˜ + , X − ) be the formula A(X, x, i1 , . . . , in ), where we have separated Proof Let A(X the positive and negative occurrences of X and suppressed the mentioning of the parameters x, i1 , . . . , in . Then we have ˜ + , IF<ξ ) → A(X ˜ + , IF<η ) ξ ≥ η ⇒ A(X
(13.16)
˜ F<ξ , X − ) → A(I ˜ A<η , X − ). ξ ≤ η ⇒ A(I
(13.17)
and
First we prove ˜ F<∞ , IF<β )] → A(I ˜ F<∞ , IF<∞ ). (∀α )(∃β )[α ≤ β ∧ A(I
(i)
˜ F<∞ , IF<β ). We show A(I ˜ F<∞ , IF<∞ ) by β ≥ α and A(I ˜ + , X − ). The claim is trivial if X − the formula A(X
Assume that for all α there is a induction on the complexity of ˜ + , X − ). does not occur in A(X + − ˜ / X − then If A(X , X ) is a formula s ∈ <β
s∈ / IF
<∞
→s∈ / IF
<∞
holds by definition of IF . ˜ + , X − ) be a formula A˜ 1 (X + , X − ) ∧ A˜ 2 (X + , X − ). From the hypothNext let A(X ˜ F<∞ , IF<β )] we obtain (∀α )(∃β )[α ≤ β ∧ A˜ i (IF<∞ , IF<β )] esis (∀α )(∃β )[α ≤ β ∧ A(I <∞ <∞ ˜ F<∞ , IF<∞ ) by the induction hypothesis. for i ∈ {1, 2}. Hence A˜ 1 (IA , IF ) ∧ A(I
348
13 Nonmonotone Inductive Definitions
˜ + , X − ) is a formula A˜ 1 (X + , X − ) ∨ A˜ 2 (X + , X − ). Then we Now assume that A(X ˜ F<∞ , IF<β )] that (∀α )(∃β )[α ≤ β ∧ A˜ i (IF<∞ , IF<β )] get from (∀α )(∃β )[α ≤ β ∧ A(I holds true for at last one i ∈ {1, 2} because otherwise there would exist an α1 and <∞ <β an α2 such that A˜ i (IF , IF ) would be false for all β ≥ αi . But then the disjunc<∞
<β
<∞
<β
tion A˜ 1 (IF , IF ) ∨ A1 (IF , IF ) would be false for all β ≥ max{α1 , α2 }. By the <∞ <∞ induction hypothesis, we therefore obtain A˜ i (IF , IF ) for at least one i ∈ {1, 2} <∞ <∞ ˜ F , IF ). which entails A(I ˜ + , X − ) is a formula (∀y ∈ n)A˜ 0 (X + , X − , y) we obtain from the In case that A(X ˜ F<∞ , IF<β )] that (∀α )(∃β )A˜ 0 (IF<∞ , IF<β , y) holds hypothesis (∀α )(∃β )[α ≤ β ∧ A(I true for all y less than n. From the induction hypothesis, we therefore obtain <∞ <∞ ˜ F<∞ , IF<∞ ). A˜ 0 (IF , IF , y) for all y < n and this implies A(I + − + ˜ , X ) is a formula (∃y ∈ n)A˜ 0 (X , X − , y) we obtain from the hypothesis If A(X <∞ <β that there is a y < n such that (∀α )(∃β )[α ≤ β ∧ A˜ 0 (IF , IF , y)] holds true. Oth˜ F<∞ , IF<β , y) becomes erwise for every y < n there would exists an αy such that A(I ˜ F<∞ , IF<β ) would be false for all β ≥ max {αy y < n}. false for all β ≥ αy and A(I <∞ <∞ From the induction hypothesis we therefore obtain that A˜ 0 (IF , IF , y) holds true ˜ F<∞ , IF<∞ ). for some y < n and this implies A(I As a consequence of (13.17) and (i) we have ˜ F<β , IF<β )] → A(I ˜ F<∞ , IF<∞ ). (∀α )(∃β )[α ≤ β ∧ A(I
(ii)
For the opposite direction we prove ˜ F<∞ , IF<∞ ) → (∃β )A(I ˜ F<β , IF<∞ ) A(I
(iii)
˜ + , X − ). The claim is trivial if X + by induction of the complexity of the formula A(X + − + − ˜ ˜ does not occur in A(X , X ). If A(X , X ) is a formula s ∈ X + then we obtain the claim from <∞
s ∈ IF
<ξ
→ (∃ξ )[s ∈ IF ].
(iv)
˜ + , X − ) is a formula A˜ 1 (X + , X − ) ∧ A˜ 2 (X + , X − ) or a formula If we assume that A(X + − ˜ ˜ A1 (X , X ) ∨ A2 (X + , X − ) we have the induction hypotheses <∞ <∞ <β <∞ A˜ i (IF , IF ) → (∃βi )A˜ i (IF , IF ) <∞
<∞
<β
<∞
˜ F , IF ) → (∃β )[A˜ 1 (IF , IF )] for i = 1, 2 or i ∈ {1, 2} and obtain by (13.17) A(I with β := max{β1 , β2 } or β := βi as witness. ˜ + , X − ) is a formula (∃y ∈ n)A˜ 0 (X + , X − , y) then there is a y < n and by If A(X <β <∞ ˜ F<β , IF<∞ ). induction hypothesis a β such that A˜ 0 (IF , IF , y). Hence (∃β )A(I + − + − ˜ , X ) is a formula (∀y ∈ n)A˜ 0 (X , X , y) then there is for every y < If A(X ˜ F<βy , IF<∞ , y) and, by putting β := n by induction hypothesis a βy such that A(I
<β <∞ max{β0 , . . . , βn−1 }, we obtain (∃β )(∀y ∈ n)A˜ 0 (IF , IF , y). This finishes the proof of (iii). But (iii) together with (13.16) and (13.17) show
13.5 The Upper Bound for ||(Π01 –FXP)0 ||
349
˜ F<∞ , IF<∞ ) → (∀α )(∃β )[α ≤ β ∧ A(I ˜ F<β , IF<β )]. A(I
(v)
From (v) and (ii) we finally obtain ˜ F<∞ , IF<∞ ) ↔ (∀α )(∃β )[α ≤ β ∧ A(I ˜ F<β , IF<β )]. A(I
As a corollary of the proof of Lemma 13.5.2, we obtain its relativization to limit ordinals. 13.5.3 Lemma Let A(X, x, i1 , . . . , in ) be a formula which contains at most quantifiers which are bounded by ordinals < ω . Then <γ
<β
A(IF , x, i1 , . . . , in ) ↔ (∀α < γ )(∃β < γ )[α ≤ β ∧ A(IA , x, i1 , . . . , in )] for all limit ordinals γ . Proof
The proof is a simple relativization of the proof of Lemma 13.5.2.
13.5.4 Lemma Let F(X, y) be the Π10 -formula (∀x)A(X, x, y) in the language of arithmetic. Then <∞ <γ (∀y ∈ ω ) (∀x ∈ ω )A(IF , x, y) → (∃γ )(∀x ∈ ω )A(IF , x, y) . (Π2 –REF)2 Proof
Assume <∞
(∀x ∈ ω )A(IF , x, y).
(i)
By Lemma 13.5.2 we obtain <β
(∀x ∈ ω )(∀α )(∃β )[α ≤ β ∧ A(IF , x, y)].
(ii)
<β
The formula A(IF , x, y) is ∆0 and by Σ -reflection, we obtain <β
(∀α )(∃b)(∀x ∈ ω )(∃β ∈ b)[α ≤ β ∧ A(IF , x, y)].
(iii)
From (iii) and the fact that (Π2 –REF)2 proves (∀γ )(∃ξ )[γ < ξ ] we get by (Π2 –REF) a transitive nonvoid set z such that <β
(∀α ∈ z)(∃b ∈ z)(∀x ∈ ω )(∃β ∈ b)[α ≤ β ∧ A(IF , x, y) ∧ (∀γ ∈ z)(∃ξ ∈ z)[γ < ξ ]].
(iv)
Then γ := sup(z ∩ On) ∈ Lim and <β
(∀x ∈ ω )(∀α < γ )(∃β < γ )[α ≤ β ∧ A(IF , x, y)].
(v)
From (v) and Lemma 13.5.3 we obtain <γ
(∀x ∈ ω )A(IF , x, y).
350
13 Nonmonotone Inductive Definitions
13.5.5 Corollary For any Π10 -formula F(X, x) in the language of arithmetic we obtain (Π2 –REF)2
<∞
<∞
(∀x ∈ ω )[F(IF , x) → x ∈ IF ]. <α
<∞
Proof We obtain F(IF , x) → (∃α )F(IF , x) according to Lemma 13.5.4. By <∞ (13.14) this implies x ∈ IF . Putting ∞
<∞
<∞
IF := {x ∈ ω F(IF , x)} ∪ IF we obtain <∞
IF
∞
= IF
(13.18)
which corresponds to (13.1) with Φ(X) := {x ∈ ω F(X, x)}. Together with (13.15) we get from Corollary 13.5.5 that ≺F and F fulfill (FIX). By already shown facts, we know that (PWO1) – (PWO3) are satisfied, too, and we obtain (Π2 –REF)2
(∃X)[X ⊆ ω ∧ FXPF (X)].
(13.19)
13.5.6 Theorem Let F be a sentence in the second-order language of arithmetic F. Then (Π2 –REF)2 Fω . such that (Π01 –FXP)0 Proof It suffices to show that all axioms of (Π01 –FXP)0 are provable in (Π2 –REF)2 . We have already seen in Sect. 11.5 that all axioms of NT, except the axiom of Mathematical Induction, are provable in KPω and thence also in (Π2 –REF)2 . The axiom (Found)2 is apparently equivalent to (∀X) (∀x)[(∀y ∈ x)(y ∈ X) → x ∈ X] → (∀x)(x ∈ X) . (i) Pick X ⊆ ω , such that 0 ∈ X and y ∈ X implies y + 1 ∈ X and x ∈ ω such that (∀y ∈ x)[y ∈ X]. If x = 0, we get x ∈ X by hypothesis. Otherwise, there is a y ∈ ω such that x = y + 1 which implies y ∈ x and therefore y ∈ X by choice of x. By choice of X, we get x = y + 1 ∈ X. From (i) we then obtain (∀x ∈ ω )[x ∈ X] and have shown (∀X)[X ⊆ ω ∧ 0 ∈ X ∧ (∀y ∈ ω )[y ∈ X → y + 1 ∈ X] → (∀x ∈ ω )[x ∈ X]] which is the translation of axiom (Ind)2 . The translation of the scheme (Π01 -CA) of arithmetical comprehension is covered by the the scheme (CS) of class comprehension. Finally, we have seen in (13.19) that (Π2 –REF)2 also proves FXP(F)ω . To obtain the upper bound for the proof-theoretical ordinal of (Π01 –FXP)0 we adopt Definition 11.7.1 to the theory (Π01 –FXP)0 and define the ordinal (Π0 –FXP)0
σΠ 01 1
:= sup {|n|F + 1 F(X, x) is a Π10 -formula and (Π01 –FXP)0
n ε ΦF }.
13.5 The Upper Bound for ||(Π01 –FXP)0 ||
351
Using the fact that every positive inductive definition can already be defined by a Π10 -definable positive inductive definition – a fact which we showed in the proof of Theorem 13.4.4 – we apparently have 0
(Π0 –FXP)0
κ (Π1 –FXP)0 ≤ σΠ 01
.
1
If (Π01 –FXP)0
n ε ΦF
we obtain (Π2 –REF)2
Theorem 13.5.6 and therefore |n|F <
||(Π2 –REF)2 ||Σ .
(∃α )[n ∈ IF (α )] by
||(Π2 –REF)2 ||Σ and we obtain (Π0 –FXP)0
0
||(Π01 –FXP)0 || = κ (Π1 –FXP)0 ≤ σΠ 01 1
(Π0 –FXP)0
Hence, σΠ 01
≤ ||(Π2 –REF)2 ||Σ =
1
≤
(13.20)
||(Π2 –REF)||= ψ (εΩ +1 ). Summing up, we have 13.5.7 Theorem ||(Π01 –FXP)0 || = ||(Π11 –CA)− 0 || = ||ID1 || = ψ (εΩ +1 ) Proof
By Theorem 13.4.5 and (13.20) we have ψ (εΩ +1 ) = ||ID1 || ≤ ||(Π11 –CA)− 0 || ≤ 0
(Π0 –FXP)0
||(Π01 –FXP)0 || = κ (Π1 –FXP)0 ≤ σΠ 01 1
≤ ψ (εΩ +1 ).
Chapter 14
Epilogue
The leitmotif of this book was to pursue and extend Gentzen’s work in proof theory. Gentzen’s research was guided by Hilbert’s program. Therefore, it is likely that already his work in pure logic was inspired by the observation that the consistency of a formal system could only be obtained by scrutinizing deviation free derivations. According to Theorem 7.5.4, it suffices to find a single formula which is underivable within a formal system to establish its consistency. The obvious strategy for finding such a formula is to show that an atomic formula, which is not among the axioms of the formal system, cannot be the end-formula of any derivation. This, however, is only obvious if the derivation is deviation free. The natural first step in pursuing this strategy is therefore to develop a system of pure logic which is deviation free. Thus, it is likely that similar considerations lead to Gentzen’s papers [29] and [30] which contain his Hauptsatz (corresponding to Theorems 4.5.1 and 4.5.2 in this book). However, this result turned out to be still insufficient to establish the consistency of pure number theory. Here the presence of nonlogical axioms, especially the axiom of Mathematical Induction, spoils the cut-elimination procedure. Later Gentzen overcame these obstacles replacing the induction axiom by an induction rule and subsequently eliminating deviations, not in all derivations, but at least in the end-piece (“Endst¨uck”) of derivations of quantifier free sentences in a formal system of Peano Arithmetic (cf. [30] and [32]). In this way, he could show that the empty sequent is not derivable which entailed the consistency of pure number theory. So his results were certainly meant as a step toward a performance of Hilbert’s program. However, we believe that Gentzen was aware that his work meant more. This belief is corroborated by his paper [31] in which he showed the unprovability of induction up to ε0 in Peano arithmetic without referring to G¨odel’s incompleteness theorem, i.e., without referring to consistency. Simultaneously, he proved that this bound is exact. Roughly speaking Gentzen showed that all applications of rules of Mathematical Induction occurring in the end-piece of a derivation of an atomic formula can be unraveled into a series of cuts, which are eliminated afterwards. This unraveling W. Pohlers, Proof Theory: The First Step into Impredicativity, Universitext, c Springer-Verlag Berlin Heidelberg 2009
353
354
14 Epilogue
procedure is not fully visible in Gentzen’s original papers, where he works with finite derivations. It becomes, however, completely plain when turning to infinitary systems as they have been systematically used by Sch¨utte. There it becomes apparent that unraveling a formal derivation from the axioms of number theory leads to a cut free derivation of length below ε0 , i.e., – in our terminology – to a verification of length below ε0 , showing that the truth complexities of the provable formulas of number theory are below ε0 . This unraveling process, i.e., the process of resolving abstract axioms into an iteration of elementary principles, is the central issue of infinitary proof theory. This aspect of infinitary proof theory becomes even more transparent in the ordinal analysis of subsystems of set theory. There, we can study how “complicated” axioms, e.g., the reflection axiom, are resolved into the verification calculus for the constructible hierarchy. The verification calculus for the constructible hierarchy has the advantage over the verification calculus for Π11 -sentences that it needs no “deep” theorem, like the ω -completeness theorem (Theorem 5.4.9), to prove its completeness and also no “deep” theorem, as the Boundedness theorem (Theorem 6.6.10), to read off the information kept in a verification.1 The definition of the constructible hierarchy is locally predicative. It is a well-known fact that the notion of definability can even be replaced by a finite set of “G¨odel” functions, very elementary set theoretic functions, which suffice to generate the stage Lα +1 from Lα . The passage from Lα to Lα +1 is therefore even more elementary than displayed it in this book.2 Resolving formal proofs into the verification calculus for the constructible hierarchy therefore means a reduction of abstract concepts to a (transfinite) iteration of very elementary principles. So in some respect, we moved far away from Hilbert’s original programme of a finitist consistency proof. Nonetheless ordinal analysis, as performed in this book, has a strong connection to another aspect of Hilbert’s program. One of the issues of his program concerned the elimination of ideal objects (cf. [43]). Actual3 statements (in Hilbert’s setting) are statements which are directly verifiable. He compares them to the sort of physical statements which are verifiable by experiments.4 Statements of this type are Σ10 -statements with parameters, i.e., essentially Π20 -sentences. If a Π20 -statement is true we can verify every instance of it. In Sect. 10, we have shown in the paradigmatic example of the theory NT, that, in case that there is an ordinal analysis of the theory which proves the Π20 -sentences, we even have an upper bound for the number of steps which we need for the verification.5 We have, however, also Compare Theorem 11.9.9 and its consequences in Sect. 11.9 to the ω -completeness and Boundedness Theorem. 2 Using G¨ odel functions instead of definability in the verification calculus would have made the verification calculus unnecessarily complicated. 3 In [43] Hilbert is talking (in German) about “reale Aussagen”. In order to not confuse real objects with real numbers, we opted to translate that by “actual” objects and “actual” statements. 4 Cf. [43]“ . . . – sowie in meiner Beweistheorie nur die realen Aussagen unmittelbar einer Verifikation f¨ahig sind.” i.e., “. . . – as well as my prooftheory only allows a direct verification of actual statements.” 5 This is also true for stronger systems (cf. [10]). 1
14 Epilogue
355
shown in Sect. 10, that the description of this upper bound in terms of iterations of an “elementary” function needs transfinitely many steps. In this respect Hilbert’s “dream” of obtaining a justification of the infinite by finite reasoning failed. Eliminating ideal objects in a proof of an actual statement needs in general infinitely many steps. In some other respects, however, Hilbert’s idea can be generalized. Instead of only regarding natural numbers as actual objects, we may also count infinite ordinals among the actual objects, if they can be coded by an effectively decidable well-ordering on the natural numbers, i.e., we may regard all ordinals below ω1CK as actual objects. In this generalized setting the actual statements are represented by Σ1Lω1CK formulas with parameters. An example for an ideal object then is the ordinal ω1CK – or rather the stage Lω CK in the constructible hierar1 chy – whose defining axiom is the scheme of Π2 -reflection. Our ordinal analysis of (Π2 –REF) then corresponds exactly to an elimination of ideal objects in a proof of an actual statement. In this interpretation the hypothesis in the collapsing theorem (Theorem 11.11.2) – that ∆ has to be a set of Σ1 -sentences – is therefore not only of technical nature. The ordinal analysis of subsystems of second ordinal arithmetic carries of course the same flavor. There we computed the truth complexities of the provable Π11 sentences of a theory. Since a Π11 -sentence in the language of arithmetic corresponds to an Σ1Lω1CK -sentence in the language of set theory, we again computed an upper bound for the length of a verification of (generalized) actual statements. Using this generalized interpretation of actual objects, we could even show that Hilbert’s idea of justifying the infinite by finite reasoning, i.e., actual reasoning in our generalization, gains some substance. As shown in Sect. 9.7 the “ideal ordinal” Ω , can be alternatively interpreted by an ordinal below ω1CK , i.e., by an actual object.6 Leaving aside these speculations about Hilbert’s program, ordinal analysis has also a genuinely mathematical aspect which we could not fully handle in this book. As shown in Sect. 9.6.1, we can extract an ordinal notation system from the iterations of the (least) controlling operator. In this sense an ordinal analysis not only computes the proof-theoretic ordinal ||T || of a theory T but also provides us with a notation system for the ordinals below ||T ||. According to Definition 10.3.1 and the discussion in Sect. 10.6, an ordinal notation system induces a subrecursive hierarchy based on some “operator”, i.e., on a strictly increasing number theoretic function, which eventually majorizes the Skolem functions for the provable Π20 -sentences of T . Moreover, our discussion in Sect. 10.6 has also shown that this hierarchy is pretty independent of the choice of the base function. Most combinatorial principles lead to Π20 -sentences. A famous example stems from Ramsey Theory. For a set M denote by [M]k := {S S ⊆ M ∧ S = k} the set of subsets of size k. Let c be a cardinal. A c coloring of [M]k is a map P: [M]k −→ c. A subset H ⊆ M is homogeneous if P[H]k is constant, i.e., if every k-sized subset of H gets the same color. 6
This follows also already from Sect. 9.6.1, by which we obtain that the order type of the ordinals in the set BεΩ +1 (0) / ∩ εΩ +1 is below ω1CK .
356
14 Epilogue
The finite version of Ramsey’s Theorem states that for all natural numbers k, c, and n you can find a natural number R such that every finite set M of cardinality ≥ R and any c coloring of [M]k there is a homogeneous subset H ⊆ M of cardinality ≥ n. This can easily be formalized as an Π20 -sentence. The Ramsey number R then depends on k, c, and n, i.e., R = R(k, c, n). Erd¨os and Rado showed that this function has a growth rate in terms of exponential towers. A size which lies within the Skolem functions of the provable Π20 -sentences of NT. In fact, the finite Ramsey Theorem is a theorem of Peano arithmetic. Due to an observation by Paris and Harrington (cf. [66]) the situation changes dramatically if we require that the homogeneous set is relatively large, i.e., that min H ≤ H. The then emerging Paris–Harrington Ramsey function gets the growth rate of Hε0 (cf. Sect. 10.6). Since Hε0 majorizes all Skolem functions of the provable Π20 -sentences of NT this shows that the Paris–Harrington version of the finite Ramsey theorem cannot be provable within NT.7 As indicated in Chap. 10, in the example of the theory NT, the Skolem functions for the provable Π20 -sentences of a theory are governed by the subrecursive hierarchy generated by the notation system for its proof-theoretical ordinal. The combinatorial content of theory is therefore reflected by the notation system generated by its ordinal analysis. This holds still true for stronger theories (cf. e.g. [11]). This example may illuminate the importance of ordinal analysis. There are, however, other examples of applications of ordinal analysis. One should be aware that the analysis presented in this book is still quite rough. We only looked at the truth complexities of the provable Π11 -sentences. A finer analysis would also take into account the derivations itself and not only their ordinal heights. An example of such a finer analysis is in [12]. A broader discussion about the importance of ordinal analysis is Rathjen’s paper [80]. In this book, we restricted ourselves to the first step into impredicativity. The further steps are in general more complicated. The reason for that is that, we needed only one collapsing step and could afterwards argue mainly semantically (e.g., using semantical cut-elimination (Theorem 9.3.1)). The analysis of stronger theories need more, in general infinitely many, collapsing steps. Although this meant a big obstacle in the beginning of impredicative proof theory, most of the then occurring problems are solved today. So it was for more pedagogical reasons that we restricted ourselves to the first step into impredicativity. Nevertheless, we hope that we will soon be able to present a continuation of this book which includes the further steps.
The requirement min H ≤ H of relative largeness can be specified to f (min H) ≤ H for an arithmetical function f . The Paris–Harrington result holds for f = Id. It is an extremely interesting problem to classify the “threshold” function f which is on the edge between provability and unprovability in NT. This is a research project of Weiermann, who already obtained a series of spectacular results also regarding other combinatorial principles. A survey is published in [109].
7
References
1. Ackermann, W.: Begr¨undung des ‘Tertium non datur’ mittels der Hilbertschen Theorie der Widerspruchsfreiheit. Mathematische Annalen 93 (1924) 2. Arai, T.: Variations on a theme by Weiermann. Journal of Symbolic Logic 63, 897–925 (1998) 3. Arai, T.: Proof theory for theories of ordinals I: Recursively Mahlo ordinals. Annals of Pure and Applied Logic 122, 163–208 (2003) 4. Barwise, J.: Admissible Sets and Structures. Perspectives in Mathematical Logic. SpringerVerlag, Berlin/Heidelberg/New York (1975) 5. Barwise, J. (ed.): Handbook of Mathematical Logic. North-Holland Publishing Company, Amsterdam (1977) 6. Barwise, J., Schlipf, J.S.: On recursively saturated models of arithmetic. In: D.H. Saracino, V.B. Weispfenning (eds.) Model Theory and Algebra, no. 498 in Lecture Notes in Mathematics, pp. 42–55. Springer-Verlag, Heidelberg/New York (1975) 7. Beckmann, A.: Dynamic ordinal analysis. Archive for Mathematical Logic 42, 303–334 (2003) 8. Beckmann, A., Pohlers, W.: Application of cut–free infinitary derivations to generalized recursion theory. Annals of Pure and Applied Logic 94, 1–19 (1998) 9. Bernays, P.: Hilberts Untersuchungen u¨ ber die Grundlagen der Mathematik. In: D. Hilbert (ed.) Gesammelte Abhandlungen, vol. III, pp. 196–216. Springer-Verlag, Berlin (1935) 10. Blankertz, B., Weiermann, A.: How to characterize provably total functions by the Buchholz operator method. No. 6 in Lecture Notes in Logic. Springer-Verlag, Heidelberg/New York (1996) 11. Buchholz, W.: An independence result for Π11 -CA + BI. Annals of Pure and Applied Logic 33, 131–155 (1987) 12. Buchholz, W.: Notation systems for infinitary derivations. Archive for Mathematical Logic 30, 277–296 (1991) 13. Buchholz, W.: A simplified version of local predicativity. In: P. Aczel, H. Simmons, S.S. Wainer (eds.) Proof Theory, pp. 115–147. Cambridge University Press, Cambridge (1992) 14. Buchholz, W., Cichon, E.A., Weiermann, A.: A uniform approach to fundamental sequences and hierarchies. Mathematical Logic Quarterly 40, 273–286 (1994) 15. Buchholz, W., Feferman, S., Pohlers, W., Sieg, W. (eds.): Iterated Inductive Definitions and Subsystems of Analysis: Recent Proof-Theoretical Studies. No. 897 in Lecture Notes in Mathematics. Springer-Verlag, Heidelberg/New York (1981) 16. Buchholz, W., Sch¨utte, K.: Syntaktische Abgrenzungen von formalen Systemen der Π11 Analysis und ∆21 -Analysis. Bayerische Akademie der Wissenschaften, Sitzungsberichte 1980 pp. 1–35 (1981) 17. Buchholz, W., Sch¨utte, K.: Proof Theory of Impredicative Subsystems of Analysis. No. 2 in Studies in Proof Theory, Monographs. Bibliopolis, Naples (1988)
357
358
References
18. Buss, S.R. (ed.): Handbook of Proof Theory. Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company (1998) 19. Buss, S.R., H´ajek, P., Pudl´ak, P. (eds.): Logic Colloquium ’98, no. 13 in Lecture Notes in Logic. Association for Symbolic Logic, Natik, Massachusetts (2000) 20. Dedekind, R.: Gesammelte mathematische Werke, vol. III. Friedr. Vieweg & Sohn, Wiesbaden (1932) 21. Devlin, K.J.: Constructibility. Perspectives in Mathematical Logic. Springer-Verlag, Heidelberg/New York (1984) 22. Erd¨os, P., Rado, R.: Combinatorial theorems on classifications of subsets of a given set. Proceedings of the London Mathematical Society 28, 417–439 (1952) 23. Feferman, S.: Systems of predicative analysis. Journal of Symbolic Logic 29, 1–30 (1964) 24. Feferman, S.: Formal theories for transfinite iteration of generalized inductive definitions and some subsystems of analysis. In: Kino et al. [60], pp. 303–326 25. Friedman, H.M.: Iterated inductive definitions and Σ21 -AC. In: Kino et al. [60], pp. 435–442 26. Friedman, H.M.: Subsystems of set theory and analysis. Ph.D. thesis, MIT Press, Boston (1987) 27. Friedman, H.M., McAloon, K., Simpson, S.G.: A finite combinatorical principle which is equivalent to the 1-consistency of predicative analysis. In: G. Metakides (ed.) Patras Logic Symposium, no. 109 in Studies in Logic and the Foundations of Mathematics, pp. 197–230. North-Holland Publishing Company, Amsterdam (1982) 28. Gentzen, G.: Untersuchungen u¨ ber das logische Schließen I. Mathematische Zeitschrift 39, 176–210 (1934) 29. Gentzen, G.: Untersuchungen u¨ ber das logische Schließen II. Mathematische Zeitschrift 39, 405–431 (1935) 30. Gentzen, G.: Die Widerspruchsfreiheit der reinen Zahlentheorie. Mathematische Annalen 112, 493–565 (1936) 31. Gentzen, G.: Beweisbarkeit und Unbeweisbarkeit von Anfangsf¨allen der transfiniten Induktion in der reinen Zahlentheorie. Mathematische Annalen 119, 140–161 (1943) 32. Gentzen, G.: Der erste Widerspruchsfreiheitsbeweis f¨ur die klassische Zahlentheorie. Archiv f¨ur Mathematische Logik und Grundlagenforschung 16, 97–118 (1974) 33. Girard, J.Y.: Une extension de l’interpretation de G¨odel a l’analyse et son application a l’elimination des coupures dans l’analyse et la theorie des types. In: J.E. Fenstad (ed.) Proceedings of the 2nd Scandinavian Logic Symposium, no. 63 in Studies in Logic and the Foundations of Mathematics, pp. 63–92. North-Holland Publishing Company, Amsterdam (1971) 34. Girard, J.Y.: Π21 -logic I. Dilators. Annals of Mathematical Logic 21, 75–219 (1981) 35. Girard, J.Y.: Proof Theory and Logical Complexity, vol. 1. Bibliopolis, Naples(1987) 36. G¨odel, K.: Die Vollst¨andigkeit der Axiome des logischen Funktionenkalk¨uls. Monatshefte f¨ur Mathematik und Physik 37, 349–360 (1930) ¨ 37. G¨odel, K.: Uber formal unentscheidbare S¨atze der ‘Prinzipia Mathematica’ und verwandter Systeme. Monatshefte f¨ur Mathematik und Physik 38, 173–198 (1931) ¨ 38. G¨odel, K.: Uber eine bisher noch nicht ben¨utzte Erweiterung des finiten Standpunktes. Dialectica 12, 280–287 (1958) 39. Herbrand, J.: Recherches sur la theorie de la demonstration. Societe des Science et des Lettres Varsovic, Science Mathematiques et Physiques 33, 128 (1930) 40. Herbrand, J.: Sur la non-contradiction de l’arithm´etique. Journal f¨ur die reine und angewandte Mathematik 166 (1930) 41. Hilbert, D.: Neubegr¨undung der Mathematik. Abhandlungen aus dem Math. Seminar d. Hamb. Univ. I, 157–122 (1922) ¨ 42. Hilbert, D.: Uber das Unendliche. Mathematische Annalen 95, 161–190 (1926) 43. Hilbert, D.: Die Grundlagen der Mathematik. Vortrag gehalten auf Einladung des Mathematischen Seminars im Juli 1927 in Hamburg. Hamburger Mathematische Einzelschriften 5. Heft, 1–21 (1928) 44. Hilbert, D.: Die Grundlegung der elementaren Zahlenlehre. Mathematische Annalen 104, 485–494 (1931)
References
359
45. Hinman, P.G.: Recursion-Theoretic Hierarchies. Perspectives in Mathematical Logic. Springer-Verlag, Heidelberg/New York (1978) 46. Howard, W.A.: Assignment of ordinals to terms for primitive recursive functionals of finite type. In: Kino et al. [60], pp. 443–458 47. Howard, W.A.: A system of abstract constructive ordinals. Journal of Symbolic Logic 37, 355–374 (1972) 48. J¨ager, G.: Die konstruktible Hierarchie als Hilfsmittel zur beweistheoretischen Untersuchung von Teilsystemen der Mengenlehre und Analysis. Dissertation, Ludwig-MaximiliansUniversit¨at, Munich (1979) 49. J¨ager, G.: A well ordering proof for Feferman’s theory T0 . Archiv f¨ur Mathematische Logik und Grundlagenforschung 23, 65–77 (1983) 50. J¨ager, G.: The strength of admissibility without foundation. Journal of Symbolic Logic 49, 867–879 (1984) 51. J¨ager, G.: A version of Kripke-Platek set theory which is conservative over Peano arithmetic. Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik 30, 3–9 (1984) 52. J¨ager, G.: Theories for Admissible Sets. A Unifying Approach to Proof Theory. No. 2 in Studies in Proof Theory, Lecture Notes. Bibliopolis, Naples (1986) 53. J¨ager, G.: Metapredicative and explicit mahlo: a proof-theoretic perspective. In: R. Cori, A. Razborov, S. Todorcevic, C. Wood (eds.) Logic Colloquium ’00, Lecture Notes in Logic, vol. 19, pp. 272–293. AK Peters, Wellesley, MA (2005) 54. J¨ager, G.: Reflections on reflections in explicit mathematics. Annals of Pure and Applied Logic 66 (2005) 55. J¨ager, G., Pohlers, W.: Eine beweistheoretische Untersuchung von (∆21 -CA) + (BI) und verwandter Systeme. Bayerische Akademie der Wissenschaften, Sitzungsberichte 1982 pp. 1–28 (1983) 56. J¨ager, G., Sch¨utte, K.: Eine syntaktische Abgrenzung der (∆11 -CA)-Analysis. Bayerische Akademie der Wissenschaften, Sitzungsberichte 1979 pp. 15–34 (1979) 57. J¨ager, G., Strahm, T.: Upper bounds for metapredicative mahlo in explicit mathematics and admissible set theory. Journal of Symbolic Logic 66(2), 935–958 (2001) 58. Jech, T.J.: The Axiom of Choice. No. 75 in Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, Amsterdam (1973) 59. Jech, T.J.: Set Theory. Academic Press, New York (1978) 60. Kino, A., Myhill, J., Vesley, R.E. (eds.): Intuitionism and Proof Theory, Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, Amsterdam (1970) 61. Kreisel, G.: Generalized inductive defintions. Stanford Report, mimeographed section III (1963) 62. Kreisel, G.: Notes concerning the elements of proof theory. Lecture Notes, University of Calfornia, Los Angeles, Los Angeles, CA (1967, 1968) 63. Lorenzen, P.: Algebraische und logistische Untersuchungen u¨ ber freie Verb¨ande. Journal of Symbolic Logic 16, 81–106 (1951) 64. Mostowski, A.: On ω –models which are not β –models. Fundamenta Mathematicae 65, 83–93 (1969) 65. Neumann, J.v.: Zur Hilbertschen Beweistheorie. Mathematische Annalen 26 (1927) 66. Paris, J.B., Harrington, L.: A mathematical incompleteness in Peano arithmetic. In: Barwise [5], pp. 1133–1142 67. Peano, G.: Arithmetices principia, nova methodo exposita. Bocca, Torino (1889) 68. Pohlers, W.: An upper bound for the provability of transfinite induction. In: J. Diller, G.H. M¨uller (eds.) |= ISILC Proof Theory Symposium, no. 500 in Lecture Notes in Mathematics, pp. 271–289. Springer-Verlag, Heidelberg/New York (1975) 69. Pohlers, W.: Ordinals connected with formal theories for transfinitely iterated inductive definitions. Journal of Symbolic Logic 43, 161–182 (1978) 70. Pohlers, W.: Cut-elimination for impredicative infinitary systems I. Ordinal analysis for ID1 . Archiv f¨ur Mathematische Logik und Grundlagenforschung 21, 113–129 (1981) 71. Pohlers, W.: Proof-theoretical analysis of IDν by the method of local predicativity. In: Buchholz et al. [15], pp. 261–357
360
References
72. Pohlers, W.: Cut elimination for impredicative infinitary systems II. Ordinal analysis for iterated inductive definitions. Archiv f¨ur Mathematische Logik und Grundlagenforschung 22, 69–87 (1982) 73. Pohlers, W.: Proof Theory. An Introduction. No. 1407 in Lecture Notes in Mathematics. Springer-Verlag, Berlin/Heidelberg/New York (1989) 74. Pohlers, W.: Subsystems of set theory and second order number theory. In: Buss [18], pp. 209–335 75. Prawitz, D.: Hauptsatz for higher order logic. Journal of Symbolic Logic 33, 452–457 (1968) 76. Probst, D.: On the relationship between fixed points and iteration in admissible set theory without foundation. Archive for Mathematical Logic 44(5), 561–580 (2005) 77. Rathjen, M.: Proof-theoretic analysis of KPM. Archive for Mathematical Logic 30, 377–403 (1991) 78. Rathjen, M.: Eine Ordinalzahlanalyse der Π3 -Reflexion. Habilitationsschrift, Westf¨alische Wilhelms-Universit¨at, M¨unster (1992) 79. Rathjen, M.: Proof theory of reflection. Annals of Pure and Applied Logic 68, 181–224 (1994) 80. Rathjen, M.: The realm of ordinal analysis. In: S. Cooper, J. Truss (eds.) Sets and Proofs, pp. 219–79. Cambridge University Press (1999) 81. Rathjen, M.: An ordinal analyis of parameter free Π21 -comprehension. Archive for Mathematical Logic 48/3, 263–362 (2005) 82. Rathjen, M.: An ordinal analyis of stability. Archive for Mathematical Logic 48/2, 1–62 (2005) 83. Richter, W.H., Aczel, P.: Inductive definitions and reflecting properties of admissible ordinals. In: J.E. Fenstad, P.G. Hinman (eds.) Generalized Recursion Theory I, no. 79 in Studies in Logic and the Foundations of Mathematics, pp. 301–381. North-Holland Publishing Company, Amsterdam (1974) 84. Rogers jun., H.: Theory of Recursive Functions and Effective Computability. McGraw-Hill Book Company, New York (1967) 85. Sch¨utte, K.: Beweistheoretische Untersuchungen der verzweigten Analysis. Mathematische Annalen 124, 123–147 (1952) 86. Sch¨utte, K.: Syntactical and semantical properties of simple type theory. Journal of Symbolic Logic 25, 305–326 (1960) 87. Sch¨utte, K.: Eine Grenze f¨ur die Beweisbarkeit der transfiniten Induktion in der verzweigten Typenlogik. Archiv f¨ur Mathematische Logik und Grundlagenforschung 7, 45–60 (1965) 88. Sch¨utte, K.: Predicative well-orderings. In: J.N. Crossley, M.A.E. Dummett (eds.) Formal Systems and Recursive Functions, Studies in Logic and the Foundations of Mathematics, pp. 280–303. North-Holland Publishing Company, Amsterdam (1965) 89. Sch¨utte, K.: Proof Theory. No. 225 in Grundlehren der mathematischen Wissenschaften. Springer-Verlag, Heidelberg/New York (1977) 90. Schwichtenberg, H.: Proof theory: Some applications of cut-elimination. In: Barwise [5], pp. 867–895 91. Schwichtenberg, H., Troelstra, A.: Basic Proof Theory. No. 43 in Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, UK (1996) 92. Shoenfield, J.R.: Mathematical Logic. Addison-Wesley Publishing Company, Reading (1967) 93. Sieg, W.: Hilbert’s Programs: 1917–1922. Bulletin of Symbolic Logic 5, 1–44 (1999) 94. Simpson, S.G.: Subsystems of Second Order Arithmetic. Springer-Verlag, Berlin/ Heidelberg/ New York (1999) 95. Spector, C.: Provably recursive functionals of analysis: a consistency proof of analysis by an extension of principles formulated in current intuitionistic mathematics. In: J.C.E. Dekker (ed.) Recursive Function Theory, no. 5 in Proceedings of Symposia in Pure Mathematics, pp. 1–27. American Mathematical Society, Providence (1962) 96. Szabo, M.E. (ed.): The Collected Papers of Gerhard Gentzen. North-Holland Publishing Company, Amsterdam (1969)
References
361
97. Tait, W.W.: A non constructive proof of gentzen’s hauptsatz for second order predicate logic. Bulletin of the American Mathematical Society 66, 980–983 (1966) 98. Tait, W.W.: Intensional interpretatons of functionals of finite type. Journal of Symbolic Logic 32/2, 198–212 (1967) 99. Tait, W.W.: Applications of the cut elemination theorem to some subsystems of classical analysis. In: Kino et al. [60], pp. 475–488 100. Takahashi, M.o.: A proof of cut-elimination in simple type theory. Journal of the Mathematical Society of Japan 19, 399–410 (1967) 101. Takeuti, G.: On a generalized logic calculus. Japanese Journal of Mathematics 24, 149–156 (1953) 102. Takeuti, G.: Consistency proofs of subsystems of classical analysis. Annals of Mathematics 86, 299–348 (1967) 103. Takeuti, G.: Proof Theory, 2. edn. No. 81 in Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, Amsterdam (1987) 104. Takeuti, G., Yasugi, M.: Reflection principles of subsystems of analysis. In: H.A. Schmidt, K. Sch¨utte, H.J. Thiele (eds.) Contributions to Mathematical Logic, Studies in Logic and the Foundations of Mathematics, pp. 255–273. North-Holland Publishing Company, Amsterdam (1968) 105. Takeuti, G., Yasugi, M.: The ordinals of the systems of second order arithmetic with the provably ∆21 -comprehension axiom and with the ∆21 -comprehension axiom respectively. Japanese Journal of Mathematics 41, 1–67 (1973) 106. Weiermann, A.: How to characterize provably total functions by local predicativity. Journal of Symbolic Logic 61, 52–69 (1996) 107. Weiermann, A.: Sometimes slow growing is fast growing. Annals of Pure and Applied Logic 83, 199–223 (1997) 108. Weiermann, A.: What makes a (pointwise) subrecursive hierarchy slow growing? In: B. Cooper, J.K. Truss (eds.) Sets and Proofs, no. 258 in London Mathematical Society Lecture Notes Series, pp. 403–423. Cambridge University Press, Cambridge (1999) 109. Weiermann, A.: Analytic combinatorics, proof-theoretic ordinals and phase transitions for independence results. Annals of Pure and Applied Logic 136, 189–218 (2005) 110. Weyl, H.: Das Kontinuum. Veit & Co., Leipzig (1918) 111. Weyl, H.: Der Circulus vitiosus in der heutigen Begr¨undung der Mathematik. Jahresbericht der Deutschen Mathematiker-Vereinigung pp. 85–92 (1919) ¨ 112. Weyl, H.: Uber die neue Grundlagenkrise der Mathematik. Mathematische Zeitschrift 10, 39–79 (1921) 113. Woodin, W.H.: The continuum hypothesis, Part I. Notices of the AMS 48/6, 567–576 (2001) 114. Woodin, W.H.: The continuum hypothesis, Part II. Notices of the AMS 48/7, 681–690 (2001) 115. Zucker, J.I.: Iterated inductive definitions, trees, and ordinals. In: A.S. Troelstra (ed.) Metamathematical Investigation of Intuitionistic Arithmetic and Analysis, no. 344 in Lecture Notes in Mathematics, pp. 392–453. Springer-Verlag, Heidelberg/New York (1973)
Index
Notations CHAPTER 2 S, 9 Ckn , 9 Pkn , 9 Sub, 10 Rec, 10 λ xi . f (x1 , . . . , xn ), 15 λ x . t(x), 15 CHAPTER 3 ω1 , 19 ω1CK , 19 Tran(M), 20 sup M, 22 α , 22 ω , 23 field(R), 23 F(a), 23 dom(F), 23 rng(F), 23 Fa, 23 V , 23 F: N −→ p M, 23 otyp≺ (s), 25 ω1CK , 25 µ≺ S, 25 en≺ , 26 otyp(M), 26
enM , 26 Onα , 28 α + ξ , 28 α = β , 32 ε0 , 33 Fix ( f ), 35 ∆α <κ Xα , 36 ϕα , 37 Cr (α ), 37 ϕξ ,η , 40 CHAPTER 4 (t1 , . . . ,tn ) ε U, 48 FU ({x1 , . . . , xn Au1 ,...,un (x1 , . . . , xn )}), 49 FU (A), 50 F(A), 50 M |=L F, 50 |= F, 50 m ∆ , 54 T
[ f ](m), 56 otypT (s), 57 otyp(T ), 57 R( ∆ ), 57 S M∆ , 57 T
T
F, 61
IDEN , 61 363
364 m
T T ∆ , 62 rnk(F), 65 CHAPTER 5 ∼F, 74 Diag(N), 74 –type , 75 –type , 75 CS(F), 75 α F, 76 α ∆ , 77 (STR), 78 S ω∆ , 79
F∇ , 81 tc F , 82
CHAPTER 6 MΓ , 85 ω1 , 85 ΦF [a1 , . . . , am ], 87 |F|, 88 κ N , 89 Acc(X, x, ≺), 90 Accα≺ , 90 |s|Acc≺ , 90 FT (X, x), 91 IT , 91 ITα , 91 ClF (X), 92 Prog(≺, X), 96 TI(≺), 100 Wf (X), 103 CHAPTER 7 PA, 105 L (PA), 105 (IND), 105 2 PA , 106 (Ind)2 , 106 L (NT), 106 NT, 106 (∆10 –CA), 107 (∆01 –CA), 107 (ACA)0 , 108
Index α ρ ∆ , 112 ϕσn1 (ξ ), 116
α ⊆ X, 119 Prog(X), 119 TI(α , X), 119 F({x G(x)}), 119 J (X), 120 Prf T (i, v), 126 ∆0 -formula, 127 T , 127 CHAPTER 8 α NTω1 ρ ∆ , 135 α ∗ , 135 ∆ω1 (α ), 136 (∆10 –CA) + (BR), 137 Kσ , 138 TIσ (α ), 138 Jp(S), 140 h(α ), 145 N(α ), 145 CHAPTER 9 L (ID), 157 IF , 157 ID1 , 157 (ID11 ), 158 (ID12 ), 158 ClF (G), 158 κ ID1 , 158 L∞ (NT), 160 α t ε IF , 162 α t ε IF , 162 Ω , 162 SC(α ), 167 B0 (X), 167 par(G), 168 par(∆ ), 168 α H ρ ∆ , 168 Hα , 173 ψ (α ), 173 H Ω –type, 175 B(X), 181 ψ (α ), 181
Index
365
Γ , 182 Γξ , 182 Bα , 182 Bnα , 182 αnf , 183 δ (α ), 183 K(ξ ), 186 ϕ¯ , 186 SC, 187 H, 187 On, 187 K(a), 187 a ≺ b, 187 a = b, 187 |a|O , 187 α <0 β , 189 Acc, 189 M, 189 α <1 β , 189 Progi (F), 189 TIi (α , F), 189 AccΩ , 192 α SC , 196 ∆n , 200 lh(α ), 200 CHAPTER 10 f (z1 , . . . , zn ) $ g(z1 , . . . , zn ), 208 {e}n (x), 208 par(∆ ) for ∆ ⊆ L (NT), 210 F f (X), 210 OF (X), 211 F
α ρ
∆ , 211
F ≤ G, 211 ψF (α ), 213 α &F β , 214 PRH (F ), 231 SFΠ 0 (T ), 231 α
2
F , 232 Pαp (x), 234 CHAPTER 11 Pow(a), 239 ZFC, 239
(Ext), 240 (Pair), 240 (Union), 240 (Sep), 240 (Coll), 240 (Inf), 240 (FOUND), 240 (Found), 241 rnkL (a), 245 (∆0 –Sep), 247 (∆0 –Coll), 247 BST, 247 (Ext’), 247 (Pair’), 247 (Union’), 247 (∆0 –Sep), 248 (∆0 –Coll), 248 KP, 248 (Inf’), 248 KPω , 248 F a , 249 ∗ KP , 252 ClB,a (X), 259 IB,a , 259 KPN, 261
GF , 267
366
(Π2 –REF), 268 (Π2 –REF), 269 stg(t), 270 Tα , 270 <ST , 270 stg(F), 270 –type for Σ -sentences of LRS , 271 –type for Σ -sentences of LRS , 271 i(G), 272 α H ρ ∆ for ∆ ⊆ LRS , 278 ◦ ( ) , 279 ( )◦ , 280 drnk(∆ ), 280 a ∈ b, 280 a∈ / b, 280 (Str), 281 (Taut), 281 (Sent), 281 A((v1 , . . . , vn )), 287 CHAPTER 12 − KP , 298 T + , 298 (Ax)M , 299
Index
(Ax)L , 300 <α , 302 <ρ
D (α ,β ) , 310 D (m,n) , 320 (FOUND(X)), 321 KPn , 323
CHAPTER 13 Φα , 333 Φ<α , 333 Φ<∞ , 333 |Φ|, 334 |n|Φ , 334 x Φ y, 336 x ≺Φ y, 336 D , 339 Wf (X), 341 PO(X), 341 X , 341 ≺X , 341 PRO(X), 341 FXPF (X), 341 (Π01 –FXP)0 , 341 (ACA)0 , 341 0 σ (Π1 –FXP)0 , 350
Index
367
Key–words
∆ -formula, 245 ∆ -relation symbol, 252 L-rank of a set, 245 M-completeness theorem, 299 Π2 -formula, 267 Π20 -ordinal of a theory, 231 Σ -function symbol, 252 Σ -recursion theorem, 254 α -verifiable, 76 κ -club, 27 ε -numbers, 33 ∨-Exportation, 81, 114, 169 ω -completeness theorem, 81 ∧-Inversion, 113 Hardy-hierarchy, 234 Kleene-bracket, 208 LRS -expression, 275 LRS -term atomic, 270 Peano arithmetic, 105 (Π2 –REF), 269 Σ -ordinal of KPω , 266 TAIT, 73 TAIT-language, 73 Bachmann–Howard ordinal, 181 K RIPKE –P LATEK set theory, 248 absoluteness, 243 accessible parts, 90 admissible extension, 298 admissible ordinal, 248 admissible set, 248 arithmetical transfinite recursion, 155 assignment propositional, 123 asymmetric interpretation, 310 atom propositional, 123 autonomous closure of an ordinal, 135 axiom of choice, 239
of collection, 238 of foundation, 239, 241 of pairing, 238 of power–set, 239 of replacement, 239 of separation, 238 of union, 238 Axiom β , 262 axiom system, 61 categorical for N, 106 bar induction, 56 bar recursion, 56 Basic Elimination Lemma, 115 Basic Set Theory, 247 basis of a subrecursive hierarchy, 230 Boundedness Lemma, 172, 278 cardinal, 27 characteristic function, 12 characteristic sequence, 75 of a Lκ -formula , 132 of an LRS -sentence, 272 chaste semi-formal system, 305 theory, 318 class, 237 transitive, 20 class terms as constituents of second-order logic, 67 in KPω , 252 in second-order logic, 49 closed in a regular κ , 27 closure ordinal of a nonmonotone operator, 334 of an inductive definition, 86 Collapsing Lemma, 176 collection for Σ -formulas, 250 restricted to ∆0 -formulas, 247
368
collection axiom, 238 Completeness Theorem for countable segments of L, 307 Condensation Lemma, 192 conjunctive type, 75 Controlled Tautology, 179 course of values of a function, 14 critical formula, 78 Deduction Theorem, 50, 51 definable from a set, 242 defining axioms for IF , 157 for primitive recursive functions, 106 derivation rank of a multiset, 280 derivative of a class, 35 of a function, 35 Detachment Rule, 140 diagonal intersection, 36 diagonalization of a reflexive relation, 339 diagram of the structure N, 74 disjunctive type, 75 domain of a function, 23 of a relation, 23
Index
formal rule, 99 formal system, 99 Π11 -sound, 100 formula X-positive, 88 absolute, 243 arithmetical, 69 critical, 112 critical of a clause, 78 downwards persistent, 243 first-order, 50 logically valid, 50 propositionally valid, 124 provable in a theory, 61 satisfiable, 50 upwards persistent, 243 valid in a structure, 50 formulas logically equivalent, 50 foundation axiom, 241 Foundation Lemma, 294 foundation scheme, 240 Foundation Theorem, 294 function, 23 ω1CK -recursive, 267 partial, 23, 208 partial recursive, 208 provably ω1CK -recursive, 267 fundamental sequences, 232
Elimination Theorem, 116 enumerating function of a well-founded relation, 26 equivalent orderings, 19 evaluation of a partial recursive function term, 208 of a primitive recursive function term, 10 extended identity operator, 279 extension of operators, 211
Generalized Induction, 180 generating class for the provable Π20 -sentences, 231 global model of AxΩ , 203 Graph of a function, 267
field of a relation, 23 fixed-point the, 85 of a nonmonotone operator, 333
identity axioms, 61, 107 incompleteness, 122 index of a partial recursive function, 208
hierarchy constructible, 242 subrecursive, 230 von Neuman, 241 hydra, 63
Index
induction, 107 Induction Lemma, 111 inductive definition nonmonotone, 333 positive, 88 inductive norm, 90 induced by a nonmonotone operator, 334 interpretation asymmetric, 310 good, 196 good relative to an ordinal, 196 Inversion Lemma, 169, 212 jump, 121 leaf of a tree, 56 length of a generating class, 231 limit ordinal, 23 linear ordering, 18 local model of AxΩ , 203 logical consequence, 50 Monotonicity Lemma, 180 Mostowski collapse of a well-founded relation, 25 Mostowski collapsing function, 25 multisets, 279 nesting property, 233 norm associated to a prewellordering, 337 on a set, 334 norm of an ordinal, 145, 210 Number Theory NT, 106 operator definable, 87 induced by a formula, 87 positive, 88 strongly increasing, 214 ordered pair of sets, 243 ordinal α -critical, 37, 39
369
additively indecomposable, 29 admissible, 248 finite, 23 limit, 23 multiplicatively indecomposable, 33 principal, 29 regular, 27 successor, 22, 23 ordinal analysis, 5 Partial recursive function term, 208 path in a tree, 56 power–set axiom, 239 Predicative Elimination Lemma, 116 prewellordering, 337 Primitive recursive function terms, 10 primitive recursive hull, 231 proof-theoretic ordinal of a formal system, 100 of a theory, 100 propositionally closed, 124 Provably recursive functions of a theory, 209 pseudo Π11 -sentence, 77 range of a relation, 23 rank of a formula, 65 recursor, 106 redex of a finite sequence of formulas, 57 Reduction Lemma, 170 reductive proof theory, 130 relation, 23 positive-inductively definable, 88 replacement Σ , 251 strong Σ , 251 restriction of a function, 23 of a relation, 23 rule permissible, 124 propositional, 124 scheme of ∆0 -collection, 248
370
of ∆0 -separation, 248 of Π11 -comprehension, 342 of arithmetical comprehension, 108, 342 of foundation, 240, 248 of full comprehension, 68 of Mathematical Induction, 105, 107 search tree, 79 semi-formal provability, 112 semi-formal system, 112 chaste, 305 with additional axioms, 302 sentence, 50 of LRS , 270 sentences numerically equivalent, 108 sentential rule, 282 separation ∆ , 251 restricted to ∆0 -formulas, 247 separation axiom, 238 sequence number, 14 set Φ closed, 84 admissible, 248 admissible above ω , 248 arithmetical, 72 hereditarily transitive, 21 positive-inductively definable, 88 transitive, 20 set of natural numbers definable, 71 signature of a first-order language, 48 Skolem-hull operator, 167 Cantorian closed, 167 transitive, 167 slice of a set, 88, 341 stage of an LRS -formula, 270 of an LRS -term, 270 stage comparison relations, 336 standard interpretation of an ordinal term, 196
Index
strongly critical components, 167 Structural Lemma, 113 structural rule, 113, 168, 211 for the first-order TAIT calculus, 54 for the verification calculus, 78 substitution operator, 106 subtree, 56 successor, 22, 106 sum of ordinals, 28 symmetric, 32 symbols for primitive recursive functions, 106 symmetric product, 216 symmetric sum of ordinals, 32 Tait calculus for pure first-order logic, 54 Tait language of a language L , 53 Tautology Lemma, 110 tautology rule, 282 term, 47 closed, 50 The theory (Π01 –FXP)0 , 341 theory, 61 Π11 -sound, 100 transfinite induction, 22 transitive closure of a set, 252 transitive set, 20 tree, 56 tree-ordering, 63 truth complexity for arithmetical sentences, 76 for a (pseudo) Π11 -sentence, 82 unbounded in a regular κ , 27 universal closure of a formula, 106 verification calculus, 76, 77 well-ordering, 18 Zermelo–Fraenkel set theory, 239
Universitext Aguilar, M.; Gitler, S.; Prieto, C.: Algebraic Topology from a Homotopical Viewpoint Ahlswede, R.; Blinovsky, V.: Lectures on Advances in Combinatorics Aksoy, A.; Khamsi, M. A.: Methods in Fixed Point Theory Alevras, D.; Padberg M. W.: Linear Optimization and Extensions Andersson, M.: Topics in Complex Analysis Aoki, M.: State Space Modeling of Time Series Arnold, V. I.: Lectures on Partial Differential Equations Arnold, V. I.; Cooke, R.: Ordinary Differential Equations Audin, M.: Geometry Aupetit, B.: A Primer on Spectral Theory Bachem, A.; Kern, W.: Linear Programming Duality Bachmann, G.; Narici, L.; Beckenstein, E.: Fourier and Wavelet Analysis Badescu, L.: Algebraic Surfaces Balakrishnan, R.; Ranganathan, K.: A Textbook of Graph Theory Balser, W.: Formal Power Series and Linear Systems of Meromorphic Ordinary Differential Equations Bapat, R.B.: Linear Algebra and Linear Models Benedetti, R.; Petronio, C.: Lectures on Hyperbolic Geometry Benth, F. E.: Option Theory with Stochastic Analysis Berberian, S. K.: Fundamentals of Real Analysis Berger, M.: Geometry I, and II Bhattacharya, R; Waymire, E. C.: A Basic Course in Probability Theory Bliedtner, J.; Hansen, W.: Potential Theory Blowey, J. F.; Coleman, J. P.; Craig, A. W. (Eds.): Theory and Numerics of Differential Equations Blowey, J. F.; Craig, A.; Shardlow, T. (Eds.): Frontiers in Numerical Analysis, Durham 2002, and Durham 2004
Blyth, T. S.: Lattices and Ordered Algebraic Structures B¨orger, E.; Gr¨adel, E.; Gurevich, Y.: The Classical Decision Problem B¨ottcher, A; Silbermann, B.: Introduction to Large Truncated Toeplitz Matrices Boltyanski, V.; Martini, H.; Soltan, P. S.: Excursions into Combinatorial Geometry Boltyanskii, V. G.; Efremovich, V. A.: Intuitive Combinatorial Topology Bonnans, J. F.; Gilbert, J. C.; Lemarchal, C.; Sagastizbal, C. A.: Numerical Optimization Booss, B.; Bleecker, D. D.: Topology and Analysis Borkar, V. S.: Probability Theory Brides/Vita: Techniques of Constructive Analysis Bruiner, J. H.: The 1-2-3 of Modular Forms Brunt B. van: The Calculus of Variations B¨uhlmann, H.; Gisler, A.: A Course in Credibility Theory and its Applications Carleson, L.; Gamelin, T. W.: Complex Dynamics Cecil, T. E.: Lie Sphere Geometry: With Applications of Submanifolds Chae, S. B.: Lebesgue Integration Chandrasekharan, K.: Classical Fourier Transform Charlap, L. S.: Bieberbach Groups and Flat Manifolds Chern, S.: Complex Manifolds without Potential Theory Chorin, A. J.; Marsden, J. E.: Mathematical Introduction to Fluid Mechanics Cohn, H.: A Classical Invitation to Algebraic Numbers and Class Fields Curtis, M. L.: Abstract Linear Algebra Curtis, M. L.: Matrix Groups Cyganowski, S.; Kloeden, P.; Ombach, J.: From Elementary Probability to Stochastic Differential Equations with MAPLE Da Prato, G.: An Introduction to Infinite Dimensional Analysis Dalen, D. van: Logic and Structure Das, A.: The Special Theory of Relativity: A Mathematical Exposition
Debarre, O.: Higher-Dimensional Algebraic Geometry
Friedman, R.: Algebraic Surfaces and Holomorphic Vector Bundles
Deitmar, A.: A First Course in Harmonic Analysis Demazure, M.: Bifurcations and Catastrophes Devlin, K. J.: Fundamentals of Contemporary Set Theory DiBenedetto, E.: Degenerate Parabolic Equations Diener, F.; Diener, M. (Eds.): Nonstandard Analysis in Practice Dimca, A.: Sheaves in Topology Dimca, A.: Singularities and Topology of Hypersurfaces DoCarmo, M. P.: Differential Forms and Applications Duistermaat, J. J.; Kolk, J. A. C.: Lie Groups Dumortier.: Qualitative Theory of Planar Differential Systems Dundas, B. I.; Levine, M.; Østvaer, P. A.; R¨ondip, O.; Voevodsky, V.: Motivic Homotopy Theory Edwards, R. E.: A Formal Background to Higher Mathematics Ia, and Ib Edwards, R. E.: A Formal Background to Higher Mathematics IIa, and IIb Emery, M.: Stochastic Calculus in Manifolds Emmanouil, I.: Idempotent Matrices over Complex Group Algebras Endler, O.: Valuation Theory Engel, K.-J.; Nagel, R.: A Short Course on Operator Semigroups Erez, B.: Galois Modules in Arithmetic Everest, G.; Ward, T.: Heights of Polynomials and Entropy in Algebraic Dynamics Farenick, D. R.: Algebras of Linear Transformations Foulds, L. R.: Graph Theory Applications Franke, J.; H¨ardle, W.; Hafner, C. M.: Statistics of Financial Markets: An Introduction Frauenthal, J. C.: Mathematical Modeling in Epidemiology Freitag, E.; Busam, R.: Complex Analysis
Fuks, D. B.; Rokhlin, V. A.: Beginner’s Course in Topology Fuhrmann, P. A.: A Polynomial Approach to Linear Algebra Gallot, S.; Hulin, D.; Lafontaine, J.: Riemannian Geometry Gardiner, C. F.: A First Course in Group Theory G˚arding, L.; Tambour, T.: Algebra for Computer Science Godbillon, C.: Dynamical Systems on Surfaces Godement, R.: Analysis I, and II Goldblatt, R.: Orthogonality and Spacetime Geometry Gouvˆea, F. Q.: p-Adic Numbers Gross, M. et al.: Calabi-Yau Manifolds and Related Geometries Grossman, C.; Roos, H.-G.; Stynes, M: Numerical Treatment of Partial Differential Equations Gustafson, K. E.; Rao, D. K. M.: Numerical Range. The Field of Values of Linear Operators and Matrices Gustafson, S. J.; Sigal, I. M.: Mathematical Concepts of Quantum Mechanics Hahn, A. J.: Quadratic Algebras, Clifford Algebras, and Arithmetic Witt Groups H´ajek, P.; Havr´anek, T.: Mechanizing Hypothesis Formation Heinonen, J.: Lectures on Analysis on Metric Spaces Hlawka, E.; Schoißengeier, J.; Taschner, R.: Geometric and Analytic Number Theory Holmgren, R. A.: A First Course in Discrete Dynamical Systems Howe, R., Tan, E. Ch.: Non-Abelian Harmonic Analysis Howes, N. R.: Modern Analysis and Topology Hsieh, P.-F.; Sibuya, Y. (Eds.): Basic Theory of Ordinary Differential Equations Humi, M., Miller, W.: Second Course in Ordinary Differential Equations for Scientists and Engineers
Hurwitz, A.; Kritikos, N.: Lectures on Number Theory Huybrechts, D.: Complex Geometry: An Introduction Isaev, A.: Introduction to Mathematical Methods in Bioinformatics Istas, J.: Mathematical Modeling for the Life Sciences Iversen, B.: Cohomology of Sheaves Jacod, J.; Protter, P.: Probability Essentials Jennings, G. A.: Modern Geometry with Applications Jones, A.; Morris, S. A.; Pearson, K. R.: Abstract Algebra and Famous Inpossibilities Jost, J.: Compact Riemann Surfaces Jost, J.: Dynamical Systems. Examples of Complex Behaviour Jost, J.: Postmodern Analysis Jost, J.: Riemannian Geometry and Geometric Analysis Kac, V.; Cheung, P.: Quantum Calculus Kannan, R.; Krueger, C. K.: Advanced Analysis on the Real Line Kelly, P.; Matthews, G.: The Non-Euclidean Hyperbolic Plane Kempf, G.: Complex Abelian Varieties and Theta Functions Kitchens, B. P.: Symbolic Dynamics Klenke, A.: Probability Theory Kloeden, P.; Ombach, J.; Cyganowski, S.: From Elementary Probability to Stochastic Differential Equations with MAPLE Kloeden, P. E.; Platen; E.; Schurz, H.: Numerical Solution of SDE Through Computer Experiments Koralov, L. B.; Sinai, Ya. G.: Theory of Probability and Random Processes. 2nd edition Kostrikin, A. I.: Introduction to Algebra Krasnoselskii, M. A.; Pokrovskii, A. V.: Systems with Hysteresis Kuo, H.-H.: Introduction to Stochastic Integration Kurzweil, H.; Stellmacher, B.: The Theory of Finite Groups. An Introduction
Kyprianou, A. E.: Introductory Lectures on Fluctuations of L´evy Processes with Applications Lang, S.: Introduction to Differentiable Manifolds Lefebvre, M.: Applied Stochastic Processes Lorenz, F.: Algebra I: Fields and Galois Theory Lorenz, F.: Algebra II: Fields with Structure, Algebras and Advanced Topics Luecking, D. H., Rubel, L. A.: Complex Analysis. A Functional Analysis Approach Ma, Zhi-Ming; Roeckner, M.: Introduction to the Theory of (non-symmetric) Dirichlet Forms Mac Lane, S.; Moerdijk, I.: Sheaves in Geometry and Logic Marcus, D. A.: Number Fields Martinez, A.: An Introduction to Semiclassical and Microlocal Analysis Matouˇsek, J.: Using the Borsuk-Ulam Theorem Matsuki, K.: Introduction to the Mori Program Mazzola, G.; Milmeister G.; Weissman J.: Comprehensive Mathematics for Computer Scientists 1 Mazzola, G.; Milmeister G.; Weissman J.: Comprehensive Mathematics for Computer Scientists 2 Mc Carthy, P. J.: Introduction to Arithmetical Functions McCrimmon, K.: A Taste of Jordan Algebras Meyer, R. M.: Essential Mathematics for Applied Field Meyer-Nieberg, P.: Banach Lattices Mikosch, T.: Non-Life Insurance Mathematics Mines, R.; Richman, F.; Ruitenburg, W.: A Course in Constructive Algebra Moise, E. E.: Introductory Problem Courses in Analysis and Topology Montesinos-Amilibia, J. M.: Classical Tessellations and Three Manifolds Morris, P.: Introduction to Game Theory Mortveit, H.; Reidys, C.: An Introduction to Sequential Dynamical Systems Nicolaescu, L.: An Invitation to Morse Theory
Nikulin, V. V.; Shafarevich, I. R.: Geometries and Groups Oden, J. J.; Reddy, J. N.: Variational Methods in Theoretical Mechanics Øksendal, B.: Stochastic Differential Equations Øksendal, B.; Sulem, A.: Applied Stochastic Control of Jump Diffusions. 2nd edition Orlik, P.; Welker, V.: Algebraic Combinatorics Perrin, D.: Algebraic Geometry Pohlers, W.: Proof Theory. The First Step into Impredicativity Poizat, B.: A Course in Model Theory Polster, B.: A Geometrical Picture Book Porter, J. R.; Woods, R. G.: Extensions and Absolutes of Hausdorff Spaces Procesi, C.: Lie Groups Radjavi, H.; Rosenthal, P.: Simultaneous Triangularization Ramsay, A.; Richtmeyer, R. D.: Introduction to Hyperbolic Geometry Rautenberg, W.: A concise Introduction to Mathematical Logic Rees, E. G.: Notes on Geometry Reisel, R. B.: Elementary Theory of Metric Spaces Rey, W. J. J.: Introduction to Robust and Quasi-Robust Statistical Methods Ribenboim, P.: Classical Theory of Algebraic Numbers Rickart, C. E.: Natural Function Algebras Rotman, J. J.: Galois Theory Rubel, L. A.: Entire and Meromorphic Functions Ruiz-Tolosa, J. R.; Castillo E.: From Vectors to Tensors Runde, V.: A Taste of Topology Rybakowski, K. P.: The Homotopy Index and Partial Differential Equations Sabbah, C.: Isomonodromic Deformations and Frobenius Manifolds Sagan, H.: Space-Filling Curves Salsa, S.: Partial Differential Equations in Action Samelson, H.: Notes on Lie Algebras Sauvigny, F.: Partial Differential Equations I Sauvigny, F.: Partial Differential Equations II Schiff, J. L.: Normal Families
Schirotzek, W.: Nonsmooth Analysis Sengupta, J. K.: Optimal Decisions under Uncertainty S´eroul, R.: Programming for Mathematicians Seydel, R.: Tools for Computational Finance Shafarevich, I. R.: Discourses on Algebra Shapiro, J. H.: Composition Operators and Classical Function Theory Simonnet, M.: Measures and Probabilities Smith, K. E.; Kahanp¨aa¨, L.; Kek¨al¨ainen, P.; Traves, W.: An Invitation to Algebraic Geometry Smith, K. T.: Power Series from a Computational Point of View Smorynski, C.: Self-Reference and Modal Logic Smory´nski, C.: Logical Number Theory I. An Introduction Srivastava: A Course on Mathematical Logic Stichtenoth, H.: Algebraic Function Fields and Codes Stillwell, J.: Geometry of Surfaces Stroock, D. W.: An Introduction to the Theory of Large Deviations Sunder, V. S.: An Invitation to von Neumann Algebras ´ Tamme, G.: Introduction to Etale Cohomology Tondeur, P.: Foliations on Riemannian Manifolds Toth, G.: Finite M¨obius Groups, Minimal Immersions of Spheres, and Moduli Tu, L. W.: An Introduction to Manifolds Verhulst, F.: Nonlinear Differential Equations and Dynamical Systems Weintraub, S. H.: Galois Theory Wong, M. W.: Weyl Transforms Xamb´o-Descamps, S.: Block Error-Correcting Codes Zaanen, A.C.: Continuity, Integration and Fourier Theory Zhang, F.: Matrix Theory Zong, C.: Sphere Packings Zong, C.: Strange Phenomena in Convex and Discrete Geometry Zorich, V. A.: Mathematical Analysis I Zorich, V. A.: Mathematical Analysis II