This page intentionally left blank
P1: JZP CUNYXXX-FM
CUNYXXX/Dimitracopoulos
July 11, 2007
Logic Colloquium 2005
i
20:6
P1: JZP CUNYXXX-FM
CUNYXXX/Dimitracopoulos
July 11, 2007
ii
20:6
P1: JZP CUNYXXX-FM
CUNYXXX/Dimitracopoulos
July 11, 2007
20:6
lecture notes in logic
A Publication of The Association for Symbolic Logic This series serves researchers, teachers, and students in the field of symbolic logic, broadly interpreted. The aim of the series is to bring publications to the logic community with the least possible delay and to provide rapid dissemination of the latest research. Scientific quality is the overriding criterion by which submissions are evaluated. Editorial Board Anand Pillay, Managing Editor Department of Pure Mathematics, School of Mathematics, University of Leeds Lance Fortnow Department of Computer Science, University of Chicago Shaughan Lavine Department of Philosophy, The University of Arizona Jeremy Avigad Department of Philosophy, Carnegie Mellon University Vladimir Kanovei Institute for Information Transmission Problems, Moscow Steffen Lempp Department of Mathematics, University of Wisconsin See end of book for a list of the books in the series. More information can be found at http://www.aslonline.org/books-lnl.html.
iii
P1: JZP CUNYXXX-FM
CUNYXXX/Dimitracopoulos
July 11, 2007
iv
20:6
P1: JZP CUNYXXX-FM
CUNYXXX/Dimitracopoulos
July 11, 2007
lecture notes in logic
28
Logic Colloquium 2005 Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, Held in Athens, Greece, July 28–August 3, 2005 Edited by
COSTAS DIMITRACOPOULOS Department of History and Philosophy of Science University of Athens
LUDOMIR NEWELSKI Mathematical Institute Wroclaw University
DAG NORMANN Department of Mathematics University of Oslo
JOHN R. STEEL Department of Mathematics and Computer Science University of California, Berkeley
association for symbolic logic
v
20:6
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521884259 © Association for Symbolic Logic 2007 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2007 eBook (EBL) ISBN-13 978-0-511-35476-2 ISBN-10 0-511-35476-2 eBook (EBL) hardback ISBN-13 978-0-521-88425-9 hardback ISBN-10 0-521-88425-X
Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
CONTENTS
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
Speakers and Titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
Jan A. Bergstra, Inge Bethke and Alban Ponse Thread algebra and risk assessment services . . . . . . . . . . . . . . . . . . . . . . . .
1
M´ario J. Edmundo Covering definable manifolds by open definable subsets . . . . . . . . . . . . .
18
Sergei S. Goncharov Isomorphisms and definable relations on computable models . . . . . . . .
26
Deirdre Haskell Independence for types in algebraically closed valued fields . . . . . . . . .
46
Eric Jaligot Simple groups of finite Morley rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
Hannes Leitgeb Towards a logic of type-free modality and truth . . . . . . . . . . . . . . . . . . . .
68
Justin Tatch Moore Structural analysis of Aronszajn trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
Sara Negri Proof analysis in non-classical logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Charles Parsons Paul Bernays’ later philosophy of mathematics . . . . . . . . . . . . . . . . . . . . . 129 Greg Restall Proofnets for S5: Sequents and circuits for modal logic . . . . . . . . . . . . . 151 Helmut Schwichtenberg Recursion on the partial continuous functionals . . . . . . . . . . . . . . . . . . . . 173 Michael Sheard A transactional approach to the logic of truth . . . . . . . . . . . . . . . . . . . . . . 202 vii
viii
contents
Dieter Spreen On some problems in computable topology . . . . . . . . . . . . . . . . . . . . . . . . 221 Sergei Tupailo Monotone inductive definitions and consistency of New Foundations 255
INTRODUCTION
The 2005 European Summer Meeting of the Association for Symbolic Logic was held in Athens, Greece, July 28–August 3, 2005. The meeting was called Logic Colloquium 2005 and its sessions, except the opening one, which took place in the Main Building, took place in the building of the Department of Mathematics of the University of Athens. It was attended by 198 participants (and 25 accompanying persons) from 29 different countries. The organizing body was the Inter-Departmental Graduate Program in Logic and Algorithms (MPLA) of the University of Athens, the National Technical University of Athens and the University of Patras. Financial support was provided by the Association for Symbolic Logic, the Athens Chamber of Commerce and Industry, the Bank of Greece, the Graduate Program in Logic and Algorithms, IVI Loutraki Water Co., the Hellenic Parliament, Katoptro Publications, Kleos S. A., the Ministry of National Education and Religious Affairs, Mythos Beer Co., the National and Kapodistrian University of Athens, the National Bank of Greece and Sigalas Wine Co. The Program Committee consisted of Chi Tat Chong (Singapore), Costas Dimitracopoulos (Athens), Hartry Field (New York), Gerhard J¨ager (Bern), George Metakides (Patras), Ludomir Newelski (Wrocław), Dag Normann (Oslo), Rohit Parikh (New York), John Steel (Berkeley), Stevo Todorˇcevi´c (Paris), John Tucker (Swansea), Frank Wagner (Lyon) and Stan Wainer (Leeds, Chair). The Organizing Committee consisted of Dionysios Anapolitanos (Athens), Costas Dimitracopoulos (Athens, Chair), Lefteris Kirousis (Patras), George Koletsos (Athens), Michael Mytilinaios (Athens), Stavros Papastavridis (Athens), Thanases Pheidas (Iraklio), Panos Rondogiannis (Athens), George Stavrinos (Athens), Anneta Synachopoulos (Athens), Thanases Tzouvaras (Thessaloniki) and Stathis Zachos (Athens). The program of the meeting is listed on the following pages. All invited speakers were invited to submit a paper to the proceedings volume, but not all ix
x
INTRODUCTION
did. The submissions were all refereed and the editors would like to sincerely thank the referees for their work. The editors would like to express their deep gratitude to the Alexander S. Onassis Public Benefit Foundation for generously providing a grant towards the cost of publication of this volume. The Editors Costas Dimitracopoulos, Athens Ludomir Newelski, Wrocław Dag Normann, Oslo John Steel, Berkeley
SPEAKERS AND TITLES
Tutorial Speakers Peter Aczel, Constructive set theory. University of Manchester, UK. Itay Ben-Yaacov, Model theory in positive and continuous logics. University of Wisconsin, Madison, USA. Phokion G. Kolaitis, Constraint satisfaction, complexity, and logic. I.B.M. Almaden Research Center and U.C.S.C., USA. Greg Restall, Proofnets for S5: Sequents and circuits for modal logic. University of Melbourne, Australia.
Plenary Speakers Jan A. Bergstra, Inge Bethke and Alban Ponse, Thread algebra and risk assessment services. University of Amsterdam, The Netherlands. Sergei S. Goncharov, Isomorphisms and definable relations on computable models. Novosibirsk State University, Russia. Deirdre Haskell, Independence for types in algebraically closed valued fields. McMaster University, Hamilton, Ontario, Canada. Eric Jaligot, Simple groups of finite Morley rank. University of Lyon 1, France. Justin Tatch Moore, Structural analysis of Aronszajn trees. Boise State University, Idaho, USA.
xi
xii
SPEAKERS AND TITLES
Andr´e Nies, Algebras with finite descriptions. University of Auckland, New Zealand. Charles Parsons, Paul Bernays’ later philosophy of mathematics. Harvard University, Cambridge, Massachusetts, USA. Helmut Schwichtenberg, Recursion on the partial continuous functionals. University of Munich, Germany. Michael Sheard, A transactional approach to the logic of truth. Saint Lawrence University, Canton, New York, USA. Sergei Tupailo, Monotone inductive definitions and consistency of New Foundations. Tallinn University of Technology, Estonia, and Ohio State University, USA. Klaus Weihrauch, Computable analysis. University of Hagen, Germany. Jindrich Zapletal, Forcing idealized. University of Florida, Gainesville, USA.
Special Sessions Computability in Analysis Vasco Brattka, Computability on non-separable Banach spaces. University of Cape Town, South Africa. Dieter Spreen, On some problems in computable topology. University of Siegen, Germany. Computer Science Logic Wiebe van der Hoek, Dynamic epistemic logic. University of Liverpool, UK. Stephan Kreutzer, Gaifman’s theorem and approximation schemes. Humboldt University of Berlin, Germany. Model Theory M´ario J. Edmundo, Covering definable manifolds by open definable subsets. University of Lisbon, Portugal. Piotr Kowalski, Projective D-varieties over a Hasse field. University of Wrocław, Poland.
SPEAKERS AND TITLES
Philosophical Logic Hannes Leitgeb, Towards a logic of type-free modality and truth. University of Salzburg, Austria, and Stanford University, La Jolla, California, USA. Sara Negri, Proof analysis in non-classical logics. University of Helsinki, Finland.
xiii
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
Abstract. Threads as contained in a thread algebra emerge from the behavioral abstraction from programs in an appropriate program algebra. Threads may make use of services such as stacks, and a thread using a single stack is called a pushdown thread. Equivalence of pushdown threads is decidable. Using this decidability result, an alternative to Cohen’s impossibility result on virus detection is discussed and some results on risk assessment services are proved.
§1. Introduction. This paper is about thread algebra [1, 5]. Threads are processes tailored to describe sequential program behaviour and emerge from the behavioral abstraction of sequential programs. A basic thread models a finite program behaviour to be controlled by some execution environment: upon each action (e.g., a request for some service), a reply true or false from the environment determines further execution. Any execution trace of a basic thread ends either in the (successful) termination state or in the deadlock state. Both these states are modeled as special thread constants. Regular threads extend basic threads by comprising loop behaviour, and are reminiscent of flowcharts [14, 12]. Threads may make use of services, i.e., devices that control (part of) their execution by consuming actions, providing the appropriate reply, and suppressing observable activity. Regular threads using the service of a single stack are called pushdown threads. Apart from the distinction between deadlock and termination, pushdown threads are comparable to pushdown automata or pushdown processes as described by Stirling [17] or Burkart and Steffen [9]. First, we recall from our companion paper [2] that equivalence of pushdown threads is decidable, and we provide a sketch of our proof. Then we elaborate on Cohen’s impossibility result on virus detection [10] (in that 1984 paper, the term computer virus was coined). Whereas Cohen showed that a test predicate that decides whether a program executes (and spreads) a virus cannot exist, we proposed in [8] a more modest test that can be used to forecast whether the execution of a thread has no security hazard. This is decidable for regular threads (as argued in [8]), and also for shrat-safe pushdown threads (as argued in this paper). In our approach, a security hazard is modeled as the occurrence Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
1
2
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
of a certain action in a thread. We define a service SHRAT (security hazard risk assessment tool) that provides the replies to such tests. The idea is as follows: a security hazard is modeled by an action risk and the security hazard risk test as sh.ok. In case SHRAT replies true to if sh.ok then P else Q, P will not execute risk and execution continues with P. In the other case (reply false), Q will be executed instead because P would execute risk (there is no security hazard risk assessment of Q). A major point is whether P itself may or may not execute sh.ok tests. If P is regular, this is not a problem and we prove that SHRAT is correct. In the case that P is a pushdown thread, correctness only follows if P is shrat-safe, i.e., contains no occurrences of both sh.ok and risk (this is a decidable property). Our approach offers an alternative to that of Cohen in his well-known paper [10] which shows the impossibility of a test action that reacts on two arguments P and Q at the same time. More precisely, Cohen considers a decision procedure D (a predicate on program texts) that determines whether a program executes (and spreads) a virus. Then Cohen’s impossibility result is established by the program C defined by C = if ¬D(C) then P else Q, where P executes a virus, and Q is virus-free. §2. Threads and services. In this section we recall the definitions of basic threads and regular threads. Furthermore we discuss services that may be used by a thread, and we consider the use-operator, which defines how a thread uses a service. 2.1. Threads. Basic thread algebra [5]1 , BTA, is tailored for the description of sequential program behaviour. Based on a finite set of actions A, it has the following constants and operators: • the termination constant S, • the deadlock or inaction constant D, • for each a ∈ A, a binary postconditional composition operator a . We use action prefixing a ◦ P as an abbreviation for P a P and take ◦ to bind strongest. The operational intuition behind thread algebra is that each action represents a command which is to be processed by the execution environment of a thread. More specifically, an action is taken as a command for a service offered by the environment. The processing of a command may involve a change of state of this environment. At completion of the processing of the command, the service concerned produces a reply value true or false to the 1 In
[4], basic thread algebra is introduced under the name basic polarized process algebra.
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
3
thread under execution. The thread P a Q will then proceed as P if the processing of a yielded the reply true indicating successful processing, and it will proceed as Q if the processing of a yielded the reply false. BTA can be equipped with a partial order and an approximation operator in the following way: 1. is the partial ordering on BTA generated by the clauses (a) for all P ∈ BTA, D P, and (b) for all P1 , P2 , Q1 , Q2 ∈ BTA, a ∈ A, P1 Q1 & P2 Q2 ⇒ P1 a P2 Q1 a Q2 . 2. : N × BTA → BTA is the approximation operator determined by the equations (a) for all P ∈ BTA, (0, P) = D, (b) for all n ∈ N, (n + 1, S) = S, (n + 1, D) = D, and (c) for all P, Q ∈ BTA, n ∈ N, (n + 1, P a Q) = (n, P) a (n, Q). We further write n (P) instead of (n, P). The operator finitely approximates every thread in BTA. That is, for all P ∈ BTA, ∃n ∈ N 0 (P) 1 (P) · · · n (P) = n+1 (P) = · · · = P. Every thread in BTA is finite in the sense that there is a finite upper bound to the number of consecutive actions it can perform. Following the metric theory of [11] in the form developed as the basis of the introduction of processes in [3], BTA has a completion BTA∞ which comprises also the infinite threads. Standard properties of the completion technique yield that we may take BTA∞ as the cpo consisting of all so-called projective sequences. That is, BTA∞ = {(Pn )n∈N | ∀n ∈ N (Pn ∈ BTA & n (Pn+1 ) = Pn )} with (Pn )n∈N (Qn )n∈N ⇔ ∀n ∈ N Pn Qn and (Pn )n∈N = (Qn )n∈N ⇔ ∀n ∈ N Pn = Qn . For a detailed account of this construction see [1]. In this cpo structure, finite linear recursive specifications represent continuous operators having as unique fixed points regular threads, i.e., threads which can only reach finitely many states. A finite linear recursive specification over BTA is a set of equations Xi = ti (X )
4
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
for i ∈ I with I some finite index set and all ti (X ) of the form S, D, or Xil ai Xir for il , ir ∈ I . Example 2.1.1. We define the regular threads 1. a ◦ b ◦ D, 2. a ◦ b ◦ S and 3. (a ◦ b)∞ (this informal notation is explained below) as the fixed points for X1 in the specifications 1. X1 = a ◦ X2 , X2 = b ◦ X3 , X3 = D, 2. X1 = a ◦ X2 , X2 = b ◦ X3 , X3 = S, 3. X1 = a ◦ X2 , X2 = b ◦ X1 , respectively. Both a ◦ b ◦ D and a ◦ b ◦ S are finite threads; (a ◦ b)∞ is the infinite thread corresponding to the projective sequence (Pn )n∈N with P0 = D, P1 = a ◦ D and Pn+2 = a ◦ (b ◦ Pn ). Observe that a ◦ b ◦ D a ◦ b ◦ S, a ◦ b ◦ D (a ◦ b)∞ , but a ◦ b ◦ S (a ◦ b)∞ . Convention 2.1.2. In reasoning with finite linear recursive specifications, we shall from now on identify variables and their fixed points. For example, we say that P is the regular thread defined by P = a ◦ P instead of stating that P equals the fixed point for X in X = a ◦ X . 2.2. Services. A service is a component of an execution architecture for threads that can be used to determine the reply to an action. In [6] various services (called state machines in that paper) were considered, as well as their possible role in thread execution. A service is a pair Σ, F consisting of a set Σ of so-called co-actions and a reply function F . The reply function F of a service Σ, F is a mapping that gives for each sequence of co-actions in Σ+ the reply produced by the service. This reply is a boolean value true or false. Example 2.2.1 (Stack). One of the services that will occur in what follows is the stack S = Σ, F with Σ = {push:i, topeq:i, empty, pop | i ∈ I } for some finite set I , where push:i pushes i onto the stack and yields reply true, the action topeq:i tests whether i is on top of the stack, empty tests whether the stack is empty, and pop pops the stack if it is non-empty with reply true and yields the reply false otherwise (leaving the stack empty). By S(α) we denote a stack with contents α ∈ I ∗ with the leftmost element of α on top in case α = with the empty stack contents. In Example 3.1.1 we return to the use of a stack as a service. In order to provide a specific description of the interaction between a thread and a service, we will use for actions the general notation c.a where c is the so-called channel or focus and a is a co-action. For example, we write s.pop to denote the action which pops a stack via channel s.
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
5
For a service S = Σ, F and a finite thread P, we define P using the service S via channel c, notation P/c S, by the following rules: S/c S D/c S (P c .a Q)/c S (P c.a Q)/c S (P c.a Q)/c S (P c.a Q)/c S
= = = = = =
S, D, (P/c S) c .a (Q/c S) if c = c, P/c S if a ∈ Σ and F (a) = true, Q/c S if a ∈ Σ and F (a) = false, D if a ∈ Σ,
where S = Σ, F with F () = F (a) for all co-action sequences ∈ Σ+ . Note that actions that use a service S are not observable. The use operator is expanded to infinite threads P by stipulating P/c S = (n (P)/c S)n∈N . As a consequence, P/c S = D if for every n, n (P)/c S = D. Example 2.2.2. We consider again the threads a ◦ b ◦ D, a ◦ b ◦ S and (a ◦ b)∞ from Example 2.1.1 but now in the versions c.a ◦ c.b ◦ D, c.a ◦ c.b ◦ S and (c.a ◦ c.b)∞ for some channel c and service S = {a, b}, F . Then (c.a ◦ c.b ◦ D)/c S = D and (c.a ◦ c.b ◦ S)/c S = S, but (c.a ◦ c.b)∞ /c S = D. §3. Pushdown threads and decidable equivalence. In this section we consider pushdown threads, i.e., regular threads that use a stack. Then, we recall from our paper [2] that equivalence of pushdown threads is decidable and sketch a proof of this fact. 3.1. Pushdown threads. In the next example we show that the use of services may turn regular threads into non-regular ones. Example 3.1.1. Let {a, b, s.push:1, s.pop} ⊆ A, where the last two actions refer to the stack S defined in Example 2.2.1 with I = {1}. By the defining equations for the use operator it follows that for any thread P and ∈ {1}∗ , (s.push:1 ◦ P)/s S() = P/s S(1). Furthermore, it easily follows that S (P s.pop S)/s S() = P/s S()
if = (the empty sequence), if = 1.
Now consider the regular thread Q defined by 2 Q = (s.push:1 ◦ Q) a R, R = b ◦ R s.pop S. 2 Note
that a linear recursive specification of Q requires (at least) five equations.
6
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
Then for all ∈ {1}∗ , Q/s S() = ((s.push:1 ◦ Q) a R)/s S() = (Q/s S(1)) a (R/s S()), R/s S(1) = b ◦ R/s S(), R/s S() = S. It is not hard to see that Q/s S() is an infinite thread with the property that for all n ∈ N, a trace of n+1 a-actions produced by n positive and one negative reply on a is followed by n b-actions and S. This yields an nonregular thread: if Q/s S() were regular, it would be a fixed point of some finite linear recursive specification, say with k equations. But specifying a trace containing k b-actions followed by S already requires k+1 linear equations X1 = b ◦ X2 , . . . , Xk = b ◦ Xk+1 , Xk+1 = S, which contradicts the assumption. So Q/s S() is not regular. We call a regular thread that uses a stack as described in Example 2.2.1 a pushdown thread. In what follows we assume that pushdown threads are given with help of a distinguished identifier from a finite linear recursive specification F and a stack over some fixed alphabet. The equations in F may contain actions that address the stack via the use-application /s . 3.2. Decidable equivalence. From our companion paper [2] we quote the following result: Theorem 3.2.1. Equivalence of pushdown threads is decidable. This theorem follows from a reduction to the dpda-equivalence problem whose decidability was proved by S´enizergues [15, 16]. Here we provide only a sketch, a detailed proof can be found in [2]. The idea is to use a transformation from pushdown threads to dpda’s such that the identity P/s S(α) = Q/s S() holds if and only if the identity L(A, P α ) = L(A, Q ) holds, where the latter identity expresses that for the derived dpda A, the language accepted by ‘configuration’ P α equals the one accepted by configuration Q . The transformation described in [2] consists of five steps and uses the dpda-equivalence result as formulated by Stirling [18] because this is closer to our setting: 1. Transform P/s S(α) and Q/s S() such that initially the stacks are nonempty (also if one of α and is the empty string), and such that upon their termination the stack is empty. The reason for this step stems from the fact that language acceptance for dpda’s is defined on configurations
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
7
of the form Rα where R is a ‘state’ and α is a non-empty stack contents. A word w is in the accepted language iff the dpda in initial state R empties the stack by performing the transitions whose labels form w. 2. Replace occurrences of D by loops that fill the stack (e.g., replace Pi = D by Pi = s.push:j ◦ Pi for some j ∈ I ). The reason for this step is that D has no equivalent in the dpda-equivalence result. 3. Normalize infinite traces: replace each equation Pi = Pl a Pr by Pi = S b (Pl a Pr ) with b an action that occurs not in P and Q. Here S is the thread that first empties the stack and then terminates (S is also used in step 1). The reason for this step is that each infinite trace becomes interlarded with exits b, and is thus characterized by finite traces which in turn are subject to dpda language acceptance. 4. Construction of an associated pushdown automaton (pda). The specifications of the so far transformed P(α) and Q() admit a straightforward definition of a pda whose transitions are deterministic. The only remaining problem is that the -transitions (that stem from stack actions) need not pop the stack, as required by the decidability result in [18]. 5. Construction of a dpda in which the -transitions only pop the stack. The pda thus obtained is transformed by changing its transition rules for . Those that do not pop the stack are either swallowed by an observable transition and yield a new transition rule, or form a loop, in which case they can be omitted. This step preserves language acceptance and concludes the transformation. We will exploit this decidability result by replacing certain equations in the definition of the regular thread that underlies a pushdown thread, i.e. in the definition of P when considering P/s S(α). For example, it is decidable whether a pushdown thread is normed, i.e., has the option to terminate (to end in S): let a linear recursive specification | i = 1, . . . , n} F = {Pi = ti (P) be given (and thus a repertoire of stack actions and external actions). Replace each equation Pi = S ∈ F by P i = a◦P i and overline all remaining identifiers. Then Pk /s S(α) is normed ⇔ Pk /s S(α) = P k /s S(α). Remark 3.2.2. Interestingly, inclusion of pushdown threads is not decidable (although two pushdown threads are equivalent if they are included in each other). This follows from a reduction to the halting problem for Minsky machines — an approach also taken in Janˇcar et al. [13]. A detailed proof is recorded in [2]. §4. Security hazard risk assessment. In this section we consider the possibility that a pushdown thread uses a service that supports forecasting of certain future behaviour. In [7] various such services are studied (e.g., the
8
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
halting problem and “rational agents”) and in [8] we discuss a rather specific case: a service SHRAT (security hazard risk assessment tool). In this paper we provide a detailed construction of SHRAT for regular threads and a proof of its correctness. Finally, we consider SHRAT for pushdown processes and distinguish the case of shrat-safe threads. 4.1. A definition of SHRAT. We model a security hazard in a pushdown thread P as the execution of an action risk. Furthermore, P may contain a test action sh.ok that can use the service SHRAT to forecast whether risk will be executed: SHRAT replies true to Q sh.ok R if Q does not execute risk, and false if Q does execute the action risk (and then R is executed instead). In order to model forecasting, we first define the residual thread of a pushdown thread P as the thread that remains after zero or more actions of P have been executed: Definition 4.1.1. Let P be a pushdown thread. We write Q ∈ Res (P) whenever Q is a residual thread of P: • • • •
P ∈ Res (P), P ∈ Res (P a Q), Q ∈ Res (P a Q), and if R ∈ Res (Q) and Q ∈ Res (P), then R ∈ Res (P).
Of course, the very idea of a service SHRAT that supports forecasting of the execution of future actions risk in a residual thread Q sh.ok R of P, thus (1)
(Q sh.ok R)/sh SHRAT
requires that SHRAT is aware of the specification of Q. So, a reply function that only uses the current co-action and those processed before is in this case not sufficient. It seems most natural to model that SHRAT “gets to know and analyzes” Q’s specification upon the request sh.ok in the use-application (1) above. We describe this change of state of SHRAT and the resulting reply in the following definition. Definition 4.1.2. Let a pushdown thread P be given by some specification FP and let sh.ok be the only action in P with focus sh. Then the service SHRAT is defined by the following two properties: (1) for any residual thread Q sh.ok R of P, (Q sh.ok R)/sh SHRAT = (Q sh.ok R)/sh SHRAT(FP , Q), where SHRAT(FP , Q) is the instance of SHRAT that has loaded FP and analyzed Q, and
9
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
(2) (Q sh.ok R)/sh SHRAT(FP , Q) = Q/sh SHRAT (thus reply true) if no risk-action will be executed in Q/sh SHRAT, SHRAT (thus reply false) if a risk-action R/ sh will be executed in Q/sh SHRAT. The (instantiated) service SHRAT(FP , Q) models a “security hazard risk assessment” in the sense that if a security hazard in Q is modeled by the execution of the action risk, the reply true to Q sh.ok R ensures that in the residual thread Q/sh SHRAT no security hazard will occur (cf. [8]). It can be the case that SHRAT(FP , Q) replies true because SHRAT will reply false to a future sh.ok-test in Q/sh SHRAT. For example, in the regular thread P1 given and depicted below, the various sh.ok-tests are evaluated as follows: P1 P2 P3 P4
= = = =
P2 sh.ok P8 P3 a P4 P5 sh.ok P6 P6 sh.ok P7
(true)
P5 P6 P7 P8
(true) (false)
? P1 : sh.ok /@ / ? @ ? R P8 : S P2 : a @ R @
P3 : sh.ok P4 : sh.ok /@ / \\ @ @ R R @ P5 : [ b ] P6 : [risk] P7 : [ c ]
[a]
where
= = = =
b ◦ P2 risk ◦ P1 c ◦ P8 S.
≈ a◦P
? P
and
a ≈ Pl a Pr . @ R @ Pl
Pr
Clearly, the thread T = P1 /sh SHRAT satisfies T = b ◦ T a c ◦ S. In the next section we discuss how to instantiate SHRAT for regular threads in an appropriate way. 4.2. SHRAT for regular threads. Following Convention 2.1.2, we assume that if a regular thread P1 is given, it is given by a linear recursive specification FP1 that contains an equation P1 = t1 (P). Furthermore, we say that an equation Pj = Pl a Pr in FP1 has a predecessor if Pj occurs in the righthand side of at least one equation. Finally, we restrict to specifications FP1
10
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
with the property that if Pj = Pl sh.ok Pr ∈ FP1 , then l = r (otherwise, the reply to sh.ok would be meaningless). Starting from P1 /sh SHRAT with the regular thread P1 specified in FP1 , we provide an algorithm that upon each residual thread of the form (Pm sh.ok Pj )/sh SHRAT constructs an instantiated service SHRAT(FP1 , Pm ) that gives the correct reply. Typical for this algorithm is that SHRAT(FP1 , Pm ) contains a copy of FP1 in which all sh.ok actions are annotated with the correct reply. To this end, FP1 is loaded into SHRAT and analyzed as follows: number each equation that contains a risk-occurrence starting from 1. Then, for each numbered equation label each predecessor equation with the next free number until a connecting sh.ok-equation is found, or a loop occurs, or an equation without predecessors is found. In the case that some sh.ok-equation is found and connects via its true-branch, its sh.ok-action is annotated false (sh.okfalse ); if it connects via its false-branch, the equation is labeled with a fresh negative number (it may possibly lead to a risk-action, namely when a false-annotation is added in a future inspection). Then this procedure is repeated for equations labeled with a negative number, again instantiating first occurrences of sh.ok-actions with false if their true-branch leads to an action risk. Finally, all non-annotated sh.ok-actions are annotated true because their true-branch does not lead to a risk-action. In Figure 1, we illustrate how the annotation proceeds: first the two lowest sh.ok actions are annotated false, and because of the arrow, the equation of the leftmost one is labeled with a fresh negative number. The combination of the false-annotation and this label leads to the false-annotation of the topmost sh.ok-action. Construction of SHRAT(FP1 , Pm ) for a regular thread P1 . Let FP1 = {Pi = | i = 1, . . . , n} be a linear specification of the regular thread P1 . Upon ti (P) a residual thread Pm sh.ok Pw , the service SHRAT(FP1 , Pm ) is constructed as follows: load FP1 in SHRAT. We further call this copy FPan1 . Label each equation in FPan1 that contains risk in the right-hand side with a number, starting from 1, say 1, . . . , k. If no risk-actions occur in FPan1 , then apply step 3 below. In the other case, apply step 1: 1. On FPan1 apply the procedure Eval+ (1), where Eval+ (i) for i ≥ 1 is defined as follows: Eval+ (i): If the equation labeled with number i has the form (i) Pj = Pl a Pr , then evaluate all Pj occurrences in the right-hand sides of all equations,
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
11
sh.ok @ R @
sh.ok
...
sh.ok
[risk]
@ R @ [risk]
? ...
? ...
@ R @ ...
⇓
sh.okfalse @ R @
sh.okfalse
...
sh.okfalse
[risk]
@ R @ [risk]
? ...
? ...
@ R @ ...
Figure 1. Annotating sh.ok actions i.e., apply steps (1a) - (1e) below exhaustively, where evaluation goes with some bookkeeping: we will in some cases give equations a next free number and possibly annotate sh.ok-actions with false. The first free positive number is k+1 and the first free negative number is −1. Furthermore, the next free number for positive numbers is the smallest p > 0 not already used, and for negative numbers the largest p < 0 not already used: (a) No non-evaluated Pj occurrences left: if there is an equation numbered i+1 then apply Eval+ (i+1), else, if negative numbers are used, go to step 2; if none of these is the case, go to step 3, (b) If Pv = Pj sh.ok Pq , then replace sh.ok by sh.okfalse and search the next non-evaluated Pj occurrence (a possible number of this equation is preserved),
12
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
(c) If Pv = Pq sh.okPj and this equation is not numbered, then give it the next free negative number and search the next non-evaluated Pj occurrence, else just search the next non-evaluated Pj occurrence, (d) If Pv = Pq sh.okfalse Pj and this equation is not numbered, then give it the next free negative number and search the next nonevaluated Pj occurrence, else just search the next non-evaluated Pj occurrence, (e) All remaining cases, i.e., equations of the form Pv = Pj b Pq or Pv = Pq b Pj : if not yet numbered, give this equation the next free positive number and search the next non-evaluated Pj occurrence; else, just search the next non-evaluated Pj occurrence. 2. On FPan1 apply the procedure Eval− (−1), where Eval− (i) for i ≤ −1 is defined as follows: Eval− (i): • if the equation labeled with number i has the form (i) Pj = Pl sh.ok Pr , then apply Eval− (i−1) if there is an equation numbered i−1, otherwise go to step 3; • if the equation labeled with number i has the form (i) Pj = Pl a Pr for a = sh.ok (possibly a = sh.okfalse ), then evaluate all Pj occurrences in the righthand sides of all equations, i.e., apply steps (2a) - (2e) below exhaustively, where evaluation again goes with some bookkeeping: we will in some cases give equations the next free negative number and possibly annotate sh.ok-actions with false: (a) No non-evaluated Pj occurrences left: if there is an equation numbered i−1 then apply Eval− (i−1), else go to step 3, (b) If Pv = Pj sh.ok Pq , then replace sh.ok by sh.okfalse and search the next non-evaluated Pj occurrence (a possible number of this equation is preserved), (c) If Pv = Pq sh.ok Pj , then search the next non-evaluated Pj occurrence, (d) If Pv = Pq sh.okfalse Pj and this equation is not numbered, then give it the next free negative number and search the next nonevaluated Pj occurrence, else just search the next non-evaluated Pj occurrence, (e) All remaining cases, i.e., equations of the form Pv = Pj b Pq or Pv = Pq b Pj : if not yet numbered, give this equation the next free negative number and search the next non-evaluated Pj occurrence; else, just search the next non-evaluated Pj occurrence.
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
13
3. Replace all sh.ok occurrences in FPan1 that are not yet annotated by sh.oktrue . Now SHRAT(FP1 , Pm ) is defined as the service that replies to the residual thread Pm sh.ok Pw with the annotation b found in the right-hand side Pm sh.okb Pw of its internal specification FPan1 . Theorem 4.2.1. Let P1 be a regular thread specified by the linear recursive specification FP1 . Then, upon each residual thread of the form Pm sh.ok Pw , the tool SHRAT(FP1 , Pm ) is sound, i.e., agrees with Definition 4.1.2. Hence, (Pm sh.ok Pw )/sh SHRAT = (Pm sh.ok Pw )/sh SHRAT(FP1 , Pm ) Pm /sh SHRAT if Pm /sh SHRAT does not execute risk, = Pw /sh SHRAT otherwise. Proof. Assume Pm sh.ok Pw is a residual thread of P1 . Clearly the algorithm for SHRAT(FP1 , Pm ) terminates and Pm sh.okb Pw occurs at least once as a right-hand side in FPan1 (in case of multiple occurrences, b has the same value). We argue that the boolean b is the correct reply to (Pm sh.ok Pw )/sh SHRAT(FP1 , Pm ). FPan1
In case contains no risk action, all annotations are true (step 3), which obviously is correct. In case FPan1 contains at least one risk action, it is clear that after all Eval+ (i)’s have been applied (step 1), all true-branches of annotated sh.okfalse actions lead to risk. Furthermore, the right-hand sides of all negatively numbered equations have a sh.ok action (possibly annotated false) of which the false-branch leads to risk. At Eval− (i) (step 2), the negatively numbered equations with non-annotated action sh.ok will not be annotated false (as their true-branch does not lead to risk). The remaining labeled equations all have a residual thread that may lead to risk, and thus yield next (negative) numbers until a loop occurs, or an equation without a predecessor is found, or another sh.ok that connects via its true-branch occurs (in the latter case, this action is annotated false). Hence, after step 3, all annotations are correct. 4.3. SHRAT for pushdown threads. It is not clear how to define a (terminating) algorithm for SHRAT that is correct for arbitrary pushdown threads. However, in the particular case that either no test action sh.ok or no action risk is executed by a pushdown thread P, the correct reply of sh.ok in (P sh.ok Q)/sh SHRAT
14
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
follows easily from Theorem 3.2.1 (i.e., equivalence of pushdown threads is decidable): consider a pushdown thread Pk /s S(α) where Pk is specified in F. Assuming that the action a does not occur in F, define F a by replacing in F each occurrence of the action a by a and replacing all identifiers Pi by Pia . Then Pk /s S(α) does not execute a if and only if Pk /s S(α) = Pka /s S(α), so this is decidable. Note that if Pk /s S(α) = Pka /s S(α), then for any residual thread Pl /s S() of Pk /s S(α), also Pl /s S() = Pla /s S(). A pushdown thread P = Pk /s S(α) is called shrat-safe if either P = risk Pk /s S(α) or P = Pksh.ok /s S(α). In both cases the correct reply to sh.ok in P sh.ok Q can be found: • if P = Pkrisk /s S(α), then this reply is true, thus (P sh.ok Q)/sh SHRAT = P/sh SHRAT,
• if P = Pksh.ok /s S(α), then both replies can occur, thus (P sh.ok Q)/sh SHRAT P/sh SHRAT (reply true) if Pk /s S(α) = Pkrisk /s S(α), = Q/sh SHRAT otherwise, where the latter case is only meaningful if Q is also shrat-safe. Although much weaker, it is not unreasonable to consider shrat-safe pushdown threads. This situation can always be obtained: upon a residual thread (P sh.ok Q)/sh SHRAT, rename all sh.ok actions in the specification of P, thus ignoring their forecasting effect and evaluating both their true and false-branches. If SHRAT then replies true, this certainly comprises a security hazard risk assessment of P. The only problem is that if SHRAT replies false, it is not certain that P will indeed execute risk. §5. Digression and discussion. In this paper we presented some of our latest work on thread algebra and on security hazard risk assessment (as defined in [8]). We end the paper with a few comments on the latter subject. 5.1. Architecture-sensitive services. First, we propose to call services as SHRAT architecture-sensitive services: in case SHRAT has to reply to a thread Q sh.ok R, it first needs to analyze the future behaviour of Q and therefore it needs to “know” both the specification and the particular execution state. Assuming
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
15
that Q is specified in FP , this idea is captured in Definition 4.1.2 by the equation (Q sh.ok R)/sh SHRAT = (Q sh.ok R)/sh SHRAT(FP , Q), which characterizes the instantiation of SHRAT to SHRAT(FP , Q). So, in the particular case of SHRAT (and similar services such as rational agents discussed in [7]), the reply in a use-application is architecture-sensitive and can not be defined with a reply function that only depends on the current co-action and those processed before (such as the reply function for the stack defined in Example 2.2.1). Typically, different use-applications need not commute if architecture-sensitive services are involved, e.g., ([(risk ◦ S s.pop S) sh.ok D]/sh SHRAT)/s S() = D while ([(risk ◦ S s.pop S) sh.ok D]/s S())/sh SHRAT = S. Use-applications with services with a reply function that only depends on the current co-action and those processed before do commute if distinct foci are used (cf. [6]). 5.2. SHRAT for pushdown threads. At this stage, it is not clear how to define a (terminating) algorithm for SHRAT that is correct for all pushdown threads. One possibility may be to approximate pushdown threads by regular threads in such a way that a sound risk-analysis can be established. Given a linear specification FP1 of P1 and a stack S, it seems likely that in P1 /s S(α) only finitely many stack configurations (uniformly depending on FP1 and α) play a distinctive role with respect to SHRAT’s replies. Another approach is to start from a game theoretic characterization of SHRAT: in residual threads of the form (2)
(Q sh.ok R)/sh SHRAT,
the service SHRAT has to give the correct reply (according to its Definition 4.1.2), while the opponent replies to all other test actions and aims for the execution of risk. We do not (yet) know whether game theoretic results cover this particular game. Hence: Open question: Is SHRAT decidable for all pushdown threads? An interesting simplification may be the case of one-counter threads, i.e., regular threads that use a counter (a stack over a singleton datatype) instead of a stack, with s.push and s.pop as the only actions. Also for this case, the above question is still open. Of course, security hazard risk assessment for computable threads is undecidable. In the setting of Turing machines, given a regular control program P and tape configuration Tape(α x) ˆ with head pointing at x, it is undecidable
16
JAN A. BERGSTRA, INGE BETHKE, AND ALBAN PONSE
whether some action of P will be executed in P/tmt Tape(α x): ˆ there is a straightforward reduction to the halting problem (cf. [7]). 5.3. SHRAT and external services. In order to define security hazard risk assessment in precisely the same way as was done in [8], the results and explanations for both the regular and the pushdown case in Section 4 should be slightly modified. In [8], a thread can also engage in external communication with a service E (via actions with focus e). Such a communication blocks further assessment of SHRAT because E is beyond control of the thread under execution. It is not difficult to implement this modification in the algorithm for regular threads: in the evaluation step, simply stop evaluation upon an equation defined by a postconditional composition over e.m. However, for clarity of presentation we did not consider this possibility before. REFERENCES
[1] J. A. Bergstra and I. Bethke, Polarized process algebra and program equivalence, Automata, Languages and Programming, Proceedings 30th ICALP, Eindhoven, The Netherlands (J. C. M. Baeten, J. K. Lenstra, J. Parrow, and G. J. Woeginger, editors), LNCS, vol. 2719, Springer-Verlag, 2003, pp. 1–21. [2] J. A. Bergstra, I. Bethke, and A. Ponse, Decision Problems for Pushdown Threads, Electronic report PRG0502, Faculty of Science, University of Amsterdam, 2005, available at www.science.uva.nl/research/prog/publications.html. [3] J. A. Bergstra and J. W. Klop, Process algebra for synchronous communication, Information and Control, vol. 60 (1984), no. 1/3, pp. 109–137. [4] J. A. Bergstra and M. E. Loots, Program algebra for sequential code, Journal of Logic and Algebraic Programming, vol. 51 (2002), no. 2, pp. 125–156. [5] J. A. Bergstra and C. A. Middelburg, A thread algebra with multi-level strategic interleaving, Proceedings CIE 2005 (S. B. Cooper, B. Loewe, and L. Torenvliet, editors), LNCS, vol. 3526, Springer-Verlag, 2005, pp. 35– 48. [6] J. A. Bergstra and A. Ponse, Combining programs and state machines, Journal of Logic and Algebraic Programming, vol. 51 (2002), no. 2, pp. 175–192. [7] , Execution architectures for program algebra, Technical report Logic Group Preprint Series 230, Department of Philosophy, Utrecht University, 2004, to appear in the Journal of Applied Logic, prior version available at http://www.phil.uu.nl/preprints/lgps/ ?lang=en. [8] , A bypass of Cohen’s impossibility result, Advances in Grid Computing - EGC 2005 (P. M. A. Sloot, A. G. Hoekstra, T. Priol, A. Reinefeld, and M. Bubak, editors), LNCS, vol. 3470, Springer-Verlag, 2005, also vailable as Electronic report PRG0501 at www.science.uva. nl/research/prog/publications.html, pp. 1097–1106. [9] O. Burkart and B. Steffen, Pushdown processes: Parallel composition and model checking, CONCUR’94, LNCS, vol. 836, Springer-Verlag, August 1994, pp. 98–113. [10] F. Cohen, Computer viruses - theory and experiments, Computers & Security, vol. 6 (1984), no. 1, pp. 22–35, also available at http://vx.netlux.org/lib/afc01.html. [11] J. W. de Bakker and J. I. Zucker, Processes and the denotational semantics of concurrency, Information and Control, vol. 54 (1982), no. 1/2, pp. 70–120. [12] S. A. Greibach, Theory of Program Structures: Schemes, Semantics, Verification, LNCS, vol. 36, Springer-Verlag, 1975. [13] P. Jancar, F. Moller, and Z. Sawa, Simulation problems for one-counter machines, ˇ
THREAD ALGEBRA AND RISK ASSESSMENT SERVICES
17
Proceedings of SOFSEM’99: The 26th Seminar on Current Trends in Theory and Practice of Informatics, LNCS, vol. 1725, Springer-Verlag, 1999, pp. 398– 407. [14] Z. Manna, Mathematical Theory of Computation, McGraw-Hill, New-York, 1974. [15] G. S´enizergues, L(A) = L(B)?, Technical report 1161-97, LaBRI, Universit´e Bordeaux, 1997, available at www.labri.u-bordeaux.fr. [16] , L(A) = L(B)? decidability results from complete formal systems, Theoretical Computer Science, vol. 251 (2001), pp. 1–166. [17] C. Stirling, Decidability of bisimulation equivalence for pushdown processes, Technical report EDI-INF-RR0005, Laboratory for Foundations of Computer Science, University of Edinburgh, 2000, available at http://www.inf.ed.ac.uk/research/lfcs/publications.html. [18] , Decidability of DPDA equivalence, Theoretical Computer Science, vol. 255 (2001), pp. 21–31. PROGRAMMING RESEARCH GROUP, FACULTY OF SCIENCE UNIVERSITY OF AMSTERDAM, THE NETHERLANDS and APPLIED LOGIC GROUP, DEPARTMENT OF PHILOSOPHY UTRECHT UNIVERSITY, THE NETHERLANDS
E-mail:
[email protected] URL: www.science.uva.nl/~janb/ PROGRAMMING RESEARCH GROUP, FACULTY OF SCIENCE UNIVERSITY OF AMSTERDAM, THE NETHERLANDS
E-mail:
[email protected] URL: www.science.uva.nl/~inge/ E-mail:
[email protected] URL: www.science.uva.nl/~alban/
COVERING DEFINABLE MANIFOLDS BY OPEN DEFINABLE SUBSETS
´ MARIO J. EDMUNDO
Abstract. Let N be an o-minimal expansion of a real closed field. We show that if X is a Hausdorff definable manifold, then X can be covered by finitely many open definable subsets which are definably homeomorphic to open balls and the intersection of any two open definable subsets of this covering is a finite union of elements of the covering. We also mention the importance of this result in the solution of the torsion point problem for definably compact definable groups.
§1. Introduction. We work over a fixed, but arbitrary, o-minimal structure N and definable means N -definable (possibly with parameters). By definition of o-minimality, in the model theoretic structure N , every definable subset of N is a finite union of points and intervals with endpoints in N ∪ {−∞, +∞}. One is often interested in studying definable groups in N . A definable group is a group whose underlying set is a definable set and the graphs of the group operations are definable sets. The theory of definable groups in arbitrary ominimal structures, which includes real algebraic groups and semi-algebraic groups, began with Anand Pillay’s paper [P] and has since then grown into a well developed branch of mathematics (see for example [E1], [PS], [PSt1], [PPS1] and [PPS2]). For example we have: (TOP) every definable group G has a unique definable manifold structure such that the group operations are continuous and the definable homomorphisms are also continuous; (DCC) the descending chain condition for definable subgroups of a definable group G; (QT) existence in the category of definable groups of the quotient of a definable group by a definable normal subgroup together with the existence of a corresponding definable section; 2000 Mathematics Subject Classification. 03C64; 20E99. Key words and phrases. O-minimal structures and definable groups. With partial support from the FCT (Fundac¸a˜ o para a Ciˆencia e Tecnologia), program POCTI (Portugal/FEDER-EU). Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
18
COVERING DEFINABLE MANIFOLDS BY OPEN DEFINABLE SUBSETS
19
(AB) every definable group G of positive dimension has a definable abelian subgroup of positive dimension; (TOR) if G is a definable group, then for all m ∈ N, the subgroup G[m] of m-torsion points of G is a finite definable subgroup. Properties (TOP), (DCC) and (AB) were proved in [P]. Property (QT) is from [E1] and (TOR) is from the paper [S]. Property (TOP) is used to define the notion of definably connected [P] and of definably compact [PS]: a definable group G is definably connected if it has no proper nonempty open and closed (with respect to the topology given by (TOP)) definable subset; and G is definably compact if for every continuous definable map : (a, b) ⊆ [−∞, +∞] −→ G (continuous with respect to the topology on G given by (TOP)), the limit limt−→a + (t) and limt−→b − (t) exist in G. In o-minimal expansions of fields (TOR) has a strong version for definably compact groups, namely: Theorem 1.1. If N is an o-minimal expansion of a field and G is a definably connected, definably compact definable group, then for each k ∈ N the subgroup G[k] of k-torsion points of G is non trivial. This result is a solution to a problem posed by Peterzil and Steinhorn in [PS] and was first proved in an early version of the unpublished preprint [E2]. In [E2] there are three proofs of Theorem 1.1: the first one follows from the fact that the o-minimal singular cohomology of G is a non trivial Hopf algebra; the second one follows from the o-minimal version of the Lefschetz coincidence theorem for o-minimal expansions of fields and was later modified and simplified in [BO2]; the third proof, which now appears in [EO] in a simplified version, computes the o-minimal singular cohomology of G and describes the subgroups G[k] in the abelian case, namely, this data is the same as that of a compact connected abelian Lie group of dimension dim G. All of these proofs of Theorem 1.1 use heavily o-minimal singular homology and cohomology whose existence was established in [Wo]. In the first two one shows that the o-minimal Euler characteristic E(G) of G is zero and then apply a result from [S] to conclude the existence of the torsion points. There is now a different proof by Peterzil and Starchenko [PSt2] which avoids o-minimal singular cohomology and uses instead o-minimal Morse theory exploring the method suggested by [BO1]. In these notes, for lack of space, we will avoid the language of o-minimal homology and cohomology and present instead the proof of the following result which does not rely on this formalism and is nevertheless crucial in the o-minimal singular homology orientation theory for definable manifolds which is used in all of the three proofs of Theorem 1.1:
20
´ MARIO J. EDMUNDO
Theorem 1.2. Assume that N is an o-minimal expansion of a field. If X is a definable manifold of dimension n, then X can be covered by finitely many definable subsets definably homeomorphic to open ball in N n . This result is related to [BO2, Theorem 4.3] (and can be read off from the proofs of [BO2, Lemmas 4.1 and 4.2]) and to Wilkie’s result in [W] which says that an open definable subset X ⊆ N n can be covered by finitely many open cells. Under the assumption of Hausdorffness, we can improve Theorem 1.2 as follows: Theorem 1.3. Assume that N is an o-minimal expansion of a field. If X is a Hausdorff definable manifold of dimension n, then X can be covered by finitely many open definable subsets which are definably homeomorphic to open balls in N n and the intersection of any two open definable subsets of this covering is a finite union of elements of the covering. After developing the o-minimal singular homology orientation theory for definable manifolds using Theorem 1.2 one concludes that the homology group of G over Z of degree dim G is non trivial. Using classical homological algebra arguments adapted to the o-minimal context, it follows from this that the o-minimal singular cohomology H ∗ (G; Q) is isomorphic r to the Hopf algebra ∧[w1 , . . . , wr ]Q with w1 , . . . , wr of odd degree and i=1 degwi = dim G. From this information and classical computations, we also have that dim G the Euler-Poincar´e characteristic (G) = i=1 (−1)i tr(id|H i (G;Q) ) of G is actually zero. But by [BO2] (or by the construction of o-minimal homology [Wo], we have (G) = E(G), so the o-minimal Euler characteristic E(G) of G is zero. Hence, by [S] we conclude the existence of the torsion points. Below we work in an o-minimal expansion N of a field (N, 0, 1, +, ·, <) and assume that the reader is familiar with the basic notions and facts of o-minimal structures, in particular those involving the definable triangulation theorem. For a treatment of this, see van den Dries book [vdD]. §2. Covering definable manifolds. As defined in [vdD], an abstract definable manifold, of dimension n, is a triple (X, Xi , φi )i∈I where {Xi : i ∈ I } is a finite cover of the set X and for each i ∈ I : (i) we have injective maps φi : Xi −→ N n such that φi (Xi ) is an open definably connected definable set; (ii) each φi (Xi ∩ Xj ) is an open definable subset of φi (Xi ); (iii) the map φij : φi (Xi ∩ Xj ) −→ φj (Xi ∩ Xj ) given by φij = φj ◦ φi−1 is a definable homeomorphism for all j ∈ I such that Xi ∩ Xj = ∅. Let (X, Xi , φi )i∈I be an abstract definable manifold. We say that a subset S ⊆ X is definable (resp., open or closed) if φi (S ∩Xi ) is a definable (resp., open
COVERING DEFINABLE MANIFOLDS BY OPEN DEFINABLE SUBSETS
21
or closed) subset of φi (Xi ) for all i ∈ I . A definable map between abstract definable manifolds is a map whose graph is a definable subset of the product abstract definable manifold. (X, Xi , φi )i∈I is called an affine definable manifold if X is a definable subset of some N k and the topology on X generated by the open definable subsets as defined above is the induced topology from N k . As usual, we will often ignore the definable charts (Xi , φi )i∈I of the abstract definable manifold (X, Xi , φi )i∈I and simply say that X is a definable manifold. Before we proceed any further, recall that, given a simplicial complex M in N n and an open k-simplex s of M , the star StM s of s in M is the union of {s} together with all open l -simplexes t of M such that l > k and the geometric realization |s| of s in N n is a subset of the closure |t| of the geometric realization |t| of t in N n . Note also that here as in [vdD], the geometric realizations of the simplicial complexes are not necessarily closed. Proof of Theorem 1.2. Let (X, Xi , φi )i∈I be a definable manifold. For each i, let (Ψi , Mi ) be a definable triangulation of φi (Xi ) ⊆ N n . Let s be an open simplex of Mi . Then |StMi s| ⊆ |Mi | ⊆ N n are open definable subsets (by the invariance of domain (see [Wo]), |Mi | is open in N n since it is definably homeomorphic to the open definable subset φi (Xi ) of N n ). So, we need to show that |StMi s| is definably homeomorphic to an open ball in N n . But this is a consequence of the following claim: Claim 2.1. Let M be a simplicial complex in N n such that |M | is an open definable subset of N n . If s is an open simplex of M , then |StM s| is definably homeomorphic to an open ball in N n . Take a barycentric subdivision of the simplicial complex M and let p be the barycentre of s. Since |M | is an open definable subset of N n , the set |StM s| is also an open definable subset of N n . Hence, there is an open ball Bn (p, ) in N n such that Bn (p, ) ⊆ |StM s|. For each point x in S n−1 (p, ) (the boundary of Bn (p, )) let lx+ (t) with t ≥ 0 be the half line that starts at p and passes through x. For each x ∈ S n−1 (p, ), let sx be the unique element such that lx+ (sx ) ∈ |StM s| − |StM s|. This element exists and is unique because |StM s| is closed and bounded and, since Bn (p, ) ⊆ |StM s|, every such half line must intersect |StM s| − |StM s|. Clearly, we have |StM s| = {lx+ (t) : 0 ≤ t < sx , x ∈ S n−1 (p, )} and for every q ∈ |StM s| − {p}, there are unique x ∈ S n−1 (p, ) and 0 ≤ t < sx such that q = lx+ (t). To finish the proof of the lemma, let h : |StM s| −→ Bn (p, )
be the definable homeomorphism given by h(lx+ (t)) = lx+ ( sx t). By [vdD, Chapter VI, Lemma 3.5], affine definable manifolds are definably normal. By Proposition 2.2 below every abstract Hausdorff definable manifold X is definably regular. Finally, by [vdD, Chapter X, Theorem 1.8], every definably regular abstract definable manifold is definably homeomorphic to an affine definable manifold.
22
´ MARIO J. EDMUNDO
The argument in the following proof is contained in that of [BO1, Lemma 10.4]. Proposition 2.2. Every abstract Hausdorff definable manifold X is definably regular, hence, affine. Proof. For each i ∈ I and x, y ∈ Xi , let di (x, y) = |φi (x) − φi (y)|. Let K be a closed definable subset of X and a0 ∈ X \ K . For ∈ N and > 0, define K to be the set of all points y ∈ X such that if y ∈ Xi then there is a point x in Ki = Xi ∩ K with di (x, y) < . Clearly, K is an open definable subset containing K . Similarly we define L containing a0 to be the open definable subset of all points y ∈ X such that if y ∈ Xi and a0 ∈ Xi , then di (a0 , y) < . If for some ∈ N with > 0 we have K ∩ L = ∅, then we are done. Otherwise, K ∩L = ∅ for all sufficiently small > 0. Now by definable choice ([vdD] Chapter VI, Proposition 1.2) and o-minimality, there is a definable continuous map a : (0, ) → X such that a() ∈ K ∩ L for all 0 < < . Since X is Hausdorff, the limit lim→0 a() is unique and must be a0 . We reach a contradiction by showing that a0 ∈ K . Choose i such that a0 ∈ Xi . Then, since Xi is open, for all sufficiently small ∈ N with > 0 we have a() ∈ Xi . So di (a(), Ki ) is well defined and must be less than since a() belongs to
K . Therefore, lim→0 di (a(), Ki ) = 0 i.e., di (a0 , Ki ) = 0 and a0 ∈ K . For the rest of this section we will assume that (X, Xi , φi )i∈I is an abstract Hausdorff definable manifold of dimension n, hence affine. Since X is affine, we have X ⊆ N k for some k, and so, by [vdD, Chapter VIII, (1.7)], we can definably triangulate the definable set X . But, for the proof of Theorem 1.3 we will be interested in a modification of this notion. Proof of Theorem 1.3. Let (X, Xi , φi )i∈I be an affine definable manifold. Suppose that V1 , . . . , Vn are non empty definable subsets of X . Let I = {1, 2, . . . , k} be a numbering of I . Put V0 = X and define inductively (Ki , Ni , (Ψi , Mi )) for i ∈ I by: K1 = {X1 ∩ Xj ∩ Vl : j ∈ I, l = 0, . . . , n}, (Ψ1 , M1 ) is a definable triangulation of φ1 (X1 ) compatible with the definable subsets in {φ1 (B) : B ∈ K1 } and N1 = {C ⊆ X : Ψ1 (φ1 (C )) is the geometric realization of an open simplex of M1 }; Ki+1 = {Xi+1 ∩ Xj ∩ Vl ∩ C : j ∈ I, l = 0, . . . , n and C ∈ N1 ∪ · · · ∪ Ni }, (Ψi+1 , Mi+1 ) is a definable triangulation of φi+1 (Xi+1 ) compatible with the definable sets in {φi+1 (B) : B ∈ Ki+1 } and Ni+1 = {C ⊆ X : Ψi+1 (φi+1 (C )) is the geometric realization of an open simplex of Mi+1 }. By a definable triangulation of the charts (Xi , φi )i∈I of X compatible with V1 , . . . , Vn we mean a sequence (Ψi , Mi )i∈I like above. For each i ∈ I and for each open k-simplex s of Mi , let StMi s be the star of s in Mi . Let 1, . . . , mi be an enumeration of all open simplexes of Mi and,
COVERING DEFINABLE MANIFOLDS BY OPEN DEFINABLE SUBSETS
23
for each l ∈ {1, . . . , mi }, let Wli = φi−1 (Ψ−1 i (|StMi s|)) where s is the open simplex of Mi corresponding to l . The following claims hold for the collection {Wli : i = 1, . . . , m, l = 1, . . . , mi } where I = {1, . . . , m}. (1) If i, j ∈ {1, . . . , m} and j > i, then for every l ∈ {1, . . . , mi } and k ∈ {1, . . . , mj } we have that Wli ∩ Wkj is a finite union of elements from {Wsj : s ∈ Skj } where Skj = {s ∈ {1, . . . , mj } : Wsj ⊆ Wkj }. (2) For every i ∈ {1, . . . , m}, if j, l ∈ {1, . . . , mi }, then we have that Wli ∩Wki is an element from {Wsi : s ∈ Sji } ∩ {Wsi : s ∈ Sli }. Claim (1) follows easily from definition of a definable triangulation of the charts (Xi , φi )i∈I of X compatible with V1 , . . . , Vn . In fact, if t is an open simplex of Mj and φj−1 (Ψ−1 j (|t|)) intersects Xi , then there is an open simplex s (|t|)) is a definable subset of φi−1 (Ψ−1 of Mi such that φj−1 (Ψ−1 j i (|s|)). Hence, j j i Wl ∩ Wk is a definable subset of Wk which is a finite union of subsets of the j i form φj−1 (Ψ−1 j (|t|)) where t is an open simplex of Mj . Now, since Wl ∩ Wk is open, if a subset of the form φj−1 (Ψ−1 j (|t|)) (with t is an open simplex of j −1 i Mj ) is contained in Wl ∩ Wk , then φj (Ψ−1 j (|StMj t|)) is also contained in Wli ∩ Wkj and claim (1) holds. On the other hand, (2) follows from the fact that given two open simplexes s and t of Mi , the intersection of the stars StMi s and StMi t is either empty or equals the star StMi r, where r is the open simplex of Mi generated by s and t. This is easy to see. In fact, an open simplex l of Mi is contained in StMi s ∩ StMi t if and only if |s|, |t| ⊆ |l | if and only if s and t generate an open simplex r of Mi and |s|, |t|, |r| ⊆ |l | if and only if s and t generate an open simplex r of Mi and l is contained in StMi r. Thus, it remains to show that each Wji is definably homeomorphic to an open ball in N n . Let s be the open simplex of Mi corresponding to j. Then Wji is definably homeomorphic to |StMi s| and |StMi s| ⊆ |Mi | ⊆ N n are open definable subsets (by the invariance of domain (see [Wo]), |Mi | is open in N n since it is definably homeomorphic to the open definable subset φi (Xi ) of N n ). So, we need to show that |StMi s| is definably homeomorphic to an open ball
in N n . But this is a consequence of Claim 2.1. We call the finite collection (Wl , l )l ∈L of open definable subsets Wl of X together with the definable homeomorphisms l : Wl −→ Bn (0, l ) ⊆ N n given by Theorem 1.2 (resp., Theorem 1.3) definable charts of X by open balls (resp., special definable charts of X by open balls). In this context it is natural to call each Wl a definable sub-ball of X and a definable subset U of X of the form l−1 (Bn (0, )) with 0 < < l a definable proper sub-ball of Wl (or of X ))
24
´ MARIO J. EDMUNDO
since we will have a definable homeomorphism from the closure U of U in X into the closed unit ball in N n sending U − U into the unit (n − 1)-sphere. Theorem 1.2 easily implies that if A ⊆ X is a definably compact definable subset of X , then A can be covered by finitely many definable proper sub-balls of X . See [BO2] for details. This fact shows that we could not obtain Theorem 1.3 using the usual definable triangulation theorem instead of the modified version. As pointed out in [BO2] (see also [T]) a counterexample occurs already in the classical case: the double suspension ΣΣP of Poincar´e dodecahedral space P is a compact, triangulated topological manifold homeomorphic to S 5 such that the star of each of the suspension points is not homeomorphic to an open subset V of ΣΣP whose closure V in ΣΣP is compact and for which there is a homeomorphism from V into the unit closed ball sending the boundary of V to the boundary of the unit closed ball. We could not find in the literature classical analogues of Theorems 1.2 and 1.3 except for the trivial case of Theorem 1.2 that holds for compact topological manifolds. REFERENCES
[BO1] A. Berarducci and M. Otero, Intersection theory for o-minimal manifolds, Annals of Pure and Applied Logic, vol. 107 (2001), no. 1-3, pp. 87–119. [BO2] , Transfer methods for o-minimal topology, The Journal of Symbolic Logic, vol. 68 (2003), no. 3, pp. 785–794. [E1] M. Edmundo, Solvable groups definable in o-minimal structures, Journal of Pure and Applied Algebra, vol. 185 (2003), no. 1-3, pp. 103–145. [E2] , O-minimal cohomology and definably compact definable groups, RAAG preprint n. 24 (2004) (http://ihp-raag.org/). [EO] M. Edmundo and M. Otero, Definably compact abelian groups, Journal of Mathematical Logic, vol. 4 (2004), no. 2, pp. 163–180. [PPS1] Y. Peterzil, A. Pillay, and S. Starchenko, Definably simple groups in o-minimal structures, Transactions of the American Mathematical Society, vol. 352 (2000), no. 10, pp. 4397– 4419. [PPS2] , Linear groups definable in o-minimal structures, Journal of Algebra, vol. 247 (2002), no. 1, pp. 1–23. [PSt1] Y. Peterzil and S. Starchenko, Definable homomorphisms of abelian groups in ominimal structures, Annals of Pure and Applied Logic, vol. 101 (2000), no. 1, pp. 1–27. [PSt2] , Computing o-minimal topological invariants using differential topology, preprint, 2005. [PS] Y. Peterzil and C. Steinhorn, Definable compactness and definable subgroups of ominimal groups, Journal of the London Mathematical Society, vol. 59 (1999), no. 3, pp. 769–786. [P] A. Pillay, On groups and fields definable in o-minimal structures, Journal of Pure and Applied Algebra, vol. 53 (1988), no. 3, pp. 239–255. [S] A. Strzebonski, Euler characteristic in semialgebraic and other o-minimal groups, Journal of Pure and Applied Algebra, vol. 96 (1994), no. 2, pp. 173–201. [T] W. P. Thurston, Three-Dimensional Geometry and Topology, Princeton University Press, Princeton, 1997.
COVERING DEFINABLE MANIFOLDS BY OPEN DEFINABLE SUBSETS
25
[vdD] L. van den Dries, Tame Topology and o-Minimal Structures, Cambridge University Press, 1998. [W] A. Wilkie, Covering open definable sets by open cells, O-Minimal Structures (M. Edmundo, D. Richardson, and A. Wilkie, editors), Proceedings of the RAAG Summer School Lisbon 2003, Lecture Notes in Real Algebraic and Analytic Geometry, Cuvillier Verlag, 2005. [Wo] A. Woerheide, O-Minimal Homology, Ph.D. thesis, University of Illinois, UrbanaChampaign, 1996. CMAF UNIVERSIDADE DE LISBOA AV. PROF. GAMA PINTO 2 1649-003 LISBOA, PORTUGAL
E-mail:
[email protected]
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
S. S. GONCHAROV
We are interested in computable structures and some different computable representations of these structures. The basic definitions, results, and problems on this topic can be found in [1, 5, 4]. In the present paper, we consider the problems about algorithmic complexity of isomorphism. We also study the definability property on models and its connections with the Scott rank. The results were obtained in collaboration with J. Knight, W. Calvert, V. Harizanov, C. McCoy, R. Solomon, R. Shore, A. Morozov, D. Tusupov. Through the paper, we adopt the following conventions. 1. Languages are computable, for every structure a subset of serves as its universe. 2. The complexity of a structure A is identified with its atomic diagram D(A). ¨ 3. Sentences are identified with their Godel numbers. Under these conventions, a structure A is said to be computable (arithmetical or hyperarithmetical) if its diagram D(A), considered as a subset of , is computable (arithmetical or hyperarithmetical). There are known examples of computable structures of different computable Scott ranks. There are also structures, for example, the Harrison ordering, of Scott rank 1CK + 1. Makkai [19] constructed a structure of Scott rank 1CK , which can be made computable [14] and simplified so that it is a computable tree [3]. In [2], further computable structures of Scott rank 1CK were constructed in the following classes: undirected graphs, fields of any characteristic, and linear orderings. These structures share the strong approximability property with the Harrison ordering and the tree in [3]. These results give us examples of computable structures with different complexity of isomorphism problem for different computable representations.
Partially supported by grant RFBR-05-01-00819 and President grant of Scientific School 2112.2003.01 Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
26
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
27
§1. Introduction. In this section, we recall some definitions and known results. The Scott rank is a measure of model-theoretic complexity. The notion comes from Scott Isomorphism Theorem (see [22]). Theorem 1.1 (Scott Isomorphism Theorem). For a countable structure A (a countable language L) there is an L1 sentence whose countable models are just isomorphic copies of A. In the proof by Scott, countable ordinals were assigned to tuples in A and with A itself. There are several different definitions of the Scott rank. We begin with a family of equivalence relations. We will define A ∼ = B if these models A and B are isomorphic. Definition 1.2. Let a, b be tuples in A. 1. We write a ≡0 b if a and b satisfy the same quantifier-free formulas. 2. For α > 0 we write a ≡α b if for all < α and c there exists d , and for each d there exists c such that a, c ≡ b, d . Definition 1.3. 1. The Scott rank of a tuple a in A is the least such that for all b the relation a ≡ b implies (A, a) ∼ = (A, b). 2. The Scott rank of A, denoted by SR(A), is the least ordinal α greater than the ranks of all tuples in A. Let us recall the definition of Kleene’s system O. The system consists of a set O of notations equipped with a partial ordering
28
S. S. GONCHAROV
Theorem 1.4 (Barwise–Kreisel Compactness). Let Γ be a Π11 set of computable infinitary sentences. If every ∆11 subset of Γ has a model, then Γ has a model. Barwise-Kreisel Compactness differs from ordinary Compactness in that it can be used to produce computable structures. Corollary 1.5. Let Γ be a Π11 set of computable infinitary sentences. If every ∆11 subset has a computable model, then Γ has a computable model. The following two corollaries give evidence of the expressive power of computable infinitary formulas. Corollary 1.6. If A, B are computable structures satisfying the same computable infinitary sentences, then A ∼ = B. Corollary 1.7. Suppose that a, b are tuples satisfying the same computable infinitary formulas in a computable structure A. Then there is an automorphism of A taking a to b. This corollary yields a bound on Scott ranks for computable structures [21]. Proposition 1.8. For a computable structure A we have SR(A) ≤ 1CK + 1. The well-known Barwise-Kreisel Compactness Theorem and its three corollaries can be found in [1]. One point in the proof of the Barwise-Kreisel Compactness Theorem is expanded in [15]. We list the following properties of computable structures. Proposition 1.9. For a computable structure A 1. SR(A) < 1CK if there is a computable ordinal such that the orbits of all tuples are defined by computable Π formulas, 2. SR(A) = 1CK if the orbits of all tuples are defined by computable infinitary formulas, but there is no computable bound on the complexity of these formulas, 3. SR(A) = 1CK + 1 if there is a tuple whose orbit is not defined by any computable infinitary formula. Low Scott rank is associated with simple Scott sentences. A Scott sentence for A is a sentence whose countable models are just the isomorphic copies of A (as in Scott Isomorphism Theorem). Nadel [21], [20] established the following assertion. Theorem 1.10 (Nadel). For a computable structure A Scott rang SR(A) is computable if and only if A has a computable infinitary Scott sentence. Proposition 1.11 (Ressayre). Suppose that A is a hyperarithmetical structure. Let Γ be a Π11 set of computable infinitary sentences in a finite expansion of the language of A. Suppose that for each ∆11 set Γ ⊆ Γ the model A can be expanded to a model of Γ . Then A can be expanded to a model of Γ.
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
29
1.1. Structures of Scott rank SR(A) < 1CK . Proposition 1.12. All computable members of the following classes of structures have computable Scott rank: 1. well orderings, 2. superatomic Boolean algebras, 3. reduced Abelian p-groups. 1.2. Structures of Scott rank SR(A) = 1CK + 1. There are several wellknown examples of computable structures of Scott rank 1CK + 1. J. Harrison [16] has shown that there is a computable ordering of type 1CK (1 + ). This ordering, the Harrison ordering, gives rise to some other computable structures with similar properties. The Harrison Boolean algebra is the interval algebra of the Harrison ordering. The Harrison Abelian p-group has length 1CK , with all infinite Ulm invariants, and the divisible part has infinite dimension. Proposition 1.13. The Harrison ordering, Harrison Boolean algebra, and Harrison Abelian p-groups all have Scott rank 1CK + 1. 1.3. Structures of Scott rank SR(A) = 1CK + 1. For Scott rank 1CK it is not easy to find computable examples. There is an arithmetical example constructed by M. Makkai [19]. Theorem 1.14 (Makkai). There is an arithmetical structure A of rank 1CK . §2. Isomorphism problem. In contrast to the Harrison ordering, the set of computable infinitary sentences true of Makkai’s example is ℵ0 –categorical; thus the conjunction of these sentences is a Scott sentence for the structure. The particular Harrison ordering, as originally constructed,has the feature that there is no infinite hyperarithmetical decreasing sequence. As a consequence, there is no non-trivial hyperarithmetical automorphism. However, there are other computable orderings of type 1CK (1 + ) with non-trivial computable automorphism. The group-tree costruction gives a computable ”group-tree” A(T ) corresponding to a computable tree T such that if the tree has a path but no hyperarithmetical path, then A(T ) has no non-trivial hyperarithmetical automorphism. If, in addition, the tree is thin, then the group-tree has Scott rank 1CK . From results of [19] and [14], and [18], we obtain that there is a computable structure of Scott rank 1CK . J. Knight, J. Millar and W. Calvert showed that there is a computable tree of Scott rank 1CK . The idea is to take trees the same as Knight-Yang Trees and add a homogeneity property. J. Knight, J. Millar and W. Calvert constructed a tree of rank 1CK as follows. This construction is very interesting and helpful in many applications.
30
S. S. GONCHAROV
Let T be a subtree of < . We have a top node ∅. Below, we define tree rank for ∈ T , and then for T itself. We use the notation rk(), rk(T ). Definition 2.1. 1. rk() = 0 if is terminal, 2. for α > 0, rk() = α if all successors of have ordinal rank, and α is the first ordinal greater than these ordinals, 3. rk() = ∞ if does not have ordinal rank. We let rk(T ) = rk(∅). Theorem 2.2. [3] 1. There is a computable, thin, rank-homogeneous tree T such that rk(T ) = ∞, but T has no hyperarithmetical path. 2. If T is a computable, thin, rank-homogeneous tree such that rk(T ) = ∞, but T has no hyperarithmetical path, then SR(T ) = 1CK . We give new examples of computable structures of Scott rank 1CK . Theorem 2.3. [2] Each of the following classes contains computable structures of Scott rank 1CK . 1. undirected graphs 2. linear orderings, 3. Boolean algebras, 4. fields of any characteristic. 2.1. Barwise rank. Recall the notion of the quantifier rank of a formula. (We assume that the implication ⇒ is expressed in terms of ¬ and ∧ and thereby it does not occur in our formulas.) 0 if ϕ is quantifier-free; qr() if ϕ is ¬; qr(ϕ) = qr() + 1 if ϕ is ∃v or ∀v; sup{qr() | ∈ Φ} if ϕ is Φ or Φ. Let α be an ordinal. Models M and N are said to be α-equivalent if they satisfy the same sentences whose quantifier rank does not exceed α. We denote this by M ≡α N. Two tuples a, b ∈ M< are called α-equivalent if they satisfy the same formulas whose quantifier rank does not exceed α. We denote this by a ≡α b. Models M and N are said to be equivalent if they satisfy the same sentences. We denote this by M ≡ N. We say that a tuple a ∈ M< has quantifier rank α in M if for all tuples b ∈ M< the implication (a ≡α b ⇒ a ≡ b) holds. The Barwise rank of a model M, br(M), is the minimal ordinal α such that for all a, b ∈ M< , (a ≡α b ⇒ a ≡α+1 b). It is well known that the Barwise rank of a model M ∈ HYP does not exceed 1CK .
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
31
The following theorem about the existence of hyperarithmetical isomorphisms for different computable representations of models gives us a close connection between this problem and the Π11 -definability of relations on computable models.We will define A ∼ =h B if there is a hyperarithmetical isomorphism of the models A and B. Theorem 2.4. [12] Let M be a hyperarithmetical model. Then the following conditions are equivalent: 1. there exist tuples a, b ∈ M< such that M, a ∼ = M, b, but M, a h
M, b; 2. there exists a tuple a ∈ M< such that there exists an infinite family (a¯i )i< of tuples in M< possessing the following properties: (a) M, a ∼ = M, a¯i for all i < , (b) M, a i h M, a j for all i < j < ; 3. the Barwise rank of M is equal to 1CK ; 4. IM ∈ / Π11 , where IM = { a, b ∈ M< × M< | a ∼ = b}. We see that it is important to have a description for relations with Π11 – complexity. §3. Intrinsically Π11 relations. The first results about relations of analitic complexity were obtained by I. Soskov. Proposition 3.1. [24] Suppose that A is computable and R is a ∆11 relation that is invariant under automorphisms of A. Then R is definable in A by a computable infinitary formula, with no parameters. Corollary 3.2. For a computable structure A and a relation R on A the following assertions are equivalent: 1. R is intrinsically ∆11 on A, 2. R is relatively intrinsically ∆11 on A, 3. R is definable in A by a computable infinitary formula, with finitely many parameters. Definition 1. A relation R on A is formally Π11 on A if it is defined in A by a Π11 disjunction of computable infinitary formulas, with finitely many parameters. I. Soskov’s result can be reformulated as follows. Proposition 3.3. [25] For a computable (or hyperarithmetical ) structure A and relation R on A the following assertions are equivalent: 1. R is relatively intrinsically Π11 on A, 2. R is formally Π11 on A.
32
S. S. GONCHAROV
Theorem 3.4. [13] Suppose that A is a computable structure and R is a relation on A that is Π11 and is invariant under automorphisms of A. Then R is formally Π11 . Moreover, there is a definition with no parameters. Corollary 3.5. [13] For a computable structure A and relation R the following assertions are equivalent: 1. R is intrinsically Π11 on A, 2. R is relatively intrinsically Π11 on A, 3. R is formally Π11 on A. A relation is properly Π11 if it is Π11 and not Σ11 . Corollary 3.6. [13] If a relation R on a computable structure A is invariant and properly Π11 , then the image of R in any computable copy is also properly Π11 . Here are some examples of computable structures with intrinsically Π11 relations. Example 1. A Harrison ordering is a computable ordering of type 1CK (1 + ). Harrison showed the existence of such orderings. In fact, he showed that for any computable tree T ⊆ < , if T has paths but no hyperarithmetical paths, then the Kleene-Brouwer ordering on T is a computable ordering of type 1CK (1 + ) + α, for some computable ordinal α. Let A be a Harrison ordering, and let R be the initial segment of type 1CK . This set is intrinsically Π11 , since it is defined by the disjunction of computable infinitary formulas saying that the interval to the left of x has order type , for computable ordinals . Example 2. The Harrison Boolean algebra is the interval algebra of the Harrison ordering. Let A be a Harrison Boolean algebra, and let R be the set of superatomic elements—those contained in one of the Fr´echet ideals. This set is intrinsically Π11 , since it is defined by the disjunction of computable infinitary formulas saying that x is a finite union of α-atoms, for computable ordinals α. Example 3. Recall that a countable Abelian p-group G is determined up to isomorphism by its Ulm sequence (uα (G))α<(G) , and the dimension of the divisible part. A Harrison p-group is a computable Abelian p-group G such that (G) = 1CK , uG (α) = ∞, for all α < 1CK , and the divisible part D has infinite dimension. A Harrison group is a Harrison p-group for some p. Let A be a Harrison group, and let R be the set of elements that have computable ordinal height, i.e., the complement of the divisible part. Then
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
33
R is intrinsically Π11 on A, since it is defined by the disjunction of computable infinitary formulas saying that x has height α, for computable ordinals α. Theorem 3.7. For the Harrison groups, Harrison Boolean algebra, and Harrison ordering there are two computable representations without hyperarithmetical isomorphisms between these representations. §4. Notions related to computable categoricity. Let A be a computable structure. We say that A is computably categorical if for all computable B∼ = A there is a computable isomorphism from A onto B. Similarly, A is ∆0α categorical if for all computable B ∼ = A there is a ∆0α isomorphism. We say that A is relatively computably categorical if for all B ∼ = A there is an isomorphism that is computable relative to B, and A is relatively ∆0α categorical if for all B∼ = A there is a ∆0α (B) isomorphism. A Scott family for A is a set Φ of formulas, with a fixed tuple of parameters c in A, such that 1. each tuple in A satisfies some ϕ ∈ Φ, and 2. if a, b are tuples in A satisfying the same formula ϕ ∈ Φ, then there is an automorphism of A taking a to b. A formally c.e. Scott family is a c.e. Scott family made up of finitary existential formulas. A formally Σ0α Scott family is a Σ0α Scott family made up of “computable Σα ” formulas. Proposition 4.1. For a structure A the set {a : A |= ϕ(a)} is Σ0α (A) if ϕ is computable Σα , and Π0α (A) if ϕ is computable Πα . Moreover, this is so with all imaginable uniformity, over structures and formulas. It is easy to see that if A has a formally c.e. Scott family, then it is relatively computably categorical, so it is computably categorical. More generally, if A has a formally Σ0α Scott family, then it is relatively ∆0α categorical, so it is ∆0α categorical. Goncharov showed that, under some additional effectiveness conditions (on a single copy), if A is computably categorical, then it has a formally c.e. Scott family. Ash showed that, under some effectiveness conditions (on a single copy), if A is ∆0α categorical, then it has a formally Σ0α Scott family. For the relative notions the effectiveness conditions disappear. Ash-KnightManasse-Slaman, Chisholm proved the following assertion. Proposition 4.2. A computable structure A is relatively ∆0α categorical if and only if it has a formally Σ0α Scott family. In particular, A is relatively computably categorical if and only if it has a formally c.e. Scott family.
34
S. S. GONCHAROV
§5. Basic results from numbering theory. For S ⊆ P() a numbering is a binary relation such that S = { (i) : i ∈ }, where (i) = {x : (i, x) ∈ }. A Friedberg numbering of S is a numbering that is 1 − 1 in the sense that i = j implies (i) = (j). Suppose that and are two numberings of the same family S. We write ≤ if there is a computable function f such that for all i we have (i) = (f(i))—we can effectively pass from a -index to a -index for the same set. We say that and are computably equivalent if ≤ and ≤ . Note that if and are Friedberg numberings of S, then ≤ implies ≤ . A family S ⊆ P() is discrete if for each A ∈ S there exists ∈ 2< such that for all B ∈ S we have ⊆ B if and only if B = A. The family is effectively discrete if there is a c.e. set E ⊆ 2< such that (a) for each A ∈ S there is ∈ E such that ⊆ A , and (b) for all ∈ E and A, B ∈ S if ⊆ A , B , then A = B. Proposition 5.1. [22] There exists a family S ⊆ P() that has a unique, up to a computable equivalence, computable Friedberg numbering and is discrete but not effectively discrete. Proposition 5.2. [8] For each finite n ≥ 1 there is a family of sets with just n computable Friedberg numberings, up to a computable equivalence. Proposition 5.3. [26, 23] There is a family S ⊆ P() with numberings in all non-computable degrees but no computable numbering. The numbering results of Selivanov, Goncharov, and Wehner about computable numberings can be relativized. §6. Turning a family of sets into a graph. Let S be a family of sets. For each A ∈ S we can construct a daisy graph GA possessing the following properties. (a) G(S) is a rigid graph. (b) If S has a unique computable Friedberg numbering, then G(S) is computably categorical. (c) If S has just n computable Friedberg numberings, up to a computable equivalence, then G(S) has computable dimension n. (d) If S is discrete, then every element of G(S) has a finitary existential definition with no parameters. (e) Suppose that S has a computable Friedberg numbering and is discrete but not effectively discrete. Then G(S) does not have a formally c.e. defining family. Here we will state the basic results that we plan to lift. Proposition 6.1. [6, 4] There is a rigid graph structure G that is computably categorical with no formally c.e. defining family.
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
35
Proposition 6.2. [19] There is a computable structure A with a relation R that is intrinsically c.e., but not relatively intrinsically c.e. Proof. Consider the cardinal sum of disjoint computable copies of the graph structure G from Theorem 6.1. Let R be a unique isomorphism between these two copies. Proposition 6.3. [9, 10, 4, 5] For each finite n there is a rigid graph structure G with computable dimension n. Proposition 6.4. [23, 26] There is a structure A with copies in just the non-computable degrees. §7. Coding a ∆0α structure in a computable one. To lift the basic results of Goncharov and Manasse, we first relativize producing a ∆0α graph. We then pass to a computable structure using a pair of structures to code the arrow relation. For a graph G and a pair of structures B1 , B2 , for the same relational language, we set G ∗ = (G ∪ U, G, U, Q, . . . ), where 1. G is the universe of G, 2. G and U are disjoint, 3. Q is a ternary relation assigning to each pair a, b ∈ G an infinite set U(a,b) , 4. the sets U(a,b) form a partition of U , 5. each relation in . . . is the union of its restrictions to the sets U(a,b) , and 6. for each pair a, b ∈ G, B1 (U(a,b) , . . . ) ∼ = B2
if G |= a → b otherwise.
Theorem 7.1. [11] Suppose that G is a graph structure. Assume that G ∗ is constructed from G, Bi in the way described at the beginning of this section. Then G has a ∆0α copy if and only if G ∗ has a computable copy. More generally, for any X , G has a ∆0α (X ) copy if and only if G ∗ has an X -computable copy. In addition, (a) if G has just one ∆0α copy, up to a ∆0α isomorphism, then G ∗ is ∆0α categorical, (b) if G has just n ∆0α copies, up to a ∆0α isomorphism, then G ∗ has ∆0α dimension n, (c) if G has no Σ0α Scott family made up of finitary existential formulas, then G ∗ has no formally Σ0α Scott family. We describe the construction for the reducibility of this result to graphs and some other algebraic structures. Theorem 7.2. Suppose that M is a countable structure with signature such that the arity of all predicate and functional symbols from is bounded by some
36
S. S. GONCHAROV
number k. There exists a partial ordering (graph) M∗ possessing the following properties: The model M has a computable copy if and only if M∗ has a computable copy. More generally, for any X , M has a X -computable copy if and only if M∗ has an X -computable copy. In addition, (a) if M is ∆0α categorical, then M∗ is ∆0α categorical, (b) if M has ∆0α dimension n, then M∗ has ∆0α dimension n, (c) if M has no formally Σ0α Scott family, then M∗ has no formally Σ0α Scott family. The proof of this theorem follows from the constructions [7] concerning some categories of computable algebraic systems. We consider a computable signature = P0n0 , P1n1 , . . . , Pknk , . . . such that is countable or finite. By the category Mod we mean a category whose objects are models of signature and the morphisms are their isomorphisms. Let of the category Mod . Note that a model us define the subcategory Modcom M is computable if it is computably isomorphic to a computable model M with computable basic set |M| which is } computable subset of some sets of words of finite alphabet and the set { i, m1 , . . . , mni /M |= Pi (m1 , . . . , mni )} is computable. If M1 and M2 are computable models, then the isomorphism ϕ : M1 →onto M2 is computable provided that the function ϕ is partially com are computable models of putable. The objects of the subcategory Modcom signature and the morphisms are computable isomorphisms. If M is a model (M) the complete subcategory of of signature , then we denote by Modcom Modcom whose objects are the computable models isomorphic to M. If K0 is a subcategory of a category K and F is a functor from K into K1 , then the restriction of the functor F to the subcategory K0 will be denoted by F K0 . A signature is said to be bounded if there exists a number k such that for arity of predicates from this signature we have ni ≤ k for every i. Proposition 7.3. For an arbitrary bounded signature there exists a finite signature 0 and a completely univalent functor F1 of the category Mod into the category Mod 0 such that (i) F1 Modcom is a completely univalent functor of category Modcom into 0 the category Modcom , (M) (ii) for an arbitrary model M of signature , the functor F1 Modcom 0 realizes an equivalence of the categories Modcom (M) and Modcom (F1 (M)).
In addition, (a) if M is ∆0α categorical, then F1 (M) is ∆0α categorical, (b) if M has ∆0α dimension n, then F1 (M) has ∆0α dimension n, (c) if M has no formally Σ0α Scott family, then F1 (M) has no formally Σ0α Scott family.
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
37
Proof. Let = P0n0 , P1n1 , . . . , Pknk , . . . . Suppose that the set ni |i ∈ N is bounded by a number k. For each k ≤ K we consider all the predicates Pi k 0 , Pi k 1 , . . . , Pi k l , . . . , l ∈ Nk from of arity k, where Nk either is equal to N or is an initial segment of N . We set 0 = {=, P01 , P12 , . . . , Pkk+1 , . . . , Pkk+1 , A12 , }. We first define a functor F1 on the objects of the category Mod . Let M be an arbitrary model of signature . If M is the basic set of the model M, then for the basic set of the model M0 F1 (M) we take the set M0 which is equal to M ∪ {a0 , a1 , . . . , an , . . . }, where {a0 , a1 , . . . , an , . . . } ∩ M = ∅ and ai = aj for i = j. We define the predicates as follows: 1. AM0 {a0 , a1 , . . . , an , . . . }, 2. x y if x = an and y = an+1 for some n, i 3. (x0 , x1 , . . . , xs ) ∈ (Ps )M0 if x0 = ad , xj ∈ M and M |= Piss d (x1 , . . . , xs ). j=1
It is easy to see that M0 is a computable model if M is a computable model. If M and M0 are objects of the category Mod 0 and ϕ is an isomorphism of the model M onto M0 , then we define F1 (M, M0 )(ϕ). We define it only in the case where the basic sets of both models are subsets of N . In the remaining cases, everything is done in the same way. Thus, we introduce x, if x ∈ {a0 , a1 , . . . , an , . . . }; 0 F1 (M, M )(ϕ) (x) ϕ(x), if x ∈ M. It is clear that F1 (M, M0 )(ϕ) is a computable isomorphism relative to the Turing degree a if ϕ is } a computable isomorphism relative to the Turing degree a. To prove that a function F1 of a category K1 into K2 is completely univalent, it suffices, by definition, to show that the mapping F1 (A, B) : Hom(A, B) −→ Hom(F1 (A), F1 (B)) is bijective for each pair A, B of objects of K1 . (M) are completely univalent functors. We can Thus, F1 and F1 Modcom prove that F1 Modcom (M) realizes an equivalence by showing that for each 0 (F1 (M)) there exists an object M0 of the object M of the category Modcom category Modcom (F1 )(M) such that M and F1 (M0 ) are isomorphic in the (F1 )(M). The case of a finite category Modcom (F1 (M)). Let M ∈ Modcom model is trivial. Let M be an infinite model. 1−1 M \ AM . Since M is a We can consider a computable function f : N onto computable model, it follows that N \ AM is computable and the function exists. Let a be an element of AM that does not have a —predecessor. We define predicates of signature on N : (n1 , . . . , nmk ) ∈ Pimk k if and only if mk +1 , where l0 , l1 , l2 , . . . , lk are such that li li+1 for (l, f(n1 ), . . . , f(nmi )) ∈ Pm k
38
S. S. GONCHAROV
0 ≤ i < k, a = l0 and lk = l . It is easy to see that the model M of signature defined in such a way is computable. We show that F (M ) is computably isomorphic to M . For this purpose, we consider a function g defined as follows: f(m), if m ∈ N ; g(m) = l, if m = ak and there exist l0 , l1 , . . . , lk k−1 such that M |= i=0 li li+1 and l0 = a & lk = l. It is clear that g is an isomorphism and a computable function. The additional properties about Scott families from our proposition can be proved from the definability of basic predicates and their negations by ∃–formulas. Proposition 7.4. For an arbitrary finite signature 0 there exist a signature 1 that consists of a single predicate symbol P and a completely univalent functor F2 of the category Mod 0 into the category Mod 1 such that 0 0 1 (i) F2 Modcom is a completely univalent functor from Modcom into Modcom ; 0 (ii) for each model M of signature 0 the functor F2 Modcom (M) realizes an 0 1 (M) and Modcom (F2 (M)). equivalence of the categories Modcom
In addition, (a) if M is ∆0α categorical, then F2 (M) is ∆0α categorical, (b) if M has ∆0α dimension n, then F2 (M) has ∆0α dimension n, (c) if M has no formally Σ0α Scott family, then F2 (M) has no formally Σ0α Scott family. Proof. Our goal is to define the functor F2 . Let 0 = P0n0 , P1n1 , . . . , Pknk . Suppose that M is a model of the finite signature 0 . Consider a predicate k
symbol P of arity n = ni and a signature 1 = P n . We start by defining i=0
F2 on the objects of Mod 0 . Suppose that M is a model of signature 0 and M is its basic set. Consider {∞} ∪ M as the basic set M0 of the model F2 (M). We define P on the set M0 as follows:
x1 , . . . , xn ∈ P if and only if one of the following conditions is fulfilled: a) x1 = x2 = · · · = xn = 0, b) there exist i ≤ k and y1 , . . . , yni such that yj = xj+mi for any j such that 1 ≤ j ≤ ni , and M |= Pi (y1 , . . . , yni ). But xj = 0 for any j such that 1 ≤ j ≤ mi or mi + ni + 1 ≤ j ≤ n. We put here m0 = 0 and mi =
i−1
nl for i > 1.
l =0
If M is a computable model, then F2 (M) is also a computable model. Let M and N be two models of signature 0 . Then we define a mapping
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
39
F2 (M, N) : Hom(M, N) → Hom(F2 (M), F2 (N)) as follows: ∞, if x = ∞; F2 (M, N)(ϕ) (x) ϕ(x), if x = ∞. It is easy to see that F2 (M, N) is an isomorphism if ϕ is an isomorphism, and it is computable if ϕ is computable. Since the verification of the remaining conditions can be carried out in the same way as in 7.3, we omit the proof. The additional properties from our proposition can be proved by induction. Now, we consider the Categories of Graphs and Partial Orders. Let us consider a signature ∗ that consists of a single binary predicate Q ∗ ∗ and a category Mod . We call the category Mod the category of Graphs ∗ Graph. We denote by Ord the complete subcategory of the category Mod whose objects are the models M, Q, where Q defines a partial order on M . Proposition 7.5. For each signature 1 with a single predicate of arity n ≥ 3 there exists a completely univalent functor F3 from the category Mod 1 into the category Graph with binary predicate R such that 1 1 is a completely univalent functor from Modcom into (i) F3 Modcom ∗ , Graphcom = Modcom 1 (M) realizes an (ii) for each model M ∈ ObMod 1 the functor F3 Modcom 1 ∗ (F3 (M)). equivalence between the categories Modcom (M) and Modcom In addition, (a) if M is ∆0α categorical, then F3 (M) is ∆0α categorical, (b) if M has ∆0α dimension n, then F3 (M) has ∆0α dimension n. (c) if M has no formally Σ0α Scott family, then F3 (M) has no formally Σ0α Scott family. Proof. We give a direct construction of a functor F3 from Mod 1 into Graph. Let M, P be a model of signature 1 , where P is a predicate of arity n. Consider the sets I = {0, 1, . . . , n} and M = I × M n ∪ M . We define the set M0 M ∪ {a0 , a1 , a2 , b0 , b1 , b2 , c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 } as the basic set |F3 (M)| of the model. We suppose that all elements of the set {a0 , a1 , a2 , b0 , b1 , b2 , c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 } are different and new. We fix elements a0 , a1 , a2 , b0 , b1 , b2 , c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 which will be referred to as basic elements for the definability of F3 on M. Now we can define a predicate R on M0 as follows. Let x, y, ∈ M0 . We set < x, y >∈ R if one of the following conditions is fulfilled: a) x = ai &y = cj , and 1 ≤ i ≤ 3 and (i = 0&j ∈ {0, 1}) ∨ (i = 1&j ∈ {2, 3, 4}) ∨ (i = 2&j ∈ {5, 6, 7, 8}), b) x = cj &y = bi , and 1 ≤ i ≤ 3 and (i = 0&j ∈ {0, 1}) ∨ (i = 1&j ∈ {2, 3, 4}) ∨ (i = 2&j ∈ {5, 6, 7, 8}), c) x ∈ M &y ∈ I × M n &y = i, x1 , . . . , xn &x = xi and n ≥ i ≥ 1, d) x, y ∈ I × M n &x = i, x1 , . . . , xn &y = i + 1, x1 , . . . , xn and i ≥ 1,
40
S. S. GONCHAROV
e) x = a1 &y = 0, y1 , . . . , yn ∈ I × M n &M P(y1 , . . . , yn ), f) x = a0 &y = 0, y1 , . . . , yn ∈ I × M n &M |= P(y1 , . . . , yn ), g) x = a2 &y ∈ M . Thus, we constructed a graph on the set M0 . Let M and M0 be two models of signature 1 , and let ϕ be an isomorphism from M onto M0 . Then we define the function [F3 (M, M0 )](ϕ) directly with the elements: ϕ(x), if x ∈ M 0 F3 (M, M )(ϕ) (x) i, ϕ(x1 ), . . . , ϕ(xn ) if x = i, x1 , . . . , xn ∈ I × M n x, in the contrary case A successive verification of all the cases shows that F3 (M, M0 )(ϕ) is an isomorphism and it is computable if the isomorphism ϕ is computable. We only prove that the functor F3 is completely univalent. Let Ψ be an isomorphism from the model F3 (M) onto F3 (M0 ). Then the restriction of Ψ to the definable by an existential formula subset M in F3 (M) induces an isomorphism Ψ0 of the models M and M0 . Since all the elements in the model
M0 , P are of type i, x1 , . . . , xn and are definable over elements of M by existential formulas, it is easy to see that F3 (M, M0 )(Ψ0 ) = Ψ. We now prove assertion ii. Let M be a model of signature 1 , and let M ∈ ∗ Modcom (F3 (M)). Since the tuple of elements a0 , a1 , a2 , b0 , b1 , b2 , c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 are definable by existential formula over elements in F3 (M), we select them in M and denote them by a00 , a10 , a20 , b00 , b10 , b20 , c00 , c10 , c20 , c30 , c40 , c50 , c60 , c70 , c80 . We choose elements that are connected by basic binary predicate with a20 . It is exactly the definable set X0 which is isomorphic to M in F3 (M). Let us define the predicate P n on X0 as follows: n |= (∃y1 , . . . , yn )(y1 R . . . Ryn & x 1 , . . . , xn ∈ P ⇔ M & 1≤i≤j≤n xi Ryj &a0 Ry1 ) It / P n ⇔ M |= (∃y1 . . . yn )(y1 P 2 . . . P 2 yn is easy to see2 that x1 , .2 . . , xn ∈ &( 1≤i≤j≤n xi P yj )&a1 P y1 ). Therefore, X0 , P n is a computable model of signature 1 , and a direct verification shows that the model of F3 ( X0 , P) is computably isomorphic to M . We have to prove now only the last additional properties of our proposition. All the elements of M = I × M n ∪ {a0 , a1 , a2 , b0 , b1 , b2 , c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 } are definable in F3 (M) over elements from M by existential formulas from a computable set of these formulas. Thus, we can construct a formally Σ0α Scott family for a model F3 (M) from a formally Σ0α Scott family for the model M. Let Σ0α be a formally Scott family for the model F3 (M). We can see that the model F3 (M) is ∆–definable in the model M with the
n+1 n+i
15 2n+1+i basic set M Θi ∆i . We set X, Y ∈ Θi if i=1 M i=1 M X = x1 , . . . , xn+i , Y = y1 , . . . , yn+i and xj = yj for any 1 ≤ j ≤ n. For another equivalence relation we set X, Y ∈ ∆i for any elements X, Y from
41
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
M 2n+1+i . Since this model is ∆–definable, we can define a formally Σ0α Scott family for the model M. Proposition 7.6. For each signature 1 with a single binary predicate R there exists a completely univalent functor F4 from the category Mod 1 into the category Ord such that ∗
1 1 is completely univalent functor from Modcom into Modcom , (i) F4 Modcom 1 (M) realizes an (ii) for each model M ∈ ObMod 1 the functor F4 Modcom 1 ∗ (M) and Modcom (F4 (M)). equivalence between the categories Modcom
In addition, (a) if M is ∆0α categorical, then F4 (M) is ∆0α categorical, (b) if M has ∆0α dimension n, then F4 (M) has ∆0α dimension n, (c) if M has no formally Σ0α Scott family, then F4 (M) has no formally Σ0α Scott family. Proof. We construct a functor F4 of the category Mod {R} into the category Ord that satisfies the assumptions of our proposition. Let M = M, R be a model with a single binary predicate R. We define a partially ordered set M0 , ≤ with a basic set M0 that will be the image of M under the functor F4 . We set M0 = M ∪ M 2 × {0, 1} ∪ {a1 , a2 , a3 , a4 , a5 } ∪ {b1 , . . . , b7 , b8 }, where the elements of the set {a1 , a2 , a3 , a4 , a5 } ∪ {b1 , . . . , b7 , b8 } are new. Let us define a partial order ≤ on this set M0 such that its transitive closure is the desired partial order on M0 : 1) a1 ≤ a2 , a2 ≤ a4 , a2 ≤ a3 , a4 ≤ a5 , 2) b1 ≤ b2 , b2 ≤ b3 , b3 ≤ b4 , b4 ≤ b5 , b5 ≤ b6 , b5 ≤ b7 , b7 ≤ b8 , 3) if x1 , x2 ∈ M and x1 = x2 , then
x1 , x2 , 0 ≤ x1 , and
x1 , x2 , i ≤ x2 for i ∈ {0, 1}, 4) if x1 = x2 ∈ M and M |= P(x1 , x2 ), then a5 ≤
x1 , x2 , 0, 5) if x1 = x2 ∈ M and M P(x1 , x2 ), then a3 ≤
x1 , x2 , 0, 6) if x1 ∈ M and M |= P(x1 , x1 ), then b6 ≤
x1 , x2 , 0, 7) if x1 ∈ M and M P(x1 , x1 ), then b8 ≤
x1 , x2 , 0. We define F4 (M, M ) on isomorphisms ϕ in the same manner as in the case of the functor F3 . The proof of all the properties of this functor follows the same idea as in Proposition 7.5. Using the idea of the proof of 7.5, it is easy to construct a functor from a category of arbitrary signature into a category of bounded signature. Proposition 7.7. For each signature Σ there exists a bounded signature Σ0 and a completely univalent functor F6 from the category Mod Σ into Mod Σ0 such that Σ Σ (i) F5 Modcom is a completely univalent functor from the category Modcom Σ0 into Modcom ,
42
S. S. GONCHAROV
Σ (ii) for each model M of signature Σ the functor F5 Modcom (M) realizes an Σ Σ0 equivalence of the categories Modcom (M) and Modcom (F5 (M)). In addition, (a) if M is ∆0α categorical, then F5 (M) is ∆0α categorical, (b) if M has ∆0α dimension n, then F5 (M) has ∆0α dimension n, (c) if M has no formally Σ0α Scott family, then F5 (M) has no formally Σ0α Scott family. Proof. We introduce a new signature ∗ . In ∗ , we take all predicates from with arity n ≤ 2. If a predicate symbol Pn has arity mn ≥ 3, then we add three new predicate symbols in ∗ : binary predicate symbol Rn , and two unary predicate symbols An and Bn . We also add one new unary predicate symbol U . Now, we consider the impoverishment Mn of the model M of signature Σn = Pnmn for each mn ≥ 3. We consider a model Ln with M ⊆ |Ln | that is isomorphic to the model F3 (Mn ) from Proposition 7.5 with an isomorphism ϕn from F3 (Mn ) on this model Ln such that for any element m ∈ M we have ϕ(m) = m, but |Ln | ∩ |Lk | = M for any n = k. The basic set |F5 (M)| of the model F5 (M) is equal to the union n |Ln |. We put U = M . We define each predicate symbol P from with arity n ≤ 2 being equal to the interpretation of this predicate in M. Now, we define the remaining predicate symbols from the signature ∗ . Let Rn on |F5 (M)| be exactly equal to the binary predicate from Ln . But An is the set |Ln | \ M and Bn is the set {ϕ(a0 ), ϕ(a1 ), ϕ(a2 ), ϕ(b0 ), ϕ(b1 ), ϕ(b2 ), ϕ(c0 ), ϕ(c1 ), ϕ(c2 ), ϕ(c3 ), ϕ(c4 ), ϕ(c5 ), ϕ(c6 ), ϕ(c7 ), ϕ(c8 )}, where {a0 , a1 , a2 , b0 , b1 , b2 , c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 } is the set of basic elements for the definability of F3 on Mn . Thus, we obtain the desired functor F5 on the objects of the category. Using the construction of Proposition 7.5, we can easily define it also on the morphisms of our category. The verification of all the conditions is the same as in the proof of Proposition 7.5.
Remark. In the case of signature with function symbols, we can consider a new signature with predicates for the graphs of these functions. Thus, we have proved the following assertion. Theorem 7.8. For each signature Σ there exist a signature Σ0 with only one binary predicate R and a completely univalent functor F from the category Mod Σ into Mod Σ0 such that Σ Σ is a completely univalent functor from the category Modcom (i) F Modcom Σ0 into Modcom , Σ (ii) for each model M of signature Σ the functor F Modcom (M) realizes an Σ Σ0 equivalence of the categories Modcom (M) and Modcom (F5 (M)).
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
43
In addition, (a) if M is ∆0α categorical, then F (M) is ∆0α categorical, (b) if M has ∆0α dimension n, then F (M) has ∆0α dimension n, (c) if M has no formally Σ0α Scott family, then F (M) has no formally Σ0α Scott family. This assertion means that it suffices to study only the problems connected to computable equivalence and self-equivalence on partially ordered sets or graphs since nothing new can appear in more complex signatures. The above two theorems lead to the following results. Theorem 7.9 (S. Goncharov and D. Tusupov). Suppose that G is a graph structure and the partial ordering (graph) ∆(G) is constructed from G, Bi in the way described in Theorem 7.1 and then by 7.2 Then G has a ∆0α copy if and only if ∆(G) has a computable copy. More generally, for any X , G has a ∆0α (X ) copy if and only if ∆(G) has an X computable copy. In addition, (a) if G has just one ∆0α copy, up to a ∆0α isomorphism, then ∆(G) is ∆0α categorical, (b) if G has just n ∆0α copies, up to a ∆0α isomorphism, then ∆(G) has ∆0α dimension n, (c) if G has no Σ0α -computable Scott family made up of finitary existential formulas, then ∆(G) has no formally Σ0α –Scott family. §8. Lifting the basic results. Here is our lifting of the result by Goncharov on structures that are computably categorical, but not relatively computably categorical. Theorem 8.1. [11] For each computable successor ordinal α there is a structure that is ∆0α categorical, but not relatively ∆0α categorical (and without Σ0α – Scott family). Corollary 8.2 (S. Goncharov and D. Tusupov). For each computable successor ordinal α there is a partial ordering (graph) that is ∆0α categorical, but not relatively ∆0α categorical(and without Σ0α –Scott family). Here is our lifting of the result of Manasse on relations that are intrinsically c.e. but not relatively intrinsically c.e. Theorem 8.3. [11] For each computable successor ordinal α, there is a computable structure with a relation that is intrinsically Σ0α , but not relatively intrinsically Σ0α . Corollary 8.4 (S. S. Goncharov, D. Tusupov). For each computable successor ordinal α, there is a computable partial ordering (graph) with a relation that is intrinsically Σ0α but not relatively intrinsically Σ0α .
44
S. S. GONCHAROV
Here is our lifting of the result of Goncharov on structures with finite computable dimension. Theorem 8.5. [11] For each computable successor ordinal α and each finite n there is a computable structure with ∆0α dimension n. Corollary 8.6 (S. S. Goncharov, D. Tusupov). For each computable successor ordinal α and every finite n there is a computable partial ordering (graph) with ∆0α dimension n. Here is our lifting of the result of Slaman and Wehner. Theorem 8.7. [11] For each computable successor ordinal α there is a structure with copies in just the degrees of sets X such that ∆0α (X ) is not ∆0α . In particular, for each finite n there is a structure with copies in just the non-lown degrees. Corollary 8.8 (S. S. Goncharov, D. Tusupov). For each computable successor ordinal α there is a partial ordering (graph) with copies in just the degrees of sets X such that ∆0α (X ) is not ∆0α . In particular, for each finite n there is a structure with copies in just the non-lown degrees. Now from examples of computable graphs we can construct on the basis of construction [17] many other algebraic structures with the same properties like Theorems 8.1, 8.3, 8.5, 8.7 . REFERENCES
[1] C. J. Ash and J. F. Knight, Computable Structures and the Hyperarithmetical Hierarchy, Studies in Logic and the Foundations of Mathematics, vol. 144, North-Holland Publishing Co., Amsterdam, 2000. [2] W. Calvert, S. S. Goncharov, and J. F. Knight, Computable structures of Scott rank 1CK in familiar classes, preprint. [3] W. Calvert, J. F. Knight, and J. M. Millar, Computable trees of scott rank 1CK and computable approximation, preprint. [4] Yu. L. Ershov and S. S. Goncharov, Constructive Models, Siberian School of Algebra and Logic, vol. 6, Consultants Bureau, New York, 2000. [5] Yu. L. Ershov, S. S. Goncharov, A. Nerode, and J. Remmel, Handbook of Recursive Mathematics, Studies in Logic and the Foundations of Mathematics, vol. 138-139, North-Holland, Amsterdam, 1998. [6] S. S. Goncharov, The quantity of non-autoequivalent constructivizations, Algebra and Logic, vol. 16 (1977), pp. 257–282. [7] , The quantity of non-autoequivalent constructivizations, Algebra and Logic, vol. 16 (1977), pp. 169–185, (English translation). , Computable single-valued numerations, Algebra and Logic, vol. 19 (1980), pp. 325– [8] 356, (English translation). [9] , Problem of the number of non-self-equivalent constructivizations, Algebra and Logic, vol. 19 (1980), pp. 401– 414, (English translation). [10] , Problem of the number of non-self-equivalent constructivizations, Soviet Doklady Mathematics, vol. 21 (1980), pp. 411– 414, (English translation).
ISOMORPHISMS AND DEFINABLE RELATIONS ON COMPUTABLE MODELS
45
[11] S. S. Goncharov, V. S. Harizanov, J. F. Knight, C. McCoy, R. G. Miller, and R. Solomon, Enumerations in computable structure theory, Annals of Pure and Applied Logic, vol. 136 (2005), no. 3, pp. 219–246. [12] S. S. Goncharov, V. S. Harizanov, J. F. Knight, A. Morozov, and A. Romina, On automorphic tuples of elements in computable models, preprint. [13] S. S. Goncharov, V. S. Harizanov, J. F. Knight, and R. Shore, Π11 relations and paths through O, The Journal of Symbolic Logic, vol. 69 (2004), no. 2, pp. 585–611. [14] S. S. Goncharov and B. Khoussainov, Complexity of categorical theories with computable models, Algebra and Logic, vol. 43 (2004), no. 6, pp. 365–373, (English translation). [15] S. S. Goncharov and J. F. Knight, Computable structure/non-structure theorems, Algebra and Logic, vol. 41 (2002), pp. 351–373, (English translation). [16] J. Harrison, Recursive pseudo-well-orderings, Transactions of the American Mathematical Society, vol. 131 (1968), pp. 526–543. [17] D. R. Hirschfeldt, B. Khoussainov, R. A. Shore, and A. M. Slinko, Degree spectra and computable dimensions in algebraic structures, Annals of Pure and Applied Logic, vol. 115 (2002), no. 1-3, pp. 71–113. [18] J. F. Knight and J. M. Millar, Computable structures of rank 1CK , submitted to J. Math. Logic. [19] M. Makkai, An example concerning Scott heights, The Journal of Symbolic Logic, vol. 46 (1981), no. 2, pp. 301–318. [20] M. Nadel, L1 and admissible fragments, Model-Theoretic Logics (K. J. Barwise and S. Feferman, editors), Springer, New York, 1985, pp. 271–316. [21] M. E. Nadel, Scott sentences and admissible sets, Annals of Pure and Applied Logic, vol. 7 (1974), pp. 267–294. [22] V. L. Selivanov, The numerations of families of general recursive functions, Akademiya Nauk SSSR. Sibirskoe Otdelenie. Institut Matematiki. Algebra i Logika, vol. 15 (1976), no. 2, pp. 205–226, 246. [23] T. Slaman, Relative to any nonrecursive set, Proceedings of the American Mathematical Society, vol. 126 (1998), no. 7, pp. 2117–2122. [24] I. N. Soskov, Intrinsically hyperarithmetical sets, Mathematical Logic Quarterly, vol. 42 (1996), no. 4, pp. 469– 480. [25] , Intrinsically Π11 relations, Mathematical Logic Quarterly, vol. 42 (1996), no. 1, pp. 109–126. [26] S. Wehner, Enumerations, countable structures and Turing degrees, Proceedings of the American Mathematical Society, vol. 126 (1998), no. 7, pp. 2131–2139. INSTITUTE OF MATHEMATICS OF SB RAS PR. KOPTUGA 4 NOVOSIBIRSK 630090, RUSSIA
E-mail:
[email protected]
INDEPENDENCE FOR TYPES IN ALGEBRAICALLY CLOSED VALUED FIELDS
DEIRDRE HASKELL
Introduction: the historical context. My goal in this article, as it was in the lecture at the Logic Colloquium in Athens, is to survey the notion of independence of types, a fundamental tool in the area of stability theory, and see different ways in which it can be realized in a particular example of an unstable theory, the theory of algebraically closed valued fields. I thank the anonymous referee for many comments which have significantly improved this article. Of course, all remaining errors are my own. The notion of independence was first formulated by Shelah [Sh] in the 1970’s in the context of classification theory. The motivating problem was the following. Problem. Given a theory T and a cardinal > |T |, let I (T, ) be the number of models of T of cardinality , up to isomorphism. What can the function I (T, ) be? The first observation is that there is the following fundamental dichotomy. • Suppose T is unstable; that is, for every uncountable , there is a parameter set A with |A| ≤ such that the number of types over A is 2 . In this case, I (T, ) = 2 ; the maximum possible value. • Suppose T is stable. Then there are different possibilities for I (T, ), so one can look for further conditions on the theory which will serve to determine the value that I (T, ) takes. Examples of unstable theories include any theory of a structure which defines an infinite ordering or a non-trivial valuation onto an ordered group; examples of stable theories include algebraically closed fields, separably closed fields and differentially closed fields. This dichotomy of stable/unstable theories resulted, for a while, in a dichotomy between pure and applied model theorists. Pure model theorists in the 1970’s and 80’s developed tools to study stable theories, the most fundamental being forking and its attendant concepts of independence, dimension, Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
46
INDEPENDENCE FOR TYPES IN ALGEBRAICALLY CLOSED VALUED FIELDS
47
orthogonality, and canonical base. Of course, many mathematically interesting theories are unstable, and these continued to be studied by applied model theorists, using various methods including quantifier elimination. An exciting development of the last fifteen years is that the two branches of model theory have moved back together, as it has become clear that some of the ideas and tools of stability theory can be used also in the unstable context. The following setting is the context for discussing notions of independence. Fix a theory T , and a model (the monster model) U large, sufficiently saturated. All sets of parameters will be taken from U, and will be small relative to the size of the U. Mostly, a will be a tuple from U, and B ⊇ C will be sets of parameters from U. An n-type over C is a consistent set of formulae in an n-tuple of variables with parameters from C . The type of a over C , tp(a/C ), is the set of all formulae over C which hold at a. The set of realizations in U of tp(a/C ) is an orbit of the group of automorphisms of U fixing C . (Of course, if a ∈ C then this orbit has size 1.) Informally, to say that (the type of) a is independent from B over C , a | C B, should say that B knows no more about a than C does. In other words, allowing parameters from B (and hence potentially increasing the expressibility of the formulae) does not decrease the set of realisations of tp(a/C ). Considered purely as an abstract ternary relation with this informal definition, there are properties one might expect the relation to have. For example: | C B , Monotonicity: if a | C B and C ⊆ C ⊆ B ⊆ B then a Transitivity: a | C B and a | B B then a | CB, Symmetry: if a | C B and b ∈ B then b | C a ∪ C. The next two properties come from the fact that we work in a first-order language, and formulae involve only finitely many parameters. | C C ∪ B0 for every finite Finite character: a | C B if and only if a B0 ⊂ B. Local character: a | C B if for every C0 ⊆ C of cardinality at most |T |, a | C B. 0
The last three properties that I want to mention are somewhat more complicated. Independence theorem (or amalgamation property): Suppose M is a | M A, b | M B. Then model, M ⊆ A ∩ B, A | M B, a ≡M b and a | M A∪B there is c such that c ≡A a, c ≡B b and c Existence of invariant extensions: Given a and C ⊆ B, there is a ≡C a | C B. with a Uniqueness of invariant extensions (stationarity): Suppose M is a model, | M B, a | M B. Then a ≡B a . M ⊆ B, a ≡M a and a
48
DEIRDRE HASKELL
The following remarkable theorem ties the axiomatic properties of the independence notion back into the more global property of a theory being stable. Theorem 1. 1. If | satisfies all of the above properties then it is the independence derived from non-forking and the theory is stable. 2. [KP] If | satisfies all but the last of the above properties then it is the independence derived from dividing and the theory is simple. An account of this theorem can be found in the review article [KP2]. Examples of simple theories include pseudofinite fields and the model companion of the theory of fields with an automorphism. This theorem can be taken as either a positive or a negative result. Positively, it says that one only needs to check the axiomatic properties in order to prove stability or simplicity of a theory. Negatively, it says that unless a theory is at least simple, it has no hope of having a well-behaved notion of independence. However, we can still consider possible ternary relations measuring the potential information content of a set of parameters, and find out how many of the favored axiomatic properties a selected definition actually has. This is the approach I will take for examining theories of valued fields. Definition and examples of valued fields. I begin by recalling the definition of a valued field. Definition 2. A valuation is a map v : K → Γ ∪ {∞} from a field K onto an ordered group Γ such that for all a, b ∈ K v(ab) = v(a) + v(b), v(a + b) ≥ min{v(a), v(b)}, v(a) = ∞ ⇔ a = 0. There are three fundamental sets associated to the valuation: the valuation ring O = {x : v(x) ≥ 0}, the maximal ideal M = {x : v(x) > 0} and the residue field k = O/M. The value group, valuation ring, maximal ideal and residue field of a given valued field (K, v) will be denoted respectively by ΓK , OK , MK and kK . The characteristic of K as a valued field is the ordered pair (char(K ), char(kK )). The possibilities for the characteristic of a valued field are (0, 0), (0, p) and (p, p), where p is a prime number. Fields of formal Laurent series provide a family of examples of fields naturally equipped with a valuation. Given a field F , define qi t i : qi ∈ F, qM = 0, M ∈ Z . K = F ((t)) = i∈Z,i≥M
Then the function v : F ((t)) → Z defined by v( i≥M qi t i ) = M is a valuation. In this case, ΓK = Z and kK = F .
INDEPENDENCE FOR TYPES IN ALGEBRAICALLY CLOSED VALUED FIELDS
49
If F is algebraically closed of characteristic 0, then it is a well-known result that the field of Puiseux series L = F t =
∞
F
1 tn
n=1
is algebraically closed. The elements of the field are power series as above in fractional powers of t; the valuation is defined as above. Here ΓL = Q and kL = F . Thus, taking F = C gives an example of an algebraically closed valued field of characteristic (0, 0). To get an example of characteristic (p, p) is a little more subtle. Write F˜ p for the algebraic closure of Fp , the finite field with p elements. Then F˜ p t is not algebraically closed, but L = F˜ p ((Q)), the field of formal power series whose support is a well-ordered subset of Q, is algebraically closed. Here ΓL = Q and kL = F˜ p . To complete the picture, we need an example of an algebraically closed valued field of characteristic ˜ p , the algebraic closure of the field Qp of p-adic (0, p). The elements of Q numbers, can be identified with their expansions in fractional powers of p with coefficients in F˜ p . These expansions are not strictly power series, as p interacts with the coefficients in the definitions of addition and multiplication, but it gives the same tree-like picture for the elements of the field. A useful picture of a valued field is to think of it as a set of infinite paths through a tree. The value group is the set of nodes on each branch. Consider ∞ for example Fp [[t]] = { i=0 ai t i : ai ∈ Fp }. This is represented by a pbranching ∞ tree with a root at valuation 0 distinguishing the subsets Fr = {r + i=1 ai t i }, for r = 0, . . . , p − 1. At valuation 1, each of these sets branches again in p-many directions. The valuation ring of the field Fp ((t)) is the tree just described. The field is the infinitely descending tree, with the same pattern at any finite valuation. The maximal ideal is the set F0 , and the elements of the residue field can be identified with the sets Fr , r = 0, . . . , p − 1. In general, the residue field and the set of branches at each node of the tree have the same cardinality. If the value group is divisible, then the nodes of the tree will be densely ordered on each branch of the tree. It is straightforward to write down the axioms for a non-trivially valued field, say in a language of fields with a predicate for the relation v(x) ≥ v(y). We can add the usual axiom scheme to enforce that the field is algebraically closed, and the consequences form ACVF; the theory of algebraically closed valued fields. To get a complete theory one only has to specify the field and residue field characteristic. This theory was studied in the mid 1950’s by Abraham Robinson, who proved the result that we now understand to imply: Theorem 3. [R] ACVF has quantifier elimination. As for all quantifier elimination results, we immediately get further statements about the character of the definable sets in any model of the theory.
50
DEIRDRE HASKELL
Corollary 4. Let K be an algebraically closed valued field. Definable sets are boolean combinations of sets of the form {x : f(x) = 0} and {x : v(f(x)) ≥ v(g(x))} where f, g are polynomials over K . In particular, a definable set in one variable is either finite or has nonempty interior in the valuation topology. In fact, more is true. Since K is algebraically closed, the polynomials can be factored into linear terms. Hence an infinite definable set in one variable is a boolean combination of balls: {x ∈ K : v(x − c) ≥ } is the closed ball of radius around c, {x ∈ K : v(x − c) > } is the open ball of radius around c. In the definition of a closed ball, we allow = ∞ in order to let a single point be a closed ball. More general quantifier elimination results, for the theory of algebraically closed valued fields in a language with sorts for the value group and residue field as well as the field itself, were proved by Delon [D]. It follows that the residue field k and value group Γ are stably embedded; that is, that any set definable in k or Γ using parameters from the field structure is definable already in k or Γ itself. Ten years after Robinson’s result, Ax and Kochen, and independently Erˇsov, studied the relationship between the theory of a valued field and the associated theories of its value group and residue field. Theorem 5. [AK, E] Let K , L be henselian valued fields of residual characteristic 0. Then K is elementarily equivalent to L if and only if their value groups are elementarily equivalent and their residue fields are elementarily equivalent. This theorem led to significant applications in number theory. From the point of view in this article we are looking for a theory neither stable nor simple which might nevertheless have a reasonable notion of independence. The Ax-Kochen-Erˇsov theorem suggests that ACVF is a prime candidate for this purpose, as • the residue field is algebraically closed and stably embedded, hence stable and strongly minimal, • the value group is divisible and stably embedded, hence o-minimal and thus as nice as an unstable theory can be, and • the full theory seems to be controlled in some way by the theories of its residue field and value group. Independence in an algebraically closed valued field. For a first approach, we proceed by analogy with independence in a theory which, at least naively, appears to be closely related; that of an algebraically closed field. There are many equivalent ways to formulate the definition of independence; the three we look at here come from Morley rank, genericity, and polynomial ideals.
INDEPENDENCE FOR TYPES IN ALGEBRAICALLY CLOSED VALUED FIELDS
51
I recall the different definitions, to see how each might be recast in the context of a valuation. All of the results about valued fields in this section can be found in the papers [HHM1] and [HHM2]. Morley rank. First recall the definition of Morley rank, RM, for definable sets. Let D be a definable set in some (ℵ0 -saturated) structure. Definition 6. • RM(D) ≥ 0 iff D = ∅; • RM(D) ≥ α + 1 iff there is an infinite family Di of pairwise disjoint definable sets contained in D with RM(Di ) ≥ α for all i; • for α a limit ordinal, RM(D) ≥ α iff RM(D) ≥ for every < α. Thus, if D = ∅ then RM(D) = −1. If RM(D) ≥ α and RM(D) ≥ α + 1 then RM(D) = α. If RM(D) ≥ α for all ordinals α then RM(D) = ∞. We also define RM(a/C ) = inf{RM(D) : D is a C -definable set containing a}, and hence the following notion of independence: a | B ⇐⇒ RM(a/B) = RM(a/C ). C
Morley proved that a theory is -stable if and only if every type has a Morley rank which is an ordinal. Thus we do not expect to be able to use Morley rank to define independence for all types in an algebraically closed valued field. But we could ask whether if might be possible to use it for some types. However, the answer is no, as we see from the following proposition. Proposition 7. Let K be an algebraically closed valued field, D an infinite definable subset of K 1 . Then RM(D) = ∞. Proof. Suppose RM(D) = α. Then by definition, D does not contain an infinite disjoint family of subsets Di of rank α. Since D is infinite, it contains a closed or open ball; without loss of generality, we may assume that D = {x ∈ K : v(x − c) > }. Let < 1 < 2 < · · · be an infinite increasing sequence from Γ. Let ci ∈ D be such that v(c − ci ) = i , and let Di = {x ∈ D : v(x − ci ) > i }. By the superadditivity of the valuation, for i < j, v(ci − cj ) = min{v(ci − c), v(c − cj )} = i . Thus Di ∩ Dj = ∅; since, if v(x − ci ) > i , then v(x − cj ) = min{v(x − ci ), v(ci − cj )} = i < j . i) + c, where v(d ) = , But Di is isomorphic to D, by the map x → d (x−c di v(di ) = i . As isomorphisms preserve Morley rank, RM(Di ) = RM(D), which is a contradiction.
52
DEIRDRE HASKELL
Genericity (More precise statements of the following definitions and results can be found in Sections 2.3 and 2.5 of [HHM1] and Section 7 of [HHM2].) In an algebraically closed field F , we have a natural notion of genericity. Definition 8. A point a ∈ F n is generic in an irreducible algebraic variety V ⊆ F n defined over C if a ∈ V and for any other irreducible algebraic variety W ⊆ F n defined over C , if a ∈ W then V ⊆ W . This gives the following definition of independence in an algebraically closed field: a | B ⇐⇒ whenever a is generic over C in a variety V defined C
over C , then a remains generic in V over B. In an algebraically closed field, genericity and Morley rank are related as follows: if a is generic in V over C then RM(a/C ) = RM(V ). Thus a | CB if and only if RM(a/B) = RM(V ), which happens if and only if a remains generic in V over B. Varieties are fundamental sets in an algebraically closed field; to see which sets might play an analogous role in a valued field, we look at the role the varieties play from the point of view of model theory. Since the theory of algebraically closed fields has quantifier elimination in the language of rings, all definable sets are boolean combinations of those defined by positive quantifier-free formulae. These are precisely the varieties. Since ACVF also has quantifier elimination, the sets in one variable which are analogous to the varieties are those defined by positive quantifier-free formulas; that is, the balls. A closed ball around 0 can be thought of as an O-submodule of K , in which case we write it as O = {x : v(x) ≥ }. Similarly, an open ball around 0 is also an O-submodule of K , M = {x : v(x) > }. In the process of studying 1-types in a valued field, one comes to realise that tp(a + O) and tp(a + M) should morally be thought of as 1-types (or, as we prefer to call them, unary types). Thus we work more generally not just with balls but with unary sets; that is, sets of the form O/O, O/M, M/O and M/M . Definition 9. An element a in an algebraically closed valued field K is generic over C in a unary set V ⊆ K defined over acl(C ) if a ∈ V and for any other definable unary set W ⊆ K defined over acl(C ), if a ∈ W then V ⊆ W . We therefore make the following definition of generic independence. g
Definition 10. Suppose a is generic in V over C . Then a | C B if a is still generic in V over B. The following examples, in which generic independence holds, help to form an intuition for what it means. In particular, we see that generic independence is not, in general, symmetric.
INDEPENDENCE FOR TYPES IN ALGEBRAICALLY CLOSED VALUED FIELDS
53
Example 11. First, suppose that a is generic in O over C = acl(∅). Then v(a) ≥ 0, but a ∈ / M, so v(a) = 0. Indeed, Γacl(C (a)) = ΓC . Also, a is not in any open C -definable ball with radius 0, so the residue of a is not in the residue g | C C ∪{a}. field of C . Now, let a be another generic element of O over a; a Then v(a ) = 0, and also v(a − a ) = 0. Furthermore, v(f(a )) = 0 for any g monic polynomial f(X ) ∈ acl(C (a))[X ], thus a | C C ∪ {a }. Example 12. Now suppose that a is generic in M over C = acl(∅). Then v(a) > 0, but if ∈ ΓC with > 0, a is not in the ball around 0 of radius , so v(a) < . Thus v(a) is not in ΓC . Let a be another generic element of M g g | C C ∪ {a}. Then 0 < v(a ) < v(a). Thus a
| C C ∪ {a }. over a; a There is one fundamental difference between genericity in an algebraically closed field and the definition I have given here for an algebraically closed valued field. That is, the former is defined naturally for subsets of cartesian powers of the field, whereas for the latter it is essential that we are thinking of singletons. It may be possible to understand intrinsically the basic definable sets in K n , and hence give a similar definition of genericity which is immediately appropriate for n-tuples. However, we use instead a sequential definition. Definition 13. Let a = (a1 , . . . , an ), where each ai is a singleton, U = g | C B via U if either a ∈ acl(C ) or for each i, (U1 , . . . , Un ). We say that a Ui is an acl(Ca1 . . . ai−1 )-definable unary set and ai is generic in Ui over Ba1 . . . ai−1 . We now have a definition of independence for which we can examine the axiomatic properties. It turns out that sequential independence satisfies transitivity and monotonicity. Symmetry fails, as we have seen. Finite character holds, as do existence and uniqueness of invariant extensions. Polynomial ideals (The results of this section can be found in Section 14 of [HHM2].) This brings us to the third method for formulating independence in an algebraically closed field that I mentioned; that is, via polynomial ideals. Given a type or a variety defined over the parameter set C , we can define the following two ideals of polynomials: I (a/C ) = {f(X ) : f(a) = 0}, I (V ) = {f(X ) : ∀x ∈ V f(x) = 0}. In an algebraically closed field it is a standard result that, for a generic in V over C , I (a/C ) = I (V ). Hence the following formulation of independence is again equivalent: a | B ⇐⇒ I (a/B) = I (a/C ). C
54
DEIRDRE HASKELL
We again use quantifier elimination to decide what might be the appropriate analogue in a valued field to the ideal of polynomials vanishing on the type. The terms are still polynomials, but the relations in the language include inequalities of the valuation, so the following set of polynomials, which turns out to be an OK -module, is more intrinsic. From it we get the corresponding notion of independence. Definition 14. J (a/C ) = {f(X ) : v(f(a)) ≥ 0} and J
| B ⇐⇒ J (a/C ) = J (a/B) a C
The following example helps to form an intuition for the definition. Example 15. As in Example 12 above, take a generic in M, a generic over a in M, both over C = acl(∅). Then g
| C ∪ {a}, a C
g
a
| C ∪ {a },
and 0 < v(a ) < v(a).
C
Let f(X ) = a1 (X − a ). Then tp(a/a ) v(f(x)) = 0, hence f(X ) ∈ J (a/a ). But it is possible to have b with the same type as a over C such that / J (a/∅). Thus 0 < v(b) < v(a ). Then v(f(b)) < 0, so f(X ) ∈ J
a
| C ∪ {a } . C
In this example, we were able to make J-independence fail by having generic independence fail also, and indeed this is the easiest way to construct such examples. It turns out that this is true for all 1-types, though for n-types the notions are not equivalent in general. J
g
| C B ⇐⇒ a | C B. Proposition 16. Let a ∈ K 1 . Then a We again have a notion of independence for which we can consider the axiomatic properties, and it turns out that monotonicity and transitivity hold, symmetry fails, and finite character holds. Stable domination (A more detailed exposition can be found in Part A of [HHM2].) So far, we have proceeded by analogy. A rather different approach is to work with the strongly minimal structure that is already present in an algebraically closed valued field, and build on the notion of independence that it already has. The residue field k is stable and stably embedded, but even more of the structure is stable. For a parameter set C , define the stable reduct of C to be:
INDEPENDENCE FOR TYPES IN ALGEBRAICALLY CLOSED VALUED FIELDS
55
StC = the multisorted structure whose sorts are the C -definable stable, stably embedded subsets of U (the monster model), and whose relations are all the C -definable relations on these sorts. For A ⊂ U, StC (A) = StC ∩ dcl(A). If C is a model of the theory, then StC is just kUeq with a family of k-vector spaces. But if C is not a model, then there can be stable, stably embedded sets defined over C which cannot be identified with sets definable over the residue field. We use the independence in the stable structure StC to lift to a definition of independence for some types in the full structure. dom
Definition 17. Let A, B, C be subsets of U. We say A | C | C StC (B) and tp(StC (A)/CB) tp(A/CB).
B if StC (A)
| C Definition 18. Also tp(A/C ) is stably dominated if whenever StC (A) dom StC (B), A | C B. The axiomatic properties of domination independence are inherited directly from independence in the stable structure StC . Thus symmetry, monotonicity, finite character, local character and existence and uniqueness of invariant extensions all hold. We now have three different definitions of independence in an algebraically closed valued field. It is clear from the fact that they do not all satisfy the same axiomatic properties that they are not equivalent to each other. Our intuition from the Ax-Kochen-Erˇsov theorem is that the instability in the theory ought to arise from the value group, and therefore the types for which the different kinds of independence will exhibit different behaviour ought to arise from interaction with the value group. We therefore formulate the following notion of orthogonality to the value group. Definition 19. Assume C is algebraically closed. We say that tp(a/C ) is g orthogonal to Γ if for any model M ⊇ C , if a | C M then Γ(M (a)) = Γ(M ). Theorem 20. The tp(a/C ) is stably dominated if and only if tp(a/C ) is orthogonal to Γ. The resulting theorem seems to confirm our intuition. Theorem 21. [HHM2, Theorem 14.9] If tp(a/C ) is orthogonal to Γ then all notions of independence are equivalent. Conclusions. We have investigated three different notions of independence in an algebraically closed valued field; the last theorem states explicitly how they relate to each other. There are still questions to ask about these definitions; J -independence in particular is rather hard to work with. For example, we do not even know whether J -independent extensions exist. However, it is perhaps of more interest to see ways in which independence can be used.
56
DEIRDRE HASKELL
Stable domination is the notion that might most easily have applications. Hrushovski has proved some results about the structure of definable groups with a stably dominated type. Such groups can arise in an algebraically closed valued field, for example as the set in the field which is the inverse image under the residue map of an algebraic group over the residue field. Furthermore, the notion of stable domination can be defined in any theory, but is more likely to be of interest if the theory has a reasonably large stable part. A natural place to look is in a differentially closed valued field. The notion of stable domination can also be generalized to look at types which are dominated by some other ‘nice’ part of the theory. Very recently, Hrushovksi, Peterzil and Pillay have announced results about groups definable in o-minimal structures, using the concept of compactly dominated types. Finally, I note that recently, Shelah has been looking at the notion of ‘dependent theories’, which include all the ‘relation-minimal’ theories — o-minimal, P-minimal, C-minimal (of which algebraically closed valued fields are an example). There seems to be a notion of independence, and it would be very interesting to know how it compares with the ones we have looked at here. REFERENCES
[AK] J. Ax and S. Kochen, Diophantine problems over local fields. I, American Journal of Mathematics, vol. 87 (1965), pp. 605–630. [D] F. Delon, Model theory of Henselian valued fields, Logic colloquium ’87 (H.-D. Ebbinghaus, J. Fernandez-Prida, M. Garrido, D. Lascar, and M. Rodr´ıquez Artalejo, editors), Studies in Logic and the Foundations of Mathematics, vol. 129, North-Holland, Amsterdam, 1989, pp. 1–10. [E] Ju. Erˇsov, On elementary theory of maximal normalized fields, Akademiya Nauk SSSR. Sibirskoe Otdelenie. Institut Matematiki. Algebra i Logika, vol. 4 (1965), no. 3, pp. 31–70, (Russian). [HHM1] D. Haskell, E. Hrushovski, and D. Macpherson, Definable sets in algebraically ¨ die Reine und Angewandte closed valued fields: elimination of imaginaries, to appear in Journal fur Mathematik. [HHM2] , Stable domination and independence in algebraically closed valued fields, preprint. [KP] B. Kim and A. Pillay, Simple theories, Joint AILA-KGS Model Theory Meeting (Florence, 1995). Annals of Pure and Applied Logic, vol. 88 (1997), no. 2-3, pp. 149–164. [KP2] , From stability to simplicity, The Bulletin of Symbolic Logic, vol. 4 (1998), no. 1, pp. 17–36. [R] A. Robinson, Complete Theories, North-Holland, Amsterdam, 1956. [Sh] S. Shelah, Classification Theory and the Number of Nonisomorphic models, Studies in Logic and the Foundations of Mathematics, vol. 92, North-Holland, Amsterdam-New York, 1990. DEPARTMENT OF MATHEMATICS AND STATISTICS MCMASTER UNIVERSITY HAMILTON ON L8S 4K1, CANADA
E-mail:
[email protected]
SIMPLE GROUPS OF FINITE MORLEY RANK
ERIC JALIGOT
§1. Historical motivations. Modern model theory started when M. Morley [Mor65] proved his famous theorem on the categoricity in any uncountable cardinal of first order theories categorical in one uncountable cardinal. He introduced for that purpose an ordinal valued rank on types of such a theory, later shown to be finite by J. Baldwin [Bal73]. Fact 1.1. An uncountably categorical first order theory has finite Morley rank. This was the begining of the fantastic development of the classification theory by S. Shelah on the number of non-isomorphic uncountable models of a first order theory [She90], and more precisely the developments of stability theory with all subsequent generalizations of linear or algebraic independance in classical mathematical structures such as vector spaces or fields. In the meantime, it appeared interesting to study structures with such a given modeltheoretic property. The first result of this kind was obtained by A. Macintyre [Mac71]. Macintyre’s Theorem. An infinite field of finite Morley rank is algebraically closed. On the other hand, B. Zilber showed in his work on uncountably categorical structures the following result on simple groups of finite Morley rank [Zil77]. Fact 1.2. An infinite simple group of finite Morley rank is uncountably categorical. Naturally, this result gave the feeling that simplicity was certainly hiding stronger structural properties. Hence, motivated by a sense that most structures already exist “in nature”, G. Cherlin and B. Zilber proposed independently the following conjecture in the late seventies. Algebricity Conjecture. An infinite simple group of finite Morley rank is isomorphic, as an abstract group, to the group of K -rational points of an algebraic group over an algebraically closed field K . Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
57
58
ERIC JALIGOT
Other conjectures of the same flavour were formulated at the same time by B. Zilber. One of the most famous, the Trichotomy Conjecture, was refuted in the mid-eighties by E. Hrushovski, whose constructions of “generic” structures produced numerous new examples of structures satisfying some stability properties. Nevertheless, the Algebricity Conjecture for simple groups of finite Morley rank remains entirely open, as nobody sees at present how to apply Hrushovski’s method in a pure group-theoretic context. This is certainly a major challenge for the future, but for the moment we rather tend to focus on algebraic properties of groups of finite Morley rank, with the Algebricity Conjecture as an in´epuisable source of inspiration. In the eighties, A. Borovik suggested to adapt the architecture of the Classification of the Finite Simple Groups to the context of groups of finite Morley rank. Indeed, groups of finite Morley rank have some properties of finite groups, such as the Descending Chain Condition on definable subgroups, and the finiteness of Morley rank allows one to make some proofs by induction. First, it was worth finding an axiomatic treatment of Morley rank easily understandable by algebraists, and which emphasizes how the Morley rank generalizes the Zariski dimension of varieties in algebraic geometry over an algebraically closed field. This was accomplished by B. Poizat [Poi87]. Indeed, groups of finite Morley rank can be defined as groups equipped with a function “rk”, which assigns to each nonempty definable set A a nonnegative integer (its “rank”, or rather its dimension) and which is mostly defined by the following property: rk(A) ≥ n + 1 if and only if A contains infinitely many pairwise disjoint definable subsets Ai with rk(Ai ) ≥ n. The book [BN94] contains the basic developments implied by this purely combinatorial definition, as well as basic results needed here, and outlines the classification project of simple groups of finite Morley rank. §2. Results on simple groups. The whole classification project of simple groups couldn’t have been possible in the finite case without the FeitThompson Theorem [FT63], which gave elements of order 2 in any finite nonabelian simple group. These involutions played the main role throughout the whole classification in the finite case. Similarly, they turn out to be extremely useful also for groups of finite Morley rank, though there is no known analog of the Feit-Thompson Theorem in this context. Before entering into these considerations involving 2-Sylow theory, we review the notion of connected group, which most of the time allows one to smooth arguments from finite group theory. 2.1. Connected groups. By Descending Chain Condition on definable subgroups, each group G of finite Morley rank has a smallest definable subgroup of finite index, the intersection of all of them, which is denoted by
SIMPLE GROUPS OF FINITE MORLEY RANK
59
G ◦ and called the connected component of G. It is obviously normal in G, and G is connected if G = G ◦ . The main property of connected groups of finite Morley rank is that they contain a unique generic type, or equivalently that they cannot be partitionned in two definable generic subsets [Che79]. One consequence of this is the following elementary but very useful fact. Fact 2.1. A connected group of finite Morley rank acting definably on a finite set fixes it pointwise. There is also a notion of connected component for non-definable subgroups. By Descending Chain Condition on definable subgroups again, each subset X of a group G of finite Morley rank is contained in a smallest definable subgroup, denoted by d (X ) and called the definable closure of X . This allows one to define the generalized connected component of X as X ◦ = d (X )◦ ∩ X . If X is a subgroup, then one may check easily that X ◦ is also a normal subgroup of X of finite index in X . 2.2. Types and characteristic. The Classification of the Finite Simple Groups was roughly divided in two main chapters for the identification of finite simple groups of Lie type, those of characteristic 2 type and those of characteristic not 2 type. In the context of groups of finite Morley rank, such a distinction is merely given by the structure of Sylow 2-subgroups, first studied in [BP90]. Fact 2.2. In any group of finite Morley rank, Sylow 2-subgroups are conjugate and if S is one of them, then S ◦ = T ∗ U is nilpotent and a central product, with finite intersection, of a 2-torus T and a 2-unipotent subgroup U . Here, a 2-torus is a divisible abelian 2-group, and a 2-unipotent group is a definable connected 2-group of bounded exponent. Accordingly, we say that a group G of finite Morley rank has the following type, depending on whether T or U is trivial or not. U = 1 U =1 T = 1 Mixed Odd T = 1 Even Degenerate By analogy with the structure of Sylow 2-subgroups in the algebraic case, an infinite simple group of finite Morley rank should not be of mixed type, and should be algebraic over an algebraically closed field of characteristic 2 or different from 2, depending on whether it is of even or of odd type. The existence of an infinite simple group of finite Morley rank of degenerate type, i.e. with finite Sylow 2-subgroups, would correspond to a failure of the FeitThompson Theorem in this context of infinite groups; this is the most widely open question for the moment.
60
ERIC JALIGOT
2.3. Groups of even type. For groups of even type there is an absolute answer to the Algebricity Conjecture. Theorem 2.3. A simple group of finite Morley rank and of even type is algebraic over an algebraically closed field of characteristic 2. This theorem is proved by T. Altınel, A. Borovik, and G. Cherlin, together with a few important contributions from other authors. The full proof is contained in the book [ABC05] in preparation. Of course, it uses intensively the presence of infinite elementary abelian 2-subgroups, or more generally the presence of many involutions. Theorem 2.3 is proved by induction, considering thus a minimal counterexample, called simple L∗ -group. In such a group G, every proper infinite simple definable section (i.e. infinite simple quotient H/L, where L H < G are two definable subgroups) of even type is algebraic over an algebraically closed field of characteristic 2. It is noticeable that the full proof of Theorem 2.3 is handled without assuming any analog of the Feit-Thompson Theorem, so with the potential presence of infinite simple definable sections of degenerate type. 2.4. Groups of mixed type. As the class of groups of finite Morley rank is closed under finite direct product, it contains products of simple algebraic groups over algebraically closed fields of different characteristics, for example 2 and different from 2. In particular, it contains groups of mixed type. Under a simplicity assumption, this cannot happen. Theorem 2.4. A simple group of finite Morley rank cannot be of mixed type. A preliminary version of this theorem has been proved in [Jal99] under the additional assumption that the group G was a minimal counterexample to the Algebricity Conjecture. The proof there proceeded as if G was a direct product of a group of even type and one of odd type. More precisely, it consisted in showing that two conjugacy classes of involutions, those corresponding to the even part and to the odd part respectively, commute, which contradicted simplicity. Then, T. Altınel noticed that there was some asymmetry in the proof, more precisely that the inductive assumption was used only for the even type part, and hence that Theorem 2.4 for any simple group was indeed a corollary of the substantial Theorem 2.3. 2.5. Groups of odd type. For groups of odd type there is unfortunately no absolute answer to the Algebricity Conjecture as for groups of even type. In this case it is known that the connected component of a Sylow 2-subgroup S ¨ is a direct product of finitely many copies of the Prufer 2-group Z2∞ , i.e. S ◦ = Z2∞ × · · · × Z2∞ , and the number of copies is called the Pr¨ufer 2-rank of the ambient group. The best which can be said at the moment for simple groups of finite Morley rank of odd type is the following.
SIMPLE GROUPS OF FINITE MORLEY RANK
61
Theorem 2.5. A minimal counterexample of odd type to the Algebricity Conjecture has Pr¨ufer 2-rank at most 2. The full proof of this theorem follows from a series of papers which culminates in [BCJ05]. Indeed, it is shown in [Bur05a] that a minimal coun¨ terexample G to the Algebricity Conjecture of Prufer 2-rank at least three necessarily contains a proper 2-generated core M , i.e. the definable closure M of the subgroup generated by all subgroups NG (V ), for all elementary abelian 2-subgroups V of rank 2 of a Sylow 2-subgroup S, is proper in G. This result ¨ relies on identification theorems of groups of large Prufer 2-rank. Then it is shown in [BBN04] that G is a minimal connected simple group, i.e. all its proper definable connected subgroups are solvable, and that the proper 2-generated core is indeed strongly embedded in G, i.e. M ∩ M g has no involutions for every g ∈ G \ M . Then it is shown in [BCJ05] that a minimal connected sim¨ ple group of odd type with a strongly embedded subgroup has Prufer 2-rank exactly one, proving thus Theorem 2.5. 2.6. Groups of degenerate type. For groups of degenerate type there is not so much theory. Infinite simple groups of finite Morley rank should have infinite Sylow 2-subgroups according to the Algebricity Conjecture. There are nevertheless potential configurations of groups of finite Morley rank without involutions, such as bad groups, whose inexistence seems hardly provable. In their most extreme forms, these groups can be thought of as groups G with a malnormal subgroup B (i.e. satisfying B ∩ B g = 1 for every g ∈ G \ B) and such that G = g∈G B g . The paper [Jal01] deals with these configurations. At least, the following result from [BBC05] excludes some pathological configurations of connected groups with finite and nontrivial Sylow 2-subgroups. Notice that the only assumption in this statement is the connectedness of the ambient group, instead of simplicity. Theorem 2.6. Let G be a connected group of finite Morley rank of degenerate type. Then G has no involutions. Surprisingly, the proof of Theorem 2.6 uses some methods from probabilistic finite group theory. There are also some results of the same flavour for odd primes. Anyway, the limitation of available methods in the absence of involutions leads us to develop further other theories which do not depend on the presence of involutions. Undoubtfully, the most interesting one is that of Carter subgroups. §3. Carter subgroups. A Carter subgroup of a group of finite Morley rank is a definable connected nilpotent subgroup of finite index in its normalizer. This definition is designed to approximate maximal tori of algebraic groups, which are abelian and of finite index in their normalizers. The existence of these subgroups in the algebraic context follows mostly from the Jordan
62
ERIC JALIGOT
decomposition. As we will see, the main result about Carter subgroups in the abstract context of groups of finite Morley rank is their existence. As no Jordan decomposition is available in this abstract context, we have to develop a theory of semisimplicity and unipotence from scratch. The only question we can ask in our category of groups of finite Morley rank is how much a group can act on another, or, at the opposite, how much it can be acted upon by another. Most of the time, the study of actions reduces to the Zilber Field Theorem. Let U T be a group of finite Morley rank, with U and T infinite abelian definable subgroups, CT (U ) = 1, and such that U contains no proper infinite T -invariant subgroup. Then an algebraically closed field K is interpreted, U K+ , T embeds as a definable subgroup of K × , and T acts on U by multiplication. The possibility that T < K × , or in other words the existence of a socalled bad field of finite Morley rank, is still an open question in the subject. Such an existence seems highly unprobable in characteristic p > 0 [Wag03], but very probable in characteristic 0 [Poi01]. This complicates the theory enormously, especially when considering the action of a torsion-free group on another. But if T < K × in the situation of the Zilber Field Theorem, then rk(T ) < rk(U ). In this sense U is “more” unipotent than T , as a conjugate of it cannot act on T in the same way. This leads to the gradual notion of unipotence in characteristic 0 developed in the thesis of J. Burdges [Bur04]. 3.1. Semisimplicity versus unipotence. Before explaining the Burdges notion of unipotence in characteristic 0, we look at easier cases. Following [Che05], we call decent torus any group T of finite Morley rank which is the definable closure of a divisible abelian torsion subgroup S, i.e. T = d (S). By finiteness of Morley rank, it is known that, if S is exactly the torsion subgroup of T , than S = ⊕p prime (⊕rp Zp∞ ) where ¨ the Prufer p-ranks rp of T are all finite. In particular, there are only finitely many elements of order n in T for each integer n, and Fact 2.1 gives a rigidity property. Fact 3.1 (Action and decent tori). Let T be a definable decent torus in a group of finite Morley rank. Then N ◦ (T ) = C ◦ (T ). Hence decent tori can never be acted upon seriously, and in this sense they are less unipotent than any other connected group. At the opposite of decent tori, a p-unipotent group (p a prime) is a definable connected nilpotent pgroup of bounded exponent. The following fact, whose proof boils down to the Zilber Field Theorem, tells us that these groups can never act seriously on a connected nilpotent group, and hence that they are more unipotent than any other connected group.
SIMPLE GROUPS OF FINITE MORLEY RANK
63
Fact 3.2 (Action and p-unipotence). Let H ·U be a group of finite Morley rank, with H and U definable connected nilpotent subgroups, U p-unipotent, and H normalized by U . Then HU is nilpotent. Of course, the proofs of the two preceding facts use heavily torsion elements, but most groups are torsion free! So we develop now the theory of unipotence of Burdges in characteristic 0, or equivalently for elements of infinite order. Given a group G of finite Morley rank and an integer r ≥ 1, we define A is a definable indecomposable subgroup, . U(∞,r) (G) = A ≤ G A/Φ(A) is torsion-free and of rank r Here, a group A of finite Morley rank is indecomposable if it is abelian and cannot be written as a sum of two proper definable subgroups. Such a group has to be connected and has a unique maximal proper definable connected subgroup Φ(A). A Burdges U(∞,r) -group, or merely an U(∞,r) -group, is a group G of finite Morley rank such that G = U(∞,r) (G). This definition is mostly designed to yield the following property, whose proof boils down to the Zilber Field Theorem in characteristic 0 [Bur05b]. Fact 3.3 (Action and 0-unipotence). Let U1 U2 be a group of finite Morley rank, with Ui two definable nilpotent U(∞,ri ) -groups (0 < ri < ∞). Assume that U1 is normal and that r1 ≤ r2 . Then U1 U2 is nilpotent. Hence, the parameter r can be seen as the “unipotence degree” of the groups involved. It is now natural to say that decent tori have unipotence degree 0 and that p-unipotent groups have an infinite unipotence degree. To have a uniform notation, we consider couples p˜ = (p, r) with p prime or ∞, r a nonnegative integer or ∞, and with p = ∞ if and only if r < ∞. We call p-group ˜ (of finite Morley rank) any group of the following form, depending on the value of p˜ = (p, r). • (∞, 0)-groups: (abelian) decent tori, • (∞, r)-groups, with 0 < r < ∞: nilpotent Burdges U(∞,r) -groups, and • (p, ∞)-groups, with p prime: (nilpotent) p-unipotent groups. Notice that nilpotency of these groups is imposed by definition. Facts 3.1, 3.2, and 3.3 can now be restated uniformely, including all possible values of the unipotence degree [FJ05a]. Proposition 3.4 (Action and p-groups). ˜ Let U1 U2 be a group of finite Morley rank, with Ui two definable (pi , ri )-groups (0 ≤ ri ≤ ∞). Assume that U1 is normal and that r1 ≤ r2 . Then U1 U2 is nilpotent. 3.2. Sylow p-subgroups. ˜ The notion of U(∞,r) -group was developed by Burdges in order to get a unipotence theory in characteristic 0, and for that reason he prefered the word “U0,r -group”. But it gives also a good notion of Sylow theory for elements of infinite order, which justifies the term “(∞, r)-group”.
64
ERIC JALIGOT
While decent tori and p-unipotent groups cover all kinds of basic torsion subgroups which can occur in a finite Morley rank context, (∞, r)-groups cover the torsion-free ones. So we consider couples p˜ = (p, r) as before, with p prime or infinite. As seen above, what is really important is not so much the prime p, usually ∞, but rather the unipotence degree r. Given a group G of finite Morley rank, we call Sylow p-subgroup ˜ of G any definable p-subgroup ˜ of G which is maximal with respect to these properties. Notice that these subgroups are nilpotent by definition. There are many results showing that these groups behave like Sylow p-subgroups in finite group theory [Bur05b]. The first one is an analog of the decomposition of finite nilpotent groups as a direct product of their Sylow p-subgroups. Nilpotent Decomposition. A connected nilpotent group of finite Morley rank is the central product of its Sylow p-subgroups. ˜ Notice however that the product here is just central, not necessarily direct as in finite group theory. The question of the conjugacy of Sylow p-subgroups ˜ in any group of finite Morley rank is a natural conjecture. It is verified for p˜ = (∞, 0) [Che05] and for any p˜ if the ambient group is solvable [Bur05b]. The reader can find in [FJ05a] a complete survey on conjugacy results around these notions of unipotence, Carter subgroups, and Sylow subgroups. 3.3. Existence of Carter subgroups. The main result concerning Carter subgroups is their existence in full generality. Theorem 3.5. Any group of finite Morley rank contains a Carter subgroup. This existence theorem is proved in [FJ05b]. Roughly, the proof goes as follows. If G denotes the ambient group, then one can take a nontrivial Sylow p-subgroup ˜ U1 of G of minimal unipotence degree, and consider N1 = N ◦ (U1 ). Then one can take a nontrivial Sylow p-subgroup ˜ U2 of N1 of minimal unipotence degree greater than that of U1 , and consider N2 = N ◦ (U1 U2 ). Then U1 U2 is nilpotent by Proposition 3.4, and U1 and U2 are Sylow subgroups of it. In particular N2 ≤ N1 . Then we can repeat the same process at each step i, building thus nilpotent groups U1 U2 · · · Ui , and looking at Ni = N ◦ (U1 U2 · · · Ui ). By finiteness of Morley rank, this process is stationnary at some step i0 , and one can check that U1 U2 · · · Ui0 is of finite index in its normalizer. Hence Carter subgroups are build from their least unipotent part to their most unipotent part, which is not surprising as they are intended to approximate maximal algebraic tori, which are semisimple in algebraic groups. However, we do not know if each Carter subgroup can be obtained by the
SIMPLE GROUPS OF FINITE MORLEY RANK
65
same process. More generally, we do not know if Carter subgroups are always conjugate. But we will nevertheless see now that they are conjugate under a genericity assumption. 3.4. Conjugacy of generous Carter subgroups. We say that a definable subset X of a group G of finite Morley rank is generous in G if rk(X G ) = rk(G), i.e. if the union of G-conjugates of X forms a generic subset of the ambient group G. It is shown in [Jal06] that there is at most one conjugacy class of generous Carter subgroups. Theorem 3.6. In any group of finite Morley rank, generous Carter subgroups are conjugate. The main feature of the proof of Theorem 3.6 is that it uses no group theory as developed before. It is a mere mixture of genericity arguments and finiteness conditions, culminating in an application of Fact 2.1. The first claim is that if H is a definable connected generous subgroup of a group G of finite Morley rank, then, generically, an element of H is in only finitely many conjugates of H . This can be seen by considering the geometry naturally associated to the problem, whose points are elements of H G and lines are the conjugates of H , with the natural incidence relation. The second claim is the following important application of Fact 2.1. Fundamental Lemma. Let G be a group of finite Morley rank, H a definable subgroup of G, Y the definable subset of those elements of H contained in only finitely many conjugates of H , and U a definable subset of H meeting Y in a nonempty subset. Then N ◦ (U ) ≤ N ◦ (H ). Proof. Let U1 = Y ∩ U . By assumption, U1 is contained in H and in only finitely many conjugates of H , say H , H g1 , ..., H gn . Now N ◦ (U ) normalizes U1 and hence acts definably on this finite set of conjugates of H . By Fact 2.1,
it fixes all of them, and in particular N ◦ (U ) normalizes H . To prove the conjugacy of generous Carter subgroups, we consider now two generous Carter subgroups C1 and C2 . For i = 1 and 2, the definable subset Yi of elements of Ci contained in only finitely many conjugates of Ci is generic in Ci , and it can be deduced from the fact that Ci is generous in G that Yi is also generous in G. As all this can be done in the connected component of G, which cannot be partitionned in two disjoint definable generic subsets, one gets Y1G ∩ Y2G nonempty. After conjugacy, one can thus assume Y1 ∩ Y2 nonempty. We are then in a position to apply the Fundamental Lemma with H = C1 and U = C1 ∩ C2 . This gives N ◦ (U ) ≤ N ◦ (C1 ) = C1 , as Carter subgroups are of finite index in their normalizers. Similarly, N ◦ (U ) ≤ C2 . But now Carter subgroups satisfy the Normalizer Condition as they are nilpotent, which gives easily C1 = C2 , and hence our conjugacy result. This proof uses only one time each of the two main properties of Carter subgroups, their nilpotence on the one hand and the fact that they are of finite
66
ERIC JALIGOT
index in their normalizers on the other. Hence it looks like a very minimal proof! Furthermore, it emphasizes a property of algebraic groups: a generic element of a maximal torus is in no other conjugates of this maximal torus, and a generic element of a parabolic subgroup is in only finitely many conjugates of this parabolic subgroup. 3.5. Genericity conjectures. Unfortunately, we do not know yet whether Carter subgroups are generous in general. We nevertheless strongly hope in one of the following (successively weaker) conjectures. Genericity Conjecture. One of the following is true in any group G of finite Morley rank: • Any Borel subgroup of G is generous in G. • Any Carter subgroup of G is generous in G. • There exists a generous Carter subgroup in G. There are even weaker conjectures, such as the generic covering of the ambient group by Carter subgroups, or merely by definable connected nilpotent subgroups. This last one has been verified in [BBC05] for minimal connected simple groups. Finally, it is worth mentioning that the verification of any of the Genericity Conjectures above would have a strong impact on the overall structure of groups of finite Morley rank. It gives hopes to generalize the BN -pair theory of Tits in the context of algebraic groups, with all relevant notions of root groups, Borel subgroups, parabolic subgroups and so on, and this independently from the much stronger Algebricity Conjecture. Hence, these Genericity Conjectures are certainly a major challenge in the near future. REFERENCES
[ABC05] T. Altınel, A. Borovik, and G. Cherlin, Simple groups of finite Morley rank, Book in preparation, 2005. [Bal73] J. T. Baldwin, αT is finite for ℵ1 -categorical T , Transactions of the American Mathematical Society, vol. 181 (1973), pp. 37–51. [BBC05] A. Borovik, J. Burdges, and G. Cherlin, Involutions in groups of finite Morley rank of degenerate type, Submitted, 2005. [BBN04] A. Borovik, J. Burdges, and N. Nesin, Uniqueness cases in odd type groups of finite Morley rank, Preprint, 2004. [BN94] A. Borovik and A. Nesin, Groups of Finite Morley Rank, The Clarendon Press Oxford University Press, New York, 1994, Oxford Science Publications. [BP90] A. V. Borovik and B. P. Poizat, Tores et p-groupes, The Journal of Symbolic Logic, vol. 55 (1990), no. 2, pp. 478– 491. [Bur04] J. Burdges, Odd and Degenerate Types Groups of Finite Morley Rank, Doctoral Dissertation, Rutgers University, 2004. [Bur05a] , Signalizers and balance in groups of finite Morley rank, Submitted, 2005. [Bur05b] , Sylow theory for p = 0 in solvable groups of finite Morley rank, To appear in Journal of Group Theory, 2005.
SIMPLE GROUPS OF FINITE MORLEY RANK
67
[BCJ05] J. Burdges, G. Cherlin, and E. Jaligot, Minimal connected simple groups of finite Morley rank with strongly embedded subgroups, Submitted, 2005. [Che79] G. Cherlin, Groups of small Morley rank, Annals of Mathematical Logic, vol. 17 (1979), no. 1-2, pp. 1–28. [Che05] , Good tori in groups of finite Morley rank, Journal of Group Theory, vol. 8 (2005), no. 5, pp. 613–622. [FT63] W. Feit and J. G. Thompson, Solvability of groups of odd order, Pacific Journal of Mathematics, vol. 13 (1963), pp. 775–1029. [FJ05a] O. Fr´econ and E. Jaligot, Conjugacy in groups of finite Morley rank, Model Theory at Newton Institute, Submitted, 2005. [FJ05b] , The existence of Carter subgroups in groups of finite Morley rank, Journal of Group Theory, vol. 8 (2005), no. 5, pp. 623–644. [Jal99] E. Jaligot, Groupes de type mixte, Journal of Algebra, vol. 212 (1999), no. 2, pp. 753– 768. [Jal01] , Full Frobenius groups of finite Morley rank and the Feit-Thompson theorem, The Bulletin of Symbolic Logic, vol. 7 (2001), no. 3, pp. 315–328. [Jal06] , Generix never gives up, The Journal of Symbolic Logic, vol. 71 (2006), no. 2, pp. 599–610. [Mac71] A. Macintyre, On 1 -categorical theories of fields, Fundamenta Mathematicae, vol. 71 (1971), no. 1, pp. 1–25. (errata insert). [Mor65] M. Morley, Categoricity in power, Transactions of the American Mathematical Society, vol. 114 (1965), pp. 514–538. [Poi87] B. Poizat, Groupes stables, Bruno Poizat, Lyon, 1987, Une tentative de conciliation entre la g´eom´etrie alg´ebrique et la logique math´ematique. [Poi01] , L’´egalit´e au cube, The Journal of Symbolic Logic, vol. 66 (2001), no. 4, pp. 1647–1676. [She90] S. Shelah, Classification Theory and the Number of Nonisomorphic Models, second ed., North-Holland Publishing Co., Amsterdam, 1990. [Wag03] F. O. Wagner, Bad fields in positive characteristic, The Bulletin of the London Mathematical Society, vol. 35 (2003), no. 4, pp. 499–502. [Zil77] B. I. Zil’ber, Groups and rings whose theory is categorical, Fundamenta Mathematicae, vol. 95 (1977), no. 3, pp. 173–188. INSTITUT CAMILLE JORDAN UNIVERSITE´ LYON 1 43 BD. DU 11 NOVEMBRE 1918 69622 VILLEURBANNE CEDEX FRANCE
E-mail:
[email protected]
TOWARDS A LOGIC OF TYPE-FREE MODALITY AND TRUTH
HANNES LEITGEB
Abstract. We develop a type-free theory of modality and truth in which modalities are treated syntactically in the same way as truth, i.e., as predicates of sentences. In contrast to Tarski and Montague, we do not conclude from the well-known inconsistency results for the unrestricted truth scheme and for the predicate versions of certain systems of modal logic that truth predicates have to be typed or that modal predicates are to be replaced by sentential operators. Instead we suggest to hold on to an unrestricted syntax while restricting the admissible instances of the truth scheme and the modal axiom schemes and rules. We are going to present a possible worlds semantics and corresponding logical systems for modal predicates in which these restrictions are taken care of. The possible worlds semantics for the reflexive frame that contains exactly one world yields the theory of truth investigated by Leitgeb [14].
In modal logic, modalities are traditionally expressed by means of sentential operators. On the syntactic level these operators are characterized by the fact that they are applied to formulas, such that any such application yields another formula. E.g., since 2 = 1 + 1 is a sentence, • 22 = 1 + 1 is another sentence, which can be translated into natural language as • it is necessary that 2 = 1 + 1 But there is also a different way of formalizing modalities: we can express necessity — and, for that matter, knowledge, belief, tense, obligation, and so forth — by predicates; e.g., if 2 = 1 + 1 is a name of the sentence 2 = 1 + 1, then • 2(2 = 1 + 1) is a sentence that can be translated into natural language by • ‘2 = 1 + 1’ is necessary The syntactic difference between the two accounts is that predicates take Key words and phrases. Modality, predicates, self-referentiality, paradox, truth. I would like to thank Philip Welch, Volker Halbach, Leon Horsten, Sol Feferman, Johan van Benthem, Aldo Antonelli, Albert Visser, Michael Sheard, Greg Restall, Lev Beklemishev, Byeong-Uk Yi, the participants of seminars on this topic at Stanford, Minneapolis, Berkeley, Oxford, Bristol, and the ASL Meeting in Athens, and finally an anonymous referee. Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
68
TYPE-FREE MODALITY AND TRUTH
69
singular terms rather than formulas as their arguments. Since in natural language we find instances of both modal operators and modal predicates, one would think that modal logicians were interested in both of them and that they were finally aiming at a satisfying axiomatic and semantic treatment of (i) modal operators, (ii) modal predicates, and (iii) their logical relationship. However, while the logic of modal operators is well-studied, the logic of modal predicates has been more or less neglected (with some noteworthy exceptions: see the corresponding references in [7] and [8]). In this article we will pursue the second option, i.e., modal predicates. We will concentrate on modalities that apply to names of syntactic items, although the logic of modal predicates for (structured or unstructured) propositions would be of equal interest; some of the methods introduced below might also be relevant to the study of the latter. Furthermore, we focus mainly on necessity predicates, which we are going to denote by the 2 symbol. But actually our theory is not just applicable to alethic modalities such as necessity but really to a broad range of notions that are typically expressed by operators and which are usually thought of as creating non-extensional linguistic contexts. Section 1 summarizes the particular merits of the predicate approach. In section 2 we analyze its most urgent logical problem: inconsistency. The presence of diagonal constructions along the lines of the Liar sentences for truth predicates seems to yield paradoxical results if combined with standard axiomatic or semantic systems of modal logic for necessity as a predicate. Section 3 presents a logical treatment of modal predicates that is free from such paradoxial consequences. One instantiation of the theory yields the theory of truth in [14] for a semantically closed language. §1. 2 as a predicate. We start by pointing out some of the advantages of not abandoning modal predicates in favor of modal operators right from the start. First of all, there are several concepts that are intimately connected to necessity for which we think we have good reasons not to express them by operators. The paradigm case example is truth: it is generally accepted that the concept of truth ought to be expressed by a predicate of sentences or propositions. The standard argument for this choice is that truth predicates can be easily combined with bound variables and definite descriptions. E.g., we would like to say that every sentence (of a given language) is either true or its negation is true. The obvious way of doing so is by means of a predicate Tr, i.e.: ∀x(Sent(x) → (Tr(x) ∨ Tr(¬x))) ˙
(we use Feferman’s dot notation in order to distinguish logical signs such as ¬ from functions signs such as ¬; the latter denotes the syntactic mapping ˙
that takes sentences or codes of sentences as inputs and maps them to their
70
HANNES LEITGEB
negations or the codes of their negations, respectively). Since truth can be regarded as the borderline case of necessity for a logical space of one and only one “actual” world, it is certainly attractive to treat necessity in a similar syntactic manner. Accordingly, we are used to expressing logical truth, provability, and (in philosophical contexts) analyticity by predicates. Each of these concepts may be considered as a special form of necessity — of a logical, mathematical, or linguistic kind. One virtue of the predicate approach is that it leaves these notions with the logical form that they seem to have at first glance. Moreover, using predicates for each of these notions allows us to combine them in a straight-forward manner. E.g., we are easily able to claim that everything that is provable in Peano arithmetic is necessary, i.e., ∀x(ProvPA (x) → 2(x)) as well as that everything that is necessary is also true: ∀x(2(x) → Tr(x)) Modal predicates are certainly the most convenient syntactic tools for making statements like these syntactically precise. Secondly, when we use modal predicates in order to express modalities we are not forced to leave the standard extensional semantics of first-order languages, which at least those logicians will count as a positive feature who ascribe some sort of primacy to first-order logic. In a sentence such as 2(2 = 1 + 1), it might seem that the sentence 2 = 1 + 1 occurs in a quotational and thus non-extensional context, but actually 2 = 1 + 1 is not a proper logical part of 2(2 = 1 + 1) at all. 2 = 1 + 1 is just some singular term which denotes 2 = 1 + 1 and the predicate 2 is applied to this singular term in the standard first-order manner. Accordingly, we can understand the sentence 2(2 = 1+1) as a standard first-order atomic formula, i.e., as expressing that the object denoted by 2 = 1 + 1 — the sentence 2 = 1 + 1 — is a member of the extension of 2. We may still think of 2 as a distinguished logical predicate if we want, since its extension is determined in a special “logical” or “semantic” way, but this does not change the fact that the standard syntactic and semantic rules of first-order languages apply just as well to formulas involving 2 as a predicate. This is in contrast with modal operators: none of the first-order semantic rules for predicates, propositional connectives, or quantifiers can be applied to them. In this sense, modal predicates enable us to be syntactically and semantically conservative with respect to the first-order languages that we want to extend by modalities. For the same reason, the modal predicate logic of necessity predicates is simply given by the predicate logic of a first-order language with a distinguished predicate 2 the extension of which is characterized by additional eigenaxioms and rules, much as the extensions of mathematical predicates are given by axiomatic systems that are added to first-order logic.
TYPE-FREE MODALITY AND TRUTH
71
Finally, modal predicates are extremely powerful devices with regard to their expressiveness. Of course, we can use them to state claims such as • 2(2 = 1 + 1), 2(2(2 = 1 + 1)), . . . but these statements do not yet indicate the special assets of modal predicates; after all it would easy — indeed, easier — to use modal operators for the same purpose. The next example makes the usefulness of modal predicates as linguistic devices more transparent: • ∀x∀y(2(x) ∧ 2(y) → 2(x ∧ y)) ˙
In standard modal predicate logic the only way to express the claim that the conjunction of two arbitrary necessary sentences is itself necessary is in terms of an axiom scheme of the form 2A ∧ 2B → 2(A ∧ B). In this way, i.e., by shifting the original object-linguistic quantification over formulas into the metalanguage, modal operators can still be used to express universal quantifications. However, this strategy of circumventing modal predicates ceases to be applicable in cases where existential quantifiers or mixtures of universal and existential quantifiers are used. E.g., we might want to say that there are negation sentences which are not necessary, or, that for every necessary sentence x there is a sentence y such that the statement that the necessity of y implies x is itself necessary. Such sentences can be formalized by means of modal predicates as follows: • ∃x(Sent(x) ∧ ¬2(¬x)) ˙
.
• ∀x(2(x) → ∃y2(2y → x)) ˙
˙
(here . is a function sign that denotes a function which maps sentences or codes of sentences to their names or the codes of their names, respectively). This is not to say that it is principally impossible to express such claims on the basis of sentential operators: it is just impossible to do so in the standard languages of modal first-order predicate logic, instead one would have to turn to fragments of second order modal predicate logic with quantifiers of the form ∃p and ∀p which would be supposed to run over propositions. But even if one does so, in order to mimick sentences such as • ∀x(ProvPA (x) → 2(x)), ∀x(2(x) → Tr(x)) • ∀x(Sentarith (x) → (2(x) ∨ 2(¬x))) ˙
(the latter of which expresses that for every arithmetical sentence either the sentence itself or its negation is necessary) one would either have to introduce additional operators — for provability in the axiom system PA, for being expressible in the language of arithmetic, and so forth — or one would allow variables to occur at the same time in places for singular terms and in places for formulas. In any case, the usual operator account of modal logic would have to be strengthened by some strong additional formal machinery, whereas on
72
HANNES LEITGEB
the predicate side it is sufficient to combine modal predicates with standard first-order quantification and with a first-order theory of syntax as strong as first-order arithmetic (modulo coding). In the latter, notions such as provability in Peano arithmetic or being an arithmetical sentence are actually definable. Since philosophers of mathematics and metaphysicians should be able to express claims on the modal status of arithmetical sentences, and since epistemologists ought to be capable of quantifying over infinitely many possible contents of knowledge and belief, languages with modal predicates should be at their disposal. Because of advantages like these the predicate approach to modalities was initially regarded as attractive. E.g., the early Carnap used it in The Logical Syntax of Language [2] as a means of reducing (seemingly) “philosophical” or “metaphysical” sentences to syntactic ones. Quine suggested to employ predicates in order to express logical necessity [19] and, more importantly, belief (see e.g. [20]) and regarded this as one aspect of his programme of translating all scientific idioms into the canonical notation of first-order languages. But mainstream modal logic definitely turned to sentential operators as their primary means of expressing modalitites. This is for several reasons: First of all, modal logicians of more recent generations may conceive of the merits that we ascribed to modal predicates above as somewhat old-fashioned. For many of them the attractiveness of modal operator languages is constituted by the fact that they are not as expressive as full first-order or second-order languages; modal logic is rather assumed to study fragments of first- or second-order languages that are metatheoretically “manageable” (e.g., guarded fragments). For other modal logicians the question of the logical form of modal expressions may seem idle in view of Solovay’s famous theorem that systems for sentential operators such as the provability operator of propositional provability logic can be proved sound and complete modulo realization mappings with respect to the provability predicate for first-order arithmetic. But all of these different motivations for manners and styles of doing modal logic in one or another way should not actually give rise to any sort of antagonism. Rather there ought to be room for investigating and assessing the logic of all kinds of modalities that can be found in natural and scientific language. In many cases, sentential operators will turn out to be the syntactically and semantically most parsimonious means of expressing modal claims. However, if modal logic is to be applied in a domain where a high degree of expressiveness is absolutely necessary — such as in certain areas of philosophy — then modal predicates might be the superior choice. There are no morals in modal logic either. §2. Modal predicates and their paradoxes. So why was the predicate approach to modal logic abandoned after all? Instead of elaborating on the corresponding historical developments in modal logic, we straightforwardly
TYPE-FREE MODALITY AND TRUTH
73
turn to the perhaps main logical reason for the decline of the predicate approach (there are also philosophical worries, such as the one expressed by Church [3] which we are not going to discuss further). While the discovery of the possible worlds semantics for 2 as an operator enabled modal logicians to single out the rich and well-behaved class of normal modal logics and to prove these systems sound and complete, Myhill [18], Kaplan & Montague [10], Montague [16], and their successors showed that if modal predicates are used in order to formalize corresponding systems of modal logic, this seems to commit us to paradoxes: Let e.g. L2 be the first-order language of arithmetic extended by a necessity predicate 2. In this language we formulate the system T: 2(ϕ) → ϕ,
Nec:
ϕ 2(ϕ)
which can be regarded as the obvious translation of the standard modal operator system T into its modal predicate version (actually, the translation of the operator scheme for K is missing, but it is not needed for the subsequent result). As Montague [16] proved, this system is inconsistent with first-order arithmetic. This is particulary bewildering since arithmetical truth is usually regarded to be one of the paradigm cases of necessity, and consequently there should not be anything wrong with adding the T scheme to arithmetic and with closing all logical and arithmetical axioms under necessitation (and first-order deduction). Moreover, if 2 is interpreted as expressing knowledge, then the same system seems to solely consist of fundamental and seemingly unproblematic epistemological assumptions; its inconsistency is therefore sometimes also referred to as the “Knower paradox” (see Egre [4] for an extensive survey of different versions of this paradox). Note that the axioms and the necessitation rule of the system are meant to hold for unrestricted ϕ ∈ L2 . Hence, “Necessity-Liar” sentences for which ↔ ¬2() is arithmetically derivable can be used to instantiate the T-axiom and the necessitation rule. What Montague observed was that weakening Tarski’s truth scheme by (i) taking an axiom scheme for its left-right direction and (ii) turning its right-left direction into a rule was still sufficient to derive a contradiction. Since the existence of such Liar sentences is implied by the Diagonalization lemma for arithmetic, the modal predicate approach seems to be stuck with a serious problem. Montague’s way out was perfectly straight-forward: he simply suggested to express necessity in terms of a sentential operator, by which all paradoxical constructions could be banned right from the start. It is illuminating to compare this move to Tarski’s response to his discovery of semantic paradoxes
74
HANNES LEITGEB
for truth: both Tarski and Montague turned to syntactic restrictions in order to avoid paradoxical consequences. The only difference is that where Tarski relied on a hierarchy of typed truth predicates in order to exclude Liar-like sentences as not being well-formed, Montague preferred the even more severely restricted expressive means of a necessity operator. In the latter case, modalities do not have to be indexed in any way, but the “type” of a modal sentence can be reconstructed from the modal depth of its recursive structure. Kripke [11] has given strong arguments against Tarski’s syntactic restrictions on truth. E.g., there are sentences in natural language which contain the truth predicate, which seem perfectly meaningful, and which we should therefore be able to represent in a formal language. However, the type-theoretic level on which these sentences ought to be located by Tarski’s conception of truth — if there is such a level at all — can be shown to depend on empirical states of affairs. But the determination of the manner in which sentences in natural language are to be logically formalized should be independent of, and indeed prior to, empirical investigations on the truth values of these sentences. Hence Tarski’s way out is not the right one. As far as applications of modal predicates in natural language are concerned, arguments like these also apply to the syntactic restrictions suggested by Montague. Instead of fixing the “level” or “rank” of formulas with truth or necessity predicates syntactically, we should rather aim at a logical theory for untyped predicates in which formulas are assigned “levels” or “ranks” by some semantic construction, such as Kripke’s transfinitely recursive approximation of his least Strong Kleene fixed point valuation in [11]. However, contrary to Kripke’s theory of truth (or recent extensions of it such as Field’s [5]), we intend to hold on to standard first-order logic, i.e., we do not want to change our background logic into a non-classical one. This is one of our desiderata in this paper; we do not argue for it except for pointing out that otherwise all modal logicians who prefer classical logic over non-classical ones would not be able to regard the predicate approach as a proper alternative to standard modal operator logic. Summing up: we are looking for a uniform account in which necessity, truth, and various others modalities are expressed by predicates in classical first-order languages. We neither accept any type-theoretic restrictions (apart from the ones that are “built” into the syntactic rules of first-order languages), nor do we want to exclude strong theories of syntax which allow for, or even imply, the existence of ungrounded sentences. But how can we thus avoid the paradoxes? Let us reconsider L2 , i.e., the first-order language of arithmetic with a necessity predicate 2: a first possible diagnosis of paradoxes as the above one is that they “inform” us about the inadequacy of some of our traditional ¨ theorem may be interpreted to tell choices of modal axiom systems. E.g., Lob’s us that the modal system T is not the logic of formal arithmetical provability.
TYPE-FREE MODALITY AND TRUTH
75
Accordingly, T is perhaps not the logic of metaphysical necessity or knowledge either and the paradoxes can be “cured” by simply choosing the “right” logics of necessity and knowledge. If this were the right interpretation, other choices would be inadequate as well: e.g., the predicate variant of the operator system K+D+4+Necessitation ¨ is inconsistent with arithmetic by Godel’s second incompleteness theorem; the predicate version of the modal system K+D+Barcan formula+Necessitation is -inconsistent by McGee’s theorem (see [15]); the predicate variant of the minimal temporal logic of past tense and future tense operators is internally inconsistent given arithmetic (see Horsten & Leitgeb [9]); and so forth (see Friedman & Sheard [6] and Leitgeb [12] for further examples of theories that are inconsistent with arithmetic or that at least exclude the standard model of arithmetic). The deficiencies of these systems for modal predicates would have to be interpreted to show that none of their corresponding operator systems were actually acceptable as modal logics, because translating the axiom schemes and rules of the latter to their modal predicate counterparts leads to inconsistency. At this point the most urgent question would thus be: the axiom schemes and rules of which logics for modal operators can be translated consistently to logics for modal predicates? The results in Halbach, Leitgeb, and Welch [7] may be interpreted in the way that provability seems to be the “only” modality that can be expressed by an unrestricted predicate of sentences in the context of a sufficient amount of arithmetic (we will return to this result below). More precisely: if we want to preserve the axiom schemes and rules of modal operator logics without any restrictions whatsoever on the formulas of L2 by which these schemes and rules can be instantiated, then the only modal operator logic for which this can be done consistently is modal provability logic (together with its subsystems). But many of our standard systems in modal logic are simply not intended to involve provability-type modalities at all. This can be seen from the fact that their intended possible worlds frames have properties such as reflexivity or seriality which are excluded from the transitive converse well-founded frames of provability logic. The advocates of modal operators could argue justifiedly that the schemes and rules of their sound systems of modal logic should not be blamed for the defects of their predicate analogues. If provability is the only consistent modal predicate there is, then the predicate approach is simply not able to match the great variety of modal notions and applications that can be dealt with by the operator account. Modal predicates may still be used to study formal provability, but they do not yield an alternative treatment of modalities in general in the same was as sentential operators do. The scope of modal predicates would thus end up to be severly constrained. Fortunately, there is a different diagnosis available and, correspondingly, a different way out. Rather than preserving the axiom schemes and rules of modal operators unrestrictedly as we did above, we might impose “natural”
76
HANNES LEITGEB
restrictions on their permissible instantiations. From the viewpoint of expressive languages with modal predicates such as L2 , this is the very reason why the systems for modal operators are not affected by paradoxes: since their languages are much more restricted syntactically than languages with modal predicates are, the former can only yield instantiations of T and of necessitation which are equally restricted. E.g., these languages allow for sentences that say about arithmetic sentences that they are necessary, or for sentences that claim about the latter sentences that they are not necessary, and so on, and each of these sentence can be “plugged” into the T axiom scheme and the necessitation rule in order to derive theorems. But e.g. no Liar-like sentences are available by which axiom schemes or rules could be instantiated, for the simple reason that such sentences cannot be constructed syntactically from the non-modal part of the language by iterated application of sentential operators. Accordingly, in the syntactically more liberal case of languages with modal predicates, we might search for plausible sets of restricted instances of the T axiom scheme and the necessitation rule. Such sets of instances would at least have to satisfy the following postulates: 1. All “paradoxical” instantiations are excluded (otherwise the systems would be threatened by inconsistency again). 2. We can derive the predicate versions of all sentences that are derivable in the combination of the operator logic T with first-order arithmetic (otherwise the predicate system would be weaker than its operator counterpart), and accordingly for all other standard systems of modal logic. 3. We are able to derive a rich variety of theorems additional to the ones mentioned in 2 (otherwise the predicate approach would not be superior to the operator account and we could have used the latter from the start). 4. There is a plausible semantic guideline or principle by which the “natural” and intended instances of modal schemes and rules can be distinguished from the “unnatural” and unintended ones (otherwise the logic of modal predicates would be accidental and arbitrary). 5. One and the same semantic guideline applies to all sorts of modal predicates at the same time (otherwise the logical interaction of different modal predicates in multi-modal axiom systems would be hampered). This strategy of coping with the paradoxes for modal predicates can be regarded as being analogous to the now generally accepted response to the foundational crisis in mathematics at the beginning of the last century: first the unrestricted comprehension scheme for sets was used to derive contradictions; then type-theoretic restrictions were imposed on the syntax of set theory in order to avoid the set-theoretic paradoxes; secondly, the latter syntactic restrictions were replaced by imposing restrictions on the permissible instances of the comprehension scheme where each of these restrictions could be expressed within the first-order language of set theory. Indeed, except for
TYPE-FREE MODALITY AND TRUTH
77
extensionality and choice, all of the standard axioms of modern axiomatic set theory may be regarded as “natural” instances of comprehension. Finally, the “naturalness” of these instances can at least to some extent be justified on the basis of conceptions such as the cumulatively ranked universe of sets or the idea of the “smallness” of sets. If we were able to develop a corresponding approach of saving modal predicates from paradox in a systematic and non-arbitrary way, the languages and logics for modal predicates would be acceptable both from a logical and from a philosophical point of view. Their formal development would extend the scope of modal logic and might finally add to the great success that modal operator logics have enjoyed in the last few decades. As far as the semantics of modal predicates is concerned, restrictions on modal axiom schemes and rules should correspond to restrictions on the semantic rules for these predicates. In particular, there should be a possible worlds semantics for necessity predicates in which the satisfaction clause for sentences of the form 2(ϕ), i.e., w |= 2(ϕ) iff for all w : if wRw then w |= ϕ only holds for restricted ϕ ∈ L2 . As Halbach, Leitgeb, and Welch (see [7] and [8]) have shown, the only frames for which this satisfaction clause can hold unrestrictedly are “essentially” those given by converse-wellfounded accessibility relations R (by “essentially” we mean: up to the possible occurrence of converse-illfounded worlds w at places where the set of converse-wellfounded worlds seen by w can be assigned an admissible ordinal rank; see [7] for the details). This is the semantic result which we used above in order to support our claim that the axiom schemes and rules for standard provability logic (or systems weaker than provability logic) would be the only ones that could be translated to their predicate counterparts unrestrictedly. If “paradoxical” instances were not excluded from the rules of possible worlds semantics for modal predicates, many of the standard graph-theoretical properties of R that we make use of (successfully) in the possible worlds semantics for modal operators would be excluded from modal frames. Note that postulate 2 from our list of desiderata above shows up semantically in the way that our intended restrictions to plausible formulas ϕ should be independent of our choice of accessibility relation. Our strategy in the next section will be to develop a possible worlds semantics for 2 as a predicate in which such “natural” restrictions on ϕ are taken care of. This possible worlds semantics is going to determine axiomatic systems the axiom schemes and rules of which are restricted accordingly. §3. Towards a modal logic for 2 as a predicate. What are “natural instances” of modal axiom schemes and rules if modalities are expressed by predicates?
78
HANNES LEITGEB
How can we restrict the semantic rules for the possible worlds semantics of 2 as a predicate to their “natural instances”? In the following we want to define a set Φ2 of sentences of L2 whose truth values are determined directly or indirectly by the necessity of sentences without 2, i.e., by the necessity of purely arithmetical sentences. The members of Φ2 , and only the members of Φ2 , will be allowed to be inserted into modal schemes and rules such as the ones for the modal system T. It will follow that none of these instantiations has any paradoxical effects, that all (predicate translations of) sentences that are derivable in the combination of the operator logic T with first-order arithmetic are among the instantiations given by Φ2 , and that a rich variety of additional instantiations is available that we could not even express in standard modal operator languages. The “naturalness” of our restriction to members of Φ2 will (hopefully) show up in the semantic construction of this set. The basic idea behind this construction can be explained in terms of simple examples such as atomic sentences of the form 2(ϕ): by the semantic rules of first-order languages, the truth value of 2(ϕ) only depends on whether the sentence ϕ is a member of the extension of 2: if yes, then 2(ϕ) is true, else 2(ϕ) is false. We will express this in the way that the sentence 2(ϕ) (completely) depends on the set {ϕ}. If ϕ is an arithmetical sentence, then we will say that 2(ϕ) depends directly on arithmetical, i.e., on non-semantic states of affairs. For analogous reasons, the sentence 2(2(ϕ)) depends on {2(ϕ)}. In a case where ϕ is an arithmetical sentence again, we will say that 2(2(ϕ)) depends indirectly on arithmetical, i.e., on non-semantic states of affairs, since it depends on {2(ϕ)} the only member of which depends on {ϕ} for arithmetic ϕ. In order to make this precise we have to state a definition of a binary dependency relation between sentences of L2 and sets of such sentences. We choose the second relata of this relation to be sets of sentences rather than sentences, because the truth values of some sentences may be seen to depend on more than just one sentence at a time; e.g., if [x] is an open arithmetical formula, then ∀x([x] → 2(x)) and ∃x([x] ∧ 2(x)) provably depend on the set of (codes of) sentences that satisfy [x]. For ϕ ∈ L2 , Φ ⊆ L2 , we make use of the following terminology: ‘ValΦ (ϕ)’ shall denote truth value of ϕ in the expansion of the standard model of firstorder arithmetic in which Φ is taken to be the extension of 2. For simplicity, we will not consider any vocabulary additional to arithmetic expressions and 2; therefore, once the arithmetic vocabulary is interpreted in the standard way and a set Φ is fixed to be the interpretation of 2, the truth value of every sentence ϕ ∈ L2 is determined semantically. Accordingly, we will assume later that each of our possible worlds corresponds to one of these expansions of the standard model of arithmetic. But actually the same methods that we are going to employ below are also applicable to models for languages with additional vocabulary (in particular, contingent vocabulary), as long as
TYPE-FREE MODALITY AND TRUTH
79
a coding procedure for the syntactic items of this language is available. Finally, we will restrict ourselves to the study of modal predicates for sentences rather than open formulas. Since we operate in expansions of standard models of arithmetic, this is unproblematic because we can rely on substitutional quantification — i.e., quantification understood in terms of substituting numerals — both within and outside of modal contexts, and de re modalities can thus be expressed in terms of a necessity predicate for sentences. In order to express de re modalities in other structures, we would employ the modal counterparts of satisfaction predicates in order to express, e.g., → → → ∃− x 2 − x , ϕ − x → → i.e., that the open formula ϕ[− x ] is necessary of the sequence − x of objects. A theory analogous to the one below can be stated for such modal satisfaction predicates. However, for the sake of simplicity, we shall restrict ourselves to unary necessity predicates of sentences. Whenever a sentence of L2 is said to quantify over or refer to sentences, it is also actually meant to quantify over or refer to the natural number codes of these sentences (relative to some fixed recursive coding). This is what our formal definition of dependency looks like for arbitrary ϕ ∈ L2 , Φ ⊆ L2 : Definition 3.1. ϕ depends on Φ :iff for all Ψ1 , Ψ2 ⊆ L2 : if Ψ1 ∩ Φ = Ψ2 ∩ Φ, then ValΨ1 (ϕ) = ValΨ2 (ϕ). Read epistemically, this means that ϕ depends on Φ if and only if complete knowledge about the necessity of the members of Φ is (in principle) sufficient to determine the truth value of ϕ. E.g., 2(ϕ) obviously depends on {ϕ} for every ϕ ∈ L2 . Moreover, if 2(ϕ) depends on some other set Ψ, then Ψ must actually be a superset of {ϕ}. In general, however, there are sentences for which there is no least set of sentences on which they depend, i.e., every set on which such a sentence depends gives rise to an infinitely descending chain of subsets on which the sentence depends as well. For every set Φ ⊆ L2 let us now consider the set of sentences ϕ ∈ L2 that depend on Φ. We call the corresponding operator ‘D’ and define: D(Φ) := {ϕ ∈ L2 |ϕ depends on Φ } It is easy to see that D is monotonic, i.e., if Φ ⊆ Ψ, then D(Φ) ⊆ D(Ψ). Thus, by the Knaster-Tarski fixed point theorem (or the theory of induction on abstract structures in [17]), there is a least fixed point Φ2 of D that can be approximated from below by means of: • Φ0 := ∅ • Φα+1 := D(Φα ) • Φ := α< Φα (for a limit)
80
HANNES LEITGEB
Let α ∗ be the least ordinal for which this transfinite recursion reaches its fixed point, i.e., Φα ∗ = Φ2 . One can show that Φ2 may also be reached by starting from the purely arithmetic part L of L2 , i.e., defining Φ0 to be identical to L rather than ∅ would generate the same fixed point; indeed, for α > 0 the same sequence would be determined. Note that by means of the sequence (Φα ), ordinal ranks can be assigned to the members of Φ2 according to their first occurrence in the sequence. Φ2 is subject to nice closure conditions: it is closed under all propositional operations (e.g., if ϕ ∈ Φ2 then ¬ϕ ∈ Φ2 ), substitutional quantification, logical equivalence, dependency (if ϕ depends on Φ2 then ϕ ∈ Φ2 ), and applications of 2 (if ϕ ∈ Φ2 then 2(ϕ) ∈ Φ2 ). Furthermore, by the definition of the sequence (Φα ) and the fact that Φ0 could be defined to be the pure language L of arithmetic, the members of Φα either depend directly on the arithmetic fragment of L2 (in case α = 1) or they do so indirectly, i.e., by means of intermediate steps (in case α > 1). So we can define: Definition 3.2. For ϕ ∈ L2 : 1. ϕ depends (directly or indirectly) on non-semantic states of affairs :iff ϕ ∈ Φ2 . 2. ϕ is ungrounded :iff ϕ ∈ Φ2 . E.g., 2(2 = 1+1), 2(2(2 = 1+1)), ∀x(ProvPA (x) → 2(x)) depend on non-semantic states of affairs. On the other hand, every 2-Liar that is identical to 2(), where is an arithmetic expression that denotes (the code of) in the standard model arithmetic, is ungrounded. Accordingly, ∀x(2(x) → ¬2(¬x)) is ungrounded. ˙
Ungrounded sentences can be further classified into those which are selfreferential and those which are not: Definition 3.3. For ϕ ∈ L2 : ϕ is selfreferential :iff for all Φ ⊆ L2 : if ϕ depends on Φ, then ϕ ∈ Φ. In this way the intuitive notion of self-referentiality is explicated in terms of a formal notion of self-dependency. This is particularly useful in view of the fact that our pre-theoretic concept of self-referentiality is unclear and ambiguous; compare the controversial discussion (see Leitgeb [13] for an analysis) on whether the members of the Yablo’s [21] infinite sequence of paradoxical sentences are self-referential or not. According to the above definitions, the latter prove to be ungrounded though not self-referential (if they are expressed by means of infinitely many predicates Pn which are added to the language L2 and the interpretations of which are chosen to be of the form {∀x(Pn+1 (x) → ¬2(x)), ∀x(Pn+2 (x) → ¬2(x)), . . .}). Liar sentences
TYPE-FREE MODALITY AND TRUTH
81
as described above are of course self-referential. Every self-referential sentence is ungrounded but not necessarily vice versa. Now we are going to show that by restriction to the members of Φ2 a possible worlds semantics for 2 as a predicate can be introduced. We will explain the procedure in terms of an example: Let W, R be an arbitrary Tframe, i.e., let R ⊆ W × W be reflexive. For all w ∈ W , we will approximate the final extension of 2 in w along R in the following way: • Γw 0 := ∅ • Γw {ϕ ∈ Φα+1 | for all w ∈ W : if wRw , then ValΓwα (ϕ) = 1} α+1 := w • Γw α< Γα (for a limit). := The sequence (Γw α ) is thus defined by a “jump” operation for valuations similar to the one Kripke used to define his three-valued evaluation mappings for partial truth predicates, but of course the jump operation in our theory is defined on classical models. In order to end up with a monotonically increasing sequence of extensions for 2 in w, the successor stage clause is restricted to members of Φα+1 . Indeed, by simultaneous transfinite induction over α and one shows that Lemma 3.4. For all w ∈ W , for all α ∈ Ord , for all ∈ Ord with < α: w Γw α ∩ Φ = Γ . (cf. Theorem 16 in [14]) and thus that for all w, (Γw α ) is monotonic and . converges to a fixed point Γw 2 w By the fixed point property of Γw 2 it follows that if Γ2 is used as the extension of 2 in w for all worlds w in W , the possible worlds satisfaction clause for 2 is satisfied for all members of Φ2 ; consequently, all instantiations of the predicate versions of K, T, and necessitation with members of Φ2 can be shown to hold in the following sense: Theorem 3.5. For all ϕ, ∈ Φ2 , for all w ∈ W : • ϕ ∈ Γw 2 iff ValΓw2 (2(ϕ)) = 1 iff for all w ∈ W : if wRw then ValΓw2 (ϕ) = 1. • ValΓw2 (2(ϕ → ) → (2(ϕ) → 2())) = 1. • ValΓw2 (2(ϕ) → ϕ) = 1. • If for all w ∈ W , ValΓw2 (ϕ) = 1, then for all w ∈ W , ValΓw2 (2(ϕ)) = 1. Here are some examples of members of Γw 2 (for arbitrary w ∈ W ): • 2=1+1 2(2 = 1 + 1) 2(2(2 = 1 + 1)) .. . • ∀x(ProvPA (x) → 2(x))
82
HANNES LEITGEB
• 2(2(2 = 1 + 1)) → 2(2 = 1 + 1) . • 2(∀x(SentL (x) ∧ 2(2(x)) → 2(x))) ˙
(where SentL (x) is arithmetically definable and expresses that x is the code of a purely arithmetical sentence). 2-Liars are not members of Γw 2 , neither . are ∀x(2(2(x)) → 2(x)) or ∀x(2(x) → ¬2(¬x)). ˙
˙
All of our postulates from above can be seen to be satisfied by this semantic construction. The same approach works also for other modal systems like D, S4, S5, . . . , for which different or additional constraints on R would have to be introduced. Furthermore, the same methods can be applied if “contingent” vocabulary is used the interpretation of which is allowed to vary from one world to another (in this way different possible worlds could be distinguishable as first-order models even if they were indistinguishable in terms of their places in the accessibility relation; without such contingent vocabulary e.g. every world in a serial frame R — i.e., where for all w ∈ W there is a w ∈ W such that wRw — can in fact be shown to correspond to one and the same first-order model for the trivial reason that the same jump operation is applied to the same initial model in each world1 ). Accordingly, sound axiomatic systems can be formulated in terms of (i) a (primitive) predicate for Φ2 and (ii) the necessity predicate 2. These axiomatic systems do not have complete axiomatizations of course, since not even their ¨ arithmetic fragments have complete axiomatizations (according to Godel’s incompleteness theorems). If we choose a (trivial) reflexive frame with precisely one world, i.e., • W = {w0 }, R = {w0 , w0 } then the necessity predicate 2 for this world is actually nothing but a truth predicate Tr and the same procedure as above leaves us with a definition 0 of truth for the semantically closed language LTr : ΦTr and ΓTr = Γw Tr are such that for all ϕ ∈ ΦTr : ValΓTr (Tr(ϕ) ↔ ϕ) = 1. The corresponding theory of truth is the one introduced and studied by Leitgeb [14]. ΦTr and ΓTr can be shown to be Π11 but not ∆11 ; the least fixed point ordinal α ∗ is identical to the least non-recursive ordinal 1CK (the closure ordinal for induction on the standard model of first-order arithmetic). Since classical logic is presupposed, ΓTr is incomparable with Kripke’s least fixed point for the Strong Kleene scheme, whereas ΓTr can be proved to be a proper subset of the least Cantini fixed point (see Cantini [1]) for the supervaluation scheme by which partial valuations are extended to classical ones. The “gap” between ΓTr and Cantini’s supervaluation fixed point can be closed by a slight variation of our theory according to which a conditional notion of dependency is used 1 We
want to thank Philip Welch for pointing this out to us.
TYPE-FREE MODALITY AND TRUTH
83
that may be defined as follows (see Leitgeb [14], Section 5.4, for a weaker notion of conditional dependence): Definition 3.6. ϕ depends on Φ given Γ:iff for all Ψ1 , Ψ2 ⊆ LTr with (i) Γ ⊆ Ψ1 , Ψ2 and (ii) Ψ1 , Ψ2 consistent: if Ψ1 ∩ Φ = Ψ2 ∩ Φ then ValΨ1 (ϕ) = ValΨ2 (ϕ). If we define the sequences (Φα ) and (Γα ) co-recursively in the way that • Φ0 := ∅, Γ0 := ∅ • Φα+1 := {ϕ | ϕ depends on Φα given Γα }, Γα+1 := α+1 | ValΓα (ϕ) = 1} {ϕ ∈ Φ • Φ := α< Φα , Γ := α< Γα (for a limit) then the fixed point of the Γα sequence is just Cantini’s least supervaluation fixed point, as can be seen by inspecting that ϕ ∈ Γα+1 iff for all consistent extensions Ψ of Γα : ValΨ (ϕ) = 1. REFERENCES
[1] A. Cantini, A theory of formal truth arithmetically equivalent to ID1 , The Journal of Symbolic Logic, vol. 55 (1990), pp. 244–259. [2] R. Carnap, The Logical Syntax of Language, Kegan Paul, London, 1937. [3] A. Church, On Carnap’s analysis of statements of assertion and belief, Analysis, vol. 10 (1950), pp. 298–304. [4] P. Egre, The knower paradox in the light of provability interpretations of modal logic, Journal of Logic, Language and Information, vol. 14 (2005), pp. 13– 48. [5] H. Field, A revenge-immune solution to the semantic paradoxes, Journal of Philosophical Logic, vol. 32 (2003), pp. 139–177. [6] H. Friedman and M. Sheard, An axiomatic approach to self-referential truth, Annals of Pure and Applied Logic, vol. 33 (1987), pp. 1–21. [7] V. Halbach, H. Leitgeb, and P. Welch, Possible worlds semantics for modal notions conceived as predicates, Journal of Philosophical Logic, vol. 32 (2003), pp. 179–223. [8] , Possible worlds semantics for predicates, Intensionality (R. Kahle, editor), Lecture Notes in Logic, vol. 22, Association for Symbolic Logic, 2005, pp. 20– 41. [9] L. Horsten and H. Leitgeb, No future, Journal of Philosophical Logic, vol. 30 (2001), pp. 259–265. [10] D. Kaplan and R. Montague, A paradox regained, Notre Dame Journal of Formal Logic, vol. 1 (1960), pp. 79–90. [11] S. Kripke, An outline of a theory of truth, Journal of Philosophy, vol. 72 (1975), pp. 690– 716. [12] H. Leitgeb, Theories of truth which have no standard models, Studia Logica, vol. 68 (2001), pp. 69–87. [13] , What is a self-referential sentence? Critical remarks on the alleged (non-) circularity of Yablo’s paradox, Logique et Analyse, vol. 177–178 (2002), pp. 3–14. [14] , What truth depends on, Journal of Philosophical Logic, vol. 34 (2005), pp. 155– 192. [15] V. McGee, How truthlike can a predicate be? A negative result, Journal of Philosophical Logic, vol. 14 (1985), pp. 399– 410.
84
HANNES LEITGEB
[16] R. Montague, Syntactic treatments of modality with corollaries on reflexion principles and finite axiomatizability, Acta Philosophica Fennica, vol. 16 (1963), pp. 153–167. [17] Y. N. Moschovakis, Elementary Induction on Abstract Structures, Studies in Logic and the Foundations of Mathematics, vol. 77, North-Holland, Amsterdam, 1995. [18] J. Myhill, Some remarks on the notion of proof, Journal of Philosophy, vol. 57 (1960), pp. 461– 471. [19] W. V. Quine, Three grades of modal involvement, Ways of Paradox and Other Essays, Harvard University Press, Cambridge, MA, 1976, pp. 158–176. [20] , From Stimulus to Science, Harvard University Press, Cambridge, Mass., 1995. [21] S. Yablo, Paradox without self-reference, Analysis, vol. 53 (1993), pp. 251–252. DEPARTMENTS OF PHILOSOPHY AND MATHEMATICS UNIVERSITY OF BRISTOL 9 WOODLAND ROAD, BRISTOL BS8 1TB, UK
E-mail:
[email protected]
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
JUSTIN TATCH MOORE
Abstract. In this paper I will survey some recent developments in the combinatorics of Aronszajn trees. I will cover work on coherent and Lipschitz trees, the basis problem for uncountable linear orderings, subtree bases for Aronszajn trees, and the existence of minimal Aronszajn types.
§1. Introduction. An Aronszajn tree is an uncountable tree in which all levels and chains are countable. These objects were first constructed by Aronszajn and Kurepa in the course of analyzing Souslin’s Hypothesis. Their study, both in and outside of the context of Souslin’s Hypothesis, has played an important role in the development of set theory ever since. For example, the complete solution of Souslin’s problem represented both some of the pioneering work on the fine structure of the constructible universe by Jensen (see [8]) as well as the birth of the forcing axioms in Solovay and Tennenbaum’s [25]. Todorcevic’s analysis of Aronszajn trees in [29] led to his method of minimal walks which has seen wide and varied applications — see [34]. In [28], Todorcevic presented a survey of trees and linear orderings. At the time it appeared, it captured essentially all of the existing knowledge on Aronszajn trees and lines. The purpose of this article is to survey some of the developments in the study of Aronszajn trees which have occurred since [28]. I will focus on Aronszajn trees which are not Souslin with an emphasis on results which address the structure of these trees both as individuals and as a class. The topics are organized in approximate chronological order. I will begin with some preliminaries on trees in Section 2. Todorcevic’s analysis of coherent and Lipschitz trees is detailed in Sections 3 and 4. Section 5 discusses Shelah’s 2000 Mathematics Subject Classification. Primary: 03E05, 03E75, 06A07; Secondary: 03E55, 03E65. Key words and phrases. Aronszajn tree, basis problem, coherent tree, Countryman line, forcing axiom, Lipschitz tree, non-Suslin base, Shelah’s conjecture, ladder system uniformization. I would like to acknowledge support provided by NSF grant DMS–0401893. I would also like to acknowledge partial travel support provided by the ASL to present some of the material surveyed in this article at the 2005 Logic Colloquium in Athens, Greece. Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
85
86
JUSTIN TATCH MOORE
conjecture for Aronszajn orders and its proof in [17]. Section 6 surveys Baumgartner’s, Hanazawa’s, and Todorcevic’s work on bases for the subtrees ¨ of an Aronszajn tree. Section 7 presents joint work with Konig, Larson, and Veliˇckovi´c in [12] on the consistency strength of Shelah’s conjecture. Aronszajn tree uniformization and its relation to the consistency of “1 and 1∗ are the only minimal uncountable order types” is discussed in Section 8. The paper closes with some open questions in Section 9. The reader of this paper is assumed to have some fluency in set theory; [11] and [13] are standard references for background material. §2. Preliminaries. In this section I will present some of the prerequisites on trees and fix some notation and terminology. Further reading can be found in [28]. The end of this section will also provide some discussion of the axioms which I will be quoting from time to time. Recall that a tree is a partial ordering (T, <) in which the predecessors of any given t in T are well ordered by <. The order type of this set is called the height of t. All trees in this paper are assumed to be Hausdorff — whenever t = t both have limit height, they have a different set of predecessors. The set of all elements of T of a given height is denoted T and called the th -level of T . The least strict upper bound for the heights of elements of T is known as the height of T . Trees of height 1 in which all levels are countable are known as 1 -trees. Those which moreover have no uncountable chains are known as Aronszajn trees or simply A-trees. The following operations are central to the analysis of an A-tree T . Definition 2.1. If t is in T and α is an ordinal, then t α is t if α is at least the height of t and otherwise is the element s of T such that s < t and the height of s is α. Definition 2.2. If s and t are incomparable elements of T , then ∆(s, t) is the greatest ordinal such that s = t . Definition 2.3. If s and t are incomparable elements of T , then s ∧ t is the restriction s ∆(s, t) = t ∆(s, t). Frequently it will be helpful to utilize the following notation. Typically we will be interested in the cases F = ∆(·, ·) and F = ∧(·, ·). Definition 2.4. Suppose that F is defined on a subset of T k for some k. ¯ such that a¯ is in the If A is a subset of T , then F (A) is the set of all F (a) intersection of Ak and the domain of F . Collections of pairwise incomparable elements in T play an important role in the study of A-trees — they are known as antichains. An important
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
87
subclass of the A-trees are the special A-trees — those which can be covered by countably many antichains. The study of A-trees is connected to the study of A-lines — those uncountable linear ordering which do not contain a real type, 1 , or 1∗ — by the following definition. Definition 2.5. If (T, <) is a tree, then a linear ordering ≤lex on T is a lexicographical ordering if, for distinct s and t in T , s ≤lex t is equivalent to s ( + 1) ≤lex t ( + 1) where = ∆(s, t) if s and t are incomparable and = min{ht(s), ht(t)} otherwise. Any lexicographical ordering on an A-tree is an A-line and every A-line can be represented as a suborder of a lexicographical ordering on an A-tree (see [28]). Frequently we will be interested in the notion of a subtree of an A-tree. There are different ways one can make this precise; we will use the following as our definition. Definition 2.6. A subtree of an A-tree T is an uncountable subset of T which is closed under the ∧-operation. Definition 2.7. A subset U of an A-tree T is downward closed if t ≤ u and u in U implies t is in U . It is worth noting that if A is an uncountable subset of an A-tree T , then ∧(A) is a subtree of T (in particular, ∧(A) is uncountable). Notice that if A is a downward closed subset of T , then ∧(A) is the set of all members of T which have two or more immediate successors. Since many statements about A-trees are independent of ZFC, one cannot have a serious discussion of modern work on A-trees without some mention of additional axioms of set theory. That said, I will not define these axioms here and will refer the interested reader to other standard sources. The additional axioms we will consider fit into two classes. The axioms CH, ♦, and ♦+ are progressively stronger axioms which all are consequences of V = L. They are sometimes referred to as enumeration principles as they enable the construction objects with second order properties — such as Souslin trees — by diagonalizations of length 1 . Moreover, ♦+ settles most statements about 1 . Each of these can be forced by a -closed forcing (see [13]). Further reading can be found in [8], [11], and [13]. The axioms MAℵ1 , PFA(κ), and PFA also represent a progressively stronger list of axioms which act as alternatives to the enumeration principles above. These are examples of forcing axioms. They can be viewed as postulating forms of Σ1 -absoluteness between V and its generic extensions. The weakest of these axioms — MAℵ1 — was introduced in [25] in the course of proving the consistency of Souslin’s Hypothesis. The methods of [25] establish its
88
JUSTIN TATCH MOORE
consistency and it in turn implies Souslin’s Hypothesis. MAℵ1 negates CH; in general forcing axioms serve to limit diagonalization constructions of length 1 to those which can be carried out in ZFC. The Proper Forcing Axiom (PFA) has considerable consistency strength; the best known upper bound is the existence of a supercompact cardinal. If κ is an uncountable cardinal, then PFA(κ) is a weakening of PFA introduced in [9] which essentially captures the consequences of PFA for statements about subsets of κ. PFA(ℵ1 ) is also known as the Bounded Proper Forcing Axiom. While BPFA itself has large cardinal strength — it is equiconsistent with a reflecting cardinal [9] — a number of its consequence cited here have no large cardinal strength. I will use BPFA∗ to mean that BPFA is used as a hypothesis and that the stated consequence is consistent relative only to ZFC. Further information on MAℵ1 and PFA can be found in [6], [22], [30]. Reading on PFA(κ) can be found in [9], [15], and [32]. The reader who wishes to simplify the axiomatic picture can replace all the assumptions in this paper with either ♦+ or PFA and not lose too much information. Still, the results in Section 7 cannot be appreciated without a finer stratification of the forcing axioms. §3. Coherent trees and Countryman orders. One of the most important notions in the modern analysis of A-trees is that of a coherent tree. Definition 3.1. [29] A coherent sequence (of length 1 ) is a sequence e ( < 1 ) such that the range of each e is contained in and if ≤ < 1 , then { < : e () = e ()} ¯ we can consider is finite. For such a sequence e, ¯ = {e α : α ≤ < 1 } T (e) as a tree ordered by end extension. Trees having this form are said to be coherent. Not all coherent trees are Aronszajn: Let T (∅) be the tree of all countable length sequences of elements of with finite support. Every uncountable ∧-closed subset of T (∅) has an uncountable chain. It is easily seen, however, that every coherent tree with a co-final branch can be embedded into T (∅). A natural condition on e¯ which ensures the non-existence of an uncountable ¯ is that each e is finite-to-one. While such sequences can branch in T (e) be routinely constructed by transfinite induction, we will see momentarily that natural examples can simply be defined by a recursive formula from an appropriate parameter. We will see in the next section that Coherent trees are, in a sense that can be made precise, the irreducible A-trees.
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
89
When lexicographically ordered, such sequences give natural examples of Countryman lines — uncountable linear orderings C such that the coordinatewise partial order on C 2 is the union of countably many chains. The existence of such orderings was first proved in [21]. As we will see in Section 5, the existence of Countryman lines has important implications for the basis problem for uncountable linear orders. Theorem 3.2. [29] If C = {e : < 1 } is a coherent sequence of finite-toone functions, then C is Countryman when given the lexicographical ordering. Now we will give the standard example of a coherent sequence of finite-toone functions. First we will need a definition. Definition 3.3. A C -sequence (on 1 ) is a sequence Cα (α < 1 ) such that Cα ⊆ α is co-final in α and if < α, then Cα ∩ is finite for each α < 1 . Example 3.4. [29] If α ≤ , define 1 (α, ) recursively by 1 (α, α) = 0,
1 (α, ) = max 1 α, min(C \ α) , |C ∩ α| .
Alternately, 1 (α, ) = max |Ci ∩ α| i
where 0 = , i+1 = min(Ci ∩ α) if i > α and l is such that l = α. Set e (α) = 1 (α, ). Then e ( < 1 ) is a coherent sequence of finite-to-one functions. While it does not quite fall within the scope of this article, let me also point out that the ZFC construction of an L space in [18] is a consequence of the study of certain A-trees such as T ( 1 ). A wide variety of other constructions and applications of the method of minimal walks is presented in [34]. §4. The structure of the class of Lipschitz trees. The class of trees themselves are equipped with a number of natural quasi-orderings. The one of interest to us here will be defined by S ≤ T iff there is a strictly increasing function from S into T . The restriction of this ordering to the class A of A-trees is already an interesting object to study. While it was shown in [1] that every two A-trees are consistently club isomorphic, the order ≤ and the corresponding equivalence ≡ allowed for a finer study of A-trees. For example, Laver asked in [14] whether the class of A-trees was well quasi-ordered1 by the stronger quasiorder in which the embedding is 1 A reflexive, transitive relation ≤ is a well quasiorder if whenever A (i < ) is a sequence of i its elements, there are i < j such that Ai ≤ Aj . This is equivalent to ≤ being both well founded and not containing infinite antichains.
90
JUSTIN TATCH MOORE
required to preserve meets after showing that the class of -scattered trees is w.q.o. under this quasiorder. Over the course of several years, Todorcevic produced a number of results regarding this question. The culmination of this work is [26] in which he proves Theorems 4.2 and 4.3 below, giving a strong negative answer to Laver’s question. Todorcevic showed that coherent trees can be used to construct examples of trees S and T such that S < T ; before this is was not even known that the collection of ≡-equivalence classes was infinite. Example 4.1. [26] Suppose that T is a coherent tree. If t is in T , define t + by t + ( + 1) = t() whenever + 1 is in the domain of t and setting t + () = 0 if is a limit ordinal. Put T + = {t + : t ∈ T }. Then t → t + witnesses that T ≤ T + . It is furthermore possible to show that T < T + . Theorem 4.2. [26] There are coherent A-trees Sm (m ∈ Z) such that m < n implies Sm < Sn . Theorem 4.3. [26] There is a family F of cardinality 2ℵ1 which consists of pairwise incomparable A-trees. The first result is obtained by finding a tree S for which the inverse of the shift operation in Example 4.1 gives a meaningful output and which can moreover be iterated. The latter result is obtained by carefully “gluing” a number of coherent trees together. Coherent trees turn out to play a special role in the study of (A , ≤). The following definitions play a central role in [26]. Definition 4.4. [26] If T is an A-tree and f is a partial map from T to T which is level preserving, then we say f is Lipschitz if ∆(s, t) ≤ ∆(f(s), f(t)) for all s, t in the domain of f. Definition 4.5. [26] An 1 -tree T is Lipschitz if whenever f : T → T is a partial level preserving map with uncountable domain, there is an uncountable restriction of f which is Lipschitz. Let C denote the collection of all Lipschitz trees. While the limitation of this definition to A-trees is in part to prevent trivial discussion, any meaningful extension of this notion already implies that the tree is Aronszajn. For instance, if the tree is uncountable and has the property that every two elements have incomparable extensions, then the above condition implies that the tree is Aronszajn.
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
91
While Definition 4.5 is appealing from an aesthetic point of view, the following theorem gives an equivalent formulation which is more useful in applications. Lemma 4.6. [26] Suppose that T is a Lipschitz tree, Ξ ⊆ 1 is uncountable, and a ( ∈ Ξ) are such that a ⊆ T has a fixed finite size k for ∈ Ξ. There is an uncountable Ξ ⊆ Ξ such that if = are in Ξ , then ∆(a (i), a (i)) does not depend on i for i < k. Here and elsewhere a(i) is the i th least element of a in some fixed lexicographical ordering on T . This lemma is often combined with the following standard Lemma. Lemma 4.7. Suppose that T is an A-tree, Ξ ⊆ 1 is uncountable, and a ( ∈ Ξ) are such that a ⊆ T has a fixed finite size k for ∈ Ξ. There is an uncountable Ξ ⊆ Ξ such that if = are in Ξ , then ∆(a ∪ a ) \ ∆(a ) ∪ ∆(a ) = {∆(a (i), a (i)) : i < k}. The following theorems shows that there is a rather natural necessary and sufficient condition for a coherent tree to be Lipschitz. Theorem 4.8. [26] A coherent tree is Lipschitz iff every uncountable subset contains an uncountable antichain. In the presence of MAℵ1 , the converse is true as well. Theorem 4.9. [26] Assuming MAℵ1 , every Lipschitz tree is isomorphic to a coherent tree. As mentioned above, under an appropriate hypothesis, Lipschitz trees are irreducible objects. Theorem 4.10. [26] (MAℵ1 ) If S is a downward closed subtree of a Lipschitz tree T , then S ≡ T . The following combinatorial object is useful both as an invariant and as a tool in the study of Lipschitz trees. Definition 4.11. [26] Suppose that T is a Lipschitz tree. Define U (T ) = {X ⊆ 1 : ∃S ⊆ T (S is a subtree) and (∆(S) ⊆ X )}. Theorem 4.12. [26] If T is a Lipschitz tree, then U (T ) is a filter. This is a routine consequence of Lemma 4.6. Furthermore, if one makes a rather mild set theoretic assumption, then U (T ) measures all subsets of 1 . Theorem 4.13. [26] MAℵ1 implies that U (T ) is an ultrafilter for every Lipschitz tree T .
92
JUSTIN TATCH MOORE
Hence, assuming MAℵ1 , U (T ) is an example of a uniform ultrafilter on 1 which is Σ1 -definable over H (ℵ+ 1 ), a fact of independent interest. It is well known, for instance, that there is no uniform ultrafilter on which is Σ1 -definable over H (ℵ+ 0 ). Notice that the number of elements required to generate U (T ) is at most the cardinality of a subtree base for T (see Section 6 for a definition of subtree base). Hence if T has a subtree base of cardinality ℵ1 , U (T ) is not an ultrafilter. Rather remarkably, assuming MAℵ1 the ultrafilters U (T ) provide a complete invariant for (C , ≤). Theorem 4.14 (MAℵ1 ). Suppose that S and T are Lipschitz trees. The following are equivalent: (1) S ≤ T , (2) S > T , (3) There is an f : 1 → 1 such that ≤ f() for all < 1 and f(U (S)) = U (T ). In particular, two element of C are equivalent iff their corresponding ultrafilters are equal. Also, if we make the following definition, we can work in the broader context of A-trees. Definition 4.15. If T is an A-tree, let F (T ) be the collection of all X ⊆ 1 such that T can be covered by countably many sets Z such that ∆(Z) ⊆ X . Since the collection of partitions of a set into countably many pieces is directed when given the order of refinement, F (T ) is a filter. Notice that if t has height , then ∆({s ∈ T : s is comparable with t}) ⊆ 1 \ . Since T is countable, F (T ) contains every co-countable subset of 1 . Clearly F (T ) ⊆ U (T ), even when T is not Lipschitz and U (T ) is not necessarily a filter. If T is a Souslin tree, then F (T ) is exactly the co-countable subsets of 1 . The method of proof of Theorem 4.13 can be used to prove the following theorem. Theorem 4.16 (MAℵ1 ). If T is a Lipschitz tree, then F (T ) is an ultrafilter and therefore is equal to U (T ). Hence, in the context of MAℵ1 , F (T ) can be viewed as a generalization of U (T ) to the class of all A-trees. Finally, we have the following result which contrasts Theorem 4.3 and shows that the amalgamation of coherent trees is necessary to obtain this result. Theorem 4.17. [26] (BPFA∗ ) The collection C is a chain which is both cofinal and co-initial in (A , ≤) and which has neither a maximal nor minimal element.
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
93
§5. Partitions of A-trees and Shelah’s Conjecture. In this section we will discuss a conjecture of Shelah and its recent solution. Conjecture 5.1. [21] It is consistent that every Aronszajn line contains a Countryman suborder. This conjecture will subsequently referred to in this paper as Shelah’s Conjecture. First let us consider the motivation for this conjecture. In [2], Baumgartner showed that it is consistent with the usual axioms of set theory that every two ℵ1 -dense sets of reals are isomorphic. In fact, he showed that this conclusion follows from PFA. In particular, BPFA∗ implies that the uncountable separable linear orderings have a single element basis consisting of an arbitrary set of reals of size ℵ1 . It is tempting to believe that an analogous result might follow for the class of Aronszajn lines as well. In such simplicity, however, this is false. Shelah proved in [21] that there is a Countryman line — an uncountable linear ordering C such that C 2 is the union of countably many chains. This refuted a conjecture made by Countryman a few years prior [7]. Countryman orders are necessarily Aronszajn; I will leave this as an exercise to the interested reader. A key property of Countryman orders is that, unlike ℵ1 -dense real types, they have distinct notions of “left” and “right.” If f is an order reversing map partial map from C into C , then f meets every chain in C 2 in a singleton and therefore must be countable. It follows that C and C ∗ are not near — no uncountable linear order embeds into both of them. In [21], it was conjectured however that consistently every two Countryman orders are either near or conear — one is near the converse of the other. The analysis of this conjecture then developed in the folklore for some time (see, e.g., [1, p. 79], [3], [4]). At some point it was proved that, assuming MAℵ1 , the Countryman lines have a two element basis, proving the second conjecture in [21]. The following theorem finally appeared in full in [26], with some of these equivalences being original to [26]. Theorem 5.2 (BPFA∗ ). The following are equivalent: (1) The uncountable linear orderings have a five element basis consisting of X , 1 , 1∗ , C , and C ∗ whenever X is a set of reals of cardinality ℵ1 and C is a Countryman line. (2) Every Aronszajn order contains a Countryman suborder. (3) For every A-tree T and every K ⊆ T , there is a subtree of T which is either contained in or disjoint from K . (4) There is an A-tree T such that for every K ⊆ T , there is a subtree of T which is either contained in or disjoint from K . (5) For every pair S and T of A-trees and every uncountable partial level preserving f from S into T , either f or f −1 has an uncountable Lipschitz restriction. (6) If T is a Lipschitz tree, then T + is the immediate successor of T in (A , ≤).
94
JUSTIN TATCH MOORE
Remark 5.3. The implication (2) implies (3) is a theorem of ZFC. In [17] I proved that Item (4) follows from PFA and in fact from the conjunction of BPFA and the Mapping Reflection Principle (MRP) introduced in [16]. Combining this with the above theorem yields the following results. Theorem 5.4. [17] (PFA) Every Aronszajn line contains a Countryman suborder. Corollary 5.5. [2] [17] [26] (PFA) The uncountable linear orderings have a five element basis consisting of X , 1 , 1∗ , C , and C ∗ whenever X is a set of reals of cardinality ℵ1 and C is a Countryman line. In the remainder of this section I will present some of the motivation and insight which led to the proof. To this end, let T be a special coherent binary Atree which is closed under finite modifications. Adding the finite modifications if necessary, the tree T ( 3 ) of [34] is such an example. The first standard approach toward building a forcing to introduce a subtree S ⊆ K is the following. Definition 5.6. H (K ) is the collection of all finite X ⊆ T such that ∧(X ) is contained in K . H (K ) is considered as a forcing notion by giving it the order of reverse inclusion. In fact, if one assumes that K is a union of levels of T , then H (K ) is either c.c.c. — in which case it forces that K contains a subtree — or else there is a subtree of T which is disjoint from K [26]. This is a direct consequence of Lemmas 4.6 and 4.7 and is how Theorem 4.13 is proved. Moreover, this shows that if U (T ) is not an ultrafilter, then the countable chain condition is not productive. We will now examine how H (K ) may fail to be c.c.c. for an arbitrary K ⊆ T . For convenience we will let E denote the collection of all clubs ℵ0 which consist of elementary submodels which contain T and E ⊆ [H (ℵ+ 1 )] K as elements. Let E0 denote the element of E which consists of all such submodels which have T and K as an element. The following object is a local version of U (T ). Definition 5.7. If P is in E0 , then IP (T ) is the collection of all I ⊆ 1 which are disjoint from some set of the form {∆(t, u) : u ∈ X } where X is an uncountable subset of T in P, and t is a fixed element of the downward closure of X of height P ∩ 1 . A proof similar to that for Theorem 4.12 shows that IP (T ) is an ideal. The same argument shows that the dual filter UP (T ) — which consists of complements of elements of IP (T ) — extends U (T ) ∩ P.
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
95
Definition 5.8. If P is in E0 and X is a finite subset of T , then we say that P rejects X if { < 1 : t ∈ K } t∈X
is in IP (T ). The following observation is what motivates the definition of rejection. Observation 1. Suppose that X is in H (K ) and that P is in E0 . X is rejected by P iff there is an uncountable antichain A ⊆ H (K ) in P such that X is in A where = P ∩ 1 . In particular, H (K ) satisfies the countable chain condition iff no element of H (K ) is rejected by any element of E0 . This follows from applications of Lemmas 4.6 and 4.7 and the observation that — since T is countable — there are Ξ ⊆ 1 and a : ∈ Ξ in P with ∈ Ξ and a = X . Notice that H (K ) contains all of the singletons of elements of T . This provides an important special case of Observation 1. Observation 2. If t is in T and some P in E0 rejects {t}, then there is a subtree of T which is disjoint from K . A general form of Observation 1 can be phrased as follows. Observation 3. If E is in E , k < , and F ( < 1 ) is a sequence of disjoint k-element subsets of T such that no element of E rejects any member of the sequence, then there are < such that F (i) ∧ F (i) is in K for all i < k. Notice that any sequence F ( < 1 ) as in Observation 1 can be refined using Lemma 4.7 so that, for any , < 1 , any element of ∧(F ∪ F ) \ ∧ (F ) ∪ ∧(F ) is of the form F (i) ∧ F (i) for some i < k. Hence, while elements of the sequence may themselves have meets outside of K , the new meets in ∧(F ∪F ) can be arranged to all be in K . The next observation shows that, assuming an appropriate hypothesis, the set of P in E0 which reject a given set X satisfies a dichotomy. Observation 4 (MRP). There is a closed and unbounded subset D of + H (2ℵ1 ) such that if X is a finite subset of T and N is in D, then there is an E ∈ E ∩ N such that either (1) every P in E ∩ N rejects X or (2) no P in E ∩ N rejects X .
96
JUSTIN TATCH MOORE
Here MRP is the Mapping Reflection Principle introduced in [16] in the course of showing that BPFA implies |R| = ℵ2 . For us, it suffices to know that Observation 4 follows from MRP which in turn follows from PFA, that MRP has considerable large cardinal strength well beyond that of BPFA, and that its consequences such as the above observation, can not generally be accomplished by a broad class of examples of proper forcings built by Todorcevic’s method of using models as side conditions [27] (see below). These observations taken together suggest the consideration of another notion of compatibility based upon the definition of rejection. This is built into the following forcing via Todorcevic’s method of using models as side conditions. Definition 5.9. ∂(K ) consists of all pairs p = (Xp , Np ) such that: (1) Np is a finite ∈-chain of elements of D. (2) Xp ⊆ T is a finite set and if N is in Np , then there is an E in E ∩ N such that Xp is not rejected by any element of E ∩ N . A variation of this forcing which is also relevant is ∂H (K ) = {p ∈ ∂(K ) : Xp ∈ H (K )}. The proof of Theorem 5.4 can now be summarized as follows. Lemma 5.10. [17] (BPFA) If ∂H (K ) is canonically proper,2 then there is a subtree S of T which is either contained in or disjoint from K . Lemma 5.11. [17] (BPFA) If ∂H (K ) is not canonically proper, then neither is ∂(K ). Lemma 5.12. [17] (MRP) If ∂(K ) is not canonically proper, then there is a c.c.c. forcing Q which does not have property K (in particular BPFA is false). Observation 2 is used to show that if ∂H (K ) fails to meet density conditions, then there is a subtree S which is disjoint from K ; this is the content of the proof Lemma 5.10. Lemma 5.11 is proved by showing that if ∂H (K ) is not canonically proper but ∂(K ) is, then ∂(K ) can be used to force a failure of Observation 3 which, by BPFA, would exist in V giving a contradiction. In the proof of Lemma 5.12, Observation 4 is used in the verification of the countable chain condition in the forcing Q. The forcing Q is a collection of finite approximations to a failure of Observation 3 and therefore cannot have property K . We will now turn to the necessity of Observation 4. The following axiom gives a canonical failure of MRP at the level of 1 . : There are continuous functions fα : α → for each α < 1 such that if E ⊆ 1 is closed and unbounded, then there is a limit point in E such that f takes all values in on any final segment of E ∩ . 2 ∂H (K ) is canonically proper if p is (M, ∂H (K ))-generic whenever M is a countable elemen+ tary submodel of a suitable H ( ) and M ∩ H (2ℵ1 ) is in Np .
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
97
Theorem 5.13. [19] implies that there is an Aronszajn order with no Countryman suborder. The significance of this construction is that instances of are quite robust. For example, in [19] is it shown that forcings which negate an instance of cannot be within the class of -collapse*c.c.c. forcings of [33]. These forcings capture most examples of built using Todorcevic’s method [27]. This class is moreover sufficient for nearly all applications of PFA: MAℵ1 , the isomorphism of all ℵ1 -dense sets of reals, the club-isomorphism of A-trees, the non-existence of S spaces and Kurepa trees, the failure of (κ), OCA, and 2ℵ0 = ℵ2 . The end of [17] presents an abstract form of Observation 4 which may be useful in handling this difficulty in future applications of PFA. §6. A basis for the subtrees of an A-tree. Before moving on to an analysis of the consistency strength of Shelah’s conjecture, it will be worthwhile recalling some older results of Baumgartner, Hanazawa, and Todorcevic concerning subtrees of A-trees. A basic question to ask regarding the subtrees of a given A-tree T is their co-initiality: what is the minimum cardinality of a collection F of subtrees of T such that every subtree contains an element of F as a subset? In the case of Souslin trees, this cardinal is trivially ℵ1 since every subtree of a Souslin tree contains one of the form T [s] = {t ∈ T : s ≤ t or t ≤ s}. I will refer to a co-initial family of subtrees as subtree base for T .3 In the papers discussed in this section, the notion of a subtree which was considered is that of a downward closed subtree. It is easily seen though, that if F is co-initial in the downward closed subtrees, then {∧(S) : S ∈ F } is co-initial in the subtrees. The first paper to study the minimum cardinality of a subtree base is [10] in which the following result was proved. Theorem 6.1. [10] ♦+ implies that there is an A-tree which does not have a subtree base of size ℵ1 . In fact the following example, due to Todorcevic (see [5]) shows that the existence of a Kurepa tree — an 1 -tree with at least ℵ2 branches — suffices as a hypothesis. Example 6.2. Let T be an A-tree and U be an 1 -tree. Consider T ⊗ U = {(t, u) ∈ T × U : ht(t) = ht(u)}. Since the projection of chain in T ⊗ U onto the first coordinate is a chain in T , T ⊗ U has no uncountable chains. Since both T and U have countable 3 Subtree
bases are sometimes referred to as anti-Souslin bases in the literature.
98
JUSTIN TATCH MOORE
levels, so does T ⊗ U . Hence T ⊗ U is Aronszajn. If b is an uncountable branch in U , let Sb = {(t, u) ∈ T ⊗ U : u ∈ b}. For such a b, Sb is an uncountable subtree of T ⊗ U and if b = b are uncountable branches through U , then Sb ∩ Sb is countable. In particular any subtree base for T ⊗ U must have at least the cardinality of the set of uncountable branches through U . Remark 6.3. In [1] it was shown that if 2ℵ0 = ℵ1 and 2ℵ1 = ℵ2 , then there is a cardinal preserving proper forcing extension in which all A-trees are clubisomorphic. Hence, if there is a Kurepa tree in the ground model, no A-tree is saturated in the extension. Here a tree is said to be saturated if every family of subtrees with pairwise countable intersection has size at most ℵ1 . This observation is due to Todorcevic and shows that no Σ1 -property of an A-tree T — such as coherence — can imply even that the subtrees of T fail to contain an almost disjoint family of size ℵ2 . On the other hand, we have the following result, discovered by Todorcevic and then independently by Baumgartner. Theorem 6.4. [28, 8.13] [5] Suppose that V [G] is a -closed forcing extension of V . If T is an A-tree in V and S is a subtree of T which is in V [G], then there is a subtree S0 of S with S0 in V . An immediate consequence is the following result contrasting Theorem 6.1. Theorem 6.5. [5] After Levy collapsing an inaccessible cardinal to ℵ2 , every A-tree has a subtree base of size ℵ1 . The proof of Theorem 6.4 — like the result itself — can be considered as an extension of Silver’s argument that -closed forcings do not add new branches through 1 -trees [24]. Using an appropriate closing off argument, it is possible to show that if p is an element of a -closed forcing and p forces that S˙ is a downward closed subset of Tˇ , then there is a q ≤ p and a < 1 such that q forces S˙ ∩ T is empty. While this was not the emphasis of [5], Theorem 6.4 reveals a rather uncommon phenomenon. Typically enumeration principles — especially those as strong as ♦+ — can be used to construct substructures with strong second order properties. For example, Sierpinski has shown that CH implies that every uncountable set of reals X contains a suborder Y such that there is no monotonic function from X into Y . This theorem shows that, in the case of an arbitrary A-tree T , there are considerable limitations the types of subtrees of T which can be constructed using even ♦+ . For instance, if V is a -closed extension of a model of MAℵ1 , then there is an A-tree — any A-tree which admits a Countryman lexicographical ordering suffices (see Section 5 below)
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
99
— which club-embeds into all of its subtrees. The implications of this will be discussed some in Section 8. In the next section we will be interested in the weaker assertion that no A-tree contains an almost disjoint family of subtrees of size ℵ2 . This assertion will be referred to as A-tree saturation. This statement shows up in the analysis of the consistency strength of Shelah’s Conjecture. The above argument shows that the consistency strength of this statement is exactly that of an inaccessible cardinal. Obtaining the consistency strength of this statement with, e.g., MAℵ1 , however, is a more subtle matter. §7. The consistency strength of Shelah’s Conjecture and A-tree saturation. Unlike many of the applications of PFA to statements about H (ℵ+ 1 ) given in the past (see [30], [31]), the large cardinals in the proof of Theorem 5.4 cannot be immediately eliminated. In fact, even though Shelah’s Conjecture represents a conjunction of Σ1 -formulas in the language of H (ℵ+ 1 ), the assertion that ∂(K ) is proper is not such a sentence in this language and it may be that the existence of a proper forcing which adds a K -homogeneous subtree for a given K ⊆ T is not a theorem of ZFC. Hence it is not even clear that PFA(ℵ1 ) (a.k.a. BPFA) — which asserts that (H (ℵ+ 1 ), ∈) is Σ1 -elementary in every proper forcing extension — suffices to imply Shelah’s Conjecture. Moreover, while PFA(ℵ1 ) is equiconsistent with a reflecting cardinal [9], the best upper bound on the consistency strength of Theorem 5.4 provided + by [17] is that of a cardinal which is H (2 )-reflecting. The former is weaker in consistency strength than a Mahlo cardinal; the latter is, for instance, sufficient to force SRP(ℵ2 ) and to prove the existence of inner models with many Woodin cardinals [32]. The search for a better bound led to [12] in which we investigated the consistency strength of Shelah’s Conjecture and, in particular, the hypothesis necessary to prove Observation 4. The following results are the culmination of this work. Theorem 7.1. [12] PFA(ℵ2 ) implies A-tree saturation. Theorem 7.2. [12] The conjunction of PFA(ℵ1 ) and A-tree saturation implies Shelah’s Conjecture. In particular, PFA(ℵ2 ) implies Shelah’s Conjecture. Theorem 7.3. [12] If κ is Mahlo, then there is a set forcing extension of Lκ which satisfies Shelah’s conjecture. The consistency strength of PFA(ℵ2 ) is exactly that of a cardinal which is H ( + )-reflecting [15]. Such cardinals are weakly compact but weaker in consistency strength than the existence of 0 . By methods mentioned below, Theorem 7.2 provides us with an even sharper bound on the consistency strength — a cardinal which is both reflecting and Mahlo suffices. If the proper class ordinal is 2-Mahlo, then there are a proper class of such cardinals.
100
JUSTIN TATCH MOORE
Further examination of the proof shows that something less suffices; even Theorem 7.3 is not optimally stated. It is remarked in the conclusion of [12] that the forcing extension of Theorem 7.3 can be arranged so that it satisfies “2 is not a reflecting cardinal in L.” It is necessarily the case, however, that in this generic extension there are cofinally many < 2 which are inaccessible in L and for which L satisfies “there is a reflecting cardinal.” While it is not clear at all that Shelah’s Conjecture has any large cardinal strength, the current upper bound seems quite satisfactory until a non-trivial lower bound is established — if this is possible at all. While Theorem 7.3 is beyond the scope of this note, I will now sketch the proofs of Theorems 7.1 and 7.2. A collection F of subtrees of a given A-tree T is predense if whenever S is a subtree of T , there is a U in F which has uncountable intersection with S. For a collection F of subtrees of an A-tree T , consider the following assertion. (F ): There exist S ( < 1 ) in F and a club E ⊆ 1 such that if is in E and t is in T , then there is a t < such that if is in (t , ) ∩ E, then there is a < with t in S . If S ( < 1 ) witnesses (F ), then it can be verified that F0 = {S : < 1 } is predense. It is also not difficult to show that, in the presence of PFA(ℵ1 ), A-tree saturation is equivalent to the assertion that (F ) holds for every predense family F of subtrees of arbitrary A-trees. Furthermore, assuming PFA(ℵ1 ), is equivalent to the following statement holding for arbitrary collections F of subtrees of A-trees. ϕ(F ): There is a closed unbounded set E ⊆ 1 and a continuous chain N : ∈ E of countable subsets of F ∪ F ⊥ such that for every in E and t in T either (1) there is a t < such that if ∈ (t , ) ∩ E, then there is A ∈ F ∩ N such that t is in A, or (2) there is a B in F ⊥ ∩ N such that t is in B. Here F ⊥ is the collection of all subtrees S which have countable intersection with every element of F . There is an important caveat though. Unlike the situation with NS1 , the maximality of an antichain of size ℵ1 of subtrees need not be upwards absolute, as the next example shows. In particular, is stronger than A-tree saturation. Example 7.4. Suppose that U is an 1 -tree with the following properties: (1) U has exactly ℵ1 many uncountable branches. (2) There is an ℵ1 -preserving forcing extension in which U gains a new uncountable branch. (3) U has no Aronszajn subtrees. Such a tree can be forced by countable approximations — see p. 282 of [28] with κ = ℵ1 . The point is that the forcing extension V [G] constructed
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
101
following 8.12 of [28] can be viewed as an iteration V [G0 ][b ∗ ] where b ∗ is a branch of the generic 1 -tree and all branches of this tree except b ∗ are in V [G0 ]. Following Example 6.2 above, let T be any A-tree and consider the family F of all subtrees Sb of T ⊗ U such that b is an uncountable branch through U . Clearly F has size ℵ1 and since U has no Aronszajn subtrees, F ⊥ is empty. However, adding b ∗ causes F to no longer be maximal since then Sb ∗ is in F ⊥ . Theorem 7.2 is derived by showing that if K is a subset of T , then there is a countable sequence Rn (n < ) of families of subtrees of finite powers of T such that if ϕ(Rn ) is true for each n, then Observation 4 holds. The arguments in [12] show that, for a fixed F , (F ) can be forced with a proper forcing (this proves Theorem 7.1). With appropriate book-keeping and ground model assumptions one can use this to establish the consistency of PFA(ℵ1 ) together with the assertion that ϕ(F ) holds for every family of subtrees of an A-tree starting from the existence of a reflecting Mahlo cardinal. §8. Ladder system colorings, A-trees, and minimal uncountable order types. In this section, we will consider a companion result to Theorem 5.5. Suppose for a moment that B is a finite basis for the uncountable linear orderings which is, moreover, as small as possible. Because of the minimality of B, any element of B must embed into all of its uncountable suborders — it must be a minimal uncountable order type. Hence, assuming PFA, 1 , 1∗ , X , C , and C ∗ are all minimal uncountable order types if X is a set of reals of size ℵ1 and C is a Countryman line. By classical results, both 1 and 1∗ are minimal without any axiomatic assumptions: if X ⊆ 1 is uncountable, the collapsing map is an isomorphism between X to 1 . It is reasonable to ask whether there are any other ZFC examples of minimal uncountable order types. This is implicit in Baumgartner’s [4], though it seems to have been present in the folklore before that article. Regarding real types, Sierpinski proved the following result which implies that there are no minimal real types if one assumes CH. Theorem 8.1. [23] If X ⊆ R and |X | = |R|, then there is a Y ⊆ X with |Y | = |R| such that if f ⊆ Y 2 is monotonic, then {y ∈ Y : f(y) = y} has cardinality less than |Y |. An analogous result for A-lines, however, is a more subtle matter as the following result of Baumgartner shows. Theorem 8.2. [4] If ♦+ is true, then there is an Aronszajn line which is minimal. This left open the question of whether there could be a ZFC example of a minimal A-line. In [20], I proved that this is not the case.
102
JUSTIN TATCH MOORE
Theorem 8.3. [20] It is consistent with CH that there are no minimal Aronszajn lines. This is a consequence of a study of a variation of ladder system uniformization which may be of independent interest. Definition 8.4. Suppose T is an A-tree and f ( ∈ lim(1 )) is a coloring of a ladder system4 C ( ∈ lim(1 )). A T -uniformization of f : < 1 is a function g defined on a downward closed subtree U of T such that if u is an element of U of limit height , then f () = g(u ) for all but finitely many in C . Theorem 8.3 now follows immediately from the next two theorems. Theorem 8.5. [20] There is a proper forcing extension which satisfies CH in which every ladder system coloring can be T -uniformized for every A-tree T . Theorem 8.6. [20] Suppose that T is an A-tree and that (1) Every ladder system coloring can be T -uniformized. (2) There is lexicographical ordering on a subset of T which is a minimal Aronszajn line. Then 2ℵ0 = 2ℵ1 . These theorems also provide an example related to Woodin’s question on maximization of Π2 sentences for H (ℵ+ 1 ) in the presence of CH (Question 21 2 of [35]) and Steel’s question on Σ2 -absoluteness. Example 8.7. Let L be the language of set theory and let L C denote the language of set theory expanded to add a predicate for a C -sequence. Let ZFCC be the extension of ZFC by adding an axiom asserting that there is a unique C -sequence on 1 which is equal to the predicate. Since ZFC implies the existence of C -sequences on 1 , this yields a conservative extension of ZFC. This is a rather mild logical maneuver as the family of C -sequences on 1 is ∆0 -definable with parameter 1 . If we let 1 be the assertion that every ladder system coloring can be uniformized relative to every A-tree, then 1 is a Π2 -sentence in L . If we let 2 be the assertion that the coherent sequence {e : < 1 }, defined in Example 3.4 using the predicate, is a minimal Aronszajn type, then 2 is a Π2 -sentence in L C . By Theorem 8.5, the conjunction of 1 and CH can be forced with a proper forcing. By Theorem 8.6, the conjunction of 1 and 2 implies CH is false. Finally, 2 is a consequence of MAℵ1 (see [26]) and, by Theorem 6.4, is preserved by -closed forcing. Therefore, the conjunction of 2 and ♦+ can be forced with a proper forcing. On the other hand, by 4 For
our purposes, a ladder system is a C -sequence defined only on the limit ordinals.
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
103
Theorems 8.5 and 8.6, we can always go into a proper forcing extension in which 2 is false. P. Larson noted that the proof of [1, 2.3] can be adapted to show that 2ℵ0 < 2ℵ1 implies the existence of a non-minimal Aronszajn type which is a lexicographical ordering on a coherent sequence. While it is not completely clear whether the coherent sequence can be constructed as in Example 3.4, there seems to be little hope of improving Example 8.7 by removing the predicate for C . §9. Questions. The work exposited above both leaves open and suggests a number of questions. I have collected a few which I hope will yield interesting mathematics. I will begin with a question related to the consistency strength of Shelah’s Conjecture and motivated by the work in [12]. If K is a subset of an A-tree T , let FK = {S ⊆ K : S is a subtree of T }. Assuming Shelah’s Conjecture, FK⊥ is Σ1 -definable: U is in F ⊥ iff there exists a subtree S of T such that S ⊆ U and S ∩ K = ∅. Hence Shelah’s Conjecture implies that ϕ(FK ) is a Σ1 -formula in the language of H (ℵ+ 1 ). It follows from this and the fact that ϕ(FK ) can always be forced by a proper forcing that — in the presence of BPFA — Shelah’s Conjecture is equivalent to the assertion that ϕ(FK ) holds for all A-trees T and all K ⊆ T . Question 9.1. Assume BPFA. If T is an A-tree and K ⊆ T , must FK ∪FK⊥ contain a predense family of size ℵ1 ? This lends some plausibility to Shelah’s Conjecture having large cardinal strength. At present at least, Example 6.2 provides us with essentially the only construction of a non-saturated A-tree. This motivates a variety of questions. Question 9.2. Is the statement that all A-trees are saturated preserved by forcings which do not add subsets of 1 ? Notice that, if this question has a negative answer, then the forcing extension witnessing this would contain a non-saturated tree but no Kurepa trees. The ground model — and consequently the extension — must contain a tree without a subtree base of size ℵ1 . It would be interesting to have an internal construction of non-saturation inside members of a reasonably definable class of A-trees. Question 9.3. Is there a Σ1 -definable class S of A-trees such that if S is in S , then L[S] correctly computes 1 and satisfies that S is a non-saturated?
104
JUSTIN TATCH MOORE
In models such as the Levy collapse A-trees are saturated for the simple reason that there are essentially very few subtrees of a given tree. It is reasonable to ask whether the saturation of an A-tree which has an abundance of subtrees requires more substantial large cardinals. Question 9.4. Suppose that T is a saturated coherent A-tree and U (T ) is an ultrafilter. Must 2 be Mahlo in L? The methods of [17] definitely entail that |R| = ℵ2 holds in the resulting models of Shelah’s Conjecture. Question 9.5. Does Shelah’s Conjecture imply that |R| ≤ ℵ2 ? This question would have a positive answer if follows from |R| > ℵ2 . At present, this seems plausible. It is also reasonable to ask if there is a consistent higher dimensional analogue of Item (3) of Theorem 5.2. Question 9.6. Does PFA imply that whenever T is an A-tree and T [2] = {{s, t} : s, t ∈ T and s < t} is partitioned into infinitely many sets Ki (i < ), then there is an i < and a subtree S of T such that S [2] is disjoint from Ki ? It seems likely that if this question has a positive answer, then it is possible to replace with some finite n and also obtain a positive answer. It is worth noting that there is a canonical counterexample if n = 2: If T is a subtree of 2<1 , put {s, t}< in Ki iff t(ht(s)) = i. Dˇzamonja and J. Larson have modified this partition to show that there is also a counterexample if n = 3. A natural question to ask after the results of [20] is the following. This is at least implicit in [3]. Question 9.7. Is it consistent that if L is a linear order which is not scattered, then there is a suborder X ⊆ L which is not -scattered such that L does not embed in X ? Aronszajn and real types are examples of linear orders which are not scattered; another type of example is constructed in [3]. The construction mentioned at the end of Section 8 leaves the following question open. Question 9.8. If C is a Countryman line, is there a proper forcing which makes C minimal and which does not add reals? Notice that it is possible to go into a forcing extension in which C is minimal and in which even ♦+ holds. Finally, I will mention the following question offered by Todorcevic. Question 9.9 (PFA). What is the co-initiality of (C , ≤)?
STRUCTURAL ANALYSIS OF ARONSZAJN TREES
105
REFERENCES
[1] U. Abraham and S. Shelah, Isomorphism types of Aronszajn trees, Israel Journal of Mathematics, vol. 50 (1985), no. 1-2, pp. 75–113. [2] James E. Baumgartner, All ℵ1 -dense sets of reals can be isomorphic, Polska Akademia Nauk. Fundamenta Mathematicae, vol. 79 (1973), no. 2, pp. 101–106. , A new class of order types, Annals of Pure and Applied Logic, vol. 9 (1976), no. 3, [3] pp. 187–222. [4] , Order types of real numbers and other uncountable orderings, Ordered Sets (Banff, Alta., 1981), NATO Adv. Study Inst. Ser. C: Math. Phys. Sci., vol. 83, Reidel, Dordrecht, 1982, pp. 239–277. [5] , Bases for Aronszajn trees, Tsukuba Journal of Mathematics, vol. 9 (1985), no. 1, pp. 31– 40. [6] M. Bekkali, Topics in Set Theory, Lecture Notes in Mathematics, vol. 1476, SpringerVerlag, Berlin, 1991, Lebesgue measurability, large cardinals, forcing axioms, -functions, Notes on lectures by Stevo Todorˇcevi´c. [7] R. Countryman, Spaces having a -monotone base, preprint, 1970. [8] Keith J. Devlin, Constructibility, Perspectives in Mathematical Logic, Springer-Verlag, Berlin, 1984. [9] Martin Goldstern and Saharon Shelah, The bounded proper forcing axiom, The Journal of Symbolic Logic, vol. 60 (1995), no. 1, pp. 58–73. [10] Masazumi Hanazawa, On Aronszajn trees with a non-Souslin base, Tsukuba Journal of Mathematics, vol. 6 (1982), no. 2, pp. 177–185. [11] Thomas Jech, Set Theory, Perspectives in Mathematical Logic, Springer-Verlag, Berlin, 1997. [12] Bernhard Konig, Paul Larson, Justin Tatch Moore, and Boban Velickovi c, ¨ ˇ ´ Bounding the consistency strength of a five element linear basis, to appear in Israel Journal of Mathematics. [13] Kenneth Kunen, An Introduction to Independence Proofs, Studies in Logic and the Foundations of Mathematics, vol. 102, North-Holland, 1983. [14] Richard Laver, Better-quasi-orderings and a class of trees, Studies in Foundations and Combinatorics, Adv. in Math. Suppl. Stud., vol. 1, Academic Press, New York, 1978, pp. 31– 48. [15] Tadatoshi Miyamoto, A note on weak segments of PFA, Proceedings of the Sixth Asian Logic Conference (Beijing, 1996), World Scientific Publishing, New Jersey, 1998, pp. 175–197. [16] Justin Tatch Moore, Set mapping reflection, Journal of Mathematical Logic, vol. 5 (2005), no. 1, pp. 87–97. [17] , A five element basis for the uncountable linear orders, Annals of Mathematics. Second Series, vol. 163 (2006), no. 2, pp. 669–688. [18] , A solution to the L space problem, Journal of the American Mathematical Society, vol. 19 (2006), no. 3, pp. 717–736. [19] , Persistent counterexample to basis conjectures, unpublished note, September 2004. [20] , 1 and 1∗ may be the only minimal uncountable order types, preprint, October 2005. [21] Saharon Shelah, Decomposing uncountable squares to countably many chains, Journal of Combinatorial Theory. Series A, vol. 21 (1976), no. 1, pp. 110–114. [22] , Proper and Improper Forcing, 2nd ed., Springer-Verlag, Berlin, 1998. [23] W. Sierpinski, Sur un probl`eme concernant les types de dimensions, Fundamenta Mathe´ maticae, vol. 19 (1932), pp. 65–71. [24] Jack Silver, The independence of Kurepa’s conjecture and two-cardinal conjectures in model theory, Axiomatic Set Theory (Proc. Sympos. Pure Math., Vol. XIII, Part I, Univ. California, Los Angeles, Calif., 1967), American Mathematical Society, Providence, R.I., 1971, pp. 383–390.
106
JUSTIN TATCH MOORE
[25] Robert Solovay and S. Tennenbaum, Iterated Cohen extensions and Souslin’s problem, Annals of Mathematics. Second Series, vol. 94 (1971), pp. 201–245. [26] Stevo Todorcevic, Lipschitz maps on trees, report 2000/01 number 13, Institut MittagLeffler. [27] , A note on the proper forcing axiom, Axiomatic Set Theory (Boulder, Colo., 1983), Contemporary Mathematics, vol. 31, American Mathematical Society, Providence, RI, 1984, pp. 209–218. [28] , Trees and linearly ordered sets, Handbook of Set-Theoretic Topology, NorthHolland, Amsterdam, 1984, pp. 235–293. [29] , Partitioning pairs of countable ordinals, Acta Mathematica, vol. 159 (1987), no. 34, pp. 261–294. , Partition Problems in Topology, American Mathematical Society, Providence, [30] RI, 1989. [31] , A classification of transitive relations on 1 , Proceedings of the London Mathematical Society. Third Series, vol. 73 (1996), no. 3, pp. 501–533. [32] , Localized reflection and fragments of PFA, Logic and Scientific Methods, DIMACS Ser. Discrete Math. Theoret. Comput. Sci., vol. 259, American Mathematical Society, Providence, RI, 1997, pp. 145–155. [33] , Countable chain condition in partition calculus, Discrete Mathematics, vol. 188 (1998), no. 1-3, pp. 205–223. [34] , Coherent sequences, Handbook of Set Theory, North-Holland, forthcoming. [35] W. Hugh Woodin, The Axiom of Determinacy, Forcing Axioms, and the Nonstationary Ideal, de Gruyter Series in Logic and its Applications, vol. 1, Walter de Gruyter & Co., Berlin, 1999. DEPARTMENT OF MATHEMATICS BOISE STATE UNIVERSITY BOISE, IDAHO 83725–1555, USA
E-mail:
[email protected]
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
SARA NEGRI
Introduction. The development of sequent systems for non-classical, in particular, modal, logics, started in the 1950s, with the work of Curry [1952] who provided a system with cut elimination and a decision procedure for S4, and Kanger [1957], who gave sequent calculi and decision procedures for T, S4, S5 with the use of “spotted formulas”, i.e., formulas indexed by natural numbers. Difficulties in the Gentzen-style formalization of modal logic were, however, encountered at a very elementary level, for instance in the search of an adequate cut-free sequent calculus for the modal logic S5.1 These difficulties are well witnessed by the ongoing present interest in the problem, with two more proposals presented in this Colloquium (Restall [2005], Stouppa [2005]). The lack of a general solution has justified an overall pessimistic attitude towards the possibility of applying Gentzen’s systems to non-classical logics, as is shown in the following passages: Gentzen’s methods do not provide anything like a universal approach to logic . . . There are certain standard logics to which these methods do not apply in as direct a fashion . . . For example, consider the logics B and S5. The Kripke models for these are symmetric . . . Such things effectively destroy all possibility of a good, simple cut-free Gentzen system. Fitting (1983, p. 4) The other tradition that should be mentioned is that of proof theory. Gentzen methods have never really flourished in modal logic. Bull and Segerberg (1984, p. 7) In modal logic the situation is much more delicate; there are significant technical problems . . . to be faced when translating one 1 In 1957 Ohnishi and Matsumoto presented sequent calculi with cut elimination for various modal logics, but no cut elimination for S5. Mints [1968] gives a sequent calculus for S5 with quantifiers that enjoys cut elimination but not the subformula property. The same limitation is encountered in Sato [1980]. Shvarts [1989] gave an indirect proof of cut elimination, showing that A is provable in S5 iff 2A is provable in a suitable cut-free calculus. A similar idea, translated in ¨ terms of tableaux systems, is exploited in Fitting [1999]. Brauner [2000] proved cut elimination for a calculus for S5 that cannot be appropriately called a sequent system because of the non-locality of its rules.
Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
107
108
SARA NEGRI
style into another. We step over these problems by choosing . . . a Hilbert type proof system. Sally Popkorn (1994, p. 97) and, in the most recent textbook on modal logic, in the section “What this book is not about” The omission of proof theory and automated reasoning techniques calls for a little more explanation. . . . as is often the case in modal logic, the proof systems discussed are basically Hilbert-style axiomatic systems. There is no discussion of natural deduction, sequent calculi, labelled deductive systems, resolution, or display calculi. . . . Why is this? Essentially because modal proof theory and automated reasoning are still relatively youthful enterprises; they are exciting and active fields, but yet there is little consensus about methods and few general results. Blackburn, de Rijke, and Venema (2001, p. xvi). Good sequent calculi should satisfy certain design principles. In Wansing (1994, 2002) explicit philosophical, methodological and computational requirements for sequent systems for modal logic have been laid down. In short, they are the following: 1. Separation: The rules for each connective/modality should be given through a purely structural account of its meaning, in the sense that they should be independent of any other connective/modality. 2. Weak symmetry: Each rule should either be a left or a right rule, introducing the connective either into the left or into the right hand side of the sequent arrow in the conclusion. The requirement can be strengthened to symmetry if there are both left and right rules for each connective/modality. 3. Weak explicitness (resp. explicitness): The connective/modality appears only (resp. only and once) in the conclusion of the rule. 4. The two modalities 2 and 3 should be both primitive but interderivable. 5. Uniqueness: Each connective should be uniquely characterized by its rules in a given system. 6. Different systems are obtained by changing only the structural rules, while leaving the logical rules unaltered. 7. Cut elimination. 8. Subformula property. After reviewing the earlier attempts of defining sequent systems for certain non-classical logics, Wansing is led to the conclusion that No uniform way of presenting . . . the most important normal modal and temporal propositional logics as ordinary Gentzen calculi is known. Further, the standard approach fails to be modular. . . . Each of the ordinary sequent systems presented . . . fails to satisfy some of the more philosophical requirements mentioned . . .
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
109
there are thus not only technical but also philosophical reasons for investigating generalizations of the notion of a Gentzen sequent. The generalizations of traditional Gentzen sequent calculi presented include systems such as higher-level sequents, higher-dimensional, higher-arity, multiple sequent systems, hypersequents, display logic. In addition to these generalizations, in recent years an approach based on the internalization of the Kripke semantics into the calculus has gained prominence. This idea, with early precursors as far as in Kanger [1957], has been developed in several forms. Inference systems have been presented that incorporate possible worlds in the form of sequents (Mints 1997, Vigano´ 2000, Kushida and Okada 2003, Castellini and Smaill 2002, Castellini 2005), in the form of tableaux (Fitting 1983, Catach 1991, Nerode 1991, Gor´e 1999, Massacci 2000), and in the form of natural deduction (Fitch 1966, Simpson 1994, Basin, Matthews, Vigano´ 1998). The use of a syntax that includes the relational semantics has been central also in the work on first-order encodings of modal logic (Ohlbach 1993, Schmidt and Hustadt 2003) and in what is called hybrid logic (Blackburn 2000). Internalization of the algebraic - rather than relational - semantics into a natural deduction style presentation is instead mainly used in Labelled Deductive Systems (Gabbay 1996). Despite their impact, labelled proof systems have been criticized as impure, in contrast to the more traditional proof systems, and difficult to use in practice: a deductive treatment congenial to modal logic is yet to be found, for Hilbert systems are not suited for the purpose of actual deductions, and in Hintikka/Kripke systems the alternativeness relation introduces an alien element which, moreover, can become quite unmanageable in special cases. Bull and Segerberg (1984, 2001). Furthermore, the more goal-oriented labelled tableau systems do not translate into elegant Gentzen sequent calculi. Tableau calculi, translated into sequent system do not possess all the elegant properties usually demanded of (Gentzen) systems. . . . Elegant modal sequent systems respecting the ideals of Gentzen have proved elusive. Gor´e [1999]. Our aim is to provide a general approach to the proof theory of nonclassical logics through labelled sequent calculi that obey all the principles of good design usually required of traditional sequent systems. In particular, the calculi we shall present have all the structural rules–weakening, contraction, and cut–admissible; they support, whenever possible, proof search, and have a simple and uniform syntax that allows easy proofs of metatheoretic results. These calculi all stem from a systematic development, started with Negri and von Plato [1998], of a method for converting axioms into rules to be
110
SARA NEGRI
added to cut- and contraction-free sequent systems while maintaining all the structural properties in the resulting extension. In previous work the method has been applied to extensions of logic, that is, to certain mathematical theories such as theories of order (Negri, von Plato, and Coquand 2001), lattice theory (Negri and von Plato 2004, Negri 2005a), linear Heyting algebras (Dyckhoff and Negri 2006), real closed fields (Negri 2001), projective and affine geometry (von Plato 2005a), and to the so-called geometric and cogeometric theories (Negri 2003, Negri and von Plato 2005). Recently, the method has been applied inside logic, for the generation of sequent systems for all those logics that can be characterized in terms of a Kripke-style relational semantics. These include the most standard normal modal logics and provability logic, treated in Negri [2005], and also intermediate logics, relevant logic, and, in general, substructural logics. In Section 1 we shall review the background on sequent calculus and its extensions with rules. In Section 2, starting from a G3-style labelled sequent calculus for basic modal logic, we shall present the application of the method to modal logics characterized by universal and geometric frame properties. ¨ ¨ There are certain modal logics, such as the provability logic of Godel-L ob, that are characterized by frame properties that are not first-order. In Section 3 it is shown how to deal with such extensions through a semantically justified definition of the rules for the modality. In Section 4 we present a sequent calculus with internalized Kripke semantics for intuitionistic logic. It turns out that all the properties characterizing the Kripke frames for the seven interpolable intermediate logics are geometric axioms, and thus fall under the scope of our method. Also relevant, and in general, substructural logics, can be characterized through a suitable relational semantics, with properties following the form of geometric axioms. A uniform proof-theoretic treatment for substructural logic is presented in Section 5. Finally, in the conclusion we indicate how the calculi presented relate to the above requirements for sequent calculi and how they can serve as calculi establishing decidability through terminating proof search. §1. Background. In Negri and von Plato (1998, 2001) and in Negri [2003] a general method was presented for extending sequent calculi with rules for axiomatic theories while preserving all the structural properties of the logical calculus. We recall here the general ideas of the method and the main results. For extensions of classical predicate logic the starting point is the contraction- and cut-free sequent calculus G3c. We recall that all the rules of G3c are invertible and all the structural rules are admissible, that is, whenever their premisses are derivable, then so is their conclusion. Weakening and contraction are in addition height-preserving admissible, that is, whenever their premisses are derivable with derivation height bounded by n, then also
111
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
is their conclusion, with the same bound on the derivation height (the height of a derivation is its height as a tree, that is, the length of its longest branch). Moreover, the calculus enjoys height-preserving admissibility of substitution. Also, invertibility of the rules of G3c is height-preserving (see Chapters 3 and 4 of Negri and von Plato [2001] for detailed proofs). These remarkable structural properties of G3c are maintained in extensions of the logical calculus with suitably formulated rules that represent axioms for specific theories. Universal axioms are first transformed, through the rules of G3c, into conjunctive normal form, that is, conjunctions of formulas of the form P1 & . . . &Pm ⊃ Q1 ∨ · · · ∨ Qn , where the consequent is ⊥ if n = 0 and all Pi , Qj are atomic. (Any such formula, universally quantified, is called a regular formula.) We abbreviate the multiset P1 , . . . , Pm as P. Each conjunct is then converted into a schematic rule, called the regular rule scheme, of the form Q1 , P, Γ ⇒ ∆ . . . Qn , P, Γ ⇒ ∆ Reg P, Γ ⇒ ∆ By this method, all universal theories can be formulated as contraction- and cut-free systems of sequent calculi. In Negri [2003], the method is extended to cover also geometric theories, that is, theories axiomatized by geometric implications. We recall that a geometric formula is a formula not containing ⊃ or ∀ and a geometric implication is a sentence of the form ∀ z(A ⊃ B) where A and B are geometric formulas. Geometric implications can be reduced to a normal form consisting of conjunctions of formulas, called geometric axioms, of the form ∀ z(P1 & . . . &Pm ⊃ (∃x 1 M1 ∨ · · · ∨ ∃x n Mn )) where each Mj is a conjunction of atomic formulas, Qj1 , . . . , Qjkj . Without loss of generality, no xi is free in any Pj . Note that regular formulas are geometric implications, with neither conjunctions nor existential quantifications to the right of the implication. The left rule scheme for geometric axioms takes the form Q 1 (y 1 /x 1 ), P, Γ ⇒ ∆
...
Q n (y n /x n ), P, Γ ⇒ ∆
P, Γ ⇒ ∆
GRS
where Q j and P indicate the multisets of atomic formulas Qj1 , . . . , Qjkj and P1 , . . . , Pm , respectively, and the eigenvariables y i of the premisses are not free in the conclusion. In order to maintain admissibility of contraction in the extensions with regular or geometric rules, the formulas P1 , . . . , Pm in the antecedent of the
112
SARA NEGRI
conclusion of the scheme have (as indicated) to be repeated in the antecedent of each of the premisses. In addition, whenever an instantiation of free parameters in atoms produces a duplication (two identical atoms) in the conclusion of a rule instance, say P1 , . . . , P, P, . . . , Pm , Γ ⇒ ∆, there is a corresponding duplication in each premiss and in the conclusion of the rule. The closure condition imposes the requirement that the rule with the duplication P, P contracted into a single P is added to the system of rules. For each axiom system, there is only a bounded number of possible cases of contracted rules to be added, very often none at all, so the condition is unproblematic. The main result for such extensions is the following (Theorems 4 and 5 from Negri [2003]): Theorem 1.1. The structural rules of weakening, contraction, and cut are admissible in all extensions of G3c with the geometric rule-scheme and satisfying the closure condition. Weakening and contraction are moreover height-preserving admissible. §2. Basic modal logic and its extensions. In this section we shall present a sequent system for the basic modal logic K with rules for the modalities 2 and 3 obtained through a meaning explanation, in terms of the possible worlds semantics, and an inversion principle. The modal logic K is characterized by arbitrary frames. Restrictions of the class of frames characterizing a given modal logic amounts to adding certain frame properties to the calculus. These properties are added in the form of mathematical rules, following the development outlined in Section 1. All the extensions are thus obtained in a modular way. As a consequence, the structural properties of the resulting calculi can be established in one theorem for all systems. 2.1. Basic modal logic. Basic modal logic is formulated as a labelled sequent calculus through an internalization of the possible worlds semantics into the syntax. The way to achieve this is the following: First we enrich the language so that sequents are expressions of the form Γ ⇒ ∆ where the multisets Γ and ∆ consist of relational atoms xRy and labelled formulas x : A (corresponding to the forcing x A of Kripke models), with x, y ranging in a set W of labels/possible worlds and A any formula in the language of propositional logic extended with the modal operators of necessity and possibility, 2 and 3. The rules for each connective/modality are obtained from their meaning explanation in terms of the relational semantics: From the inductive definition of forcing for a modal formula x 2A iff for all y, xRy implies y A we obtain If y : A can be derived for an arbitrary y accessible from x, then x : 2A can be derived
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
113
that is formalized into the rule xRy, Γ ⇒ ∆, y : A R2 Γ ⇒ ∆, x : 2A where arbitrariness of y becomes the variable condition y not in Γ, ∆. Through the inversion principle2 we obtain the rule xRy, Γ ⇒ ∆ y : A, Γ ⇒ ∆ L2 x : 2A, Γ ⇒ ∆ that can be equivalently given as a one-premiss rule in the following form y : A, x : 2A, xRy, Γ ⇒ ∆ L2 x : 2A, xRy, Γ ⇒ ∆ The rules for 3 are obtained similarly from the semantic explanation x : 3A iff for some y, xRy and y : A The semantic explanation of the classical propositional connectives is flat, so the result of the above procedure is just a labelling with the same variable of the active formulas in the premisses and conclusion of each rule of the calculus G3c. The sequent calculus for basic modal logic G3K thus obtained is given in Table 1 below. 2.2. Extensions. We present, by way of an example, a table of modal systems with their characterizing Hilbert-style axioms and corresponding frame properties.
T 4 E B 3
Axiom 2A ⊃ A 2A ⊃ 22A 3A ⊃ 23A A ⊃ 23A 2(2A ⊃ B) ∨ 2(2B ⊃ A)
Frame property ∀x xRx reflexivity ∀xyz(xRy & yRz ⊃ xRz) transitivity ∀xyz(xRy & xRz ⊃ yRz) euclideanness ∀xy(xRy ⊃ yRx) symmetry ∀xyz(xRy & xRz ⊃ yRz ∨ zRy) connectedness
D 2A ⊃ 3A 2 32A ⊃ 23A
∀x∃y xRy seriality ∀xyz(xRy & xRz ⊃ ∃w(yRw & zRw)) directedness
W 2(2A ⊃ A) ⊃ 2A
no infinite R-chains + trans.
2 There are several formulations of the inversion principle. Here we follow the inversion principle in the form Whatever follows from a proposition must follow from the direct grounds for asserting that proposition. This form allows to uniquely determine the elimination rules in natural deduction and the left rules in sequent calculus, as shown in detail in Negri and von Plato 2001.
114
SARA NEGRI
Table 1. The sequent calculus G3K. Initial sequents: x : P, Γ ⇒ ∆, x : P
xRy, Γ ⇒ ∆, xRy
Propositional rules: x : A, x : B, Γ ⇒ ∆ L& x : A&B, Γ ⇒ ∆
Γ ⇒ ∆, x : A Γ ⇒ ∆, x : B Γ ⇒ ∆, x : A&B
x : A, Γ ⇒ ∆ x : B, Γ ⇒ ∆ L∨ x : A ∨ B, Γ ⇒ ∆
Γ ⇒ ∆, x : A, x : B Γ ⇒ ∆, x : A ∨ B
R∨
Γ ⇒ ∆, x : A x : B, Γ ⇒ ∆ x : A, Γ ⇒ ∆, x : B L⊃ x : A ⊃ B, Γ ⇒ ∆ Γ ⇒ ∆, x : A ⊃ B
R⊃
x :⊥, Γ ⇒ ∆
R&
L⊥
Modal rules: y : A, x : 2A, xRy, Γ ⇒ ∆ L2 x : 2A, xRy, Γ ⇒ ∆
xRy, Γ ⇒ ∆, y : A R2 Γ ⇒ ∆, x : 2A
xRy, y : A, Γ ⇒ ∆ L3 x : 3A, Γ ⇒ ∆
xRy, Γ ⇒ ∆, x : 3A, y : A R3 xRy, Γ ⇒ ∆, x : 3A
The frame properties in the first group (T, 4, E, B, 3) are universal axioms, those in the second group are geometric implications, as defined in Section 1, whereas the last one is not expressible as a first-order property. The systems T, K4, KB, S4, B, S5, . . . are obtained by adding one or more axioms to the system K. Sequent calculi are obtained by adding to the system G3K the rule(s) corresponding to the properties of the accessibility relation characterizing their frames. For instance, a sequent calculus for S4 is obtained by adding to G3K the rules corresponding to the axiom of reflexivity and transitivity of the accessibility relation xRz, xRy, yRz, Γ ⇒ ∆ xRx, Γ ⇒ ∆ Ref Trans Γ⇒∆ xRy, yRz, Γ ⇒ ∆ and a system for S5 by adding also the rule corresponding to symmetry yRx, xRy, Γ ⇒ ∆ Sym xRy, Γ ⇒ ∆ Extensions are obtained in a modular way for all possible combinations of properties: G3T = G3K + Ref
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
115
G3K4 = G3K + Trans G3KB = G3K + Sym G3S4 = G3K + Ref + Trans G3TB = G3K + Ref + Sym G3S5 = G3K + Ref + Trans + Sym A system for Deontic logic is obtained by adding the geometric rule xRy, Γ ⇒ ∆ Ser Γ⇒∆ with the variable condition y ∈ / Γ, ∆. Directedness is another property that follows the pattern of a geometric implication, and it is converted into the rule yRu, zRu, xRy, xRz, Γ ⇒ ∆ Dir xRy, xRz, Γ ⇒ ∆ with the variable condition u ∈ / xRy, xRz, Γ, ∆. The treatment of a modal logic with a frame property not expressible as a first-order sentence, namely provability logic, is postponed to the following section. 2.3. Structural properties. Let G3K* be any extension of G3K with rules for the accessibility relation following the regular rule scheme or the more general geometric rule scheme. The following properties of any system belonging to the class G3K* can all be established uniformly. We refer to Negri [2005] for the details. Lemma 2.1. Sequents of the form x : A, Γ ⇒ ∆, x : A with A an arbitrary modal formula (not just atomic) are derivable in G3K*. In order to prove the correspondence between our systems and their Hilbertstyle presentations it is necessary to show that the characteristic axioms are derivable and the systems closed under the rules of necessitation and modus ponens. Lemma 2.2. For arbitrary A and B, the sequent ⇒ x : 2(A ⊃ B) ⊃ (2A ⊃ 2B) is derivable in G3K*. The rule of necessitation ⇒x:A ⇒ x : 2A is a context-dependent rule, as it requires both the antecedent and succedent contexts to be empty. As an explicit rule it would destroy the flexibility of the systems in the permutations needed to prove cut elimination; However, we do
116
SARA NEGRI
not need to add any such rule because we can show that it is admissible. In order to prove this we exploit the first-order features of the system in proving a lemma about substitution. Substitution of labels is defined in the obvious way as follows for relational atoms and labelled formulas: xRy(z/w) ≡ xRy if w = x and w = y xRy(z/x) ≡ zRy if x = y xRy(z/y) ≡ xRz if x = y xRx(z/x) ≡ zRz x : A(z/y) ≡ x : A if y = x x : A(z/x) ≡ z : A and is extended to multisets componentwise. We have Lemma 2.3. If Γ ⇒ ∆ is derivable in G3K*, then Γ(y/x) ⇒ ∆(y/x) is also derivable, with the same derivation height. An immediate consequence is Corollary 2.4. The necessitation rule is admissible in G3K*. We also obtain a desirable property of a sequent calculus, namely: Proposition 2.5. All the rules of G3K* are height-preserving invertible. Finally, we have: Theorem 2.6. All the structural rules–weakening, contraction, and cut–are admissible in the system G3K*. 2.4. Equality and undefinability. The syntax for system G3K* can be extended with equality. The treatment of equality as a left rule system, following Negri and von Plato (2001, Section 6.5), is easily implemented in the context of labelled calculi. We shall not give here the details, that can be found in Negri (2005, Section 7), but just observe by way of an example that the modal axiom 3(A&2B) ⊃ 2(A ∨ 3A ∨ B) corresponding to the frame property ∀xyz(xRy & xRz ⊃ z = y ∨ zRy ∨ yRz) converts to the rule z = y, xRy, xRz, Γ ⇒ ∆ zRy, xRy, xRz, Γ ⇒ ∆ yRz, xRy, xRz, Γ ⇒ ∆ xRy, xRz, Γ ⇒ ∆ The corresponding sequent system is obtained by adding the above rule to the system G3K augmented with the rules for equality. All the structural properties of the resulting system hold as a consequence of the general results. The use of proof systems that unify the syntax and semantics of modal logic permits to obtain very simple proofs of negative results in correspondence theory. These results state that certain frame properties (such as irreflexivity
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
117
and intransitivity) do not have any modal correspondent. The usual proofs are based on model extension methods: in order to prove that a frame property is not modally definable it is shown that the corresponding class of frames is not closed under the constructions of disjoint union, generated subframes, bounded morphic images, and ultrafilter extensions (cf. Blackburn, de Rijke, and Venema 2001, Section 3.3; see also Van Benthem 1984). In our systems, the lack of a modal correspondent is an immediate consequence of a conservativity theorem. Consider, for instance, the frame property of irreflexivity ∀x ∼ xRx that corresponds to the rule xRx, Γ ⇒ ∆
Irref
By a straightforward proof analysis (see Theorem 7.1 of Negri [2005] for the complete, five-line proof) we observe Theorem 2.7. The system G3K+Irref is conservative over G3K. It follows that the property of irreflexivity does not have any modal correspondent, because, if it had, there would be some formula that is provable in the extension G3K+Irref but not in G3K. The result is easily generalized to any property, generalizing intransitivity, of the form ∼ (P1 & . . . &Pn ) where Pi is xi Ryi and for some i, j, yi = yj . A similar result holds for ∃x xRx and ∀x∃y(xRy & yRy). §3. Provability logic. After Solovay’s landmark paper (1976) that presented axiomatically GL as the logic of arithmetic provability and characterized its Kripke models as the transitive and Noetherian frames, a lot of interest has been directed to the search of an adequate, cut-free sequent system for GL. Semantic proofs of closure of a certain system with respect to cut, based on completeness arguments, were presented in Sambin and Valentini [1982] and in Avron [1984]. Syntactic proofs, aimed at providing explicit proof transformations that would describe a procedure of cut elimination, were proposed by Leivant [1981], Valentini [1983], and Borga [1983]. Valentini [1983] gave a counterexample to the proof presented by Levaint. More recently Moen [2001] observed that the proof by Valentini assumes as a starting point a reduction of a cut on 2A to a detour cut, which is not fully justified in a calculus with explicit contraction. However, in all the proofs given in the 1980s (and also in more recent proposals, see Sasaki 2001) calculi with contexts-assets have been used. There are good reasons for objecting to such an approach to sequent calculus that would deserve a more thorough discussion, but we shall not go into this issue here.
118
SARA NEGRI
Another problematic aspect of the proposed calculi for provability logic is a so-called lack of harmony3 : In fact, there is only one rule (both left and right) for 2 2Γ, Γ, 2A ⇒ A 2Γ, Γ ⇒ ∆, 2A that does not respect any of the design requirements of separation, symmetry, uniqueness recalled in the Introduction. Here we shall show how a calculus with admissible contraction for sequents labelled by possible worlds, with harmonic, semantically originated left and right rules for 2, permits a transparent proof of cut elimination. In the Kripke frames for provability logic the accessibility relation R is irreflexive, transitive, and Noetherian (every R-chain eventually becomes stationary). Equivalently, we can say that R is transitive and all R-chains are finite. Clearly, this characterizing frame condition is not first order, so the method of universal/geometric extensions exploited in Section 2 cannot be applied directly. However, the condition can be internalized in the explanation of the meaning of the modality as follows: Lemma 3.1. In irreflexive, transitive, and Noetherian Kripke frames x 2A iff for all y, xRy and y 2A implies y A Proof. See Negri [2005]. The right-to-left direction of the implication stated above gives the right rule for 2 xRy, y : 2A, Γ ⇒ ∆, y : A R2-L Γ ⇒ ∆, x : 2A with the variable condition that y is not in the conclusion. The left-to-right direction gives the left rule x : 2A, xRy, Γ ⇒ ∆, y : 2A y : A, x : 2A, xRy, Γ ⇒ ∆ L2-L x : 2A, xRy, Γ ⇒ ∆ The system G3GL thus determined is given in Table 2 below. All the properties that have been established for G3K* hold for G3GL, namely: Theorem 3.2. 1. The axioms of the Hilbert-type system for GL are derivable in G3GL. 2. The rules of substitution, weakening, and necessitation are height-preserving admissible in G3GL. 3 Cf. Read 2005 for a discussion of this notion in the context of modal logic and von Plato 2005 for an application of (harmonic) general elimination rules to the solution of the problem of normal form for S4.
119
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
Table 2. The sequent calculus G3GL. Initial sequents: x : P, Γ ⇒ ∆, x : P
x : 2A, Γ ⇒ ∆, x : 2A
Logical rules: As in G3K for &, ∨, ⊃, ⊥; L2-L, R2-L Mathematical rules: Ref, Trans 3. The rules of contraction are admissible in G3GL. Elimination of contraction does not introduce new worlds in the derivation. The last item in the theorem introduces a new notion that was not needed before, namely the notion of range of a label in a derivation. Roughly, the range of a label x in a derivation D is the set of labels belonging to the transitive closure of all the relations xRy occurring in the left hand side of sequents of D. The need for this new notion becomes clear from the proof of cut elimination for G3GL. We shall not give all the details here (for which we refer to Negri 2005), but just focus on the main ideas. A typical procedure of cut elimination for G3-like systems considers topmost cuts and performs reductions that either decrease the height of one of the two premisses of cut (for permutation cuts, that is, cuts in which the cut formula is not principal in at least one of the premisses) or the size of the cut formula (for detour, or principal, cuts, that is cuts in which the formula principal in both premisses). The reductions are repeated until cuts reach initial sequents and disappear. This procedure does not work for G3GL in the case of detour cuts on x : 2A. Consider a principal cut on x : 2A xRz, x : 2A, Γ ⇒ ∆ , z : 2A xRy, y : 2A, Γ ⇒ ∆, y : A z : A, xRz, x : 2A, Γ ⇒ ∆ L2-L R2-L Γ ⇒ ∆, x : 2A xRz, x : 2A, Γ ⇒ ∆ Cut xRz, Γ , Γ ⇒ ∆, ∆ this is transformed into four cuts as follows D. 1 D. 2 .. .. . . xRz, xRz, Γ , Γ, Γ ⇒ ∆, ∆, ∆ , z : A xRz, z : A, Γ , Γ ⇒ ∆, ∆ xRz, xRz, xRz, Γ , Γ , Γ, Γ, Γ ⇒ ∆, ∆, ∆, ∆ , ∆ Ctr* xRz, Γ , Γ ⇒ ∆, ∆
Cut
120
SARA NEGRI
where D1 and D2 are the following two derivations Γ ⇒ ∆, x : 2A xRz, x : 2A, Γ , ⇒ ∆ , z : 2A Cut xRz, Γ , Γ ⇒ ∆, ∆ , z : 2A xRz, z : 2A, Γ ⇒ ∆, z : A Cut xRz, xRz, Γ, Γ , Γ ⇒ ∆, ∆ , ∆, z : A Γ ⇒ ∆, x : 2A xRz, x : 2A, z : A, Γ ⇒ ∆ Cut xRz, z : A, Γ , Γ ⇒ ∆, ∆ Observe that the cuts on x : 2A and on z : A are all reduced according to the standard procedure, whereas the cut on z : 2A is not, because neither the complexity of the cut formula nor the height of the cut is reduced. However, if the range of z in the new derivation is strictly smaller than the range of x in the original derivation, then we have for all the cuts in the transformed derivation a reduced inductive parameter given by the triple consisting of the complexity of the cut formula, the range of its label, and the height of the cut, ordered lexicographically. In order to prove the reduction in range, two extra assumptions are needed, namely, that there be no cuts with xRx or xRx1 , . . . , xn Rx in the antecedents of their conclusions and that eigenvariables be pure, i.e., appear only in the subtree above the step introducing them. The first condition is met by observing that if there are cuts of that form, they are eliminated using Irref and Trans, the second by a fresh renaming of eigenvariables. It then follows that no x can be in the range of itself, that if y is in the range of x then the range of y is properly included in the range of x, and that if y, z are in the range of x and y is an eigenvariable, then the union of the range of y and the range of z is properly included in the range of x. ¨ axiom can be derived in this We conclude with observing that the Lob ¨ system, and Godel’s second incompleteness theorem follows as an immediate consequence of cut elimination. See Negri [2005]. §4. Intermediate logics. It is well known that intuitionistic logic can be embedded into the classical modal logic S4, and actually all the intermediate logics between intuitionistic and classical logic can be embedded into the intermediate modal logics between S4 and S5. The analogy between these two families of logics is best seen at the level of their Kripke semantics. The explanation of the meaning of implication in intuitionistic logic reflects the explanation of the modality in K. As for normal modal logics, we can internalize the inductive definition of validity in a Kripke frame for obtaining uniform G3-style sequent calculi for intermediate logics. The accessibility relation for intuitionistic logic is a partial order. By requiring additional properties, logics above intuitionistic logic are obtained. We observe that all the properties of the accessibility relation characterizing the interpolable propositional logics
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
121
fall under the geometric rule scheme. By applying the results on geometric extensions we can therefore obtain complete calculi with good structural properties. In addition, the uniformity in the syntax allows immediate proofs of the faithfulness of the embeddings. The details of the proofs can be found in Dyckhoff and Negri [2005]. From the inductive definition of validity of implication in a Kripke frame, x A ⊃ B iff for all y, x y and y A implies y A we obtain the left and right rules for intuitionistic implication. Arbitrariness in y in the right rule is again expressed by a variable condition. The rules for the other connectives are exactly as the rules in G3K. The initial sequents of G3K are instead modified in order to guarantee the property of monotonicity of forcing. In compliance with the features of the G3-style calculi, it is enough to have monotonicity with respect to atomic formulas to have full monotonicity admissible. The mathematical rules for the accessibility relation are the rules Ref and Trans, expressing that is a partial order. We have thus determined the following system G3I for intuitionistic propositional logic: Table 3. The sequent calculus G3I. Initial sequents: x y, x : P, Γ ⇒ ∆, y : P Logical rules: As in G3K for &, ∨, ⊥; x y, x : A ⊃ B, Γ ⇒ y : A, ∆, x y, x : A ⊃ B, y : B, Γ ⇒ ∆ L⊃ x y, x : A ⊃ B, Γ ⇒ ∆ x y, y : A, Γ ⇒ ∆, y : B R⊃ Γ ⇒ ∆, x : A ⊃ B Order rules: x x, Γ ⇒ ∆ Ref Γ⇒∆
x z, x y, y z, Γ ⇒ ∆ Trans x y, y z, Γ ⇒ ∆
Let G3I* be any extension of G3I with rules following the geometric rule scheme. Following the method presented in Section 3, the structural properties of G3I* are proved uniformly for any extension. We summarize the results in the following
122
SARA NEGRI
Theorem 4.1. 1. G3I* x y, x : A, Γ ⇒ ∆, y : A (monotonicity). 2. If G3I* n Γ ⇒ ∆, then G3I* n Γ(y/x) ⇒ ∆(y/x) (height-preserving substitution). 3. Weakening and contraction are height-preserving admissible. 4. All the rules of G3I* are height-preserving invertible. 5. Cut is admissible. We obtain at once that each of the seven interpolable intermediate logics (cf. Maksimova 1979, Chagrov and Zakharyaschev 1997) belong to the class G3I*: The point is simply that all these have frame conditions expressible geometrically. 1. Int Intuitionistic Logic: as already built in above, the accessibility relation is reflexive and transitive, i.e. ∀x(x x) and ∀xyz(x y & y z ⊃ x z). 2. Jan Jankov-De Morgan Logic (cf. Jankov 1968): The relation rected or convergent, i.e.,
is di-
∀xyz(x y & x z ⊃ ∃w(y w & z w)). This logic, also known as KC (cf. Chagrov and Zakharyaschev 1997) and as the “logic of weak excluded middle,” is axiomatised by either ∼ A∨ ∼∼ A or ∼ (A&B) ⊃∼ A∨ ∼ B. 3. GD G¨odel-Dummett Logic: The accessibility relation is linear, i.e., ∀xy(x y ∨ y x). This logic (also known as LC, for “linear chains”) has as characteristic axiom scheme either (A ⊃ B) ∨ (B ⊃ A) or ((A ⊃ B) ⊃ C ) ⊃ (((B ⊃ A) ⊃ C ) ⊃ C ). 4. Bd2 : The accessibility relation has depth at most 2, i.e., it satisfies ∀xyz(x y & y z ⊃ z y ∨ y x). This logic is axiomatised by for example A ∨ (A ⊃ (B ∨ ∼ B)). 5. Sm: Smetanich logic, also known as LC2 (cf. Chagrov and Zakharyaschev 1997) or the “logic of here and there.” The accessibility relation is linear and has depth at most 2, i.e., the conditions for GD and Bd2 . It is axiomatised by the GD axiom plus the Bd2 axiom, or, equivalently, (∼ B ⊃ A) ⊃ (((A ⊃ B) ⊃ A) ⊃ A). 6. GSc: The accessibility relation has depth at most 2 and at most 2 final elements, i.e., the following holds in addition to the frame condition for Bd2 : ∀xyzw(x y & x z & x w ⊃ w y ∨ w z) The logic is axiomatized by (A ⊃ B) ∨ (B ⊃ A) ∨ ((A ⊃ ∼ B)& (∼ B ⊃ A)) and A ∨ (A ⊃ B ∨ ∼ B).
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
123
7. Cl Classical logic: The accessibility relation is symmetric, i.e., ∀xy(x y ⊃ y x). The logic is axiomatised by A∨ ∼ A or by ∼∼ A ⊃ A. There are the following containments between these logics: Int ⊂ Jan ⊂ GD ⊂ Sm, Int ⊂ BD2 ⊂ GSc ⊂ Sm and Sm ⊂ Cl. We recall the standard translation 2 of Int into S4, a variant (cf. Troelstra ¨ and Schwichtenberg 2000) of the translation given in Godel [1933]: P2 2
⊥ (A ⊃ B)2 (A&B)2 (A ∨ B)2 (A1 , . . . , An )2
≡ 2P ≡ ⊥ ≡ 2(A2 ⊃ B 2 ) ≡ A2 &B 2 ≡ A2 ∨ B 2 2 ≡ A2 1 , . . . , An
We obtain a uniform proof of the faithful embeddings of intermediate logics between Int and Cl and intermediate modal logics between S4 and S5. Theorem 4.2. Given an extension G3I* of G3I with rules for , let G3S4* be the corresponding extension of G3S4. We then have G3I* Γ ⇒ ∆ iff G3S4* Γ2 ⇒ ∆2 . §5. Substructural logics. Among the logics that can be characterized in terms of a relational semantics is the family of relevant, and, in more generality, substructural logics. Here we shall show how our method can be successfully applied for obtaining sequent calculi for these logics. For a general background, history, motivations, applications, and references to the vast literature on the field we refer to the survey by Dunn and Restall [2002] and to the two recent monographs Restall [2000] and Mares [2004]. Our starting point for the development of uniform calculi for substructural logics is given by the Routley-Meyer relational semantics. This semantics is a generalization of the standard relational semantics for intuitionistic and modal logic: Instead of a binary accessibility relation, we have a ternary relation R on a set of worlds W . A distinguished element 0 of W defines a projection of R, namely a b ≡ R0ab that turns out to be a partial order. For basic relevant logic, R satifies the properties: Ref Mon1 Mon2 Mon3
R0xx R0x x & Rxyz ⊃ Rx yz R0y y & Rxyz ⊃ Rxy z R0z z & Rxyz ⊃ Rxyz
124
SARA NEGRI
Following the method recalled in Section 2, all the above properties can be given as rules for the accessibility relation to be added to an appropriate labelled calculus. As for intuitionistic logic, the only connective with a non-trivial semantics is implication, with validity defined inductively by x A ⊃ B ≡ for all y, z, Rxyz and y A implies z A This semantic explanation justifies the rules Rxyz, x : A ⊃ B, Γ ⇒ ∆, y : A Rxyz, x : A ⊃ B, z : B, Γ ⇒ ∆ L⊃ Rxyz, x : A ⊃ B, Γ ⇒ ∆ Rxyz, y : A, Γ ⇒ ∆, z : B R⊃ Γ ⇒ ∆, x : A ⊃ B where the latter has the variable condition y, z ∈ / Γ, ∆, x : A ⊃ B. A cut-free complete sequent calculus for basic relevance logic is obtained, with initial sequents given by R0xy, x : P, Γ ⇒ ∆, y : P The logical rules for implication are as above, the rules for & and ∨ as in G3K and G3I, and the mathematical rules are given by the monotonicity properties of R. Besides cut, also the other structural rules (weakening and contraction) are admissible. We observe that this does not contradict the substructural nature of these logics. These admissible rules are what could be called (borrowing terminology from hypersequents) external structural rules. In fact, we can easily verify that the axiom A ⊃ (B ⊃ A) that corresponds to weakening is not derivable in the above system despite the admissibility of weakening. Logics extending the basic relevant logic can be obtained by assuming additional properties for the accessibility relation. We recall some correspondences between axioms and frame properties for a variety of relevant logics. First, define R2 abcd ≡ R2 (ab)cd ≡ ∃x(Rabx&Rxcd ) and R2 a(bc)d ≡ ∃x(Raxd &Rbcx) Axiom A&(A ⊃ B) ⊃ B (A ⊃ B)&(B ⊃ C ) ⊃ (A ⊃ C ) (A ⊃ B) ⊃ ((B ⊃ C ) ⊃ (A ⊃ C )) (A ⊃ B) ⊃ ((C ⊃ A) ⊃ (C ⊃ B)) (A ⊃ (A ⊃ B)) ⊃ (A ⊃ B) ((A ⊃ A) ⊃ B) ⊃ B A ⊃ ((A ⊃ B) ⊃ B) A ⊃ (A ⊃ A)
Frame property Raaa or R0ab ⊃ Raab idempotence Rabc ⊃ R2 a(ab)c transitivity R2 abcd ⊃ R2 b(ac)d suffixing R2 abcd ⊃ R2 a(bc)d associativity Rabc ⊃ R2 abbc contraction Ra0a specialized assertion Rabc ⊃ Rbac commutativity Rabc ⊃ (R0ac ∨ R0bc) mingle
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
125
Observe that all the properties of R are geometric. As a consequence, the basic calculus can be extended by rules representing the frame properties and its structural properties follow from the general result on extensions with the geometric rule-scheme. A similar approach to substructural logics is presented in Vigano` [2000]. The main difference with respect to our method consists in the use of a basic sequent calculus with explicit structural rules and in a presentation of mathematical rules for the accessibility relation in the form of rules with a single conclusion (Horn clauses) that cannot be extended beyond Harrop theories (theories that do not have disjunctions in positive parts of axioms). This excludes, for instance, the treatment of the last frame property in the above table. §6. Concluding remarks. We have presented a uniform way of generating sequent calculi with good structural properties for a variety of non-classical logics, including most standard normal modal logics, provability logic, intermediate logics, and substructural logics. The calculi are all in the form of Gentzen sequent calculi, with an extra syntactic element given by the labels and rules governing them that formally encode the Kripke semantics into the sequent systems. We can now relate the proposed solution to the requirements on good sequent systems that we quoted in the Introduction. The first property, separation, is satisfied as each connective/modality has rules given through its meaning explanation, independent of any other connective. Symmetry also clearly holds, and, in particular for the case of provability logic, it simplifies previous proofs of cut elimination that were based on a non-symmetric rule for the modality. As for the third property, we observe that there are rules in which the connective/modality appears also in the premisses. This is an unavoidable feature of certain calculi, such as G3i, and it is needed for obtaining admissibility of contraction. Some extra care is therefore needed in proofs of termination of proof search. The fourth and fifth requirements are clearly satisfied. As for the sixth, we observe that our calculi satisfy a similar requirement: We have a core basic logical calculus and different systems are obtained by modifying only the mathematical rules added to the ground calculus, that is, the rules for the accessibility relation. The structural rules, instead, are absent from all our calculi, because they are admissible. In particular, cut is admissible, so also property 7 is satisfied. As for property 8, the subformula property, we do not have a priori a full subformula property. In rules for frame properties there are relational atoms that disappear from the premisses to the conclusion. However, we can prove in most cases a suitable version of the subformula property, adequate for proving syntactic decidability, as consequence of the structural properties of the calculi. Our calculi all clearly satisfy a weak subformula property, that is, all formulas in a derivation are either subformulas of (formulas in) the endsequent or atomic formulas
126
SARA NEGRI
of the form xRy. By considering minimal derivations, that is, derivations in which shortenings are not possible, the weak subformula property can be strengthened by restricting the labels that can appear in the relational atoms to those in the conclusion. The subterm property states that all terms (variables, worlds) in a derivation are either eigenvariables or terms (variables, worlds) in the conclusion. This property, together with height-preserving admissibility of contraction, ensures the consequences of the full subformula property and it has been used for establishing decidability through terminating proof search for logics extending basic modal logic in Section 6 of Negri [2005]. The same approach to proof-theoretic decidability can be extended to the other non-classical logics treated in this article. REFERENCES
A. Avron [1984], On modal systems having arithmetical interpretations, The Journal of Symbolic Logic, vol. 49, no. 3, pp. 935–942. D. Basin, S. Matthews, and L. Vigano` [1998], Natural deduction for non-classical logics, Studia Logica, vol. 60, no. 1, pp. 119–160. P. Blackburn [2000], Representation, reasoning, and relational structures: a hybrid logic manifesto, Logic Journal of the IGPL, vol. 8, no. 3, pp. 339–365. P. Blackburn, M. de Rijke, and Y. Venema [2001], Modal Logic, Cambridge University Press, Cambridge. M. Borga [1983], On some proof theoretical properties of the modal logic GL, Studia Logica, vol. 42, no. 4, pp. 453– 459 (1984). T. Brauner [2000], A cut-free Gentzen formulation of the modal logic S5, Logic Journal of the ¨ IGPL, vol. 8, no. 5, pp. 629–643. R. Bull and K. Segerberg [1984], Basic modal logic, Handbook of Philosophical Logic, Vol. 2 (D. Gabbay and F. Guenther, editors), Kluwer, Dordrecht, pp. 1–88. C. Castellini [2005], Automated Reasoning in Quantified Modal and Temporal Logic, Ph.D. thesis, School of Informatics, University of Edinburgh. C. Castellini and A. Smaill [2002], A systematic presentation of quantified modal logics, Logic Journal of the IGPL, vol. 10, no. 6, pp. 571–599. L. Catach [1991], Tableaux: A general theorem prover for modal logics, Journal of Automated Reasoning, vol. 7, no. 4, pp. 489–510. A. Chagrov and M. Zakharyaschev [1997], Modal Logic, Oxford University Press, New York. H. B. Curry [1952], The elimination theorem when modality is present, The Journal of Symbolic Logic, vol. 17, pp. 249–265. J. M. Dunn and G. Restall [2002], Relevance logic, Handbook of Philosophical Logic (D. Gabbay and F. Guenthner, editors), vol. 6, Kluwer, Dordrecht, pp. 1–128. R. Dyckhoff and S. Negri [2005], Proof analysis in intermediate propositional logics, ms. R. Dyckhoff and S. Negri [2006], Decision methods for linearly ordered Heyting algebras, Archive for Mathematical Logic, vol. 45, no. 4, pp. 411– 422. F. B. Fitch [1966], Tree proofs in modal logic, The Journal of Symbolic Logic, vol. 31, p. 152. M. Fitting [1983], Proof Methods for Modal and Intuitionistic Logics, Synthese Library, vol. 169, Kluwer, Dordrecht. M. Fitting [1999], A simple propositional S5 tableau system, Annals of Pure and Applied Logic, vol. 96, no. 1-3, pp. 107–115.
PROOF ANALYSIS IN NON-CLASSICAL LOGICS
127
D. Gabbay [1996], Labelled Deductive Systems, Oxford University Press, New York. R. Gor´e [1999], Tableau methods for modal and temporal logics, Handbook of Tableau Methods (M. D’Agostino et al., editors), Kluwer, Dordrecht, pp. 297–396. K. Godel [1933], Eine Interpretation des intuitionistischen Aussagenkalk¨uls, Ergebnisse eines ¨ ¨ mathematischen Kolloquiums, vol. 4, pp. 39– 40, English translation in Godel’s Collected Works, vol. I (1986), pp. 300–303. V. A. Jankov [1968], The calculus of the weak “law of excluded middle”, Math. USSR Izvestija, vol. 2, pp. 997–1004. S. Kanger [1957], Provability in Logic, Almqvist and Wiksell. H. Kushida and M. Okada [2003], A proof-theoretic study of the correspondence of classical logic and modal logic, The Journal of Symbolic Logic, vol. 68, no. 4, pp. 1403–1414. D. Leivant [1981], On the proof theory of the modal logic for arithmetic provability, The Journal of Symbolic Logic, vol. 46, no. 3, pp. 531–538. L. Maksimova [1979], Interpolation properties of superintuitionistic logics, Studia Logica, vol. 38, no. 4, pp. 419– 428. E. D. Mares [2004], Relevant Logic. A Philosophical Interpretation, Cambridge University Press. F. Massacci [2000], Single step tableaux for modal logics: computational properties, complexity and methodology, Journal of Automated Reasoning, vol. 24, no. 3, pp. 319–364. G. Mints [1968], Cut-free calculi of type S5, Zap. Nauˇcn. Sem. Leningrad. Otdel. Mat. Inst. Steklov (LOMI), vol. 8, pp. 166–174, (in Russian), Translated in English in Studies in constructive mathematics and mathematical logic, vol. 8, Part II (A. O. Slisenko, editor), Leningrad, 1970. G. Mints [1997], Indexed systems of sequents and cut-elimination, Journal of Philosophical Logic, vol. 26, no. 6, pp. 671–696. A. Moen [2001], The proposed algorithms for eliminating cuts in the provability calculus GLS do not terminate, Nordic Workshop on Programming Theory. S. Negri [2001], A sequent calculus for constructive ordered fields, Reuniting the Antipodes— Constructive and Nonstandard Views of the Continuum (P. Schuster et al., editors), Kluwer, Dordrecht, pp. 143–155. S. Negri [2003], Contraction-free sequent calculi for geometric theories with an application to Barr’s theorem, Archive for Mathematical Logic, vol. 42, no. 4, pp. 389– 401. S. Negri [2005], Proof analysis in modal logic, Journal of Philosophical Logic, vol. 34, no. 5-6, pp. 507–544. S. Negri [2005a], Permutability of rules for linear lattices, Journal of Universal Computer Science, vol. 11, no. 12, pp. 1986–1995. S. Negri and J. von Plato [1998], Cut elimination in the presence of axioms, The Bulletin of Symbolic Logic, vol. 4, no. 4, pp. 418– 435. S. Negri and J. von Plato [2001], Structural Proof Theory, Cambridge University Press, Cambridge. S. Negri and J. von Plato [2004], Proof systems for lattice theory, Mathematical Structures in Computer Science, vol. 14, no. 4, pp. 507–526. S. Negri and J. von Plato [2005], The duality of classical and constructive notions and proofs, From Sets and Types to Topology and Analysis: Towards Practicable Foundations for Constructive Mathematics (L. Crosilla and P. Schuster, editors), Oxford University Press, Oxford, pp. 149–161. S. Negri, J. von Plato, and Th. Coquand [2001], Proof-theoretical analysis of order relations, Archive for Mathematical Logic, vol. 43, no. 3, pp. 297–309. A. Nerode [1991], Some lectures on modal logic, Logic, Algebra, and Computation (F. L. Bauer, editor), NATO ASI Series, Springer, pp. 149–161. H. J. Ohlbach [1993], Translation methods for non-classical logics: an overview, Bulletin of IGPL, vol. 1, no. 1, pp. 69–89.
128
SARA NEGRI
M. Ohnishi and K. Matsumoto [1957], Gentzen method in modal calculi, Osaka Journal of Mathematics, vol. 9, pp. 113–130. J. von Plato [2005], Normal derivability in modal logic, Mathematical Logic Quarterly, vol. 51, pp. 632–638. J. von Plato [2005a], The decision problem in projective and affine geometry, ms. S. Popkorn (Harold Simmons) [1994], First Steps in Modal Logic, Cambridge University Press, Cambridge. S. Read [2004], Harmony and modality, ms. G. Restall [2000], An introduction to substructural logics, Routledge. G. Restall [2005], Short course, Logic Colloquium 2005, Athens, Greece. G. Sambin and S. Valentini [1982], The modal logic of provability. The sequential approach, Journal of Philosophical Logic, vol. 11, no. 3, pp. 311–342. K. Sasaki [2001], L¨ob’s axiom and cut-elimination theorem, Journal of the Nanzan Academic Society Mathematical Sciences and Information Engineering, vol. 1, pp. 91–98. M. Sato [1980], A cut-free Gentzen-type system for the modal logic S5, The Journal of Symbolic Logic, vol. 45, no. 1, pp. 67–84. R. Schmidt and U. Hustadt [2003], A principle for incorporating axioms into the first-order translation of modal formulae, Automated Deduction-CADE-19 (F. Baader, editor), Lecture Notes in Artificial Intelligence, vol. 2741, Springer, pp. 412– 426. G. Shvarts [1989], Gentzen style systems for K45 and K45D, Logic at Botik ’89 (Pereslavl Zalesskiy, 1989), Lecture Notes in Computer Science, vol. 363, Springer, Berlin, pp. 245–256. A. Simpson [1994], Proof Theory and Semantics of Intuitionistic Modal Logic, Ph.D. thesis, School of Informatics, University of Edinburgh. R. M. Solovay [1976], Provability interpretations of modal logic, Israel Journal of Mathematics, vol. 25, no. 3-4, pp. 287–304. P. Stouppa [2005], A deep inference system for the modal logic S5, ms. A. Troelstra and H. Schwichtenberg [2000], Basic Proof Theory, 2nd ed., Cambridge University Press, Cambridge. S. Valentini [1983], The modal logic of provability: cut-elimination, Journal of Philosophical Logic, vol. 12, no. 4, pp. 471– 476. J. Van Benthem [1984], Correspondence theory, Handbook of Philosophical Logic (D. Gabbay and F. Guenther, editors), vol. 2, Kluwer, Dordrecht, pp. 167–247. L. Vigano` [2000], Labelled Non-Classical Logics, Kluwer, Dordrecht. H. Wansing [1994], Sequent calculi for normal modal propositional logics, Journal of Logic and Computation, vol. 4, no. 2, pp. 125–142. H. Wansing [1996], Proof theory of modal logic, Proceedings of the Workshop Held at the University of Hamburg, Hamburg, November 19–20, 1993 (Heinrich Wansing, editor), Applied Logic Series, vol. 2, Kluwer, Dordrecht. H. Wansing [2002], Sequent systems for modal logics, Handbook of Philosophical Logic (D. Gabbay and F. Guenther, editors), vol. 8, Kluwer, Dordrecht, 2nd ed., pp. 61–145. DEPARTMENT OF PHILOSOPHY, PL 9, 00014 UNIVERSITY OF HELSINKI, FINLAND
E-mail: sara.negri@helsinki.fi
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
CHARLES PARSONS
§1. The name of Paul Bernays (1888-1977) is familiar probably first of all for his contributions to mathematical logic. Many of those were in the context of his position as David Hilbert’s junior collaborator in his proof-theoretic program inaugurated after the first World War. For those of us starting out in logic in the mid-twentieth century, the monumental Grundlagen der Mathematik of Hilbert and Bernays was one of the basic works in mathematical logic that we were obliged to study. That was the more true for those like me who aspired to work in proof theory. Bernays’ collaboration with Hilbert ended with the publication in 1939 of volume II of that work. Because the Nazis had forced his removal from ¨ his position in Gottingen in 1933, the collaboration ceased to be face-to¨ face in 1934, when Bernays moved to Zurich, where he lived for the rest of his life.1 Bernays is also known, though less well known, as a philosophical writer. The fact that he had some philosophical training was apparently a reason why Hilbert chose him as his collaborator. Constance Reid tells the story ¨ that when Hilbert visited Zurich in 1917, he went for a walk with two local ´ mathematicians, Bernays and Georg Polya. The conversation turned to phi´ losophy, and although Polya was usually rather voluble and Bernays was not, this time Bernays did most of the talking. On the spot Hilbert invited Bernays ¨ to Gottingen to work with him, and Bernays accepted.2 This paper results in considerable measure from my (somewhat marginal) participation in the project of preparing a collected edition [200?] of Bernays’ papers in the philosophy of mathematics. In §6 I draw on one of the introductions I wrote for that edition. I have learned much about Bernays from the other participants, especially (in fact over a longer period) Wilfried Sieg. §7 was prompted by comments made by Albert Visser at the Athens meeting. I am also indebted to an anonymous referee and to correspondence with Miriam Franchella, although I have not gone into many questions about Bernays’ relation to Nelson and Gonseth suggested by her [2005]. 1 I do not know exactly when Bernays’ move took place, but it appears to have been in April 1934. 2 Reid [1970: 150-51]. Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
129
130
CHARLES PARSONS
This story is too neat to be quite correct. In fact, in a letter to Reid commenting on a draft of the biography, Bernays writes that the walk indeed occurred in the spring of 1917 but that the offer was made and accepted ¨ when Hilbert returned to Zurich in September for his lecture “Axiomatisches Denken” [1918].3 Although Bernays’ self-identification was clearly as a mathematician, and his academic career, such as it was, was in mathematics, he maintained an interest in philosophy throughout his life and continued to write philosophical essays. His writing was not confined to the philosophy of mathematics, but naturally enough that is where its main weight lies. Near the end of his life he put together a collection Abhandlungen zur Philosophie der Mathematik [1976], which contains the most important such essays.4 A more comprehensive collection, but still of essays in the philosophy of mathematics, is in preparation and will have the originals and English translations on facing pages, with introductions by various hands.5 Bernays’ philosophical formation is of interest from a contemporary point of view because it would have to be described as neither “analytic” nor “continental” in the sense in which those terms are now used to classify philosophical styles and tendencies. His principal teacher in philosophy was Leonard Nelson (1882-1927), who would be described as a neo-Kantian but who did not belong to either of the principal neo-Kantian schools, the Marburg school and the southwestern school. However, Bernays writes that earlier in Berlin he heard lectures of Alois Riehl and Ernst Cassirer.6 Riehl had been one of the founders of neo-Kantianism, and Cassirer was the leading representative in the twentieth century of the Marburg school. Edmund Husserl was a profes¨ sor in Gottingen during Bernays’ studies there and led an active school, but it seems that he and his philosophy had little impact on Bernays. Husserl is not mentioned in Bernays’ autobiographical sketch, and the rather few comments about Husserl and phenomenology in Bernays’ writings are mostly critical. Nonetheless there are indications that eventually they exercised some positive 3 Bernays to Reid, 27 November 1968, copy in the Bernays papers in the Archive of the ETH, document Hs 975: 3775. Later correspondence shows that Bernays was concerned about this inaccuracy after the publication of the book. However, the book was later reprinted unrevised. Also in [1976a, xv], Bernays states that the offer was made when Hilbert came for the lecture, which occurred on 11 September 1917. The interval would have allowed Hilbert not only to think about the idea of engaging Bernays but to inquire with people who knew his work, such as his Doktorvater Edmund Landau, Leonard ¨ Nelson (see below), Ernst Zermelo, and in Zurich probably Adolf Hurwitz and Hermann Weyl. 4 [1935] and [1946] are translated from French, the latter by Bernays himself. 5 All essays reprinted in [1976], as well as [1928], [1937], [1976a], and a number of items not cited here, will be included in this collection. Quotations from these essays (other than [1976a]) in the present paper are in the current draft translations for this collection, in some cases based on published translations. Translations of quotations from other writings of Bernays are my own. 6 [1976a, xiv].
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
131
influence on Bernays. Nelson sought to revive the philosophical approach of J. F. Fries (1773-1843), and Bernays published three early essays in the Abhandlungen der Fries’schen Schule, which Nelson had revived as the organ of his school. Bernays published no more philosophy until several years after his return ¨ to Gottingen in 1917, and most of the papers published during his second ¨ sojourn in Gottingen reflect rather directly his collaboration with Hilbert. I will make some remarks later about [1930], the most significant and original of these papers. By his “later” philosophy I have in mind principally what we find in publications after 1945. In the later period he was evidently much influenced by ¨ his Zurich colleague Ferdinand Gonseth (1890-1975), with whom he collaborated actively in organizational matters. Remarks of Bernays himself indicate a significant change in his philosophical views between the early 1930s and the post-war period. I had come close to the views of [Ferdinand] Gonseth on the basis of the engagement of my thinking (meinen gendanklichen Auseinandersetzungen) with the philosophy of Kant, Fries, and Nelson, and so I attached myself to his philosophical school [1976a, xvi]. . . . during the period in which these articles were published, my views on the relevant questions have changed almost exclusively in response to new insights gained from research in the foundations of mathematics [1976, vii].7 There is a tension between these two remarks if they are applied to the transitional period between 1934 and 1945, since the first suggests that reflections on the philosophy of Kant, Fries, and Nelson, probably differences with them, led to his coming close to Gonseth’s views, while the second attributes change in his views to developments in the foundations of mathematics. It seems to me likely that the problems the Hilbert program encountered in the wake of ¨ Godel’s theorem, and perhaps other mathematical developments of the time, crystallized some dissatisfactions on Bernays’ part with his Kantian legacy. 7 Bernays wrote “ . . . meine Ansichten . . . sich fast nur insoweit gewandelt haben, als es durch die in der mathematischen Grundlagenforschung gewonnenen neuen Einsichten bedingt wurde.” One might also translate this as “My views . . . have changed almost exclusively to the extent that is required by the new insights gained from research in the foun dations of mathematics.” The remark might be taken to minimize not only philosophical motives for changes in his views, but the changes themselves. I find that difficult to reconcile with the totality of Bernays’ philosophical writing. But it also suggests that whatever other reasons for changes there may have been, they were adequately motivated by developments in foundational research. I find that hard to reconcile with Bernays’ actual evolution. But he very likely did think that his later views yielded a better understanding of the results of foundational research. (I am indebted here to an anonymous referee.)
132
CHARLES PARSONS
We will see shortly what those dissatisfactions might have been. I will comment on aspects of two essays from the transitional period, the well-known [1935] and the almost unknown [1937]. Bernays’ style of philosophical writing was essayistic, and it is unlikely that he thought of himself as developing a systematic view. I do not find evidence of the drive toward system that one finds in so many of the best philosophers. That is evidently connected with his extremely modest personality and is reflected in his tendency to be a disciple of others, earlier Nelson and Hilbert and later Gonseth. For our purposes, it is most illuminating to single out certain themes that run through his writings. I will concentrate on two, what might be called anti-foundationalism, which summarizes his later epistemological attitude, including his rejection of the a priori, and structuralism. That leaves out much. A lot of his philosophical writing consists of comment on developments in logical research in foundations, and I will also have little to say about that. His subtle but somewhat elusive conception of intuition and its role is omitted because his most developed presentation belongs to the earlier period, and I am far from sure I understand it. A related issue is Bernays’ view that the conception of the continuum is basically geometric; he sees the predicative and intuitionistic theories as striving for a complete arithmetization that does not do justice to the conception and the standard set-theoretical treatment as a workable but not intuitively entirely satisfying compromise. These issues are large enough to be the subject of another paper. §2. Anti-foundationalism. One might introduce Bernays’ anti-foundationalism by a probably oversimple representation of the epistemology lying behind the Hilbert program as a kind of indirect foundationalism. It did not seek to make the theories actual mathematics works with rest on evident foundations in the sense that mathematical theorems would be seen to be obtainable by self-evident elementary deductive steps from axioms that are also self-evident, that is would need no further justification and would perhaps be incapable of it. But it offered a justification of mathematical theories by means of a consistency proof, where the method of the consistency proof, Hilbert’s finitary method, seems to have the character just described. In [1930] Bernays seems to view the consistency of arithmetic as obtaining such a justification from a finitist proof. However, as far as the justification of arithmetic itself goes, what is accomplished is to give “full, evident certainty . . . that it cannot come to grief through the incompatibility of its consequences” (55); however, Bernays emphasizes that arithmetic and analysis are extremely well confirmed by mathematical experience and application. However, applying our simple picture of a foundationalist view of a domain of mathematical knowledge to the finitary method is complicated by the fact that Hilbert and Bernays do not view the method as embodied by an axiomatic
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
133
theory; on the contrary the emphasis is on the intuitive character of the concepts and claims that the method gives rise to. I would express this by saying that what is attained by finitary concept formation and proof is intuitive knowledge. Still, one could infer from their expositions that the method at least allows what is formalizable in primitive recursive arithmetic. Elsewhere I have attributed to them what I call Hilbert’s Thesis: A proof of a proposition according to the finitary method yields intuitive knowledge of that proposition. In particular, this is true of proofs in primitive recurisive arithmetic.8 The key to an argument for Hilbert’s thesis is the claim that if a function is defined by primitive recursion from functions that can be seen intuitively to be well-defined, then it too can be seen intuitively to be well-defined. Hilbert and Bernays do offer an argument for this, which generalizes one given earlier by Bernays for exponentiation [1934, 25-27; 1930, 38-39]. In the paper just referred to I argue that these arguments are the weak point of the case for Hilbert’s Thesis.9 It may be that Bernays already came to doubt them in 1934. In [1935, 61] he questions whether it is intuitively evident that the 729 number 67257 has a decimal expansion, on the ground that it is a number that is “far larger than any occurring in experience”10 What is significant is not this obvious observation about the number, but his taking it as a reason for questioning whether the knowledge that it has a decimal expansion is intuitive. It is not certain that intuitive knowledge as Hilbert and Bernays understood that that is a case of perfectly evident premises leading to perfectly evident conclusions. It does seem, however, that is the way Bernays’ mentor Leonard Nelson viewed Kantian “pure intuition”, and Bernays does not question it in [1928], the exposition of Nelson’s philosophy of mathematics that Bernays published just after Nelson’s death. That may, however, just reflect the essentially expository character of the essay. Bernays’ remark in [1930] could mean only that intuitive finitist proof offers simply the greatest and most reflective certainty that can be attained. A motive for the method was surely that anything proved by it would be acceptable to any of the currents of opinion in foundations at the time, in particular the positions of Brouwer and Weyl. Whatever may have been Bernays’ view in 1928 or 1930, it is clear that by 1934 he was questioning this understanding of finitist mathematics and with it, one might say, the last refuge of foundationalism in the epistemology of mathematics. If he held such a view in the 1920s, it would have to be on the 8 Parsons [1998: 254]. It has been pointed out recently that in practice inferences that in fact went beyond PRA (whether or not this was known) were allowed as finitist. See Zach [2003]. This does not contradict the attribution to Hilbert and Bernays of Hilbert’s Thesis. 9 [1998: §5]. 10 I date the doubts to 1934 because [1935] derives from a lecture given in Geneva on 18 June 1934.
134
CHARLES PARSONS
basis of something like Hilbert’s Thesis. The remark in [1935] already calls this into question. And in the postscript to the reprint of [1930] in Abhandlungen, he writes that “the sharp distinction of the intuitive and the nonintuitive, as it is applied in the treatment of the problem of the infinite, can apparently not be carried through so strictly” [1976, 61]. That is a philosophical reflection of the ¨ mathematical fact that after Godel’s theorem proof-theoretic arguments had to be conducted with assumptions of increasing strength, and it was already clear in the 1930s that the sense in which the method of Gentzen’s consistency proof is “intuitive” is not so clear. What he had in mind in the late remark is indicated by that of 1934, but it is clear that he had carried the matter further. The major step was the introduction in [1946] of the notion of acquired evidence. The German term Evidenz and its French equivalent e´ vidence are often translated as “self-evidence”. This does not fit well the intentions of philosophers such as Brentano and Husserl who have given the concept a prominent role. It certainly does not fit the intention of Bernays. Acquired evidence is the evident character a proposition comes to have for someone who uses it in the context of a developed conceptual scheme,11 where the sentence involved may have had a use as the scheme developed. So acquired evidence is not something a proposition has by itself; Bernays explicitly says that evidence may be relative to “implicit suppositions” which the conceptual scheme involves. Bernays says that the evidences in mathematics are almost all acquired and goes so far as to remark: What distinguishes this case is that the dialectic is established in our mind in such a penetrating manner that it influences our intuitive imagination, that is to say that it influences the way in which we represent intuitively certain categories of objects. . . . In this way one also understands that intuition can derive notions that surpass the possibilities of a complete effective control and whose conceptual analysis gives rise to infinite structures [1946, 323]. About the concept of natural number (with induction and recursion), he remarks a little later that it is “a full dialectic, which certainly did not exist from the beginning for the mind but which had to be tried out and dared at a certain stage” (324). That is to say, our conception of natural number had to be developed in a way that involved trial and (very likely) error. The concept of acquired evidence offers a natural way to interpret the view commonly expressed by set theorists that the axiom of choice is evident, 11 Bernays uses the term dialectique derived from Gonseth. But it appears that what he has in mind is very close to what we would call a conceptual scheme. The relevant sense is the first of two senses of dialectique that Bernays distinguishes in [1947]. He says this can be taken as a generalization of the concept of dialectic as logic, and that one might say that logic is the first example of the scientific fixation of such a dialectic (173).
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
135
obvious, or follows from the concept of set. Bernays himself gives a more elementary example in a brief description of the obviousness that arithmetic comes to have: First we are conscious of the freedom we have to advance from one position arrived at in the process of counting to the next one. But then we take the step of a connection, through which a function that associates a successor with each and every number is posited. Here a progressus in infinitum replaces the progressus in indefinitum. But it is not immediately obvious that this idea of the infinite number series can be realized; the intellectual experience of its successful realization is then essential for developing a feeling of familiarity, even of obviousness, as acquired evidence [1955, 111]. §3. Questioning the a priori. A second aspect of Bernays’ anti-foundationalism is his rejection of the idea of a priori knowledge as he had come to understand it from Kant and Nelson. This is indicated already in the paper [1937], although it is principally directed against the idea of a priori principles in physics (including the idea that the geometry of space is known a priori, which Nelson had continued to hold). The objections he canvasses to the “aprioristic point of view” first of all concern the difficulty of maintaining that view while doing justice to developments in physics. But at the end of the paper, he draws the quite general conclusion that the aprioristic view and “pure empiricism” have in common a presupposition (which Bernays rejects): If reason is essential for empirical knowledge, then it must play its role through principles that are knowable a priori. What is most significant about mathematics is what Bernays does not say: he might have said that mathematics has a priori principles, which, however, do not have implications for physics or other sciences that allow them to come into conflict with new developments. Although his statement of the common presupposition of apriorism and empiricism suggests that he already rejected the view that mathematical knowledge is a priori, his silence could reflect rather uncertainty or suspension of judgment. In fact it is difficult to pin down what Bernays thought earlier about the a priori character of mathematics in general, beyond finitary mathematics. In a paper evidently addressed to fellow members of Nelson’s neo-Friesian school, Bernays describes analysis as rational knowledge, of the same nature as Fries attributed to pure natural science.12 But the issue seems to have been complicated for him by the Hilbertian conception of the axiomatic method, according to which the status of an axiom is not that of a known truth or even (as in Hilbert’s Grundlagen der Geometrie) as in itself true or false.13 12 [1930a,
110]. He mentions the use of ideas of totalities and quantification over them. 192]. Cf. Bernays’ discussion remark published with Nelson [1928]. The polemic against formalism in Nelson’s reply may 13 See the remark on Hilbert’s geometry in [1922], Mancosu [1998:
136
CHARLES PARSONS
As we shall see, later Bernays is quite definite in rejecting the strictly a priori character of mathematics. Before turning to this, I should explain a difficulty that a contemporary philosopher might have with my regarding the rejection of the a priori character of mathematics as an aspect of ¨ Bernays’ anti-foundationalism. Philosophers from Godel to the present have defended the idea of propositions whose justification is a priori but which are neither certain nor unrevisable. That possibility seems not to occur to Bernays, because his model of a priori knowledge comes from a certain interpretation of Kant. Thus he describes the “theory of knowledge a priori” as holding . . . that we have certain cognitions of natural reality lying in Reason, to be sure first actualized through sensory stimulation, which, when they are brought to full consciousness, can be formulated in a definitive way in the form of general laws. In addition this theory claims that these a priori knowable laws contain the principles for all research in exact natural science and that in particular the method of physical theory formation is fixed in a unique and definitive way [1937, 279]. In some ways this is excessively rigid even as an interpretation of Kant.14 But Bernays is clearly guided by a general Kantian picture of at least synthetic a priori cognition as determined by the nature of the mind. Since the nature of the mind does not change, it is entirely natural to expect that available synthetic a priori cognition will not change and will thus be unrevisable.15 Already in [1939] Bernays manifests directly a dissatisfaction with the idea of the a priori applied to mathematics: In general the present-day problem of foundations . . . points to the requirement of revising our epistemological conceptions (Begriffsbildungen). In particular, that widely held opposition, according to which there is in science on one side the self-evident (“logical”) have been one factor convincing Bernays that he had to distance himself from a formalistic interpretation of Hilbert’s views and program, as he does in [1930, II §4]. 14 Bernays goes on to say “so that after the uncovering of these principles there is strictly speaking no further development of physical speculation.” He seems to assume that according to Kant a priori principles in physics are limited to mechanics. This leaves no room for what Michael Friedman finds in the Opus postumum, a development of the theory to accommodate Lavoisier’s discoveries in chemistry. See Friedman [1992: chapter 5]. That development might also consist of “definitive” a priori principles, but Bernays’ reading still leaves it a little difficult to understand Kant’s continuing concern about empirical laws. 15 Although the position Bernays takes here is critical of Nelson, one might see Bernays as responding to an internal conflict in Nelson’s philosophy, between his orthodox Kantian view of geometry and his view that philosophy should pay close attention to developments in the sciences and should not contradict its results.
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
137
and [on the other] accommodation to perception (the “empirical”), appears inadequate for the characterization of the scientific situation. The extension of this dichotomy carried out by Kant with the concept of a priori knowledge does not give a satisfying viewpoint [1939, 87, emphasis Bernays’].16 Interestingly, Bernays mentions the ”problem of foundations”, evidently as it has developed in logical research. But it is noteworthy that [1937] is very unlikely to reflect any influence of Gonseth, whereas the setting of the publication (and probably the writing) of [1939] suggests the contrary. In the important [1950] Bernays is clearer in distancing himself from the idea of an a priori evident foundation of mathematics. Instead, one can adopt the epistemological viewpoint of Gonseth’s philosophy which does not restrict the character of a duality — due to a combination of rational and empirical factors — to knowledge in the natural sciences, but rather finds it in all areas of knowledge. For the abstract fields of mathematics and logic that means specifically that thought-formations are not determined purely a priori but grow out of a kind of intellectual experimentation [1950, 102]. The intellectual experimentation he refers to is evidently an instance of his notion of intellectual experience (geistige Erfahrung), of which more later. What he wishes to substitute for the a priori as he understands it is explained a little more fully in the essay [1961] on Carnap.17 In effect Bernays proposes replacing the notion of the a priori by the more modest notion of the antecedent (vorg¨angig, translating Gonseth’s term pr´ealable). He applies this term to “ideas, opinions, and beliefs to which we either consciously or instinctively hold on in our questions, considerations, and methods” (161). But such beliefs might be revised in the development of the science in question. In fact, Bernays writes: The scientific method also requires that we make ourselves aware of the antecedent premises, and even make them the object of an investigation, respectively include them in the subject matter of an investigation (ibid.). The antecedent, in contrast to the a priori, is related either to a state of knowledge or to a discipline.18 He seems in general terms to be following Gonseth. 16 Evidently Bernays was thinking of the synthetic a priori. The passage strikes me as rather carelessly written, rather uncharacteristic of Bernays. 17 Athough the journal issue in which this essay appeared is dated 1961, its actual publication ¨ must have been later, because Bernays wrote to Godel on 31 December 1961 that he was “presently ¨ busy” with the paper (Godel [2003: 204-05]). As the title suggests, the essay is only incidentally concerned with the philosophy of mathematics. 18 The notion of the antecedent might be instructively compared to Hilary Putnam’s notion of the contextually a priori or to the idea of constitutive principles, derived by Michael Friedman
138
CHARLES PARSONS
Although the final push may have been delivered by Gonseth, Bernays’ rejection of the a priori character of mathematics seems clearly to have been motivated by the experience of foundational research in mathematics, where different conceptual constructions competed with none being able to make convincing any claim to definitive character. After introducing the idea of intellectual experimentation, he discerns such experimentation in the different constructions in foundational research and says that even unsuccessful attempts have their value. He sees the competing foundational programs as analogous to competing theories in science [1950, 102]. Bernays’ notion of intellectual experience (geistige Erfahrung) deserves further exploration. The term seems first to occur in Bernays’ publications in 1948, in an obscure passage that does not relate the notion to mathematics.19 ¨ But its origin is certainly earlier. In a letter to Godel of 7 September 1942, Bernays mentions distinguishing “different layers and kinds of evidence,”20 as he did already in [1935]. He goes on to say: Here the point of view is to be added that the certainty of a conceptual system (a “dialectic” in the sense of Gonseth) is not given beforehand, but is acquired through use, in the light of a kind of “intellectual experience”21 . Thus the notion of acquired evidence is also invoked.22 The notion is evidently implicitly present in [1950] and is put into a more ¨ general context in some remarks to the third of Gonseth’s Zurich conferences, which concern the Gonsethian theme of the intimate relation of rational and empirical factors in knowledge. Illustrating the proposal that experience be thought of more broadly than sense-experience, he says that a field of “experiences of a more general kind” arises in the working out of theories. He goes on to say: Even in the domain of mathematical thought one can speak of experience. A proof that proceeds according to the current methods of a mathematical discipline can be considered as a definite construction whose existence is shown, in a way analogous to that in which the result of a numerical computation can be exhibited. from some logical positivist writings. There will be significant differences in both cases. For Putnam, a proposition is contextually a priori in a given epistemic setting only if it is impossible to conceive how it could be false. (He holds that to have been true for Euclidean geometry in the eighteenth century.) This element does not enter into Bernays’ conception, nor so far as I know into Gonseth’s. 19 [1948, 276-77]. 20 Godel ¨ [2003: 138], trans. p. 139. 21 Ibid. 22 Bernays mentions that he had advocated the notion of intellectual experience in a discussion remark at the first (1938) of Gonseth’s Zürich conferences; see Gonseth [1941: 78-79]. There is a hint there of the idea of acquired evidence.
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
139
Thus there is a sort of intellectual experience with respect to the deductive working out of mathematical ideas [1952, 133-34]. In another place he again introduces the idea in opposition to the idea of certain a priori knowledge: It seems necessary to concede that we also have to learn in the fields of mathematics and that we here too have an experience sui generis (we might call it “intellectual experience”). This does not diminish the rationality of mathematics. Rather, the assumption that rationality is necessarily connected with certainty appears to be a prejudice [1969, 174-75]. It is probably with what he has earlier called intellectual experimentation in mind that he ends [1970], the most Gonsethian of his essays in the philosophy of mathematics, with the remark: Gonseth proclaims ouverture a` l’exp´erience as a general method; and as a requirement it is not restricted to research in natural science, but it is equally important in the field of intellectual experience (188). What Bernays says about intellectual experience would encourage the idea that he would admit that mathematical knowledge is a priori once that thesis is dissociated from the Kantian paradigm. Commenting on Quine’s criticism of the analytic-synthetic distinction, Bernays says that mathematical propositions are justified in a different sense from physical [1969, 172].23 But in general he proves elusive on this question. An essay tantalizingly entitled “Some empirical aspects of mathematics” offers little enlightenment. He repeats the proposal to replace “the doctrines of the a priori, the analytical, the self-evident” with Gonseth’s of the pr´ealable [1965, 127] and says that mathematics is antecedent to natural science.24 But what he says elsewhere about the antecedent certainly does not rule out the possibility that essentially 23 It
is of some interest that throughout his career Bernays never seems greatly moved by the analytic-synthetic distinction, although he does comment on it in various places. His Kantian background no doubt predisposed him against the view held by many philosophers that the a priori character of mathematics could be defended by showing mathematical propositions to be analytic. Already in [1930, I §2] he criticizes the Fregean logicist analysis of number on quite different grounds. In [1969] he criticizes Quine in saying that Carnapian definitions of analyticity do characterize mathematical propositions as distinguished from those of empirical science, but he seems to mean no more than that they give an extensional characterization. Bernays discusses Kant’s distinction at some length in [1955a, 27-30]. Like many other writers, he says that Kant’s distinction is not the same as that discussed in his own time. Referring to the discussion of the foundations of mathematics, he says that it ”is surely questionable whether a sharp general defined version” of the distinction has been arrived at (30). 24 This paper is a translation. The translator states that he renders pr´ ealable as ‘preliminary’. However, I continue to follow the Bernays Project translations in rendering it as ‘antecedent’. This paper was evidently originally written in French. A copy of the typescript is in the Bernays papers, document Hs 973: 34. (I am indebted to Wilfried Sieg for calling my attention to
140
CHARLES PARSONS
empirical grounds might lead to the modification of mathematical principles. It would be natural to read in that way an expression of agreement with what he (rightly) takes to be a fundamental point in Quine’s critique of the analytic-synthetic distinction: . . . it is inappropriate to divide the validity of a judgment into a linguistic and a factual component. This outlook points in a direction similar to that of the principe de dualit´e of Ferdinand Gonseth [1957, 238]. That Bernays understood Gonseth’s principle as having this implication is suggested by the following remark: Moreover he [Gonseth] does not divide disciplines into purely empirical and purely rational; rather, he assumes that a kind of “duality” of the empirical and the rational is to be found in all parts [of knowledge] [1977, 119]. Gonseth himself emphasized the genesis of mathematical concept formations (especially geometrical) in ordinary experience, and this element does appear in Bernays. But others have thought that a genetic role of this kind is compatible with the science in its developed form having a justification without appeal to the empirical. There is some indication that this was not Bernays’ view, but it is far from decisive. An element of Bernays’ epistemology that may be relevant here but is not developed very much is the appeal that he makes at times to phenomenology. He even goes so far as to suggest regarding mathematics as “the theoretical phenomenology of formal structure” [1965, 126].25 He offers phenomenology as an example of objectivity distinct from that referring to “the real objects of nature.” His example is colors, about which he says that observations of them “do not lose their objectivity even though we have discovered that colors must be eliminated from physical theories” (ibid.).26 In this characterization of mathematics, “theoretical” is important. Bernays may think of mathematical thought as beginning with noting structural (as opposed to qualitative) aspects of the phenomenal, but it quickly idealizes and so goes beyond the strictly phenomenal. Thus he more frequently refers to mathematics as the study of idealized structures. this document.) This is not surprising since the volume in which the translation appeared is the proceedings of a meeting in Brussels of the Acad´emie Internationale de Philosophie des Sciences. 25 In his essay on Wittgenstein, Bernays criticizes Wittgenstein for rejecting any phenomenology [1959, 124, 141]; the second passage overlaps the present one in content. 26 There follows the most positive comment about Husserl that I have found in Bernays’ writings. In general (also here) Bernays criticizes Husserl’s claims for intuition, and in an essay about Gonseth he mentions Husserl’s transcendental phenomenology as an example of “first philosophy” that he and Gonseth reject [1960, 151].
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
141
§4. The stratification of mathematical theories. Both Bernays’ rejection of the a priori and his relations with Gonseth might lead one to think he holds a holistic view at least structurally analogous to that of W. V. Quine. I say “structurally analogous” because I have already made clear that he did not share Quine’s empiricism. There are indeed indications of such a view, but the sketchy idea of phenomenology would suggest a qualification. Another, more developed and more interesting to logicians, is the stratification of mathematics laid out in [1935], which is one of his most influential contributions. A central idea of that paper, which so far as I know was quite original, is that platonism in mathematics admits of degrees. “Platonism” as he describes it is first of all a methodological tendency, of which his initial illustrations are the use of the law of the excluded middle and the existential form of axioms (in Hilbert’s geometry as opposed to that of Euclid, who speaks of figures to be constructed). . . . the tendency of which we are speaking consists in viewing the objects as cut off from all links with the reflecting subject. Since this tendency asserted itself especially in the philosophy of Plato, allow me to call it “platonism” [2002, 380]. Platonist mathematical conceptions, he goes on to say, are distinguished by their simplicity and logical strength. They form “representiations which extrapolate from certain regions of experience and intuition.” Bernays goes on to make clear that the conceptions he calls platonist can be introduced at some levels of mathematics and not others, and he uses this in a now familiar way to classify stances in foundations. Thus admitting “the totality of natural numbers” allows classical quantificational reasoning about natural numbers but is silent about further steps. Analysis introduces such a conception of set of nautural numbers, function of natural numbers, and real number. (It is in this connection that Bernays gives his well known “quasicombinatorial” motivation of classical impredicative reasoning about sets of numbers (53-54).) Set theory, because it involves iteration of this conception, represents a much greater “platonist” commitment, how great depending on what axioms are accepted. Bernays attributes to Kronecker and Brouwer the rejection of even the initial platonist step of admitting the totality of numbers. Although this is no longer a matter of platonism as he understands it, he notes a stratification here, in that intuitionism goes beyond Hilbert’s finitism in reasoning with an abstract 729 concept of proof. The remark about 67257 , though directed against Brouwer, would equally imply that Hilbert’s finitism goes beyond the intuitive.27 The kind of stratification described by Bernays is a commonplace today, because 27 Thus Hao Wang places strict finitism, which he calls “anthropologism”, lower in this hierarchy than finitism. See [1958: 472-73]. He attributes his stratification to Bernays; probably he found it implicit in [1935].
142
CHARLES PARSONS
it has been ratified and refined by many proof-theoretic results and by related enterprises such as the study of hierarchies based on notions of definability. Bernays already takes note in the paper of the first result showing that “platonistic” commitments are not the only dimension on which mathematical conceptions are stratified, the relative consistency proof of classical first-order arithmetic to intuitionistic. §5. Platonism. Philosophers will naturally ask whether and in what sense Bernays was a platonist. Since that word is used in many different ways in philosophy of mathematics and elsewhere in philosophy, we can’t be sure of a unique answer. It is clear that the platonism he describes in [1935] is primarily a methodological stance, and furthermore he says that different levels of platonism are appropriate for different domains of mathematics; he suggests for example that the “intuitive concept of number” is the most natural for number theory [1935, 64]. From his description, it seems clear that it is Hilbert’s finitary method that he has in mind, so that he probably means the most elementary number theory rather than more developed number theory such as analytic number theory. Clearly, however, in saying that “today platonism reigns in mathematics” (56) he does not mean to offer a general objection. What he does reject is what he calls “absolute platonism,” characterized as “conceptual realism, postulating the existence of a world of ideal objects containing all the objects and relations of mathematics” (ibid.). This view he considers refuted by the paradoxes.28 Elsewhere I describe Bernays’ view in this paper as methodological platonism [200?, 380]. In later writings Bernays largely avoids the term “platonism”; he may well have thought it gave rise to misunderstanding. There’s no doubt that he continued to accept what has been called default realism or default platonism,29 which amounts to taking the language of classical mathematics at face value and accepting what has been proved by standard methods as true. Some such statements are of the existence of mathematical objects, which would make Bernays a platonist in the sense often used in American philosophy, where ‘platonism’ is contrasted with nominalism (as espoused by Goodman and later by Field) rather than with constructivism, as was the case in [1935]. In fact, a broadly realistic attitude was part of his general approach to knowledge in the post-war years. In one description of the common position of the group around Gonseth, he emphasizes that the position is one of trust in our cognitive faculties. He also introduces the French term connaisance de fait; the idea 28 This sounds like a view that Godel ¨ ¨ would have endorsed from 1944 on. Some of Godel’s difficulties about the notion of concept could be taken to arise from the problem of formulating the view so as to avoid the sort of refutation Bernays has in mind. 29 Tait [2001: 102-03].
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
143
is that one should in epistemology take as one’s point of departure the fact of knowledge in established branches of science.30 The stance is similar to the naturalism of later philosophers, though closer to that of Penelope Maddy or to the “substantial factualism” of Hao Wang than to that of Quine. (Wang was certainly influenced by Bernays.) With respect to the physical world, Bernays’ realism is limited by his acceptance of Gonseth’s thesis that theories stand in a “schematic correspondence” to reality.31 This does not directly limit realism about mathematics for Bernays, because according to him mathematics is on the side of the schemata in this correspondence. But he repeatedly describes the structures mathematics is concerned with as idealized. §6. Bernays and structuralism. This point brings us to the other main theme in Bernays’ later philosophy I want to emphasize, his structuralism. In some sense or other, structuralism was part of Hilbert’s view of mathematics and seems to have been adopted by Bernays early in their collaboration and held to for the remainder of his career. When he gave a brief characterization of his philosophy, he typically included “the view according to which mathematics is the science of idealized structures” [1970, 188].32 A structuralist outlook is not distinctive of his later philosophy, although we shall see that in later writings he took important steps toward working it out. There are nowadays many versions of structuralism, and I will not attempt here to characterize Hilbert’s or to relate it to current versions. I will say something later about Bernays’ relation to contemporary structuralism. Bernays’ view of existence in mathematics is laid out most fully in [1950]. Its point of departure, the thesis that “existence, in the mathematical sense, means nothing but consistency” is found in Poincar´e’s writings and in Hilbert’s early writings and was in a rough way a motivating idea of Hilbert’s program. Bernays sees the thesis as opposed to a traditional form of platonism according to which mathematical entities have “ideal being” independent of being thought of and of being “determinations” of something real.33 He criticizes 30 [1948, 275]. Generally use of a French term indicates that the concept is taken from Gonseth, but in this case Bernays says that he is making an analogy with a notion of libert´e de fait introduced by Gonseth in writing on freedom. But naturalism in the sense of the text was certainly Gonseth’s view as well. 31 Bernays [1970], esp. part 1. 32 Bernays quotes this in [1976, x], i.e. in the Preface. 33 As we shall see Bernays wishes to avoid this view; that may be a reason why he eschews the label ‘platonism’ for his own view. But he doesn’t use the term at all in this paper. The reason might be that he does not want to repudiate the views of [1935], although he may have thought that some of the formulations of that paper suggested something beyond the methodological stance he was concerned with.
144
CHARLES PARSONS
this thesis as doing no methodological work. Although he does not immediately address the vexed question what is meant by ‘independent’, it appears that he gives it quite a strong interpretation. This is indicated by the fact that so much else of what he says is compatible with platonism characterized in roughly this way. To begin with he rejects the nominalist program of eliminating reference to what he calls theoretical (ideell) entities, which in the context we might call abstract entities.34 Bernays goes on to sketch four interpretations of existence statements about theoretical entities, none of which he takes to imply independent existence. (c), existence relative to a structure, will be Bernays’ own central concept. I will leave aside (a) and (b) because Bernays says that they involve “a kind of contentual reduction” (95). (d) is described as follows: Existence of theoretical entities may mean that one is led to such entities in the course of certain reflections. For example, the statement that there are judgments in which relations appear as subjects expresses the fact that we are also led to such “second-order” judgments (as they are called) when forming judgments (95). Since “being led” is in the sense of the objectively appropriate, Bernays denies that there is reduction in this case. Why, then, does he deny that it involves independent “ideal” existence? He remarks: The existence statement is kept within the particular conceptual context, and no philosophical (ontological) question of modality which goes beyond this context is entered into (ibid.). That suggests that in all these cases he has in mind something formally analogous to Carnap’s distinction of internal and external questions, and an affirmation of “independent existence” would be a metaphysician’s affirmative answer to an external question of existence.35 In this general discussion sense (c) of existence for theoretical entities is one of the key concepts of the paper. It is existence relative to a structure 34 Like Husserl, Bernays uses the terms ideal and ideell. Following the Bernays project, I render the first as ‘ideal’ and the second as ‘theoretical’; however, its meaning is often closer to ‘abstract’. Bernays’ usage is not particularly close to Husserl’s, and there is no particular reason to think it derived from the latter. 35 When writing the present essay, Bernays is unlikely to have known Carnap [1950], the locus classicus of the distinction of internal and external questions; however, the idea is certainly foreshadowed in earlier writings of Carnap. In his essay [1961] on Carnap, Bernays does not comment on the distinction. However, in [1957], also largely about Carnap, he comments that it is in general incorrect to pose the question “Is there such and such?” in an “absolute sense” and that here the criticism by logical empiricism is justified. As in the present essay, he maintains that questions of existence are sensible only in a specific conceptual context. That suggests that he viewed Carnap’s distinction sympathetically. A relation of Bernays’ and Carnap’s thought at this point is already proposed by Heinzmann [2001: 27].
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
145
(in a broad sense), what he calls relative existence (bezogene Existenz). But his route to the development of this idea goes by way of a more direct discussion of the equation of mathematical existence and consistency. Applied in general to existence statements individually, of individual objects or of objects satisfying some condition, the thesis is criticized on generally familiar grounds (96-99). Bernays suggests that its appeal may have rested on a too simple understanding of consistency. Bernays proceeds to consider the case of existence axioms of an axiomatic theory. Here there is an obvious objection to equating existence with consistency, that consistency is a property of the system of axioms as a whole (99). But since the axiom system may be regarded as a description of a structure, the existence claims of the axioms can be understood as statements of relative existence. Thus Bernays comes to the point, much less familiar at the time than it is now, that typical existence statements in mathematics are in the context of a structure. (His example is Euclidean geometry.) The philosophical question is in his view only shifted by this observation, since the question then arises of the existence of the structure. Here, there is some justification for equating existence with consistency, as in the case of non-Euclidean geometry (100). But typically consistency is made out in such cases by the construction of a model, which leads to a potential regress of relative existences. This is the point where Hilbert shortly after 1900 saw the necessity of a syntactic consistency proof as a way to end such a regress. At this point Bernays does not turn so quickly to proof theory. He writes: We finally reach the point at which we make reference to a theoretical framework (ideeller Rahmen). It is a thought-system that involves a kind of methodological attitude; in the final analysis, the mathematical existence posits relate to this thought-system (100). He says that mathematical experience has tested the consistency of this framework to such an extent that there is “de facto no doubt about it”, and that is what is needed for the existence posits made within the framework, which are presumably of relative existence. Bernays is not as explicit as one would wish as to what this framework might be. He probably thinks there are different options even within classical mathematics; a little later he speaks of “indeterminacies” in demarcating a framework (101). The somewhat obscure remarks about the number system (ibid.) are clearly meant to be illustrative. His point is perhaps that it is not so clear as appears at first sight what positing the number system consists in. Although in the first instance it is a matter of the existence of objects, conceptual understanding and with it logic are essentially involved. Bernays’ thesis that mathematical practice operates within a framework that is not uniquely determined is another point of contact between his thought and that of Carnap. For Carnap, a framework is essentially linguistic and
146
CHARLES PARSONS
specifiable precisely with the methods of mathematical logic, a formal system at least in a generalized sense, since the characterization may involve infinitary rules. Bernays’ description of the framework as a “thought-system” implies that he thinks of it in a more informal and perhaps mentalistic way. However, he clearly thinks that a framework can be delineated more or less precisely and that axiomatization and formalization are means of attaining greater precision. The importance of the question how he precisely thinks of the concept of framework is somewhat diminished by a point on which he places some emphasis, that a precisely delineated framework for mathematics intends “a certain domain of mathematical reality” that is at least to a certain degree independent of the “particular configuration” of the framework (102). What he has specifically in mind is the fact that different axiomatizations and constructions (e.g. of the real numbers) can be equivalent even if they differ in their ontology. Bernays seems to recognize that the concept of relative existence loses some of its sharpness by its being viewed in this way (104). He adds the observation that mathematical reality is not fully ex¨ hausted by a delimited framework, what Godel called the inexhaustibility of mathematics. Bernays’ understanding of typical mathematical existence statements in terms of relative existence makes his view a structuralist view in a general sense. However, he does not pursue an idea central to later structuralist views of mathematical objects, which are based on the idea that such objects have no more in the way of a “nature” than is given by the basic relations of a structure to which they belong.36 Bernays mentions two difficulties for his perspective on mathematical existence, that it might reinstate the “ideal existence” that he has rejected, and that it might be a form of relativism. He considers what he has already said about the independence of mathematical reality of the particulars of a framework as a sufficient answer to the second objection. About the first, he makes the observation that in natural science the central assertions of existence are of the “factually real”, and other talk of existence, for example of laws, “appears as mere improper existence” (104). There is no such contrast in mathematics. “It is not a question of being (Dasein) but of relational, structural connections and the emergence (being induced) of theoretical entities from other such entities” (ibid.). To me that leaves the rejected sense of independent existence somewhat elusive. I don’t believe that Bernays would have taken the Carnapian line that, to the extent that they make sense at all, answers to external questions are simply pragmatic recommendations of one type of framework over another. 36 Thus I would no longer affirm what I wrote in [1990: 303], that the paper gives a “clear general statement” of such a view.
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
147
I have gone into some detail about this paper in part because I consider it the best of Bernays’ post-war philosophical papers and because it is little known, although it is surely one of the most important contributions to mathematical ontology of its time, not as sharply formulated as the contemporary writings of Carnap and Quine but more sensitive to actual mathematics. Even to this day it has excited little comment in the literature.37 What is distinctive about Bernays’ structuralism? As I have said, he does not say that mathematical objects have no more of a nature than is given by the basic relations of a structure to which they belong. It is not clear how much he considered that view.38 His paper develops what could be regarded as a major difficulty: that basic structures of arithmetic, analysis, and geometry can be constructed in ways that are fundamentally equivalent but differ in the choice of basic relations. A point on which contemporary structuralisms differ from each other concerns how to understand the concept of structure and what role to give the concept. Bernays talks freely of structures in his philosophical writing but does not say how he understands the existence of structures in relation to the discussion in [1950]. That might make it tempting to attribute to him the view that structures are primitive mathematical entities, as seems to be the case for Stewart Shapiro. But he does not commit himself on this question, and I do not know what he would have thought of the views on this question that have been advanced in the contemporary discussion of structuralism. §7. What can we say about Bernays’ achievement as a philosopher? The present discussion cannot be the basis for a complete answer to this question, both because of the period it concentrates on and because of its selectiveness as to topics. Bernays played a major role in developing and articulating the philosophical views accompanying Hilbert’s program, but that has been largely outside our purview. I have also done little to explore the question of his debt to others, notably Nelson and Gonseth. Bernays’ virtues as a writer on philosophy of mathematics are evident: contact with actual mathematics, especially mathematical logic, combined with familiarity with the issues that concern philosophers and sensitivity to 37 I speculate on the reasons for this neglect in my introduction to the paper in Bernays [200?], on which the present discussion draws. 38 Wilfried Sieg [2002: 378] points out that in [1930, 21] Bernays sketches the logical move characteristic of if-thenism or deductivism, the simplest of the forms of structuralism in this sense aspiring to eliminate reference to mathematical objects, and, by way of motivating concern with consistency, notes the problem of vacuity (the nonexistence of a system of objects satisfying given axioms) that this proposal faces. (Cf. Parsons [1990: 311-13].) Sieg may overinterpret in writing that Bernays “presents the standard account” of if-thenism. First, he does not propose it as a method for eliminating reference to mathematical objects. Second, although he does describe the knowledge that a proof from certain axioms yields as expressible as a statement of pure logic, he also describes it as a statement about predicates, without saying anything about what sort of entities predicates are.
148
CHARLES PARSONS
the difficulties that philosophical positions are prone to. Probably the last is the most difficult for a mathematician to achieve.39 In the aspects of his work I have canvassed in this essay, one can discern some philosophical contributions that have proved enduring: his stratification of mathematical conceptions in [1935], the observation in that paper that points to a distinction between finitism and intuitionism on one side and “strict finitism” on the other (although he made no attempt then or later to articulate strict finitism as a foundational stance), possibly the “quasi-combinatorial” description of the conception of arbitrary subset of an infinite set. Others have not been widely attended to but stand up well in the light of discussion in the philosophy of mathematics, for example the idea of acquired evidence and the accompanying idea of intellectual experience, and the discussion of mathematical ontology in [1950] with its thoughtful exploration of issues that would nowadays surround different forms of structuralism. On the other hand the essayistic mode of writing and the absence of a systematic drive mean that many ideas are not very much worked out. His judiciousness in discussing different views and his loyalty to earlier associations sometimes makes his stance border on eclecticism. Fortunately for us this tendency is more noticeable in general philosophical discussion than when Bernays is focused on mathematics.40 One can hardly question the claim that Bernays was an acute and well-informed commentator on issues in the foundations of mathematics, with a high degree of philosophical literacy. I have claimed some genuine philosophical contributions, but their extent might be disappointing given the amount Bernays wrote. I will not attempt to assess whether that disappointment would be overcome by exploring the issues I have not taken up here. WRITINGS OF PAUL BERNAYS
[1922] Die Bedeutung Hilberts f¨ur die Philosophie der Mathematik, Die Naturwissenschaften, vol. 10 (1922), no. 4, pp. 93–99, Translation, Mancosu [1998]. ¨ [1928] Uber Nelsons Stellungnahme in der Philosophie der Mathematik, Die Naturwissenschaften, vol. 16 (1928), no. 9, pp. 142–145. [1930] Die Philosophie der Mathematik und die Hilbertsche Beweistheorie, Bl¨atter f¨ur deutsche Philosophie, vol. 4 (1930), pp. 326–367, Reprinted in [1976] with a postscript. Cited according to reprint. Translation, Mancosu [1998]. [1930a] Die Grundgedanken der Fries’schen Philosophie in ihrem Verh¨altnis zum heutigen Stand der Wissenschaften, Abhandlungen der Fries’schen Schule N. F., vol. 5 (1930), pp. 99–113. [1934] (with David Hilbert) Grundlagen der Mathematik, Erster Band, Springer, Berlin, 1934. cited according to 2d ed., 1968. 39 Bernays’ membership in the schools of Nelson and later Gonseth probably sharpened his abilities in this respect by engaging him in philosophical debates. 40 What seems to me an example is the attempt in [1955a, §§8-9] to reconstruct Kantian transcendental idealism with the help of Gonseth’s idea of “schematic correspondence” of theories and reality.
PAUL BERNAYS’ LATER PHILOSOPHY OF MATHEMATICS
149
[1935] Sur le platonisme dans les math´ematiques, L’enseignement math´ematique, vol. 34 (1935), pp. 52–69, Translation, Philosophy of mathematics: selected readings (Paul Benacerraf and Hilary Putnam, editors), 2d ed., Cambridge University Press, 1983. [1937] Grunds¨atzliche Betrachtungen zur Erkenntnistheorie, Abhandlungen der Fries’schen Schule, N. F., vol. 6 (1937), pp. 275–290. [1939] Bemerkungen zur Grundlagenfrage, appendix iv to Ferdinand Gonseth, Philosophie math´ematique, Actualit´es scientifiques et industrielles, vol. 837, Hermann, Paris, 1939. [1946] Quelques points de vue concernant le probl`eme de l’´evidence, Synthese, vol. 5 (1946), pp. 321–326. [1947] Zum Begriff der Dialektik, Dialectica, vol. 1 (1947), pp. 172–175. [1948] Grunds¨atzliches zur “philosophie ouverte”, Dialectica, vol. 2 (1948), pp. 273–279. [1950] Mathematische Existenz und Widerspruchsfreiheit, Etudes de philosophie des sciences, en hommage a` Ferdinand Gonseth a` l’occasion de son soixanti`eme anniversaire, Editions du Griffon, Neuchˆatel, 1950, pp. 11–25. Reprinted in [1976]; cited according to reprint. [1952] Dritte Gespr¨ache von Z¨urich, Dialectica, vol. 6 (1952), pp. 130–140. [1955] Die Mathematik als ein zugleich Vertrautes und Unbekanntes, Synthese, vol. 9 (1955), pp. 465– 471, Reprinted in [1976]; cited according to reprint. [1955a] Zur Frage der Ankn¨upfung an die kantische Erkenntnistheorie, Dialectica, vol. 9 (1955), pp. 23–65, 195–221. [1957] Von der Syntax der Sprache zur Philosophie der Wissenschaften, Dialectica, vol. 11 (1957), pp. 233–246. [1959] Betrachtungen zu Ludwig Wittgensteins “Bemerkungen u¨ ber die Grundlagen der Mathematik”, Ratio, vol. 1 (1959), pp. 1–18, Reprinted in [1976]; cited according to reprint. Translation in the simultaneous English edition of Ratio. [1960] Charakterz¨uge der Philosophie Gonseths, Dialectica, vol. 14 (1960), pp. 151–156. [1961] Zur Rolle der Sprache in erkenntnistheoretischer Hinsicht, Synthese, vol. 13 (1961), pp. 185–200, Reprinted in [1976]; cited according to reprint. [1965] Some empirical aspects of mathematics, Information and Prediction in Science (S. Dockx and Bernays, editors), Academic Press, New York and London, 1965, pp. 123–128. [1969] Bemerkungen zur Philosophie der Mathematik, Akten des XIV. Internationalen Kongresses f¨ur Philosophie, Herder, Wien, 1969, pp. 192–198. Reprinted in [1976]; cited according to reprint. [1970] Die schematische Korrespondenz und die idealisierten Strukturen, Dialectica, vol. 24 (1970), pp. 53–66, Reprinted in [1976]; cited according to reprint. [1976] Abhandlungen zur Philosophie der Mathematik, Wissenschaftliche Buchgesellschaft, Darmstadt, 1976. ¨ [1976a] Kurze Biographie, Sets and Classes: On the Work by Paul Bernays (Gert H. Muller, editor), North-Holland, Amsterdam, 1976, pp. xiv–xvi. (It should be noted that the English version in this volume is abbreviated and not always accurate.). ¨ [1977] Uberlegungen zu Ferdinand Gonseths Philosophie, Dialectica, vol. 31 (1977), pp. 119– 128. [200?] Selected essays in the philosophy of mathematics, (Wilfried Sieg and W. W. Tait, editors), Open Court, Chicago and La Salle, Ill., forthcoming. OTHER WRITINGS
Rudolf Carnap [1950], Empiricism, semantics, and ontology, Revue internationale de philosophie, vol. 4, pp. 20–24. Reprinted in 2d ed. of Meaning and necessity, University of Chicago Press, 1956. Miriam Franchella [2005], Paul Bernays’ philosophical way, Grazer philosophische Studien, vol. 70, pp. 47–66. (Published 2006.)
150
CHARLES PARSONS
Michael Friedman [1992], Kant and the Exact Sciences, Harvard University Press, Cambridge, Mass. Kurt Godel [2003], Collected Works, volume IV, Correspondence A-G (Solomon Feferman ¨ et al., editors), Clarendon Press, Oxford. Ferdinand Gonseth (editor) [1941], Les entretiens de Zurich sur les fondements et la m´ethode des sciences math´ematiques, 6-9 d´ecembre 1938, Leeman, Zurich. Gerhard Heinzmann [2001], Paul Bernays et la philosophie ouverte, Logic and Set Theory in 20th Century Switzerland (James Gasser and Henri Volken, editors) Bern. Online at www.philosophie.ch. David Hilbert [1918], Axiomatisches Denken, Mathematische Annalen, vol. 78, pp. 405– 415. Paolo Mancosu (editor) [1998], From Brouwer to Hilbert: The Debate on the Foundations of Mathematics in the 1920s, Oxford University Press. Leonard Nelson [1928], Kritische Philosophie und mathematische Axiomatik, Unterrichtsbl¨atter f¨ur Mathematik und Naturwissenschaften, vol. 34, pp. 108–115, 136–142 (with preface by David Hilbert). Reprinted with further preface and notes by Wilhelm Ackermann in Beitr¨age ¨ zur Philosophie der Logik und Mathematik, Verlag Offentliches Leben, Frankfurt a. M., 1959, and in Gesammelte Schriften (Paul Bernays et al., editors), vol. III, Meiner, Hamburg, 1974. Charles Parsons [1990], The structuralist view of mathematical objects, Synthese, vol. 84, pp. 303–346. Charles Parsons [1998], Finitism and intuitive knowledge, The Philosophy of Mathematics Today (Matthias Schirn, editor), Clarendon Press, Oxford, pp. 249–270. Charles Parsons [2002], Realism and the debate on impredicativity, 1917–1944, Reflections on the Foundations of Mathematics: Essays in Honor of Solomon Feferman (Richard Sommer, Wilfried Sieg, and Carolyn Talcott, editors), Lecture Notes in Logic 15, Assocation for Symbolic Logic, Urbana, and A.K. Peters, Natick, pp. 372–389. Constance Reid [1970], Hilbert, Springer-Verlag, Berlin. Wilfried Sieg [2002], Beyond Hilbert’s reach?, Reading Natural Philosophy: Essays in the History and Philosophy of Science and Mathematics (David B. Malament, editor), Open Court, Chicago and La Salle, Ill., pp. 363– 405. W. W. Tait [2001], G¨odel’s unpublished papers on foundations of mathematics, Philosophia mathematica, series III, vol. 9, pp. 87–126. Hao Wang [1958], Eighty years of foundational studies, Dialectica, vol. 12, pp. 466– 497. Richard Zach [2003], The practice of finitism: Epsilon calculus and consistency proofs in Hilbert’s program, Synthese, vol. 137, pp. 211–259. DEPARTMENT OF PHILOSOPHY EMERSON HALL HARVARD UNIVERSITY CAMBRIDGE, MA 02138, USA
E-mail:
[email protected]
PROOFNETS FOR S5: SEQUENTS AND CIRCUITS FOR MODAL LOGIC
GREG RESTALL
Abstract. In this paper I introduce a sequent system for the propositional modal logic S5. Derivations of valid sequents in the system are shown to correspond to proofs in a novel natural deduction system of circuit proofs (reminiscient of proofnets in linear logic [9, 15], or multipleconclusion calculi for classical logic [22, 23, 24]). The sequent derivations and proofnets are both simple extensions of sequents and proofnets for classical propositional logic, in which the new machinery—to take account of the modal vocabulary—is directly motivated in terms of the simple, universal Kripke semantics for S5. The sequent system is cut-free (the proof of cut-elimination is a simple generalisation of the systematic cut-elimination proof in Belnap’s Display Logic [5, 21, 26]) and the circuit proofs are normalising.
This paper arises out of the lectures on philosophical logic I presented at Logic Colloquium 2005. Instead of presenting a quick summary of the material in the course, I have decided to write up in a more extended fashion the results on proofnets for S5. I think that this is the most original material covered in the lectures, and the techniques and ideas presented here gives a flavour of the approach to proof theory I took in the rest of the material in those lectures. The modal logic S5 is the most straightforward propositional modal logic — at least when you consider its models. The Kripke semantics for S5 is just about the smallest modification to classical propositional logic that you can Thanks to Conrad Asmus, Lloyd Humberstone and Allen Hazen for many helpful discussions when I was developing the material in this paper, and to Bryn Humberstone, Adrian Pearce, Graham Priest and the rest of the audience of the University of Melbourne Logic Seminar for encouraging feedback. Thanks also to audiences at Logic Colloquium 2005 in Athens and esslli’05 in Edinburgh, where I presented this material in a courses on proof theory, to seminar audiences at the Bath, Nottingham, Oxford, Paris and Utrecht, and in particular to Samson Abramsky, Denis Bonnay, Bob Coecke, Matthew Collinson, Giovanna Corsi, Melvin Fitting, Alessio Guglielmi, Barteld Kooi, Hannes Leitgeb, Neil Leslie, Øystien Linnebo, Richard McKinley, Sara Negri, Luke Ong, David Pym, Helmut Schwichtenberg, Benjamin Simmenauer, Phiniki Stouppa, and Albert Visser and Richard Zach for fruitful conversations on these topics, and to the Philosophy Faculty at the University of Oxford and Dan Isaacson, where this paper was written. Thanks to an anonymous referee for comments that helped clarify a number of issues. Comments posted at http://consequently.org/writing/s5nets are most welcome. This research is supported by the Australian Research Council, through grant DP0343388. Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
151
152
GREG RESTALL
make once you add the idea that propositions may vary in truth value from context to context. We add just one new operator, 2, with the proviso that 2A is true in a context when and only when A is true in every context. (The dual operator ♦ is definable in terms of 2 in the usual way. We could start with ♦ as primitive, and then 2 is the defiable connective. Nothing hangs here on the choice of 2 as primitive.) The modal logic S5 has very simple models. A (universal) S5 frame is a non-empty set P of points. An evaluation relation is an arbitrary relation between points and atomic formulas. A (universal) S5 model P, is a frame together with an evaluation relation on that frame. Given a model, the evaluation relation may be extended to the entire modal language as follows: • x A ∧ B iff x A and x B. • x ¬A iff x A. • x 2A iff for every y ∈ P, y A. A formula 2A is true at a point just when A is true at all points. In this case, A is not merely contingently true, but is unavoidably, or necessarily true. (We utilise the primitive vocabulary {∧, ¬, 2}, leaving ∨ and → as defined connectives in the usual manner. In addition, the modal operator ♦ for possibility is definable as ¬2¬.) A formula A is S5-valid if and only if for every model P, , for every point x ∈ P, we have x A. An argument from premises X to a conclusion A is S5-valid if and only if for each model P, , for every x ∈ P, if x B for each B ∈ X , then x A also. Clearly every classical tautology, and every classically valid argument is S5-valid. Here are some examples of distinctively modal S5 validities. 2(A → B) 2A → 2B
2A ∧ 2B 2(A ∧ B)
2(A ∨ ¬A)
2A A 2A 22A A 2¬2¬A When it comes to models, S5 is simple. Models for other modal logics complicate things by relativising possibility. (A point y is possible from the point of view of the point x, and to evaluate 2A at point x, we consider merely the points that are possible relative to x.) You can then find interesting modal logics by constraining the behaviour of relative possibility in some way or other (is it reflexive, transitive, etc.) The logic S5 can be seen as a system in which relative possibility has disappeared (possibility is unrelativised) or equivalently, as one in which relative possibility has a number of conditions governing it: typically, reflexivity, transitivity and symmetry. Once relative possibility is an equivalence relation, from the perspective of a point inside some equivalence class you can ignore the points outside that class with no effect on the satisfaction on formulas, and the model may as well be universal. In other words, you can consider S5 as a logic in which there is not much machinery at all (there is no relation of relative possibility) or it is one in
PROOFNETS FOR S5
153
which there is quite a bit of machinery (we have a notion of relative possibility with a number conditions governing it). This difference in perspectives plays a role when it comes to the proof theory for S5. Despite the simplicity of the formal semantics, providing a natural account of proof in S5 has proved to be a difficult task. We have little idea of what a natural account of proofs in S5 might look like. There are sequent systems for S5, but the most natural and straightforward of these are not cut-free [20]. The cut-free sequent systems in the literature tend to be quite complicated [10, 19], partly because they treat S5 as a logic with many rules (that is, the systems cover many modal logics and S5 is treated as a logic in which relative possibility has a number of features — so we have many different rules governing the behaviour of relative possibility), or they are quite some distance from Gentzen’s straightforward sequent system for classical propositional logic [5, 26].1 On the other hand, sequent systems can be modified by multiplying the kind or number of sequents that are considered [3, 16], or by keeping a closer eye on how formulas are used in a deduction [7]. These approaches are closest to the one that I shall follow here, but the present approach brings something new to the discussion. In this paper I introduce and defend a simple sequent system for S5, with the following innovations: the main novelty of this result is that the generalisation of sequents in this system (superficially similar, at least, to hypersequents [3]) have a straightforward interpretation both in terms of the models for S5, and in terms of natural deduction proofs for this modal logic. Sequent derivations are, in a clear and principled manner, descriptions of underlying proofs. §1. Motivations. Our aim is to defend a simple, cut-free sequent calculus for the modal logic S5, in which derivations correspond in some meaningful way to constructions of proofs. The guiding idea for this quest looks back to the original motivation of the sequent system for intuitionistic propositional logic [13]. For Gentzen, a derivation of an intuitionistic sequent of the form X A is not merely a justification of the inference from X to A, and the sequent system is not merely a collection of rules with some pleasing formal properties (each connective having a left rule and a right rule, the subformula property, etc.) Instead, the derivation can be seen as a recipe for the construction of a natural deduction proof of the conclusion A from the premises X . For example consider, the derivation of the sequent
1 Display logic is a fruitful way of constructing sequent systems for a vast range of logical systems, but it comes at the cost a significant distance from traditional sequent systems. We do not extend the sequent system for classical logic with new machinery to govern modality. We must strike at the heart of the sequent system to replace the rules for negation, at the cost of a proliferation of the number of sequent derivations.
154
GREG RESTALL
A → B (C → A) → (C → B): AA B B A → B, A B C C A → B, C → A, C B A → B, C → A C → B A → B (C → A) → (C → B) may be seen to guide the construction of the following natural deduction proof. [C → A]$ A→B
[C ]∗
A B
C →B
(∗)
(C → A) → (C → B)
($)
However, a proof may be constructed in more than one way. The first three lines of the proof (from A → B, C → A, C to B) may be analysed by the different derivation AA C C C → A, C C B B A → B, C → A, C B In this case, the natural deduction proof constructed is no different, but the analysis varies. Instead of thinking of the tree as starting with a proof of from A → B and A to B (that is, A → B, A B) and then justifying the premise A by means of the addition of the two extra premises C → A and C , we think of the proof as starting with the proof from C → A and C to A, and then we add the premise A → B to deduce B. So, the sequent rules X A Y, B C X, Y, A → B C
[→L]
X, A B X A→B
[→R]
can be seen as being motivated and justified by considerations of natural deduction inferences. The rule [→L] can be motivated by the thought that if we have a proof 1 of A from X and another proof 2 from B to C (with extra premises Y ) then we may use 1 to deduce A from X , and use the new premise A → B to deduce (using an implication elimination in the natural deduction
PROOFNETS FOR S5
155
system) B. Now using Y and the newly justified B, we may add the proof 2 to dedce the desired conclusion C . In other words, X, Y, A → B C . The rule [→R] is motivated similarly. If we have a proof from X, A to B, then we may discharge A to deduce X A → B.2 These two derivations of the sequent A → B, C → A, C B differ in the order of the application of the [→ L] rules. In some sense, this difference is merely “bureaucratic”: The sequent system imposes a difference (you must apply either this rule or that rule first) when the natural deduction proof does not (the rules are applied—the order is only imposed when we decide to read the proof from top to bottom, or from bottom to top, or from the inside out or in some other way). There is an important sense in which the sequent system, as a theory of proof, is parasitic on a prior notion of proof found in natural deduction. Some of the merely bureaucratic differences in the sequent calculus are absent from the natural deduction system. This increase in bureaucracy is not without its virtues, of course. The sequent system makes explicit what is implicit in natural deduction proofs. The sequent A → B, C → A, C B tells us quite explicitly that at the stage of the proof at which B is the conclusion, the premises A → B, C → A and C are all undischarged. This can only be “read off” the natural deduction proof with some skill. You must look down from B to notice that the two discharges (∗) and ($) occur below, and hence that at the point of the proof where B is deduced, C → A and C are still active. In the rest of this paper, I aim to do the same thing for the modal logic S5. Instead of taking the sequent calculus for classical propositional logic and modifying it, we will first endeavour to construct a natural deduction proof theory for S5, and from this, reconstruct a sequent calculus that makes explicit the kinds of implicit inferential relationships between premises and conclusions that are found in our proofs. §2. Classical circuits. The sequent calculus for classical logic uses sequents with multiple formulas on each side of the turnstile: it has the form X Y where both X and Y may involve more than one (or less than one) formula. If a derivation of the sequent X A constructs a proof from premises X to conclusion A, then it is natural to think of a derivation ending in X Y as constructing a proof with the formulas in X as premises or inputs and the formulas in Y as conclusions, or outputs. We could think of a proof as having a shape like this: A1
A2
···
An
B1
B2
···
Bm
2 There are niceties here about how many instances of A are discharged, and whether sequents have of lists, multisets, or sets of formulas on the left-hand side. Most likely the structural rule of contraction will play a role at this point.
156
GREG RESTALL
This is a very natural idea. It goes back at the least ot William Kneale who introduced his tables of development in the 1950s [17]. The simple natural deduction rules for conjunction and negation are these: A B
A∧B
A∧B
A∧B
A
B
A ¬A A ¬A
Tables of development are found by chaining basic inferences together formula-to-formula. Here is a proof of the conclusion ¬(A ∧ ¬A). ¬(A ∧ ¬A) A ∧ ¬A A
A ∧ ¬A ¬(A ∧ ¬A) ¬A
Notice that it has two instances of the one conclusion ¬(A ∧ ¬A). (This phenomenon is just like the case of the simple Gentzen-style natural deduction proof of A ∧ ¬A ⊥, which has two instances of the premise A ∧ ¬A — one to justify A and the other to justify ¬A, which are then combined to infer the falsum ⊥.) In what follows, we will call this proof of ¬(A ∧ ¬A), ‘’. The proof corresponds to a derivation of the sequent ¬(A ∧ ¬A), ¬(A ∧ ¬A). In the sequent calculus we may chain two instances of together with an application of a [∧R] rule, to derive ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A).
¬(A ∧ ¬A), ¬(A ∧ ¬A) ¬(A ∧ ¬A)
¬(A ∧ ¬A), ¬(A ∧ ¬A) [WR]
[WR]
¬(A ∧ ¬A) [∧R]
¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A)
This (essentially) utilises the rule of contraction on the right of the turnstile. (The steps labelled “WR”.) There is no corresponding move in the natural deduction system. If we want to introduce a conjunction, we are free to paste together two instances of ¬(A ∧ ¬A)
¬(A ∧ ¬A)
¬(A ∧ ¬A)
¬(A ∧ ¬A)
¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A) but as you can see, we have leftover conclusions ¬(A ∧ ¬A). Each time we add another proof to provide another conjunct for one conclusion, we add another unconjoined instance ¬(A ∧ ¬A). This would not matter if there were a proof which concluded in merely one instance of ¬(A ∧ ¬A), but it is easy to see that with these rules there is no such proof. (Proceed by way of an
157
PROOFNETS FOR S5
induction on the construction of a proof: every proof has at least either two conclusions, or two premises or one premise and once conclusion. So, each proof with no premises has at least two conclusions.) Tables of development, as defined here, are incomplete for classical logic.3 Tables of development face a more prosaic problem: they are difficult to typeset. It turns out that we can solve both of our problems: the notational problem and the contraction problem in one go. It is much more flexible to change our notation completely. Instead of taking proofs as connecting formulas in inference steps, in which formulas are represented as characters on a page, ordered in a tree, think of proofs as taking inputs and outputs, where we represent the inputs and outputs as wires. Wires can be rearranged willy-nilly—we are all familiar with the tangle of cables behind the stereo or under the computer desk—so we can exploit this to represent cut straightforwardly. In our pictures, then, formulas label wires. This change of representation will afford another insight: instead of thinking of the rules as labelling transitions between formulas in a proof, we will think of inference steps (instances of our rules) as nodes with wires coming in and wires going out. Proofs are then circuits composed of wirings of nodes. The nodes for the connectives are then: A∧B ¬A
¬I
A A
¬A
B
∧I
A
¬E
A∧B
A∧B
∧E1
∧E2
A
B
The proof for ¬(A ∧ ¬A), ¬(A ∧ ¬A) is now represented as follows: ¬I ¬(A ∧ ¬A)
A ∧ ¬A
A ∧ ¬A
∧E1
¬I ¬(A ∧ ¬A)
∧E2 ¬A
A
¬E (The arrow notation for wires allows us to lay proofs out in a way that inference
3 Patching the system is not a simple matter. The canonical references here are Shoesmith and Smiley’s Multiple Conclusion Logic [23] and Ungar’s Normalisation, Cut-Elimination, and the Theory of Proofs [24].
158
GREG RESTALL
need not go from the top of the page to the bottom of the page.) We can construct a circuit with one conclusion wire by contracting the two original conclusions like this: ¬I
A ∧ ¬A
A ∧ ¬A
∧E1
¬(A ∧ ¬A)
¬I
∧E2
¬(A ∧ ¬A)
¬A
A
¬E
WI ¬(A ∧ ¬A)
The new WI node corresponds to the contraction of the two conclusions into one in the sequent proof. We can then combine these proofs to obtain the proof of the desired conclusion: ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A). ¬I
A ∧ ¬A
A ∧ ¬A
∧E1
¬I
∧E2
A ∧ ¬A
A ∧ ¬A
∧E1
∧E2
¬A
A ¬(A ∧ ¬A)
¬I
¬E
¬A
A
¬E
¬(A ∧ ¬A) ¬(A ∧ ¬A)
WI ¬(A ∧ ¬A)
¬I
¬(A ∧ ¬A)
WI ∧I
¬(A ∧ ¬A)
¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A)
There is much more that one can say about classical circuits. The first detailed presentation of classical proofnets is found in Robinson’s 2003 paper [22]. Our style of presentation here follows Blute, Cockett, Seely and Trimble’s work on weakly (or linearly) distribtutive categories [6]. I will leave the detail for the next section in which we introduce modal operators.
159
PROOFNETS FOR S5
§3. S5 circuits. We hope to find rules for introducing a 2-formula, and for eliminating a 2-formula. If these rules are to be anything like the rules in a natural deduction system, they should step from 2A to A, and vice versa: 2A
2E
A
A
2I
2A
From 2A, we can infer A. Similarly, from A (at least, sometimes) we can infer 2A. The analogy with rules for the universal quantifier should be clear. From ∀xFx we infer Fa, and if we have derived Fa in a special way (the a is arbitrary) we may infer ∀xFx. In the modal setting, we do not have something playing the role of names. So, we need some other way to ensure that [2E] is stronger than it appears (in the quantifier case, we may infer Fa for any object a) and that [2I] is weaker than it appears (what is the restriction on its application, corresponding to the condition on names for ∀x?) Consider models for the modal operators: If 2A is true at a point, what can we infer about A? It follows that A is true at every point: not just the point at which we derived (or assumed) 2A. So, if we infer A from 2A, we are free to infer A not only here (in this context) but also there (whatever other context “there” might be). So, we can think of the output A wire in the [2E] node as freely ‘applying to’ a context other than the one in which we have evaluated A. This is the sense in which [2E] is stronger than the inference that merely steps from 2A to A without allowing a change of context. Consider [2I]. Under what conditions can our inference of A justify the step to 2A? We can infer 2A when our inference to A is general — that is, when we have inferred A at an arbitrary context. What does it mean for a context to be arbitrary? Here we take our cue from the proof theory for predicate logic. We can infer ∀xFx from some proof of Fa just when the conclusion Fa is the only part of the proof (premises or conclusion) to contain information about a (that is, to be formulas containing the name a). We can do the same thing here. If we have all of the premises and conclusions in our proof applying to a collection of contexts, and only the conclusion A applies to context, then we can infer 2A, since that context was arbitrary. We have the conclusion of A generally, in a manner which is appropriate for any context. But contexts are not like names in predicate logic, they do not explicitly show up in the syntax of the logic S5. All that this talk of contexts requires is that we pay attention to whether or not a formula in a proof occurs in the same context as another formula. We can make suggestive ideas more precise in the following way. We start by defining the class of inductively generated circuits, and the equivalence relation of nearness () on wires in a circuit.
160
GREG RESTALL
Definition (inductively generated circuits, nearness). ated circuits are defined in the following manner.
Inductively gener-
A
• An identity wire: for any formula A is an inductively generated circuit. The sole input type for this circuit is A and its output type is also (the very same instance) A. As there is only one wire in this circuit, it is near to itself. • Each boolean connective node presented below is an inductively generated circuit. ¬A
A
A
¬E
¬A
¬I
A
B
∧I A∧B
A∧B
A∧B
∧E1
∧E2
A
B
The inputs of a node are those wires pointing into the node, and the outputs of a node are those wires pointing out. The input and output wires of a each of these nodes are in the same nearness equivalence class. • Given an inductively generated circuit with an output wire labelled A, and an inductively generated circuit with an input wire labelled A, we obtain a new inductively generated circuit in which the output wire of is plugged in to the input wire of . The output wires of the new circuit are the output wires of (except for the indicated A wire) and the output wires of , and the input wires of the new circuit are the input wires of together with the input wires of (except for the indicated A wire). A wire in the new circuit near another wire if and only if either those two wires are near in or close in , or one wire is near to the ouput A in and the other wire is close to the input A in . (In other words, the equivalence classes for on the new circuit are those classes in the old circuit, except for the classes for the wire at the point of composition. The two classes for this wire are merged.) • Given an inductively generated circuit with two input wires A, a new inductively generated circuit is formed by plugging both of those input wires into the input contraction node we . In the new circuit, the relation is the same as the original relation, except that the classes for the two contracted input wires are merged, and the new single input A is in the same class. Similarly, two output wires with the same label may be extended with a contraction node wi . The two output wires are now near in the new circuit, as before. • Given an inductively generated circuit , we may form a new circuit with the addition of a new output, or output wire (with an arbitrary label)
161
PROOFNETS FOR S5
using a weakening node
ki
or
X
ke .4 X B
KI B
Y
KE
Y
The new wires are not near any other wires in the proof. (They are arbitrary extra conclusions or premises, and they could well be in any context.) • A 2E node is also an inductively generated circuit. In this node, the input wire 2A is not nearby the output wire A. 2A
2E
A
• Given an inductively generated circuit in which a conclusion wire A is not nearby any other conclusion wire, and is not nearby any premise wire, then the result of plugging in 2I to the conclusion wire A is a new inductively generated circuit. The new conclusion 2A is not nearby any other wire of the circuit. A
2I
2A
This completes our definition of the proofs for S5. Inductively generated circuits represent valid reasoning in S5. Here is an example, showing how one can derive 2¬2¬A from A. The circuit below has A as its only input, an 2¬2¬A as its only output. A
¬E
¬A
2E
2¬A
¬I
¬2¬A
2I
2¬2¬A
It is a useful exercise to show that this circuit may be inductively generated from left-to-right. The sub-circuit A
¬E
¬A
2E
2¬A
¬I
¬2¬A
4 Using an unlinked weakening node like this makes some circuits disconnected. It also forces a great number of different sequent derivations to be represented by the same circuit. Any derivation of a sequent of the form X Y, B in which B is weakened in at the last step will construct the same circuit as a derivation in which B is weakened in at an earlier step. If this identification is not desired, then a more complicated presentation of weakening, using the ‘supporting wire’ of Blute, Cockett, Seely and Trimble [6] is possible. Here, I opt for a simple presentation of circuits rather than a comprehensive account of “proof identity.”
162
GREG RESTALL
is inductively generated, because each of the nodes are themselves circuts. In this circuit, the equivalence relation relates the A and ¬A wires, and it relates the 2¬A and ¬2¬A wires. But the nearness relation does not relate the wires on the left to the wires on the right. As a result, we may apply [2I], since the output wire ¬2¬A is not near to any other wire on the periphery of the circuit. The result is the complete circuit with input A and output 2¬2¬A. This proof tells us more than simply that in any model in any world where A is true, 2¬2¬A is true (though it does tell us this too). Since the output wire 2¬2¬A is not close to the input wire A, it tells us that there is no model at all where there is a world where A is true and a world where 2¬2¬A is not true. Those worlds need not be the same. To speak in terms of contexts, it is incoherent to assert A in one context and to deny 2¬2¬A in another context. This is an example of the following general result, on the soundness of inductively generated circuits. Theorem (soundness). Given an inductively generated circuit with input wires X1 , . . . , Xn and output wires Y1 , . . . , Yn , where each Xi ∪ Yi is an equivalence class for the nearness relation, then for any S 5 model, there is no set w1 , . . . , wn of worlds where each Xi is true at wi and each Yi is false at wi . Proof. The proof is a trivial induction on theconstruction of the proof. Identity, boolean nodes, contraction, weakening are all immediate. The cut rule is a simple consequence of the transitivity of consequence in S5-models. For [2E] we note that there is no model in which there is no pair of worlds, where 2A is true in one and A is false in the other. For [2I], we note that if there is no model satisfying some condition (concerning the rest of the wires in the proof except for the one output A which is near no other wire in the periphery) where there is a world in which A is false, then in these models there is no world in which A is false, and hence, there no world in which 2A is false either. But this is the condition for [2I].
So, circuits encode valid reasoning in our models. To show that they encode all of the validities of our models, we need a completeness proof. To discuss the completeness proof, we will examine another way of representing the behaviour of circuits. §4. S5 sequents. We may represent the periphery of a circuit as a general sequent, in which the input wires are formulas in antecedent position, and the output wires are formulas in consequent position. However, this leaves out the nearness relation, which we need to model the behaviour of modal operators. So, in a sequent, we will keep track of the nearness of formulas. One way to do this is by segregating formulas into equivalence classes, and in those classes, into antecedent and consequent position. The picture, then, is
163
PROOFNETS FOR S5
of a hypersequent5 X1 Y1 | · · · | Xn Yn a multiset of sequents, in which each Xi and Yi is a multiset of formulas.6 We think of the sequent Xi Yi as forming one of the zones of the hypersequent. The hypersequent calculus for S5 has the following connective rules:7 X A, Y | ∆ X, ¬A Y | ∆ X, A Y | ∆
X, A Y | ∆ [¬L]
X ¬A, Y | ∆ X, B Y | ∆
[∧L1 ]
[¬R]
X, A ∧ B Y | ∆ X, A ∧ B Y | ∆ X A, Y | ∆ X B, Y | ∆
[∧L2 ]
[∧R]
X, X A ∧ B, Y, Y | ∆ | ∆ X, A Y | ∆ A | ∆ 2A | X Y | ∆
[2L]
2A | ∆
[2R]
which are motivated by way of the rules for constructing circuits. For [¬L], if we have a circuit in which A is an output formula, then we may expand the circuit by adding a [¬I] node, plugged in at the A wire, which will give us a circuit in which ¬A is an input wire. It is nearby all and only the formulas that are nearby to the A, and so, in the hypersequent, it is a part of the same zone. Similarly, for [2R], if we have a circuit in which A is an output wire, adjacent to no other wires on the periphery of the circut (so, we have a sequent in which A in a zone of its own), then we may add a [2I] node at this point, and the new output A is nearby no other point in the circut—that is, 2A is in a zone of its own. The appropriate rules for identity and cut are straightforward AA
X A, Y | ∆
X , A Y | ∆
X, X Y, Y | ∆ | ∆
[Cut]
5 These are hypersequents due to Arnon Avron [1, 2, 3, 27]. However, the account here differs in two ways from Avron’s presentation. First, hypersequents are motivated in terms of an underlying deductive machinery. Second, the behaviour of the modal operators is captured by a single pair of left and right rules. There is no special “modal splitting rule” connecting hypersequents and the modal operators. 6 In other words, the one hypersequent may be presented as p q, r | s, t u or as t, s u | p r, q, but this is not the same as the hypersequent p, p q, r | s, t u | s, t u. The order of formulas or zones in a hypersequent does not matter (in just the same way that the order of wires does not matter in a circuit) but the number of instances of formulas does (just as it does in a circuit). 7 To save space, I present the rules for conjunction, but not disjunction. You can think of disjunction as a define connective, or you can use the obvious rules for disjunction, dual to these rules for conjunction.
164
GREG RESTALL
With the system as it stands, we may make a number of derivations. AA ¬A, A
B B
AA [¬L]
A | 2¬A
2A | A
[2L]
A | ¬2¬A
[¬R]
A | 2¬2¬A
[2R]
[2L]
2A ∧ 2B | A
2B | B [∧L]
[2L]
2A ∧ 2B | B
[∧R]
2A ∧ 2B | 2A ∧ 2B | A ∧ B 2A ∧ 2B | 2A ∧ 2B | 2(A ∧ B)
[∧L]
[2R]
Clearly, to be able to derive all of the valid sequents, we must add a few structural rules. To mimic the behaviour of circuits closely, we allow contraction inside zones in a circut, and weakening into a new zone. X A, A, Y | ∆
X, A, A Y | ∆ X, A Y | ∆
[WL]
X A, Y | ∆
∆ [WR]
A | ∆
∆ [KL]
A | ∆
[KR]
Finally, to ensure that we can derive all of the valid hypersequents, we need to be able to throw away information by merging zones in sequents. X Y | X Y | ∆ X, X Y, Y | ∆
[merge]
This rule in a sequent proof has no parallel node in the structure of a circuit.8 It corresponds to taking a circuit and merging two zones, or taking two equivalence classes to coalesce. One simple example is taking the circuit consisting of a [2E] node alone, with input 2A and output A to prove for us 2A A (that there’s no model with a world w in which 2A is true and A is false). This is throwing away information, as the circuit can also be read as telling us that 2A | A (that there’s no model with a world w at which 2A is true and w where A is false). This is a more general fact. There is no harm in throwing away information, and it is helpful to have a rule such as this for when it comes to proving completeness, to the effect that any valid hypersequent is provable.9 Before moving on to consider completeness, we will state, without proof, the fact that motivated the construction of this sequent system. 8 Actually, the effect of a merge can be found by contracting two instances of A in different zones in the proof. Then X, A Y | X , A Y merge to be come X, X , A Y, Y . It seemed too confusing to introduce contraction in this more general form. It can be modelled straightforwardly as an application of merge and then [WL]. 9 The situation is somewhat analagous with the role of weakening in the sequent system for intuitionistic propositional logic and the natural deduction system. There is no normal natural deduction proof from premises p, q to conclusion p, but there is a sequent derivation of p, q p. We take the identity proof from p to p (consisting of the formula itself) to tell us not only that p p, but also that p, X p for any collection of formulas X .
PROOFNETS FOR S5
165
Definition (decoration). A hypersequent X1 Y1 | · · · | Xn Yn decorates a circuit if and only if the input wires for the circuit are X1 , . . . , Xn , the output wires are Y1 , . . . , Yn , and if two wires are close in the circuit, they appear in the same zone in the hypersequent.10 Theorem (translation). For each inductively generated circuit, and for any hypersequent decorating that circuit, there is a derivation of that hypersequent. Conversely, for any derivation of a hypersequent, there is an inductively generated circuit decorated by that hypersequent. §5. Completeness and cut elimination. In the next section, I will cover quite quickly some properties of the sequent system. The discussion is necessarily (for reasons of space), compressed. The aim is to explore the behaviour of this presentation of S5. Definition (validity). A hypersequent X1 Y1 | · · · | Xn Yn is valid in a model if and only if there are no worlds w1 , . . . , wn in that model in which each formula in Xi is true at wi and each formula in Yi is false at wi . The soundness theorem, proved in the section before last, then, may be restated as saying that the hypersequent corresponding to a inductively generated circuit (that is, a derivable hypersequent) is valid. The completeness theorem is the converse. Theorem (completeness). A valid hypersequent is derivable. This result may be proved in a number of ways. One is simple, but it relies upon a prior completeness result. Proof (internalisation). (i) Convert each hypersequent into a formula which is derivable if and only if the hypersequent is derivable, and valid if and only if the formula is valid, and then show that (ii) every axiom in some axiomatisation of S5 is derivable, and the rules in that axiomatisation preserve derivability. Stage(i) is simple. Convert each sequent X Y inside a hypersequent to ¬( X ∧ ¬ Y ). The resulting hypersequent is derivable if and only if the original hypersequent is derivable, and valid if and only if the original hypersequent is valid. Then, encode a hypersequent of the form A1 | · · · | An as a particular formula in the form A1 ∨ 2A2 ∨ · · · ∨ 2An and this, too, is co-derivable and co-valid with the original hypersequent.11 For the second part, show that every axiom in your favourite axiomatisation of S5 is derivable in the sequent system. The verification of this part is routine. To show that modus ponens (say in the form of the inference from ¬(A ∧ ¬B) allows 2A A to decorate the single [2E] node, as well as 2A | A. the boxes on all formulas other than one? First, to make the translation of a hypersequent with a single zone the identity translation. Second, the valid hypersequent ¬2A | A may be translated as ¬2A ∨ 2A, which is also valid. 10 This
11 Why
166
GREG RESTALL
and A to B) preserved derivability, we must use the rule cut, to extend the derivations as follows: B B A A ¬B, B
[¬R] [∧R]
· · · A
· A A ∧ ¬B, B · · ¬(A ∧ ¬B) ¬(A ∧ ¬B), A B
[¬L] [Cut]
AB [Cut]
B
That proof is simple, but it does not tell us much about the proof system. It is more interesting to prove completeness directly. Proof (model construction). Given an underivable hypersequent, we construct a model in which that hypersequent is invalid. One way to do this is to show that any underivable sequent must have an unsuccessful derivation search, from which a model can be constructed. This technique can succeed without the use of the cut rule. Firstly, notice that the following rules can be derived on the basis of the connective rules (and contractions, merges and weakenings). X, ¬A A, Y | ∆ X, ¬A Y | ∆
[¬Ls ]
X, A ¬A, Y | ∆ X ¬A, Y | ∆
X, A, B, A ∧ B Y | ∆ X, A ∧ B Y | ∆
[∧Ls ]
X A, A ∧ B, Y | ∆ X B, A ∧ B, Y | ∆ X A ∧ B, Y | ∆ X, 2A Y | X , A Y | ∆
X, 2A Y | X Y | ∆
[2Ls ]
[¬Rs ]
[∧Rs ]
X 2A, Y | A | ∆ X 2A, Y | ∆
[2Rs ]
Now consider what happens with an underivable hypersequent. If a hypersequent is underivable, and it has the form of one of the lower hypersequents in that table above, then one of the hypersequents above that line must also be underivable. In particular, that means that we do not get a hypersequent in which the same formula finds itself on both sides of a turnstile in the one zone. (Any hypersequent containing a zone of the form X, A A, Y is derivable, using weakenings and merges.) So, we can think of an underivable hypersequent as a partial description of a model. Each zone partially describes world. Antecedent formulas are true, and consequent formulas are false. The search rules above tell us that if we have a negation true, its negand is false,
167
PROOFNETS FOR S5
if a negation is false, its negand is true. Similarly for conjunction, and for necessity, if 2A is true, then A is true in each zone, and if 2A is false, then there is some zone in which A is false. So, search for a derivation, by taking a hypersequent and whenever we have a formula in a zone that is ‘unprocessed’ (a negation whose negand is not in the opposite zone, 2A true in a zone, but A not appearing in some zone), process it by means of the rules we have seen. (This might require branching in the case of a conjunction in consequent position.) Continue this process. If the original sequent is underivable, the result will be a partial description of a model in which each zone describes a world. The model will falsify the original hypersequent.
This technique (which is, in effect, constructing a tableaux system from this sequent calculus) has the advantage of not requiring the cut rule. A corollary of soundness and completeness proved in this way is that cut is admissible. That is, since we know that the cut rule preserves validity in models, and since we know that validity in models is captured exactly by the hypersequents with cut-free derivations, we know that if the premise hypersequents of a cut rule are derivable, so is the endsequent. This proof tells us nothing about how to convert a proof involving cuts into one that does not use cut. We can adopt the standard cut-elimination technique [13]. My presentation follows from Belnap’s systematic account in his Display Logic [5, 21], which in turn follows Curry’s formulation of the proof [8, page 250]. First, we check that the rules of the hypersequent calculus satisfy a number of conditions. Cut/Identity. That is, the Cut on an identity sequent is redundant: AA
X , A Y | ∆
X , A Y | ∆
[Cut]
Clearly, a cut on an identity sequent may be left out completely. Parameter conditions. Next we have conditions on parameters in rules. In our case, a parameter in an inference falling under a rule is every formula except for the major formulas in a connective rule (the formula with the connective introduced below the line and its ancestor formulas above the line), and the cut formulas in a cut rule. Every other formula is a parameter. Parameters may appear both above and below the line. A parametric class is a collection of instances of a formula in a proof. Two formulas are a part of the same parametric class if they are represented by the same letter in a presentation of the rule (the instances of A in an inference of contraction, for example) or if they occur in the same place in a structure (such as an antecedent X or a hypersequent term ∆). Regularity. The regularity condition is that if a cut formula is parametric in an inference immediately before the cut, the cut may be permuted above that
168
GREG RESTALL
inference. For example the segment X A, A, Y | ∆ X A, Y | ∆
[WR]
X , A Y | ∆ [Cut]
X, X Y, Y | ∆ | ∆
can be replaced by this segment, in which cuts take place on the top sequents, at the cost of duplicating material in the derivation. X A, A, Y | ∆ X , A Y | ∆ X, X A, Y, Y | ∆ | ∆
[Cut]
X , A Y | ∆ [Cut]
X, X , X Y, Y | ∆ | ∆ | ∆ [W and merge]
X, X Y, Y | ∆ | ∆ And similarly, X , A, B Y | ∆ X A, Y | ∆
2B | X , A Y | ∆
[2L] [Cut]
2B | X, X Y, Y | ∆ | ∆ becomes X A, Y | ∆
X , A, B Y | ∆
X, X , B Y, Y | ∆ | ∆ 2B | X, X Y, Y | ∆ | ∆
[Cut] [2R]
Position-alikeness of parameters. Two formulas in the same parameter class are in the same position (either antecedent position or consequent position). This is straightforward to check.12 Non-proliferation of parameters. Parametric classes have only one member below the line of an inference. This is straightforward to check. The previous conditions all concern permuting cuts over inferences when one side or other is parametric. Single principal constituents. A formula is principal in a rule if it is not parametric. The single principal constituent condition is that each inference has only one principal formula below the line. This is immediate.
12 This condition rules out inferences such as “matched weakening”, leading from X Y to X, A A, Y in which the parameteric class for A would appear in both antecedent and consequent position.
169
PROOFNETS FOR S5
Eliminability of matching principal consituents. An instance of cut in which the cut formula is principal in both inferences immediately before the cut may be traded in for a cut (or cuts) on subformulas of the cut formula. The interesting case in our system is for 2A. We have: X, A Y | ∆
A | ∆ 2A | ∆
[2R]
2A | X Y | ∆
[2L] [Cut]
X Y | ∆ | ∆
Clearly we could have made the cut before the introduction: A | ∆
X, A Y | ∆
X Y | ∆ | ∆
[Cut]
Given that our system satisfies these conditions, we may eliminate cuts from derivations. Theorem (elimination of cuts). Given a derivation in which the rule [Cut] is applied, we may effectively transform this derivation into one in which cut is not used. Proof. We perform an induction on the complexity of the cut formula. The hypothesis is that for every subformula of A (and for every, X, X , Y, Y , ∆, ∆ ) if X A, Y | ∆ and X , A Y | ∆ are derivable, so is X, X Y, Y | ∆ | ∆ , and we wish to show that this is the case for the formula A also. So, suppose we have derivations and of X A, Y | ∆ and X , A Y | ∆ respectively. If the cut-formula A indicated in the concluding inferences of and is principal, then we may apply the eliminability of matching principal constituents condition and our induction hypothesis to eliminate the cut. If, on the other hand, A is parametric in either or , we proceed as follows. Without loss of generality, suppose A is parametric in . Consider the class A of occurrences of A in found by tracing up the derivation and selecting each parametric instance of A congruent with the A in the conclusion of . We commute the cut on A (with the other premise X , A Y | ∆ ) past each inference in which an instance in A features, using regularity. The result is a derivation in which there may be many more cuts, but for each cut on A introduced, there are no parametric instances of A in consequent position. For each copy of introduced, we may form the set A of instances of A congruent with the A in antecedent position in the cut inference. We commute the cut with each inference crossing the set A to construct a derivation in which the cut on A occurs only on prinicpal instances of A, and this case has already been covered.
§6. Looking ahead. We will end by looking at a number of ways to extend this approach.
170
GREG RESTALL
Normalisation. Elimination of cuts corresponds quite directly to the normalisation of circuits, by way of the translation between derivations and circuits. The circuit presentation of this system gives us scope for examining other ways in which proofs may be normalised. Correctness. Not every plugging of a wires in nodes produces a circuit. (Consider the putative “inference” in which the output wires of [∨E] are plugged into the input wires of [∧I]. This does not tell us that we may infer A ∧ B from A ∨ B.) The literature on proofnets has introduced the notion of a correctness criterion [9, 15]. It is an open question as to what might be an appropriate correctness criterion for these circuits. Terms. Natural deduction systems lend themselves to a representation in a term calculus, according to which proofs correspond to terms, where formulas are types. An appropriate term calculus for these circuits is, also, an open question. It seems that Philip Wadler’s recent work on term calculi for classical linear logic will provide a useful starting point [25]. Identity of proofs. We have not said when two circuits represent the same proof. Clearly, these circuits are not the last word for proof identity. Even in the classical case, proof identity is a complicated business. There are many prposals in the literature [4, 11, 12, 18]. The key idea in this literature that a theory of proofs has the structure of a category. A proof from A to B is, essentially, an arrow in that category. It is less clear that this is what we want in the case of modal reasoning. In the category-centred approach, we take a proof for X Y to be an arrow f : X → Y . In the case of hypersequents, we do not have an obvious translation in terms of formulas. Take the hypersequent A | B. It can be thought of from the perspective of A (so it tells us that A 2B) or from B (it tells us that ♦A B). The proof from A to 2B cannot be the same as the proof from ♦A to B, as the source formulas differ, and the target formulas differ.13 So, which arrow in the category is the proof ? Could a more natural model for these deductions be a different generalisation of a category? If we quotient our proofnets with some congruence relation (respecting the kind of identities we might expect, given our preferences about the way to go here) then what kind of “category-like” structure do we find? This is an open question. Other systems. Finally, it is clear that we need to generalise this account to cover modal logics other than S5. To do this, we need to step from a simple relation of which ignores anything other than the identity and difference of contexts for wires in a proof, to something more subtle. In an inference [2E] from 2A, we step not to an arbitrary context, but to a successor context. The rule [2I] must similarly be modified. The aim, of course, is an account 13 They are not only different, they will not be isomorphic is the categories, as they have different inferential roles.
PROOFNETS FOR S5
171
of proof in which the rules for the modal operators are untouched, and the structural rules (in this case, the behaviour of nearness and the relations of ancestor/descendant) play the role of determining which modal logic is found. Exploring these matters must be left for another time. REFERENCES
[1] Arnon Avron, A constructive analysis of RM, The Journal of Symbolic Logic, vol. 52 (1987), no. 4, pp. 939–951. [2] , Using hypersequents in proof systems for non-classical logics, Annals of Mathematics and Artificial Intelligence, vol. 4 (1991), pp. 225–248. [3] , The method of hypersequents in the proof theory of propositional non-classical logics, Logic: from Foundations to Applications (W. Hodges, M. Hyland, C. Steinhorn, and J. Truss, editors), Oxford University Press, New York, 1996, pp. 1–32. [4] Gianluigi Bellin, Martin Hyland, Edmund Robinson, and Christian Urban, Categorical proof theory of classical propositional calculus, Theoretical Computer Science, to appear, 200+. [5] Nuel D. Belnap, Display logic, Journal of Philosophical Logic, vol. 11 (1982), no. 4, pp. 375– 417. [6] Richard Blute, J. R. B. Cockett, R. A. G. Seely, and T. H. Trimble, Natural deduction and coherence for weakly distributive categories, Journal of Pure and Applied Algebra, vol. 113 (1996), no. 3, pp. 229–296, available from ftp://triples.math.mcgill.ca/pub/ rags/nets/nets.ps.gz. [7] Torben Brauner, A cut-free Gentzen formulation of the modal logic S5, Logic Journal of ¨ the IGPL, vol. 8 (2000), no. 5, pp. 629–643. [8] Haskell B. Curry, Foundations of Mathematical Logic, Dover, New York, 1977, Originally published in 1963. [9] Vincent Danos and Laurent Regnier, The structure of multiplicatives, Archive for Mathematical Logic, vol. 28 (1989), no. 3, pp. 181–203. [10] Kosta Doˇsen, Sequent-systems for modal logic, The Journal of Symbolic Logic, vol. 50 (1985), no. 1, pp. 149–168. [11] Kosta Doˇsen and Zoran Petric, ´ Proof-Theoretical Coherence, KCL Publications, London, 2004. [12] Carsten Fuhrmann and David Pym, Order-enriched categorical models of the classical ¨ sequent calculus, Journal of Pure and Applied Algebra, vol. 204 (2006), no. 1, pp. 21–78. [13] Gerhard Gentzen, Untersuchungen u¨ ber das logische Schließen, Mathematische Zeitschrift, vol. 39 (1934), no. 1, pp. 176–210 and 405– 431, translated in The Collected Papers of Gerhard Gentzen [14]. [14] , The Collected Papers of Gerhard Gentzen, North-Holland, Amsterdam, 1969, Edited by M. E. Szabo. [15] Jean-Yves Girard, Linear logic, Theoretical Computer Science, vol. 50 (1987), no. 1, pp. 1–101. [16] Andrzej Indrzejcak, Cut-free double sequent calculus for S5, Logic Journal of the IGPL, vol. 6 (1998), no. 3, pp. 505–516. [17] William C. Kneale, The province of logic, Contemporary British Philosophy: Third Series (H. D. Lewis, editor), George Allen and Unwin, 1956, pp. 237–261. [18] F. Lamarche and L. Straßburger, Naming proofs in classical propositional logic, Typed Lambda Calculi and Applications, Lecture Notes in Computer Science, vol. 3461, Springer, Berlin, 2005, pp. 246–261.
172
GREG RESTALL
[19] Grigori Mints, Indexed systems of sequents and cut-elimination, Journal of Philosophical Logic, vol. 26 (1997), no. 6, pp. 671–696. [20] M. Ohnishi and K. Matsumoto, Gentzen method in modal calculi, Osaka Journal of Mathematics, vol. 9 (1957), pp. 113–130. [21] Greg Restall, An Introduction to Substructural Logics, Routledge, 2000. [22] Edmund Robinson, Proof nets for classical logic, Journal of Logic and Computation, vol. 13 (2003), no. 5, pp. 777–797. [23] D. J. Shoesmith and T. J. Smiley, Multiple-Conclusion Logic, Cambridge University Press, Cambridge, 1978. [24] A. M. Ungar, Normalization, Cut-Elimination and the Theory of Proofs, CSLI Lecture Notes, vol. 28, Stanford University Center for the Study of Language and Information, Stanford, 1992. [25] Philip Wadler, Down with the bureaucracy of syntax! pattern matching for classical linear logic, Available at http://homepages.inf.ed.ac.uk/wadler/papers/dual-revolutions/ dual-revolutions.pdf, 2004. [26] Heinrich Wansing, Displaying Modal Logic, Kluwer Academic Publishers, Dordrecht, 1998. [27] , Translation of hypersequents into display sequents, Logic Journal of the IGPL, vol. 6 (1998), no. 5, pp. 719–733. PHILOSOPHY DEPARTMENT THE UNIVERSITY OF MELBOURNE VICTORIA, 3010, AUSTRALIA
E-mail:
[email protected] URL: http://consequently.org
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
HELMUT SCHWICHTENBERG
§1. Introduction. We describe a constructive theory of computable functionals, based on the partial continuous functionals as their intendend domain. Such a task had long ago been started by Dana Scott [30], under the well-known abbreviation LCF. However, the prime example of such a theory, ¨ type theory [23] in its present form deals with total (strucPer Martin-Lof’s ¨ [24] to give tural recursive) functionals only. An early attempt of Martin-Lof a domain theoretic interpretation of his type theory has not even been published, probably because it was felt that a more general approach — such as formal topology [13] — would be more appropriate. Here we try to make a fresh start, and do full justice to the fundamental notion of computability in finite types, with the partial continuous functionals as underlying domains. The total ones then appear as a dense subset [20, 15, 7, 31, 27, 21], and seem to be best treated in this way. Computable functionals and logic. Types are built from base types by the formation of function types, ⇒ . As domains for the base types we choose non-flat (cf. Figure 2) and possibly infinitary free algebras, given by their constructors. The main reason for taking non-flat base domains is that we want the constructors to be injective and with disjoint ranges. The naive model of such a finitely typed theory is the full set theoretic hierarchy of functionals of finite types. However, this immediately leads to higher cardinalities, and does not lend itself well for a theory of computability. A more appropriate semantics for typed languages has its roots in work of Kreisel [20] (who used formal neighborhoods) and Kleene [19]. This line of research was taken up and developed in a mathematically more satisfactory way by Scott and Ershov [28, 16]. Today this theory is usually presented in the context of abstract domain theory [31, 3]; it is based on classical logic. The present work can be seen as an attempt to develop a constructive theory of formal neighborhoods for continuous functionals, in a direct and intuitive style. The task is to replace abstract domain theory by a more concrete and (in case of finitary free algebras) finitary theory of representations. As a framework we use Scott’s information systems [29, 22, 31]. It turns out Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
173
174
HELMUT SCHWICHTENBERG
that we only need to deal with “atomic” and “coherent” information systems (abbreviated acis), which simplifies matters considerably. In this setup the basic notion is that of a “token”, or unit of information. The elements of the domain appear as abstract or “ideal” entitites: possibly infinite sets of tokens, which are “consistent” and “deductively closed”. Total functionals. One reason to be interested in total functionals is that for base types, that is free algebras, we can prove properties of total objects by structural induction. This is also true for the more general class of structuretotal objects, where the arguments at parameter positions in constructor terms need not be total. An example is a list whose length is determined, but whose elements need not be total. We show that the standard way to single out the total functionals from the partial ones works with non-flat base domains as well, and that Berger’s proof [7] of Kreisel’s [20] density theorem can be adapted. Terms and their denotational and operational semantics. Since we have introduced domains via concrete representations, it is easy to define the computable functionals, simply as recursively enumerable ideals (= sets of tokens). However, this way to deal with computability is too general for concrete applications. In practice, one wants to define computable functionals by recursion equations. We show that and how computation rules [11, 8] can be used to achieve this task. The meaning [[ x M ]] of a term M (with free variables in x ) involving constants D defined by computation rules will be an inductively , b), of the type of defined set of tokens (U x M. So we extend the term language of Plotkin’s PCF [25], by constants defined via “computation rules”. One instance of such rules is the definition of the fixed point operators Y of type ( ⇒ ) ⇒ , by Y f = f(Y f). Another instance is the structural recursion operator RN , defined by RN (f, g, 0) = f,
RN (f, g, Sn) = g(n, RN (f, g, n)).
Operationally, the term language provides some natural conversion rules to “simplify” terms: , , and — for every defined constant D — the defi → M with non-overlapping constructor patterns P; the ning equations D P equivalence generated by these conversions is called operational semantics. We show that the (denotational) values are preserved under conversions, including computation rules. Computational adequacy. Clearly we want to know that the conversions mentioned above give rise to a “computationally adequate” operational semantics: If [[M ]] = k, then the conversion rules suffice to actually reduce M to the numeral k. We show that this holds true in our somewhat extended setting as well, with computation rules and non-flat base domains. Structural recursion. An important example of computation rules are those ¨ of the (Godel) structural recursion operators. We prove their totality, by
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
175
showing that the rules are strongly normalizing. A predicative proof of this fact has been given in [1], based on Aczel’s notion of a set-based relation. Our proof is predicative as well, but — being based on an extension of Tait’s method of strong computability predicates — more along the standard line of such proofs. Moreover, it extends the result to the present setting. Related work. The development of constructive theories of computable ¨ functionals of finite type began with Godel’s [18]. There the emphasis was on particular computable functionals, the structural (or primitive) recursive ones. In contrast to what was done later by Kreisel, Kleene, Scott and Ershov, the domains for these functionals were not constructed explicitly, but rather considered as described axiomatically by the theory. Denotational semantics for PCF-like languages is well-developed, and usually (as in Plotkin’s [25]) done in a domain-theoretic setting. The study of the semantics of non-overlapping higher type recursion equations — called here computation rules — has been initiated by Berger [11], again in a domaintheoretic setting. Recently [8] he has introduced a “strict” variant of this domain-theoretic semantics, and used it to prove strong normalization of ¨ extensions of Godel’s T by different versions of bar recursion. Information systems have been conceived by Scott [29], as an intuitive approach to domains for denotational semantics. The idea to consider atomic information systems is due to Ulrich Berger (unpublished work); coherent information systems have been introduced by Plotkin [26, p. 210]. Taking up Kreisel’s [20] idea of ¨ developed in unpublished (but somewhat neighborhood systems, Martin-Lof distributed) notes [24] a domain theoretic interpretation of his type theory. The intersection type discipline of Coppo and Dezani [5] can be seen as a different style of presenting the idea of a neighborhood system. The desire to ¨ Sambin have a more general framework for these ideas has lead Martin-Lof, and others to develop a formal topology [13]. It seems likely that the method in [21, Section 3.5] (which is based on an idea of Ulrich Berger) can be used to prove density in the present case, but this would require some substantial rewriting. The first proof of an adequacy theorem (not under this name) is due to Plotkin [25, Theorem 3.1]; Plotkin’s proof is by induction on the types, and uses a computability predicate. A similar result in a type-theoretic setting is ¨ notes [24, Second Theorem]. Adequacy theorems have been in Martin-Lof’s proved in many contexts [2, 4, 5, 24]. Coquand [14] — building on the work ¨ [24] and Berger [8] — observed that the adequacy result even of Martin-Lof holds for untyped languages, hence also for dependently typed ones. The problem of proving strong normalization for extensions of typed calculi by higher order rewrite rules has been studied extensively in the literature [32, 17, 33, 12, 1, 8]. Most of these proofs use impredicative methods (e.g., by reducing the problem to strong normalization of second order propositional logic, called system F by Girard [17]). Our definition of the strong
176
HELMUT SCHWICHTENBERG
computability predicates and also the proof are related to Zucker’s [34] proof of strong normalization of his term system for recursion on the first three number or tree classes. However, Zucker uses a combinatory term system and defines strong computability for closed terms only. Following some ideas in an unpublished note of Berger, Benl (in his diploma thesis [6]) transferred this proof to terms in simply typed -calculus, possibly involving free variables. Here it is adapted to the present context. Organization of the paper. In Section 2 atomic coherent information systems are defined, and used as a concrete representation of the relevant domains, based on non-flat and possibly infinitary free algebras. Section 3 deals with total and structure-total ideals; it is shown that the density theorem holds. Section 4 introduces the term language, extending Plotkin’s PCF by defined constants and computation rules. The denotational and operational semantics is defined, the former by an inductive definition of a relation , b) ∈ [[ (U x M ]], the latter by conversions which include the computation rules. We prove preservation of values under conversions. Section 5 contains the proof of the adequacy theorem. The structural recursion operators are taken up in Section 6, as an example of computation rules defining total objects. The paper concludes in Section 7 with remarks on an implementation of some of its ideas, in the Minlog proof assistant www.minlog-system.de under development in Munich. §2. Partial continuous functionals. Information systems have been introduced by Scott [29], as an intuitive approach to deal constructively with ideal, infinite objects in function spaces, by means of their finite approximations. One works with atomic units of information, called tokens, and a notion of consistency for finite sets of tokens. Finally there is an entailment relation, between consistent finite sets of tokens and single tokens. The ideals (or objects) of an information system are defined to be the consistent and deductively closed sets of tokens; we write |A| for the set of ideals of A. One shows easily that |A| is a domain w.r.t. the inclusion relation. Conversely, every domain with countable basis can be represented as the set of all ideals of an appropriate information system [22]. Here we take Scott’s notion of an information system as a basis to introduce the partial continuous functionals. Call an information system atomic if the entailment relation U b is given by ∃a∈U {a} b and hence determined by a transitive relation on A (namely {a} b, written a ≥ b). Call it coherent [26, p. 210] when a finite set U of tokens is consistent iff each two-element subset of it is. We will show below that if B is atomic (coherent), then so is the “function space” A → B. Since our algebras will be given by atomic coherent information systems, this is the only kind of information systems we will have to deal with.
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
177
2.1. Types. A free algebra is given by its constructors, for instance zero and successor for the natural numbers. We want to treat other data types as well, like lists and binary trees. When dealing with inductively defined sets, it will also be useful to explicitely refer to the generation tree. Such trees are quite often infinitely branching, and hence we allow infinitary free algebras. The freeness of the constructors is expressed by requiring that their ranges are disjoint and that they are injective. To allow for partiality — which is mandatory when we want to deal with computable objects —, we have to embed our algebras into domains. Both requirements together imply that we need “lazy domains”. Our type system is defined by two type forming operations: arrow types ⇒ and the formation of inductively generated types α κ , where α = = (κi )i=1,...,k is a list (αj )j=1,...,N is a list of distinct “type variables”, and κ of “constructor types”, whose argument types contain α1 , . . . , αN in strictly positive positions only. For instance, α(α, α ⇒ α) is the type of natural numbers; here the list (α, α ⇒ α) stands for two generation principles: α for “there is a natural number” (the 0), and α ⇒ α for “for every natural number there is a next one” (its successor). Definition 2.1. Let α = (αj )j=1,...,N be a list of distinct type variables. Types , , , ∈ T and constructor types κ ∈ KT(α) are defined inductively by , 1 , . . . , n ∈ T ⇒ ( 1 ⇒ αj1 ) ⇒ . . . ⇒ ( n ⇒ αjn ) ⇒ αj ∈ KT(α) κ1 , . . . , κn ∈ KT(α) (α (κ1 , . . . , κn ))j ∈ T
(n ≥ 1)
(n ≥ 0)
, ∈ T ⇒∈T
Here ⇒ means 1 ⇒ . . . ⇒ m ⇒ , associated to the right. We reserve for types of the form (α (κ1 , . . . , κk ))j . The parameter types of are the members of all appearing in its constructor types κ1 , . . . , κk . Examples. U
:= α α,
Unit
B
:= α (α, α),
Booleans
N
:= α (α, α ⇒ α),
Natural numbers
L()
:= α (α, ⇒ α ⇒ α),
Lists
⊗
:= α ⇒ ⇒ α,
(Tensor) product
+
:= α ( ⇒ α, ⇒ α),
Sum
178
HELMUT SCHWICHTENBERG
(tree, tlist) := (α, ) (N ⇒ α, ⇒ α, , α ⇒ ⇒ ), Bin
:= α (α, α ⇒ α ⇒ α),
Binary trees
O
:= α (α, α ⇒ α, (N ⇒ α) ⇒ α),
Ordinals
T0
:= N,
Tn+1
:= α (α, (Tn ⇒ α) ⇒ α).
Trees
Notice that there are many equivalent ways to define these types. For instance, we could take U + U to be the type of booleans, and L(U) to be the type of natural numbers. A type is called finitary if it is a -type with all its parameter types finitary, and in all its constructor types (1)
n ⇒ αjn ) ⇒ αj ⇒ ( 1 ⇒ αj1 ) ⇒ . . . ⇒ (
n are all empty. In the examples above U, B, N, tree, tlist and Bin the 1 , . . . , are all finitary, whereas O and Tn+1 are not. L(), ⊗ and + are finitary provided their parameter types are. An argument position in a type is called finitary if it is occupied by a finitary type. 2.2. Function spaces via atomic coherent information systems. Definition 2.2. An atomic coherent information system (abbreviated acis) is a triple (A, Con, ≥) with A a countable set (the tokens, denoted a, b, . . . ), Con a nonempty set of finite subsets of A (the consistent sets or formal neighborhoods, denoted U, V, . . . ), and ≥ a transitive and reflexive relation on A (the entailment relation) which satisfy (a) ∅ ∈ Con, and {a} ∈ Con for every a ∈ A, (b) U ∈ Con iff every two-element subset of U is in Con, and (c) if {a, b} ∈ Con and b ≥ c, then {a, c} ∈ Con. We write U ≥ a for ∃b∈U b ≥ a, and U ≥ V for ∀a∈V U ≥ a. — Any acis is an information system in the sense of [29]; this follows from Lemma 2.3. Let A = (A, Con, ≥) be an acis. U ≥ V1 , V2 implies V1 ∪ V2 ∈ Con. Proof. Let b1 ∈ V1 , b2 ∈ V2 . Then we have a1 , a2 ∈ U such that ai ≥ bi . From {a1 , a2 } ∈ Con we obtain {a1 , b2 } ∈ Con by (c), hence {b1 , b2 } ∈ Con again by (c). Definition 2.4. Let A = (A, ConA , ≥A ) and B = (B, ConB , ≥B ) be acis’s. Define A → B = (C, Con, ≥) by C := ConA × B,
{(U1 , b1 ), . . . (Un , bn )} ∈ Con :↔ ∀i,j Ui ∪ Uj ∈ ConA → {bi , bj } ∈ ConB , (U, b) ≥ (V, c) :↔ V ≥A U ∧ b ≥B c.
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
179
Lemma 2.5. Let A = (A, ConA , ≥A ) and B = (B, ConB , ≥B ) be acis’s. Then A → B is an acis again. Proof. Clearly ≥ is transitive and reflexive, and the conditions (a) and (b) of an acis hold; it remains to check (c). So let {(U1 , b1 ), (U2 , b2 )} ∈ Con and (U2 , b2 ) ≥ (V, c), hence V ≥ U2 and b2 ≥ c. We must show {(U1 , b1 ), (V, c)} ∈ Con. So assume U1 ∪ V ∈ Con; we must show {b1 , c} ∈ Con. Now U1 ∪ V ∈ Con and V ≥ U2 by the previous lemma imply U1 ∪ U2 ∈ Con. But then {b1 , b2 } ∈ Con, hence {b1 , c} ∈ Con by (c). Scott [29] introduced the notion of an approximable map from A to B. Such a map is given by a relation r between ConA and B, where r(U, b) intuitively means that whenever we are given the information U ∈ ConA on the argument, then we know that at least the token b appears in the value. Definition 2.6. Let A and B be acis’s. A relation r ⊆ ConA × B is an approximable map from A to B (written r : A → B) iff (a) if r(U, b1 ) and r(U, b2 ), then {b1 , b2 } ∈ ConB , and (b) if r(U, b), V ≥A U and b ≥B c, then r(V, c). Call a (possibly infinite) set x of tokens consistent if U ∈ Con for every finite subset U ⊆ x, and deductively closed if ∀a∈x ∀b≤a b ∈ x. The ideals (or objects) of an information system are defined to be the consistent and deductively closed sets of tokens; we write |A| for the set of ideals of A. Theorem 2.7. Let A and B be acis’s. The ideals of A → B are exactly the approximable maps from A to B. Proof. We show that r ∈ |A → B| satisfies the axioms for approximable maps. (a). Let r(U, b1 ) and r(U, b2 ). Then {b1 , b2 } ∈ ConB by the consistency of r. (b). Let r(U, b), V ≥A U and b ≥B c. Then (U, b) ≥ (V, c) by definition, hence r(V, c) by the deductive closure of r. For the other direction suppose r : A → B is an approximable map. We must show that r ∈ |A → B|. Consistency of r: Suppose r(U1 , b1 ), r(U2 , b2 ) and U = U1 ∪ U2 ∈ ConA . We must show that {b1 , b2 } ∈ ConB . Now by definition of approximable maps, from r(Ui , bi ) and U A Ui we obtain r(U, bi ), and hence {b1 , b2 } ∈ ConB . Deductive closure of r: Suppose r(U, b) and (U, b) ≥ (V, c), i.e., V ≥A U ∧ b ≥B c. Then r(V, c) by definition of approximable maps. The set |A| of ideals for A carries a natural topology (the Scott topology), which has the cones u˜ = {z | z ⊇ u} generated by the formal neighborhoods U as basis. The continuous maps f : |A| → |B| and the ideals r ∈ |A → B| are in a bijective correspondence. With any r ∈ |A → B| we can associate a continuous |r| : |A| → |B|: |r|(z) := { b ∈ B | r(U, b) for some U ⊆ z },
180
HELMUT SCHWICHTENBERG
. .. S(S(S0)) • @ @ @• S(S(S∗)) S(S0) • @ @ @• S(S∗) S0 • @ @ @• S∗ 0 • @ @ ∗@• Figure 1. Tokens and entailment for N. and with any continuous f : |A| → |B| we can associate fˆ ∈ |A → B|: ˆ b) :⇐⇒ b ∈ f(U ). f(U, — We ˆ and r = |r|. These assignments are inverse to each other, i.e., f = |f| ˆ will usually write r(z) for |r|(z), and similarly f(U, b) for f(U, b). It will be clear from the context where the mods and hats should be inserted. 2.3. Algebras with approximations. We can now define the acis of an algebra j , given by constructors Ci . • The tokens are all type correct constructor expressions with an outermost Ci , such that at any finitary argument position we have either a special symbol — written ∗ —, which carries no information or else a token, and at any other argument position we have a formal neighborhood of the appropriate type. — By an extended token a ∗ we mean a token or ∗, and a ∗ ≥ b ∗ means that b ∗ is ∗, or both are tokens and the entailment relation holds. • Two tokens are in the entailment relation ≥ if they start with the same constructor, and for every argument position the arguments located there are either extended tokens a ∗ , b ∗ such that a ∗ ≥ b ∗ , or formal neighborhoods U , V such that U ≥ V , as defined above (notice that this is an inductive definition). • A finite set of tokens is consistent if every two-element subset is; two tokens are consistent if both start with the same constructor and have consistent extended tokens resp. formal neighborhoods at corresponding argument positions. For example, the tokens for the algebra N are as shown in Figure 1. A token a entails another one b iff there is a path from a (up) to b (down). In this case (and similarly for every finitary algebra) a finite set U of tokens is consistent iff it has an upper bound.
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
181
.• ∞ .. • S(S(S0)) @ @ @• S(S(S⊥)) S(S0) • @ @ @• S(S⊥) S0 • @ @ @• S⊥ 0 • @ @ ⊥@• Figure 2. Ideals and inclusion for N, i.e., its domain. Every constructor C generates ≥ b ∗ . , b ∗ | U rC := U with bi∗ extended tokens or formal neighborhoods. The continuous map |rC | is defined by , b) ∈ rC for some U ⊆ z }. |rC |( z ) := { b | (U Hence the (continuous maps corresponding to) constructors are injective and their ranges are disjoint, which is what we wanted to achieve. The ideals x for are — as for any information system — the consistent and deductively closed sets of tokens. Clearly all tokens in x begin with the same constructor. For instance, {S(S0), S(S∗), S∗}, {S(S∗), S∗}, {0} are ideals for N, but also the infinite set { S n ∗ | n > 0 }. The ideals for N and their inclusion relation are pictured in Figure 2. Here we have denoted the ideals ∅, {0}, { S n ∗ | n > 0 } by ⊥, 0, ∞, respectively, and any other ideal by applications of (the continuous map corresponding to) the constructor S to 0 or ⊥. The ambiguous notation — S denotes a symbol in constructor expressions and also the continuous map |rS | — should not lead to confusion. Let C→ := C − C . The ideals x ∈ |C | are called partial continuous functionals of type . §3. Total functionals. Total ideals are important because one can prove their properties by (structural) induction. We also introduce the concept of structure-total ideals, first for a free algebra and then for arbitrary types. They are more general, because ideals at parameter positions need not be total, but still allow to argue by induction. An example of the latter notion are lists whose structure (number of Cons’s) is known, but whose elements may be partial. This is of interest, because for such “structure-total” objects an obvious induction principle holds.
182
HELMUT SCHWICHTENBERG
In [20] Kreisel states the important density theorem, which says that any finite functional can be extended to a total one. Full proofs of various versions of the density theorem are in [15, 7, 31, 27, 21]. Here we give a proof for the practically important case where the base domains are not just the flat domain of natural numbers, but non-flat and possibly parametrized free algebras. 3.1. Total and structure-total ideals. It is well-known how one can single out the total functionals from the partial ones. One good reason to be interested in total functionals is that for base types, that is free algebras, we can prove properties of total objects by structural induction. This is also true for the slightly more general class of structure-total objects, where the arguments at parameter positions in constructor terms need not be total. An example is a list whose length is determined, but whose elements may be partial. Definition 3.1. The total ideals of type are defined inductively. • Case . For an algebra , the total ideals x are those of the form C z with C a constructor of and z total (C denotes the continuous function |rC |). • Case ⇒ . An ideal r of type ⇒ is total iff for all total z of type , the result |r|(z) of applying r to z is total. The structure-total ideals are defined similarly; the difference is that in case the ideals at parameter positions of C need not be total. — We write x ∈ G to mean that x is a total ideal of type . For instance, for N the ideals 0, S0, S(S0) etc. in Figure 2 are total, but ⊥, S⊥, S(S⊥), . . . , ∞ are not. For L(), precisely all ideals of the form Cons(x1 , . . . Cons(xn , Nil) . . . ) are structure-total. The total ones are those where in addition all list elements x1 , . . . , xn are total. For non-flat base domains it is easy to see that there are maximal but not total ideals: ∞ is an example for N. This is less easy for flat base domains; a counterexample has been given by Yuri Ershov in [16]; a more perspicious one (at type (N ⇒ N) ⇒ N) is in [31]. Conversely, the total continuous functionals need not be maximal ideals in C : A counterexample is { (S n 0, 0) | n ∈ N }, which clearly is a total object of type N ⇒ N representing the constant function with value 0. However, addition of the pair (∅, 0) yields a different total object of type N ⇒ N. However, it is easy to show both functionals are “equivalent” in the sense that they have the same behaviour on total arguments. 3.2. Induction. The principle of induction over (the total ideals of) simultaneous free algebras = α κ can now be formulated as follows; it clearly holds in our domains. For readability let the variables xi , yj range over total ideals only. For the constructor type κi = ⇒ ( 1 ⇒ αj1 ) ⇒ . . . ⇒ ( n ⇒ αjn ) ⇒ αj ∈ KT(α)
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
183
we have the step formula Di := ∀
1 ⇒j
m y1 1 ,...,ym ,ym+1
1
n ⇒j
,...,ym+n
n
.∀x 1 Pˆ j1 (ym+1 x ) → · · · → ) → ∀x n Pˆ jn (ym+n x Pˆ j (Ci ( y )).
1 ⇒j
⇒jn
n y = y11 , . . . , ymm , ym+1 1 , . . . , ym+n of type j under consideration, and
are the components of the ideal Ci ( y)
∀x 1 Pˆ j1 (ym+1 x ), . . . , ∀x n Pˆ jn (ym+n x )
are the hypotheses available in the induction step. The induction axiom Indx ,jA with x = (xj j )j=1,...,N and A = (Aj )j=1,...,N = (Pˆ j (xj j ))j=1,...,N then proves the formula D1 → · · · → Dk → ∀ j Pˆ j (xj ). xj
, A when We will often write Indxj ,A for Indx ,jA , and omit the upper indices x they are clear from the context. In case of a non-simultaneous free algebra, i.e. of type α κ, for Indx,A we normally write Indx,A . Examples. Again all variables are supposed to range over total ideals. Indp,A : A[p := tt] → A[p := ff ] → ∀pB A, Indn,A : A[n := 0] → ∀n (A → A[n := Sn]) → ∀nN A, Indl,A : A[l := Nil] → ∀x,l (A → A[l := Cons(x, l )]) → ∀l L(α) A, Indx,A : ∀y1 A[x := Inl(y1 )] → ∀y2 A[x := Inr(y2 )] → ∀x 1 +2 A. Induction over the structure-total ideals is defined similarly. For instance, in the formula above expressing list induction Indl,A we can let x range over arbitrary ideals, and l over the structure-total ones. 3.3. Dense and separating sets. We now prove the density theorem, which says that any finitely generated functional (i.e., any U with U ∈ Con ) can be extended to a total functional. However, we need some assumptions on the base types for this theorem to hold. Otherwise, density might fail for the trivial reason that there are no total ideals at all (e.g., in α (α → α)). A type α1 , . . . , αN (κ1 , . . . , κn ) is said to have total ideals if for every j (1 ≤ j ≤ N ) there is a constructor type κij of form (1) with j1 , . . . , jn < j. Then clearly for every j we have a total ideal of type αj ; call it zj . Moreover, we assume that all base types are finitary. Then their total ideals are finite and maximal, which will be used in the proof. Theorem 3.2 (Density). Assume that all base types are finitary and have total ideals. Then for any U ∈ Con we can find an x ∈ G such that U ⊆ x.
184
HELMUT SCHWICHTENBERG
Proof. Call a type dense if ∀U ∈Con ∃x∈G U ⊆ x, and separating if / Con → ∃z∈G InCon(U1 ( z ) ∪ U2 ( z )) . ∀U1 ,U2 ∈Con U1 ∪ U2 ∈ Here z ∈ G means that z is a sequence of total zi such that Uj z is of a base type . We prove by simultaneous induction on that any type is dense and separating. This extended claim is needed for the inductive argument. For base types both claims are easy: the fact that is separating is obvious, and density for can be inferred from the induction hypothesis, as follows. For simplicity of notation assume that is non-simultaneously defined. Let U ∈ Con . Then (since is finitary) ∃b ∀a∈U b ≥ a. In the token b, replace every constructor symbol by its corresponding continuous function, every token at a parameter argument position by a total ideal of its type (which exists by induction hypothesis), and every ∗-token at a type--position by the total ideal z of type (which exists by assumption). The result is the required total ideal. ⇒ is separating: This will follow from the inductive hypotheses that is dense and is separating. So let W, W ∈ Con⇒ such that W ∪ W ∈ / Con⇒ . Then there are (U, a) ∈ W and (U , a ) ∈ W such that / Con . Since is dense, we have a z ∈ G U ∪ U ∈ Con but {a, a } ∈ such that U ∪ U ⊆ z. Hence a ∈ W (z) and a ∈ W (z). Now since is separating there are z ∈ G such that z ) ∪ {a }( z) , InCon {a}( hence also
InCon W (z, z ) ∪ W (z, z ) .
This concludes the proof that ⇒ is separating. ⇒ is dense: This will follow from the inductive hypotheses that is separating and is dense. So fix W = { (Ui , ai ) | i ∈ I } ∈ Con⇒ . / Con . Then Ui ∪ Uj ∈ / Con . Since is Consider i, j such that {ai , aj } ∈ zij ) separating, there are zij ∈ G and kij , lij ∈ G such that with kij := Ui ( and lij := Uj ( zij ) InCon(kij ∪ lij ). We clearly may assume that zij = zji and (kij , lij ) = (lji , kji ). Now define for any U ∈ Con a set IU of indices i ∈ I such that “U behaves as Ui with respect to the zij ”. More precisely, let / Con → U ( zij ) = kij ) }. IU := { i ∈ I | ∀j ({ai , aj } ∈ We first show that (2)
{ ai | i ∈ IU } ∈ Con .
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
185
It suffices to show that {ai , aj } ∈ Con for all i, j ∈ IU . So let i, j ∈ IU and / Con . Then U ( zij ) = kij as i ∈ IU and U ( zji ) = kji as assume {ai , aj } ∈ j ∈ IU , and because of zij = zji and InCon(kij ∪ kji ) (recall lij = kji ) we could conclude that U ( zij ) would be inconsistent. This contradiction proves {ai , aj } ∈ Con and hence (2). Since (2) holds and is dense by induction hypothesis, we can find yIU ∈ G such that ai ∈ yIU for all i ∈ IU . Define r ⊆ Con × C by a ∈ yIU , if U ( zij ) is finite and maximal for all zij ; r(U, a) ⇐⇒ ∃i∈IU ai ≥ a, otherwise. We will show that r ∈ G⇒ and W ⊆ r. For W ⊆ r we have to show r(Ui , ai ) for all i ∈ I . But this holds, since clearly i ∈ IUi and also ai ∈ yIUi . We now show that r is an approximable map, i.e., that r ∈ |C ⇒ |. To prove this we have to verify the defining properties of approximable maps. zij ) is finite and (a). r(U, b1 ) and r(U, b2 ) implies {b1 , b2 } ∈ Con . If U ( maximal for all zij , the claim follows from the consistency of yIU . If not, the claim follows from Lemma 2.3. zij ) (b). r(U, b), V ≥A U and b ≥B c implies r(V, c). First assume that U ( zij ) is maximal for all zij . From is finite and maximal for all zij . Then also V ( zij ) and r(U, b) we get b ∈ yIU . We have to show that c ∈ yIV . But since U ( V ( zij ) are maximal for all zij and V ≥ U , they must have the same values on the zij , hence IU = IV , so yIU = yIV and therefore c ∈ yIV by deductive closure. Now assume the contrary. From r(U, b) we get ai ≥ b for some i ∈ IU . From V ≥ U we can conclude IU ⊆ IV , by the definition of IU . Hence i ∈ IV , and also b ∈ yIV (since ai ∈ yIU for all i ∈ IU , and yIV is deductively closed). Therefore r(V, b) and hence r(V, c). This concludes the proof that r is an approximable map. It remains to prove r ∈ G⇒ . So let x ∈ G . We must show |r|(x) = { a ∈ C | ∃U ⊆x r(U, a) } ∈ G . Now x( zij ) is total for all i, j, hence by our assumption on base types finite zij ) = x( zij ). Let and maximal. So there is some Uij ⊆ x such that Uij ( U ⊆ x be the union of all the Uij . Then by definition r(U, a) for all a ∈ yIU . Therefore yIU ⊆ |r|(x) and hence |r|(x) ∈ G . §4. Terms; denotational and operational semantics. For every type , we have defined what a partial continuous functional of type is: an ideal consisting of tokens at this type. These tokens or rather the formal neighborhoods formed from them are syntactic in nature; they are reminiscent to Kreisel’s “formal neighborhoods” [20, 24, 14]. However — in contrast to [24] —
186
HELMUT SCHWICHTENBERG
we do not have to deal separately with a notion of consistency for formal neighborhoods: this concept is built into information systems. Let us now turn our attention to a formal (functional programming) language, in the style of Plotkin’s PCF [25], and see how we can provide a denotational semantics (that is, a “meaning”) for the terms of this language. A closed term M of type will denote a partial continuous functional of this type, that is, a consistent and deductively closed set of tokens of type . We will define this set inductively. It will turn out that these sets are recursively enumerable. In this sense each closed term M of type denotes a computable partial continuous functional of type . However, it is not a good idea to define a computable functional in this way, by providing a recursive enumeration of its tokens. We rather want to be able to use recursion equations for such definitions. Therefore we extend the term language by constants D defined by certain “computation rules”, as in [11, 8]. Our semantics will cover these as well. There are some natural questions one can have for such a term language: 1. Preservation of values under conversion (as in [24, First Theorem]). Here we need to include applications of computation rules. 2. An adequacy theorem (as in [25, Theorem 3.1] or [24, Second Theorem]), which in our setting says that whenever a closed term has a proper token in the ideal it denotes, then it evaluates to a constructor term entailing this token. 3. Strong normalization, that is, termination of arbitrary conversions, provided the only defined constants are the structural recursion operators. Properties 1, 2 and 3 will be proved in the present section, in Section 5 and in Section 6, respectively. Coquand [14] observed that the types play only a somewhat minor role in this setup. It suffices to know the arity (a natural number) of the constants (constructors and defined constants), to guide the definitions. An interesting consequence is that one can use this approach for dependently typed languages ¨ type theory. as well, for instance, the terms of Martin-Lof’s 4.1. Terms. Terms are built from (typed) variables and (typed) constants (constructors C or defined constants D, see below) by (type-correct) application and abstraction: M, N ::= x | C | D | (x M )⇒ | (M ⇒ N ) . Every defined constant comes with a system of computation rules, consisting i = Mi (i = 1, . . . , n) with constructor patterns of finitely many equations D P Pi , such that Pi and Pj (i = j) are non-unifiable. Constructor patterns are lists of applicative terms with distinct variables, defined inductively as follows x ) to indicate all variables in P; notice that x can be a variable (we write P( for a formal neighborhood, and that all expressions must be type-correct):
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
187
• x(x) is a constructor pattern. x ) a constructor pattern, then (C P)( x ) is • If C is a constructor and P( a constructor pattern. x ) and Q( • If P( y ) are constructor patterns whose variables x and y are Q)( disjoint, then (P, x, y ) is a constructor pattern. 4.2. Ideals as meaning of terms. How can we use computation rules to define an ideal z in a function space? The general idea is to inductively define the set of tokens (U, b) that make up z. However, since arbitrary terms are allowed on the right, we need to define the value [[ x M ]], where M is a term with free variables among x . Since this value is a token set, we can define , b) ∈ [[ inductively the relation (U x M ]]. , b) means (U1 , . . . (Un , b) . . . ), and at We use the following notation. (U argument positions of constructors we use b ∗ for extended tokens as well as , V ) ⊆ [[ , b) ∈ [[ for formal neighborhoods. (U x M ]] means (U x M ]], for all (finitely many) b ∈ V . For a constructor C , let {C∗ } if V = ∅, C (V ) := {Ca | a ∈ V } otherwise. , b) ∈ [[ Definition 4.1 (Inductive, of (U x M ]]). Ui ≥ b , b) ∈ [[ (U x xi ]]
(V ),
, V ) ⊆ [[ , V, c) ∈ [[ (U x N ]] (U x M ]] (A). , c) ∈ [[ (U x .MN ]]
For every constructor C and defined constant D we have V ≥ b (C ), , V , C b ) ∈ [[ (U x C ]]
, V , b) ∈ [[ (U x, y M ]] (D), , P( V ), b) ∈ [[ (U x D]]
y ) = M. with one such rule (D) for every computation rule D P( Here are some simple consequences of this definition. First we show a useful property of constructors: , b) ∈ [[ ]] iff there are c ∗ ≥ b ∗ such that b = C b ∗ and Lemma 4.2. (U x .C N (U , ci ) ∈ [[ x Ni ]] (i = 1, . . . , n). , ci ) ∈ [[ x Ni ]] Proof. We may assume that b ∗ , c ∗ are tokens b, c . Let (U , {cj+1 }, . . . , {cn }, C b) ∈ (ci ≥ bi , i = 1, . . . , n). For j = 0, . . . , n we show (U [[ x .CN1 . . . Nj ]]. In case j = 0 use (C ): {c1 } ≥ b1
{cn } ≥ bn . , {c1 }, . . . , {cn }, C b ) ∈ [[ (U x C ]] ...
188
HELMUT SCHWICHTENBERG
In the step from j − 1 to j use (A): , {cj }, . . . , {cn }, C b ) ∈ [[ , cj ) ∈ [[ x Nj ]] (U x .CN1 . . . Nj−1 ]] (U . , {cj+1 }, . . . , {cn }, C b ) ∈ [[ (U x .CN1 . . . Nj ]] For j = n the claim follows. — For the other direction, observe that only (A) could have been applied. Hence the argument can be read backwards. Using the fact that the left hand sides of computation rules are non-unifiable we can prove: Lemma 4.3. [[ x M ]] is an ideal, i.e., consistent and deductively closed. , b) ∈ [[ Proof. Induction on (U x M ]]. 1 , b1 ), (U 2 , b2 ) ∈ [[ (1) Consistency. Case (V ). Assume (U x xi ]], and that 1 and U 2 are pairwise consistent. We must show {b1 , b2 } ∈ Con. By (V), U U1i ≥ b1 and U2i ≥ b2 . Now {b1 , b2 } ∈ Con follows from U1i ∪ U2i ∈ Con. 2 , V2 , C b ∗ ) ∈ [[ 1 , V1 1 , V1 , C b ∗ ), (U x C ]], and that U Case (C ). Assume (U 1 2 ∗ ∗ and U2 , V2 are pairwise consistent. We must show {C b1 , C b2 } ∈ Con. By (C), Vi ≥ bi∗ (i = 1, 2). From the pairwise consistency of V1 and V2 we obtain the pairwise consistency of b1∗ and b2∗ . Hence {C b1∗ , C b2∗ } ∈ Con. 2 , c2 ) ∈ [[ 1 and U 2 pairwise 1 , c1 ), (U x .MN ]], with U Case (A). Let (U 1 , V1 ), (U 2 , V2 ) ⊆ [[ x N ]], consistent. We show {c1 , c2 } ∈ Con. By (A), (U 1 , V1 , c1 ), (U 2 , V2 , c2 ) ∈ so by IH V1 ∪ V2 ∈ Con. Similarly, again by (A), (U [[ x M ]], hence {c1 , c2 } ∈ Con by IH. i , P i (Vi ), bi ) ∈ [[ Case (D). Let (U x D]] (i = 1, 2), and assume that U1 , P1 (V1 ) and U2 , P2 (V2 ) are pairwise consistent. From the fact that the left 1 = P 2 , and hand sides of computation rules are non-unifiable we can infer P that V1 and V2 are pairwise consistent. Then {b1 , b2 } ∈ Con by IH. and b ≥ c. We must show (2) Closure under ≥. Case (V ). Assume V ≥ U (V , c) ∈ [[ x xi ]]. By (V) it suffices to show Vi ≥ c. But this follows from Vi ≥ Ui ≥ b ≥ c. 1 ≥ U , V1 ≥ V and C b ∗ ≥ C c ∗ . We must show Case (C ). Assume U ∗ x C ]]. By (C) it suffices to show V1 ≥ c ∗ . But this follows (U1 , V1 , C c ) ∈ [[ ∗ from V1 ≥ V ≥ b ≥ c ∗ . Case (A). The IH clearly suffices here. , Z ≥ P( V ) and b ≥ b1 . Notice that Z ≥ P( V ) 1 ≥ U Case (D). Assume U x D]]. implies Z = P(V1 ) with V1 ≥ V , so we must show (U1 , P(V1 ), b1 ) ∈ [[ 1 , V1 , b1 ) ∈ [[ By IH we have (U x, y M ]]. Now use (D). 4.3. Preservation of values. We now prove that our definition above of the meaning of a term is reasonable in the sense that an application of the standard (- and -) conversions and also of a computation rule does not change the meaning of a term. For the -conversion part of this proof it is helpful to first introduce a more standard notation, which involves variable environments.
189
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
Definition 4.4. Assume that all free variables in M are among x .
[[M ]]U x := { b | (U , b) ∈ [[x M ]] },
[[M ]]ux :=
[[M ]]U x .
⊆ U u
We have a useful monotonicity property, which follows from the deductive closure of [[ x M ]]. , b ≥ c and b ∈ [[M ]]U , then c ∈ [[M ]]V . Lemma 4.5. (a) If V ≥ U x x (b) If v ⊇ u , b ≥ c and b ∈ [[M ]]ux , then c ∈ [[M ]]vx . , b ≥ c and Proof. (a). By the deductive closure of [[ x M ]], V ≥ U (U , b) ∈ [[ x M ]] together imply (V , c) ∈ [[ x M ]]. (b) follows from (a). Lemma 4.6. (a) [[xi ]]ux = ui . (b) [[y M ]]ux = { (V, b) | b ∈ [[M ]]ux,V ,y }. (c) [[MN ]]ux = [[M ]]ux [[N ]]ux .
for u . But (V, b) ∈ [[y M ]]U Proof. (b). It suffices to prove this with U x , V, b) ∈ [[ and b ∈ [[M ]]U ,V are both equivalent to (U x , y M ]]. x ,y
(c). c ∈ [[M ]]ux [[N ]]ux ↔ ∃V ⊆[[N ]]u (V, c) ∈ [[M ]]ux x
(application in acis’s)
↔ ∃V ⊆[[N ]]u ∃U ⊆u (V, c) ∈ [[M ]]U x x
↔ ∃U1 ⊆u ∃
U
V ⊆[[N ]]x 1
∃U ⊆u (V, c) ∈ [[M ]]U x
↔(∗) ∃U ⊆u ∃V ⊆[[N ]]U (V, c) ∈ [[M ]]U x x
, V ) ⊆ [[ , V, c) ∈ [[ ↔ ∃U ⊆u ∃V .(U x N ]] ∧ (U x M ]] , c) ∈ [[ ↔ ∃U ⊆u (U x .MN ]] ↔ ∃U ⊆u c ∈
(by (A))
[[MN ]]U x
↔ c ∈ [[MN ]]ux . Here is the proof of the equivalence marked (∗). The upwards direction is 1 ⊆ u , obvious. For the downwards direction we use monotonicity. Assume U 1 U U V ⊆ [[N ]]x , U ⊆ u and (V, c) ∈ [[M ]]x . Let U2 := U1 ∪ U ⊆ u . Then by
U2 2 monotonicity V ⊆ [[N ]]xU and (V, c) ∈ [[M ]]x .
Corollary 4.7. [[y M ]]ux v = [[M ]]ux,v,y .
190
HELMUT SCHWICHTENBERG
Proof. b ∈ [[y M ]]ux v ↔ ∃V ⊆v (V, b) ∈ [[y M ]]ux ↔ ∃V ⊆v b ∈
[[M ]]ux,V ,y
(application in acis’s)
(by the lemma)
↔ b ∈ [[M ]]ux,v,y . u ,[[N ]]ux
Lemma 4.8 (Substitution). [[M ]]x ,z
= [[M [z := N ]]]ux .
Proof. Case y M . For readability we leave out x and u . ]] ]],V [[y M ]][[N = { (V, b) | b ∈ [[M ]][[N } z z,y
= { (V, b) | b ∈ [[M [z := N ]]]V y } = [[y.M [z := N ]]]
(by IH)
(by the lemma)
= [[(y M )[z := N ]]]. The other cases are easy. u u Lemma 4.9. [[(y M )N ]]x = [[M [y := N ]]]x . Proof. For readability we leave out x and u . By the last two lemmas and ]] the corollary, [[(y M )N ]] = [[y M ]][[N ]] = [[M ]][[N = [[M [y := N ]]]. y Lemma 4.10. [[y.My]]ux = [[M ]]ux , if y ∈ / FV(M ). Proof. For readability we leave out x and u . (V, b) ∈ [[y.My]] ↔ b ∈ [[My]]V y ↔ b ∈ [[M ]]V ↔ ∃U ⊆V (U, b) ∈ [[M ]]
(application in acis’s)
↔ (V, b) ∈ [[M ]], where in the last step we have used monotonicity. To prove preservation of values under computation rules, the following observation will be needed: (it removes the need for “(generalized) predecessor functions” [11, 8]): , V , b) ∈ [[ , CV , b) ∈ [[ Lemma 4.11. (U x, y .M [z := C y ]]] iff (U x , z M ]]. , V , b) ∈ [[ Proof. Induction on (U x, y .M [z := C y ]]], and cases on the form of M . Case MN . , V , c) ∈ [[ (U x, y .M [z := C y ]N [z := C y ]]] , V , Z) ⊆ [[ , V , Z, c) x, y N [z := C y ]]] ∧ (U ↔ ∃Z .(U ∈ [[ x, y M [z := C y ]]] , CV , Z) ⊆ [[ , CV , Z, c) ∈ [[ x , z N ]] ∧ (U x , z M ]] ↔ ∃Z .(U , CV , c) ∈ [[ ↔ (U x , z.MN ]]
(by (A)).
(by IH)
191
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
Case z. , V , c) ∈ [[ (U x, y .C y ]] = [[ x C]] ↔ ∃b ∗ .V ≥ b ∗ ∧ Cb ∗ ≥ c, ↔ CV ≥ c, , CV , c) ∈ [[ ↔ (U x , z z]].
In all other cases both sides are clearly equivalent.
We can now prove preservation of values under computation rules: y ) = M of a defined constant Lemma 4.12. For every computation rule D P( y )]] = [[ D, [[ y .D P( y M ]]. Proof. The following are equivalent: y )]] (V , b) ∈ [[ y .D P( V ), b) ∈ [[D]] = [[ (P( z .D z ]]
(by Lemma 4.11)
(V , b) ∈ [[ y M ]],
where the last step is by definition.
4.4. Examples. We consider the doubling function D : N ⇒ N, addition + : N ⇒ N ⇒ N and the fixed point operators Y . Structural recursion could be treated as well. Doubling. D : N ⇒ N is defined by the computation rules D0 = 0,
D(Sn) = S(S(Dn)).
One can show easily that all tokens n+1 ({0}, 0), {S 0}, S 2n+2 0 ,
{S n+1 ∗}, S 2n+2 ∗
are in [[D]], and that any token (V, c) ∈ [[D]] is entailed by one of these. Addition. + : N ⇒ N ⇒ N is defined by the computation rules n + 0 = n,
n + Sm = S(n + m).
As above one shows that all tokens n+1 ({0}, 0), {S 0}, S n+1 0 ,
{S n+1 ∗}, S n+1 ∗
are in [[m.0 + m]], and that any token (V, c) ∈ [[m.0 + m]] is entailed by one of these. So we can conclude that [[m.0 + m]] = [[m m]]. This is of interest, because it allows us to replace 0 + M by M for an arbitrary (not necessarily total) term M without affecting the values. Fixed points. The computation rule Y f = f(Y f) defines the fixed point operator Y of type ( ⇒ ) ⇒ .
192
HELMUT SCHWICHTENBERG
§5. Adequacy. The adequacy theorem of Plotkin [25, Theorem 3.1] says that whenever the value of a closed term M is a numeral, then M headreduces to this numeral. So in this sense the (denotational) semantics is (computationally) “adequate”. Plotkin’s proof is by induction on the types, and uses a computability predicate. We prove an adequacy theorem in our setting, for arbitrary computation rules. 5.1. Operational semantics. Recall that a token of a base type is either ∗ or a constructor expression (possibly involving ∗) whose outermost constructor is for . We use B to denote both, constructors C and defined constants D. Definition 5.1 (M 1 N , M head-reduces to N ). M 1 M , MN 1 M N N ) 1 M [ ] for D P( y ) = M a computation rule, D P( y := N
(x M )N 1 M [x := N ],
M 1 M for n < ar(B). Ba1 . . . an M 1 Ba1 . . . an M denotes the reflexive transitive closure of 1 . Clearly for every term M there is at most one M such that M 1 M ; call M normal if there is no such M . We define an “operational interpretation” [24] of formal neighborhoods U . To this end we define a notion M ∈ [a], for M closed, by induction on the type of the token a, and write M ∈ [U ] for ∀a∈U M ∈ [a]. Definition 5.2. (a) For a of base type , M ∈ [a] iff ∃b≥a M b. with length of M less than (b) M ∈ [(U, b)] iff M x M or M B M ar(B), and ∀N ∈[U ] MN ∈ [b]. Lemma 5.3. If M N , N ∈ [V ] and V ≥ U , then M ∈ [U ]. Proof. Let a ∈ U . We show M ∈ [a], i.e., ∃K≥a M K . Because of V ≥ U we have c ∈ V such that c ≥ a. Because of N ∈ [c] we have a term K such that N K ≥ c. Hence M N K ≥ c ≥ a. , b) ∈ [[ Theorem 5.4 (Adequacy). If (U x M ]] with b a proper token, then , b ) ≥ (U , b). , b )] for some (U x M ∈ [(U , b) ∈ [[ Proof. By induction on the rules defining (U x M ]], and cases on the form of M . Case xi . Ui ≥ b (V ). (U , b) ∈ [[ x xi ]] , b) such that , b )], i.e., ∀ Ki ∈ [b ]. , b ) ≥ (U x xi ∈ [(U We need (U K ∈[U ] , b = b. Let Ki ∈ [Ui ]. Then by definition Ki ∈ [b ]. = U Take U
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
193
Case MN . , V ) ⊆ [[ , V, c) ∈ [[ (U x N ]] (U x M ]] (A). (U , c) ∈ [[ x .MN ]] , c) such that , c )], i.e., , c ) ≥ (U x .MN ∈ [(U We need to find some (U ∈ [c ]. ∀K ∈[U ] (MN )[ x := K] , b) such that 1 , b ) ≥ (U xN ∈ By IH, for all b ∈ V we have some (U 1 , b )], i.e., ∀ N [ ] ∈ [b ]. Recall that (U 1 , b ) ≥ (U , b) means [(U x := K K ∈[U1 ] ≥U 1 and b ≥ b. Hence we can pick the same U1 for all b ∈ V , and U ∈ [V ]. x := K] ∀K ∈[U1 ] N [ 2 , V , c ) ≥ (U , V, c) such that 2 , V , c )], Also, by IH we have (U x M ∈ [(U i.e., ∈ [(V , c )]. x := K] ∀K ∈[U2 ] M [ 2 , V , c ) ≥ (U , V, c) means U ≥U 2 , V ≥ V and c ≥ c. Recall that (U ∈ U . Clearly Let U := U1 ∪ U2 (component wise union), and fix K ∈ [U 1 ] and K ∈ [U 2 ]. From M [ ∈ (V , c ) we know that M [ K x := K] x := x M or M B M with length of M less than ar(B), and also K] ]L ∈ [c ]. ∀L∈[V ] .M [ x := K ∈ [V ] and hence ∈ [V ] we obtain (MN )[ ∈ [c ], Since N [ x := K] x := K] as required. Case D. , V , b) ∈ [[ (U x, y M ]] (D), (U , P(V ), b) ∈ [[ x D]] b ) ≥ (U , P( V ), b) y ) = M be a computation rule. We need (U , Z, with D P( , Z, b )]. Recall that (U , Z, b ) ≥ (U , P( V ), b) means such that x D ∈ [(U ≥U , P( V ) ≥ Z and b ≥ b. U , V , b) such that , V , b )], , V , b ) ≥ (U x, y M ∈ [(U By IH we have (U i.e., ] ∈ [b ]. y := N ∀K ∈[U ] ∀N ∈[V ] M [ , V , b ) ≥ (U , V , b) means U ≥U , V ≥ V and b ≥ b. Recall that (U := P( V ). Pick the required U , b as the ones provided by the IH, and Z We must show x D ∈ [(U , P(V ), b )], i.e., ∀K ∈[U ] ∀L∈[ V )] D L ∈ [b ]. P(
∈ [U ] and L ∈ [P( V )]. Then L = P( N ) with N ∈ [V ]. From Now fix K y := N ] and M [ y := N ] ∈ [b ] the claim follows. D P(N ) 1 M [
194
HELMUT SCHWICHTENBERG
Case C . V ≥ b ∗ (C ). , V , C b ∗ ) ∈ [[ (U x C ]] , V , C b ∗ ) such that , V , b )]. Recall , V , b ) ≥ (U x C ∈ [(U We need (U ∗ that (U , V , b) ≥ (U , V , C b ) means U ≥ U , V ≥ V and b ≥ C b ∗ . := U , V := V and b := C b ∗ . We must show Pick U xC ∈ ∗ , V , C b )], i.e., [(U ∗ ∀K∈[ U ] ∀L∈[ V ] C L ∈ [C b ]. This follows from V ≥ b ∗ .
§6. Strong normalization for structural recursion. It is well-known that in a system of simultaneously defined free algebras every term (possibly involving recursion operators) is strongly normalizing. However, the standard proof reduces the problem to strong normalization of second order propositional logic (called system F by Girard [17]). This latter result requires a method not formalizable in analysis. Here we give a much simpler proof, which only uses predicative methods. 6.1. Structural recursion. The inductive structure of the types = α κ ] we can corresponds to two sorts of constants. With the constructors Ci : κi [ construct elements of a type j , and with the recursion operators R j, we can . In construct total functionals from j to j by recursion on the structure of order to define the type of the recursion operators w.r.t. = α κ and result types , we first define for κi = ⇒ ( 1 ⇒ αj1 ) ⇒ . . . ⇒ ( n ⇒ αjn ) ⇒ αj ∈ KT(α) the step type ,
i
:= ⇒ ( 1 ⇒ j1 ) ⇒ . . . ⇒ ( n ⇒ jn ) ⇒ ( 1 ⇒ j1 ) ⇒ . . . ⇒ ( n ⇒ jn ) ⇒ j .
n ⇒ jn ) correspond to the components of the Here , ( 1 ⇒ j1 ), . . . , ( object of type j under consideration, and ( 1 ⇒ j1 ), . . . , ( n ⇒ jn ) to the previously defined values. The recursion operator R j, has type ,
R j, : 1
,
⇒ . . . ⇒ k ⇒ j ⇒ j
(recall that k is the total number of constructors for all types 1 , . . . , N ). , when they We will often write Rj , for R j, , and omit the upper indices are clear from the context. In case of a non-simultaneous free algebra, i.e., of type α κ, for R, we write R .
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
195
Examples. ttB := CB1 , RB :
ffB := CB2 ,
⇒ ⇒ B ⇒ ,
0N := CN 1 ,
SN⇒N := CN 2 ,
RN : ⇒ (N ⇒ ⇒ ) ⇒ N ⇒ , , NilL(α) := CL(α) 1
Consα⇒L(α)⇒L(α) := CL(α) , 2
RL(α) : ⇒ (α ⇒ L(α) ⇒ ⇒ ) ⇒ L(α) ⇒ ,
Inl
⇒+
Inr
⇒+
:= C1+ , := C2+ ,
R+ : ( ⇒ ) ⇒ ( ⇒ ) ⇒ + ⇒ ,
⊗+
⇒⇒⊗
:= C⊗ , 1
R⊗ : ( ⇒ ⇒ ) ⇒ ⊗ ⇒ . 6.2. Conversion. To define the conversion relation, it will be useful to employ the following notation. Let = α κ and 1 ⇒ αj1 ) ⇒ . . . ⇒ ( n ⇒ αjn ) ⇒ αj ∈ KT(α), κi = 1 ⇒ . . . ⇒ m ⇒ ( . Then we write N P = N P , . . . , NmP for the parameter and consider Ci N 1 1 m R arguments N1 , . . . , Nm and N = N1R , . . . , NnR for the recursive arguments 1 ⇒j
⇒
n jn , and n R for the number n of recursive arguments. Nm+1 1 , . . . , Nm+n We define a conversion relation → between terms of type by
(3)
(xM )N → M [x := N ]
if x ∈ / FV(M ) (M not an abstraction) )j ⇒j (C N ) → Mi N (Rj1 M ) ◦ N1R . . . (Rjn M ) ◦ NnR (5) (Rj M i (4)
x.Mx → M
Here we have written Rj for R j, . The one step reduction relation → can now be defined as follows. M → N if N is obtained from M by replacing a subterm M in M by N , where M → N . The reduction relations →+ and →∗ are the transitive and the = M1 , . . . , Mn we write reflexive transitive closure of →, respectively. For M M → M if Mi → Mi for some i ∈ {1, . . . , n} and Mj = Mj for all i = j ∈ {1, . . . , n}. A term M is normal (or in normal form) if there is no term N such that M → N . . Clearly normal closed terms are of the form Ci N
196
HELMUT SCHWICHTENBERG
6.3. Strong computability predicates. Definition 6.1. The set SN of strongly normalizable terms is inductively defined by (∀N.M → N ⇒ N ∈ SN) ⇒ M ∈ SN
(6)
Note that with M clearly every subterm of M is strongly normalizable. Definition 6.2. We define strong computability predicates SC by induction on . κ )j . Then M ∈ SCj if Case j = (α (7)
∀N.M → N ⇒ N ∈ SC, and
(8)
⇒N P ∈ SC ∧ M = Ci N
n R
p=1
∈ SCjp . ∀K ∈SC NpR K
Case ⇒ . M ∈ SC⇒ :⇐⇒ ∀N ∈SC MN ∈ SC . P ∈ SC and K ∈SC in (8) is legal, because the Notice that the reference to N types , i of N , K must have been generated before j . Note also that by (8) ∈ SC. ∈ SC implies N Ci N We now set up a sequence of lemmas leading to a proof that every term is strongly normalizing. Lemma 6.3. If M ∈ SC and M → M , then M ∈ SC. Proof. Induction on . Case . By (7). Case ⇒ . Assume M ∈ SC⇒ and M → M ; we must show M ∈ SC. So let N ∈ SC ; we must show M N ∈ SC . But this follows from MN → M N and MN ∈ SC by induction hypothesis (IH) on . ∈ SC ⇒ (x M ) ∈ SC. Lemma 6.4. ∀ .M M ∈SN
∈ SN. Assume M ∈ SN and M ∈ SC; we must Proof. Induction on M ) ∈ SC. So assume x M → N ; we must show N ∈ SC. Now by show (x M with M →M . the form of the conversion rules N must be of the form x M But M ∈ SC by Lemma 6.3, hence x M ∈ SC by IH for M . Lemma 6.5. (a) SC ⊆ SN, (b) x ∈ SC . κ )j . (a). We Proof. By simultaneous induction on . Case j = (α show M ∈ SCj ⇒ M ∈ SN by (side) induction on M ∈ SCj . So assume M ∈ SCj ; we must show M ∈ SN. But for every N with M → N we have N ∈ SC by (7), hence N ∈ SN by the side induction hypothesis SIH. (b). x ∈ SCj holds trivially. Case ⇒ . (a). Assume M ∈ SC⇒ ; we must show M ∈ SN. By IH(b) for we have x ∈ SC , hence Mx ∈ SC , hence Mx ∈ SN by IH(a) for .
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
197
∈ SC with 1 = ; But Mx ∈ SN clearly implies M ∈ SN. (b). Let M we must show x M ∈ SC . But this follows from Lemma 6.4, using IH(a) for . Corollary 6.6. N ∈ SC ⇒ C N ∈ SC, i.e. C ∈ SC. i
i
∈ SC by induction on N ∈ SN ∈ SC ⇒ C N Proof. First show ∀N ∈SN .N i as in Lemma 6.4, and then use Lemma 6.5(a). ∈ SC ⇒ (xM )N N ∈ SC . Lemma 6.7. ∀ .M [x := N ]N M,N,N ∈SN
∈ SN. Let M, N, N ∈ SN and asProof. By induction on M, N, N sume M [x := N ]N ∈ SC; we must show (xM )N N ∈ SC. Assume → K ; we must show K ∈ SC. Case K = (xM )N N with (xM )N N ∗ M, N, N → M , N , N . Then M [x := N ]N → M [x := N ]N , hence by ∈ SC we can infer M [x := N ]N ∈ (7) from our assumption M [x := N ]N . Then SC, therefore (xM )N N ∈ SC by IH. Case K = M [x := N ]N K ∈ SC by assumption. ∈ SC ⇒ (xM )N N ∈ SC . .M [x := N ]N Corollary 6.8. ∀ M,N,N ∈SN
Proof. By induction on , using Lemma 6.5(a). NL ∈ SC . ,L ∈ SC ⇒ Rj M .M Lemma 6.9. ∀N ∈SCj ∀
M ,L∈SN
,L ∈ SN. Proof. By main induction on N ∈ SCj , and side induction on M Assume NL → L. Rj M We must show L ∈ SC. N L ∈ SC by the SIH. Case 1. Rj M N L ∈ SC by the main induction hypothesis (IH). Case 2. Rj M Case 3. N = Ci N and (Rj M ) ◦ N1R . . . (Rj M ) ◦ NnR L. L = Mi N ∈ SC by (8). ,L ∈ SC by assumption. N ∈ SC follows from N = C N M i R Note that for all recursive arguments Np of N and all strongly computable by (8) we have the IH for NpR K available. It remains to show (Rj M )◦ K R R (Np x Q ∈ SC be given. We must show Np = xp .Rj M p ) ∈ SC. So let K, R R (NpR K) Q ∈ SC, p ))K Q ∈ SC. By IH for Np K we have Rj M ( xp .Rj M (Np x ,Q ∈ SN. Now Corollary 6.8 yields the claim. since by Lemma 6.5(a) K Corollary 6.10. Rj ∈ SC.
Definition 6.11. A substitution is strongly computable, if (x) ∈ SC for all variables x. A term M is strongly computable under substi¡tution, if M ∈ SC for all strongly computable substitutions .
198
HELMUT SCHWICHTENBERG
Theorem 6.12. Every term is strongly computable under substitution. Proof. Induction on the term M . Case x. x ∈ SC, since is strongly computable. Case Ci . By Corollary 6.6. Case Rj . By Corollary 6.10. Case MN . By IH M , N ∈ SC, hence (MN ) = (M )(N ) ∈ SC. Case xM . Let be a strongly computable substitution; we must show (xM ) = xM xx ∈ SC. So let N ∈ SC; we must show (xM xx )N ∈ SC. By IH M xN ∈ SC, hence (xM xx )N ∈ SC by Corollary 6.8. Corollary 6.13. Every term is strongly normalizable.
§7. Implementation. This “logic for computable functionals” is the basis for the Minlog proof assistant www.minlog-system.de, under development in Munich. It treats partial functionals as first class citizens: variables range over all partial continuous functionals of a given type. Since these functionals are viewed as sets of tokens, we in fact quantify over sets, so we have a second order theory. However, the existence axioms — here in the form of which terms are allowed in ∀-elimination — are weak in the sense that these terms involve quantifiers over functionals, so our theory remains predicative. In contrast to [23], formulas and types are kept separate. This makes it possible to avoid dependent types, which simplifies the theory considerably. More importantly, by separating the logic rules from type theory one avoids the well-known difficulty then when propositions are viewed as types and types as domains, then — as every domain is inhabited by its bottom element — every proposition would have a proof. Types are built from base types (non-flat and possibly infinitary free algebras, with type parameters) by forming function spaces; this suffices for our intended mathematical applications. For more metamathematical subjects one may also add universe formation processes, as in [9]. Decidable predicates are viewed as boolean valued functions (and hence the rewrite mechanism described below applies to them), and inductive definitions are the common way to introduce undecidable predicates. In addition to free type variables also free predicate variables are allowed. They are viewed as placeholders for formulas (or more precisely, comprehension terms, that is formulas with some variables abstracted). However, in comprehension terms quantification over predicate variables is not allowed, since this would form a glaring impredicativity: we then would define a predicate (by the comprehension term) with reference to the totality of all predicates, to which the one to be defined belongs. A central application domain for the Minlog proof assistant is program extraction from constructive (and also classical [10]) proofs. This is done by means of a realizability interpretation, which requires — when the formula to be realized is given by an inductively defined predicate — a (possibly non-finitary) free algebra as domain of the realizers.
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
199
Computable functionals are defined by “computation rules” [11, 8]; these rules are added to the standard conversion rules of typed -calculus. To simplify equational reasoning, the system identifies terms with the same normal form. Then it clearly is desirable to use other equations as rewrite rules as well; for instance, we not only want to rewrite M + 0 into M (which is an instance of a computation rule), but also 0 + M into M . To justify this we need to ˆ where mˆ ranges over all (possibly partial) objects of type N. prove 0 + mˆ = m, The standard way to prove such equations is of course induction. However, induction is only valid for total objects (or — for types with parameters — “structure-total” objects; cf. Section 3.1), hence cannot be used for equations involving partial variables. Here the approach developed in the present paper helps: one can prove the equality of the values of the two terms, by showing that both contain the same tokens, and then use reflection to conclude that the terms must be equal. The present paper aims at preparing the ground for such proofs. REFERENCES
[1] Andreas Abel and Thorsten Altenkirch, A predicative strong normalization proof for a calculus with interleaving inductive types, Types for Proofs and Programs, International Workshop, Types ’99, L¨okeberg, Sweden, June 1999, Lecture Notes in Computer Science, vol. 1956, Springer Verlag, Berlin, Heidelberg, New York, 2000, pp. 21– 40. [2] Samson Abramsky, Domain theory in logical form, Annals of Pure and Applied Logic, vol. 51 (1991), no. 1-2, pp. 1–77. [3] Samson Abramsky and Achim Jung, Domain theory, Handbook of Logic in Computer Science, Vol. 3 (S. Abramsky, D. M. Gabbay, and T. S. E. Maibaum, editors), Oxford Univ. Press, New York, 1994, pp. 1–168. [4] Roberto M. Amadio and Pierre-Louis Curien, Domains and Lambda-Calculi, Cambridge University Press, Cambridge, 1998. [5] Henk Barendregt, Mario Coppo, and Mariangiola Dezani-Ciancaglini, A filter lambda model and the completeness of type assignment, The Journal of Symbolic Logic, vol. 48 (1983), no. 4, pp. 931–940. [6] Holger Benl, Konstruktive Interpretation Induktiver Definitionen, Master’s thesis, Mathe¨ matisches Institut der Universit¨at Munchen, 1998. [7] Ulrich Berger, Total sets and objects in domain theory, Annals of Pure and Applied Logic, vol. 60 (1993), no. 2, pp. 91–117. [8] , Continuous semantics for strong normalization, Proc. Cie 2005, Lecture Notes in Computer Science, vol. 3526, 2005, pp. 23–34. [9] Ulrich Berger, Stefan Berghofer, Pierre Letouzey, and Helmut Schwichtenberg, Program extraction from normalization proofs, Studia Logica, vol. 82 (2006), no. 1, pp. 25– 49. [10] Ulrich Berger, Wilfried Buchholz, and Helmut Schwichtenberg, Refined program extraction from classical proofs, Annals of Pure and Applied Logic, vol. 114 (2002), no. 1-3, pp. 3–25. [11] Ulrich Berger, Matthias Eberl, and Helmut Schwichtenberg, Term rewriting for normalization by evaluation, Information and Computation, vol. 183 (2003), no. 1, pp. 19– 42. [12] Fr´ed´eric Blanqui, Jean-Pierre Jouannaud, and Mitsuhiro Okada, The calculus of algebraic constructions, Rewriting Techniques and Applications (Trento, 1999), Lecture Notes in Computer Science, vol. 1631, Springer, Berlin, 1999, pp. 301–316.
200
HELMUT SCHWICHTENBERG
[13] Thierry Coquand, Giovanni Sambin, Jan Smith, and Silvio Valentini, Inductively generated formal topologies, Annals of Pure and Applied Logic, vol. 124 (2003), no. 1-3, pp. 71– 106. [14] Thierry Coquand and Arnaud Spiwack, Proof of normalisation using domain theory, Slides of a talk, October 2005. [15] Yuri L. Ershov, Everywhere defined continuous functionals, Algebra i Logika, vol. 11 (1972), no. 6, pp. 656–665. [16] , Maximal and everywhere defined functionals, Algebra i Logika, vol. 13 (1974), no. 4, pp. 374–397. [17] Jean-Yves Girard, Une extension de l’interpr´etation de G¨odel a` l’analyse, et son application a` l’´elimination des coupures dans l’analyse et la th´eorie des types, Proceedings of the Second Scandinavian Logic Symposium (J. E. Fenstad, editor), North-Holland, Amsterdam, 1971, pp. 63– 92. ¨ [18] Kurt Godel, Uber eine bisher noch nicht ben¨utzte Erweiterung des finiten Standpunktes, ¨ Dialectica, vol. 12 (1958), pp. 280–287. [19] Stephen C. Kleene, Countable functionals, Constructivity in Mathematics (A. Heyting, editor), North-Holland, Amsterdam, 1959, pp. 81–100. [20] Georg Kreisel, Interpretation of analysis by means of constructive functionals of finite types, Constructivity in Mathematics, North-Holland, Amsterdam, 1959, pp. 101–128. [21] Lill Kristiansen and Dag Normann, Total objects in inductively defined types, Archive for Mathematical Logic, vol. 36 (1997), no. 6, pp. 405– 436. [22] Kim G. Larsen and Glynn Winskel, Using information systems to solve recursive domain equations, Information and Computation, vol. 91 (1991), no. 2, pp. 232–258. [23] Per Martin-Lof, ¨ Intuitionistic Type Theory, Bibliopolis, 1984. [24] , The domain interpretation of type theory, Talk at the workshop on semantics of ¨ programming languages, Chalmers University, Goteborg, August 1983. [25] Gordon D. Plotkin, LCF considered as a programming language, Theoretical Computer Science, vol. 5 (1977/78), no. 3, pp. 223–255. [26] , T as a universal domain, Journal of Computer and System Sciences, vol. 17 (1978), no. 2, pp. 209–236. [27] Helmut Schwichtenberg, Density and choice for total continuous functionals, Kreiseliana. About and Around Georg Kreisel (P. Odifreddi, editor), A.K. Peters, Wellesley, Massachusetts, 1996, pp. 335–362. [28] Dana Scott, Outline of a Mathematical Theory of Computation, Technical Monograph PRG–2, Oxford University Computing Laboratory, 1970. [29] , Domains for denotational semantics, Automata, Languages and Programming (Aarhus, 1982) (E. Nielsen and E. M. Schmidt, editors), Lecture Notes in Computer Science, vol. 140, Springer, Berlin, Heidelberg, New York, 1982, A corrected and expanded version of a paper prepared for ICALP’82, Aarhus, Denmark, pp. 577–613. [30] , A type-theoretical alternative to ISWIM, CUCH, OWHY, Theoretical Computer Science, vol. 121 (1993), no. 1-2, pp. 411– 440. [31] Viggo Stoltenberg-Hansen, Edward Griffor, and Ingrid Lindstrom, ¨ Mathematical Theory of Domains, Cambridge Tracts in Theoretical Computer Science, vol. 22, Cambridge University Press, Cambridge, 1994. [32] William W. Tait, Normal form theorem for bar recursive functions of finite type, Proceedings of the Second Scandinavian Logic Symposium, North-Holland, Amsterdam, 1971, pp. 353– 367. [33] Anne S. Troelstra (editor), Metamathematical Investigation of Intuitionistic Arithmetic and Analysis, Lecture Notes in Mathematics, vol. 344, Springer-Verlag, Berlin, Heidelberg, New York, 1973.
RECURSION ON THE PARTIAL CONTINUOUS FUNCTIONALS
201
[34] Jeffrey Zucker, Iterated inductive definitions, trees, and ordinals, Metamathematical Investigation of Intuitionistic Arithmetic and Analysis (A. S. Troelstra, editor), Lecture Notes in Mathematics, vol. 344, Springer, Berlin, Heidelberg, New York, 1973, pp. 392– 453. MATHEMATISCHES INSTITUT DER ¨ MUNCHEN ¨ LUDWIG-MAXIMILIANS-UNIVERSITAT THERESIENSTR. 39 ¨ D-80333 MUNCHEN, GERMANY
E-mail:
[email protected]
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
MICHAEL SHEARD
Abstract. We survey and evaluate recent discussions about axiomatic theories of truth, with special attention to deflationary approaches. Then we propose a new account of the use of truth theories, called a transactional analysis. In this analysis, information is communicated between intelligent agents, which are modeled as individual axiomatic theories. We note the need in the course of communication to distinguish whether or not new information is considered trustworthy.
!
To say that what is is not, or that what is not is, is false; but to say that what is is, and what is not is not, is true; and therefore also he who says that a thing is or is not will say either what is true or what is false. – Aristotle, Metaphysics, 1011b 1 This paper consists of three parts. First is a brief introduction; probably most of it will be very familiar material. Then I will describe and discuss some recent work on axiomatic theories of truth. Finally, I will suggest an alternative way of thinking about axiomatic theories of truth, which I call a transactional approach. The famous quotation from Aristotle (shown above, and chosen in honor of the conference at which this paper was presented) is not really the starting point, but includes one little feature which deserves attention for later reference: the use of the word “say”. §1. Introduction. My starting point is a familiar problem. Let L be a first-order language, with sufficient resources to define combinatorial notions needed for representation of the syntax of formal languages, including L itself. Phrasing it this way is a bit inaccurate, since I have not yet specified what interpretation is to be provided for L, but in all cases we will have 1 Aristotle
[1], pp. 200-201.
Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
202
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
203
an additional framework2 : either a semantic interpretation for L in which those resources really do represent the appropriate syntactic elements, or else a core of basic axioms sufficient to prove all necessary facts about the syntactic elements. In particular, we will assume that there is a canonical mechanism for creating or coding names for all sentences of L and of simple extensions of L; if φ is a sentence, we will indicate its canonical name with quotations as ‘φ’. One convenient and frequent choice is to take L to be L(PA), the language ¨ of Peano Arithmetic, with the mechanism of Godel numbers available for the ¨ formalization of syntax; then ‘φ’ is the numeral for the Godel number of φ. We form the language LT by adding a new unary predicate T to L, a “truth predicate”, with the intended meaning of T (x) being “x is [the name, code, ¨ Godel number, etc. of ] a true sentence in LT ”. Note that the truth predicate applies to sentences of the expanded language LT , and not just L. Na¨ıvely, we would like to add to any theory in LT the so-called Tarski biconditionals or T sentences: all sentences of the form T (‘φ’) ↔ φ, where φ is a sentence of LT . Unfortunately, as is well known from Tarski’s theorem on the undefinability of truth, this yields an inconsistent theory, since our assumptions about L allow a formalized version of the Paradox of the Liar. By the usual fixed-point construction one can construct a sentence — the Liar Sentence — such that some basic combinatorics or arithmetic formulated in LT proves ↔ ¬T (‘’). Coupled with the T -sentence ↔ T (‘’) this yields a contradiction. So what are we to do? One possibility, consistent with Tarski’s insistence on an object-language/metalanguage distinction, is to allow T to apply only to sentences of L, and so to assume instances of the T -sentences only for sentences of L. Such an approach yields more interesting questions, both technical and philosophical, than one might expect at first. However, this is not the road I want to go down here. Instead, I want to look at “type-free” or “self-referential” systems, where the truth predicate is allowed to apply to at least some sentences of LT involving T itself. One obvious attempt is to assume only “unproblematic” instances of the T -sentences, and to try to assume as many of them as possible. Unfortunately, this formulation of the problem is too spare to allow any meaningful progress. McGee [24] has demonstrated that if a maximal consistent set of T -sentences is the only goal, then the range of possibilities is so wide that any set of sentences whatsoever may result as the extension of the predicate T . Even worse, one may consistently assume T -sentences that prove false statements of arithmetic. In other words, many perfectly consistent choices of a set of T -sentences lead to theories that do not resemble a theory of truth at all. In any case, to assume a particular set of T -sentences for no reason other than because one can do so consistently seems ad hoc and unmotivated. A far 2 I like the use of the word “framework”, to mean either a semantic structure or a set of axioms to provide a (partial) definition of elements of a formal language, which is found in Feferman [8].
204
MICHAEL SHEARD
better goal is to establish criteria and justifications for assumptions that will allow us to develop a robust and well-motivated logical theory. At this point it becomes prudent to raise a philosophical question: Exactly what problem are we trying to solve? Are we trying to uncover the “real” theory of truth? Are we approximating a theory of truth? Are we refining a theory of truth? Are we diagnosing and correcting a faulty theory of truth? All of these possible characterizations seem to find some support in the literature. I am struck by a claim of Burgess [4] that most philosophers of this subject believe, or act as if they believe, that there is a consistent and useful na¨ıve theory of truth under which rational people operate, which philosophy has so far failed to uncover satisfactorily. For such philosophers, the problem to be solved is to refine our formalization of a truth theory, either in the terms that I have defined above or in some other logical setting, to create a consistent formal system which coheres with our na¨ıve understanding.3 Burgess, however, maintains that the na¨ıve theory of truth is really inconsistent, and in ordinary discourse people avoid confronting the inconsistency by applying different tenets of the theory in different places, and by just not pushing any line of reasoning under the theory too far.4 I will lay one of my philosophical cards on the table early. Without commenting on the empirical question of what most philosophers believe, I concur with the second half of Burgess’s claim: I believe that the na¨ıve theory of truth as it is envisioned or used in practice is actually inconsistent. Indeed, I consider this to be almost self-evident after reflection on the Liar and similar paradoxes. It is relative easy to get most people to assent to assertions about truth which, if laid side-by-side, lead to a contradiction. It is also likely that people actually use the principles expressed by these assertions, whether consciously or unconsciously, in situations in which they have to reason about truth. So in practice what is applied is merely a convenient and accessible portion of an inconsistent theory. If we accept this thesis, that a na¨ıve understanding of truth is formally inconsistent, then the goal of the logician is to capture some useful and consistent fragment of the na¨ıve theory.5 Of course, any theory in the language is a subtheory of the inconsistent theory, so what I mean is that we need to 3 There are obviously many subtle shadings here. Probably most philosophers of the viewpoint Burgess is describing would not describe it that way themselves, and would instead place themselves at various points along a spectrum. It is also likely that different philosophers mean different things by “the na¨ıve theory of truth.” 4 In part Burgess here follows Tarski, who famously argued that our informal notion of truth in natural language is inconsistent. 5 An alternative approach is provided by dialetheism, whose advocates argue that some sentences, such as the Liar, can be both true and false simultaneously. A system of paraconsistent logic is employed to avoid the classical consequence that any other sentence whatsoever must follow. This is another interesting avenue which will not be pursued here.
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
205
try to create a framework that validates some of our fundamental intuitions about truth. This provides an open-ended problem, since no consistent theory will embody all of our intuitions, and any fragment will have to be judged according to less-than-obvious criteria. Indeed, we may find that the best plan is to create a suite of consistent formalizations, each identified for use in an appropriate context. Historically the majority of work on this problem has been on the semantic6 side, here taking “semantic” in a very broad sense to refer to the assignment of truth values from among ‘t’, ‘f’, and (necessarily) one or more others to all sentences in LT in a fashion that “makes sense”. (If a semantic scheme refuses to assign a truth value to a particular sentence — as is the case of several well-known examples — or maintains that an apparently syntactically correct sentence actually does not say anything, then we can re-interpret it as assigning some special designated truth value, such as ‘u’ for “undetermined”). The best known semantic framework is the family of fixed-point models introduced by Kripke [20] and Martin and Woodruff [22], in which “truth gaps” are allowed; that is, many sentences, including the Liar, receive an indeterminant truth value. Other well-known approaches include the revision theory due to Herzberger, Gupta, and Belnap (e.g., [18, 14]) and the Russellian and Austinian models of Barwise and Etchemendy [2]. Semantic approaches are not my main topic here, but I do want to note one point for a later comparison. All semantic approaches are subject to — or at least must deal with — some version of the so-called Revenge Paradox. In its simplest form, it goes like this: We can prove ↔ ¬T (‘’), and at a minimum our intuition about truth predicates says that if a sentence φ is true then T (‘φ’) should also be true. Thus, we cannot assign the classical truth value ‘t’ to . Whatever else we may do to make our semantics useful and consistent, an obviously correct statement about those semantics is that is not true. But on any reasonable reading, this is what itself asserts. So, we cannot simultaneously (i) demand that T (‘φ’) be true if φ is true, (ii) give a classical reading of negation and the biconditional on classical truth values, and (iii) interpret T (‘φ’) as meaning that φ receives a truth value of ‘t’. Something has to give. There are attempts at “revenge-proof” semantics7 , and these are both interesting and laudable, but necessarily each gives up at least one of these principles. I for one find giving up any of them to be a serious loss, so 6 In this paper I will make a conscious distinction between “semantical”, referring to general philosophical issues involving meaning, reference, truth, etc., and “semantic”, referring to the process in formal logic of assigning interpretations to logical elements. Thus a truth predicate is by definition a semantical object, which may or may not be given a semantic analysis. No consistent distinction of terminology is observed in the literature, where the words appear to be used interchangeably for either concept. 7 Hartry Field has been a prominent recent advocate of revenge-proof approaches; see for example [10].
206
MICHAEL SHEARD
I consider the Revenge Paradox to be an unavoidable part of any semantic approach. §2. An overview of recent work on some axiomatic theories. In contrast to semantic approaches, axiomatic approaches renounce the search for an assignment of truth values to all sentences, and begin with basic axioms, and possibly auxiliary rules of inference, about the truth predicate. The systems considered here are in classical logic, although several systems have been formulated in various non-classical logics. (See, for example, Feferman [8], Cantini [7], and Halbach and Horsten [17].) Within classical first-order logic, three main axiomatic systems have been explored in the literature. Each of these takes Peano Arithmetic (PA) as the underlying non-semantical theory and overlays a truth mechanism on top of it, although certainly the same truth mechanism could be added to other theories, such as Zermelo-Fraenkel set theory. I will take a brief look at each one in turn, and indicate some of the most recent discussion about them. The first and weakest of the three theories is known as FS. It begins with a basic and straightforward set of axioms, of which the most central assert that the truth predicate commutes in the more-or-less obvious way with the logical connectives and quantifiers. In particular, it includes the axiom ∀x(Sent(x) → (T (¬x) ↔ ¬T (x)))8 , which implies two important principles of the system: ¬(T (‘φ’) ∧ T (‘¬φ’)) T (‘φ’) ∨ T (¬φ’)
(T-consistency) (T-completeness)
Each of these can be taken to be quantified over all instances of φ, rather than just presenting a scheme. The axioms are then augmented with two auxiliary rules of inference: From φ, deduce T (‘φ’) (semantic ascent, or by analogy to modal logic, necessitation). From T (‘φ’), deduce φ (semantic descent, or co-necessitation). (Like many axioms and rules in this subject, these are typically formulated with parameters, suppressed here for ease of reading.) Note that the rules of inference represent a weaker principle than unrestricted assumption of the T -sentences; that is, in general we do not have T (‘φ’) ↔ φ for all instances of φ in LT . By an old result of Friedman and myself [11], FS is consistent; by a result of Halbach [15], it is proof-theoretically equivalent to ramified analysis through all finite levels, RA< . There has also been some work on subsystems and other variants of FS, which we can call FSlike, the defining feature being the presence of one or both of the auxiliary rules of inference. 8 The
¨ first “¬” is actually an operator on Godel numbers rather than a logical connective.
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
207
By construction, obviously, FS has the property that it proves φ if and only if it proves T (‘φ’). In this sense FS makes φ and T (‘φ’) equivalent, in that they have the same status: provable or not provable. In a recent paper Halbach and Horsten [16] have argued that this feature, among others, makes FS an especially appropriate for what is known as a deflationary theory of truth. I will say more about deflationary theories later, but one tenet of the deflationary viewpoint is that truth operates on the surface, at the logical level, without recourse to deep definitions. The two auxiliary rules of FS are about as close to the surface as you can get. Two recent papers have criticized FS-like theories (although in neither case was that the main point of the paper). One strain of criticism can be disposed of fairly quickly. Christopher Gauker [12] finds something objectionable in the fact that the rules of semantic assent and descent cannot be used within conditional subproofs in a natural deduction system. But of course if we allowed use of the rules within conditional subproofs, we could immediately prove all of the T -sentences, and thus recover the inconsistent theory. So, to adopt these rules with restriction is to accept the choice between doing something and doing nothing. More generally, if we were unwilling to accept auxiliary rules of inference that can be applied only if there are no open assumptions, then we would have to throw out much of modal logic and almost all of provability logic.9 Michael Glanzberg [13] sees a Revenge phenomenon with respect to FS and FS-like theories.10 The argument goes like this: FS does not prove the Liar sentence . Thus is not true on the account provided by FS, so an accurate statement about truth-under-FS is given by ¬T (‘’). But FS does not prove ¬T (‘’) (since this is equivalent to ). So FS fails to describe its own conception of truth adequately. This criticism would appear to be off target on several levels. Since truth is a complicated concept recursion-theoretically — even in the most basic arithmetical case, the set of true sentences in L(PA) is ∆11 — we cannot expect a complete axiomatization. So, if we start with the assumption that any axiomatization is at best an approximation of a theory of truth, rather than an implicit definition of truth, then an axiomatization is only a way of getting at some statements that we can know to be true, not an absolute determination of what is and is not true. To expect that a theory that fails to prove some sentence φ should therefore prove ¬T (‘φ’) is to ask too much. (On the other hand, it is perfectly reasonable to ask that a theory that proves ¬φ should also prove ¬T (‘φ’), which FS does.) Such a demand misunderstands the commitment 9 In fairness, the underlying intuition behind the restriction is not so clear-cut for truth as for necessity or provability operators. 10 Technically, Glanzberg criticizes a subtheory of FS, which he calls TP, but the difference is irrelevant for the argument.
208
MICHAEL SHEARD
of an axiomatic theory, and treats an axiomatic theory as though it were a semantic analysis. Failure of a theory to prove does not commit one to an observation that is not true; there is no claim that truth lies solely in provability within the system. Glanzberg’s argument misses one important aspect of the very partialness that he seeks to analyze.11 Glanzberg’s main goal is to defend a hierarchical view of truth as both inevitable and salutary. If he wants to argue that FS does not completely capture our commitments in the use of the truth concept, and wishes to find additional principles to express in the language, he will get no argument out of me. But notice that if this a defect, then it is a defect of any essentially incomplete axiomatic theory, whether that theory has anything to do with truth predicates or not. To take the most obvious example, certainly one of our commitments in using any theory is that the theory is consistent — a statement which is not provable in the theory itself. In the current setting having a truth predicate allows us to state especially crisp forms of our commitments and their consequences, but it does not follow thereby that all such consequences should be provable. There are, however, some other, older criticisms of FS that rightly give one pause. First of all, by a result of McGee [23], full FS is -inconsistent. A corollary of the proof is that FS is inconsistent with some instances of its own Uniform Reflection Principle, ∀x(ProvFS (‘φ(x)’) → φ(x)). Some philosophers consider these results to be a very serious flaw in FS.12 However, it is important to note that this -inconsistency involves only sentences containing the truth predicate — FS proves no sentences of L(PA) that are false in N. I have argued elsewhere [29], as have Halbach and Horsten [16] on somewhat different grounds, that for this reason the alleged flaw may not be as bad as it seems at first, if viewed in the appropriate light. A second axiomatization is KF, which was created by Feferman with an eye toward describing the basic Kripke fixed-point model to mentioned earlier. It is most conveniently formulated with a falsity predicate F in addition to the truth predicate T . Most of the truth axioms of KF are compositional. Some typical examples: T (‘¬φ’) ↔ F (‘φ’) F (‘¬φ’) ↔ T (‘φ’) T (‘φ ∧ ’) ↔ T (‘φ’) ∧ T (‘’) F (‘φ ∧ ’) ↔ F (‘φ’) ∨ F (‘’)
(and similarly)
11 Obviously my response applies generally: axiomatic theories should be considered immune from the Revenge Paradox. 12 For example, Jeffrey Ketland (private communication).
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
T (‘∀xφ(x)’) ↔ ∀xT (‘φ(x)’)
(and similarly)
T (‘T (‘φ’)’) ↔ T (‘φ’)
(and similarly)
T (‘φ’) ↔ φ
when φ is arithmetical atomic
F (‘φ’) ↔ ¬φ
when φ is arithmetical atomic
209
(All of these examples except the last two are actually quantified over variable sentence names φ and .) In addition, KF includes the T-Consistency axiom, ¬(T (‘φ’) ∧ T (‘¬φ’)), mentioned above.13 An important feature which is rarely remarked is that inclusion of the T-Consistency axiom permits proof, by induction on the build-up of formulas, of T (‘φ’) → φ for all sentences φ of LT . KF has been studied extensively by several authors. Results of Feferman [9] and Cantini [5] show that proof-theoretically KF is equivalent to RA<0 , the system of ramified analysis up to level 0 . Generally KF is regarded as an excellent axiomatization of reasoning about truth gaps — certainly it inherits many of the nice features of the fixed-point models that inspired it. Historically the main philosophical criticism of KF has centered on the incongruity of the “inner logic” and the “outer logic”. That is, if we look at the set of all φ such that KF proves T (‘φ’) — the inner logic — we see a set very different from the consequences of KF itself — the outer logic. Most notably, the inner logic is partial; KF does not prove T (‘ ∨ ¬’), even though the sentence inside the predicate is a validity of propositional logic. Reinhardt [26], in a detailed analysis of the philosophical significance of KF, argued that the inner logic should be regarded as the real theory of truth which the system embodies, with the outer logic as a formalist superstructure for supporting the inner logic. In a similar spirit, Halbach and Horsten [17] have recently proposed restricting the outer logic to a formal system of inference that makes it cohere with the inner logic. The third and strongest axiomatic system is known as VF, first defined in its full form by Cantini [6]. In the same way that KF was intended to describe the structure of the basic fixed-point models, VF was motivated by the supervaluation fixed-point models, also defined by Kripke [20], building on earlier work of van Fraassen. The central axiom scheme of VF is T (‘φ’) → φ, which is to say one direction of the T -sentences. In addition — and what distinguishes it most markedly from KF — the system VF includes a partial reflection axiom, ∀x(ProvPA(T ) (x) → T (x)), asserting that any sentence provable in Peano Arithmetic formulated in LT is true. A few other axioms on iteration of the 13 Some presentations omit the T-Consistency axiom. Indeed, there is considerable disparity in the early literature about which of several variants is labeled “KF”, but consensus seems to have settled on the version described here.
210
MICHAEL SHEARD
truth predicate round out the axiomatization. Like KF, the system VF does not have the property that if it proves φ then it proves T (‘φ’); such closure under semantic ascent would immediately yield the famous Montague/Kaplan version of the Liar Paradox.14 In terms of proof-theoretic strength, Cantini [6] showed that VF has the same arithmetical consequences as ID1 , the theory of arithmetical inductive definitions. There does not appear to have been much recent work on VF, which may be unfortunate in light of some of the discussion in the next section. §3. A Transactional approach. Having taken an overview of several axiomatic systems, I would like to step back and ask a more fundamental question: What is it that these axiomatic systems are intended to do? I believe that the answer is fairly clear — each is an attempt to axiomatize some understanding of the set of true sentences. In the cases of KF and VF this is the explicit motivation: they axiomatize the basic Kripke fixed-point models and the supervaluation fixed-point models, respectively. As already noted, there is no -model of FS15 , but it is easy to construe the theory as being motivated by the intuition that a sentence T (‘φ’) has the same logical status as φ itself. In other words, these theories intend to capture how we might reason about truth, rather than the how we reason with truth. As an alternative, I want to propose that we focus on how a truth predicate is used in the ordinary process of communication of information. When you look anywhere in the philosophical literature on truth, one thing that stands out is how often the questions are framed in terms of a person saying that some statement, or collection of statements, is true. That is, the motivating examples are typically not phrased in some abstract formulation — what does it mean for a sentence to be true? — but rather in terms of communication: what information is conveyed when someone says that something is true? An analysis based on communication picks up some themes of the deflationary approach to truth. Deflationism about truth is a family of philosophical theories characterized by their insistence that truth is a concept that works at the surface level and that is not in need of definition in terms of other philosophical concepts, such as reference, meaning, correspondence, and the like. A radical deflationist — and I’m not prepared to swear that anyone actually holds so absolute of a view — might maintain that truth is not even a meaningful concept at all, but rather a name created for a linguistic mechanism to increase the ease and flexibility of communication.16 From such a point 14 Let be the usual Liar sentence. Then we can prove ↔ ¬T (‘’), which together with the axiom T (‘’) → proves . Closure under semantic ascent would give T (‘’), but since — which has already been proved — is equivalent to ¬T (‘’), this is a contradiction. See [25]. 15 There is however a natural semantic interpretation for FS: theorems of FS are eventually true in the finite levels of revision semantics. See Gupta and Belnap [14]. 16 Certainly there are deflationists who insist that there is no such property as truth.
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
211
of view, a semantic analysis of truth is a pointless or even irrational exercise, because there is no underlying concept to explicate. Instead, what is needed is a clarification of the linguistic transformations that take place at or near the syntactic level. While most deflationists might not endorse so strong a rejection of semantic analysis, there is clearly an argument to be made that an axiomatic analysis is more in keeping with the deflationist enterprise, and some philosophers have maintained exactly that.17 However, I would push the point one step further. I want to suggest that a deflationary outlook favors not just an axiomatic description of truth, but one that focuses specifically on the role of truth in communication. In that spirit, I want to propose a transactional approach to the logic of truth. It is characterized by replacing any attempt to axiomatize what truth “is” by an analysis of the principles which people employ when they use truth talk as a means of communication. When we use a truth predicate in our communication, we employ some sort of principles for the encoding and decoding our messages. I want to urge that we focus on uncovering and formalizing those principles as far as consistently possible. To set this up, I will borrow some terminology from the field of artificial intelligence and describe the situation in terms of a community of intelligent agents.18 Each agent has a certain amount of information about the world, which again borrowing computer science terminology we might refer to as a database, and which we will model as an incomplete but consistent theory in some shared formal language which includes a truth predicate. Perhaps an even better term might be world-view, since we want to consider the possibility that each agent’s theory, while internally consistent, might be inconsistent with other agents’ theories. (We can remain agnostic for now as to whether one is right and the other wrong, or whether this is a model of relativism.) When communication takes place from X to Y , a message — a sentence of the formal language — is transmitted from X to Y . One can debate whether an agent’s database should be an actual theory closed under logical consequence: human beings do not actually draw all inferences from the set of premises which they know or believe, any one human being’s actual set of thoughts presumably being finite. However, as an exercise in modeling with first-order logic, it seems best to treat a database as the set of facts which could be deduced from the available information. It seems a much more complicated task — beyond the scope of this analysis — to try to determine which possible inferences will or will not in fact be drawn. Alternatively, if one wished, the database could be taken to be a consistent set of sentences not necessarily closed under logical consequence, like a set 17 See
for example Halbach and Horsten [16]. for example, the unifying structure in Russell and Norvig [27].
18 See,
212
MICHAEL SHEARD
of axioms, but since any processing of additional information will involve drawing inferences from the given information, the distinction makes little difference to the logical analysis. A first, simple pass at what takes place when a sentence is transmitted from X to Y is that the sentence is taken from X ’s theory and added to Y ’s theory, and then Y ’s theory is closed under deducibility under the rules of logic of an appropriate system. If the additional sentence from X leads to Y ’s theory becoming inconsistent (possible since we assume that agents’ world-views can be inconsistent with each other), then Y simply rejects the new information from X and returns to the status quo ante. (Alternative models, involving revision of previous views with some form of non-monotonic reasoning, suggests an interesting avenue for exploration.) While this is an obvious model of communication, it is too simple for some important kinds of situations, one of which will be suggested below in some detail. Possibly something more complicated needs to take place: perhaps some pre-processing of a message when it is received, for example. So the question becomes, what features of a theory of truth do we need in our logical system in order to allow our agents to encode and decode messages involving a truth predicate in a competent manner? It is entirely possible — indeed, likely — that there is no single best answer to this question. To head in the right direction, however, let me lay out some of the properties which should characterize a transactional theory: It is non-semantic. By this I mean that any axiomatization involved is not intended to describe any real or imagined model or semantic interpretation for the language. I have suggested most of the arguments for this principle already, but let me add just one more piece specific to the notion of truth as a tool of communication. Usually we do not talk about truth itself as the subject of the conversation, even in situations in which we apply a type-free truth predicate to sentences involving the truth predicate itself. A familiar example comes from Kripke’s paper [20], in which we consider trying to untangle conflicting testimony in the Watergate hearings. The underlying subject of that sample dialogue is a sequence of historical events, not the nature of truth. Similarly, in a transactional analysis, the information ultimately conveyed is about the state of affairs of the world, not semantics. Now obviously there are specific situations in which people do talk about truth as a subject of discussion — I have presented this paper in public — but if that needs to be given up on the first pass at a transactional theory, it seems a small enough price to pay.19
19 Indeed, it may be that further analysis would reveal that the best choice to represent discussions among semanticists would treat ordinary uses of truth to facilitate communication and truth as a subject of study in two very different ways. Be that as it may, the moral is that we need
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
213
It is non-reflective. This point continues the theme of the preceding one. While this is hardly a new idea, it seems worth a reminder: when we use a formal system, not every assumption or commitment which underlies the system needs to be, or ought to be, expressly validated in that formal system, even if the language makes it possible. Non-truth-theoretic examples are familiar, and I have already suggested one: if we chose to use Peano Arithmetic as our system of arithmetic, then one underlying assumption is that PA is consistent. Of course PA does not prove this fact, and if we add it to our system, then we adopt a different, stronger system and are implicitly committed to its consistency. This dog will never catch its tail — leading to what has been called the inexhaustibility principle. As good a place as any to stop is never to start, and typically we work with PA as is, leaving the implicit assumption of consistency unstated. Similarly, I have already argued that it is not a failing of a formal system of truth if the system does not have among its consequences an expression of some particular interesting fact about itself. From a transactional point of view, if we share an implicit assumption in our use of truth — even one that we reply upon extensively — we do not necessarily need to state it explicitly. If we recognize that we share it, why would we need to convey it like a separate piece of new information? We may need to distinguish trustworthy and untrustworthy sources of communication. This point emerges from some of the more specific analysis which will be described in more detail below. But in short, it becomes apparent that a critical distinction in a transactional approach needs to be made about the status of information when it is received, in order to determine how it should be processed. If the receiver considers the source of a message to be trustworthy, then the content of the message can be added to the receiver’s database directly. If, however, the receiver considers the source to be untrustworthy, then the information will need to be evaluated before it is accepted. If trustworthy source A says φ, then φ (or some processed version of φ) is added to the database directly; if untrustworthy source B says φ, then only “B says ‘φ’ ” is added to the database. The consequences of these varying assumptions may be quite different, and we should even consider the possibility that they could require two different systems of inference. Since the distinction here concerns how received information is processed, the determination of whether a source is trustworthy or untrustworthy is an a priori decision. But interestingly, when I presented an early version of these ideas at a university philosophy colloquium, the question of how one
to avoid the trap of trying to provide a definition of truth — either explicit or implicit — instead of representing the features of its use in context.
214
MICHAEL SHEARD
distinguishes trustworthy from untrustworthy sources was a major topic of interest in the discussion that followed. Can a formerly trustworthy source be discovered to be untrustworthy? If so, what adjustments need to be made? What happens if two sources of information, both considered trustworthy, are found to conflict? It even branched into a discussion of modeling psychological development in childhood, in which our earliest teachers (parents or others) are taken as trustworthy sources of information, and the information they convey is accepted without question. Later sources come to be considered untrustworthy, and information which they convey is rejected if it conflicts with information already provided by trustworthy sources. This suggests an interesting model of the process of human learning and value formation. We may need to distinguish actively between direct attributions of truth and truth used for generalizations. A central theme of many truth theorists (one that has been stressed emphatically by Halbach, for example) is that a central purpose — indeed, possibly the central purpose — for having a truth predicate in the language at all is to allow the expression of generalizations. The basic pattern has the form ∀x(R(x) → T (x)), where R(x) is a predicate which distinguishes some set of sentences. In the interesting case the set of sentences in question is infinite and not equivalent to any single sentence of the language, so that the generalization can be interpreted as expressing a kind of infinite conjunction of the sentences.20 In contrast, a more mundane use of the truth predicate is a direct attribution of truth to a single particular sentence. It may be that the way we process messages involving the truth predicate varies depending on whether the message is more like “It it is true that Jones visited Athens” or more like “Everything Jones says about logic is true.” The first example feels like an exercise in the redundancy theory of truth — strip off the truth predicate and be done with it — whereas successful analysis of the second may depend on the specifics of the logical system we employ. The possibility of formalizing this difference in a transactional account suggests yet another way in which a truth predicate is different from other predicates. Ordinarily, when we compare two applications of a predicate in first-order logic, one in an unquantified context 20 Some philosophers — but not all — might make a distinction between the notion of generalization and the related notion of blind attribution of truth. I can make a useful generalization by saying that every axiom of Peano Arithmetic is true, and it is clear to which sentences I wish to attribute truth. On the other hand, if I make a genuinely blind attribution by asserting that everything Kripke has ever said about philosophy is true, neither the speaker nor audience can know with certainty the exact list of sentences to which truth is attributed. The distinction is probably not as straightforward as these examples would suggest: if I say that all theorems of Peano Arithmetic are true, it certainly seems clear enough what I mean to assert, but in fact the set of theorems of PA is not decidable.
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
215
and one quantified — for example, P(c) where c is a constant vs. ∀xP(x) — we recognize the logical differences in the full formulas, but do not imagine that there is any difference in the uses of the predicate P. In a transactional account, it may be that the truth predicate T needs to be treated differently in T (‘φ’) from the way it is treated in ∀x(R(x) → T (x)). It is not a problem if the theory is not conservative over the underlying non-truththeoretic theory. One of the most prominent objections to deflationary analysis hinges on the problem of conservativeness. Several authors, notably Ketland [19] and Shapiro [28], have argued that deflationary axiomatizations fall victim to a two-pronged dilemma. If the assumed truth-theoretic axioms are too weak, then the theory cannot prove the generalizations which theorists of truth would want. If, on the other hand, the truth-theoretic theory is sufficiently strong to prove the desired generalizations, then it is no longer conservative over the underlying non-semantical theory. In the latter case, though, the truth apparatus does indeed add something new to what we have warrant to assert about the underlying situation, which is to say that it no longer meets the deflationist’s requirement that truth not be a substantive property. Notice that on a transactional account, this version of the objection evaporates. Axioms and rules of inference involving the truth predicate which allow us to derive new non-semantical facts from a message are perfectly acceptable, as long as those non-semantical facts represent an unpacking of information which the sender (is assumed to have) intended to convey. There may be assumptions about the world built into the semantical apparatus, but if these underlying assumptions are shared by both sender and receiver, then no new information is really being added. Having established the motivation and some criteria for a transactional theory, we have reached the problem of determining what axioms and rules of inference such a theory should include. I chose to approach this initially as an empirical question. How is a truth predicate used in practice? There are traditional examples in the literature, but one can also decide to have a little fun with the project. For an extensive supply of examples, I turned to that font of logic puzzles, Raymond Smullyan’s What is the Name of This Book? [30]. Most of the puzzles in the book involve deciphering short conversations which explicitly or implicitly involve a truth predicate. It provides a large set of examples for which one can try to uncover the logical principles needed to extract the information being conveyed. One early observation is that the set of principles needed to decode seemingly tangled webs of cross-references often tends to be remarkably weak. Indeed, for many situations, all that is needed is one of the directions of the T sentences, either T (‘φ’) → φ or φ → T (‘φ’), where φ contains no occurrences of “T ” (that is, φ is in the base language L). Almost everyone considers this
216
MICHAEL SHEARD
pair of restricted principles completely unobjectionable, and it is well known that they do not lead to contradiction. To take a very simple example, here is Smullyan’s Problem #28 (paraphrased): There is an island on which every inhabitant is either a Knight or a Knave. Knights always tell the truth and Knaves always lie, but they are otherwise indistinguishable. You encounter two of them; one says “At least one of us is a Knave.” What are they? An easy examination of cases provides the answer (the speaker is a Knight, his companion is a Knave); either of the two directions of the T -sentences is sufficient to justify the reasoning.21 There are, of course, situations in which the two directions of the T sentences are not interchangeable. A simple example arises in a direct attribution, “That’s not true!”, where the reference of “that” is crystal clear from context. Obviously the speaker intends simply to assert the negation of whatever has been said, which we can derive by an application of modus tollens to φ → T (‘φ’). Similarly, other examples may require T (‘φ’) → φ. Since virtually all systems of truth validate the T-sentence T (‘φ’) ↔ φ for any φ not involving T , all such simple cases will be covered. For more complicated examples, we have to be prepared to handle type-free applications of the truth predicate. Here the distinction between trustworthy and untrustworthy sources begins to matter. In the special case that all sources are considered trustworthy, it appears that FS, the proof-theoretically weakest of the three systems mentioned, is adequate to meet the requirements of transactional decoding, at least in regard to basic transmission of information.22 Indeed, all that is required is the rule that deduces φ from T (‘φ’). Since all sources are assumed to be trustworthy, all that needs to be considered is the movement of information from one database to another, either as individual LT -sentences or as generalizations. In the single-sentence case, if φ is a member of A’s database and A wants to convey it to B, then A simply sends φ to B. Since A is trustworthy, φ can be added to B’s database directly. If instead A wishes to convey an infinite list of sentences φn , where the list is defined by the formula R(x), then A transmits the generalization ∀x(R(x) → T (x)). For any given n, if B’s database includes R(‘φn ’), then B can infer T (‘φn ’) 21 A little care must be taken to justify this claim. If we wish to use principles of the form T (‘φ’) → φ, then we need to render “φ is a lie” as T (‘¬φ’), rather than ¬T (‘φ’). Such a treatment is the usual way to represent falsity. If instead we wish to use the principle φ → T (‘φ’), then “φ is a lie” must be ¬T (‘φ’), or else we need also to invoke the T-Consistency principle, ¬(T (‘φ’) ∧ T (‘¬φ’)). 22 Strictly speaking, this is not really FS, since FS is formulated in the language of arithmetic, which does not include expressions for intelligent agents, knights and knaves, etc. Instead, we consider the truth mechanism of FS laid on top of an appropriate expansion of the language and extension of the underlying theory. Similar remarks apply to further references to KF and VF as well.
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
217
and thus φn . If B’s database does not include R(‘φn ’), then B will have to await further developments before concluding φn , but this in itself is a common enough phenomenon in the process of ordinary communication. (The trustworthiness assumption guarantees that ¬R(‘φn ’) is not in B’s database.) Since both KF and VF prove all instances of the scheme T (‘φ’) → φ, which subsumes the relevant rule of inference of FS, they will work equally as well as FS for the decoding of information from trustworthy sources. However, it is also apparent that FS is not strong enough to derive the available information needed to evaluate an untrustworthy source. For example, consider a minor modification of the previous example from Smullyan: There is an island one which every inhabitant either always tells the truth or always lies, but the two types are otherwise indistinguishable. You encounter two of them; one says “At least one of us always lies.” Which type is each of them? The same analysis by cases works as before, but this time we need T (‘φ’) → φ where φ contains occurrences of the truth predicate.23 FS does not prove all instances of this scheme; the corresponding rule — from T (‘φ’) deduce φ — is not adequate, because it cannot be applied under open assumptions in a proof by cases. Indeed, a little fiddling with revision semantics for FS reveals that FS is not strong enough to validate the desired conclusion. Both KF and VF include the needed scheme, and are adequate to complete the proof here and in many similar examples. Thus decoding messages when sources are untrustworthy requires a stronger axiomatic theory than is needed in the fully trustworthy setting. At the moment I have no intuition as to which of KF and VF would be a more suitable theory for a transactional approach. As a kind of final exam for this analysis, I turned to George Boolos’s wonderful little paper, “The Hardest Logic Puzzle Ever” [3], which analyzes another of Smullyan’s puzzles. I will not repeat the problem in detail here, but it involves some characters who always tell the truth, some who always lie, some who tell the truth or lie at random, and all who speak in a language for which we do not know which word means “Yes” and which means “No”. The problem is to find a sequence of questions whose answers will allow the determination of which kind of speaker each character is. An analysis of Boolos’s solution shows that either KF or VF is sufficient to justify the reasoning which he employs. Whether or not this really is the hardest logic puzzle ever, its solution suggests that these axiomatizations are sufficient for reasoning at the limits of the kind of work which people expect truth predicates to do. This not the end of the story, however, since there are easily-understood uses of the truth predicate which cannot be handled adequately by KF or VF 23 Instances of the other direction of the T -sentences,φ → T (‘φ’), would also work, but this possibility is a vastly inferior choice: in most truth logics the unrestricted scheme φ → T (‘φ’) permits a proof of T (‘φ’) for all sentences φ.
218
MICHAEL SHEARD
alone. Consider the earlier example, “That’s not true!”, where “that” may now refer to a sentence containing the truth predicate. The intent of the speaker is as clear as it was before, to assert the negation of an earlier sentence. Yet this obvious inference cannot be drawn in KF or VF, since neither system proves all instances of the scheme φ → T (‘φ’), to which modus tollens would need to be applied. This may be an excellent occasion for an application of the preprocessing idea suggested above: in the specific case that B sends the message ¬T (‘φ’), add “B says ‘¬φ’ ” to the database instead of “B says ‘ ¬T (‘φ’)’ ”, before beginning the process of logical inference. (An intriguing possibility is to add both versions to the database. Then, for example, if B says “This sentence is not true”, we will add both ¬T (‘’) and ¬ to the database and derive a contradiction. We will conclude that B is speaking nonsense, which seems a reasonable conclusion when someone actually goes to the trouble to assert the Liar sentence.) As a last thought on the selection of an appropriate theory, I will tentatively suggest one respect in which the theories considered here may be stronger than strictly necessary to fulfill the demands of a transactional approach. The systems FS, KF, and VF all include the axiom ∀nT (‘φ(x/n)’) → T (‘∀xφ(x)’) , where “x/n” indicates that the variable x has been replaced by the numeral for n.24 While the principle behind this axiom is widely accepted as a principle of truth, it seems fairly superfluous for the purpose of decoding actual communication. The obvious use for the axiom occurs if the sender transmits the message ∀nT (‘φ(x/n)’), so that the receiver may infer first T (‘∀xφ(x)’) and then ∀xφ(x). But why would anyone choose to transmit such a message in the first place? Rather than asserting that every substitution instance of a particular formula is true, it is both more natural and less complicated to assert the universally quantified sentence, avoiding an extraneous application of the truth predicate. Of course, there may be less obvious situations in which the axiom does play a crucial role, so it may be best to keep it in the mix pending further investigation. So far this discussion has focused on the decoding of messages. The question of how encoding takes place seems fraught with some interesting philosophical questions. Some examples are easy: a single direct attribution of truth appears to be just a convenient way either to give emphasis to a sentence or (in an appropriate context) to transmit a sentence without the trouble of re-stating it.25 Use of the truth predicate to express generalizations raises more complex issues. If I employ a truth predicate to express an infinite conjunction, what warrants my belief in the infinite list of sentences, so that I would look for a way to assert them all simultaneously in a single package? In some situations 24 This
axiom is the analogue of the Barcan Formula of quantified modal logic. direct attributions without restating the sentence are sometimes called lazy uses of truth, although the terminology seems uncharitable. 25 Indeed,
A TRANSACTIONAL APPROACH TO THE LOGIC OF TRUTH
219
my choice to use a truth predicate in this fashion may be a reflection more of the lack of expressive power of first-order logic than of my own insight. For example, I might say that every instance of the induction axiom scheme of Peano Arithmetic is true; in that case I am getting around my inability to state the single sentence of second-order logic which expresses my acceptance of the principle of induction. Truly blind attributions are even more problematic. Why would I say, “Everything Jones said about chemistry is true”? Is it because I believe I have heard every statement Jones has ever made about chemistry and endorse every one of them? Or is it an expression of faith — I know Jones to be both so knowledgeable and so honest that I am sure that he would never state an untruth, at least on the subject of chemistry? This speculation probably takes us beyond the realm of logical analysis. So far, the transactional approach which I have outlined really is an approach rather than a theory. As my tentative comments about the encoding process make clear, there is much work to be done even to define the boundaries of the philosophical task at hand. At the present moment, though, this open-ended nature of the problem is one of its appealing features. The transactional point of view in framing these questions reveals some intriguing areas for further study concerning the way that logic, and specifically the logic of truth, can be used to model the process of human communication. Acknowledgment. Thanks to John Burgess, Hartry Field, Volker Halbach, Jeffrey Ketland, Robin Lock, Vann McGee, the anonymous referee, and the faculty and students of the St. Lawrence University Department of Philosophy for valuable discussions, reactions, and answers to my questions in the [21] preparation of this paper. REFERENCES
[1] Aristotle, The Metaphysics, G. P. Putnam’s & Sons, New York, 1933, translated by Hugh Tredennick, Loeb Classical Library. [2] J. Barwise and J. Etchemendy, The liar — An Essay on Truth and Circularity, Oxford University Press, London and New York, 1987. [3] G. Boolos, The hardest logic puzzle ever, The Harvard Review of Philosophy, vol. 6 (1996), pp. 62–65, Reprinted in G. Boolos, Logic, Logic, and Logic, Harvard University Press, Cambridge, 1998, pp. 406– 410. [4] J. Burgess, Is there a problem about the deflationary theory of truth?, Principles of Truth (V. Halbach and L. Horsten, editors), H¨ansel-Hohenausen, Frankfurt am Main, 2002, pp. 37–55. [5] A. Cantini, Notes on formal theories of truth, Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik, vol. 35 (1989), no. 2, pp. 97–130. [6] , A theory of formal truth arithmetically equivalent to ID1 , The Journal of Symbolic Logic, vol. 55 (1990), no. 1, pp. 244–259. [7] , Partial truth, Principles of Truth (V. Halbach and L. Horsten, editors), H¨anselHohenausen, Frankfurt am Main, 2002, pp. 37–55. [8] S. Feferman, Toward useful type-free theories. I, The Journal of Symbolic Logic, vol. 49 (1984), no. 1, pp. 75–111, Reprinted in Martin [21], pp. 237–287.
220
MICHAEL SHEARD
[9] , Reflecting on incompleteness, The Journal of Symbolic Logic, vol. 56 (1991), no. 1, pp. 1– 49. [10] H. Field, A revenge-immune solution to the semantic paradoxes, Journal of Philosophical Logic, vol. 32 (2003), no. 2, pp. 139–177. [11] H. Friedman and M. Sheard, An axiomatic approach to self-referential truth, Annals of Pure and Applied Logic, vol. 33 (1987), no. 1, pp. 1–21. [12] C. Gauker, Partial truth, Deflationism and Paradox (J. C. Beall and Brad Armour-Garb, editors), Oxford University Press, 2005. [13] M. Glanzberg, Truth, reflection, and hierarchies, Synthese, vol. 142 (2004), no. 3, pp. 289– 315. [14] A. Gupta and N. Belnap, The Revision Theory of Truth, MIT Press, Cambridge, MA, 1993. [15] V. Halbach, A system of complete and consistent truth, Notre Dame Journal of Formal Logic, vol. 35 (1994), no. 3, pp. 311–327. [16] V. Halbach and L. Horsten, The deflationist’s axioms for truth, Deflationism and Paradox (J. C. Beall and Brad Armour-Garb, editors), Oxford University Press, 2005. [17] , Axiomatizing Kripke’s theory of truth, to appear. [18] H. G. Herzberger, Notes on naive semantics, Journal of Philosophical Logic, vol. 11 (1982), no. 1, pp. 61–102, Reprinted in Martin [21], pp. 133–174. [19] J. Ketland, Deflationism and Tarski’s paradise, Mind. A Quarterly Review of Philosophy, vol. 108 (1999), no. 429, pp. 69–94. [20] S. Kripke, Outline of a theory of truth, Journal of Philosophy, vol. 72 (1975), pp. 690–716. [21] R. L. Martin, Recent Essays on Truth and the Liar Paradox, Oxford University Press, London and New York, 1987. [22] R. L. Martin and P. W. Woodruff, On representing “true-in-L” in L, Philosophia, vol. 5 (1975), pp. 217–221, Reprinted in Martin [21], pp. 47–51. [23] V. McGee, How truthlike can a predicate be? A negative result, Journal of Philosophical Logic, vol. 14 (1985), no. 4, pp. 399– 410. [24] , Maximal consistent sets of instances of Tarski’s schema (T), Journal of Philosophical Logic, vol. 21 (1992), no. 3, pp. 235–241. [25] R. Montague, Syntactical treatments of modality, with corollaries on reflexion principles and finite axiomatizability, Acta Philosophica Fennica, vol. 16 (1963), pp. 153–167, Reprinted in R. Montague, Formal Philosophy: Selected Papers of Richard Montague, Yale University Press, New Haven, 1974, pp. 286–302. [26] W. N. Reinhardt, Some remarks on extending and interpreting theories with a partial predicate for truth, Journal of Philosophical Logic, vol. 15 (1986), no. 2, pp. 219–251. [27] Stuart Russell and Peter Norvig, Artificial intelligence: A modern approach, 2nd ed., Prentice-Hall, Upper Saddle River, 2003. [28] S. Shapiro, Proof and truth: through thick and thin, The Journal of Philosophy, vol. 95 (1998), no. 10, pp. 493–521. [29] M. Sheard, Truth, provability, and na¨ıve criteria, Principles of Truth (V. Halbach and L. Horsten, editors), H¨ansel-Hohenausen, Frankfurt am Main, 2002, pp. 169–181. [30] Raymond Smullyan, What Is the Name of This Book? The Riddle of Dracula and Other Logical Puzzles, Prentice-Hall, Inc., Englewood Cliffs, 1978. DEPARTMENT OF MATHEMATICS COMPUTER SCIENCE, AND STATISTICS ST. LAWRENCE UNIVERSITY CANTON, NY 13617, USA
E-mail:
[email protected]
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
DIETER SPREEN
Abstract. Computations in spaces like the real numbers are not done on the points of the space itself but on some representation. If one considers only computable points, i.e., points that can be approximated in a computable way, finite objects as the natural numbers can be used for this. In ¨ the case of the real numbers such an indexing can e.g. be obtained by taking the Godel numbers of those total computable functions that enumerate a fast Cauchy sequence of rational numbers. Obviously, the numbering is only a partial map. It will be seen that this is not a consequence of a bad choice, but is so by necessity. The paper will discuss some consequences. All is done in a rather general topological framework.
§1. Introduction. The wish and the need to compute with real numbers has been one of the driving forces for the development of large parts of analytical mathematics. Following the Grundlagenkrise, in order to develop analysis in a constructive way also approaches based on one or the other of the newly found formalizations of the notion of algorithm were put forward. The problem to define what is a computable real number was indeed Turing’s [43, 44] main motivation for the introduction of his machine model. Today computable analysis is an active research area in theoretical computer science. Among others, the goal is to develop a way of computing with real numbers which does not suffer from such deficiencies as the unpredictable propagation of rounding errors. To achieve this one tries to extend the usual operations on the reals to a class of rational intervals approximating the reals. This larger structure is then also used to interpret the data type real used in programming languages. The aim is to use formal verification tools for reasoning about programs that compute with real numbers. Every decreasing sequence of rational intervals, one properly contained in the other, so that the associated sequence of interval lengths tends to 0, uniquely determines a real number.Rational intervals are easy to code.
This research has partially been supported by the German Science Association (DFG) under grant 446 CHV 113/240/0-1 “Algorithmic Foundations of Numerical Computation, Computability and Computational Complexity”. Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
221
222
DIETER SPREEN
If there is a total recursive function that enumerates the codes of such a se¨ quence, the corresponding real number is computable and any Godel number of the recursive function is called an index of it. As follows from the definition, the operation of taking the limit of a decreasing sequence of rational intervals is effective with respect to the numbering (or indexing) thus obtained. Moreover, we can enumerate all intervals containing a given computable real, uniformly in any of its indices. These are two useful properties. On the other hand, the numbering is only a partial map. Its domain of definition is at least Π02 -hard. We will see that this is not a consequence of a clumsy definition: Any indexing of the computable reals having the just mentioned properties must be partial. A great part of the nowadays theory of numberings has been developed by the Russian school of computability theory (cf. [11, 13, 15, 16, 17]). In these studies only total numberings have been considered so far. This is legitimate as long as one has only numberings of algebraic structures in mind. As we have just seen, the situation is completely different in the case of topological spaces such as the computable reals. Here, the canonical numberings are only partial maps. The situation with partial numberings is more complicated than the one with only total indexings, in many respects. Typical notions in numbering theory require the existence of certain witness functions. Such a function can behave well also in the case of arguments for which a given condition is not satisfied. The numbering may e.g. be undefined for such a number. Depending on how we deal with cases like this we obtain notions of different strength, which collapse when restricted to total numberings. As is well-known, given two numberings of a set, one is reducible to the other, if there is a computable function translating, for every element of the numbered set, an index of this element with respect to the first numbering into an index of the same element with respect to the second numbering. In the case of a partial numbering this function can also map a natural number that is not an index of any element onto an index of some element. Thus, in general, knowing that the result of the translation is an index of some element we cannot conclude that the argument of the translating function is an index of the same element, this time with respect to the first numbering. We can require the translation function to do only such translations—in which case we speak of strong reducibility—, but we do not have to. Hence, we obtain two reducibility notions of different strength. As we will see, the degree structures of the partial numberings of a given set with respect to these two reducibility notions are completely different. By Rice’s Theorem all nontrivial properties of the computable real numbers are undecidable. In order to study how difficult they are one considers their index sets. But the index sets have to be taken with respect to a partial
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
223
numbering. So, if one succeeds, e.g., to determine the level of the index set of some set X of computable real numbers in the arithmetical hierarchy, say Σ0n , one cannot conclude that the index set of the complement of X is in Π0n . All one knows is that the complement of the index set of X is in Π0n . But in addition to indices of elements in the complement of X , this set contains natural numbers with no computational significance: they do not name any computable real number. As a way out of this dilemma Shapiro [29] suggested instead of index sets to study pairs of index sets, one of X and one of the complement of X . But as we will see, in some cases one is happy to have only a classification of the index set of X . The space of all computable real numbers with the induced Euclidian topology is just a special case of the general framework considered in this paper. It includes the more general recursive metric spaces studied by Moschovakis [24] as well as constructive versions of Erˇsov’s A- and f-spaces [9, 12, 13, 14] and Scott’s directed-complete partial orders [28, 1, 38]. In contrast to metric spaces the latter classes of spaces satisfy only the rather weak T0 separation axiom. We consider second-countable T0 spaces and assume that there has already been a way to define what are their computable elements. Our spaces contain only countably many elements and come with a numbering of these. As we have already seen, in general we can only expect that such numberings are partial. Moreover, we follow M. B. Smyth’s approach [31] and think of the basic open sets as easy to encode observations that can be made about the computational process determining the elements. Therefore, we let the topological basis be indexed in a total way. By doing better and better observation we want finally be able to determine every element. Thus, we need a relation of definite refinement between the basic open sets which in many cases will be stronger than set inclusion. In most applications it will be recursively enumerable. As it turns out in these cases, the refinement relation is a relation between the codes of the basic open sets and not between the sets itself. Therefore, we assume that the indexing of the basic open sets is such that there is a transitive relation on the indices so that the property of being a topological basis holds with respect to this relation instead of just set inclusion. The property of being a base of the topology is a ∀∃ statement. We require it to be realized by a computable function on the involved indices. This leads us to the notion of an effective space. Note that we think of the topological basis with its numbering and the associated refinement relation as being part of the structure under consideration. We will encounter properties which are not invariant under a change of these being givens, though of course the topology remains the same. This seems
224
DIETER SPREEN
to be a typical feature of constructive approaches: constructive notions may depend on how objects are represented. The paper is organized as follows: Section 2 contains basic definitions. Different kinds of reducibilities between partial numberings are presented. In Section 3 effective spaces are introduced and conditions on the numberings of their points are discussed. They require that the collection of all basic open sets containing a given point can be enumerated, uniformly in any index of that point. Moreover, from an enumeration of a filter base of basic open sets one can compute an index of the point the filter converges to. This leads us to the notion of an acceptable numbering. Numberings with only the first property are called computable. The partial recursive function involved in the second requirement can of course be defined for other arguments as well and still have indices of points as values. Sometimes one has to demand that it cannot do so. In this case the numbering is called strongly acceptable. In Section 4 standard examples that satisfy the requirements set up in the preceding section are considered. They include constructive A- and fspaces, constructive domains, recursive metric spaces, and the computable real numbers. As is shown in Section 5, the class of effective spaces with (strongly) acceptable numberings is closed under the construction of subspaces, Cartesian products, disjunct unions, as well as inverse limits. In Section 6 the difficulty of some decision problems is studied. First the membership problem for nonopen sets is considered. The index set of such sets is always productive. As a consequence one obtains Rice’s Theorem for connected spaces. Then the problem of deciding for a given point whether it is nonfinite is examined. Here, a point is finite if its neighbourhood filter has a finite base. For a large class of spaces including the computable reals this problem is Π02 -complete. In case of strongly acceptably indexed spaces, the index set of any set containing a nonfinite point below which there is no finite point is Π02 -hard. Note here that any T0 space comes with a canonical partial order, the specialization order. It follows that every strongly acceptable numbering of a space with such a nonfinite point cannot be total. In particular, we have that the above mentioned numbering of the computable real numbers is necessarily partial. In Section 7 the behaviour of the two reducibility relations for partial numberings mentioned earlier is investigated. To this end we consider all partial numberings on a fixed set and the induced degree structures. As already said, they are quite different. If the reducibility is employed that straightforwardly extends the one used for total numberings, then for each numbering there are uncountably many reducible to it. Furthermore, the degrees of partial numberings form a distributive lattice, which in the case of an effective topological space contains the degrees of computable numberings as an ideal. Remember
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
225
that only countably many total numberings can be reduced to a given numbering and the collection of their degrees is an upper semilattice, which in general is not a lattice. Moreover in the total case, the degree of any Friedberg numbering is minimal in the semilattice ordering. Now, if the fixed set is infinite, there is an infinite descending chain of degrees of Friedberg numberings as well as an uncountable antichain of such degrees below every degree. In case of the strong reducibility relation the situation is similar to that of total numberings: The degrees form only a semilattice. By applying Erˇsov’s completion construction [11] to a partial numbering one obtains a complete total one. This can be used to establish an isomorphism between the upper semilattice of the degrees of partial numberings with respect to strong reducibility and the upper semilattice of all degrees of complete total numberings. As we have seen so far, in many cases working with partial numberings is less easy than working with total ones and requires special attention. In addition, when conditions like acceptability have to be satisfied we cannot expect that every partial numbering extends to a total numbering of the same set also fulfilling the conditions. On the other hand, Erˇsov’s completion construction allowed to extend partial numberings, the result being not only total but also complete. The construction requires to enlarge the given set by a finite element. Unfortunately, it does not preserve acceptability. So, the question comes up where it is possible to totalize a given acceptable partial numbering by embedding the corresponding space into a larger one, containing more finite elements. A construction of this kind is presented in Section 8. The larger space is an algebraic constructive domain containing the given space as homeomorphic image. If the given space is constructively complete and T1 , it corresponds exactly to the subspace of maximal elements of the new space. Representations of topological spaces by domains have been considered by many authors, mostly in a nonconstructive environment and under different motivations (cf. e.g. [28, 47, 20, 39, 3, 40, 7, 18, 4, 8, 21, 25, 5]). Concluding remarks will be found in Section 9. §2. Basic definitions. In what follows, let , : 2 → be a recursive pairing function with corresponding projections 1 and 2 such that i (a1 , a2 ) = ai , and let D be a standard coding of all finite subsets of natural numbers. Moreover, let P (n) (R(n) ) denote the set of all n-ary partial (total) recursive functions, and let Wi be the domain of the ith partial re¨ numbering ϕ. We let ϕi (a)↓ cursive function ϕi with respect to some Godel mean that the computation of ϕi (a) stops and ϕi (a)↓ ∈ C that it stops with value in C . Let S be a nonempty set. A (partial ) numbering of S is a partial map : S (onto) with domain dom(). The value of at n ∈ dom() is denoted by n . Note that instead of numbering we also say indexing.
226
DIETER SPREEN
Definition 2.1. A numbering of set S is said to be 1. negative, if the set {m, n | m, n ∈ dom() ∧ m = n } is recursively enumerable (r.e.), 2. precomplete, if for any function g ∈ P (1) there is a function f ∈ R(1) such that f(n) ∈ dom() and f(n) = g(n) , for n ∈ dom(g) with g(n) ∈ dom(), 3. complete, if there is some element e ∈ S, called special element, such that for any function g ∈ P (1) there is a function f ∈ R(1) such that f(n) ∈ dom(), for all n ∈ with either n ∈ / dom(g), or n ∈ dom(g) and g(n) ∈ dom(), and g(n) if n ∈ dom(g) with g(n) ∈ dom(), f(n) = e if n ∈ \ dom(g). As is shown in [11], is precomplete if and only if the recursion theorem holds with respect to . (Note that in [11] only total numberings are considered, but the proof is valid also in the partial case.) A subset X of S is completely enumerable, if there is an r.e. set Wn such that i ∈ X if and only if i ∈ Wn , for all i ∈ dom(). Set Mn = X , for any such n and X , and let Mn be undefined, otherwise. Then M is a numbering of the class CE of completely enumerable subsets of S. If Wn is recursive, X is called completely recursive. X is enumerable, if there is an r.e. set A ⊆ dom() such that X = {i | i ∈ A}. Thus, X is enumerable if we can enumerate a subset of the index set of X which contains at least one index for every element of X , whereas X is completely enumerable if we can enumerate all indices of elements of X and perhaps some numbers which are not used as indices by the numbering . Definition 2.2. Let and κ be numberings of set S. 1. ≤ κ, read is reducible to κ, if there is some witness function f ∈ P (1) such that dom() ⊆ dom(f), f(dom()) ⊆ dom(κ), and a = κf(a) , for all a ∈ dom(). 2. ≤s κ, read is strongly reducible to κ, if ≤ κ via f ∈ P (1) so that dom() = f −1 (dom(κ)). 3. ≡ κ, read is equivalent to κ, if ≤ κ and κ ≤ . Similarly for strong equivalence ≡s . As follows from the definition, if is reducible to κ via f ∈ P (1) , then we have that a = κf(a) , for all a ∈ dom(), whereas is strongly reducible to κ via f ∈ P (1) exactly if = κ ◦ f, where, if read pointwise, this equality means that either both sides are defined and equal, or both sides are undefined. It follows that ≤s κ via f ∈ P (1) if and only if for every s ∈ S and all i ∈ , i ∈ −1 ({s}) ⇔ f(i)↓ ∈ κ −1 ({s}),
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
227
and that ≤ κ via f ∈ P (1) if and only if for every s ∈ S and all i ∈ , i ∈ −1 ({s}) ⇒ f(i)↓ ∈ κ −1 ({s}). This shows that strong reducibility extends Ershov’s notion of pm-reducibility for sets and families of sets [11] to partial numberings. Moreover, we see that in the case that ≤ κ it is only required that the witness function f behaves correctly when transforming indices i of elements s ∈ S with respect to into indices f(i) of s with respect to κ. We do not demand that if f(i) in an index of some s with respect to κ, then i must be an index of s with respect to . Though in some cases we need be able to reason in this way. Definition 2.3. Let (S, ) and (S , ) be numbered sets and F : S → S . Then F is effective if there is a function f ∈ P (1) such that f(a)↓ ∈ dom( ) , for all a ∈ dom(). and F (a ) = f(a) Now, let T = (T, ) be a topological T0 space with countable basis B. We also write = B to express that B is a countable basis of . For any subset X of T , int (X ) and cl (X ), respectively, are the interior and the closure of X . An open set X is regular open, if X = int (cl (X )), and T is semi-regular, if all X ∈ B are regular open. (Note that in [6] a topology is called semi-regular, if it has a basis of regular open sets. Here, we think of a topology as being represented by a fixed basis.) As is well-known, each point y of a T0 space is uniquely determined by its neighbourhood filter N (y) and/or a base of it. A point y is called finite, if N (y) has a finite and hence a singleton base. Moreover, on T0 spaces there is a canonical partial order, the specialization order, which we denote by ≤ . Definition 2.4. Let T = (T, ) be a T0 space, and y, z ∈ T . y ≤ z if N (y) ⊆ N (z). Let B be a numbering of B. By definition each open set is the union of certain basic open sets. In the context of effective topology one is only interested in enumerable unions. We call an open set O ∈ Lacombe set, if there is an r.e. set A ⊆ dom(B) such that O = { Ba | a ∈ A }. Set Ln = { Ba | a ∈ Wn }, if Wn ⊆ dom(B), and let Ln be undefined, otherwise. Then L is a numbering of the Lacombe sets of . Obviously, B ≤ L . Now, we can effectively compare second-countable topologies and say what it means that a map is effectively continuous. Definition 2.5. Let = B and = C be a topologies on T , and B and C , respectively, be numberings of B and C.
228
DIETER SPREEN
1. ⊆e , read is effectively finer than , if C ≤ L . 2. =e , read and are effectively equivalent, if both ⊆e and ⊆e . Definition 2.6. For = 1, 2, let T = (T , ) be a second-countable topological T0 space with basis B and associated numbering B . Then a map F : T 1 → T 2 is effectively continuous, if there is a function v ∈ P (1) such that 1 1 for all n ∈ dom(B 2 ), v(n)↓ ∈ dom(L ) and F −1 (Bn2 ) = Lv(n) . Note that the usual continuity notion is strengthened here in two stages: First it is required that the preimage of a basic open set is not just open, but a Lacombe set, and then it is demanded that its index depends uniformly on the index of the given basic open set. It would perhaps be more appropriate to call such maps effectively Lacombe continuous. But for brevity we prefer the above notion. In case F is an embedding, i.e., one-to-one, it is called effectively homeomorphic, if both F and its partial inverse F −1 : F (T ) → T are effectively continuous. Every indexing of a countable set induces a family of natural topologies on this set. Let T be a countable set with numbering x. A topology on T is a Mal’cev topology [22], if it has a basis C of completely enumerable subsets of T . Any such basis is called a Mal’cev basis. E = CE is called Erˇsov topology. All Mal’cev bases on T can be indexed in a uniform canonical way. Let Mn = Mn , if Mn ∈ C, and let it be undefined, otherwise. Then E is the effectively finest Mal’cev topology on T . Beside the Erˇsov topology there are other important Mal’cev topologies. Obviously, CE is a distributive lattice with respect to union and intersection. For U ∈ CE, let U ∗ denote its pseudocomplement, that is, the greatest completely enumerable subset of T \ U , if it exists. U is called regular, if U ∗ and U ∗∗ both exist and U ∗∗ = U . We say that a topology is a biMal’cev topology, if it has a basis of regular sets. Any such basis is called a bi-Mal’cev basis. Since the class REG of all regular subsets of T is closed under intersection, it also generates a bi-Mal’cev topology on T , which we denote by R. Let be a bi-Mal’cev topology on T . Just as in the general case of all Mal’cev bases, also all bi-Mal’cev bases on T can be indexed in a uniform = Mm , if m ∈ dom(M ), n ∈ dom(M ) and way. Let to this end Rm,n Mm ∗ = Mn , and let it be undefined, otherwise. Then R is the effectively finest bi-Mal’cev topology on T . The reason for the introduction of bi-Mal’cev topologies is that in certain cases one needs to be able to enumerate not only each basic open set, but to a certain extend also its complement. In general, one cannot expect that the whole complement of a basic open set is completely enumerable. So, one has to decide for which part of it this should be the case.
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
229
If is a topology on T , then a subset X of T is called weakly decidable, if its interior and its exterior are both completely enumerable. Obviously, every regular set is weakly decidable with respect to the Erˇsov topology. This leads to another choice of which part of the complements of basic open sets should be completely enumerable. We say that is complemented, if all of its basic open sets are weakly decidable. The class of all weakly decidable regular open sets of a topology on T generates a topology ∗ which is coarser than , but the effectively finest complemented semi-regular topology generated by open sets of topology ; it is said to be the complemented semi-regular topology associated with . Proposition 2.7. [34] R is the complemented semi-regular topology associated with the Erˇsov topology on T , that is, R = E ∗ . §3. Effective spaces. In what follows, let T = (T, ) be a countable topological T0 space with countable basis B. At first sight the requirement that T is countable seems quite restrictive. We think of T as being the subspace of computable elements of some larger space. There are several approaches to topology that come with natural computability notions for points and maps (cf. e.g. [30, 41, 45, 5]). It allows to assign indices to the computable points in a canonical way so that important properties become computable. In general the notion of computable point is rather complex, mainly harder than Σ01 . Consequently, the indexings of the computable points thus obtained are only partial maps. Contrary to this, in most applications the basic open sets have a simple finite description. By coding the descriptions one obtains a total numbering of the topological basis. For us basic open sets are predicates. Each point is uniquely determined by the collection of all predicates it satisfies, thus the T0 requirement. Usually, set inclusion between basic open sets is not completely enumerable. But in the applications we have in mind there is a canonical relation between the descriptions of the basic open sets (respectively, their code numbers), which in many cases is stronger than set inclusion. This relation is r.e. We assume that the topological basis B comes with a numbering B of its elements and such a relation between the codes. Definition 3.1. Let ≺B be a transitive binary relation on . We say that: 1. ≺B is a strong inclusion, if for all m, n ∈ dom(B), from m ≺B n it follows that Bm ⊆ Bn . 2. B is a strong basis, if ≺B is a strong inclusion and for all z ∈ T and m, n ∈ dom(B) with z ∈ Bm ∩ Bn there is a number a ∈ dom(B) such that z ∈ Ba , a ≺B m and a ≺B n.
230
DIETER SPREEN
For what follows we assume that ≺B is a strong inclusion with respect to which B is a strong basis. Definition 3.2. Let T = (T, ) be a countable topological T0 space with countable basis B, and let x and B be numberings of T and B, respectively. Then T is effective, if B is total and the property of being a strong basis holds effectively, which means that there exists a function sb ∈ P (3) such that for i ∈ dom(x) and m, n ∈ with xi ∈ Bm ∩ Bn , sb(i, m, n)↓, xi ∈ Bsb(i,m,n) , sb(i, m, n) ≺B m, and sb(i, m, n) ≺B n. Obviously, the effectivity of T is invariant under the equivalence of numberings of T . Note that very often the totality of B can easily be achieved, if the space is recursively separable, which means that it has a dense enumerable subset, called its dense base. In applications canonical indexings usually have the important property that all basic open sets Bn are completely enumerable, uniformly in n. Definition 3.3. Let T = (T, ) be a countable topological T0 space with countable basis B, and let x and B be numberings of T and B, respectively. We say that x is computable if there is some r.e. set L such that for all i ∈ dom(x) and n ∈ dom(B), i, n ∈ L ⇔ xi ∈ Bn . As is readily verified, T is effective if x is computable, B is total and the strong inclusion relation is r.e. Since we work with strong inclusion instead of set inclusion, we had to adjust the notion of a topological basis. In the same way we have to modify that of a filter base. Definition 3.4. Let H be a filter. A nonempty subset F of H is called strong base of H if the following two conditions hold: 1. For all m, n ∈ dom(B) with Bm , Bn ∈ F there is some index a ∈ dom(B) such that Ba ∈ F, a ≺B m, and a ≺B n. 2. For all m ∈ dom(B) with Bm ∈ H there some index a ∈ dom(B) such that Ba ∈ F and a ≺B m. If x is computable, a strong base of basic open sets can effectively be enumerated for each neighbourhood filter. For effective spaces this can always be done in a normed way [34]. Definition 3.5. An enumeration (Bf(a) )a∈ with f : → such that range(f) ⊆ dom(B) is said to be normed if f is decreasing with respect to ¨ number of f is ≺B . If f is recursive, it is also called recursive and any Godel said to be an index of it. In case (Bf(a) ) enumerates a strong base of the neighbourhood filter of some point, we say it converges to that point.
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
231
We want not only to be able to generate normed recursive enumerations of basic open sets that converge to a given point, but conversely, we need also to be able to pass effectively from such enumerations to the point they converge to. Definition 3.6. Let x be a numbering of T . We say that: 1. x allows effective limit passing if there is a function pt ∈ P (1) such that, if m is an index of a normed recursive enumeration of basic open sets which converges to some point y ∈ T , then pt(m)↓ ∈ dom(x) and xpt(m) = y. 2. x is acceptable if it allows effective limit passing and is computable. Note that numbering x is precomplete exactly if there is a total function pt ∈ R(1) witnessing that x allows effective limit passing [33]. By definition the function pt is only required to behave correctly on indices of normed converging enumerations of basic open sets. If for some m ∈ we have that pt(m)↓ ∈ dom(x), we cannot conclude that m is an index of such an enumeration. Sometimes (Bϕm (a) ) should at least enumerate a strong base of N (xpt(m) ). Definition 3.7. A numbering x of T is strongly acceptable, if it is computable and there is a function pt ∈ P (1) which witnesses that x allows effective limit passing and is such that for m ∈ it follows from pt(m)↓ ∈ dom(x) that the collection of all sets Bϕm (a) (a ∈ dom(ϕm )) is a strong base of N (xpt(m) ). As is shown in [32], not every acceptable numbering is also strongly acceptable. But as we will see later, it is always equivalent to such a numbering. They exist under conditions usually satisfied in applications. If x is computable, each neighbourhood filter N (y) has a completely enumerable strong base of basic open sets, namely the set of all Ba with y ∈ Ba . In approaches to computable topology like [30, 45] this is the requirement for being a computable point. If, in addition, ≺B is r.e., one can construct a strongly acceptable numbering of T that is even precomplete and makes T effective. Proposition 3.8. Let T be such that the neighbourhood filter of each point has an enumerable strong base of basic open sets. Moreover, let ≺B be r. e. Then T has a precomplete and strongly acceptable numbering x. ˆ If B is total, T is effective with respect to this numbering. Note that the result is a consequence of [32, Proposition 2.11] and [34, Lemma 2.15]. The definition of xˆ is straightforward: if m is an index of a normed recursive enumeration of basic open sets that converges to point y ∈ T , set xˆ m = y. Otherwise let xˆ be undefined. In some results we have to require that up to strong equivalence T is indexed by x. ˆ Definition 3.9. A numbering x is said to be a standard numbering of T , if it is strongly equivalent to numbering x. ˆ
232
DIETER SPREEN
Indexings which are computable and/or allow effective limit passing are related to each other in the following way. Lemma 3.10. [34] Let T be effective. Then for any two numberings x and x of T the following hold :
1. If x is computable and x allows effective limit passing, then x ≤ x . 2. If x is computable and x ≤ x , then x is computable. 3. If x allows effective limit passing and x ≤ x , then x allows effective limit passing. Corollary 3.11. Let T be effective and x be acceptable. Then for any numbering x of T the following hold : 1. x is computable if and only if x ≤ x. 2. x allows effective limit passing if and only if x ≤ x . 3. x is acceptable if and only if x ≡ x. We see that the acceptability notion is invariant under equivalence. The same is true for strong acceptability with respect to strong equivalence. If ≺B is r.e., it follows from Proposition 3.8 that for each computable numbering x of T there is an strongly acceptable numbering x of T with x ≤ x . If x is acceptable, we even have that x and x are equivalent. For neighbourhood filters of points having an enumerable strong base, we can always construct a normed enumeration of a strong base of the same filter. But not every such enumeration needs to converge. This gives rise to the following completeness notion. Definition 3.12. A T0 space T = (T, , B, B, ≺B ) with a countable strong basis is constructively complete, if each normed recursive enumeration of basic open sets converges. As we shall see in the next section, the constructive completeness of a space may depend on the choice of the topological basis B (as well as the numbering B and the strong inclusion relation ≺B belonging to it). Proposition 3.13. Let T be effective and constructively complete such that all basic open sets are nonempty. Let ≺B be r.e. and x allow effective limit passing. Then T is recursively separable. Proof. Since ≺B is r.e., there is some function t ∈ R(1) so that Wt(m) = {n ∈ | n ≺B m}. Note that each such set is nonempty because all basic open sets are not empty and B is strong. Let k ∈ R(1) with ϕk(m) (0) = m, ϕk(m) (i + 1) = ϕr(t(ϕk(m) (i))) (0).
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
233
Here, r ∈ R(1) so that ϕr(i) enumerates Wi . Again, since all basic open sets are nonempty and B is strong, ϕk(m) is total. Thus, k(m) is an index of a normed recursive enumeration of basic open sets. As T is constructively complete, it converges to xpt(k(m)) . By the construction of function k we have that xpt(k(m)) ∈ Bm . This shows that the set of all points xpt(k(m)) (m ∈ ) is an enumerable dense base of T . §4. Special cases. In this section we will consider some important standard examples of effective T0 spaces. 4.1. Constructive domains. Let Q = (Q, ) be a partial order with least element. A nonempty subset S of Q is directed, if for all y1 , y2 ∈ S there is some u ∈ S with y1 , y2 u. The way-below relation on Q is defined as follows: y1 y2 if for every directedsubset S of Q the least upper bound of which exists in Q, the relation y2 S implies the existence of an element u ∈ S with y1 u. Note that is transitive. Elements y ∈ Q with y y are called compact. A subset Z of Q is a basis of Q, if for any y ∈ Q the set Zy = { z ∈ Z | z y } is directed and y = Zy . A partial order that has a basis is called continuous. If all elements of Z are compact, Q is said to be algebraic and Z is called algebraic basis. Now, assume that Q is countable and let x be an indexing of Q. Then Q is constructively d-complete, if each of its enumerable directed subsets has a least upper bound in Q. Let Q be constructively d-complete and continuous with basis Z. Moreover, let be a total numbering of Z. Then (Q, , Z, , x) is said to be a constructive domain, if the restriction of the way-below relation to Z as well as all sets Zy , for y ∈ Q, are completely enumerable with respect to the indexing and ≤ x. The numbering x of Q is said to be admissible, if the set { i, j | i xj } is r.e. and there is a function d ∈ R(1) such that for all indices i ∈ for which
(Wi ) is directed, xd (i) is the least upper bound of (Wi ). As shown in [46], such numberings always exist. They can even be chosen as total. Partial orders come with several natural topologies. In the applications we have in mind, one is mainly interested in the Scott topology : a subset X of Q is open in , if it is upwards closed with respect to the partial order and intersects each enumerable directed subset of Q of which it contains the least upper bound. In the case of a constructive domain this topology is generated by the sets Bn = {y ∈ Q | n y} with n ∈ . It follows that Q = (Q, ) is a countable T0 -space with countable basis. Observe that the partial order on Q coincides with the specialization order defined by the Scott toplogy [19]. Moreover, compactness matches with finiteness. Obviously, every admissible numbering is computable. Since Z is dense in Q we also obtain that Q is recursively separable.
234
DIETER SPREEN
Define m ≺B n ⇔ n m . Then ≺B is a strong inclusion with respect to which the collection of all Bn is a strong basis. Because the restriction of to Z is completely enumerable, ≺B is r.e. It follows that Q is effective. Moreover, it is constructively complete and each admissible indexing allows effective limit passing, i.e., it is acceptable. Conversely, every acceptable numbering of Q is admissible. Note here that since we have to make use of the effectivity characteristics of the basis, these properties can only be verified if we choose the strong inclusion relation as above and do not use simple set inclusion instead. The next result is a special case of a characterization theorem for effective spaces in [34] and generalizes the well-known Rice/Shapiro Theorem [27]. Theorem 4.1. For any admissibly indexed constructive domain, =e E. A partial order Q is bounded-complete, if every bounded subset of Q has a least upper bound in Q. Algebraic, bounded-complete constructive domains are called constructive Scott domains, if the restriction of the domain order to Z as well as the boundedness of two elements of Z are completely recursive, and there is a function su ∈ R(2) such that for any two bounded elements m and n , su(m,n) is their least upper bound. 4.2. Constructive A- and f-spaces. A- and f-spaces have been introduced by Erˇsov [9, 10, 12, 13, 16] as a more topologically oriented approach to domain theory. They are not required to be complete. Let Y = (Y, ) be a topological T0 space. For elements y, z ∈ Y define y z if z ∈ int ({u ∈ Y | y ≤ u}). Then y is finite if and only if y y. Y is an A-space, if there is a subset Y0 of Y satisfying the following three properties: 1. Any two elements of Y0 which are bounded in Y with respect to the specialization order have a least upper bound in Y0 . 2. The collection of sets int ({u ∈ Y | y ≤ u}), for y ∈ Y0 , is a basis of topology . 3. For any y ∈ Y0 and u ∈ Y with y u there is some z ∈ Y0 such that y z and z u. Any subset Y0 of Y with these properties is called basic subspace. Let Y be countable and Y0 have a numbering . For m, n ∈ dom( ) set Bn = int ({u ∈ Y | n ≤ u }) and define m ≺B n ⇔ n m . Then ≺B is a strong inclusion with respect to which {Bn | n ∈ dom( )} is a strong basis. The A-space Y with basic subspace Y0 is constructive, if the numbering is total, the restriction of to Y0 is completely enumerable, and the neighbourhood filter of each point has an enumerable strong base of basic
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
235
open sets. As follows from Proposition 3.8, Y has a precomplete standard numbering x such that Y is effective. Moreover, it is recursively separable with dense basis Y0 . Theorem 4.2. [34] For every constructive A-space (Y, ), =e E. Since the topology of a constructive A-space is not required to be the Scott topology (with respect to ≤ ), constructive d-completeness is too weak a completeness notion in this case. Definition 4.3. A constructive A-space Y is effectively complete, if every enumerable directed subset S of Y with the property that for every z ∈ S there is some z ∈ S with z z , has an upper bound y ∈ Y which is also a limit point of S. Obviously, given such a set S we can enumerate a subset S such that any two elements of S are comparable with respect to and for every z ∈ S there is some z ∈ S with z z . This gives us the following result. Proposition 4.4. A constructive A-space Y is constructively complete if and only if it is effectively complete. Let Y = (Y, ) be again an arbitrary topological T0 -space. An open set V is an f-set, if there is an element zV ∈ V such that V = {y ∈ Y | zV ≤ y}. The uniquely determined element zV is called an f-element. Y is an f-space, if the following two conditions hold: 1. If U and V are f-sets with nonempty intersection, then U ∩ V is also an f-set. 2. The collection of all f-sets is a basis of topology . An f-space is constructive, if the set of all f-elements has a total numbering α such that the restriction of the specialization order to this set as well as the boundedness of two f-elements are completely recursive and there is a function su ∈ R(2) such that in the case that αn and αm are bounded, αsu(n,m) is their least upper bound, and if the neighbourhood filter of each point has an enumerable base of f-sets. Every f-space is an A-space with basic subspace the set of all f-elements, which are exactly the finite elements of the space. Moreover, for y, z ∈ Y with y or z being an f-element, y z if and only if y ≤ z. It follows that also every constructive f-space is a constructive A-space. 4.3. Constructive metric spaces. Let R denote the set of all real numbers, and let be some canonical total indexing of the rational numbers. Then a real number z is said to be computable, if there is a function f ∈ R(1) such that for all m, n ∈ with m ≤ n, the inequality |f(m) − f(n) | < 2−m holds and ¨ number of the function f is called an index of z. z = limm f(m) . Any Godel This defines a partial indexing of the set Rc of all computable real numbers.
236
DIETER SPREEN
Now, let M = (M, ) be a separable metric space with range() ⊆ Rc , and let be a total numbering of the dense subset M0 . A sequence (ya )a∈ of elements of M0 is said to be fast, if (ym , yn ) < 2−m , for all m, n ∈ with m ≤ n. Moreover, (ya ) is recursive, if there is some function f ∈ R(1) such ¨ number of f is called an index of that ya = f(a) , for all a ∈ . Any Godel (ya ). M is said to be constructive, if the restriction of the distance function to M0 is effective, i.e., if there is some function d ∈ R(2) such that for all i, j ∈ , ( i , j ) = d (i,j) , and each element y of M is the limit of a fast recursive sequence of elements of M0 . If m is the index of such a sequence, set xm = y. Otherwise, let x be undefined. Then x is a numbering of M with respect to which and the indexing of the computable real numbers the distance function is effective [32]. As is well-known, the collection of sets Bi,m = { y ∈ M | ( i , y) < 2−m } (i, m ∈ ) is a basis of the canonical Hausdorff topology ∆ on M . Because the usual less-than relation on the computable real numbers is completely enumerable [23], it follows that x is computable. A point y ∈ M is finite if and only if it is isolated [32]. Define i, m ≺B j, n ⇔ ( i , j ) + 2−m < 2−n . Using the triangle inequality it is readily verified that ≺B is a strong inclusion and the collection of all Ba is a strong basis. Moreover, ≺B is r.e. It follows that M is effective. The numbering x is precomplete and standard [32]. Theorem 4.5. [32, 34] Let M be a constructive metric space. Then the following statements hold : 1. M is constructively complete if and only if every fast recursive sequence of elements of the dense subset converges. 2. ∆ =e R. Well-known examples of constructive metric spaces include Rnc with the Euclidean or the maximum norm, Baire space, that is, the set R(1) of all total recursive functions with the Baire metric [27], and the set with the discrete metric. By using an effective version of Weierstraß’s approximation theorem [26] and Sturm’s theorem [42] it can be shown that Cc [0, 1], the space of all computable functions from [0, 1] to R [26] with the supremum norm, is a constructive metric space. A proof of this result and further examples can be found in Blanck [4]. As follows from Theorem 4.5 (2), the metric topology on Baire space is also generated by the completely enumerable subsets with completely enumerable complement. 4.4. The computable real numbers. When starting our presentation of the theory of effective topological spaces we took for granted that the spaces under
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
237
consideration come with a fixed countable strong basis B of the topology, a total numbering B of B and a strong inclusion relation. An assumption of this kind seems to be unavoidable in a constructive approach. As a consequence, certain notions may not be invariant under basis change. Let us consider the space Rc of all computable real numbers with the induced Euclidean topology. Then Rc is a constructive metric space, which is constructively complete with respect to the basis introduced above. When dealing with the real numbers we usually use other bases, e.g. the collection I of all open intervals with rational endpoints. As we shall show now, Rc is not constructively complete with respect to this basis. Let again be a canonical indexing of the rational numbers and f ∈ R(1) enumerate the set {m, n | m < n }. Set Ia = (1 (f(a)) , 2 (f(a)) ) and let a ≺I b ⇔ 1 (f(b)) < 1 (f(a)) ∧ 2 (f(a)) < 2 (f(b)) . Then ≺B is an r.e. strong inclusion with respect to which I is a strong basis. Moreover, for every y ∈ Rc , N (y) has an enumerable strong base of basic open sets. It follows that also in this case Rc has a precomplete standard numbering so that it is an effective space. In addition, Rc is recursively separable. Now, let m ∈ with Iϕm (n) = (− n1 , 1 + n1 ). Then m is an index of a normed recursive enumeration of basic open sets. But this enumeration does not converge. §5. New spaces from old. In this section we study how properties like the effectivity of spaces or the acceptability of numberings are inherited to spaces constructed from given ones. 5.1. Base reduction. By definition a topological basis is allowed to contain the empty set, but sometimes it is useful to exclude this. We will show that under the assumption that {n ∈ | Bn = ∅} is r.e., we can define a numbering B of the basis B = B \ {∅} and a strong inclusion relation ≺B so that B is strong again and T is effective with respect to the new basis, if it was so with respect to the old one. Let f ∈ R(1) enumerate the set of all n ∈ for which Bn is not empty. Then f has a right inverse f ∈ P (1) defined by f (a) = n : f(n) = a. Set Bn = Bf(n) and let m ≺B n ⇔ f(m) ≺B f(n). Lemma 5.1. Let T = (T, ) be a countable T0 space with countable basis B, and let x and B be numberings of T and B, respectively. Moreover, let ≺B be a strong inclusion relation. Then ≺B is a strong inclusion as well and the following statements hold : 1. The identity on T induces an effective homeomorphism between T = (T, , B, B, ≺B , x) and T = (T, , B , B , ≺B , x).
238
DIETER SPREEN
2. If B is strong, the same is true for B . 3. If T is effective, then T is effective as well. 4. If T is constructively complete, then also T is constructively complete. Proof. For the proof of the second statement let z ∈ T with z ∈ Bm ∩ Bn . Then z ∈ Bf(m) ∩ Bf(n) . Hence, there is some a ≺B f(m), f(n) with z ∈ Ba . It follows that Ba is not empty. Thus f (a) is defined, z ∈ Bf (a) and f (a) ≺B m, n. The proof of the other statements is obvious. Note that the above assumption about {n ∈ | Bn = ∅} is always fulfilled, if T is recursively separable and x is computable. When we introduced computability and similar notions for the numbering x of T , we did that with respect to a fixed topological basis B with numbering B and strong inclusion ≺B . All these properties remain unchanged, if we use B , B and ≺B instead. Lemma 5.2. Let T = (T, ) be a countable T0 space with countable basis B, and let x and B be numberings of T and B, respectively. Moreover, let ≺B be a strong inclusion relation. Then the following statements hold : 1. If x is computable with respect to B, it is computable with respect to B as well. 2. If x allows effective limit passing with respect to B and ≺B , it does so also with respect to B and ≺B . 3. If x is (strongly) acceptable with respect to B and ≺B , it is also (strongly) acceptable with respect to B and ≺B . 4. If x is a standard numbering with respect to B and ≺B , it is also a standard numbering with respect to B and ≺B . Proof. The proof of the first statement is obvious. For the proof of the second one let k ∈ R(1) with ϕk(m) = f ◦ ϕm . Then, if m is an index of a normed recursive enumeration of basic open sets with respect to B and ≺B , k(m) is an index of the same recursive enumeration which is now normed with respect to B and ≺B . It follows that pt = pt ◦k witnesses that x allows effective limit passing with respect to B and ≺B . The third statement is now an easy consequence. For the proof of the last statement define numberings x¯ and xˆ as follows. If (Bϕm (a) )a∈ and (Bϕ m (a) )a∈ are normed enumerations of a strong base of N (y), for y ∈ T , set x¯ m = xˆ = y. Otherwise let both numberings be undefined. Then xˆ ≤s x¯ via the just defined function k ∈ R(1) . Conversely, if (Bϕm (a) ) is a normed enumeration of a strong base of N (x¯ m ), then Bϕm (a) is not empty, for all a ∈ . Hence f (ϕm (a)) is always defined. Let k ∈ R(1) with ϕk (m) = f ◦ ϕm . Then k witnesses that xˆ ≤s x. ¯ Thus, ˆ x¯ ≡s x.
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
239
¯ Now, if x is a standard numbering with respect to B and ≺B , then x ≡s x. ˆ which shows that x is also standard with respect to B It follows that x ≡s x, and ≺B . 5.2. Subspaces. Let S be a subset of T and S be the induced topology on S. Define B S to be the collection of all sets X ∩S with X ∈ B, set BnS = Bn ∩S and let m ≺B S n ⇔ m ≺B n. Finally, let xiS = xi , if i ∈ dom(x) and xi ∈ S. In any other case let x S be undefined. Set T S = (S, S , B S , B S , ≺B S , x S ). Lemma 5.3. Let T = (T, ) be a countable T0 space with countable basis B, numberings x and B of T and B, respectively, and strong inclusion ≺B . Moreover, let S ⊆ T . Then ≺B S is also a strong inclusion and the following statements hold : 1. The canonical embedding of S in T is both effective and effectively continuous. 2. If B is strong, B S is also strong. 3. If T is effective, T S is effective as well. 4. If x is computable, allows effective limit passing, is (strongly) acceptable, or a standard numbering, the same holds for x S . Note here that for numberings x¯ and xˆ of T with x¯ ≤s x, ˆ we have that x¯ S ≤s xˆ S . 5.3. Disjoint unions. For = 1, 2, let T = (T , ) be a countable T0 space with countable basis B , numberings x and B of T and B , respectively, and strong inclusion relation ≺B . Set T 1 ⊕ T 2 = {1} × T 1 ∪ {2} × T 2 , {1} × Bm1 if n is odd, ⊕ Bn,m = (n, m ∈ ), {2} × Bm2 otherwise n, m ≺B ⊕ n , m ⇔ n, n odd ∧ m ≺B 1 m ∨ n, n even ∧ m ≺B 2 m . Let B ⊕ be the collection of all sets Ba⊕ , for a ∈ . Then B ⊕ is a basis of the canonical topology ⊕ on T 1 ⊕ T 2 . Moreover, ≺B ⊕ is a strong inclusion relation. Obviously, T 1 ⊕ T 2 = (T 1 ⊕ T 2 , ⊕ ) is a countable T0 space again. Set (1, xi1 ) if n is odd and i ∈ dom(x 1 ), ⊕ xn,i = (2, xi2 ) if n is even and i ∈ dom(x 2 ). In any other case let x ⊕ be undefined.
240
DIETER SPREEN
Lemma 5.4. For = 1, 2, let T = (T , ) be a countable T0 space with countable basis B , numberings x and B of T and B , respectively, and strong inclusion relation ≺B . Then the following statements hold : 1. For = 1, 2, the canonical inclusion of T into T 1 ⊕ T 2 is both effective and effectively continuous. 2. If B 1 and B 2 are strong, then B ⊕ is strong too. 3. If T 1 and T 2 are constructively complete, then also T 1 ⊕T 2 is constructively complete. 4. If T 1 and T 2 are effective, then T 1 ⊕ T 2 is effective as well. 5. If x 1 and x 2 are both computable, allow effective limit passing, are (strongly) acceptable, or standard numberings, the same holds for x ⊕ . The above results can be extended to countable unions in a straightforward way. In the case of the last two statements one has to require that the corresponding effectivity conditions hold uniformly, which e.g. means that the witnessing functions in the definition of a standard numbering computably depend on the space index. × = 5.4. Cartesian products. Let T be as above, for = 1, 2. Set Bm,n 1 2 Bm × Bn , for m, n ∈ , and define m, n ≺B × m , n ⇔ m ≺B 1 m ∧ n ≺B 2 n . Then ≺B × is a strong inclusion. Moreover B × , the collection of all sets Ba× with a ∈ , is basis of the canonical topology × on T 1 × T 2 . T 1 × T 2 = (T 1 × T 2 , × ) is again a countable T0 space. For (i, j) ∈ dom(x 1 ) × dom(x 2 ), × set xi,j = (xi1 , xj2 ). Otherwise, let x × be undefined. Lemma 5.5. For = 1, 2, let T = (T , ) be a countable T0 space with countable basis B , numberings x and B of T and B , respectively, and strong inclusion relation ≺B . Then the following statements hold : 1. For = 1, 2, the canonical projection of T 1 × T 2 onto T is both effective and effectively continuous. 2. If B 1 and B 2 are strong, then B × is strong as well. 3. If T 1 and T 2 are constructively complete, then also T 1 ×T 2 is constructively complete. 4. If T 1 and T 2 are effective, then T 1 × T 2 is effective too. 5. If x 1 and x 2 are both computable, allow effective limit passing, are (strongly) acceptable, or standard numberings, the same holds for x × . 5.5. Countable products. Let f ∈ R(1) enumerate the set {a ∈ | (∀c, c ∈ Da )[1 (c) = 1 (c ) ⇒ 2 (c) = 2 (c )]} and for ∈ , let T = (T , ) be a countable T0 space with countable basis B , numberings x and B of T and
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
241
B , respectively, and strong inclusion relation ≺B . Set T Π = z¯ ∈ T | (∃t ∈ R(1) )(∀ ∈ )z¯ = xt( ) ∈
and let P : T → T be the projection on the -th component. Moreover, for a, b ∈ , let BaΠ = {P −1 (Bn ) | , n ∈ Df(a) } and define
a ≺B Π b ⇔ 1 Df(b) 1 Df(a) ∧ ∀ , n ∈ Df(b)
∃ , n ∈ Df(a) = ∧ n ≺B n . Π
Then B Π , the collection of all sets BaΠ , for a ∈ , is a basis of the canonical topology Π on T Π . Furthermore, ≺B Π is a strong inclusion. T Π = (T Π , Π ) is again a countable T0 space. If i ∈ such that ϕi ( )↓ ∈ dom(x ), for all ∈ , define xiΠ by xiΠ ( ) = xϕ i ( ) . In any other case let x Π be undefined. Lemma 5.6. For ∈ , let T = (T , ) be a countable T0 space with countable basis B , numberings x and B of T and B , respectively, and strong inclusion relation ≺B . Then the following statements hold : 1. For ∈ , the canonical projection of T Π onto T is both effective and effectively continuous, uniformly in . 2. If B is strong, for all ∈ , then B Π is also strong. 3. If x allows effective limit passing, uniformly in , and T is constructively complete, for every ∈ , then T Π is constructively complete too. 4. If each T is effective, uniformly in , then T Π is effective as well. 5. If all x are computable, allow effective limit passing, are (strongly) acceptable, or standard numberings, always uniformly in , the same holds for x Π . Proof. The statements follow in a more or less straightforward way. We only show (3). Note that given an index m of a normed recursive enumeration of basic open sets BaΠ and an index , we can effectively find some a ∈ so that ∈ 1 (Df(ϕm (a)) ). By the definition of ≺B Π , we have that ∈ 1 (Df(ϕm (a )) ), for all a ≥ a. Let k( , c) = n : , n ∈ Df(c) and n ∈ be such that ϕn (c) = k( , ϕm (c + a)). Then n is an index of a normed recursive enumeration of basic open sets in space T . By assumption there is a function p ∈ R(1) so that ϕp( ) witnesses that x allows effective limit passing. Thus, the recursive enumeration with index n converges to xϕ (n) . Let h ∈ R(1) p( )
with ϕh(m) ( ) = ϕp( ) (n). Then the enumeration with index m converges to Π xh(m) . 5.6. Inverse limits. Let ETOP (ETOPe ) be the category of effective T0 spaces with (effective and) effectively continuous maps. ETOPco , ETOPlp , ETOPac , ETOPsa , and ETOPst , respectively, be the full subcategories of spaces
242
DIETER SPREEN
with indexings that are computable, allow effective limit passing, are acceptable, strongly acceptable, or standard. Analogous subcategories are defined for ETOPe . Denote the collection of these categories by E. F F F An -cochain T = T 0 ←0 T 1 ←1 T 2 ←2 . . . in ETOP is called effective, if the effectivity of the T and the effective continuity of the maps F hold uniformly in ; analogously for the other categories. Let C ∈ E and T be an -cochain in C. A cone (T , (P ) ∈ ) to T is effective, if the effective continuity of the maps P holds uniformly in . If C is the category ETOPe or one of its full subcategories, also the effectivity of the P has to hold uniformly. Then an effective limit to T is a terminal object in the category of effective cones to T in C. Now, let T = (T , F ) ∈ be an effective cochain in C and define T ∞ by
T ∞ = z¯ ∈ T Π | (∀ ∈ )z¯ = F (z¯ +1 ) . The corresponding topology is induced by the product space topology and its basis with numbering and associated strong inclusion, as well as the indexing of space T ∞ are given in the same way as in Section 5.2. Let P denote again the canonical projection of T ∞ onto T . Then it follows from Lemmata 5.3 and 5.6 that (T ∞ , (P ) ∈ ) is an effective cone to T in C. Proposition 5.7. Let C ∈ E and T = (T , F ) ∈ be a cochain in C. Then (T ∞ , (P ) ∈ ) is an effective limit to T in C. §6. Some decision problems. If is a total numbering of some set S, then for each subset X of S there is a predicate on the natural numbers which is true for natural number n, if n ∈ X , and false, otherwise. In case that is a partial numbering there is only a partial predicate, being true for n ∈ , if n ∈ X , false, if n ∈ X , and undefined, otherwise. We represent such partial predicates by pairs of disjoint sets of natural numbers. Let Ω(X ) = {n ∈ dom() | n ∈ X }, then (Ω(X ), Ω(X )) where X is the complement of X is the partial predicate corresponding to X . If is a total numbering, we identify the predicate (Ω(X ), Ω(X )) with the set Ω(X ). Definition 6.1. Let (A1 , A2 ) and (B1 , B2 ) be partial predicates. (A1 , A2 ) ≤m (B1 , B2 ), read (A1 , A2 ) is many-one reducible to (B1 , B2 ), if there is a function f ∈ P (1) such that for i = 1, 2 and all a ∈ A1 ∪ A2 , if a ∈ Ai then f(a)↓ ∈ Bi . This definition is due to Shapiro [29]. In the case that A2 and B2 respectively are the complements of A1 and B1 , it reduces to the well-known many-one reducibility. We denote both kinds of reducibility by ≤m . Moreover, ≤1 denotes one-one reducibility, ≡m many-one equivalence, and ≡1 one-one equivalence. Definition 6.2. Let S be a class of sets of natural numbers and (A1 , A2 ) a partial predicate.
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
243
1. (A1 , A2 ) is S-hard, if for every B ∈ S we have that (B, B) ≤m (A1 , A2 ). 2. (A1 , A2 ) is potentially in S, if there is some B ∈ S with A1 ⊆ B and A2 ⊆ B. 3. (A1 , A2 ) is potentially S-complete, if it is both S-hard and potentially in S. If A2 is the complement of A1 , this definition cuts down to the usual notion of A1 being S-complete. Now, for the remainder of this section, let T = (T, ) be a countable T0 space with countable strong basis, total numbering B of the basis, and numbering x of the space. As is readily verified, the index set of any Lacombe set is r.e., if x is computable. Let us therefore consider the case of nonopen subsets of T . Theorem 6.3. [33] Let T be effective and x acceptable. Moreover, let X be a nonopen subset of T such that its complement X contains an enumerable dense set. Then (K , K ) ≤1 (Ω(X ), Ω(X )). In particular, Ω(X ) is productive. Here, K denotes the halting set. Corollary 6.4. [33] Let T be effective and x acceptable. For any nonopen subset X of T with completely enumerable complement, (Ω(X ), Ω(X )) ≡1 (K , K ). In particular, (Ω(X ), Ω(X )) is potentially Π01 -complete. An additional consequence of Theorem 6.3 is a generalization of Rice’s Theorem. Theorem 6.5. [33] Let T be effective and x acceptable. Moreover, let T be connected and X be a subset of T . Then X is completely recursive, if and only if either T is empty or the whole space. Erˇsov [11] has shown that Rice’s theorem is true for arbitrary precomplete numbered sets. (He considers only total numberings, however the proof remains valid also for partial numberings.) But it is not known, whether acceptable numberings of effective topological spaces are always precomplete. Let us next consider the problem to decide whether a given point is nonfinite. Let Fin and NFin, respectively, be the sets of all finite and nonfinite points of T . Lemma 6.6. [32] Let T be effective, B negative, and x acceptable. Then (Ω(NFin), Ω(Fin)) is potentially in Π02 . Note that in [32] a slightly stronger assumption on B was used, but as can be seen from the proof, the above requirement suffices. As we shall see now, in important cases Ω(NFin) is even Π02 -complete, if it is not empty. Let to this end, for y ∈ T , ↓y = {z ∈ T | z ≤ y}. Theorem 6.7. [32] Let T be effective and x strongly acceptable. Moreover, let X be a subset of T which contains a nonfinite point y such that X ∩↓y has no finite elements. Then Ω(X ) is Π02 -hard. It is not clear, how a similar result can be obtained for (Ω(X ), Ω(X )).
244
DIETER SPREEN
Corollary 6.8. [32] Let T be effective and contain a point below which there are no finite points. Moreover, let x be strongly acceptable. Then x cannot be total. The assumption is particularly true, when T contains no finite points at all, as in the case of the computable reals. Since the numbering of Rc introduced in Section 4.3 is strongly acceptable—it is even a standard numbering—we have that is necessarily partial. It follows that we cannot expect that a constructive metric space has a total strongly acceptable numbering. As a further consequence of Theorem 6.7 the index set of any nonempty subset of NFin is Π02 -hard, in particular Ω(NFin) is Π02 -hard. Note that it does not follow from Lemma 6.6 that Ω(NFin) ∈ Π02 . All we know is that Ω(NFin) and Ω(Fin) are separated by a Π02 set. For a large class of spaces and indexings including the computable real numbers with numbering this is however true. Theorem 6.9. [32] Let T be effective, constructively complete and contain a nonfinite element. Moreover, let B be negative, ≺B be r.e. and x be a standard numbering. Then Ω(NFin) is Π02 -complete. Since all computable real numbers are nonfinite, this means that dom() is Π02 -complete. §7. Degree structures. As we have just seen, numberings of topological spaces can in general not be assumed to be total. In numbering theoretic studies this has always been done so far. We encountered examples where results carry over smoothly. The situation is different, however, in the case of degrees. Let S be a countable set and Num(S)p be the set of all partial numberings of S. The set of all total numberings of S is denoted by Num(S). Let
degp () = κ ∈ Nump (S) | ≡ κ , ( ∈ Nump (S)),
deg() = κ ∈ Num(S) | ≡ κ , ( ∈ Num(S)), respectively, be the partial and the total degree of . The reduction relation ≤ can be lifted to the sets of degrees in the usual way. Thus we obtain the two partial orders
Lp (S) = {degp () | ∈ Nump (S)}, ≤ ,
L(S) = {deg() | ∈ Num(S)}, ≤ . As is well-known, L(S) is an upper semilattice in which the least upper bound of degrees of and κ is induced by the join ⊕ κ defined as
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
follows: for a ∈ , ( ⊕ κ)2a ( ⊕ κ)2a+1
245
a if a ∈ dom(), = undefined otherwise, κa , if a ∈ dom(κ), = undefined otherwise.
If S contains more than one element, L(S) is not a lattice [11]. In the case of Lp (S) also the greatest lower bound of two degrees exists. It is induced by the meet κ of the numberings and κ. For m, n ∈ , m if m ∈ dom(), n ∈ dom(κ) and m = κn , ( κ)m,n = undefined otherwise. Proposition 7.1. [2] Let S be a countable set. Then the following two statements hold : 1. Lp (S) is a distributive lattice. 2. L(S) is embeddable into Lp (S) as upper subsemilattice. The map deg() → degp () is an order-isomorphism which preserves finite least upper bounds. For a total numbering the set {κ ∈ Num(S) | κ ≤ } is countable, as there are only countably many witness functions f. In the case of partial numberings any function that agrees with f, but has a smaller domain of definition may be used as witness function. Lemma 7.2. [2] For any ∈ Nump (S), {κ ∈ Nump (S) | κ ≤ } is uncountable. Proof. Consider the cylindrification c() defined by i if i ∈ dom(), c()i,j = (i, j ∈ ). undefined otherwise, Then c() ≡ and for each s ∈ S, c()−1 ({s}) is infinite. Let κ ∈ Nump (S) with κ −1 ({s}) ⊆ c()−1 ({s}), for each s ∈ S. Then κ ≤ c() via the identity function and there are uncountably many such κ. A Friedberg numbering is a one-to-one numbering. In the case of total numberings they are known to be minimal with respect to the reducibility preorder. Proposition 7.3. [2] Let S be infinite. Then the degree of any partial Friedberg numbering is not minimal in Lp (S). Given a Friedberg numbering , by using a similar argument as above combined with a diagonalization construction a Friedberg numbering is produced which is reducible to , but not vice versa.
246
DIETER SPREEN
An infinite set has uncountably many pairwise incomparable total Friedberg numberings [11]. For partial Friedberg numberings an improved statement holds. Proposition 7.4. Let S be infinite and ∈ Nump (S). Then there are uncountably many pairwise incomparable partial Friedberg numberings reducible to . By applying the construction used in Proposition 7.3 to any of them and iterating it, one can strengthen that statement. Corollary 7.5. [2] Let S be infinite. Then there is an infinite descending chain generated by partial Friedberg numberings below every degree in Lp (S). Let us now return to the case of effective T0 spaces. As a consequence of Lemma 3.10 we have that the degree of a computable numbering consists only of computable numberings. Moreover, join and meet of computable numberings are computable again. Let
C(T ) = degp () | ∈ Nump (T ) ∧ computable . If existing, the acceptable numberings form a single degree which is largest in C(T ). It consists only of such numberings. Proposition 7.6. [37] Let T be an effective T0 space. Then the following statements hold : 1. C(T ) is an ideal in Lp . In particular, C(T ) is a distributive lattice. 2. If T has a computable numbering and an r.e. strong inclusion, C(T ) has a greatest element. Statement (2) is a result of Proposition 3.8. As we have seen, partial numberings are much different from total ones. The above results mainly follow by the fact that given a numbering and a reducibility function f, one can define new numberings by using for each s ∈ S only some of the indices in f −1 ( −1 ({s})). This can no longer be done in case of strong reducibility. For ∈ Nump (S) let degs () = {κ ∈ Nump (S) | κ ≡s } be the strong degree of . As with the other reducibility relations, strong reducibility can be lifted to a partial order on the collection of all strong degrees: Ls (S) = ({degs () | ∈ Nump (S)}, ≤s ). Lemma 7.7. [37] Ls (S) is an upper semilattice with the least upper bound of the degrees of numberings and κ being given by the degree of ⊕ κ. With respect to strong reducibility partial numberings behave more like total ones. In order to see this we will apply Erˇsov’s completion construction for
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
247
total numberings to partial numberings. It allows to set up a correspondence between partial numberings and complete total numberings. Let to this end S = {{s} | s ∈ S} ∪ {∅} and : S → S with (s) = {s} be the corresponding canonical embedding. Moreover, let u ∈ P (1) be a universal function and u ∈ R(1) defined by u (c) = a : u(a) = c be its right inverse. For ∈ Nump (S) set {u(a) } if a ∈ u −1 (dom()), ˆa = (a ∈ ). ∅ otherwise, Lemma 7.8. [37] Let ∈ Nump (S). Then the following two statements hold : 1. ˆ is a complete total numbering of S with special element ∅. 2. ˆu (c) = (c ), for all c ∈ dom(). Thus, as well as its partial inverse are effective maps. Moreover, the corestriction of ˆ to (S) is equivalent to ◦ . Now, let = ({deg( ) | ∈ Num(S) ∧ complete with special element ∅}, ≤). C∅ (S) Note that the join of complete total numberings need not be complete again. Nevertheless, we have the following result. Theorem 7.9. [37] Let S be a countable set. Then the following statements hold : 1. Ls (S) is a monotone retract of L(S). is an upper semilattice. 2. C∅ (S) are isomorphic. 3. The two upper semilattices Ls (S) and C∅ (S) §8. Totalization. In a certain sense, working with total numberings is much easier than working with partial numberings. Remember e.g. the discussion about the difficulty of classifying decision problems in the case of partially numbered sets. Therefore, the question comes up, whether given a partially indexed set (S, ), there is a total indexing of S which extends . In applications one is of course not interested in just a total extension of a given partial indexing: important properties such as acceptability should be preserved. As we have already seen, important spaces like the computable reals have canonical strongly acceptable indexings, but do not have a total indexing of this kind. In the last section we presented a construction that allows totalizing a given partial numbering. To this end we had to enlarge the given set, but just by one point. The new total indexing was even complete and both the canonical embedding of the old space into the new one as well as its partial inverse were effective. But does this construction preserve properties like acceptability?
248
DIETER SPREEN
Let T be an effective T0 space again with numbering x and let T and x, ˆ respectively, be obtained from T and x as in the last section. Set B0 = T ,
Bn+1 = (Bn ),
let B be the collection of the sets Ba (a ∈ ), and define m ≺B n ⇔ n = 0 ∨ m = 0 ∧ n = 0 ∧ m − 1 ≺B n − 1 . B, ˆ B, Then B is a strong basis of a T0 topology ˆ on T . Moreover, T = (T , , ˆ is effective. ≺B , x) We will show now that in general, xˆ is not computable, even if x is. Assume to this end that both x and xˆ are computable. Then there is some for i, n ∈ . Let such that xˆ ∈ Bn , exactly if i, n ∈ L, r.e. set L A = {j ∈ | (∃n > 0)j, n ∈ L}. Then A is r.e. Moreover, we have that i ∈ A ⇔ xˆ i ∈ (T ) ⇔ i ∈ u −1 (dom(x)). Hence, dom(x) = u(A), which shows that dom(x) is r.e. This is not true in general: in case of the computable real numbers with its canonical indexing we know by Theorem 6.9 that dom() is Π02 -complete. In the above construction we enlarged T by one more finite element. As we have seen in Corollary 6.8, an effective space can only have a strongly acceptable numbering, if below each of its nonfinite points there is a finite point. By definition the neighbourhood filter of a finite point z ∈ T is generated by a single basic open set, say Bn . Then Bn = {y ∈ T | z ≤ y}. We will now consider spaces such that all basic open sets are pointed in a certain way. For n ∈ , let {Bm | n ≺B m}. hl(Bn ) = Definition 8.1. Let T = (T, ) be a countable T0 -space with a countable strong basis B, and let x and B be numberings of T and B, respectively. We say that T is effectively pointed, if there is a function pd ∈ P (1) such that for all n ∈ dom(B) for which Bn is not empty, pd(n)↓ ∈ dom(x), xpd(n) ∈ hl(Bn ) and xpd(n) ≤ z, for all z ∈ Bn . Note that
Bn ⊆ z ∈ T | xpd (n) ≤ z ⊆ hl(Bn ).
Clearly, constructive A-spaces and domains are effectively pointed. Conversely, effectively pointed spaces have typical properties of domains. Lemma 8.2. [32] Let T be effective and effectively pointed, and let x be computable. Moreover, let y ∈ T and n ∈ . Then the following hold : 1. T is recursively separable with dense base {xa | a ∈ range(pd )}. 2. The set {xpd (a) | y ∈ Ba } is directed and y is its least upper bound.
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
249
3. If m is an index of a converging normed recursive enumeration of basic open sets, then the enumeration converges to the least upper bound of
xpd (ϕm (a)) a∈ . 4. If y is finite, then y ∈ {xa | a ∈ range(pd )}. 5. If xpd (n) is finite, then hl(Bn ) = {z ∈ T | xpd (n) ≤ z}. For effectively pointed spaces the existence of a total acceptable numbering has very strong structural consequences. Theorem 8.3. [35] Let T be effective and effectively pointed with computable numbering x. Then T has a total acceptable numbering, if and only if T is constructively d-complete and is the Scott topology. The numbering can be chosen as complete, exactly if T has also a smallest element. It agrees with the special element of the numbering. It is easy to see that a constructive A-space is effectively complete, just if it is constructively d-complete and its topology is the Scott topology. Thus, we have that exactly the effectively complete constructive A-spaces have total acceptable numberings. Effectively complete constructive A-spaces that have a least element coincide with the bounded-complete constructive domains. The last theorem tells us that we can totalize an acceptable numbering x such that the resulting numbering xˆ is acceptable as well, if we embed the given space into a constructive domain. Theorem 8.4. [35] Let T be effective with r.e. strong inclusion so that all basic open sets are not empty. Moreover, let x be acceptable. Then there is an algebraic constructive domain T with a total acceptable complete numbering xˆ and an effectively homeomorphic embedding F : T → Tˆ such that both F and its partial inverse are effective and F (T ) is a dense subset of Tˆ . The construction is as follows: for m, n ∈ let m n ⇔ n = 0 ∨ m = n ∨ m = 0 ∧ n = 0 ∧ m − 1 ≺B n − 1 . Then the relation is obviously r.e., reflexive, and transitive with 0 as greatest element. Define T to be the set of all r.e. filters of , i.e., the collection of all nonempty r.e. subsets of which are upwards closed with respect to and which with any two elements m and n contain an element a such that a m, n, and order it by set inclusion. Then (T , ⊆) is a partial order with the filter {0} as smallest element. Let s ∈ R(1) be an enumeration of all indices i such that Wi is not empty. Since is r.e., a function h ∈ R(1) can be constructed so that ϕh(i) tries to enumerate a longest descending chain in Ws(i) . Set
xˆ i = m ∈ | (∃a)ϕh(i) (a) m . Then xˆ is a total numbering of T . As is readily verified, (Tˆ , ⊆) is constructively
250
DIETER SPREEN
d-complete with respect to this numbering. Note that the least upper bound of a directed enumerable subset of Tˆ is the union of all filters in this set. For n ∈ , let zˆn = {m ∈ | n m}. Then the collection of these elements is an algebraic basis of T . An easy verification shows that T is an algebraic constructive domain. The Scott topology on Tˆ has as canonical basis the collection of all sets ˆ Bˆ n = {yˆ ∈ Tˆ | zˆn ⊆ y}. It is not hard now to show that xˆ is acceptable. Moreover, it is complete with the least element of T as special element. Since for any y ∈ T the set of all Bn such that y ∈ Bn is a strong base of the neighbourhood filter of y, it follows that {n + 1 | y ∈ Bn } ∪ {0} is a filter with respect to . Define F : T → Tˆ by F (y) = {n + 1 | y ∈ Bn } ∪ {0}. Then F is one-to-one and both F and its partial inverse are effective. Note that xi ∈ Bn if and only if F (xi ) ∈ Bn+1 . Hence, both F and F −1 are effectively continuous. Since the basic open sets Bn are not empty, we also have that F (T ) is dense in T . Note that if the strong inclusion relation ≺B is recursive, it is decidable in m and n whether Bm and Bn are disjoint, and B is effectively closed under nonempty finite meets, then T is even a constructive Scott domain. Here effective closure under nonempty finite meets means that there is a function d ∈ P (2) such that for nondisjoint Bm and Bn , d (m, n)↓, Bd (m,n) = Bm ∩ Bn and d (m, n) ≺B m, n. Furthermore in Theorem 8.4, if T is also constructively complete, we obtain with Proposition 3.13 that F (T ) is even effectively dense in T , which means that given Bn we can effectively find a point xˆ i ∈ F (T ) ∩ Bn . Embeddings of topological spaces in domains have been much studied in the literature. Mostly one is interested in the case that the embedded space coincides with the subspace of maximal domain elements. Let Max(T ) be the set of maximal points of T with respect to the specialization order. Proposition 8.5. [35] Let T be effective and constructively complete so that all basic open sets are nonempty and the strong inclusion relation is r.e. Moreover, let x be acceptable, T be the algebraic constructive domain constructed in Theorem 8.4, and F : T → T be the embedding of T in T . Then F (Max(T )) = Max(T ). As is well-known, for T1 spaces the specialization order coincides with the identity. Thus, if in addition T is T1 , then T is effectively homeomorphic to the subspace of maximal elements of T .
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
251
In Section 4.4 we have already pointed out that the constructive completeness notion depends on the choice of the strong inclusion relation. The same is of course true for the construction of the domain T . Consider again the space Rc of all computable real numbers with the induced Euclidean topology. As a constructive metric space Rc is constructively complete. Thus, it is effectively homeomorphic to the subspace of maximal elements of the domain constructed as above with respect to the strong basis and the strong inclusion relation introduced for constructive metric spaces in Section 4.3. If the collection I of all open intervals with rational endpoints is chosen as topological basis together with the strong inclusion relation ≺I defined in Section 4.4, Rc is no longer constructively complete. Nevertheless, it is effectively homeomorphic to the subspace of maximal elements of the domain constructed as above with respect to I and ≺I . Now, instead of ≺I consider the following relation a ≺I b ⇔ 1 (f(b)) ≤ 1 (f(a)) ∧ 2 (f(a)) ≤ 2 (f(b)) . Then also ≺I is an r.e. strong inclusion with respect to which I is a strong basis. Again, Rc is not constructively complete. But in this case, space T is a constructive Scott domain, in which the embedded computable reals are no longer maximal. To see this, let q ∈ Q. By definition, F (q) = {a +1 | 1 (f(a)) < q < 2 (f(a)) }∪ {0}. As is easily verified the following sets
Jq1 = a + 1 | 1 (f(a)) < q ≤ 2 (f(a)) ∪ {0}
Jq2 = a + 1 | 1 (f(a)) ≤ q < 2 (f(a)) ∪ {0} are also filters with respect to the preorder derived from ≺I . Neither is Jq1 contained in Jq2 nor is Jq2 contained in Jq1 , but F (q) is properly contained in both of them. §9. Final remarks. In this paper we have presented a general approach to effectively given topological spaces. A class of spaces with suitable indexings was defined that includes a large variety of well-known examples and is closed under important constructions of new spaces from given ones, thus showing the robustness of the concept. Numberings of the kind considered here exist under very natural conditions usually satisfied in applications. With respect to these numberings important topological operations like limit passing become effective. Canonical bases of the neighbourhood filters of the points can uniformly be enumerated. Typically such numberings are only partially defined. We have seen that this is so by necessity. We have also seen that dealing with partial numberings is more difficult than doing so with total numberings. Notions introduced for the latter do
252
DIETER SPREEN
not always carry over easily. There are refinements of different strength. Therefore, the question of whether such numberings can be totalized was considered. As has already been said when introducing effective spaces, in this approach we assume that the spaces we are interested in come equipped with a notion of computable point. We consider only the subspaces induced by these points. Doing so allows to represent these points by natural numbers and to use classical computability theory over the natural numbers. Other approaches to computable topology like Weihrauch’s Type Two Theory of Effectivity (TTE) [45] do not restrict themselves to subspaces of computable elements. The elements can now no longer be represented by natural numbers. In the TTE approach elements of Baire space, i.e. functions on the natural numbers are used instead. The relationship between this approach and the one presented here has been studied in [36]. REFERENCES
[1] S. Abramsky and A. Jung, Handbook of Logic in Computer Science. Vol. 3, The Clarendon Press Oxford University Press, New York, 1994. [2] S. Badaev and D. Spreen, A note on partial numberings, Mathematical Logic Quarterly, vol. 51 (2005), no. 2, pp. 129–136. [3] U. Berger, Total sets and objects in domain theory, Annals of Pure and Applied Logic, vol. 60 (1993), no. 2, pp. 91–117. [4] J. Blanck, Domain representability of metric spaces, Annals of Pure and Applied Logic, vol. 83 (1997), no. 3, pp. 225–247. [5] , Domain representations of topological spaces, Theoretical Computer Science, vol. 247 (2000), no. 1-2, pp. 229–255. [6] N. Bourbaki, Elements of Mathematics. General Topology. Part 1, Hermann, Paris, 1966. [7] P. Di Gianantonio, Real number computability and domain theory, Information and Computation, vol. 127 (1996), no. 1, pp. 11–25. [8] A. Edalat, Domains for computation in mathematics, physics and exact real arithmetic, The Bulletin of Symbolic Logic, vol. 3 (1997), no. 4, pp. 401– 452. [9] Yu. L. Erˇsov, Computable functionals of finite types, Akademiya Nauk SSSR. Sibirskoe Otdelenie. Institut Matematiki. Algebra i Logika, vol. 11 (1972), pp. 367– 437, English translation: Algebra and Logic 11 (1972) 203–242. [10] , Continuous lattices and A-spaces, Doklady Akademii Nauk SSSR, vol. 207 (1972), pp. 523–526, English translation: Soviet Mathematics Doklady 13 (1972) 1551–1555. [11] , Theorie der Numerierungen. I, Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik, vol. 19 (1973), pp. 289–388. [12] , Theory of A-spaces, Akademiya Nauk SSSR. Sibirskoe Otdelenie. Institut Matematiki. Algebra i Logika, vol. 12 (1973), pp. 369– 416, English translation: Algebra and Logic 12 (1973) 209–232. [13] , Theorie der Numerierungen. II, Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik, vol. 21 (1975), no. 6, pp. 473–584. [14] , Model C of partial continuous functionals, Logic Colloquium 76 (Oxford, 1976) (R. Gandy et al., editors), North-Holland, Amsterdam, 1977, pp. 455– 467. [15] , Theorie der Numerierungen. III, Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik, vol. 23 (1977), no. 4, pp. 289–371. [16] , Theory of Numberings, Nauka, Moscow, 1977.
ON SOME PROBLEMS IN COMPUTABLE TOPOLOGY
253
[17] , Theory of numberings, Handbook of Computability Theory (E. R. Griffor, editor), Elsevier, Amsterdam, 1999, pp. 473–503. [18] M. H. Escardo, ´ PCF extended with real numbers, Theoretical Computer Science, vol. 162 (1996), no. 1, pp. 79–115. [19] G. Gierz, K. H. Hofmann, K. Keimel, J. D. Lawson, M. Mislove, and D. S. Scott, Continuous Lattices and Domains, Cambridge University Press, Cambridge, 2003. [20] T. Kamimura and A. Tang, Total objects of domains, Theoretical Computer Science, vol. 34 (1984), no. 3, pp. 275–288. [21] J. Lawson, Spaces of maximal points, Mathematical Structures in Computer Science, vol. 7 (1997), no. 5, pp. 543–555. [22] A. I. Mal’cev, The Metamathematics of Algebraic Systems. Collected Papers: 1936–1967, North-Holland Publishing Co., Amsterdam, 1971. [23] Y. N. Moschovakis, Recursive analysis, Ph.D. thesis, University of Wisconsin, Madison, 1963. [24] , Recursive metric spaces, Fundamenta Mathematicae, vol. 55 (1964), pp. 215–238. [25] D. Normann, Categories of domains with totality, Oslo Preprint in Mathematics, no. 4. University of Oslo, 1997. [26] M. B. Pour-El and J. I. Richards, Computability in Analysis and Physics, SpringerVerlag, Berlin, 1989. [27] H. Rogers, Jr., Theory of Recursive Functions and Effective Computability, McGraw-Hill Book Co., New York, 1967. [28] D. Scott, Outlines of a mathematical theory of computation, Proc. 4th Annual Princeton Conf. on Information Sciences and Systems, Princeton University Press, 1970, pp. 169–176. [29] N. Shapiro, Degrees of computability, Transactions of the American Mathematical Society, vol. 82 (1956), pp. 281–299. [30] I. Sigstam, Formal spaces and their effective presentations, Archive for Mathematical Logic, vol. 34 (1995), no. 4, pp. 211–246. [31] M. B. Smyth, Finite approximation of spaces (extended abstract), Category Theory and Computer Programming (Guildford, 1985) (D. Pitt et al., editors), Lecture Notes in Computer Science, vol. 240, Springer, Berlin, 1986, pp. 225–241. [32] D. Spreen, On some decision problems in programming, Information and Computation, vol. 122 (1995), no. 1, pp. 120–139, Corrigendum 148 (1999) 241–244. [33] , Effective inseparability in a topological setting, Annals of Pure and Applied Logic, vol. 80 (1996), no. 3, pp. 257–275. [34] , On effective topological spaces, The Journal of Symbolic Logic, vol. 63 (1998), no. 1, pp. 185–221, Corrigendum 65 (2000) 1917–1918. [35] , Can partial indexings be totalized?, The Journal of Symbolic Logic, vol. 66 (2001), no. 3, pp. 1157–1185. [36] , Representations versus numberings: on the relationship of two computability notions, Theoretical Computer Science, vol. 262 (2001), no. 1-2, pp. 473– 499. [37] , Strong reducibility of partial numberings, Archive for Mathematical Logic, vol. 44 (2005), no. 2, pp. 209–217. [38] V. Stoltenberg-Hansen, I. Lindstrom, ¨ and E. R. Griffor, Mathematical Theory of Domains, Cambridge University Press, Cambridge, 1994. [39] V. Stoltenberg-Hansen and J. Tucker, Complete local rings as domains, The Journal of Symbolic Logic, vol. 53 (1988), no. 2, pp. 603–624. [40] , Algebraic and fixed point equations over inverse limits of algebras, Theoretical Computer Science, vol. 87 (1991), no. 1, pp. 1–24. [41] , Effective algebras, Handbook of Logic in Computer Science, vol. 4, Oxford University Press, New York, 1995, pp. 357–526.
254
DIETER SPREEN
[42] C. F. Sturm, M´emoire sur la r´esolution des e´ quations numeriques, Annales de Math´ematiques Pures et Appliqu´ees, vol. 6 (1835), pp. 271–318. [43] A. M. Turing, On computable numbers, with an application to the “Entscheidungsproblem”, Proceedings of the London Mathematical Society, vol. 42 (1836), pp. 230–265. [44] , On computable numbers, with an application to the “Entscheidungsproblem”. A correction, Proceedings of the London Mathematical Society, vol. 43 (1837), pp. 544–546. [45] K. Weihrauch, Computable Analysis, Springer-Verlag, Berlin, 1998. [46] K. Weihrauch and T. Deil, Berechenbarkeit auf cpo’s, Schriften zur Angewandten Mathematik und Informatik, vol. 63, Aachen University of Technology, 1980. [47] K. Weihrauch and U. Schreiber, Embedding metric spaces into cpo’s, Theoretical Computer Science, vol. 16 (1981), no. 1, pp. 5–24. THEORETISCHE INFORMATIK FACHBEREICH MATHEMATIK ¨ SIEGEN UNIVERSITAT 57068 SIEGEN, GERMANY
E-mail:
[email protected]
MONOTONE INDUCTIVE DEFINITIONS AND CONSISTENCY OF NEW FOUNDATIONS
SERGEI TUPAILO
Abstract. In this paper we reduce the consistency problem for NF to consistency of a certain extension of Jensen’s NFU. Working in NFU + Pairing, which is known to be consistent relative to Zermelo set theory, due to Jensen [17], we define a certain monotone operation pw and conclude that existence of its least fixpoint is sufficient to model NF.
§1. Introduction. New Foundations. New Foundations, NF, is a system of set theory named after Quine’s 1937 article [18] “New foundations for mathematical logic”, where it was introduced. The language L∈ of NF is the simple set-theoretic language, i.e. the usual first-order language with the only constants = and ∈. The logic is classical first-order with equality. The only nonlogical axioms are Extensionality and Stratified Comprehension as described below. Extensionality is an axiom Ext : ∀x∀y ∀z(z ∈ x ↔ z ∈ y) → x = y . Definition 1.1. Stratification of a formula ϕ is an assignment of natural numbers to variables (both free and bound) in ϕ s.t. every atomic subformula x = y of ϕ receives an assignment x n = y n , for some n, and every atomic subformula x ∈ y of ϕ receives an assignment x m ∈ y m+1 , for some m. A formula ϕ is stratified iff there exists a stratification of ϕ. Examples. The formula x ∈ y ∧ y ∈ z is stratified, but the formula x ∈ y ∧ y ∈ x is not. Stratified Comprehension is an axiom scheme SCA : ∃y∀x x ∈ y ↔ ϕ[x] , for every stratified formula ϕ with y not free in ϕ. Received by the editors January 8, 2006. A part of this work was done during the author’s visit to The Ohio State University, USA, whose support is gratefully acknowledged. A thorough review by an anonymous referee is gratefully acknowledged. Logic Colloquium ’05 Edited by C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel Lecture Notes in Logic, 28 c 2006, Association for Symbolic Logic
255
256
SERGEI TUPAILO
It is known that NF is at least as strong as Simple Type Theory with Infinity, but NF is not known to be consistent, relative to any known extension of Zermelo-Fraenkel Set Theory, — see e.g. [21, 22, 24, 25, 17, 6, 11, 3, 14, 9, 10, 15, 16, 23, 8, 13]. There is a number of subsystems of NF which are known to be consistent. Perhaps the most famous of them is NFU, so called “NF with Urelements”, introduced by Jensen 1969 [17]. NFU results from NF by restricting extensionality to non-empty sets, i.e. by replacing the axiom Ext by the following axiom ∀x∀y ∃z(z ∈ x ∨ z ∈ y) ∧ ∀z(z ∈ x ↔ z ∈ y) → x = y . Ext : NFU, however, is surprisingly weak: a model of NFU can be constructed within Peano Arithmetic. One of the drawbacks of NFU is that, contrary to NF, it doesn’t prove the axiom of Infinity. On the other hand, it was also shown by Jensen [17] that NFU is consistent with Infinity, as well as with Infinity and Choice, AC, notwithstanding NF refuting AC, according to Specker [24]. This time the consistency results are relative to a much stronger theory, Zermelo Set Theory with Separation restricted to ∆0 formulae (also known as Mac Lane Set Theory), or, equivalently, Simple Type Theory with Infinity (see [17, Theorem 1 and Lemma 4]). There are further consistent extensions of NFU, forming a kind of “large cardinals program” in this set theory — see e.g. [17, 6, 16, 23]. It’s worthwhile to note that appropriate NFlarge cardinal axioms, when added to NF, or even to NFU, do allow one to model ZF: a good reference is [15]. This paper is an attempt to apply the so called bisimulation method in order to model NF in an appropriate extension of NFU. This method has been used in many different situations, when there was a need to satisfy Extensionality in a non-extensional, non-wellfounded, framework: basic references here are [1] and [2]1 . On the language part, in order to carry out necessary constructions, the only required addition to L∈ is a type-preserving ordered pairing function ·, · built-in. The fact that this extension is equivalent (in NF) to having Infinity axiom was shown first by Rosser [22] (but see also Quine [19]), and in the context of NFU was employed by Holmes [14, 15]. When having this kind of pairing, it was easy to talk about finite sequences, trees and bisimulations, which are the key preparatory notions in the present paper. Working in
1 The referee has pointed out a similarity of our method to the development of the isomorphism classes of well-founded extensional relations with top, which was used more often in the NF literature, in particular for modelling ZF in extensions of NFUP. However, there are trees which are bisimilar but don’t give rise to isomorphic relations in the above sense, so it’s not clear to which extent these two methods realize “the same” structure.
TOWARDS NEW FOUNDATIONS
257
NFU + Pairing, NFUP2 , we define a certain monotone operation pw acting on sets of trees and conclude the following: Lemma 3.4. Any set models all Equality axioms of NF, Lemma 3.19. Any fixpoint of the pw operation models Stratified Comprehension, and Lemma 3.18. Any least fixpoint, in addition, models Extensionality. Thus, existence of a pw- least fixpoint is sufficient to model NF. This connects us with the well-known MID principle, which asserts existence of least fixpoints of monotone operations and has been studied extensively in different areas of Mathematical Logic. For example, in Set Theory, many ZF- large cardinal axioms can be seen as the MID principle for particular monotone operations; in Proof Theory, much research has been done about the MID principle over Peano Arithmetic and subsystems of Analysis, for a start see [4]; in Computer Science, one manifestation of MID is various -calculi. Related to all of the above, including New Foundations, is the study of MID in Feferman’s Explicit Mathematics, EM: one can start from [7, 12, 20, 28]. Explicit Mathematics can be seen as an extension of the restriction of NFUP containing only two types, cf. [5]; for this reason the only set operations f one can talk about in EM are type-preserving (or type level), i.e. such that x and f(x) must have the same type. However, since EM postulates many more set existence principles than just those provided for by Stratified Comprehension, the very question of consistency and strength of MID becomes very non-trivial; this question has been answered, positively. In NFUP in general, as well, MID for type level operations easily follows from Stratified Comprehension, but the consistency question seems to be much more difficult if the operation is not so. Anyway, for our operation pw, a positive answer would imply Consis(NF)3 . §2. Preliminary developments in NFUP: sequences, trees and bisimulations. Throughout this paper, NFUP will mean an extension of NFU as described in the Introduction by the ordered pairing operation built in. Stratified Comprehension SCA and restricted Extensionality Ext axioms remain as above; now 2 This theory apparently was first considered in Feferman [6], where its consistency, and of a strong extension thereof, was proved. Holmes [14] offered a way to implement Quine pair in the NFU environment, resulting in an interpretation of NFUP in NFU + Infinity. Conversely, Infinity is deducible from type level pair by the device due to Rosser [22]. 3 Observation made by the referee: For some other monotone operations, existence of least fixpoints is inconsistent with NFUP. For our operation pw, an inconsistency can be deduced if one adds unrestricted Choice to NFUP. Therefore, this could lead to a model of NF only after building very special models of NFUP which violate Choice.
258
SERGEI TUPAILO
we describe a mechanism to include ordered pairing. To do this, we add to the language L∈ the ordered pairing ·, · function constant and adjoin to the theory the following Pairing axiom: x, u = y, v → x = y ∧ u = v.
Pair :
Using Pairing, we can conservatively define projection functions p0 and p1 . Namely, translate every atomic formula [p0 (t)] :⇔ ∃x∃y (t = x, y ∧ [x]), [p1 (t)] :⇔ ∃x∃y (t = x, y ∧ [y]). From this translation we see that p0 and p1 are inverses of ·, · : Unpair :
p0 (x, y ) = x ∧ p1 (x, y ) = y.
The new extended language will be called LP . The notion of stratification is adjusted in such a way that in the term s, t the components s and t must have the same type n, and then the whole term s, t is also assigned the type n. The requirements for x n = y n and x m ∈ y m+1 of the Definition 1.1 are left intact, now relating to terms s, t instead of mere variables x, y. It follows that the type of p0 (t), p1 (t) must be the same as the type of t. Keep in mind that in the SCA axiom of NFUP the formula ϕ must be stratified in the new sense. NFUP is formulated in LP and based on classical logic with equality. We set NFUP := Ext + SCA + Pair. In this paper by default we will be reasoning in NFUP. V will denote the universal set {x | x = x}, and Λ the empty set {x | x = x}. We customarily define x1 , . . . , xn := x1 , . . . , xn−1 , xn for n ≥ 3. Having ordered pair at our disposal, we can define the Cartesian product, relations and functions. Namely, Definition 2.1. x × y := {u, v | u ∈ x ∧ v ∈ y}; Rel := {R | ∀x ∈R ∃y∃z x = y, z }; dom(R) := {x | ∃y x, y ∈ R}; rng(R) := {y | ∃x x, y ∈ R}; Fun := {f ∈ Rel | ∀x ∈f∀y ∈f (p0 x = p0 y → p1 x = p1 y)}; f : x → y :⇔ f ∈ Fun ∧ dom(f) = x ∧ rng(f) ⊆ y; f(x) := “the unique y s.t. x, y ∈ f” for f ∈ Fun and x ∈ dom(f).
TOWARDS NEW FOUNDATIONS
259
We define Frege integers in the standard way (see [15, p. 79–80]). Namely, set 0 := {Λ},
(1) (2)
S(x) := {y ∪ {z} | y ∈ x ∧ z ∈ / y},
and, finally, (3)
N :=
{x | 0 ∈ x ∧ ∀y ∈x S(y) ∈ x}.
All Peano axioms hold for so defined N. We use 1 := S(0). Addition +, subtraction −, etc., can be defined to satisfy the standard properties. For details of those developments, see e.g. [15, Chapter 12]. Equally, we have access to (primitive) recursion and induction on N: Lemma 2.2 (Induction on N, see [15, p. 81]). If X ⊆ N, 0 ∈ X and ∀y ∈ X S(y) ∈ X , then X = N. Lemma 2.3 (Recursion on N, see [15, p. 83]). If X is a set, x is an element of X , and f : X × N → X , then there exists a unique function g : N → X s.t. g(0) = x and g(S(k)) = f(g(k), k) for each k ∈ N. We can define a set Seq of sequences so that Seq = {x, y | (x = 0 ∧ y = 0) ∨ (x = 1 ∧ y = p0 (y), p1 (y) ∧ p0 (y) ∈ Seq)}. To do this, by SCA one defines a set Seq0 s.t. Seq0 := {0, 0 },
(4) and a function SeqS s.t. (5)
SeqS (Y ) := {1, y, z | y ∈ Y }.
Then by recursion on N (X := {y | ∃z z ∈ y}, x := Seq0 , f(Y, n) := SeqS (Y )) one defines a function sq s.t. sq(0) := Seq0 , (6) sq(S(n)) := SeqS (sq(n)). Finally by SCA one sets (7)
Seq :=
rng(sq).
We abbreviate nil := 0, 0 . Since the definition of Seq is inductive, we have the standard principles of induction and recursion on Seq: Lemma 2.4 (Induction on Seq). If X ⊆ Seq, nil ∈ X and ∀y ∈X ∀u 1, y, u ∈ X , then X = Seq.
260
SERGEI TUPAILO
Proof. By induction on n we prove sq(n) ⊆ X , for every n ∈ N. Therefore Seq ⊆ X . Since additionally we are given X ⊆ Seq, by Ext we obtain X = Seq. Lemma 2.5 (Recursion on Seq). If X is a set, x is an element of X , and f : X × Seq × V → X , then there exists a unique function g : Seq → X s.t. g(nil) = x and g(1, y, u ) = f(g(y), y, u) for each y ∈ Seq, u ∈ V. Proof. Define a function H : X → X, where X := {Y ⊆ Seq × X | ∃z z ∈ Y }, so that H (A) := {1, y, u , f(v, y, u) | y, v ∈ A}. By recursion on N (X := X, x := {0, 0 , x }, f (Y, n) := H (Y )) define F (0) := {0, 0 , x }, (8) F (S(n)) := {1, y, u , f(v, y, u) | y, v ∈ F (n)}. Claim 1. ∀y ∈sq(n)∃!v y, v ∈ F (n). /- By induction on n. Obvious when n = 0. Assume n > 0 and y ∈ sq(n). Then by (6) and (5) y = 1, p0 (p1 (y)), p1 (p1 (y)) with p0 (p1 (y)) ∈ sq(n−1). By IH ∃!v0 p0 (p1 (y)), v0 ∈ F (n − 1). Consequently, by (8), ∃!v y, v ∈ F (n). -/ Claim 2. y, v ∈ F (n) → y ∈ sq(n). /- By induction on n, using (8).
-/
Claim 3. y ∈ sq(n) ∧ m > n → y ∈ / sq(m). /- By induction on n. Obvious when n = 0, since 0 = 1. Assume n > 0. If y ∈ sq(n) ∧ y ∈ sq(m), then p0 (p1 (y)) ∈ sq(n − 1) ∧ p0 (p1 (y)) ∈ sq(m − 1), which contradicts the IH. -/ Claims 1–3 show ∀y ∈Seq∃!n ∈N∃!v y, v ∈ F (n), or ∀y ∈Seq∃!v y, v ∈
F (n).
n∈N
Therefore g := {y, v | y ∈ Seq ∧ y, v ∈
F (n)}
n∈N
is a function defined on Seq. rng(g) ⊆ X , g(nil) = x and g(1, y, u ) = f(g(y), y, u) follow from (8). Uniqueness of such a g is proved by induction on Seq.
TOWARDS NEW FOUNDATIONS
261
One defines the length function ln : Seq → N by recursion on Seq to satisfy the following equations: ln(nil) := 0, ln(1, a, b ) := ln(a) + 1 : take in Lemma 2.5 X := N, x := 0, and f(k, a, b) := k + 1. From the definition of ln above we immediately have (by induction on Seq, Lemma 2.4), for c ∈ Seq, ln(c) = 0 ↔ c = nil.
(9)
By recursion on N (Lemma 2.3) one defines the result of erasing the last k members from a sequence c. For this, we set (10) rem0 := {x, x | x ∈ Seq}; remk+1 :=
{x, nil | x ∈ Seq ∧ ln(x) ≤ k + 1} {1, a, b , y | a ∈ Seq ∧ ln(a) ≥ k + 1 ∧ a, y ∈ remk }. By induction on k we prove ∀k ∈N∀c ∈Seq∃!d ∈Seq c, d ∈ remk , i.e. all remk are functions Seq → Seq. We denote rem(c, k) = d for c, d ∈ remk . By induction on k (10) also gives us rem(c, k) = nil
if
ln(c) ≤ k.
The operation rem allows us to define the k-th last element (c)k of a sequence c, 1 ≤ k ≤ ln(c): (c)k := p1 (p1 (rem(c, k − 1))). We also define, for c = nil, head(c) := rem(c, ln(c) − 1). The operation head, from a non-zero sequence c, gives a one-element sequence head(c) consisting of the first (from the beginning) member of c. We will also need a complementary operation, bodyt(c), the remainder from c after the head is “cut off”: By recursion on Seq, taking in Lemma 2.5 X := Seq, x := nil and nil if a = nil, f(d, a, b) := 1, d, b otherwise, one defines bodyt(c) for c ∈ Seq in the following way: bodyt(nil) := nil, bodyt(1, a, b ) := f(bodyt(a), a, b).
262
SERGEI TUPAILO
Note that this definition yields
nil bodyt(1, a, b ) = 1, bodyt(a), b
if a = nil, otherwise.
Now we define the concatenation operation x ∗ y by recursion on y ∈ Seq (X := Seq, f(z, y, u) := 1, z, u ): x ∗ nil := x, x ∗ 1, y, u := 1, x ∗ y, u , for x ∈ Seq. Observe that x ∗ y is a homogeneous function: all three variables must have the same type in any stratification of “x ∗ y = z”. It’s also a routine check that for any c ∈ Seq, c = nil, head(c) ∗ bodyt(c) = c.
(11)
Lemma 2.6. Concatenation is associative, i.e. ∀x ∈Seq∀y ∈Seq∀z ∈Seq x ∗ (y ∗ z) = (x ∗ y) ∗ z. Proof. By induction on z.
Definition 2.7 (cf. [27, Def. 2.1] and [26, Def. 2]). By SCA sets and 1 are defined as below: := {x, y | x ∈ Seq ∧ y ∈ Seq ∧ ∃z ∈Seq (y ∗z = x)}, := {x, y | x ∈ Seq ∧ y ∈ Seq ∧ ∃z ∈Seq (ln(z) = 1 ∧ y ∗z = x)}. 1
We will use x y and x 1 y in place of x, y ∈ and x, y ∈1 , resp. Lemma 2.8. ∀x ∈Seq∀y ∈Seq∀z ∈Seq (y z → x ∗ y x ∗ z). Proof. By associativity (Lemma 2.6).
A tree is a non-empty set of sequences, downwards closed with respect to the -relation: Definition 2.9 (cf. [27, Def. 2.3] and [26, Def. 3]). By SCA we define Tree := {T ⊆ Seq | nil ∈ T ∧ ∀y ∈T ∀z (y z → z ∈ T )}. If T ∈ Tree, x T y and x 1T y will mean x ∈ T ∧ y ∈ T ∧ x y and x ∈ T ∧y ∈ T ∧x 1 y, resp. With these notations we will make a familiar use of bounded quantifiers: e.g. ∀x 1T x ϕ[x ] will mean ∀x (x 1T x → ϕ[x ]).
TOWARDS NEW FOUNDATIONS
263
Definition 2.10. If T, T ∈ Tree we say that R is a bisimulation between T and T , written BS(R, T, T ), iff R ⊆ T × T , nil, nil ∈ R, and the following holds: ∀x ∈T ∀y ∈T x, y ∈ R −→ (12) ∀y 1T y∃x 1T x x , y ∈ R . ∀x 1T x∃y 1T y x , y ∈ R Definition 2.11. We define T ∼ = T :⇔ T ∈ Tree ∧ T ∈ Tree ∧ ∃R BS(R, T, T ). Lemma 2.12. ∼ = is an equivalence relation on Tree, i.e. for every T, T , T ∈ Tree the following hold : (13)
T ∼ = T;
(14)
T ∼ = T → T ∼ = T; ∼ ∼ T =T ∧T =T →T ∼ = T .
(15)
Proof. (13) is provided by the identity relation on T : {x, x | x ∈ T }. (14) is provided by the inverse relation: when BS(R, T, T ), set R−1 := {y, x | x, y ∈ R}. (15) is provided by the composition: when BS(R1 , T, T ) ∧ BS(R2 , T , T ), set R2 ◦ R1 := {x, z | ∃y(x, y ∈ R1 ∧ y, z ∈ R2 )}. Definition 2.13. For T ∈ Tree and x ∈ T we define Tx := {y ∈ Seq | x ∗ y ∈ T }. Lemma 2.14. If T ∈ Tree and x ∈ T then Tx ∈ Tree. Proof. By Definition 2.9 we need to prove Tx ⊆ Seq ∧ nil ∈ Tx ∧ ∀y ∈Tx ∀z(y z → z ∈ Tx ). Tx ⊆ Seq is immediate from Definition 2.13. nil ∈ Tx follows from x ∗ nil = x ∈ T . Now assume y ∈ Tx ∧ y z. We then have x∗y ∈T and by Lemma 2.8 x ∗ y x ∗ z. Since T ∈ Tree, it must hold x ∗ z ∈ T , i.e. z ∈ Tx . Lemma 2.15. If T, T ∈Tree, BS(R, T, T ) and x, y ∈ R then Tx ∼ = Ty .
Proof. Tx , Ty ∈ Tree by Lemma 2.14. Consider R := {x , y | x ∗ x , y ∗ y ∈ R}.
264
SERGEI TUPAILO
From R ⊆ T × T we have R ⊆ Tx × Ty . From x, y ∈ R we have nil, nil ∈ R . Finally, ∀x ∈Tx ∀y ∈Ty x , y ∈ R −→ ∀x 1Tx x ∃y 1Ty y x , y ∈ R ∀y 1Ty y ∃x 1Tx x x , y ∈ R follows from the condition (12), so that we can conclude BS(R , Tx , Ty ). Lemma 2.16. If T, T ∈Tree and T ∼ = T then
)) ∀x (nil, x ∈ T → ∃y (nil, y ∈ T ∧ Tnil,x ∼ = Tnil,y ∼ )). ∀y (nil, y ∈ T → ∃x (nil, x ∈ T ∧ Tnil,x = T nil,y
Proof. Let T, T ∈ Tree and BS(R, T, T ). By the Definition 2.10 we have nil, nil ∈ R and ∀y 1T nil∃x 1T nil x, y ∈ R. ∀x 1T nil∃y 1T nil x, y ∈ R The claim now follows from Lemma 2.15.
Lemma 2.17. ∀T ∈Tree∀T ∈Tree ∀x (nil, x ∈ T → ∃y (nil, y ∈ T ∧ Tnil,x ∼ )) = Tnil,y )) → T ∼ ∀y (nil, y ∈ T → ∃x (nil, x ∈ T ∧ Tnil,x ∼ = Tnil,y = T .
Proof. Given T ∈ Tree ∧ T ∈Tree and (16) (17)
∀x (nil, x ∈ T → ∃y (nil, y ∈ T ∧ Tnil,x ∼ )) = Tnil,y ∀y (nil, y ∈ T → ∃x (nil, x ∈ T ∧ Tnil,x ∼ )), = Tnil,y
set (18) R := {nil, nil }
{x, y | x ∈ T −{nil} ∧ y ∈ T −{nil} ∧ Tx ∼ = Ty }.
Claim. R is a bisimulation between T and T . /- From (18) we immediately have R ⊆ T × T and nil, nil ∈ R. We must now show ∀x ∈T ∀y ∈T x, y ∈ R −→ (19) ∀x 1T x∃y 1T y x , y ∈ R ∀y 1T y∃x 1T x x , y ∈ R .
265
TOWARDS NEW FOUNDATIONS
Fix x ∈ T , y ∈ T . First consider the case x = nil = y. Fix x 1T nil. By (16) ∃y 1T nil Tx ∼ = Ty . By (18) x , y ∈ R for these x , y . Similarly if 1 we start with y T nil. Observe that (18) implies x, y ∈ R → (x = nil ↔ y = nil). So it remains to consider the case x, y ∈ R ∧ x = nil = y. Assuming x = nil = y, x, y ∈ R yields Tx ∼ = Ty . By Lemma 2.16
∀x (nil, x ∈ Tx → ∃y (nil, y ∈ Ty ∧ Tx,x ∼ = Ty,y )) ∼ ∀y (nil, y ∈ T → ∃x (nil, x ∈ Tx ∧ Tx,x = T )), y,y
y
i.e. ∀x 1T x∃y 1T y Tx ∼ = Ty
∀y 1T y∃x 1T x Tx ∼ = Ty ,
which yields the conclusion of (19). Now, if T = {nil}
-/
{nil, y1 , . . . , yn ∈ T } ∈ Tree,
by T˘ we want to denote a tree {nil} {nil, {y1 }, . . . , {yn } | nil, y1 , . . . , yn ∈ T }. For establishing properties of T˘ , we will use the NFU-fact ∀x∀y (x = y ↔ {x} = {y}).
(20)
The exact definitions are below. Definition 2.18. Set =0 := {nil, nil }, =1 := {{p}, q | p ∈ Seq ∧ q ∈ Seq ∧ ln(p) = 1 ∧ln(q) = 1 ∧ {(p)1 } = (q)1 }, =k+2 := {{p}, q | p ∈ Seq ∧ q ∈ Seq ∧ ln(p) > 1 ∧ ln(q) > 1 ∧{(p)1 } = (q)1 ∧ {rem(p, 1)}, rem(q, 1) ∈ =k+1 }. By recursion on N (Lemma 2.3) there exists a function g s.t. ∀k ∈N g(k) = =k . Finally we set p+ = q :⇔ p = nil ∧ q = nil ∨ ∃k ∈N−{0} {p}, q ∈ g(k). Definition 2.19. For T ∈ Tree we define T˘ := {q ∈ Seq | ∃p ∈T p+ = q}.
266
SERGEI TUPAILO
Lemma 2.20. ∀p ∈Seq∃!q ∈Seq p+ = q. Proof. By induction on Seq, using the facts (9), (20) and the axiom Pair. Lemma 2.21. ∀T ∈Tree∃!U ∈Tree U = T˘ . Proof. Use the Definition 2.19, Lemma 2.20, Definition 2.9 and the Equality axioms of NFUP. Lemma 2.22. For T1 , T2 ∈ Tree it holds: T1 ∼ = T2 ↔ T˘1 ∼ = T˘2 . Proof. It suffices to use the equivalence ˘ T˘1 , T˘2 ), BS(R, T1 , T2 ) ↔ BS(R, where R˘ := {q1 , q2 | p1 , p2 ∈ R ∧ p1+ = q1 ∧ p2+ = q2 }, and note that both R and R˘ are definable from each other in a stratified way. §3. Modelling NF. Definition 3.1. We define
S ∈˘ T :⇔ S ∈ Tree ∧ T ∈ Tree ∧ ∃x nil, x ∈ T ∧ S˘ ∼ = Tnil,x .
Lemma 3.2. For S, S , T, T ∈ Tree the following hold : (1) S ∼ = S ∧ S ∈˘ T → S ∈˘ T ; (2) T ∼ = T ∧ S ∈˘ T → S ∈˘ T . Proof. (1) follows from Lemmata 2.22 and 2.12. (2) follows from the Definition 3.1, Lemma 2.16 and Lemma 2.12. Definition 3.3. Let ϕ be an L∈ -formula and Z be a set. By ϕ Z we denote ˘ , and all quantifiers the formula obtained from ϕ by replacing = by ∼ =, ∈ by ∈ Qz by QZ ∈Z. When ϕ is a statement, we say that Z satisfies ϕ, Z |= ϕ, iff ϕ Z holds. Lemma 3.4. Let ϕ(x, y1 , . . . , yk ) be a formula of L∈ with all free variables shown and Z be a set. Let Yi ∈ Tree for all 1 ≤ i ≤ k. Let X1 , X2 ∈ Tree and X1 ∼ = X2 . Then ϕ Z [X1 ] ↔ ϕ Z [X2 ]. In other words, any set Z satisfies the Equality axioms of NF.
TOWARDS NEW FOUNDATIONS
267
Proof. By induction on ϕ. The atomic case follows from Lemmata 2.12 and 3.2. Lemma 3.5. The defining formulae in the Definitions 2.11 and 3.1 are stratified. In any stratification of T ∼ = T , T and T must have the same type, and in any stratification of S ∈˘ T , the type of T must be 1 higher than the type of S. Proof. By inspection. Z Lemma 3.6. ϕ satisfies Separation for any stratified ϕ, i.e. if ϕ[x] is a stratified formula of L∈ and Z is a set, then (21) ∃Y ∀X X ∈ Y ↔ X ∈ Z ∧ ϕ Z [X ] . Proof. In view of Lemma 3.5, the only obstacle why the formula X ∈ Z ∧ ϕ Z [X ] could be unstratified is that it might contain several occurrences of the variable Z. Let Z1 ...Zn [X ] be a new formula, obtained from X ∈ Z∧ϕ Z [X ] by replacing each occurrence of Z by occurrence of a new variable Zi . Then the formula Z1 ...Zn [X ] is stratified. By SCA, we have (22) ∀Z1 . . . ∀Zn ∃Y ∀X X ∈ Y ↔ Z1 ...Zn [X ] . Substituting now Z for Z1 , . . . , Zn , we obtain (21). Now we introduce the following construction. If T = {nil} {nil, y1 , . . . , yn ∈ T } ∈ Tree,
by T we want to denote a tree {nil} {nil, T, {y1 }, . . . , {yn }) | nil, y1 , . . . , yn ∈ T }. The exact definition is below. Definition 3.7. For T ∈ Tree we define: T := {nil} {nil, T ∗ q | q ∈ T˘ }. Lemma 3.8. For any T ∈ Tree it holds T ∈ Tree ∧ T nil,T = T˘ . Proof. is straightforward, using Definitions 2.19, 3.7 and the axiom Ext . Definition 3.9. For any Y ⊆ Tree we define Y∗ := {nil} ∪ {T | T ∈ Y}. Lemma 3.10. For any Y ⊆ Tree we have Y∗ ∈ Tree and (23) ∀T nil, T ∈ Y∗ → Y∗nil,T = T˘ ∧ T ∈ Y .
268
SERGEI TUPAILO
Proof. Y∗ ∈ Tree is obvious from the definition of Y∗ . For (23) we additionally employ Lemma 3.8. Lemma 3.11. ∀Y⊆Tree∃!T ∈Tree T = Y∗ . Proof. Existence follows from Lemma 3.10. Uniqueness follows from the Equality axioms of NFUP. Definition 3.12. For any Z ⊆ Tree we define pw(Z) := {Y∗ | Y ⊆ Z}. Lemma 3.13. ∀Z⊆Tree∃!W⊆Tree W = pw(Z). Proof. Existence follows from SCA and Lemma 3.11. Uniqueness follows from the Equality axioms of NFUP. Lemma 3.14. The operation pw is monotone on Tree, i.e. ∀Z1 ⊆Tree∀Z2 ⊆Tree Z1 ⊆ Z2 → pw(Z1 ) ⊆ pw(Z2 ) . Proof. To show {Z∗ | Z ⊆ Z1 } ⊆ {Z∗ | Z ⊆ Z2 }, we observe that if Z ⊆ Z1 then Z ⊆ Z2 , so Z∗ ∈ pw(Z2 ). Definition 3.15. 1. A set Z ⊆ Tree is called a (pw-) fixpoint iff pw(Z) ⊆ Z. 2. A set Z ⊆ Tree is called a (pw-)least fixpoint iff it is a fixpoint and ∀Y⊆Tree pw(Y) ⊆ Y → Z ⊆ Y . Lemma 3.16. If Z is a least fixpoint then Z = pw(Z). Proof. Since Λ∗ ∈ pw(Z), by Ext it’s sufficient to show (24)
pw(Z) ⊆ Z
and (25)
Z ⊆ pw(Z).
(24) follows from the fact that Z is a fixpoint. Since the operation pw is monotone (Lemma 3.14), we obtain pw(pw(Z)) ⊆ pw(Z), i.e. pw(Z) is also a fixpoint. But since Z is a least fixpoint, we obtain (25). Lemma 3.17. A least fixpoint, if exists, is unique. Proof. Let Z1 and Z2 be two least fixpoints. By Lemma 3.16 Z1 = pw(Z1 ) and Z2 = pw(Z2 ). Then we also have Λ∗ ∈ Z1 and Λ∗ ∈ Z2 . Since Z1 and Z2 are both least fixpoints, Z1 ⊆ Z2 and Z2 ⊆ Z1 both hold. It remains to apply the Ext axiom of NFUP.
TOWARDS NEW FOUNDATIONS
269
Lemma 3.18. If Z is a least fixpoint then the following holds: ˘ T ) → T ∼ ∀T ∈Z∀T ∈Z ∀S ∈Z (S ∈˘ T ↔ S ∈ = T . In other words, any least fixpoint satisfies the Extensionality axiom of NF. Proof. Given ˘ T ↔S∈ ˘ T ), T ∈ Z ∧ T ∈Z ∧ ∀S ∈Z (S ∈
(26)
first we observe, since Z ⊆ Tree, that T ∈ Tree ∧ T ∈Tree.
(27) Now we aim to show (28) (29)
)) ∀x (nil, x ∈ T → ∃y (nil, y ∈ T ∧ Tnil,x ∼ = Tnil,y ∀y (nil, y ∈ T → ∃x (nil, x ∈ T ∧ Tnil,x ∼ )). = Tnil,y
From (26) we have (30)
∀S ∈Z (S ∈˘ T ↔ S ∈˘ T ).
In order to prove (28), assume nil, x ∈ T . Since T ∈ Z and Z = pw(Z) (Lemma 3.16), we have T ∈ pw(Z), i.e. ∃Y⊆Z Y∗ = T.
(31) By Lemma 3.10 (32)
˘ x ∈ Y ∧ Y∗nil,x = x,
which implies (33)
x ∈ Z ∧ Tnil,x = x. ˘
˘ T , i.e. Then we must have x ∈˘ T , and then by (30) x ∈ ∃y nil, y ∈ T ∧ x˘ ∼ (34) . = Tnil,y From (33) and (34) we obtain Tnil,x ∼ = Tnil,y
for the abovementioned x, y. For (29), we proceed in the similar manner, now employing the direction ← of (30). This establishes (28) and (29), and hence, by Lemma 2.17, T ∼ = T . Comment. Does the operation pw have fixpoints? Yes, — for example the sets Tree, pw(Tree), pw(pw(Tree)), . . . . But we don’t know whether it’s consistent to assume that it has a least fixpoint.
270
SERGEI TUPAILO
Lemma 3.19. Any fixpoint satisfies SCA of NF. Proof. Let Z be a fixpoint. Let ϕ(x, y1 , . . . , yk ) be a stratified formula of L∈ with all free variables shown. Let Yi ∈ Z for all 1 ≤ i ≤ k. We need to prove ∃Y∗ ∈Z∀X ∈Z X ∈˘ Y∗ ↔ ϕ Z (X, Y1 , . . . , Yk ) . (35) By Lemma 3.6 set (36)
Y := {X ∈ Z | ϕ Z [X ]}.
Defining Y∗ as in Definition 3.9 and using that Z is a fixpoint, we conclude Y ∈ Z. Now, assuming T ∈ Z, it remains to prove ∗
˘ Y∗ ↔ ϕ Z [T ]. T ∈ In → direction, assume T ∈˘ Y∗ . By Definition 3.1 this means (37) ∃T nil, T ∈ Y∗ ∧ T˘ ∼ = Y∗nil,T , which by Lemma 3.10 implies (38)
∃T ∈Y T˘ ∼ = Y∗nil,T = T˘ .
By Lemma 2.22 we have now (39)
T ∼ = T .
Now from (36) and Lemma 3.4 we conclude ϕ Z [T ]. In the converse direction, assume ϕ Z [T ]. Then by (36) (40)
T ∈ Y,
and by Definition 3.9 and Lemma 3.8 (41)
T ∈˘ Y∗ .
Definition 3.20. Let MID(pw) be the axiom saying There exists a least fixpoint of the pw operation. Theorem 1. NF is consistent relative to NFUP + MID(pw). Proof. Follows from Lemmata 3.4, 3.19 and 3.18.
REFERENCES
[1] P. Aczel, Non-Well-Founded Sets, CSLI Lecture Notes, vol. 14, Stanford University, Stanford, 1988. [2] J. Barwise and L. Moss, Vicious Circles, CSLI Lecture Notes, vol. 60, Stanford University, Stanford, 1996.
TOWARDS NEW FOUNDATIONS
271
[3] M. Boffa, The consistency problem for NF, The Journal of Symbolic Logic, vol. 42 (1977), no. 2, pp. 215–220. [4] W. Buchholz, S. Feferman, W. Pohlers, and W. Sieg, Iterated Inductive Definitions and Subsystems of Analysis, Lecture Notes in Mathematics, vol. 897, Springer-Verlag, Berlin, 1981. [5] A. Cantini, Relating Quine’s NF to Feferman’s EM, Studia Logica, vol. 62 (1999), no. 2, pp. 141–162. [6] S. Feferman, Some formal systems for the unlimited theory of structures and categories, Unpublished manuscript, 52 pp., available at http://math.stanford.edu/∼ feferman/ papers/Unlimited.pdf, Abstract in the Journal of Symbolic Logic, vol. 39 (1974), pp. 374–375. [7] , Monotone inductive definitions, The L. E. J. Brouwer Centenary Symposium (A. S. Troelstra and D. van Dallen, editors), North-Holland, Amsterdam, 1982, pp. 77–89. , Typical ambiguity: trying to have your cake and eat it too, One Hundred Years of [8] Russell’s Paradox (G. Link, editor), de Gruyter, Berlin, 2004, pp. 135–151. [9] T. E. Forster, Set Theory with a Universal Set, 2nd ed., Clarendon Press, New York, 1995. [10] , Quine’s NF—60 years on, The American Mathematical Monthly, vol. 104 (1997), no. 9, pp. 838–845. [11] H. Friedman, One hundred and two problems in mathematical logic, The Journal of Symbolic Logic, vol. 40 (1975), pp. 113–129. [12] T. Glaß, M. Rathjen, and A. Schluter, On the proof-theoretic strength of monotone ¨ induction in explicit mathematics, Annals of Pure and Applied Logic, vol. 85 (1997), no. 1, pp. 1– 46. [13] M. R. Holmes, New foundations home page, http://math.boisestate.edu/∼ holmes/ holmes/nf.html. [14] , The axiom of anti-foundation in Jensen’s “new foundations with ur-elements”, Bulletin de la Soci´et´e Math´ematique de Belgique. S´erie B, vol. 43 (1991), no. 2, pp. 167–179. [15] , Elementary Set Theory with a Universal Set, Cahiers du Centre de Logique, vol. 10, Universit´e Catholique de Louvain D´epartement de Philosophie, Louvain, 1998. [16] , Strong axioms of infinity in NFU, The Journal of Symbolic Logic, vol. 66 (2001), no. 1, pp. 87–116. [17] R. B. Jensen, On the consistency of a slight(?) modification of Quine’s NF, Synthese, vol. 19 (1969), pp. 250–263. [18] W. V. Quine, New Foundations for Mathematical Logic, The American Mathematical Monthly, vol. 44 (1937), no. 2, pp. 70–80. [19] , On ordered pairs, The Journal of Symbolic Logic, vol. 10 (1945), pp. 95–96. [20] M. Rathjen, Explicit mathematics with monotone inductive definitions: a survey, Reflections on the Foundations of Mathematics: Essays in Honour of Solomon Feferman, Lecture Notes in Logic, vol. 15, Association for Symbolic Logic, Urbana, IL, 2002, pp. 329–346. [21] J. B. Rosser, On the consistency of Quine’s new foundations for mathematical logic, The Journal of Symbolic Logic, vol. 4 (1939), pp. 15–24. [22] , The axiom of infinity in Quine’s New Foundations, The Journal of Symbolic Logic, vol. 17 (1952), pp. 238–242. [23] R. Solovay, The consistency strength of NFUB, Unpublished manuscript, 39 pp., available at http://front.math.ucdavis.edu/author/Solovay-R*&c=LO, 1997. [24] E. P. Specker, The axiom of choice in Quine’s new foundations for mathematical logic, Proceedings of the National Academy of Sciences of the USA, vol. 39 (1953), pp. 972–975. [25] , Typical ambiguity, Logic, Methodology and Philosophy of Science (E. Nagel, editor), Stanford University Press, Stanford, 1962, pp. 116–124. [26] S. Tupailo, On non-wellfounded constructive set theory: construction of non-wellfounded sets in explicit mathematics, Games, Logic, and Constructive Sets (G. Mints and R. Muskens, editors), CSLI Publications, Stanford, 2003, pp. 109–125.
272
SERGEI TUPAILO
[27] , Realization of constructive set theory into Explicit Mathematics: a lower bound for impredicative Mahlo universe, Annals of Pure and Applied Logic, vol. 120 (2003), no. 1-3, pp. 165–196. [28] , On the intuitionistic strength of monotone inductive definitions, The Journal of Symbolic Logic, vol. 69 (2004), no. 3, pp. 790–798. TALLINN UNIVERSITY OF TECHNOLOGY INSTITUTE OF CYBERNETICS TALLINN, ESTONIA
E-mail:
[email protected]
Lecture Notes in Logic 1. Recursion Theory. J. R. Shoenfield.. (1993, reprinted 2001; 84 pp.) 2. Logic Colloquium ’90; Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Helsinki, Finland, July 15–22, 1990. Eds. J. Oikkonen and J. V¨aa¨ n¨anen. (1993, reprinted 2001; 305 pp.) 3. Fine Structure and Iteration Trees. W. Mitchell and J. Steel. (1994; 130 pp.) 4. Descriptive Set Theory and Forcing: How to Prove Theorems about Borel Sets the Hard Way. A. W. Miller. (1995; 130 pp.) 5. Model Theory of Fields. D. Marker, M. Messmer, and A. Pillay. (First edition, 1996; 154 pp. Second edition, 2006; 155 pp.) 6. G¨odel ’96; Logical Foundations of Mathematics, Computer Science and Physics; Kurt G¨odel’s Legacy. Brno, Czech Republic, August 1996, Proceedings. Ed. P. Hajek. (1996, reprinted 2001; 322 pp.) 7. A General Algebraic Semantics for Sentential Objects. J. M. Font and R. Jansana. (1996; 135 pp.) 8. The Core Model Iterability Problem. J. Steel. (1997; 112 pp.) 9. Bounded Variable Logics and Counting. M. Otto. (1997; 183 pp.) 10. Aspects of Incompleteness. P. Lindstrom. (First edition, 1997; 133 pp. Second edition, 2003; 163 pp.) 11. Logic Colloquium ’95; Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Haifa, Israel, August 9–18, 1995. Eds. J. A. Makowsky and E. V. Ravve. (1998; 364 pp.) 12. Logic Colloquium ’96; Proceedings of the Colloquium held in San Sebastian, Spain, July 9–15, 1996. Eds. J. M. Larrazabal, D. Lascar, and G. Mints. (1998; 268 pp.) 13. Logic Colloquium ’98; Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Prague, Czech Republic, August 9–15, 1998. Eds. S. R. Buss, P. H´ajek, and P. Pudl´ak. (2000; 541 pp.) 14. Model Theory of Stochastic Processes. S. Fajardo and H. J. Keisler. (2002; 136 pp.) 15. Reflections on the Foundations of Mathematics; Essays in Honor of Solomon Feferman. Eds. W. Seig, R. Sommer, and C. Talcott. (2002; 444 pp.) 16. Inexhaustibility; A Non-exhaustive Treatment. T. Franz´en. (2004; 255 pp.) 17. Logic Colloquium ’99; Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Utrecht, Netherlands, August 1–6, 1999. Eds. J. van Eijck, V. van Oostrom, and A. Visser. (2004; 208 pp.) 18. The Notre Dame Lectures. Ed. P. Cholak. (2005; 185 pp.)
19. Logic Colloquium 2000; Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Paris, France, July 23– 31, 2000. Eds. R. Cori, A. Razborov, S. Todorˇcevi´c, and C. Wood. (2005; 408 pp.) 20. Logic Colloquium ’01; Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Vienna, Austria, August 1–6, 2001. Eds. M. Baaz, S. Friedman, and J. Kraj´ıcˇ ek. (2005; 486 pp.) 21. Reverse Mathematics 2001. Ed. S. Simpson. (2005; 401 pp.) 22. Intensionality. Ed. R. Kahle. (2005; 265 pp.) 23. Logicism Renewed: Logical Foundations for Mathematics and Computer Science. P. Gilmore. (2005; 230 pp.) 24. Logic Colloquium ’03: Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Helsinki, Finland, August 14–20, 2003. Eds. V. Stoltenberg-Hansen and J. V¨aa¨ n¨anen. (2006; 407 pp.) 25. Nonstandard Methods and Applications in Mathematics. Eds. N. J. Cutland, M. Di Nasso, and D. Ross. (2006; 248 pp.) 26. Logic in Tehran: Proceedings of the Workshop and Conference on Logic, Algebra, and Arithmetic, held October 18–22, 2003. Eds. A. Enayat, I. Kalantari, and M. Moniri. (2006; 341 pp.) 27. Logic Colloquium ’02: Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic and the Colloquium Logicum, held in M¨unster, Germany, August 3–11, 2002. Eds. Z. Chatzidakis, P. Koepke, and W. Pohlers. (2006; 359 pp.) 28. Logic Colloquium ’05: Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic, held in Athens, Greece, July 28–August 3, 2005. Eds. C. Dimitracopoulos, L. Newelski, D. Normann, and J. Steel. (2007; 272 pp.)