PROCEEDINGS OF THE THIRD SCANDINAVIAN LOGIC SYMPOSIUM
Edited by
Stig KANGER Professor of Philosophy, University of Uppsala,Sweden
1975
NORTH-HOLLAND PUBLISHING COMPANY-AM STERDAMeOXFORD AMERICAN ELSEVIER PUBLISHING COMPANY, 1NC.-NEW YORK
0 NORTH-HOLLAND PUBLISHING COMPANY- 1975 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any from or by any means, electronic, mechanical, photocopyin:, recording or otherwise, without the prior permission of the copyright owner. Library of Congress Catalog Curd Number 74-801 13 North-Holland ISBN S 0720422000
072042283 3 American Elsevier ISBN 0444 106790
Published by:
North-Holland Publishing Company- Amsterdam North-Holland Publishing Company, Ltd.- Oxford
Sole distributors for the U.S.A. and Canada: American Elsevier Publishing Company, Inc. 52 Vanderbilt Avenue New York, N.Y. 10017
P RI NTED I N EAST G ERM A N Y
PREFACE The Third Scandinavian Logic Symposium was held at the University of Uppsala, Sweden, April 9-1 1, 1973. About 70 persons attended the meeting and 18 papers were presented. The symposium was organized by a committee consisting of J. E. Fenstad, J. Hintikka, B. Mayoh and S . Kanger. It was sponsored by the Division of Logic, Methodology and Philosophy of Science of the International Union of History and Philosophy of Science, by the Swedish Natural Science Research Council, and by Fornanderska Fonden. Uppsala, January 1974
Stig Kanger
V
QUANTIFIERS, GAMES AND INDUCTIVE DEFINITIONS Peter ACZEL University of Manchester, Manchester, Great Britain
0. Introduction We give a game theoretic interpretation of infinity strings of quantifiers (Qx, Qxl Qx, where Q is any non-trivial monotone quantifier. These are shown to be closely related to certain inductive definitions. In tbe last section we give a generalisation of Moschovakis's characterisation, using his game quantifier, of the positive &st order inductive relations (see [5]). Finally we relate our work to an earlier construction in descriptive set theory, Kolmogorov's R-operator. The results in this paper were first announced in an unpublished paper 'Stage comparison theorems and game playing with inductive definitions' and a supplement 'The classical R-operator and games'. a..),
0.1. NOTATION. If R is a set, R(u) means u E R. "A denotes the set of n-tuples of elements of A . ( ) is the only element of OA. (""A = "A. "'Ais the set of all infinite sequences of elements of A . x,y , z, ... range over A ; s, t range over ("'A, and 01, /?range over "A. Let aln = (01(0), ..., a (n - 1)) E "Afor 01 E "A and n E w .
Un
1. Quantifiers
By a quuntijier on the set A we mean a set Q of subsets of A. We sometimes prefer the notation Qx R(x) to R E Q. Q is monotone if X 2 Y EQ => X E Q, and Q is non-triuiul if Q # 0 and 0 4 Q. Throughout this paper, 1
2
PETER ACZEL
all quantifiers will be understood to be monotone and non-trivial, unless otherwise indicated. The usual quantifiers are 3 = {X E A I X # S } and V = {A}. All quantifiers have certain properties in common with 3 and V. In particular, 1.1. PROPERTY. If S is a sentence and R Qz (S
A
c A, then
R(z)) c> S
A
Qz R(z),
Qz ( S v R(z)) 0S v Qz R(z). The aim of this paper is to investigate a natural interpretation to infinite strings of quantifiers and relate it to some earlier ideas. Certain infinite strings of quantifiers have already been interpreted. For example (3x0 3x1 *.-), (VXO v x ,
-a),
(VXO 3x1 vx* 3x3
*.a).
We shall give an interpretation to (Qxo Qxl for any quantifier Q. This will generalise the three examples Q = 3, V and V3, where V3 is the quantifier on A x A given by em.)
V3xy R ( x , y ) e Vx 3y R (x, y ) .
More elaborate strings of quantifiers may be considered without extra effort except for notational complexity (see 0 5). Using these strings, we can define a quantifier Q* on "A by
Q*LXR(a)
(Qx, Q x ~ R ( ~ g ,XI, ...). ;*a)
Also we can define quantifiers Q " and Q on (@)Aby Q"SR(S)*(QXOQ x ~ : - - ) V R ( X O..., , xn-l), n
Q " s R ( s ) * ( Q x ~ Qx, * * . ) A R ( x o..., , ~
~ - 1 ) .
n
For example, 3* and V* are the usual quantifiers on mA, while 3" and V" are the usual quantifiers on (m)A.More interestingly, V" is the Souslin quantifier Y Y s R(s) -eVLXV R (.In), n
QUANTIFIERS, GAMES AND INDUCTNE DEPINlTIONS
3
and 3' is the classical d-guantijier d s R(s) 0 36 A R (.in). n
Recently, Moschovakis has used the game quantiJier B = (V3)".
The dual 0 of a quantifier Q is given by
o x R(x) o i Qx
R(x).
i
3 and V are duals. So are Y and d. Also the dual of B is (3V) A . One of the results of this paper will be that 0 A is always the dual of Q'. 2. Games
The key to our interpretation of (Qxo Qxl --.)is the following triviality.
2.1. Qz R(z)e ( ~ X Q) E (Vz E X)R(z). Now (3x0E
0 ) (Vzo E Xo)(3x1E Q)("21
E
Xi)* . *
has a natural interpretation in terms of games G (9, R) for R c "A between two players 3 and V who make alternate moves starting with 3 who always picks elements of Q, while V always responds by picking an element from the last set chosen by 3. As Q is non-trivial, each player can always move. A play of the game produces an infinite sequence xo,z~,Xl,z1,...Xn,z~, such that znE XnE Q for all n. Such a play is a win for 3 if R (zo ,z1,...); otherwise it is a win for V. if player 3 has a 2.2. MAIN DEFINITION. (Qxo Qxl .-.)R(xo, x1, winning strategy in the game G (Q, R). First notice that Q*, Q A and Q ' as defined in 0 1 are all non-trivial monotone quantifiers. Also, the special cases when Q is 3, V or V3 agree with the earlier interpretations.
4
PETER ACZEL
Our k s t result appears obvious from the notation, but does require a proof.
2.3. THEOREM. Qz (Qzo Qzi
me.)
R (z, zo, z i , ...) 0 (Qzo Qzi
..a)
R (ZO,
21 9
me.).
PROOF.Suppose that the right-hand side is true. Then player 3 has a winning strategy a in G (Q, R). Let X E Q be the first move of 3 following this strategy. Then to each move z E X of player V, the strategy a determines a strategy a, for 3 in the game G (Q, &), where R, (zo, zl,...) 0R (z, zo ,z l , ...). Clearly, a, is a winning strategy, so that ( 3 X € Q) CJZE
x> (Qzo Qzi
..a)
Rz (zo, z i , ..-).
Hence by 2.1, the left-hand side is true. Conversely, suppose that the left-hand side is true. Then by 2.1, there is an X E Q such that for each z E X , 3 has a winning strategy a, in the game G (Q, Rz).Now 3 has the following winning strategy for G (Q, R): Start by choosing X and then if V chooses z E,'A follow the strategy a,. Hence the right-hand side holds. If s, t are two sequences, let S-t denote the result of concatenating them. ( ) denotes the empty sequence. 2.4. COROLLARY.
(i) R() v QzQ"sR((z)-s)-=-Q"sR(s), (ii) R() A Q z Q A s R ( ( z ) n s ) ~ Q h ~ R ( s ) .
PROOF.
QUANTIFIERS, GAMES AND INDUCTIVE DEFINITIONS
(ii) As in (i), except use If R E (")A, let
A
AR =
Then
5
' instead of ' v '.
-
E
Q v~ R(s)
I
"A A R (.In) n
I.
Q*or V R(or),
Q " S R(s) e-Q*CX A R(ol). The games G (Q, V R) and G (Q, V R) are open and closed, respectively, in the terminology of Gale-Stewart [2], and hence by a result of that paper, they are determined, i.e., one of the players has a winning strategy. Hence 2.5. THEOREM.
where W, (Q, R) and W, (Q, R) mean that 3 and V, respectively, haue a winning strategy in the game G (Q, R).
3. Fans If S E '"'A and s E ("'A, let S, = { z E A I s-(z) sight into the games G (Q, R) we need:
E
S}. For more in-
3.1. DEFINITION. S E ("'A is a Q-fan if
(0 (18 s, (ii) S E S * S , E Q . We shall see that a Q-fan is essentially a strategy for 3 in G (Q, R) while a a-fan is a strategy for V in the same game.
6
PETER ACZEL
3.2. THEOREM. (i) W3 (Q, R ) CJ (3s) [ S is a Q-fan & A S
c R],
(ii) W ~ ( Q , ~ ) * ( ~ S ) [ S ~ S ~ ~E~i U R ]~, where & A Si R = " A - R . PROOF.Given a strategy u for either player, let S be the set of sequences (zo, ..., z,-~) of possible moves of player V when this strategy is being followed. If u is a winning strategy for 3, then clearly A S E R, and if R. Hence in order to prove it is a winning strategy for V, then A S E i the implications =-in (i) and (ii) it is sufficient to show that S is a Q-fan in (i) and a @fan in (ii). So first let u be a strategy for 3. Trivially, ( ) E S. If s E S, then S, is the set of possible moves of V after 3 has made his move according to his strategy, given the sequence s of moves of V. Hence S, itself must be the set chosen by V so that S, E Q and hence S is a Q-fan. Now let u be a strategy for V. Again, ( ) E S. Given s E S , suppose that S, 4 0. Then i S, = A - S, E Q and hence is a possible move for 3 after the sequence s of moves of V. If 3 makes this move, the strategy u determines an element z E is,as ti's next move. So s-(z) 4 S, but this contradicts the fact that s-(z) is a sequence of possible moves of V following the strategy u. Hence we must have S, E (5, so that S is a 0-fan. For the converse implications, let S be a Q-fan with A S E R. Consider the following strategy for 3. If V has played a sequence of moves s E S, then 3 should play S, as next move. Any response z E S, of V still gives a sequence s-(z) E S. If 3 follows this strategy, then the moves of V will form a sequence in A S E R. Hence it is a winning strategy for 3. Thus (i) e is proved. Finally, let S be a @fan with A S E i R . Consider the following strategy for V. If V has played a sequence s E S of moves, and X E Q is the last move of 3, then as S, E 0, i s , 4 Q so that X $ is,, i.e., X n S, # 8. V's strategy is to choose an element of X n S, for his next move. In following this strategy, V's moves will form a sequence in A S -c iR. Hence the strategy is a winning strategy for V, proving (iii) -=. We have the following immediate corollaries.
QUANTIFIERS, OAMES AND INDUCTIVE DEFINITIONS
3.4. THEOREM. Q ' is the dual of
4
7
A.
This follows from 2.5 and 3.3. 3.5. THEOREM.
Q's R(s) o (3s) [ S is a Q-fan & A S
c V R],
Q ^ s R(s) c> (3s E R) [S is a Q-fun].
The last equivalence holds because
when S is a Q-fan. 4. Inductive definitions Moschovakis [ 5 ] has shown that the game quantifier B = (El)"can be expressed in terms of an inductive definition. We shall do the same for arbitrary Q '. An operator @ on A , mapping subsets of A to subsets of A is monotone if X c Y c A * @(X) E @(Y). The set inductively defined by such @ is { S I @(S) E S } . More generally, Qrn = Qrn(x)= r ) { S [ X u Q j ( S ) s S } if X S A . 4.1. THEOREM. If@ is the operator on @"Aghen by @(S)= (s E '")A I S, E Q } for S c @)A, then Q"s
R (t-s)
ot E
Grn(R).
PROOF.Let X be the set of t E (@)Asatisfying the left-hand side. We must show that X = @"(R). By 2.4 (i), R u @(X) E X. Hence CDrn(R)E X.To show that X C Om@), let t E X. Then there is a Q-fan S such that A S c V (s I R (t -)}. Suppose that t#Grn(R).We shall obtain a contradiction by defining a sequence z,,, zl, ... in A S such that t-(z0, zl, ..., z,- 1) 4 Qrn(R)for all n.
8
PETER ACZEL
Let
so = ( t " ( Z ) 1 z E so>. Then as SO E Q, t E @(So)so that as t 4 P ( R ) we must have So $ @,"(R). Hence there is zo E So such that t-(zo) 4 @,"(It). Let
s1= (t+o
Y
z)
I z E s(so)) *
Then as &,) E Q, t-(zo) E @(S1) so that S1$ @"(R). Repeating, we may define z o , zl,... such that z, E S(,,,,...,Z n - l ) and t-(zO, ...,Zn- ,) 4 @"(R) for all H. Hence z0 , Z ,
so that R (t-(zo, 4 @,"(R).
~
.,. E A S E V
{S
R ( t -s))
...,z , - ~ ) )for some n, contradicting t-(zo, ...,z,-,)
4.2. COROLLARY. r f @ is as in 4.1, then
Q "s R(s) o ( ) E @,"(R). 5. Generalisations
Given a sequence Q = Qo, Q 1 , ... of quantifiers on A, the infinite string (Qozo Qlzl may be interpreted using games G ( 9 , R) which are like the earlier games G (Q, R),except that player 3 must choose X, E Q, for his (n + l>,, move. All the results will generalise and we may define quantifiers Q*, Q " and Q as before. The notion of Q-fan is defined in the obvious way; i.e., if S is a Q-fan, then if s E S is a sequence of length n, then S, E Q,. Another generalisation is perhaps more interesting. Let Q = { Q a } A e a be a family of quantifiers on A. Then (Q,zo Qzozl QZ1z2-..) may be interpreted using games G,, (9, R) which are like the games G (Q, R), except that 3 starts by choosing X, E Q, and if z, is the (n + lst) move of player V, then 3 must choose X,,, E QZ, for his next move. Again, all the results generalise, and quantifiers a - Q*, a - Q" and a - Q" may be defined for each a E A. Also the notion of an (a - @-fan S may be defined. This time, So E Q., and if s-(z) e S,then SS-(.)E Q,.
QUANTIFIERS, GAMES AND INDUCTIVE DEFINITIONS
9
If @ ( X ) = ( x E A I XE:Q,}, then @ is a monotone operator on A such that @(0) = 0 and @(A) = A. As in the proof of 4.1, we can prove 5.1. THEOREM. If U
c A , then
a E @"(U)
ea E
U v (Q,,z, QEOzl -..) V zn E U . n
More generally, we may characterise arbitrary inductive definitions on A in terms of these strings of quantifiers. Let @ be any monotone operator on A and define Q = by
If U c A , let U' = U u @(@). 5.2. THEOREM.
a E @"(U) e a E U' v (Qazo Qz0z,...) V znE U ' . n
6. Applications
The ordinary quantifiers are often used to specify interesting classes of relations on a set (e.g. the Xc,"and ll,"relations on co). Other quantifiers may also be useful in this way. In order to use the quantifiers constructed earlier, we shall assume given a coding of '")A in A ; i.e., an injective mapping @')A-+ A that assigns (s) E A to each S E '"'A. Using this coding, we shall assume that Q" and Q" are quantifiers on A defined by
Q " X R(x) 0 (Qxo Qxl ...) A R ((xO
7 *
**
n
Q " x R ( x ) e ( Q ~ oQ x , . . * ) V R ( ( x o , n
.
y
Xn-
1)) 7
..., ~ ~ - 1 ) ) .
Also, if Q1 .,, Qn are quantifiers on A, then Q1 on A defined by
Q,is the quantifier
10
PETER ACZEL
Before quantifiers can be used to specify classes of relation on A , we need a basic class of relations to start with. We assume that 9 is a class of relations on A satisfying the following conditions. (1) W is closed under the boolean operations. (2) If R € 9and
S(X1, ..*,xn) 0 (xn(1)7* * - , xn(m))7 for somen: (1, ..., m} + (1, ..., n}, then S E ~ . (3) The coding Is ( s ) is an 9-acceptable coding. To spell out what we mean by this, let us call a function f:"A + A an 9-function if, whenever R E 9,then so is S, where
s (Xl,
***
7
x",y1,
Y
ym)
(f(xl
3
.**
3
xn), Y17
* * * 9
vm).
Then ilr <s) is 9-acceptable if (3a) For each n, ilr (s) f "A is an 9-function. (3b) For each n, there is an 9-function Ax (x). such that ((xo, ..., xm-,)), = x,
if n c rn.
(3c) There is an 9-function * such that (s) * ( t ) = ( s e t ) for s, t E ('"'A. (3d) If S E m f l A is in B? and f : "'+IA + A is an W-function, then S' E 9,where
These are certainly not the most elegant set of conditions, but they are sufficient for our purposes. The examples we have in mind are (A) W is the set of recursive relations on A = o,in which case the &%-functionsare just the recursive functions.
11
QUANTIFlERS, GAMES AND INDUCTIVE DEFINlTlONS
(B) W is the set of elementary (i.e., first order definable) relations on an acceptable structure 8 = ( A , R1,..., R ) in the sense of Moschovakis's [6],in which case the 92-functions are just those functions whose graph is elementary. 6.1. DEFINITION. If Q is a quantifier on A and R E nA,then R is Q-explicit if
R(x,,...,xn)~QyS(y,x,,..*,xn) for some S E 9.
6.2. EXAMPLES. (A) If W is the set of recursive relations on w, then 3-explicit = r.e. = x :, V-explicit = II;, ~j-explicit= n;, 3V-explicit = I;;, etc. ... Recall that d = 3 A and 9 = (V3)". Then d-explicit = I;:, 9-explicit = .&-explicit = H i . (B) If W is the set of elementary relations on an acceptable structure 8, then the main result of [ 5 ] is 6.3. THEOREM (Moschovakis). R is positive jirst-order inductive iff R is 9-explicit. Our aim here is to generalise this result. 6.4. DEFINITION. (i) @ : P(A) + P(A) is a Q-operator if there is an R E 9 and some &?-functionf such that
x E @(x) Qx,
Qxn [R( x ,
,
* * * 7
Xn)
v x(f(x,
***
3
xn))]
(ii) A relation R is Q-inductive if there is a Q-operator @ and an 9function f such that
R
(XI,
..., x,) * f ( ~ l ,...,
E
!DW.
Moschovakis's canonical form for positive first-order inductive definitions implies that they are essentially just the V3-operators. Hence we get 6.5. THEOREM. R is positive jirst-order inductive if R is V3-inductive. We can now state our generalisation of 6.2.
12
PETER ACZEL
6.6. THEOREM. For any quantifier Q,
R is Q-inductive o R is Q '-explicit. PROOF. Let R be Q-inductive. Then where g is an W-function and x E @(x) Q Y ~
- 0 .
QYm
( S (x, Y I 7
...¶
Ym)
v X ( f ( x , YI
9
* a *
3
Ym))),
where S E W and f is an W-function. Then, as in the proof of 4.1 we can show that
where
QUANTIFIERS, GAMES AND INDUCTIVE DEFINITIONS
13
Then ?Pmay easily be seen to be a Q-operator and ( x , x 1 , ...,X,)E ! P r n o x E @ Q({x I
S(X,XI,
..., xn)}).
Hence
so that R is Q-inductive. In [6], Moschovakis has generalised his work on positive-first order inductive to what he calls positive LZ'(Q)-inductive. Essentially, LZ'(Q) is the first-order language for '2I augmented with quantifier symbols Q x and o x so that if Q is a formula, so are QxY and b Y . By putting formulae of LZ'(Q) in prenex form using 1.1, the positive Y'(Q)-inductive definitions may be seen to be the V3QO-operators. If Q subsumes 3, i.e., there is an W-function f such that
then the VLlQO-operators are just the Qa-operators. Hence we have the result
6.7. THEOREM. If Q subsumes 3, then R is positive 9(Q)-inductive iff R
is (QO) "-explicit.
An example of such a Q is d ,because 3x R(x) e d x R((x),J. Finally, we wish to relate the ideas in this paper to some earlier work in descriptive set theory. Let Q be a quantifier on cc). If F = F(O), F ( l ) , ... is a sequence of subsets of a set A , let
I'@)
= {u E A
I Qn (a E F(n))}.
Operators of the form r, are calledpositive analytic (see [3] for details). In the 1920's, Kolmogorov introduced an operator R on positive analytic operations which had the property that R r 3 = In fact, it turns out that R r Q = rQA for all Q. Our terminology 'Q-fan' has been taken from the terminology used for the same idea in the definition of R. An operator R* similar to R was later introduced by Ljapunov. This time R*r, = I'caii,for all Q. In particular, R*r3 = r,, so that the
r,.
14
PETER ACZEL
game quantifier is not a stranger to descriptive set theory. For further results relevant to the R-operator, see [4] as well as [3]. For further results on Q-inductive relations on co, see [l].
References [l] P.Acztl, Representability in some systems of second order arithmetic, Israel J. Math. 8 (1970) 309-328. [2] D.Gale and F.M.Stewart, Infinite games of perfect information, Ann. Math. Studies 28 (1953) 245-266. [3] P. G .Hinman, Hierarchies of effective descriptive set theory, Trans. Am. Math. SOC.142 (1969) 111-140. [4] P. G. Hinman, The finite levels of the hierarchy of efective R-sets, to appear. [5] Y.N.Moschovakis, The game quantifier, Proc. Am. Math. SOC.31 (1971) 245-250. [6] Y .N. Moschovakis, Elementary induction on abstract structures (North-Holland, Amsterdam, 1974).
SOME CONNECTIONS BETWEEN ELEMENTARY AND MODAL LOGIC Kit F I N E University of Edinburgh, Edinburgh, Scotland
0. Introduction
A common way of proving completeness in modal logic is to look at the canonical frame. This paper shows that the method is applicable to any complete logic whose axioms express a XA-elementary condition or to any logic complete for a A-elementary class of frames. We also prove two mild conversesto this result‘. The first is that any finitely axiomatized logic has axioms expressing an elementary condition if it is complete for a certain class of natural subframes of the canonical frame. The second result is obtained from the first by dropping ‘finitely axiomatized’, and weakening ‘elementary’ to ‘A-elementary’. Classical logic is used in the formulation and proof of these results. The proofs are not hard, but they do show that there may be a fruitful and non-superficial contact between modal and elementary logic. Hopefully, more work along these lines can be carried out. $1 outlines some basic notions and results of modal logic. For simplicity, this is taken to be mono-modal. However, the results can be readily extended to multi-modal logics and, in particular, to tense logic. After writing this paper, I discovered that A.H.Lachlan had already proved the first of these ‘mild converses’ in [5]. His proof uses Craig’s interpolation theorem, whereas mine uses the algebraic characterization of elementary classes. R. I. Goldblatt [4] independently hit upon this latter proof at about the same time as I did. He also has a counter-example to the converse of this result. It is similar to the one in 94. I should like to thank Steve Thomason for the references above and for some helpful comments on the paper.
15
16
KIT FINE
$2 proves the first of the above results and a related result as well; 53 proves the second of the above results; and finally, $4 constructs counterexamples to some plausible looking converse results.
1. Preliminaries
The reader is assumed to be familiar with elementary logic. [I], for example, contains all of the relevant information. We shall briefly review some of the basic notions and results of modal logic, including its translation into elementary logic. Proofs are straightforward or standard and are therefore omitted. Further details may be found in [71 or [9]. 1.1. Syntax The language La,01 an ordinal, is the set { p < :t < 01} of distinct sentence-letters. Usually only countable languages are considered. In [2], there are languages with only finitely many sentence letters. In this paper, we will consider languages of any cardinality. The (modal) formulas of Laare constructed from L, with the help of truth-functional connectives (+ and I)and the modal connective [7 (necessity). 0 (possibility) abbreviates -. All formulas and derivative notions will be relative to a fixed language La,unless otherwise stated. A (modai) theory d is a set of modal formulas such that (i) all tautologous formulas E d ; (ii) all formulas 0 ( A + B) + ( U A + n B ) E A ; (iii) A , A
--+
B Ed
B EA (Modus Ponens).
A theory A is normal if (iv) A e A * CIA E A ; and substitutive if (v) A Ed * A
p/leEd.
SOME CONNECTIONS BETWEEN ELEMENTARY
17
A theory A is consistent if not both A and - A E A , complete if A or - A E A , and maximally consistent if both consistent and complete. For A a set of formulas, a A-theory is simply a theory containing A . A modal logic L is a normal substitutive theory. A logic is finitely axiomatizable if it is the smallest logic to contain some finite set of formulas. Suppose that A is a formula of La with sentence letters
Then we may let A* be the result of substituting prfor psi,i = 0,1,. ..,n, or be A itself if A contains no senlence-letters. To each logic L on L, there corresponds a logic La on La, where La = {formulasA of La :A*€ L}. By substitution ((v) above), La is the only logic on L, to conservatively extend L, cx 2 0. For this reason we shall often restrict our attention to logics on L,.
1.2. Semantics A model 8 for L, is a triple (X, R, v), where X (worlds) is a non-empty set, R (accessibility) E Xz,and v (valuation) E La+ P ( X ) . A frame is a pair ( X , R), where X is non-empty and R E X z . Thus a frame is essentially a model on Lo. Given a model 8 = ( X , R, u), the domain of 8, Dom (%), is X and the frame of 8, the frame upon which it is based, is ( X , R). Usually, we shall let the models 8 and 23 be ( X , R, v) and ( Y ,S, w),
respectively. A point is a pair ((11, a), where 8 is a model and a is a member of Dom (8).The truth-definition for modal logic holds between points and formulas. The non-obvious clause is: ((11,
a) k [7A
-
(8,b) k A for each b such that a R b.
For A a set of formulas, (8,a) i=A , d is true at (a, a), if (8, a) i= A for each member A of d. A model 8 verifies A ( d ) if (8,a) i= A(d) for each point (8, a). A frame 3is a frame for A@), or validates A(& if each model based upon 8 verifies A ( A ) . Fr(A) is the class of frames for A . A logic L is complete f o r a class of frames K - alternatively, K determines L - if A E L iff each frame in K validates A . A logic L is complete if it is complete for Fr(L). 2
Knnger, Symposium
18
KIT FINE
1.3. Results We shall require some results on canonical models, generated submodels, and p-morphisms. Each normal theory T determines a model 91T = ( X T ,RT, uT), called the canonical modelfor T, where XT = { A : A is a maximally consistent T-theory}; RT UT
= { ( a , b ) E X + : { A : f Z l A E a -C ) b}; =
{(t,Y)E L,xP
(XT)
: Y = {a E XT :pc E a}}.
The canonicalframe ST is (XT ,RT).
LEMMA 1 (Lindenbaum). Any consistent T-theory is included in a member of XT, for T a normal theory. LEMMA 2 (Truth-Lemma). and a in X T .
(aT, a) k A o A E a, for
T a normal theory
Given a model 8 = (X, R,u) and Y c X,let 8 I Y, the restriction of 3 to Y, be (Y,R1 Y,u [ Y),where the internal restrictions have their obvious sense. 8 = (Y,S, w) is a generated submodel of 8 if B = M [ Y and b E Y whenever a E Y and a R b. The notion of generated subframe is similar.
LEMMA 3 (On Generated Submodels). Suppose 8 is a generated submodel of a. Then (23, a) t A i f ( %a) , t A for each a in Dom(8). LEMMA 4 . 3 is a frame for L if, for each a in 8, some generated subframe of 3 contains a and is a frame for L. f is a p-morphismfrom frame (3 = (Y,5') onto frame 3 = (X, R ) i f
(i) f is a function from Y into X ; (ii) f is onto; (iii) a S b =.f(a) R f ( b ) for a, b E Y ; (iv) f ( a ) R f ( b ) =. a S c and f ( c ) = f ( b ) for some c in Y.
f is a p-morphism from 23 = ( Y , S, w) onto (v) ~ ( 5 = ) ( f (a):a E I@)}.
=
(X,R,v ) if, in addition,
SOME CONNECTIONS BETWEEN ELEMENTARY
19
LEMMA 5. Suppose that f is a p-morphism from 23 onto 8. Then (‘23, a) t.A i f ( % , f ( a ) ) C A , for a in Dom (8). LEMMA 6. Suppose that there is a p-morphism f from @ onto 5. Then a frame for logic L if@ is.
is
1.4. The translation
There is a standard translation from a modal into an elementary language. For this purpose, we may regard the sentence-letters of L, as monadic predicate letters. The elementary formulas of L, are then constructed from L, with the help of variables x,,, x i , ..., truth-functional connectives, quantifiers, the identity symbol = ,and the binary predicate letter (for accessibility). Given a modal formula A of L, and a term t, we let A’(t) be an elementary formula of L,, where (0 PXt) = P$, 5 .c a ; (ii) I’= I; (iii) (A
+ B)’
( t ) = (A’(t) + B’(t));
(iv) (CIA)’ (t) = (Vy) (t & y to occur in t.
-P
A’(y)), where y is the first variable not
The models % of modal logic are also, of course, models for elementary logic. The standard notions of elementary logic will be used. In particular, will be used in an unrigorous, but obvious, way as a name for the element a. The context will always make it clear whether notions of modal or elementary logic are intended. The semantic significance of the translation is given by: THEOREM 1. For any point (%, a) and modal formula A ,
(a, a) k A z g
% I= A‘@).
2. From elementary to canonical We shall sometimes think of a modal logic in terms of the class of its frames. Thus we say a logic L is elementary (A-elementary, ZA-elemenfury) if Fr(L) is elementary, (A-elementary, XA-elementary), i.e. if its
20
KIT FINE
axioms express the appropriate first-order property. Many of the standard logics are known to be elementary. A logic L is canonical if, for each ordinal a, &a is a frame for L. From Lemmas 1 and 2, it follows that every canonical logic is complete. The first result of this section is: THEOREM 2. Any complete XA-elementary logic is canonical. This result shows that the success of the canonical frame proofs in [7] is no accident. For all of the logics considered are XA-elementary (indeed, elementary). So the logics, if complete, are also canonical. The proof proceeds through a series of lemmas. Recall that a set A of elementary formulas in the free variable xo is satisfied in the model % if there is an a in X = Dom (%)such that q!@) is true in % for each 4(xo) in A ; that A isfnitely satisfied in % if each finite subset of A is satisfied in 9, and that 5 3 is a-saturated, a a cardinal, if any set of formulas A in the free variable xo and with fewer than a constants a, a E X, is satisfied in % whenever d is finitely satisfied in 3. LEMMA 7. Any model 8 on L, has an o-saturated elementary extension 23. The proof of this result and its generalisations is well-known from classical model theory. For example, 23 can be obtained from M as an ultralimit in the sense of Kochen [6].In the sequel we shall only use the fact that 23 is 2-saturated. There are two senses of saturation that are relevant for modal logic. We say that a set of modal formulas A is satisfied in % if there is an a in X = Dom(%) such that (8,a) k A for each A in A ; and that A is fifiitely satisjied in % if any finite subset of A is satisfied in 8. Then 8 is modally saturated, if a set of formulas A is satisfied in % whenever it is finitely satisfied in %. Given the set of modal formulas A , let Od = (0(Ao
h
/I
A,) :A o ,
...,A , E A ] .
A
A , ) : Ao,
..., AnEd).
Similarly, let
a d = (0(A0 A
***
Then % is modally saturated, if, whenever OA is true at a point (8,a), then, for some b in X,a R b and A is true at (%, b). % is modally saturated if it is modally saturated, and modally saturated,.
SOME CONNECTIONS B E W E N ELEMENTARY
21
LEMMA 8. Any 2-saturated model 8 is also modally saturated. PROOF.Suppose that 8 is 2-saturated. To show that 8 is modally saturated,, let A be any set of modal formulas that is finitely satisfied in 23. By Theorem 1, {A’(xo): A E A } is finitely satisfied in %. Since 8 is l-saturated, {A‘(xo): A E A } is satisfied in 8.So by Theorem 1 again, d is satisfied in 8. To show that 8 is modally saturated,, let A be any set of modal formulas such that OA is true at the point (8,a). By Theorem 1,
is true in 23. So {g & xo} u {A’(xo):A E A } is finitely satisfied in %. Since 8 is 2-saturatedY there is a b in Dom(8) such that {&I,) u {A’(bo): A E A } is true in %. So by Theorem 1 again, a R b and d is true at (%, b). For each model %, let the (modal) theory of % be { A : (%, a) I=A for each u in X}. Clearly, the theory of % is normal.
LEMMA 9. There is a p-morphism f from any modally saturated model 23 = ( Y , S, w) onto the canonical model % for the theory T of 8.
,
FROOF.For each a in Y = Dam(%), let f ( a ) = { A : (8,a) CA}. We verify that f satisfies the five conditions for being a p-morphism. (i) f is a function from Y into X , = Dom (3,). It is easy to show that f ( a ) is a theory. f ( a ) is maximally consistent since - A Ef ( a ) iff A 4 f ( u ) by the clause for negation. Finally, f ( a ) 2 T since T is the theory of 8. (ii) f i s a function onto X,. Let d be any member of X,. Then A is finitely satisfied in 8. For suppose otherwise, so that there are formulas A o , ...,A , in d such that (8,a) k -(Ao A A A,) for each a in Y. Since T is the theory of 8, -(Ao A ... A A,)eT. So - ( A o A A A,)eA by A a theory containing T, and - A , e d , for some i I n, by A complete. But then A l 44 by d consistent, contrary to the original supposition.
22
KIT FINE
b is modally saturated, and so (8, a) C d for some a in Y. But d is maximally consistent and s o f ( a ) = d. (iii) a S b =.f(a) R,f(b) for each a, b in Y. Suppose a S b and U A Ef(a). Then (b,a) I= UA. So (8, b) I. A and A Ef(Q. (iv) f ( a ) R,f(b) * ( 3 4 (a s c andf(4 = f(b>). Suppose f ( a ) R,f(b) = d. Then Od c f(a) and (a, a) C Od. By B modally saturated,, there is a c in Y such that a S c and (a, c) C d. But d is maximally consistent and sof(c) = d = f(b). (v) DT(5) = {fc.> :a E ~ 8 1 . v#) = {b E X, :pc E b} (by Lemma 2) = {f(u) :(23,a) i=p s } (by (ii) and the definition off) = { f ( a ):a E w(t)}. We are now in a position to put all the lemmas together. Suppose that L is a XA-elementary and complete logic (defined on thelanguage La). Let La be the corresponding logic on L,, LY an ordinal. Then we must show that is a frame for L. Let { A i :i E I } enumerate all of the La-consistent formulas. Since L is complete, there is, for each i in I, a model 9ii = ( X i , R i , v i ) and an element a, of Xi such that (at,al) C At and & = (Xi,R,) is a frame for L. Clearly, we can suppose that the sets { X i :i E I } are pairwise disjoint. Now let = (X, R,v ) = Us,, i.e., X = u X i , R = U R , and ~ ( 5 ) = u u i (5). Then by Lemma 4, 3 = (X, R) is a frame for L, and the theory of % is L,. By Lemma 7,% has an w-saturated elementary extension b = (Y, S,w). By Lzbelementary, @ = (Y, 5')is also a frame for L; and by Theorem 1, the theory of b is also La.Now 23 is modally saturated by Lemma 8. So by Lemma 9, there is a p-morphism f from b onto %La. But then by Lemma 6, &a is a frame for L and the proof is complete. A logic L is quasi-elementary (quasi-A-elementary, quasi-XA-elementary) if some elementary (A-elementary, XA-elementary) class of frames determines the same logic as Fr(L). Clearly, every elementary (A-elementary, CA-elementary) logic is quasi-elementary (A-elementary, ZA-elementary). Later we shall give an example of a logic that is quasi-elementary but not elementary (or even XA-elementary). We shall now prove:
SOME CONNECTIONS BETWEEN ELEMENTARY
23
THEOREM 3. Any quasi-A-elementary and complete logic L is canonical.
PROOF.Suppose that L is quasi-A-elementary and complete logic L, so that L is complete for some A-elementary class K of frames. We wish to show that is a frame for L. Let do be an arbitrary member of &a, i.e., consistent and complete L,-theory. Then by Lemma 4, it suffices to show that some generated subframe of & contains doand is a frame for L. Let ( A i : i E I } enumerate the formulas of do. Then for each i in I there is a model 8, = (Xi, R t , v,) and an element a, of X , such that (Hi7a,) I= A , and 8, = (Xi7R i ) E K. Let .Ii = { j I :~(5, aJ k A i } , i E I, and Fo = (.Ii:i E I}.Then Fo has the finite intersection property and so can be extended to an ultrafilter F. Now let % = (X, R, u) = n%,/F. By K A-elementary, 3 = (X, 3)E K ;and by Theorem 1 and LoS's theorem, the theory of % contains La. By Lemma 7, % has an w-saturated elementary extension b = (Y, S,w). By K A-elementary, (8 = (Y, S)E K; and by Theorem 1 again, the theory of 23 contains L,. Now 23 is modally saturated by Lemma 8; and so by Lemma 9, there is a p-morphismffrom %3 onto a submodel I X' of ELe *
By Lemma 6, I X' is a frame for L. Letfo be the function on I such thatf(i) = a,, i E I. Then by Lol's theorem and Theorem 1,fcfo/ -) =do. Finally, &* I X' is a generated subframe of 3L.,.For suppose that f ( a ) = A and dRL. I' for a in Y.Then OI'is finitely satisfied at (23,a). So by 8 modally saturated,, there is a b in Y such that a R b and ('23, b) I. I'. But I'is maximally consistent and so f(b) = I'. 3. From natural to elementary
e)
A canonical model 8 = (X, R, has the following two properties: (i) for any distinct a, b in X,there is a formula A such that (8,a) C A but (a7b) k - A ; (ii) if not a R b, then there is a formula A such that (%, a) k n A and (8,b) k - A . We shall call a model with the first property differentiated, a model with the second property tight, and a model with both properties natural. This follows the terminology of [2]. Thomason [I I] calls an analogue of
24
KIT FINE
natural models ‘refined’. Segerberg [9] calls differentiated models ‘distinguishable’ and uses ‘natural’ for the canonical logics of $2. A modal logic is natural if any natural model (on any language) that verifies the logic is based upon a frame for the logic. Any natural logic is canonical. Indeed, a model is natural if it is isomorphic to a submodel of a canonical model that satisfies Lemma 2, i.e., (%, a) I: A iff A E a. We know from $2 that any elementary (and complete) logic is canonical. In this section we prove a mild converse to this result: THEOREM 4. Any finitely axiomatizable natural logic is elementary. PROOF.The proof uses Kochen’s [6] characterization of A-elementary classes. We could use Shelah’s stronger characterization [lo], but this hardly simplifies the proof. LEMMA 10. A class of models K is A-elementary if (i) it is closed under isomorphism; (ii) it is closed under the formation of ukraproducts; (iii) it is closed under the formation of ultraiimits; (iv) its complement (in the class of similar models) is also closed under the formation of ultraproducts; (v) its complement is closed under the formation of ultralimits. Suppose that L is a natural logic (on the language LJ. Then we need to show that K = Fr(L) satisfies the conditions (i)-(v). (i) is trivial. (The scrupulous reader can derive it from Lemma 6.) (ii) and (iii) call for the following lemma: 11. Suppose that each a,, i E I, is a model on L, , that (I, F ) is an LEMMA ultrajilter pair, aad that ‘2l is the ultraproduct n[%,/F. Then there are models Bi on a language L,, p 2 01, such that each !Ji is the restriction of Bi to L,, i E I, and % = n23J.F is natural.
PROOF.For each a in Y = Dom(B), select an f in a and letpf and qs be fresh and distinct sentence-letters. Choose p > a so that each p f and qf can be identified with a distinct p e , a I5 < p. Suppose = ( X i , R,, u,), i E I. Then we let Bl = ( X i , R,, wi), where : for 5 < a ;
{ b E X i :f ( i ) Rib}
for Pr = P f ; for pr = q f .
SOME CONNECTIONS BETWEEN ELEMENTARY
25
Clearly, each ?It is the restriction of Bi to L,. So it remains to show that
B is natural.
First we show that B is differentiated. Suppose a and b are distinct members of Y. Let f and g be the selected members from a and b, respectively. Since a # b, Fa, = {i E I : f ( i )# g(i)} E F. Recall that wi(t) = {f(i)}forpr = p f . So(Bi,f(i)) kpgforeachiinIand(Bj,g(i))k -pe for each i in Fa,b.But then by Log's theorem and Theorem 1, (B,f/-) k P
!= -P<. We now show that B is tight. Suppose that not a R b for a and b in Y. Let f and g be the selected members from a and b, respectively. Since not a Rb,FL,b= { i e kf(i)R,g(i))EF. Recall that wi(@ = ( b E X i :f(i) R , b ) for pr = qf. So (Bi,f ( i ) ) k U p r for each i in I and (Bj,g(i)) k -pr for each i i n Fi,b. But then by LoS's theorem and Theorem 1, (B,fl-)k U p , and (8, d-1 k -Pg. We can now establish (ii). Suppose that gt E Fr(L), i E I, and that 5 = n s i / F for some ultrafilter pair (I,F). Putting %i = !Xi, 8 = (11 and 01 = 0 in Lemma 11, there are models BLon a language L, such that is the frame of Bi and n B = nBi/Fis natural. Each gi is a frame for L and so Bi verifies L. By Log's theorem and Theorem 1, B verifies L. But 5 is the frame of 8 ; and so by B natural, 3 E Fr(L). (iii) requires an additional lemma. Let us say that {%,: n E w) is an elementary chain of models if, for each n E w, the language of %,+ contains the language L, of %, and 8,+ restricted to the language L, is an elementary extension of %., An elementary chain of models has a union Ua,, in the obvious sense, on the language UL,.
si
LEMMA 12. Suppose that {(11,: n E w> is an elementary chain of models and that ?I = UZ,. Then Z is natural ifeach (11, is natural, n E w.
PROOF. (11 is differentiated. For suppose that a and b are distinct members of Dom(%). Then for some n in a, a, b E Dam(%,). Since %, is differentiated, there is a formula A of L, such that (%, a) C A and (a,, b) k - A . So by Theorem 1 and the union of chains lemmaapplied to (11, I L,: n 5 m w},((11, a) k A and ((11, b) I=- A . The proof that 8 is tight is similar. We can now establish (iii). Suppose that '3: E Fr(L) and that {(I,, F,): n E w} is a sequence of ultrafilter pairs. Let go= 8, Sn+l = Sf'lF,;
-=
26
KIT FINE
sm
and be the ultralimit of 8 with respect to the ultrafiltersequence {(I,,, F,,): n E w } . We wish to show that 8" E Fr(L). By repeated applications of Lemma 11, there is a set of natural models {a,,: n E w } such that is the frame of a,,, n E w . It follows that there is an elementary chain of natural models {b,,: n E w } such that bnis isomorphic to '?la, n E w , and B = Ub,,is isomorphic to X". By Lemma 12,% is natural. boverifies L since is a frame for L; and so b verifies L by {B,,:n E w } an elementary chain and Theorem 1. Therefore, by L natural, S",which is isomorphic to the frame of 93,is a frame for L. To establish (iv), suppose that 4 Fr(L), i E I, and that 5 = n & / F for some ultrafilter pair (I,F). Since L is finitely axiomatizable, we can suppose it is the smallest logic to contain some formula A . So, for each i E Z, there is a model 5Xi based upon si and an element a, E Dam(%,) such that (!Ill,al) I= - A . Let = n a i / F andfbe such thatf(i) = ai for each i E I. Then by LoS's theorem and Theorem 1, (%,A-) k - A and 8 is not a frame for L. Finally, we establish (v). Suppose 5 4 Fr (L) and let 5"be the ultralimit of with respect to the ultrafilter sequence {(In,F,,): n E @}. Then there is a formula A of L, a model % with frame 3, and a point a in % such that (a, a) k - A . Let a" be the ultralimit of 3 with respect to the ultrafilter sequence {(Z,,,F,,):n E w } , where is the equivalence relation on UDom(%,,) that generates the domain of a. Then by Theorem 1 and 8" elementarily equivalent to 3, (a", a / - ) =! - A . But 3" is the frame of a", and so 8" 4 Fr(L). This completes the proof of Theorem 4.
s,,
so
si
s
N
It is worth noting that the proof of (iv) and (v) did not use the fact that A was a modal formula. Any second-order universal closure of an elementary formula would do instead. The proof of Theorem 4 only used the fact that L was finitely axiomatized to establish condition (v). But K is A-elementary if it satisfies conditions (i)-(iv) of Lemma 10. It immediately follows that:
THEOREM 5. Any natural logic is A-elementary.
21
SOME CONNECTIONS BETWEEN ELEMENTARY
4. Some counter-examples
The following implications follow from the last two sections: Natural
A-elementary and complete =+ 2A-elementary and complete =>
canonical.
It would be nice if the first or last of these implications could be reversed. Unfortunately, such results are too good to be true. Let us begin by constructing an elementary and complete logic L that is not natural. L is the smallest logic to contain U p + p , U p --t OUp, and O O p OOp. Thus L is the logic S4.1, introduced by McKinsey in [S]. It is easy to verify that Fr(L) is the class of frames satisfying the conditions:
(W (x B 4 ; (VX, y , 4 (x R Y & y li 2 -y li 4; ( V 4 (3) (x B Y &L (V.4 (Y li z Y = 4). -+
+
so (1) L is elementary. Now let '3 be the model (X, R, v ) on Lm such that X = m, R = I, and v(n) = {m E w :rn In}. The diagram for is: Po
POP1
POPlPZ
1
2
O---+-O--+-O
0
***
We can establish the following facts about 8. (a) 8 is differentiated. PROOF.Suppose k # 1, say k < 1. Then (8,1) I=p l and (8, k ) b -pl (b) B is tight. PROOF.Suppose not 1 R Ic. Then k
-P1. (c)
I
-= 1 and so (a, 1) 1 U p Lbut (rU,k)
(9,l)I=A e (a, n) I=A for 1 2 n 2 0 and A a formula of L,.
28
KIT FINE
PROOF.By induction on the construction of A . (d) % verifies L. PROOF. (X, R) is reflexive and transitive and so ! ! Iverifies each substitutioninstance of O p + p and U p -,O o p . Also, % verifies any substitutioninstance 0 O A + O O A of !J G p + QOp. For suppose A is a formula of L, and let k be an arbitrary member of X . Set n = max ( k , 1 } . Now either (%, n) k A or (a, n) I: - A . So by (c) above, (a, m) k A for all m 2 n or (8,m) I: - A for all m 2 n. But then (%, n) k OA v 0- A , and so(%,k)I:0 0 A -+ OOA. By (a), (b) and (d), % is a natural model that verifies L. But (X, R) = (0, I)is not a frame for L. Therefore (2) L is not natural. It is easy to show: (3) L is complete; and (11, (2) and (3) complete the counter-example. It is worth noting that, by Theorem 2, (1) and (3) imply that L is canonical. A direct proof of this and, consequently, (3) is given in [7]. Let us now construct a canonical logic L that is not XA-elementary. L is the smallest logic to contain
OOP
+
(00(P A 4) v o n (P A -4)).
for La,01 an ordinal. Then Let % = (X, R, u) be the canonical model we can establish the following facts about %: (a) For a E X,
OOA E a or
-
00 ( A u ( A } ) E a
00( A u ( - A ) ) c a . PROOF.Suppose OOA c a but 00 ( A u { A ) ) $ a. Then for some formulas B 1 , ...,B,, ELI, 00 (B, A ... A B,, A A ) $ a . Take any formulas C1, ... , C, E d. Then
00 (B1
A
*.*
A
B,,
A
clA
But
oO(B1
A
A
B,, A
..*
c, A
A
* * a
c, A A ) $ a . A
C,,,)Ea,
29
SOME CONNECTIONS BETWEEN ELEMENTARY
since OOA E a, and so, by a 2 L,, 00 (Bl
A
--*
A
B,, A
c, h
*.-
A
c,,,A
-A)
Eu.
Therefore 00 (C, A A C, A - A ) E a and 00 ( A u { - A } ) C _ a. (b) Suppose { A i : i E I } is a c-chain of sets of formulas and a E X . Then ODdi E a for each i E I + OD u A , c a.
PROOF.Straightforward. (c) The frame (X, R) of % satisfies the condition:
I*)
(VX,
u) (X &Y.
+
(32) (X
&z
A (VU,0) (2
&U
= 2,
y&u))). PROOF.Suppose a R b for a, b in X . Let A = { A : O A E b}. We can suppose b R f for some f (otherwise, put z = b). But thend is consistent, and so O n A c a. From (a) and (b) above and Zorn's lemma, there is a maximally consistent r z d such that OUT c a. So, for some cin X , a R c and OI' -c c. Now suppose c R d and c R e. Then T -c d, e and so, by maximally consistent, d = e = T.Also A E l' c d, and so b R d. Any frame satisfying condition (*) is a frame for L ; and so by (c) above : A ZBv
4U
A
r
(1) L is canonical. To show that L is not EA-elementary, pick upon a non-principal ultrafilter F i n u). Let 8 = (X,R),where
X = (P}uFuLo; R = {(a,b)EX2:bEa}. Thus in the frame 8,I: is related to its members, which are related to their members. (a) 8 is a frame for L.
PROOF.Let (21 = ( X , R, v ) be any model defined upon L, and suppose that (8,a) k OOpo for a in X. If a # F, then (a, a) k 00 ( p o A pl). So suppose a = F. Then for some M in F, (a, M ) C U p o ;and so v(0) n LO E F. Now either LO n v(1) or u) - v(1) E F. Assume that w n v(1) E F (the other case is similar). Then M' = LO n u(0) n v ( l ) E F. Hence ( 8 7 M') k 0 (Po A PI) and (2,F) 1 00 (Po A P I). (b) In any elementary submodel 49 = (Y, R I Y) of 3;(i) FE Y ; (ii) F n Y is non-empty; (iii) {n E LO nY M R n and M'R n} is infinite for any M , M' in Y n I;.
30
KIT FINE
PROOF. (i) holds since Fis the unique object to satisfy (3y, z) (x&y (ii) holds since (3y) (FIty) is true in 3. (iii) holds since
A
y8.z).
is true in 3 for each natural number n > 0. (c) No countable elementary submodel @ of 3 is a frame for L.
PROOF.Suppose that C$ = (Y, R I Y ) is a countable elementary submodel of 3, with { M t : l < a } an enumeration of Y n F. By b(ii), we can assume that 0 < a I w . Let ao,a , , ... and bo, b l , ... be two sequences in Mo n Y such that M,,R a,,, b, and a, and b, are distinct and not members of {ao,boy...,a,,- 1, b,- l}. Such sequences exist by b(iii). Let 8 be a model ( Y ,R I Y, v) such that v(0) = M o n Y and o(l) = { a o , sly...}. F e Y b y b(i); F R M , ; ( 8 , M , J k O p o ; a n d s o ( 8 , P ) k O O p o . Now take any M in Y such that F R M . Then M = M,, for some n < a. (ByM,,) I# 0 ( p o A p l ) since M,R b,, and b, 4 z!( 1); and (‘B, M n ) I# 00 (PO A pl) since M,R a, and a,,E ~(1).Hence (IS, F) I# OUPO-+ ( O D (Po
A
PI)V O D (PO A -PI>>
and 8 is not a frame for L. By the Skolem-Lowenheim theorem, 3 has a countable elementary submodel @. So by (a) and (c) above. (2) L is not ZA-elementary. This completes the counter-example. The proof that L is canonical also shows that L is quasi-elementary. Consequently, quasi-elementary does not imply ZA-elementary. Some questions remain. Is every natural logic the union of finitely axiomatized natural logics? Say that a logic L is a-canonical if SLFis a frame for L whenever 6 < a. [2] presents an o-canonical logic that is not w+-canonical. But can an w +-canonical logic fail to be a-canonical for some cardinal a > w? Does being canonical or being ZA-elementary imply being quasi-Aelementary? The last two questions are connected. For suppose every w +-canonical logic were quasi- A-elementary. Then a modification of the proof of Theorem 3 would show that every w+-canonical logic was. canonical.
SOME CONNECTIONS BETWEEN ELEMENTARY
31
References [l] J. L.Bell and A.B. Slornson, Models and ultraproducrs, an introduction (NorthHolland, Amsterdam, 1969). [2] K. Fine, Logics containing K4 I, J. Symbolic Logic, to appear. [3] K. Fine, Compactness in modal logic, to appear. [4] R. I. Goldblatt, First-order definability in modal logic, unpublished. [5] A.H.Lachlan, A note on Thornason’s refined structures for tense logic, Theoria, to appear. [6] S . Kochen, Ultraproducts in the theory of models, Ann. of Math. 74 pp.221-261. [7] E. J. Lemmon and D. Scott, Intensional Logic, Preliminary draft of initial chapters by E. J. Lemmon, Dittoed, Stanford University (1966). [S] J. C. C. McKinsey, On the syntactical construction of modal logic, J. Symbolic Logic 10 (1945) 83-96. [9] K.Segerberg, An essay in classical modal logic (Uppsala, 1971). [lo] S. Shelah, Every two elementarily equivalent models have isomorphic ultrapowers, Israel J. Math. 10 (1971) 224-233. [Ill S.Thornason, Semantic analysis of tense logics, J. Symbolic Logic 37 (1972) 150-1 58.
FILTRATIONS AND THE FINITE FRAME PROPERTY IN BOOLEAN SEMANTICS Bengt HANSSON and Peter G A R D E N F O R S University of Lund, Lund, Sweden
In modal logic, it is often interesting to know whether a certain logic has the so-called finite model property (abbreviated fmp) because it then immediately follows that it is decidable, provided it is finitely axiomatizable. Lemmon and Scott used so-calledJiftrations [4] to prove that many well-known modal logics had the fmp. The method is presented by Segerberg [6, 71. For a logic to have the fmp means to be characterized by a class of finite models, or, equivalently, that each non-theorem is rejected by some finite model for the logic in question. (We assume familiarity with the basic concepts of modal semantics, in particular with the concepts of a frame and a model, the latter being a frame with an added valuation. Our terminology is explained in detail in [3].) Prima facie the concept of fmp. is relative to the kind of models employed - i.e., relational or neighbourhood models in the case of Lemmon & Scott and Segerberg. Nevertheless it can be shown (cf. [3]) that a logic has the fmp. in the relational sense iff it has it in the neighbourhood sense and iff it has it in the boolean sense. It is also possible to define a somewhat stronger variant of the fmp, where for each non-theorem there exists a computable upper limitation to the size of the model that falsifies the formula in question. At the cost of this complication we no longer need to know that the logic is finitely axiomatizable in order to conclude that it is decidable. For if a logic has this stronger property, we only need to check whether a certainformula is true in all models smaller than the given limitation in order to know 32
FILTRATIONS AND THE FINITE FRAME PROPERTY
33
whether it is a theorem. In fact, most proofs that have been given that a certain logic has the fmp suffice to show that it has the stronger variant too. We will not be directly concerned with fmp, but rather study the finite frame property (ffp). It means that every non-theorem is rejected by some finite frame for the logic. It is trivial that the ffp entails the fmp. In fact, the converse also holds, as shown in [7]. Although the fact that a logic has the ffp is independent of which kind of frames we use, the techniques for proving this may differ in complexity. We will use the boolean semantics developed in [3] to describe a comparatively simple filtration method. In many respects it is a generalization of McKinsey’s methods in [5]. A central point is that we know that each classical modal logic is characterized by a single boolean frame, i.e., a pair <W, f ) , where W is a boolean algebra and f a function from elements of W to elements of W. Such a characteristic frame can be construed in the following way: as elements of W we take the equivalence classes of provably equivalent formulas in the logic in question with the boolean operations defined in the obvious way. The value off for the equivalence class IAl determined by the formula A is the equivalence class IOAl. We now fix our attention on a given logic L. For each non-theorem A of L we want to construe a finite frame for L which rejects A . Let us therefore look at A and all its subformulas. Each of them determines an equivalence class, i.e., an element of the boolean algebra just mentioned. We form the set of all boolean compounds of these elements and thus obtain a finite subalgebra of W which we call aA. This algebra is to be the first component of the finite frame (aA, f A ) which we are looking for. As for f A , we look at f ( x ) for x in WA.It may happen that f ( x ) too is in gA,and we then define fA(x) to be f ( x ) . If f ( x ) is not in WA, we leave the value of fA(x) undetermined for the moment. We now have the following lemma: LEMMA 1 . If W A is as above, and f A is any function on W A which agrees with .f whenever f ( x ) is in W A and B is a formula such that IBI is in g A and B is not valid in (W, f), then B is not valid in (aA, fA). PROOF.Consider the model obtained from (99, f) by adding the valuation V defined by V ( p ) = IpI for any propositional variable p . A formula 3 Kanger, Symposium
34
BENGT HANSSON AND PETER GARDENFORS
is true in this model iff it is a theorem of L . But (a,f ) is characteristic for L and therefore the B of Lemma 1 is not true in the model, Le., V ( B ) is not the boolean unit element. Y restricted to the variables occurring in A is obviously a valuation on (aA, f A ) . When we compute the values of VA for compound formulas, everything goes as in (99, f ) as far as propositional connectives are concerned. When we compute VA(OC ) given VA(C),we note that f ( [ C [ )= lOC[and since (UC(is in BA,f,(lC() must agree with f ( l C ( ) ,so V A ( o C )= V ( 0 C ) . Therefore, VA(B)cannot be the unit element. THEOREM. If, for each formula A , there is a function fA on W A agreeing with f whenever f ( x ) is in a A , such that is a frame for L (i.e., a frame which validates all theorems of L), then L has the ffp in the strong sense. PROOF.Suppose A is a non-theorem of L. Let ( g Af,A ) be as in the assumption of the theorem. By Lemma 1, A is not valid in ( B A f, A ) . Since this frame validates all theorems of L and since the cardinality of aAis at most 2” if n is the number of subformulas of A , the proof is complete. An application where it is easy to find a suitable f A is the following example.
EXAMPLE 1. E, the smallest classical modal logic, has the ffp. Since ( B A fA> , is a frame for B for any f A ,the result follows immediately from the theorem. In general, more ingenuity is required to find an adequatef, for a given logic. A straightforward method is to try to approximate the value off as closely as possible. In principle this can be done in two ways: either we take the smallest element in above f ( x ) or the greatest one below (note that ‘smallest’ and ‘greatest’ have a definite meaning since aAis finite - they simply denote the intersection of all elements above and the union of all those below). The following lemma will provide reason for approximating from below. LEMMA 2. Let B be an arbitrary boolean algebra and S an arbitrary finite subset of 93 closed under intersection. Let m(x) be the union of ally’s in S
FILTRATIONS A N D THE FINITE FRAME PROPERTY
35
such that y 5 x. Then m(x n y ) = m(x) n rn(y).
PROOF. m(x n y ) is the union of all S-elements below x n y . Each of these is of course also below x. Therefore this union is below or equal to the union of all S-elements below x, i.e., rn(x). Similarly, we obtain m(x n y ) Im(y) and hence m(x n y ) Im(x) n rn(y). We now turn to the opposite inclusion. m(x) is the union of all u’s in S below x and m(y) the union of all 0’s in S below y. m(x) n m(y) is thus an intersection of two unions, which by de Morgan’s laws is equal to the union of all elements of the form u n 0. These are all in S and each of them is below x n y. Therefore their union is below or equal to the union of all S-elements below x n y, i.e., m(x n y). This completes the proof. We get our approximation from below if we take for S the set of elements of 93,. This construction is sufficient for many standard logics.
EXAMPLE 2. K, the smallest normal modal logic, has the ffp. The function f in the Lindenbaum frame for K has the following properties: f(l> = 1 f(x n Y ) = f
0 nf(u)
We now definef,(x) as rn(f(x)) in the sense of Lemma 2 (with S as 93J. It is clear thatf,(x) = f(x) iff(x) is in 93,. We proceed to show thatf, fulfils the same conditions as f above. It is immediate that fA(l) = 1 since 1 belongs to 93,. The other condition follows directly from Lemma 2, so our theorem is applicable.
EXAMPLE 3. The modal logic T has the ffp. In addition to the properties mentioned above, the function f i n the Lindenbaum frame for T fulfils f(x) 5 x If we takefA(x) as mCf(x)) again, we only have to show thatf,(x) I x in order to prove that (93A,fA) is a frame for T. Since m(f(x)) 5 f(x) holds generally, this is immediate.
36
BENGT HANSSON AND PETER GXRDENFORS
EXAMPLE 4. The Brouwerian system B has the ffp. Frames for B are characterized by the following condition x If(-f(-x)> in addition to those for T. With the same definition of fA we only have to show that fA fulfils the new condition. By the definition of m ( f ( x ) ) we get f A ( - x ) ~ f ( - x ) . Hence -f(-x) I -fA(-x). By the general rule that f(x) If(y) follows from x Iy, which holds in all extensions of K, we conclude that f(-f( -x)) I f(-fA( -x)). But our assumption x ~ f ( - f ( - x ) ) implies that x If(-fA(-x)).fA(-fA(-x)) is the union of all elements in BA below f(-fA(-x)). One of these is evidently x. Hence x 5 fA( -fA( -x)). In our framework, generalizations to many-place operators is quite straightforward and our theorem will work as before. As a simple example we take the following system QP of qualitative probability with the two-place operator (to be interpreted as ‘as least as probable as’).
+
Modal axioms
Inference rules substitution, modus ponens, replacement of provable equivalents.
E ~ L 5. E QP has the ffp. A frame for QP is characterized by the following properties: f ( x , 0) = 1, f(x, 1) 5 x, f(XY v) n f h 4 5 f(x, 4, if x n z = 0 and y n z = 0 thenf(x,y) =f(x u z, y u z).
FILTRATIONS AND THE FINITE FRAME PROPERTY
37
The problem is to find anfA which has these three properties. As before, fA(x,y ) = m ( f ( x , y ) ) will do. Only transitivity is not completely trivial. BY definition, f X x , Y ) ~ J Y 4 , is W l x , u))n d f (4 ~ 1,,which, according to Lemma 2, equals m ( f ( x , y ) n f ( y , 2)). It follows that
m(f(x,y ) f ( y , 2)) 5 m(f(x,z)) = fd(x, z)* When it comes to S4 and its extensions, the iterated modal operators complicate the picture. In order to take care of them, we have to make a slightly more sophisticated choice of S in Lemma 2. Following an idea of McKinsey's, we choose for S the set of elements of aAwhich are in the range of the function f. It is closed under intersection as soon as we are dealing with extensions of K. EXAMPLE 6 . S4 has the ffp. A frame for S4 is characterized by the condition
f W
=m
in addition to those for T. We define fA(x) as m ( f ( x ) ) with the choice of S as above. In principle, we have to check all the T-conditions again, since we have changed the definition of fR, but the intersection property follows directly from Lemma 2 and the other ones go as before. As for the specific S4 condition, we note that fA(x) is the union of several elements in S, say f ( x l ) , ,..,f(xJ. For each i we have f(xJ 5 fA(x) and hence //
>
r3/ >
N I
/>>
Therefore fA(')
=
uf(xi> 5 f(f~(x)).
The invexse inclusionfollows trivially, so in fact w e havefA(n) = fcfA(x)).
Since f A coincides with f when its value is in .a,, = fA\x) anh we are ready.
we a h ka-e
f%cx\
U p <S!nQw w e have expYici\\y based our constructions on the Lindenbaum frame for the relevant logic. This is, however, not necessary. We need not even start with a characteristic frame, for any frame for the logic and any model on that frame which falsifies A will do. We then let aAbe the boolean closure of the set of elements assigned to A and its subformulas according to this model. Our theorem will hold for this g d too. We use this generalization in the following example.
38
BENGT HANSSON AND PETER G-ENFORS
EXAMPLE 7. S4.3 has the ffp. We know from [I] that there is a frame for S4.3 where all the f-values are linearly ordered. (The Lindenbaum frame has not this property, though.) Conversely, any frame in which thef-values are linearly ordered is a frame for S4.3. We therefore continue to prove that they are so when fA is as in Example 6. Suppose f(x) I f ( Y ) . fA(x) is the union of all f-values in aAbelow f ( x ) . All of these are below f ( y ) too, so f,(y) is the union of all these f-values and perhaps some more. Hencef,(x) I fA(y). This is perhaps also the place to point out the connection with Bull’s famous result [2] that all normal extension of S4.3 have the ffp (note that a ‘model’ for Bull is a ‘frame’ for us). What Bull does is, in our terminology, to give a general method of constructingf, - a method which he proves to work for every normal extension of S4.3. Finally, we would like to call attention to the possibility of using the fmp and ffp in proving completeness results. Usually, a logic is first proved to be complete and a specific model (e.g. the so-called canonical model) is then used for obtaining the fmp. If we work in a boolean framework we are always guaranteed that we have a characteristic boolean frame to start with. Since it is proved in [3] that a finite boolean frame is isomorphic to a finite relational frame (if the logic is normal), this fact is sufficient for us to conclude that a logic with the fmp is complete with respect to a class of relational frames. The relational frames can be explicitly constructed by the following rule : the elements (worlds) are to be the atoms of the boolean algebra (they exist since the algebra is finite). The relation R is to hold between x and y if and only if y I z holds for each z in the boolean algebra such that x 5 fA(z). We thus see that e.g. T is characterized by a class of reflexive relational frames, since the fact that fA(x) I x holds entails that x I z holds whenever x 5 fa(z) holds, which by the rule above means that x R x holds. Since the T-axioms trivially hold in any reflexive frame, it follows that T is characterized by the set of all reflexive frames. Similarly, the transitivity of R follows from the S4 axiom. For assume x R y and y R z. Then y I fA(w) whenever x I f:(w), and z I w whenever y I fA(w), i.e., z I w whenever x If:(w) = fA(w) (by the typical S4 condition on fA), whence x R z. As above we conclude that S4 is characterized by the set of all transitive and reflexive frames.
FILTRATIONS AND THE FINITE FRAME PROPERTY
39
It has sometimes been argued that the Lindenbaum frame is uninteresting since it is so uninformative. That it is uninformative can hardly be denied, but we hope that this paper has given some reason to think that it is not uninteresting, because it can serve as a starting point for completeness and decidability results. It can be compared with the conceptually more complicated canonical model (in neighbourhood semantics) which is also to a large extent ad hoc, but nevertheless quite useful in many applications.
References [I] R.A.Bull, An algebraic study of Diodorean modal systems, J. Symbolic Logic 30 (1965) 58-64. [2] R.A.Bul1, That all normal extensions of S4.3 have the finite model property, Z. Math. Logik Grrrndlagen Math. 12 (1966) 341-344. [3] B.Hansson and P. Gardenfors, A guide to intensional semantics, in: Modality, morality and other problems of sense and nonsense, Essays dedicated to Soren HalldCn (Lund, 1973) 151-167. [4] E. J. Lemmon and D. Scott, Intensional logic, Preliminary draft of initial chapters by E. J.Lernmon (dittoed), Stanford University (1966). [5] J. C. C. McKinsey, A solution to the decision problem for the Lewis systems S2 and S4, with an application to topology, J. Symbolic Logic 6 (1941) 117-134. [6] K. Segerberg, Decidability of S4 1, Theoria 34 (1968) 7-20. [7] K.Segerberg, An essay in classical modal logic, Filosofiska studier utgivna av Filosofiska foreningen och Filosofiska institutionen vid Uppsala universitet nr. 13, Uppsala (1971).
SYSTEMATIZING DEFINABILITY THEORY Jaakko H I N T I K K A and Veikko R A N T A L A Academy of Finland, Helsinki, Finland
1. Preliminaries There exists a fairly extensive literature on the different kinds of definability (identifiability)‘ in first-order theories. (See, e. g. [ 121.) Especially during the last few decades there have been progress in the questions of definability. This progress is mostly due to some logicians, as is the clarifying of the concept of definability. The purpose of this work is to show that it is possible to systematize (at least to some extent) the questions of the different kinds of definability in first-order (finitary) logic. The main syntactical tool used in this enterprise will be the theory of distributive normal forms developed by Hintikka. (See [3,4, 6,7].) The import of distributive normal forms for these questions can be seen in [7]. Here this line of thought will be continued. For a given first-order theory, one can construct certain expressions which have a form of constituent but which can be inconsistent. In spite of their inconsistency, these expressions can be interpreted modeltheoretically so that we can obtain model-theoretic arguments for results concerning certain familiar kinds of identifiability-instead of the syntactic proofs in [7]. A fully explicit treatment of the model-theoretical import of such inconsistent expressions would obviously require recourse to ideas some-
’
In econometrics, questions of definability are customarily referred to as problems of identijiubility. In fact, our main problems may be considered as generalizations of econometricians’ identifiability problems. (For these problems, see e.g. 121.) In this work, we shall not examine the interrelations between logicians’ and econometricians’ problems beyond borrowing the handy term ‘identifiability’from the latter.
40
SYSTEMATIZING DEFINABILITY THEORY
41
what different from those of the traditional model theory. A set of tools appropriate to this task is in fact found in the surface semantics of Hintikka [8], suitably extended. It would take us too far in this survey to review Hintikka’s theory in any detail. For this reason, the modeltheoretic significance of the ideas developed here will remain partly implicit. Although the focus of our concepts will thus seem to be syntactic, it is hoped that the main outlines of their model-theoretical applications are apparent enough. In any case, our methods enable us to obtain a convenient overview of a number of earlier results concerning different kinds of definability (identifiability) in first-order theories. As a preparation for this overview, a brief summary of some of the relevant results is presented below in Section 4. In most of what follows, we shall be considering an arbitrary fixed first-order theory T. For the sake of expositional simplicity, we shall usually formulate our definitions and results first on the basis of a number of simplifying assumptions concerning T.They are the following: (a) (b) (c) (d)
T is finitely axiomatizable. T does not contain individual constants. T does not contain free individual variables. T contains a finite number of non-logical constants.
Of these, (c) is trivial, and needs no special comments, while (d) will be adhered to through most of the discussion. As to the others, it will be indicated in each case what changes in our approach are necessitated by giving them up. We shall normally study the identifiability of one particular one-place predicate P occurring in T in terms of the other constants of T,the set of which will be called 9. Occasionally, the presence of these different constants in a theory or in a sentence will be indicated by writingthem out as arguments in the obvious way, e.g. T (97, P). The number of layers of quantifiers in a sentence or other expressions will be called its depth. In other words, the depth of F is the maximum length of nested sequences of quantifiers in F. (In [9, pp. 18-19, 142 (note 33)] and elsewhere, it is indicated how this definition of depth can be sharpened somewhat. However, these refinements are not needed in the present work.) We shall often indicate the depth of our expressions
42
JAAKKO HINTIKKA AND VEIKKO RANTALA
by superscripts in parenthesis, e.g. Ttd)would be a (finitely axiomatizable) theory of depth d. The assumption that P is monadic is not restrictive since the results in this work can be generalized for the case that P is not monadic. 2. Distributive normal forms
Distributive normal forms may be thought of as operating as follows : Each sentence S is split up into a number of disjuncts called constituents, each of which describes one of the several alternatives concerning the world S admits of. A constituent C does this by giving us a ramified list of all the different kinds of finite sequences (of a fixed length) of individuals that can be found in a model which satisfies C. This length equals the depth of C. Moreover, each sentence of depth d can be transformed into a disjunction of constituents of the same depth with the same nonlogical constants. When depth is increased, each constituent is split into a disjunction of deeper constituents. Given a finitely axiomatized theory T of depth d, we thus obtain the kind of tree-like analysis of T as shown in Fig. 1 .
Fig. 1.
When T is a tautology of depth zero, the tree diagram of Fig. 1 gives us an analysis of the whole first-order language using the same nonlogical constants as T. b ' For simplicity, we shall often assume that all the inconsistent constituents have been eliminated from Fig. 1. Although this cannot be accomplished effectively, we can use this assumption in defining our metalogical concepts.
SYSTEMATIZING DEFINABILITY THEORY
43
Any two constituents of the same depth are mutually incompatible. (That is, one of them always implies the negation of the other. Of course, if inconsistent constituents are admitted, two constituents can none the less be equivalent, viz. if they are both inconsistent.) Hence all the constituents in Fig. 1 which are compatible with a given one, say Chd+e), lie in branches passing through CAd+e).Each complete consistent theory compatible with T is formulated by the sentences of some branch of Fig. 1. If inconsistent constituents are assumed to be eliminated, the converse relation also holds : the sentences of any one branch constitute a complete consistent theory compatible with T. Thus many questions concerning Tare reduced to questions concerning the branches of Fig. 1. In the sequel, it will be assumed that the following is an arbitrary branch (of consistent constituents) of this kind:
Arrows indicate here logical implications. Normally, they cannot be strengthened into equivalences. When we are considering a theory T which is not finitely axiomatizable, perhaps not axiomatizable at all, pretty much the same things can still be said, provided that only a finite number of non-logical constants occur in T. Then T is not any longer equivalent with the disjunction of all the constituents C ( d + eof ) each fixed depth d + e in Fig. 1. Rather, the constituents of a given depth d + e occurring in Fig. 1 are all the different constituents of this depth (with the same non-logical constants as T) logically compatible with T. Again, all the different complete theories compatible with T correspond one-to-one with the branches of (consistent) constituents in Fig. 1. It remains to explain the structure of constituents in less informal terms. We shall first consider the case in which no individual constants are present. For later purposes, we shall highlight the role of the special monadic predicate P whose definability we are studying here. Thenan
44
JAAKKO HINTIKKA AND VEIKKO RANTALA
arbitrary constituent C f + e )can be said to be of the following form:
(Ex,) (Px,
A
CtLt+e-l) ( y , P, q))A
... (EX,) ( - P x l
A
A
C t T e - l ) ((p, P , x l ) )
A
Here each conjunction k P x , A Ct(d+e-l)( y , P, x l ) is called an attributice constituent (in the vocabulary (p {P}).‘ Their structure resembles closely that of (2). In fact, an arbitrary attributive constituent of depth d + e - i 1 with the set of constants y { P } and with the free
+
+
+
Our notation is therefore somewhat deviant, for usually the symbol Ct is used for the attributive constituents themselves, not for certain parts of theirs, as here. We do not see any serious danger of confusion here, however, which is simply due to our desire to highlight the special role of the predicate P.
45
SYSTEMATIZING DEFINABILITY THEORY
individual variables x i , x 2
(Ev>(Py
A
...
xi - is of the following form:
CtE+e-i’ (9);f‘,
Xi 9
X2
..., Xi- 1 , Y ) )
(Ev)(PJ’ A Ctg+e-i)(v, P , X i , X 2 , ... ... ( E Y ) ( - - P ~A CtS,
(d+e-i)
(EY)(-PY
A
...
Ctp2
(d+e-i)
(-P.Y
(-p.Y (+A,
A
... A A
...I A
J’))A A
( 9 ) , P , ~ i , ~ 2 , - . . , x i - iA, ~ ) ) ( ~ ) Y ~ , X ~ Y X ~ , . . . , X ~ A- ~ Y Y ) )
A (die-i)
( Q ~ , P , X ~ , X ~ , . . . , X ~ -v- ~ , Y ) )
(d+e-f)
(91,P,xlYx2,...,~l-i,~)) v
( ~ ) [ ( pAy Ctal (PY
Xi- 1
A
CtaZ
V (d+e-i)
Cfb2
(d+e-i)
( ~ ) Y P , X ~ , X. - ~, XYi - i , Y ) )
V
( 9 ) ~ ~ ~ X i ~ X ~ , . . . , X i - iV ,Y)) A
A
.--).
(3)
Here the A , are all the atomic formulas which we can build up of the members of 91, P, and x l , x 2 , ...)x i - l and which contain x i - l . The symbol If: is a placeholder for the negation-symbol or for nothing at all, in all the combinations compatible with the general restrictions that may be put on constituents. In order to make an overview of (3) easier, xihas been replaced by y in (3). (3) may be considered a (course-of-values) recursive definition of an attributive constituent of a given depth in terms of shallower attributive constituents. These have of course the same constants, but in (3) they contain more free individual variables, whence the course-of-values character of the definition. When d + e - i + 1 = 0, only the last conjunction (that of unnegated or negated atomic formulas) occurs in (3). This gives us a basis for the inductive definition. This definition is of course relative to the vocabulary p { P } . For other selections of non-logical constants, attributive constituents are defined analogously.
+
46
JAAKKO HINTIKKA AND VElKKO RANTALA
We shall disregard the order of conjuncts in an attributive constituent as well as possible repetitions of conjuncts. Furthermore, the naming of bound variables (barring conflicts between them) is likewise considered inessential. This suffices to define an attributive constituent. From the concept of an attributive constituent we obtain readily that of a constituent (with free individual variables or with individual constants). All we have to do is to conjoin (3) with a conjunction of the form
where k again stands for a negation-sign or for nothing at all and where the B, are all the atomic formulas that can be built from the members of q,from P , and from the variables xl, x 2 , ..., This gives us constituents with free individual variables. Constituents with individual constants are obtained from them by simple substitution. Our notation for constituents with individual constants is very simple: whenever necessary, we just write out the constants as arguments. Likewise free individual variables may be written out, as may non-logical constants. Thus (I) may be written out more explicitly:
It is perhaps not very hard to see how the formal structure of constituents hangs together with the informal explanation given of constituents earlier in this chapter in terms of the different sequences of individuals one can find in a world (model) in which the constituent in question is true. Let us look at (2) and (3) for the purpose, assuming that the latter occurs as a part of the former. Clearly the different initial existential quantifiers of (2) list all the different kinds of individuals (sequences of individuals of length one) that one can find in a world in which (2) is true. Likewise, in (3) the different initial quantifiers may be thought of as listing the different kinds of next step (next individual to be found) after we have already found the individuals which are the values of xl, x2,...,xi- Assuming that a constituent has been fully written out, we can thus say that each sequence of nested quantifiers occurring in it corresponds to a sequence of individuals which one can
SYSTEMATIZING DEFINABILITY THEORY
47
find in a world in which the constituent is true. This sequence is described by all the atomic formulae in the constituent whose individual variables are bound to these quantifiers. Conversely, each different sequence of dindividuals that can be found in a world in which (3) is true will have to correspond in this sense to a nested sequence of d quantifiers in (3). That is, the atomic formulae whose variables are bound to these quantifiers describe the inlerre1a;ions of the given sequence of d individuals. From this intuitive idea it is seen at once that certain operations can be performed on constituents and on attributive constituents so that the result is implied by the original. For instance, since the descriptions of different sequences of individuals that the constituents give us can be formulated in vocabularies of different strength, we can simply omit all the atomic sentences containing certain fixed constants from a constituent and obtain a constituent in the poorer vocabulary which is implied by the given one. We shall normally use the convention that a designation for the new constituent is obtained simply by omitting the constants in question from the designation of the original one. For instance, is the being given, C:d+e)(rp, x1,...,x i Ci(d+e) (rp, P , xl, ...,x i result of omitting from it all atomic formulas which contain P . (‘Omitting P’, we shall say in the sequel.) Intuitively, the latter describes the same sequences that can be found in a world in which the former is true, but without the use of the predicate P . Likewise, we may omit any layer of quantifiers from a constituent and obtain (apart from repetitions) a constituent which is implied by the former. It obviously lists precisely the same sequences of individuals that the former said to exist in the world, but only truncated by one step. For a more formal motivation for these observations, see [4]. The idea that the different sequences of nested quantifiers in a constituent describe the different sequences of individuals that can be found in a model in which the constituent is true can be made more formal in several ways. One of them is Hintikka’s game-theoretical interpretation of first-order logic [ 5 ] , another his ‘surface semantics’ [8]. For the time being, we shall only use the idea in an informal way, however. Attributive constituents have pretty much the same properties as constituents. For instance, any two different attributive constituents of the same depth with the same constants and with the same free individual variables are incompatible. Of two compatible attributive constituents
48
JAAKKO HINTIKKA A N D VEIKKO RANTALA
(with the same constants and the same free individual variables) the shallower results from the deeper by omitting layers of quantifiers. Obtainability from another attributive constituent by this procedure defines a relation on attributive constituents with the same non-logical constants and the same free individual variables. This relation clearly imposes a tree structure on the set of all such attributive constituents. The branches of this tree are countable (have the order type co), and only a finite number of branches diverge at each element. When identities are admitted to our language, the former definition of constituents and attributive constituents can remain unchanged. However, then quantifiers will have to be given what Hintikka [4,Sections l l and 131 has called an exclusive interpretation. What this interpretation is is easily explained in terms of the idea that nested sequences of quantifiers in a constituent Chdfe)represent the kinds of individuals one can draw from a model of CAd+e).In the case of an exclusive interpretation unlike the normal or ‘inclusive’ one - these draws are performed without replacement. Because of this difference in interpretation, constituents will in the case with identities behave somewhat differently from their namesakes without identity. The most important differences are noted in [4].We shall refer to these differences in the sequel whenever they are relevant. As a special case we shall admit of an exceptional constituent in which there are no existentially quantified conjuncts in (2) and the disjunction in (2) becomes inconsistent. Such a constituent (which of course could simply be ruled out) is satisfiable in an empty domain only. We shall call it an empty constituent and think of it as containing no quantifiers. (All such constituents are of course equivalent.) In principle, the same thing might of course happen in (3). We shall then use the same conventions as in the case when (2) becomes empty. A constituent which then becomes ‘empty’ only in some of its inner parts like (3) is easily seen to be inconsistent, however. Since we shall later employ inconsistent constituents for legitimate purposes, we must nevertheless take into account such partially empty constituents (constituents in which some branches stop before their allotted depth). The case in which all the inner parts of a given depth become empty can occur in consistent constituents, satisfied in a finite domain, if the quantifiers are interpreted exclusively.
49
SYSTEMATIZING DEFINABILITY THEORY
3. Uncertainty sets From (2) and (3) we can see at once one kind of index of uncertainty concerning P left by C6d+e).Omitting from each of the expressions Ct:a+e-l) (cp, P , xl), all the atomic formulas containing P (with the associated connectives) we obtain an expression which we shall call Ctj tcp, xl) or more explicitly, Ct:d+e-l) (p, xl). It is clearly an attributive constituent with the constants p and with the single free individual symbol xl. We can then put a(x1)
=
{CL, (v,
B(x1)
=
{Ct,, (V,
Y(Xl) =
XI),
Cta, (9, x1),
-
Ctp, (p, Xi),
* *
.}
*
.I
7
3
4x1) n B(x1).
(4) (5)
(6)
Let us assume that Y(Xl) =
Wt,, (V>
-
CtY,(cp, x,), **.I
Likewise we can form from (3) the sets 01 ( ~ 1 *,. * ,
xi-1
,Y ) = (Cta, (q,xi, -*.,
xi-1
9
Y),
Cta,(cp,x1, . * * s x i - l , ~ ) ~ . * - } , B(x1,
...,Xi-l,Y)
(7)
= {Ct,,(p,x1, . . . , x i - l,Y),
C t p 2 ( ~ , ~*1.,* , x i - l , ~ ) , * * - } , y(x1,...,xi-i,Y) = a ( x 1 , . . - , x i - l , ~ )n B ( x l , . . . , x i - l , y ) .
(8) (9)
Let us assume that here * ~(x1,*..,~i-= l , {Cty,(q,xI> ~)
. . , ~ i - l , ~ ) ,
Cty, (v, x1, - * * , Xi-l,Y),
...>.
The depth of the a’s, B’s and y’s may be indicated by a superscript. The members of any 01 are called P-positive and the members of any Pnegative. The sets (6) and (9) are called uncertainty sets. Their model-theoretical meaning may be seen as follows: Assume that M is a model of C$d+e). Assume also that we cannot ‘see’ directly which elements of the domain 4 Kanger, Symposium
50
JAAKKO HINTlKM AND VEIKKO RANTALPI
D of M have the property P but that we know how the members of y are interpreted in D. What can we tell on the basis of this knowledge of the distribution of P in D? It is assumed here that only sequences of individuals of a fixed finite length d + e (at most) may be used in answering this question. According to formula (2), each a E D satisfies one of the expressions Ctx'e-l)(y, x,) or CtE+e-l)(q,x,). There -are three possibilities here: ( x l ) n B(xl), (i) it satisfies a member of a(ii) it satisfies a member of a ( x l ) n B(xl), (iii) it satisfies a member of y(xl). In case (i), we must have Pa.In case (ii), we must have -Pa. It is precisely in case (iii), i.e., when a satisfies a member of the uncertainty set y(xl), that we cannot, by considering those characteristics which can be expressed by means of d e - 1 layers of quantifiers at most, tell whether it has the property P or not. (Hence the term 'uncertainty set'.) More than this we cannot say without considering longer sequences of individuals. Suppose likewise that we have chosen certain elements a l , ..., ai- E D which in this order satisfy the attributive constituent Ct (47, x1, ..., xi- 1) obtained from those attributive constituents in which (3) occurs in (2). Let us suppose further that although we could not initially tell whether a,, ..., aid, have P or -P, we have somehow assigned one of these two to each a l , ..., al.-l and even decided that a,, ..., a l - l satisfy a certain constituent (with free variables) C ( y , P, x l , ...,XIwhich of course will be obtainable from (3) and from the attributive constituents in which (3) occurs in (2). Given now one more element a, E D , can we tell (on the basis of its characterization in terms of q, of a,, ..., a i - , , and of d + e - i layers of quantifiers) whether we have Pai or -Pai? By the same reasoning as before, we can see from (3) that we cannot tell this (on the basis indicated) precisely in case a,, ...,at- , aisatisfy an attributive constituent
+
,
,
Thus the uncertainty set y ( x , , ...,x i - ,y) again defines those cases in which we remain in uncertainty as to whether P or -P applies at , at the level of analysis on which we are moving in (3).
51
SYSTEMATIZING DEFINABILITY THEORY
4. Summary of some known results on definability
Several earlier results on different kinds of definability can be formulated in terms of the uncertainty sets. They illustrate the systematizations which are made possible by our concepts, especially by the notion of uncertainty set. The following is a summary of some main results: (A) Piecewise dejnability (Svenonius). P is piecewise definable in Tiff in each branch (1) of Fig. 1, y ( d + e - l )(xl) = 0 for some finite e. This ~ l )@(d+e-l)(~l) are separated in that branch means that d d + e - l ) (and by at least one set (possibly by several) 6 = & d + e - l ) (x1); &(d
If
+e - 1)( X I ) E 6, 6
=
iCtdl
@ ( d + e - 1 ) (XI)
-
c 6.
-
(YYx l )Ctd, ~ (YY xl>, --}Y
T will imply
where A is the set of all the different 6's obtained in the different branches (I) of Fig. 1. Here piecewise definability means the same as Hintikka-Tuomela's 'conditional definability' [lo]. It is easily seen to be equivalent to d e b ability in each model of T (see [16], who first pointed this out). Conversely, it is easily shown (cf. [lo]) that whenever P is definable in the complete theory constituted by the sequence (l), then this will have to be betrayed by the separation of o((xl) and @(xl) at the depth which equals that of the shallowest explicit definition of P in terms of fp implied by (1). Definability in a given model M means of course that an explicit definition of P of the form
(where 6 E A ) is true in M. What is characteristic of piecewise definability is that it cannot be gathered from the way the members of are interpreted in the domain D of Mwhich definition (which 6 E A ) applies in M.
52
JAAKKO HINTIKKA AND VEIKKO RANTALA
We so to speak have to know how P is interpreted in the domain of M in order to decide how it is to be defined there, although we knew ahead of time that one of a finite list of explicit definitions must be applicable. (B) Explicit dejinability. We obtain explicit definability as a special case of piecewise definability when the 00s and /?'s are uniformly separable at some depth in all the different branches (1) of Fig. 1, that is to say, there is a set d,(x,) such that for all the pairs m,(xl) and Bi(xl) obtained from the different branches (1) we have
Then T implies the explicit definition where
(C) Restricted identijiability (Chang and Makkai). Now that we know what it means for separation to take place in the outmost attributive constituents of the sequence (l), we are naturally led to ask what happens if a separation is effected deeper in the constituents of this sequence. Again the answer is quite clear cut. If in each of the branches (1) of Fig. 1 some y (xl, ..., x k ) (not necessarily the same in different branches) eventually disappears, then P is what Hintikka [7] calls restrictedly identifiable. It might aImost be called countably identifiable, for what we have is that whenever the interpretation of the members of y in any countably infinite domain D is fixed, there are at most No different interpretations of P which make T true in D. More generally, if the cardinality of D is 5, the interpretation of 4, in any infinite D leaves at most 5 choices open for P . Chang [l] and Makkai [13] have shown (cf. also [12], p. 430) that this is equivalent with the following: (i) P is identifiable in any infinite domain D to a degree less than 2.' (ii) There are formulas Fl , ..., F, with the constants y and the free individual variables xl, ...,xk- 1,y but without P such that T implies i= n
v
(Ex,)
i=1
"'
(Exk-1)
(r)(py Fi (XI,
**.)
xk-1,
Y)).
(12)
53
SYSTEMATIZING DEFINABILITY THEORY
Hintikka [7] indicates how to show that (ii) is equivalent to the disappearance of some y (x,, ..., xk) in each sequence (1) of Fig. 1 (not necessarily the same one in different sequences). In fact, if
6 = {CtJ, (P,
XI
> ***
Y
xk),
Ct62 (9,x 1
>
* * * 3
xk),
--
*>
separates di (xl,..., xk) and (xl, ..., xk) in a given branch, then the constituent in which y (xl,..., xk) disappears implies ( c b , (v, x1, * * . , x k - l ~ ~ )
(Ex,) (Exk-1)(u) IPu
CtJ2(9,x 1
3 *
-
* 3
x k - 13
7) v ”.)] *
( 3)
Hence T implies (in virtue of Konig’s lemma) a disjunction of the form (12). The converse implication (i.e., implication from (ii) to the disappearance of some y in each (1)) can be established by studying the constituents of Fig. 1 at the depth of (12). It is easily seen that in each branch a separation must take place at this depth if (12) is to be implied by T. These observations can be considerably sharpened, as we shall point out later in this study. (D) Finite ident$ubility [12]. It may also happen that not only does some y (xl,..., xk) disappear in each given sequence (l), but that all the uncertainty sets y i (xl,..., x,J disappear that are derived from ‘indistinguishable’ (in v) constituents C(d+e - k) (9,p , x1 xk) 9
occurring as (not necessarily consecutive) parts of the constituent CAd+e) of (1) at which the disappearance takes place. Two such constituents
Cd
(9,p , x1, . * * Y
+e
xk),
c,( d + e - k ) (0,P , XI, ..*,Xk)
will be called indistinguishable in pl iff the reducts C:d+e-k’(v?Xlr
+-
(d e k)
*.*>
xk),
c,
Y * * . Y
xk)
= Cti
(0, XI,
...,Xk)
are identical. That
Gd
+
- k,
(v, p , x1
(d+e-k)
(9,p , x1,
A (&B1 A
&B,
A
xk)
...)
54
JAAKKO HINTIKKA A N D VEIKKO RANTALA
occurs as a part of
(tp,
P ) means that
occurs there as a consecutive part and that the + B , , fB, , ... (with the appropriate ‘signs’) occur there with their variables bound to the same quantifiers as the variables of CtY+e-k , ( 9 ) , p , x l ,. * . , x k ) * As shown in [7], in this case C f f e )also satisfies the following condition of Kueker’s: (i),, There are expressions S a n d Fi ( i = 1, ...,n) in the vocabulary 47 (but without P ) with the free individual variables x l , ...,x k and x l , ...,x k , y respectively, such that Cf+=) implies the following: *”
(xi)
(xk)
[
s (XI
9
--
(Exk) s ( x l ,
(14)
xk)?
i=n *Y
xk)
v (v)(pu
i=l
Fi ( x i 9
*
--
7
xk
( 15 )
Here n is the number of the different separating sets needed in the different indistinguishable constituents. It is easily seen that if this happens in each sequence (l), then T also implies expressions of the form (14) and (15) (with n now the maximum of the similar parameters in the several branches). Kueker [12] (cf. also [14, 151) shows that (i),, is equivalent to at most n-fold identifiability of P in T. 5. Uncertainty descriptions
By means of uncertainty sets, we can define certain first-order expressions which may be said to describe,in a rather vivid sense, the uncertainty which the theory T leaves to the predicate P (in those cases in which it is not definable in T). We shall fist explain the formal definition of an uncertainty description. Such a description is relative to a given depth. Hence we have to start from some given constituent, say from CAd+e)(q, P ) . It is of the form (2).
SYSTEMATIZING DEFINABILITY THEORY
The corresponding uncertainty description will be called (p, P ) . Unc c$'+~)
It is reached by stages as follows: Stage 1. Omit from (2) all the attributive constituents &PXl
A
Ct'd+"l'
(99
p , Xl)
which do not yield a member of y ( x l ) when P is omitted from them.
...
Stage i (1 < i I d + e). Assuming that (3) occurs in the expression obtained at stage i - 1, omit from it all the attributive constituents
&&'
A
Ct'dis-i) (p, P, X i ,
...,
Xi-1
,u)
which do not yield a member of y (xi, ...,xi- 1, y ) when P is omitted from them.
...
This process comes to an end after a finite number of stages. The outcome is Unc CFie) (p, P). From Unc Cidie)(Q), P ) we can form Unc CAd+e'(fp) by omitting from it all the atomic formulas containing P (together with the associated sentential connectives). The latter will be called a reduced uncertainty description. It is easily seen that the expressions Unc Cid+=) (9, P ) and Unc Chdie'(p) have the syntactic form of a constituent (in the vocabulary p + {PI and p, respectively). However, they are often inconsistent. In particular, they may contain empty parts in the sense explained in Section 2 above. (That is, some of their branches may come to an end before the depth d e even though other branches do not.) In spite of their inconsistency, such expressions Unc Cf") (p, P ) and Unc C$d+e) (p) can be put to a perfectly good use and even given an intuitive model-theoretic explanation. In order to see what this explanation might be, let us assume once again that we are given a model M of T with the domain D.How much can we tell of the definition of P on D on the basis of the way the members of Q) are interpreted in D? The answer is relative to the depth to which we are following the inter-relations of the elements of D. Let us therefore fix this depth at d + e.
+
56
JAAKKO HMTIKKA AND VEIKKO RANTALA
We may again consider the kind of situation mentioned above when we defined the uncertainty sets: We are given a model M of CAd+" ( y , P ) with the domain D.We know how the members of y are interpreted in D, and on the basis of this we are trying to see precisely when it is that P that is undefined on a member a, of D,on a second member a2 of D, and so on. If we choose one member a, ED, we do not know whether it has P or -P precisely when it satisfies a member of y(x,), in other words, satisfies an attributive constituent Ct(d+e-l ) ( y , x l ) which is preserved at Stage 1 of the definition of an uncertainty description. We can thus assign to such an a, E D either P or -P and investigate M further. At each stage, a member a, E D is chosen, and the question is whether we can tell between Pa, and -Pal on the basis of the available information on a l , ...,ai - I. This information consists of course of our knowing which point in the tree Unc CAd+e)( y , P) we have reached, which means knowing which constituent C(d+e-i+l)
ty, P,~
1 - *,- , Xi-1)
,
is satisfied by a, ,...,a j - (in this order).' From the definition of uncertainty descriptions it is seen that we do not know, on the basis of the
-
This information includes knowing (having decided) whether each of the a l , ...,aL-l has P or P. It is not exhausted by the latter information, however, as is readily seen from the following example: Consider a theory T (it can be formulated as a single constituent of depth 3) which contains a two-place predicate R which imposes a discrete linear order without endpoints on the domain and a one-place predicate P which is carried over to the left in the sense that we have (x) ( y ) (Rry A Py 3 Px). We are interested in the definability of P. Then, if al is the rightmost individual with P, no uncertainty remains with respect to the other individuals, characterized in terms of their relationship to a l , whether they have P or P. In other words, we have here a n instance of restricted identifiability yielding the definition-like statement (cf. (13)) (Ex) (y) ( Py = Ryx). However, such an a, cannot be recognized on the sole basis of the interpretation of R plus knowledge whether we have Pal or P a l . We have to know which constituent CCz) (R,P,xl)with P is satisfied by nl . Whether less information than knowing which constituent is satisfied by al , ..., suffices for restricted definability can be seen by comparing the different branches of CAdfe)(p,P ) which yields a quasi-definition of the form (13). In this way we can, e.g., see when it suffices to know whether each of the a , , ..., a i Y lhas P or -P. The fact that the latter does not always suffice may perhaps be considered a kind of analogue to the peculiarity of piecewise definability which was registered above.
-
N
57
SYSTEMATIZING DEFINABlLlTY THEORY
definition of the members of y on D and on the basis of the already available information just described, whether Pai or Pai precisely when a , , ...,a,- a, satisfy one of those attributive constituents N
Ct(d+e-i ) ( r p Y x1,
... X i - 1 , Y ) Y
(cf. (3)) which are preserved at Stage i above. Hence the different sequences of nested quantifiers in Unc Cz+e'(rp, P ) or Unc Cid+e)(y)describe in a sense all the different kinds of sequences (of length ~d e) of individuals that can be chosen from D one by one preserving all the time the uncertainty as to whether the new individual chosen has P or P . They thus constitute in a rather vivid sense a description of the uncertainty which Cid+" (rp, P ) leaves open for P.' (Notice that Unc CF+e)(q, P ) is independent of M.) This description is accomplished in a way which closely resembles the way an ordinary constituent describes its models. In the latter case, too, what is specified is what kinds of sequences of individuals we can successively draw from a model. This similarity can be spelled out more technically, for instance, in terms of a suitable game-theoretical interpretation of first-order logic. From this point of view, we can also see the reasons for the inconsistency of many of the uncertainty descriptions. It is part and parcel of the usual interpretation of quantified expressions that the draws of the individuals from a domain which quantifiers game-theoreticallyrepresent are draws from a constant domain (cf. [8, Section lo]). Now the peculiarity of uncertainty descriptions is precisely that the successive draws they describe are not draws from a domain independent of the draw. For the range of individuals of which we do not know whether they have P or P can change.2This uncertainty range can decrease, for in the case of later individuals we have some further information at our disposal which we did not have to begin with, viz. information concerning the individuals chosen earlier. Conversely, our uncertainty may be greater at later stages
+
N
N
If the quantifiers are interpreted inclusively, then (and only then) a n individual may occur repeatedly in the same sequence. This merely reflects the fact that the considered information (when identity is not present) is not sufficient for saying whether an individual chosen was chosen earlier. That is, theset of theithcoordinatesof thesequences described by Unc C&d+e'( y , P ) may change when i is increasing in the interval 1 5 i 5 d e.
+
58
JAAKKO HINTIKKA A N D VEIKKO RANTALA
than at earlier ones, for the later individuals, when they are described in terms of the members of y only, are described by means of fewer layers of quantifiers than earlier ones, thus allowing for less firm a decision between P and -P.
6. Uncertainty descriptions and different kinds of definability
In terms of uncertainty descriptions, we can reformulate some of the results explained in Section 4 in a way that brings out more clearly the underlying situation and also yields proofs of most of the results. It is assumed in this chapter that we are dealing with first-order logic with identity, i.e., that the quantifiers have been given an exclusive interpretation. Let us examine a given sequence (I). Let the corresponding sequence of uncertainty descriptions be Unc Cid)( y , P), Unc Chd+” (9, P), ..., Unc Chd+e) ( y , P ) , ... . (17) It is easily seen that if the last layer of quantifiers is omitted from result is either identical with (16) or can be obtained from it by omitting some attributive constituents. This simply reflects the fact that one’s uncertainty about P grows smaller when a new layer of quantifiers is admitted to the description of individuals in terms of the defining constants pl. Thus each branch of (16) is either continued or cut off in Unc CAd+e+l) 6% PI. What are now the most important things that can happen in (17) and what do they tell us about the dehability of P in the complete theory (l)? The most important things that can happen here are clearly the following: (a) The members of (17) disappear altogether from some point on, say from (16) on. Then obviously P is definable explicitly in (l), and if this happens in each branch (1) of Fig. 1, P is definable piecewise in the given theory. This is case (A) of Section 4. (b) All the branches of the members of (17) stop growing from some point on, say from (16) on. Then the description given in Section 4 of case (D) applies a fortiori, and we have a case of finite identifiability. Described in the way done here, however, this case yields Kueker’s
unCC f + e + l t( y , P), the
59
SYSTEMATIZING DEFINABILITY THEORY
conditions (14), (15). much more easily than in [7] and in fact yields several parallel conditions. What we obtain immediately is the existence of a number of expressions (in the vocabulary of 9)
Fl(xl,...,xk,y),...,F",(xI,...,xk,y) such that Chdf" (rp, P ) implies j=m
A (Ex,) ." (Exk) sj (XI
7
*
J= 1 j=m
(X1)
"*
(xk)
v
J=l
[Sj (XI
9
-
*2
xk) A
(y) (py
e . 9
(18)
xk)?
FJ(XI 9
* >
xk, v))l*(19)
Here some of the S j may be identical. Any two different S, are incompatible, however. The Kueker conditions (14), (15) are obtained from any equivalence class of identical SJ's. It is an interesting result that if (14), (15) hold for one S j , they hold for each of them. In order to see what these S j and Fj are, let Ctk (9,P , x l , ...,xk) (it is (rp, P ) and correspond of the form (3), with i - 1 = k) occur in Chd+e) to a point in Unc Chd+e) (v, P), where a branch comes to an end so that the resulting uncertainty set y (x, , .. .,x,,, y) is empty. Then we obtain a S j as the conjunction
where Ctk-l ( y , p , x1 ...) Xk-l)r 9
Ctl
(v, p ,
are all the attributive constituents in which Ctk (rp, P , x l , ..., xk) occurs in C&d+e) (vYP ) , The corresponding Fj is obtained as the disjunction of the members of any 6 which separates the P-positive and P-negative attributive constituents shown in Ctk ( y , P, xl, ...,xk). Conversely, if one branch in the uncertainty descriptions (17) keeps growing indefinitely, it means that there exists in each infinite model M of (1) a countable sequence of individuals such that at each individual a, P's applying or not applying to a is not determined by the earlier individuals and at least one of the two choices gives rise to a further un-
60
JAAKKO HINTIKKA AND VEIKKO RANTALA
certainty.l But this means that P is not finitely identifiable. Thus the conditions given above are not only sufficient but also necessary conditions of finite identifiability on the basis of (l), and we obtain a simple argument for the Kueker-type criterion of finite identifiability.” If this kind of situation occurs in each branch of Fig. 1, P is finitely identifiable on the basis of the given theory. It is readily seen how conditions (14), (15) or (18), (19) obtained in the different branches are to be combined so as to yield conditions of the same form for the whole theory. (It is easily seen that it does not matter whether in (18), (19) quantifiers are interpreted exclusively or inclusively.) We can read off further syntactical criteria of finite identifiability from what has been said. Perhaps the simplest is that T should imply, for a suitable k and for suitable F’ (xl, ,xk,y ) (in the vocabulary of p) w i t h j = 1,2,..., m,
...
j=m
(Xl)”‘(xk)
v
j=1
(y)(pu
(20)
Fj(xl,*--?xk,y))*
On the inclusive interpretation this could be written, as one can easily see, (XI) * “ (xk)
[
(xi =k x2 J=m
v
j = 1
A
*’*
( y ) (pr
A xk-1
G j
*
(x1,
xk)
9
(21)
xk, y))]
.
with suitableGj (xl, ..,xk, y ) (in the vocabulary of p) withj = 1,2,,..,m.3 As a further consequence, it is seen that P is finitely identifiable on the basis of T iff it is finitely definable (identifiable) in each model M of T, in an obvious sense of finite definability in a model.
-
Of course, for each stage of choices there must be at least two individuals such that P can be given to one of them and P to the other. It should be also noted that here identity is supposed to be present. That the terminating of every branch is the necessary condition can be proved also syntactically by showing that if some branch does not terminate in (17), then the condition in section 4 (in the very beginning of part (D)) is not satisfied. It is obvious that (20) is also a sufficient condition for finite identifiability when x2 A identity is present, since (21) is of the form (15) and (Exl) (Exk)(xl A x k - l =k xk) holds in each infinite model of T.
+
..-
SYSTEMATIZING DEFINABILITY THEORY
61
(c) The third main question concerning (17) is whether any branch terminates (stops growing) in the uncertainty descriptions (17). It was already seen in Section 4, part (C), that in this case we have restricted identifiability. The Chang-Makkai condition (12) was also seen to result trivially in this case. Conversely, if no branch terminates in (17), it means that we have in any model M of (1) a countable sequence of choices whether or not assign P or P to certain individuals. Each choice is independent of the earlier ones and either alternative leads to further choices. This clearly means that P cannot be identifiable to any degree smaller than 2Ko, h0wever.l By extending this argument somewhat, it can aIso be shown (we shall not do it here) that if no branch terminates in (17), the Chang-Makkai condition (i) (Section 4,part (C)) is not satisfied for infinite domains D with cardinality 5 > KO. Again, it is seen that P is restrictedly identifiable on the basis of a theory iff it is restrictedly identifiable in each model of this theory. Thus the most important known results on identifiability find their niche in terms of uncertainty descriptions.
-
References [l] C.C.Chang, Some new results in definability, Bull. Am. Math. Soc. 70 (1964) 808-8 13. 121 F. M. Fisher, The identificationproblem in econometrics(McGraw-Hill, New York,
1966). [3] J. Hintikka, Distributive Normal Forms and Deductive Interpolation, Z. Math. Logik Grundlagen Math. 10 (1964) 185-191. [4] J. Hintikka, Distributive forms in first-order logic, in: Formal systems and recursive functions, J. N. Crossley and M. A.E. Dummet, Eds. (North-Holland, Amsterdam, 1965) 47-90. [5] J. Hintikka, Language-games for quantifiers, in: Studies in logical theory, American Philosophical Quarterly,Monograph Series No. 2 (Blackwell's, Oxford, 1968, pp. 46-72). Reprinted in [9]. [6] J. Hintikka, Surface information and depth information, in: Information and inference, J. Hintikka and P. Suppes, Eds. @. Reidel, Dordrecht, 1970) 263-297. This can be proved also syntactically by showing that if no branch terminates in (17), then no uncertainty set disappears in (l), either.
62
JAAKKO HINTIKKA AND VEIKKO RANTALA
[7] J.Hintikka, Constituents and finite identifiability, J. Phil. Logic 1 (1972) 45-52. [8] J. Hintikka, Surface semantics: Definition and its motivation, in: Truth, syntax and modality, H. Leblanc, Ed. (North-Holland, Amsterdam, 1973). [9] J. Hintikka, Logic, language-games and information: Kantian themes in the philosophy of logic (The Clarendon Press, Oxford, 1973). [lo] J.Hintikka and R.Tuomela, Towards a general theory of auxiliary concepts and definability in first-order theories, in: [ll]. [l I] J. Hintikka and P. Suppes, Eds., Informafion and itference (Reidel, Dordrecht, 1970) 298-330. [12] D. W. Kueker, Generalized interpolation and definability, Ann. Math. Logic 1 (1970) 423-468. [I31 M. Makkai, A generalization of a theorem of E. W.Beth, Acta Math. Acad. Sci. Hungar. 15 (1964) 227-235. [I41 G.E.Reyes, Local definability theory, Ann. Math. Logic 1 (1970) 95-137. [15] S.Shelah, Remarks to ‘Local definability theory’ of Reyes, Ann. Math. Logic 2 (1971) 4 4 4 7 . [16] L.Svenonius, A theorem about permutation in models, Theoria 25 (1959) 173178.
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER ‘THERE EXIST UNCOUNTABLY MANY’ Herman Ruge JERVELL University of Tromm, Tromsn, Norway
0.
In this paper, we will give some new connections betweenordered models and the quantifier Qx - ‘there exist uncountably many x’. For some of the known results, see the papers by Fuhrken [l] and Keisler [2]. We work with the languages L, L1,LR and LQ. L is an ordinary first-order countabIe relational language. L1is the language we get from L by adding a new binary relation Rxy. LQ is got from L by adding the quantifier Qx. Below we will define LR as a certain sublanguage of L, . Formulas are defined as usual. A sentence is a formula without free variables. We map the formulas of LQ into L1as follows: (F)* = F for F atomic,
(lF)* = 1(F)*, ( F 3 G)* (F
A
F*
3
G*,
G)* = F*
A
G*,
=
( F v G)* = F* v G*, (VX Fx)* = VX (Fx)*,
(3x Fx)* = 3~ (Fx)*, (Qx Fx)* = V ~ 3~ J [ R p A (EX)*]. 63
64
HERMAN RUGE JERVELL
LQ* is the *-image of LQ in L1. Whenever the meaning is clear from the context, we omit * from LQ* or from F*. LR is the sublanguage of L1 consisting of LQ*-formulas and their subformulas. In LR, we have formulas of the following four types (Fx is LQ*formula, s and t are terms): (1) (2) (3) (4)
LQ*-formulas, 3x [ f i x A Fx], Rst A Ft, Rst.
Let % be a model. If we extend the language L with names for the individuals of 8 , we get the language L8. Similarly we get LR8 and LQ8. An LQ-model is an uncountable model in the language L. If we wanted to, we could have admitted countable models as LQ-models. This would have necessitated some trivial changes in the theory below. An ordered model is a model in the language LR where Rxy is a total, irreflexive, linear ordering. Let 8 and B be two ordered models. We then say that (i) B is a proper extension of 9.l if B is an extension of B and there is b €23 such that for all a € 8 , B i=Rub; (ii) B is a conservative extension of 8 if B is an extension of 8 and for all F E LR8;
8kFG-BkF (iii) % is an endextension of 8 if it is an extension of 8 such that for all a E1 ‘ 1, b E B if 23 i= Rba, then b E 8. We could have changed the definition in (ii) to: ‘For all F F E L19.1, ‘3 i= F o B i= F’. The first part of the theory below (results (A) and (B)) will still go through with only minor changes, but we then get into difficulties when we try to tie our results up with the result of Keisler (result (C)). Let A be the class of all countable ordered models which have proper conservative endextensions. We can now formulate the main results of this paper: (A) We give a necessary and sufficient criterion for a countable model to be an element of dt. This criterion can be used to axiomatize ‘true in all models of A’.
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
65
(B) For all F E LQ : F is valid iff F* is true in all models of A. (C) In [2], Keisler gave another axiomatization of LQ-validity, and a completeness proof of it. His proof is in two parts. First, he proved completeness of ‘true in all weak models satisfying certain schemata’ using an ordinary Henkin construction. Then, and this is the hard part, he proved that from each such weak model he could construct an LQelementary equivalent LQ-model. Below we prove that there is an easy connection between such weak models and the models in 4. From this we get a new proof of Keislers completeness result. The result are all for first-order logic. It is straightforward to extend them to a-logic and to L,,,.
1. We now want to characterize the class A of all countable ordered models which have proper conservative endextensions. A model % in the language L, is ordered if and only if it satisfies MI-M3. M1
VxVy [Rxy v x = y v R Y X ] ,
M2
Vx ‘dy Vz [Rxy A Ryz
M3
Vx 7 Rxx.
3
Rxz],
Then we note that an ordered model % has a proper conservative extension if and only if it satisfies
M4
V x 3y Rxy.
In fact, we get by compactness that % has then a proper L,-conservative extension. The claim now is that a countable model % in the language L, is in A’ if and only if it satisfies M1-M5, where
M5
[Qx 3y Fxy
for all LQ-formulas Fxy. 5
Kenger, Symposium
3
Qy 3x Fxy v 3y Qx Fxy]“
66
HERMAN RUGE JERVELL
First, we prove the necessity. Let 8 be an ordered model and 8 a proper conservative endextension of 8. 8 will then obviously satisfy Ml-M4. We will prove that it also satisfies M5. Let Fxy be a formula in LQK Assume
121 P Qx 3y Fxy.
Then
'8 I= Qx 3y Fxy, 8 k 3yFby for some b e 8
- %,
There are now two cases. Case 1. c E ?I. To each a E !?I, 8 I= Rab
Hence
A
Fbc,
8 k 3 x (Rax
A
Fxc),
8 P 3 x (Rux
A
Fxc).
8 k V y 3x (Ryx
A
Fxc),
121 t. Q x Fxc, 8 I= 3y Q x Fxy. Case 2. c E 8
- 91. To each a €121, 8 k Rue
A
Fbc,
2 ' 3 I. Rac
A
3 x Fxc,
B P 3y [Ray A 3 x Fxy], Hence
8 k 3y [Ray A 3x Fxy].
8 k Qy 3x Fxy. From the two cases we conclude and 8 satisfies M5,
?I k 3y Qx Fxy v Qy 3x Fxy,
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
67
LEMMA 1. 9.l E A * 8 satisjes MI-M5. We use the remainder of this section to show the converse. Assume that 8 is a countable model satisfying Ml-M5. By MI-M3, we get that 8 is ordered. By M4, 8 has a proper conservative extension. We must prove that i!ihas a proper conservative endextension. Let b be a proper conservative extension of %. A subset N of 8 is small (relative to a) if N c { b E b I b C Fb},where Fx E LQa and % I# Qx Fx. b E b is small (relative to '%) if it is included in a small set (relative to %). b E b is Zarge (relative to '%) if for all a E %, b =! Rub. LEMMA 2. No element of 23 is both small and large.
PROOF.Assume b €8 is small. Then for some Fx E LQa, b != Fb and
8 p Qx Fx, there is a E 8 such that
b
I#
3x (Rax
A
Fx),
I#
3x (Rax
A
Fx).
Since b k Fb,we have b P Rab and b is not large. Given an ordered model Q with linear order R. Let R* be another linear order obtained from R by permuting the elements {c I Q k Rcd), where d is a fixed element. Let Q* be the model obtained from Q by exchanging R with R*. Then by a straightforward argument, Q and Q* are LQQelementary equivalent. It is more difficult to give results on the preservation of LRQ-formulas. The La-formulas are of the following four types : (1) LWforrnulas, (2) 3 x [Rsx A Fx], where Fx E LQQ, (3) Rst A Ft, where Ft E LQQ, (4) Rst. The problem is to get control over formulas of the second type. We call such formulas bound formulas. LEMMA 3. There is a proper conservative extension of % such that each element in the extension is either small or large.
68
HERMAN RUGE JERVELL
PROOF.Let 23 be a proper conservative extension of 8. We divide the elements of 23 into three disjoint classes: S : the small elements. (Note that 8 c S.) L : the large elements. c = % - ( S lJ L).
We define a new binary relation R*xy on 23 as follows:
R*cd o c E L
A
Rcd
.v dEL
A
Rcd
. v C E SA d e S A Rcd
. V C E C A ~ E C A R C ~ . v C E SA d E C .
%* is the model we get from '23 by exchanging R with R*. It is straightforward that %* is a proper extension of % and that '23 and B* are LQ-elementary equivalent. We shall prove that %* is a conservative extension of 8. The only problem comes with the bound formulas. It suffices to prove: Let Fx E LQ8 and a E 8. Then
(*I
B* C 3 x [Rax A Fx] e %* k 3x [Rax A Fx].
Assume '23 k 3x [Rax A Fx],
23 k Rub
A
Fb for some b.
Since a E 8, a E S. From the definition of R*,R*ab. By LQB-elementary equivalence, B* k Fb. Hence
B* I.Rub Now assume
A
Fb,
%* C 3x [Rax A F x ] .
8 I# 3x [Rax A Fx].
Let b € 8 be such that %* I=Rub. If B k Rub, then % I= i F b and %* k -I Fb. We then get %* k Rub 3 7Fb.
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
69
Assume B # Rub. Then b E C. By assumption % If 3x [Rax A Fx], % I# Qx Fx.
Since b 4 S, B I. i F b . By LQB-elementary equivalence, b* k i F b . In this case we also get %* I= Rub 3 -iFb. We conclude b* (# 3x [Rax A Fx]. This proves (*). It only remains to observe that the small elements of %* are exactly S, and the large elements are exactly C u L. b* is a proper conservative extension of % such that each element is either small or large. We have given the model 8, satisfying Ml-M5. If we now apply Lemma 3 (o times, then we get modelsB,, Bl ...,B,,, ..., n < coy such that Bo is the model 8, and for each i, Bi+, is a proper conservative extension of Bt,where each element is either small or large (relative to %[). Some of the properties of the chain of models Bo,2' 3, , ... are preserved under permutations. (We put % = Uncm B,,.)
DEFINITION. A chain permutation n is a permutation of the elements of = Un
b
ti) for only a finite number of elements b, nb # by (ii) for b E Bo, nb = b, Ciii) f o r b E % t , n b E B i .
DEFINITION. A chain permutation n acts strongly on a set S of elements if for all b E S , relative to Btythen nb = b ; (i) if b is not large in some (ii) if b is large in Bt+ relative to Bi, then so is also nb. A chain permutation which acts strongly on % is called a strong chain permutation. A chain permutation n acts on B in the following way: Let nR be the linear order defined by
n Rxy
-
Rnx ny,
where R is the linear order in B. We then get nB from B by exchanging R with nR. Similarly, nBo, nB1 ... . It is straightforward to prove
70
HERMAN RUGE JERVELL
LEMMA 4. Suppose n is a strong chain permutation. Then nB,,,zbl, is a chain of models, where nBo is the model %, and for each i, nBi+r is a proper conservative extension of nbi, where each element is either small or large relative to nbi. Furthermore, nBi and b, are LQBi-elementary equivalent. The chain permutation acts also on formulas and sets of formulas. Let IC be a chain permutation, F E LRB.n F is the formula obtained from F by exchanging each element c of F with xc. Similarly we get n r f r o m for'I a set of formulas in LRB. Let 6 be an ordered model. We define
r,
T%= ( F E L R ~ I F, Q Fisasentence), ~
IQ= { F EL a 1 Q
F, F is a sentence).
DEFINITION. Let (Tl, I,)and (Ta, la)be two pairs of sets of sentences. (Ta, La)is an extension of (Tl, 11) if T1E T2 and IlZ La. We also say, (Tz, 12)extend (T,, I ,). DEFINITION. (T, I)is a consistent pair if it can be extended to (TQ, l&> for some ordered model 6. DEFINITION. a consistent pair (T, I)is a model pair when it contains at least one term, and (1) If ~ F T,Ethen FEI. (2) If - I F €I,then Fe T. (3) I f F A G E T , t h e n F E T a n d G E T . (4) If F A G E I,then FEI or G E I. (5) I f F v G E T , then FET or G E T . (6) If F v G E I,then FEI and G E I. (7) If Vx Fx E T and t is a term in (T, I),then Ft E T. (8) If Vx Fx E I,then Ft E I for some term t. (9) If 3x Fx E T, then Ft E T for some term t. (10) If 3x Fx E I and t is a term in (T, I),then Ft E I.
If is a model pair, then the term model, got from the atomic formulas in (T, I),gives a model making all the formulas in T true and all those in I false. If ( T I , 11) is a countable consistent pair, then we can by a well-known procedure extend it to a model pair
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
71
By an appropriate modification of this procedure we shall start with the consistent pair (Ta, 1%)and extend it to a model pair giving a model which is a proper conservative endextension of %. We need the following notion in the construction of the proper conservative endextension of %.
DEFINITION. (T, I,z) is a consistent triple if (i) (T, I)extends (T%, I%); (ii) T - T%and I - I8 are finite sets; (iii) z is a nonempty finite sequence of elements b , , ...,b, such that bt (1 Ii 5 N ) is large in relative to 8i-l; (iv) the elements of (T, I)are either in % or in z or in (T n LQb, I n LQb); (v) each element of (T, I)which is not in 2lYis a large element of bi relative to for some i(1 I i 5 N ) ; (vi) there are strong chain permutation z, u such that (Tub, la%) extends (nT, nl); (vii) if there is a chain permutation u such that c acts strongly on the elements o f t and (Tb, 123)extends ((aT n LQB), (ul n LQB)), then o acts strongly on all the elements of (T, I); (viii) if 3x (Rbx A Fx) E I - L M , then Qx Fx E T; (ix) if 3x (Rbx A Fx) E I - L M , then b is large in some bi(i 2 1) relative to bt-l and Qx Fx E I n LQBi-l.
LEMMA 5. Let b , be a large element of 8 , relative to bo. Then (T'u, I%,( b ,j>is a consistent triple. This is obvious. In Lemma 6 we summarize the standard results in this type of construction.
LEMMA 6. (i)
If( T u { i F } , I, z) is a consistent triple, then so is also (T u{i F},
(ii)
If( T, I u { i F } , z) is a consistent triple, then so is also (T u { F ) ,
Iu { F } , t>.
I u {lF},z).
u { F A G}, I, z) is a consistent triple, then so is also (T u ( F A G, F, G } , I, z). (iv) Zf(T, Iu {F A GI, z) is a consistent triple, then so is also either (T, Iu {F A G, F } , z} or (T, I u { F A G, G),z).
(iii)
If (T
12
HERMAN RUGE JERVELL
(v) If (T u { F v G}, I , z) is a consistent triple, then so is also (vi) (vii) (viii) (ix) (x)
( T u { F v G , F } , I , t ) o r ( T u { F v G,G],I,z). I f (T, I u { F v G}, t ) is a consistent triple, then so is also ( T , I u { F v G, F, G } , z). I f (T u (Vx Fx}, I, z) is a consistent triple and b is an element of the triple, then (T u {Vx Fx, Fb}, I, t ) is a consistent triple. I f ( T , I u {3x Fx}, t ) is a consistent triple and b is an element of the triple, then (T, I u ( 3 x Fx, Fb}, t ) is a consistent triple. Suppose Vx Fx E LRa. I f ( T , Iu {Vx Fx}, z) is a consistent trkle, then for some element a of a, Fa E 1. Suppose 3x Fx E LRa. If(T u {3x Fx}, I,t )is a consistent triple, then for some element a of a, Fa E T.
LEMMA 7. Suppose Vx Fx E LQ% - LRa, Fx 4 LQ8, and
is a consistent triple. Let bN+l be a large element of Then (T, I u
%N+,
relative to 8,.
{v~F~,FbN+i},(bi,...,bN,bN+i))
is a consistent triple. PROOF. V x FX must be of the form Vx 3y [Rxy A Gy]. Let n,o be strong chainpermutationssuchthat (To%, 1uB)extends ( n T , n ( l u{VxFx})). Then n b N + is a large element of uBN+ relative to omN and
08I# nVx F x , 0 % ~ nvX u%N
is a consistent triple.
FX,
I# XVX 3~ [ R x A~ G y ] ,
O%N+1
k n3y
aBN+l
I#
[ a N + I Y
nFbN+l
9
Gy1,
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
73
LEMMA 8. Suppose that 3x Fx $ (LQ% u LRa), and (T u ( 3 x Fx}, I, ( b , , ..., b N ) ) is a consistent triple, then there is bN+,such that
( b l ?* * ' >
bN, bN+l))
is a consistent triple. PROOF.3x Fx must be o f the form 3 x [Rbx A Gx], where Ox Gx E T. Let n, o be strong chain permutations such that (TUB, la%) extends ( n (T u {3x Fx}), nl). Then oBN+ k n QX Gx. There is bN+ which is large in o%N+l relative to 08, such that
,
,
oBN+l
n (RbbN+l
O%N+l
nFbN.1.
7
*
GbN+l),
(bi
b ~b ,~ + i ) )
is a consistent triple. 3x Fx and Fx are in LQ%, and (T u ( 3 x Fx}. I, t) is a consistent tr@le. Thenf o r some element b (T v (3x Ex, Fb), I , t )is
LEMMA9. Suppose a consistent triple.
PROOF.Say t is ( b , , bN). Let M be the least number such that there exist chainpermutation a acting strongly on ( b , , ...,b N )and (T%, I%) extends (0 (T u {3x Fx, Fb}) n LQ%, o (In LQB)), where b is an element with ob E % M . Then 0 IM IN. We shall prove: (*) Suppose M 2 1 and b E BM, and o a chain permutation as above.
Then ob is large in B M relative to % M - l .
Assume ab is small in !BM relative to 5BMdl. There is Nx E LQ%M-l such that % M I# o Q x N x . 8, I. oNb, Let F,, ..., FK be the sentences in (T n LQ%) - LRM, G 1 , ..., GL the sentences in (In LQ%) - LR%, and K a sentence which expresses that the elements of 3x Fx, Fl , ...,F K , G I , ...,G L are distinct. We write 3x H x for FK A l G 1 A ... A -iGL]. 3x [Fx A K A Fl A
74
HERMAN RUGE JERVELL
or Since SM- I# o Qx Nx, the first alternative cannot be realized. Hence there is c such that
,
and ac 6 BM- . There exists d and a chain permutationnacting strongly on ( b , ,...,bN> such that B knHd, ndE'BM-1.
(TB, IS) extends (n ((T u (3x Fx, Fd}) n LQB), n (In LQB)). This contradicts the choice of M,and we have proved (*).
Now back to the proof of Lemma 9. Let a be a chain permutation acting strongly on z such that
(TB, La) extends (u ((T u (3x Fx, Fb}) n LQB), a (In LQS)), where b is an element with ab E BM.By property (vii) in the definition of consistent triple and by (*), we can assume that a is a strong chain permutation and b is either a large element of S M relative to BM- or b is in a,. There is a strong chain permutation n such that (TnB,InB) extends the quantifier-free formulas of (oT, al>. By Lemma 4,
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
15
We shall prove that (TnB,In%) also extends the bounds formulas of (aT, 01). Assume 3x (Rex A Gx) E oT. I f 3x (Rex A Gx) E LR%, then obviously, 3x (Rex A Gx) E Tn%. Assume 3x (RCXA Gx) # L m . Then by the definition of consistent triple, Qx Gx E oT. Hence Q x Gx E TnB and 3x (Rex A Gx) ETnb. We get a (T u (3x Fx, fi}) c Tn%. Assume 3x (RCXA Gx) E nl. I f 3x (RCXA Gx) E L m , then obviously 3x (Rcx A Gx) E In!& Assume 3x (Rcx A Gx) # LRX. Then by the definition of consistent triple, there is i such that that c E B,+1, Q x Gx E a l n LQB,. Hence nb, !# a Qx Gx. There is d E bisuch that nbi I# 3x (Rdx A Gx). Since Rdc, nBi+, 3x (Rcx A Gx) and 3x (Rcx A Gx) E 8 In%. Hence a l s TnB,and (TnB, In%) extends ( (crT u (3x Fx, Fb)), o l ) . This proves that (T u (3x Fx, Fb), I) satisfies (vi) in the definition of consistent triple. We prove (vii) in the definition of consistent triple. Assume there is a chain permutation e such that e acts strongly on he elements of t,and (TB, 1%)extends
(e (T u (3x Fk,Fb}) n LQB, p In LQ8). Since (T u (3x Fx), I,t) is a consistent triple, e acts strongly on the elements of (T u (3x Fx}, I).By (*), e acts also strongly on b, and (vii) is proved. We conclude that (T u (3x Fx, Fb}, I,t) is a consistent triple and Lemma 9 is proved. LEMMA 10. Suppose V x Fx and Fx are in LQB and (T, I u {Vx Fx], t) is a consistent triple. Then for some element b, (T, I u (Vx Fx, Fb}, t) is a consistent triple. PROOF.The Lemma is proved in the same way as Lemma 9. For Lemma 5-Lemma 10 we conclude THEOREM 1. % has a (countable) proper cotiservative endextension if and only if% satisfies Ml-M5.
PROOF.We have already proved the necessity. Assume that B satisfies MI-M5. Then by Lemma 5-Lemma 10 we construct a model pair (T, I) extending (T%, I%)such that
HERMAN RUGE JERVELL
16
(i) there is at least one element of (T, I)not in (TB, La), (ii) if b is an element of
B satisfies Ml-M5 and is countable.
THEOREM 3. Let F be a formula in LR. F is true in all models of dl if and only i f F can be derivedfrom Ml-M5 in ordinary first-order logic. Theorem 1-Theorem 3 give Result A. 2.
Let B be a model in A.Then there is a countable proper comervative endextension 23 of 8. Now 23 is ordered and hence satisfies Ml-M3. M4 is equivalent to ( Q x ( x = x))*. Both M4 and M5 are equivalent to formulas in (LQ)*. Hence 23 satisfies M1-M5 and is in A. Continuing in this way we get a chain:
@ is the first uncountable ordinal) such that Bo = 8, and each Bais countable and for 01 < /3, iUD is a proper conservative endextension of Ba. Let 3, = (J a*. U
Then BD is an uncountable model such that for each Fx E LRN, if
Ba k ( Qx i Fx)*, then { a E B $1 ~
k Fa) is countable, and if
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
17
then ( a E I '21n k Fa) is uncountable. Besides, is a proper conservative endextension of %. We conclude that for each F E LQ,
The converse turns out to be quite easy. Let 0: be an LQ-model. 0. is uncountable. By a Skolem-Lowenheim argument, we get Qo, with 6, LQ-elementary equivalent to 6, and 6, has cardinality K1.On Q,, we then impose a total, linear, irreflective ordering R such that each initial segment is countable. Let 6: be the ordered model we then obtains. Obviously, for each F E LQ, Q C P o Q : k F*. We then use the ordinary Lowenheim-Skolem theorem to get a countable model Q* which is LR-elementary equivalent to 6;. Now we observe that 6; satisfies Ml-M5. Hence Q* is in A.We conclude that for each I:E LQ,
Q k F o Q * k F*. By the above,
THEOREM 4. For each F E LQ we have I; is valid e-F* is true in all models of A.
THEOREM 5. LQ-validity is recursively enumerable. PROOF. We can axiomatize LQ-validity of formula F by using derivability of F* from Ml-M5. This gives Result B. 3.
In [l], Keisler introduced the axioms
K1 V y V z l Q X ( X= y v x K2 Vx (Fx
3
Gx) 3 (Qx Fx
=
z),
3
Qx G x ) ,
78
HERMAN RUGE JERVELL
K3 Qx Fx t)Qy Fy, K4 QX(X= x),
K5 Qy 3x F’x
2
3x Qy FYXv QX3~ FYX.
He showed that this system together with first-order logic axiomatized LQ-validity. Let 8 be an L-model and W E P(A) (powerset of 8).The pair (%, W> is called a weak LQ-model. Satisfaction in weak LQ-models are defined as usual with the extra condition
Keisler showed by a straightforward Henkin-construction : THEOREM (Keisler [l]). For FELQ, F is valid in all countable weak LQ-models satisfying Kl-K5 if and only if F is derivable from Kl-K5 and jrst-order logic. In this section we will show that there is a connection between countable weak LQ-models satisfying Kl-K5, and models in A. Let (a, W) be a countable weak LQ-model satisfying Kl-K5. Let Vo = { { a [ (8, W ) I# Fu) I FQE LQa, (#, W ) If
QX
Fx}.
Then obviously, (8, W) and (a, P(9.I) - Vo)are LQ8-elementary equivalent. Vo is countable and has the following properties. (1) All finite subsets of % are in V,, (2) S E Vo and TE V, => S v T E V,, (3) S E Vo and {Q I (%, W )1 Fa) E S (4) B Vo.
=>
{u I (a, W ) 1 Fa) E V,,
Property (3) is in a consequence of K2. In [l], Keisler gave an elementary proof that Qx (Fx v Gx) t)Qx Fx v Qx Gx is derivable from Kl-K5, [l, Lemma 1.91. This gives property (2). Property (1) follows then from property (2) and K1. Since V, is countable, we can enumerate it by, say, S o , S1, ..., S,,, ..., 11 < to. Let
Tn=
IJ
icn
Si.
CONSERVATIVE ENDEXTENSIONS AND THE QUANTIFIER
79
We then get an w-chain To E TIE T,
E
TnE
... ,
n <w
u
of sets from Vo such that Tn= %, and For each S E Vo there is an n such that S c Tn.Since each TnE Vo, Tn# 8. On 3 we now impose a total, linear, irreflexive ordering R such that each Ti is an initial segment of the ordening. The claim now is that (W, R) E "Iand for each F E LQ,
(a, R) != F* .s(a, W )=! F First, we prove that (%, R) and (a, W >are LQ-elementary equivalent. This is done by induction over length of formula. The only non-trivial part is the following:
I# 0
Qx Fx
(a, p(W) - vo) #
Qx FX
0 { a I (3,
w >I= Fa} E vo 0
there is an N,such that (a I
(a,
W> k Fa> E TN
0 {a I (W, W ) C
Fa} is bounded in (a, R) (inductionhypothesis)
0 (a I
(a, R) k Fa*}
is bounded in (%, R)
0
(a, R) I#
(Qx
W*
This proves that (a, W) and (a, R) are LQ-elementary equivalent. <%, R) is a countable ordered model that satisfies Kl-K5. Therefore it satisfies Ml-M5, and (W, R) is in A?.
80
HERMAN RUGE JERVELL
Conversely, if we have a model (IU, R) in A?', we let W
=
{ { a I (a, R ) k Fa*) I (a, R ) k ( ( a x Fx)").
It is straightforward that (IU, W) is a countable weak LQ-model satisfying Kl-K5 and LQIU - elementary equivalent to (IU, R). This gives 6. For any F E LQ, the following are equivalent: THEOREM (1) F is true in all LQ-models.
(2) (3) (4) (5)
F is true in all weak LQ-models satisfying K1-1<5. F is derivable from Kl-K5 in first-order logic. F* is true in all models in 4. F* is derivable from Ml-M5.
This gives Result C.
References [l] H.J.Keisler, Logic with the quantifier 'there exist uncountable many', Ann. Mark Logic 1 (1970) 1-93. [2] G.Fuhrken, Languages with the added quantifier 'there exist at least N2,in: The theory of models, J. W. Addison, L. Henkin and A.Tarski, Eds. (North-Holland, Amsterdam, 1967) pp. 121-131.
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES AND THE NOTION OF DEFINITIONAL EQUALITY Per
MARTIN-LOF
University of Stockholm, Stockholm, Sweden
0. Introduction
This paper consists of two parts. The first is devoted to the formulation of what seems to me to be the most natural notion of model for intuitionistic theories which either are type theories by their very definition, or else may be viewed as such because of the correspondence between formulae and type symbols discovered by Curry and Feys [2] and Howard [9]. The second analyzes the notion of definitional equality and its formal counterpart, convertibility, and advocates a change in the current definition of convertibility for systems in which explicit definitions are represented by means of lambda abstraction rather than the introduction of constants or the special constants called combinators. Because of the correspondence between lambda terms and natural deductions, this change is equally called for in Prawitz’s definition of convertibility [I91 (or equivalence, as he says) for natural deductions.
1. Models The notion of model with which we shall be concerned can be formulated at least for (1) the positive implicational calculus, (2) intuitionistic propositional logic, 6 Kanper, Symposium
81
82
PER MARTIN-LOF
(3) intuitionistic first order predicate logic, (4) the system of primitive recursive functions, (5) primitive recursive arithmetic, (6)intuitionistic first order arithmetic, (7) the system of primitive recursive functionals of finite type, (8) Godel’s theory T, (9) intuitionistic arithmetic of finite type, (10) intuitionistic ramified analysis, (11) intuitionistic theories of generalized inductive definitions as formalized by Kreisel and Troelstra [14], Howard [l 11 and MartinLof [161, (12) the intuitionistic theory of types of Martin-Lof [17], (13) the system F of Girard [6] or, what amounts to essentially the same, intuitionistic second order logic with O-ary predicate variables only, (14) the theory of species, (15) intuitionistic simple type theory. The most important omissions in this list are systems containing axioms for choice sequences such as those of Kleene and Vesley [12] and Kreisel and Troelstra 1141. In the study of models of intuitionistic theories, one has the choice between classical and intuitionistic abstractions on the metalevel. Examples of classically described notions of model are the algebraic and topological interpretations, the Beth and Kripke semantics, Lauchli’s abstract notion of realizability and the models of Stenlund [20] and Girard [7]. Examples of intuitionistic models are Kleene’s realizability interpretation and the closely related model of convertible terms, first constructed by Tait [21] for Godel’s theory T. An obstacle to the formulation of a general intuitionistic notion of model has been the lack of a sufficiently welldeveloped intuitionistic notion of set. Using the type-theoretic abstractions described in [171, I intend in the following to formulate an intuitionistic notion of model which is applicable to any one of the theories listed above and which is wide enough to include the realizability interpretation as well as the term model of the theory in question. The transition to intuitionistic abstractions on the metalevel is both
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
83
essential and nontrivial. Essential, because in what seems to me to be the most fruitful notion of model, the interpretation of the convertibility relation conv, is standard, that is, it is interpreted as definitional equality =def in the model, and definitional equality is a notion which is unmentionable within the classical set theoretic framework. Nontrivial, because of certain novelties which I would like to exemplify at once. In the realizability interpretation, when described classically, one puts
A
=def the set of natural numbers that realize the formula or type symbol A, AP (e, m ) = { e } (m), def
where, in the latter definition, it is supposed that m and e are natural numbers that realize A and A -+ By respectively. Intuitionistically, this no longer works, because there is no function in the intuitionistic sense which takes m and e into {e}(m).Instead, we have to put
A
= def
the species of natural numbers that realize A,
Obj (A)=d e f ( 2 j n E N) A(m) =def the type of pairs whose first component is a natural number m and whose second component is a proof that m realizes A, AP (by 4 = P (4(b, 4). def
Here it is supposed that a and b are objects of types Obj (A) and Obj (A + B), respectively, and that e realizes A
4
B = (Vx E Obj (A)) (3y E Obj (B))( { e } (p(x)) N p ( y ) ) def
which is logically equivalent (but not definitionally equal) to the more usual form (Vm E N) (m realizes A
3
(3n E N)(n realizes B & {e}(m) N n)).
The functions p and q are the left and right projections of types (Zx E A) B(x) --f A
and (Ilz E (2% E A ) B(x)) B (p(z)),
respectively, which are defined by the schema
84
PER hfARTIN-LOF
,Z being replaced by 3 if we think of B(a) for a of type A as a proposition rather than a type. Similarly, in the term model, when described classically, one puts
A
= the set of closed normal terms with type symbol A ,
def
Ap (b, a) =dcf the normal form of b(a) which exists and is unique by virtue of the normalization theorem and the Church-Rosser property. Again, this does not work intuitionistically, because the normal form of b(a) is not a function of a and b alone. Instead, we have to put
A
= def
CA = the species of computable terms with type symbol A , def
Obj (A) =def (Za E Term ( A ) ) CA(a)=def the type of pairs whose first component is a closed term a with type symbol A and whose second component is a proof that a is computable, AP (b, a) = P (4( b 7 def
4).
which is logically equivalent to (Va E Term ( A ) ) (C,(a)
--f
(3d E Term ( B ) ) (C,(d) & b(a) red d ) ) .
Since the number of clauses needed in order to define what is a model of a certain theory grows in proportion to the number of clauses that specify the theory in question, I shall limit myself from now on to the positive implicational calculus and intuitionistic second order logic with 0-ary predicate variables only. Having given the complete definition for these, it should be clear how to extend it to the other theories listed above.
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
85
1.1. The positive implicational calculus A model for the positive implicational calculus consists of the following data. (a) A type TYP. (b) A function Obj which to an arbitrary object A of type Typ assigns a type Obj (A). (c) A function F of type Typ + Typ + Typ. Here and in the following parentheses are associated to the right. When there is no risk of confusion, I shall allow myself to write A + B instead of F ( A , B). (d) A function Ap of type Obj ( F ( A , B)) -P Obj (A) -+ Obj (B) for everypairofobjectsAandBoftypeTyp.Ap Ap (Ap (b, a,),a2)...,a,,) will be abbreviated b (a, , ...,a,,). (e) Closure under explicit definitions. For every finite sequence of objects A,, ..., A,, and B of type Typ and every term b [ x , , ..., x,,] of type Obj (B) built up from variables xl, ..., x, of types Obj (Al), ..., Obj (A,,), respectively, by means of the operation Ap, there shall exist an object f E Obj (A, + + A,, -r B ) such that f ( a 1 , * - * , 4= b[a1,*..,anl(-.a
def
By the combinatorial completeness property, it suffices in fact to have, for every triple of objects A, B and C of type Typ, objects I,K and S of B -+ A) and Obj ((A + B + C)4 ( A B) types Obj (A -+ A), Obj (A 4. 4 A .--, C),respectively, such that
I(a) = a , def
K ( a , b) = a, def
S (c, b, U )
= c (a, b ( ~ ) ) . def
Besides the use of the intuitionistic type theoretic abstractions instead of the classical set theoretic ones on the metalevel, the most important difference between this notion of model and the models defined by Stenlund [20] and Girard [7] is the requirement that the equality in the equa-
86
PER MARTIN-L~F
tion f ( a l , ...,a,) =def b [ a , , ...,a,] be definitional and not merely settheoretic equality or equality with respect to some arbitrary equivalence relation. Suppose now that we are given an assignment of an object A of type Typ to every atomic formula A of the positive implicational calculus. Extend this assignment to composite formulae by putting A + B = def F (A,B). In classical model theory one verifies that a formula which is formally derivable is true in an arbitrary model of the theory. For us, this step corresponds to showing how to assign to a closed term a with type symbol A an object ii of type Obj (A). The definition of ii is by induction on the construction of a. However, during the induction we have to consider open terms as well. We put 1 =def a variable of type Obj (A),provided x is a variable with type symbol A ,
b(a) = Ap (6, ii), def
the object of type Obj (A, + ... -+ A, + 3) such that !(al, ... ,a,,) =def 6 [ a , , ... , a,] for a , , ... ,a, of types Obj (Al), ...,Obj (A,), respectively, provided f is the constant with type symbol A , 3 ... -+ A , -+ B introduced by the schemaf(a,, ..., a,,) conv b [a,, ...,a,].
f =def
The assignment of ii to a is clearly such that,
if a conv b, then ii = 6. def
Thus the interpretation of the convertibility relation is standard. 1.1.1. Example. Intended interpretation. (a) Typ = d e f the type of propositions. (b) Obj ( A ) =def the type of proofs of the proposition A . (c) F ( A , B ) =def the proposition A implies B. (d) Ap =def modus ponens. (e) Closure under explicit definitions is trivially fulfilled. 1.1.2. EXAMPLE. Realizability interpretation. (a) Typ =def the type of species of natural numbers. (b) Obj ( A ) =dcp (Zin E N) A(in).
87
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
(c) F ( A , B ) = def the species of all natural numbers e such that (Vx E Obj (4) (3E Obj (B))((4 (P(x)) N P(Y)).
(dl Ap (b, a) =def P (4(b7 a)). (e) Closure under explicit definitions is verified in very much the same way as in the term model which we shall construct next.
1.1.3. EXAMPLE. Term model. (a) Typ =def the type of pairs ( A , p), where A is a type symbol and ~1 a species of closed terms with type symbol A . =def ( 2E~ Term (4) d4. (b) Obj ((Ay PI,>> (c) F ( ( A , p), (By y)) = def ( A + B, the species of all closed terms b with type symbol A + B such that (vx
E Obj
((Ayd)) ( 3 E~ Obj ((4 w))) ( b (PW red P(Y)).
(dl AP (b7 a) = def P (4(by a))(e) We shall verify closure under explicit definitions by considering a typical case, namely, we shall show how to interpret the constant K with type symbol A -+ B A which is defined by the schema -+
K (a, b) conv a . Its interpretation R is simply the pair consisting of the constant K and the usual proof that K is computable (see Tait [21]), = def (K,
((K (dX)), (IY)(x7 theproof that (dX), P(Y)) redp(x)))~ the proof that K ( p ( x ) )red K ( ~ ( x ) ) ) )
where, of course, the use of the lambda notation is informal. For this R and a and b of types Obj (A)and Obj (B),respectively, we have
R (a, b) = AP (AP (R,4 , b) = P (4(P (4(R,4 1 7 def
def
= def
a
as desired. This finishes the construction of the term model for the positive implicational calculus.
1.2. Intuitionistic second order logic We shall now extend the notion of model just introduced for the positive implicational calculus to the second order. A model for the fragment of intuitionistic second order logic in which only 0-ary predicate vari-
88
PER MARTIN-LOF
ables are allowed or, what amounts to essentially the same, the system F of Girard [6]consists of the following data. (a) A type TYP. (b) A function Obj which to an object A of type Typ assigns a type Obj (A). (c) An assignment to every closed formula A of the extended language, obtained by adding the objects of type Typ as constants, of an object A of type Typ such that A = A def
if A is (the constant for) an object of type Typ, and, furthermore, the substitution property B[A]= B [ A ] def
is fulfilled. Observe that the equality here is definitional. In particular, the function F required for the positive implicational calculus is given by F ( A , B)
=
A
-+
def
B
where, of course, A + B is the formula of the extended language obtained by applying the connective + to the constants A and B. (d) For all closed formulae in the extended language of the forms A -+ B and ( V X ) B [ X ] ,there shall be functions Ap E Obj ( A -+ B )
-+
Obj (A) -+ Obj ( B ) ,
Ap E Obj ((VX) B [ X ] )+ (nXE Typ) Obj @[XI), respectively. (e) Closure under explicit definitions. If (VX,)
.**
(VX,) ( B , + * - .
4
B, + C )
is a closed second order formula, and c [ X , , ... , X,,, y , , ... ,y,,] is n term of type Obj (C[XI, ..., X J ) built u p from variables XI, ..., X , of type , . .. , X,], where A [XI, ..., X,,,] Typ, terms of type Typ of the form A [X, is a second order formula, and variables y1 , ... ,y n of types Obj (&
LX17
...,Xml), ..., Obj (B, [XI, ..., X,D,
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
89
respectively, by means of the functions Ap, there shall exist an object JE
such that
Obj ( ( V X , ) ... (VX,) (B,
--).
-.-
-+
B,, -+ C ) )
~ ( A ~ , . . . , A ~ , , b ~ , . . .=, b~ [n A) 1 , - . * y A m , b l- ,* . , b n ] def
Here it has been assumed, for notational simplicity, that all the universal quantifiers precede all the implications in the formula (VX,)
(VX,) (B,
-+
..-4 B,,
3
C).
In general, they may occur in an arbitrary order. The mapping of a closed term a with type symbol A into an object Z of type Obj (A)already defined for the positive implicational calculus is extended in the obvious way to the second order. That is, we add the new clause b(A) = Ap (6, A) def
and change the third of the previous clauses to
.--
the object of type Obj ((VX,) (VX,,,) (B, -+ -+ B,, -+ C ) ) such that f(A,, ... , A,, bl , .. . ,b,,) =def E [ A , , ... , A,, b,, ...,b,] which we have required to exist provided f is a constant with type --* B,, 4 C ) introduced by the symbol (VX,) ... (VX,,) (B, schemaf(d,, ... , A,, bl, ... , b,,) conv c [ A , , ... , A,, b,, ..., b,,].
f =def
-+
There is, however, one essential novelty that arises and which was overlooked by Stenlund [20]. Namely, we have to verify that b(A) = Ap(6,A) is an object of type Obj ( B [ A ] )provided A and 6 are the objects of types Typ and Obj ( ( V X )B [ X ] )associated with the formula A and the term b with type symbol ( V X ) B [ X ] , respectively. One sees immediately that b(A) is an object of type Obj @[A]).But B[A]=def B [ A ]by the substituis indeed an object of type Obj (B[Al). tion property and hence Note that the last step in the argument is an application of the informal counterpart of the formal rule of type conversion formulated in [17]. Just as in the case of the positive implicational calculus, it is clear that the interpretation of the convertibility relation is standard, that is, that a conv b implies Z =def 6.
-
b(A)
90
PER MARTIN-LOF
1.2.1. EXAMPLE. Intended interpretation. Typ and Obj are defined as in the case of the positive implicational calculus. Moreover, for every formula A of the extended language, we put
k
= def
the proposition denoted by A .
We take the first of the functions Ap to be modus ponens just as before and the second to be Ap
= def
universal instantiation.
Closure under explicit definitions is trivially fulfilled.
1.2.2. EXAMPLE. Realizability interpretation. The extension to the second order is due to Kreisel and Troelstra [14]. (a) Typ = d e f the type of species of natural numbers. (b) Obj ( A ) = d e f (,?h E N) A @ ) . (c) For a closed formula A of the extended language, the definition of the species k is by induction on the construction of A . If A is a constant, then A is the object of type Typ which it is a constant for. A
-,B(e) = ( v x E o b j (A))(YY def
E Obj
((4( ~ ( 4=)P(Y)>.
which is equivalent (but not definitionally equal) to the definition of Kreisel and Troelstra [14]. The verification of the substitution property B[A] = B [ A ] is immediate by induction on the construction of the formula B [ X ] . (d) The first of the functions Ap is defined just as in the case of the positive implicational calculus and the second by
-
AP (b, A )
=P
def
(4(b, 4)'
(e) Closure under explicit definitions. Let the constant f of type
@XI)
'**
(VX,)
(B,
+ .*. --*
B,
--f
C)
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
91
C [XI ... > Xml 9
with free variables and assumptions as indicated. In the proof that every derivable formula is realizable, one shows how to associate with such a derivation a Godel number e and a proof that, for all species of natural numbers A , , ..., A,, if ej realizes Bj [X,, ..., X,] relative to A ] , ..., A , f o r j = 1 , ... ,n, then { e } ( e l , ... , en) is defined and realizes C [ X ,,....X,] relative to A ] , ... ,A,. The number e together with this proof is essentially the objectfof type Obj ((VX,) ... (VX,,,) ( B , -+ -+ B,, -+ C ) )that interprets the constant f. The verification that ~(Al,...,Ai,~,~l,...,~n) = ~[A def
l,...,Arn,~~,...,bnI
will be omitted since it is completely analogous to the corresponding verification for the term model. 1.2.3. EXAMPLE. Term model. The extension to the second order is due to Girard [6]. (a) Typ =def the type of pairs of the form ( A , v), where A is a closed second order formula and v a species of closed terms with type symbol A . (b) Obj ( ( A ,9)) = def (Zu E Term ( A ) )~ ( u )where , Term ( A ) denotes the type of closed terms with type symbol A . (c) For a closed formula A of the extended language, the definition of the object A of type Typ is by induction on the construction of A .
k =A def
if A is (the constant for) an object of type Typ.
For an implication the definition is as in the case of the positive implicational calculus.
92
PER MARTIN-LOF
(VX) B [ X ] = d e f ( ( V X ) B [ X ] , the species of all closed terms b with type symbol ( V X ) B[X]such that
Here parameters have been suppressed for the sake of notational simplicity.
-
The fact that B[A]=def B[A],which is seen by induction on the construction of the formula B [ a , is essentially the content of the substitution lemma in Girard [6].However, it has to be observed as in [15] that the equality in the substitution lemma is definitional and not merely extensional. (d) Application. AP (44 = P (4@,a)). def
( e ) Closure under explicit definitions. Just as in the case of the positive implicational calculus, we shall verify this by considering a typical case. Let the constant Z with type symbol ( V X ) ( X 4 X ) be defined by the schema I ( A , b) conv b. We construct its interpretation 1 of type Obj ((VX)( X + X ) ) by taking the pair consisting of the constant I and the proof that l i s computable,
for A of type Typ and b of type Obj ( A ) as desired.
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
93
2. Definitional equality
By definitional equality, I mean the relation which is used on almost every page of an informal mathematical text and which is denoted by =, =def or most often but less felicitously simply = . As an example, one can take the .first part of this paper where it has been used more than fifty times. Being informal, it occurs in the left column of the following dictionary which shows the relation between certain informal notions and their formal counterparts. informal
formal
proposition proof type mathematical object defining equation definiendum definiens definitional equality
formula derivation, proof figure type symbol term rule of conversion redex contracturn convertibility
Thus the formal counterpart of definitional equality is the relation of convertibility studied in combinatory logic and proof theory. Definitional equality is a relation between linguistic expressions and not between the abstract entities which they denote and which are the same. This is the view that Frege [3] took of the relation of equality of content (Inhaltsgleichheit)which enters into his Begriffsschrift but which he later abandoned. I claim that the relation of definitional equality is determined by the following three principles and by these principles alone. (i) A definiens is always definitionally equal to its definiendum. (ii) Definitional equality is preserved under substitution. That is, if we substitute two definitionally equal expressions for a variable in a given expression, then the resulting expressions are also definitionally equal. (iii) Definitional equality is an equivalence relation, that is, it is reflexive, symmetric and transitive.
94
PER MARTIN-L6F
This claim is supported by the following heuristic evidence. The only place where the relation of definitional equality is used in a crucial way except in the definitional schemata themselves is in arguments of the form
if a is an object of type A and A = d c f 3, then a is an object of type B, and, correspondingly for propositions and proofs, if a is a proof of the proposition A and A of the proposition B.
= def
B, then a is a proof
This principle is accepted on the basis that if A =def B, then A and B are merely notational variants of one and the same abstract type or proposition, as the case may be. Detailed case studies show that the relation of definitional equality with respect to which this principle is applied has to satisfy precisely the above three conditions. Here is a typical example. Define a type valued function F by the schema
Then, given a function f of type (17n E N) F(n), we can define a function g of the same type by putting
s(4 =f(n dcf
+ lYf(4).
+
Indeed, if n is an arbitrary natural number,f(n) andf(n 1) are objects of types F(n) and F (n I), respectively. But F (n 1) = def F(n) 3 F(n) and hence f(n + 1) is a function of type F(n) -+ F(n), so that we can apply it tof(n), thereby getting an objectf(n l,f(n)) of type F(n). Let us now see what corresponds to the three principles determining the relation of definitional equality on the formal level. Clearly, they are turned into the conversion rules a conv c redex conv contractum, b[a] conv b[c]
+
+
+
a conv a,
a conv b b conv a’
a conv b b conv c a conv c
9
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
95
in the second of which the terms a and c must have the same type symbol as the variable x in b[x] for which they are substituted. The corresponding reduction relation is obtained by omitting the symmetry rule and the strict reduction relation by omitting the reflexivity as well. In particular, in the positive implicational calculus (or, what amounts to the same, the basic theory of functionality in [ 2 ] ) when formulated with constants, the conversion rules are equivalent to the following f(a,,...,an)convb[a,,...,anl, a conv c
b(a) conv b(c) ’ a conv a,
a conv b b conv a ’
b conv d b(a) conv d(a) ’ a conv b
b conv c a conv c
which generate the convertibility and, if symmetry is left out, reduction relations which are called weak in combinatory logic. When only the special constants called combinators are allowed, the first rule of conversion specializes to Z(a) conv a,
K (a,b) conv a, S (c, b, a) conv c (a,b(a)).
On the other hand, if we consider the typed lambda calculus, which is isomorphic to the natural deduction formulation of the positive implicational calculus, then the above rules of conversion reduce to (Ax) b[x](a)conv b[a];
a conv a,
a conv b b conv a ’
a conv c
b[a]conv b[c] ’ a conv b
b conv c a conv c
The corresponding reduction relation, which is obtained by leaving out the symmetry rule, is precisely the restricted (as opposed to general) reduction relation introduced by Howard [lo].But, and this is the impor-
96
PER MARTIN-LOF
tant point, the convertibility relation generated by these rules is nor (the typed version of) the usual convertibility relation between lambda terms as defined in [l] and [2],because the rule
b[x] conv d[x] (Ax)b[x] conv (Ax) d[x] which cannot be derived from the others, has been left out. Similarly, the reduction relation between natural deductions introduced by Prawitz [18] corresponds not to the restricted but to the general reduction relation, because in a reduction step as defined by him A
A
A
B A + B B
red
. B
there may be open assumptions in the subderivation A
. A
B A + B B
which become closed (discharged or cancelled) further down in the derivation, that is, below the downmost occurrence of the formula B. The outcome of the foregoing analysis is that the rule (6) is unacceptable as a rule of conversion. Of course, we are free to define many different relations between terms and call them convertibility relations, but my claim is that only one of these correctly formalizes the informal notion of definitional equality. And a correct definition of convertibility is vital in all systems, in particular, in all higher type systems like Godel’s T, intuitionistic arithmetic of finite type and intuitionistic
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
91
simple type theory, whose formulae alias type symbols are not necessarily in normal form, because we then need the rule of inference A B
- AconvB alias the rule of term formation if a is a term with type symbol A and A conv B, then a is a term with type symbol B. Hence a change in the definition of convertibility may change the stock of derivations alias terms of the theory and even the derivability relation. This difficulty does not arise in systems whose formulae alias type symbols are all in normal form, because then the derivations alias terms can be generated separately, that is, without reference to the convertibility relation, whose definition can wait until afterwards. Examples of such systems are intuitionistic first order predicate logic and Girard’s system F. 2.1. Positive effects of abolishing the conversion rule (6) 2.1.1. For the models described in the first part of this paper, we achieve that a conv b implies i =dCf 6. In particular, in the realizability interpretation, we achieve that, if a and b are two interconvertible derivations of the formula A , then the corresponding numbers which realize A as well as the proofs which show that they do so are definitionally equal. Similarly, in the term model, we achieve that, if a conv b, then the normal forms of a and b as well as the proofs which show that they are computable (hereditarily normalizable) are definitionally equal. 2.1.2. When defining his notion of model for functional systems up to the level of intuitionistic simple type theory, Girard [7] introduces a relation reduction * between closed terms of an extended language obtained by adding as new constants elements of certain sets of arbitrary cardinality. The relation reduction * is obtained from the usual reduction relation for lambda terms by not allowing a redex to be contracted unless it is closed. Now, since we are only dealing with closed terms, this is precisely the same restriction as the one that has been advocated above. I
Kanger, Symposium
98
PER MARTIN-LOF
Girard [7] also notes that the realizability interpretation alias the model of the hereditarily recursive operations is not a model with respect to the general reduction relation for lambda terms. This is his reason for introducing the restricted relation reduction *.
.,
2.1.3. For the natural transformalions from lambda terms to terms built up from constants or combinators and vice versa, denoted by the superscripts and respectively, we achieve that, for lambda terms a and b
'
a conv b implies a0 conv bo
while preserving the property that, for terms a and b built up from constants or combinators,
a conv b implies a. conv b. The transformations
and
.
are defined as follows.
2.1.3.1. For a variable in the lambda calculus, we put xo = x ACT
Furthermore,
where a,, ...,a,, are the (necessarily disjoint) maximal subterms of b [ a l ,... ,a,,, x ] that do not contain any free occurrences of the variable x, and f is the function constant introduced by the schema f(al,.... a,,, a) conv b [ a l , ...,a,,, a ] . 0
If we only allow the special constants I, K and S, we have to put instead
((Ax)b[xl)O = (Ax) bO[xl, where
def
(Ax)x
= def
I,
(Ax) a = K(a) dcl
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
99
provided x does not occur free in a, and (Ax) (b[xI(a[xl))= s ((Ax)b b l , (Ax) 4x1) def
provided x occurs free in at least one of the terms a[x] and b [ x ](otherwise the previous clause is applicable). 2.1.3.2. Conversely, X.
=x,
def
(b(a)). = b.(a.) def
and, iff is a function constant introduced by the schema f ( a 1 , ..., a") conv b [a1, ..., an1 , then
f
= def
(Ax,) ... (Ax,) b.
[Xl,
...,x,]
so that, in particular,
2.1.4. The proof of normalization for my intuitionistic type theory (see [17]) becomes locally formalizable in the theory itself. When the dubious
rule of lambda conversion was allowed, I could not carry out the proof of normalization for every specific term in the theory itself, contrary to what one would expect from one's experience with other full scale formal theories. The reason for this failure was that, when A conv B in the old sense, I was only able to prove C, and C, to be extensionally equal, whereas one would like to have C, =def C,. Here C,, and C, are the computability predicates associated with the type symbols A and B, respectively.
PER MARTIN-LOF
100
2.1.5. By forbidding the rule
b[x] conv d [ x ] (Ax) b[x]conv (Ax) d [ x ] ’ Howard [lo] was able to achieve, for his unique assignment of ordinals to the terms of Godel’s T, that if a reduces strictly to b, then 01 > p , where 01 and p are the ordinals <E~ assigned to the terms a and b, respectively, For general reductions, this property is no longer known to hold. 2.2. Further rules of conversion which do not correctly formalize the notion of definitional equality as understood in this paper 2.2.1. Curry’s rule of 7-conversion,
(Ax) (b(x))conv b provided the variable x does not occur free in the term b, and the combinatory axioms which correspond to it. Equally unacceptable is the corresponding rule for Cartesian products, ( P ( 4 4 ( 4 ) conv c ,
although, as shown below, the abstract objects denoted by (p(c), q(c)) and c can be proved to be identical. 2.2.2. In systems of natural deduction, the following rules which are all formulated in [19]. First, the permutative rules for v and 3,
A
.. B.. . .
A v B
C C
.
C D
A ’
conv A v B
B
C . C . D
D
D
101
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
and B [XI
(3x) B[xl C
c
BbI
.
c
. conv ( 3 x ) B [ x ]
D
. D
D
provided the inference from C to D neither binds any free variable nor discharges any assumption in the derivation of A v B and (3x)B [ x ] , respectively. Second, the simpliJicationrules which are used to get rid of redundant applications of the elimination rules for v and 3. Third, the expansion rules, one for each of the logical operations, which in the case of implication reads
A+B A B A+B
conv
:
A+B
and corresponds to Curry’s rule of 7-conversion.
2.3. Dejinitional equality versus identity It is necessary to distinguish carefully between, on the one hand, the relation of definitional equality which, according to what has been said above, is a relation between linguistic expressions and, on the other hand, the relation of identity between the abstract entities that they denote. I regard the identity relation a = b between objects a and b of some type A as defined by the axiom of =-introduction a = a
alias the object r(a) of type a
=
a
analogously to the way in which the logical operations &, v and 3 and the type N are defined by their respective introduction rules. The cor-
102
PER MARTIN-LOF
responding axiom of =-elimination is (Vx E A ) C [x, x , r(x)] -+ (Vz E a = b) C [a, b, 21
which, in case the predicate C [ a ,b, c] does not depend on the proof c of the proposition a = b, reduces to (VX E A) C [x, XI
-+
(a = b
-+
C [a, b)] .
This, in turn, is equivalent (modulo the axioms of implication and universal quantification) to the usual eliminatory axiom of identity
u
=
b
-+
(C[U]-+ C[b]).
In one direction, the relation between definitional equality and identity is as follows. If a =dcl bythen a = b holds, and, on the formal level, if a conv b, then a
=
b is derivable.
Informally, we argue that a = a is an axiom and that a =det b implies (a = a) =def ( a = b) so that a = a and a = b have the same meaning and we can conclude a = b. The last step in the argument amounts on the formal level to an application of the (indispensable) rule of formula alias type conversion formulated in [17]. In the other direction, there seems to be little hope of showing that, if a = b holds, then a = d e f b or , even that, if a and b are terms of a formally delimited theory, the validity of a = b should imply a conv b. Little hope, because to say that a = b holds intuitionistically means only that we suppose that we have a completely arbitrary abstract proof of a = b, and it seems too much to hope for that we should be able to pass from such an abstract proof to the sequence of combinatorial transformations that would establish a conv b. However, if we assume not only that a = b holds but that a = b is derivable in a formally delimited theory, then we have the following precise answer. THEOREM. If there exists a closed derivation of a = b, then the terms a and b are interconvertible. PROOF.This follows, for any one of the theories listed in the first part of this paper, as a combinatorial corollary to the normalization theorem for the theory in question. Suppose namely that there exists a closed deriva-
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
103
tion of a = b. (Then the terms a and b are necessarily closed too.) Using the normalization theorem, we can reduce it to normal form. Now, a closed normal derivation must necessarily have introduction form. In particular, when the end formula is a = b, it must have the form c = c
( a = b) conv (c
a=b
= c).
The assumption that there is precisely one application of the rule of formula conversion between the axiom c = c and the end formula a = b implies no essential restriction of generality, because, if there were several, we could condense them into one, and, if there were none, we could insert a redundant application of the rule in question. From ( a = b) conv (c = c) the Church-Rosser theorem allows us to conclude that a = b and c = c have a common reduct. Hence so do, on the one hand, a and c, and, on the other hand, b and c. Therefore, the terms a and b are interconvertible as was to be proved. The theorem is not so interesting as it may seem, in particular, it proves nothing about the adequacy of the definition of convertibility (cf. [13]), because the relation with respect to which we conclude that a and b are interconvertible is just the convertibility relation which we put into the theory via the rule of formula alias type conversion. The following counterexample shows that the theorem is no longer true for open terms. Consider a Cartesian product A x B and let p and q be the associated projections with type symbols A x B + A and A x B + B, respectively, defined by the schema P ((a, 4)conv a,
(
q ((a, b)) conv b.
Then, for free variables x and y with type symbols A and B, respectively, (XY
Y) = (x, v)
is an instance of the law of identity from which we can infer
(P ((XY Y)),4 ((XY
r))) = (4u)
by formula conversion. The latter formula taken together with the axiom (VX E A ) (Vy E
B)c [(x, y)]
+ (VZ E A
x B) C[Z]
104
PER MARTIN-LOF
yields (P(Z>,
4 w =
for a free variable z with type symbol A x B, although the term (p(z),q(z)) does not convert into z with respect to the above rules of conversion. On the other hand, if c is a closed term with type symbol A x B, then (p(c), q(c))conv c, because a closed term with type symbol A x B necessarily reduces to one of the form (a, b). 2.4. Discussion of the conjecture about identity of proofs formulated by Prawitz [19].
The conjecture was that two derivations represent the same proof if and only if they are equivalent. Here equivalent means interconvertible. Clearly, the conjecture hinges upon what we understand by two proofs being the same. Two, and only two, interpretations seem possible. Either we mean by saying that two proofs are the same that they are dejnitionally equal which, according to what has been said above, is an assertion about the proofs thought of as linguistic expressions. In that case, the conjecture is turned into the thesis which has been advocated above, namely, that the relation of convertibility as defined in this paper correctly formalizes the notion of definitional equality. Or else we really have the abstract proofs in mind and not their linguistic representations. Then sameness must mean identity, and the conjecture is turned into the assertion that two derivations are interconvertible if and only if the abstract proofs that they represent are identical. As was argued above, there seems to be little hope of proving the conjecture in this form unless identical is replaced by provably identical in which case the theorem and the remarks following it give a complete answer. 2.5. On the treatment of equality in Frege’s writings As was mentioned earlier, equality appears in $ 8 of Frege’s Begriffsschrift as equality of content (Inhaltsgleichheit) which he denotes by = and which is a relation between names and not between their contents. It seems reasonable to identify Frege’s equality of content (provided one disregardsthe geometrical example that he gives) with definitional equality
ABOUT MODELS FOR INTUITIONISTIC TYJ?E THEORIES
105
or, on the formal level, convertibility as understood in the present paper. So far so good, but later, in $20 and 521, the axioms of identity are written (in modern notation) a -= b
(A(a) --+ A(b)), a
E a.
This is no longer compatible with the analysis of the relation = given earlier, because if = is viewed as a relation between names, then a = b is not a proposition on a par with the propositions inside the formal theory like A(a) and A(b) which we prove by means of possibly logically complicated proofs. In particular, it cannot be combined with these into compound propositions by means of the logical operations. Thus, for example, a = b --+ (A(a) 3 A@)) is meaningless because in a = b the entities a and b are names, that is, they stand for themselves, whereas in A(a) and A(b) they stand for their contents. This caused Frege [4, 51 to abandon the relation of equality of content = and replace it by the relation of identity =. Similarly, something like for all natural numbers n, n
= dcf
n
is meaningless and, accordingly, (Vx E N)(x conv x) is not a wellformed formula, because the variable x ranges over the natural numbers and not over the numerical terms of some formal theory. On the other hand, for all natural numbers n, n
=
n
is a meaningful and true proposition which is expressed by the formula (Vx E N) (x = x). Also meaningful and true is the proposition for all numerical terms a, a conv a , but it can only be expressed in the formal theory after arithmetization.
2.6. Equality in Godel’s T The relation of equality enters into Godel’s theory T i n two different ways, on the one hand in the definitional schemata of the primitive recur-
106
PER MARTIN-L6F
sive functionals, and on the other in the associated deductive theory. In the definitional schemata, the equality relation is clearly definitional which implies that they should be written formally
f(a1, ..., a n , 0) conv a [ a , , ... , a,,],
f(a,, ..., an,a') conv b [ a , , ...,a,,,a , f ( a l ,...,a,,, a ) ] . In the deductive theory, on the other hand, the equality relation has to be understood as identity and not as intensional or definitional equality as suggested by Godel [8], because there we prove equalities by means of the axioms of identity
a = a,
a = b A[a] A [bl
,
and the induction schema A [XI
4 0 1 &'I A [a1 of whose validity we cannot convince ourselves unless, when reading the formulae, we associate with the terms not themselves but the abstract objects which they denote. To complete the formulation of the deductive part of the theory, we only have to add the rule of formula conversion
A B
A conv B .
If we allow as formulae not only equalities between terms of arbitrary finite type but also propositional combinations of such, then we shall have to add the rules of intuitionistic propositional logic, because we have no right to assert
(a=b)v -(a=b) intuitionistically except at the lowest type in which case it is derivable from the other axioms, provided - A is defined as A -+ 0 = 1.
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
107
This should be compared with the fact that, as proved by Tait [21J for all terms a and b of an arbitrary finite type, (a conv b) v -(a conv b) . However, the decidability of the convertibility relation provides no evidence whatever for the decidability of the identity relation, the former being a relation between terms and the latter a relation between the abstract objects which the terms denote. The identity relation on an arbitrary type is decidable if and only if there exists a numerical valued equality functional E such that E(a, b) = 0- a = b ,
and hence we have just as little right to postulate the existence of such a functional as the decidability of the identity relation except at the lowest type where we can put E(a, b) =def la - b(. On the other hand, Tait’s proof of the decidability of the convertibility relation provides us for every finite type with a function E such that
E(a,b) = Ot,aconvb. However, this function E is defined for the terms and not for the abstract objects of the type in question, and hence it is not on a par with the functions that are defined by the ordinary definitional schemata of explicit definition and recursion. Because of what has been said above, the system of intuitionistic arithmetic of finite type formulated by Tait [21] is not intuitionisticallyacceptable unless the axiom (a = b) v -(a = b) is abolished at higher types. And, in the intensional version of the system formulated by Troelstra [22], not only this axiom but also the equality functional has to be thrown out.
Acknowledgments
I am grateful to Peter Hancock and Soren Stenlund for pointing out errors in the first draft of this paper.
108
PER MARTIN-LOF
References 111 A. Church, The calculi of lambda-conversion, Annals of Mathematics Studies No. 6 (Princeton University Press, Princeton, N. J., 1941). [2] H. B. Curry and R. Feys, Combinatory logic, Vol. I (North-Holland, Amsterdam, 1958). [3] G. Frege, Begrifsschrifr, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (Verlag von Louis Nebert, Halle, 1879). [4] G.Frege, Uber Sinn und Bedeutung, Z. Philosophie undphilosophische Kritik 100 (1 892) 25-50, [5] G. Frege, Grundgesetze der Arithmetik, begriffsschriflich abgeleitet, I. Band (Verlag von H.Pohle, Jena, 1893). [6] J.Y. Girard, Une extension de l’interprktation de Godel 1I’analyse, et son application 1 l’elimination des coupures dans I’analyse et dans la thkorie des types, in: Proceedings of the second Scandinavian logic symposium, J. E. Fenstad, Ed. (NorthHolland, Amsterdam, 1971) pp. 63-92. [7] J.Y. Girard, Interprbtation fonctionnelle et dimination des coupures de l’arithmetique d’ordre supikieur, These, Universite Paris VII (1972). [8] K.Gode1, Uber eine bisher noch nicht benutzte Erweiterung des finiten Standpunktes, Dialectica 12 (1958) 280-287. [9] W. A. Howard, The formulae-as-types notion of construction, privately circulated notes. [lo] W. A. Howard, Assignment of ordinals to terms for primitive recursive functionals of finite type, in: Intuitionism and proof theory, J. Myhill, A. Kino and R. E. Vesley, Eds. (North-Holland, Amsterdam, 1970) pp. 443-458. [Ill W.A.Howard, A system of abstract constructive ordinals, J. Symb. Logic 37 (1972) 355-374. [12] S. C. Kleene and R. E.Vesley, Thefoundations of intuitionistic mathematics (NorthHolland, Amsterdam, 1965). [13] G. Kreisel, A survey of proof theory 11, in: Proceedings of the second Scandinavian logic symposium, J. E. Fenstad, Ed. (North-Holland, Amsterdam, 1971) pp. 109170. [14] G. Kreisel and A. S.Troelstra, Formal systems for some branches of intuitionistic analysis, Ann. Math. Logic 1 (1970) 229-387. [I51 P.Martin-Lof, Hauptsatz for intuitionistic simple type theory, in: Logic, methodology and philosophy of science ZV, P. Suppes et al. Eds. (North-Holland, Amsterdam, 1973) pp. 279-290. [16] P. Martin-Lof, Hauptsatz for the intuitionistic theory of iterated inductive definitions, in: Proceedings of the second Scandinavian logic symposium, J. E. Fenstad, Ed. (North-Holland, Amsterdam, 1971) pp. 179-216. [17] P.Martin-Lof, An intuitionistic theory of types, mimeographed. [18] D. Prawitz, Natural deduction, a proof-theorefical study (Almqvist and Wiksell, Stockholm, 1965).
ABOUT MODELS FOR INTUITIONISTIC TYPE THEORIES
109
[19] D.Prawitz, Ideas and results in proof theory, in: Proceedings of the second Scandinavian logic symposium, J. E. Fenstad, Ed. (North-Holland, Amsterdam, 1971) pp. 235-307. [20] S. Stenlund, Combinntors, A-terms and proof theory (D. Reidel Publ. Co., Dordrecht, 1972). [21] W. W.Tait, Intensional interpretations of functionals of finite type I, J . Symbolic Logic 32 (1967) 198-212. I221 A. S. Troelstra, Notions of realizability for intuitionistic arithmetic and intuitionistic arithmetic in all finite types, in : Proceedings of the second Scandiiiaciarr logic Symposium, J. E. Fenstad, Ed. (North-Holland, Amsterdam, 1971) pp. 369 -405.
COMPLETENESS AND CORRESPONDENCE I N THE FIRST AND SECOND ORDER SEMANTICS FOR MODAL LOGIC*
Hennk SAHLQVIST University of Oslo, Oslo, Norway
0.
This paper is a small step towards answering questions like: What kinds of completeness results are there in modal logic? What properties of models and frames are ‘modally axiomatisable’? We concentrate on normal logics and first order properties of the accessibility relation. There are three main themes, a positive, a negative, and a comparative. We sketch the background. Segerberg [191 says : ‘At the center of interest have been questions of the following type: (1) Given a condition n,is it possible to axiomatize the set Z of formulas valid in all structures ( U , R) such that R satisfies 7~ over U? ... In addition to questions of type (l), there is also another type which is important.. . but which has so far received less attention: (2) Given a condition n, does there exist a set Z of formulas that are valid in a structure ( U , R) if and only if R has n over U? In particular, is there a finite 2 of this kind? And both problems may be turned round : Given 2,find a n. Problem (1) and its converse is the problem of completeness. Problem (2) and its converse we call the problem of correspondence.
* This paper is a revised version of parts of the author’s cand. real.-thesis submitted to the University of Oslo, Dept. of Mathematics, Spring 1973. 110
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
111
Thomason [20] asks for a characterisation of the finiters corresponding to first order conditions n. Five open problems or conjectures are stated in [l 11. They are grouped together at the end of Section 5, and constitute a discussion of completeness results in general and the limitations on the techniques of the preceding pages. A (Conjecture). Every normal modal logic is determined by a class of relational frames. (I.e., there are completeness results for all such logics in terms of ordinary relational frames.) This is proved to be wrong in [24] and [l]. Gerson [6] shows that the corresponding conjecture for classical modal logics and (ordinary) neighbourhood frames is false, too. B (Problem). Is there a completeness result for the logic KM, where M is the schema 0 O A + ODA? Fine [3] gives a positive solution to this. In [ll], two kinds of completeness results are given, one in terms of classes of (relational) models, obtained by considering so-called characteristic conditions of formulas, the other in terms of classes of (relational) frames. The problem is then raised:
C (Problem). Is it possible to derive the second type of completeness results from the first one directly? We show how the two types of results are interrelated, by reformulating the first type in what Thomason [20] calls the first order semantics. This is the comparative theme.
D (Conjecture). The modal logic formed by adding to K the schema O"'0"' A1 A - * - A Orn'O"'Ak+ @ ( A l , ..., A k ) (where @ is an affirmative formula) is complete for a certain condition which may be 'read off from the schema. We prove this and generalise the result considerably. We also have correspondence results for these schemas. This is the positive theme. E (Conjecture). There is no schema Z such that KZ is a stronger logic than K and KZ is determined by the asymmetric (antisymmetric) frames. The corresponding result for irreflexivity was announced but not proved in [I I]. In Section 3, we prove the corresponding result for a lot
112
HEN=
SAHLQVIST
of other ‘negative’ properties. No formula (or schema) corresponds to these properties. (There are some remarks on this in [12, p. 451.) This is the negative theme. The simple method used also solves the following problem of Segerberg [181:
F (Problem). Give simple proofs that the logics K4, D4, S4 are determined by the classes of irreflexive transitive trees, irreflexive transitive trees with infinite branches, and reflexive transitive trees, respectively. We give such proofs, sharpen the theorems and prove similar results for other logics in Section 4.
1. Introductory observations
Terminology and notation is essentially as in [15, 16, 17, 181. Readers unfamiliar with the notions ‘normal logic’, ‘generated model’, ‘p-morphism’, ‘canonical model’, ‘fundamental theorem’ etc. should consult these references. Let T ( A ) be the set of substitution instances of A . A schema is a set of formulas of the form T ( A ) ,and K r ( A ) is the system obtained from K by adding T ( A ) as axioms. A second order relational frame is an ordered pair ( U , R) such that U is a set and R a binary relation over U. The elements of U are called points (or worlds). If 37 = ( U , R) is a frame, then ( U , R, V ) is a model on Y ,and V a valuation on X if V is a function from the natural numbers to B U (the power set of U ) . If U = ( U , R, V ) is a model and u E U,truth of A in U at u, t= :A, is defined in the usual way. We write IIA”ll for { u I u E U A kiA}. Following Thomason [20], we define a $rst order relationalframe to be an ordered triple (U, IT,R), where U and R are as before and IT is a subset of BU,closed under (i) the Boolean operations, (ii) the operation MR defined by
113
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
If S = (U,17,R ) is a first order frame, then ( X , V ) is a model on X , and V a valuation on X if V is a function from the natural numbers to 17. ‘Truth’ etc. is then defined as before. The closure properties of 17 guarantee that IJA11 <3yg E lI for all formulas A. A modal formula A corresponds to a class 48 of frames, and vice versa, iff (VX)(k %A X E U).
“’
f-)
The notion may also be defined for schemas. (?Ve will usually treat classes U of those frames satisfying a certain sentence or having a certain property a, and then say: A corresponds to LX, and vice versa.) In analogy with the notion ‘first order property’ and the like in predicate logic, we might say that a property of frames is a modalproperty if there is a modal formula corresponding to it, or that it is (finitely) modally axiomatisable. The notion ‘general modal property’ also suggests itself in analogy with ‘general first order property’. To say that A corresponds to 01 is to say that the frames of K F ( A ) are characterised by a , since if every modal axiom is valid in a class of frames, the logic is sound for the class. If L is a normal logic, then IAIL is the set of L-maximal sets containing A , UL = ITIL, RL is defined by u RLv iff for all A , if
[7A E u, then
A
E
v,
and VLis defined by VL(n)= IpnlL,
for each natural number n.
% I L =
FA
iff
AEU.
A semantics is said to be adequate for a logic if every nontheorem of the logic is falsified by some frame (in the semantics) of the logic. I.e., a semantics is adequate for a logic if the class of frames of the logic determines it. 8
Kanger, Symposium
114
HENRlK SAHLQVIST
Thomason [20] proves that the first order relational semantics is adequate for all tense logics, by showing that the algebraic semantics is, and that there is an equivalent algebra for each first order relational frame. The same may be done for normal modal logics. And by defining a first order neighbourhood semantics we may again - via an algebraic semantics - prove adequacy for all classical modal logics. The first part of this is given in Hansson and Giirdenfors [22]. The second part follows from Stone's representation theorem. This theorem is strong enough because we do not require any connections between the modal and Boolean operations. For tense logic or normal modal logic we need stronger representation theorems, e.g. [8, Theorem 3.10, p. 9331, to which Thomason refers, or [lo, Theorem 32, p. 2061 (which does not refer to [S]). These algebraic proofs employ the Lindenbaum-Tarski algebra of the logic, which is shown to determine it. We can as well do the whole thing within the relational (or neighbourhood) semantics. Let the frame of the (second order) canonical model be called the (second order) canonical frame. Define the Jirst order canonical frame 9;= UL,IIL,RL) by taking UL and RL as in the second order case and ITL = {IIBmLIII B a formula}.
<
THEOREM 1. I=''A
f-f
kLA.
PROOF. One way is obvious:
For the other direction, suppose 3 V on 9; # -=FL'v "A. Since V is on SZ,it takes its values in IIL = (11B11'"" I B a formula}. Let Bibe a formula such that V(i) = llBtIlmL.Let A' be the substitution-instance of A under the substitutions B,/p,. Then, by a theorem on substitution, [I 1, Theorem 2.151 or [2, Lemma 1.11, A' is falsified in %RL (at the same V) and
115
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
For each individual variable x, define the translation modal language into Lz as follows: p,:” = x E si
Ax
from the
for all i,
LAX = I,
( A + B ) A X= ( A A X+ B A 7 ,
( D A Y X = (VY)(XRY+AAP), where y is the first variable not occurring in A We write A for the universal closure of A A x in L1. (This is ambiguous since x may vary, but it does not matter.) We write VA for the universal closure of A in L ~ . A ” = is what Lemmon and Scott [ I l l calls the characteristic condition of A on x, and write cA(x). When U = ( X , V) is a relational model for modal logic (first or second order), it may also be regarded as an ordinary relational structure for L1 (or Lz ,when free predicate letters are regarded as ‘constants’). And X may be regarded as a structure for L2(when free predicate letters are regarded as ‘variables’). Then we have
U C ,,AAX iff
k:~,
UCA
iff
c’~,
x~VA
iff
k X ~ .
We introduce some abbreviations. Let V (a, B) (‘ V’for ‘valid’) abbreviate (VX)( X k 01 + k 9 ) . When we want to make it explicit that the frames X are of first or second order, we write V1 (a, B) or V2 (a, B), respectively. Let S (L, a) (‘S’ for ‘sound’) abbreviate (VS) (X
c 01 + e X q ,
or equivalently, (VB) [ L B
+
V (a,B)I ,
and similarly for S1 (L, a) and S2 (L, 01).
116
SAHLQVIST
HE-
Let C (L, a) (‘C’ for ‘complete’) abbreviate (VB) [ V ( a ,B ) + ILBI,
and similarly for Cl (L, a) and C2 (L, a). Let A(L) (‘A’ for ‘adequate’) abbreviate (VB) [ ( V X ) (CXL+ I . 9 ) + I$],
and similarly for A1 (L) and A2 (L), Let R (A, a) (‘K for ‘reflect’, see below) abbreviate (VX)( P A 4 X k a),
and similarly for Rl (A, a) and R2 (A, a). (Prior [12, p. 451 used ‘reflect’ for V ( a ,A), the converse of R (A, a).) Then S (L, a) A C ( L , a) iff L is determined by a and V ( a ,A) A R ( A , a) iff A and a correspond to each other, and
Vb,4 R (A,4
A
A (Kr(A))
A (Kr(A)) C(LY4
A
-
s ( K r( A ) , 4 -+
Y
c ( K T ( A ) ,a),
c ( U ( A ) ,v-4,
s(L,a)-+A(L),
R1 (A, a) + C1 ( K r (A), a ) , c1 ( K r ( A ) , VA) A
s1 ( K r ( A ) , vff).
Since
v1 (a,B) + v2 (a,B ) . Hence we have
THEOREM 2.
c 2 (L, a) + c1 (L,a), s1 (L,a) + s 2 (L, a).
When a E L, it has no predicate letter, so
iff
< U , n , R ) =! ci.
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
117
Consequently, if a E L
v2 (a, B) 4 v1 (a, B ) . Hence THEOREM 3. I f a E L, we have c 1 (L,a ) + c2 (L,ol), s2 (L,a ) 3 s1 (L, a).
Trivially, A corresponds to VA. But there are far better results in the second order relational semantics for some important formulas A . They correspond to sentences LY in L. If these results held also in the first order semantics, we would immediately have second order completeness results for systems axiomatised in these formulas : 8 1 (A, a ) 3 c1 (KF(A),a ) p.L, c2 (KF(A),a ) , v 1 (a, A ) 4 v2 (a, A ) + s2 ( K T ( A ) ,a). Or in words, if a is in L, and A first order corresponds to a, then KF(A) is second order determined by a. But since there are few interesting examples of A and a satisfying the hypothesis, this is of little use. We would like to weaken the hypothesis by requiring only that A second order corresponds to a. Essentially this is Problem C . Let ZU= (IIAIIu I A E Fm}.We say that U‘ = ( X , V’> is substitutionrelated to U = ( X , V ) iff V’(i)€ & for all i. (In Fine’s terminology: U’is a definable variant of U.) Any model on 9; is substitution-related to m=. We say that U strongly satisfies a iff a is true in any U’substitution-related to U. In [ l l ] it is proved for the second order relational semantics that a formula is a theorem in the system K + the single axiom A iff the formula is true in every U in which A^ is true, and, more interesting, a formula is a theorem in K r ( A ) iff it is true in every U strongly satisfying 2. Problem C is to derive the classical completeness results in terms of first order properties of (second order) frames (obtained by studying the canonical frame, and generalised in Section 5 ) from this completeness result.
118
HENRIK SAHLQMST
For theJirst order semantics we now know that theoremhood in KI'(A) is equivalent to validity in a class of frames, the class of frames in which A^ is valid; by the adequacy of the semantics. The proof of this, however, is almost the same as the proof in Lemmon and Scott. In fact, the proof may be edited so as to correspond word by word to the proof in [ll]. And the two results are equivalent. This is seen by showing
LEMMA 4. V1 (a,B) t,B is true in every second order model strongly satisfying a. PROOF. (1) Given V1 (a,B) and a U = (U,R, V) strongly satisfying a. Then (U,&, R) I= a,so by V1 (a,B), k("*zU*R)B, hence t=(UsR9v)B. (2) Assume the right-hand side, let (U,ZI, R) be given in which a is valid and assume &fu'"'R)B, i .e., there is a V on ( U , l7,R) such that I#(U*n*R'Y) B. (U,R,V) must strongly satisfy a, so k(U*R*V)B, hence t=(v*n* R *v)B, which is a contradiction. 2. Correspondence
OA 3 A second order corresponds to reflexivity of R, OA --+ O O A to transitivity of R, and A + U O A to symmetry of R. Some such results are well known, but the only references we have seen explicitly stating the problem of correspondence or correspondence results] are [20] and [19]. Hanson [7] comes near to it; he says that OA --t A 'is equivalent to the assumption that R is reflexive'. But his argument only shows that OA --t A is valid in
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
119
(We omit -P for simplicity.) If q~ and y are n-ary propositional functions, 9 A y is the n-ary propositional function which maps (A ,...,A,,) into v ( A , , ... ,A,) A y ( A , , ...,A,,), etc. The notation should be self-explanatory. A modality is a unary propositional function composed of only the unary basic functions 1,0 and 0 . Two propositional functions 91 and p are equivalent in a logic L iff for ,,), all formulas A, , ..., Amsxcm, FLP~( A ,
9
.-. A m ) 9
f+
Y (A1 ,
9
An)
-
If q~ and y are equivalent in K,we may say that 91 can be rewritten as y. When v is written without 4, the dual of 9,q, is what we get when A and v , Iand T, and 0 and 0 are interchanged throughout q. We define afjrmative and negative occurrences of propositional letters in a formula : p i occurs affirmatively in p t . If p i occurs affirmatively (negatively) in A, it occurs affirmatively (negatively) in B + A and negatively (affirmatively) in A + B, and accordingly affirmatively (negatively) in A A B, A v ByB A A and B v A, negatively (affirmatively) in i A. v is said to be affirmative if all occurrences of p , , ...,p,, in ~1 (pl, ...,p,,) are affirmative and negative if all the occurrences are negative. It is seen that q~ is affirmative iff it may be written without (+ and) 1. And a negative g~ may be written as i y or y (1, ... , i) where , y is a f i m a tive. For each propositional function cp and frame (U,R), we shall give a function from BU into 9 U yR’. For nonmodal ip, the R% are the usual Boolean set operations: R A = n, R” = u, R’ = -I, or to indicate the composition of functions: R@”””(S1 ,..., S,,) = R’(S1,
..., S,,)n R v ( S l , ..., S,,),
etc. Complement is of course taken relative to U.Define the operations M R and P R by pR(s)
M,(s)
= (xI(3y)(xRyA YES)}, = 1P R
s).
(7
120
HENRIk SAHLQVIST
PR(S)in the set of immediate R-predecessors of S, and MR(S)is the set of those points which have all their immediate R-successors in S. For the converse of R, R, Pz and M E are defined similarly. We now put R"(S) = M,(S) and Ro(S) = PR(S),or
Rnm( S , , ... , S,,) = MR (Rm( S , , ..., S,,)). Finally, for theprojections 0:we let Rot' be the corresponding projection: RoLn ( S , , ... , S,,) = Si.
We often write uRm( S ) or uR'S for u E Rm(S). Observe that PRn(S) = P i ( S ) , and similarly for M R ,Pi7 and M E . In [11] and [18] the notation 'uRm."t',for y an affirmative modality, is used for what we would write 'tRmPi{u}', but the notation is not extended to other propositional functions y . LEMMA 5. I f p , occurs only affirmatively in y ( p , ,...,p,,), Rmis monotone in the Ph place, i.e., if S, c Si, then
Rm( S , , ..., Si,...,S,,) E Rm( S , , ... ,S ; , ...,Sn). I f p , occurs only negatively in y ( p l , ...,p,,), Rmis antitone in the ithplace, i.e., ifS, E S i , then
Rm( S , , ... , S : , ..., S,,) E Rm( S , , ..., S , , ..., S,,). PROOF.The operations n, u, MR and P R are monotone.1 is antitone. LEMMA 6. and similarly,
x _c MR(Y) * p g ( X )
Y,
X E Mz( Y ) t)PR(X)
c Y.
PROOF.Suppose, for all x, x E X + x E MR(Y), i.e. X E X + (Vy) (x R y + y E Y ) . Suppose y E Pji(X), i.e., (3x) ( x R y A x E X ) . Then y E Y. The other way: Suppose (3x)(x R y A x E X ) --t. y E Y for all y , and suppose x EX.Then, given y , if x R y then y E Y. The lemma is obvious when one considers that PE(X) is the set of immediate successors of X , and MR(Y) is the set of those points which have all their immediate successors in Y.
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
121
Note: It is not true that X c P,(Y) is equivalent to Mji(X) E Y, neither iinplication holds. To give the idea of the proof, we treat first a special case of the result we are aiming at. THEOREM 7. For all m and n, and any affirmation modality 9,there is a sentence in L which second order corresponds to the schema 0"IJ"A
-,~ ( 4 ,
namely the sentence which expresses that (Vu) (Vx) [uR"x --f uR'Pi { x } ].
PROOF. (1) Assume kcup R, 0"U"A + pl(A),
i.e., or
(VS G
U)(Vu) [u E PgMjl ( S ) + uR'S],
(VS E U ) (Vu) (Vx) [uR"x
A
x E M i ( S ) + uR'S] ,
which has as a special case (vu) (Vx) [uRmxA x
EMiPi
{ x } + uR'Pi { x } ].
But in general S E M i P i ( S ) ,so we get the desired result. (2) Assume
.
(Vu) (Vx) [uRmx4 uR'Pi { x } ]
Given S E U,u and x such that uRmxand x E M i ( S ) .Then u E R"Pi { x } , and since x E M:(S) t,P i { x } E S by Lemma 6, Lemma 5 gives UER'S. Notice that this works also for m or n equal to 0. (Ro is = and P i ( S ) is S.) Generalising this we get THEOREM 8. Let S be a formula that may be rewritten as
0"(Y ( P I ... P 3 Y
7
or a conjunction of such, such that: (i) 9 is affirmative,
+
V (PI
9
... P A ) , 9
122
HENRK SAHLQVIST
(ii) in y, projections are brought innermost and negations are brought inside all other connectives (4 is eliminated), and each p i occurs in y ( p l , ...,pk) only under sequences of connectives where no 0 precedes any A , v or 0. Then S corresponds to a sentence s of L, efectively obtainablefrom S. PROOF.
c('*~)smeans
(VS) (VS1, *. ., s k
V , U) (VRmU A
URW( S ,
7
**
~
S,)
+
UR'
( S , ... , Sk)). 9
We rewrite this according to the following equivalences, where @ and Y are any formulas, y, lyl and y z any propositional functions : (1)
(v ...) (y A c*
(2)
(v * * * )
(V "') (Y
A
,s k )
XR" ( S , ,
(y A X R " ' " ~(Sl, ~ 2
A
i=1
(3)
x R ~ ' "( s~l y~* * . ? S k ) 4 @ )
(v ...) (y A
(V * * * ) (Y
A
Sk)
A XR"
( S , ..., s k ) + @). 9
@)
xRWi(S1,... , s k ) +@).
XRoW( S ,
..., s k ) + @)
c * ( b ' * . - y ) ( yA X R y
A
yR"(S1, . . . , S , ) + @),
where y is a new variable.
By assumption, y is then either a projection, I or T, preceded by 0 ' s only, or all S,'s occurring in RW( S , , ..., s k ) occur negatively. In the first case, xRuv (S, , ..., S,) equals xMgSi, xMg(1 or x M i U for some n and i. Then rewrite as
or
(v -..) (y A
XMisi + @ ) 7
(V-0) (Y
XPiU v @),
(V ..*)(Y--f @),
respectively. In the second case we rewrite as
(V
(Y
A
x R l o c (is,, ...
i s k )-+ @)
123
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
and apply (5)
*)(!P
(V
fs
A
XR'"" (S1,... , S,)
+ @)
(V -*.) (y4. XRv (S1, ..., Sk)
V
@).
We use this only when y is a projection or comes from (4). In both cases y is affirmative. It should be clear that by this rewriting we will finally reach a sentence of the form
(vs")
h
A
g=
(v-**)(@gA 1
k
A
1,
mj
A X , , ~ , ~ M J~+ *J =~ V 1*ugJ,j R'g.J S (Sl,*-*,Sk))*
J=1 i = 1
where all the variables u ~ ,1~<,j < ma,occur in and each Ggis a quantifierfree formula of L, in fact a conjunction which may be read as ordering its variables in a certain tree (each variable occurs to the right of R only once). Now, Lemma 6 tells us that mi
"'j
i=l
i=1
=
A ~,,i,jM3.','Sjt* P ~ * ' ~ J { x g , , ,Si, j}
so by universal instantiation for the S's we obtain as in the proof of the preceding theorem :
which by the monotonicity of the R'Q.~'s conversely implies Vs'. We shall give an example. The schema Alt, is treated in [181: Alt,
oA1
V
0(Al
-+
A2) V
..*V 0( A ,
A
***
A
A,
+/&+I).
It is proved that KAlt, is determined by the frames where each point has ~n alternatives. Alt, may be rewritten Alt,
OA1 4
A
0(lA1
0 (A1 v
A
A,)
v
An+1),
A
- 0 .
A
0(1Al
A
..* A
lA,-l
A
A,)
124
HENRIK SAHLQVIST
which by our method corresponds to (VUXl
X n ) (U R X i A (U R X 2 A X 2 A ( U R X , A Xn
#
Xn-l
+ (Vr) (U R y + y =
#
Xi) A
A
x1 v
A Xn
#
Xi)
v y =x~)),
i.e., each point has < n alternatives. The method of proof even shows that An+I may be dropped (replaced by I)in the schema. The resulting schema generalises the schema O A + c7A treated in [ll]. (Altk reduces to T -+aI,by conventions about empty conjunctions and disjunctions.) The proof of this general correspondence theorem does not work for the first order relational semantics. However, as we shall see in section 5, there is a class of first order relational frames in which it can be carried through and which is wide enough to admit completeness results for all normal logics. 3. Properties which lack a corresponding modal formula
Since validity of formulas is preserved under generated subframes, a minimum requirement for a sentence (in L) to correspond to some modal formula is that it be preserved under generated subframes. But the results here show that preservation under generated subframes is not sufficient. Preservation under subframes, however, is obviously a too strong requirement. It would be interesting to have a preservation theorem of the type: A sentence a in L'(or L,) is preserved under such-and-such an operation on frames iff there is a sentence Vff such that for all frames X , x k OL t,]A. We shall present here a simple proof that no schema corresponds to irreflexivity, asymmetry etc., nor to a certain generalisation of the notion of intransitivity, or any property entailed by it. The proof was originally given in [13]. We have learnt that Fine had found the construction, too. We state everything for the second order relational semantics. The results carry over to the first order case. For typographical reasons we write u I. A for I.:A and U* b * A for \:**A in this section.
125
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
We shall give a method of transforming frames and models. The method - which we call the unravelling technique - turns generated models into equivalent generated models. Let U = (U,R, V ) and u, E U. Construct a new model U* = (U*,R*, V * ) as follows: (i) U* consists of all finite sequences (uo , u l , ... , u,,), where uiE U for 0I iI n and u,Rui+, for 0 I i In - 1. (Call them R-sequences.) (ii) (u,, ..., urn>R* (u,, ...,un> holds iff (UO,
e a . 3
un>
=
.-.)urn) *
(un),
where * is the concatenation operation of sequences. (iii) For atomic formulas A , (u,, ..., u,,) E V*(A) iff u,, E V(A). We shall prove that if (u,, ..., u,,) E U*, then (u,, ..., u,,) k * A iff u,, k A , or, when u’ denotes the last element of u* E U*:u* k * A iff u’ k A , for all formulas A . The proof goes by induction: For atomic formulas A : (at.) u* t * A f - ) u * ~ V * ( A ) e , u ’ ~ V ( A ) f - ) u ’ k A , (1)u* (A)
k * l A f-) 1 u * k * A f-)
U*k*A
A
1u’
k A f-) u‘ I= 1
A,
Bt,u*k*A*k*Bt,u’tA&U’kB
ou’CA
A
By
(0)u * k * O A f-)
V ( u o , ..., u,,) E
* V(u0, ..., u,)
E
U*(u*R*(uo, ..., u,,)
--t
U*((u,, ...,un> = u*
* (24,)
f-)Vu,,~ U(u’Ru,, + u,, k A ) - u ’
(u,,,
..., u,,) t * A ) -+ u,, k
A)
t OA.
Hence U* and Uu, are equivalent. Note that if we started with a first order frame, we could change (iii) to: (iii)’ For all S E 17, define S* by (uo , ...,u,,) E S* iff u,, E S. Define 17” = {S* I S E IT). Then the proof above shows that u E S u* E S*, and it is clear that IT* is closed under the formula-forming operations, so the new first order frame is equivalent to the frame f-)
17 I UU,, R I Uuo>.
(UUo,
Instead of the induction we could have used the p-morphism theorem. It is easy to check that the mapping from U* to U,,, which takes u*
126
HENRIK SAHLQVIST
into u’ (the last element of u*) is a p-morphism, reliable on every propositional letter. See [18] or [17]. But the proof of the p-morphism theorem is in turn an induction like the above. The model constructed above has the form of a tree, and, of course, is not just asymmetric and intransitive. It has all properties which can be expressed by a sentence of the form :
(l),,,,p Vxy 7(xRrnyA xRpy) for m # P, or of the form (2)rn
Vxyz, , ... , 2-,
lu,, ... , urn- (xRz, A
A
z , Rz2
A
... A
XRU, A u ~ R Au ~
a * *
A
4 Oclcm
zm- Ry A
Um-lRy
(zI = ui)) for 2 Im .
If we restrict ourselves to generated frames, the class of frames obtained from arbitrary frames by the construction above is determined up to isomorphism by the Lml,-sentence (3)
A
mn+p
(1)m.p
A
A (21,
2 Srn
i.e., we have:
LEMMA 9. .f is generated and .fb (3) if.% is isomorphic to aframe obtained by the unravelling technique. PROOF.If .% is isomorphic to a frame obtained by the construction, then it certainly is generated and satisfies (3). If .f is generated and satisfies (3), then use the construction on X itself, starting with some uo from which all points in the structure can be reached, and obtain X ” . Then X N X * is shown by the mapping f: un-+
This means that we can state the full strength of the proof above as: LEMMA 10. For every generated model, there is an equivalent model satisfying (3).
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
127
THEOREM 11. K is determined by the property of satisfying (3) (‘determined bY (3)y). COROLLARY 12. There is no schema Z such that K KZ and KZ is determined by (*), for any sentence (*) implied by (3). In particular, we can take as (*) the sentence expressing irreflexivity, or asymmetry, antisymmetry or intransitivity. COROLLARY 13. Ecery class of frames between the class of all .% and the class of X satisfying (3) determines the logic K. THEOREM 14. There is no schema Z corresponding to any property which can be expressed by a sentence (*) implied by (3), except the trivialproperty of just being a frame.
PROOF.Suppose there was such a schema 2.There exists a model not
satisfying (*). It must be a countermodel for some instance of Z. Hence, by Lemma 10, this instance has a countermodel satisfying (*), which contradicts the assumption.
We point out that, in a way, Theorem 11 has been known long, although it is not noted in the literature as far as we know. Kripke [9] gave completeness results for T, S4, B and S5 using semantic tableaux methods (in two versions, the ‘R-and S-formulations’). Hanson [7] extended this (in the ‘S-formulation’) to other systems, among them K (which he called F). This completeness proof for K proves more than what Hanson states - that K is determined by the class of all frames - it proves Theorem 11. Thus, when conjecture E was presented in [ll], there was already in the literature a way of answering it. Kit Fine has noted that while K4Alt, is determined by the transitive frames where no point has more than n alternatives,
is determined by the irreflexive such frames. But some of our negative results may be relativised to certain systems stronger than K , in different ways. One way is shown in the next section.
128
HENRIK SAHLQVIST
4. Trees and minimal trees
Segerberg [18, p. 831, suggests as a research problem to give simple proofs of the following:
S4 is determined by the reflexive trees,
0 4 is determined by the (irreflexive) trees with infinite branches,
K4 is determined by the (irreflexive) trees.
Segerberg's trees are all transitive, but of course the notion of tree may be generalised. With this more general sense of tree, Schumm [14] indicates how one may prove that
T is determined by the reflexive trees,
S4 is determined by the reflexive transitive trees, by modifying the or-
dinary completeness proof using the canonical model.
It is rather obvious how one may prove these results by the unravelling technique. We want to extend these results here, but without trying to answer the general question of what logics are determined by a class of trees. We also compare our trees with Kripke's tree-model structures. What do we mean by a tree? We don't want to be restrictive, but we do want our trees to be trees in some intuitive or 'graphic' sense. We think that the latter implies that one should not accept e.g. Kripke's SS-tree-model structures as trees. A frame (U,R) is a tree iff there is a relation S on U and a t E U such that for all u E U (i) (ii) (iii) (iv)
lust,
-,(3 !u) (uSu), tSanCu, S E R c SC(Sa""),
u# t
where S""" is the ancestral of S, i.e., US""CV
f-)
(3n 2 0)(US%),
and SC (Sane) is the symmetric closure of S""'. Equivalently, we could replace (i) and (ii) by the clause:
(3).
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
129
Note that S""" is always reflexive and transitive, SC (Sanc)always reflexive and symmetric. We think that we cannot liberalise (iv) without violating intuitions about trees, since SC (Sanc) is the largest relation R containing S such that whenever uRu, then u and v are in the same S-branch. A frame ( U , R ) is minimal with respect to a property (e.g. a minimal reflexive tree) iff
iff iff iff iff iff iff
(VX)
(34(XRY),
(Vx, y ) (xR"y + xRy),
(Vx) (xR"x), (Vx, y ) (xRmy+ yR"x), (Vx) (x has at most n alternatives), i ( 3 x l , ..., x , , + ~()x l R x 2 ,..., x,,Rx,,+,).
(We say 'n,l-transitive' since Lemmon used 'n-transitive' for another property.) One easily proves that U, R ) is a minimal reflexive (n,l-symmetric, n,l-transitive) tree iff there is a minimal tree ( U , S ) such that R is the reflexive (n,l-symmetric, n,l-transitive) closure of S.
<
Thus, apart from the use of a distinguished element, the 'actual world', Kripke's tree M-model structures are exactly the minimal reflexive trees, his tree S4-model structures are exactly the minimal reflexive and transitive trees, and his tree Brouwersche-model structures are exactly the minimal reflexive and symmetric trees. (See [9].) We write for the schema for the schema T",O for the schema B"," for the schema Alt; for the schema D
4"s
OA + O A OA + 0 " A (n > 1) O"A --+ A (n # 0) 0"O"A + A (n # 0)
Qn for theschema O"T 9 Kanger, Symposium
130
HENRIK SAHLQVIST
(We use '4".l' instead of '4"' because Lemmon used the latter for another schema. We use 'T".O Y instead of 'T"' since Segerberg uses the latter for another schema.) Note that T"s0 = go*".We do not treat T n e 0separately. The following logics are determined by the corresponding properties. D: serialness, K4". : n,l-transitivity, KBm*":m,n-symmetry;in particular: T".O : n-reflexivity, KAltA: In-branching, KQ": no n 1-sequences.
+
THEOREM 15. The following logics are determined by the respective classes of trees:
all (minimal) trees. (minimal) trees with infinite branches. K4"* : (minimal) n,l-transitive trees. KB". ": m,n-symmetric trees. In case n = 1 we may add 'minimal'. (e) KAltL: (minimal) 5 n-branching trees. (f) KQ": minimal trees with branches of length s n .
(a) (b) (c) (d)
K: D:
PROOF. The soundness part is clear from the results above. Given a non-
theorem of K ( D , etc), it has a generated countermodel (that is serial, etc.). Unravel this, from (one of) its 'root(s)'. Call the original model (U,R, V), the unravelled one (U*,S,V*). The cases (a)-(f) are proved as follows: (a) (U*,S) is a minimal tree. (b) In this case, ( U * , S) is serial and hence has infinite branches. For if (U*,S> had a 'dead end', u*, there could be no u in (U,R) such that dRu, so R would not be serial. (c) In this case, we may let R* be the n,l-transitive closure of S. (U*,R*) is then a minimal n,l-transitive tree. Since we connect only copies of points that were connected by R, we see (by the p-morphism theorem) that ( U * , S, V*) is equivalent to (U*,R",V*). (d) In case n = 1 we may let R* be the m,l-symmetric closure of S. As in (c) above this gives the result. In case n # 1 we do not bother
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
131
about minimality, so we may as well be liberal in defining R* : For each u* and v* that are in the same S-branch in (U*,S), if u'Ru', put u*S'v*. Then take R* = S u S'. R* is m,n-symmetric, since if U*R*~V*, then u'Rrnv',so v'RW, i.e., there exist x1,...,x,- such that v'Rxl , ...,x,- Ru', so obviously v*Srnu*and thus o*R*"u*. As in (c) above (U*, S, V*) is equivalent to (U*, R*, V * ) . (e) ( U * , S) is already a minimal sn-branching tree. (f) ( U * , S) is a minimal tree with branches of length sn. Note that in this case we cannot drop 'minimal'. Some of these results are obviously additive. In particular we get such results for K, D, T; K4,D4, S4; KB, DB and B. (But as we do not accept transitive and symmetric 'trees', we do not get results of this kind for K4B.) Since we regard the proofs as rather simple, we think we have solved Segerberg's research problem. Segerberg [18] proves the following: The bulldozer theorem. For every transitive (transitive and reflexive) model, there is an equivalent strict partially (partially) ordered model.
We can prove it this way: Let U = (U,R, V ) be transitive (and reflexive). For u E U,let U, = (U,,R,, V,) be the submodel of U generated by u. Let U : = (U:, R:, V:) be the unravelling of U,. Define U* = (U*,R*, V * ) by
u* =
u u:,
UE
u
R* = the transitive (and reflexive) closure of V*(i) =
u
u R,*,
UEU
Vt(i).
UEU
Then R* is a strict partial (partial) ordering, and U* is equivalent to U. This is only one way to prove the bulldozer theorem by the unravelling technique. Other ways: (1) U* is the set of all finite R-sequences from U.We could define another relation on this set by putting (u*, u * ) in the relation iff u* is a proper part (a part) oft)*. This relation includes R*,and is immediately seen to have the desired properties. The new model is equivalent to ll.
132
HEN=
SAHLQVIST
In this, and many other ways, we can connect the disconnected parts of U" above. (2) We could make the bulldozed or unravelled model smaller by taking only one U,*for each cluster in U. (A cluster is an equivalence class under the relation defined by: u v holds iff either uRv and vRu, or u = v.) (3) Instead of unravelling the whole generated submodel above a point we could unravel just (C, R C, V C) for each cluster C,and then connect these tree models as the clusters were. Segerberg's proof also yields the following version of the bulldozer theorem: if the original model is connected ( x # y + xRy v yRx), the resulting model will be connected too, and hence linearly instead of partially ordered. It is easy to change our proof to obtain this result, but then our construction becomes isomorphic to Segerberg's. Segerberg [18, p. 841, gives a schema W, and proves (pp. 86ff.), that
-
-
r
K4 W is determined by the class of all finite strict partial orderings. If U is a finite strict partial ordering, where A is false, unravel it from a point where A is false, and take the transitive closure. Since U is finite and has no cycles, the new model is finite. Thus we have proved the completeness part of: K4 W is determined by the class of ftnite irreflexive transitive trees. (The soundness part follows from the result above.) The reader should compare Segerberg's remarks following [18,Corollary 2.3, p. 881. On p. 96, Segerberg gives a schema Grz, and he proves (pp. lolff.), that S4Grz is determined by the class of all finite partial orderings. Let U = is shown to equivalent to U,,, exactly as before. Since U is finite, U* is too. Hence we have proved the completeness part of: S4Grz is determined by the class of finite reflexive transitive trees. (The soundness part follows from the above.)
133
FIRST A N D SECOND SEMANTICS FOR MODAL LOGIC
5. Completeness results by way of canonical frames The discussion in Section 1 left us with the problem: Given C1 (KF (A), VA), i.e.,
IfKrcA$
+
(3 ( U , IT,R)) (( U, IT,R> k VA
A
I#<"'
IT'
")B>,
we want to show C2 (KF(A),"A), i.e.,
I#
KT ( A $
--).
(3 (U, R ) ) (( U, R ) b VR
A
# <"' "'B).
Given that I# Kr(A)B,we find a certain (U,l7,R ) in which A is true and B is false. Could it be that we could use the same U and R and have k("* ")A and k <"* ")B? We trivially get If (U*R)B, but when is it possible to conclude k'*'")A? That depends both on ( U , 17,R ) and on A . We show that C1 (KF(A),VA) may be strengthened by restricting the range of the first order frames. The class of A such that k
Then he proves that every frame has an equivalent refined frame, from which it follows that any normal logic is determined by the class of its refined frames. The method is rather similar to the filtration technique of Lemmon and Scott [ l l ] and Segerberg [15, 181, but with important differences. When we prove the adequacy of the first order relational semantics for all normal logics by way of canonical frames, this result is immediate. For the canonical frame satisfies (1) and (2) : If u # x , there is an A such that A E u and A 6 x, so u E [AIL and And ITL consists of all sets x4 If TXRLU,there is an A such that OA E X and A 6 u, so u 6 IAlr. And IJA E x gives (Vv) (xRLv-+ A E v), i.e., x E M,JA(,.
134
HENRIK SAHLQVIsT
Thus Fk is refined and Thomason's theorem follows. We generalise the notion 'refined' through the following steps: (3) For all n, (VSE~~)(~ES+XP;S)+XR~~ or equivalently, (VS E n ) (XMiS + u E S ) + uP;{x>.
(n = 0 gives (l), n = 1 gives (2).) (4) Forallmandn,, ..., n,,
( 5 ) For all m,nl , ...,n, and affirmative functions q,
(for all parameters that can be put for '. ..'). (6) For all 1, m, n l , ..., n,, affirmative q,, ...,q l , x,Mj;'S 1 j=1
--f
V ujR'j(
j=1
...,
..., S, ...)
m
u P 2 {x i }, ...
i=l
We say that a first order frame is simple if it satisfies (6) for all x,, 241,
..., 241.
..., x,,
We prove that the first order canonical frame F t of a normal logic is simple by going through these steps, each time using the previous step and a direct proof or an induction on n or v. Some parts of the proof work for all frames. We note what is needed to have the other parts go through, and show that 9:has these properties. The following properties are needed : For all 1, m,n, n l ,..., n,,, x, xl, ...,x,, u, u l , ..., u l , v, y 1 , vl (affirmative) :
...,
FIRST AND SECOND SEMANTICSFOR
(a)
=
Mona LOGIC
135
w, ((Vt)(xRt -+ tM",)
(b) (VS
+ ( 3 t ) (xRt A
(vsE
n )
-P
uES)
( t M i S + u E S)),
m
m
+
v
1=1
(VS E n)(X,MZS -P u E S ) , m
x i M z S + (3)(uRt m
(el
(vsE
n ) 1
+
(
m
A =1
1
x1M2s+ v
j=l
A
tRm(..., S, ...)))
x1M2S + tRm(... , S,
...))),
ujRv'( ..., s, ...I)
m
V (VS €17) A xlM$S + ujRmJ(..., S, ...))
j=1
(i=l
Any simple frame has these properties.
THEOREM 16. A first order frame is simple if it has properties (a)-(e). PROOF. (3) is shown by induction on n. The basis (n = 0) is exactly property (a). For the induction step, assume (VSEn)(xM:+lS~uES),
i.e., the antecedent of (b). By IH, the consequent of (b) gives
( I t ) (xRt
A
tR'u),
i.e., xR"+lu or u E P;+'{x}. Given (3), (4) follows directly by (c). We now prove (5) by induction on tp. Basis. Trivial if pl is truth or falsity. If pl is a projection, we have to show either m ( V S E ~ )A x t M 2 S + u ~ S ' + u E S ' ,
L1
1
which is trivial since U E IT,or (4),which is already shown.
HENRIK SAHLQVIST
136
Induction step (1) The propositional function has the form 09. Then we assume (VS E 17)
/ m
A xiM;;'S+ uR"
[i=l
.))
(..., S, ..
\
i.e., the antecedent of (d). By IH, the consequent of (d) gives
(..., u in
(3) (uRt
A
tR'
i=1
P${xi}, ...
LC.,
(2) The propositional function has the form Up. Then we assume i=l
i.e., /
xiMJ;'S+ uR" (.. ., S, ...)) , \
m
A xiMj?'S-+ (Vt) (uRt + tR' (... , S, ...))),
i=l
which by IH gives m
(Vt)(uRf -+ tRm
(...7
or
\
u
P;{xi},
i=l
...
m
uRa'
u
(..., i = l P ~ { x i }...,
(3) The propositional function has the form rp v ly. Assume m
(VS E
n) A (i=l
xlMj;'S-+ u R m V W (... , S, ...)) ,
i.e., a special case of the antecedent of (e). The corresponding instance of the consequent of (e) together with IH gives
) uRmVW(..., u
(..., u P${xi}, ... m
uRm
or
i=1
( u €';{xi},
v uRW
m
i=l
...,
m
i=1
P?{X~}, ...
...
137
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
(4) The propositional function has the form q~
A
y. Assume \
/ m
A xiM2S-, uRmAw( ..., S, ...)).
i=l
By predicate logic and IH,
)
(..., iu= l l'z{xi}, ... m
uR' or
/
A
( u P;{xi}, ...
uRW
...,
m
i=l
\
m
Finally, given (5), (6) follows directly by (e). LEMMA 17. For all affirmative propositional functions q ~ , kKp1 (
..,
A
A Ak,
...) +
(..., A l , ...) A
(..., A , ,
...).
PROOF. The basis and the induction steps for 0 and 0 are indicatedin [11, Theorem 5.1(d)]. The induction steps for A and v are trivial. THEOREM 18. The first order canonical frame of a normal logic has the properties (a)-( e).
PROOF.Points in 9:are L-maximal sets of formulas, so we translate from talk of sets in IIL to talk of formulas, by the fundamental theorem. For example, (VS E IT,) (XMi'S + URES) *--)
u E n {RES I S E ITL, x M i L S } u E n { 11qA I ' " L
t)
I O"AE x }
f - ' { q A I u " A E x } cu. A
(a) By maximality of u and x, if u f x , there is an A such that A 4 x, so u E \AIL, x 4 !AIL.And IIL consists of all sets !AIL. (b) We want to show { A 1 u " + ~ AE X }
c u + (3)(xRLt
A
{ A I n n AE t } E u).
It is easily shown that { A I U " A E t } E U + + ( O " A 1 A € U }E t .
E u,
138
HENRIK SAHLQVIST
Consider the set
r =(AIC7AEX) U ( O " A I A E 2 4 ) . If r i s L-inconsistent,there are formulas B, ,...,Bk with OB,, .. ., OBkEx and C , , ..., C g ~ u s u c h t h a t {Bi, -..,Bk, O"C1,
or which gives
tLB1
LOB^
A
A
A
***
A
..., O"CI>I L ,
Bk + i O"C1 v
OBk
-+
On"
i O"Ci
(lc1V
V lcl).
Since x contains L, Un+l ( i C 1 v .-.v i C I )E x, hence by hypothesis 1C , v v 1Ci E u, so i Ci E u for some i, which yields a contradiction. So r is L-consistent and can be extended to an L-maximal set t. { 0 " A 1 A E u} E t, and { A I OA E x } c t means by definition that xR,t. (This is reproduced from [ I l , Theorem 2.71.) (c) We want to show
Assume the antecedent and suppose neither disjunct of the consequent holds. Then there are formulas B1,...,Brn4 u such that 0"'Bi E xi, 1 5 i Im. But then O"'(B1 v ... v Brn)Exi,16 i < m, since L is normal, and hence B1 v v B,,,E u, which contradicts the maximality of u. (d) We want to show
Consider
FIRST AND SECOND SEMANTICS FOR MODAL,LOGIC
139
If r is L-inconsistent, there are formulas B1,..., Bk with U B , ,..., O B , E u, C1,..., C1with O n l C l ..., , O"'Cl e x i , 1 < i < m, such that
or
Since p is affirmative, Lemma 17 then gives
Since u contains L and 0 (B, A ..- A Bk)E u, O i p , (..., C , A ... A C,, ...) E u. But since 0"'( C , A -.- A C,) E xl , our assumption gives Op (... , C1 A ... A C l , ...) E u. Contradiction. So P i s L-consistent and can be extended to anL-maximal set t . {p (... ,A , ...) I AY=l(O"'A E xi)} E t, and { A I OA E u } E t means by definition that uR,t. (e) The antecedent of (e) means that there is a union II1u u 17' =' h such that I
A ,=1
(vse n ' )
L1 m
A xikfff's + ujRrnJ(... , s, ...));
or equivalently in the canonical frame, there is a union Fm' u = Fm such that 1
m
,= 1
1=1
-..u Fm'
A {pJ(..., A ,...) I A E F ~ ~ A A (O"'AEx,)
Given this, we want to show
so suppose for a reductio argument that there are formulas B,, ..., B, such that O n l B l..., , O"'BIE x i , 1 < i < m, but pj (..., B,, ...) $ u J , 1 < j < 1. Since 0"'( B , A A B,) E x i , 1 < i < m, and B, A . . . A BI must belong to some Fm', we must have pi (..., B, A A B,, ...) E u, for this j . And by Lemma 17 and the affirmativity of pj, this implies qj(... ,B,, ...) E u j . Contradiction.
140
HEN-
SAHLQVIST
One sees that any simple frame has the further property: (7) For all affirmative 9's etc.,
where @ is any formula not containing S1,...,S,. The converse holds in any first order frame by the monotonicity of the Rj"'s. By a simple formula we understand a formula of the type described in Theorem 8, by a simple schema a schema constructed from a simple formula, and by a simple logic a logic axiomatisable with only the rule of necessitation and simple schemas. THEOREM
19. Let S be a simple schema. Then KS is determined by s.
PROOF. We know that I=V8, hence k v',!?' , and by (7), Pisk s. Since s is expressible in L, != s. Soundness was proved in Theorem 8.
%is %is
%ks
This generalises the completeness result for KH' in [ l l ] and their conjecture D. In the style of Thomason [20], we might conclude that any normal logic is determined by its simple first order frames, define F as the class of formulas A such that whenever (U,D, R) is simple and k("*n*R>A, then k'U9R)A,and show that the second order semantics is adequate for any normal logic axiomatisable with only the rule of necessitation and axiom schemas from I;. We see that all simple formulas are in F. Among interesting formulas which are simple we mention I,, of Fine [2], similar generalisations of B and Lemmon-Scott's H and H + , etc., Segerberg's B,, and a similar generalisation of Sobocinski's H, Fine's L, , the axioms Segerberg [I81 treats under the names P, Zem, R, F, and one of the axioms used to axiomatise S4Sch. In [18, Theorem 3.7, Chapter IV], Segerberg derives completeness results for regular logics from completeness results for normal logics. But completeness results for the logics E2 (S)and E 3 ( S ) , which may be obtained by direct methods, do not come out as a corollary to this
FIRST AND SECOND SEMANTICS FOR MODAL LOGIC
141
general theorem, since O(OT
A
lJA)+A
is not a ‘Hintikka schema’. But it is simple and Segerberg’s [18, Theorem 3-71 is immediately improved when completeness results for normal logics are improved. Thus E2 ( S ) and 6 3 ( S ) are brought under the same principle as the other regular logics Segerberg treats in [18, Chapter IV, Section 31. So he is right in assuming that his observations on p. 223 ‘may not be a coincidence’. The formula O O P l + OOPl is not in F. This is shown by the frame (N, 17, <), where N is the set of natural numbers, 17 the set of finite and cofinite sets and < the ordinary ordering. 17 IS closed under PR since
i:
if S = {l},
PR(S) = { x I x c max S } if S finite,
if S infinite.
The frame is simple since P z ( X ) = ( x I min X c x } is cofinite for any X E N and hence is in 17.It is seen that the formula is valid in (N, 17, c ), but taking Y(1) = the set of odd numbers shows that it is not valid in (N, -=>. Fine has shown that KM is (second order) complete, but it is presumably not complete for any first order property. We note that the result in [ll] about KM” may be generalised. 6. Open problems
(1) Are all formulas in F simple? (2) Do all simple logics have the finite frame property? Do all logics axiomatisable in F have it? Some of them have, according to Theorems 6 and 7 in [5]. Hopefully, Gabbay’s results can be generalised. (3) Do we have Y2 (a, A ) A R2 ( A , K ) A LY E L C2 (KF(A),a)?
142
HENRlK SAHLQVIST
(4)Is there a simple classification of those A and 01 E L for which C2 (KF (A), a) and V 2 (a, A) but not R2 (A, a)? (5) Can the method in [20] for constructing an equivalent refined frame for each first order frame be generalised to give an equivalent simple frame?’ Added in proof
(1) The properties (c) and (e) (pp. 135, 138f.) hold in all first order frames, not only in 9;. (2) “Open problem 3”: The answer is no. The incomplete logic in [24] corresponds to reflexivity and transitivity. (3) Independently of the author’s results reported above, R. I. Goldblatt proved conjecture D, and J.F.A.K. van Benthem proved the correspondence result for the formula in that conjecture. (4) Van Benthem [21] has proved that M does not correspond to any first order property. Goldblatt [22] proved that, too, and also that KM is not determined by any first order property. References [l] K.Fine, An incomplete logic containing S4, Theoria 40 (1974) 23-29. [2] K.Fine, Logics containing K4, J. Symbolic Logic 39 (1973) 31-42. (31 K.Fine, Normal forms in modal logic, unpublished. [4] F. B. Fitch, A correlation between modal reduction principles and properties of relations, J. PhiZosophical Logic 2 (1973) 97-101. [5] D. M. Gabbay, A general filtration method for modal logics, J. PhiIosophicaZ Logic 1 (1972) 29-34. [6] M.Gerson, The inadequacy of the neighbourhood semantics for modal logic (January, 1973). [7] W.H.Hanson, Semantics for deontic logic, Logique et Anatyse 8 (1965) 177-190. [8] B. Jbnsson and A.Tarski, Boolean algebras with operators I, Amer. J. Mark. 73 (1951) 891-939. [9] S. A. Kripke, Semantical analysis of modal logic I, Normal modal propositional calculi, 2.Math. Logik Grundlogen Math. 9 (1963) 67-96. At The Third Scandinavian Logic Symposium the author presented an alleged construction to this effect, but later Krister Segerberg found a mistake in one of the proofs.
FIRST A N D SECOND SEMANTICS FOR MODAL LOGIC
143
[lo] E. J. Lemnion, Algebraic semantics for modal logic I and 11, J. Symbolic Logic 31 (1966) 46-65, 191-218. [l 11 E. J.Lemmon and D.Scott, Intensional logic, Preliminary Draft of Initial Chapters by E. J.Lemmon, Stanford (July, 1966). [I21 A. Prior, Past, present andfuture (Clarendon, Oxford, 1967). [13] Sahlqvist, H., A note on irreflexive, asymmetric and intransitive Kripke models, Preprint Series No. 11, Math., University of Oslo (April 27, 1972). [14] G.Schumm, T, S4 and Henkinesque trees (abstract), J. Symbolic Logic 37 (1972) 446. 1151 K. Segerberg, Decidability of S4.1, Theoria 34 (1968) 7-20. [16] K. Segerberg, Decidability of four modal logics, Theoria 34 (1968) 21-25. [17] K. Segerberg, Modal logics with linear alternative relations, Theoria 36 (1970) 301 -322. 1181 K. Segerberg, At$ essay 6 1 classical modal logic, Filosofiska studier utgivna av Filosofiska Foreningen och Filosofiska Institutionen vid Uppsala Universitet, Uppsala (1971). [19] K.Segerberg, Review of M.K.Rennie: On postulates for temporal order, J. Syiiibolic Logic 37 (1 972) 629. [20] S.K.Thomason, Semantic analysis of tense logic, J. Symbolic Logic 37 (1972) 155-158. Additional references
[21] J. F. A. K. van Benthem, A note on modal formulas and relational properties J. Symbolic Logic, to appear. [22] R.I. Goldblatt, Metamathematics of modal logic (Unpublished doct. diss., Victoria University of Wellington, February 1974). [23] B. Hansen and P. Gardenfors, A guide to intensional semantics, in: Modality, morality, and other problems of sense and nonsense, Essays dedicated to Soren Halldkn (C.W.K. Gleerup, Lund, 1973) pp. 151-167. [24] S.K. Thomason, An incompleteness theorem in modal logic, Theoria 40 (1974) 30-34.
O N SOME DECIDABILITY PROBLEMS CONCERNING DEVELOPMENTAL LANGUAGES Art0 SALOMAA University of Turku, Turku, Finland
1. Introduction Developmental languages were defined by Lindenmayer [7] in connection with a theory proposed to model the development of filamentous organisms. Each letter in a word is interpreted as a state of a biological cell. Thus, stages of development are represented by strings of letters corresponding to filaments of cells. The developmentalinstructions which are presumed to generate the organism are modelled by rewriting rules or productions. These productions are applied simultaneously to all letters to reflect the simultaneity of growth in the organism. Such generating devices, generally referred to as L-systems, have been extensively studied from formal language theory point of view during the past few years. Thus, the essential characteristic of an L-system is that at every step of a derivation a production is applied to every letter in the word considered. In this paper, we use customary abbreviations about different kinds of L-systems. Thus, OL, 1L and 2L mean context-free, one-sided contextsensitive and two-sided context-sensitive, respectively. D means that rewriting is deterministic, and P that it is propagating, i.e., no letter goes into the empty word. Systems with tables are abbreviated by T, and extended systems (i.e., a terminal alphabet is allowed) by E. The reader is referred to [lo] for an introductory discussion, and to [9] for a recent bibliography of the field. The resistance of some L-families against closure operations is mainly due to the fact that there is no terminal alphabet and, consequently, also 144
DECIDABILITY PROBLEMS CONCERNING DEVELOPMENTAL LANGUAGES
145
all intermediate words are in the language. Families generated by extended systems possess strong closure properties. E.g., the family TOL is an anti-AFL, whereas ETOL is a full AFL. An important observation (made in [Z]) as regards decidability problems is that all OL-systems (including tables and extended systems) can be simulated by index grammars of Aho [l].
2. Basic decidability problems Since membership, emptiness and finiteness are decidable for index languages [l] and [ 5 ] , the following theorem is now immediately obtained. THEOREM 1. Membership, emptiness and finiteness are decidable for all OL-families, in particular for the family ETOL.
2. Consider any family XYOL among the OL-families. There is THEOREM an algorithm for deciding of an arbitrary language L in XYOL, and an arbitrary word P whether or not P occurs as a subword of a word in L.
PROOF. Let G be an index grammar for L, with the terminal alphabet V T . Since the index languages form an AFL, also the language L(G) V,*PV,* is an index language. Moreover, an index grammar G, for it can be effectively found. Theorem 2 now follows because we can decide the emptiness of L(G,).
n
In many cases (e.g., for POL-systems), the decision method of Theorem 2 can be replaced by a simpler device of checking through all ‘direct ancestors’ in derivation trees. 3. Consider any family XYOL among the OL-families. There is THEOREM an algorithm f o r deciding of an arbitrary language L in XYOL and an arbitrary word P whether or not P occurs infinitely many times as a subword in L.
Theorem 3 is obtained similarly as the previous theorem by deciding the finiteness of L(G,). Because the equivalence of sentential forms of context-free grammars is undecidable [12]; the following theorem is obtained. 10 Kanger, Symposium
146
ART0 SALOMAA
THEOREM 4. The equivalence of OL-systems (even of POL-systems) is undecidable. Also the equivalence of DTOL-systems is undecidable. Theorem 4 leaves open the equivalence problem for DOL- and PDOLlanguages. In fact, these are the problems of the longest open standing in this area. Also the equivalence problems for DOL- and PDOL-sequences are open (i.e., given two DOL-systems, one has to decide whether they generate the same sequence of words). It seems likely that all of these problems are decidable. In fact [M. Nielsen, personal communication], if we can solve the equivalence problem for PDOL-sequences, we can solve it for PDOL-languages. On the other hand, to solve the former problem, it suffices to consider two PDOL-sequences Piand Qi ,i = 1,2,... , such that, for every i and every letter a, the number of occurrences of a in Pi equals the number of occurrences of a in Q, . (The condition is obviously necessary for equivalence and it can be decided by the theory of growth functions considered in the next section.) One then has to determine a bound k such that Pi = Q, for i 5 k implies P, = Q, for i > k. Results concerning subword complexity [4] might also be relevant for these problems. Decision methods of an arithmetic nature can be obtained for the case where the alphabet consists of one letter only. (Such systems are called UL-systems. Now it is obviously irrelevant whether the system is OL, 1L or 2L.) Herman et al. [6] have shown that the generated language is always either regular or of the form idk* I i 2 0}, for some k andj. All their constructions are effective and, thus, there is an algorithm for deciding whether a given UL-language is regular. Herman et al. [6] propose as an open problem the existence of a converse algorithm. Such an algorithm can indeed be found [I31 by considering the minimal automaton accepting the given regular language: THEOREM 5. There is an algorithm for deciding whether a given regular language is UL. 3. Growth functions Consider any deterministic L-system defining a unique sequence of words Po, P I , P z , ..., where Po is the axiom. (Thus, systems with tables
DECIDABILITY PROBLEMS CONCERNING DEVELOPMENTALLANGUAGES
147
are excluded.) The function
is termed the growth function associated with the system. E.g., for the DOL-system determined by the axiom a and productions a -+ abd6, b -+ bcd'l, c + cd6, d + d ,
we have f(n) = (n + The basic paper in this area is by Szilard [14]. Growth functions of DOL-systems fit in the framework of the theory of integral sequential word functions [8]. The latter have been extensively studied in the past in connection with probabilistic automata. Assume that the alphabet of a DOL-system consists of the letters a, , ..., a k . Let TC be the k-dimensional row vector such that its ithcomponent equals the number of occurrences of a, in the axiom P o , for i = 1, ...,k. Let q be the k-dimensional column vector with all components equal to 1. Let M be the k-dimensional square matrix whose (i,j)th entry equals the number of occurrences of a, on the right-hand side of the production with a, on the left-hand side. Then the growth function can be given the matrix representation f ( n ) = nM"q. This representation gives rise to the following theorem [14, 81. (By the growth equivalence problem we mean the problem of deciding of two L-systems of a given type whether or not they possess the same growth function.) THEOREM 6. The generating function of the growth function f of a DOLsystem equals TC ( I - Mx)-'q, where I is the identity matrix. Consequently, the growth equivalence problem of DOL-systems is decidable. By changing q, one gets the same result also for the case, where only the number of occurrences of some letters in a DOL-sequence is considered. Theorem 6 also gives a solution to the 'growth analysis' problem of DOL-systems: given a system, one has to determine its growth function. Another more practical solution is based on difference equations [ l l , 16, 171. The converse 'growth synthesis' problem (i.e., given a function, one has to realize it, if possible, as the growth function of a system of some previously specified type) is much more difficult. The following result holds [8].
148
ART0 SALOMAA
THEOREM 7 . There is an efective procedure A with the following property. Given a function f ( n ) and an upper bound r for the Hankel rank off, A produces a DOL-system whose growth function equals f, provided such a system exists. If no such system exists, A runs forever.
It is a consequence of Theorem 6 that DOL growth functions are always exponential, polynomial or combinations of the two. In the DOL case, one can give an explicit characterization of the different types of growth. Denote by 3 the exponential growth (i.e., there are no and t > 1 such that the growth function satisfies the condition f ( n ) 2 t" for n 2 no), by 2 the growth bounded by a polynomial but not bounded by a constant, by 1 the growth bounded by a constant but not becoming ultimately 0, and by 0 the growth becoming ultimately 0. (These exhaust all the possibilities in the DOL case.) Given a semi-DOL-system,i.e., a DOLsystem without the axion, different types of growth may result by different choices of the axiom. Thus, the productions a --f a2b, b
3
bc, c
--f
cd, d
--f
E
give rise to the type combination 3210. The following theorem [16] tells which combinations may occur in DOL-systems. THEOREM 8. Type 1 never occurs without type 2. All other combinations are possible.
Assume that f ( n ) is the growth function of some (deterministic) L-system and, furthermore, thatf(n) is not bounded by a constant. Then f(n) is at most exponential and at least logarithmic. These bounds give some restrictions for the growth realizable by an L-system. Consider the following example due to Lindenmayer. A filament of cells grows in such a way that, at each step of the growth process, the first cell remains undivided, the second cell is divided into two new cells, the third cell into three new cells, and so forth. It is easy to see that this growth is more than exponential and, hence, not realizable by any L-system. On the other hand, the possibilities for D1L and D2L growth are much richer than those for DOL growth. Consequently, the former growth functions are much harder to characterize than the latter. (In fact, no general characterizations are known.) The following theorem gives some examples.
DECIDABILITY PROBLEMS CONCERNING DEVELOPMENTAL LANGUAGES
149
THEOREM 9. For D1L-systems without the axiom, all combinations among the types 3,2, 1 , O arepossible. There are D1L-systems with a logarithmic growth function. For each natural number t, there is a D2L-system whose growth function is asymptotically equal to d t t . PROOF.The first sentence follows by Theorem 8 and by consideringthe semi-D1L-system with the productions "a + a, #a 4a', where g is the input from the environment. An example establishing the second sentence is due to Herman, cf. [8]. The example given before [8, Theorem 351 has a growth function asymptotically equal to nli2.This is due to the fact that the lengths of the constant intervals grow in a linear fashion. The growth n113 is obtained similarly by making these lengths to grow in a quadratic fashion. This is achieved by letting a messenger travel back and forth in the string, always going one step further than at the previous time. Growth occurs only when the messenger has reached the right end of the string. A PD2L-system with these properties is defined as follows. The axiom is bca, g is the input from the environment, and the productions are:
a 4 a except ba + b, ha 4 c, f a -+ c, d
4
d, ae + e,
b --+ a, c 4 c except b~ d
+
-+
h, *Co + ae,
a except Qd-+ b,
e + a except
-+A
h -+ d. Then the generated sequence is: bca, aha, adc, dac, bac, abc, aaae, aaea, aeaa, eaaa, faaa, bcaa, ahaa, adca, daca, bacu, abca, aaha, aadc, adac, daac, baac, abac, aabc, aaaae,aaaea, aaeaa, aeaaa, eaaaa,faaaa, bcaaa, .... Other fractional powers are obtained by iterating this procedure. (Thus, to get n1'4, each of the messengers d and b has to travel back and forth.) Theorem 9 gives many examples of growth function of contextdependent L-systems which are not realizable by any DOL-system. An interesting open problem is whether or not there exists an L-system with
150
ART0 SALOMAA
growth type 2+, i.e., growth is faster than polynomial but slower than (With biological conexponential, as exemplified by the function dog". notations, type 3 growth has been called 'malignant' and type 2 growth 'normal'. Thus, type 2 i could be considered as growth which is neither malignant nor normal.) We mention finally an example showing that, for D1L-systems, very small changes in the number of occurrences of some letter in the axiom can cause big changes in the growth type, a phenomenon not possible for DOL-systems. Consider the productions b + c, c + c, a + a2 except *a b. Then the axiom ba2 gives rise to linear growth whereas the axiom ba3 gives rise to exponential growth. By letting also c grow, other similar changes are obtained. --f
4. Macro systems and Lindenmayer AFL's Let 9 be a family of languages. By definition, 9MOL consists of languages L such that L is the result of an 9-substitution into a OLlanguage. Languages in 64MOL are referred to as 64-macra-OL-languages. The family 64MTOL is defined similarly by considering 9-substitutions into TOL-languages. Macro systems were introduced in [3]. They correspond to the biological situation where one first observes longer segments ('macros') in a growth process and, finally, the structure of each macro is inserted. If 9 is the family of finite (resp. regular) languages, the corresponding macro families are denoted by FMOL and FMTOL (resp. RMOL and RMTOL). As usual, a cone is a family of languages (containing at least one nonempty language) closed under rational transductions (or, equivalently, under homomorphism, inverse homomorphism and intersection with regular languages). THEOREM 10. If B is a cane, then Sl= 9MOL and LZ2 = 9MTOL are full AFL's.
PROOF.We prove the theorem for Z1,the proof for P2being similar.
It suffices [lo, p. 1351 to prove that zl is closed under union, star, regular substitution, and intersection with regular languages. Closure under union and star is obvious, and closure under regular substitution
15 1
DECIDAEIILITYPROBLEMS CONCERNING DEVELOPMENTAL LANGUAGES
follows immediately from the fact that 3 is closed under regular substitution. To show closure under intersection with regular languages, we assume that LPl contains the language L and that R is a regular language, accepted by the finite automaton M with the state set Q,initial state qo and final state set F. For qi, q, E Q,denote by K,, the regular language consisting of all words which move M from q, to q,. Let L be the result of substituting languages L,, i = 1, ...,k, belonging to 9 'into the language generated by the OL-system G whose alphabet consists of the letters a, , ...,aK.Without loss of generality, we assume that there is only one production X -+ E (i.e., X is one of the a,%) in G with the empty word E on the right side. (Otherwise, we modify G by introducing a new letter X with productions X 4 X and X + E, removing all other productions A --f E, and replacing 0 or more occurrences of such letters A on the right-hand sides of the original productions in all possible ways by X.)We may also assume without loss of generality that a, is the axiom of G. A new OL-system GI is now defined as follows. The letters are S (axiom) and triples of the form (4,,a, q,), where qi, q, E Q and a is a letter of G. The productions of G, consist of all of the following: (9 S -+ (qo,a l , qF),where q F E F. (ii) If A + BIBz B,,, n 2 1, B,'s are letters, is a production of G, then a.9
.
is a production of GI for all states q l ,q,, qll, ..., qfn-l (For n = 1 the production is (qi,A, q,) + (qi , Bl ,q,).) (iii) (q,,X,qr) E, for all states qi. The language substituted for 5' is empty. The language substituted for (qi ,a,, qf)is L, 0 Kf,. Since 2' is a cone and KiIis regular, the language L,,, KfJis in 9'. Similarly, one can prove that if 2' is a faithful cone, then 2Z1and Y z are AFL's. LPl and g 2are termed the Lindenmayer AFL's associated with 2'. In [2], a different method is used to show that if 9 is a full AFL, then so are 3,and LY2. --f
n
152
A R T 0 SALOMAA
THEOREM 11. FMOL = EOL. PROOF.As in the previous proof, one can show that FMOL is closed under intersection with regular languages. This implies the inclusion EOL G FMOL. The reverse inclusion follows from the fact [15] that EOL is closed under arbitrary homomorphisms.
REMARK. In the study of FMOL-languages, one has to be careful in distinguishing between &-productionsin the OL-system and &-substitutions. The closure of FMOL under arbitrary homomorphisms IS obvious by definition, whereas the proof of this closure for EOL is rather complicated. Thus, Theorem 11 is not as immediate as claimed in [2]. All of the following facts are rather hard to prove directly, but any of them implies easily the other two: (i) EOL is closed under homomorphism. (ii) FMOL c EOL. (iii) The two definitions of FMOL languages given in [2] and [3] (one as above and the other involving a finite set of terminal productions) are equivalent. Along similar lines, one can show that FMTOL = RMTOL
=
ETOL.
References [l] A.Aho, Indexed grammars - an extension of context-free grammars, J. Assoc. Comput. Much. 15 (1 968) 647-67 1. [2] K. Culik 11, On some families of languages related to developmental systems, Internut. J . Comput. Murh., to appear. [3] K. Culik I1 and J. Opatrny, Macro OL systems, to appear. [4] A. Ehrenfeucht and G. Rozenberg, Subwords in deterministic TOL systems, to appear. [ 5 ] T.Hayashi, On derivation trees of indexed grammars, a n extension of the uuwxytheorem, Kyoto University, Tech. Rep. RIMS-122 (1972). [6] G. Herman, K. Lee, J. v. Leeuwen and G. Rozenberg, Characterization of unary developmental languages, Discrete Math. 6 (1973) 235-247. [7] A. Lindenmayer, Mathematical models for cellular interactions in development, Parts 1-11, J. Theoret. Biol. 18 (1968) 280-315. [8] A.Paz and A.Salomaa, Integral sequential word functions and growth equivalence of Lindenmayer systems, Information and Conrrol, to appear.
DECIDABILITY PROBLEMS CONCERNING DEVELOPMENTAL LANGUAGE3
153
[9] G. Rozenberg and D. Wood, Generative models for parallel processes, McMaster University, Hamilton, Ont., Computer Science Tech. Rept. 73/6 (1973). [lo] A. Salomaa, Formal languages (Academic Press, New York, 1973). [l 11 A. Salomaa, On exponential growth in Lindenmayer systems, Indag. Math. 35 (1973) 23-30. [I21 A. Salomaa, On sentential forms of context-free grammars, Acta Inforrnatica 2 (1973) 40-49. [13] A. Salomaa, Solution of a decision problem concerning unary Lindenmayer systems, Discrete Math., to appear. [I41 A. Szilard, Growth functions of Lindenmayer systems, to appear. [15] J. van Leeuwen, Pre-set push-down automata, University of California, Berkeley, Calif., Computer Science Tech. Rept. 10 (1973). [16] P.Vitanyi, Structure of growth in Lindenmayer systems, Tech. Rept. No. 1/73, Mathematisch Centrum, Amsterdam (1973). [ 171 P.Vitanyi, Growth of strings in parallel rewriting systems, unpublished.
LAMBDA CALCULUS AND RECURSION THEORY (Preliminary Version)
Dana SCOTT Oxford University, Oxford, England
0. Introduction
A lattice-theoretic and topological method for defining a class of models of the Church-Curry A-calculus was discovered by the author in 1969 (cf. [8] and [9]). Though we knew then the connections with Recursion Theory, the details were not sufficiently satisfactory to be published. The situation now is completely changed through the subsequent discovery of a new construction of a ‘universal’ model which by its very nature gives an immediate connection with standard theory (‘standard’, say, in the sense of Rogers’ book [7]). A further advantage of the present construction is the elimination of any previous discussion of abstract lattices and topological spaces, since the structure of the model can be defined directly. One point should be made at once about the model: the connection with recursive functions is not at all like that of the socalled combinatory arithmetic, because the integers are taken as primitive rather than as defined. (Cf. [2] and [4] about the combinatory approach.) This is hardly a defect, since the aim is to construct a model out of known objects (in this case the space of sets of integers). Further, what is taken as primitive in the corresponding formal language is a very brief selection of arithmetic notions -just the ones that would naturally present themselves - and the combinators (or pure A-terms) are then used to define everything else. And since no combinator is excluded, the iterators used previously to introduce integers can be studied as well in the present context. Thus nothing has been lost; but much has been 154
LAMBDA CALCULUS AND RECURSION THEORY
155
gained, because now we can see the possibility of a real cooperation between arithmetic (and later set-theoretic) notions and those of the I-calculus. Because the original program of relative combinatory logic could never be carried out, these connections were never clear before. It is rather strange that the present model was not discovered earlier, for quite sufficient hints are to be found in the early paper of Myhill and Shepherdson [5] and in Roger’s book 171 (especially §§ 9.7-9.8). These two sources introduce effective enumeration operators and indicate that there is a certain amount of argebra about that gives these operators a pleasant theory, but no one seemed ever to take the trouble to find out what it was. Part of the problem was the restriction to effective operators, since the definitions apply to arbitrary continuous operators. Another oversight was in the lack of any definition of I-abstraction. Rogers defines application [7, p, 1471, and we shall use the same definition, but he never defines the corresponding abstraction - though he notes special cases such as composition [7, Theorem XX, given without proof]. Credit for discovery of the model is due to G. Plotkin in his unpublished memorandum [6]. However, his discussion is not entirely satisfactory. Plotkin does not introduce a formal language and so never really writes down the definition of abstraction (though comprehension is ‘demonstrated’ on pp. 9-10). Secondly, he thinks only of the pure 1-calculus and does not connect the model with recursion-theoretic ideas. He calls his work a ‘set-theoretical definition of application’, but does not tie up the theory of application with formal set theory either. The crucial point is of course the relation between recursion done via the ‘paradoxical’ combinator and standard recursion. This problem was originally solved by David Park for the earlier models, and the solution given below is even easier here. In view of the complex history of the idea behind the construction, it is suggestedthat we do not callthis‘Scott’s version oftheMyhill-Shepherdson-Rogers-Plotkin model’ but simply the graph model. The point of the construction in any case is that a continuous operator can be adequately represented by its graph (in a sense to be made precise below). Now, as these graphs are objects of the same type as the arguments of the operators, this is how a consistent meaning can be given to self-application. In the end it has turned out to be all very straightforward.
156
DANA SCOTT
A final general point must be made before we turn to the details. The present model does not satisfy strong extensionality in the form of the principle of q-conversion : (7)
a = Ax. a(x).
This axiom would say: every object is a function - which incidentally is uniquely determined by its values. Instead, we must be satisfied with ordinary extensionality in the form of this schema:
0)
.
.
Ax. y(x) = Ax p(x) 4+ vx y(x) = y(x).
Here y(x) and y(x) are any two terms of the formal language to be introduced. Informally, we can read (6) as saying that those objects which are functions are uniquely determined by their values (better: by the argument-value correspondence). Besides extensionality, we can expect the axioms corresponding to or-&conversion:
On top of this, there will be a principle of induction to be formulated below, which has not been previously considered in the logical studies of I-calculus. It involves the paradoxical combinator, since every form of recursion ought to have a corresponding principle of induction. Such an expansion of the axiomatics of the A-calculus is another advantage of the model construction: because we do not intend to reduce logic to A-calculus, we feel free to us logic in our formalization. This makes our study look rather different from, say, Curry’s. The failure of (7) should not be considered a defect. Though it is algebraically very elegant to have all objects functions; in applications it is neither here nor there. A few type distinctions never hurt anyone. The function notion is still that of extensional (and continuous!) function. But what, we may ask, are those objects which are not to represent functions? From a formal point of view, this may seem to be a difficult question. But the model will make everything clear and natural. And if the reader is unconvinced, he can easily return to the earlier inverse-limit models, which do satisfy (q) but which are more difficult to relate to ordinary notions.
LAMBDA CALCULUS AND RECURSION THEORY
157
1. Continuous functions
The domain of elements of the model can be introduced at once: it is the set Pw={xlxco}, the space of all sets of integers. It is a valid question whether we need all x G w which will be discussed later, but at the moment there is no reason to exclude any. We shall regard Pw as a topological space (and as a complete lattice under G ) , but the topological structure is so immediate that no general theory of topology is needed -just as with the real numbers. (This is not to exclude the possibility that at a later time some abstract theory might be useful.) Theoretically speaking we use the weak (or positive) topology; this is the product topology where Pw is identified with the infinite product space {I, T}". Here {I, T} is not the usual discrete two-point space (0,1). Rather the topology on {I, T) makes {I} closed and { T I open, but I and T are not interchangeable. In (0,l } we allow complete symmetry, and (0, 1)" - as is well known becomes a Hausdorff space; in fact, the Cantor space. The present topology on Pw is not Hausdorff - only To, in the jargon. But all these theoretical terms are unnecessary. To make the topology 'visible', introduce the standard enumeration E = {enInEw) of alljinite subsets of w. Thus n is to be the code number of the finite set e n , where if k , < k , < ... k,,,-l, then we have
-=
en = {k,,k l ,
..., k m - l } ifs
2k'.
n= i<m
That is to say, the elements of en correspond exactly to the exponents that occur in the binary expansion of the integer n. This is a well-known oneone enumeration of finite sets, which note is well founded in the sense that
k E en always implies k < n . Note too that the enumeration is effective, and that max {en) is (primitive) recursive in n. Similarly with such relationships as en c em.
158
DANA SCOTT
Topologically speaking, the elements of E turn out to be dense in Pw. This is obvious in a way from the equation x =
u
{enI en E x } ,
which simply states in symbols the fact that every set x - including infinite ones - is the limit (union) of its finite subsets. We can make this more definite by using the en to determine the neighborhoods of Pw as a topological space. In fact, the subsets U, = { x E Po I en c x } are taken as a basis for the open subsets of Pw by definition. One can then work out that the above union is indeed a limit. In general, a subset U G Po is open if and only if it is a subset finitely determined in the sense that, for all x -c o,we have XE
U iff 3en E x . e n € U .
This implies that if U is open, x E U and x c y then y E U also. Thus we see why Po is not like the Cantor space: when we give a neighborhood of a point x E Pm. we tell a little of what is in x but nothing of what is out of x. This is why we say that the topology is ‘positive’. A continuous function is one that preserves limits. We need not consider arbitrary limits, however, as the following characterization is sufficient :
DEFINITION. A function f:Pw
-+
=
u
have
f(x)
Pw is continuous iff for all x {f(en)
E Po we
I en E XI.
Just as open sets are ‘positive’ as noted above, continuous functions are ‘monotonic’ in the sense that x 5y
always implies f ( x ) c f(y).
This is obvious from the definition adopted. It is also easy to verify that ‘continuous’ in the sense of the definition is the same as that of the usual topological definition: inverse images of open sets are open. But for our purposes it is better to have the equivalent ‘E-8’ definition, which can be given in two forms.
LAMBDA CALCULUS AND RECURSION THEORY
PROPOSITION 1.1. A function f : Pw and all m E w we have
+ Pw
is continuous i f ffor all x ~ P w
.
(9
.
(ii)
m Ef ( x ) z f s 3e, E x m Ef(e.). Equivalently we can write
159
em E f ( x ) iff 3e, E x em E f ( e , ) .
PROOF.Condition (i) is just a way of writing in logical symbols what was written as an equation in the definition; so there is nothing to prove. Now (i) and (ii) are not equivalent for particular m, but if (i) holds for all m E w , then so does (ii), and conversely. The converse is clear because we can specialize em to a singleton set to obtain any instance of (i). Suppose then that (i) holds for all m. To prove (ii), for a particular m, note first that the implication from right to left in (ii) holds because f is monotonic. Assume then that em E f ( x ) . Now we can apply (i) for each k E em obtaining a corresponding en, c x, where k Ef(e.J. We have but to let en =
u
{ene I k E em}
to find en c x such that em c f(e,), again by the monotonicity off. Another way of reading the characterization of continuous functions in (i) is to say that these functions are completely determined by the pairs of integers n, m such that m Ef(e,,). This is much like saying that a continuous function of a real variable is determined by pairs of rationals q, r such that r < f ( q ) . We can make the point of this remark more concrete by use of a pairing function on the integers. We write (n,m) = $ ( n
+ m ) ( n + m + 1) + m ,
and note that n < (n, m) and m < (n, m) with the strict inequalities holding except in the cases (0,O)= 0 and (1,O) = 1. This pairing function has cu as its range and has primitive recursive inverse functions, but we do not need any notation for them.
DEFINITION. The graph of a continuous function f : Pw + Pw is defined by the equation: graph (f)= {(n, m) I m Ef(e3) *
160
DANA SCO'IT
DEFINITION. The function determined by any set u E o is defined by the equation :
.
fun (u) ( x ) = { m I 3en c x (n, m ) E u } . PROPOSITION 1.2. Every continuous function is uniquely determined by its graph in the sense that fun (graph (f)) = f.
(0
Conversely, every set of integers determines a continuous function, where we have u E graph (fun (u)).
(ii)
Equality holds in (ii) just in case u satisfies the condition that
(k,m) E u and ek G enalways imply (n, m ) E u.
(iii)
PROOF.Givenf, to prove (i) we calculate
. E x .m Ef(e,))
fun (graph 0)( x ) = {m I 3en c x (n, m) E graph (f)} = { m I ]en =
x,
because f is continuous. Next, given u, we see from the definition that (n, m ) E u always implies m E fun (u) (en); where as m
E fun
.
(u) (en)implies 3ek E en (k,m) E u .
It is thus easy to deduce that m E fun (u) ( x )
z y
.
3e, E x rn E fun (u) (e,),
which shows that fun (u) is continuous. Part (iii) should now also be obvious. By these quite elementary arguments, we now have a good grasp of the nature of continuous functions on PUJinto Pco; examples of particular functions are forthcoming in the next section. Aside from functions of one argument, we shall also have to consider functions of several arguments.
LAMBDA CALCULUS AND RECURSION THEORY
161
DEFINITION. A function f:P o k -+ P o of k-variables is continuous iff it is continuous in each of its variables separately. This definition would not be correct for functions of real variables, but topology in Po is rather easier than the classical theory. Thus according to the definition, a function f ( x , y ) of two variables is continuous just in case the following two equations hold: V ) I e n E XI
f(x, U) = (J
{f(eti 9
f(x~u> =
{ f ( x , e k ) I ek
u
9
v>.
If we like, these can be replaced by one equation: f(x,y) =
u
{f(en,
ek>
I en
x, ek
y>*
Those familiar with the product topology will now see that this is the same as having f continuous on the product space Po x Po, but we need not enter into details. More important is the question of substitution.
1.3. Continuous functions of several variables on P o are PROPOSITION closed under substitution. PROOF.By ‘substitution’ here we understand generalized composition. We can analyze the process into two steps: first with distinct variables, then by identijication of variables. By way of example we might have first: f(g(x3 v), h(z, w,u)), and then pass to, say
f( d x , r),h(Y, 4A)-
Since each variable is treated separately, the proof can be reduced to two special cases simply by neglecting the remaining variables. The special cases are ordinary composition, f(g(x)), and identification of two variables, h ( x , y), wheref, g and h are given continuous functions. For the sake of completeness we show the main steps of the easy proofs. In the first case we use characterization (ii) in Proposition 1.1 : em
c f(g(x)) * 3en
E
d x ) em = f(en)
.
3ek c x .3en E g(ek) em e f(en)
c)
.
* 3ek E x em c f(g(ek)). 11 Kanger, Symposium
162
DANA SCOTT
In the second case: em E h (x,x) e,3en E x . em E h (en,x) e, 3en E
x . 3 e k E x . em E h (en,ek).
Now if we think of e, = en v ek and use the monotonicity of h, we find
.
em E h (x,x ) e,3e, E x em E h (ej, ej),
which shows that h (x,x ) is continuous in x. It is no surprise that the theory of functions of several variables is closely related to that of one variable - especially as the product space Pw x Pw is homeomorphic to Pw. Without the use of general topology, however, we shall exhibit a more immediate connection that is well known to the practitioners of 1-calculus in the next section when we discuss iterated application. Before we turn to the algebra of continuous functions, however, there are two general results involving limits which it will be useful to record here: the theorem on fixed points and the theorem on extending continuous functions to larger domains. Both of these theorems can be given in greater generality, but the plan here is to give very elementary proofs.
PROPOSITION 1.4. Every continuousfunction f : Pw point given by the formula fix(f)=
Pw has a least fixed
( J { f v ) b 4 Y
where 0 is the empty set andf” is the nyold composition off with itself.
PROOF. This argument is well known. Supposef has a fixed point a =f(a). Then since 0 E a and f is monotonic, we argue inductively thatf”(0) G a for all n E w. Thus fix cf) E a; hence, if fix (f)is itself a fixed point, it must be the least one. Now we remark that f”(0) E f”+l(@); hence, if ek c fix 0, we have ek c f”(0) for some n, because ek is finite. Thus m Ef(fix
.
0) 3ek E fix cf) t ,
m Ef(ek)
~ 3 n ~ w . sf”(0).m~f(ek) 3 e ~
.
3n E o m ~ f ” + l ( 0 )
e,
e,mEfixCf),
and sof(fix
0)= fix m.
LAMBDA CALCULUS AND RECURSION THEORY
163
The remaining results in this section require some knowledge of general topology, but not much; we shall indicate the elementary content of the theorems. PROPOSITION 1.5. Let X and Y be arbitrary topologicaI spaces, where X E Y as a subspace. Every continuous function f:X + P w can be extended to a continuous function f: Y --). Pw defined by means of the formula = m x ) I X E x n UI I Y E UI,
m u in
where y
E
Y and U ranges over the open sets of Y.
PROOF.Recalling the base for the topology of Pw, we note that
U, = {Z E Pro I en c Z } =
n { I z I m ~ zImEen}, }
a finite intersection. Thus to prove that f is continuous on Y, we need only show that the inverse images of the sets ( z I m E z } are open in Y. To this end, calculate
-
mE~(Y)
so that
iY
YI m
n {f(x) I x = u { u I m n {mI
3u [Y E u A
Ef(y)~
mE
E
E
x n u>l, E
xnw.
This set is obviously open, thus f is continuous. To prove thatfextendsf, note first that when x E X,it is obvious from the definition thatf(x) E f(x). For the opposite inclusion, suppose that m Ef(x). Sincefis continuous, there is an open subset V of X such that x’ E V always implies m Ef (x’) . But Xis a subspace of Y, so V = X n U for some open subset U of Y. Therefore, m E (I{f(x’) I x’ E X n U},
and we have rn E ~ ( x )as , was to be shown. The purpose of stressing the Extension Theorem (Proposition 1.5) is to show that there are many Pw-valued continuous functions. The reason is that there are so many subspaces of Pw, as we shall see in the Embedding Theorem (Proposition 1.6). Continuous functions between sub-
164
DANA SCOTT
spaces can always be regarded as restrictions of continuous functions
f :Po + Po,as Proposition 1.5 shows. This remark justifies our interest
in and concentration on the continuous functions defined over Po. For readers not as familiar with general topology we may remark that Proposition 1.5 can be turned into a definition. Suppose that X G Po is a subset of Pw. We can of course regard it as a subspace with the relative topology, but what interests us are the continuous functions defined on X. From Proposition 1.5, we can see that a necessary and sufficient condition for f : X + Po to be continuous is that for x E X and rn E w we have m Ef ( x ) ijg. 3en c x Vx' E X [en c x'
+m
~f(x')].
We note that for this definition, because en4 X in general, we have to replace the clause m Ef(e,) in (i) of Proposition 1.1 by something more complicated. In this way we have an elementary characterization which does not employ general notions.
PROPOSITION 1.6. Every To-spaceX with a countabIe basis for its topoIogy can be embedded as a subspace of Pw.
PROOF.The proof is actually quite trivial. Let the basis for the topology of X be the countable family of open sets {Vn I ~
E U ) .
That Xis a Tospace means that each point of Xis uniquely determined by the V,, that contain it. Thus if we define a function f : X + Pw by the equation f(x) = { n e w I X E Vn], then the function will be one-one from X to a subset of Pw. Note next that {x E X I n ~ f ( x )= } V,; hence f:X -+ Pw is continuous. Finally we must show thatfmaps open subsets of X to relatively open subsets of f ( X ) c: Pw. It is enough to show that the image of each V, is open. But in view of the last equation we can easily check that
f(vn)=f(x) n { Y ~ f ' wI ~ which shows that f(Y.) is open in f ( X ) .
E Y ) ,
LAMBDA CALCULUS AND RECURSION THEORY
165
The simplicity of the proof of Proposition 1.6 is nothing to be held against it: now that we know that Po is rich in subspaces, we can study them in more detail. All ‘classical’ spaces are included, and such nonclassical ones as the continuous lattices [S] have been shown to be of some interest also. In this paper, however, we shall not continue this line of study since our main purpose is to establish the link with Recursion theory. Nevertheless, to grasp the significance of the model Po,we needed some background on continuous functions. In summary we can list our results indicating each with a short descriptive name: Proposition Proposition Proposition Proposition Proposition Proposition
1.1, the Characterization Theorem; 1.2, the Graph Theorem; 1.3, the Substitution Theorem; 1.4, the Fixed Point Theorem; 1.5, the Extension Theorem; 1.6, the Embedding Theorem.
By giving names to these propositions, we do not mean to claim any originality or depth for the results. What makes them pleasant is rather that they are easy to prove and together show the coherence and naturalness of the theory. But still all the discussion has been quite abstract in that no applications for the theory have been illustrated. We have restricted attention to one space Po, for the most part, but we have never said what the elements of Po could be used for. This is like introducing real numbers as the completion of the rationals without ever mentioning geometry or measurement. It is mathematics, but is it life? Clearly not, and we must hasten to establish the inevitability and indispensability of the model by showing how to interpret the elements of Po and how to do ‘algebra’ on them. 2. Computability and defbability Let x E Po. What can x represent? If x = 0, there is not much x can represent - without some artificial convention. But if x = {n}, a singleton, there is something - some quantity - for x to represent: the number n itself. This is so natural that we shall assume the identification as part of our mathematical background; that is, we suppose it to be a fact
166
DANA SCOTT
that for all n E o,
n = {n}. As a consequence, we can then write w E
Po,
because every integer is by assumption a set of integers. We note that this identification is at variance with the construction of the integers in certain systems of set theory where we may find that
The point here is that we do not care how the integers are constructed; we feel we know enough about them to take them for granted - or better: to take them as given axiomatically. Thus we are free to identify them with whatever sets we wish, since we are interested now in finding more ways of using them. So much for the singletons; what about other sets? The answer will not be unique. In the first place every integer n can be taken as a pair n = (no,n,) for suitable no,n, . Therefore every set of integers can be constructed as a relation between integers (a set of ordered pairs). We shall make full use of this familiar idea shortly. Note too the correspondence between finite sets and integers (en corresponds to n); thus a set of integers could just as well represent a set of (finite) sets - each of which represents a (finite) set of (finite) sets. As we shall see, the relation, between finite sets and single integers (which can be represented by sets) will play a central role. The moral of this discussion is that even though a set is a set, it has many aspects or layers. We can X-ray it to find internal structure, and pictures taken from different directions will reveal different structures. Thus the same set can have many difeerent meanings, and the meaning of the moment is only going to be clear from the context of use. Fortunately the result of this situation is richness and variety and not incoherence. Before we get on with the X-raying, we should ask what can be seen on the surface. The answer is obvious: a set is made up of its elements. We can write: x = {n I n E x ] =
u
{n I n E x} =
u
nEx
?I.
167
LAMBDA CALCULUS AND RECURSION THEORY
In case xis empty or a singleton, that is all there is to say (on the surface). Otherwise we may find
x
=0
v 1 u 2 u 666
in the finite case, or in an infinite case
x
= 1u2
v 4 u 8 u 16 u 32 u
v 2" v
.-..
Not just one but a multitude of integers. A set is in general a multiplicity a multiple integer, if you like. (Caution: each integer that occurs does so only once, but many integers are allowed.) The idea can be put in another way. We are very used to thinking of single-valued number-theoretic functions
p : o +0. Since o E P o according to our agreement above, we may also regard p:o+Pw, but p is special because p(n) E w for all n E o.What can be said for an arbitrary mapping q:o+Pw, where q(n) need not be a singleton? Such a function is very conveniently regarded as multi-valued function. Thus instead of writing we write and there may very well be many such m (even none). In this way partial functions are also included since we can read the equation q(n) = 0 as saying that q is undefined at n. We are beginning to see a connection between ordinary functions and functions on Po,but there is still an imbalance. Namely, we have multivalued functions (of an integer variable), but we have not yet connected these with multi-argumented functions (that is, functions f:P o -P P o ) . That is not quite true, actually, for remember that w c Po. Indeed, as
168
DANA SCOTI
a subspace of Pw,the set w gets the proper discrete topology (all points are open). By the Extension Theorem, any continuous function can be continuously extended; and, since w is discrete, arbitrary functions are continuous. The extension of q : w + Pw to 4 : Po 3 Po is not all that interesting, however; for by the formula of Proposition 1.5 we find that if x E Pw, then g(x) = =
n {q(n) I n
EW}
dn)
if x = 8; ifx = new; otherwise.
=w
The function Tj is continuous but rather extreme in its behaviour. A better connection defines (1 : Pw + Po by the equation: 4(x) =
u
{ 4 ( 4 I n E XI 3
which is easily seen to be a continuous extension of q. (In fact, in a suitable sense 4 is the minimal extension while ?jis the maximal one.) It is also easy to see that a functionf: Po + Po is of the form (1 for some q ifff satisfies the equations
f(U {xn I n E w.>) =
u {f(xrJ I
n E w}
9
f(O) = Pr
for all systems where xne Po for all n e a, a E w. Such functions we call (infinitely) distributive; they are a very special case of continuous function. (A continuous function which distributes over jinite unions is infinitely distributive, by the way.) In view of this discussion, then, we can feel free to concentrate on the arbitrary continuous f:Po --f Po, since these encompass ordinary (multi-valued) functions. Further we know what to look for to single out the ordinary functions. Clearly it is better to have arguments and values of the same type for the sake of symmetry, but is the added generality really significant? To answer the question, we must imagine how functions of this kind are to be computed. Consider computingf(n) E Pw. The function is given its argument n which causes it to start a process of generating all the m ~ f ( n )An . infinite set cannot be calculated at once: each element must be produced in turn. (And that does not mean in numerical order!) Of course, we are considering functions abstractly in extension, so we do
LAMBDA CALCULUS AND RECURSION THEORY
169
not record the steps of the process but just collect the results into a set f(n). So much for values, now what about arguments? When we calculate f(x) where x E Po, we cannot say that x is given all in one go: it must be generated. Again, the order of generation must not count only the elements. As each n E X is brought forth, a part of the process forfcan be set in motion. And now we can see the difference between distributive f and the more general continuous J A distributivef treats each n independently because it is characterized as satisfying the equation
f(x) =
u { f ( 4I
n E XI.
A continuousf on the other hand allows for cooperation between finitely many of the elements of its argument. Thus it can wait for a finite subset en c x before it even gives out m Ef(en) E f ( x ) . The calculation - if it is to be effective - can only depend on finite portions of x, but the dependence must obey the monotonic law. Why? Because we never know that a particular integer is to be excluded from x since we generate only what is in x. In case the process puts m Ef(e,,), it must also put m E f ( e k ) in case en E e k ; because e k E x might also be possible - it cannot be excluded in finite time, so to speak. This is why computable f:Po + Po are continuous, but so far we have only a few bare hints as to why continuous functions are interesting. To get the full impact of the idea we need examples, and the best examples have to go below the surface. Recall the Graph Theorem (Proposition 1.2). Every continuous function could be recaptured from a set; in fact, if u = g r a p h w , then f(x) = fun (u) (x). Let us now take a closer look at the binary operation on sets that we are going to call application:
u(x) = fun (u) (x). We knew this was continuous in x, but it is also easy to prove it is continuous in u (indeed it is distributive in u). With suitable choices of u we obtain any desired continuous function, but with combinations involving application we can define new functions. A well worn example is composition: u (v(x)), where u, v are fixed, and x is the variable. A new and relatively bizarre example is self--application: x(x), where x is the variable. Note here how x is being used in two different ways: first as a
DANA SCOTT
170
graph (where we go below the surface) and then as an argument (where x is taken at face value). This shows how the same object can be given different ‘meanings’. But this is not very odd: an integer has a different meaning according as the occurrence is in the numerator or denominator of a fraction, though we must admit that x(x) is rather more difficult to understand than n/n. Consider now any such combination [- x -3 which is continuous in the variable x. Mathematically this defines a function f:Pa -+ Pa, where
f(x)
=
r-
x -I
7
for all x E Po. By Proposition 1.2 we can reduce this function to a set: 2.4
= graph
cf)
7
where in terms of the newly defined operation of application we find:
for all X E Po. In this way we have made the mathematical notion (or at least: notation) of function redundant in the continuous case, since they can all be represented by sets. Going a step further, we introduce a nonalphabetical name for this set as follows:
Ax. [“-x--]
= graph(f).
That is, the denotation of the A-expression is the graph of the function defined by abstraction on the variable indicated. In outline, this is the graph model for the A-calculus, and it is in terms of this notation that we will begin to see how interesting the continuous functions can be. To make this more precise we introduce a formal language of terms (called LAMBDA) which consists of the expressions of the pure A-calculus augmented by arithmetic primitives appropriate to our plan of constructing the model from sets of integers. Explanations follow, but it will be seen at once how the arithmetic primitives are distributive extensions of well-known point functions. Also to make the definition clearer, we write out in full the denotations of application and abstraction. One could imagine more primitives for the language, but we shall establish later a result that explains the scope of those chosen here.
LAMBDA CALCULUS AND RECURSION THEORY
171
DEFINITION. The syntax and semantics of the term language LAMBDA is indicated by these six equations:
2 3
x, y = {n E x I 0 E z } u {mE y
I 3k. k
+ 1 E z} ,
u(x)= { m 1 3 e n ~ x . ( ~ , m ) E u } ,
Ax
.
t=
{(n, m) I m E t [e,,/x]}.
The definition is somewhat informal, but it was thought that too much formality would make the plan too difficult to understand. On the left we find the LAMBDA-notation: there is one constant, two unary operations, one ternary operation, one binary operation, and finally one variable-binding operator. The notation is perfectly algebraic and these operations can be combined in any order to give compound terms in the familiar way. (In A x . t we had to be a bit more formal to allow z to be an arbitrary compound term; in the other equations variables were sufficient to convey the ideas. Of course, in place of the ‘x’ we could use any other variable.) On the right we see the denotations of the terms. Strictly speaking we have a confusion here between the form of the variables and the denotation or value of the variables. Form on the left, value on the right; or if you like, object language on the left, metalanguage on the right. (There are standard ways to resolve the confusion, but since LAMBDA is such a simple language the extra complication would not be fruitful.) Let us read the equations. The symbol 0 denotes the number 0 (remember: (0) = 0 by convention in Pw). For sets x, the operation x + 1 adds one to all elements of the set. (There is then no ambiguity about n + 1 whether we think of n E w or n E Pw.) In the case of x - 1, we subtract one from all positive members of x. (Note: 0 - 1 = 8.) The fourth equation defines the meaning of the conditional expression. It
172
DANA SCOTT
will be clearer written out by cases: z
3
x,y = 8
if z
=
8;
=x
if z
=
0;
=Y
ifO$zbutk+ 1 ~ z ;
=
x v y otherwise.
Finally, application and abstraction are defined as before, except we have been more formal about the z. On the right, z[e,Jx] indicates that x should be replaced by en (or better: valued as en)throughout z. Since x is a bound variable, it does not really occur on the right. By virtue of the definition, every LAMBDA-term has a denotation or value once the values of the free variables are given: the value is afunction of the values of the variables. Functions (of several variables) defined in this way by LAMBDA-terms are obviously called LAMBDA-definable functions. Here is the first result.
THEOREM 2.1. All LAMBDA-definablefunctions are continuous. PROOF.A function defined by a constant or a single variable is obviously continuous. It is left to the reader to check in detail that x + 1, x - 1, z 3 x , y, and u(x) are continuous. For I-abstraction we do a special case. Suppose z (x, y) is continuous in x and y (this is an informal notation to display variables). What we must show is that Ax. z (x,y) remains continuous in y. We calculate
Ax
7 (x, Y) =
= =
=
{(n, m> I m
E7
(en, v)}
{(n, m) I 3ek c Y m E z (en, e3)
u
{{(n, m) I mEZ(en, ek>>
I ek
y>
(Ax * z (x, ek) I e, E y>.
(Note that we did not use the continuity in x to prove it about y; the assumption on x is used in Theorem 2.2.) Finally, we appeal to Proposition 1.3 to take care of compound terms formed by substitution. The fundamental Graph Theorem (Proposition 1.2) can now be stated in LAMBDA-notation thus justifying the rules of conversion :
LAMBDA CALCULUS AND RECURSION THEORY
THEOREM 2.2. Axioms (cc), more formally we have: (a)
.
Ax z = Ay
(P> (8
173
(p), (5) are all satisfied in the model. Stated
.
provided y is not free in z.
t lylx],
(Ax
7)
(v) =
Ivlxl,
Ax.z=Ax.ot,Vx.t=o.
Actually, Proposition 1.2 (the Graph Theorem) is needed only for (/?) and (E) because ( N ) is obvious from the definition. A well-known consequence of the rules of conversion can be obtained in a sharpened form with the aid of Proposition 2.1. THEOREM 2.3. I f f is a continuous function of k variables, then there is a set u E Pm such that U(X0)
(x1) " * (xk-1) =f(xO, x1, * . * , x k - 1 )
holds for all x o , x l , ...,x , - ~E Pw. PROOF.In the proof of Proposition 2.1 we could have included f as a new k-ary primitive. This would allow us to define: U =
1x0. 1x1 "'
k k - 1
f(X0, X I ,
Xk-1).
The theorem then results by applying (/?)k times. In this way we show that functions of several variables can be reduced to functions of one variable with the help of iterated application. Another well-known consequence introduces the combinators. THEOREM 2.4. Every LAMBDA-deJinable function can be defined by iterated application starting from variables and these six constants: 0 = 0, SUC
= 1x.x
+ 1,
pred = Ax.x - 1 ,
cond=ilx.Ay.Az.z~x,y,
K
= ilx.Ay.x,
. .Ax. u(x) (v(x)).
s = ilu Av
174
DANA
SCOlT
The proof can be taken over from [l], [2], or [4]. An improvement of the result special to this model is given below in Section 3. More interesting at the moment is the behaviour of other combinators. DEFINITION. The paradoxical combinator is defined by the equation :
. .
Y = Ru (Ax
u (x(x))) (Ax. u (x(x))).
THEOREM 2.5. If u is the graph of a continuous function f, then Y(u) is the least fixed point o f f . PROOF.What we must prove is that
(h
(x(x))) (h (x(4)) = fix cf>.
By way of short-hand, let
a = fix(f)
d = Ax. u (~(x)). Calculate first by (p): =
-
( d o ) = f(d(d))
Thus d(d) = Y(u) is a fixed point o f f , and so a E d. Suppose for the sake of contradiction that a =Id. = By continuity of application we note
44 =
u
{e,(e,)I el E 4.
Let I then be the least integer such that el E d but el(el)$ a. There must then be an integer k E e,(e,) where k a. By definition of application there is an en c e, where (n, k) E e , . But then en E d and (N, k) E d. By definition of abstraction, k E u (en(en)),and so
+
Moreover,
u ( d e n ) ) $ a*
en(en) $ a ,
since otherwise, by monotonicity, u(en(en))E u(a) = f ( a ) = a. But n < (n, k) and (n, k) -c 1. Thus n is a smaller integer which satisfies the conditions that gave us the choice of I as the least such. This is the last time we shall distinguish a continuous function from its graph. It is now clear that they are interchangeable. Every set u can repre-
LAMBDA CALCULUS AND RECURSION THulRY
175
sent a function, because u(x) is continuous in x. Those that are graphs satisfy u = Ax. u(x),
a condition which is written out in more primitive terms in (iii) of Proposition 1.2. From now onfis just another variable which we may single out stylistically when we are thinking of functions. As a corollary of Theorem 2.5 we may remark that Y (f) = fix (f)is continuous i n 5 where we now regard f as just an element of Po.This could have been proved directly from the definition of fix in Proposition 1.4 if we had give a topology to the function space. Instead we derive a topology from that of Po by thinking of the function space as a subset of Po: { u I u = Ax. u(x)} E Po. A complete analysis of the function-space topology can be given, but we shall not repeat it here. We do note that it is the topology of pointwise convergence, that it is a To-topology, and that the space is naturally partially ordered in a pointwise fashion. This can be expressed as a stronger form of extensionality:
Theorem 2.5 is far from being a curiosity as we shall make it the basis of our representation of recursion within LAMBDA. Of course, if it by chance had not turned out valid, we would have taken 6x as a primitive since it is continuous and computable. Then Y would have been just a curiosity. Before we turn to a precise definition of computability, however, we need to show how pairs, triples, and sequences are represented in the language. A group of specific definitions and lemmata are required to reach the goal.
DEFINITION. 1 = (Ax. x(x)) (Ax. x(x)). LEMMA 1 . 1 = 8.
PROOF.By definition, 1 is the least fixed point of the identity function in
view of Theorem 2.5 and the definition of Y. The least fixed point is obviously the empty set for which 1 is a more dramatic name.
DANA SCOTT
176
LEMMA 2. 0 , l E Ax. 0. PROOF.By definition of abstraction, Ax. 0 = ((n, m) I m E 0 [e,/x]} = {(% m) I
m
E 0)
= ((n, 0) 1 n Em).
By virtue of our chosen coding of integer pairs we find that 0 = (0,O) and 1 = (1,O).
LEMMA 3. X
U JJ
=
(AX.
0)
3 X,
JJ.
This last is an immediate corollary of Lemma 2 and shows that (binary) union of sets is LAMBDA-definable. It needs hardly to be mentioned that the separate integers 1 = 0 + 1, 2 = 1 + 1 , 3 = 2 + 1, etc., are also definable; hence all Jinite subsets are definable. Of course, A x . 0 is an infinite set. What about other infinite sets?
.
DEFINITION, T = Y (A,. AX x U ~ ( + X 1)) (0). LEMMA 4. T = o. PROOF.Let g be the indicated fixed point in the definition, so that T = g(0). To prove the result, we must prove something more general. Namely, that for all m E o we have Vx. x
+m
E g(x),
where we can define x
+m
=
(n
+ ml n ~ x ) .
We argue by induction. For m = 0 we have g(x) = x u g ( x
+ 1) 2 x .
+ 1, so that ( x + 1) + m E g ( x + 1) c x u g ( x + 1)
Assume the result for m. Specialize x to x
= g(x).
LAMBDA CALCULUS AND RECURSION THEORY
+
Thus x (m for all m E w.
+ 1) c g(x)
177
follows. We can then conclude that m ~ g ( 0 )
The trick just used of specializing a more general recursion can be used in many places. We prefer the more symmetrical (better: dual) notation of I,T to the conventional set-theoretical notation 0, w , but this is not an important point. The elements I,T are (graphs of) functions, by the way; but as the next lemma shows, they play a rather special role.
LEMMA 5 . The onlyfixedpoints of K are 1 and T.
PROOF.We must show, a we calculate
= A x . a iff a =
1 or a
=
T. In one direction
A x . 1 = {(n,m) I m E 1) = I,
Ax.T
=
{ ( n , m ) I m E T }= T .
In the other direction, suppose a = A x . a but a =k 1.We note that a = {(n, m) I m
E a}
This means that first
m E a always implies (n, m) E a , and secondly,
k E a always implies k
=
(n,m) for some m E a .
Let k be the least element of a. By the second implication, k = (n, m) and m < k. Since m e a , it follows that m = k. But this implies k = 0 by the nature of our integer pairing function; thus, 0 E a. By the first implication, (n, 0) E a for all y1. In particular, 1 = (1,O) E a; therefore, (n, 1) E a for all n. In particular, 2 = (0, 1) E a ; therefore, (n,2) E a for all n. Backtracking 3 = (2,O) E a ; therefore, (n,3) E a for all n. Continuing in this way, all pairs belong to a, and so a = T. We now turn to the definitions of tuples and sequences.
DEFINITION.
0 = 1, (xo,
XI,
12 Kanger. Symposium
...,X k )
.
= ilz 2
xo,
= I
(XI,
..., X k ) (z - 1).
178
DANA SCOTT
LEMMA 6. = I otherwise.
PROOF. The proof is by a double induction: first on k , and then on n. The case k = 0 is clear in view of our definition of the empty or 0-tuple. Assume the result for k (and all n). We now prove it for k + 1 by induction on n. In case n = 0, it is clear from the definition. Assume it for n and do n 1. By the definition this will fall back on what we assumed about k.
+
The definition of ordered tuples was arranged for uniformity. Note such convenient facts as so every k-tuple is at the same time a (k + 1)-triple. Also,
The point of Lemma 6 is that a k-tuple is a function such that application to n gives the nth coordinate. This is so convenient that we give the subscript notation an official definition. DEFINITION. u, = u(x). But we shall usually only employ this when x E o.If the reader does not enjoy the confusion between 2-tuples and 3-tuples; he can use (2, X , y ) and ( 3 , x , y, z), respectively; and similarly in other dimensions. Caution: Not every element of Po is an ordered couple in the sense of the definition. If the full homeomorphism between Pcu x Po and Po is desired, we must use some function as [x, y ] = (2n I n E X } u (2m
+ 1 Im Ey}.
This is not a LAMBDA definition, but one will be seen to be possible after we discuss primitive recursion. Another caution: Do not confuse ( x , y ) and (n, m), since the latter is but an integer function. If we wanted to extend this to Po,the natural
LAMBDA CALCULUS A M ) RECURSION THEORY
179
definition would read: ( X , Y ) = {(n,m>In=,mEY}.
This is more like a Cartesian product and is not satisfactory as a pairing function on all of Pw, because ( L Y ) = ( x , 1)= 1 for all x , y E Pw. Turning now to infinite sequences, we define the sequential combinator.
DEFINITION. $ =
Y(ilsAuAz.z= u o , s ( A t . u ( t
+ 1))(z - 1)).
This generalizes the idea of our definition of tuples, though its formal expression may hide its significance.
LEMMA7. For all u E Pw we have: $(u) = $ ($(u))= AZ
U {un I n E z},
and so the range of $ consists exactly of the distributive functions.
PROOF (outline). By definition, equation :
$(u) (z) = z
3
$
is the least function satisfying the
uo, $ ( A t . u (t
+ 1)) (z - 1).
By induction on n one shows first that Vu Vz [n E z
u,, E $(u) (z)].
This establishes that and so
U {un I n E Z} s $(u) (z) ilz .U {u,, I n E z } E $(u).
For the other inclusion, define s=Au.Az.U{u,,Jn~z}, which is reasonable since the expression is continuous in u and z. Then check that s satisfies the fixed-point equation for $. But then S c s as desired.
180
DANA SCOTT
The proof that $(u) = $ ($(u)) rests on the fact that $(u),, = u,, for all new. Finally, note that the range of $ is the same as the set of fixed points of $. Now the equation u=jZ~.U(u,,1n~z),
as we have noted before, exactly characterizes the distributive functions.
DEFINITION. i L t ~ w . z= $ ( A z . z [ z / n ] ) . The point of this definition of &abstraction relativized to integers is that it is LAMBDA-definable, and it produces the expected distributive function that is the way we represent functions of an ordinary integer variable in Pw. This has been a long sequence of lemmas, but the result allows us to transcribe all of recursion theory on integers over into our LAMBDAnotation in a wholesale way. In particular, we can easily do now all primitive recursions. LEMMA 8. If p : w + w is a primitive recursive function, then the corresponding distributive function fi :Pw + Pw is LAMBDA-definable.
PROOF(outline). Again to establish this specific result (about functions of one variable) we have to prove something more general (about functions of several variables). Now the primitive recursive functions are generated from the zero, successor, and identity functions by substitution and the scheme of primitive recursion. Only the last will give us any trouble. Let us take an example. Suppose P(0) = k,
P (n
+ 1) = 4 (n, P(41,
where 4 is a function of two variables that we already know about. That is, 4 = AxIY { q ( n , m ) l n E X , m E Y )
is LAMBDA-definable. The trick with p is to note that
fi
.
= Y ( I f . An E w n
3
k, 4 (n
- 1) ( f ( n - l))),
LAMBDA CALCULUS AND RECURSION THEORY
181
or in less formal terms, 8 is the least solution of the equation:
p = ilnEco.n=,
k,4(n
- l ) @ ( n - 1)).
This will work with more variables also, and it shows why 8 is LAMBDAdefinable - and the LAMBDA definition can be written down directly from its recursive definition. Having encompassed this part of recursion theory, we must ask a more expansive question : when is an arbitrary continuous function to be called computable?
DEFINITION. A continuous function f of k-variables is computable iff the relationship m Ef(en,,, e n l , * * * ,enk-l)
is recursively enumerable in the integer variables m yno ,n, , ...,nkThe definition is chosen on the one hand to be a direct generalization of that of partial recursive function. For as we know a partial function p : co + w u (I}is partial recursive iff the relationship m =P(4
is r.e, (recursively enumerable) in n and m. Passing now to the distributive function j? :Pw + Pco, we note that:
rn E P(ek) iff 3n E e k rn = p(n), and so this relationship is r.e. iff p is partial recursive. On the other hand, the definition is naturally motivated. How to compute y = f ( x ) ? If x and y are infinite, we can do it only little by little. Thus start generating the en E x. As you find these (better as they are given you), hand them over to$ The function starts a process of generating the elements m E f(en). We are saying that this process is effective iff the generation is r.e. in the usual finitary sense. In this way computations with infinite objects are reduced back to computations with finite objects (numbers). The definition of the computability offis thus not an independent one, because it is given in terms of the already understood notion r.e. However, in Theorem 2.6 we shall see that a direct, nonreductive definition is at hand. This is interesting when we recall the point of making our generalization (which is not just to have a model for
182
DANA SCOTT
the %calculus). In ordinary recursion theory, when we write m = p(n) there is a distinction in type between n, m and p. The first are finitary integers, whereas the second is a function, an infinite object. In the present theory, when we write y = f(x), there is no distinction in type between x, y and f:they are all sets of integers. Of course, there can be a distinction in kind: some sets are finite, others are r.e., others are neither. It is very similar to the situation with real numbers: some are rational, others algebraic, others transcendental. But it is not for the sake of this analogy that we pursue the generalization. Rather it is the realization that the ideas of recursion theory apply just as well to computations on functions as on integers. The distinction between finite and infinite is not as important as getting at the idea a computable process. These computation procedures can just as well take other procedures as arguments as integers as arguments. Our aim then is to show that the unified, ‘typefree’ theory is not only possible but better.
THEOREM 2.6. Let f be a continuousfunction of k variables. The following three conditions are equivalent: (i) f is computable; (ii) Axo Ax, ..-jlxk-l f(xo, x,, ..., Xk-1) as a set is r.e.; k k - 1 .f(Xg, Xi, ..., xk-1) is LAMBDA-deJinable. (iii) 1x0 AX,
.
* * a
PROOF. For the equivalence of (i) and (ii), we take a special case of functions of two variables. By definition, AX
2.Y .f(X, V ) = AX {(k, m> I m Ef(X,
ek>>
= {(n, m’)I In‘ E {(k,m) I m €f(en, ek)>) =
{(n, (k,m)) I m
E f ( e n , ek)>*
The equivalence is now obvious. For the equivalence of (ii) and (iii) we have to prove two things: LAMBDA-definable functions are computable (actually that amounts to: (iii) implies (i)), and every r.e. set is LAMBDA-definable. For the first fact (say in the case of two variables), consider the logical form of the predicate m E Len/x, ek/rl, where z is a LAMBDA-term. By the definition of the semantics of our language, this can be written as an arithmetic predicate involving such
LAMBDA CALCULUS AND RECURSION THEORY
183
recursive predicates as j = i + 1, j E e,, k = (i,j ) , conjunction and disjunction, existential number quantifiers, and bounded universal quantifiers (4‘’ E eJ. Thus the predicate is r.e. To prove the second fact, suppose a is r.e. If a = I,we use Lemma 1. If a is nonempty, there is a primitive recursive function p : w + w such that a = {p(n) I n ew}. But this is the same as: a = #(T),
and by Lemmas 4 and 8, we see that a is LAMBDA-definable. Let RE E Pm be the class of r.e. sets. Of course, the finite sets E E RE. An interesting corollary to Theorem 2.6 is the fact that the countable
class RE forms a model for the A-calculus. Indeed, RE is the Zeus? class containing the combinators of Theorem 2.4 and closed under application, and any such class satisfies (a), (fl) and (E*). The first two are obvious. The implication from left to right in (E*) is also obvious. Suppose then that Ax. t ( x ) $ A x . o(x). , this x need not There is then some set x E Po) such that z(x) $ ~ ( x )but belong to the subclass. But in any case, there will be some m E w, where m E t(x), but m 4 ~ ( x )By . continuity, nz e z(e,), where en E x. By monotonicity, m # a(e,,), and en does belong to the subclass. In Section 3 we shall note how many closed classes there are. This completes the foundations of our recursion theory. In summary we list short descriptive names for the results of this section:
Theorem 2.1, the Continuity Theorem; Theorem 2.2, the Conversion Theorem; Theorem 2.3, the Reduction Theorem; Theorem 2.4, the Combinator Theorem; Theorem 2.5, the Recursion Theorem (or: Fixed-Point Theorem); Theorem 2.6, the Definability Theorem. Needless to say, along the way we have noted many related and auxiliary results too numerous to name. In the next section we discuss some standard applications of recursion theory in the new context to give more weight to the argument that A-calculus and recursion theory make a good combination.
184
DANA SCOTT
3. Enumeration and degrees
To be able to enumerate things we shall have to introduce Godel numbers. A system of numbering can sometimes be rather complex, but in our theory the algebraic notation of the combinators makes everything easy. In fact, all we need is one combinator and iterated application. The proof is facilitated by a lemma on ordered pairs. LEMMA 1. cond ( x ) ( y ) (cond ( x ) (y)) = y.
PROOF.In case x = y = I,we note that cond (I)(I)= 1 and l(1) = 1, so the equation checks in this case. Recall
.
cond ( x ) ( y ) = ilz z
3
x, y
Thus 0 $ cond ( x ) (y), because 0 = (n, rn) iff n = 0 = rn. Furthermore, e, = l a n d I ~ x , y = l ; b u t O ~ l . A l s o cond ( x ) ( y ) (0) = x
and cond ( x ) ( y ) (1) = y ,
so if either x =I= I or y # I, then cond ( x ) ( y ) =k 1. In this case, cond ( x ) ( y ) must contain positive elements. The result now follows. THEOREM 3.1. There is a single combinator G such that all other LAMBDA definable elements can be obtainedfrom it by iterated application.
PROOF.The combinator G = cond (a) (0), where a is yet to be determined. By Lemma 1 we see G(G) = 0 , G (G(G)) = a .
So let
a=
(SUC,
pred, cond, K, S}
and all the combinators of Theorem 2.4 can be generated. The advantage of having a single generator and the one binary operation of application is that the recursion equations for the effective enumeration of all LAMBDA-definable elements are particularly simple.
LAMBDA CALCULUS AND RECURSION THEORY
185
THEOREM 3.2. There are combinators val and apply such that and for all n, m E w ,
val(0) = G;
6)
apply (4 0E
(ii)
Val (apply (4 0)= Val (4 (Val (4)-
(iii)
Therefore Val enumerates all LAMBDA-dejnable elements! that is, R E = (val(n)InEcc)}.
(iv)
PROOF.The pairing function (n, m) is primitive recursive; and, by the method of Lemma 8 of Section 3, it is represented by a combinator. Just which is unimportant. We let
+
apply = i l n ~ ~ . h ~ ~ 1,m). . ( n We also need the combinators corresponding to the primitive recursive functions where fst ((n, m)) = n, snd ((n, m)) = m .
By the fixed-point theorem we define val so that
.
val = Lk E cc) fst (k) 3 G, val (fst (k)
- 1) (val (snd (k))).
Properties (i)-(iii) are immediate. We can apply Theorem 3.2 in a standard way to get Kleene’s Second Recursion Theorem. First we need one other combinator.
LEMMA 2. There is a combinator num such that for all n E w we have num (n) E w ,
(0
Val (num (n)) = n .
(ii)
PROOF.We must refer back to the proof of Theorem 3.1. Since (1,O) = 1, we have val(1) = 0 Then, since (1, 1) = 4, we have val(4) = a .
186
DANA SCOTT
Next, (5, 1) = 22, so V a l (22) = SUC.
Now we can define the combinator num by the fixed point theorem so that num = An E w n 3 1, apply (22) (num (n - 1)).
.
Properties (i) and (ii) are then proved by induction.
THEOREM 3.3. Given any LAMBDA-definable element u E P o , we can efectively jind an integer n E w such that Val (n) = u(n).
PROOF.Since u is definable, so is w
.
.
= Am E OJ u (apply (m)(num (m)))
By looking at the definition, we can find an integer k E a,where Val (k) = w , by virtue of Theorem 3.2. Let n = apply (4(num (W,
and then calculate: Val (n) = Val (k) (Val (num (k)))
as desired.
=
w(k)
=
24
(apply (k) bum (k)>>
= 4n),
The effective character of the proof of Theorem 3.3 can be embodied in a combinator rec (which we leave to the reader to define explicitly) which is analogous to Y,except that it is an integer function. The properties of rec are that rec (k) E w for all k E w, and that Val (rec (k)) = val (k) (rec (k)).
LAMBDA CALCULUS AND RECURSION THEORY
187
Theorem 3.3 has many applications (cf. [7, §§ 11.2-11.81). We give a few hints. By specializing u, we can of course find n such that val(n) = n,
or
val(n) = n
+ 100.
The point being that the enumeration by val is exceptionally repetitious and stumbles all over itself countless times. It is a bit more difficult to get an effective infinite sequence of hits. For example, can we have a recursive function r, where for all i E w we have r(i) E w and Val (r(i)) = r (i where, say, r(i) < r (i
+ l),
+ l)? Let us make a guess that
.
r = l i E w apply (k) (num (i)) for some choice of k, and see what comes out. If this were correct, then for all i E w , val (r(i)) = Val (k) ( i ) .
So, we should like Val (k) = l i E w apply (k) (num (i
+ 1)).
Does such an integer k exist? Yes, by Theorem 3.3. (This proof seems to be rather more understandable than that of [7, p. 1861.) The apply combinator is very handy. For example, if u E RE, we can at once find a primitive recursive function s such that for all m E w Val (s(m)) = ~ ( m ) . Because, let u = Val (k), then what we want is
.
s = Am E w apply (k) (num (m)).
The guiding idea behind these arguments is that of Godel numbering; but since our syntax is so simple, the use of ordered pairs of integers is sufficient - and easy. As another application of Theorem 3.2, we give a version of the incompleteness theorem.
188
DANA SCOTT
THEOREM3.4. The relationship val (n) = Val (m)
is not r.e. in n, in E w. Therefore, there can be no formal system with efectively given rules and axioms that will generate all true equations z = cr between LAMBDA-terms.
In fact, it can be shown that the set b = { n e w Ival(n) =
I}
is not r.e. The proof can be given along standard lines using the usual trick of diagonalization. We do not include the details here. We turn now to a short discussion of degrees of sets in Po. First some definitions. DEFINITION. A subclass A closed under application. DEFINITION. For x , y
E
c P o is a subalgebra iff RE c A and A is
Po,we write Deg(x) G Deg(y)
to mean that for some u E RE x = u(y).
In other words, in this last definition we want x to be computable from y . These are exactly the enumeration degrees of Rogers [7, pp. 1461471. Note that { x I Deg (4 G Deg = M Y > I u E RE>;
hence, we could simply define : Deg ( Y ) = { U < Y )
I * RE19
and the ordering of degrees above would be simple class inclusion. We suppose this is done. (Caution: Deg ( y ) consists of all objects reducible to y , not just those equivalent, or of the same degree as y.) Degrees and subalgebras are very closely related.
LAMBDA CALCULUS AND RECURSION THEORY
189
THEOREM 3.5. The (enumeration) degrees are exactly the finitely gelzerated subakebras of Po and every one can be generated by a single element.
PROOF.First let A = Deg ( y ) . We want to show that it is a subalgebra. Let u E RE, then K(u) E RE also. But then
So RE E A. Next consider two elements of A. They are of the form u(y) and u(y) where u, Y E RE. But then
is of the same form; hence, A is closed under application. But clearly G, y E A , and they generate A as a subalgebra; therefore A is finitely gener ated . In the other direction, let A be any finitely generated subalgebra. Suppose the generators are zo, z l , ...,zk- 1. Using the same trick as in the proof of Theorem 3.1, let
By a similar argument, y generates A and indeed A = Deg ( Y ) ,
as desired.
This theorem seems to explain something about degrees. They form a semilattice with the join operation characterized by
U Deg ( Y )
Deg
= Deg ( ( x , y ) ) .
This is just the semilattice of finitely generated subalgebras of Pw which is a part of the complete lattice of all subalgebras. Since Po is uncountable and each finitely generated subalgebra is countable, it is trivial to prove there exists a chain of degrees: A0
c A1 E
E A, E
190
DANA SCOTT
such that the union is not finitely generated. However, if we let An = Deg (xn),
then neu
A,
c D e g ( A n E o . xn).
Otherwise, put: any countable subalgebra is contained in a finitely generated one. The proof that the intersection of two finitely generated subalgebras need not be finitely generated is rather more complicated. Not much else seems to be known about the structure of enumeration degrees. (Rogers in [7, pp. 151-1531 relates enumeration degrees to Turing degrees.) Perhaps one should study the lattice of subalgebras of P o as the completion of the semilattice of degrees and try to find out the structure of the lattice. Our last application of the techniques of enumeration concern the socalled effective operations and the Myhill-Shepherdson Theorem ([5, p. 3131 or [7, p. 3591). Suppose q = Val (k)and consider the function
.
p = An E o apply (k)(n).
This function has the following extensionality property: (*)
val (n) = val (m) always implies Val (p(n)) = val (p(m)),
because by construction we have for all n E o: Val ( P ( 4 = 4 (Val (4). Can there be other such extensional mappings on Godel numbers? No, not if they are recursive.
THEOREM 3.6. Suppose that p(n) E (L) f o r all n E w and that p satisfies (*). I f p E RE, then there is a function q E RE such that for all n E LC).
Val (P(n)) = 4 (Val (4)
PROOF. It is boring, but there is no difficulty, in finding a combinator fin such that: (i) fin (4 E m , (ii) vaI (fin (n)) = en,
LAMBDA CALCULUS A N D RECURSION THEORY
191
for all n E w. We can then define q by analogy with our definition of &abstraction : 4 = {(n, m) I m E Val (P (5 -
(m
Clearly, if p E RE, then so is q. This is obviously what we want i f p and q are related by the equation of the theorem, but the equation remains to be proved. The desired connection is a consequence of the fact that any effective mapping satisfying (*) is necessarily 'continuous' in the sense of: (**)
Val (P(n)> =
u {val
(P (fin 0')))I q E val (n)) ,
which is to hold for all ~tE w. Suppose by way of contradiction that (**) does not hold, for a particular n. There are two cases. Suppose first that we have k E V a l (P(n)), but for all j E o with e, E val (n) we have
k4
(P (fin ( j ) ) ) .
The trick here is that the single integer k forces a distinction between finite and infinite in a way that is too effective to be true. Let r be a (primitive) recursive function whose range is not r.e. Thus,
{m I m
4 {r(i) I i E w}} q! RE.
Note that the relationship j ~ v d ( n )A m # { r ( i ) I i < j ) is r.e. in the variables m andj. This means we can define a (primitive) recursive function s such that
val (s(m)) = { j E val (n) I m
4 {r(i) I i < j } }
for all m E w. Note that val (s(m)) is always a subset of the infinite set Val (n); further it isfinite if m is in the range of r, otherwise it is equal to val(n). By virtue of (*) and what we assumed about k above, we can conclude : k E val (p ( ~ ( m ) ) )iff m 4 {r(i) I i E w}.
DANA SCOTT
192
But this is impossible, because the predicate on the left is r.e. in m, and that on the right is not. That was the first case; suppose next that instead we had e, c Val (n), and for some k E w k E Val (P (fin (A)) but k $ Val cp(n)) Y
-
Again, we see an integer k forcing a decision. Let
.
t = Am E w e, u (Val (m)3 val (n), Val (n)).
This function has the property that for m
E w,
if Val (m)= I;
t(m) = e,
= Val (n) otherwise.
Let the (primitive) recursive function u be chosen so that for m E w :
Val (u(m))= t(m). We then find that, by virtue of (*) again,
k E Val (p (u(m))) iff t(m) = e,, iff val(m) = 1. But as we noted in the proof of Theorem 3.4, the predicate on the right is not r.e. in m ; while on the other hand, the predicate on the left is r.e. Theorem 3.6 has a very satisfying interpretation stated syntactically. Consider the closed LAMBDA-terms (that is, no free variables). Suppose that n[z] is an effectively defined syntactical mapping that maps closed terms to closed terms. Suppose further that the mapping is extensional in the sense that whenever the equation t = (T is true in Pw, then so is n[z]= n[cr].We can conclude from Theorem 3.6 that in this case there is a closed LAMBDA-term p such that the equation
nkl
=
is true in Pw for all closed terms t.Informally this means that if a mapping effectively defined by ‘symbol manipulation’ outside the language has good semantical sense, then an extensionally equivalent mapping
LAMBDA CALCULUS AM) RECURSION THEORY
193
can already be defined inside the language. Thus we have a completeness theorem for definability for the language LAMBDA. This completes our short survey of the theory of enumerability as based on the A-calculus. Here is a summary of the theorems just proved: Theorem 3.1, the Generator Theorem; Theorem 3.2, the Enumeration Theorem; Theorem 3.3, the Second Recursion Theorem; Theorem 3.4, the Incompleteness Theorem; Theorem 3.5, the Subalgebra Theorem; Theorem 3.6, the Completeness Theorem (for Defhability). This would seem to be justification enough for the program of passing from the pure A-calculus to the applied calculus based on LAMBDA. What remains to be done is to show how this new calculus relates to recursion theory on domains other than cr) or Pw and to discuss the appropriate proof theory.
Note An expanded and revised version of this paper will be published under the title Data Types asLattices for the Kiel Summer School in Logic (July 1974). References [I] A. Church, The calculi of lambda-conversion. Princeton (1941). [2] H.B.Curry et al., Combinatory logic, Vols. 1 and 2 (North-Holland, Amsterdam, 1958, 1972). [3] E. Elgot and S . Eilenberg, Recursiveness (Academic Press, New York, 1970). [4] J.R.Hindley et al., Introduction to Combinatory Logic, London Math. SOC.Lecture Notes, vol. 7, CUP (1972). [5] J. Myhill and J. C. Shepherdson, Effective operations on partial recursive functions, Z. Math. Logik Grundlagen Math. 1 (1955) 310-317. [6] G .D. Plotkin, A set-theoretical definition of application, School of A. 1. Memo. MIP-R-95, Edinburgh (1972). [7] H. Rogers, A theory of recursive functions and effective computability (McGrawHill, New York, 1967). [S] D. Scott, Continuous lattices, Lecture Notes in Mathematics, Vol. 274 (Springer, Berlin, 1972) pp. 97-136. [9] D. Scott, Models for various type-free calculi, in: Logic, methodology and philosophy of science ZV, Ed. P. Suppes (North-Holland, Amsterdam, 1973) pp. 157-187. 13 Kanger, Symposium
THAT EVERY EXTENSION OF S4.3 IS NORMAL Krister SEGERBERG Abo Academy, Abo, Finland
Schiller Joe Scroggs’s paper [5] is one of the classical contributions to modal logic; it has been a source of inspiration for many later modal logicians. One way of putting the main result of his paper is this:’ SCROGGS’S FIRSTTHEOREM. For every proper normal extension L of S5 there is some finite positive index frame i (where thus 0 < i < w) such that L is determined by i. S5 is determined by the class of all finite positive index frames of length 1, as well as by the single index frame w. It is that result that is most often associated with Scroggs’s name. However, his paper also contains another result which is not without interest if one considers the discovery of McKinsey and Tarski, published in [4], of a non-normal extension of S4: SCROGGS’S SECOND THEOREM. Every extension of S5 is normal. Yet another celebrated contribution to modal logic is R. A. Bull’s result, first proved in [l], that every normal extension of S4.3 has the finite model property. That Bull’s result may be considered as a generalization of Scroggs’s First Theorem is evident if it is given a formulation such as the following:
BULL’STHEOREM. For every normal extension L of S4.3 there is some class C of finite positive index frames (i, , ... , in- offinite length (where thus n > 0 and 0 < i,, ..., in-1< o) such that L is determined by C. S4.3 is determined by the class of all finite positive index frames of finire length, as well as by the single index frame (w,w, w,...). For terminology not explained here, see [7]. Indices were first introduced in [6]. 194
THAT EVERY EXTENSION OF
s4.3
IS NORMAL
195
The purpose of this note is to show that also Scroggs's Second Theorem can be generalized : THEOREM. Every extension of S4.3 is normal. In order to keep down the length of the paper we shall make use of two results in the literature. One is [7, Lemma 3.5.11: If C is a class of transitive connected Kripke frames no cluster of which is degenerate except possibly the first one, then C determines a normal logic. (By a Kripke frame we understand a structure 3 = ( t , U,R) such that (U,R) is a frame (in the ordinary sense) generated by t. A formula a is valid in 3if, for every model 9.l = ( t , U,R, V ) , % kca.) The other result is due to Kit Fine and HAkan Franztn, independently of one another, and may be extracted from either [2] or [8]: If L is an extension of S4.3, not necessarily normal, and a is a formula not a theorem of L, then there exists a finite Kripke frame 3 = ( t , U,R) such that the following conditions are satisfied: (i) 3 is a Kripke frame for L ; that is, every theorem of L is valid in 3. (ii) There is some u E U and some model % on such that % If. a. (iii) If u R t , then u = t. PROOFOF THE THEOREM. Let L and 01 be as specified. By the former result it is more than enough to show that there exists some transitive, strongly connected Kripke frame for L in which 01 fails to be valid. Let 3,9.l and u be as in the formulation of Fine's and Franztn's result. Note that since 3is a Kripke frame for L, and L 2 S4.3, ?will j be transitive and strongly connected. If u = t, there is nothing more to prove. Suppose therefore that u # t. By (iii), not u R t. We define a function f on U as follows : ifuRx, u ifnot u R x .
x
Let %" = (u, U",R", Vu) be the submodel generated by u from %. It is easy to see that f is a p-morphism from (U,R, V ) to (U", RU,V") stable on every propositional letter, and since3 = u we may therefore conclude that, for every 8, % C, if and only if %" C, 8. Hence by virtue
196
KRISTER SEGERBERG
of the P-morphism Theorem (suitably modified) and (i), p =(u, U”,R ” ) is a Kripke frame for L. Moreover, by (ii) 01 is not valid in p . Fine has improved Bull’s result considerably (see [3]). Our result can also be extended, by not nearly as widely. For example, whereas Scroggs’s First Theoren can be extended to 04.3-this was proved by Fine and, independently, by FranzCn - Scroggs’s Second Theorem cannot. For consider the Kripke frame 3 = ( t , U,R), where t = 0, u = (0, 1, 21, and R = ((O,O>, ( 0 , I>, (0, 2), <1,2), <2,2)}, and let L(3) be the set of formulas valid in 8.Then L(3) is a logic that extends 04.3. But L(3) is not normal, for U p -+p is a theorem of L(3) but ( U p --* p ) is not. For the benefit of matrix minded readers we point out that the following matrix, where I, 2 , 3 , 4 are the designated elements, is characteristic for L(3): A
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8 ~
1 2 3 4 5 6 7 8
~~
2 2 4 4 6 6 8 8
~~
3 4 3 4 7 8 7 8
4 4 4 4 8 8 8 8
5 6 7 8 5 6 7 8
6 6 8 8 6 6 8 8
7 8 7 8 7 8 7 8
8 8 8 8 8 8 8 8
1 -1
8
2
1
3 4 5 6 7 8
6 5 4 3 2 1
0 --
1 2 3 4 5 6 7 8
1 8 5 8 5 8 5 8
References [l] R.A.Bul1, That all normal extensions of S4.3 have the finite model property, Z. Math. Logik Grundlagen Math. 12 (1966) 341-344. [2] K. Fine, The logics containing S4.3, Z. Math. Logik Crundlugen Math. 17 (1971) 371-376. [3] K.Fine, Logics containing K4, J. SymboIic Logic 39 (1974) 31-42. 141 J.C.C.McKinsey and A.Tarski, Some theorems about the sentential calculi of Lewis and Heyting, J. Symbolic Logic, 13 (1948) 1-15. [5] S.J.Scroggs, Extension of the Lewis system S5, J. Symbolic Logic 16 (1951) 112-128. [6] K.Segerberg, On the extensions of S4.4, J. Symbolic Logic 36 (1971) 590-591. [7] K. Segerberg, An essay in classical modal logic, Dissertation, Stanford University (1971), mimeographed edition, Uppsala (1971). IS] K.Segerberg, Framkn’s proof of Bull’s Theorem, Ajatus 35 (1973) 216-221.
DESCRIPTIONS IN INTUITIONISTIC LOGIC Soren S T E N L U N D University of Uppsala, Uppsala,Sweden
1. Introduction The theory of descriptions I presented at the Third Scandinavian Logic Symposium is laid out in detail in [2]. That treatment is carried out within the framework of classical first order logic. The purpose of this paper is to present a similar treatment within intuitionistic or Heyting first order predicate logic and also to summarize some of the results in [2]. As pointed out in [2], descriptions are, in a certain sense, less problematic from an intuitionistic point of view. The reason for this is that intuitionistically truth is understood in terms of proof and proofs have to be treated as mathematical objects. A consequence of this is that descriptive phrases receive a stronger interpretation. Take for example the phrase 'a so-and-so'. Intuitionistically a number x such that A(x) is given not only as a number but as a number n together with a proof of A(n). Similarly the number x such that A(x) is given as a number n together with a proof of A(n) and a proof of the uniqueness of n. Does this interpretation make it impossible to use empty descriptions or descriptions whose existence and uniqueness conditions are not known to be true? The answer is no, and the reason is the same as in the classical case: Descriptions can be introduced on the basis of assumptions and we are free to adopt any assumptions we waut. The fact that empty descriptions can be significant and meaningfully used in informal intuitionistic reasoning is therefore explained as in the introduction of [2] : The introduction of a description presupposes the truth of the corresponding 197
198
SOREN STENLUND
existence and uniqueness conditions. A difference from the classical case is, however, that such presuppositions or assumptions take a different form intuitionistically. To assume that a certain proposition A is true intuitionistically is to assume that we have a proof y of A , i.e., to introduce a variable y intended to range over proofs of A . Following [l] we indicate this by writing YEA. Assuming that our individuals are natural numbers, to have a proof of the proposition 3,x A(x) (that there exists exactly one x such that A(x)) is to have three things: a number n, a proof a of A(n) and a proof f of V x Vy (A(x) & A ( y ) + x = y). A proof of the proposition 3,x A(x) can therefore be represented as an ordered pair ((n, a), f),where n, a and f are as just explained and we indicate this by writing <(%
+,f) E 3,x A@).
(1)
A consequence of this is that if we are given a proof b of 31xA(x), we can always find the natural number n which is the x such that A(x) by twice applying the left projectionp (i.e., the functionp such that p ( ( x , y ) ) = x for all x and y ) and remembering that b is an object of the form ( ( n , ff>J >. As an example consider the statement the natural number x such that A ( x ) is a prime number.
(2)
Let P x mean that x is a prime number. To formalize (2) along the lines explained, we must first introduce the assumption that A(x) holds for exactly one number x : Y E
31X A(x)
(3)
i.e., let y be a proof of 3,x A(x). Since y ranges over objects of the form (l), p ( p ( y ) )is a natural number and (2) then takes the form
The meaningfullness of this formula hinges of course on the assumption that we have a proof y of 3 , x A ( x ) . If we want to interpret (2) as a closed statement depending on no assumptions we must discharge the
DESCRIPTIONS IN INTUITIONISTIC UXiIC
199
assumption just mentioned and interpret (2) as the universal statement
which informally means that for each proof y of the proposition 3,xA(x), the unique number p ( p ( y ) ) which can be read off from this proof is a prime number. If there were no number or more than one number satisfying A(x) and hence no proof of 3 , x A ( x ) , the formula (4) is of course vacuously true. This is in agreement with the ideas of [2] and as argued there it seems to be the only reasonable interpretation of (2) as far as truth-conditions are concerned. In a more complete formalization of intuitionistic abstractions such as [I], where proofs are treated as objects, there is therefore no need to introduce the description operator as an independent primitive notion since any statement containing descriptions can be translated into the system using the left projection function as in the example above. This is no doctrine of elimination of descriptions but a consequence of the intuitionistic stronger interpretation of descriptive phrases. In systems where this stronger interpretation can be expressed, there is thus no special problem about descriptions. Difficulties arise, however, in intuitionistic formal systems like Heyting's predicate logic which differ from classical systems essentially only by lacking the law of the excluded middle. In such systems proofs are not treated as objects, i.e. as possible values of bound variables, and the explanation above cannot be carried out. There are also other difficulties. In the usual formulations of Heyting's predicate logic it is not possible to introduce terms of the kind 7x A(x) for any formula A(x) and add the ?-rule 3 ,x A(x.1 T X A(x) or the ,-axiom 31x A(x) 3 A(7x A(x)),
not worrying about non-denoting terms as is sometimes done in classical logic. This would make the following formula provable: ( A -+ 3,x B(x)) -+ 3x ( A -+ B(x)),
200
SOREN STENLUND
where x is not free in A. But this formula is not intuitionistically valid. This is true in spite of the fact that the 7-rule and the ?-axiom are obviously intuitionistically valid when 7x A(x) is interpreted as the unique individual which can be read off from a proof of 3,x A(x). This shows that the usual formulations of Heyting’s predicate calculus has to be modified. As will be shown below, it is possible to give a treatment of descriptions in Heyting’s predicate calculus using the ideas of 121. This treatment of descriptions is in agreement with the intuitionistic interpretation of descriptions explained above as far as it can be expressed within the language of Heyting’s predicate logic. The disposition of the remainder of this paper is parallel to the presentation in [2], and the reader should consult [2] for informal motivation and for a more rigorous presentation of formal details and proofs. In the next section I shall introduce a formal language and a deductive system which is an extension and modification of Gentzen’s system of natural deduction. I shall then show that the eliminability results of [2] hoId also in the present context by indicating the differences needed to make the proofs of [2] go through. Certain complications arise in connection with disjunction and existential quantification. In the final section I shall briefly describe a classical model theory for the present formal system using Kripke structures. 2. Rules of inferences We consider a language of first order predicate logic based on the following logical symbols: 3 (implication), & (conjunction), v (disjunction), I (absurdity), V (universal quantification), 3 (existential quantification), 7 (description operator), = (identity). There may be a list of n-placefunction symbols, n 2 0. A 0-place function symbol is an individual constant or a name. There is a list of n-place pre-
DESCRIPTIONS IN INTUITIONISTIC LOGIC
20 1
dicate symbols and an infinite list of individual variables. The atomic term-expressions are the variables and the individual constants, and the composite term-expressions are formed from the atomic ones by means of the function symbols and also from the formula-expressionsby means of the description operator. So when A(x) is a formula-expression, IX A(x) is a term-expression. The atomic formula-expressions are also built up as in [2] from term-expressions by means of the predicate symbols and the composite formula-expressions are built up as usual from the atomic ones using the logical symbols. We shall follow the same conventions as in [2] regarding free and bound variables, substitution and omission of parenthesis. Also we shall write 3,x A ( 4 as an abbreviation for 3 x A(x) & vx v y (A(x) & A ( y ) --+ x = y ) .
2.1. In order to present the rules of proof I shall introduce the following metamathematical symbols I,
F,
E.
The derivations of the deductive system are certain tree arrangements of expressions of the following forms tEI,
AEF,
A,
where t is a term-expression and A is a formula-expression. The rules of inference allow us to infer expressions of each of these forms. The informal meaning of t E I is that ‘t’ denotes an individual or ‘t’ has a reference. A E F means that ‘A’ expresses a proposition. This means that when we have A E F, we know that it means to be a proof of A , because intuitionistically to understand a proposition A is to understand what it means to be a proof of A . When we have a derivation ending with A , we interpret this as meaning that the proposition A has been proved and the derivation will be said to codify a proof of A . A derivation is started from an axiom or by making an assumption and proceeds downwards by means of the rules of inference. Some of the 14 Kanger, Symposium
202
SOWN STENLUND
rules of inference are such that when one passes from the premisses to the conclusion certain assumptions are discharged. This will be indicated by enclosing the assumption in question in square brackets. The assumptions are of two forms. Either it is of the form x € Z , meaning ‘let x be an individual’ or else of the form A, and in the latter case the rules of inference tell us when we are allowed to introduce such an assumption. A derivation is closed if all of its assumptions have been discharged. Otherwise it is open. We say that a term-expression t is a term if we have a closed derivation of t E I. A is a formula if we have a closed derivation of A E F, and if there is a closed derivation with conclusion A we say that A is a (formal) theorem. 2.2.
As in [2], the rules of inference will be devided into three groups depending on the form of the conclusion: I-rules:
(f-rule)
t, € Z ,
..., & € I
ftl
t, E I
(+rule) F-rules:
.1
..., t , € I Pt, .--t, E F E I,
AEF BEF A&BEF
t€I S€Z t=s€F AEF BEF A v BEF [A1
AEF BEF
DESCRIPTIONS IN INTUITIONISTIC LOGIC
[xE 11
[xE 11
A(x) E F
A(x) E F 3x A(x) E F
Vx A(x) E F
203
Introduction and elimination rules: I i x A(x)
I A
A B A&B -
A&B
A&B
B
A
.. .. A BEF A v B
AEF B A v B
AEF B
A(x) E F t E Z A(t) 3x A(x) t€Z t=t
A v B
. .
C C
C
A+B B
A
3x A(x)
C C
t = s A(t)
204
GREN STENLUND
The rules of inference in which one of the premisses depends on an assumption of the form x E Z which is discharged by that inference are subject to the following restriction: x must not occur free in any assumption on which the premiss in question depends other than the ones exhibited. In addition x must not occur free in the formula-expression C of the existential elimination rule. 2.3. In the case of thef-rule, it is to be understood that when f is a 0-place function symbol, i.e., an individual constant, then the $rule takes the form of an axiom:
f€Z.
In our formulation of the f-rule it is understood that the function symbols represent total functions. The f-rule can be given a more general formulation so as to include the cases when the function symbols represent partial functions. In that case an additional premiss must be introduced which says when the function is defined. Let f be a function symbol representing an n-place partial function and let Af ( x , , ...,x,) be a formula not containingf which represents the domain of definition off; then thef-rule takes the form
.
ti E I, ..., t, E I, Af ( t i , .. , t,) .ftl
.*- t, E z
In most cases, however, it seems unnecessary to introduce function symbols for partial functions since we have the description operator. For example, if our individuals are real numbers, the inverse function x - l is already represented by the term-expression
-
7y (x y = 1)
in the object language. There is also another reason not to include the f-rule for partial functions in the present context. It would not be compatible with the intuitionistic notion of a function according to which a function is understood as a rule of correspondence or a method for finding an individual as a value when one is given certain individuals as arguments. With the generalf-rule for a partial function5 if we are given individuals t l ,..,,t ,
DESCRIPTIONS IN INTUITIONISTIC LOGIC
205
we may never know whether A, (tl , ..., t,) holds or not, and hence we have no method for finding a value given only t , , ...,t,. If, in addition, we have a proof of A , (t, ,...,tJ, then we can find a value. But then, 1 arguments. The thought of as a rule or method, f depends on n (n + argument being proofs of A, (tl, ..., t.) and this would take us outside the formalism of the present paper. As already pointed out, the informal meaning of
+
AEF is that the formula-expression A expresses a proposition or is welldefined. The premisses A E F and B E F of the introduction rules for disjunction assure us therefore that a disjunction A v B holds only if A and B are both well-defined. Similarly, the premiss A(x) E F of the introduction rule for existential quantification assures us that 3x A(x) holds only if A(x) is well-defined for all x E 1.l From a purely formal point of view, disjunction and existential quantification could be given a different treatment. If we omit all of the premisses just mentioned and change the corresponding Prules to the following: AEF A v BEF
BEF A v BEF
3x A(x) E F and change some of the definitions accordingly, then the proofs would go through with few changes. On this second approach, a disjunction A v B would mean that at least one of A and B is defined and true, and an existential statement 3x A(x) would mean that there is an x such that A(x) is defined and true. To take an example, suppose that we have the language of arithmetic. Then the following formula-expression
There is an error in the formulation of this introduction rule in [2] in that this premiss is lacking.
206
a R E N STENLUND
expresses a well-defined and true proposition. Using the second notion of existential quantification we can infer
3x (7y ( y < x) < 2).
(5)
This formula is not provable on the first approach, since 7y ( y < x) < 2 is not defined for all natural numbers x. If we make explicit what is really said in the statement (3, we arrive at the following: ‘There is a number x such that the term-expression 7y(y < x ) refers to a unique number and this number is strictly less than 2.’ Using the first meaning of existential quantification, this can be expressed as follows :
I shall not deal with the second notions of disjunction and existential quantification any further, since they seem to be less basic and have a more complicated informal meaning than the notions introduced first. It seems to me that any statement which can be expressed using the second meaning of disjunction and existential quantification can also be expressed by means of the first as in the example just given. 2.4.
When we have a derivation ending with a formula-expression A , we interpret this as meaning that we have a proof of A . Now, the things which can have proofs are by definition propositions. Hence we must require that when we have a derivation of A , there should also be a derivation of A E F. This follows from the next result.
If we have a derivation of A , then we can jind a derivation of THEOREM. A E F. This is proved as in [2] by induction on the length of the derivation of A . The only complication which arises now is in the treatment of the elimination rules for disjunction and existential quantification where one has to use a subsidiary induction on the construction of the formulaexpression C.
DESCRIPTIONS IN INTUITIONISTIC LOGIC
207
3. Eliminability 3.1. The eliminability results of [2] extend to the present system. We associate to each formula-expression A a description-free formula-expression A" as follows: The translation ? is compatible with all logical operations except 7y so we need only consider atomic formula-expressions. If tl , ...,t, in Pt, ... t, contain no term-expressions of the form ?xC, then (Ptl * * - t,)O = Pt, t,.
Suppose, then, that this is not the case and let
stand for P t , .+.t,, where 7x A , ( x ) , ..., ?y A,(y) are all term-expressions of the form 7xC which have at least one occurrence not within a termexpression of the form 7xC taken from left to right, then we put (Ptl
-**
t,)'
=
V X .*. Vy (A:(x) & ..- & A:(y) + P(x, ... ,y)).
We can then establish the following result.
THEOREM 1. Given a derivation of A in the original system, we can j n d a derivation of A" in the description-free system. The proof is by induction on the length of the derivation of A. Since the "-transformation is compatible with the logical operations, the result follows immediately by the induction hypothesis for all inference rules except the following ones: 3 i x A(x)
t
=s
A(t)
208
SOREN STENLUND
A ( x ) E F t E I A(t) 3x A(x)
where t and s contain occurrences of descriptions. In [2], it was shown how to replace the first three o f these rules by the following one 3 1 x A(x) VZ, ..* VZ, (VX ( A ( x ) + B(x)) t+B (
A(x))
7 ~
which is easier to deal with. This rule can also be used to derive the last one and the proof can then be completed as in [ 2 ] . The following result is also proved as in [ 2 ] . THEOREM 2. For each formula-expression A , if we have a derivation of A E F, then we canJind a derivation of A t* A". Combining these two theorems we have:
ELIMINABILITY. If there is a derivation of A E F, then A is derivable in the original system if and only $ A is derivable in the descriptionzfree system. 3.2. It is also possible to give a characterization of the metalogical predicates I and F i n terms of the derivability predicate. To each term-expression t and each formula-expression A" we associate the description-free formula-expression I(t) and F ( A ) respectively, as follows :
Z(t) = ( t = t ) if t is a variable or a constant, I(j-11
t,)
=
I(t1) &
& Z(t,),
I(7x A(x)) = VX F ( A ( x ) ) & 3 , A"(x), ~ F ( I ) = I, q p t , *.. t,) = Z ( t l ) &
& Z(t,),
. * I
F(A & B ) = F(A) & F(B),
DESCRIPTIONS IN INTUITIONISTIC LOGIC
209
F ( A v B ) = F(A) & F(B), F ( A 3 B) = F(A) & (A" + F(B)), F (VX A ( x ) ) = VX F (A(x)), F (3x A(x)) = VX F (A(x)).
Then the following result is proved by induction over t and A . THEOREM 3. For any termexpression t and any formula-expression A we have t E Z iff Z(t) is derivable, A EF
iff
F(A) is derivable.
4. Model theory
In this final section, I shall briefly sketch a model theory for the formal system of Section 3 using Kripke structures. A structure G = ( T , 5 , K,v,@) consists of the following components: (1) A non-empty set T. (2) A reflexive and transitive relation 5 on T. (3) A function K which assigns to each a E T a (possibly empty) domain K(a) of individuals such that if a , /? E T and a I /?,then !P(a) c K(/?)(4) A function p which assigns to each a E T an equivalence relation ~ ( I x on ) K(a) such that if a, j3 E T and IX I/?, then ~ ( c c E)~ ( j 3 ) . (p(a) will be the interpretation of = on Y(a).If i, j E K(a) and ( i , j ) E p(a), we say that i and j are pidentical.) ( 5 ) A function @ which assigns to each n-place predicate symbol P and each a E T a set d, (P,a ) E (K(a))", such that if 9,/? E T and a I/I, then d, (P,a ) E @ (P,/?). @ is also such that for all i l , ...,in E Y(Ix)and all j , , ...,jnE Y(a) such that ik andj, are pidentical, 1 Ik In, and (il, ..., in)€ d,(P,a), wehave ( j , ,...,j.)E@(P,a).
210
S6REN STENLUND
To each function symbolf with n argument places and each a E T, the function @ assigns a function dj (f,a) : (Y(a)>lt--f Y(a) such that if a, p E T and a 5 p, then @ (f,a) (il, ..., in) = @ (f,,I!?) (il, ... , iJ, for all i,, ..., in E !P(a). @ is also such that for all i l , ... ,in and all j , , ...,j , in T(a) such that ik and j , are pidentical, 1 5 k I n, we have @ (f,a)
(4 . ,i n ) 9
*.
= @ (f, a) ( j l ,
...,
jfl).
We shall define a partial function V, of two arguments. The first argument being closed term-expressions and formula-expressions and the second argument being elements of T. The values of V, will be either individuals or one of the truth-values 1 (truth) or 0 (falsity). We extend Y(a). our language by introducing names for the individuals in a, b, c, ... will be used as notations for such names and 3, 6, t,... will denote the corresponding individuals. We introduce, of course, exactly one name for each individual. Let a E T.The names of the individuals of T(a) are 0-place function symbols, so we add as axioms
UaET
U € Z
for each name a such that ci E !P(a), and in the definition of V, we let the notions of a term, term-expression,formula, etc. refer to this enlargement of the language relative to G and a. VG is then defined by induction on the construction of term-expressions and formula-expressions as follows: (i) V , (a, a) is defined if 3 E Y(a)and then V, (a, a) = i. (ii) V, (ftl i n ,a) is defined if V, ( t l , a), ..., V, (t,,, a) are all defined, and then
(iii) VG (7x A(x), a) is defined if for each /?E T such that a I ,6 and each ci E !P(p), V, (A@), p) is defined, and the set
contains only pidentical individuals. If VG (7x A(x), a) is defined, it is equal to one of these indivuduals in !P(a).
DESCRIPTIONS IN INTUITIONISTIC LOGIC
211
(iv) VG(Ptl t,, a) is defined if V6 (tl ,a), ..., VG( f n , a) are all defined, and then
Ve (t = s, a) is defined if VG(t, a) and V, (s, a) are both defined, and then
(v) V, (I,(x) is defined and = 0. (vi) V , ( A & B, 01) is defined if V, (A, a) and V, (B, a) are both defined, and then
V , ( A & B , a ) = 1 if V 6 ( A , a ) = V6(B,a) = 1, =0
otherwise.
(vii) V , ( A v By&) is defined if V G ( A , a )and VG@,a) are both defined, and then
VG( A v Bya) = 1 if V, (A, a) = 1 or V, (B, a ) = 1 , =0
otherwise.
(viii) V , ( A 3 B, a) is defined if for each p E T such that a 5 j3 (1) V, (A, /3) is defined,
(2) V , (A,
If
V6 ( A 3
8) = 1 only if
V , (B, j3) is defined.
B, a) is defined, then
V, ( A 3 B, a) = 1 if for each p E T such that as/?,V,(A,j3)=0or V~(B,j3)=1, = 0 otherwise.
212
SOREN STENLUM)
(ix) V, (Vx A(x), a ) is defined if for each B E T such that a 5 j3 and each 5 E !P(a), V, (A(a),B) is defined, and then
V, (Vx A(x), a) = 1 if for all 5
B, V,
E T such that
(A(&
B)
= 1,
= 0 otherwise.
(x) V, (3x A(x), a) is defined if for each E T such that a I p and each 5 E !P(a), V, (A(a),B) is defined, and then VG(3x A(x), a) = 1 if for some 5 E Y(a),we have V, (&a), 4 = 1 , = 0 otherwise.
This completes the definition of V,. It is easy to verify that for each structure G = ( T , 5 , Y,@) and for all IY, j3 E T such that a I B, if V, (t, a ) is defined, then VG( t , j3) is defined and V , ( t , a ) = V, (t, b) E !P(a).Also, if V, ( A , IY) is defined, then V , ( A , p) is defined and if V, ( A , a ) = 1, then V, ( A , P) = 1. A closed term-expression t is said to have a refererice in G if Vs ( t , IY) is defined for all a E T.A closed formula-expression A is said to express a proposition in G if V, ( A , a ) is defined for all a E T and A is said to be valid in G if V, ( A , a ) = 1 for all a E T. It is then possible to establish the following (classical) soundness and completeness results as in [2] :
THEOREM 4. (i) t is a term i f t has a reference in each structure G. (ii) A is a formula @ A expresses a proposition in each structure G. (iii) A is a theorem i f A is valid in each structure G. Here t and A range over closed term-expressions and formula-expression s in the original language of Section 3.
References [l] P.Martin-Lof, An intuitionistic theory of types, mimeographed, Stockholm (1973). [2] S. Stenlund, The logic of description and existence, Philosophical Studies (Philosophical Society and the Department of Philosophy, University of Uppsala, Uppsala, 1973).
AUTHOR INDEX* Aczel, P., 1-14 Aho, A., 145,152
Jervell, H.R., 63-80 Jbnsson, B., 114,142
Bell, J.L., 16,31 Bull, R.A., 38,39,194,196
Keisler, H. J., 63,65,78,80 Kleene, S.C., 82,108 Kochen, S.,20,24,31 Kreisel, G., 82,90,103,108 Kripke, S.A., 127,129,142 Kueker, D. W., 40,52-54,62
Chang, C.C., 52,61 Church, A., 96,108,174,193 Culik 11, K., 145,150-152 Curry, H.B.,81,95,96,108,154,174,193 Ehrenttucht, A., 146,152 Eilenberg, S.,193 Elgot, E., 193 Feys, R., 81, 95,96,108 Fine,K., 15-31, 111, 114,140,142,195,
196
Fisher, F. M., 40,61 Fitch, F.B., 118,142 Frege, G., 93,105,108 Fuhrken, G., 63,80 Gabbay, D.M., 141,142 Gale, D., 5, 14 Gardenfors, P., 32-39 Gerson, M., 111, 142 Girard, J. Y., 82,85,87,97,98,108 Godel, K.,106,108 Goldblatt, R.I., 15,31 Hanson, W.H., 118,127,142 Hansson, B., 32-39 Hayashi, T., 145,152 Herman, G., 146,149,152 Hindley, J.R., 154,174,193 Hinman, P.G., 13, 14 Hintikka, J., 40-62 Howard, W.A., 81,82,95,100,108
Lachlan, A.H., 15,31 Lee, K., 146,149,152 Lemmon,E.J.,16,20,28,31,32,39,lll,
113-115, 117,118,120,124,127,133, 137,138,142 Lindenmayer, A., 144,152 Makkai, M., 52,62 Martin-Lof, P., 81-109, 198,199,212 McKinsey, J.C.C., 31, 33,39,194,196 Moschovakis,Y.N., 1, 7,11, 13, 14 Myhill, J., 155,190,193 Opatrny, J., 150,152 Paz, A., 147,152 Plotkin, G.D., 155,193 Prawitz, D., 81,96,104,108,109 Prior, A., 112,116,142 Rantala, V . , 40 - 62 Reyes, G.E., 54,62 Rogers,H., 154,155, 187,188,190,193 Rozenberg, G., 144,146,149,152, I53 Sahlqvist, H., 110-143 Salomaa, A., 144-153 Schumm, G., 128,142 Scott, D., 16,20,28,31,32,39,111, 114,
115, 117,118,120,124,127,133, 137, 138, 142,154-193
* The page numbers of authors' contributions are in italics. 213
214
AUTHOR INDEX
Scroggs, S.J., 194,196 Tait, W.W., 82, 87, 107,109 Segerberg,K.,16,24,31-33,39,110,112, Tarski, A., 114,142, 194, 196 118, 120, 123, 126, 131-133, 140-143, Thomason,S., 22,31, 110-112, 114,118, 194-196 133,140,142,143 Shelah, S., 54,62 Troelstra, A.S., 82,90,108 Shepherdson, J.C., 155, 190,193 Tuomela, R.,51, 62 Slomson,A.B., 16,31 Stenlund, S., 82, 85,89,109, 197-212 Van Leeuwen, J., 146,149, 152, 153 Stewart, F.M., 5, 14 Vesley, R.E.,82,108 Suppes, P.,62 Vitanyi, P.,147, 148,153 Svenonius, L.,51,62 Szilard, A., 147, 153 Wood, D., 144,153