HARTRY FIELD
SAVING THE TRUTH SCHEMA FROM PARADOX Received 10 July 2001; received in revised version 2 August 2001
ABSTRACT. The paper shows how we can add a truth predicate to arithmetic (or formalized syntactic theory), and keep the usual truth schema Tr(A) ↔ A (understood as the conjunction of Tr(A) → A and A → Tr(A)). We also keep the full intersubstitutivity of Tr(A)) with A in all contexts, even inside of an →. Keeping these things requires a weakening of classical logic; I suggest a logic based on the strong Kleene truth tables, but with → as an additional connective, and where the effect of classical logic is preserved in the arithmetic or formal syntax itself. Section 1 is an introduction to the problem and some of the difficulties that must be faced, in particular as to the logic of the →; Section 2 gives a construction of an arithmetically standard model of a truth theory; Section 3 investigates the logical laws that result from this; and Section 4 provides some philosophical commentary. KEY WORDS: conditionals, law of excluded middle, paradoxes, truth
1. I NTRODUCTION
The existence of sentences that assert their own untruth poses a wellknown conflict between classical logic and the classical theory of truth. Let Q be such a sentence: more precisely, a sentence for which the background syntactic theory (or it plus uncontroversial empirical assumptions) implies (1) Q if and only if ¬Tr(Q), where Q is a name of Q and ‘Tr’ is the truth predicate. The classical theory of truth includes each sentence of form (T) Tr(A) if and only if A, and in particular (2) Tr(Q) if and only if Q, But (1) is inconsistent with (2) in classical logic. One might be tempted to try to challenge the assumptions about syntax that lead to (1), but there are many different routes to self-referential sentences, and there appears to be no reasonable way to block them all. Besides, there are rather similar paradoxes (e.g., heterologicality) that don’t Journal of Philosophical Logic 31: 1–27, 2002. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.
2
HARTRY FIELD
depend on self-reference in this sense. The only live possibilities, I believe, are modifying classical truth theory and modifying classical logic. The approach that keeps classical logic while modifying classical truth theory has been well explored; see especially Friedman and Sheard [9]. I think it is fair to say that the weakenings of classical truth theory required by this approach are not very appealing. (It isn’t just that no such weakening seems to reflect the ordinary concept of truth; that wouldn’t be such a big deal. The problem is that no such weakening provides a satisfying substitute concept.) For instance, an initially attractive idea (suggested by supervaluational approaches to vagueness) is to weaken (T) to (T ) If Tr(A) then A, and supplement this with partial converses. Unfortunately, the Q-instance of (T ), together with (1), is enough to imply both Q and ¬Tr(Q); and while there’s no inconsistency here, it is certainly odd to assert a sentence while at the same time denying its truth. Unless one is willing to swallow this, one must give up even (T ). One possibility is to abandon both directions of (T), but replace them with the following four inferential rules I II III IV
True(A) A, A True(A), ¬A ¬True(A), ¬True(A) ¬A.
(These aren’t by themselves enough to allow the intersubstitutivity of A with Tr(A) even in truth functional embeddings, which in classical logic would lead to paradox.) But these rules preclude many natural additional laws. Indeed, the first two rules alone are incompatible with laws that assert that they and modus ponens are truth-preserving (indeed, which assert just the instances of these, even if they don’t license the generalizations).1 Similar results hold for some other combinations of the above rules. See [9] for much more detail about the consistent options. McGee [12, p. 29] shows that the options are further constrained if we demand ω-consistency in the underlying syntactic theory, as we surely should. The lack of appeal of the classical logic solutions suggests that it might be better to keep classical truth theory while weakening classical logic. But as far as I have been able to ascertain, there are no satisfactory ways to do this in the literature. Kripke [11] showed that if we weaken classical logic to a logic appropriate to Kleene’s (strong) 3-valued truth tables, we can preserve the intersubstitutivity of A and Tr(A) in all contexts. That
SAVING THE TRUTH SCHEMA FROM PARADOX
3
sounds pretty good, until one realizes that the logic in question does not validate the schema If A then A. (Because of that, one can also not derive either (T ) or its converse.) Restricting ourselves to such a weak logic would require a drastic and awkward revision of our normal inferential practices. One appealing idea is to add a new conditional to the language. The Kleene “conditional” A ⊃ B is defined from negation and disjunction in the usual way (¬A ∨ B); since it is essential to the Kleene approach to give up excluded middle as a general law, it is essential to give up A ⊃ A as a general law. But maybe we could keep the Kleene logic for ¬, ∧, ∨, ∀ and ∃, but add a new conditional A → B, not definable from them, for which we could consistently add A → A as a general law? (Also a corresponding biconditional ↔, with A ↔ B the conjunction of A → B and B → A.) Of course, for this to be of interest we’d need the conditional to have other nice properties. And we’d want to have Tr(A) intersubstitutable with A even inside the conditional; since we have A ↔ A, such intersubstitutivity would give us the schema (T), stated in terms of the new biconditional. If we are to do this, an important obstacle that must be overcome is a paradox variously attributed to Curry, Löb and Henkin. For any sentence B, no matter how dubious (e.g., ‘Santa Claus exists’, or an ordinary Liar sentence), consider a sentence CB that says ‘If CB is true then B’. (Or ‘If the sentence in quotation marks on lines 22–24 of page 3 of “Saving the Truth Schema from Paradox” is true, then B’.) Using the truth schemas plus seemingly innocent principles involving ‘if. . . then’ (and without using negation or any other connectives except for conjunction) we can prove B. The simplest derivation: 1. 2. 3. 4. 5. 6.
Tr(CB ) ↔ (Tr(CB ) → B) Tr(CB ) → (Tr(CB ) → B) Tr(CB ) → B
(Tr(CB ) → B) → Tr(CB ) Tr(CB ) B
(Truth schema applied to CB ) (1, defn of ↔, and ∧-elimination) (2, contraction) (1, defn of ↔, and ∧-elimination) (3, 4, modus ponens) (3, 5, modus ponens)
So any approach to the paradoxes that adds a new conditional to Kleene logic must do it in a way that blocks the reasoning to this and similar paradoxes.2 There is no difficulty in getting a simple and natural logic for restricted applications of the →: applications in which neither the antecedent nor
4
HARTRY FIELD
the consequent of the →-sentence contains an ‘→’. The problem is to get reasonable laws governing the embeddings of an ‘→’ inside the scope of an ‘→’. One might be tempted to simply disallow such embeddings (or more weakly, to disallow those →-sentences that have ‘→’s in their antecedents; or alternatively, those with ‘→’s in their consequents). Such a drastic restriction would be extremely awkward in practice; in addition, it would be incompatible with the goal of keeping schema (T) (with ‘if and only if’ read as the conjunction of two ‘→’-statements). For the instances of (T) where A contains an unembedded ‘→’ are conjunctions of two conditionals with embedded ‘→’s (one with the embedding in the antecedent and the other in the consequent). How else might we block the paradox? One possibility might be to restrict modus ponens (taken as the rule A, A → B B) to the case where A does not itself contain an →, thereby invalidating Step 5 of the above derivation. This too would be awkward in practice, for we would need to keep track of which sentences that we are representing as sentence letters really have ‘→’s hidden in them; but one could learn to live with that. More serious is that since the goal is to keep the equivalence of Tr(A) to A, we would need to also rule out applying the rule when A contains a predication of ‘Tr’ to a sentence with an arrow. (Thus if we take the usual arithmetic approach to achieving self-reference, Step 6 as well as Step 5 of the above derivation would be invalidated.) We would presumably also need to rule out application of the rule when the premise applies ‘Tr’ to all members of a class that may contain sentences with an ‘→’, e.g., ‘All sentences on the blackboard are true’. This would very substantially restrict the use of modus ponens in reasoning. I think that by far the least drastic course is to give up “contraction”, i.e. the inference from A → (A → C) to A → C. (This is the only one of the principles used in the derivation that is explicitly a principle about how embeddings of ‘→’ function.) Giving up contraction isn’t as unappealing as it may sound: there’s no need to give up the inference from A ∧ A → C to A → C; it’s just that A → (B → C) won’t be equivalent to A ∧ B → C. I think that abandoning contraction is the only serious possibility for avoiding the paradox while keeping the unrestricted truth schema. Of course, one needs to show that there is a reasonable logic of the conditional that follows this route and that avoids all the semantic paradoxes, not just this one. There have been efforts to develop a logic of a conditional that can be added onto Kleene logic while avoiding such paradoxes as Curry’s. The most notable – because it comes so tantalizingly close to success – involves the conditional of Łukasiewicz continuum-valued semantics, aka “fuzzy
SAVING THE TRUTH SCHEMA FROM PARADOX
5
logic”. Here the possible “semantic values” or “degrees of truth” are the members of the interval [0, 1], with 1 “best”. The semantic value |X → Y | of a conditional X → Y is taken to be 1 if the “defect” |X| − |Y | is 0 or negative, and otherwise is 1 minus the defect. |A∧B| is the minimum of |A| and |B|; |A ∨ B|, the maximum; and |¬A| is 1 − |A|. (The strong Kleene semantics can be viewed as the result of dropping the → and restricting the allowed semantic values to {0, 12 , 1}.)3 Given this, it is easy to see that the Liar sentence can be consistently assigned value 12 , and that whatever can be convalue p in [0, 1] is assigned to the sentence B the value p+1 2 sistently assigned to the corresponding Curry sentence CB . According to this semantics for the conditional, the Curry argument goes wrong because the contraction principle is invalid: for instance, the contraction in Step . The second modus ponens 3 lowers the degree of truth from 1 to p+1 2 lowered it further, to p, but modus ponens is still valid in the sense of delivering value 1 when both premises have value 1; contraction is thus the sole culprit in the argument. So the Łukasiewicz semantics handles the Curry paradox nicely. More generally, there is a simple and appealing proof, based on a slight generalization of the fact that there is no way to get continuously from one side of a square to the opposite side without crossing the diagonal, that any “ordinary” paradoxical sentence can be consistently assigned a value in that semantics; where an “ordinary” paradoxical sentence is one in which the only role of the quantifiers is to achieve self-reference.4 Unfortunately, there are less ordinary paradoxical sentences which can not be handled with Łukasiewicz semantics, as noted in Restall [15] and Hajek et al. [10]. The latter paper shows that if you construe classical truth theory narrowly, as simply the result of adding all instances of the truth schema to the underlying syntactic theory, then classical truth theory is consistent in the Łukasiewicz semantics. But that piece of good news is put in perspective by two pieces of bad news. First, although we have consistency, we do not have ω-consistency: Restall’s paradoxical sentence,5 also used by Hajek et al., shows that the truth schema can only be satisfied in a nonstandard model of the underlying arithmetic or syntax, i.e. it can only be satisfied by assuming the existence of sentences that are not genuinely finite in length. That is clearly unsatisfactory. In case that piece of bad news isn’t enough, Hajek et al. also show that if we take classical truth theory more broadly, to include not only the schema but the intersubstitutivity of Tr(A) with A in all contexts, we lose even bare consistency. In what follows I will try to do better on both counts. I will prove that a certain logic that extends the logic of the strong Kleene truth tables, but contains an added fully embeddable conditional, fits with classical truth
6
HARTRY FIELD
theory. (1) Here “fits with” will be understood as requiring not merely consistency but ω-consistency: my requirement of fit is that there be a standard model of the syntax. Since I will follow the usual procedure of assuming the syntax defined in arithmetic, the requirement will be that there is an arithmetically standard model. In addition to this, (2) I will interpret ‘classical truth theory’ in a broad sense, that includes both (i) the instances of (T) (with ‘if and only if’ understood as the new biconditional ↔) and (ii) the full intersubstitutivity of Tr(A) with A in all contexts (even within the scope of an →). Indeed, given the underlying logic that I will validate for the conditional, (ii) will follow from (i). (The implication in the other direction is trivial, given that the logic contains A ↔ A.) In fact, I will interpret classical truth theory to include considerably more: I will take it also to include appropriate generalizations, such as “Every sentence of the form A → A is true”. But it is impossible to specify which generalizations are appropriate until the logic itself is specified. Fortunately, the method of construction I’ll employ gives you the generalizations automatically, once you have the instances and the logic, so there is no need to worry about the generalizations for a while. To repeat, my task is to prove that a certain logic that extends the logic of the strong Kleene truth tables, but contains an added fully embeddable conditional, fits with classical truth theory, in a strong sense of ‘classical truth theory’ and a strong sense of ‘fits with’. I make no claim to the optimality of the chosen logic, but it is the strongest one (with or without contraction) for which I am aware of such a proof.6 Like the Łukasiewicz “near miss”, this will be a logic in which contraction fails. It would be nice to have not only a reasonable and sufficiently powerful logic for ‘if . . . then’ that allows us to keep classical truth theory, but also some sort of interpretation of ‘if . . . then’ that “explains” the failure of contraction (and any other classical principles governing ‘if . . . then’ that need to be abandoned). The Łukasiewicz account, though unsuccessful on other grounds, could be viewed as satisfying such a demand: it could be viewed as explaining the failure of contraction in terms of degrees of truth. Here, I must admit, I do less well: I do not offer a serious proposal for how to understand ‘if . . . then’ in terms of which the failure of contraction could be justified, but merely offer a proof of the ω-consistency of truth theory in the logic. The ω-consistency proof does rely on an interpretation of ‘if . . . then’, in terms of provability in a certain arithmetic theory.7 But this interpretation as it stands is too specialized to provide a plausible general account of the meaning of ‘if . . . then’ – at the very least, one would need to generalize it to provability in some broader theory – and it is not intended as such. I suspect that it is possible to prove ω-consistency via a more
SAVING THE TRUTH SCHEMA FROM PARADOX
7
generally applicable interpretation of ‘if . . . then’, but the attempt raises issues I will not address. Even without that, though, the paper does provide a reasonable logic that allows the preservation of classical truth theory, and seems plausibly to accord with the legitimate meaning of ‘if . . . then’; it is only the justification of the logic about which it would be desirable to say more. A final preliminary remark is needed, to clarify “the logic of the Kleene truth tables”. There are several ways to read an implication relation off the (strong) Kleene truth tables. In the weak sense, implies B iff for any valuation v, (W) if all members of have value 1 in v, B has value 1 in v. (I use 0, 12 and 1 as the three semantic values, where the highest value 1 is the one often called “truth” and the lowest value 0 the one often called “falsity”; the middle value 12 is the one often called “undefined”.)8 There is also a strong sense of ‘implies’, where (W) is replaced by the condition (S) the greatest lower bound of the semantic values of all members of in v is less than or equal to the semantic value of B in v. Clearly, A strongly implies B iff both A weakly implies B and ¬B weakly implies ¬A. While the distinction between these notions of implication is well-known, I’m not sure that there is a standard terminology; but let K3 be the logic obtained from the Kleene tables using (W) as the implication relation, and K3− the logic obtained by using (S). Note that because K3 employs the weaker implication relation, it is the stronger logic, in the sense that more claims of implication are correct in it. Although the construction that follows will yield the ω-consistency of classical truth theory even in the stronger logic K3 , it will be important to pay attention to a theory built upon the weaker logic K3− in doing the construction. 2. T HE C ONSTRUCTION
Let L0 be a standard language of arithmetic (say with ¬, ∧, ∨, ∀ and ∃ the primitive logical operators) with an additional one place predicate ‘Tr’. Let L be L0 with a new binary connective →. I will regard → as binding variables in its scope: thus A → B will count as a sentence, even if A and B contain free variables. (I will make some remarks late in Section 3 on an alternative construction that takes a variable to be free in A → B if and only if it is free in A or in B.) Fix a Gödel numbering for L, and for any expression e let e be the numeral for its Gödel number.
8
HARTRY FIELD
I take a quantified version of K3− as a minimal logic for L. This minimal logic is “blind to the meaning of →”: it treats sentences whose main connective is → as essentially atomic (or better, it treats occurrences of such sentences that aren’t themselves in the scope of an → as atomic.) For our purposes it isn’t really necessary to formalize the logic but, for the record, one way to do so [13, §4.6, §6.4] is to base it on a quantified version of a slightly simpler logic FDE.9 This is given by a natural deduction system with the usual introduction and elimination rules for ∧, ∨, ∀ and ∃, together with (1) the new law ∀x(A ∨ Bx) A ∨ ∀x Bx when x is not free in A, and (2) the interdeducibility of each of the following pairs: (i) (ii) (ii*) (iii) (iii*)
¬¬A with A, ¬(A ∧ B) with ¬A ∨ ¬B, ¬∀xA with ∃x¬A, ¬(A ∨ B) with ¬A ∧ ¬B, ¬∃xA with ∀x¬A.
K3− results from FDE by adding the rule A ∧ ¬A B ∨ ¬B; K3 results by adding the stronger rule A ∧ ¬A B. FDE could be used in place of K3− in what follows, without affecting the main results that I will discuss (though A ∧ ¬A → B ∨ ¬B will come out valid if we use K3− in the construction but not if we use FDE, and later on I will note a less obvious validity involving quantifiers that we get from K3− but not from FDE). I now expand quantified K3− (alternatively, quantified FDE) to a fuller deductive theory I’ll call N − . N − is to have the effect of classical logic for the purely arithmetic fragment, so we’ll add all (closures of) formulas of form A ∨ ¬A for which A doesn’t contain ‘Tr’ or →, and we’ll add the K3 rule (A ∧ ¬A B) for such A also. We’ll also add the usual axioms of Peano arithmetic other than the induction axiom, and add the induction rule F (0) ∧ ∀x[¬F (x) ∨ F (x )] ∀xF (x); we allow F (x) to be any formula of L, even one containing ‘Tr’ and ‘→’. (The usual axiom schema for induction, restricted to the purely arithmetic sublanguage, follows from this, given that we have excluded middle for that sublanguage.)10 In addition, we add the axiom ‘∀x[Sent(x) ∨ ¬Tr(x)]’, where ‘Sent’ is the usual formulation of the property of being the Gödel number of a sentence, and the rules Tr(A) A, A Tr(A), ¬Tr(A) ¬A and ¬A ¬Tr(A) when A is any sentence of L, even one containing ‘Tr’ and ‘→’. That’s N − . N0− will be the subtheory obtained from N − by eliminating all sentences containing →. It may seem that → plays no role in the axioms or rules of N − and that therefore the difference is pedantic, but this isn’t quite
SAVING THE TRUTH SCHEMA FROM PARADOX
9
so: both the induction rule and the truth rules of N − go beyond those of N0− , for they include →-formulas and →-sentences in their scope. Finally, I’ll let N and N0 be the results of adding the full K3 rule to N − and N0− ; in other words, N and N0 are to be theories in K3 , whereas N − and N0− are theories in K3− (or FDE). There is a natural notion of arithmetically standard Kleene-model of L0 : such a model has the natural numbers as domain, assigns the usual numbers and functions to the corresponding arithmetical constants and function symbols of L0 , and assigns to the identity symbol the function that takes n, m to 1 if n = m and 0 otherwise. It also assigns to ‘Tr’ some function M from the natural numbers to {0, 12 , 1}, giving the semantic values of sentences of form ‘Tr(t)’. M is the only ingredient that varies from one arithmetically standard model to the next. For any sentence B of L0 , let |B|M be the result of evaluating B by the usual Kleene rules on the basis of the model determined by M. Clearly, |B|M will be 1 if B is either an arithmetical instance of excluded middle or an axiom of classical number theory. And the rules of K3− “preserve goodness”, i.e. if B is a rule of K3− then |B|M ≥ |A|M , for at least one A in . The K3 rule also “preserves maximal goodness”, in the sense that if the premises have value 1 so does the conclusion; and restricted to arithmetic language it preserves goodness more generally. The only remaining axioms and rules of N0 or N0− involve ‘Tr’. Let us say that the arithmetically standard model determined by M validates N0 (or equivalently, validates N0− ) if (i) ∀n [if n is not (the Gödel number of) a sentence of L0 , M(n) = 0]; and (ii) for any sentence A of L0 , |Tr(A)|M = |A|M . The terminology is appropriate: this is just what is required of an arithmetically standard model if it is to give value 1 to the axiom of N0 that involves ‘Tr’ and to make all the truth rules of N0 preserve either goodness or maximal goodness. (Since the set of truth rules is closed under contraposition, the distinction between their collectively preserving maximum goodness and their collectively preserving goodness vanishes.) It is clear from Kripke [11] that there are arithmetically standard models of L0 that validate N0 , but that doesn’t take us far enough: we want to have some sort of model for L, which validates the truth rules for sentences containing → as well as for →-free ones (and similarly, though more trivially, for the induction rule); and we want it to reflect something about the logic of →, so that it validates a substantial extension of the full N. How might we do this?
10
HARTRY FIELD
Well, N − is a formal theory, so there is an arithmetic RE-formula Pr(x, y), true of numbers n and m iff n and m are the Gödel numbers of formulas A and B for which A N − B. (It is crucial to use a 2-place provability predicate rather than simply provability of the material conditional in the construction to follow, since not all sentences of form A ⊃ A are provable in N − (or N). The construction to follow would make sense with a 2-place predicate representing provability in the full N rather than in N − , but there is a good reason for preferring the predicate for N − ; it will emerge in the next section.11 ) Let us use this provability predicate to specify a mapping of formulas of L into formulas of L0 ; this will allow us to use models of L0 as models of L. Here’s the mapping: If A is atomic, A is A, (¬A) is ¬A , (A ∧ B) is A ∧ B , (A ∨ B) is A ∨ B , (∀vA) is ∀vA , (∃vA) is ∃vA , (A → B) is Pr(A , B ). (Note that (A → B) never contains either free variables or an →, even when A and B do, so the mapping does indeed take sentences of L into sentences of L0 . Also, if the only occurrences of ‘Tr’ in A are “sealed” within the scope of an →, A doesn’t contain ‘Tr’, so full classical logic applies to A in that case.) So given a valuation function M for L0 , we can extend the definition of |B|M to sentences of L: if B is a sentence of L0 L, |B|L M is just |B |M . (We could similarly extend a valuation of formulas relative to an assignment to the free variables, but there’s no need for this since every object in the domain has a name.) We now need to prove the Kripke result in a slightly extended form: we need to show that there are arithmetically standard models of L that validate N. Validating N as opposed to N0 will require only an obvious modification of (i) and (ii): we need (i*) ∀n[if n is not (the Gödel number of) a sentence of L, M(n) = 0], L0 L (ii*) For any sentence A of L: |Tr(A)|L M = |A|M ; that is, |Tr(A)|M = 0 |A |L M . 0 (Since A is just A , (ii*) implies that for any sentence A of L, |Tr(A)|L M L0 = |Tr(A )|M .) The arithmetical standardness of the model will guarantee that the induction rule extends to the full N. Indeed, since classical logic
SAVING THE TRUTH SCHEMA FROM PARADOX
11
applies to a formula F (x) whenever F (x) contains no occurrences of ‘Tr’ that isn’t sealed inside an →, we get that even the induction axiom applies to such formulas F (x), by the reasoning of note 10. (I will have more to say on induction at the end of Section 3.) To get the model, let’s define an operator J on sets X of sentences of L. For any such set X, we inductively define JX (by complexity) as follows:
0. Only sentences are in JX. 1a. An identity sentence (the only kind of atomic sentence in the language of arithmetic itself) is in JX iff the denotations of its terms are the same; 1b. The negation of an identity sentence is in JX iff the denotations of its terms are different. 2a. A sentence of form Tr(t) is in JX iff the denotation of t is (the Gödel number of) something that is both a sentence and in X; 2b. A sentence of form ¬Tr(t) is in JX iff the denotation of t is either not (the Gödel number of) a sentence or (the Gödel number of) a sentence whose negation is in X. 3a. A sentence of form A → B is in JX iff (A → B) is in X. [Note that this makes sense even if A and B aren’t themselves sentences.] 3b. A sentence of form ¬(A → B) is in JX iff ¬(A → B) is in X. 4. ‘¬¬A’ ∈ JX iff A ∈ JX. 5a. ‘A ∧ B’ ∈ JX iff A ∈ JX and B ∈ JX. 5b. ‘¬(A ∧ B)’ ∈ JX iff ‘¬A’ ∈ JX or ‘¬B’ ∈ JX. 6a. ‘∀xA’ ∈ JX iff for all closed t, ‘A(x/t)’ ∈ JX; 6b. ‘¬∀xA’ ∈ JX iff for some closed t, ‘¬A(x/t)’ ∈ JX. (Regard ∨ and ∃ as defined in the usual way, or introduce corresponding rules for them.) Obviously, J is monotone; so by the argument of Kripke [11], we get that if X is J-sound in the sense that X ⊆ JX, then there is a smallest Y such that X ⊆ Y and Y = JY ; call this J X. It is also clear that if a sound X contains no arithmetic falsehoods and is contradiction-free in the sense of never containing both A and ¬A for any A, the same is true of J X. For our purposes we can stick to the case where the initial set X is ∅ and so trivially sound. Let K be J ∅ (the minimal fixed point of the operator J). Note that a sentence A is in K iff A is in K.
12
HARTRY FIELD
We’re now in a position to specify the valuation function M for Tr that determines the model: 1 if n is (the Gödel number of) a member of K; 0 if n is not (the Gödel number of) a sentence of L or is (the M(n) = Gödel number of) a sentence of L whose negation is in K; 1 otherwise. 2 A straightforward induction on complexity on sentences of L0 (using the fact that JK = K) establishes that for any such sentence A, |A|L0 is 1 iff A ∈ K and 0 iff ¬A ∈ K. (This covers all sentences of form Tr(n), even when n happens to be the Gödel number of a sentence with an →.) It follows that for any sentence A of L, |A|L is 1 iff A ∈ K and |A|L is 0 iff ¬A ∈ K; for, |A|L is |A |L0 , A ∈ K iff A ∈ K, ¬A ∈ K iff ¬A ∈ K. For any sentence A of L, Tr(A) ∈ K iff A ∈ K, so Tr(A) has the same value as A. This validates (ii*), and (i*) is trivial. So we have an arithmetically standard model of N. But what laws of the conditional (besides those in N) are validated by the model?
3. W HAT L AWS OF → ARE VALIDATED ?
Let’s look first at sentential logic. Given that the Łukasiewicz continuumvalued approach to the paradoxes is such a near miss, it is natural to compare the results of the construction to that. A first observation is that under this model → “seals off non-classicalness”: if the only occurrences of the non-arithmetical predicate ‘Tr’ in A are within the scope of an →, then A ∨ ¬A is in K (since A is arithmetical). This is enough to give us a large variety of special cases of excluded middle, in particular (SEM) (A → B) ∨ ¬(A → B), (SEM) is certainly not valid in the Łukasiewicz system, so this model validates some things that the Łukasiewicz semantics doesn’t. Of course, there must also be sentences valid in Łukasiewicz and not validated in this model, given that this model is arithmetically standard and validates the truth schema, whereas no arithmetically standard model of the Łukasiewicz system does so. It will be illuminating to consider an axiomatization of the Łukasiewicz sentential logic and see which of its sentential axioms and rules are validated in the current model.12 A useful complete axiomatization of (the-
SAVING THE TRUTH SCHEMA FROM PARADOX
13
oremhood for) this logic is given by Priest [14], and is as follows (I’ve included one redundant axiom in brackets): A1. A → A, A2a. A → A ∨ B, A2b. B → A ∨ B, A3a. A ∧ B → A, A3b. A ∧ B → B, A4. A ∧ (B ∨ C) → (A ∧ B) ∨ (A ∧ C), A5. ((A → B) ∧ (A → C)) → (A → (B ∧ C)), A6. ((A → C) ∧ (B → C)) → ((A ∨ B) → C), A7. ¬¬A → A, A8. (A → ¬B) → (B → ¬A), A9. (A → B) → ((B → C) → (A → C)), [A10. (A → B) → ((C → A) → (C → B))], A11. A → ((A → B) → B), A13. A → (B → A), A14. ((A → B) → B) → (A ∨ B), R1. A, A → B B, R2. A, B A ∧ B. The missing A12 is an important non-axiom of Łukasiewicz: the contraction principle, which was argued in Section 1 to be the obvious culprit in the Curry paradox. A12. (A → (A → B)) → (A → B). Which of these are validated by the construction of Section 2? It’s straightforward to check that any instance of A1–A4 or A7 is validated: e.g., an instance of A4 is in K iff Pr(A ∧ (B ∨ C ), (A ∧ B ) ∨ (A ∧ C ))
is in K; but arithmetical sentences are in K whenever they are true, and this is a truth about derivability in N − . (Nothing requires that A, B and C be sentences; the proof is the same if they contain free variables. Similarly in all the cases below.) A5 and A6 aren’t much harder. Take an instance of A5; it’s in K if and only if Pr[Pr(A , B ) ∧ Pr(A , C ), Pr(A , B ∧ C )]
14
HARTRY FIELD
is in K, i.e. if and only if the latter is true. But that is clearly so: it is a theorem of classical number theory, hence of N − , that if there is an N − derivation of B from A and an N − -derivation of C from A then there is an N − -derivation of B ∧ C from A . A6 is analogous. The verification of A8 is similar, but deserves a special comment. An instance of A8 is in K iff Pr(Pr(A , ¬B ), Pr(B , ¬A ))
is in K, which holds iff it’s true. But it’s a theorem of classical number theory, hence of N − , that if there is an N − -derivation of ¬B from A then there is an N − -derivation of ¬A from B . The comment is that this depends on the fact that N − -derivations are closed under contraposition: if A N − ¬B then B N − ¬A. This explains my choice of a provability predicate for N − over one for N. A9 and A10 are only slightly more difficult. Take an instance of A9; it’s in K if and only if Pr[Pr(A , B ), Pr[Pr(B , C ), Pr(A , C )]]
is in K, i.e. iff it’s true, i.e. iff Pr(A , B ) N − Pr[Pr(B , C ), Pr(A , C )].
Letting Bew be the standard predicate for arithmetic theoremhood, this is equivalent to Pr(A , B ) N − Bew[Pr(B , C ) ⊃ Pr(A , C )],
since Pr(B , C ) and Pr(A , C ) are purely arithmetic claims. But Pr(A , B ) N − Bew[Pr(A , B )], so it suffices to show Bew[Pr(A , B )] N − Bew[Pr(B , C ) ⊃ Pr(A , C )];
and that’s obvious, since N − Pr(A , B ) ⊃ [Pr(B , C ) ⊃ Pr(A , C )]. A10 (which, though redundant in Łukasiewicz, is not in the system we have validated so far) is analogous. Turning to the rules, R2 is trivial. So is R1 (modus ponens for →): if A and A → B are in K, so are A and Pr(A , B ); but Pr(A , B ) is arithmetical and only true arithmetical claims are in K, so A N − B ; and K is closed under N − , in which case B is in K, and so B is in K. So far, we’ve validated the system called TW by Priest [14, p. 193], plus (SEM). Greg Restall called my attention to another axiom that goes beyond TW, and which is not valid in Łukasiewicz. It’s called conjunctive syllogism, and is a symmetric variant of A9 and A10:
SAVING THE TRUTH SCHEMA FROM PARADOX
15
(CS) (A → B) ∧ (B → C) → (A → C). It’s valid here, and the proof is actually simpler than that of A9: an instance is equivalent to an instance of Bew(Bew(A ⊃ B ) ∧ Bew(B ⊃ C ) ⊃ Bew(A ⊃ C )),
which is obviously true and so in K. From (CS) we can easily obtain (GCSR) [X → (A → B)] ∧ [X → (B → C)] X → (A → C).13 We can also validate another law that goes beyond TW (a weakened form of A13): A13*. (A → B) → [C → (A → B)]. An instance of this translates to Pr[Pr(A , B ), Pr[C , Pr(A , B )]].
This is effectively equivalent to the more readable Pr[Bew(A ⊃ B ), Bew[C ⊃ Bew(A ⊃ B )]],
which is a true arithmetical sentence and hence in K. TW together with (CS) and A13* is strong enough to prove all sentences of form (A ↔ B) → (A ↔ B ), where A contains A as a subformula and B results from it by substituting B for any number of occurrences of A.14 (We must for now assume that the substituted occurrences of A are not in the scope of quantifiers in A , since we don’t yet have quantifier axioms, but the result will extend to the general case once such axioms are added. Even now, the formulas can contain free variables.) The proof for a single substitution is a straightforward induction on the complexity of the embedding of A in A .15 (It doesn’t require CS.) The proof for an arbitrary number of substitutions is an induction on the number of occurrences substituted for; the basis (n = 0) is by A1 and A13*, and the induction step results from the single step case by using (GCSR). As an important corollary, we get as theorems all sentences of form ↔ † , where † results from by replacing certain instances of sentences Ai by Tr(Ai ) or vice versa (for one or more i). This corollary seems to me to be very important to the philosophical interest of the construction. It remains to look at A11, A13, and A14; also A12, which as I’ve said is not an axiom of Łukasiewicz. These all fail on the present construction,
16
HARTRY FIELD
and in each case the failure stems from Gödel’s second incompleteness theorem and related results such as Löb’s theorem. For the failure of A12, let A be ‘0=0’ and B be arithmetical. Then (A → B) is equivalent to Bew(B), where again Bew is a provability predicate for classical arithmetic. So the -translation of an instance of contraction is Pr[Bew(Bew(B)), Bew(B)]. But by Löb’s theorem that implies Bew(Bew(B)), which is certainly not an arithmetical theorem when B is ‘0=1’, so nothing that implies it is in K. There are weaker forms of contraction that lead to variants of the Curry paradox. When m > n > 1, consider the principle (A →m B) → (A →n B), where A →m B abbreviates (A → (A → . . . (A → B))), with m arrows. This too is invalidated, by the same choice of A and B: for the -translation is Pr[Bewm (B), Bewn (B)], and (when m > n) that implies Bewn+1 (B) (by Löb’s theorem in the case where m = n + 1, and using in addition that Bewk entails Bewk+1 in the case of higher m); again, an arithmetic falsehood and so not in K. Moving on to A14 (a law of less than transparent appeal, though it is valid for the classical ⊃), it fails too. This time let A be ‘0=1’ and B be arithmetical. A → B will translate to a theorem, so (A → B) → B will translate to an equivalent of Bew(B), and since ¬A is a theorem, ((A → B) → B) → (A ∨ B) will translate to Pr[Bew(B), B]. Again this fails for some B, e.g., ‘0=1’, by the second incompleteness theorem. A11 fails too. Let A be ‘0=0’ and B be ‘0=1’. A11 then reduces to Pr[Bew(0 = 1), 0 = 1]. A11 is equivalent to the permutation law [A → (B → C)] → [B → (A → C)], given the axioms validated so far.16 I was initially disappointed that A11 failed, but as Restall pointed out to me, its failure is inevitable, given that the system validates the highly desirable (CS) and avoids contraction (and contains the other axioms we’ve validated). For an instance of A11 is B → [(B → B) → B], which yields B ∧ [B → C] → [(B → B) → B] ∧ [B → C]. And an instance of (CS) is [(B → B) → B] ∧ [B → C] → [(B → B) → C)]. These together yield B ∧ [B → C] → [(B → B) → C)]. By permutation and A1 we can eliminate the B → B, getting (∗) B ∧ [B → C] → C.
SAVING THE TRUTH SCHEMA FROM PARADOX
17
And (∗) suffices for contraction: for [B → (B → C)] → [B ∧ B → B ∧ (B → C)]; so (#) [B → (B → C)] → [B → B ∧ (B → C)]; but by (∗), [B → B ∧ [B → C]] → [B → C], so [B → (B → C)] → [B → C]. The loss of permutation isn’t so bad after all! (I should also point out that we still get (A ∧ B → C) → (B ∧ A → C); the reason that permutation fails is that A → (B → C) isn’t equivalent to A ∧ B → C, as we knew already from the failure of contraction). Incidentally, the distinctive laws for the Łukasiewicz finite-valued logics, ((A →n ¬A) → A) → A, are also all invalidated. (One might have worried that the 3-valued law and therefore the others would hold, given the three-valued starting point.) Letting A be ‘0=1’, (A →n ¬A) is a theorem (for each n), so [(A →n ¬A) → A] is in effect Bew(0 = 1); so each of the characteristic laws for n-valued logic translates to the false claim Pr[Bew(0 = 1), 0 = 1]. We’ve proved a special case of A13, but A13 fails in full generality: let B be ‘0=0’; and let A be the Gödel sentence, or the sentence that declares itself true, or any other →-free sentence (whether in the language of arithmetic or not) that is not provable in N0 and doesn’t prove its own provability. A13 then translates to Pr[A, Bew(A)], which for any such A is false. Turning to quantification, the fact that we have taken → to bind all variables makes it easy to verify the obvious axioms and rules. In particular, we get ∀x(Ax → Bx) → (∀xAx → ∀xBx). (Since the quantification in the antecedent is vacuous, the -translation of this is just: Pr[Pr(A x, B x), Pr(∀xA x, ∀xB x)],
which is obvious by the rules of the underlying derivation procedure.) We also validate all the axioms and rules of K3 ; for if the inference from A1 , . . . , An to An+1 is licensed by the rules of K3 , so is the inference from A1 , . . . , An to An+1 , and for each i, Ai is in K iff Ai is in K. (And as
18
HARTRY FIELD
remarked above, → also “seals off non-classicalness”: in reasoning among formulas in which there is no application of ‘Tr’ outside of the scope of an →, we can reason with quantifiers according to full classical logic.) However, a caution is needed about universal instantiation: the policy of taking → to bind variables means that certain inferences which may look like instances of universal instantiation aren’t . That is so in particular for certain inferences which we would certainly expect to be valid, such as the inference from A → B to Aj → B j , where j is any scheme for replacing variables by terms and Aj and B j the results of performing such replacements on the free variables of A and B (changing bound variables if necessary to avoid conflicts). But though we can’t justify this as a universal instantiation, there is no difficulty in justifying it by another route, namely: if A → B is in K, Pr(A , B ) is in K; so by obvious properties of provability, Pr((A )j , (B )j ) is in K. But (A )j is (Aj ) , and similarly for B. (Reason: if A has → as main connective, A and A have no free variables, so (A )j and (Aj ) are both A ; and in other cases the result is trivial.) So Pr((Aj ) , (B j ) ) is in K, so Aj → B j is in K. We do not of course get analogous justifications for all other “pseudo instantiations”, but we shouldn’t expect them: e.g., from ¬(A → B) we wouldn’t expect to get ¬(Aj → B j ), given our interpretation of →. Nor should we expect to get from (A → B) → C to (Aj → B j ) → C j ; only to (A → B)j → C j , which is (A → B) → C j . It’s also worth considering the special quantifier rule for the Łukasiewicz continuum-valued semantics: (D) ∀x(¬Ax → Ax) → (¬∀xAx → ∀xAx). Whether this is validated depends on whether we use K3− or FDE in the theory N − on which the provability predicate is based. If we use K3− , (D) is validated: from a derivation of Ax from ¬Ax, we can get a derivation of Ax∧¬Ax from ¬Ax, and so by the K3− -rule a derivation of Ay∨¬Ay from ¬Ax. We can also get, by change of variables, a derivation of Ay from ¬Ay and hence from Ay ∨¬Ay. Putting these together, we get a derivation of Ay from ¬Ax. Ordinary quantifier rules then yield a derivation of ∀yAy from ∃x¬Ax, and so of ∀xAx from ¬∀xAx. But if we take FDE as the underlying logic of N − , (D) has counterexamples. Let (0 = 0)n be the conjunction of n conjuncts, each ‘0=0’, and let F (n, x) be a primitive recursive function symbol for a function that applied to a formula x yields the conjunction of that formula with (0 = 0)n . Let A(n) be the Tarskian fixed point of ¬Tr(F (n, x)); i.e. it says “the conjunction of myself with (0 = 0)n isn’t true”. So each A(n) is a different Liar sentence. There is in N − a general derivation of A(x) from ¬A(x), even
SAVING THE TRUTH SCHEMA FROM PARADOX
19
if the underlying logic is FDE, so the antecedent of (D) is provable; but without the K3− rule, the different Liar sentences A(n) are independent, and there is no proof of the claim that all of them hold from the claim that not all of them hold. A semantic valuation in which two of the Liar sentences have incomparable semantic values will bear this out. I mentioned earlier that there is a quite different way to handle quantification, which takes a variable to occur free in A → B iff it occurs free in A or in B. This requires a different provability translation. To facilitate it, it’s convenient to stipulate that the basic arithmetic language is to be primitive recursive arithmetic (PRA). Then for any formula, let’s introduce the symbol [A]. Whereas A is a constant for the Gödel number of the formula A, [A] is to be a term fA (x1 , . . . , xk ) with the same free variables as A has (fA being a primitive recursive function symbol); and relative to an assignment of numbers n1 , . . . , nk to the free variables, [A] denotes the Gödel number of the sentence that results from substituting the numerals for those numbers for the corresponding free variables. For sentences (formulas with no free variables), [A] is effectively equivalent to A: identity sentences of form [A] = A are provable in PRA. We can now use a modified provability-from predicate, Pr , holding only among sentences; and we can modify the -mapping so that (A → B) is Pr ([A ], [B ]). The construction then goes through essentially as before. On this alternative treatment of quantification, we can validate many of the axioms we’d expect: ∀xA → A(x/t) and A(x/t) → ∃xA, when the substitution of t for x is legitimate, ∀x(A ∨ B) → (∀xA ∨ B) and ∃xA ∧ B → ∃x(A ∧ B), when x isn’t free in B, ∃xA → ¬∀x¬A and its converse. We also get ∀-introduction, since a universal quantification gets into K whenever all of its instances do. But we have a problem with rules that involve a more serious interaction between the quantifiers and →; for instance, the rule ∀x(A → B) → (∀xA → ∀xB). For taking A to be x = x and B to have no free variables beyond x, the translation reduces to Pr [∀x(Bew ([B (x)])), Bew (∀xB (x))],
that is, in effect, to Pr [∀n(Bew ([B (n)])), Bew (∀xB (x))],
20
HARTRY FIELD
which obviously fails. Because of this failure (which would require a restriction in the substitutivity theorem), I think the original treatment of quantification preferable.17 Finally, a note on mathematical induction. There are six forms that need distinguishing, depending (i) on whether the connective in the induction step is a material conditional (defined from ¬ and ∨) or an →, and (ii) on whether induction is viewed as an axiom schema or rule, and if as an axiom schema whether its main connective is ⊃ or →. Consider first the three cases where the connective in the induction step is ⊃: I. F (0) ∧ ∀x[F (x) ⊃ F (x )] ∀xF x, II. F (0) ∧ ∀x[F (x) ⊃ F (x )] ⊃ ∀xF x, III. F (0) ∧ ∀x[F (x) ⊃ F (x )] → ∀xF x. I have already remarked that [I] holds without restriction, and that [II] holds whenever the induction predicate F contains no “unsealed” occurrences of ‘Tr’ (occurrences outside of the scope of an →). This restriction on [II] is needed. For instance, suppose F (x) is x = 0 ∨ Q, where Q is the liar sentence. The second premises of [II] is correct for this F , as is ∀x[x = 0 ⊃ F x], so [II] effectively reduces in this case to F (0) ⊃ F (0), hence to Q ⊃ Q, i.e. ∼Q ∨ Q. [III], unlike [II], also holds without restriction: it reduces to the provability in N of ∀xF x from F (0) ∧ ∀x[F (x) ⊃ F (x + 1)]. It was to get the unrestricted versions of [I] and [III] that I took pains to build the general inductive rule, rather than just the induction schema restricted to arithmetic language, into N − . (Since F doesn’t contain →, it was actually unnecessary for the induction rule in N − to cover →-sentences.) Now let’s look at the corresponding cases with → instead of ⊃ in the induction step: I*. F (0) ∧ ∀x[F (x) → F (x )] ∀xF x, II*. F (0) ∧ ∀x[F (x) → F (x )] ⊃ ∀xF x, III*. F (0) ∧ ∀x[F (x) → F (x )] → ∀xF x. [I*] is generally valid. For its failure would require that F (0) and ∀x[F (x) → F (x )] both be in K while ∀xF (x) isn’t. But in order that ∀x[F (x) → F (x )] be in K, each instance F (n) → F (n + 1) must be (by “pseudo-instantiation” on my preferred reading of quantification, or more directly on the alternative). But since F (0) is in too, it follows by induction that each F (n) is in K, using R1. The construction of K then requires that ∀xF (x) is in K.
SAVING THE TRUTH SCHEMA FROM PARADOX
21
[II*] requires the restriction that there be no unsealed occurrences of ‘Tr’: without this restriction, we’d have the same counterexample as for [II]. But the validity of [II*] with the restriction follows from that of [II] with the restriction, for given the restriction, ∀x[F (x) → F (x )] entails ∀x[F (x) ⊃ F (x )]. The reason for the entailment is that the following is valid: (C) A ∨ ¬A, A → B A ⊃ B.18 [Proof: A, A → B B, so A, A → B ¬A ∨ B. Also, ¬A ¬A ∨ B. So, A ∨ ¬A, A → B ¬A ∨ B.]19 And taking A to be F (x) and B to be F (x ), the assumption on F (x) gives A ∨ ¬A; so we get F (x) → F (x ) F (x) ⊃ F (x ), and we can universally generalize simply by quantified K3 , whichever of the two treatments of quantification we prefer. How about [III*]? Even in the case where F (x) contains no unsealed occurrences of ‘Tr’, we can’t argue to [III*] from [III] in the way that we argued to [II*] from [II]: the analogous argument would require not merely (C) but (C?) A ∨ ¬A (A → B) → (A ⊃ B), which is clearly invalid. And indeed, [III*] is easily seen to fail, even when ‘Tr’ contains no unsealed occurrences of ‘Tr’. Just let F (x) be ‘x = 0’: the -translation of this instance of [III*] says in effect that ∀x(x = 0) is provable from Bew(0 = 1), which of course it isn’t. It isn’t really surprising that [III*], unlike the others, should fail: it involves just the sort of embedding of →s that we have seen leads to trouble in other contexts. 4. C ONCLUDING R EMARKS
How does the theory developed here apply to “paradoxical” sentences? Consider first the Liar sentence Q. As foreshadowed in the opening paragraph, we can accept both Q ↔ ¬Tr(Q) (as the construction of Q via ones favorite mechanism for self-reference demands) and Tr(Q) ↔ Q
(as the truth schema demands). These imply Q ↔ ¬Q, and Q ∨ ¬Q → Q ∧ ¬Q, and many more classically contradictory statements; but unlike
22
HARTRY FIELD
the situation with classical logic, these are not contradictory here. The alteration of the logic is based on Kleene’s, but expands it, for no such conditional → or biconditional ↔ was available in Kleene. A reasonably strong and natural logic for the → emerges, and the theory obtained from number theory (or formal syntax) by adding all instances of Tr(A) ↔ A, or indeed, by adding the full intersubstitutivity of Tr(A) with A, is not only consistent but has arithmetically standard models. Although the → does not obey some natural apparent laws for “if . . . then”, such as contraction, this is inevitable given the Curry paradox; and I have supplied a “reading” of the → that should make the failure of contraction seem less surprising. On this account one cannot accept Q: that would lead to Q ∧ ¬Q, which, since the logic includes the K3 rule, would imply anything. (If we were to weaken the logic to K3− or FDE, then we could choose a “low threshold of acceptance” that would allow us to accept Q ∧ ¬Q; but let’s stick to K3 .) For the same reason, one cannot accept ¬Q. But the inability to accept either isn’t incompleteness in the usual sense. We have incompleteness in the usual sense when neither A nor ¬A is a theorem, but we accept A ∨ ¬A (e.g., when it is a theorem); in accepting A ∨ ¬A we in effect commit ourselves to there being a fact of the matter as to whether A, so the inability to decide between A and ¬A is something of a defect. When we don’t accept A ∨ ¬A, though, there’s no reason to say that there is a fact of the matter, so the inability to either prove A or prove ¬A is no defect.20 Just as the theory doesn’t allow us to assert Q ∨ ¬Q, it doesn’t allow us to assert its negation. Again, this would only be a defect if it allowed us to assert (Q∨ ¬Q) ∨ ¬(Q∨ ¬Q); but it doesn’t. Similarly, the theory doesn’t allow us to assert Q ∧ ¬Q, and doesn’t allow us to assert its negation; but since it doesn’t allow us to assert their disjunction, this is no defect. Finally, the theory doesn’t allow us to assert one way or the other whether Q has a truth value, i.e. is either true or false. For given the intersubstitutivity, that would be equivalent to asserting Q∨ ¬Q or asserting its negation. (This holds whether one interprets falsity as non-truth or as truth of negation; for these are equivalent given the intersubstitutivity of Tr(B) and B.) For the same reason, the theory doesn’t allow us to assert one way or the other whether Q is both true and false. But again, it is misleading to call these “incompletenesses” in the theory: you shouldn’t expect to be able to prove A or to prove ¬A when you don’t accept their disjunction, and in these cases the disjunctions (e.g., that Q either has a truth value or doesn’t, or that it either has both truth values or doesn’t) aren’t to be accepted either.
SAVING THE TRUTH SCHEMA FROM PARADOX
23
The situation is interestingly different for Curry sentences such as C0=1 , due to the presence of the law (SEM). C0=1 is equivalent to Tr(C0=1 ) → 0 = 1, which in turn is equivalent to C0=1 → 0 = 1 by the intersubstitutivity of T (A) with A. So we can derive 0 = 1 from C0=1 , by this equivalence plus modus ponens. But since C0=1 is equivalent to a conditional, (SEM) leads to C0=1 ∨ ¬C0=1, which with the previous leads to ¬C0=1 ; that is, we can assert the falsity of C0=1 . (Arguing directly from the construction: the reason that C0=1 is false is that under the -translation it asserts the derivability of 0 = 1 from Tr(C0=1 ) in N − ; but there are Kleene models for the language of N − in which Tr(C0=1 ) gets value 12 , so the derivability claim must fail.) On a different point: I remarked earlier that an adequate truth theory involves more than the Tarski biconditionals; it involves appropriate generalizations about truth, such as “For every arithmetical sentence, either it or its negation is true”. Which generalizations are appropriate is initially up for grabs, in that it will depend on the logic we decide on, but the fixed point construction we’ve considered validates just the ones appropriate to the logic: e.g., it validates such generalizations as Every universal generalization of a formula provable from A1–A10, A13*, (CS) and (SEM) using R1 and R2 is true. When I say that the construction validates such generalizations I simply mean that they are members of K. Why are they in K? If I had used the second version of quantification I could say that it is because K is closed under ω-proof. But even on the preferred first interpretation, K is closed under ω-proof among sentences not containing an →; and the generalizations under discussion do not contain →, though they are generalizations about sentences that do. There are a number of ways in which the results offered here might usefully be expanded. First, I have not produced a formal semantics, in any normal sense, for the logic I’ve recommended. The provability interpretation serves much the same purposes – it motivates the axioms and is useful in demonstrating the invalidity of inferences – but there is room for a formal semantics as well. I suspect that it would be possible (and worthwhile) to provide a formal semantics with some resemblance to the usual modal semantics for classical provability logics (see [4] or [16], and [3]), but I have not investigated this. It would also be nice to have an axiomatic theory that yields appropriate generalizations about truth directly, without the set-theoretic construction; again, I will not attempt to provide this.21
24
HARTRY FIELD
I make no claim that the principles of sentential logic validated above (A1–A10, A13*, (CS), and (SEM), with R1 and R2) are “anywhere near complete”: there could well be important principles that are guaranteed by the construction that don’t follow from these. (A relatively unimportant one is A ∧ ¬A → B ∨ ¬B, which we get if we use K3− rather than FDE in the theory N − .) Finally, I should repeat that I am by no means convinced that the construction itself is optimal. Indeed, in Section 1 I mentioned a specific ground for dissatisfaction with it: the “reading” of the conditional that I have proposed is fine as a tool for producing an ω-consistency proof, but it is too specialized to serve as a general account of the meaning of “if . . . then”. For a less specialized account, we would at least need to switch from arithmetic derivability to derivability in a more comprehensive theory; but an interpretation more substantially different from the one offered here may be preferable. Such an interpretation might lead to a logic that differs in details from the one that results from the provability interpretation. (In particular, it is not clear to me that we should expect the law (SEM) to hold on subtler accounts of the conditional; if it failed, we might end up with more of a parallel between Curry sentences and Liar sentences than we ended up with on the present proposal.) What I’ve offered here, then, is not offered as the ultimately most satisfactory theory. But it is a start.
ACKNOWLEDGEMENTS I am extremely grateful to J. C. Beall, Graham Priest, and Greg Restall for their encouragement, observations, and advice, and for information about the relevant literature. Also to Ross Brady and Josh Schechter and a referee for useful comments at a later stage.
N OTES 1 By (the instantial version of) the law that modus ponens is truth-preserving, I mean True(A ⊃ B) ⊃ (True(A) ⊃ True(B)); by (the instantial versions of) the laws that rules I and II are truth-preserving, I mean True(True(A)) ⊃ True(A) and True(A) ⊃ True(True(A)). For a proof that these three laws are classically inconsistent with rules I and II, see [9, p. 16]. 2 For further discussion of Curry’s paradox, with historical references, see [2]. 3 Although the set of semantic values in Łukasiewicz semantics is larger than in Kleene semantics, the logic is still a conservative extension of the Kleene logic in that the valid inferences among sentences built up out of ¬, ∧, ∨, ∀ and ∃ are the same. (As I’ll discuss shortly, there are two natural notions of validity for Kleene semantics, and there are two
SAVING THE TRUTH SCHEMA FROM PARADOX
25
corresponding ones for Łukasiewicz; the conservative extension claim is true for either notion of validity, provided that the corresponding notion is used for each logic.) 4 The “ordinary” paradoxical sentences are the ones that arise in a system that results from propositional logic by adding a diagonalization operator in the sense of [16, p. 72]. The key to showing that such sentences are consistently evaluable in Łukasiewicz is that the function determining the semantic value of a quantifier-free formula from the semantic values of its atomic constituents is continuous. In the simplest cases of paradox, we start from a quantifier-free formula D[p1 , . . . , pk , Tr(x)], built from ordinary atomic sentences or formulas p1 , . . . , pk (which may or may not contain x free) plus the predicate ‘Tr’; and we construct (using quantifiers) a sentence F provably equivalent to D(p1 (F ), . . . , pk (F ), Tr(F )). Satisfying the truth schema amounts to finding a value v for Tr(F ) satisfying v = gD (a1 , . . . , ak , v), where gD is the continuous function corresponding to D and the ai are the semantic values of the pi (F ). Since every continuous function on [0, 1] has fixed points (a special case of the principle about the square mentioned in the text), this is always possible. We can also handle more general fixed point constructions, where more than one Fi is introduced at a time (e.g., where F1 says that if F2 is true then Santa Claus exists, and F2 says that if F1 isn’t true then Santa Claus exists). In this case we use the more general principle that any continuous function on [0, 1]n has a fixed point. (I believe that the argument sketched here was first noted by Skolem.) 5 A slight variant of Restall’s example: let h be a primitive recursive function such that h(0, x) is x and when x is A, h(n+1, x) is ¬(A → ¬A). Then construct R to be equivalent to ∃n ¬Tr(h(n, R)). 6 For instance, it is much stronger than the Aczel–Feferman system reported in §11 of [6]: that one contains no axioms about the embedding of → (other than instances of the truth schema for sentences that contain →). And its logic is in important respects stronger than that of Brady [5]: his logic does not include the principles A9, A10 and A13* below, which are important for the strong intersubstitutivity result whose proof is sketched below; nor does it include what I shall call the K3 rule. (The Brady logic does include principles that mine doesn’t, e.g., (A → B) ∨ (B → A); but I judge them less intuitive, and less important, than A9, A10 and A13*.) 7 To motivate in advance that something like this might work, suppose that A → B were simply to abbreviate Bew(A ⊃ B), i.e. “A ⊃ B is provable in arithmetic”. On this reading of A → B (which is not the one I will adopt), the formalized version of Löb’s theorem [16, p. 11] entails that for each arithmetic sentence B (e.g., ‘0=1’), Bew(B) ↔ [Bew(B) → B] is a theorem of arithmetic. Consequently, an ordinary truth predicate for arithmetic would justify Tr(Bew(B)) ↔ [Tr(Bew(B)) → B] when B is in the arithmetic language; so that in this case Bew(B) has the key property of the Curry sentence CB . Bew(B) → B is not a theorem (unless B is), and is false if B is false, so the contraction to Tr(Bew(B)) → B would be illegitimate. This interpretation of the → is too restrictive to handle the case where B itself contains ‘Tr’, so I will adopt a broader one, but this one is enough to give the general flavor of how a provability interpretation could avoid the Curry paradox. 8 In my view the usual readings are objectionable: they lead to the assertion that a paradoxical sentence like Q (which is constrained to have the middle value 12 ) are neither true nor false; but since we want classical truth theory, such an assertion should be equivalent to the assertion ¬ (Q ∨ ¬Q), which is equivalent in the logic to Q∧¬Q and is not assertable. For more on this see [7, pp. 145–6]; I will also have something to say about it in the final section of the paper.
26
HARTRY FIELD
9 (Quantified) FDE is the logic of arbitrary complete deMorgan lattices, using the implication relation (S) above; where a deMorgan lattice is a distributive lattice with an added operation of negation that obeys the deMorgan laws and where ¬¬b is equivalent to b. 10 A B gives A ¬A ∨ B, and ¬A ¬A ∨ B, so A ∨ ¬A ¬A ∨ B, so ¬A ∨ B; i.e., A ⊃ B. 11 I could have avoided explicitly introducing N − , by saying that Pr(x, y) is true of numbers n and m iff n and m are the Gödel numbers of sentences A and B for which both A N B and ¬B N ¬A. This would depend on using the K3− version of N − rather than the FDE version. 12 I’ll consider the Łukasiewicz logic based on the weak implication relation (W) from Section 1, since I’m taking the underlying logic to be K3 rather than K3− . 13 The conjunction of the premises yields X → [(A → B) ∧ (B → C)], by A5, which
with (CS) and A9 yields X → (A → C). 14 More generally, (A → B) → ( → ), when all substituted occurrences of A in A B A are positive; and (A → B) → (B → A ), when all substituted occurrences of A in A are negative. 15 For instance, in the case of the conjunction clause, we need (i) that if (A ↔ B) → ((A → (B ) then (A ↔ B) → ((A ∧ D → (B ∧ D) and (ii) that if (A ↔ B) → ((B → (A ) then (A ↔ B) → ((B ∧ D → (A ∧ D). [These easily give that if (A ↔ B) → ((A ↔ (B ) then (A ↔ B) → ((A ∧ D ↔ (B ∧ D), using A5.] Note that even without (CS) we have the rule X → Y, Y → Z X → Z, by A9 and R1; so to prove (i) it suffices to prove ((A → (B ) → ((A ∧ D → (B ∧ D). But by A9 we can easily get ((A → (B ) → ((A ∧ D → (B ), and A13* yields ((A → (B ) → ((A ∧ D → D); so by A5, ((A → (B ) → ((A ∧ D → (B ) ∧ ((A ∧ D → D). But also by A5, ((A ∧ D → (B ) ∧ ((A ∧ D → D) → ((A ∧ D → (B ∧ D); using this and previous result, we get ((A → (B ) → ((A ∧ D → (B ∧ D), as desired. (ii) is analogous. 16 In one direction it’s obvious: given permutation, A11 is equivalent to an instance of A1. A proof of the unobvious direction is given in Anderson and Belnap [1, pp. 79–80]. 17 The rule (D) also fails on this interpretation of the quantifiers, whether the base logic is FDE or K3− : take Ax to be arithmetical; then (D) translates to Pr [∀nBew (A(n)), Bew (∀xAx)], and this fails for (e.g.,) “x is not a derivation of ‘0=1’ ”. 18 Incidentally, (C) together with (SEM) gives rise to another law that fails in Łukasiewicz: (A → B) → C (A → B) ⊃ C. 19 The use of side-formulas in a disjunction-elimination is unproblematic in K , and 3 could be eliminated by use of the distribution law. 20 Not only is there no reason to say that there is a fact of the matter, there is good reason to reject this in certain cases; but this must be in a sense of rejection other than assertion of the negation. The proper account of rejection, I think, is in terms of a kind of degree of belief appropriate to the nonclassical logic in question. I have developed this point of view elsewhere, in connection with the theory of vagueness and indeterminacy: see [7], pp. 307–9, and for more detail [8] (though the theory of degrees of belief developed there would need to be expanded to cover the degrees of belief of sentences involving the new conditional). We should expect the account of rejection that works for vagueness and indeterminacy to apply here too, for it plausible that the Liar paradox results from a vagueness or indeterminacy in the term ‘true’.
SAVING THE TRUTH SCHEMA FROM PARADOX
27
21 Instead of an ordinary axiomatization of the generalizations, we might instead use schematic variables to generalize the schema (T), in the manner discussed in [7, pp. 114–5, 141–3].
R EFERENCES 1. Anderson, A. R. and Belnap, N. D.: Entailment: The Logic of Relevance and Necessity, Vol. 1, Princeton University Press, Princeton, 1986. 2. Beall, J. C.: Curry’s paradox, in E. N. Zalta (ed.), (Online) Stanford Encyclopedia of Philosophy, CSLI, Stanford University, 2001. 3. Boolos, G.: Provability, truth and modal logic, J. Philos. Logic 9 (1980), 1–7. 4. Boolos, G.: The Logic of Provability, Cambridge University Press, Cambridge, 1993. 5. Brady, R. T.: The non-triviality of dialectical set theory, in G. Priest, R. Routley and J. Norman (eds), Paraconsistent Logic: Essays on the Inconsistent, Philosophia Verlag, 1989, pp. 437–470. 6. Feferman, S.: Toward useful type-free theories, I, J. Symbolic Logic 49 (1984), 75– 111. 7. Field, H.: Truth and the Absence of Fact, Oxford University Press, Oxford, 2001. 8. Field, H.: Mathematical undecidables, metaphysical realism and equivalent descriptions, in L. Hahn (ed.), The Philosophy of Hilary Putnam, Open Court, 2003. 9. Friedman, H. and Sheard, M.: An axiomatic approach to self-referential truth, Ann. Pure Appl. Logic 33 (1987), 1–21. 10. Hajek, P., Paris, J. and Shepherdson, J.: The liar paradox and fuzzy logic, J. Symbolic Logic 65 (2000), 339–346. 11. Kripke, S.: Outline of a theory of truth, J. Philos. 72 (1975), 690–716. 12. McGee, V.: Truth, Vagueness, and Paradox, Hackett, Indianapolis, 1991. 13. Priest, G.: Paraconsistent logic, in D. M. Gabbay and F. Günthner (eds), Handbook of Philosophical Logic, 2nd edn, Vol. 3, D. Reidel, Dordrecht, 2000, pp. 437–470. 14. Priest, G.: An Introduction to Non-Classical Logic, Cambridge University Press, Cambridge, 2001. 15. Restall, G.: Arithmetic and truth in Łukasiewicz’s infinitely valued logic, Logique et Analyse 139–140 (1992), 303–312. 16. Smorynski, C.: Self-Reference and Modal Logic, Springer-Verlag, New York, 1985.
New York University (e-mail:
[email protected])