Philosophical Perspectives, 17, Language and Philosophical Linguistics, 2003
ALL SETS GREAT AND SMALL: AND I DO MEAN AL...

Author:
Shapiro Stewart

Philosophical Perspectives, 17, Language and Philosophical Linguistics, 2003

ALL SETS GREAT AND SMALL: AND I DO MEAN ALL

Stewart Shapiro The Ohio State University The University of St. Andrews I want it all. Queen You can’t always get what you want. But if you try sometime, you just might find, you get what you need. Rolling Stones

1. Wither generality? Timothy Williamson [2003] has made a compelling, prima facie case against the view he calls ‘‘generality relativism’’, the thesis that it is not possible for firstorder variables to range over everything at once. He and others have pointed out that one cannot state the relativist position without violating it. For example, the relativist might say, or try to say, that for any quantifier used in a proposition of English, there is something outside of its range. What is the range of the word ‘‘something’’ at the end? Or suppose we ask the relativist if there is some one thing cannot appear in the range of any bound variable. The likely response would be: ‘‘No. For each object o, it possible to include o in the range of quantifiers, but one cannot quantify over everything at once.’’ This sentence contains an unrestricted quantifier, or so it seems, pending some clever move from a generality relativist. Truth be told, I am not particularly interested in whether it is coherent to have bound variables ranging over absolutely everything. When it comes to the world of subatomic physics, for example, who knows if it is best to talk about objects at all, let alone all objects? The same may go for ordinary talk of items with vague boundaries, such as clouds, mountains, and seas. In response to this, I can hear a Quinean generalist protesting.1 The trouble, if there is trouble, lies with predicates like ‘‘particle’’ and ‘‘cloud’’, not with ‘‘exists’’ or, what is the same thing, with quantification. If there are such things as particles and clouds, then they fall within the range of our bound variables, and can do so all at once. However, I am not sure matters are this straightforward. On some views of

468 / Stewart Shapiro

vagueness, the boundaries of vague terms vary with context. Any context can change and, with this change, new objects might be found within the range of the quantifiers. It may not make sense to have a super-context that includes every context. These matters of anti-realism and metaphysics can be put aside here. I am concerned with the special case of whether one can have bound variables ranging over all pure sets, or all pure set-like-totalities. And I am interested in whether one can have bound variables ranging over all ordinals and all cardinals, or over all well-ordering-types and cardinality-types. I presume that those are the interesting cases anyway, given the role of the Russell, Cantor, and Burali-Forti paradoxes in the literature on this topic. Toward the end of his paper, Williamson puts his finger on what I take to be the main semi-formal sticking point for the generality absolutist: how are we to understand second-order quantifiers when the corresponding first-order quantifiers have unrestricted range? Prima facie, second-order quantifiers have a range too, and by Russell’s paradox, this range cannot lie entirely in the range of the firstorder variables. So there seems to be something that cannot lie in the range of first-order quantifiers. So first-order quantifiers cannot be completely unrestricted. The same goes for any quantifier at any level, provided that there is a higher level. Mea culpa. In my book on second-order logic (Shapiro [1991]), I took the second-order variables of set theory to range over proper classes, which I called ‘‘logical sets’’. The lesson of Russell’s paradox, I said, is that in the context of set theory, there are logical sets that do not correspond to any member of the iterative hierarchy. In the meta-theory, or perhaps the meta-meta-theory, or the mathematical English that I used to write the book, I suppose I took classes to be ‘‘things’’. Along with many (but not all) set-theorists, I used singular terms, like ‘‘V’’ and ‘‘V’’, that denote proper classes, and I had informal meta-variables ranging over proper classes. Clearly, proper classes are set-like things, having only pure iterative sets as members. So the first-order variables of second-order ZFC do not range over all pure set-like things. But consider George Boolos’s [1998, 35] retort to a similar suggestion: ‘‘Wait a minute! I thought that set theory was supposed to be a theory about all, ‘absolutely’ all, the collections that there were and that ‘set’ was synonymous with ‘collection’.’’2 At the time, I might have responded with the above line from the Rolling Stones: ‘‘You can’t always get what you want’’, perhaps adding that we can get what we need. I see now, if not then, that there is something fishy about claiming that second-order ZFC is the most inclusive theory of pure sets that there is, and then using informal variables ranging over pure set-like things that outrun the first-order variables of ZFC on its intended interpretation. To solve this problem, Boolos proposed his celebrated plural interpretation of monadic, second-order quantifiers. Williamson raises five points against that resolution, at least four of which apply in the cases that interest me: pure sets, ordinals, and cardinals. I have my own doubts as to whether our independent or pre-theoretic grasp of plural quantifiers is sufficiently determinate to ground

All Sets Great and Small / 469

second-order theories with infinite, let alone unbounded, domains. Consider a statement of second-order real analysis of the form: "X9Y(X,Y).

There is no issue concerning the existence of sets of real numbers, or at least none that is relevant here (and none that moved Boolos himself). So the opening second-order quantifiers can be given both a plural and an ordinary, set-theoretic interpretation. It had better be the case that if we read the quantifiers as plurals, we will get exactly the same truth value, in general, as we would if we understand the quantifiers as ranging over sets of real numbers. In effect, there needs to be a ‘‘plurality’’ (if you will excuse the expression) for each set of real numbers. Does the English plural construction have that determinate a meaning? Of course, the pluralist can always stipulate that she intends it to have such a meaning in cases like that of real analysis—but one needs some set theory to make the stipulation. This move might sustain Michael Resnik’s [1988] and my [1993] complaint that the sophisticated understanding of the plural construction used in justifying second-order logic is mediated by set theory. In any case, the important use of the plural construction is for cases where we would rather not—or cannot—speak of pluralities as things. Second-order Zermelo-Fraenkel set theory is the main case in point (as confirmed in conversation with Boolos). Is there reason to think that the plural construction is sufficiently determinate in such cases? Is there a ‘‘plurality’’ corresponding to each and every proper class? The pluralist is surely not in position to stipulate this, unless she recognizes proper classes as objects. I do not claim to have presented a knock-down objection against the plural rescue. Moreover, there are a number of other proposals for higher-order quantification that need to be digested. Williamson suggests that we can directly engender an understanding of the generality involved in second-order quantification, an understanding that is not mediated by set theory or any other construal of the range of the quantifiers. Nevertheless, this direct understanding is supposedly equivalent to the set-theoretic interpretation on domains in which the domain is a set (with a powerset). If the Williamson plan succeeds, and does not beg any questions, then we can wax homophonic in giving truth conditions. Perhaps. I take it as agreed that the antinomies provide the basic motivation for generality relativism. If the absolutist gets past those, his remaining problematic feature is the use of unrestricted second-order quantification. However, even if first-order languages are too weak, we may not need full second-order languages to do whatever work we want our grand theory to do. Here, I float a compromise, building on the framework proposed in Zermelo [1930], ‘‘U¨ber Grenzzahlen und Mengenbereiche’’ (‘‘On boundary numbers and domains of sets’’). It allows unrestricted, absolute first-order quantification, and it allows at least restricted second-order quantification. To echo the Rolling Stones, extendibility principles give us what we need. Although similar frameworks have been proposed by

470 / Stewart Shapiro

Charles Parsons and Geoffrey Hellman, they are generality-relativists, after a fashion. Let’s see how much of the cake we can have if we eat it too. We turn first to one of the antinomies.

2. Pesky Burali-Forti and indefinite extensibility Russell’s ‘‘On some difficulties in the theory of transfinite numbers and order types’’ [1906] begins with an examination of the now standard paradoxes, and concludes: …the contradictions result from the fact that…there are what we may call selfreproductive processes and classes. That is, there are some properties such that, given any class of terms all having such a property, we can always define a new term also having the property in question. Hence we can never collect all of the terms having the said property into a whole; because, whenever we hope we have them all, the collection which we have immediately proceeds to generate a new term also having the said property.

Citing this passage, Michael Dummett [1993, 441] writes that an ‘‘indefinitely extensible concept is one such that, if we can form a definite conception of a totality all of whose members fall under the concept, we can, by reference to that totality, characterize a larger totality all of whose members fall under it’’ (emphasis mine). According to Dummett, an indefinitely extensible property P has a ‘‘principle of extension’’ that takes any definite totality t of objects each of which has P, and produces an object that also has P, but is not in t (see also Dummett [1991, 316–319]). Let us say that a property P is Definite if it is not indefinitely extensible. Dummett’s remarks won’t do as a definition, since he uses the word ‘‘definite’’ to explain what it is to be indefinitely extensible. Nevertheless, what he says seems correct. Let us focus on the Burali-Forti paradox. Let O be any Definite collection of ordinal numbers. Let O0 be the collection of all ordinals a such that there is a b2O and ab. That is, a is in O0 if a is smaller than, or equal to, something in O. Since O0 is well-ordered, let g be its order type. Let g0 be the order type of O0 [ {g}—the order of O0 with one item added ‘‘at the end’’. Then g0 is an ordinal number, and g0 is not a member of O. So the property of being an ordinal is indefinitely extensible. One can, of course, challenge the set-theoretic principles—union, pairing, etc.—used here, but the reasoning does appear natural. As Dummett [1991, 316] puts it, if we have a clear grasp of any totality of ordinals, we thereby have a conception of what is intuitively an ordinal number greater than any member of that totality. Any [D]efinite totality of ordinals must therefore be so circumscribed

All Sets Great and Small / 471 as to forswear comprehensiveness, renouncing any claim to cover all that we might intuitively recognise as being an ordinal.

This is the sort of thing that motivates generality relativism. Russell [1906, 144] wrote that it ‘‘is probable’’ that if P is any property which demonstrably does not have an extension (that obeys extensionality) then ‘‘we can actually construct a series, ordinally similar to the series of all ordinals, composed entirely of terms having the property’’ P. In present terms, Russell’s conjecture is that if P is indefinitely extensible, then there is a one-to-one function from the ordinals into P. Russell does not provide an argument for this, but I think there is one: Let a be an ordinal and assume that we have a one-to-one function f from the ordinals smaller than a to objects that have the property P. Consider the collection {fb b < a}. This is Definite. Since P is indefinitely extensible, there is an object a such that P holds of a, but a is not in this set. Set fa = a.

This argument uses transfinite recursion on ordinals and a version of replacement: if a totality t is equinumerous with an ordinal, then t is Definite.3 Both of those seem beyond reproach, but one does need special care in areas like this. It is clear that some intuitive principles have to be dropped. Nevertheless, if the argument (or at least its conclusion) is correct, then ‘‘ordinal’’ is the basic indefinitely extensible notion. In any case, the Burali-Forti paradox is robust. The very definition of well-ordering suffices to generate ever more ordinals, or at least what look like well-ordering types, without using ‘‘external’’ resources like the powerset or setcomprehension principles invoked in the Cantor and Russell paradoxes. Let V(x) be the property of being an ordinal. It is, of course, routine to show that V is itself a well-ordering (i.e., has the requisite property of relations). That is, the V’s are well-ordered. But, alas, V has no order type. We are used to that. We can define a relation that is a well-ordering strictly longer than V: Let a and b be ordinals. Say that a1 b if a6¼0 and either a** be the ordered pair of x and y. If a,b,g,d are ordinals, then let 3 ** if either a

472 / Stewart Shapiro

to do transfinite recursions on L, which are also of length V. So, in effect, we have a transfinite recursion of length 2V. The following passage appears in a survey article called ‘‘The ABC’s of mice’’, by Ernest Schimmerling: We begin by constructing L level by level. The first o levels are exactly the hereditarily finite sets, the next !L1 levels are exactly the sets that are heredi-

tarily countable in L, and so on. Now we ask ourselves what comes next. (Schimmerling [2001, 486–7]) Next? I thought that in defining L, we were to go through all of the ordinals, i.e., to carry it as far as possible. How can something come next? If there is a ‘‘next’’, we have not gone far enough—we have not gone through the ordinals.4 Schimmerling continues: For although we have climbed up to the minimal transitive proper class model of ZFC, foundational considerations that fall under the category of large cardinals have tempted us to adopt certain theories that extend ZFC. These extensions are not true in L, for they imply that there exists a non-trivial elementary embedding j:L!L, which is known to fail in L. So how do we continue or revise the construction in a way that buys us the existence of such an embedding? One naı¨ ve idea is to continue the construction past all the ordinals and throw in the proper class j at stage V or beyond, but this approach leads to some obvious metamathematical problems that we find annoying.

We are to go past all of the ordinals? I can see why the working set theorist would find the meta-mathematical problems annoying. We do need some care when reasoning in this area; the Burali-Forti paradox is annoying. In the words of Graham Priest [2002], we seem to have gone beyond the limits of thought, and I think most of us would rather avoid the drastic measures that Priest recommends. He says, for example, that V both is and is not less than V+1, the next ordinal after all ordinals. Typically, the ‘‘care’’ here is to replace the long transfinite recursions with codings. Nevertheless, it seems to me that in this case at least, the grand transfinite recursion, of length 2V, is as coherent as can be, or at least as coherent as anything else in set theory. Of course, all this talk of ‘‘constructing’’ is only a metaphor. We cannot actually construct anything to stage o without performing a super-task, and even if we could do super-tasks, they would not take us much beyond o. The more literal characterization of L is a perfectly good definition by transfinite recursion, and once we have a well-defined predicate, we can use it to do further transfinite recursions. Using variables or schematic letters, we can even do a transfinite recursion along the order type 3 above, of length V2. And of course, this is not as far as we can go—not by a long shot.

All Sets Great and Small / 473

At this point, one might protest, paraphrasing Boolos: Wait a minute! I thought that set theory was supposed to include a theory about all, ‘‘absolutely’’ all, the well-orderings and transfinite recursions that there are and that ‘‘well-ordering-type’’ was synonymous with (or at least coextensive with or isomorphic to) ‘‘ordinal’’.

Here, however, I do not see a clean way out for the generality-absolutist. What is she to make of these long transfinite recursions and inductions? The absolutist can claim that there are no ‘‘objects’’—no ordinals—that correspond to these explicitly definable long well-ordering properties. These properties simply have no order types. This rejection of V, 2V, and V2 is just an instance of Boolos’s own rejection of proper classes. But what, then, is wrong with the long transfinite recursions, and the transfinite inductions that go along with them? Well, I guess I should not say ‘‘them’’, since these long well-orderings are not objects. But what should I say? Moreover, the move seems ad hoc. What, exactly is wrong with the long transfinite recursions and inductions? Why can’t we just introduce such ‘‘ordinals’’, or names for them, by expanding ontology—just as the generality relativist contends? We just introduce a singular term, like ‘‘2V’’, that denotes one of these well-orderings. In doing so, we are just giving names to well-orderings that we are capable of understanding and using, and treating those well-orderings as objects. What’s wrong with that?

3. Williamson on indefinite extensibility In a recent article on Dummett, Williamson [1998] himself gives an intriguing analysis of indefinite extensibility, or at least what goes for that notion. For most of the paper, he focuses on semantic notions like ‘‘name’’ and ‘‘truth’’. His idea is that when we reflect on a paradoxical situation involving a key word, or words, those words might shift in meaning, not just extension. Here is the idea [1998, §6]: We start with one set of correlative meanings for ‘‘say’’, ‘‘true’’ and ‘‘false’’; we use them to construct a sentence that says nothing in that sense of ‘‘say’’; but reflection on that sentence causes normal speakers to give ‘‘say’’, ‘‘true’’, and ‘‘false’’ a new set of correlative meanings, much like the previous ones except that the sentence in question says something in the new sense of ‘‘say’’; the process can be repeated indefinitely. Normal speakers are not aware of the change, just as they are not aware of many ordinary processes of gradual change. They feel themselves to be going on in the same way, but they are not.

Williamson adds that sentences are to be ‘‘treated as sequences of sounds or marks, which have no intrinsic interpretation, but receive different interpretations in different languages; for these purposes, a change of meaning is ipso facto

474 / Stewart Shapiro

a change of language’’. So when we do some reflection on a supposedly paradoxical sentence or construction, like the Liar, the whole language changes a little. Some words get new meanings, and the corresponding predicates get new extensions. Williamson shows how there is no foothold for so-called ‘‘extended’’ or ‘‘strengthened’’ versions of the various paradoxes: ‘‘Whatever our present understanding, encountering a paradox can cause us to reach a new understanding. We cannot construct a Strengthened Liar for this approach because we cannot anticipate our future understanding in our present meaning.’’ There is no unequivocal meaning or extension we can give to ‘‘true in some successor of our language obtained by reflection on paradoxical sentences’’. This analysis of paradox is nicely consonant with the account of language and meaning found in Williamson’s work on vagueness (e.g., Williamson [1994]) and other matters. Nevertheless, I do wonder how it fits into the generality absolutism on the table today. Toward the end of the Dummett paper (§7), Williamson briefly turns to set theory: Isn’t the consistency and coherence of the iterative conception of sets intuitively obvious? Someone might argue otherwise, on the grounds that even the iterative concept of set is indefinitely extensible. In particular, it may be suggested, however large we suppose the universe of sets to be, our ability to enlarge it by adding proper classes shows that we have not really exhausted the full import of the iterative conception, and therefore that that conception is incoherent after all. This argument assumes that, on the iterative conception, proper classes are really sets under another name, so that denying them the name ‘‘set’’ is an ad hoc verbal evasion, inconsistent with the underlying conception. The Burali-Forti paradox can be used likewise against the concept of an ordinal, which the iterative conception needs to index the stages. The present view has the resources to interpret the iterative conception more charitably, while helping to accommodate the intuitions behind the argument against it. For given any reasonable assignment of meaning to the word ‘‘set’’ we can assign it a more inclusive meaning while feeling that we are going on in the same way, and make correlative changes to the words in an iterative account of sets, to preserve it too. The inconsistency is not in any one meaning we assign the iterative account; it is in the attempt to combine all the different meanings that we could reasonably assign it into a single super-meaning.

The idea, I take it, is that at any given time in our recent history, we have, or may have, assigned a coherent meaning and extension to the words ‘‘pure set’’ and ‘‘ordinal’’ (in the context of iterative set theory). Thereafter, whenever we reflect on the Russell, Cantor, or Burali-Forti paradoxes, and try to talk about proper classes like V and V, we silently and subtly assign new meanings to those words. We are then speaking a new language. The process can be repeated indefinitely. To avoid ambiguity, let us call one of the old languages L1, which contains the words ‘‘pure set1’’ and ‘‘ordinal1’’; and let L2 be a new language obtained by

All Sets Great and Small / 475

reflection on L1. The new language L2 contains the locutions ‘‘pure set2’’ and ‘‘ordinal2’’. The key point here is that the change in language results in adding items to the extensions of the corresponding terms. There are ordinals2 which are much longer than any ordinal1. There are pure sets2 that are much larger than any pure set1. Since both languages are second-order, Zermelo’s theorem applies. The pure sets1 are isomorphic to a proper initial segment of the pure sets2. As speakers of the language, we can then reflect on L2 and produce a third language L3, with its words ‘‘pure set3’’ and ‘‘ordinal3’’. According to this approach, what we cannot do is put everything together, to produce a ‘‘supermeaning’’ for a word like ‘‘set’’, whose extension includes everything we might ever mean by that term as we keep changing our language. There is no term for ‘‘pure set like entity in some successor to our language’’. Fair enough. What do the expanded extensions for ‘‘set’’ and ‘‘ordinal’’ say about generality absolutism? Consider the word ‘‘everything’’ in the original language L1. Call the word ‘‘everything1’’, or the phrase ‘‘absolutely1 everything1’’. When speaking the language L1, the generality absolutist asserts that his ‘‘everything’’ includes absolutely everything. And let us suppose that indeed it does. He is not deluding himself with that word. Now suppose he reflects on the Burali-Forti paradox and ends up speaking the new language L2, with the new words ‘‘pure set2’’ and ‘‘ordinal2’’. Of course, this new language also has a word ‘‘everything’’. Call this word ‘‘everything2’’. And of course, our generality absolutist maintains that everything2 is absolutely everything. The switch to a new language has not cured him of his generality absolutism (especially since he might not have noticed the shift in meaning). The compelling question is this: Does (or did) ‘‘everything1’’ include, or apply to, the new ordinals, the items to which the new word ‘‘ordinal2’’ applies? The whole point of the exercise is that the word ‘‘ordinal’’ in the language L2 applies to an item, an object, that is isomorphic to all of the ordinals1. Call this L2 object V2. Then V2 itself, as well as 2V2, (V2)2, (V2)V2, etc. are all perfectly good ordinals in L2. But none of those are ordinals1, nor does the language L1 recognize well-ordering types that long. But are those items nevertheless included in ‘‘everything1’’? When, before the shift to L2, the generality absolutist asserted that by everything, he meant absolutely everything, did this include V2, 2V2, (V2)2, (V2)V2, etc.? There are two possibilities. Suppose, first, that it does not. So the English word ‘‘everything’’ or, indeed, the phrase ‘‘absolutely everything’’ itself shifts its meaning in the transition to the new language, in exactly the way that the words ‘‘pure set’’ and ‘‘ordinal’’ do. In other words, ‘‘everything2’’ includes much more stuff than ‘‘absolutely1 everything1’’ did. For example, it includes the ordinals2 to which ‘‘everything1’’ does not apply: V2, 2V2, (V2)2, (V2)V2, etc. In other words ‘‘absolutely everything’’ has that same extendibility that ‘‘set’’ and ‘‘ordinal’’ have. I submit that this would be at least a moral victory for the generality relativist. It is what she has been trying to say all along. The relativist might note: ‘‘All right, sure, I grant that my first-order variables range over ‘absolutely

476 / Stewart Shapiro

everything’, as that term is used in my present language, right now. But we both know a way to reflect on the use of certain words, by considering the BuraliForti paradox, and we know that this reflection will result in a new language, and its phrase ‘absolutely everything’ covers a lot more than our current phrase does. That is enough relativity for me.’’ I take it that this option is a non-starter for the generality absolutist.5 So now suppose that the original word ‘‘everything’’ in L1 does include, or apply to the items to which the new word ‘‘ordinal2’’ applies, and also the items to which ‘‘ordinal3’’ applies (or will apply once L3 is generated), etc. After all, how can we change what exists—the furniture of the universe—just by speaking a new language? So the ontology of L1, ‘‘absolutely1 everything1’’, covers V2, 2V2, (V2)2, (V2)V2, etc. It is just that L1 does not recognize these things, these ordinals2, to be ordinals. They are not ordinals1. They have members2, but not members1. They are well-orderings2, but not well-orderings1. On this option, the words ‘‘ordinal’’, ‘‘set’’, ‘‘member’’, and ‘‘well-order’’ shift their meanings (and extensions) through the language changes, but the word ‘‘everything’’ does not. In L2, the phrase ‘‘every ordinal’’ is (much) more extensive than the same phrase in L1, but the difference is chalked up to a change in the word ‘‘ordinal’’, not to a change in the word ‘‘every’’. One might think that this, too, gives a small moral victory to the generality relativist. Speaking in L1, the relativist might say: ‘‘All right, sure, I grant that the phrase ‘absolutely all ordinals’ does cover absolutely all ordinals—all ordinals1—which are all the ordinals there are. But we both know a way to reflect on our use of certain words, and we know that this reflection will result in a new language in which the word ‘ordinal’ has a meaning much like ours, applying to well-ordered set-like things, but the new word has a much expanded range. That is enough relativity for me.’’ In fact, however, this is no victory for the relativist at all. When the speaker asserted in L1 that ‘‘all ordinals’’ includes absolutely all ordinals, she only means all ordinals1. What else can she mean? She has no other word for ordinal. No language contains a word whose extension includes all and only the items that are in the extension of the word ‘‘ordinal’’ in some language or other that we might speak one day, unless that word is ‘‘everything’’. For all we know, we might end up speaking a language in which the word ‘‘ordinal’’ means what we mean by ‘‘grasshopper’’. Recall that Williamson treats sentences and terms ‘‘as sequences of sounds or marks, which have no intrinsic interpretation’’. More seriously, perhaps, no language can contain a word whose extension includes all and only the items that are in the extension of the word ‘‘ordinal’’ in a language obtained from L1 by iterated applications of the paradoxes. The reason is that we could do a Burali-Forti reflection on that language. It seems that we have only limited control over what our words might mean as the language morphs into other languages as we reflect on its expressive resources or do the Burali-Forti thing—even though the reflections into new languages are regular and well-ordered, indexable by ordinals.

All Sets Great and Small / 477

In other words, the absolutist insists on a language-independent meaning and extension of the phrase ‘‘absolutely all’’. In present terms, the absolutist maintains that the phrase ‘‘absolutely all’’ is not indefinitely extensible. The word ‘‘ordinal’’, however, remains our paradigm case of an indefinitely extensible notion. According to Williamson, no one—generality relativist or generality absolutist—can insist on a fixed, language-independent meaning of the word ‘‘ordinal’’ or the phrase ‘‘well-ordering’’, a meaning that encompasses everything we might ever mean by the term. There is no language-independent concept or extension for what we might call an ‘‘ordinal-like entity’’, a term that covers every possible well-order-type encompassed into a von Neumann-style set. There is similarly no language-independent concept for ‘‘pure-set-like entity’’. Of course, the relativist is not happy with the asymmetry between ‘‘everything’’ and ‘‘ordinal’’ or ‘‘ordinal-like entity’’.6 Both camps hold that ‘‘pure set’’ and ‘‘ordinal’’ are open-ended, changing as we reflect on the language’s expressive resources and continue previous operations. The relativist suggests that we should similarly regard locutions like ‘‘exists’’, ‘‘object’’, ‘‘entity’’, and ‘‘is selfidentical’’ to be indefinitely extensible. If we try to conceive of an absolute plurality of existing things, we have the mathematical-linguistic capacity to consider the collection of just those things, and on it goes from there. The relativist might add that wholes inherit the indefinite extensibility of their parts. Every ordinal and every pure set is an entity. Since the former are indefinitely extensible, the word ‘‘entity’’ is just as indefinitely extensible as the mathematical notions. In response, the Williamson-style absolutist claims that this just begs the question. His view just is that ‘‘everything’’ is not indefinitely extensible, and so there is an asymmetry between ‘‘everything’’ and ‘‘ordinal’’, like it or not. Consider the second language L2. Cardinality considerations (after a fashion) entail that most of the ordinals2 cannot be pure sets1. After all, V2 has a powerset2, and by the axiom of choice, this pure set2 can be well-ordered2 and has an ordinal2. And this ordinal has a powerset which can be well-ordered and so has an ordinal2. Powerset, replacement, and union take us higher and higher. So what are these ordinals2 in the original language L1? Since most of them are not pure sets, they (or something in their transitive closures under membership1) are urelements—we should say urelements1. It follows that on this combination of views, Vann McGee’s [1997] urelement set axiom, stating that the urelements form a set, fails and fails badly in any language which has a set theory like ZFC and a quantifier ranging over absolutely everything. The word ‘‘everything’’ in L1 must apply to the ordinals2, the ordinals3, the ordinals4, etc. All of those are urelements in L1, and all but a few of those are urelements in L2, etc. I might add that it also follows that in L1, it is not the case that each proper class (so to speak) is equinumerous to every other. The ordinals3 are much more than the ordinals2, and these are much more than the ordinals1, but all three are proper classes in L1. The point is general. In any language that we happen to be

478 / Stewart Shapiro

speaking, or in any language that we can speak, the pure sets and the ordinals are a tiny, tiny portion of the vast, wild, and wonderful world of absolutely everything. Well, I do not have a problem with this, if the generality absolutist does not. Set theories with urelements, where the urelements form a proper class (so to speak) are not well developed.7 The program might put a damper on the use of the pure iterative hierarchy, V, as the realm for model-theoretic semantics. There are interpretations of the language that are not isomorphic to a pure set, since they are (much) larger than any pure set. For model theory, V is not big enough—not by a long shot.8

4. Zermelo’s better idea Let me turn to a different suggestion for handling notions like ‘‘ordinal’’, ‘‘cardinal’’, and ‘‘set’’. It is based on the original Russell-Dummett notion of indefinite extensibility, where the language is kept fixed. Words do not shift in meaning. Zermelo [1930] presents a version of second-order ZFC with urelements, in pretty much its contemporary form. If there are no urelements, then each model of second-order ZFC is isomorphic to a rank Vk, in which k is a strong inaccessible. In what follows, I use ‘‘inaccessible’’ for ‘‘strong inaccessible’’. Zermelo [1930, 1233] proposes an axiom stating the existence of ‘‘an unbounded sequence’’ of models of the theory, each larger than its predecessors. Each such model has subsets (like the collection of ordinals in the model) which are not members of the structure. Within the given model, these subsets are proper classes, and act as indefinitely extensible properties. However, [w]hat appears as an ‘‘ultra-finite non- or super-set’’ in one model is, in the succeeding model, a perfectly good, valid set with both a cardinal number and an ordinal type…To the unbounded series of Cantor ordinals there corresponds a similarly unbounded…series of essentially different set-theoretic models.

‘‘Model’’ might not be the best word here. Zermelo refers to what set-theorists call ‘‘intended models’’ or ‘‘standard models’’. The structures he has in mind are inaccessible ranks in an iterative hierarchy with urelements. His proposed axiom of extendibility entails that the inaccessibles are unbounded in the universe. In present terms, then, Zermelo’s proposed axiom is that the series of inaccessible cardinals is itself indefinitely extensible. Each inaccessible is a Definite collection, but any inaccessible, or indeed any set of inaccessibles, gives rise to further, larger inaccessible sets, cardinals, and ordinals. So there is no set of all inaccessible cardinals, or all such ranks. Parsons [1977] and Hellman [1989, Chapter 2], [2002] provide accounts of set theory similar to Zermelo’s, but they think in terms of possible structures (or

All Sets Great and Small / 479

possible extensions of the universe). One can follow Hellman [2002] and interpret Zermelo that way, by inserting boxes and diamonds into his text at crucial places.9 However, Zermelo’s own language seems to take the talk of inaccessible ranks (‘‘models of set theory’’) at face value. As noted, Zermelo says that for each inaccessible rank, there is a larger. In contrast, Hellman [1989, 72] makes a modal assertion: (MOD-EXT) &"X"f [(^ZF2)X[2/f ] ! ^9Y9g((^ZF2)Y[2/g] & (X,f )< (Y,g))], where ‘‘^ZF2’’ is the conjunction of the axioms of second-order set theory; ‘‘(^ZF2)X[2/f]’’ is the restriction of those axioms to the monadic, second-order variable X, substituting the binary relation variable f for the membership symbol ‘‘2’’; and (X,f)<(Y,g) says that "x(Xz!Yx) and that f is the restriction of g to X. So (MOD-EXT) is a sentence in a second-order language, with no non-logical terminology. It asserts that, necessarily, for every model of second-order set theory (and so for every inaccessible rank), there could be another model such that X is a proper subclass of Y and f is the restriction of g to X. The Parsons-Hellman program is an attractive resolution of the paradoxes for a nominalist. I presume that there is no serious problem with unrestricted quantification—or at least none motivated by the paradoxes—for someone who does not believe in the existence of abstract objects. For a nominalist, the unrestricted quantifiers do not range over any ordinals, or any other abstract objects, since there aren’t any such things (or so says the nominalist). Hellman says instead that such things might exist, and he provides a compelling modal theory to support this. I suggest that the main problem for this approach is in making sense of the modality involved (see Shapiro [1993]), but considering that would take us too far afield. Let us stick to Zermelo’s own program, with the language taken literally, without modal terminology. The main ideas are not uncommon among set theorists (or at least those with realist tendencies). The program represents a breathtaking extension of the old Aristotelian dichotomy. Each inaccessible is an actual infinity, but the ‘‘collection’’ of all inaccessibles is only a potential infinity. The ‘‘process’’ of generating more inaccessibles never ‘‘finishes’’. To be sure, this talk stretches the notion of ‘‘process’’ beyond recognition (see Parsons [1977]). Zermelo [1930, 1233] writes:10 Scientific reactionaries and anti-mathematicians have so eagerly and lovingly appealed to the ‘‘ultra-finite antinomies’’ in their struggle against set theory. But these are only apparent ‘‘contradictions’’, and depend solely on confusing set theory itself…with individual models representing it…The two polar opposite tendencies of the thinking spirit, the idea of creative advance and that of collection and completion, ideas which also lie behind the Kantian ‘‘antinomies’’, find their symbolic reconciliation in the transfinite number series based on the

480 / Stewart Shapiro concept of well-ordering. This series reaches no true completion in its unrestricted advance, but possesses only relative stopping-points, just those [strong inaccessibles] which separate the higher model types from the lower. Thus the set-theoretic ‘‘antinomies’’, when correctly understood, do not lead to a cramping and mutilation of mathematical science, but rather to an, as yet, unsurveyable unfolding and enriching of that science.

In present terms, Zermelo claim’s that the proper classes of a given inaccessible rank become sets ‘‘in the succeeding model’’—in the next inaccessible rank. This may not be correct, depending on how many urelements there are, and depending on what ‘‘succeeding’’ means. Let l be the third inaccessible, and suppose that there are exactly l-many urelements. In Zermelo’s ‘‘canonical’’ iteration, the ‘‘first’’ model of set theory would be the first inaccessible rank. In that structure, the urelements are a proper class. The urelements are also a proper class in the second and third inaccessible rank. Only at the fourth inaccessible rank do the urelements become a set. The proofs of the main theorems in Zermelo [1930] indicate that he was aware of this. So perhaps we need to be careful about what ‘‘succeeding’’ means: if M is an inaccessible rank in Zermelo’s (canonical) hierarchy, then define ‘‘the succeeding model’’ of M to be the smallest inaccessible rank in which the universe of M is a set. Of course, we would then need an axiom to the effect that each inaccessible rank has a succeeding model. Zermelo’s key assertion, the extendibility principle, is that the proper classes of a given inaccessible rank become sets in the succeeding model. This amounts to a thesis that the proper classes of a given rank become sets in some later model. Moreover, his ‘‘first development theorem’’ manipulates the collection of urelements using ordinary set-theoretic constructions, such as replacement and union. This presupposes, or seems to presuppose, that for each inaccessible rank, there is a ‘‘succeeding model’’ in the hierarchy, in the foregoing sense of ‘‘succeeding’’. The extendibility principle is thus an analogue of McGee’s urelement set axiom. It entails that there is a model whose rank is a cardinal k and which contains a function whose domain is a member of Vk and whose range includes every urelement. By replacement and separation, the corresponding rank contains a set containing all and only the urelements. Zermelo does explicitly envision inaccessible ranks in which the urelement set axiom is false (e.g., p. 1227), but the extendibility principle, as interpreted here, is true ‘‘eventually’’. This version of the urelement set axiom is the crucial item separating the Zermelo program from Williamson’s generality absolutism. According to Williamson, the language we are speaking at the moment (and any language we will be speaking at any foreseeable future moment) contains a phrase, ‘‘absolutely everything’’, whose extension is indeed absolutely everything. As we saw, if one accepts Williamson’s account of indefinite extensibility, this ‘‘absolutely everything’’ includes the sets and ordinals in languages obtained

All Sets Great and Small / 481

from this language by Burali-Forti reflection (and any other way we have of talking about new things under new names and new concepts). To speak roughly, for Williamson, ‘‘absolutely everything’’ is a proper class par excellence. There can be no language which contains a word ‘‘set’’ governed by the axioms of ZFC (with urelements) and which contains a set of absolutely everything. Such a set would have no powerset. We would encounter genuine contradiction if we did a Burali-Forti reflection on that language, producing a successor language. So for Williamson, it is decidedly not true that the proper classes (so to speak) in the present language become sets in some future language or some successor to that language. They can never form a set (in the sense of ZFC). To relate the Zermelo program to the present generality absolutism, we focus attention on Zermelo’s own language, the language in which the program is described and the lovely theorems proved. What are we to make of Zermelo’s own talk of ‘‘models’’, ‘‘normal domains’’ (i.e., inaccessible ranks), ‘‘order types’’, and the like? It seems clear that the Zermelo and Williamson programs agree that there is, or can be, unrestricted first-order quantification, or at least first-order quantification over all inaccessible ranks (i.e., all standard models of set theory). In other words, there can be unrestricted first-order quantification over all sets, which amounts to unrestricted first-order quantification over all objects.11 For what it is worth, I do not see a problem with this quantification. I have realist tendencies, and would like to go as far as I can with them. The Zermelo program invokes bound variables ranging over an indefinitely extensible notion (i.e., ‘‘standard model of second-order ZFC’’ or ‘‘inaccessible rank’’). Dummett claims that such locutions must be interpreted with Heyting semantics, and sanction only intuitionistic logic. We need not broach that issue here. The more interesting question, I think, concerns unrestricted second-order quantifiers. Like Williamson, Richard Cartwright [1994] argues for unrestricted first-order quantification, but unlike Williamson, Cartwright does not say anything about higher-order languages.12 Maybe we can get by without variables that range over unrestricted pluralities, or whatever one wants to call them, in the Zermelo program. To be frank, I am not sure. Within each inaccessible rank M, the secondorder variables range over proper classes in M, but with Zermelo’s extendibility principle (i.e., his version of McGee’s urelement set axiom), the ‘‘proper classes’’ of M are all sets in a later inaccessible rank M0 (and in all subsequent models beyond that). So the second-order quantifiers in M can be given their ordinary set-theoretic treatment in M0 . That is, the second-order quantifiers have the usual sort of range in M0 . So the later structures have the wherewithal to give the semantics for previous models. So within each rank, the second-order quantifiers are not really unrestricted, just as Zermelo claims. At the outset, Zermelo writes that we do apply ordinary set-theoretic concepts to the realm of inaccessibles:

482 / Stewart Shapiro …we call a ‘‘normal domain’’ a domain consisting of ‘‘sets’’ and ‘‘urelements’’ which satisfies the [ZF] system with regard to the ‘‘basic relation’’ a2b. We will treat ‘‘domains’’ of this kind, their ‘‘elements’’, their ‘‘subdomains’’, their ‘‘sums’’ and ‘‘intersections’’ exactly like sets, and thus according to the general settheoretic concepts and axioms, for there is no means of distinguishing them from sets in any way which essentially matters. However, we will always denote them as ‘‘domains’’ and not as ‘‘sets’’ in order to distinguish them from the ‘‘sets’’ which are the elements of the domain in question. (second emphasis mine)

The ‘‘set-theoretic axioms’’ include separation and replacement, both of which are second-order in Zermelo’s formulation. So if we are to take this talk literally, we will have to countenance unrestricted second-order quantification. But if we do recognize such quantifiers, what do they range over? Absolutely proper classes? We are in danger of giving back all the gains made by the program. For what it is worth, the axiom of separation is not a problem. Suppose that x is a set or ‘‘domain’’ and we wish to use separation on x using a property or formula . By extendibility, x is a member of an inaccessible rank M. We can thus apply separation on x using in the ‘‘domain’’ of M. This can be carried out within any model larger than M. This generalizes a bit. For the most part, the Zermelo plan—and his text— only requires second-order sentences that are restricted to a particular rank. With the extendibility principle in place, one need not invoke unrestricted second-order quantification in stating the plan or proving the categoricity theorems. For the most part. The axiom of replacement is a tougher nut to crack. Stay tuned. We can get a bit beyond the restriction by invoking a Russell-style systematic ambiguity. One can state that a given higher-order sentence is true in each inaccessible rank of set theory. For example, we note that each inaccessible rank satisfies the separation and replacement axioms. A statement like that can be interpreted as a single, unrestricted first-order sentence in the background language that Zermelo uses. This is not to say that unrestricted higher-order quantification has no place. I’ll close with some brief, and tentative, remarks on the desirability of unrestricted second-order quantification. Williamson argues that we need something like unrestricted second-order quantification to give the semantics and define the consequence relation for ordinary, first-order logic, when unrestricted quantifiers are allowed (see also Rayo and Williamson [2004]). In particular, the straightforward, Tarskian [1935] definition of logical consequence invokes a variable ranging over all interpretations. Thus, to define consequence this way, we need some way of specifying ‘‘interpretations’’ or ‘‘extensions’’ of the predicate letters. Some of these ‘‘interpretations’’ and ‘‘extensions’’ are proper classes, such as ‘‘all ordinals’’ or ‘‘all pure sets’’. Perhaps, but to return to the Rolling Stones, we might find that we get what we need—thanks to the completeness theorem. A well-known

All Sets Great and Small / 483

argument, due to Kreisel [1967], provides some confidence that the ordinary conception of logical consequence, which is restricted to models whose domain is a set, gets it right at least for first-order languages. Admittedly, this is not a completely comfortable resolution. If we do not have unrestricted second-order quantification, or something like that, we may not be able to state the correctness of the ordinary conception of logical consequence for languages with unrestricted quantification. In other words, we can’t say what Kreisel’s argument proves. The ‘‘informal rigor’’ may be too informal. McGee [1997] provides another interesting application of unrestricted second-order quantification. Let M1 and M2 be any interpretations that satisfy second-order ZFC and the urelement set axiom, and assume that the universe of M1 is equinumerous with the universe of M2. McGee shows that the pure sets of M1 are isomorphic to the pure sets of M2. This result is part of an argument for the determinacy the language of pure mathematics. The relevant cases for that argument are those in which the quantifiers of both M1 and M2 range over absolutely everything. McGee concludes that it might be indeterminate what singular terms like ‘‘p’’, ‘‘o’’, and ‘‘the natural numbers’’ refer to, but his result entails that every sentence of pure mathematics has a unique and determinate truth value. In other words, there is indeterminacy of reference but no indeterminacy of truth value. McGee’s conclusion, of course, turns on the fact that pure mathematics can be interpreted in the pure iterative hierarchy, and that isomorphic structures are equivalent. Notice, incidentally, that McGee’s program for determinacy is not available to an advocate of the Williamson program, if the latter includes Williamson’s account of indefinite extensibility. As we saw, the urelement set axiom fails on that program (miserably). McGee’s result is not available on the Zermelo program either, unless unrestricted second-order quantification is allowed. To get McGee’s philosophical conclusion, one has to apply his categoricity theorem to ‘‘interpretations’’ that are the ‘‘size’’ of the universe, the entire iterative hierarchy. Restricted second-order quantification will not allow this. Even if unrestricted second-order quantification is allowed, however, Zermelo himself seems to demur from fixing truth values the way that McGee suggests. Zermelo proposes a ‘‘general hypothesis that every categorically determined domain can also be interpreted as a set in some way, i.e., can appear as an element of’’ an inaccessible rank (p. 1232).13 This principle is inconsistent with a categorical characterization of the entire iterative hierarchy, and such a characterization is a key element in McGee’s framework. Nevertheless, McGee’s categoricity theorem is a piece of mathematics that neither Zermelo, nor anyone else, is in a position to challenge. However, McGee’s philosophical interpretation of his result depends on the presence of a quantifier which, as a matter of logic, ranges over everything (in all interpretations). I suspect that Zermelo would balk at this. One cannot set the range of quantifiers as a matter of logic.

484 / Stewart Shapiro

An advocate of the Zermelo program can interpret McGee’s theorem as a systematic ambiguity, applying to each model of set theory. Let P, Q be any two models of set theory whose universes are equinumerous. Let M1 be an interpretation within P of second-order ZFC plus the urelement set axiom in which the first-order quantifiers range over all of P, and let M2 be an interpretation of second-order ZFC plus the urelement set axiom within Q in which the first-order quantifiers range over all of Q. Then the pure sets of M1 are isomorphic to the pure sets of M2. However, this does not establish the determinacy of mathematical language. It only entails that each sentence of pure mathematics has the same truth value no matter how it is interpreted within P (or within Q). Indeed, let P, P0 be two models of set theory whose universes are not equinumerous. Then there may be sentences that are true in the indicated models of ZFC in P but false in the intended models of ZFC in P0 . Consider, for example, a sentence stating that there is a largest inaccessible. This is true in some models in the Zermelo hierarchy, but false in others (see also Hellman [2002]). One area that seems to require, or at least strongly suggest, full, secondorder replacement is Zermelo’s extendibility principle. Informally, the principle is that the inaccessibles are unbounded in the universe. As Zermelo puts it, to each inaccessible rank, there is a higher inaccessible rank (with the same urelements). It follows that for each inaccessible, there is a greater inaccessible. Suppose that the inaccessibles form an o-sequence: k0, k1,…One would think that the informal extendibility principle would yield an inaccessible ko greater than all of those. However, the existence of such an inaccessible does not follow from the extendibility principle alone, at least as it is formulated so far. If the universe consists of the union of the ranks corresponding to k0, k1,…, then the inaccessibles are unbounded in the universe, and so we cannot derive the existence of any more. The replacement axiom, applied to the universe, gives us what we want here. The enumeration k0, k1,…amounts to a function from o to the inaccessibles. Replacement entails that there is a set whose members are the k0, k1,…By the extendibility principle, this set {k0, k1,…}, and its union, is a member of an inaccessible rank. This amounts to the existence of ko. Extendibility would then yield the existence of ko+1, ko+2,…Then replacement would yield the existence of k2o, etc. One way to avoid unrestricted second-order variables would be to invoke a first-order replacement scheme, the move typically made in first-order set theory. That is, we take, as axioms, each formula obtained from the replacement axiom by substituting the second-order variable with a formula from the firstorder language. This ploy falls prey to the compelling arguments against schemes in Shapiro [1991, Chapter 5]. What do the instances of the scheme have in common? Do we have to justify them individually? If we are going to adopt replacement, the honest move is to go for the full, second-order version.14

All Sets Great and Small / 485

Recall Zermelo’s ‘‘general hypothesis’’ that every ‘‘categorically determined domain can also be interpreted as a set in some way’’. He applies it to the issue at hand here: from each infinite sequence of different [inaccessible ranks] with a common basis [i.e., with the same urelements], which are such that of any two one always contains the other as a canonical segment, there arises…a categorically determined domain of sets which again can be extended to [an inaccessible rank]. Thus, to each categorically determined totality of [inaccessible cardinals], there follows a greater such number, and the series of ‘‘all’’ [inaccessible cardinals] is unbounded in the same way as the number series itself. Thus to each transfinite index there corresponds in a one-to-one fashion a determinate [inaccessible cardinal]. (pp. 1232–1233)

The last sentence in this passage provides a work-around for our problem concerning replacement and second-order logic. Zermelo proposes the ‘‘the existence of an unbounded sequence of [inaccessible ranks] as a new axiom of ‘meta-set theory’.’’ In effect, the principle states that for each ordinal a, there is a unique inaccessible cardinal ka. This stronger extendibility principle is firstorder: it states that the inaccessibles are isomorphic to the ordinals. To be sure, our workaround is not a replacement for the full replacement axiom (pardon the pun). Suppose, for example, that there is an o-sequence of Mahlo cardinals. We would need replacement to get the existence of a set that contains the members of this sequence. But at least we can state Zermelo’s extendibility principle, with its intended strength, in the relevant first-order language.15 In any case, the Zermelo program does not, by itself, rule out unrestricted second-order quantification, and I have no desire to do so either, just because I am at a loss to see how to understand it. Let the flower bloom, if it can. There is, however, one consideration that might militate against unrestricted secondorder quantification. We are not quite finished with the Burali-Forti paradox.16 Consider, again, the language used in the Zermelo program, the language in which we speak about the unbounded models of set theory. In this language, there is a first-order formula that says that a given set is a von Neumann ordinal (in one of the inaccessible ranks). The statement that a set is well-ordered under membership is a straightforward, first-order formula, and a pure set is a von Neumann ordinal if it is transitive and well-ordered under membership. Or, to invoke standard theorems, a set is an ordinal if it is transitive and all of its members are transitive. So let V(x) be the property of being a von Neumann ordinal. This is an absolutely proper class, in the sense that there is no model in the Zermelo hierarchy that contains all of the V’s (as per the usual Burali-Forti reasoning). But we note that the V’s are themselves well-ordered under membership. A fortiori, there is no model of set theory with an ordinal of order-type V. Well,

486 / Stewart Shapiro

that’s life. As noted in §2 above, however, the object language of the Zermelo program has the wherewithal to define relations characterizing well-orderings of type V+1, 2V, VV, etc. Can we do transfinite recursions and inductions as long as those ‘‘well-orderings’’? Well, why not? Nevertheless, we cannot, on pain of contradiction, allow models that contain ordinals isomorphic to these wellorderings. Zermelo’s claim that the proper classes become sets in other, larger ranks does not go that far. It cannot apply to properties definable in the very language he is speaking when he says things like this. We have here a strengthened Burali-Forti paradox. For what it is worth, the reasoning can be resisted if we reject unrestricted higher-order quantification outright. To get the extended Burali-Forti reasoning started, one notes that the V’s are well-ordered under membership. This cannot be expressed in the firstorder language. Although we can state that a given set is well-ordered under membership in the indicated first-order language, the V’s do not form a set (in any rank). That’s the point. Let X be a monadic second-order variable, and let R be a binary relation variable. Then there is a straightforward second-order formula, with no non-logical terminology, that states that the X’s are wellordered under R: "x:Rxx & "x"y"z((Rxy&Ryz) ! Rxz) & "Y((9xYx & "x(Yx!Xx)) ! 9y(Yy & "z(Yz ! (z = y _ Ryz)))).

However, compactness considerations entail that there is no adequate formulation of the general notion of well-ordering in a first-order language (see Shapiro [1991, Chapter 5, §5.1.3]). So if the object language used to carry out the Zermelo program is firstorder, then the reasoning behind the extended Burali-Forti paradox cannot get started. We cannot even state that the property V is well-ordered under membership. Intuitively, to say that V is well-ordered, we have to say something about all of its sub-properties—namely, that each has a least element. This is exactly what we cannot do. To be sure, one can easily define the relations corresponding to V + 1, 2V, etc. Each of these can be characterized with a first-order formula with two free variables. Also, one can give explicit definitions that may amount to transfinite recursions over these relations. But in a first-order language, we cannot state, much less prove, that the relations are well-orderings. Without this, we cannot do transfinite inductions over those orderings, and we cannot show that the aforementioned explicit definitions are well-defined. Admittedly, this feels like a cheat, especially in light of my longstanding defense of second-order logic. Intuitively, it seems manifest that V is wellordered, or, better, that the V’s are well-ordered. Giving up that intuition is a bitter pill to swallow, at least for me. Thus my ambivalence. I might add that the straw we are trying to grasp is flimsy. Since the ordinals (and V itself) are all transitive, the only way a collection (or class or property) of

All Sets Great and Small / 487

them could fail to be well-ordered would be for it to fail to be well-founded. Suppose that we did have a descending o-sequence of ordinals (i.e., of V’s): a1, a2, a3…, where a22a1, a32a2, etc. By the extendibility principle, a1 has to be a member of an inaccessible rank Vk. By transitivity, a2, a3… are all in Vk. So Vk itself violates foundation. This is a contradiction, since by hypothesis, Vk satisfies full, second-order ZFC. So using the first-order resources of Zermelo’s own language, we can rule out the possibility that the V’s are not well-founded, and thus we can rule out the possibility that they are not well-ordered. If we are not allowed unrestricted second-order quantification, all that we cannot do, it seems, is say that the ordinals are well-ordered, even though it is quite clear that they are, and that, intuitively, one can prove that they are (if only we could state the theorem). The inability to formulate an unrestricted notion of well-ordering does seem to prevent the long transfinite inductions—technically. But this runs against intuitions. Well, so does the Burali-Forti reasoning. To sum up, if unrestricted higher-order quantification can be made coherent, then I see no reason why it should not be invoked, and we will have to live with the strengthened Burali-Forti phenomenon. But that is a big ‘‘if’’. Clearly, there are tradeoffs involved in the use of unrestricted higher-order quantification. It seems that Queen is frustrated, and the older and wiser Rolling Stones are right. We can’t always get what we want. Thanks to the genius of the early set theorists, however, it looks like we can get what we need. It is a fact of life in philosophy that some intuitions must be given up. The trick is to figure out which ones those are. The issue of unrestricted higher-order quantification is internal to the generality absolutist camp. I submit that whether there is unrestricted second-order quantification or not, the prospects for generality absolutism are good.

Notes 1. Thanks to Geoffrey Hellman here. 2. Boolos continues, ‘‘If one admits that there are proper classes at all, oughtn’t one to take seriously the possibility of an iteratively generated hierarchy of collection-theoretic universes in which the sets which ZF recognizes play the role of ground-floor objects? I can’t believe that any such view of the nature of ‘2’ can possibly be correct. Are the reasons for which one believes in [proper] classes really strong enough to make one believe in the possibility of such a hierarchy?’’ 3. The argument might require a (global) choice principle. It depends on the exact formulation of the notion of indefinite extensibility. Suppose that we define a property P to be indefinite extensible if, for every Definite collection C of P’s, there is a object c such that Pc but not Cc. Then choice is needed in the above argument. Recall, however, that Dummett writes that each indefinitely extensible concept has a ‘‘principle of extension’’ that takes any definite totality t of objects each of which has P, and produces an object that also has P, but is not in t. Similarly, recall the clause in Russell’s definition: ‘‘there are some properties

488 / Stewart Shapiro

4. 5. 6. 7.

8.

9.

10.

11. 12.

13. 14.

such that, given any class of terms all having such a property, we can always define a new term also having the property in question’’. If these assertions are taken (more or less) literally, then the ability to ‘‘produce’’ or ‘‘define’’ the new term, given any Definite collection of them, is part of the notion of indefinite extensibility. If so, then choice is not needed in the above argument. The ‘‘principle of extension’’ does the choosing. Thanks to Timothy Bays here. This was confirmed in conversation with a few notable generality absolutists. I am indebted to Geoffrey Hellman here. Kit Fine made a similar suggestion. One notable exception is Menzel [1986], who proposes a set theory in which the urelements do not form a set. I do not think that particular system is of much help here, however. Menzel suggests that there is a single, fixed range of ordinals, or well-ordering types. These are called ‘‘real ordinals’’ and are not identified with pure sets like the von Neumann ordinals. According to Menzel’s set theory, there is a set of all real ordinals, but no set of all von Neumann ordinals, even though the real ordinals are equinumerous with the von Neumann ordinals. The replacement principle is restricted to functions whose domain is a pure set. There is also a powerset of the set of all real ordinals, and by Zermelo’s theorem, this powerset can be well-ordered (as can its powerset). So Menzel’s set theory allows for well-ordered sets that are much larger than any ordinal. The absolutist can safely restrict model-theoretic semantics to the pure sets (V) if there is a reflection principle to the effect that if an argument is invalid, then there is a pure-set model in which its premises are true and its conclusion false. If the language is first-order, such a reflection principle is an immediate consequence of the Lo¨wenheim-Skolem theorem. If the language is higher-order, the reflection principle has ramifications concerning large cardinals (see Shapiro [1991, Chapter 6]). Hellman [2002] provides a rational reconstruction of Zermelo’s [1930] program, and does not make exegetical claims concerning the real Zermelo. Hellman’s character is called ‘‘Zermelo*’’, which we might read ‘‘Zermelo superstar’’. Thanks to Hellman for clarification of several matters. I am at a loss to understand the phrase ‘‘as yet’’ in the last sentence of this passage (which is also the last sentence of the article, except for a brief acknowledgment). Are we to look forward to the day when we can ‘‘survey’’ the entire iterative hierarchy? Perhaps the problem is in what Zermelo means by ‘‘unsurveyable’’. Thanks to Timothy Williamson here. In the opening paragraph, Cartwright does use the plural construction, and perhaps plural quantification: ‘‘the natural numbers, the pure sets, and the trees in the garden—all of them, along with any other objects there are—can simultaneously be the values of the variables in a first-order language.’’ I am not sure that he needs locutions like this. Thanks to Timothy Williamson for pointing out this passage. I do not know how to rigorously formulate this ‘‘general hypothesis’’, especially if unrestricted second-order quantification is not allowed. One possibility would be to invoke the schemes themselves (as in Feferman [1991]). This gives us some of the expressive power of second-order languages,

All Sets Great and Small / 489 but (perhaps) without a need to invoke proper classes or some other ranges to the higher-order variables. The issues would take us too far afield here. 15. The issue concerning unrestricted second-order quantification has an interesting analogue in Hellman’s modal program. Invoking the metaphor of possible worlds, Hellman allows second-order quantification within each world. This is the counterpart of Zermelo’s use of second-order separation and replacement within each inaccessible rank. For Hellman, the ‘‘proper classes’’ in each world are sets in another, larger possible world. Ditto for Zermelo. So for Hellman, there is no world that houses every possible ordinal, or every possible set. So there is no analogue to absolutely proper classes. In a sense, Hellman’s ‘‘quantifier’’ &"x covers all objects in all worlds. In contrast, if X is monadic, the locution &"X covers the proper classes of each world, but each of those are sets in another world. There are no such things as what may be called ‘‘absolutely proper classes’’, collections (like V or V) that are ‘‘too large’’ to be in any one world. The analogue of absolutely unrestricted second-order quantification is indeed ruled out. Hellman thus has a direct counterpart of the above issue concerning an o-sequence of inaccessible ranks. In response, he proposes a replacement scheme (Hellman [1989, 78]). Although Hellman does not invoke it, an analogue of Zermelo’s stronger extendibility principle is also available. One can state in the allowed formal language that, necessarily, for each ordinal a, it is possible for there to be a sequence of inaccessibles of length a. 16. Thanks to Graham Priest here.

References Boolos, G. [1998], ‘‘Reply to Charles Parsons’ ‘Sets and classes’’’, in G. Boolos, Logic, logic, and logic, Cambridge, Massachusetts, Harvard University Press, 30–36. Cartwright, Richard L. [1994], ‘‘Speaking of everything’’, Nouˆs 28, 1–20. Dummett, M. [1991], Frege: Philosophy of Mathematics, Cambridge, Massachusetts, Harvard University Press. Dummett, M. [1993], The Seas of Language, Oxford, Oxford University Press. Feferman, S. [1991], ‘‘Reflections on incompleteness’’, Journal of Symbolic Logic 56, 1–49. Hellman, G. [1989], Mathematics Without Numbers, Oxford, Oxford University Press. Hellman, G. [2002], ‘‘Maximality vs. extendability: reflections on structuralism and set theory’’, in D. Malament (editor), Reading Natural Philosophy, La Salle, Illinois, Open Court, 335–361. Kreisel, G. [1967], ‘‘Informal rigour and completeness proofs’’, Problems in the Philosophy of Mathematics, edited by I. Lakatos, Amsterdam, North Holland, 138–186. McGee, Vann [1997], ‘‘How we learn mathematical language’’, Philosophical Review 106, 35–68. Menzel, Christopher [1986], ‘‘On the iterative explanation of the paradoxes’’, Philosophical Studies 49, 37–61. Parsons, C. [1977], ‘‘What is the iterative conception of set?’’, Logic, Foundations of Mathematics and Computability Theory, edited by R. Butts and J. Hintikka, Dordrecht, Holland, D. Reidel, 335–367; reprinted in P. Benacerraf and H. Putnam (editors), Philosophy of Mathematics, second edition, Cambridge, Cambridge University Press, 1983, 503–529; and in C. Parsons, Mathematics in Philosophy, Ithaca, New York, Cornell University Press, 1983, 268–297. Priest, G. [2002], Beyond the Limits of Thought, second edition, Oxford, Oxford University Press.

490 / Stewart Shapiro Rayo, A. and T. Williamson [2004], ‘‘A completeness theorem for unrestricted first-order languages’’, J. C. Beall and M. Glanzberg, editors, Liars and Heaps, Oxford, Oxford University Press, forthcoming. Resnik, M. [1988], ‘‘Second-order logic still wild’’, Journal of Philosophy 85, 75–87. Russell, B. [1906], ‘‘On some difficulties in the theory of transfinite numbers and order types’’, Proceedings of the London Mathematical Society 4, 29–53; reprinted in Bertrand Russell, Essays in Analysis, London, George Allen and Unwin Ltd., 1973, 135–164. Schimmerling, Ernest [2001], ‘‘The ABC’s of mice’’, Bulletin of Symbolic Logic 7, 485–503. Shapiro, S. [1991], Foundations Without Foundationalism: A Case for Second-order Logic, Oxford, Oxford University Press. Shapiro, S. [1993], ‘‘Modality and ontology’’, Mind 102, 455–481. Tarski, A. [1935], ‘‘On the concept of logical consequence’’, Logic, Semantics and Metamathematics, by A. Tarski, Oxford, Clarendon Press, 1956, 417–429. Williamson, T. [1994], Vagueness, London and New York, Routledge Publishing Company. Williamson, T. [1998], ‘‘Indefinite extensibility’’, in Johannes L. Brandl and Peter Sullivan, editors, New Essays on the Philosophy of Michael Dummett, Grazer Philosophische Studien 55, 1–24. Williamson, T. [2003], ‘‘Everything’’, this volume. Zermelo, E. [1930], ‘‘U¨ber Grenzzahlen und Mengenbereiche: Neue Untersuchungen u¨ber die Grundlagen der Mengenlehre’’, Fundamenta Mathematicae 16, 29–47; translated as ‘‘On boundary numbers and domains of sets: new investigations in the foundations of set theory’’, in From Kant to Hilbert: A Source Book in the Foundations of Mathematics, Volume 2, edited by William Ewald, Oxford, Oxford University Press, 1996, 1219–1233.

ALL SETS GREAT AND SMALL: AND I DO MEAN ALL

Stewart Shapiro The Ohio State University The University of St. Andrews I want it all. Queen You can’t always get what you want. But if you try sometime, you just might find, you get what you need. Rolling Stones

1. Wither generality? Timothy Williamson [2003] has made a compelling, prima facie case against the view he calls ‘‘generality relativism’’, the thesis that it is not possible for firstorder variables to range over everything at once. He and others have pointed out that one cannot state the relativist position without violating it. For example, the relativist might say, or try to say, that for any quantifier used in a proposition of English, there is something outside of its range. What is the range of the word ‘‘something’’ at the end? Or suppose we ask the relativist if there is some one thing cannot appear in the range of any bound variable. The likely response would be: ‘‘No. For each object o, it possible to include o in the range of quantifiers, but one cannot quantify over everything at once.’’ This sentence contains an unrestricted quantifier, or so it seems, pending some clever move from a generality relativist. Truth be told, I am not particularly interested in whether it is coherent to have bound variables ranging over absolutely everything. When it comes to the world of subatomic physics, for example, who knows if it is best to talk about objects at all, let alone all objects? The same may go for ordinary talk of items with vague boundaries, such as clouds, mountains, and seas. In response to this, I can hear a Quinean generalist protesting.1 The trouble, if there is trouble, lies with predicates like ‘‘particle’’ and ‘‘cloud’’, not with ‘‘exists’’ or, what is the same thing, with quantification. If there are such things as particles and clouds, then they fall within the range of our bound variables, and can do so all at once. However, I am not sure matters are this straightforward. On some views of

468 / Stewart Shapiro

vagueness, the boundaries of vague terms vary with context. Any context can change and, with this change, new objects might be found within the range of the quantifiers. It may not make sense to have a super-context that includes every context. These matters of anti-realism and metaphysics can be put aside here. I am concerned with the special case of whether one can have bound variables ranging over all pure sets, or all pure set-like-totalities. And I am interested in whether one can have bound variables ranging over all ordinals and all cardinals, or over all well-ordering-types and cardinality-types. I presume that those are the interesting cases anyway, given the role of the Russell, Cantor, and Burali-Forti paradoxes in the literature on this topic. Toward the end of his paper, Williamson puts his finger on what I take to be the main semi-formal sticking point for the generality absolutist: how are we to understand second-order quantifiers when the corresponding first-order quantifiers have unrestricted range? Prima facie, second-order quantifiers have a range too, and by Russell’s paradox, this range cannot lie entirely in the range of the firstorder variables. So there seems to be something that cannot lie in the range of first-order quantifiers. So first-order quantifiers cannot be completely unrestricted. The same goes for any quantifier at any level, provided that there is a higher level. Mea culpa. In my book on second-order logic (Shapiro [1991]), I took the second-order variables of set theory to range over proper classes, which I called ‘‘logical sets’’. The lesson of Russell’s paradox, I said, is that in the context of set theory, there are logical sets that do not correspond to any member of the iterative hierarchy. In the meta-theory, or perhaps the meta-meta-theory, or the mathematical English that I used to write the book, I suppose I took classes to be ‘‘things’’. Along with many (but not all) set-theorists, I used singular terms, like ‘‘V’’ and ‘‘V’’, that denote proper classes, and I had informal meta-variables ranging over proper classes. Clearly, proper classes are set-like things, having only pure iterative sets as members. So the first-order variables of second-order ZFC do not range over all pure set-like things. But consider George Boolos’s [1998, 35] retort to a similar suggestion: ‘‘Wait a minute! I thought that set theory was supposed to be a theory about all, ‘absolutely’ all, the collections that there were and that ‘set’ was synonymous with ‘collection’.’’2 At the time, I might have responded with the above line from the Rolling Stones: ‘‘You can’t always get what you want’’, perhaps adding that we can get what we need. I see now, if not then, that there is something fishy about claiming that second-order ZFC is the most inclusive theory of pure sets that there is, and then using informal variables ranging over pure set-like things that outrun the first-order variables of ZFC on its intended interpretation. To solve this problem, Boolos proposed his celebrated plural interpretation of monadic, second-order quantifiers. Williamson raises five points against that resolution, at least four of which apply in the cases that interest me: pure sets, ordinals, and cardinals. I have my own doubts as to whether our independent or pre-theoretic grasp of plural quantifiers is sufficiently determinate to ground

All Sets Great and Small / 469

second-order theories with infinite, let alone unbounded, domains. Consider a statement of second-order real analysis of the form: "X9Y(X,Y).

There is no issue concerning the existence of sets of real numbers, or at least none that is relevant here (and none that moved Boolos himself). So the opening second-order quantifiers can be given both a plural and an ordinary, set-theoretic interpretation. It had better be the case that if we read the quantifiers as plurals, we will get exactly the same truth value, in general, as we would if we understand the quantifiers as ranging over sets of real numbers. In effect, there needs to be a ‘‘plurality’’ (if you will excuse the expression) for each set of real numbers. Does the English plural construction have that determinate a meaning? Of course, the pluralist can always stipulate that she intends it to have such a meaning in cases like that of real analysis—but one needs some set theory to make the stipulation. This move might sustain Michael Resnik’s [1988] and my [1993] complaint that the sophisticated understanding of the plural construction used in justifying second-order logic is mediated by set theory. In any case, the important use of the plural construction is for cases where we would rather not—or cannot—speak of pluralities as things. Second-order Zermelo-Fraenkel set theory is the main case in point (as confirmed in conversation with Boolos). Is there reason to think that the plural construction is sufficiently determinate in such cases? Is there a ‘‘plurality’’ corresponding to each and every proper class? The pluralist is surely not in position to stipulate this, unless she recognizes proper classes as objects. I do not claim to have presented a knock-down objection against the plural rescue. Moreover, there are a number of other proposals for higher-order quantification that need to be digested. Williamson suggests that we can directly engender an understanding of the generality involved in second-order quantification, an understanding that is not mediated by set theory or any other construal of the range of the quantifiers. Nevertheless, this direct understanding is supposedly equivalent to the set-theoretic interpretation on domains in which the domain is a set (with a powerset). If the Williamson plan succeeds, and does not beg any questions, then we can wax homophonic in giving truth conditions. Perhaps. I take it as agreed that the antinomies provide the basic motivation for generality relativism. If the absolutist gets past those, his remaining problematic feature is the use of unrestricted second-order quantification. However, even if first-order languages are too weak, we may not need full second-order languages to do whatever work we want our grand theory to do. Here, I float a compromise, building on the framework proposed in Zermelo [1930], ‘‘U¨ber Grenzzahlen und Mengenbereiche’’ (‘‘On boundary numbers and domains of sets’’). It allows unrestricted, absolute first-order quantification, and it allows at least restricted second-order quantification. To echo the Rolling Stones, extendibility principles give us what we need. Although similar frameworks have been proposed by

470 / Stewart Shapiro

Charles Parsons and Geoffrey Hellman, they are generality-relativists, after a fashion. Let’s see how much of the cake we can have if we eat it too. We turn first to one of the antinomies.

2. Pesky Burali-Forti and indefinite extensibility Russell’s ‘‘On some difficulties in the theory of transfinite numbers and order types’’ [1906] begins with an examination of the now standard paradoxes, and concludes: …the contradictions result from the fact that…there are what we may call selfreproductive processes and classes. That is, there are some properties such that, given any class of terms all having such a property, we can always define a new term also having the property in question. Hence we can never collect all of the terms having the said property into a whole; because, whenever we hope we have them all, the collection which we have immediately proceeds to generate a new term also having the said property.

Citing this passage, Michael Dummett [1993, 441] writes that an ‘‘indefinitely extensible concept is one such that, if we can form a definite conception of a totality all of whose members fall under the concept, we can, by reference to that totality, characterize a larger totality all of whose members fall under it’’ (emphasis mine). According to Dummett, an indefinitely extensible property P has a ‘‘principle of extension’’ that takes any definite totality t of objects each of which has P, and produces an object that also has P, but is not in t (see also Dummett [1991, 316–319]). Let us say that a property P is Definite if it is not indefinitely extensible. Dummett’s remarks won’t do as a definition, since he uses the word ‘‘definite’’ to explain what it is to be indefinitely extensible. Nevertheless, what he says seems correct. Let us focus on the Burali-Forti paradox. Let O be any Definite collection of ordinal numbers. Let O0 be the collection of all ordinals a such that there is a b2O and ab. That is, a is in O0 if a is smaller than, or equal to, something in O. Since O0 is well-ordered, let g be its order type. Let g0 be the order type of O0 [ {g}—the order of O0 with one item added ‘‘at the end’’. Then g0 is an ordinal number, and g0 is not a member of O. So the property of being an ordinal is indefinitely extensible. One can, of course, challenge the set-theoretic principles—union, pairing, etc.—used here, but the reasoning does appear natural. As Dummett [1991, 316] puts it, if we have a clear grasp of any totality of ordinals, we thereby have a conception of what is intuitively an ordinal number greater than any member of that totality. Any [D]efinite totality of ordinals must therefore be so circumscribed

All Sets Great and Small / 471 as to forswear comprehensiveness, renouncing any claim to cover all that we might intuitively recognise as being an ordinal.

This is the sort of thing that motivates generality relativism. Russell [1906, 144] wrote that it ‘‘is probable’’ that if P is any property which demonstrably does not have an extension (that obeys extensionality) then ‘‘we can actually construct a series, ordinally similar to the series of all ordinals, composed entirely of terms having the property’’ P. In present terms, Russell’s conjecture is that if P is indefinitely extensible, then there is a one-to-one function from the ordinals into P. Russell does not provide an argument for this, but I think there is one: Let a be an ordinal and assume that we have a one-to-one function f from the ordinals smaller than a to objects that have the property P. Consider the collection {fb b < a}. This is Definite. Since P is indefinitely extensible, there is an object a such that P holds of a, but a is not in this set. Set fa = a.

This argument uses transfinite recursion on ordinals and a version of replacement: if a totality t is equinumerous with an ordinal, then t is Definite.3 Both of those seem beyond reproach, but one does need special care in areas like this. It is clear that some intuitive principles have to be dropped. Nevertheless, if the argument (or at least its conclusion) is correct, then ‘‘ordinal’’ is the basic indefinitely extensible notion. In any case, the Burali-Forti paradox is robust. The very definition of well-ordering suffices to generate ever more ordinals, or at least what look like well-ordering types, without using ‘‘external’’ resources like the powerset or setcomprehension principles invoked in the Cantor and Russell paradoxes. Let V(x) be the property of being an ordinal. It is, of course, routine to show that V is itself a well-ordering (i.e., has the requisite property of relations). That is, the V’s are well-ordered. But, alas, V has no order type. We are used to that. We can define a relation that is a well-ordering strictly longer than V: Let a and b be ordinals. Say that a1 b if a6¼0 and either a

472 / Stewart Shapiro

to do transfinite recursions on L, which are also of length V. So, in effect, we have a transfinite recursion of length 2V. The following passage appears in a survey article called ‘‘The ABC’s of mice’’, by Ernest Schimmerling: We begin by constructing L level by level. The first o levels are exactly the hereditarily finite sets, the next !L1 levels are exactly the sets that are heredi-

tarily countable in L, and so on. Now we ask ourselves what comes next. (Schimmerling [2001, 486–7]) Next? I thought that in defining L, we were to go through all of the ordinals, i.e., to carry it as far as possible. How can something come next? If there is a ‘‘next’’, we have not gone far enough—we have not gone through the ordinals.4 Schimmerling continues: For although we have climbed up to the minimal transitive proper class model of ZFC, foundational considerations that fall under the category of large cardinals have tempted us to adopt certain theories that extend ZFC. These extensions are not true in L, for they imply that there exists a non-trivial elementary embedding j:L!L, which is known to fail in L. So how do we continue or revise the construction in a way that buys us the existence of such an embedding? One naı¨ ve idea is to continue the construction past all the ordinals and throw in the proper class j at stage V or beyond, but this approach leads to some obvious metamathematical problems that we find annoying.

We are to go past all of the ordinals? I can see why the working set theorist would find the meta-mathematical problems annoying. We do need some care when reasoning in this area; the Burali-Forti paradox is annoying. In the words of Graham Priest [2002], we seem to have gone beyond the limits of thought, and I think most of us would rather avoid the drastic measures that Priest recommends. He says, for example, that V both is and is not less than V+1, the next ordinal after all ordinals. Typically, the ‘‘care’’ here is to replace the long transfinite recursions with codings. Nevertheless, it seems to me that in this case at least, the grand transfinite recursion, of length 2V, is as coherent as can be, or at least as coherent as anything else in set theory. Of course, all this talk of ‘‘constructing’’ is only a metaphor. We cannot actually construct anything to stage o without performing a super-task, and even if we could do super-tasks, they would not take us much beyond o. The more literal characterization of L is a perfectly good definition by transfinite recursion, and once we have a well-defined predicate, we can use it to do further transfinite recursions. Using variables or schematic letters, we can even do a transfinite recursion along the order type 3 above, of length V2. And of course, this is not as far as we can go—not by a long shot.

All Sets Great and Small / 473

At this point, one might protest, paraphrasing Boolos: Wait a minute! I thought that set theory was supposed to include a theory about all, ‘‘absolutely’’ all, the well-orderings and transfinite recursions that there are and that ‘‘well-ordering-type’’ was synonymous with (or at least coextensive with or isomorphic to) ‘‘ordinal’’.

Here, however, I do not see a clean way out for the generality-absolutist. What is she to make of these long transfinite recursions and inductions? The absolutist can claim that there are no ‘‘objects’’—no ordinals—that correspond to these explicitly definable long well-ordering properties. These properties simply have no order types. This rejection of V, 2V, and V2 is just an instance of Boolos’s own rejection of proper classes. But what, then, is wrong with the long transfinite recursions, and the transfinite inductions that go along with them? Well, I guess I should not say ‘‘them’’, since these long well-orderings are not objects. But what should I say? Moreover, the move seems ad hoc. What, exactly is wrong with the long transfinite recursions and inductions? Why can’t we just introduce such ‘‘ordinals’’, or names for them, by expanding ontology—just as the generality relativist contends? We just introduce a singular term, like ‘‘2V’’, that denotes one of these well-orderings. In doing so, we are just giving names to well-orderings that we are capable of understanding and using, and treating those well-orderings as objects. What’s wrong with that?

3. Williamson on indefinite extensibility In a recent article on Dummett, Williamson [1998] himself gives an intriguing analysis of indefinite extensibility, or at least what goes for that notion. For most of the paper, he focuses on semantic notions like ‘‘name’’ and ‘‘truth’’. His idea is that when we reflect on a paradoxical situation involving a key word, or words, those words might shift in meaning, not just extension. Here is the idea [1998, §6]: We start with one set of correlative meanings for ‘‘say’’, ‘‘true’’ and ‘‘false’’; we use them to construct a sentence that says nothing in that sense of ‘‘say’’; but reflection on that sentence causes normal speakers to give ‘‘say’’, ‘‘true’’, and ‘‘false’’ a new set of correlative meanings, much like the previous ones except that the sentence in question says something in the new sense of ‘‘say’’; the process can be repeated indefinitely. Normal speakers are not aware of the change, just as they are not aware of many ordinary processes of gradual change. They feel themselves to be going on in the same way, but they are not.

Williamson adds that sentences are to be ‘‘treated as sequences of sounds or marks, which have no intrinsic interpretation, but receive different interpretations in different languages; for these purposes, a change of meaning is ipso facto

474 / Stewart Shapiro

a change of language’’. So when we do some reflection on a supposedly paradoxical sentence or construction, like the Liar, the whole language changes a little. Some words get new meanings, and the corresponding predicates get new extensions. Williamson shows how there is no foothold for so-called ‘‘extended’’ or ‘‘strengthened’’ versions of the various paradoxes: ‘‘Whatever our present understanding, encountering a paradox can cause us to reach a new understanding. We cannot construct a Strengthened Liar for this approach because we cannot anticipate our future understanding in our present meaning.’’ There is no unequivocal meaning or extension we can give to ‘‘true in some successor of our language obtained by reflection on paradoxical sentences’’. This analysis of paradox is nicely consonant with the account of language and meaning found in Williamson’s work on vagueness (e.g., Williamson [1994]) and other matters. Nevertheless, I do wonder how it fits into the generality absolutism on the table today. Toward the end of the Dummett paper (§7), Williamson briefly turns to set theory: Isn’t the consistency and coherence of the iterative conception of sets intuitively obvious? Someone might argue otherwise, on the grounds that even the iterative concept of set is indefinitely extensible. In particular, it may be suggested, however large we suppose the universe of sets to be, our ability to enlarge it by adding proper classes shows that we have not really exhausted the full import of the iterative conception, and therefore that that conception is incoherent after all. This argument assumes that, on the iterative conception, proper classes are really sets under another name, so that denying them the name ‘‘set’’ is an ad hoc verbal evasion, inconsistent with the underlying conception. The Burali-Forti paradox can be used likewise against the concept of an ordinal, which the iterative conception needs to index the stages. The present view has the resources to interpret the iterative conception more charitably, while helping to accommodate the intuitions behind the argument against it. For given any reasonable assignment of meaning to the word ‘‘set’’ we can assign it a more inclusive meaning while feeling that we are going on in the same way, and make correlative changes to the words in an iterative account of sets, to preserve it too. The inconsistency is not in any one meaning we assign the iterative account; it is in the attempt to combine all the different meanings that we could reasonably assign it into a single super-meaning.

The idea, I take it, is that at any given time in our recent history, we have, or may have, assigned a coherent meaning and extension to the words ‘‘pure set’’ and ‘‘ordinal’’ (in the context of iterative set theory). Thereafter, whenever we reflect on the Russell, Cantor, or Burali-Forti paradoxes, and try to talk about proper classes like V and V, we silently and subtly assign new meanings to those words. We are then speaking a new language. The process can be repeated indefinitely. To avoid ambiguity, let us call one of the old languages L1, which contains the words ‘‘pure set1’’ and ‘‘ordinal1’’; and let L2 be a new language obtained by

All Sets Great and Small / 475

reflection on L1. The new language L2 contains the locutions ‘‘pure set2’’ and ‘‘ordinal2’’. The key point here is that the change in language results in adding items to the extensions of the corresponding terms. There are ordinals2 which are much longer than any ordinal1. There are pure sets2 that are much larger than any pure set1. Since both languages are second-order, Zermelo’s theorem applies. The pure sets1 are isomorphic to a proper initial segment of the pure sets2. As speakers of the language, we can then reflect on L2 and produce a third language L3, with its words ‘‘pure set3’’ and ‘‘ordinal3’’. According to this approach, what we cannot do is put everything together, to produce a ‘‘supermeaning’’ for a word like ‘‘set’’, whose extension includes everything we might ever mean by that term as we keep changing our language. There is no term for ‘‘pure set like entity in some successor to our language’’. Fair enough. What do the expanded extensions for ‘‘set’’ and ‘‘ordinal’’ say about generality absolutism? Consider the word ‘‘everything’’ in the original language L1. Call the word ‘‘everything1’’, or the phrase ‘‘absolutely1 everything1’’. When speaking the language L1, the generality absolutist asserts that his ‘‘everything’’ includes absolutely everything. And let us suppose that indeed it does. He is not deluding himself with that word. Now suppose he reflects on the Burali-Forti paradox and ends up speaking the new language L2, with the new words ‘‘pure set2’’ and ‘‘ordinal2’’. Of course, this new language also has a word ‘‘everything’’. Call this word ‘‘everything2’’. And of course, our generality absolutist maintains that everything2 is absolutely everything. The switch to a new language has not cured him of his generality absolutism (especially since he might not have noticed the shift in meaning). The compelling question is this: Does (or did) ‘‘everything1’’ include, or apply to, the new ordinals, the items to which the new word ‘‘ordinal2’’ applies? The whole point of the exercise is that the word ‘‘ordinal’’ in the language L2 applies to an item, an object, that is isomorphic to all of the ordinals1. Call this L2 object V2. Then V2 itself, as well as 2V2, (V2)2, (V2)V2, etc. are all perfectly good ordinals in L2. But none of those are ordinals1, nor does the language L1 recognize well-ordering types that long. But are those items nevertheless included in ‘‘everything1’’? When, before the shift to L2, the generality absolutist asserted that by everything, he meant absolutely everything, did this include V2, 2V2, (V2)2, (V2)V2, etc.? There are two possibilities. Suppose, first, that it does not. So the English word ‘‘everything’’ or, indeed, the phrase ‘‘absolutely everything’’ itself shifts its meaning in the transition to the new language, in exactly the way that the words ‘‘pure set’’ and ‘‘ordinal’’ do. In other words, ‘‘everything2’’ includes much more stuff than ‘‘absolutely1 everything1’’ did. For example, it includes the ordinals2 to which ‘‘everything1’’ does not apply: V2, 2V2, (V2)2, (V2)V2, etc. In other words ‘‘absolutely everything’’ has that same extendibility that ‘‘set’’ and ‘‘ordinal’’ have. I submit that this would be at least a moral victory for the generality relativist. It is what she has been trying to say all along. The relativist might note: ‘‘All right, sure, I grant that my first-order variables range over ‘absolutely

476 / Stewart Shapiro

everything’, as that term is used in my present language, right now. But we both know a way to reflect on the use of certain words, by considering the BuraliForti paradox, and we know that this reflection will result in a new language, and its phrase ‘absolutely everything’ covers a lot more than our current phrase does. That is enough relativity for me.’’ I take it that this option is a non-starter for the generality absolutist.5 So now suppose that the original word ‘‘everything’’ in L1 does include, or apply to the items to which the new word ‘‘ordinal2’’ applies, and also the items to which ‘‘ordinal3’’ applies (or will apply once L3 is generated), etc. After all, how can we change what exists—the furniture of the universe—just by speaking a new language? So the ontology of L1, ‘‘absolutely1 everything1’’, covers V2, 2V2, (V2)2, (V2)V2, etc. It is just that L1 does not recognize these things, these ordinals2, to be ordinals. They are not ordinals1. They have members2, but not members1. They are well-orderings2, but not well-orderings1. On this option, the words ‘‘ordinal’’, ‘‘set’’, ‘‘member’’, and ‘‘well-order’’ shift their meanings (and extensions) through the language changes, but the word ‘‘everything’’ does not. In L2, the phrase ‘‘every ordinal’’ is (much) more extensive than the same phrase in L1, but the difference is chalked up to a change in the word ‘‘ordinal’’, not to a change in the word ‘‘every’’. One might think that this, too, gives a small moral victory to the generality relativist. Speaking in L1, the relativist might say: ‘‘All right, sure, I grant that the phrase ‘absolutely all ordinals’ does cover absolutely all ordinals—all ordinals1—which are all the ordinals there are. But we both know a way to reflect on our use of certain words, and we know that this reflection will result in a new language in which the word ‘ordinal’ has a meaning much like ours, applying to well-ordered set-like things, but the new word has a much expanded range. That is enough relativity for me.’’ In fact, however, this is no victory for the relativist at all. When the speaker asserted in L1 that ‘‘all ordinals’’ includes absolutely all ordinals, she only means all ordinals1. What else can she mean? She has no other word for ordinal. No language contains a word whose extension includes all and only the items that are in the extension of the word ‘‘ordinal’’ in some language or other that we might speak one day, unless that word is ‘‘everything’’. For all we know, we might end up speaking a language in which the word ‘‘ordinal’’ means what we mean by ‘‘grasshopper’’. Recall that Williamson treats sentences and terms ‘‘as sequences of sounds or marks, which have no intrinsic interpretation’’. More seriously, perhaps, no language can contain a word whose extension includes all and only the items that are in the extension of the word ‘‘ordinal’’ in a language obtained from L1 by iterated applications of the paradoxes. The reason is that we could do a Burali-Forti reflection on that language. It seems that we have only limited control over what our words might mean as the language morphs into other languages as we reflect on its expressive resources or do the Burali-Forti thing—even though the reflections into new languages are regular and well-ordered, indexable by ordinals.

All Sets Great and Small / 477

In other words, the absolutist insists on a language-independent meaning and extension of the phrase ‘‘absolutely all’’. In present terms, the absolutist maintains that the phrase ‘‘absolutely all’’ is not indefinitely extensible. The word ‘‘ordinal’’, however, remains our paradigm case of an indefinitely extensible notion. According to Williamson, no one—generality relativist or generality absolutist—can insist on a fixed, language-independent meaning of the word ‘‘ordinal’’ or the phrase ‘‘well-ordering’’, a meaning that encompasses everything we might ever mean by the term. There is no language-independent concept or extension for what we might call an ‘‘ordinal-like entity’’, a term that covers every possible well-order-type encompassed into a von Neumann-style set. There is similarly no language-independent concept for ‘‘pure-set-like entity’’. Of course, the relativist is not happy with the asymmetry between ‘‘everything’’ and ‘‘ordinal’’ or ‘‘ordinal-like entity’’.6 Both camps hold that ‘‘pure set’’ and ‘‘ordinal’’ are open-ended, changing as we reflect on the language’s expressive resources and continue previous operations. The relativist suggests that we should similarly regard locutions like ‘‘exists’’, ‘‘object’’, ‘‘entity’’, and ‘‘is selfidentical’’ to be indefinitely extensible. If we try to conceive of an absolute plurality of existing things, we have the mathematical-linguistic capacity to consider the collection of just those things, and on it goes from there. The relativist might add that wholes inherit the indefinite extensibility of their parts. Every ordinal and every pure set is an entity. Since the former are indefinitely extensible, the word ‘‘entity’’ is just as indefinitely extensible as the mathematical notions. In response, the Williamson-style absolutist claims that this just begs the question. His view just is that ‘‘everything’’ is not indefinitely extensible, and so there is an asymmetry between ‘‘everything’’ and ‘‘ordinal’’, like it or not. Consider the second language L2. Cardinality considerations (after a fashion) entail that most of the ordinals2 cannot be pure sets1. After all, V2 has a powerset2, and by the axiom of choice, this pure set2 can be well-ordered2 and has an ordinal2. And this ordinal has a powerset which can be well-ordered and so has an ordinal2. Powerset, replacement, and union take us higher and higher. So what are these ordinals2 in the original language L1? Since most of them are not pure sets, they (or something in their transitive closures under membership1) are urelements—we should say urelements1. It follows that on this combination of views, Vann McGee’s [1997] urelement set axiom, stating that the urelements form a set, fails and fails badly in any language which has a set theory like ZFC and a quantifier ranging over absolutely everything. The word ‘‘everything’’ in L1 must apply to the ordinals2, the ordinals3, the ordinals4, etc. All of those are urelements in L1, and all but a few of those are urelements in L2, etc. I might add that it also follows that in L1, it is not the case that each proper class (so to speak) is equinumerous to every other. The ordinals3 are much more than the ordinals2, and these are much more than the ordinals1, but all three are proper classes in L1. The point is general. In any language that we happen to be

478 / Stewart Shapiro

speaking, or in any language that we can speak, the pure sets and the ordinals are a tiny, tiny portion of the vast, wild, and wonderful world of absolutely everything. Well, I do not have a problem with this, if the generality absolutist does not. Set theories with urelements, where the urelements form a proper class (so to speak) are not well developed.7 The program might put a damper on the use of the pure iterative hierarchy, V, as the realm for model-theoretic semantics. There are interpretations of the language that are not isomorphic to a pure set, since they are (much) larger than any pure set. For model theory, V is not big enough—not by a long shot.8

4. Zermelo’s better idea Let me turn to a different suggestion for handling notions like ‘‘ordinal’’, ‘‘cardinal’’, and ‘‘set’’. It is based on the original Russell-Dummett notion of indefinite extensibility, where the language is kept fixed. Words do not shift in meaning. Zermelo [1930] presents a version of second-order ZFC with urelements, in pretty much its contemporary form. If there are no urelements, then each model of second-order ZFC is isomorphic to a rank Vk, in which k is a strong inaccessible. In what follows, I use ‘‘inaccessible’’ for ‘‘strong inaccessible’’. Zermelo [1930, 1233] proposes an axiom stating the existence of ‘‘an unbounded sequence’’ of models of the theory, each larger than its predecessors. Each such model has subsets (like the collection of ordinals in the model) which are not members of the structure. Within the given model, these subsets are proper classes, and act as indefinitely extensible properties. However, [w]hat appears as an ‘‘ultra-finite non- or super-set’’ in one model is, in the succeeding model, a perfectly good, valid set with both a cardinal number and an ordinal type…To the unbounded series of Cantor ordinals there corresponds a similarly unbounded…series of essentially different set-theoretic models.

‘‘Model’’ might not be the best word here. Zermelo refers to what set-theorists call ‘‘intended models’’ or ‘‘standard models’’. The structures he has in mind are inaccessible ranks in an iterative hierarchy with urelements. His proposed axiom of extendibility entails that the inaccessibles are unbounded in the universe. In present terms, then, Zermelo’s proposed axiom is that the series of inaccessible cardinals is itself indefinitely extensible. Each inaccessible is a Definite collection, but any inaccessible, or indeed any set of inaccessibles, gives rise to further, larger inaccessible sets, cardinals, and ordinals. So there is no set of all inaccessible cardinals, or all such ranks. Parsons [1977] and Hellman [1989, Chapter 2], [2002] provide accounts of set theory similar to Zermelo’s, but they think in terms of possible structures (or

All Sets Great and Small / 479

possible extensions of the universe). One can follow Hellman [2002] and interpret Zermelo that way, by inserting boxes and diamonds into his text at crucial places.9 However, Zermelo’s own language seems to take the talk of inaccessible ranks (‘‘models of set theory’’) at face value. As noted, Zermelo says that for each inaccessible rank, there is a larger. In contrast, Hellman [1989, 72] makes a modal assertion: (MOD-EXT) &"X"f [(^ZF2)X[2/f ] ! ^9Y9g((^ZF2)Y[2/g] & (X,f )< (Y,g))], where ‘‘^ZF2’’ is the conjunction of the axioms of second-order set theory; ‘‘(^ZF2)X[2/f]’’ is the restriction of those axioms to the monadic, second-order variable X, substituting the binary relation variable f for the membership symbol ‘‘2’’; and (X,f)<(Y,g) says that "x(Xz!Yx) and that f is the restriction of g to X. So (MOD-EXT) is a sentence in a second-order language, with no non-logical terminology. It asserts that, necessarily, for every model

480 / Stewart Shapiro concept of well-ordering. This series reaches no true completion in its unrestricted advance, but possesses only relative stopping-points, just those [strong inaccessibles] which separate the higher model types from the lower. Thus the set-theoretic ‘‘antinomies’’, when correctly understood, do not lead to a cramping and mutilation of mathematical science, but rather to an, as yet, unsurveyable unfolding and enriching of that science.

In present terms, Zermelo claim’s that the proper classes of a given inaccessible rank become sets ‘‘in the succeeding model’’—in the next inaccessible rank. This may not be correct, depending on how many urelements there are, and depending on what ‘‘succeeding’’ means. Let l be the third inaccessible, and suppose that there are exactly l-many urelements. In Zermelo’s ‘‘canonical’’ iteration, the ‘‘first’’ model of set theory would be the first inaccessible rank. In that structure, the urelements are a proper class. The urelements are also a proper class in the second and third inaccessible rank. Only at the fourth inaccessible rank do the urelements become a set. The proofs of the main theorems in Zermelo [1930] indicate that he was aware of this. So perhaps we need to be careful about what ‘‘succeeding’’ means: if M is an inaccessible rank in Zermelo’s (canonical) hierarchy, then define ‘‘the succeeding model’’ of M to be the smallest inaccessible rank in which the universe of M is a set. Of course, we would then need an axiom to the effect that each inaccessible rank has a succeeding model. Zermelo’s key assertion, the extendibility principle, is that the proper classes of a given inaccessible rank become sets in the succeeding model. This amounts to a thesis that the proper classes of a given rank become sets in some later model. Moreover, his ‘‘first development theorem’’ manipulates the collection of urelements using ordinary set-theoretic constructions, such as replacement and union. This presupposes, or seems to presuppose, that for each inaccessible rank, there is a ‘‘succeeding model’’ in the hierarchy, in the foregoing sense of ‘‘succeeding’’. The extendibility principle is thus an analogue of McGee’s urelement set axiom. It entails that there is a model whose rank is a cardinal k and which contains a function whose domain is a member of Vk and whose range includes every urelement. By replacement and separation, the corresponding rank contains a set containing all and only the urelements. Zermelo does explicitly envision inaccessible ranks in which the urelement set axiom is false (e.g., p. 1227), but the extendibility principle, as interpreted here, is true ‘‘eventually’’. This version of the urelement set axiom is the crucial item separating the Zermelo program from Williamson’s generality absolutism. According to Williamson, the language we are speaking at the moment (and any language we will be speaking at any foreseeable future moment) contains a phrase, ‘‘absolutely everything’’, whose extension is indeed absolutely everything. As we saw, if one accepts Williamson’s account of indefinite extensibility, this ‘‘absolutely everything’’ includes the sets and ordinals in languages obtained

All Sets Great and Small / 481

from this language by Burali-Forti reflection (and any other way we have of talking about new things under new names and new concepts). To speak roughly, for Williamson, ‘‘absolutely everything’’ is a proper class par excellence. There can be no language which contains a word ‘‘set’’ governed by the axioms of ZFC (with urelements) and which contains a set of absolutely everything. Such a set would have no powerset. We would encounter genuine contradiction if we did a Burali-Forti reflection on that language, producing a successor language. So for Williamson, it is decidedly not true that the proper classes (so to speak) in the present language become sets in some future language or some successor to that language. They can never form a set (in the sense of ZFC). To relate the Zermelo program to the present generality absolutism, we focus attention on Zermelo’s own language, the language in which the program is described and the lovely theorems proved. What are we to make of Zermelo’s own talk of ‘‘models’’, ‘‘normal domains’’ (i.e., inaccessible ranks), ‘‘order types’’, and the like? It seems clear that the Zermelo and Williamson programs agree that there is, or can be, unrestricted first-order quantification, or at least first-order quantification over all inaccessible ranks (i.e., all standard models of set theory). In other words, there can be unrestricted first-order quantification over all sets, which amounts to unrestricted first-order quantification over all objects.11 For what it is worth, I do not see a problem with this quantification. I have realist tendencies, and would like to go as far as I can with them. The Zermelo program invokes bound variables ranging over an indefinitely extensible notion (i.e., ‘‘standard model of second-order ZFC’’ or ‘‘inaccessible rank’’). Dummett claims that such locutions must be interpreted with Heyting semantics, and sanction only intuitionistic logic. We need not broach that issue here. The more interesting question, I think, concerns unrestricted second-order quantifiers. Like Williamson, Richard Cartwright [1994] argues for unrestricted first-order quantification, but unlike Williamson, Cartwright does not say anything about higher-order languages.12 Maybe we can get by without variables that range over unrestricted pluralities, or whatever one wants to call them, in the Zermelo program. To be frank, I am not sure. Within each inaccessible rank M, the secondorder variables range over proper classes in M, but with Zermelo’s extendibility principle (i.e., his version of McGee’s urelement set axiom), the ‘‘proper classes’’ of M are all sets in a later inaccessible rank M0 (and in all subsequent models beyond that). So the second-order quantifiers in M can be given their ordinary set-theoretic treatment in M0 . That is, the second-order quantifiers have the usual sort of range in M0 . So the later structures have the wherewithal to give the semantics for previous models. So within each rank, the second-order quantifiers are not really unrestricted, just as Zermelo claims. At the outset, Zermelo writes that we do apply ordinary set-theoretic concepts to the realm of inaccessibles:

482 / Stewart Shapiro …we call a ‘‘normal domain’’ a domain consisting of ‘‘sets’’ and ‘‘urelements’’ which satisfies the [ZF] system with regard to the ‘‘basic relation’’ a2b. We will treat ‘‘domains’’ of this kind, their ‘‘elements’’, their ‘‘subdomains’’, their ‘‘sums’’ and ‘‘intersections’’ exactly like sets, and thus according to the general settheoretic concepts and axioms, for there is no means of distinguishing them from sets in any way which essentially matters. However, we will always denote them as ‘‘domains’’ and not as ‘‘sets’’ in order to distinguish them from the ‘‘sets’’ which are the elements of the domain in question. (second emphasis mine)

The ‘‘set-theoretic axioms’’ include separation and replacement, both of which are second-order in Zermelo’s formulation. So if we are to take this talk literally, we will have to countenance unrestricted second-order quantification. But if we do recognize such quantifiers, what do they range over? Absolutely proper classes? We are in danger of giving back all the gains made by the program. For what it is worth, the axiom of separation is not a problem. Suppose that x is a set or ‘‘domain’’ and we wish to use separation on x using a property or formula . By extendibility, x is a member of an inaccessible rank M. We can thus apply separation on x using in the ‘‘domain’’ of M. This can be carried out within any model larger than M. This generalizes a bit. For the most part, the Zermelo plan—and his text— only requires second-order sentences that are restricted to a particular rank. With the extendibility principle in place, one need not invoke unrestricted second-order quantification in stating the plan or proving the categoricity theorems. For the most part. The axiom of replacement is a tougher nut to crack. Stay tuned. We can get a bit beyond the restriction by invoking a Russell-style systematic ambiguity. One can state that a given higher-order sentence is true in each inaccessible rank of set theory. For example, we note that each inaccessible rank satisfies the separation and replacement axioms. A statement like that can be interpreted as a single, unrestricted first-order sentence in the background language that Zermelo uses. This is not to say that unrestricted higher-order quantification has no place. I’ll close with some brief, and tentative, remarks on the desirability of unrestricted second-order quantification. Williamson argues that we need something like unrestricted second-order quantification to give the semantics and define the consequence relation for ordinary, first-order logic, when unrestricted quantifiers are allowed (see also Rayo and Williamson [2004]). In particular, the straightforward, Tarskian [1935] definition of logical consequence invokes a variable ranging over all interpretations. Thus, to define consequence this way, we need some way of specifying ‘‘interpretations’’ or ‘‘extensions’’ of the predicate letters. Some of these ‘‘interpretations’’ and ‘‘extensions’’ are proper classes, such as ‘‘all ordinals’’ or ‘‘all pure sets’’. Perhaps, but to return to the Rolling Stones, we might find that we get what we need—thanks to the completeness theorem. A well-known

All Sets Great and Small / 483

argument, due to Kreisel [1967], provides some confidence that the ordinary conception of logical consequence, which is restricted to models whose domain is a set, gets it right at least for first-order languages. Admittedly, this is not a completely comfortable resolution. If we do not have unrestricted second-order quantification, or something like that, we may not be able to state the correctness of the ordinary conception of logical consequence for languages with unrestricted quantification. In other words, we can’t say what Kreisel’s argument proves. The ‘‘informal rigor’’ may be too informal. McGee [1997] provides another interesting application of unrestricted second-order quantification. Let M1 and M2 be any interpretations that satisfy second-order ZFC and the urelement set axiom, and assume that the universe of M1 is equinumerous with the universe of M2. McGee shows that the pure sets of M1 are isomorphic to the pure sets of M2. This result is part of an argument for the determinacy the language of pure mathematics. The relevant cases for that argument are those in which the quantifiers of both M1 and M2 range over absolutely everything. McGee concludes that it might be indeterminate what singular terms like ‘‘p’’, ‘‘o’’, and ‘‘the natural numbers’’ refer to, but his result entails that every sentence of pure mathematics has a unique and determinate truth value. In other words, there is indeterminacy of reference but no indeterminacy of truth value. McGee’s conclusion, of course, turns on the fact that pure mathematics can be interpreted in the pure iterative hierarchy, and that isomorphic structures are equivalent. Notice, incidentally, that McGee’s program for determinacy is not available to an advocate of the Williamson program, if the latter includes Williamson’s account of indefinite extensibility. As we saw, the urelement set axiom fails on that program (miserably). McGee’s result is not available on the Zermelo program either, unless unrestricted second-order quantification is allowed. To get McGee’s philosophical conclusion, one has to apply his categoricity theorem to ‘‘interpretations’’ that are the ‘‘size’’ of the universe, the entire iterative hierarchy. Restricted second-order quantification will not allow this. Even if unrestricted second-order quantification is allowed, however, Zermelo himself seems to demur from fixing truth values the way that McGee suggests. Zermelo proposes a ‘‘general hypothesis that every categorically determined domain can also be interpreted as a set in some way, i.e., can appear as an element of’’ an inaccessible rank (p. 1232).13 This principle is inconsistent with a categorical characterization of the entire iterative hierarchy, and such a characterization is a key element in McGee’s framework. Nevertheless, McGee’s categoricity theorem is a piece of mathematics that neither Zermelo, nor anyone else, is in a position to challenge. However, McGee’s philosophical interpretation of his result depends on the presence of a quantifier which, as a matter of logic, ranges over everything (in all interpretations). I suspect that Zermelo would balk at this. One cannot set the range of quantifiers as a matter of logic.

484 / Stewart Shapiro

An advocate of the Zermelo program can interpret McGee’s theorem as a systematic ambiguity, applying to each model of set theory. Let P, Q be any two models of set theory whose universes are equinumerous. Let M1 be an interpretation within P of second-order ZFC plus the urelement set axiom in which the first-order quantifiers range over all of P, and let M2 be an interpretation of second-order ZFC plus the urelement set axiom within Q in which the first-order quantifiers range over all of Q. Then the pure sets of M1 are isomorphic to the pure sets of M2. However, this does not establish the determinacy of mathematical language. It only entails that each sentence of pure mathematics has the same truth value no matter how it is interpreted within P (or within Q). Indeed, let P, P0 be two models of set theory whose universes are not equinumerous. Then there may be sentences that are true in the indicated models of ZFC in P but false in the intended models of ZFC in P0 . Consider, for example, a sentence stating that there is a largest inaccessible. This is true in some models in the Zermelo hierarchy, but false in others (see also Hellman [2002]). One area that seems to require, or at least strongly suggest, full, secondorder replacement is Zermelo’s extendibility principle. Informally, the principle is that the inaccessibles are unbounded in the universe. As Zermelo puts it, to each inaccessible rank, there is a higher inaccessible rank (with the same urelements). It follows that for each inaccessible, there is a greater inaccessible. Suppose that the inaccessibles form an o-sequence: k0, k1,…One would think that the informal extendibility principle would yield an inaccessible ko greater than all of those. However, the existence of such an inaccessible does not follow from the extendibility principle alone, at least as it is formulated so far. If the universe consists of the union of the ranks corresponding to k0, k1,…, then the inaccessibles are unbounded in the universe, and so we cannot derive the existence of any more. The replacement axiom, applied to the universe, gives us what we want here. The enumeration k0, k1,…amounts to a function from o to the inaccessibles. Replacement entails that there is a set whose members are the k0, k1,…By the extendibility principle, this set {k0, k1,…}, and its union, is a member of an inaccessible rank. This amounts to the existence of ko. Extendibility would then yield the existence of ko+1, ko+2,…Then replacement would yield the existence of k2o, etc. One way to avoid unrestricted second-order variables would be to invoke a first-order replacement scheme, the move typically made in first-order set theory. That is, we take, as axioms, each formula obtained from the replacement axiom by substituting the second-order variable with a formula from the firstorder language. This ploy falls prey to the compelling arguments against schemes in Shapiro [1991, Chapter 5]. What do the instances of the scheme have in common? Do we have to justify them individually? If we are going to adopt replacement, the honest move is to go for the full, second-order version.14

All Sets Great and Small / 485

Recall Zermelo’s ‘‘general hypothesis’’ that every ‘‘categorically determined domain can also be interpreted as a set in some way’’. He applies it to the issue at hand here: from each infinite sequence of different [inaccessible ranks] with a common basis [i.e., with the same urelements], which are such that of any two one always contains the other as a canonical segment, there arises…a categorically determined domain of sets which again can be extended to [an inaccessible rank]. Thus, to each categorically determined totality of [inaccessible cardinals], there follows a greater such number, and the series of ‘‘all’’ [inaccessible cardinals] is unbounded in the same way as the number series itself. Thus to each transfinite index there corresponds in a one-to-one fashion a determinate [inaccessible cardinal]. (pp. 1232–1233)

The last sentence in this passage provides a work-around for our problem concerning replacement and second-order logic. Zermelo proposes the ‘‘the existence of an unbounded sequence of [inaccessible ranks] as a new axiom of ‘meta-set theory’.’’ In effect, the principle states that for each ordinal a, there is a unique inaccessible cardinal ka. This stronger extendibility principle is firstorder: it states that the inaccessibles are isomorphic to the ordinals. To be sure, our workaround is not a replacement for the full replacement axiom (pardon the pun). Suppose, for example, that there is an o-sequence of Mahlo cardinals. We would need replacement to get the existence of a set that contains the members of this sequence. But at least we can state Zermelo’s extendibility principle, with its intended strength, in the relevant first-order language.15 In any case, the Zermelo program does not, by itself, rule out unrestricted second-order quantification, and I have no desire to do so either, just because I am at a loss to see how to understand it. Let the flower bloom, if it can. There is, however, one consideration that might militate against unrestricted secondorder quantification. We are not quite finished with the Burali-Forti paradox.16 Consider, again, the language used in the Zermelo program, the language in which we speak about the unbounded models of set theory. In this language, there is a first-order formula that says that a given set is a von Neumann ordinal (in one of the inaccessible ranks). The statement that a set is well-ordered under membership is a straightforward, first-order formula, and a pure set is a von Neumann ordinal if it is transitive and well-ordered under membership. Or, to invoke standard theorems, a set is an ordinal if it is transitive and all of its members are transitive. So let V(x) be the property of being a von Neumann ordinal. This is an absolutely proper class, in the sense that there is no model in the Zermelo hierarchy that contains all of the V’s (as per the usual Burali-Forti reasoning). But we note that the V’s are themselves well-ordered under membership. A fortiori, there is no model of set theory with an ordinal of order-type V. Well,

486 / Stewart Shapiro

that’s life. As noted in §2 above, however, the object language of the Zermelo program has the wherewithal to define relations characterizing well-orderings of type V+1, 2V, VV, etc. Can we do transfinite recursions and inductions as long as those ‘‘well-orderings’’? Well, why not? Nevertheless, we cannot, on pain of contradiction, allow models that contain ordinals isomorphic to these wellorderings. Zermelo’s claim that the proper classes become sets in other, larger ranks does not go that far. It cannot apply to properties definable in the very language he is speaking when he says things like this. We have here a strengthened Burali-Forti paradox. For what it is worth, the reasoning can be resisted if we reject unrestricted higher-order quantification outright. To get the extended Burali-Forti reasoning started, one notes that the V’s are well-ordered under membership. This cannot be expressed in the firstorder language. Although we can state that a given set is well-ordered under membership in the indicated first-order language, the V’s do not form a set (in any rank). That’s the point. Let X be a monadic second-order variable, and let R be a binary relation variable. Then there is a straightforward second-order formula, with no non-logical terminology, that states that the X’s are wellordered under R: "x:Rxx & "x"y"z((Rxy&Ryz) ! Rxz) & "Y((9xYx & "x(Yx!Xx)) ! 9y(Yy & "z(Yz ! (z = y _ Ryz)))).

However, compactness considerations entail that there is no adequate formulation of the general notion of well-ordering in a first-order language (see Shapiro [1991, Chapter 5, §5.1.3]). So if the object language used to carry out the Zermelo program is firstorder, then the reasoning behind the extended Burali-Forti paradox cannot get started. We cannot even state that the property V is well-ordered under membership. Intuitively, to say that V is well-ordered, we have to say something about all of its sub-properties—namely, that each has a least element. This is exactly what we cannot do. To be sure, one can easily define the relations corresponding to V + 1, 2V, etc. Each of these can be characterized with a first-order formula with two free variables. Also, one can give explicit definitions that may amount to transfinite recursions over these relations. But in a first-order language, we cannot state, much less prove, that the relations are well-orderings. Without this, we cannot do transfinite inductions over those orderings, and we cannot show that the aforementioned explicit definitions are well-defined. Admittedly, this feels like a cheat, especially in light of my longstanding defense of second-order logic. Intuitively, it seems manifest that V is wellordered, or, better, that the V’s are well-ordered. Giving up that intuition is a bitter pill to swallow, at least for me. Thus my ambivalence. I might add that the straw we are trying to grasp is flimsy. Since the ordinals (and V itself) are all transitive, the only way a collection (or class or property) of

All Sets Great and Small / 487

them could fail to be well-ordered would be for it to fail to be well-founded. Suppose that we did have a descending o-sequence of ordinals (i.e., of V’s): a1, a2, a3…, where a22a1, a32a2, etc. By the extendibility principle, a1 has to be a member of an inaccessible rank Vk. By transitivity, a2, a3… are all in Vk. So Vk itself violates foundation. This is a contradiction, since by hypothesis, Vk satisfies full, second-order ZFC. So using the first-order resources of Zermelo’s own language, we can rule out the possibility that the V’s are not well-founded, and thus we can rule out the possibility that they are not well-ordered. If we are not allowed unrestricted second-order quantification, all that we cannot do, it seems, is say that the ordinals are well-ordered, even though it is quite clear that they are, and that, intuitively, one can prove that they are (if only we could state the theorem). The inability to formulate an unrestricted notion of well-ordering does seem to prevent the long transfinite inductions—technically. But this runs against intuitions. Well, so does the Burali-Forti reasoning. To sum up, if unrestricted higher-order quantification can be made coherent, then I see no reason why it should not be invoked, and we will have to live with the strengthened Burali-Forti phenomenon. But that is a big ‘‘if’’. Clearly, there are tradeoffs involved in the use of unrestricted higher-order quantification. It seems that Queen is frustrated, and the older and wiser Rolling Stones are right. We can’t always get what we want. Thanks to the genius of the early set theorists, however, it looks like we can get what we need. It is a fact of life in philosophy that some intuitions must be given up. The trick is to figure out which ones those are. The issue of unrestricted higher-order quantification is internal to the generality absolutist camp. I submit that whether there is unrestricted second-order quantification or not, the prospects for generality absolutism are good.

Notes 1. Thanks to Geoffrey Hellman here. 2. Boolos continues, ‘‘If one admits that there are proper classes at all, oughtn’t one to take seriously the possibility of an iteratively generated hierarchy of collection-theoretic universes in which the sets which ZF recognizes play the role of ground-floor objects? I can’t believe that any such view of the nature of ‘2’ can possibly be correct. Are the reasons for which one believes in [proper] classes really strong enough to make one believe in the possibility of such a hierarchy?’’ 3. The argument might require a (global) choice principle. It depends on the exact formulation of the notion of indefinite extensibility. Suppose that we define a property P to be indefinite extensible if, for every Definite collection C of P’s, there is a object c such that Pc but not Cc. Then choice is needed in the above argument. Recall, however, that Dummett writes that each indefinitely extensible concept has a ‘‘principle of extension’’ that takes any definite totality t of objects each of which has P, and produces an object that also has P, but is not in t. Similarly, recall the clause in Russell’s definition: ‘‘there are some properties

488 / Stewart Shapiro

4. 5. 6. 7.

8.

9.

10.

11. 12.

13. 14.

such that, given any class of terms all having such a property, we can always define a new term also having the property in question’’. If these assertions are taken (more or less) literally, then the ability to ‘‘produce’’ or ‘‘define’’ the new term, given any Definite collection of them, is part of the notion of indefinite extensibility. If so, then choice is not needed in the above argument. The ‘‘principle of extension’’ does the choosing. Thanks to Timothy Bays here. This was confirmed in conversation with a few notable generality absolutists. I am indebted to Geoffrey Hellman here. Kit Fine made a similar suggestion. One notable exception is Menzel [1986], who proposes a set theory in which the urelements do not form a set. I do not think that particular system is of much help here, however. Menzel suggests that there is a single, fixed range of ordinals, or well-ordering types. These are called ‘‘real ordinals’’ and are not identified with pure sets like the von Neumann ordinals. According to Menzel’s set theory, there is a set of all real ordinals, but no set of all von Neumann ordinals, even though the real ordinals are equinumerous with the von Neumann ordinals. The replacement principle is restricted to functions whose domain is a pure set. There is also a powerset of the set of all real ordinals, and by Zermelo’s theorem, this powerset can be well-ordered (as can its powerset). So Menzel’s set theory allows for well-ordered sets that are much larger than any ordinal. The absolutist can safely restrict model-theoretic semantics to the pure sets (V) if there is a reflection principle to the effect that if an argument is invalid, then there is a pure-set model in which its premises are true and its conclusion false. If the language is first-order, such a reflection principle is an immediate consequence of the Lo¨wenheim-Skolem theorem. If the language is higher-order, the reflection principle has ramifications concerning large cardinals (see Shapiro [1991, Chapter 6]). Hellman [2002] provides a rational reconstruction of Zermelo’s [1930] program, and does not make exegetical claims concerning the real Zermelo. Hellman’s character is called ‘‘Zermelo*’’, which we might read ‘‘Zermelo superstar’’. Thanks to Hellman for clarification of several matters. I am at a loss to understand the phrase ‘‘as yet’’ in the last sentence of this passage (which is also the last sentence of the article, except for a brief acknowledgment). Are we to look forward to the day when we can ‘‘survey’’ the entire iterative hierarchy? Perhaps the problem is in what Zermelo means by ‘‘unsurveyable’’. Thanks to Timothy Williamson here. In the opening paragraph, Cartwright does use the plural construction, and perhaps plural quantification: ‘‘the natural numbers, the pure sets, and the trees in the garden—all of them, along with any other objects there are—can simultaneously be the values of the variables in a first-order language.’’ I am not sure that he needs locutions like this. Thanks to Timothy Williamson for pointing out this passage. I do not know how to rigorously formulate this ‘‘general hypothesis’’, especially if unrestricted second-order quantification is not allowed. One possibility would be to invoke the schemes themselves (as in Feferman [1991]). This gives us some of the expressive power of second-order languages,

All Sets Great and Small / 489 but (perhaps) without a need to invoke proper classes or some other ranges to the higher-order variables. The issues would take us too far afield here. 15. The issue concerning unrestricted second-order quantification has an interesting analogue in Hellman’s modal program. Invoking the metaphor of possible worlds, Hellman allows second-order quantification within each world. This is the counterpart of Zermelo’s use of second-order separation and replacement within each inaccessible rank. For Hellman, the ‘‘proper classes’’ in each world are sets in another, larger possible world. Ditto for Zermelo. So for Hellman, there is no world that houses every possible ordinal, or every possible set. So there is no analogue to absolutely proper classes. In a sense, Hellman’s ‘‘quantifier’’ &"x covers all objects in all worlds. In contrast, if X is monadic, the locution &"X covers the proper classes of each world, but each of those are sets in another world. There are no such things as what may be called ‘‘absolutely proper classes’’, collections (like V or V) that are ‘‘too large’’ to be in any one world. The analogue of absolutely unrestricted second-order quantification is indeed ruled out. Hellman thus has a direct counterpart of the above issue concerning an o-sequence of inaccessible ranks. In response, he proposes a replacement scheme (Hellman [1989, 78]). Although Hellman does not invoke it, an analogue of Zermelo’s stronger extendibility principle is also available. One can state in the allowed formal language that, necessarily, for each ordinal a, it is possible for there to be a sequence of inaccessibles of length a. 16. Thanks to Graham Priest here.

References Boolos, G. [1998], ‘‘Reply to Charles Parsons’ ‘Sets and classes’’’, in G. Boolos, Logic, logic, and logic, Cambridge, Massachusetts, Harvard University Press, 30–36. Cartwright, Richard L. [1994], ‘‘Speaking of everything’’, Nouˆs 28, 1–20. Dummett, M. [1991], Frege: Philosophy of Mathematics, Cambridge, Massachusetts, Harvard University Press. Dummett, M. [1993], The Seas of Language, Oxford, Oxford University Press. Feferman, S. [1991], ‘‘Reflections on incompleteness’’, Journal of Symbolic Logic 56, 1–49. Hellman, G. [1989], Mathematics Without Numbers, Oxford, Oxford University Press. Hellman, G. [2002], ‘‘Maximality vs. extendability: reflections on structuralism and set theory’’, in D. Malament (editor), Reading Natural Philosophy, La Salle, Illinois, Open Court, 335–361. Kreisel, G. [1967], ‘‘Informal rigour and completeness proofs’’, Problems in the Philosophy of Mathematics, edited by I. Lakatos, Amsterdam, North Holland, 138–186. McGee, Vann [1997], ‘‘How we learn mathematical language’’, Philosophical Review 106, 35–68. Menzel, Christopher [1986], ‘‘On the iterative explanation of the paradoxes’’, Philosophical Studies 49, 37–61. Parsons, C. [1977], ‘‘What is the iterative conception of set?’’, Logic, Foundations of Mathematics and Computability Theory, edited by R. Butts and J. Hintikka, Dordrecht, Holland, D. Reidel, 335–367; reprinted in P. Benacerraf and H. Putnam (editors), Philosophy of Mathematics, second edition, Cambridge, Cambridge University Press, 1983, 503–529; and in C. Parsons, Mathematics in Philosophy, Ithaca, New York, Cornell University Press, 1983, 268–297. Priest, G. [2002], Beyond the Limits of Thought, second edition, Oxford, Oxford University Press.

490 / Stewart Shapiro Rayo, A. and T. Williamson [2004], ‘‘A completeness theorem for unrestricted first-order languages’’, J. C. Beall and M. Glanzberg, editors, Liars and Heaps, Oxford, Oxford University Press, forthcoming. Resnik, M. [1988], ‘‘Second-order logic still wild’’, Journal of Philosophy 85, 75–87. Russell, B. [1906], ‘‘On some difficulties in the theory of transfinite numbers and order types’’, Proceedings of the London Mathematical Society 4, 29–53; reprinted in Bertrand Russell, Essays in Analysis, London, George Allen and Unwin Ltd., 1973, 135–164. Schimmerling, Ernest [2001], ‘‘The ABC’s of mice’’, Bulletin of Symbolic Logic 7, 485–503. Shapiro, S. [1991], Foundations Without Foundationalism: A Case for Second-order Logic, Oxford, Oxford University Press. Shapiro, S. [1993], ‘‘Modality and ontology’’, Mind 102, 455–481. Tarski, A. [1935], ‘‘On the concept of logical consequence’’, Logic, Semantics and Metamathematics, by A. Tarski, Oxford, Clarendon Press, 1956, 417–429. Williamson, T. [1994], Vagueness, London and New York, Routledge Publishing Company. Williamson, T. [1998], ‘‘Indefinite extensibility’’, in Johannes L. Brandl and Peter Sullivan, editors, New Essays on the Philosophy of Michael Dummett, Grazer Philosophische Studien 55, 1–24. Williamson, T. [2003], ‘‘Everything’’, this volume. Zermelo, E. [1930], ‘‘U¨ber Grenzzahlen und Mengenbereiche: Neue Untersuchungen u¨ber die Grundlagen der Mengenlehre’’, Fundamenta Mathematicae 16, 29–47; translated as ‘‘On boundary numbers and domains of sets: new investigations in the foundations of set theory’’, in From Kant to Hilbert: A Source Book in the Foundations of Mathematics, Volume 2, edited by William Ewald, Oxford, Oxford University Press, 1996, 1219–1233.

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close